In graphic design and computer graphics, a drop shadow is a visual effect consisting of a drawing element which looks like the shadow of an object, giving the impression that the object is raised above the objects behind it. The drop shadow is often used for elements of a graphical user interface such as windows or menus, and for simple text. The text label for icons on desktops in many desktop environments has a drop shadow, as this effect effectively distinguishes the text from any colored background it may be in front of. A simple way of drawing a drop shadow of a rectangular object is to draw a gray or black area underneath and offset from the object. In general, a drop shadow is a copy in black or gray of the object, drawn in a slightly different position. Realism may be increased by: Darkening the colors of the pixels where the shadow casts instead of making them gray. This can be done with alpha blending the shadow with the area it is cast on. Softening the edges of the shadow. This can be done by adding Gaussian blur to the shadow's alpha channel before blending. Inset drop shadows are a type which draws the shadows inside the element. This allows the interface element to appear as if it is sunken into the interface. == Photo editing == In photo editing or photography post-production, a drop shadow may be added right beneath a model or product in the image. It is used to create contrast between the background and the subject. To add a drop shadow, retouchers use graphic editing tools like Adobe Photoshop. Drop shadows are often used as a visual effect in e-commerce. This is done to improve the presentation of product images and create depth in the image. == Use == Generally, window managers which are capable of compositing allow drop shadow effects, whereas incapable window managers do not. In some operating systems like macOS, drop shadow is used to differentiate between active and inactive windows. Websites are able to use drop shadow effects through the CSS properties box-shadow, text-shadow, and drop-shadow() filter function in filter. The first two are used for elements and text respectively, while the filter applies to the element's content, letting it support oddly shaped elements or transparent images.
Kuaishou
Kuaishou Technology is a Chinese publicly traded partly state-owned holding company based in Haidian District, Beijing, that was founded in 2011 by Hua Su (Chinese: 宿华) and Cheng Yixiao (Chinese: 程一笑). The company, listed on the Hong Kong Stock Exchange, is known for developing a mobile app for sharing users' short videos, a social network, and video special effects editor. The app is known as Kwai in many countries outside of China. It is also known as Snack Video in India, Pakistan and Indonesia. == Ownership and governance == Kuaishou's overseas team is led by the former CEO of the application 99, and staff from Google, Facebook, Netflix, and TikTok were recruited to lead the company's international expansion. The China Internet Investment Fund, a state-owned enterprise controlled by the Cyberspace Administration of China, holds a golden share ownership stake in Kuaishou. == History == Kuaishou is China's first short video platform that was developed in 2011 by engineer Hua Su and Cheng Yixiao. Prior to co-founding Kuaishou, Su Hua had worked for both Google and Baidu as a software engineer. The company is headquartered in Haidian District, Beijing. Kuaishou's predecessor "GIF Kuaishou" was founded in March 2011. GIF Kuaishou was a mobile app with which users could make and share GIF pictures. In 2013, Kuaishou became a short-video social platform. By 2013, the app had reached 100 million daily users. By 2019, it had exceeded 200 million active daily users. In March 2017, Kuaishou closed a US$350 million investment round that was led by Tencent. In January 2018, Forbes estimated the company's valuation to be US$18 billion. In April 2018, Kuaishou's app was briefly banned from Chinese app stores after China Central Television (CCTV) reported on the platform popularizing videos of teenage mothers. In 2019, the company announced a partnership with the People's Daily, an official newspaper of the Central Committee of the Chinese Communist Party, to help it experiment with the use of artificial intelligence in news. In June 2020, following the start of the 2020–2021 China–India skirmishes, the Government of India banned Kwai along with 58 other apps, citing "data and privacy issues". In January 2021, Kuaishou announced it was planning an initial public offering (IPO) to raise approximately US$5 billion. Kuaishou's stock completed its first day of trading at $300 Hong Kong dollars (HKD) (US$38.70), more than doubling its initial offer price, and causing its market value to rise to over $1 trillion HKD (US$159 billion). In February 2021, Kuaishou made a debut on the Hong Kong Stock Exchange, with its shares soaring by 194% at the opening. The company subsequently encountered major setbacks as a result of heightened regulatory restrictions on Chinese internet firms, which contributed to its share price falling by nearly 80% from its post-IPO peak. By December 2021, Kuaishou announced a major reorganization, including the layoff of 30% of its staff, primarily targeting mid-level employees earning an annual salary of $157,000 or more. This restructuring aimed to cut costs and mitigate financial losses. In October 2022, state-owned Beijing Radio and Television Station took a minority ownership stake in Kuaishou. In April 2024, a Financial Times article citing current and former Kuaishou employees stated that the company has been running an ageist redundancy programme known internally as "Limestone", culling workers in their mid-30s. In June 2024, Kuaishou and the Sichuan international communication center launched a branch center in São Paulo, Brazil. In June 2024, Kuaishou released its diffusion transformer text-to-video model, Kling, which they claimed could generate two minutes of video at 30 frames per second and in 1080p resolution. The model has been compared to that of OpenAI's Sora text-to-video model. It is accessible to the public on Kuaishou's video editing app KwaiCut via signing up for a waitlist with a Chinese phone number. In December 2025, Kuaishou came under a cyberattack which led to a temporary influx of violent and pornographic content. == Popularity == As of 2019, it had a worldwide user base of over 200 million, leading the "Most Downloaded" lists of the Google Play and Apple App Store in eight countries, such as Brazil, where it was introduced in 2019. Its main short-video platform competitor was Douyin, which is known as TikTok outside China. Compared to Douyin, Kuaishou is more popular with older users living outside China's Tier 1 cities. Its initial popularity came from videos of Chinese rural life. The app is particularly well known for its "rustic" aesthetic and is popular among rural people. Kuaishou also relied more on e-commerce revenue than on advertising revenue compared to its main competitor. == Reception == Kwai (as the app is called outside of China) was banned in India in 2020 along with other short video apps like TikTok. Kuaishou then released the clone SnackVideo, which was subsequently also banned. The app is one of the most popular social media platforms in Brazil, where Kuaishou partnered with creators to make telenovela style content, and appeals to football fans by working with football teams CR Flamengo and Santos FC and sponsoring the tournament Copa América. Kwai was notable in Brazil for spreading information (and misinformation) about the COVID-19 vaccine and political misinformation. === Manjiao Wenhua === "Manjiao wenhua" (慢脚文化) is a sarcasm term on Chinese internet on the unethical or illegal contents on Kuaishou. State broadcaster China Central Television (CCTV) reported that many contents are about child pregnancy. "Dating, pregnancy, bearing a child...these are strictly prohibited in the real time by a minor, but these contents can easily shown to audiences here." In addition, many students from primary or secondary schools make a pose of smoking. Wang Zhenhui (王贞会) from CUPSL stated that these kinds of bad values will give negative effects to the minors.
Contrast set learning
Contrast set learning is a form of association rule learning that seeks to identify meaningful differences between separate groups by reverse-engineering the key predictors that identify for each particular group. For example, given a set of attributes for a pool of students (labeled by degree type), a contrast set learner would identify the contrasting features between students seeking bachelor's degrees and those working toward PhD degrees. == Overview == A common practice in data mining is to classify, to look at the attributes of an object or situation and make a guess at what category the observed item belongs to. As new evidence is examined (typically by feeding a training set to a learning algorithm), these guesses are refined and improved. Contrast set learning works in the opposite direction. While classifiers read a collection of data and collect information that is used to place new data into a series of discrete categories, contrast set learning takes the category that an item belongs to and attempts to reverse engineer the statistical evidence that identifies an item as a member of a class. That is, contrast set learners seek rules associating attribute values with changes to the class distribution. They seek to identify the key predictors that contrast one classification from another. For example, an aerospace engineer might record data on test launches of a new rocket. Measurements would be taken at regular intervals throughout the launch, noting factors such as the trajectory of the rocket, operating temperatures, external pressures, and so on. If the rocket launch fails after a number of successful tests, the engineer could use contrast set learning to distinguish between the successful and failed tests. A contrast set learner will produce a set of association rules that, when applied, will indicate the key predictors of each failed tests versus the successful ones (the temperature was too high, the wind pressure was too high, etc.). Contrast set learning is a form of association rule learning. Association rule learners typically offer rules linking attributes commonly occurring together in a training set (for instance, people who are enrolled in four-year programs and take a full course load tend to also live near campus). Instead of finding rules that describe the current situation, contrast set learners seek rules that differ meaningfully in their distribution across groups (and thus, can be used as predictors for those groups). For example, a contrast set learner could ask, “What are the key identifiers of a person with a bachelor's degree or a person with a PhD, and how do people with PhD's and bachelor’s degrees differ?” Standard classifier algorithms, such as C4.5, have no concept of class importance (that is, they do not know if a class is "good" or "bad"). Such learners cannot bias or filter their predictions towards certain desired classes. As the goal of contrast set learning is to discover meaningful differences between groups, it is useful to be able to target the learned rules towards certain classifications. Several contrast set learners, such as MINWAL or the family of TAR algorithms, assign weights to each class in order to focus the learned theories toward outcomes that are of interest to a particular audience. Thus, contrast set learning can be thought of as a form of weighted class learning. === Example: Supermarket Purchases === The differences between standard classification, association rule learning, and contrast set learning can be illustrated with a simple supermarket metaphor. In the following small dataset, each row is a supermarket transaction and each "1" indicates that the item was purchased (a "0" indicates that the item was not purchased): Given this data, Association rule learning may discover that customers that buy onions and potatoes together are likely to also purchase hamburger meat. Classification may discover that customers that bought onions, potatoes, and hamburger meats were purchasing items for a cookout. Contrast set learning may discover that the major difference between customers shopping for a cookout and those shopping for an anniversary dinner are that customers acquiring items for a cookout purchase onions, potatoes, and hamburger meat (and do not purchase foie gras or champagne). == Treatment learning == Treatment learning is a form of weighted contrast-set learning that takes a single desirable group and contrasts it against the remaining undesirable groups (the level of desirability is represented by weighted classes). The resulting "treatment" suggests a set of rules that, when applied, will lead to the desired outcome. Treatment learning differs from standard contrast set learning through the following constraints: Rather than seeking the differences between all groups, treatment learning specifies a particular group to focus on, applies a weight to this desired grouping, and lumps the remaining groups into one "undesired" category. Treatment learning has a stated focus on minimal theories. In practice, treatment are limited to a maximum of four constraints (i.e., rather than stating all of the reasons that a rocket differs from a skateboard, a treatment learner will state one to four major differences that predict for rockets at a high level of statistical significance). This focus on simplicity is an important goal for treatment learners. Treatment learning seeks the smallest change that has the greatest impact on the class distribution. Conceptually, treatment learners explore all possible subsets of the range of values for all attributes. Such a search is often infeasible in practice, so treatment learning often focuses instead on quickly pruning and ignoring attribute ranges that, when applied, lead to a class distribution where the desired class is in the minority. === Example: Boston housing data === The following example demonstrates the output of the treatment learner TAR3 on a dataset of housing data from the city of Boston (a nontrivial public dataset with over 500 examples). In this dataset, a number of factors are collected for each house, and each house is classified according to its quality (low, medium-low, medium-high, and high). The desired class is set to "high", and all other classes are lumped together as undesirable. The output of the treatment learner is as follows: Baseline class distribution: low: 29% medlow: 29% medhigh: 21% high: 21% Suggested Treatment: [PTRATIO=[12.6..16), RM=[6.7..9.78)] New class distribution: low: 0% medlow: 0% medhigh: 3% high: 97% With no applied treatments (rules), the desired class represents only 21% of the class distribution. However, if one filters the data set for houses with 6.7 to 9.78 rooms and a neighborhood parent-teacher ratio of 12.6 to 16, then 97% of the remaining examples fall into the desired class (high-quality houses). == Algorithms == There are a number of algorithms that perform contrast set learning. The following subsections describe two examples. === STUCCO === The STUCCO contrast set learner treats the task of learning from contrast sets as a tree search problem where the root node of the tree is an empty contrast set. Children are added by specializing the set with additional items picked through a canonical ordering of attributes (to avoid visiting the same nodes twice). Children are formed by appending terms that follow all existing terms in a given ordering. The formed tree is searched in a breadth-first manner. Given the nodes at each level, the dataset is scanned and the support is counted for each group. Each node is then examined to determine if it is significant and large, if it should be pruned, and if new children should be generated. After all significant contrast sets are located, a post-processor selects a subset to show to the user - the low order, simpler results are shown first, followed by the higher order results which are "surprising and significantly different." The support calculation comes from testing a null hypothesis that the contrast set support is equal across all groups (i.e., that contrast set support is independent of group membership). The support count for each group is a frequency value that can be analyzed in a contingency table where each row represents the truth value of the contrast set and each column variable indicates the group membership frequency. If there is a difference in proportions between the contrast set frequencies and those of the null hypothesis, the algorithm must then determine if the differences in proportions represent a relation between variables or if it can be attributed to random causes. This can be determined through a chi-square test comparing the observed frequency count to the expected count. Nodes are pruned from the tree when all specializations of the node can never lead to a significant and large contrast set. The decision to prune is based on: The minimum deviation size: The maximum difference between the support
Brain Imaging Data Structure
The Brain Imaging Data Structure (BIDS) is a standard for organizing, annotating, and describing data collected during neuroimaging experiments. It is based on a formalized file and directory structure and metadata files (based on JSON and TSV) with controlled vocabulary. This standard has been adopted by a multitude of labs around the world as well as databases such as OpenNeuro, SchizConnect, Developing Human Connectome Project, and FCP-INDI, and is seeing uptake in an increasing number of studies. While originally specified for MRI data, BIDS has been extended to several other imaging modalities such as MEG, EEG, and intracranial EEG (see also BIDS Extension Proposals). == History == The project is a community-driven effort. BIDS, originally OBIDS (Open Brain Imaging Data Structure), was initiated during an INCF sponsored data sharing working group meeting (January 2015) at Stanford University. It was subsequently spearheaded and maintained by Chris Gorgolewski. Since October 2019, the project is headed by a Steering Group and maintained by a separate team of maintainers, the Maintainers Group, according to a governance document that was approved of by the BIDS community in a vote. BIDS has advanced under the direction and effort of contributors, the community of researchers that appreciate the value of standardizing neuroimaging data to facilitate sharing and analysis. == BIDS Extension Proposals == BIDS can be extended in a backwards compatible way and is evolving over time. This is accomplished through BIDS Extension Proposals (BEPs), which are community-driven processes following agreed-upon guidelines. A full list of finalized BEPs and BEPs in progress can be found on the BIDS website
Data analysis
Data analysis is the process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, and is used in different business, science, and social science domains. In today's business world, data analysis plays an important role in making decisions more scientific and helping businesses operate more effectively. It is widely used in fields such as business analytics, healthcare, and artificial intelligence to extract meaningful insights from data. Data mining is a particular data analysis technique that focuses on statistical modeling and knowledge discovery for predictive rather than purely descriptive purposes, while business intelligence covers data analysis that relies heavily on aggregation, focusing mainly on business information. In statistical applications, data analysis can be divided into descriptive statistics, exploratory data analysis (EDA), and confirmatory data analysis (CDA). EDA focuses on discovering new features in the data, while CDA focuses on confirming or falsifying existing hypotheses. Predictive analytics focuses on the application of statistical models for predictive forecasting or classification, while text analytics applies statistical, linguistic, and structural techniques to extract and classify information from textual sources, a variety of unstructured data. All of the above are varieties of data analysis. == Data analysis process == Data analysis is a process for obtaining raw data, and subsequently converting it into information useful for decision-making by users. Statistician John Tukey, defined data analysis in 1961, as:"Procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data." There are several phases, and they are iterative, in that feedback from later phases may result in additional work in earlier phases. === Data requirements === The data is necessary as inputs to the analysis, which is specified based upon the requirements of those directing the analytics (or customers, who will use the finished product of the analysis). The general type of entity upon which the data will be collected is referred to as an experimental unit (e.g., a person or population of people). Specific variables regarding a population (e.g., age and income) may be specified and obtained. Data may be numerical or categorical (i.e., a text label for numbers). === Data collection === Data may be collected from a variety of sources. A list of data sources are available for study & research. The requirements may be communicated by analysts to custodians of the data; such as, Information Technology personnel within an organization. Data collection or data gathering is the process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes. The data may also be collected from sensors in the environment, including traffic cameras, satellites, recording devices, etc. It may also be obtained through interviews, downloads from online sources, or reading documentation. === Data processing === Data integration is a precursor to data analysis: Data, when initially obtained, must be processed or organized for analysis. For instance, this may involve placing data into rows and columns in a table format (known as structured data) for further analysis, often through the use of spreadsheet (e.g. Excel) or statistical software. === Data cleaning === Once processed and organized, the data may be incomplete, contain duplicates, or contain errors. The need for data cleaning will arise from problems in the way that the data is entered and stored. Data cleaning is the process of preventing and correcting these errors. Common tasks include record matching, identifying inaccuracy of data, overall quality of existing data, deduplication, and column segmentation. Such data problems can also be identified through a variety of analytical techniques. For example; with financial information, the totals for particular variables may be compared against separately published numbers that are believed to be reliable. Unusual amounts, above or below predetermined thresholds, may also be reviewed. There are several types of data cleaning that are dependent upon the type of data in the set; this could be phone numbers, email addresses, employers, or other values. Quantitative data methods for outlier detection can be used to get rid of data that appears to have a higher likelihood of being input incorrectly. Text data spell checkers can be used to lessen the amount of mistyped words. However, it is harder to tell if the words are contextually (i.e., semantically and idiomatically) correct. === Exploratory data analysis === Once the datasets are cleaned, they can then begin to be analyzed using exploratory data analysis. The process of data exploration may result in additional data cleaning or additional requests for data; thus, the initialization of the iterative phases mentioned above. Descriptive statistics, such as the average, median, and standard deviation, are often used to broadly characterize the data. Data visualization is also used, in which the analyst is able to examine the data in a graphical format in order to obtain additional insights about messages within the data. === Modeling and algorithms === Mathematical formulas or mathematical models (supported by algorithms) may be applied to the data in order to identify relationships among the variables; for example, checking for correlation and by determining whether or not there is the presence of causality. In general terms, models may be developed to evaluate a specific variable based on other variable(s) contained within the dataset, with some residual error depending on the implemented model's accuracy (e.g., Data = Model + Error). Inferential statistics utilizes techniques that measure the relationships between particular variables. For example, regression analysis may be used to model whether a change in advertising (independent variable X), provides an explanation for the variation in sales (dependent variable Y), i.e. is Y a function of X? This can be described as (Y = aX + b + error), where the model is designed such that (a) and (b) minimize the error when the model predicts Y for a given range of values of X. === Data product === A data product is a computer application that takes data inputs and generates outputs, feeding them back into the environment. It may be based on a model or algorithm. For instance, an application that analyzes data about customer purchase history, and uses the results to recommend other purchases the customer might enjoy. === Communication === Once data is analyzed, it may be presented in many formats to the users of the analysis to support their requirements. The users may have feedback, which results in additional analysis. When determining how to communicate the results, the analyst may consider implementing a variety of data visualization techniques to help communicate the message more clearly and efficiently to the audience. Data visualization uses information displays (graphics such as, tables and charts) to help communicate key messages contained in the data. Tables are a valuable tool by enabling the ability of a user to query and focus on specific numbers; while charts (e.g., bar charts or line charts), may help explain the quantitative messages contained in the data. == Quantitative messages == Stephen Few described eight types of quantitative messages that users may attempt to communicate from a set of data, including the associated graphs. Time-series: A single variable is captured over a period of time, such as the unemployment rate over a 10-year period. A line chart may be used to demonstrate the trend. Ranking: Categorical subdivisions are ranked in ascending or descending order, such as a ranking of sales performance (the measure) by salespersons (the category, with each salesperson a categorical subdivision) during a single period. A bar chart may be used to show the comparison across the salespersons. Part-to-whole: Categorical subdivisions are measured as a ratio to the whole (i.e., a percentage out of 100%). A pie chart or bar chart can show the comparison of ratios, such as the market share represented by competitors in a market. Deviation: Categorical subdivisions are compared against a reference, such as a comparison of actual vs. budget expenses for several departments of a business for a given time period. A bar chart can show the comparison of the actual versus the reference amount. Frequency distribution:
Text Retrieval Conference
The Text REtrieval Conference (TREC) is an ongoing series of workshops focusing on a list of different information retrieval (IR) research areas, or tracks. It is co-sponsored by the National Institute of Standards and Technology (NIST) and the Intelligence Advanced Research Projects Activity (part of the office of the Director of National Intelligence), and began in 1992 as part of the TIPSTER Text program. Its purpose is to support and encourage research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies and to increase the speed of lab-to-product transfer of technology. TREC's evaluation protocols have improved many search technologies. A 2010 study estimated that "without TREC, U.S. Internet users would have spent up to 3.15 billion additional hours using web search engines between 1999 and 2009." Hal Varian the Chief Economist at Google wrote that "The TREC data revitalized research on information retrieval. Having a standard, widely available, and carefully constructed set of data laid the groundwork for further innovation in this field." Each track has a challenge wherein NIST provides participating groups with data sets and test problems. Depending on track, test problems might be questions, topics, or target extractable features. Uniform scoring is performed so the systems can be fairly evaluated. After evaluation of the results, a workshop provides a place for participants to collect together thoughts and ideas and present current and future research work.Text Retrieval Conference started in 1992, funded by DARPA (US Defense Advanced Research Project) and run by NIST. Its purpose was to support research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies. == Goals == Encourage retrieval search based on large text collections Increase communication among industry, academia, and government by creating an open forum for the exchange of research ideas Speed the transfer of technology from research labs into commercial products by demonstrating substantial improvements retrieval methodologies on real world problems To increase the availability of appropriate evaluation techniques for use by industry and academia including development of new evaluation techniques more applicable to current systems TREC is overseen by a program committee consisting of representatives from government, industry, and academia. For each TREC, NIST provide a set of documents and questions. Participants run their own retrieval system on the data and return to NIST a list of retrieved top-ranked documents. NIST pools the individual result judges the retrieved documents for correctness and evaluates the results. The TREC cycle ends with a workshop that is a forum for participants to share their experiences. == Relevance judgments in TREC == TREC defines relevance as: "If you were writing a report on the subject of the topic and would use the information contained in the document in the report, then the document is relevant." Most TREC retrieval tasks use binary relevance: a document is either relevant or not relevant. Some TREC tasks use graded relevance, capturing multiple degrees of relevance. Most TREC collections are too large to perform complete relevance assessment; for these collections it is impossible to calculate the absolute recall for each query. To decide which documents to assess, TREC usually uses a method call pooling. In this method, the top-ranked n documents from each contributing run are aggregated, and the resulting document set is judged completely. == Various TRECs == In 1992 TREC-1 was held at NIST. The first conference attracted 28 groups of researchers from academia and industry. It demonstrated a wide range of different approaches to the retrieval of text from large document collections .Finally TREC1 revealed the facts that automatic construction of queries from natural language query statements seems to work. Techniques based on natural language processing were no better no worse than those based on vector or probabilistic approach. TREC2 Took place in August 1993. 31 group of researchers participated in this. Two types of retrieval were examined. Retrieval using an ‘ad hoc’ query and retrieval using a ‘routing' query In TREC-3 a small group experiments worked with Spanish language collection and others dealt with interactive query formulation in multiple databases TREC-4 they made even shorter to investigate the problems with very short user statements TREC-5 includes both short and long versions of the topics with the goal of carrying out deeper investigation into which types of techniques work well on various lengths of topics In TREC-6 Three new tracks speech, cross language, high precision information retrieval were introduced. The goal of cross language information retrieval is to facilitate research on system that are able to retrieve relevant document regardless of language of the source document TREC-7 contained seven tracks out of which two were new Query track and very large corpus track. The goal of the query track was to create a large query collection TREC-8 contain seven tracks out of which two –question answering and web tracks were new. The objective of QA query is to explore the possibilities of providing answers to specific natural language queries TREC-9 Includes seven tracks In TREC-10 Video tracks introduced Video tracks design to promote research in content based retrieval from digital video In TREC-11 Novelty tracks introduced. The goal of novelty track is to investigate systems abilities to locate relevant and new information within the ranked set of documents returned by a traditional document retrieval system TREC-12 held in 2003 added three new tracks; Genome track, robust retrieval track, HARD (Highly Accurate Retrieval from Documents) == Tracks == === Current tracks === New tracks are added as new research needs are identified, this list is current for TREC 2018. CENTRE Track – Goal: run in parallel CLEF 2018, NTCIR-14, TREC 2018 to develop and tune an IR reproducibility evaluation protocol (new track for 2018). Common Core Track – Goal: an ad hoc search task over news documents. Complex Answer Retrieval (CAR) – Goal: to develop systems capable of answering complex information needs by collating information from an entire corpus. Incident Streams Track – Goal: to research technologies to automatically process social media streams during emergency situations (new track for TREC 2018). The News Track – Goal: partnership with The Washington Post to develop test collections in news environment (new for 2018). Precision Medicine Track – Goal: a specialization of the Clinical Decision Support track to focus on linking oncology patient data to clinical trials. Real-Time Summarization Track (RTS) – Goal: to explore techniques for real-time update summaries from social media streams. === Past tracks === Chemical Track – Goal: to develop and evaluate technology for large scale search in chemistry-related documents, including academic papers and patents, to better meet the needs of professional searchers, and specifically patent searchers and chemists. Clinical Decision Support Track – Goal: to investigate techniques for linking medical cases to information relevant for patient care Contextual Suggestion Track – Goal: to investigate search techniques for complex information needs that are highly dependent on context and user interests. Crowdsourcing Track – Goal: to provide a collaborative venue for exploring crowdsourcing methods both for evaluating search and for performing search tasks. Genomics Track – Goal: to study the retrieval of genomic data, not just gene sequences but also supporting documentation such as research papers, lab reports, etc. Last ran on TREC 2007. Dynamic Domain Track – Goal: to investigate domain-specific search algorithms that adapt to the dynamic information needs of professional users as they explore in complex domains. Enterprise Track – Goal: to study search over the data of an organization to complete some task. Last ran on TREC 2008. Entity Track – Goal: to perform entity-related search on Web data. These search tasks (such as finding entities and properties of entities) address common information needs that are not that well modeled as ad hoc document search. Cross-Language Track – Goal: to investigate the ability of retrieval systems to find documents topically regardless of source language. After 1999, this track spun off into CLEF. FedWeb Track – Goal: to select best resources to forward a query to, and merge the results so that most relevant are on the top. Federated Web Search Track – Goal: to investigate techniques for the selection and combination of search results from a large number of real on-line web search services. Filtering Track – Goal: to binarily decide retrieval of new
Sex differences in social media use
Men and women use social media in different ways and with different frequencies. In general, several researchers have found that women tend to use social network services (SNSs) more than men and primiarly to socialize. == Differences == === Predilection for usage === Many studies have found that women are more likely to use either specific SNSs such as Facebook or MySpace or SNSs in general. In 2015, 73% of online men and 80% of online women used social networking sites. The gap in gender differences has become less apparent in LinkedIn. In 2015 about 26 percent of online men and 25% of online women used the business-and employee-oriented networking site. Researchers who have examined the gender of users of multiple SNSs have found contradictory results. Hargittai's groundbreaking 2007 study examining race, gender, and other differences between undergraduate college student users of SNSs found that women were not only more likely to have used SNSes than men but that they were also more likely to have used many different services, including Facebook, MySpace, and Friendster; these differences persisted in several models and analyses. Although she only surveyed students at one institution – the University of Illinois at Chicago – Hargittai selected that institution intentionally as "an ideal location for studies of how different kinds of people use online sites and services." In contrast, data collected by the Pew Internet & American Life Project found that men were more likely to have multiple SNS profiles. Although the sample sizes of the two surveys are comparable – 1,650 Internet users in the Pew survey compared with 1,060 in Hargittai's survey – the data from the Pew survey are newer and arguably more representative of the entire adult United States population. Pinterest, Facebook, and Instagram attract more females. Picture sharing sites overall are very popular among women. Pinterest alone attracts three times as many female users than male. However, use of Pinterest by men has increased from 5% in 2012. Facebook attracts about 77% of women online. Instagram is also more likely to attract women. Men are more likely to participate in online forums like Reddit, Digg or Slashdot. One in five men claim to be a part of an online forum. === Uses === In general, women seem to use SNSs more to explicitly foster social connections. A study conducted by Pew research centers found that women were more avid users of social media. In November 2010, the gap between men and women was as high as 15%. Female participants in a multi-stage study conducted in 2007 to discover the motivations of Facebook users scored higher on scales for social connection and posting of photographs. Studies have also been conducted on the differences between females and males with regards to blogging. The Pew Research Center found that younger females are more likely to blog than males their own age, even males that are older than them. Similarly, in a study of blogs maintained in MySpace, women were found to be more likely to not only write blogs but also write about family, romantic relationships, friendships, and health in those blogs. A study of Swedish SNS users found that women were more likely to have expressions of friendship, specifically in the areas of (a) publishing photos of their friends, (b) specifically naming their best friends, and (c) writing poems to and about their friends. Women were also more likely to have expressions related to family relationships and romantic relationships. One of the key findings of this research is that those men who do have expressions of romantic relationships in their profile had expressions just as strong as the women. However, the researcher speculated that this may be in part due to a desire to publicly express heterosexual behaviors and mannerisms instead of merely expressing romantic feelings. A large-scale study of gender differences in MySpace found that both men and women tended to have a majority of female Friends, and both men and women tended to have a majority of female "Top" Friends in the site. A later study found women to author disproportionately many (public) comments in MySpace, but an investigation into the role of emotion in public MySpace comments found that women both give and receive stronger positive emotion. It was hypothesised that women are simply more effective at using social networking sites because they are better able to harness positive emotion. A study focused on the influence of gender and personality on individuals' use of online social networking websites such as Facebook, reported that men use social networking sites with the intention of forming new relationships, whereas, women use them more for relationship maintenance. In addition to this, women are more likely to use Facebook or MySpace to compare themselves to others and also to search for information. Men, however, are more likely to look at other people's profiles with in the intention to find friends. Women were less successful at actually finding new friends, but more successful at "maintaining existing relationships, making new relationships, using for academic purposes and following specific agenda". Similarly, men also self-reported this motivation "while women reported using them more for relationship maintenance". === Personality === OCEAN personality traits are known to systematically vary between human males and females. In one study, the same women were more extraverted and agreeable, such as less neurotic while on social media than offline. Other studies associated neuroticism with female use of social media. === Privacy === Privacy has been the primary topic of many studies of SNS users, and many of these studies have found differences between male and female SNS users, although some studies have found results contradictory to those found in other studies. Some researchers have found that women are more protective of their personal information and more likely to have private profiles. Other researchers have found that women are less likely to post some types of information. Acquisti and Gross found that women in their sample were less likely to reveal their sexual orientation, personal address, or cell phone number. This is similar to Pew Internet & American Life research of children users of SNSs that found that boys and girls presented different views of privacy and behaviors, with girls being more concerned about and restrictive of information such as city, town, last name, and cell phone number that could be used to locate them. At least one group of researchers has found that women are less likely to share information that "identifies them directly – last name, cell phone number, and address or home phone number," linking that resistance to women's greater concerns about "cyberstalking", "cyberbullying", and security problems. Despite these concerns about privacy, researchers have found that women are more likely to maintain up-to-date photos of themselves. Further, Kolek and Saunders found in their sample of college student Facebook users that women were more likely to not only post a photograph of themselves in their profile but that they were more likely to have a publicly viewable Facebook account (a contradictory finding compared to many other studies), post photos, and post photo albums. Women were more likely to have: (a) a publicly viewable Facebook account, (b) more photo albums, (c) more photos, (d) a photo of themselves as their profile picture, (e) positive references to alcohol, partying, or drugs, and (f) more positive references to or about the institution or institution-related activities. In general, women were more likely to disclose information about themselves in their Facebook profile, with the primary exception of sharing their telephone number. Similarly, female respondents to Strano's study were more likely to keep their profile photo recent and choose a photo that made them appear attractive, happy, and fun-loving. Citing several examples, Strano opined that there may also be a difference in how men and women Facebook users display and interpret profile photos depicting relationships. Privacy has also been a concern for the SnapChat app, which allows you to send messages either text or photo or video which then disappear. One study has shown that security is not a major concern for the majority of users and that most do not use Snapchat to send sensitive content (although up to 25% may do so experimentally). As part of their research almost no statistically significant gender differences were found. === Cyberbullying === Past research carried out to investigate if there are any gender differences in cyber-bullying has found that boys commit more cyber verbal bullying, cyber forgery and more violence based on hidden identity or presenting themselves as other person. === Mansplaining === A 2021 article found that mansplaining could be seen more prominent online rather than offl