AI App UI Design

AI App UI Design — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • 2024 National Public Data breach

    2024 National Public Data breach

    In August 2024, three class-action lawsuits were filed against National Public Data along with over 14 complaints filed in federal court, claiming that the company permitted hackers to steal sensitive private information covering millions of individuals. The theft was alleged to have occurred in April 2024. One of the lawsuits specifically claims that in April, a hacker going by the moniker "USDoD" posted a notice on the dark web, offering the data for sale at the price of US$3.5 million. The information stolen is alleged to include 2.9 billion records containing full names, current and past addresses, Social Security numbers, dates of birth, and telephone numbers. The stolen data contains records for people in the US, UK, and Canada. National Public Data confirmed on August 16, 2024, there was a breach originating from someone trying to breach their systems since December 2023, with the breach occurring from April 2024 and over the next few months. The company also confirmed that 2.9 billion records were obtained, though they were still working to determine how many people were affected by the breach, and were working with law enforcement to identify the hacker. == Jerico Pictures == Jerico Pictures, Inc., doing business as National Public Data, was a data broker company that performed employee background checks. Their primary service was collecting information from public data sources, including criminal records, addresses, and employment history, and offering that information for sale. On October 2, 2024, Jerico Pictures filed for Chapter 11 bankruptcy as it currently faces over a dozen lawsuits over the breach, and is potentially liable "for credit monitoring for hundreds of millions of potentially impacted individuals." In December 2024, National Public Data shut down, showing a closure notice on its website.

    Read more →
  • Sufficient dimension reduction

    Sufficient dimension reduction

    In statistics, sufficient dimension reduction (SDR) is a paradigm for analyzing data that combines the ideas of dimension reduction with the concept of sufficiency. Dimension reduction has long been a primary goal of regression analysis. Given a response variable y and a p-dimensional predictor vector x {\displaystyle {\textbf {x}}} , regression analysis aims to study the distribution of y ∣ x {\displaystyle y\mid {\textbf {x}}} , the conditional distribution of y {\displaystyle y} given x {\displaystyle {\textbf {x}}} . A dimension reduction is a function R ( x ) {\displaystyle R({\textbf {x}})} that maps x {\displaystyle {\textbf {x}}} to a subset of R k {\displaystyle \mathbb {R} ^{k}} , k < p, thereby reducing the dimension of x {\displaystyle {\textbf {x}}} . For example, R ( x ) {\displaystyle R({\textbf {x}})} may be one or more linear combinations of x {\displaystyle {\textbf {x}}} . A dimension reduction R ( x ) {\displaystyle R({\textbf {x}})} is said to be sufficient if the distribution of y ∣ R ( x ) {\displaystyle y\mid R({\textbf {x}})} is the same as that of y ∣ x {\displaystyle y\mid {\textbf {x}}} . In other words, no information about the regression is lost in reducing the dimension of x {\displaystyle {\textbf {x}}} if the reduction is sufficient. == Graphical motivation == In a regression setting, it is often useful to summarize the distribution of y ∣ x {\displaystyle y\mid {\textbf {x}}} graphically. For instance, one may consider a scatterplot of y {\displaystyle y} versus one or more of the predictors or a linear combination of the predictors. A scatterplot that contains all available regression information is called a sufficient summary plot. When x {\displaystyle {\textbf {x}}} is high-dimensional, particularly when p ≥ 3 {\displaystyle p\geq 3} , it becomes increasingly challenging to construct and visually interpret sufficiency summary plots without reducing the data. Even three-dimensional scatter plots must be viewed via a computer program, and the third dimension can only be visualized by rotating the coordinate axes. However, if there exists a sufficient dimension reduction R ( x ) {\displaystyle R({\textbf {x}})} with small enough dimension, a sufficient summary plot of y {\displaystyle y} versus R ( x ) {\displaystyle R({\textbf {x}})} may be constructed and visually interpreted with relative ease. Hence sufficient dimension reduction allows for graphical intuition about the distribution of y ∣ x {\displaystyle y\mid {\textbf {x}}} , which might not have otherwise been available for high-dimensional data. Most graphical methodology focuses primarily on dimension reduction involving linear combinations of x {\displaystyle {\textbf {x}}} . The rest of this article deals only with such reductions. == Dimension reduction subspace == Suppose R ( x ) = A T x {\displaystyle R({\textbf {x}})=A^{T}{\textbf {x}}} is a sufficient dimension reduction, where A {\displaystyle A} is a p × k {\displaystyle p\times k} matrix with rank k ≤ p {\displaystyle k\leq p} . Then the regression information for y ∣ x {\displaystyle y\mid {\textbf {x}}} can be inferred by studying the distribution of y ∣ A T x {\displaystyle y\mid A^{T}{\textbf {x}}} , and the plot of y {\displaystyle y} versus A T x {\displaystyle A^{T}{\textbf {x}}} is a sufficient summary plot. Without loss of generality, only the space spanned by the columns of A {\displaystyle A} need be considered. Let η {\displaystyle \eta } be a basis for the column space of A {\displaystyle A} , and let the space spanned by η {\displaystyle \eta } be denoted by S ( η ) {\displaystyle {\mathcal {S}}(\eta )} . It follows from the definition of a sufficient dimension reduction that F y ∣ x = F y ∣ η T x , {\displaystyle F_{y\mid x}=F_{y\mid \eta ^{T}x},} where F {\displaystyle F} denotes the appropriate distribution function. Another way to express this property is y ⊥ ⊥ x ∣ η T x , {\displaystyle y\perp \!\!\!\perp {\textbf {x}}\mid \eta ^{T}{\textbf {x}},} or y {\displaystyle y} is conditionally independent of x {\displaystyle {\textbf {x}}} , given η T x {\displaystyle \eta ^{T}{\textbf {x}}} . Then the subspace S ( η ) {\displaystyle {\mathcal {S}}(\eta )} is defined to be a dimension reduction subspace (DRS). === Structural dimensionality === For a regression y ∣ x {\displaystyle y\mid {\textbf {x}}} , the structural dimension, d {\displaystyle d} , is the smallest number of distinct linear combinations of x {\displaystyle {\textbf {x}}} necessary to preserve the conditional distribution of y ∣ x {\displaystyle y\mid {\textbf {x}}} . In other words, the smallest dimension reduction that is still sufficient maps x {\displaystyle {\textbf {x}}} to a subset of R d {\displaystyle \mathbb {R} ^{d}} . The corresponding DRS will be d-dimensional. === Minimum dimension reduction subspace === A subspace S {\displaystyle {\mathcal {S}}} is said to be a minimum DRS for y ∣ x {\displaystyle y\mid {\textbf {x}}} if it is a DRS and its dimension is less than or equal to that of all other DRSs for y ∣ x {\displaystyle y\mid {\textbf {x}}} . A minimum DRS S {\displaystyle {\mathcal {S}}} is not necessarily unique, but its dimension is equal to the structural dimension d {\displaystyle d} of y ∣ x {\displaystyle y\mid {\textbf {x}}} , by definition. If S {\displaystyle {\mathcal {S}}} has basis η {\displaystyle \eta } and is a minimum DRS, then a plot of y versus η T x {\displaystyle \eta ^{T}{\textbf {x}}} is a minimal sufficient summary plot, and it is (d + 1)-dimensional. == Central subspace == If a subspace S {\displaystyle {\mathcal {S}}} is a DRS for y ∣ x {\displaystyle y\mid {\textbf {x}}} , and if S ⊂ S drs {\displaystyle {\mathcal {S}}\subset {\mathcal {S}}_{\text{drs}}} for all other DRSs S drs {\displaystyle {\mathcal {S}}_{\text{drs}}} , then it is a central dimension reduction subspace, or simply a central subspace, and it is denoted by S y ∣ x {\displaystyle {\mathcal {S}}_{y\mid x}} . In other words, a central subspace for y ∣ x {\displaystyle y\mid {\textbf {x}}} exists if and only if the intersection ⋂ S drs {\textstyle \bigcap {\mathcal {S}}_{\text{drs}}} of all dimension reduction subspaces is also a dimension reduction subspace, and that intersection is the central subspace S y ∣ x {\displaystyle {\mathcal {S}}_{y\mid x}} . The central subspace S y ∣ x {\displaystyle {\mathcal {S}}_{y\mid x}} does not necessarily exist because the intersection ⋂ S drs {\textstyle \bigcap {\mathcal {S}}_{\text{drs}}} is not necessarily a DRS. However, if S y ∣ x {\displaystyle {\mathcal {S}}_{y\mid x}} does exist, then it is also the unique minimum dimension reduction subspace. === Existence of the central subspace === While the existence of the central subspace S y ∣ x {\displaystyle {\mathcal {S}}_{y\mid x}} is not guaranteed in every regression situation, there are some rather broad conditions under which its existence follows directly. For example, consider the following proposition from Cook (1998): Let S 1 {\displaystyle {\mathcal {S}}_{1}} and S 2 {\displaystyle {\mathcal {S}}_{2}} be dimension reduction subspaces for y ∣ x {\displaystyle y\mid {\textbf {x}}} . If x {\displaystyle {\textbf {x}}} has density f ( a ) > 0 {\displaystyle f(a)>0} for all a ∈ Ω x {\displaystyle a\in \Omega _{x}} and f ( a ) = 0 {\displaystyle f(a)=0} everywhere else, where Ω x {\displaystyle \Omega _{x}} is convex, then the intersection S 1 ∩ S 2 {\displaystyle {\mathcal {S}}_{1}\cap {\mathcal {S}}_{2}} is also a dimension reduction subspace. It follows from this proposition that the central subspace S y ∣ x {\displaystyle {\mathcal {S}}_{y\mid x}} exists for such x {\displaystyle {\textbf {x}}} . == Methods for dimension reduction == There are many existing methods for dimension reduction, both graphical and numeric. For example, sliced inverse regression (SIR) and sliced average variance estimation (SAVE) were introduced in the 1990s and continue to be widely used. Although SIR was originally designed to estimate an effective dimension reducing subspace, it is now understood that it estimates only the central subspace, which is generally different. More recent methods for dimension reduction include likelihood-based sufficient dimension reduction, estimating the central subspace based on the inverse third moment (or kth moment), estimating the central solution space, graphical regression, envelope model, and the principal support vector machine. For more details on these and other methods, consult the statistical literature. Principal components analysis (PCA) and similar methods for dimension reduction are not based on the sufficiency principle. === Example: linear regression === Consider the regression model y = α + β T x + ε , where ε ⊥ ⊥ x . {\displaystyle y=\alpha +\beta ^{T}{\textbf {x}}+\varepsilon ,{\text{ where }}\varepsilon \perp \!\!\!\perp {\textbf {x}}.} Note that the distribution of y ∣ x {\displaystyle y\mid {\textbf {x}}} is the same as the distribution of y ∣ β T x {\displ

    Read more →
  • Local tangent space alignment

    Local tangent space alignment

    Local tangent space alignment (LTSA) is a method for manifold learning, which can efficiently learn a nonlinear embedding into low-dimensional coordinates from high-dimensional data, and can also reconstruct high-dimensional coordinates from embedding coordinates. It is based on the intuition that when a manifold is correctly unfolded, all of the tangent hyperplanes to the manifold will become aligned. It begins by computing the k-nearest neighbors of every point. It computes the tangent space at every point by computing the d-first principal components in each local neighborhood. It then optimizes to find an embedding that aligns the tangent spaces, but it ignores the label information conveyed by data samples, and thus can not be used for classification directly.

    Read more →
  • Latent and observable variables

    Latent and observable variables

    In statistics, latent variables (from Latin: present participle of lateo 'lie hidden') are variables that can only be inferred indirectly through a mathematical model from other observable variables that can be directly observed or measured. Such latent variable models are used in many disciplines, including engineering, medicine, ecology, physics, machine learning/artificial intelligence, natural language processing, bioinformatics, chemometrics, demography, economics, management, political science, psychology and the social sciences. Latent variables may correspond to aspects of physical reality. These could in principle be measured, but may not be for practical reasons. Among the earliest expressions of this idea is Francis Bacon's polemic the Novum Organum, itself a challenge to the more traditional logic expressed in Aristotle's Organon: But the latent process of which we speak, is far from being obvious to men’s minds, beset as they now are. For we mean not the measures, symptoms, or degrees of any process which can be exhibited in the bodies themselves, but simply a continued process, which, for the most part, escapes the observation of the senses. In this situation, the term hidden variables is commonly used, reflecting the fact that the variables are meaningful, but not observable. Other latent variables correspond to abstract concepts, like categories, behavioral or mental states, or data structures. The terms hypothetical variables or hypothetical constructs may be used in these situations. The use of latent variables can serve to reduce the dimensionality of data. Many observable variables can be aggregated in a model to represent an underlying concept, making it easier to understand the data. In this sense, they serve a function similar to that of scientific theories. At the same time, latent variables link observable "sub-symbolic" data in the real world to symbolic data in the modeled world. == Examples == === Psychology === Latent variables, as created by factor analytic methods, generally represent "shared" variance, or the degree to which variables "move" together. Variables that have no correlation cannot result in a latent construct based on the common factor model. The "Big Five personality traits" have been inferred using factor analysis. extraversion spatial ability wisdom: “Two of the more predominant means of assessing wisdom include wisdom-related performance and latent variable measures.” Spearman's g, or the general intelligence factor in psychometrics === Economics === Examples of latent variables from the field of economics include quality of life, business confidence, morale, happiness and conservatism: these are all variables which cannot be measured directly. However, by linking these latent variables to other, observable variables, the values of the latent variables can be inferred from measurements of the observable variables. Quality of life is a latent variable which cannot be measured directly, so observable variables are used to infer quality of life. Observable variables to measure quality of life include wealth, employment, environment, physical and mental health, education, recreation and leisure time, and social belonging. === Medicine === Latent-variable methodology is used in many branches of medicine. A class of problems that naturally lend themselves to latent variables approaches are longitudinal studies where the time scale (e.g. age of participant or time since study baseline) is not synchronized with the trait being studied. For such studies, an unobserved time scale that is synchronized with the trait being studied can be modeled as a transformation of the observed time scale using latent variables. Examples of this include disease progression modeling and modeling of growth (see box). == Inferring latent variables == There exists a range of different model classes and methodology that make use of latent variables and allow inference in the presence of latent variables. Models include: linear mixed-effects models and nonlinear mixed-effects models Hidden Markov models Factor analysis Item response theory Analysis and inference methods include: Principal component analysis Instrumented principal component analysis Partial least squares regression Latent semantic analysis and probabilistic latent semantic analysis EM algorithms Metropolis–Hastings algorithm === Bayesian algorithms and methods === Bayesian statistics is often used for inferring latent variables. Latent Dirichlet allocation The Chinese restaurant process is often used to provide a prior distribution over assignments of objects to latent categories. The Indian buffet process is often used to provide a prior distribution over assignments of latent binary features to objects.

    Read more →
  • ChatScript

    ChatScript

    ChatScript is a combination Natural Language engine and dialog management system designed initially for creating chatbots, but is currently also used for various forms of NL processing. It is written in C++. The engine is an open source project at SourceForge. and GitHub. ChatScript was written by Bruce Wilcox and originally released in 2011, after Suzette (written in ChatScript) won the 2010 Loebner Prize, fooling one of four human judges. == Features == In general ChatScript aims to author extremely concisely, since the limiting scalability of hand-authored chatbots is how much/fast one can write the script. Because ChatScript is designed for interactive conversation, it automatically maintains user state across volleys. A volley is any number of sentences the user inputs at once and the chatbots response. The basic element of scripting is the rule. A rule consists of a type, a label (optional), a pattern, and an output. There are three types of rules. Gambits are something a chatbot might say when it has control of the conversation. Rejoinders are rules that respond to a user remark tied to what the chatbot just said. Responders are rules that respond to arbitrary user input which is not necessarily tied to what the chatbot just said. Patterns describe conditions under which a rule may fire. Patterns range from extremely simplistic to deeply complex (analogous to Regex but aimed for NL). Heavy use is typically made of concept sets, which are lists of words sharing a meaning. ChatScript contains some 2000 predefined concepts and scripters can easily write their own. Output of a rule intermixes literal words to be sent to the user along with common C-style programming code. Rules are bundled into collections called topics. Topics can have keywords, which allows the engine to automatically search the topic for relevant rules based on user input. == Example code == Words starting with ~ are concept sets. For example, ~fruit is the list of all known fruits. The simple pattern (~fruit) reacts if any fruit is mentioned immediately after the chatbot asks for favorite food. The slightly more complex pattern for the rule labelled WHATMUSIC requires all the words what, music, you and any word or phrase meaning to like, but they may occur in any order. Responders come in three types. ?: rules react to user questions. s: rules react to user statements. u: rules react to either. ChatScript code supports standard if-else, loops, user-defined functions and calls, and variable assignment and access. == Data == Some data in ChatScript is transient, meaning it will disappear at the end of the current volley. Other data is permanent, lasting forever until explicitly killed off. Data can be local to a single user or shared across all users at the bot level. Internally all data is represented as text and is automatically converted to a numeric form as needed. === Variables === User variables come in several kinds. Variables purely local to a topic or function are transient. Global variables can be declared as transient or permanent. A variable is generally declared merely by using it, and its type depends on its prefix ($, $$, $_). === Facts === In addition to variables, ChatScript supports facts – triples of data, which can also be transient or permanent. Functions can query for facts having particular values of some of the fields, making them act like an in-memory database. Fact retrieval is very quick and efficient the number of available in-memory facts is largely constrained to the available memory of the machine running the ChatScript engine. Facts can represent record structures and are how ChatScript represents JSON internally. Tables of information can be defined to generate appropriate facts. The above table links people to what they invented (1 per line) with Einstein getting a list of things he did. == External communication == ChatScript embeds the Curl library and can directly read and write facts in JSON to a website. == Server == A ChatScript engine can run in local or server mode. == Pos-tagging, parsing, and ontology == ChatScript comes with a copy of English WordNet embedded within, including its ontology, and creates and extends its own ontology via concept declarations. It has an English language pos-tagger and parser and supports integration with TreeTagger for pos-tagging a number of other languages (TreeTagger commercial license required). == Databases == In addition to an internal fact database, ChatScript supports PostgreSQL, MySQL, MSSQL and MongoDB both for access by scripts, but also as a central filesystem if desired so ChatScript can be scaled horizontally. A common use case is to use a centralized database to host the user files and multiple servers to scale the ChatScript engine. == JavaScript == ChatScript also embeds DukTape, ECMAScript E5/E5.1 compatibility, with some semantics updated from ES2015+. == Spelling Correction == ChatScript has built-in automatic spell checking, which can be augmented in script as both simple word replacements or context sensitive changes. With appropriate simple rules you can change perfect legal words into other words or delete them. E.g., if you have a concept of ~electronic_goods and don't want an input of Radio Shack (a store name) to be detected as an electronic good, you can get the input to change to Radio_Shack (a single word), or allow the words to remain but block the detection of the concept. This is particularly useful when combined with speech-to-text code that is imperfect, but you are familiar with common failings of it and can compensate for them in script. == Control flow == A chatbot's control flow is managed by the control script. This is merely another ordinary topic of rules, that invokes API functions of the engine. Thus control is fully configurable by the scripter (and functions exist to allow introspection into the engine). There are pre-processing control flow and post-processing control flow options available, for special processing.

    Read more →
  • Dynamic time warping

    Dynamic time warping

    In time series analysis, dynamic time warping (DTW) is an algorithm for measuring similarity between two temporal sequences, which may vary in speed. For instance, similarities in walking could be detected using DTW, even if one person was walking faster than the other, or if there were accelerations and decelerations during the course of an observation. DTW has been applied to temporal sequences of video, audio, and graphics data — indeed, any data that can be turned into a one-dimensional sequence can be analyzed with DTW. A well-known application has been automatic speech recognition, to cope with different speaking speeds. Other applications include speaker recognition and online signature recognition. It can also be used in partial shape matching applications. In general, DTW is a method that calculates an optimal match between two given sequences (e.g. time series) with certain restriction and rules: Every index from the first sequence must be matched with one or more indices from the other sequence, and vice versa The first index from the first sequence must be matched with the first index from the other sequence (but it does not have to be its only match) The last index from the first sequence must be matched with the last index from the other sequence (but it does not have to be its only match) The mapping of the indices from the first sequence to indices from the other sequence must be monotonically increasing, and vice versa, i.e. if j > i {\displaystyle j>i} are indices from the first sequence, then there must not be two indices l > k {\displaystyle l>k} in the other sequence, such that index i {\displaystyle i} is matched with index l {\displaystyle l} and index j {\displaystyle j} is matched with index k {\displaystyle k} , and vice versa We can plot each match between the sequences 1 : M {\displaystyle 1:M} and 1 : N {\displaystyle 1:N} as a path in a M × N {\displaystyle M\times N} matrix from ( 1 , 1 ) {\displaystyle (1,1)} to ( M , N ) {\displaystyle (M,N)} , such that each step is one of ( 0 , 1 ) , ( 1 , 0 ) , ( 1 , 1 ) {\displaystyle (0,1),(1,0),(1,1)} . In this formulation, we see that the number of possible matches is the Delannoy number. The optimal match is denoted by the match that satisfies all the restrictions and the rules and that has the minimal cost, where the cost is computed as the sum of absolute differences, for each matched pair of indices, between their values. The sequences are "warped" non-linearly in the time dimension to determine a measure of their similarity independent of certain non-linear variations in the time dimension. This sequence alignment method is often used in time series classification. Although DTW measures a distance-like quantity between two given sequences, it doesn't guarantee the triangle inequality to hold. In addition to a similarity measure between the two sequences (a so called "warping path" is produced), by warping according to this path the two signals may be aligned in time. The signal with an original set of points X(original), Y(original) is transformed to X(warped), Y(warped). This finds applications in genetic sequence and audio synchronisation. In a related technique sequences of varying speed may be averaged using this technique see the average sequence section. This is conceptually very similar to the Needleman–Wunsch algorithm. == Implementation == This example illustrates the implementation of the dynamic time warping algorithm when the two sequences s and t are strings of discrete symbols. For two symbols x and y, d ( x , y ) {\displaystyle d(x,y)} is a distance between the symbols, e.g., d ( x , y ) = | x − y | {\displaystyle d(x,y)=|x-y|} . int DTWDistance(s: array [1..n], t: array [1..m]) { DTW := array [0..n, 0..m] for i := 0 to n for j := 0 to m DTW[i, j] := infinity DTW[0, 0] := 0 for i := 1 to n for j := 1 to m cost := d(s[i], t[j]) DTW[i, j] := cost + minimum(DTW[i-1, j ], // insertion DTW[i , j-1], // deletion DTW[i-1, j-1]) // match return DTW[n, m] } where DTW[i, j] is the distance between s[1:i] and t[1:j] with the best alignment. We sometimes want to add a locality constraint. That is, we require that if s[i] is matched with t[j], then | i − j | {\displaystyle |i-j|} is no larger than w, a window parameter. We can easily modify the above algorithm to add a locality constraint (differences marked). However, the above given modification works only if | n − m | {\displaystyle |n-m|} is no larger than w, i.e. the end point is within the window length from diagonal. In order to make the algorithm work, the window parameter w must be adapted so that | n − m | ≤ w {\displaystyle |n-m|\leq w} (see the line marked with () in the code). int DTWDistance(s: array [1..n], t: array [1..m], w: int) { DTW := array [0..n, 0..m] w := max(w, abs(n-m)) // adapt window size () for i := 0 to n for j:= 0 to m DTW[i, j] := infinity DTW[0, 0] := 0 for i := 1 to n for j := max(1, i-w) to min(m, i+w) DTW[i, j] := 0 for i := 1 to n for j := max(1, i-w) to min(m, i+w) cost := d(s[i], t[j]) DTW[i, j] := cost + minimum(DTW[i-1, j ], // insertion DTW[i , j-1], // deletion DTW[i-1, j-1]) // match return DTW[n, m] } == Warping properties == The DTW algorithm produces a discrete matching between existing elements of one series to another. In other words, it does not allow time-scaling of segments within the sequence. Other methods allow continuous warping. For example, Correlation Optimized Warping (COW) divides the sequence into uniform segments that are scaled in time using linear interpolation, to produce the best matching warping. The segment scaling causes potential creation of new elements, by time-scaling segments either down or up, and thus produces a more sensitive warping than DTW's discrete matching of raw elements. == Complexity == The time complexity of the DTW algorithm is O ( N M ) {\displaystyle O(NM)} , where N {\displaystyle N} and M {\displaystyle M} are the lengths of the two input sequences. The 50 years old quadratic time bound was broken in 2016: an algorithm due to Gold and Sharir enables computing DTW in O ( N 2 / log ⁡ log ⁡ N ) {\displaystyle O({N^{2}}/\log \log N)} time and space for two input sequences of length N {\displaystyle N} . This algorithm can also be adapted to sequences of different lengths. Despite this improvement, it was shown that a strongly subquadratic running time of the form O ( N 2 − ϵ ) {\displaystyle O(N^{2-\epsilon })} for some ϵ > 0 {\displaystyle \epsilon >0} cannot exist unless the Strong exponential time hypothesis fails. While the dynamic programming algorithm for DTW requires O ( N M ) {\displaystyle O(NM)} space in a naive implementation, the space consumption can be reduced to O ( min ( N , M ) ) {\displaystyle O(\min(N,M))} using Hirschberg's algorithm. == Fast computation == Fast techniques for computing DTW include PrunedDTW, SparseDTW, FastDTW, and the MultiscaleDTW. A common task, retrieval of similar time series, can be accelerated by using lower bounds such as LB_Keogh, LB_Improved, or LB_Petitjean. However, the Early Abandon and Pruned DTW algorithm reduces the degree of acceleration that lower bounding provides and sometimes renders it ineffective. In a survey, Wang et al. reported slightly better results with the LB_Improved lower bound than the LB_Keogh bound, and found that other techniques were inefficient. Subsequent to this survey, the LB_Enhanced bound was developed that is always tighter than LB_Keogh while also being more efficient to compute. LB_Petitjean is the tightest known lower bound that can be computed in linear time. == Average sequence == Averaging for dynamic time warping is the problem of finding an average sequence for a set of sequences. NLAAF is an exact method to average two sequences using DTW. For more than two sequences, the problem is related to that of multiple alignment and requires heuristics. DBA is currently a reference method to average a set of sequences consistently with DTW. COMASA efficiently randomizes the search for the average sequence, using DBA as a local optimization process. == Supervised learning == A nearest-neighbour classifier can achieve state-of-the-art performance when using dynamic time warping as a distance measure. == Amerced Dynamic Time Warping == Amerced Dynamic Time Warping (ADTW) is a variant of DTW designed to better control DTW's permissiveness in the alignments that it allows. The windows that classical DTW uses to constrain alignments introduce a step function. Any warping of the path is allowed within the window and none beyond it. In contrast, ADTW employs an additive penalty that is incurred each time that the path is warped. Any amount of warping is allowed, but each warping action incurs a direct penalty. ADTW significantly outperforms DTW with windowing when applied as a nearest neighbor classifier on a set of benchmark time series classification tasks. == Alternative approaches == In functional data analysis, time series are regarde

    Read more →
  • Relation network

    Relation network

    A relation network (RN) is an artificial neural network component with a structure that can reason about relations among objects. An example category of such relations is spatial relations (above, below, left, right, in front of, behind). RNs can infer relations, they are data efficient, and they operate on a set of objects without regard to the objects' order. == History == In June 2017, DeepMind announced the first relation network. It claimed that the technology had achieved "superhuman" performance on multiple question-answering problem sets. == Design == RNs constrain the functional form of a neural network to capture the common properties of relational reasoning. These properties are explicitly added to the system, rather than established by learning just as the capacity to reason about spatial, translation-invariant properties is explicitly part of convolutional neural networks (CNN). The data to be considered can be presented as a simple list or as a directed graph whose nodes are objects and whose edges are the pairs of objects whose relationships are to be considered. The RN is a composite function: R N ( O ) = f ϕ ( ∑ i , j g θ ( o i , o j , q ) ) , {\displaystyle RN\left(O\right)=f_{\phi }\left(\sum _{i,j}g_{\theta }\left(o_{i},o_{j},q\right)\right),} where the input is a set of "objects" O = { o 1 , o 2 , . . . , o n } , o i ∈ R m {\displaystyle O=\left\lbrace o_{1},o_{2},...,o_{n}\right\rbrace ,o_{i}\in \mathbb {R} ^{m}} is the ith object, and fφ and gθ are functions with parameters φ and θ, respectively and q is the question. fφ and gθ are multilayer perceptrons, while the 2 parameters are learnable synaptic weights. RNs are differentiable. The output of gθ is a "relation"; therefore, the role of gθ is to infer any ways in which two objects are related. Image (128x128 pixel) processing is done with a 4-layer CNN. Outputs from the CNN are treated as the objects for relation analysis, without regard for what those "objects" explicitly represent. Questions were processed with a long short-term memory network.

    Read more →
  • Diffusion model

    Diffusion model

    In machine learning, diffusion models, also known as diffusion-based generative models or score-based generative models, are a class of latent variable generative models. A diffusion model consists of two major components: the forward diffusion process, and the reverse sampling process. The goal of diffusion models is to learn a diffusion process for a given dataset, such that the process can generate new elements that are distributed similarly as the original dataset. A diffusion model models data as generated by a diffusion process, whereby a new datum performs a random walk with drift through the space of all possible data. A trained diffusion model can be sampled in many ways, with different efficiency and quality. There are various equivalent formalisms, including Markov chains, denoising diffusion probabilistic models, noise conditioned score networks, and stochastic differential equations. They are typically trained using variational inference. The model responsible for denoising is typically called its "backbone". The backbone may be of any kind, but they are typically U-nets or transformers. As of 2024, diffusion models are mainly used for computer vision tasks, including image denoising, inpainting, super-resolution, image generation, and video generation. These typically involve training a neural network to sequentially denoise images blurred with Gaussian noise. The model is trained to reverse the process of adding noise to an image. After training to convergence, it can be used for image generation by starting with an image composed of random noise, and applying the network iteratively to denoise the image. Diffusion-based image generators have seen widespread commercial interest, such as Stable Diffusion and DALL-E. These models typically combine diffusion models with other models, such as text-encoders and cross-attention modules to allow text-conditioned generation. Other than computer vision, diffusion models have also found applications in natural language processing such as text generation and summarization, sound generation, and reinforcement learning. == Denoising diffusion model == === Non-equilibrium thermodynamics === Diffusion models were introduced in 2015 as a method to train a model that can sample from a highly complex probability distribution. They used techniques from non-equilibrium thermodynamics, especially diffusion. Consider, for example, how one might model the distribution of all naturally occurring photos. Each image is a point in the space of all images, and the distribution of naturally occurring photos is a "cloud" in space, which, by repeatedly adding noise to the images, diffuses out to the rest of the image space, until the cloud becomes all but indistinguishable from a Gaussian distribution N ( 0 , I ) {\displaystyle {\mathcal {N}}(0,I)} . A model that can approximately undo the diffusion can then be used to sample from the original distribution. This is studied in "non-equilibrium" thermodynamics, as the starting distribution is not in equilibrium, unlike the final distribution. The equilibrium distribution is the Gaussian distribution N ( 0 , I ) {\displaystyle {\mathcal {N}}(0,I)} , with pdf ρ ( x ) ∝ e − 1 2 ‖ x ‖ 2 {\displaystyle \rho (x)\propto e^{-{\frac {1}{2}}\|x\|^{2}}} . This is just the Maxwell–Boltzmann distribution of particles in a potential well V ( x ) = 1 2 ‖ x ‖ 2 {\displaystyle V(x)={\frac {1}{2}}\|x\|^{2}} at temperature 1. The initial distribution, being very much out of equilibrium, would diffuse towards the equilibrium distribution, making biased random steps that are a sum of pure randomness (like a Brownian walker) and gradient descent down the potential well. The randomness is necessary: if the particles were to undergo only gradient descent, then they will all fall to the origin, collapsing the distribution. === Denoising Diffusion Probabilistic Model (DDPM) === The 2020 paper proposed the Denoising Diffusion Probabilistic Model (DDPM), which improves upon the previous method by variational inference. ==== Forward diffusion ==== To present the model, some notation is required. β 1 , . . . , β T ∈ ( 0 , 1 ) {\displaystyle \beta _{1},...,\beta _{T}\in (0,1)} are fixed constants. α t := 1 − β t {\displaystyle \alpha _{t}:=1-\beta _{t}} α ¯ t := α 1 ⋯ α t {\displaystyle {\bar {\alpha }}_{t}:=\alpha _{1}\cdots \alpha _{t}} σ t := 1 − α ¯ t {\displaystyle \sigma _{t}:={\sqrt {1-{\bar {\alpha }}_{t}}}} σ ~ t := σ t − 1 σ t β t {\displaystyle {\tilde {\sigma }}_{t}:={\frac {\sigma _{t-1}}{\sigma _{t}}}{\sqrt {\beta _{t}}}} μ ~ t ( x t , x 0 ) := α t ( 1 − α ¯ t − 1 ) x t + α ¯ t − 1 ( 1 − α t ) x 0 σ t 2 {\displaystyle {\tilde {\mu }}_{t}(x_{t},x_{0}):={\frac {{\sqrt {\alpha _{t}}}(1-{\bar {\alpha }}_{t-1})x_{t}+{\sqrt {{\bar {\alpha }}_{t-1}}}(1-\alpha _{t})x_{0}}{\sigma _{t}^{2}}}} N ( μ , Σ ) {\displaystyle {\mathcal {N}}(\mu ,\Sigma )} is the normal distribution with mean μ {\displaystyle \mu } and variance Σ {\displaystyle \Sigma } , and N ( x | μ , Σ ) {\displaystyle {\mathcal {N}}(x|\mu ,\Sigma )} is the probability density at x {\displaystyle x} . A vertical bar denotes conditioning. A forward diffusion process starts at some starting point x 0 ∼ q {\displaystyle x_{0}\sim q} , where q {\displaystyle q} is the probability distribution to be learned, then repeatedly adds noise to it by x t = 1 − β t x t − 1 + β t z t {\displaystyle x_{t}={\sqrt {1-\beta _{t}}}x_{t-1}+{\sqrt {\beta _{t}}}z_{t}} where z 1 , . . . , z T {\displaystyle z_{1},...,z_{T}} are IID (Independent and identically distributed random variables) samples from N ( 0 , I ) {\displaystyle {\mathcal {N}}(0,I)} . The coefficients 1 − β t {\displaystyle {\sqrt {1-\beta _{t}}}} and β t {\displaystyle {\sqrt {\beta _{t}}}} ensure that Var ( X t ) = I {\displaystyle {\mbox{Var}}(X_{t})=I} assuming that Var ( X 0 ) = I {\displaystyle {\mbox{Var}}(X_{0})=I} . The values of β t {\displaystyle \beta _{t}} are chosen such that for any starting distribution of x 0 {\displaystyle x_{0}} , if it has finite second moment, then lim t → ∞ x t | x 0 {\displaystyle \lim _{t\to \infty }x_{t}|x_{0}} converges to N ( 0 , I ) {\displaystyle {\mathcal {N}}(0,I)} . The entire diffusion process then satisfies q ( x 0 : T ) = q ( x 0 ) q ( x 1 | x 0 ) ⋯ q ( x T | x T − 1 ) = q ( x 0 ) N ( x 1 | α 1 x 0 , β 1 I ) ⋯ N ( x T | α T x T − 1 , β T I ) {\displaystyle q(x_{0:T})=q(x_{0})q(x_{1}|x_{0})\cdots q(x_{T}|x_{T-1})=q(x_{0}){\mathcal {N}}(x_{1}|{\sqrt {\alpha _{1}}}x_{0},\beta _{1}I)\cdots {\mathcal {N}}(x_{T}|{\sqrt {\alpha _{T}}}x_{T-1},\beta _{T}I)} or ln ⁡ q ( x 0 : T ) = ln ⁡ q ( x 0 ) − ∑ t = 1 T 1 2 β t ‖ x t − 1 − β t x t − 1 ‖ 2 + C {\displaystyle \ln q(x_{0:T})=\ln q(x_{0})-\sum _{t=1}^{T}{\frac {1}{2\beta _{t}}}\|x_{t}-{\sqrt {1-\beta _{t}}}x_{t-1}\|^{2}+C} where C {\displaystyle C} is a normalization constant and often omitted. In particular, we note that x 1 : T | x 0 {\displaystyle x_{1:T}|x_{0}} is a Gaussian process, which affords us considerable freedom in reparameterization. For example, by standard manipulation with Gaussian process, x t | x 0 ∼ N ( α ¯ t x 0 , σ t 2 I ) {\displaystyle x_{t}|x_{0}\sim N\left({\sqrt {{\bar {\alpha }}_{t}}}x_{0},\sigma _{t}^{2}I\right)} x t − 1 | x t , x 0 ∼ N ( μ ~ t ( x t , x 0 ) , σ ~ t 2 I ) {\displaystyle x_{t-1}|x_{t},x_{0}\sim {\mathcal {N}}({\tilde {\mu }}_{t}(x_{t},x_{0}),{\tilde {\sigma }}_{t}^{2}I)} In particular, notice that for large t {\displaystyle t} , the variable x t | x 0 ∼ N ( α ¯ t x 0 , σ t 2 I ) {\displaystyle x_{t}|x_{0}\sim N\left({\sqrt {{\bar {\alpha }}_{t}}}x_{0},\sigma _{t}^{2}I\right)} converges to N ( 0 , I ) {\displaystyle {\mathcal {N}}(0,I)} . That is, after a long enough diffusion process, we end up with some x T {\displaystyle x_{T}} that is very close to N ( 0 , I ) {\displaystyle {\mathcal {N}}(0,I)} , with all traces of the original x 0 ∼ q {\displaystyle x_{0}\sim q} gone. For example, since x t | x 0 ∼ N ( α ¯ t x 0 , σ t 2 I ) {\displaystyle x_{t}|x_{0}\sim N\left({\sqrt {{\bar {\alpha }}_{t}}}x_{0},\sigma _{t}^{2}I\right)} we can sample x t | x 0 {\displaystyle x_{t}|x_{0}} directly "in one step", instead of going through all the intermediate steps x 1 , x 2 , . . . , x t − 1 {\displaystyle x_{1},x_{2},...,x_{t-1}} . ==== Backward diffusion ==== The key idea of DDPM is to use a neural network parametrized by θ {\displaystyle \theta } . The network takes in two arguments x t , t {\displaystyle x_{t},t} , and outputs a vector μ θ ( x t , t ) {\displaystyle \mu _{\theta }(x_{t},t)} and a matrix Σ θ ( x t , t ) {\displaystyle \Sigma _{\theta }(x_{t},t)} , such that each step in the forward diffusion process can be approximately undone by x t − 1 ∼ N ( μ θ ( x t , t ) , Σ θ ( x t , t ) ) {\displaystyle x_{t-1}\sim {\mathcal {N}}(\mu _{\theta }(x_{t},t),\Sigma _{\theta }(x_{t},t))} . This then gives us a backward diffusion process p θ {\displaystyle p_{\theta }} defined by p θ ( x T ) = N ( x T | 0 , I ) {\displaystyle p_{\theta }(x

    Read more →
  • Human–AI interaction

    Human–AI interaction

    Human–AI interaction is a developing field of research and a sub-field of human–computer interaction (HCI). HCI is a field of research that explores the interactions between humans and computer-based technology, focusing on design implementation, user experience, and psychological factors. With the proliferation of artificial intelligence (AI), there has developed a sub-section of HCI research dedicated specifically to artificial intelligence and how people interact with and are impacted by it. This is human–AI interaction, abbreviated either as HAX or HAII. == Introduction == Artificial intelligence (AI), in general, has fluid definitions and varied research applications, but in brief can be applied to mechanizing tasks that would require human intelligence to complete. AI are tools designed to replicate the human abilities of navigating uncertainty, active learning, and processing information in different contexts. Within the context of HCI and HAX research, artificial intelligence can be broken into two sub-fields, natural language processing (NLP) and computer vision (CV). AI technologies notably include machine-learning, deep-learning and neural networks, and large-language models (LLMs). As a new and rapidly developing technology, AI is changing how computers work and therefore changing how humans interact with computers. Unlike the traditional human-computer interaction, where a human directs a machine, human-AI interaction is characterized by a more collaborative relationship between the computer program (the AI) and the human user, as AI is perceived as an active agent rather than a tool. This changing dynamic creates new questions and necessitates new research methods that are not present in traditional HCI research. According to a scoping review on the state of the discipline, the HAX field comprises research on the "design, development, and evaluation of AI systems" and encompasses the themes of human-AI collaboration, human-AI competition, human-AI conflict, and human-AI symbiosis. == Design == Machine learning and artificial intelligence have been used for decades in targeted advertising and to recommend content in social media. Ethical Guidelines (Framework for ethical AI development) == User Experience (UX) == This section should handle research on how users interact with tools. What techniques do they use, do they develop habits, what types of programs and devices are they using to access these tools, what do they use these tools to do exactly. === Cognitive Frameworks in AI Tool Users === AI has been viewed with various expectations, attributions, and often misconceptions. Many people exclusively understand AI as the LLM chatbots they interact with, like ChatGPT or Claude, or other generative AI programs. [Insert section: discuss how people interact with these specific AI tools as a connection to the following paragraphs] Most fundamentally, humans have a mental model of understanding AI's reasoning and motivation for its decision recommendations, and building a holistic and precise mental model of AI helps people create prompts to receive more valuable responses from AI. However, these mental models are not whole because people can only gain more information about AI through their limited interaction with it; more interaction with AI builds a better mental model that a person may build to produce better prompt outcomes. Research on human-AI interaction has emphasized that users develop mental models of AI systems and revise those models through repeated use, feedback, and explanation, while design research has stressed the importance of communicating capabilities and limitations early and supporting trust calibration through explanation and correction. In a 2025 SSRN working paper, John DeVadoss proposed "Hypothetico-Deductive Interaction" (HDI), a framework that describes human-AI interaction as a mutual process of conjecture and refutation in which users test assumptions about an AI system's capabilities while the system infers and updates assumptions about user goals through its responses and clarifying questions. DeVadoss argued that this framing helps explain prompt iteration, weak capability awareness, and trust miscalibration, and suggested design responses such as clearer communication of uncertainty, easier correction, actionable explanations, and safer failure modes. == Research themes == === Human-AI collaboration === Human-AI collaboration occurs when the human and AI supervise the task on the same level and extent to achieve the same goal. Some collaboration occurs in the form of augmenting human capability. AI may help human ability in analysis and decision-making through providing and weighing a volume of information, and learning to defer to the human decision when it recognizes its unreliability. It is especially beneficial when the human can detect a task that AI can be trusted to make few errors so that there is not a lot of excessive checking process required on the human's end. Some findings show signs of human-AI augmentation, or human–AI symbiosis, in which AI enhances human ability in a way that co-working on a task with AI produces better outcomes than a human working alone. For example: the quality and speed of customer service tasks increase when a human agent collaborates with AI, training on specific models allows AI to improve diagnoses in clinical settings, and AI with human-intervention can improve creativity of artwork while fully AI-generated haikus were rated negatively. Human-AI synergy, a concept in which human-AI collaboration would produce more optimal outcomes than either human or AI working alone could explain why AI does not always help with performance. Some AI features and development may accelerate human-AI synergy, while others may stagnate it. For example, when AI updates for better performance, it sometimes worsens the team performance with human and AI by reducing the compatibility with the new model and the mental model a user has developed on the previous version. Research has found that AI often supports human capabilities in the form of human-AI augmentation and not human-AI synergy, potentially because people rely too much on AI and stop thinking on their own. Prompting people to actively engage in analysis and think when to follow AI recommendations reduces their over-reliance, especially for individuals with higher need for cognition. === Human-AI competition === Robots and computers have substituted routine tasks historically completed by humans, but agentic AI has made it possible to also replace cognitive tasks including taking phone calls for appointments and driving a car. At the point of 2016, research has estimated that 45% of paid activities could be replaced by AI by 2030. Perceived autonomy of robots is known to increase people's negative attitude toward them, and worry about the technology taking over leads people to reject it. There has been a consistent tendency of algorithm aversion in which people prefer human advice over AI advice. However, people are not always able to tell apart tasks completed by AI or other humans. See AI takeover for more information. It is also notable that this sentiment is more prominent in the Western cultures as Westerners tend to show less positive views about AI compared to East Asians. == Research on the psychological impacts of AI == === Perception on others who use AI === As much as people perceive and make judgment about AI itself, they also form impressions of themselves and others who use AI. In the workplace, employees who disclose the use of AI in their tasks are more likely to receive feedback that they are not as hardworking as those who are in the same job who receive non-AI help to complete the same tasks. AI use disclosure diminishes the perceived legitimacy in the employee's task and decision making which ultimately leads observers to distrust people who use AI. Although these negative effects of AI use disclosure are weakened by the observers who use AI frequently themselves, the effect is still not attenuated by the observers' positive attitude towards AI. === Bias, AI, and human === Although AI provides a wide range of information and suggestions to its users, AI itself is not free of biases and stereotypes, and it does not always help people reduce their cognitive errors and biases. People are prone to such errors by failing to see other potential ideas and cases that are not listed by AI responses and committing to a decision suggested by AI that directly contradicts the correct information and directions that they are already aware of. Gender bias is also reflected as the female gendering of AI technologies which conceptualizes females as a helpful assistant. == Emotional connection with AI == Human-AI interaction has been theorized in the context of interpersonal relationships mainly in social psychology, communications and media studies, and as a technology interface through the lens of hu

    Read more →
  • Distributional Soft Actor Critic

    Distributional Soft Actor Critic

    Distributional Soft Actor Critic (DSAC) is a suite of model-free off-policy reinforcement learning algorithms, tailored for learning decision-making or control policies in complex systems with continuous action spaces. Distinct from traditional methods that focus solely on expected returns, DSAC algorithms are designed to learn a Gaussian distribution over stochastic returns, called value distribution. This focus on Gaussian value distribution learning notably diminishes value overestimations, which in turn boosts policy performance. Additionally, the value distribution learned by DSAC can also be used for risk-aware policy learning. From a technical standpoint, DSAC is essentially a distributional adaptation of the well-established soft actor-critic (SAC) method. To date, the DSAC family comprises two iterations: the original DSAC-v1 and its successor, DSAC-T (also known as DSAC-v2), with the latter demonstrating superior capabilities over the Soft Actor-Critic (SAC) in Mujoco benchmark tasks. The source code for DSAC-T can be found at the following URL: Jingliang-Duan/DSAC-T. Both iterations have been integrated into an advanced, Pytorch-powered reinforcement learning toolkit named GOPS: GOPS (General Optimal control Problem Solver).

    Read more →
  • List of datasets for machine-learning research

    List of datasets for machine-learning research

    These datasets are used in machine learning (ML) research and have been cited in peer-reviewed academic journals. Datasets are an integral part of the field of machine learning. Major advances in this field can result from advances in learning algorithms (such as deep learning), computer hardware, and, less intuitively, the availability of high-quality training datasets. High-quality labeled training datasets for supervised and semi-supervised machine-learning algorithms are usually difficult and expensive to produce because of the large amount of time needed to label the data. Although they do not need to be labeled, high-quality unlabeled datasets for unsupervised learning can also be difficult and costly to produce. Many organizations, including governments, publish and share their datasets, often using common metadata formats (such as Croissant). The datasets are classified, based on the licenses, into two groups: open data and non-open data. The datasets from various governmental-bodies are presented in List of open government data sites. The datasets are ported on open data portals. They are made available for searching, depositing and accessing through interfaces like Open API. The datasets are made available as various sorted types and subtypes. == List of sorting used for datasets == The data portal is classified based on its type of license. The open source license based data portals are known as open data portals which are used by many government organizations and academic institutions. == List of open data portals == == List of portals suitable for multiple types of applications == The data portal sometimes lists a wide variety of subtypes of datasets pertaining to many machine learning applications. == List of portals suitable for a specific subtype of applications == The data portals which are suitable for a specific subtype of machine learning application are listed in the subsequent sections. == Image data == == Text data == These datasets consist primarily of text for tasks such as natural language processing, sentiment analysis, translation, and cluster analysis. === Reviews === === News articles === === Messages === === Twitter and tweets === === Dialogues === === Legal === === Other text === == Sound data == These datasets consist of sounds and sound features used for tasks such as speech recognition and speech synthesis. === Speech === === Music === === Other sounds === == Signal data == Datasets containing electric signal information requiring some sort of signal processing for further analysis. === Electrical === === Motion-tracking === === Other signals === == Chemical data == Datasets from physical systems. === Chemical Reactions with transition states (TS) === === OpenReACT-CHON-EFH === OpenReACT-CHON-EFH (Open Reaction Dataset of Atomic ConfiguraTions comprising C, H, O and N with Energies, Forces and Hessians) is a 2025 open-access benchmark for machine-learning interatomic potentials. RTP set – 35,087 stationary-point geometries (reactant, transition state and product) drawn from 11,961 elementary reactions, each labeled with density-functional energies, atomic forces and full Hessian matrices at the ωB97X-D/6-31G(d) level. IRC set – 34,248 structures along 600 minimum-energy reaction paths, used to test extrapolation beyond trained stationary points. NMS set – 62,527 off-equilibrium geometries generated by normal-mode sampling to probe model robustness under thermal perturbations. The collection underpins the study Does Hessian Data Improve the Performance of Machine Learning Potentials? and was used to train and benchmark the machine-learning interatomic potentials reported therein. The dataset itself is distributed under a CC licence via Figshare. == Physical data == Datasets from physical systems. === High-energy physics === === Systems === === Astronomy === === Earth science === === Other physical === == Biological data == Datasets from biological systems. === Human === === Animal === === Fungi === === Plant === === Microbe === === Drug discovery === == Anomaly data == == Question answering data == This section includes datasets that deals with structured data. == Dialog or instruction prompted data == This section includes datasets that contains multi-turn text with at least two actors, a "user" and an "agent". The user makes requests for the agent, which performs the request. == Cybersecurity == == Climate and sustainability == == Code data == == Multivariate data == === Financial === === Weather === === Census === === Transit === === Internet === === Games === === Other multivariate === == Curated repositories of datasets == As datasets come in myriad formats and can sometimes be difficult to use, there has been considerable work put into curating and standardizing the format of datasets to make them easier to use for machine learning research. OpenML: Web platform with Python, R, Java, and other APIs for downloading hundreds of machine learning datasets, evaluating algorithms on datasets, and benchmarking algorithm performance against dozens of other algorithms. PMLB: A large, curated repository of benchmark datasets for evaluating supervised machine learning algorithms. Provides classification and regression datasets in a standardized format that are accessible through a Python API. Metatext NLP: https://metatext.io/datasets web repository maintained by community, containing nearly 1000 benchmark datasets, and counting. Provides many tasks from classification to QA, and various languages from English, Portuguese to Arabic. Appen: Off The Shelf and Open Source Datasets hosted and maintained by the company. These biological, image, physical, question answering, signal, sound, text, and video resources number over 250 and can be applied to over 25 different use cases.

    Read more →
  • Log-linear model

    Log-linear model

    A log-linear model is a mathematical model that takes the form of a function whose logarithm equals a linear combination of the parameters of the model, which makes it possible to apply (possibly multivariate) linear regression. That is, it has the general form exp ⁡ ( c + ∑ i w i f i ( X ) ) {\displaystyle \exp \left(c+\sum _{i}w_{i}f_{i}(X)\right)} , in which the fi(X) are quantities that are functions of the variable X, in general a vector of values, while c and the wi stand for the model parameters. The term may specifically be used for: A log-linear plot or graph, which is a type of semi-log plot. Poisson regression for contingency tables, a type of generalized linear model. The specific applications of log-linear models are where the output quantity lies in the range 0 to ∞, for values of the independent variables X, or more immediately, the transformed quantities fi(X) in the range −∞ to +∞. This may be contrasted to logistic models, similar to the logistic function, for which the output quantity lies in the range 0 to 1. Thus the contexts where these models are useful or realistic often depends on the range of the values being modelled.

    Read more →
  • Yahoo Mail

    Yahoo Mail

    Yahoo! Mail (also written as Yahoo Mail) is a mailbox provider by Yahoo. It is one of the largest email services worldwide, with 225 million users. It is accessible via a web browser (webmail), mobile app, or through third-party email clients via the POP, SMTP, and IMAP protocols. Users can also connect non-Yahoo e-mail accounts to their Yahoo Mail inbox. The service was launched on October 8, 1997. The service is free for personal use, with an optional monthly fee for additional features. It is also available in several languages other than English. == History == === 1997–2002 === On October 8, 1997, Yahoo announced its acquisition of online communications company Four11 for $92 million in stock. As part of the purchase, Yahoo received Four11's RocketMail webmail service. Yahoo Mail, based on the RocketMail technology, launched at the same time. Yahoo! chose acquisition rather than internal platform development, because, as Healy said, "Hotmail was growing at thousands and thousands users per week. We did an analysis. For us to build, it would have taken four to six months, and by then, so many users would have taken an email account. The speed of the market was critical." On March 21, 2002, Yahoo! eliminated free software client access and introduced the $29.99 per year Mail Forwarding Service. Mary Osako, a Yahoo! Spokeswoman, told CNET, "For-pay services on Yahoo!, originally launched in February 1999, have experienced great acceptance from our base of active registered users, and we expect this adoption to continue to grow." === 2002–2010 === During 2002, the Yahoo network was gradually redesigned, including the company website, Yahoo Mail and other services. Along with the new design, new features were implemented, including drop-down menus in DHTML and keyboard shortcuts. On July 9, 2004, Yahoo! acquired Oddpost, a webmail service which simulated a desktop email client. Oddpost had features such as drag-and-drop support, right-click menus, RSS feeds, a preview pane, and increased speed using email caching to shorten response time. Many of the features were incorporated into an updated Yahoo! Mail service. ==== Competition ==== On April 1, 2004, Google announced its Gmail service with 1 GB of storage, although Gmail's invitation-only accounts kept the other webmail services at the forefront. Most major webmail providers, including Yahoo! Mail, increased their mailbox storage in response. Yahoo! first announced 100 MB of storage for basic accounts and 2 GB of storage for premium users. However, soon Yahoo Mail increased its free storage quota to 1 GB, before eventually allowing unlimited storage from March 27, 2007, until October 8, 2013. === 2011–2021 === In May 2011, Yahoo Mail rolled out a new interface. It included updated design, enhanced performance, and improved Facebook integration. In 2013, Yahoo! redesigned the site and removed several features, such as simultaneously opening multiple emails in tabs, sorting by sender name, and dragging mails to folders. The new email interface was geared to give an improved user-experience for mobile devices, but was criticized for having an inferior desktop interface. Many users objected to the unannounced nature of the changes through an online post asking Yahoo! to bring back mail tabs with one hundred thousand voting and nearly ten thousand commenting. The redesign produced a problem that caused an unknown number of users to lose access to their accounts for several weeks. In December 2013, Yahoo! Mail suffered a major outage where approximately one million users, one percent of the site's total users, could not access their emails for several days. Yahoo!'s then-CEO Marissa Mayer publicly apologized to the site's users. China Yahoo Mail announced in April 2013 that it would shut down that August as part of Yahoo ceasing services in China since acquiring a stake in Alibaba in 2005. Users with email address suffixes @yahoo.com.cn and @yahoo.cn could transfer their accounts to AliCloud to continue receiving messages through the end of 2014. In January 2014, an undisclosed number of usernames and passwords were released to hackers, following a security breach that Yahoo! believed had occurred through a third-party website. Yahoo! contacted affected users and requested that passwords be changed. In October 2015, Yahoo! updated the mail service with a "more subtle" redesign, as well as improved mobile features. The same release introduced the Yahoo! Account Key, a smartphone-based replacement for password logins. The app also added support for third-party mail accounts. In 2017, Yahoo! again redesigned the web interface with a "more minimal" look, and introduced the option to customize it with different color themes and layouts. In 2019, Yahoo released a redesigned Yahoo Mail app to organize user inboxes, introducing features including a one-tap unsubscribe tool, package tracking, and travel updates. In 2020, Yahoo Mail users were able to fill Walmart shopping carts directly from their inboxes, an industry first. Yahoo! also added a feature to view NFL matches. === 2022–present === In 2022, updates to the Yahoo Mail mobile app added tools to help manage receipts, gift cards, and subscriptions. AI-based additions in 2023 included a feature that automates tracking coupon codes and credits for online shopping, as well as updates to search suggestions, message summaries and AI writing assistance. In 2024, updates to the desktop interface added more AI-based features, including a "priority inbox" tab with automatically generated summaries of important messages and automated suggestions of next actions based on message contents. In February 2025, Yahoo aired its first Super Bowl ad since 2002, in which Bill Murray invited viewers to contact him at his Yahoo Mail email address ([email protected]). The address received nearly 150,000 emails in the first two hours after broadcast. In June 2025, Yahoo Mail introduced a "Catch Up" feature that provides AI-generated summaries and email previews and prompts users to choose to delete or retain each one. As part of the feature's launch, Yahoo Mail collaborated with streetwear brand Anti Social Social Club on an apparel release. == User interface == As many as three web interfaces were available at any given time. The traditional "Yahoo! Mail Classic" preserved the availability of their original 1997 interface until July 2013 in North America. A 2005 version included a new Ajax interface, drag-and-drop, improved search, keyboard shortcuts, address auto-completion, and tabs. However, other features were removed, such as column widths and one click delete-move-to-next. In October 2010, Yahoo! released a beta version of Yahoo! Mail, which included improvements to performance, search, and Facebook integration. In May 2011, this became the default interface. Their current Webmail interface was introduced in 2017. == Spam policy == Yahoo! Mail is often used by spammers to provide a "remove me" email address. Often, these addresses are used to verify the recipient's address, thus opening the door for more spam. Yahoo! does not tolerate this practice and terminates accounts connected with spam-related activities without warning, causing spammers to lose access to any other Yahoo! services connected with their ID under the Terms of Service. Additionally, Yahoo! stresses that its servers are based in California and any spam-related activity which uses its servers could potentially violate that state's anti-spam laws. In February 2006, Yahoo! announced its decision (along with AOL) to give some organizations the option to "certify" mail by paying up to one cent for each outgoing message, allowing the mail in question to bypass inbound spam filters. Few mailers used it and, Goodmail, the company running the certification process, shut down in 2011. === Filters === In order to prevent abuse, in 2002 Yahoo! Mail activated filters which changed certain words (that could trigger unwanted JavaScript events) and word fragments into other words. "mocha" was changed to "espresso", "expression" became "statement", and "eval" (short for "evaluation") became "review". This resulted in many unintended corrections, such as "prevent" (prevalent), "revalidation" (evaluation) and "media review" (medieval). When asked about these changes, Yahoo! explained that the changed words were common terms used in their privacy dashboard and were blacklisted to prevent hackers from sending damaging commands via the program's HTML function. Starting before February 7, 2006, Yahoo! Mail ended the practice, and began to add an underscore as a prefix to certain suspicious words and word fragments. === Greylisting === Incoming mail to Yahoo! addresses can be subjected to deferred delivery as part of Yahoo's incoming spam controls. This can delay delivery of mail sent to Yahoo! addresses without the sender or recipients being aware of it. The deferral is typically of short duration, but

    Read more →
  • Amazon Rekognition

    Amazon Rekognition

    Amazon Rekognition is a cloud-based software as a service (SaaS) computer vision platform that was launched in 2016. It has been sold to, and used by, a number of United States government agencies, including U.S. Immigration and Customs Enforcement (ICE) and Orlando, Florida police, as well as private entities. == Capabilities == Rekognition provides a number of computer vision capabilities, which can be divided into two categories: Algorithms that are pre-trained on data collected by Amazon or its partners, and algorithms that a user can train on a custom dataset. As of July 2019, Rekognition provides the following computer vision capabilities. === Pre-trained algorithms === Celebrity recognition in images Facial attribute detection in images, including gender, age range, emotions (e.g. happy, calm, disgusted), whether the face has a beard or mustache, whether the face has eyeglasses or sunglasses, whether the eyes are open, whether the mouth is open, whether the person is smiling, and the location of several markers such as the pupils and jaw line. People Pathing enables tracking of people through a video. An advertised use-case of this capability is to track sports players for post-game analysis. Text detection and classification in images Unsafe visual content detection === Algorithms that a user can train on a custom dataset === SearchFaces enables users to import a database of images with pre-labeled faces, to train a machine learning model on this database, and to expose the model as a cloud service with an API. Then, the user can post new images to the API and receive information about the faces in the image. The API can be used to expose a number of capabilities, including identifying faces of known people, comparing faces, and finding similar faces in a database. Face-based user verification == History and use == === 2017 === In late 2017, the Washington County, Oregon Sheriff's Office began using Rekognition to identify suspects' faces. Rekognition was marketed as a general-purpose computer vision tool, and an engineer working for Washington County decided to use the tool for facial analysis of suspects. Rekognition was offered to the department for free, and Washington County became the first US law enforcement agency known to use Rekognition. In 2018, the agency logged over 1,000 facial searches. The county, according to the Washington Post, by 2019 was paying about $7 a month for all of its searches. The relationship was unknown to the public until May 2018. In 2018, Rekognition was also used to help identify celebrities during a royal wedding telecast. === 2018 === In April 2018, it was reported that FamilySearch was using Rekognition to enable their users to "see which of their ancestors they most resemble based on family photographs". In early 2018, the FBI also began using it as a pilot program for analyzing video surveillance. In May 2018, it was reported by the ACLU that Orlando, Florida was running a pilot using Rekognition for facial analysis in law enforcement, with that pilot ending in July 2019. After the report, on June 22, 2018, Gizmodo reported that Amazon workers had written a letter to CEO Jeff Bezos requesting he cease selling Rekognition to US law enforcement, particularly ICE and Homeland Security. A letter was also sent to Bezos by the ACLU. On June 26, 2018, it was reported that the Orlando police force had ceased using Rekognition after their trial contract expired, reserving the right to use it in the future. The Orlando Police Department said that they had "never gotten to the point to test images" due to old infrastructure and low bandwidth. In July 2018, the ACLU released a test showing that Rekognition had falsely matched 28 members of Congress with mugshot photos, particularly Congresspeople of color. 25 House members afterwards sent a letter to Bezos, expressing concern about Rekognition. Amazon responded saying the Rekognition test had generated 80 percent confidence, while it recommended law enforcement only use matches rated at 99 percent confidence. The Washington Post states that Oregon instead has officers pick a "best of five" result, instead of adhering to the recommendation. In September 2018, it was reported that Mapillary was using Rekognition to read the text on parking signs (e.g. no stopping, no parking, or specific parking hours) in cities. In October 2018, it was reported that Amazon had earlier that year pitched Rekognition to U.S. Immigration and Customs Enforcement agency. Amazon defended government use of Rekognition. On December 1, 2018, it was reported that 8 Democratic lawmakers had said in a letter that Amazon had "failed to provide sufficient answers" about Rekognition, writing that they had "serious concerns that this type of product has significant accuracy issues, places disproportionate burdens on communities of color, and could stifle Americans' willingness to exercise their First Amendment rights in public." === 2019 === In January 2019, MIT researchers published a peer-reviewed study asserting that Rekognition had more difficulty in identifying dark-skinned females than competitors such as IBM and Microsoft. In the study, Rekognition misidentified darker-skinned women as men 31% of the time, but made no mistakes for light-skinned men. Amazon called the report "misinterpreted results" of the research with an improper "default confidence threshold." In January 2019, Amazon's shareholders "urged Amazon to stop selling Rekognition software to law enforcement agencies." Amazon in response defended its use of Rekognition, but supported new federal oversight and guidelines to "make sure facial recognition technology cannot be used to discriminate." In February 2019, it was reported that Amazon was collaborating with the National Institute of Standards and Technology (NIST) on developing standardized tests to improve accuracy and remove bias with facial recognition. In March 2019, an open letter regarding Rekognition was sent by a group of prominent AI researchers to Amazon, criticizing its sale to law enforcement with around 50 signatures. In April 2019, Amazon was told by the Securities and Exchange Commission that they had to vote on two shareholder proposals seeking to limit Rekognition. Amazon argued that the proposals were an "insignificant public policy issue for the Company" not related to Amazon's ordinary business, but their appeal was denied. The vote was set for May. The first proposal was tabled by shareholders. On May 24, 2019, 2.4% of shareholders voted to stop selling Rekognition to government agencies, while a second proposal calling for a study into Rekognition and civil rights had 27.5% support. In August 2019, the ACLU again used Rekognition on members of government, with 26 of 120 lawmakers in California flagged as matches to mugshots. Amazon stated the ACLU was "misusing" the software in the tests, by not dismissing results that did not meet Amazon's recommended accuracy threshold of 99%. By August 2019, there had been protests against ICE's use of Rekognition to surveil immigrants. In March 2019, Amazon announced a Rekognition update that would improve emotional detection, and in August 2019, "fear" was added to emotions that Rekognition could detect. === 2020 === In June 2020, Amazon announced it was implementing a one-year moratorium on police use of Rekognition, in response to the George Floyd protests. === 2024 === The Department of Justice disclosed that the FBI is initiating the use of Amazon Rekognition. The DOJ's AI inventory revealed the FBI's "Project Tyr" aims to customize Rekognition to identify nudity, weapons, explosives, and other information from lawfully acquired media. === 2025 === In late 2025, the New York Times reported that scientist, Dr. Jürgen Matthäus, retired from as the head of research at the U.S. Holocaust Memorial Museum in Washington, D.C., used Amazon Rekognition to identify the shooter in the Holocaust photograph known as The Last Jew in Vinnitsa "with more than 99 percent certainty" — as Jakobus Onnen (1906–1943), a teacher from Tichelwarf near Weener in East Frisia who had been a member of the SS since 1934 and was later killed in action near Zhitomir in 1943. The photographer and victim remain unidentified. == Controversy regarding facial analysis == === Racial and gender bias === In 2018, MIT researchers Joy Buolamwini and Timnit Gebru published a study called Gender Shades. In this study, a set of images was collected, and faces in the images were labeled with face position, gender, and skin tone information. The images were run through SaaS facial recognition platforms from Face++, IBM, and Microsoft. In all three of these platforms, the classifiers performed best on male faces (with error rates on female faces being 8.1% to 20.6% higher than error rates on male faces), and they performed worst on dark female faces (with error rates ranging from 20.8% to 30.4%). The authors hypothesized that this discr

    Read more →
  • Absorbing Markov chain

    Absorbing Markov chain

    In the mathematical theory of probability, an absorbing Markov chain is a Markov chain in which every state can reach an absorbing state. An absorbing state is a state that, once entered, cannot be left. Like general Markov chains, there can be continuous-time absorbing Markov chains with an infinite state space. However, this article concentrates on the discrete-time discrete-state-space case. == Formal definition == A Markov chain is an absorbing chain if there is at least one absorbing state and it is possible to go from any state to at least one absorbing state in a finite number of steps. In an absorbing Markov chain, a state that is not absorbing is called transient. === Canonical form === Let an absorbing Markov chain with transition matrix P have t transient states and r absorbing states. The rows of P represent sources, while columns represent destinations. By ordering the transient states before the absorbing states, it can be assumed that P has the form P = [ Q R 0 I r ] , {\displaystyle P={\begin{bmatrix}Q&R\\\mathbf {0} &I_{r}\end{bmatrix}},} where Q is a t-by-t matrix, R is a nonzero t-by-r matrix, 0 is an r-by-t zero matrix, and Ir is the r-by-r identity matrix. Thus, Q describes the probability of transitioning from some transient state to another while R describes the probability of transitioning from some transient state to some absorbing state. The probability of transitioning from i to j in exactly k steps is the (i,j)-entry of Pk, further computed below. When considering only transient states, the probability is found in the upper left of Pk, the (i,j)-entry of Qk. == Fundamental matrix == === Expected number of visits to a transient state === A basic property about an absorbing Markov chain is the expected number of visits to a transient state j starting from a transient state i (before being absorbed). This can be established to be given by the (i, j) entry of so-called fundamental matrix N, obtained by summing Qk for all k (from 0 to ∞). It can be proven that N := ∑ k = 0 ∞ Q k = ( I t − Q ) − 1 , {\displaystyle N:=\sum _{k=0}^{\infty }Q^{k}=(I_{t}-Q)^{-1},} where It is the t-by-t identity matrix. The computation of this formula is the matrix equivalent of the geometric series of scalars, ∑ k = 0 ∞ q k = 1 1 − q {\displaystyle {\textstyle \sum }_{k=0}^{\infty }q^{k}={\tfrac {1}{1-q}}} . With the matrix N in hand, also other properties of the Markov chain are easy to obtain. === Expected number of steps before being absorbed === The expected number of steps before being absorbed in any absorbing state, when starting in transient state i can be computed via a sum over transient states. The value is given by the ith entry of the vector t := N 1 , {\displaystyle \mathbf {t} :=N\mathbf {1} ,} where 1 is a length-t column vector whose entries are all 1. === Absorbing probabilities === By induction, P k = [ Q k ( I t − Q k ) N R 0 I r ] . {\displaystyle P^{k}={\begin{bmatrix}Q^{k}&(I_{t}-Q^{k})NR\\\mathbf {0} &I_{r}\end{bmatrix}}.} The probability of eventually being absorbed in the absorbing state j when starting from transient state i is given by the (i,j)-entry of the matrix B := N R {\displaystyle B:=NR} . The number of columns of this matrix equals the number of absorbing states r. An approximation of those probabilities can also be obtained directly from the (i,j)-entry of P k {\displaystyle P^{k}} for a large enough value of k, when i is the index of a transient, and j the index of an absorbing state. This is because ( lim k → ∞ P k ) i , t + j = B i , j {\displaystyle \left(\lim _{k\to \infty }P^{k}\right)_{i,t+j}=B_{i,j}} . === Transient visiting probabilities === The probability of visiting transient state j when starting at a transient state i is the (i,j)-entry of the matrix H := ( N − I t ) ( N dg ) − 1 , {\displaystyle H:=(N-I_{t})(N_{\operatorname {dg} })^{-1},} where Ndg is the diagonal matrix with the same diagonal as N. === Variance on number of transient visits === The variance on the number of visits to a transient state j with starting at a transient state i (before being absorbed) is the (i,j)-entry of the matrix N 2 := N ( 2 N dg − I t ) − N sq , {\displaystyle N_{2}:=N(2N_{\operatorname {dg} }-I_{t})-N_{\operatorname {sq} },} where Nsq is the Hadamard product of N with itself (i.e. each entry of N is squared). === Variance on number of steps === The variance on the number of steps before being absorbed when starting in transient state i is the ith entry of the vector ( 2 N − I t ) t − t sq , {\displaystyle (2N-I_{t})\mathbf {t} -\mathbf {t} _{\operatorname {sq} },} where tsq is the Hadamard product of t with itself (i.e., as with Nsq, each entry of t is squared). == Examples == === String generation === Consider the process of repeatedly flipping a fair coin until the sequence (heads, tails, heads) appears. This process is modeled by an absorbing Markov chain with transition matrix P = [ 1 / 2 1 / 2 0 0 0 1 / 2 1 / 2 0 1 / 2 0 0 1 / 2 0 0 0 1 ] . {\displaystyle P={\begin{bmatrix}1/2&1/2&0&0\\0&1/2&1/2&0\\1/2&0&0&1/2\\0&0&0&1\end{bmatrix}}.} The first state represents the empty string, the second state the string "H", the third state the string "HT", and the fourth state the string "HTH". Although in reality, the coin flips cease after the string "HTH" is generated, the perspective of the absorbing Markov chain is that the process has transitioned into the absorbing state representing the string "HTH" and, therefore, cannot leave. For this absorbing Markov chain, the fundamental matrix is N = ( I − Q ) − 1 = ( [ 1 0 0 0 1 0 0 0 1 ] − [ 1 / 2 1 / 2 0 0 1 / 2 1 / 2 1 / 2 0 0 ] ) − 1 = [ 1 / 2 − 1 / 2 0 0 1 / 2 − 1 / 2 − 1 / 2 0 1 ] − 1 = [ 4 4 2 2 4 2 2 2 2 ] . {\displaystyle {\begin{aligned}N&=(I-Q)^{-1}=\left({\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}}-{\begin{bmatrix}1/2&1/2&0\\0&1/2&1/2\\1/2&0&0\end{bmatrix}}\right)^{-1}\\[4pt]&={\begin{bmatrix}1/2&-1/2&0\\0&1/2&-1/2\\-1/2&0&1\end{bmatrix}}^{-1}={\begin{bmatrix}4&4&2\\2&4&2\\2&2&2\end{bmatrix}}.\end{aligned}}} The expected number of steps starting from each of the transient states is t = N 1 = [ 4 4 2 2 4 2 2 2 2 ] [ 1 1 1 ] = [ 10 8 6 ] . {\displaystyle \mathbf {t} =N\mathbf {1} ={\begin{bmatrix}4&4&2\\2&4&2\\2&2&2\end{bmatrix}}{\begin{bmatrix}1\\1\\1\end{bmatrix}}={\begin{bmatrix}10\\8\\6\end{bmatrix}}.} Therefore, the expected number of coin flips before observing the sequence (heads, tails, heads) is 10, the entry for the state representing the empty string. === Games of chance === Games based entirely on chance can be modeled by an absorbing Markov chain. A classic example of this is the ancient Indian board game Snakes and Ladders. The graph on the left plots the probability mass in the lone absorbing state that represents the final square as the transition matrix is raised to larger and larger powers. To determine the expected number of turns to complete the game, compute the vector t as described above and examine tstart, which is approximately 39.2. === Infectious disease testing === Infectious disease testing, either of blood products or in medical clinics, is often taught as an example of an absorbing Markov chain. The public U.S. Centers for Disease Control and Prevention (CDC) model for HIV and for hepatitis B, for example, illustrates the property that absorbing Markov chains can lead to the detection of disease, versus the loss of detection through other means. In the standard CDC model, the Markov chain has five states, a state in which the individual is uninfected, then a state with infected but undetectable virus, a state with detectable virus, and absorbing states of having quit/been lost from the clinic, or of having been detected (the goal). The typical rates of transition between the Markov states are the probability p per unit time of being infected with the virus, w for the rate of window period removal (time until virus is detectable), q for quit/loss rate from the system, and d for detection, assuming a typical rate λ {\displaystyle \lambda } at which the health system administers tests of the blood product or patients in question. It follows that we can "walk along" the Markov model to identify the overall probability of detection for a person starting as undetected, by multiplying the probabilities of transition to each next state of the model as: p ( p + q ) w ( w + q ) d ( d + q ) {\displaystyle {\frac {p}{(p+q)}}{\frac {w}{(w+q)}}{\frac {d}{(d+q)}}} . The subsequent total absolute number of false negative tests—the primary CDC concern—would then be the rate of tests, multiplied by the probability of reaching the infected but undetectable state, times the duration of staying in the infected undetectable state: p ( p + q ) 1 ( w + q ) λ {\displaystyle {\frac {p}{(p+q)}}{\frac {1}{(w+q)}}\lambda } .

    Read more →