Stockfish (chess)

Stockfish (chess)

Stockfish is a free and open-source chess engine, available for various desktop and mobile platforms. It can be used in chess software through the Universal Chess Interface. Stockfish has been one of the strongest chess engines in the world for several years. It has won all main events of the Top Chess Engine Championship (TCEC) and the Chess.com Computer Chess Championship (CCC) since 2020 and, as of May 2026, is the strongest CPU chess engine in the world with an estimated Elo rating of 3653 in a time control of 40/15 (15 minutes to make 40 moves), according to CCRL. The Stockfish engine was developed by Tord Romstad, Marco Costalba, and Joona Kiiski, and was derived from Glaurung, an open-source engine by Tord Romstad released in 2004. It is now being developed and maintained by the Stockfish community. Stockfish historically used only a classical hand-crafted function to evaluate board positions, but with the introduction of the efficiently updatable neural network (NNUE) in August 2020, Stockfish 12 adopted a hybrid evaluation system that primarily used the neural network and occasionally relied on the hand-crafted evaluation. In July 2023, Stockfish removed the hand-crafted evaluation and transitioned to a fully neural network-based approach. == Features == Stockfish uses a tree-search algorithm based on alpha–beta search with several hand-designed heuristics. Stockfish represents positions using bitboards. Stockfish supports Chess960, a feature it inherited from Glaurung. Support for Syzygy tablebases, previously available in a fork maintained by Ronald de Man, was integrated into Stockfish in 2014. In 2018, support for the 7-man Syzygy was added, shortly after the tablebase was made available. Stockfish supports an unlimited number of CPU threads in multiprocessor systems, with a maximum transposition table size of 32 TB. Stockfish has been a very popular engine on various platforms. On desktop, it is the default chess engine bundled with the Internet Chess Club interface programs BlitzIn and Dasher. On mobile, it has been bundled with the Stockfish app, SmallFish and Droidfish. Other Stockfish-compatible graphical user interfaces (GUIs) include Fritz, Arena, Stockfish for Mac, and PyChess. Stockfish can be compiled to WebAssembly or JavaScript, allowing it to run in the browser. Both Chess.com and Lichess provide Stockfish in this form in addition to a server-side program. Release versions and development versions are available as C++ source code and as precompiled versions for Microsoft Windows, macOS, Linux 32-bit/64-bit and Android. == History == The program originated from Glaurung, an open-source chess engine created by Tord Romstad and first released in 2004. Four years later, Marco Costalba forked the project, naming it Stockfish because it was "produced in Norway and cooked in Italy" (Romstad is Norwegian and Costalba is Italian). The first version, Stockfish 1.0, was released in November 2008. For a while, new ideas and code changes were transferred between the two programs in both directions, until Romstad decided to discontinue Glaurung in favor of Stockfish, which was the stronger engine at the time. The last Glaurung version (2.2) was released in December 2008. Around 2011, Romstad decided to abandon his involvement with Stockfish in order to spend more time on his new iOS chess app. On 18 June 2014 Marco Costalba announced that he had "decided to step down as Stockfish maintainer" and asked that the community create a fork of the current version and continue its development. An official repository, managed by a volunteer group of core Stockfish developers, was created soon after and currently manages the development of the project. === Fishtest === Since 2013, Stockfish has been developed using a distributed testing framework named Fishtest, where volunteers can donate CPU time for testing improvements to the program. Changes to game-playing code are accepted or rejected based on results of playing of tens of thousands of games on the framework against an older "reference" version of the program, using sequential probability ratio testing. Tests on the framework are verified using the chi-squared test, and only if the results are statistically significant are they deemed reliable and used to revise the software code. After the inception of Fishtest, Stockfish gained 120 Elo points in 12 months, propelling it to the top of all major rating lists. As of May 2026, the framework has used a total of more than 20,100 years of CPU time to play over 10 billion chess games. === NNUE === In June 2020, Stockfish introduced the efficiently updatable neural network (NNUE) approach, based on earlier work by computer shogi programmers. Instead of using manually designed heuristics to evaluate the board, this approach introduced a neural network trained on millions of positions which could be evaluated quickly on CPU. On 2 September 2020, the twelfth version of Stockfish was released, incorporating NNUE, and reportedly winning ten times more game pairs than it loses when matched against version eleven. In July 2023, the classical evaluation was completely removed in favor of the NNUE evaluation. == Competition results == === Top Chess Engine Championship === Stockfish is a TCEC multiple-time champion and the current leader in trophy count. Ever since TCEC restarted in 2013, Stockfish has finished first or second in every season except one. Stockfish finished second in TCEC Season 4 and 5, with scores of 23–25 first against Houdini 3 and later against Komodo 1142 in the Superfinal event. Season 5 was notable for the winning Komodo team as they accepted the award posthumously for the program's creator Don Dailey, who succumbed to an illness during the final stage of the event. In his honor, the version of Stockfish that was released shortly after that season was named "Stockfish DD". On 30 May 2014, Stockfish 170514 (a development version of Stockfish 5 with tablebase support) convincingly won TCEC Season 6, scoring 35.5–28.5 against Komodo 7x in the Superfinal. Stockfish 5 was released the following day. In TCEC Season 7, Stockfish again made the Superfinal, but lost to Komodo with a score of 30.5–33.5. In TCEC Season 8, despite losses on time caused by buggy code, Stockfish nevertheless qualified once more for the Superfinal, but lost 46.5–53.5 to Komodo. In Season 9, Stockfish defeated Houdini 5 with a score of 54.5–45.5. Stockfish finished third during season 10 of TCEC, the only season since 2013 in which Stockfish had failed to qualify for the superfinal. It did not lose a game but was still eliminated because it was unable to score enough wins against lower-rated engines. After this technical elimination, Stockfish went on a long winning streak, winning seasons 11 (59–41 against Houdini 6.03), 12 (60–40 against Komodo 12.1.1), and 13 (55–45 against Komodo 2155.00) convincingly. In Season 14, Stockfish faced a new challenger in Leela Chess Zero, eking out a win by one point (50.5–49.5). Its winning streak was finally ended in Season 15, when Leela qualified again and won 53.5–46.5, but Stockfish promptly won Season 16, defeating AllieStein 54.5–45.5, after Leela failed to qualify for the Superfinal. In Season 17, Stockfish faced Leela again in the superfinal, losing 52.5–47.5. However, Stockfish has won every Superfinal since: beating Leela 53.5–46.5 in Season 18, 54.5–45.5 in Season 19, 53–47 in Season 20, and 56–44 in Season 21. In Season 22, Komodo Dragon beat out Leela to qualify for the Superfinal, losing to Stockfish by a large margin 59.5–40.5. Stockfish did not lose an opening pair in this match. Leela made the Superfinal in Seasons 23 and 24, but was crushed by Stockfish both times (58.5–41.5 and 58–42). In Season 25, Stockfish once again defeated Leela, but this time by a narrower margin of 52–48. Stockfish also took part in the TCEC cup, winning the first edition, but was surprisingly upset by Houdini in the semifinals of the second edition. Stockfish recovered to beat Komodo in the third-place playoff. In the third edition, Stockfish made it to the finals, but was defeated by Leela Chess Zero after blundering in a 7-man endgame tablebase draw. It turned this result around in the fourth edition, defeating Leela in the final 4.5–3.5. In TCEC Cup 6, Stockfish finished third after losing to AllieStein in the semifinals, the first time it had failed to make the finals. Since then, Stockfish has consistently won the tournament, with the exception of the 11th edition which Leela won 8.5–7.5. === Chess.com Computer Chess Championship === Ever since Chess.com hosted its first Chess.com Computer Chess Championship in 2018, Stockfish has been the most successful engine. It dominated the earlier championships, winning six consecutive titles before finishing second in CCC7. Since then, its dominance has come under threat from the neural-network engines Leelenstein and Leela Chess Zero, but it has continued to perform w

Domain adaptation

Domain adaptation is a field associated with machine learning and transfer learning. It addresses the challenge of training a model on one data distribution (the source domain) and applying it to a related but different data distribution (the target domain). A common example is spam filtering, where a model trained on emails from one user (source domain) is adapted to handle emails for another user with significantly different patterns (target domain). Domain adaptation techniques can also leverage unrelated data sources to improve learning. When multiple source distributions are involved, the problem extends to multi-source domain adaptation. Domain adaptation is a specific type of transfer learning. According to the taxonomy laid out by Pan and Yang (2010), it falls into the category of transductive transfer learning. In this setting, the source and target tasks are the same (e.g., both are object recognition), but the domains differ (different marginal distributions). This distinguishes it from inductive transfer learning (where labeled data is available for the target task) and unsupervised transfer learning (where labels are unavailable in both domains). == Classification of domain adaptation problems == Domain adaptation setups are classified in two different ways: according to the distribution shift between the domains, and according to the available data from the target domain. === Distribution shifts === Common distribution shifts are classified as follows: Covariate Shift occurs when the input distributions of the source and destination change, but the relationship between inputs and labels remains unchanged. The above-mentioned spam filtering example typically falls in this category. Namely, the distributions (patterns) of emails may differ between the domains, but emails labeled as spam in the one domain should similarly be labeled in another. Prior Shift (Label Shift) occurs when the label distribution differs between the source and target datasets, while the conditional distribution of features given labels remains the same. An example is a classifier of hair color in images from Italy (source domain) and Norway (target domain). The proportions of hair colors (labels) differ, but images within classes like blond and black-haired populations remain consistent across domains. A classifier for the Norway population can exploit this prior knowledge of class proportions to improve its estimates. Concept Shift (Conditional Shift) refers to changes in the relationship between features and labels, even if the input distribution remains the same. For instance, in medical diagnosis, the same symptoms (inputs) may indicate entirely different diseases (labels) in different populations (domains). === Data available during training === Domain adaptation problems typically assume that some data from the target domain is available during training. Problems can be classified according to the type of this available data: Unsupervised: Unlabeled data from the target domain is available, but no labeled data. In the above-mentioned example of spam filtering, this corresponds to the case where emails from the target domain (user) are available, but they are not labeled as spam. Domain adaptation methods can benefit from such unlabeled data, by comparing its distribution (patterns) with the labeled source domain data. Semi-supervised: Most data that is available from the target domain is unlabelled, but some labeled data is also available. In the above-mentioned case of spam filter design, this corresponds to the case that the target user has labeled some emails as being spam or not. Supervised: All data that is available from the target domain is labeled. In this case, domain adaptation reduces to refinement of the source domain predictor. In the above-mentioned example classification of hair-color from images, this could correspond to the refinement of a network already trained on a large dataset of labeled images from Italy, using newly available labeled images from Norway. == Formalization == Let X {\displaystyle X} be the input space (or description space) and let Y {\displaystyle Y} be the output space (or label space). The objective of a machine learning algorithm is to learn a mathematical model (a hypothesis) h : X → Y {\displaystyle h:X\to Y} able to attach a label from Y {\displaystyle Y} to an example from X {\displaystyle X} . This model is learned from a learning sample S = { ( x i , y i ) ∈ ( X × Y ) } i = 1 m {\displaystyle S=\{(x_{i},y_{i})\in (X\times Y)\}_{i=1}^{m}} . Usually in supervised learning (without domain adaptation), we suppose that the examples ( x i , y i ) ∈ S {\displaystyle (x_{i},y_{i})\in S} are drawn i.i.d. from a distribution D S {\displaystyle D_{S}} of support X × Y {\displaystyle X\times Y} (unknown and fixed). The objective is then to learn h {\displaystyle h} (from S {\displaystyle S} ) such that it commits the least error possible for labelling new examples coming from the distribution D S {\displaystyle D_{S}} . The main difference between supervised learning and domain adaptation is that in the latter situation we study two different (but related) distributions D S {\displaystyle D_{S}} and D T {\displaystyle D_{T}} on X × Y {\displaystyle X\times Y} . The domain adaptation task then consists of the transfer of knowledge from the source domain D S {\displaystyle D_{S}} to the target one D T {\displaystyle D_{T}} . The goal is then to learn h {\displaystyle h} (from labeled or unlabelled samples coming from the two domains) such that it commits as little error as possible on the target domain D T {\displaystyle D_{T}} . The major issue is the following: if a model is learned from a source domain, what is its capacity to correctly label data coming from the target domain? == Four algorithmic principles == === Reweighting algorithms === The objective is to reweight the source labeled sample such that it "looks like" the target sample (in terms of the error measure considered). === Iterative algorithms === A method for adapting consists in iteratively "auto-labeling" the target examples. The principle is simple: a model h {\displaystyle h} is learned from the labeled examples; h {\displaystyle h} automatically labels some target examples; a new model is learned from the new labeled examples. Note that there exist other iterative approaches, but they usually need target labeled examples. === Search of a common representation space === The goal is to find or construct a common representation space for the two domains. The objective is to obtain a space in which the domains are close to each other while keeping good performances on the source labeling task. This can be achieved through the use of Adversarial machine learning techniques where feature representations from samples in different domains are encouraged to be indistinguishable. === Hierarchical Bayesian Model === The goal is to construct a Bayesian hierarchical model p ( n ) {\displaystyle p(n)} , which is essentially a factorization model for counts n {\displaystyle n} , to derive domain-dependent latent representations allowing both domain-specific and globally shared latent factors. == Software packages == Several compilations of domain adaptation and transfer learning algorithms have been implemented over the past decades: SKADA (Python) ADAPT (Python) TLlib (Python) Domain-Adaptation-Toolbox (MATLAB)

Integrated writing environment

An integrated writing environment (IWE) is software that provides comprehensive writing and knowledge management functionality for writers and information workers. IWEs enable writers and information workers to perform a variety of tasks related to the document in the IWE in a single environment. This provides a distraction-free workspace and streamlined writing experience. IWEs provide similar efficiency and functionality benefits to writers and information professionals that integrated development environments (IDEs) provide to software developers. == Overview == IWEs are designed to maximize productivity and help improve the quality of written work by integrating together tools that allow users to work effectively in a single application. The IWE features may include integrated content search, reversion management, outlining, note management, and reference management, as may be suitable for the target field of use. == List of IWEs == Celtx This IWE is intended for screenplay writers and has screenplay writing and management tools. Celtex provides tools for the pre-production work phase, story development, storyboarding, script breakdowns, production scheduling, and reports. Scrivener This IWE targets novel, research paper, and script writing. Scrivener provides tools to organize notes and research documents for easy access and referencing. After completing the writing, Scrivener allows the user to export the document to formats supported by common word processors, such as Microsoft Word. TeXstudio This IWE targets LaTeX documents and provides interactive spelling checker, code folding, and syntax highlighting.

Contextual image classification

Contextual image classification, a topic of pattern recognition in computer vision, is an approach of classification based on contextual information in images. "Contextual" means this approach is focusing on the relationship of the nearby pixels, which is also called neighbourhood. The goal of this approach is to classify the images by using the contextual information. == Introduction == Similar as processing language, a single word may have multiple meanings unless the context is provided, and the patterns within the sentences are the only informative segments we care about. For images, the principle is same. Find out the patterns and associate proper meanings to them. As the image illustrated below, if only a small portion of the image is shown, it is very difficult to tell what the image is about. Even try another portion of the image, it is still difficult to classify the image. However, if we increase the contextual of the image, then it makes more sense to recognize. As the full images shows below, almost everyone can classify it easily. During the procedure of segmentation, the methods which do not use the contextual information are sensitive to noise and variations, thus the result of segmentation will contain a great deal of misclassified regions, and often these regions are small (e.g., one pixel). Compared to other techniques, this approach is robust to noise and substantial variations for it takes the continuity of the segments into account. Several methods of this approach will be described below. == Applications == === Functioning as a post-processing filter to a labelled image === This approach is very effective against small regions caused by noise. And these small regions are usually formed by few pixels or one pixel. The most probable label is assigned to these regions. However, there is a drawback of this method. The small regions also can be formed by correct regions rather than noise, and in this case the method is actually making the classification worse. This approach is widely used in remote sensing applications. === Improving the post-processing classification === This is a two-stage classification process: For each pixel, label the pixel and form a new feature vector for it. Use the new feature vector and combine the contextual information to assign the final label to the === Merging the pixels in earlier stages === Instead of using single pixels, the neighbour pixels can be merged into homogeneous regions benefiting from contextual information. And provide these regions to classifier. === Acquiring pixel feature from neighbourhood === The original spectral data can be enriched by adding the contextual information carried by the neighbour pixels, or even replaced in some occasions. This kind of pre-processing methods are widely used in textured image recognition. The typical approaches include mean values, variances, texture description, etc. === Combining spectral and spatial information === The classifier uses the grey level and pixel neighbourhood (contextual information) to assign labels to pixels. In such case the information is a combination of spectral and spatial information. === Powered by the Bayes minimum error classifier === Contextual classification of image data is based on the Bayes minimum error classifier (also known as a naive Bayes classifier). Present the pixel: A pixel is denoted as x 0 {\displaystyle x_{0}} . The neighbourhood of each pixel x 0 {\displaystyle x_{0}} is a vector and denoted as N ( x 0 ) {\displaystyle N(x_{0})} . The values in the neighbourhood vector is denoted as f ( x i ) {\displaystyle f(x_{i})} . Each pixel is presented by the vector ξ = ( f ( x 0 ) , f ( x 1 ) , … , f ( x k ) ) {\displaystyle \xi =\left(f(x_{0}),f(x_{1}),\ldots ,f(x_{k})\right)} x i ∈ N ( x 0 ) ; i = 1 , … , k {\displaystyle x_{i}\in N(x_{0});\quad i=1,\ldots ,k} The labels (classification) of pixels in the neighbourhood N ( x 0 ) {\displaystyle N(x_{0})} are presented as a vector η = ( θ 0 , θ 1 , … , θ k ) {\displaystyle \eta =\left(\theta _{0},\theta _{1},\ldots ,\theta _{k}\right)} θ i ∈ { ω 0 , ω 1 , … , ω k } {\displaystyle \theta _{i}\in \left\{\omega _{0},\omega _{1},\ldots ,\omega _{k}\right\}} ω s {\displaystyle \omega _{s}} here denotes the assigned class. A vector presents the labels in the neighbourhood N ( x 0 ) {\displaystyle N(x_{0})} without the pixel x 0 {\displaystyle x_{0}} η ^ = ( θ 1 , θ 2 , … , θ k ) {\displaystyle {\hat {\eta }}=\left(\theta _{1},\theta _{2},\ldots ,\theta _{k}\right)} The neighbourhood: Size of the neighbourhood. There is no limitation of the size, but it is considered to be relatively small for each pixel x 0 {\displaystyle x_{0}} . A reasonable size of neighbourhood would be 3 × 3 {\displaystyle 3\times 3} of 4-connectivity or 8-connectivity ( x 0 {\displaystyle x_{0}} is marked as red and placed in the centre). The calculation: Apply the minimum error classification on a pixel x 0 {\displaystyle x_{0}} , if the probability of a class ω r {\displaystyle \omega _{r}} being presenting the pixel x 0 {\displaystyle x_{0}} is the highest among all, then assign ω r {\displaystyle \omega _{r}} as its class. θ 0 = ω r if P ( ω r ∣ f ( x 0 ) ) = max s = 1 , 2 , … , R P ( ω s ∣ f ( x 0 ) ) {\displaystyle \theta _{0}=\omega _{r}\quad {\text{ if }}\quad P(\omega _{r}\mid f(x_{0}))=\max _{s=1,2,\ldots ,R}P(\omega _{s}\mid f(x_{0}))} The contextual classification rule is described as below, it uses the feature vector x 1 {\displaystyle x_{1}} rather than x 0 {\displaystyle x_{0}} . θ 0 = ω r if P ( ω r ∣ ξ ) = max s = 1 , 2 , … , R P ( ω s ∣ ξ ) {\displaystyle \theta _{0}=\omega _{r}\quad {\text{ if }}\quad P(\omega _{r}\mid \xi )=\max _{s=1,2,\ldots ,R}P(\omega _{s}\mid \xi )} Use the Bayes formula to calculate the posteriori probability P ( ω s ∣ ξ ) {\displaystyle P(\omega _{s}\mid \xi )} P ( ω s ∣ ξ ) = p ( ξ ∣ ω s ) P ( ω s ) p ( ξ ) {\displaystyle P(\omega _{s}\mid \xi )={\frac {p(\xi \mid \omega _{s})P(\omega _{s})}{p\left(\xi \right)}}} The number of vectors is the same as the number of pixels in the image. For the classifier uses a vector corresponding to each pixel x i {\displaystyle x_{i}} , and the vector is generated from the pixel's neighbourhood. The basic steps of contextual image classification: Calculate the feature vector ξ {\displaystyle \xi } for each pixel. Calculate the parameters of probability distribution p ( ξ ∣ ω s ) {\displaystyle p(\xi \mid \omega _{s})} and P ( ω s ) {\displaystyle P(\omega _{s})} Calculate the posterior probabilities P ( ω r ∣ ξ ) {\displaystyle P(\omega _{r}\mid \xi )} and all labels θ 0 {\displaystyle \theta _{0}} . Get the image classification result. == Algorithms == === Template matching === The template matching is a "brute force" implementation of this approach. The concept is first create a set of templates, and then look for small parts in the image match with a template. This method is computationally high and inefficient. It keeps an entire templates list during the whole process and the number of combinations is extremely high. For a m × n {\displaystyle m\times n} pixel image, there could be a maximum of 2 m × n {\displaystyle 2^{m\times n}} combinations, which leads to high computation. This method is a top down method and often called table look-up or dictionary look-up. === Lower-order Markov chain === The Markov chain also can be applied in pattern recognition. The pixels in an image can be recognised as a set of random variables, then use the lower order Markov chain to find the relationship among the pixels. The image is treated as a virtual line, and the method uses conditional probability. === Hilbert space-filling curves === The Hilbert curve runs in a unique pattern through the whole image, it traverses every pixel without visiting any of them twice and keeps a continuous curve. It is fast and efficient. === Markov meshes === The lower-order Markov chain and Hilbert space-filling curves mentioned above are treating the image as a line structure. The Markov meshes however will take the two dimensional information into account. === Dependency tree === The dependency tree is a method using tree dependency to approximate probability distributions.

Colloquis

Colloquis, previously known as ActiveBuddy and Conversagent, was a company that created conversation-based interactive agents originally distributed via instant messaging platforms. The company had offices in New York, New York, and Sunnyvale, California. == History == Founded in 2000, the company was the brainchild of Robert Hoffer, Timothy Kay, and Peter Levitan. The idea for interactive agents (also known as Internet bots) came from the team's vision to add functionality to increasingly popular instant messaging services. The original implementation took shape as a word-based adventure game but quickly grew to include a wide range of database applications, including access to news, weather, stock information, movie times, Yellow Pages listings, and detailed sports data, as well as a variety of tools (calculators, translator, etc.). These various applications were bundled into one entity and launched as SmarterChild in 2001. SmarterChild acted as a showcase for the quick data access and possibilities for fun conversation that the company planned to turn into customized, niche-specific products. The rapid success of SmarterChild led to targeted promotional products for Radiohead, Austin Powers, The Sporting News, and others. ActiveBuddy sought to strengthen its hold on the interactive agent market for the future by filing for, and receiving, a controversial patent on their creation in 2002. The company also released the BuddyScript SDK, a free developer kit that allow programmers to design and launch their own interactive agents using ActiveBuddy's proprietary scripting language, in 2002. Ultimately, however, the decline in ad spending in 2001 and 2002 led to a shift in corporate strategy towards business focused Automated Service Agents, building products for clients including Cingular, Comcast and Cox Communications. The company subsequently changed its name from ActiveBuddy to Conversagent in 2003, and then again to Colloquis in 2006. Colloquis was purchased by Microsoft in October 2006.

Automated decision-making

Automated decision-making (ADM) is the use of data, machines and algorithms to make decisions in a range of contexts, including public administration, business, health, education, law, employment, transport, media and entertainment, with varying degrees of human oversight or intervention. ADM may involve large-scale data from a range of sources, such as databases, text, social media, sensors, images or speech, that is processed using various technologies including computer software, algorithms, machine learning, natural language processing, artificial intelligence, augmented intelligence and robotics. The increasing use of automated decision-making systems (ADMS) across a range of contexts presents many benefits and challenges to human society requiring consideration of the technical, legal, ethical, societal, educational, economic and health consequences. == Overview == There are different definitions of ADM based on the level of automation involved. Some definitions suggests ADM involves decisions made through purely technological means without human input, such as the EU's General Data Protection Regulation (Article 22). However, ADM technologies and applications can take many forms ranging from decision-support systems that make recommendations for human decision-makers to act on, sometimes known as augmented intelligence or 'shared decision-making', to fully automated decision-making processes that make decisions on behalf of individuals or organizations without human involvement. Models used in automated decision-making systems can be as simple as checklists and decision trees through to artificial intelligence and deep neural networks (DNN). Since the 1950s computers have gone from being able to do basic processing to having the capacity to undertake complex, ambiguous and highly skilled tasks such as image and speech recognition, gameplay, scientific and medical analysis and inferencing across multiple data sources. ADM is now being increasingly deployed across all sectors of society and many diverse domains from entertainment to transport. An ADM system (ADMS) may involve multiple decision points, data sets, and technologies (ADMT) and may sit within a larger administrative or technical system such as a criminal justice system or business process. == Data == Automated decision-making involves using data as input to be analyzed within a process, model, or algorithm or for learning and generating new models. ADM systems may use and connect a wide range of data types and sources depending on the goals and contexts of the system, for example, sensor data for self-driving cars and robotics, identity data for security systems, demographic and financial data for public administration, medical records in health, criminal records in law. This can sometimes involve vast amounts of data and computing power. === Data quality === The quality of the available data and its ability to be used in ADM systems is fundamental to the outcomes. It is often highly problematic for many reasons. Datasets are often highly variable; corporations or governments may control large-scale data, restricted for privacy or security reasons, incomplete, biased, limited in terms of time or coverage, measuring and describing terms in different ways, and many other issues. For machines to learn from data, large corpora are often required, which can be challenging to obtain or compute; however, where available, they have provided significant breakthroughs, for example, in diagnosing chest X-rays. == ADM technologies == Automated decision-making technologies (ADMT) are software-coded digital tools that automate the translation of input data to output data, contributing to the function of automated decision-making systems. There are a wide range of technologies in use across ADM applications and systems. ADMTs involving basic computational operations Search (includes 1-2-1, 1-2-many, data matching/merge) Matching (two different things) Mathematical Calculation (formula) ADMTs for assessment and grouping: User profiling Recommender systems Clustering Classification Feature learning Predictive analytics (includes forecasting) ADMTs relating to space and flows: Social network analysis (includes link prediction) Mapping Routing ADMTs for processing of complex data formats Image processing Audio processing Natural Language Processing (NLP) Other ADMT Business rules management systems Time series analysis Anomaly detection Modelling/Simulation === Machine learning === Machine learning (ML) involves training computer programs through exposure to large data sets and examples to learn from experience and solve problems. Machine learning can be used to generate and analyse data as well as make algorithmic calculations and has been applied to image and speech recognition, translations, text, data and simulations. While machine learning has been around for some time, it is becoming increasingly powerful due to recent breakthroughs in training deep neural networks (DNNs), and dramatic increases in data storage capacity and computational power with GPU coprocessors and cloud computing. Machine learning systems based on foundation models run on deep neural networks and use pattern matching to train a single huge system on large amounts of general data such as text and images. Early models tended to start from scratch for each new problem however since the early 2020s many are able to be adapted to new problems. Examples of these technologies include Open AI's DALL-E (an image creation program) and their various GPT language models, and Google's PaLM language model program. == Applications == ADM is being used to replace or augment human decision-making by both public and private-sector organisations for a range of reasons including to help increase consistency, improve efficiency, reduce costs and enable new solutions to complex problems. === Debate === Research and development are underway into uses of technology to assess argument quality, assess argumentative essays and judge debates. Potential applications of these argument technologies span education and society. Scenarios to consider, in these regards, include those involving the assessment and evaluation of conversational, mathematical, scientific, interpretive, legal, and political argumentation and debate. === Law === In legal systems around the world, algorithmic tools such as risk assessment instruments (RAI), are being used to supplement or replace the human judgment of judges, civil servants and police officers in many contexts. In the United States RAI are being used to generate scores to predict the risk of recidivism in pre-trial detention and sentencing decisions, evaluate parole for prisoners and to predict "hot spots" for future crime. These scores may result in automatic effects or may be used to inform decisions made by officials within the justice system. In Canada ADM has been used since 2014 to automate certain activities conducted by immigration officials and to support the evaluation of some immigrant and visitor applications. === Economics === Automated decision-making systems are used in certain computer programs to create buy and sell orders related to specific financial transactions and automatically submit the orders in the international markets. Computer programs can automatically generate orders based on predefined set of rules using trading strategies which are based on technical analyses, advanced statistical and mathematical computations, or inputs from other electronic sources. === Business === ==== Continuous auditing ==== Continuous auditing uses advanced analytical tools to automate auditing processes. It can be utilized in the private sector by business enterprises and in the public sector by governmental organizations and municipalities. As artificial intelligence and machine learning continue to advance, accountants and auditors may make use of increasingly sophisticated algorithms which make decisions such as those involving determining what is anomalous, whether to notify personnel, and how to prioritize those tasks assigned to personnel. === Media and entertainment === Digital media, entertainment platforms, and information services increasingly provide content to audiences via automated recommender systems based on demographic information, previous selections, collaborative filtering or content-based filtering. This includes music and video platforms, publishing, health information, product databases and search engines. Many recommender systems also provide some agency to users in accepting recommendations and incorporate data-driven algorithmic feedback loops based on the actions of the system user. Large-scale machine learning language models and image creation programs being developed by companies such as OpenAI and Google in the 2020s have restricted access however they are likely to have widespread application in fields such as advertising, copywriting, stock imagery and gra

Common data model

A common data model (CDM) can refer to any standardised data model which allows for data and information exchange between different applications and data sources. Common data models aim to standardise logical infrastructure so that related applications can "operate on and share the same data", and can be seen as a way to "organize data from many sources that are in different formats into a standard structure". A common data model has been described as one of the components of a "strong information system". A standardised common data model has also been described as a typical component of a well designed agile application besides a common communication protocol. Providing a single common data model within an organisation is one of the typical tasks of a data warehouse. == Examples of common data models == === Border crossings === X-trans.eu was a cross-border pilot project between the Free State of Bavaria (Germany) and Upper Austria with the aim of developing a faster procedure for the application and approval of cross-border large-capacity transports. The portal was based on a common data model that contained all the information required for approval. === Climate data === The Climate Data Store Common Data Model is a common data model set up by the Copernicus Climate Change Service for harmonising essential climate variables from different sources and data providers. === General information technology === Within service-oriented architecture, S-RAMP is a specification released by HP, IBM, Software AG, TIBCO, and Red Hat which defines a common data model for SOA repositories as well as an interaction protocol to facilitate the use of common tooling and sharing of data. Content Management Interoperability Services (CMIS) is an open standard for inter-operation of different content management systems over the internet, and provides a common data model for typed files and folders used with version control. The NetCDF software libraries for array-oriented scientific data implements a common data model called the NetCDF Java common data model, which consists of three layers built on top of each other to add successively richer semantics. === Health === Within genomic and medical data, the Observational Medical Outcomes Partnership (OMOP) research program established under the U.S. National Institutes of Health has created a common data model for claims and electronic health records which can accommodate data from different sources around the world. PCORnet, which was developed by the Patient-Centered Outcomes Research Institute, is another common data model for health data including electronic health records and patient claims. The Sentinel Common Data Model was initially started as Mini-Sentinel in 2008. It is used by the Sentinel Initiative of the USA's Food and Drug Administration. The Generalized Data Model was first published in 2019. It was designed to be a stand-alone data model as well as to allow for further transformation into other data models (e.g., OMOP, PCORNet, Sentinel). It has a hierarchical structure to flexibly capture relationships among data elements. The JANUS clinical trial data repository also provides a common data model which is based on the SDTM standard to represent clinical data submitted to regulatory agencies, such as tabulation datasets, patient profiles, listings, etc. === Logistics === SX000i is a specification developed jointly by the Aerospace and Defence Industries Association of Europe (ASD) and the American Aerospace Industries Association (AIA) to provide information, guidance and instructions to ensure compatibility and the commonality. The associated SX002D specification contains a common data model. === Microsoft Common Data Model === The Microsoft Common Data Model is a collection of many standardised extensible data schemas with entities, attributes, semantic metadata, and relationships, which represent commonly used concepts and activities in various businesses areas. It is maintained by Microsoft and its partners, and is published on GitHub. Microsoft's Common Data Model is used amongst others in Microsoft Dataverse and with various Microsoft Power Platform and Microsoft Dynamics 365 services. === Rail transport === RailTopoModel is a common data model for the railway sector. === Other === There are many more examples of various common data models for different uses published by different sources.