AARON is the collective name for a series of computer programs written by artist Harold Cohen that create original artistic images autonomously, which set it apart from previous programs. Proceeding from Cohen's initial question "What are the minimum conditions under which a set of marks functions as an image?", AARON was in development between 1972 and the 2010s. As the software is not open source, its development effectively ended with Cohen's death in 2016. The name "AARON" does not seem to be an acronym; rather, it was a name chosen to start with the letter "A" so that the names of successive programs could follow it alphabetically. However, Cohen did not create any other major programs. Initial versions of AARON created abstract drawings that grew more complex through the 1970s. More representational imagery was added in the 1980s; first rocks, then plants, then people. In the 1990s more representational figures set in interior scenes were added, along with color. AARON returned to more abstract imagery, this time in color, in the early 2000s. Cohen used machines that allowed AARON to produce physical artwork. The first machines drew in black and white using a succession of custom-built "turtle" and flatbed plotter devices. Cohen would sometimes color these images by hand in fabric dye (Procion), or scale them up to make larger paintings and murals. In the 1990s Cohen built a series of digital painting machines to output AARON's images in ink and fabric dye. His later work used a large-scale inkjet printer on canvas. Development of AARON began in the C programming language then switched to Lisp in the early 1990s. Cohen credits Lisp with helping him solve the challenges he faced in adding color capabilities to AARON. An article about Cohen appeared in Computer Answers that describes AARON and shows two line drawings that were exhibited at the Tate gallery. The article goes on to describe the workings of AARON, then running on a DEC VAX 750 minicomputer. Raymond Kurzweil's company has produced a downloadable screensaver of AARON for Microsoft Windows PCs. This version of AARON can also produce printable images. AARON's source code is not publicly available, but Cohen has described AARON's operations in various essays and it is discussed in abstract in Pamela McCorduck's book. AARON cannot learn new styles or imagery on its own; each new capability must be hand-coded by Cohen. It is capable of producing a practically infinite supply of distinct images in its own style. Examples of these images have been exhibited in galleries worldwide. AARON's artwork has been used as an artistic equivalent of the Turing test. It does seem however that AARON's output follows a noticeable formula (figures standing next to a potted plant, framed within a colored square is a common theme). Cohen is very careful not to claim that AARON is creative. But he does ask "If what AARON is making is not art, what is it exactly, and in what ways, other than its origin, does it differ from the 'real thing?' If it is not thinking, what exactly is it doing?" — The further exploits of AARON, Painter. The Whitney Museum featured AARON in 2024, showcasing the evolution of AARON as the earliest artificial intelligence (AI) program for artmaking.
Intelligent agent
In artificial intelligence, an intelligent agent is an entity that perceives its environment, takes actions autonomously to achieve goals, and may improve its performance through machine learning or by acquiring knowledge. AI textbooks define artificial intelligence as the "study and design of intelligent agents," emphasizing that goal-directed behavior is central to intelligence. A specialized subset of intelligent agents, agentic AI (also known as an AI agent or simply agent), expands this concept by proactively pursuing goals, making decisions, and taking actions over extended periods. Intelligent agents can range from simple to highly complex. A basic thermostat or control system is considered an intelligent agent, as is a human being, or any other system that meets the same criteria—such as a firm, a state, or a biome. Intelligent agents operate based on an objective function, which encapsulates their goals. They are designed to create and execute plans that maximize the expected value of this function upon completion. For example, a reinforcement learning agent has a reward function, which allows programmers to shape its desired behavior. Similarly, an evolutionary algorithm's behavior is guided by a fitness function. Intelligent agents in artificial intelligence are closely related to agents in economics, and versions of the intelligent agent paradigm are studied in cognitive science, ethics, and the philosophy of practical reason, as well as in many interdisciplinary socio-cognitive modeling and computer social simulations. Intelligent agents are often described schematically as abstract functional systems similar to computer programs . To distinguish theoretical models from real-world implementations, abstract descriptions of intelligent agents are called abstract intelligent agents. Intelligent agents are also closely related to software agents—autonomous computer programs that carry out tasks on behalf of users. They are also referred to using a term borrowed from economics: a "rational agent". == Intelligent agents as the foundation of AI == The concept of intelligent agents provides a foundational lens through which to define and understand artificial intelligence. For instance, the influential textbook Artificial Intelligence: A Modern Approach (Russell & Norvig) describes: Agent: Anything that perceives its environment (using sensors) and acts upon it (using actuators). E.g., a robot with cameras and wheels, or a software program that reads data and makes recommendations. Rational Agent: An agent that strives to achieve the best possible outcome based on its knowledge and past experiences. "Best" is defined by a performance measure – a way of evaluating how well the agent is doing. Artificial Intelligence (as a field): The study and creation of these rational agents. Other researchers and definitions build upon this foundation. Padgham & Winikoff emphasize that intelligent agents should react to changes in their environment in a timely way, proactively pursue goals, and be flexible and robust (able to handle unexpected situations). Some also suggest that ideal agents should be "rational" in the economic sense (making optimal choices) and capable of complex reasoning, like having beliefs, desires, and intentions (BDI model). Kaplan and Haenlein offer a similar definition, focusing on a system's ability to understand external data, learn from that data, and use what is learned to achieve goals through flexible adaptation. Defining AI in terms of intelligent agents offers several key advantages: Avoids Philosophical Debates: It sidesteps arguments about whether AI is "truly" intelligent or conscious, like those raised by the Turing test or Searle's Chinese Room. It focuses on behavior and goal achievement, not on replicating human thought. Objective Testing: It provides a clear, scientific way to evaluate AI systems. Researchers can compare different approaches by measuring how well they maximize a specific "goal function" (or objective function). This allows for direct comparison and combination of techniques. Interdisciplinary Communication: It creates a common language for AI researchers to collaborate with other fields like mathematical optimization and economics, which also use concepts like "goals" and "rational agents." == Objective function == An objective function (or goal function) specifies the goals of an intelligent agent. An agent is deemed more intelligent if it consistently selects actions that yield outcomes better aligned with its objective function. In effect, the objective function serves as a measure of success. The objective function may be: Simple: For example, in a game of Go, the objective function might assign a value of 1 for a win and 0 for a loss. Complex: It might require the agent to evaluate and learn from past actions, adapting its behavior based on patterns that have proven effective. The objective function encapsulates all of the goals the agent is designed to achieve. For rational agents, it also incorporates the trade-offs between potentially conflicting goals. For instance, a self-driving car's objective function might balance factors such as safety, speed, and passenger comfort. Different terms are used to describe this concept, depending on the context. These include: Utility function: Often used in economics and decision theory, representing the desirability of a state. Objective function: A general term used in optimization. Loss function: Typically used in machine learning, where the goal is to minimize the loss (error). Reward Function: Used in reinforcement learning. Fitness Function: Used in evolutionary systems. Goals, and therefore the objective function, can be: Explicitly defined: Programmed directly into the agent. Induced: Learned or evolved over time. In reinforcement learning, a "reward function" provides feedback, encouraging desired behaviors and discouraging undesirable ones. The agent learns to maximize its cumulative reward. In evolutionary systems, a "fitness function" determines which agents are more likely to reproduce. This is analogous to natural selection, where organisms evolve to maximize their chances of survival and reproduction. Some AI systems, such as nearest-neighbor, reason by analogy rather than being explicitly goal-driven. However, even these systems can have goals implicitly defined within their training data. Such systems can still be benchmarked by framing the non-goal system as one whose "goal" is to accomplish its narrow classification task. Systems not traditionally considered agents, like knowledge-representation systems, are sometimes included in the paradigm by framing them as agents with a goal of, for example, answering questions accurately. Here, the concept of an "action" is extended to encompass the "act" of providing an answer. As a further extension, mimicry-driven systems can be framed as agents optimizing a "goal function" based on how closely the agent mimics the desired behavior. In generative adversarial networks (GANs) of the 2010s, an "encoder"/"generator" component attempts to mimic and improvise human text composition. The generator tries to maximize a function representing how well it can fool an antagonistic "predictor"/"discriminator" component. While symbolic AI systems often use an explicit goal function, the paradigm also applies to neural networks and evolutionary computing. Reinforcement learning can generate intelligent agents that appear to act in ways intended to maximize a "reward function". Sometimes, instead of setting the reward function directly equal to the desired benchmark evaluation function, machine learning programmers use reward shaping to initially give the machine rewards for incremental progress. Yann LeCun stated in 2018, "Most of the learning algorithms that people have come up with essentially consist of minimizing some objective function." AlphaZero chess had a simple objective function: +1 point for each win, and -1 point for each loss. A self-driving car's objective function would be more complex. Evolutionary computing can evolve intelligent agents that appear to act in ways intended to maximize a "fitness function" influencing how many descendants each agent is allowed to leave. The mathematical formalism of AIXI was proposed as a maximally intelligent agent in this paradigm. However, AIXI is uncomputable. In the real world, an intelligent agent is constrained by finite time and hardware resources, and scientists compete to produce algorithms that achieve progressively higher scores on benchmark tests with existing hardware. == Agent function == An intelligent agent's behavior can be described mathematically by an agent function. This function determines what the agent does based on what it has seen. A percept refers to the agent's sensory inputs at a single point in time. For example, a self-driving car's percepts might include camera images, lidar data, GPS coordinates, and speed r
Quantum artificial life
Quantum artificial life is the application of quantum algorithms with the ability to simulate biological behavior. Quantum computers offer many potential improvements to processes performed on classical computers, including machine learning and artificial intelligence. Artificial intelligence applications are often inspired by the idea of mimicking human brains through closely related biomimicry. This has been implemented to a certain extent on classical computers (using neural networks), but quantum computers offer many advantages in the simulation of artificial life. Artificial life and artificial intelligence are extremely similar, with minor differences; the goal of studying artificial life is to understand living beings better, while the goal of artificial intelligence is to create intelligent beings. In 2016, Alvarez-Rodriguez et al. developed a proposal for a quantum artificial life algorithm with the ability to simulate life and Darwinian evolution. In 2018, the same research team led by Alvarez-Rodriguez performed the proposed algorithm on the IBM ibmqx4 quantum computer, and received optimistic results. The results accurately simulated a system with the ability to undergo self-replication at the quantum scale. == Artificial life on quantum computers == The growing advancement of quantum computers has led researchers to develop quantum algorithms for simulating life processes. Researchers have designed a quantum algorithm that can accurately simulate Darwinian Evolution. Since the complete simulation of artificial life on quantum computers has only been actualized by one group, this section shall focus on the implementation by Alvarez-Rodriguez, Sanz, Lomata, and Solano on an IBM quantum computer. Individuals were realized as two qubits, one representing the genotype of the individual and the other representing the phenotype. The genotype is copied to transmit genetic information through generations, and the phenotype is dependent on the genetic information as well as the individual's interactions with their environment. In order to set up the system, the state of the genotype is instantiated by some rotation of an ancillary state ( | 0 ⟩ ⟨ 0 | {\displaystyle |0\rangle \langle 0|} ). The environment is a two-dimensional spatial grid occupied by individuals and ancillary states. The environment is divided into cells that are able to possess one or more individuals. Individuals move throughout the grid and occupy cells randomly; when two or more individuals occupy the same cell they interact with each other. === Self replication === The ability to self-replicate is critical for simulating life. Self-replication occurs when the genotype of an individual interacts with an ancillary state, creating a genotype for a new individual; this genotype interacts with a different ancillary state in order to create the phenotype. During this interaction, one would like to copy some information about the initial state into the ancillary state, but by the no cloning theorem, it is impossible to copy an arbitrary unknown quantum state. However, physicists have derived different methods for quantum cloning which does not require the exact copying of an unknown state. The method that has been implemented by Alvarez-Rodriguez et al. is one that involves the cloning of the expectation value of some observable. For a unitary U {\displaystyle U} which copies the expectation value of some set of observables X {\displaystyle {\mathsf {X}}} of state ρ {\displaystyle \rho } into a blank state ρ e {\displaystyle \rho _{e}} , the cloning machine is defined by any ( U , ρ e , X ) {\displaystyle (U,\rho _{e},{\mathsf {X}})} that fulfill the following: ∀ ρ ∀ X ∈ X {\displaystyle \forall \rho \forall X\in {\mathsf {X}}} X ¯ = X 1 ¯ = X 2 ¯ {\displaystyle {\bar {X}}={\bar {X_{1}}}={\bar {X_{2}}}} Where X ¯ {\displaystyle {\bar {X}}} is the mean value of the observable in ρ {\displaystyle \rho } before cloning, X 1 ¯ {\displaystyle {\bar {X_{1}}}} is the mean value of the observable in ρ {\displaystyle \rho } after cloning, and X 2 ¯ {\displaystyle {\bar {X_{2}}}} is the mean value of the observable in ρ e {\displaystyle \rho _{e}} after cloning. Note that the cloning machine has no dependence on ρ {\displaystyle \rho } because we want to be able to clone the expectation of the observables for any initial state. It is important to note that cloning the mean value of the observable transmits more information than is allowed classically. The calculation of the mean value is defined naturally as: X ¯ = T r [ ρ X ] {\displaystyle {\bar {X}}=Tr[\rho X]} , X 1 ¯ = T r [ R X ⊗ I ] {\displaystyle {\bar {X_{1}}}=Tr[RX\otimes I]} , X 2 ¯ = T r [ R I ⊗ X ] {\displaystyle {\bar {X_{2}}}=Tr[RI\otimes X]} where R = U ρ ⊗ ρ e U † {\displaystyle R=U\rho \otimes \rho _{e}U^{\dagger }} The simplest cloning machine clones the expectation value of σ z {\displaystyle \sigma _{z}} in arbitrary state ρ = | ψ ⟩ ⟨ ψ | {\displaystyle \rho =|\psi \rangle \langle \psi |} to ρ e = | 0 ⟩ ⟨ 0 | {\displaystyle \rho _{e}=|0\rangle \langle 0|} using U = C N O T {\displaystyle U=CNOT} . This is the cloning machine implemented for self-replication by Alvarez-Rodriguez et al. The self-replication process clearly only requires interactions between two qubits, and therefore this cloning machine is the only one necessary for self replication. === Interactions === Interactions occur between individuals when the two take up the same space on the environmental grid. The presence of interactions between individuals provides an advantage for shorter-lifespan individuals. When two individuals interact, exchanges of information between the two phenotypes may or may not occur based on their existing values. When both individual's control qubits (genotypes) are alike, no information will be exchanged. When the control qubits differ, the target qubits (phenotype) will be exchanged between the two individuals. This procedure produces a constantly changing predator-prey dynamic in the simulation. Therefore, long-living qubits, with a larger genetic makeup in the simulation, are at a disadvantage. Since information is only exchanged when interacting with an individual of different genetic makeup, the short-lived population has the advantage. === Mutation === Mutations exist in the artificial world with limited probability, equivalent to their occurrence in the real world. There are two ways in which the individual can mutate: through random single qubit rotations and by errors in the self-replication process. There are two different operators that act on the individual and cause mutations. The M operation causes a spontaneous mutation within the individual by rotating a single qubit by parameter θ. The parameter θ is random for each mutation, which creates biodiversity within the artificial environment. The M operation is a unitary matrix which can be described as: M = ( cos ( θ ) s i n ( θ ) s i n ( θ ) − c o s ( θ ) ) {\displaystyle M={\begin{pmatrix}\cos(\theta )&sin(\theta )\\sin(\theta )&-cos(\theta )\end{pmatrix}}} The other possible way for mutations to occur is due to errors in the replication process. Due to the no-cloning theorem, it is impossible to produce perfect copies of systems that are originally in unknown quantum states. However, quantum cloning machines make it possible to create imperfect copies of quantum states, in other words, the process introduces some degree of error. The error that exists in current quantum cloning machines is the root cause for the second kind of mutations in the artificial life experiment. The imperfect cloning operation can be seen as: U M ( θ ) = I 4 + 1 2 ( 0 0 0 1 ) ⊗ ( − 1 1 1 − 1 ) ( c o s θ + i s i n θ + 1 ) {\displaystyle U_{M}(\theta )=\mathrm {I} _{4}+{\frac {1}{2}}{\begin{pmatrix}0&0\\0&1\end{pmatrix}}\otimes {\begin{pmatrix}-1&1\\1&-1\end{pmatrix}}(cos\theta +isin\theta +1)} The two kinds of mutations affect the individual differently. While the spontaneous M operation does not affect the phenotype of the individual, the self-replicating error mutation, UM, alters both the genotype of the individual, and its associated lifetime. The presence of mutations in the quantum artificial life experiment is critical for providing randomness and biodiversity. The inclusion of mutations helps to increase the accuracy of the quantum algorithm. === Death === At the instant the individual is created (when the genotype is copied into the phenotype), the phenotype interacts with the environment. As time evolves, the interaction of the individual with the environment simulates aging which eventually leads to the death of the individual. The death of an individual occurs when the expectation value of σ z {\displaystyle \sigma _{z}} is within some ϵ {\displaystyle \epsilon } of 1 in the phenotype, or, equivalently, when ρ p = | 0 ⟩ ⟨ 0 | {\displaystyle \rho _{p}=|0\rangle \langle 0|} The Lindbladian describes the interaction of the individual with the environment: ρ
Learning to rank
Learning to rank (LTR) or machine-learned ranking (MLR) is the application of machine learning, often supervised, semi-supervised or reinforcement learning, in the construction of ranking models for information retrieval and recommender systems. Training data may, for example, consist of lists of items with some partial order specified between items in each list. This order is typically induced by giving a numerical or ordinal score or a binary judgment (e.g. "relevant" or "not relevant") for each item. The goal of constructing the ranking model is to rank new, unseen lists in a similar way to rankings in the training data. == Applications == === In information retrieval === Ranking is a central part of many information retrieval problems, such as document retrieval, collaborative filtering, sentiment analysis, and online advertising. A possible architecture of a machine-learned search engine is shown in the accompanying figure. Training data consists of queries and documents matching them together with the relevance degree of each match. It may be prepared manually by human assessors (or raters, as Google calls them), who check results for some queries and determine relevance of each result. It is not feasible to check the relevance of all documents, and so typically a technique called pooling is used — only the top few documents, retrieved by some existing ranking models are checked. This technique may introduce selection bias. Alternatively, training data may be derived automatically by analyzing clickthrough logs (i.e. search results which got clicks from users), query chains, or such search engines' features as Google's (since-replaced) SearchWiki. Clickthrough logs can be biased by the tendency of users to click on the top search results on the assumption that they are already well-ranked. Training data is used by a learning algorithm to produce a ranking model which computes the relevance of documents for actual queries. Typically, users expect a search query to complete in a short time (such as a few hundred milliseconds for web search), which makes it impossible to evaluate a complex ranking model on each document in the corpus, and so a two-phase scheme is used. First, a small number of potentially relevant documents are identified using simpler retrieval models which permit fast query evaluation, such as the vector space model, Boolean model, weighted AND, or BM25. This phase is called top- k {\displaystyle k} document retrieval and many heuristics were proposed in the literature to accelerate it, such as using a document's static quality score and tiered indexes. In the second phase, a more accurate but computationally expensive machine-learned model is used to re-rank these documents. === In other areas === Learning to rank algorithms have been applied in areas other than information retrieval: In machine translation for ranking a set of hypothesized translations; In computational biology for ranking candidate 3-D structures in protein structure prediction problems; In recommender systems for identifying a ranked list of related news articles to recommend to a user after he or she has read a current news article. == Feature vectors == For the convenience of MLR algorithms, query-document pairs are usually represented by numerical vectors, which are called feature vectors. Such an approach is sometimes called bag of features and is analogous to the bag of words model and vector space model used in information retrieval for representation of documents. Components of such vectors are called features, factors or ranking signals. They may be divided into three groups (features from document retrieval are shown as examples): Query-independent or static features — those features, which depend only on the document, but not on the query. For example, PageRank or document's length. Such features can be precomputed in off-line mode during indexing. They may be used to compute document's static quality score (or static rank), which is often used to speed up search query evaluation. Query-dependent or dynamic features — those features, which depend both on the contents of the document and the query, such as TF-IDF score or other non-machine-learned ranking functions. Query-level features or query features, which depend only on the query. For example, the number of words in a query. Some examples of features, which were used in the well-known LETOR dataset: TF, TF-IDF, BM25, and language modeling scores of document's zones (title, body, anchors text, URL) for a given query; Lengths and IDF sums of document's zones; Document's PageRank, HITS ranks and their variants. Selecting and designing good features is an important area in machine learning, which is called feature engineering. == Evaluation measures == There are several measures (metrics) which are commonly used to judge how well an algorithm is doing on training data and to compare the performance of different MLR algorithms. Often a learning-to-rank problem is reformulated as an optimization problem with respect to one of these metrics. Examples of ranking quality measures: Mean average precision (MAP); DCG and NDCG; Precision@n, NDCG@n, where "@n" denotes that the metrics are evaluated only on top n documents; Mean reciprocal rank; Kendall's tau; Spearman's rho. DCG and its normalized variant NDCG are usually preferred in academic research when multiple levels of relevance are used. Other metrics such as MAP, MRR and precision, are defined only for binary judgments. Recently, there have been proposed several new evaluation metrics which claim to model user's satisfaction with search results better than the DCG metric: Expected reciprocal rank (ERR); Yandex's pfound. Both of these metrics are based on the assumption that the user is more likely to stop looking at search results after examining a more relevant document, than after a less relevant document. == Approaches == Learning to Rank approaches are often categorized using one of three approaches: pointwise (where individual documents are ranked), pairwise (where pairs of documents are ranked into a relative order), and listwise (where an entire list of documents are ordered). Tie-Yan Liu of Microsoft Research Asia has analyzed existing algorithms for learning to rank problems in his book Learning to Rank for Information Retrieval. He categorized them into three groups by their input spaces, output spaces, hypothesis spaces (the core function of the model) and loss functions: the pointwise, pairwise, and listwise approach. In practice, listwise approaches often outperform pairwise approaches and pointwise approaches. This statement was further supported by a large scale experiment on the performance of different learning-to-rank methods on a large collection of benchmark data sets. In this section, without further notice, x {\displaystyle x} denotes an object to be evaluated, for example, a document or an image, f ( x ) {\displaystyle f(x)} denotes a single-value hypothesis, h ( ⋅ ) {\displaystyle h(\cdot )} denotes a bi-variate or multi-variate function and L ( ⋅ ) {\displaystyle L(\cdot )} denotes the loss function. === Pointwise approach === In this case, it is assumed that each query-document pair in the training data has a numerical or ordinal score. Then the learning-to-rank problem can be approximated by a regression problem — given a single query-document pair, predict its score. Formally speaking, the pointwise approach aims at learning a function f ( x ) {\displaystyle f(x)} predicting the real-value or ordinal score of a document x {\displaystyle x} using the loss function L ( f ; x j , y j ) {\displaystyle L(f;x_{j},y_{j})} . A number of existing supervised machine learning algorithms can be readily used for this purpose. Ordinal regression and classification algorithms can also be used in pointwise approach when they are used to predict the score of a single query-document pair, and it takes a small, finite number of values. === Pairwise approach === In this case, the learning-to-rank problem is approximated by a classification problem — learning a binary classifier h ( x u , x v ) {\displaystyle h(x_{u},x_{v})} that can tell which document is better in a given pair of documents. The classifier shall take two documents as its input and the goal is to minimize a loss function L ( h ; x u , x v , y u , v ) {\displaystyle L(h;x_{u},x_{v},y_{u,v})} . The loss function typically reflects the number and magnitude of inversions in the induced ranking. In many cases, the binary classifier h ( x u , x v ) {\displaystyle h(x_{u},x_{v})} is implemented with a scoring function f ( x ) {\displaystyle f(x)} . As an example, RankNet adapts a probability model and defines h ( x u , x v ) {\displaystyle h(x_{u},x_{v})} as the estimated probability of the document x u {\displaystyle x_{u}} has higher quality than x v {\displaystyle x_{v}} : P u , v ( f ) = CDF ( f ( x u ) − f ( x v ) ) , {\displaystyle P_{u,v}(f)={\text{CDF}
Active learning (machine learning)
Active learning is a special case of machine learning in which a learning algorithm can interactively query a human user (or some other information source) to label new data points with the desired outputs. The human user must possess expertise in the problem domain, including the ability to consult authoritative sources when necessary. In statistics literature, it is sometimes also called optimal experimental design. The information source is also called teacher or oracle. There are situations in which unlabeled data is abundant but manual labeling is expensive. In such a scenario, learning algorithms can actively query the teacher for labels. Since the learner chooses the examples, the number of examples to learn a concept can often be much lower than the number required in normal supervised learning. However, there is a risk that the algorithm is overwhelmed by uninformative examples. Recent developments are dedicated to multi-label active learning, hybrid active learning and active learning in a single-pass (on-line) context, combining concepts from the field of machine learning (e.g. conflict and ignorance) with adaptive, incremental learning policies in the field of online machine learning. Using active learning allows for faster development of a machine learning algorithm, when comparative updates would require a quantum or super computer. Large-scale active learning projects may benefit from crowdsourcing frameworks such as Amazon Mechanical Turk that include many humans in the active learning loop. == Definitions == Let T be the total set of all data under consideration. For example, in a protein engineering problem, T would include all proteins that are known to have a certain interesting activity and all additional proteins that one might want to test for that activity. During each iteration, i, T is broken up into three subsets T K , i {\displaystyle \mathbf {T} _{K,i}} : Data points where the label is known. T U , i {\displaystyle \mathbf {T} _{U,i}} : Data points where the label is unknown. T C , i {\displaystyle \mathbf {T} _{C,i}} : A subset of TU,i that is chosen to be labeled. Most of the current research in active learning involves the best method to choose the data points for TC,i. == Scenarios == Pool-based sampling: In this approach, which is the most well known scenario, the learning algorithm attempts to evaluate the entire dataset before selecting data points (instances) for labeling. It is often initially trained on a fully labeled subset of the data using a machine-learning method such as logistic regression or SVM that yields class-membership probabilities for individual data instances. The candidate instances are those for which the prediction is most ambiguous. Instances are drawn from the entire data pool and assigned a confidence score, a measurement of how well the learner "understands" the data. The system then selects the instances for which it is the least confident and queries the teacher for the labels. The theoretical drawback of pool-based sampling is that it is memory-intensive and is therefore limited in its capacity to handle enormous datasets, but in practice, the rate-limiting factor is that the teacher is typically a (fatiguable) human expert who must be paid for their effort, rather than computer memory. Stream-based selective sampling: Here, each consecutive unlabeled instance is examined one at a time with the machine evaluating the informativeness of each item against its query parameters. The learner decides for itself whether to assign a label or query the teacher for each datapoint. As contrasted with Pool-based sampling, the obvious drawback of stream-based methods is that the learning algorithm does not have sufficient information, early in the process, to make a sound assign-label-vs ask-teacher decision, and it does not capitalize as efficiently on the presence of already labeled data. Therefore, the teacher is likely to spend more effort in supplying labels than with the pool-based approach. Membership query synthesis: This is where the learner generates synthetic data from an underlying natural distribution. For example, if the dataset are pictures of humans and animals, the learner could send a clipped image of a leg to the teacher and query if this appendage belongs to an animal or human. This is particularly useful if the dataset is small. The challenge here, as with all synthetic-data-generation efforts, is in ensuring that the synthetic data is consistent in terms of meeting the constraints on real data. As the number of variables/features in the input data increase, and strong dependencies between variables exist, it becomes increasingly difficult to generate synthetic data with sufficient fidelity. For example, to create a synthetic data set for human laboratory-test values, the sum of the various white blood cell (WBC) components in a white blood cell differential must equal 100, since the component numbers are really percentages. Similarly, the enzymes alanine transaminase (ALT) and aspartate transaminase (AST) measure liver function (though AST is also produced by other tissues, e.g., lung, pancreas) A synthetic data point with AST at the lower limit of normal range (8–33 units/L) with an ALT several times above normal range (4–35 units/L) in a simulated chronically ill patient would be physiologically impossible. == Query strategies == Algorithms for determining which data points should be labeled can be organized into a number of different categories, based upon their purpose: Balance exploration and exploitation: the choice of examples to label is seen as a dilemma between the exploration and the exploitation over the data space representation. This strategy manages this compromise by modelling the active learning problem as a contextual bandit problem. For example, Bouneffouf et al. propose a sequential algorithm named Active Thompson Sampling (ATS), which, in each round, assigns a sampling distribution on the pool, samples one point from this distribution, and queries the oracle for this sample point label. Expected model change: label those points that would most change the current model. Expected error reduction: label those points that would most reduce the model's generalization error. Exponentiated Gradient Exploration for Active Learning: In this paper, the author proposes a sequential algorithm named exponentiated gradient (EG)-active that can improve any active learning algorithm by an optimal random exploration. Uncertainty sampling: label those points for which the current model is least certain as to what the correct output should be. Query by committee: a variety of models are trained on the current labeled data, and vote on the output for unlabeled data; label those points for which the "committee" disagrees the most Querying from diverse subspaces or partitions: When the underlying model is a forest of trees, the leaf nodes might represent (overlapping) partitions of the original feature space. This offers the possibility of selecting instances from non-overlapping or minimally overlapping partitions for labeling. Variance reduction: label those points that would minimize output variance, which is one of the components of error. Conformal prediction: predicts that a new data point will have a label similar to old data points in some specified way and degree of the similarity within the old examples is used to estimate the confidence in the prediction. Mismatch-first farthest-traversal: The primary selection criterion is the prediction mismatch between the current model and nearest-neighbour prediction. It targets on wrongly predicted data points. The second selection criterion is the distance to previously selected data, the farthest first. It aims at optimizing the diversity of selected data. User-centered labeling strategies: Learning is accomplished by applying dimensionality reduction to graphs and figures like scatter plots. Then the user is asked to label the compiled data (categorical, numerical, relevance scores, relation between two instances). A wide variety of algorithms have been studied that fall into these categories. While the traditional AL strategies can achieve remarkable performance, it is often challenging to predict in advance which strategy is the most suitable in a particular situation. In recent years, meta-learning algorithms have been gaining in popularity. Some of them have been proposed to tackle the problem of learning AL strategies instead of relying on manually designed strategies. A benchmark which compares 'meta-learning approaches to active learning' to 'traditional heuristic-based Active Learning' may give intuitions if 'Learning active learning' is at the crossroads == Minimum marginal hyperplane == Some active learning algorithms are built upon support-vector machines (SVMs) and exploit the structure of the SVM to determine which data points to label. Such methods usually calculate the margin, W, of each u
DABUS
DABUS (Device for the Autonomous Bootstrapping of Unified Sentience) is an artificial intelligence (AI) system created by Stephen Thaler. It reportedly conceived of two novel products — a food container constructed using fractal geometry, which enables rapid reheating, and a flashing beacon for attracting attention in an emergency. The filing of patent applications designating DABUS as inventor has led to decisions by patent offices and courts on whether a patent can be granted for an invention reportedly made by an AI system. == History in different jurisdictions == === Australia === On 17 September 2019, Thaler filed an application to patent a "Food container and devices and methods for attracting enhanced attention," naming DABUS as the inventor. On 21 September 2020, IP Australia found that section 15(1) of the Patents Act 1990 (Cth) is inconsistent with an artificial intelligence machine being treated as an inventor, and Thaler's application had lapsed. Thaler sought judicial review, and on 30 July 2021, the Federal Court set aside IP Australia's decision and ordered IP Australia to reconsider the application. On 13 April 2022, the Full Court of the Federal Court set aside that decision, holding that only a natural person can be an inventor for the purposes of the Patents Act 1990 (Cth) and the Patents Regulations 1991 (Cth), and that such an inventor must be identified for any person to be entitled to a grant of a patent. On 11 November 2022, Thaler was refused special leave to appeal to the High Court. === European Patent Office === On 17 October 2018 and 7 November 2018, Thaler filed two European patent applications with the European Patent Office. The first claimed invention was a "Food Container" and the second was "Devices and Methods for Attracting Enhanced Attention." On 27 January 2020, the EPO rejected the applications on the grounds that the application listed an AI system named DABUS, and not a human, as the inventor, based on Article 81 and Rule 19(1) of the European Patent Convention (EPC). On 21 December 2021, the Board of Appeal of the EPO dismissed Thaler's appeal from the EPO's primary decision. The Board of Appeal confirmed that "under the EPC the designated inventor has to be a person with legal capacity. This is not merely an assumption on which the EPC was drafted. It is the ordinary meaning of the term inventor." === United Kingdom === Similar applications were filed by Thaler to the United Kingdom Intellectual Property Office on 17 October and 7 November 2018. The Office asked Thaler to file statements of inventorship and of right of grant to a patent (Patent Form 7) in respect of each invention within 16 months of the filing date. Thaler filed those forms naming DABUS as the inventor and explaining in some detail why he believed that machines should be regarded as inventors in the circumstances. His application was rejected on the grounds that: (1) naming a machine as inventor did not meet the requirements of the Patents Act 1977; and (2) the IPO was not satisfied as to the manner in which Thaler had acquired rights that would otherwise vest in the inventor. Thaler was not satisfied with the decision and asked for a hearing before an official known as the "hearing officer". By a decision dated 4 December 2019 the hearing officer rejected Thaler's appeal. Thaler appealed against the hearing officer's decision to the Patents Court (a specialist court within the Chancery Division of the High Court of England and Wales that determines patent disputes). On 21 September 2020, Mr Justice Marcus Smith upheld the decision of the hearing officer. On 21 September 2021, Thaler's further appeal to the Court of Appeal was dismissed by Arnold LJ and Laing LJ (Birss LJ dissenting). On 20 December 2023, the UK Supreme Court dismissed a further appeal by Thaler. In its judgment, the court held that an "inventor" under the Patents Act 1977 must be a natural person. === United States === The patent applications on the inventions were refused by the USPTO, which held that only natural persons can be named as inventors in a patent application. Thaler first fought this result by filing a complaint under the Administrative Procedure Act alleging that the decision was "arbitrary, capricious, an abuse of discretion and not in accordance with the law; unsupported by substantial evidence, and in excess of Defendants’ statutory authority." A month later on August 19, 2019, Thaler filed a petition with the USPTO as allowed in 37 C.F.R. § 1.181 stating that DABUS should be the inventor. The judge and Thaler agreed in this case that Thaler himself is unable to receive the patent on behalf of DABUS. In their August 5, 2022, Thaler decision, the US Court of Appeals for the Federal Circuit affirmed that only a natural person could be an inventor, which means that the AI that invents any other type of invention is not addressed by the "who" mentioned in the legislation. === New Zealand === On January 31, 2022, the Intellectual Property Office of New Zealand (IPONZ) decided that a patent application (776029) filed by Stephen Thaler was void, on the basis that no inventor was identified on the patent application. IPONZ determined that DABUS could not be "an actual devisor of the invention" as required by the Patents Act 2013, and that this must be a natural person as held by the previous patent offices above. The High Court of New Zealand confirmed the decision in 2023. === South Africa === On 24 June 2021, the South African Companies and Intellectual Property Commission (CIPC) accepted Dr Thaler's Patent Cooperation Treaty, for a patent in respect of inventions generated by DABUS. In July 2021, the CIPC released a notice of issuance for the patent. It is the first patent granted for an AI invention. === Switzerland === On June 26, 2025, the Swiss Federal Administrative Court ruled that artificial intelligence systems such as DABUS cannot be listed as inventors in patent applications. The court upheld the existing practice of the Swiss Federal Institute of Intellectual Property (IPI), which requires that only natural persons can be recognized as inventors under Swiss patent law. The case concerned a patent application, which sought to designate DABUS as the sole inventor of a food container designed with a fractal geometry to enhance heat distribution. The IPI had rejected the application, arguing that both the absence of a human inventor and the attribution of inventorship to an AI system were inadmissible. While the court dismissed Thaler's main request, it accepted a subsidiary request: if a human applicant recognizes and files a patent based on an AI-generated invention, that person may be considered the inventor. As a result, the application may proceed with Thaler listed as the inventor. The decision (B-2532/2024) can still be appealed to the Swiss Federal Supreme Court.
Three-factor learning
In neuroscience and machine learning, three-factor learning is the combination of Hebbian plasticity with a third modulatory factor to stabilise and enhance synaptic learning. This third factor can represent various signals such as reward, punishment, error, surprise, or novelty, often implemented through neuromodulators. == Description == Three-factor learning introduces the concept of eligibility traces, which flag synapses for potential modification pending the arrival of the third factor, and helps temporal credit assignement by bridging the gap between rapid neuronal firing and slower behavioral timescales, from which learning can be done. Biological basis for Three-factor learning rules have been supported by experimental evidence. This approach addresses the instability of classical Hebbian learning by minimizing autocorrelation and maximizing cross-correlation between inputs.