Best AI Image Generators

Best AI Image Generators — hands-on reviews, top picks, pricing, pros and cons and a practical how-to guide on Aizhi.

  • Multi-focus image fusion

    Multi-focus image fusion

    Multi-focus image fusion is a multiple image compression technique using input images with different focus depths to make one output image that preserves all information. == Overview == The main idea of image fusion is gathering important and the essential information from the input images into one single image which ideally has all of the information of the input images. The research history of image fusion spans over 30 years and many scientific papers. Image fusion generally has two aspects: image fusion methods and objective evaluation metrics. In visual sensor networks (VSN), sensors are cameras which record images and video sequences. In many applications of VSN, a camera can't give a perfect illustration including all details of the scene. This is because of the limited depth of focus of the optical lens of cameras. Therefore, just the object located in the focal length of camera is focused and clear, and other parts of the image are blurred. VSN captures images with different depths of focus using several cameras. Due to the large amount of data generated by cameras compared to other sensors such as pressure and temperature sensors and some limitations of bandwidth, energy consumption and processing time, it is essential to process the local input images to decrease the amount of transmitted data. == Multi-Focus image fusion in the spatial domain == Huang and Jing have reviewed and applied several focus measurements in the spatial domain for the multi-focus image fusion process, suitable for real-time applications. They mentioned some focus measurements including variance, energy of image gradient (EOG), Tenenbaum's algorithm (Tenengrad), energy of Laplacian (EOL), sum-modified-Laplacian (SML), and spatial frequency (SF). Their experiments showed that EOL gave better results than other methods like variance and spatial frequency. == Multi-Focus image fusion in multi-scale transform and DCT domain == Image fusion based on the multi-scale transform is the most commonly used and promising technique. Laplacian pyramid transform, gradient pyramid-based transform, morphological pyramid transform and the premier ones, discrete wavelet transform, shift-invariant wavelet transform (SIDWT), and discrete cosine harmonic wavelet transform (DCHWT) are some examples of image fusion methods based on multi-scale transform. These methods are complex and have some limitations e.g. processing time and energy consumption. For example, multi-focus image fusion methods based on DWT require a lot of convolution operations, so they take more time and energy to process. Therefore, most methods in multi-scale transform are not suitable for real-time applications. Moreover, these methods are not very successful along edges, due to the wavelet transform process missing the edges of the image. They create ringing artefacts in the output image and reduce its quality. Due to the aforementioned problems in the multi-scale transform methods, researchers are interested in multi-focus image fusion in the DCT domain. DCT-based methods are more efficient in terms of transmission and archiving images coded in Joint Photographic Experts Group (JPEG) standard to the upper node in the VSN agent. A JPEG system consists of a pair of an encoder and a decoder. In the encoder, images are divided into non-overlapping 8×8 blocks, and the DCT coefficients are calculated for each. Since the quantization of DCT coefficients is a lossy process, many of the small-valued DCT coefficients are quantized to zero, which corresponds to high frequencies. DCT-based image fusion algorithms work better when the multi-focus image fusion methods are applied in the compressed domain. In addition, in the spatial-based methods, the input images must be decoded and then transferred to the spatial domain. After implementation of the image fusion operations, the output fused images must again be encoded. DCT domain-based methods do not require complex and time-consuming consecutive decoding and encoding operations. Therefore, the image fusion methods based on DCT domain operate with much less energy and processing time. Recently, a lot of research has been carried out in the DCT domain. DCT+Variance, DCT+Corr_Eng, DCT+EOL, and DCT+VOL are some prominent examples of DCT based methods.

    Read more →
  • Three-factor learning

    Three-factor learning

    In neuroscience and machine learning, three-factor learning is the combination of Hebbian plasticity with a third modulatory factor to stabilise and enhance synaptic learning. This third factor can represent various signals such as reward, punishment, error, surprise, or novelty, often implemented through neuromodulators. == Description == Three-factor learning introduces the concept of eligibility traces, which flag synapses for potential modification pending the arrival of the third factor, and helps temporal credit assignement by bridging the gap between rapid neuronal firing and slower behavioral timescales, from which learning can be done. Biological basis for Three-factor learning rules have been supported by experimental evidence. This approach addresses the instability of classical Hebbian learning by minimizing autocorrelation and maximizing cross-correlation between inputs.

    Read more →
  • Cognitive philology

    Cognitive philology

    Cognitive philology is the science that studies written and oral texts as the product of human mental processes. Studies in cognitive philology compare documentary evidence emerging from textual investigations with results of experimental research, especially in the fields of cognitive and ecological psychology, neurosciences and artificial intelligence. "The point is not the text, but the mind that made it". Cognitive Philology aims to foster communication between literary, textual, philological disciplines on the one hand and researches across the whole range of the cognitive, evolutionary, ecological and human sciences on the other. Cognitive philology: investigates transmission of oral and written text, and categorization processes which lead to classification of knowledge, mostly relying on the information theory; studies how narratives emerge in so called natural conversation and selective process which lead to the rise of literary standards for storytelling, mostly relying on embodied semantics; explores the evolutive and evolutionary role played by rhythm and metre in human ontogenetic and phylogenetic development and the pertinence of the semantic association during processing of cognitive maps; Provides the scientific ground for multimedia critical editions of literary texts. Among the founding thinkers and noteworthy scholars devoted to such investigations are: Alan Richardson: Studies Theory of Mind in early-modern and contemporary literature. Anatole Pierre Fuksas Benoît de Cornulier David Herman: Professor of English at North Carolina State University and an adjunct professor of linguistics at Duke University. He is the author of "Universal Grammar and Narrative Form" and the editor of "Narratologies: New Perspectives on Narrative Analysis". Domenico Fiormonte François Recanati Gilles Fauconnier, a professor in Cognitive science at the University of California, San Diego. He was one of the founders of cognitive linguistics in the 1970s through his work on pragmatic scales and mental spaces. His research explores the areas of conceptual integration and compressions of conceptual mappings in terms of the emergent structure in language. Julián Santano Moreno Luca Nobile Manfred Jahn in Germany Mark Turner Paolo Canettieri

    Read more →
  • DABUS

    DABUS

    DABUS (Device for the Autonomous Bootstrapping of Unified Sentience) is an artificial intelligence (AI) system created by Stephen Thaler. It reportedly conceived of two novel products — a food container constructed using fractal geometry, which enables rapid reheating, and a flashing beacon for attracting attention in an emergency. The filing of patent applications designating DABUS as inventor has led to decisions by patent offices and courts on whether a patent can be granted for an invention reportedly made by an AI system. == History in different jurisdictions == === Australia === On 17 September 2019, Thaler filed an application to patent a "Food container and devices and methods for attracting enhanced attention," naming DABUS as the inventor. On 21 September 2020, IP Australia found that section 15(1) of the Patents Act 1990 (Cth) is inconsistent with an artificial intelligence machine being treated as an inventor, and Thaler's application had lapsed. Thaler sought judicial review, and on 30 July 2021, the Federal Court set aside IP Australia's decision and ordered IP Australia to reconsider the application. On 13 April 2022, the Full Court of the Federal Court set aside that decision, holding that only a natural person can be an inventor for the purposes of the Patents Act 1990 (Cth) and the Patents Regulations 1991 (Cth), and that such an inventor must be identified for any person to be entitled to a grant of a patent. On 11 November 2022, Thaler was refused special leave to appeal to the High Court. === European Patent Office === On 17 October 2018 and 7 November 2018, Thaler filed two European patent applications with the European Patent Office. The first claimed invention was a "Food Container" and the second was "Devices and Methods for Attracting Enhanced Attention." On 27 January 2020, the EPO rejected the applications on the grounds that the application listed an AI system named DABUS, and not a human, as the inventor, based on Article 81 and Rule 19(1) of the European Patent Convention (EPC). On 21 December 2021, the Board of Appeal of the EPO dismissed Thaler's appeal from the EPO's primary decision. The Board of Appeal confirmed that "under the EPC the designated inventor has to be a person with legal capacity. This is not merely an assumption on which the EPC was drafted. It is the ordinary meaning of the term inventor." === United Kingdom === Similar applications were filed by Thaler to the United Kingdom Intellectual Property Office on 17 October and 7 November 2018. The Office asked Thaler to file statements of inventorship and of right of grant to a patent (Patent Form 7) in respect of each invention within 16 months of the filing date. Thaler filed those forms naming DABUS as the inventor and explaining in some detail why he believed that machines should be regarded as inventors in the circumstances. His application was rejected on the grounds that: (1) naming a machine as inventor did not meet the requirements of the Patents Act 1977; and (2) the IPO was not satisfied as to the manner in which Thaler had acquired rights that would otherwise vest in the inventor. Thaler was not satisfied with the decision and asked for a hearing before an official known as the "hearing officer". By a decision dated 4 December 2019 the hearing officer rejected Thaler's appeal. Thaler appealed against the hearing officer's decision to the Patents Court (a specialist court within the Chancery Division of the High Court of England and Wales that determines patent disputes). On 21 September 2020, Mr Justice Marcus Smith upheld the decision of the hearing officer. On 21 September 2021, Thaler's further appeal to the Court of Appeal was dismissed by Arnold LJ and Laing LJ (Birss LJ dissenting). On 20 December 2023, the UK Supreme Court dismissed a further appeal by Thaler. In its judgment, the court held that an "inventor" under the Patents Act 1977 must be a natural person. === United States === The patent applications on the inventions were refused by the USPTO, which held that only natural persons can be named as inventors in a patent application. Thaler first fought this result by filing a complaint under the Administrative Procedure Act alleging that the decision was "arbitrary, capricious, an abuse of discretion and not in accordance with the law; unsupported by substantial evidence, and in excess of Defendants’ statutory authority." A month later on August 19, 2019, Thaler filed a petition with the USPTO as allowed in 37 C.F.R. § 1.181 stating that DABUS should be the inventor. The judge and Thaler agreed in this case that Thaler himself is unable to receive the patent on behalf of DABUS. In their August 5, 2022, Thaler decision, the US Court of Appeals for the Federal Circuit affirmed that only a natural person could be an inventor, which means that the AI that invents any other type of invention is not addressed by the "who" mentioned in the legislation. === New Zealand === On January 31, 2022, the Intellectual Property Office of New Zealand (IPONZ) decided that a patent application (776029) filed by Stephen Thaler was void, on the basis that no inventor was identified on the patent application. IPONZ determined that DABUS could not be "an actual devisor of the invention" as required by the Patents Act 2013, and that this must be a natural person as held by the previous patent offices above. The High Court of New Zealand confirmed the decision in 2023. === South Africa === On 24 June 2021, the South African Companies and Intellectual Property Commission (CIPC) accepted Dr Thaler's Patent Cooperation Treaty, for a patent in respect of inventions generated by DABUS. In July 2021, the CIPC released a notice of issuance for the patent. It is the first patent granted for an AI invention. === Switzerland === On June 26, 2025, the Swiss Federal Administrative Court ruled that artificial intelligence systems such as DABUS cannot be listed as inventors in patent applications. The court upheld the existing practice of the Swiss Federal Institute of Intellectual Property (IPI), which requires that only natural persons can be recognized as inventors under Swiss patent law. The case concerned a patent application, which sought to designate DABUS as the sole inventor of a food container designed with a fractal geometry to enhance heat distribution. The IPI had rejected the application, arguing that both the absence of a human inventor and the attribution of inventorship to an AI system were inadmissible. While the court dismissed Thaler's main request, it accepted a subsidiary request: if a human applicant recognizes and files a patent based on an AI-generated invention, that person may be considered the inventor. As a result, the application may proceed with Thaler listed as the inventor. The decision (B-2532/2024) can still be appealed to the Swiss Federal Supreme Court.

    Read more →
  • Apache OpenNLP

    Apache OpenNLP

    The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as language detection, tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing and coreference resolution. These tasks are usually required to build more advanced text processing services.

    Read more →
  • Connectionist expert system

    Connectionist expert system

    Connectionist expert systems are artificial neural network (ANN) based expert systems where the ANN generates inferencing rules e.g., fuzzy-multi layer perceptron where linguistic and natural form of inputs are used. Apart from that, rough set theory may be used for encoding knowledge in the weights better and also genetic algorithms may be used to optimize the search solutions better. Symbolic reasoning methods may also be incorporated (see hybrid intelligent system). (Also see expert system, neural network, clinical decision support system.)

    Read more →
  • Leakage (machine learning)

    Leakage (machine learning)

    In statistics and machine learning, leakage (also known as data leakage or target leakage) refers to the use of information during model training that would not be available at prediction time. This results in overly optimistic performance estimates, as the model appears to perform better during evaluation than it actually would in a production environment. Leakage is often subtle and indirect, making it difficult to detect and eliminate. It can lead a statistician or modeler to select a suboptimal model, which may be outperformed by a leakage-free alternative. == Leakage modes == Leakage can occur at multiple stages of the machine learning workflow. Broadly, its sources can be divided into two categories: those arising from features and those arising from training examples. === Feature leakage === Feature or column-wise leakage is caused by the inclusion of columns which are one of the following: a duplicate label, a proxy for the label, or the label itself. These features, known as anachronisms, will not be available when the model is used for predictions, and result in leakage if included when the model is trained. For example, including a "MonthlySalary" column when predicting "YearlySalary"; or "MinutesLate" when predicting "IsLate". === Training example leakage === Row-wise leakage is caused by improper sharing of information between rows of data. Types of row-wise leakage include: Premature featurization; leaking from premature featurization before Cross-validation/Train/Test split (must fit MinMax/ngrams/etc on only the train split, then transform the test set) Duplicate rows between train/validation/test (for example, oversampling a dataset to pad its size before splitting; or, different rotations/augmentations of a single image; bootstrap sampling before splitting; or duplicating rows to up sample the minority class) Non-independent and identically distributed random (non-IID) data Time leakage (for example, splitting a time-series dataset randomly instead of newer data in test set using a train/test split or rolling-origin cross-validation) Group leakage—not including a grouping split column (for example, Andrew Ng's group had 100k x-rays of 30k patients, meaning ~3 images per patient. The paper used random splitting instead of ensuring that all images of a patient were in the same split. Hence the model partially memorized the patients instead of learning to recognize pneumonia in chest x-rays.) A 2023 review found data leakage to be "a widespread failure mode in machine-learning (ML)-based science", having affected at least 294 academic publications across 17 disciplines, and causing a potential reproducibility crisis. == Detection == Data leakage in machine learning can be detected through various methods, focusing on performance analysis, feature examination, data auditing, and model behavior analysis. Performance-wise, unusually high accuracy or significant discrepancies between training and test results often indicate leakage. Inconsistent cross-validation outcomes may also signal issues. Feature examination involves scrutinizing feature importance rankings and ensuring temporal integrity in time series data. A thorough audit of the data pipeline is crucial, reviewing pre-processing steps, feature engineering, and data splitting processes. Detecting duplicate entries across dataset splits is also important. For language models, the Min-K% method can detect the presence of data in a pretraining dataset. It presents a sentence suspected to be present in the pretraining dataset, and computes the log-likelihood of each token, then compute the average of the lowest K of these. If this exceeds a threshold, then the sentence is likely present. This method is improved by comparing against a baseline of the mean and variance. Analyzing model behavior can reveal leakage. Models relying heavily on counter-intuitive features or showing unexpected prediction patterns warrant investigation. Performance degradation over time when tested on new data may suggest earlier inflated metrics due to leakage. Advanced techniques include backward feature elimination, where suspicious features are temporarily removed to observe performance changes. Using a separate hold-out dataset for final validation before deployment is advisable.

    Read more →
  • Wetware (brain)

    Wetware (brain)

    Wetware is a term drawn from the computer-related idea of hardware or software, but applied to biological life forms. == Usage == The prefix "wet" is a reference to the water found in living creatures. Wetware is used to describe the elements equivalent to hardware and software found in a person, especially the central nervous system (CNS) and the human mind. The term wetware finds use in works of fiction, in scholarly publications and in popularizations. The "hardware" component of wetware concerns the bioelectric and biochemical properties of the CNS, specifically the brain. If the sequence of impulses traveling across the various neurons are thought of symbolically as software, then the physical neurons would be the hardware. The amalgamated interaction of this software and hardware is manifested through continuously changing physical connections, and chemical and electrical influences that spread across the body. The process by which the mind and brain interact to produce the collection of experiences that we define as self-awareness is in question. == History == Although the exact definition has shifted over time, the term Wetware and its fundamental reference to "the physical mind" has been around at least since the mid-1950s. Mostly used in relatively obscure articles and papers, it was not until the heyday of cyberpunk, however, that the term found broad adoption. Among the first uses of the term in popular culture was the Bruce Sterling novel Schismatrix (1985) and the Michael Swanwick novel Vacuum Flowers (1987). Rudy Rucker references the term in a number of books, including one entitled Wetware (1988): ... all sparks and tastes and tangles, all its stimulus/response patterns – the whole bio-cybernetic software of mind. Rucker did not use the word to simply mean a brain, nor in the human-resources sense of employees. He used wetware to stand for the data found in any biological system, analogous perhaps to the firmware that is found in a ROM chip. In Rucker's sense, a seed, a plant graft, an embryo, or a biological virus are all wetware. DNA, the immune system, and the evolved neural architecture of the brain are further examples of wetware in this sense. Rucker describes his conception in a 1992 compendium The Mondo 2000 User's Guide to the New Edge, which he quotes in a 2007 blog entry. Early cyber-guru Arthur Kroker used the term in his blog. With the term getting traction in trendsetting publications, it became a buzzword in the early 1990s. In 1991, Dutch media theorist Geert Lovink organized the Wetware Convention in Amsterdam, which was supposed to be an antidote to the "out-of-body" experiments conducted in high-tech laboratories, such as experiments in virtual reality. Timothy Leary, in an appendix to Info-Psychology originally written in 1975–76 and published in 1989, used the term wetware, writing that "psychedelic neuro-transmitters were the hot new technology for booting-up the 'wetware' of the brain". Another common reference is: "Wetware has 7 plus or minus 2 temporary registers." The numerical allusion is to a classic 1957 article by George A. Miller, The magical number 7 plus or minus two: some limits in our capacity for processing information, which later gave way to Miller's law.

    Read more →
  • Catie Cuan

    Catie Cuan

    Catie Cuan is an artist, entrepeuneur, and innovator in the field of robotic art and human-robot interaction, where she specializes in choreorobotics, an emerging field at the intersection of choreographic dance and robotics. Catie Cuan is currently one of the academic researchers pioneering the field of choreorobotics and currently holds a post-doctoral fellowship at Stanford University. == Career == Catie Cuan earned a bachelor's degree from the University of California, Berkeley. She graduated with a Ph.D. from the Department of Mechanical Engineering at Stanford University, focusing in robotics. Her most cited publication is about how to improve robotic expressive systems using tools from dance theory, such as the Laban/Bartenieff Movement Analysis. In her most recent research projects, she explores a predictive model of imitation learning for robots moving around humans, a project that advances the field of social robotics. Cuan credits her work in robotics to the experience with her father when he had a stroke and was surrounded by many medical machines, which made her think about how people might feel empowered and hopeful rather than afraid. As a ballet dancer and choreographer, she has performed with the Metropolitan Opera Ballet and the Lyric Opera of Chicago. In 2020, she was the dancer and choreographer of the show Output, which was part of a collaboration with ThoughtWorks Arts and the Pratt Institute. In the production, she danced with an ABB IRB 6700 industrial robot. In 2022, she was named as an IF/THEN ambassador for the American Association for the Advancement of Science. The same year, she was appointed Futurist-in-Residence at the Smithsonian Arts and Industries Building, where she performed at the closing ceremonies of the FUTURES exhibit on July 6, 2022. Cuan has also contributed to product designs, working with IDEO and Dutch interior design firm moooi on their Piro project, which launched a dancing scent diffuser robot during Milan Design Week in June 2022. She is a TED speaker with talks about how to teach robots to dance, and what is coming up for dancing robots in the AI era.

    Read more →
  • Quantum artificial life

    Quantum artificial life

    Quantum artificial life is the application of quantum algorithms with the ability to simulate biological behavior. Quantum computers offer many potential improvements to processes performed on classical computers, including machine learning and artificial intelligence. Artificial intelligence applications are often inspired by the idea of mimicking human brains through closely related biomimicry. This has been implemented to a certain extent on classical computers (using neural networks), but quantum computers offer many advantages in the simulation of artificial life. Artificial life and artificial intelligence are extremely similar, with minor differences; the goal of studying artificial life is to understand living beings better, while the goal of artificial intelligence is to create intelligent beings. In 2016, Alvarez-Rodriguez et al. developed a proposal for a quantum artificial life algorithm with the ability to simulate life and Darwinian evolution. In 2018, the same research team led by Alvarez-Rodriguez performed the proposed algorithm on the IBM ibmqx4 quantum computer, and received optimistic results. The results accurately simulated a system with the ability to undergo self-replication at the quantum scale. == Artificial life on quantum computers == The growing advancement of quantum computers has led researchers to develop quantum algorithms for simulating life processes. Researchers have designed a quantum algorithm that can accurately simulate Darwinian Evolution. Since the complete simulation of artificial life on quantum computers has only been actualized by one group, this section shall focus on the implementation by Alvarez-Rodriguez, Sanz, Lomata, and Solano on an IBM quantum computer. Individuals were realized as two qubits, one representing the genotype of the individual and the other representing the phenotype. The genotype is copied to transmit genetic information through generations, and the phenotype is dependent on the genetic information as well as the individual's interactions with their environment. In order to set up the system, the state of the genotype is instantiated by some rotation of an ancillary state ( | 0 ⟩ ⟨ 0 | {\displaystyle |0\rangle \langle 0|} ). The environment is a two-dimensional spatial grid occupied by individuals and ancillary states. The environment is divided into cells that are able to possess one or more individuals. Individuals move throughout the grid and occupy cells randomly; when two or more individuals occupy the same cell they interact with each other. === Self replication === The ability to self-replicate is critical for simulating life. Self-replication occurs when the genotype of an individual interacts with an ancillary state, creating a genotype for a new individual; this genotype interacts with a different ancillary state in order to create the phenotype. During this interaction, one would like to copy some information about the initial state into the ancillary state, but by the no cloning theorem, it is impossible to copy an arbitrary unknown quantum state. However, physicists have derived different methods for quantum cloning which does not require the exact copying of an unknown state. The method that has been implemented by Alvarez-Rodriguez et al. is one that involves the cloning of the expectation value of some observable. For a unitary U {\displaystyle U} which copies the expectation value of some set of observables X {\displaystyle {\mathsf {X}}} of state ρ {\displaystyle \rho } into a blank state ρ e {\displaystyle \rho _{e}} , the cloning machine is defined by any ( U , ρ e , X ) {\displaystyle (U,\rho _{e},{\mathsf {X}})} that fulfill the following: ∀ ρ ∀ X ∈ X {\displaystyle \forall \rho \forall X\in {\mathsf {X}}} X ¯ = X 1 ¯ = X 2 ¯ {\displaystyle {\bar {X}}={\bar {X_{1}}}={\bar {X_{2}}}} Where X ¯ {\displaystyle {\bar {X}}} is the mean value of the observable in ρ {\displaystyle \rho } before cloning, X 1 ¯ {\displaystyle {\bar {X_{1}}}} is the mean value of the observable in ρ {\displaystyle \rho } after cloning, and X 2 ¯ {\displaystyle {\bar {X_{2}}}} is the mean value of the observable in ρ e {\displaystyle \rho _{e}} after cloning. Note that the cloning machine has no dependence on ρ {\displaystyle \rho } because we want to be able to clone the expectation of the observables for any initial state. It is important to note that cloning the mean value of the observable transmits more information than is allowed classically. The calculation of the mean value is defined naturally as: X ¯ = T r [ ρ X ] {\displaystyle {\bar {X}}=Tr[\rho X]} , X 1 ¯ = T r [ R X ⊗ I ] {\displaystyle {\bar {X_{1}}}=Tr[RX\otimes I]} , X 2 ¯ = T r [ R I ⊗ X ] {\displaystyle {\bar {X_{2}}}=Tr[RI\otimes X]} where R = U ρ ⊗ ρ e U † {\displaystyle R=U\rho \otimes \rho _{e}U^{\dagger }} The simplest cloning machine clones the expectation value of σ z {\displaystyle \sigma _{z}} in arbitrary state ρ = | ψ ⟩ ⟨ ψ | {\displaystyle \rho =|\psi \rangle \langle \psi |} to ρ e = | 0 ⟩ ⟨ 0 | {\displaystyle \rho _{e}=|0\rangle \langle 0|} using U = C N O T {\displaystyle U=CNOT} . This is the cloning machine implemented for self-replication by Alvarez-Rodriguez et al. The self-replication process clearly only requires interactions between two qubits, and therefore this cloning machine is the only one necessary for self replication. === Interactions === Interactions occur between individuals when the two take up the same space on the environmental grid. The presence of interactions between individuals provides an advantage for shorter-lifespan individuals. When two individuals interact, exchanges of information between the two phenotypes may or may not occur based on their existing values. When both individual's control qubits (genotypes) are alike, no information will be exchanged. When the control qubits differ, the target qubits (phenotype) will be exchanged between the two individuals. This procedure produces a constantly changing predator-prey dynamic in the simulation. Therefore, long-living qubits, with a larger genetic makeup in the simulation, are at a disadvantage. Since information is only exchanged when interacting with an individual of different genetic makeup, the short-lived population has the advantage. === Mutation === Mutations exist in the artificial world with limited probability, equivalent to their occurrence in the real world. There are two ways in which the individual can mutate: through random single qubit rotations and by errors in the self-replication process. There are two different operators that act on the individual and cause mutations. The M operation causes a spontaneous mutation within the individual by rotating a single qubit by parameter θ. The parameter θ is random for each mutation, which creates biodiversity within the artificial environment. The M operation is a unitary matrix which can be described as: M = ( cos ⁡ ( θ ) s i n ( θ ) s i n ( θ ) − c o s ( θ ) ) {\displaystyle M={\begin{pmatrix}\cos(\theta )&sin(\theta )\\sin(\theta )&-cos(\theta )\end{pmatrix}}} The other possible way for mutations to occur is due to errors in the replication process. Due to the no-cloning theorem, it is impossible to produce perfect copies of systems that are originally in unknown quantum states. However, quantum cloning machines make it possible to create imperfect copies of quantum states, in other words, the process introduces some degree of error. The error that exists in current quantum cloning machines is the root cause for the second kind of mutations in the artificial life experiment. The imperfect cloning operation can be seen as: U M ( θ ) = I 4 + 1 2 ( 0 0 0 1 ) ⊗ ( − 1 1 1 − 1 ) ( c o s θ + i s i n θ + 1 ) {\displaystyle U_{M}(\theta )=\mathrm {I} _{4}+{\frac {1}{2}}{\begin{pmatrix}0&0\\0&1\end{pmatrix}}\otimes {\begin{pmatrix}-1&1\\1&-1\end{pmatrix}}(cos\theta +isin\theta +1)} The two kinds of mutations affect the individual differently. While the spontaneous M operation does not affect the phenotype of the individual, the self-replicating error mutation, UM, alters both the genotype of the individual, and its associated lifetime. The presence of mutations in the quantum artificial life experiment is critical for providing randomness and biodiversity. The inclusion of mutations helps to increase the accuracy of the quantum algorithm. === Death === At the instant the individual is created (when the genotype is copied into the phenotype), the phenotype interacts with the environment. As time evolves, the interaction of the individual with the environment simulates aging which eventually leads to the death of the individual. The death of an individual occurs when the expectation value of σ z {\displaystyle \sigma _{z}} is within some ϵ {\displaystyle \epsilon } of 1 in the phenotype, or, equivalently, when ρ p = | 0 ⟩ ⟨ 0 | {\displaystyle \rho _{p}=|0\rangle \langle 0|} The Lindbladian describes the interaction of the individual with the environment: ρ

    Read more →
  • Discovery system (artificial intelligence)

    Discovery system (artificial intelligence)

    A discovery system is an artificial intelligence system that attempts to discover new scientific concepts or laws. The aim of discovery systems is to automate scientific data analysis and the scientific discovery process. Ideally, an artificial intelligence system should be able to search systematically through the space of all possible hypotheses and yield the hypothesis - or set of equally likely hypotheses - that best describes the complex patterns in data. During the era known as the second AI summer (approximately 1978–1987), various systems akin to the era's dominant expert systems were developed to tackle the problem of extracting scientific hypotheses from data, with or without interacting with a human scientist. These systems included Autoclass, Automated Mathematician, Eurisko, which aimed at general-purpose hypothesis discovery, and more specific systems such as Dalton, which uncovers molecular properties from data. The dream of building systems that discover scientific hypotheses was pushed to the background with the second AI winter and the subsequent resurgence of subsymbolic methods such as neural networks. Subsymbolic methods emphasize prediction over explanation, and yield models which works well but are difficult or impossible to explain which has earned them the name black box AI. A black-box model cannot be considered a scientific hypothesis, and this development has even led some researchers to suggest that the traditional aim of science - to uncover hypotheses and theories about the structure of reality - is obsolete. Other researchers disagree and argue that subsymbolic methods are useful in many cases, just not for generating scientific theories. == Discovery systems from the 1970s and 1980s == Autoclass was a Bayesian Classification System written in 1986 Automated Mathematician was one of the earliest successful discovery systems. It was written in 1977 and worked by generating a modifying small Lisp programs Eurisko was a Sequel to Automated Mathematician written in 1984 Dalton is a still maintained program capable of calculating various molecular properties initially launched in 1983 and available in open source since 2017 Glauber is a scientific discovery method written in the context of computational philosophy of science launched in 1983 == Modern discovery systems (2009–present) == After a couple of decades with little interest in discovery systems, the interest in using AI to uncover natural laws and scientific explanations was renewed by the work of Michael Schmidt, then a PhD student in Computational Biology at Cornell University. Schmidt and his advisor, Hod Lipson, invented Eureqa, which they described as a symbolic regression approach to "distilling free-form natural laws from experimental data". This work effectively demonstrated that symbolic regression was a promising way forward for AI-driven scientific discovery. Since 2009, symbolic regression has matured further, and today, various commercial and open source systems are actively used in scientific research. Notable examples include Eureqa, now a part of DataRobot AI Cloud Platform, AI Feynman, and QLattice.

    Read more →
  • Manifold regularization

    Manifold regularization

    In machine learning, manifold regularization is a technique for using the shape of a dataset to constrain the functions that should be learned on that dataset. In many machine learning problems, the data to be learned do not cover the entire input space. For example, a facial recognition system may not need to classify any possible image, but only the subset of images that contain faces. The technique of manifold learning assumes that the relevant subset of data comes from a manifold, a mathematical structure with useful properties. The technique also assumes that the function to be learned is smooth: data with different labels are not likely to be close together, and so the labeling function should not change quickly in areas where there are likely to be many data points. Because of this assumption, a manifold regularization algorithm can use unlabeled data to inform where the learned function is allowed to change quickly and where it is not, using an extension of the technique of Tikhonov regularization. Manifold regularization algorithms can extend supervised learning algorithms in semi-supervised learning and transductive learning settings, where unlabeled data are available. The technique has been used for applications including medical imaging, geographical imaging, and object recognition. == Manifold regularizer == === Motivation === Manifold regularization is a type of regularization, a family of techniques that reduces overfitting and ensures that a problem is well-posed by penalizing complex solutions. In particular, manifold regularization extends the technique of Tikhonov regularization as applied to Reproducing kernel Hilbert spaces (RKHSs). Under standard Tikhonov regularization on RKHSs, a learning algorithm attempts to learn a function f {\displaystyle f} from among a hypothesis space of functions H {\displaystyle {\mathcal {H}}} . The hypothesis space is an RKHS, meaning that it is associated with a kernel K {\displaystyle K} , and so every candidate function f {\displaystyle f} has a norm ‖ f ‖ K {\displaystyle \left\|f\right\|_{K}} , which represents the complexity of the candidate function in the hypothesis space. When the algorithm considers a candidate function, it takes its norm into account in order to penalize complex functions. Formally, given a set of labeled training data ( x 1 , y 1 ) , … , ( x ℓ , y ℓ ) {\displaystyle (x_{1},y_{1}),\ldots ,(x_{\ell },y_{\ell })} with x i ∈ X , y i ∈ Y {\displaystyle x_{i}\in X,y_{i}\in Y} and a loss function V {\displaystyle V} , a learning algorithm using Tikhonov regularization will attempt to solve the expression arg min f ∈ H 1 ℓ ∑ i = 1 ℓ V ( f ( x i ) , y i ) + γ ‖ f ‖ K 2 {\displaystyle {\underset {f\in {\mathcal {H}}}{\arg \!\min }}{\frac {1}{\ell }}\sum _{i=1}^{\ell }V(f(x_{i}),y_{i})+\gamma \left\|f\right\|_{K}^{2}} where γ {\displaystyle \gamma } is a hyperparameter that controls how much the algorithm will prefer simpler functions over functions that fit the data better. Manifold regularization adds a second regularization term, the intrinsic regularizer, to the ambient regularizer used in standard Tikhonov regularization. Under the manifold assumption in machine learning, the data in question do not come from the entire input space X {\displaystyle X} , but instead from a nonlinear manifold M ⊂ X {\displaystyle M\subset X} . The geometry of this manifold, the intrinsic space, is used to determine the regularization norm. === Laplacian norm === There are many possible choices for the intrinsic regularizer ‖ f ‖ I {\displaystyle \left\|f\right\|_{I}} . Many natural choices involve the gradient on the manifold ∇ M {\displaystyle \nabla _{M}} , which can provide a measure of how smooth a target function is. A smooth function should change slowly where the input data are dense; that is, the gradient ∇ M f ( x ) {\displaystyle \nabla _{M}f(x)} should be small where the marginal probability density P X ( x ) {\displaystyle {\mathcal {P}}_{X}(x)} , the probability density of a randomly drawn data point appearing at x {\displaystyle x} , is large. This gives one appropriate choice for the intrinsic regularizer: ‖ f ‖ I 2 = ∫ x ∈ M ‖ ∇ M f ( x ) ‖ 2 d P X ( x ) {\displaystyle \left\|f\right\|_{I}^{2}=\int _{x\in M}\left\|\nabla _{M}f(x)\right\|^{2}\,d{\mathcal {P}}_{X}(x)} In practice, this norm cannot be computed directly because the marginal distribution P X {\displaystyle {\mathcal {P}}_{X}} is unknown, but it can be estimated from the provided data. === Graph-based approach of the Laplacian norm === When the distances between input points are interpreted as a graph, then the Laplacian matrix of the graph can help to estimate the marginal distribution. Suppose that the input data include ℓ {\displaystyle \ell } labeled examples (pairs of an input x {\displaystyle x} and a label y {\displaystyle y} ) and u {\displaystyle u} unlabeled examples (inputs without associated labels). Define W {\displaystyle W} to be a matrix of edge weights for a graph, where W i j {\displaystyle W_{ij}} is a similarity built from distance measure between the data points x i {\displaystyle x_{i}} and x j {\displaystyle x_{j}} (so that more close implies higher W i j {\displaystyle W_{ij}} ). Define D {\displaystyle D} to be a diagonal matrix with D i i = ∑ j = 1 ℓ + u W i j {\displaystyle D_{ii}=\sum _{j=1}^{\ell +u}W_{ij}} and L {\displaystyle L} to be the Laplacian matrix D − W {\displaystyle D-W} . Then, as the number of data points ℓ + u {\displaystyle \ell +u} increases, L {\displaystyle L} converges to the Laplace–Beltrami operator Δ M {\displaystyle \Delta _{M}} , which is the divergence of the gradient ∇ M {\displaystyle \nabla _{M}} . Then, if f {\displaystyle \mathbf {f} } is a vector of the values of f {\displaystyle f} at the data, f = [ f ( x 1 ) , … , f ( x l + u ) ] T {\displaystyle \mathbf {f} =[f(x_{1}),\ldots ,f(x_{l+u})]^{\mathrm {T} }} , the intrinsic norm can be estimated: ‖ f ‖ I 2 = 1 ( ℓ + u ) 2 f T L f {\displaystyle \left\|f\right\|_{I}^{2}={\frac {1}{(\ell +u)^{2}}}\mathbf {f} ^{\mathrm {T} }L\mathbf {f} } As the number of data points ℓ + u {\displaystyle \ell +u} increases, this empirical definition of ‖ f ‖ I 2 {\displaystyle \left\|f\right\|_{I}^{2}} converges to the definition when P X {\displaystyle {\mathcal {P}}_{X}} is known. === Solving the regularization problem with graph-based approach === Using the weights γ A {\displaystyle \gamma _{A}} and γ I {\displaystyle \gamma _{I}} for the ambient and intrinsic regularizers, the final expression to be solved becomes: arg min f ∈ H 1 ℓ ∑ i = 1 ℓ V ( f ( x i ) , y i ) + γ A ‖ f ‖ K 2 + γ I ( ℓ + u ) 2 f T L f {\displaystyle {\underset {f\in {\mathcal {H}}}{\arg \!\min }}{\frac {1}{\ell }}\sum _{i=1}^{\ell }V(f(x_{i}),y_{i})+\gamma _{A}\left\|f\right\|_{K}^{2}+{\frac {\gamma _{I}}{(\ell +u)^{2}}}\mathbf {f} ^{\mathrm {T} }L\mathbf {f} } As with other kernel methods, H {\displaystyle {\mathcal {H}}} may be an infinite-dimensional space, so if the regularization expression cannot be solved explicitly, it is impossible to search the entire space for a solution. Instead, a representer theorem shows that under certain conditions on the choice of the norm ‖ f ‖ I {\displaystyle \left\|f\right\|_{I}} , the optimal solution f ∗ {\displaystyle f^{}} must be a linear combination of the kernel centered at each of the input points: for some weights α i {\displaystyle \alpha _{i}} , f ∗ ( x ) = ∑ i = 1 ℓ + u α i K ( x i , x ) {\displaystyle f^{}(x)=\sum _{i=1}^{\ell +u}\alpha _{i}K(x_{i},x)} Using this result, it is possible to search for the optimal solution f ∗ {\displaystyle f^{}} by searching the finite-dimensional space defined by the possible choices of α i {\displaystyle \alpha _{i}} . === Functional approach of the Laplacian norm === The idea beyond the graph-Laplacian is to use neighbors to estimate the Laplacian. This method is akin to local averaging methods, that are known to scale poorly in high-dimensional problems. Indeed, the graph Laplacian is known to suffer from the curse of dimensionality. Luckily, it is possible to leverage expected smoothness of the function to estimate thanks to more advanced functional analysis. This method consists of estimating the Laplacian operator using derivatives of the kernel reading ∂ 1 , j K ( x i , x ) {\displaystyle \partial _{1,j}K(x_{i},x)} where ∂ 1 , j {\displaystyle \partial _{1,j}} denotes the partial derivatives according to the j-th coordinate of the first variable. This second approach to the Laplacian norm is to put in relation with meshfree methods, that contrast with the finite difference method in PDE. == Applications == Manifold regularization can extend a variety of algorithms that can be expressed using Tikhonov regularization, by choosing an appropriate loss function V {\displaystyle V} and hypothesis space H {\displaystyle {\mathcal {H}}} . Two commonly used examples are the families of support vector machines and regularized least squares algorithm

    Read more →
  • Incremental heuristic search

    Incremental heuristic search

    Incremental heuristic search algorithms combine both incremental and heuristic search to speed up searches of sequences of similar search problems, which is important in domains that are only incompletely known or change dynamically. Incremental search has been studied at least since the late 1960s. Incremental search algorithms reuse information from previous searches to speed up the current search and solve search problems potentially much faster than solving them repeatedly from scratch. Similarly, heuristic search has also been studied at least since the late 1960s. Heuristic search algorithms, often based on A, use heuristic knowledge in the form of approximations of the goal distances to focus the search and solve search problems potentially much faster than uninformed search algorithms. The resulting search problems, sometimes called dynamic path planning problems, are graph search problems where paths have to be found repeatedly because the topology of the graph, its edge costs, the start vertex or the goal vertices change over time. So far, three main classes of incremental heuristic search algorithms have been developed: The first class restarts A at the point where its current search deviates from the previous one (example: Fringe Saving A). The second class updates the h-values (heuristic, i.e. approximate distance to goal) from the previous search during the current search to make them more informed (example: Generalized Adaptive A). The third class updates the g-values (distance from start) from the previous search during the current search to correct them when necessary, which can be interpreted as transforming the A search tree from the previous search into the A search tree for the current search (examples: Lifelong Planning A, D, D Lite). All three classes of incremental heuristic search algorithms are different from other replanning algorithms, such as planning by analogy, in that their plan quality does not deteriorate with the number of replanning episodes. == Applications == Incremental heuristic search has been extensively used in robotics, where a larger number of path planning systems are based on either D (typically earlier systems) or D Lite (current systems), two different incremental heuristic search algorithms.

    Read more →
  • Instance-based learning

    Instance-based learning

    In machine learning, instance-based learning (sometimes called memory-based learning) is a family of learning algorithms that, instead of performing explicit generalization, compare new problem instances with instances seen in training, which have been stored in memory. Because computation is postponed until a new instance is observed, these algorithms are sometimes referred to as "lazy." It is called instance-based because it constructs hypotheses directly from the training instances themselves. This means that the hypothesis complexity can grow with the data: in the worst case, a hypothesis is a list of n training items and the computational complexity of classifying a single new instance is O(n). One advantage that instance-based learning has over other methods of machine learning is its ability to adapt its model to previously unseen data. Instance-based learners may simply store a new instance or throw an old instance away. Examples of instance-based learning algorithms are the k-nearest neighbors algorithm, kernel machines and RBF networks. These store (a subset of) their training set; when predicting a value/class for a new instance, they compute distances or similarities between this instance and the training instances to make a decision. To battle the memory complexity of storing all training instances, as well as the risk of overfitting to noise in the training set, instance reduction algorithms have been proposed.

    Read more →
  • Hierarchical navigable small world

    Hierarchical navigable small world

    Hierarchical navigable small world (HNSW) is an algorithm for approximate nearest neighbor search. It is used to find items that are similar to a query item in a large collection, without comparing the query with every item one by one. The algorithm is commonly used for searching vector data. In these systems, an item such as a document, image, song, or user profile is represented by a list of numbers called a vector. Items with similar vectors are treated as similar according to the model that produced the vectors. HNSW provides a way to search these vectors quickly, especially in large datasets. HNSW stores vectors in a graph. Each vector is a node, and links connect it to some nearby vectors. The graph has several layers: upper layers contain fewer nodes and act like a rough map, while the bottom layer contains all nodes and gives a more detailed view. A search starts in an upper layer, follows links toward nodes that are closer to the query, and then repeats the process in lower layers until it finds a set of likely nearest neighbors. == Background == The nearest neighbor search problem asks which items in a dataset are closest to a query item. A direct search can compare the query with every item in the dataset, but this becomes slow when the dataset is large. Exact search methods based on spatial trees, such as the k-d tree and R-tree, can also become less effective for high-dimensional data, a problem often associated with the curse of dimensionality. Approximate nearest neighbor methods trade some exactness for speed or lower resource use. Instead of always guaranteeing the exact closest item, they try to return close items quickly. Other approximate methods include locality-sensitive hashing and product quantization. HNSW builds on research into small-world networks and navigable graphs. In a small-world graph, most nodes can be reached from other nodes through a short chain of links. In a navigable graph, a search procedure can use local information to move toward a target. Jon Kleinberg's work on navigation in small-world networks is an important example of this research area. Later work studied ways to add links that make graphs easier to navigate greedily. The HNSW algorithm extends earlier navigable small world methods for similarity search by adding a hierarchy of graph layers. This hierarchy helps the algorithm find a good region of the graph before doing a more detailed search in the bottom layer. == Algorithm == HNSW is based on a proximity graph. In this graph, nearby vectors are connected by edges. The algorithm uses these edges to move through the dataset, rather than scanning every vector. The graph is hierarchical. Every vector appears in the bottom layer. Some vectors are also placed in higher layers, with fewer vectors appearing as the layers go upward. The upper layers allow long-range movement across the dataset, while the lower layers allow a more detailed search near promising candidates. A typical search proceeds as follows: The search begins from an entry point in the highest layer. At each step, the algorithm looks at neighboring nodes and moves to a neighbor that is closer to the query. When it cannot find a closer neighbor in that layer, it moves down to the next layer. In the bottom layer, it explores a wider set of candidate nodes and returns the nearest candidates found. This search strategy is often described as greedy navigation. The algorithm repeatedly chooses locally better nodes, using the graph structure to approach the query point. == Construction and parameters == The HNSW graph is built incrementally. When a new vector is inserted, the algorithm assigns it a maximum layer, searches for nearby existing nodes, and connects the new node to selected neighbors in each layer where it appears. Implementations usually expose parameters that control the trade-off between speed, accuracy, memory use, and construction time. A higher number of graph connections can improve recall but requires more memory. A larger search candidate list can improve accuracy but makes queries slower. A larger construction candidate list can improve the quality of the graph but makes index building slower. Because HNSW is approximate, its results are not always identical to a full exact search. Its practical performance depends on the dataset, distance measure, implementation, and parameter settings. Benchmarking studies have found HNSW-based libraries to be strong performers among approximate nearest neighbor methods, although worst-case performance can differ from performance on common benchmark datasets. == Use in vector search systems == HNSW is used as an index in systems that store and search high-dimensional vectors. These systems include vector databases, search engines, and database extensions. Typical uses include semantic search, recommender systems, image similarity search, and retrieval-augmented generation. Several software projects implement or support HNSW. Libraries include hnswlib, which is associated with the original HNSW authors, and FAISS. Database and search systems that document HNSW support include Apache Lucene, Chroma, ClickHouse, DuckDB, MariaDB, Milvus, pgvector, Qdrant, and Redis.

    Read more →