Manifold hypothesis

Manifold hypothesis

The manifold hypothesis posits that many high-dimensional data sets that occur in the real world actually lie along low-dimensional latent manifolds inside that high-dimensional space. As a consequence of the manifold hypothesis, many data sets that appear to initially require many variables to describe, can actually be described by a comparatively small number of variables, linked to the local coordinate system of the underlying manifold. It is suggested that this principle underpins the effectiveness of machine learning algorithms in describing high-dimensional data sets by considering a few common features. The manifold hypothesis is related to the effectiveness of nonlinear dimensionality reduction techniques in machine learning. Many techniques of dimensional reduction make the assumption that data lies along a low-dimensional submanifold, such as manifold sculpting, manifold alignment, and manifold regularization. The major implications of this hypothesis is that Machine learning models only have to fit relatively simple, low-dimensional, highly structured subspaces within their potential input space (latent manifolds). Within one of these manifolds, it's always possible to interpolate between two inputs, that is to say, morph one into another via a continuous path along which all points fall on the manifold. The ability to interpolate between samples is the key to generalization in deep learning. == The information geometry of statistical manifolds == An empirically-motivated approach to the manifold hypothesis focuses on its correspondence with an effective theory for manifold learning under the assumption that robust machine learning requires encoding the dataset of interest using methods for data compression. This perspective gradually emerged using the tools of information geometry thanks to the coordinated effort of scientists working on the efficient coding hypothesis, predictive coding and variational Bayesian methods. The argument for reasoning about the information geometry on the latent space of distributions rests upon the existence and uniqueness of the Fisher information metric. In this general setting, we are trying to find a stochastic embedding of a statistical manifold. From the perspective of dynamical systems, in the big data regime this manifold generally exhibits certain properties such as homeostasis: We can sample large amounts of data from the underlying generative process. Machine Learning experiments are reproducible, so the statistics of the generating process exhibit stationarity. In a sense made precise by theoretical neuroscientists working on the free energy principle, the statistical manifold in question possesses a Markov blanket.

Spreading activation

Spreading activation is a method for searching associative networks, biological and artificial neural networks, or semantic networks. The search process is initiated by labeling a set of source nodes (e.g. concepts in a semantic network) with weights or "activation" and then iteratively propagating or "spreading" that activation out to other nodes linked to the source nodes. Most often these "weights" are real values that decay as activation propagates through the network. When the weights are discrete this process is often referred to as marker passing. Activation may originate from alternate paths, identified by distinct markers, and terminate when two alternate paths reach the same node. However brain studies show that several different brain areas play an important role in semantic processing. Spreading activation in semantic networks as a model were invented in cognitive psychology to model the fan out effect. Spreading activation can also be applied in information retrieval, by means of a network of nodes representing documents and terms contained in those documents. == Cognitive psychology == As it relates to cognitive psychology, spreading activation is the theory of how the brain iterates through a network of associated ideas to retrieve specific information. The spreading activation theory presents the array of concepts within our memory as cognitive units, each consisting of a node and its associated elements or characteristics, all connected together by edges. A spreading activation network can be represented schematically, in a sort of web diagram with shorter lines between two nodes meaning the ideas are more closely related and will typically be associated more quickly to the original concept. In memory psychology, the spreading activation model holds that people organize their knowledge of the world based on their personal experiences, which in turn form the network of ideas that is the person's knowledge of the world. When a word (the target) is preceded by an associated word (the prime) in word recognition tasks, participants seem to perform better in the amount of time that it takes them to respond. For instance, subjects respond faster to the word "doctor" when it is preceded by "nurse" than when it is preceded by an unrelated word like "carrot". This semantic priming effect with words that are close in meaning within the cognitive network has been seen in a wide range of tasks given by experimenters, ranging from sentence verification to lexical decision and naming. As another example, if the original concept is "red" and the concept "vehicles" is primed, they are much more likely to say "fire engine" instead of something unrelated to vehicles, such as "cherries". If instead "fruits" was primed, they would likely name "cherries" and continue on from there. The activation of pathways in the network has everything to do with how closely linked two concepts are by meaning, as well as how a subject is primed. == Algorithm == A directed graph is populated by Nodes[ 1...N ] each having an associated activation value A [ i ] which is a real number in the range [0.0 ... 1.0]. A Link[ i, j ] connects source node[ i ] with target node[ j ]. Each edge has an associated weight W [ i, j ] usually a real number in the range [0.0 ... 1.0]. Parameters: Firing threshold F, a real number in the range [0.0 ... 1.0] Decay factor D, a real number in the range [0.0 ... 1.0] Steps: Initialize the graph setting all activation values A [ i ] to zero. Set one or more origin nodes to an initial activation value greater than the firing threshold F. A typical initial value is 1.0. For each unfired node [ i ] in the graph having an activation value A [ i ] greater than the node firing threshold F: For each Link [ i, j ] connecting the source node [ i ] with target node [ j ], adjust A [ j ] = A [ j ] + (A [ i ] W [ i, j ] D) where D is the decay factor. If a target node receives an adjustment to its activation value so that it would exceed 1.0, then set its new activation value to 1.0. Likewise maintain 0.0 as a lower bound on the target node's activation value should it receive an adjustment to below 0.0. Once a node has fired it may not fire again, although variations of the basic algorithm permit repeated firings and loops through the graph. Nodes receiving a new activation value that exceeds the firing threshold F are marked for firing on the next spreading activation cycle. If activation originates from more than one node, a variation of the algorithm permits marker passing to distinguish the paths by which activation is spread over the graph The procedure terminates when either there are no more nodes to fire or in the case of marker passing from multiple origins, when a node is reached from more than one path. Variations of the algorithm that permit repeated node firings and activation loops in the graph, terminate after a steady activation state, with respect to some delta, is reached, or when a maximum number of iterations is exceeded. == Examples ==

METR

Model Evaluation and Threat Research (METR) (MEE-tər), is a nonprofit research institute, based in Berkeley, California, that evaluates frontier AI models' capabilities to carry out long-horizon, agentic tasks that some researchers argue could pose catastrophic risks to society. METR has worked with leading AI companies to conduct pre-deployment model evaluations and contribute to system cards, including OpenAI's o3, o4-mini, GPT-4o and GPT-4.5, and Anthropic's Claude models. METR's CEO and founder is Beth Barnes, a former alignment researcher at OpenAI who left in 2022 to form ARC Evals, the evaluation division of Paul Christiano's Alignment Research Center. In December 2023, ARC Evals was spun off into an independent 501(c)(3) nonprofit and renamed METR. == Research == A substantial amount of METR's research is focused on evaluating the capabilities of AI systems to conduct research and development of AI systems themselves, including RE-Bench, a benchmark designed to test whether AIs can "solve research engineering tasks and accelerate AI R&D". === Doubling time estimates === In March 2025, METR published a paper noting that the length of software engineering tasks that the leading AI model could complete had a doubling time of around 7 months between 2019 and 2024. In January 2026, METR released a new version of their time horizon estimates model (Time Horizon 1.1). According to the updated model, the rate of progress of AI capabilities has increased since 2023, with a post-2023 doubling time estimated at 130.8 days (4.3 months). Progress is thus estimated to be 20% more rapid. === Time horizon measurements === METR releases a "task-completion time horizon" for analysed AI models. This measures the "task duration (measured by human expert completion time) at which an AI agent is predicted to succeed with a given level of reliability." The metric is reported in two variants: the 50%-time horizon, which gives the task duration at which an AI model is estimated to succeed 50% of the time, and the 80%-time horizon, which gives the task duration at which an AI model is estimated to succeed 80% of the time. METR has published two versions of the underlying model: Time Horizon 1.0 and Time Horizon 1.1, the latter introduced in January 2026. As of 9 May 2026, the best-performing model is Claude Mythos, with a 50%-time horizon of likely at least 16 hours and an 80%-time horizon of 3 hours and 6 minutes. METR notes that "[m]easurements above 16 [hours] are unreliable with [their] current task suite". The following table provides time horizon estimates ordered by each model's release date:

Mistral AI

Mistral AI SAS (French: [mistʁal]) is a French artificial intelligence (AI) company, headquartered in Paris. Founded in 2023, it has open-weight large language models (LLMs), with both open-source and proprietary AI models. As of 2025 the company has a valuation of more than US$14 billion. == Namesake == The company is named after the mistral, a powerful, cold wind in southern France, a term which originates from the Occitan language. == History == Mistral AI was established in April 2023 by three French AI researchers, Arthur Mensch, Guillaume Lample and Timothée Lacroix. Mensch, an expert in advanced AI systems, is a former employee of Google DeepMind; Lample and Lacroix, meanwhile, are large-scale AI models specialists who had worked for Meta Platforms. The trio originally met during their studies at École Polytechnique. == Company operation == === Funding === In June 2023, the start-up carried out a first fundraising of €105 million ($117 million) with investors including the American fund Lightspeed Venture Partners, Eric Schmidt, Xavier Niel and JCDecaux. The valuation was then estimated by the Financial Times at €240 million ($267 million). On 10 December 2023, Mistral AI announced that it had raised €385 million ($428 million) as part of its second fundraising. This round of financing involves the Californian fund Andreessen Horowitz, BNP Paribas and the software publisher Salesforce. It was valued at over €2 billion. On 26 February 2024, Microsoft announced an investment of $16 million in Mistral AI. On 16 April 2024, reporting revealed that Mistral was in talks to raise €500 million, a deal that would more than double its current valuation to at least €5 billion. In June 2024, Mistral AI secured a €600 million ($645 million) funding round, increasing its valuation to €5.8 billion ($6.2 billion). Based on valuation, as of June 2024, the company was ranked fourth globally in the AI industry, and first outside the San Francisco Bay Area. In April 2025, Mistral AI announced a €100 million partnership with the shipping company CMA CGM. In August 2025, the Financial Times reported that Mistral was in talks to raise $1 billion at a $10 billion valuation. In September 2025, Bloomberg announced that Mistral AI has secured a €2 billion investment valuing it at €12 billion ($14 billion). This comes after $1.5 billion investment from Dutch company ASML, which owns 11% of Mistral. In February 2026, Mistral acquired Koyeb, a Paris-based AI startup. Later that month, Mistral AI announced a multi-year strategic partnership with Accenture to help enterprises deploy sovereign AI solutions at scale. In March 2026 Mistral raised $830 million in order to build new datacenters near Paris and in Sweden. == Services == On 19 November, 2024, the company announced updates for Le Chat (pronounced /lə ʃa/ in French, like the French word for "cat"). It added the ability to create images, using Black Forest Labs' Flux Pro model. On 6 February 2025, Mistral AI released Le Chat on iOS and Android mobile devices. Mistral AI also introduced a Pro subscription tier, priced at $14.99 per month, which provides access to more advanced models, unlimited messaging, and web browsing. At the end of May 2026, Le Chat was renamed Vibe, and new features were introduced at the same time. == Models == The following table lists the main model versions of Mistral, describing the significant changes included with each version: === Mistral 7B === Mistral AI claimed in the Mistral 7B release blog post that the model outperforms LLaMA 2 13B on all benchmarks tested, and is on par with LLaMA 34B on many benchmarks tested, despite having only 7 billion parameters, a small size compared to its competitors. === Mixtral 8x7B === Mistral AI claimed in 2023 that its model beat both LLaMA 70B, and GPT-3.5 in most benchmarks. In March 2024, research conducted by Patronus AI comparing performance of LLMs on a 100-question test with prompts to generate text from books protected under U.S. copyright law found that OpenAI's GPT-4, Mixtral, Meta AI's LLaMA-2, and Anthropic's Claude 2 generated copyrighted text verbatim in 44%, 22%, 10%, and 8% of responses respectively. === Mistral Small 3.1 === On 17 March 2025, Mistral released Mistral Small 3.1 as a smaller, more efficient model. === Mistral Medium 3 === On 7 May 2025, Mistral AI released Mistral Medium 3. === Magistral Small and Magistral Medium === On 10 June 2025, Mistral AI released their first AI reasoning models: Magistral Small (open-source), and Magistral Medium, models which are purported to have chain-of-thought capabilities. === Mistral Large 3 and Ministral 3 === On 2 December 2025, Mistral AI released Mistral Large 3, a sparse, mixture-of-experts model with 41 billion active parameters and 675 billion total parameters, and Ministral 3, three small, dense models with 3 billion, 7 billion and 14 billion parameters. === Devstral 2 and Devstral Small 2 === On 10 December 2025, Mistral AI released Devstral 2 and Devstral Small 2. Devstral Small 2, a 24B parameter model is claimed to achieve better performance at coding than Qwen 3 Coder Flash model which is a 30B parameter model.

Z.ai

Knowledge Atlas Technology Joint Stock Co., Ltd., branded internationally as Z.ai, is a Chinese technology company specializing in artificial intelligence (AI). The company was formerly known as Zhipu AI outside China until its rebranding in 2025. Z.ai's flagship product is the GLM (General Language Model) family of large language models, which the company has released under the free and open-source MIT License since July 2025. As of 2024, it is one of China's "AI tiger" companies by investors and considered to be the third-largest LLM market player in China's AI industry according to the International Data Corporation. In January 2025, the United States Commerce Department blacklisted the company in its Entity List due to national security concerns. == History == Founded in 2019, the startup company began from Tsinghua University and was later spun out as an independent company. Researchers published an Association for Computational Linguistics conference paper in May 2022 introducing the GLM (General Language Model) training algorithm, which uses an "autoregressive blank infilling" strategy that creates cloze tests by randomly removing segments of input text and trains the model to autoregressively regenerate the removed text. In 2023, it raised 2.5 billion yuan (approx. 350 million in USD) from Alibaba Group and Tencent, along with Meituan, Ant Group, Xiaomi, and HongShan. In March 2024, Zhipu AI announced it was developing a Sora-like technology to achieve artificial general intelligence (AGI). In May 2024, the Saudi Arabian finance firm Prosperity7 Ventures, LLC participated in a USD $400 million financing round for Zhipu AI with a valuation of approximately 3 billion USD. In July 2024, they debuted the Ying text-to-video model. Zhipu released GLM-4-Plus in August 2024. In October 2024, Zhipu released GLM-4-Voice, an end-to-end speech large language model that can adjust its tone or dialect. Zhipu disclosed in April 2025 that it had started preparing for its initial public offering (IPO) and released two models under the free and open-source MIT License. In May 2025, the company sealed a 61.28 million yuan deal from the Chinese government for city projects in Hangzhou. In July 2025, Zhipu AI released GLM-4.5 and GLM-4.5 Air, their next generation language models, and the company rebranded itself as Z.ai internationally. In August 2025, Z.ai announced that their GLM models are compatible with Huawei's Ascend processors. On August 11, 2025, Z.ai released a new vision-language model (VLM) with a total of 106B parameters, GLM-4.5V. In late September 2025, the company released GLM-4.6 using China's domestic chips such as those from Cambricon Technologies. Z.ai released GLM-4.6V and GLM-4.7 in December 2025. That same year, the company changed its official name to Knowledge Atlas Technology JSC Ltd. On 8 January 2026, Z.ai held its IPO on the Hong Kong Stock Exchange to become a listed company. It is considered to be China's first major LLM company that went through an IPO. On February 11, 2026, Z.ai released GLM-5. In late February 2026, Z.ai's shares fell by 23%, and had a shortage of compute resources, leading to user complaints and Z.ai issuing a public call for support. Z.ai also restricted new user signups. In late March, 2026, Z.ai released the GLM-5.1 model to subscription users. On April 8th, 2026, Z.ai released GLM-5.1 as open-source. The same day, Z.ai increased its API prices by 10%, but maintained a lower price than its United States competitor Anthropic's Opus 4.6 model. On release, the company's share price increased 11.5%. == Description == Z.ai provides the following products and services: General Language Model (commonly abbreviated as GLM; formerly known as ChatGLM), a series of pre-trained dialogue models initially developed by Zhipu AI and Tsinghua KEG in 2023. GLM 4.5, released in July 2025 by Z.ai, can run on eight NVIDIA H20 chips. The release of GLM-4.6 in late September 2025 marked the first integration of FP8 and Int4 quantization on Cambricon chips. It also supports native FP8 on Moore Threads GPUs. Ying, a text-to-video model that generates image and text prompts into a six-second video clip for around 30 seconds. AutoGLM, an AI agent application that uses voice commands to complete tasks within a smartphone. The app can analyze complex tasks such as ordering an item from a nearby store and repeating an order based from the user's shopping history. AMiner, created by Jie Tang (co-founder of Z.ai) in March 2006, now owned by Z.ai. Z.ai has offices in the Middle East, United Kingdom, Singapore, and Malaysia, along with innovation center projects across Southeast Asia (2025). In January 2025, the United States Commerce Department added the company to its Entity List, citing national security concerns. == Models ==

TensorFlow Hub

TensorFlow Hub (also styled TF Hub) is an open-source machine learning library and online repository that provides TensorFlow model components, called modules. It is maintained by Google as part of the TensorFlow ecosystem and allows developers to discover, publish, and reuse pretrained models for tasks such as computer vision, natural language processing, and transfer learning. == Overview == TensorFlow Hub provides a central platform where developers and researchers can access pre-trained models and integrate them directly into TensorFlow workflows. Each module encapsulates a computation graph and its trained weights, with standardized input and output signatures. Modules can be loaded using the hub.load() function or through Keras integration via hub.KerasLayer, enabling users to perform transfer learning or feature extraction. == History == TensorFlow Hub was announced by Google in March 2018, with the first public version released shortly after. Its introduction coincided with the growing adoption of transfer learning techniques and the need for standardized model packaging. Over time, the hub expanded to include models such as the BERT family, MobileNet, EfficientNet, and the Universal Sentence Encoder. In 2020, research on “Regret selection in TensorFlow Hub” explored the problem of identifying optimal models for downstream tasks given a large repository of alternatives. == Applications == TensorFlow Hub hosts a variety of models across machine learning domains: Natural language processing: BERT, ALBERT language model, and Universal Sentence Encoder. Computer vision: ResNet, Inception (deep learning), MobileNet, EfficientNet. Speech and audio: spectrogram feature extractors and automatic speech recognition models. Multilingual embeddings: cross-lingual and sentence-level representations for machine translation and semantic similarity. Modules are widely used in education, academic research, and industry for prototyping and production deployment.

Computational Intelligence (journal)

Computational Intelligence Journal is a peer-reviewed scientific journal covering research on artificial intelligence and computer science. The journal published novel research as well as innovative applications in a broad range of AI, covering Computational Intelligence is an artificial intelligence journal publishing novel research on a broad range of experimental and theoretical topics in AI and computer science. With a broad scope, the journal covers machine learning, knowledge mining, web intelligence, AI language, and philosophical implications. The journal was established in 1985 and is published by Wiley-Blackwell. Currently, the editors-in-chief is Diane Inkpen. The quality of the journal as an academic publishing venue is evaluated according to public citation impact metrics. in 2022, the Computational Intelligence Journal CiteScore of Scopus was 5.3, while Clarivate's Web of Science gives it 0.39 in the Journal Citation Indicator and 2,8 in the Journal Impact Factor.