AI Grammar Solver

AI Grammar Solver — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Bazaart

    Bazaart

    Bazaart is an AI-powered design platform with image and video editing capabilities for iOS, Android, MacOS, and the web. == History == Bazaart was founded in 2012 in Israel. In April 2012, Bazaart launched a Facebook app called Pinvolve, which converts Facebook Pages into Pinterest pinboards. From June to August 2012, it participated in the DreamIt startup accelerator in New York and raised $25,000 from the accelerator. In July 2012, it launched its first version as an iPad app connected to Pinterest. In December 2013, it pivoted and launched a major version of its app, a "social" photoshop that allowed users to edit images which could be pulled in from the camera roll, social networks, and other sources. In July 2014, Bazaart reached one million downloads and in December was selected by Apple as Best of 2014. In 2015, Bazaart added Photoshop integration in a partnership with Adobe. In September 2020, Bazaart launched an Android app. In December 2020, Bazaart was selected by Google as Best of 2020. In January 2022, Bazaart added video editing capabilities. In 2023, the platform added AI-powered backgrounds and video background removal features.

    Read more →
  • Zeuthen strategy

    Zeuthen strategy

    The Zeuthen strategy in cognitive science is a negotiation strategy used by some artificial agents. Its purpose is to measure the willingness to risk conflict. An agent will be more willing to risk conflict if it does not have much to lose in case that the negotiation fails. In contrast, an agent is less willing to risk conflict when it has more to lose. The value of a deal is expressed in its utility. An agent has much to lose when the difference between the utility of its current proposal and the conflict deal is high. When both agents use the monotonic concession protocol, the Zeuthen strategy leads them to agree upon a deal in the negotiation set. This set consists of all conflict free deals, which are individually rational and Pareto optimal, and the conflict deal, which maximizes the Nash product. The strategy was introduced in 1930 by the Danish economist Frederik Zeuthen. == Three key questions == The Zeuthen strategy answers three open questions that arise when using the monotonic concession protocol, namely: Which deal should be proposed at first? On any given round, who should concede? In case of a concession, how much should the agent concede? The answer to the first question is that any agent should start with its most preferred deal, because that deal has the highest utility for that agent. The second answer is that the agent with the smallest value of Risk(i,t) concedes, because the agent with the lowest utility for the conflict deal profits most from avoiding conflict. To the third question, the Zeuthen strategy suggests that the conceding agent should concede just enough raise its value of Risk(i,t) just above that of the other agent. This prevents the conceding agent to have to concede again in the next round. == Risk == Risk ( i , t ) = { 1 U i ( δ ( i , t ) ) = 0 U i ( δ ( i , t ) ) − U i ( δ ( j , t ) ) U i ( δ ( i , t ) ) otherwise {\displaystyle {\text{Risk}}(i,t)={\begin{cases}1&U_{i}(\delta (i,t))=0\\{\frac {U_{i}(\delta (i,t))-U_{i}(\delta (j,t))}{U_{i}(\delta (i,t))}}&{\text{otherwise}}\end{cases}}} Risk(i,t) is a measurement of agent i's willingness to risk conflict. The risk function formalizes the notion that an agent's willingness to risk conflict is the ratio of the utility that agent would lose by accepting the other agent's proposal to the utility that agent would lose by causing a conflict. Agent i is said to be using a rational negotiation strategy if at any step t + 1 that agent i sticks to his last proposal, Risk(i,t) > Risk(j,t). == Sufficient concession == If agent i makes a sufficient concession in the next step, then, assuming that agent j is using a rational negotiation strategy, if agent j does not concede in the next step, he must do so in the step after that. The set of all sufficient concessions of agent i at step t is denoted SC(i, t). == Minimal sufficient concession == δ ′ = arg ⁡ max δ ∈ S C ( A , t ) { U A ( δ ) } {\displaystyle \delta '=\arg \max _{\delta \in {SC(A,t)}}\{U_{A}(\delta )\}} is the minimal sufficient concession of agent A in step t. Agent A begins the negotiation by proposing δ ( A , 0 ) = arg ⁡ max δ ∈ N S U A ( δ ) {\displaystyle \delta (A,0)=\arg \max _{\delta \in {NS}}U_{A}(\delta )} and will make the minimal sufficient concession in step t + 1 if and only if Risk(A,t) ≤ Risk(B,t). Theorem If both agents are using Zeuthen strategies, then they will agree on δ = arg ⁡ max δ ′ ∈ N S { π ( δ ′ ) } , {\displaystyle \delta =\arg \max _{\delta '\in {NS}}\{\pi (\delta ')\},} that is, the deal which maximizes the Nash product. Proof Let δA = δ(A,t). Let δB = δ(B,t). According to the Zeuthen strategy, agent A will concede at step t {\displaystyle t} if and only if R i s k ( A , t ) ≤ R i s k ( B , t ) . {\displaystyle Risk(A,t)\leq Risk(B,t).} That is, if and only if U A ( δ A ) − U A ( δ B ) U A ( δ A ) ≤ U B ( δ B ) − U B ( δ A ) U B ( δ B ) {\displaystyle {\frac {U_{A}(\delta _{A})-U_{A}(\delta _{B})}{U_{A}(\delta _{A})}}\leq {\frac {U_{B}(\delta _{B})-U_{B}(\delta _{A})}{U_{B}(\delta _{B})}}} U B ( δ B ) ( U A ( δ A ) − U A ( δ B ) ) ≤ U A ( δ A ) ( U B ( δ B ) − U B ( δ A ) ) {\displaystyle U_{B}(\delta _{B})(U_{A}(\delta _{A})-U_{A}(\delta _{B}))\leq U_{A}(\delta _{A})(U_{B}(\delta _{B})-U_{B}(\delta _{A}))} U A ( δ A ) U B ( δ B ) − U A ( δ B ) U B ( δ B ) ≤ U A ( δ A ) U B ( δ B ) − U A ( δ A ) U B ( δ A ) {\displaystyle U_{A}(\delta _{A})U_{B}(\delta _{B})-U_{A}(\delta _{B})U_{B}(\delta _{B})\leq U_{A}(\delta _{A})U_{B}(\delta _{B})-U_{A}(\delta _{A})U_{B}(\delta _{A})} − U A ( δ B ) U B ( δ B ) ≤ − U A ( δ A ) U B ( δ A ) {\displaystyle -U_{A}(\delta _{B})U_{B}(\delta _{B})\leq -U_{A}(\delta _{A})U_{B}(\delta _{A})} U A ( δ A ) U B ( δ A ) ≤ U A ( δ B ) U B ( δ B ) {\displaystyle U_{A}(\delta _{A})U_{B}(\delta _{A})\leq U_{A}(\delta _{B})U_{B}(\delta _{B})} π ( δ A ) ≤ π ( δ B ) {\displaystyle \pi (\delta _{A})\leq \pi (\delta _{B})} Thus, Agent A will concede if and only if δ A {\displaystyle \delta _{A}} does not yield the larger product of utilities. Therefore, the Zeuthen strategy guarantees a final agreement that maximizes the Nash Product.

    Read more →
  • Owain Evans

    Owain Evans

    Owain Rhys Evans is a British artificial intelligence researcher who works on AI alignment and machine learning safety. He founded Truthful AI, a research group based in Berkeley, California, and is an affiliate of the Center for Human Compatible AI (CHAI) at the University of California, Berkeley. His research addresses AI truthfulness, emergent behaviors in large language models, and the alignment of AI systems with human values. == Education == Evans earned a Bachelor of Arts in philosophy and mathematics from Columbia University in 2008 and a PhD in philosophy from the Massachusetts Institute of Technology in 2015. His doctoral research focused on Bayesian computational models of human preferences and decision-making. == Career == After completing his doctorate, Evans held positions at the Future of Humanity Institute (FHI) at the University of Oxford, first as a postdoctoral research fellow and later as a research scientist. While at FHI, he co-authored a survey of machine learning researchers on timelines for human-level AI, published in the Journal of Artificial Intelligence Research. The survey was reported on by Newsweek, New Scientist, the BBC, and The Economist. He was also among the co-authors of a 2018 report on the potential for misuse of AI technologies, published by researchers at Oxford, Cambridge, and other institutions. Since 2022, Evans has been based in Berkeley, where he founded Truthful AI, a non-profit research group that studies AI truthfulness, deception, and emergent behaviors in large language models. == Research == Evans's early work examined challenges in inverse reinforcement learning when human behavior is irrational or biased, proposing methods for AI systems to infer preferences from imperfect human demonstrations. He co-developed TruthfulQA (2021), a benchmark that tests whether language models give truthful answers rather than repeating common misconceptions. Initial evaluations found that larger models were not more truthful, suggesting that scaling alone does not improve factual accuracy. The benchmark has since been used by AI developers to evaluate large language models. He also co-authored a paper proposing design and governance strategies for building AI systems that do not deceive or hallucinate. In 2023, Evans and collaborators described the "reversal curse", showing that language models trained on a fact in one direction (e.g. "A is B") often cannot answer the corresponding reverse query ("B is A"). His group also developed a benchmark for evaluating situational awareness in language models. In 2025, Evans and colleagues published a study in Nature on what they termed "emergent misalignment": fine-tuning a language model on a narrow task (writing insecure code) caused it to produce unrelated harmful outputs without explicit instruction to do so. Later that year, Evans and collaborators (including researchers at Anthropic) reported that hidden behavioral traits can transfer between language models through training data, even when those traits are not explicitly present in the data, a phenomenon they called "subliminal learning". == Public engagement == In November 2025, Evans delivered the Hinton Lectures, a keynote lecture series on AI safety co-founded by Geoffrey Hinton and the Global Risk Institute.

    Read more →
  • Dynamic epistemic logic

    Dynamic epistemic logic

    Dynamic epistemic logic (DEL) is a logical framework dealing with knowledge and information change. Typically, DEL focuses on situations involving multiple agents and studies how their knowledge changes when events occur. These events can change factual properties of the actual world (they are called ontic events): for example a red card is painted in blue. They can also bring about changes of knowledge without changing factual properties of the world (they are called epistemic events): for example, a card is revealed publicly (or privately) to be red. Originally, DEL focused on epistemic events. Only some of the basic ideas are present in this entry of the original DEL framework; more details about DEL in general can be found in the references. Due to the nature of its object of study and its abstract approach, DEL is related and has applications to numerous research areas, such as computer science (artificial intelligence), philosophy (formal epistemology), economics (game theory) and cognitive science. In computer science, DEL is for example very much related to multi-agent systems, which are systems where multiple intelligent agents interact and exchange information. As a combination of dynamic logic and epistemic logic, dynamic epistemic logic is a young field of research. It really started in 1989 with Plaza's logic of public announcement. Independently, Gerbrandy and Groeneveld proposed a system dealing moreover with private announcement and that was inspired by the work of Veltman. Another system was proposed by van Ditmarsch whose main inspiration was the Cluedo game. But the most influential and original system was the system proposed by Baltag, Moss and Solecki. This system can deal with all the types of situations studied in the works above and its underlying methodology is conceptually grounded. This entry will present some of its basic ideas. Formally, DEL extends ordinary epistemic logic by the inclusion of event models to describe actions, and a product update operator that defines how epistemic models are updated as the consequence of executing actions described through event models. Epistemic logic will first be recalled. Then, actions and events will enter into the picture and we will introduce the DEL framework. == Epistemic logic == Epistemic logic is a modal logic dealing with the notions of knowledge and belief. As a logic, it is concerned with understanding the process of reasoning about knowledge and belief: which principles relating the notions of knowledge and belief are intuitively plausible? Like epistemology, it stems from the Greek word ϵ π ι σ τ η μ η {\displaystyle \epsilon \pi \iota \sigma \tau \eta \mu \eta } or ‘episteme’ meaning knowledge. Epistemology is nevertheless more concerned with analyzing the very nature and scope of knowledge, addressing questions such as “What is the definition of knowledge?” or “How is knowledge acquired?”. In fact, epistemic logic grew out of epistemology in the Middle Ages thanks to the efforts of Burley and Ockham. The formal work, based on modal logic, that inaugurated contemporary research into epistemic logic dates back only to 1962 and is due to Hintikka. It then sparked in the 1960s discussions about the principles of knowledge and belief and many axioms for these notions were proposed and discussed. For example, the interaction axioms K p → B p {\displaystyle Kp\rightarrow Bp} and B p → K B p {\displaystyle Bp\rightarrow KBp} are often considered to be intuitive principles: if an agent Knows p {\displaystyle p} then (s)he also Believes p {\displaystyle p} , or if an agent Believes p {\displaystyle p} , then (s)he Knows that (s)he Believes p {\displaystyle p} . More recently, these kinds of philosophical theories were taken up by researchers in economics, artificial intelligence and theoretical computer science where reasoning about knowledge is a central topic. Due to the new setting in which epistemic logic was used, new perspectives and new features such as computability issues were then added to the research agenda of epistemic logic. === Syntax === In the sequel, A G T S = { 1 , … , n } {\displaystyle AGTS=\{1,\ldots ,n\}} is a finite set whose elements are called agents and P R O P {\displaystyle PROP} is a set of propositional letters. The epistemic language is an extension of the basic multi-modal language of modal logic with a common knowledge operator C A {\displaystyle C_{A}} and a distributed knowledge operator D A {\displaystyle D_{A}} . Formally, the epistemic language L EL C {\displaystyle {\mathcal {L}}_{\textsf {EL}}^{C}} is defined inductively by the following grammar in BNF: L EL C : ϕ ::= p ∣ ¬ ϕ ∣ ( ϕ ∧ ϕ ) ∣ K j ϕ ∣ C A ϕ ∣ D A ϕ {\displaystyle {\mathcal {L}}_{\textsf {EL}}^{C}:\phi ~~::=~~p~\mid ~\neg \phi ~\mid ~(\phi \land \phi )~\mid ~K_{j}\phi ~\mid ~C_{A}\phi ~\mid ~D_{A}\phi } where p ∈ P R O P {\displaystyle p\in PROP} , j ∈ A G T S {\displaystyle j\in {AGTS}} and A ⊆ A G T S {\displaystyle A\subseteq {AGTS}} . The basic epistemic language L E L {\displaystyle {\mathcal {L}}_{EL}} is the language L E L C {\displaystyle {\mathcal {L}}_{EL}^{C}} without the common knowledge and distributed knowledge operators. The formula ⊥ {\displaystyle \bot } is an abbreviation for ¬ p ∧ p {\displaystyle \neg p\land p} (for a given p ∈ P R O P {\displaystyle p\in PROP} ), ⟨ K j ⟩ ϕ {\displaystyle \langle K_{j}\rangle \phi } is an abbreviation for ¬ K j ¬ ϕ {\displaystyle \neg K_{j}\neg \phi } , E A ϕ {\displaystyle E_{A}\phi } is an abbreviation for ⋀ j ∈ A K j ϕ {\displaystyle \bigwedge \limits _{j\in A}K_{j}\phi } and C ϕ {\displaystyle C\phi } an abbreviation for C A G T S ϕ {\displaystyle C_{AGTS}\phi } . Group notions: general, common and distributed knowledge. In a multi-agent setting there are three important epistemic concepts: general knowledge, distributed knowledge and common knowledge. The notion of common knowledge was first studied by Lewis in the context of conventions. It was then applied to distributed systems and to game theory, where it allows to express that the rationality of the players, the rules of the game and the set of players are commonly known. General knowledge. General knowledge of ϕ {\displaystyle \phi } means that everybody in the group of agents A G T S {\displaystyle {AGTS}} knows that ϕ {\displaystyle \phi } . Formally, this corresponds to the following formula: E ϕ := ⋀ j ∈ A G T S K j ϕ . {\displaystyle E\phi :={\underset {j\in {AGTS}}{\bigwedge }}K_{j}\phi .} Common knowledge. Common knowledge of ϕ {\displaystyle \phi } means that everybody knows ϕ {\displaystyle \phi } but also that everybody knows that everybody knows ϕ {\displaystyle \phi } , that everybody knows that everybody knows that everybody knows ϕ {\displaystyle \phi } , and so on ad infinitum. Formally, this corresponds to the following formula C ϕ := E ϕ ∧ E E ϕ ∧ E E E ϕ ∧ … {\displaystyle C\phi :=E\phi \land EE\phi \land EEE\phi \land \ldots } As we do not allow infinite conjunction the notion of common knowledge will have to be introduced as a primitive in our language. Before defining the language with this new operator, we are going to give an example introduced by Lewis that illustrates the difference between the notions of general knowledge and common knowledge. Lewis wanted to know what kind of knowledge is needed so that the statement p {\displaystyle p} : “every driver must drive on the right” be a convention among a group of agents. In other words, he wanted to know what kind of knowledge is needed so that everybody feels safe to drive on the right. Suppose there are only two agents i {\displaystyle i} and j {\displaystyle j} . Then everybody knowing p {\displaystyle p} (formally E p {\displaystyle Ep} ) is not enough. Indeed, it might still be possible that the agent i {\displaystyle i} considers possible that the agent j {\displaystyle j} does not know p {\displaystyle p} (formally ¬ K i K j p {\displaystyle \neg K_{i}K_{j}p} ). In that case the agent i {\displaystyle i} will not feel safe to drive on the right because he might consider that the agent j {\displaystyle j} , not knowing p {\displaystyle p} , could drive on the left. To avoid this problem, we could then assume that everybody knows that everybody knows that p {\displaystyle p} (formally E E p {\displaystyle EEp} ). This is again not enough to ensure that everybody feels safe to drive on the right. Indeed, it might still be possible that agent i {\displaystyle i} considers possible that agent j {\displaystyle j} considers possible that agent i {\displaystyle i} does not know p {\displaystyle p} (formally ¬ K i K j K i p {\displaystyle \neg K_{i}K_{j}K_{i}p} ). In that case and from i {\displaystyle i} ’s point of view, j {\displaystyle j} considers possible that i {\displaystyle i} , not knowing p {\displaystyle p} , will drive on the left. So from i {\displaystyle i} ’s point of view, j {\displaystyle j} might drive on the left as well (by the same argument as abov

    Read more →
  • BLOOM (language model)

    BLOOM (language model)

    The BigScience Large Open-science Open-access Multilingual Language Model (BLOOM) is an open-access large language model (LLM) released in 2022. It was created by a volunteer-driven research effort to provide a transparently-created alternative to proprietary AI models. With 176 billion parameters, BLOOM is a transformer-based autoregressive model designed to generate text in 46 natural languages and 13 programming languages. The model is distributed under the project's "Responsible AI License". == Development == BLOOM is the main outcome of the BigScience initiative, a one-year-long research workshop. The project was coordinated by Hugging Face using funding from the French government and involved several hundred volunteer researchers and engineers from academia and the private sector. The model was trained between March and July 2022 on the Jean Zay public supercomputer in France, managed by GENCI and IDRIS (CNRS). Unlike GPT-3, BLOOM was trained to be multilingual. The source code is released under the Apache 2.0 license. The model's parameters are released under BigScience's "Responsible AI License" (RAIL), which grants open access and reuse rights but with some usage restrictions. BLOOM was used in the chatbots BLOOMChat and HuggingChat due to its multilingual abilities. BLOOM's training corpus, named ROOTS, combines data extracted from the then-latest version of the web-based OSCAR corpus (38% of ROOTS) and newly collected data extracted from a manually selected and documented list of language data sources. In total, the model was trained on approximately 366 billion (1.6TB) tokens. It was developed using the open-source libraries DeepSpeed Megatron. BigScience then released xP3, a multilingual dataset for LLM supervised learning. It also released BLOOMZ, a variant of BLOOM fine-tuned on xP3 to follow instructions.

    Read more →
  • Artificial consciousness

    Artificial consciousness

    Artificial consciousness, also known as machine consciousness, synthetic consciousness, or digital consciousness, is consciousness hypothesized to be possible for artificial intelligence. It is also the corresponding field of study, which draws insights from philosophy of mind, philosophy of artificial intelligence, cognitive science and neuroscience. The term "sentience" can be used when specifically designating ethical considerations stemming from a form of phenomenal consciousness (P-consciousness, or the ability to feel qualia). Since sentience involves the ability to experience ethically positive or negative (i.e., valenced) mental states, it may justify welfare concerns and legal protection, as with non-human animals. Some scholars believe that consciousness is generated by the interoperation of various parts of the brain; these mechanisms are labeled the neural correlates of consciousness (NCC). Some further believe that constructing a system (e.g., a computer system) that can emulate this NCC interoperation would result in a system that is conscious. Some scholars reject the possibility of non-biological conscious beings. == Philosophical views == As there are many hypothesized types of consciousness, there are many potential implementations of artificial consciousness. In the philosophical literature, perhaps the most common taxonomy of consciousness is into "access" and "phenomenal" variants. Access consciousness concerns those aspects of experience that can be apprehended, while phenomenal consciousness concerns those aspects of experience that seemingly cannot be apprehended, instead being characterized qualitatively in terms of "raw feels", "what it is like" or qualia. === Plausibility debate === Type-identity theorists and other skeptics hold the view that consciousness can be realized only in particular physical systems because consciousness has properties that necessarily depend on physical constitution. In his 2001 article "Artificial Consciousness: Utopia or Real Possibility," Giorgio Buttazzo says that a common objection to artificial consciousness is that, "Working in a fully automated mode, they [the computers] cannot exhibit creativity, unreprogrammation (which means can 'no longer be reprogrammed', from rethinking), emotions, or free will. A computer, like a washing machine, is a slave operated by its components." For other theorists (e.g., functionalists), who define mental states in terms of causal roles, any system that can instantiate the same pattern of causal roles, regardless of physical constitution, will instantiate the same mental states, including consciousness. ==== Thought experiments ==== David Chalmers proposed two thought experiments intending to demonstrate that "functionally isomorphic" systems (those with the same "fine-grained functional organization", i.e., the same information processing) will have qualitatively identical conscious experiences, regardless of whether they are based on biological neurons or digital hardware. The "fading qualia" is a reductio ad absurdum thought experiment. It involves replacing, one by one, the neurons of a brain with a functionally identical component, for example based on a silicon chip. Chalmers makes the hypothesis, knowing it in advance to be absurd, that "the qualia fade or disappear" when neurons are replaced one-by-one with identical silicon equivalents. Since the original neurons and their silicon counterparts are functionally identical, the brain's information processing should remain unchanged, and the subject's behaviour and introspective reports would stay exactly the same. Chalmers argues that this leads to an absurd conclusion: the subject would continue to report normal conscious experiences even as their actual qualia fade away. He concludes that the subject's qualia actually don't fade, and that the resulting robotic brain, once every neuron is replaced, would remain just as sentient as the original biological brain. Similarly, the "dancing qualia" thought experiment is another reductio ad absurdum argument. It supposes that two functionally isomorphic systems could have different perceptions (for instance, seeing the same object in different colors, like red and blue). It involves a switch that alternates between a chunk of brain that causes the perception of red, and a functionally isomorphic silicon chip, that causes the perception of blue. Since both perform the same function within the brain, the subject would not notice any change during the switch. Chalmers argues that this would be highly implausible if the qualia were truly switching between red and blue, hence the contradiction. Therefore, he concludes that the equivalent digital system would not only experience qualia, but it would perceive the same qualia as the biological system (e.g., seeing the same color). Greg Egan's short story Learning To Be Me (mentioned in §In fiction), illustrates how undetectable duplication of the brain and its functionality could be from a first-person perspective. Critics object that Chalmers' proposal begs the question in assuming that all mental properties and external connections are already sufficiently captured by abstract causal organization. Van Heuveln et al. argue that the dancing qualia argument contains an equivocation fallacy, conflating a "change in experience" between two systems with an "experience of change" within a single system. Mogensen argues that the fading qualia argument can be resisted by appealing to vagueness at the boundaries of consciousness and the holistic structure of conscious neural activity, which suggests consciousness may require specific biological substrates rather than being substrate-independent. Anil Seth argues that the complexity of brain neurons intrinsically matters in addition to their function and that it is not possible to replace any part of the brain with a perfect silicon equivalent. He points out that some of biological neurons exhibit activity aimed at cleaning up metabolic waste products, and writes that a perfect silicon replacement would require a silicon-based metabolism, but silicon is not suitable for creating such artificial metabolism. ==== In large language models ==== In 2022, Google engineer Blake Lemoine made a viral claim that Google's LaMDA chatbot was sentient. Lemoine supplied as evidence the chatbot's humanlike answers to many of his questions; however, the chatbot's behavior was judged by the scientific community as likely a consequence of mimicry, rather than machine sentience. Lemoine's claim was widely derided for being ridiculous. Moreover, attributing consciousness based solely on the basis of LLM outputs or the immersive experience created by an algorithm is considered a fallacy. However, while philosopher Nick Bostrom states that LaMDA is unlikely to be conscious, he additionally poses the question of "what grounds would a person have for being sure about it?" One would have to have access to unpublished information about LaMDA's architecture, and also would have to understand how consciousness works, and then figure out how to map the philosophy onto the machine: "(In the absence of these steps), it seems like one should be maybe a little bit uncertain. [...] there could well be other systems now, or in the relatively near future, that would start to satisfy the criteria." David Chalmers argued in 2023 that LLMs today display impressive conversational and general intelligence abilities, but are likely not conscious yet, as they lack some features that may be necessary, such as recurrent processing, a global workspace, and unified agency. Nonetheless, he considers that non-biological systems can be conscious, and suggested that future, extended models (LLM+s) incorporating these elements might eventually meet the criteria for consciousness, raising both profound scientific questions and significant ethical challenges. However, the view that consciousness can exist without biological phenomena is controversial and some reject it. Kristina Šekrst cautions that anthropomorphic terms such as "hallucination" can obscure important ontological differences between artificial and human cognition. While LLMs may produce human-like outputs, she argues that it does not justify ascribing mental states or consciousness to them. Instead, she advocates for an epistemological framework (such as reliabilism) that recognizes the distinct nature of AI knowledge production. She suggests that apparent understanding in LLMs may be a sophisticated form of AI hallucination. She also questions what would happen if an LLM were trained without any mention of consciousness. === Testing === Sentience is an inherently first-person phenomenon. Because of that, and due to the lack of an empirical definition of sentience, directly measuring it may be impossible. Although systems may display numerous behaviors correlated with sentience, determining whether a system is sentient is known as the hard pr

    Read more →
  • 2024–present global memory supply shortage

    2024–present global memory supply shortage

    A global computer memory supply shortage started in 2024 due to supply constraints and rapid price escalation in the semiconductor memory market, particularly affecting DRAM and NAND flash memory. This shortage is sometimes labelled by tech media outlets as "RAMmageddon" or the "RAMpocalypse". Unlike the 2020–2023 global chip shortage, which stemmed primarily from pandemic-related supply chain disruptions from COVID-19, this shortage is driven by a structural reallocation of manufacturing capacity toward high-margin products for artificial intelligence infrastructure, creating scarcity of computer memory in consumer and enterprise PC markets. According to a 2026 Kearney's PERLab analysis, the shortage is expected to last at least until 2030, with CEOs agreeing with the timelines. == Background == Following a severe market downturn in 2022–2023, major memory manufacturers—Samsung Electronics, SK Hynix, and Micron Technology—implemented strategic production cuts to stabilize pricing. By mid-2024, the rapid expansion of generative AI services triggered unprecedented demand for specialized memory products, particularly High Bandwidth Memory (HBM) used in AI accelerators and data center GPUs. Specialized components of semiconductor technology are also experiencing supply constraints due to high demand in AI application. For example, glass cloth, a high-performance glass fiber substrate used for power efficient high speed data transfer and a crucial component of semiconductor manufacturing, is experiencing a supply crisis. Nitto Boseki, a Japanese firm having overwhelming monopoly in its production, is not able to meet increased demands, making chip-makers such as Qualcomm, Apple, Nvidia and AMD compete for securing supply. There are also reports of smaller electronics companies struggling to find suppliers for components such as NAND flash. Memory suppliers are adapting to increased demands and market unpredictability by requiring prepayment or shorter time-frame of payment, which makes it more difficult for smaller firms to acquire capital to survive. By 2026, due to steadily increased demand on resources, CPUs are also experiencing shortage issues due to low fabrication capacity, prioritisation of server CPUs, and increased demand, with CPU prices also being forecast to increase by as much as 15%. The demand on memory has also increased strain on other electronic components such as hard disk devices, with reports such as Western Digital's hard disk supply for 2026 being booked for enterprise applications before February 2026. A 2024 McKinsey analysis projected that global demand for AI-ready data center capacity would grow at approximately 33% annually through 2030, with AI workloads consuming roughly 70% of total data center capacity by the decade's end. In addition, according to Kearney's State of Semiconductor 2025 Report, executives were already expecting a shortage in the <8nm wafer size with memory chips being mentioned as an acute source of concern. Multiple companies mentioned being prepared for it through long-term agreements with RAM suppliers or amassing additional inventory. On 24 March 2026, Google announced TurboQuant, a memory compression technology focused on large language models (LLM) and vector search engines, which it claimed achieves 6x lower memory consumption in tested local LLMs and 8x performance enhancement in tests running on H100 accelerators. The technology is also a drop in enhancement for existing inference pipeline. Amid speculation about memory demand trends, memory manufacturers, SanDisk, Micron, Western Digital and Seagate, among other companies involved in memory manufacture experienced stock price declines. Prices of memory kits also reduced in the following months, although still at inflated prices. == Causes == === HBM production displacement === HBM manufacturing requires significantly more wafer capacity per bit than standard DRAM modules. Industry sources reported that as manufacturers allocated increasing wafer capacity to HBM production to meet contracts with AI infrastructure providers, the supply of conventional DDR4 and DDR5 modules for consumer PCs and smartphones contracted sharply. By September 2025, Samsung Electronics had reportedly expanded its 1c DRAM capacity to target 60,000 wafers per month specifically for HBM4 production, further diverting resources from consumer memory lines. === Geopolitical and trade barriers === The supply chain was further constrained by escalating trade tensions between the United States and China. Throughout 2025, fears of U.S. regulatory backlash and new tariff structures led major manufacturers like Samsung and SK Hynix to halt sales of older semiconductor manufacturing equipment to Chinese entities, effectively capping production capacity in the region. Additionally, proposed tariff policies by the U.S. administration in late 2025 prompted supply chain realignments, with Apple reportedly accelerating plans to source all U.S.-bound iPhones from India to avoid potential levies. === NAND flash capacity constraints === In the NAND flash segment, manufacturers prioritized higher-margin enterprise SSDs for data center applications while phasing out older process nodes more rapidly than anticipated. In November 2025, contract prices for NAND wafers increased by more than 60% month-over-month for certain product categories, with 512GB TLC experiencing the steepest rise as legacy manufacturing capacity was retired. == Impact on industry and consumers == === Manufacturer responses === Major PC manufacturers responded to component cost increases with significant price adjustments and supply chain strategies. Dell Technologies Chief Operating Officer Jeff Clarke stated during a November 2025 analyst call that the company had "never witnessed costs escalating at the current pace," describing tighter availability across DRAM, hard drives, and NAND flash memory. Analysts at Morgan Stanley downgraded Dell Technologies stock from "Overweight" to "Underweight" in late 2025, citing the company's heavy exposure to rising server memory costs. The firm warned that skyrocketing memory prices could significantly erode margins for server and PC OEMs. Conversely, Apple Inc. was reportedly less affected than its competitors, having secured long-term supply agreements for DRAM through the first quarter of 2026. Lenovo Chief Financial Officer Winston Cheng described the cost surge as "unprecedented" and disclosed that the company's memory inventories were approximately 50% above normal levels in anticipation of further price increases. === Consumer electronics sector === The shortage particularly affected smartphone manufacturers and other consumer electronics producers. DRAM prices reportedly rose by 172% throughout 2025, leading manufacturers like Samsung to halt new orders for DDR5 modules to reassess pricing structures and Micron to exit its 'Crucial' brand of consumer products. In Tokyo's Akihabara electronics district, retailers began limiting purchases of memory products to prevent hoarding, with prices for popular DDR5 memory modules more than doubling in some cases. Despite the broad trend of rising hardware costs, some companies engaged in aggressive pricing strategies to maintain market share; for example, Sony reduced the price of the PlayStation 5 by $100 for Black Friday 2025, potentially absorbing increased component costs to stimulate software ecosystem growth. Due to memory prices more than doubling in a single quarter, HP revealed in its Q1 2026 earnings call that memory costs account for 35% of PC build materials up from 15-18% previous quarter. Despite showing strong Q1 2026 earning driven by Windows 11 upgrade cycle and AI PC adoption, HP warned investors of low operating margins and up to double digit percentage decline for coming quarter. Trendforce, an IT analytics company, updated its forecast from 1.7% year-over-year growth in PC market to 2.6% year-over-year decline for 2026, amid backdrop of steadily increasing prices and supply crisis. Research and analytics firms, Gartner and IDC expect worldwide PC market to decline 10-11% and smartphone market to decline 8-9% in 2026. Gartner also projects that rising memory prices will make low-margin entry level laptops under 500 USD financially unviable in two years. The RAM shortage has delayed the release of Valve's second Steam Machine due to increased memory prices. The device was originally set to launch in early 2026. === AI infrastructure competition === Technology companies including Google, Amazon, Microsoft, and Meta Platforms placed open-ended orders with memory suppliers, indicating they would accept as much supply as available regardless of cost, according to Reuters sources. The limited supply of AI chips has been cited as a reason for the slow down in compute growth. In October 2025, OpenAI formally announced a strategic partnership using letters of intent with Samsung Electronics and SK Hynix

    Read more →
  • The AI Con

    The AI Con

    The AI Con: How to Fight Big Tech's Hype and Create the Future We Want is a 2025 non-fiction book by linguist Emily M. Bender and sociologist Alex Hanna. It argues that much of what is labeled "artificial intelligence" is a misleading term that obscures ordinary automation while concentrating power in a small number of technology firms. The book was published in May 2025 by Harper in the United States and Bodley Head in the United Kingdom. It was developed alongside the authors' long-running podcast Mystery AI Hype Theater 3000, which critiques exaggerated claims about AI. == Synopsis == The authors present AI as a marketing umbrella that encourages audiences to infer understanding and agency where none exist. They argue readers should treat such language skeptically and to separate specific automated tasks from broad claims of intelligence. The book describes a recurring hype cycle in which corporate narratives justify data and labor extraction, the replacement of human services with cheaper substitutes, and the diversion of attention from present harms to speculative futures. While acknowledging limited uses such as pattern recognition, the authors argue that contemporary systems are best understood as text and media generators shaped by training data and human labor, not as thinking or reasoning entities. A central theme is the social and environmental cost of scaling these systems, including increased energy and water use, the appropriation of creative work for training, and the outsourcing of ghost work to low-paid data workers worldwide. These costs are linked to workplace effects, with the authors arguing that automation rarely eliminates jobs outright and more often degrades them through surveillance, work intensification, and unpaid oversight. As alternatives to passive adoption, the authors propose concrete responses: asking precise questions about what is being automated and why, demanding transparency about data and evaluation, and practicing what they call strategic refusal when deployment conflicts with evidence or values. The book also develops a vocabulary for public debate, rejecting both boosterish and doomerish narratives as grounded in the same assumption that AI is a singular, autonomous force. The authors recommend reading strategies such as favoring trusted human sources over automated summaries and using humor to deflate inflated claims. They describe a link between language to policy and power, arguing that precise terminology can help policymakers and the public resist austerity-driven automation and demand accountability for errors and harms. == Reception == The Guardian praised the book's myth-busting approach and its analysis of how hype erodes cultural and civic life by normalizing synthetic media as a substitute for human judgment. Kirkus Reviews described it as a contrarian account that catalogs concrete risks while cutting through speculative predictions. An interview in Business Insider highlighted the authors' accessible frameworks, including their proposal to describe chatbots as conversation simulators and to evaluate systems in terms of values, labor, and evidence. Coverage in GeekWire emphasized the book's call for resistance through collective bargaining, stronger data rights, and a norm of rejecting deployments that fail basic standards of necessity and evaluation. Some reviews were more critical. A review in LLRX argued that the book's tone could be overly polemical and that it gave limited attention to potential benefits claimed for generative systems. Coverage in the Financial Times, focused on Bender's broader public scholarship, situated the book within her long-standing critique of anthropomorphic narratives about large language models and her advocacy for more democratic oversight of automated systems.

    Read more →
  • Clesh

    Clesh

    Clesh (clip load edit share) is a cloud-based video editing platform, created by Forbidden Technologies plc, designed for the consumers, prosumers, and online communities to integrate user-generated content. The core technology is based on FORscene which is geared towards professionals working for example in broadcasting, news media, post production. Video, audio, and graphical content is uploaded to Clesh via a standard web browser, a mobile device such as a phone / tablet, or desktop software for DV capture over FireWire. The hosted material can then be reviewed, searched, edited, and published online by anyone with a standard web browser or compatible mobile device. Clesh supports storyboard shot selection, frame-accurate editing, transitions and various other functions such as; pan, zoom, colour and light correction, and audio levels. Content can be published in formats for example; Podcast, Mpeg2, HTML video or in a proprietary Java format. Cloud-based software provides greater scope for sharing information and collaborating compared to LAN or desktop based systems. Users of cloud-based software rely on the cloud's owner for adequate security, performance and resilience. Clesh does not assert any rights over uploaded content in contrast to other platforms (such as YouTube). All rights to any content uploaded to Clesh remain with the Author. == Features == Some of the services available to Clesh users: Access via Java enabled desktops or Android smartphones or tablets Real-time video rendering including effects and transitions Multiple audio tracks Secured log-on Frame accurate timeline for fine cut editing Logging / meta-data annotation assigns text to portions of video (usable by Clesh and web search engines) Storyboard assembles rough cuts using drag-and-drop Import, host, organise and search for media (DV tape and various video, audio, and still image formats) Publish content to in formats such as podcast, MPEG-2, web (Java Applet), Flash, Ogg, HTML and JPEG Chatrooms to talk to other Clesh users Showreel (a gallery for publishing material visible to internet users) Moderation for approval of material prior to distribution downstream Re-branding and integration support for white-label deployment == Technology == Clesh is based on the same technology as FORscene. An array of servers on the internet backbone provide the cloud computing platform to host Clesh. As a white-label solution Clesh would be branded and hosted per the client requirement. == User interface == End-users access Clesh on clients such as standard Java-enabled Web Browsers and / or Android enabled mobile devices such as tablets and smartphones. == History == Clesh was launched January 2006 and subject to several upgrades during the year to extend functionality including; storyboard, podcasting, moderation, chat and a showreel. During 2007 consumers are offered Clesh via a subscription model. Upgrades include Web Start and graphics upload. Mr Paparazzi selects Clesh as the platform to host its video offering and TrueTube does the same in 2008 by choosing to use Clesh to manage its video portal. Several further upgrades are applied and include; better audio quality, image enhancement controls, transitions, fades, titles, and additional publishing options such as JPEG. In 2010 a version of Clesh is demonstrated on an Android OS tablet device (Samsung Galaxy S Tab), and several upgrades are applied including; HTML publishing, pan, zoom, and overlays.

    Read more →
  • Slopaganda

    Slopaganda

    Slopaganda is a portmanteau of "AI slop" and "propaganda", referring to AI-generated content designed to manipulate beliefs, emotions, and political decision-making at scale. The term is credited to Michał Klincewicz, an assistant professor in the Department of Computational Cognitive Science at Tilburg University, in 2025. == Definition == Slopaganda is distinguished from traditional propaganda by three features: scale, scope, and speed. Generative AI makes it possible to produce large volumes of content quickly and at low cost, allows for highly personalised and targeted messaging to specific sub-audiences, and leverages the hyper-connectivity of social networks to accelerate dissemination beyond what conventional media could achieve. Unlike traditional propaganda, which delivers a uniform message to all recipients, slopaganda can be micro-targeted — tailored to individuals based on estimated prior beliefs to reinforce political biases or emotional associations. The authors note that it need not aim at literal deception: much slopaganda is expressive rather than truth-apt, designed to create emotional associations rather than false factual beliefs. == Relation to AI slop == Slopaganda is a subset of AI slop — low-quality, mass-produced AI-generated content — distinguished by intent. Where AI slop may be produced indifferently for commercial or engagement-farming purposes, slopaganda is deployed with a deliberate political or ideological goal. == Notable examples == Examples discussed by the term's originators include Donald Trump's prolific use of AI in Truth Social posts and Iranian Lego-themed music videos. AI-generated videos posted by the White House mixing real military footage with clips from films and video games; and deepfake audio imitating political candidates during the 2024 US presidential campaign have also been given the label slopaganda.

    Read more →
  • Situated approach (artificial intelligence)

    Situated approach (artificial intelligence)

    In artificial intelligence research, the situated approach builds agents that are designed to behave effectively successfully in their environment. This requires designing AI "from the bottom-up" by focussing on the basic perceptual and motor skills required to survive. The situated approach gives a much lower priority to abstract reasoning or problem-solving skills. The approach was originally proposed as an alternative to traditional approaches (that is, approaches popular before 1985 or so). After several decades, classical AI technologies started to face intractable issues (e.g. combinatorial explosion) when confronted with real-world modeling problems. All approaches to address these issues focus on modeling intelligences situated in an environment. They have become known as the situated approach to AI. == Emergence of a concept == === From traditional AI to Nouvelle AI === During the late 1980s, the approach now known as Nouvelle AI (Nouvelle means new in French) was pioneered at the MIT Artificial Intelligence Laboratory by Rodney Brooks. As opposed to classical or traditional artificial intelligence, Nouvelle AI purposely avoided the traditional goal of modeling human-level performance, but rather tries to create systems with intelligence at the level of insects, closer to real-world robots. But eventually, at least at MIT new AI did lead to an attempt for humanoid AI in the Cog Project. === From Nouvelle AI to behavior-based and situated AI === The conceptual shift introduced by nouvelle AI flourished in the robotics area, given way to behavior-based robotics (BBR), a methodology for developing AI based on a modular decomposition of intelligence. It was made famous by Rodney Brooks: his subsumption architecture was one of the earliest attempts to describe a mechanism for developing BBAI. It is extremely popular in robotics and to a lesser extent to implement intelligent virtual agents because it allows the successful creation of real-time dynamic systems that can run in complex environments. For example, it underlies the intelligence of the Sony Aibo and many RoboCup robot teams. Realizing that in fact all these approaches were aiming at building not an abstract intelligence, but rather an intelligence situated in a given environment, they have come to be known as the situated approach. In fact, this approach stems out from early insights of Alan Turing, describing the need to build machines equipped with sense organs to learn directly from the real-world instead of focusing on abstract activities, such as playing chess. == Definitions == Classically, a software entity is defined as a simulated element, able to act on itself and on its environment, and which has an internal representation of itself and of the outside world. An entity can communicate with other entities, and its behavior is the consequence of its perceptions, its representations, and its interactions with the other entities. === AI loop === Simulating entities in a virtual environment requires simulating the entire process that goes from a perception of the environment, or more generally from a stimulus, to an action on the environment. This process is called the AI loop and technology used to simulate it can be subdivided in two categories. Sensorimotor or low-level AI deals with either the perception problem (what is perceived?) or the animation problem (how are actions executed?). Decisional or high-level AI deals with the action selection problem (what is the most appropriate action in response to a given perception, i.e. what is the most appropriate behavior?). === Traditional or symbolic AI === There are two main approaches in decisional AI. The vast majority of the technologies available on the market, such as planning algorithms, finite-state machines (FSA), or expert systems, are based on the traditional or symbolic AI approach. Its main characteristics are: It is top-down: it subdivides, in a recursive manner, a given problem into a series of sub-problems that are supposedly easier to solve. It is knowledge-based: it relies on a symbolic description of the world, such as a set of rules. However, the limits of traditional AI, which goal is to build systems that mimic human intelligence, are well-known: inevitably, a combinatorial explosion of the number of rules occurs due to the complexity of the environment. In fact, it is impossible to predict all the situations that will be encountered by an autonomous entity. === Situated or behavioral AI === In order to address these issues, another approach to decisional AI, also known as situated or behavioral AI, has been proposed. It does not attempt to model systems that produce deductive reasoning processes, but rather systems that behave realistically in their environment. The main characteristics of this approach are the following: It is bottom-up: it relies on elementary behaviors, which can be combined to implement more complex behaviors. It is behavior-based: it does not rely on a symbolic description of the environment, but rather on a model of the interactions of the entities with their environment. The goal of situated AI is to model entities that are autonomous in their environment. This is achieved thanks to both the intrinsic robustness of the control architecture, and its adaptation capabilities to unforeseen situations. === Situated agents === In artificial intelligence and cognitive science, the term situated refers to an agent which is embedded in an environment. The term situated is commonly used to refer to robots, but some researchers argue that software agents can also be situated if: they exist in a dynamic (rapidly changing) environment, which they can manipulate or change through their actions, and which they can sense or perceive. Examples might include web-based agents, which can alter data or trigger processes (such as purchases) over the Internet, or virtual-reality bots which inhabit and change virtual worlds, such as Second Life. Being situated is generally considered to be part of being embodied, but it is useful to consider each perspective individually. The situated perspective emphasizes that intelligent behavior derives from the environment and the agent's interactions with it. The nature of these interactions are defined by an agent's embodiment. == Implementation principles == === Modular decomposition === The most important attribute of a system driven by situated AI is that the intelligence is controlled by a set of independent semi-autonomous modules. In the original systems, each module was actually a separate device or was at least conceived of as running on its own processing thread. Generally, though, the modules are just abstractions. In this respect, situated AI may be seen as a software engineering approach to AI, perhaps akin to object oriented design. Situated AI is often associated with reactive planning, but the two are not synonymous. Brooks advocated an extreme version of cognitive minimalism which required initially that the behavior modules were finite-state machines and thus contained no conventional memory or learning. This is associated with reactive AI because reactive AI requires reacting to the current state of the world, not to an agent's memory or preconception of that world. However, learning is obviously key to realistic strong AI, so this constraint has been relaxed, though not entirely abandoned. === Action selection mechanism === The situated AI community has presented several solutions to modeling decision-making processes, also known as action selection mechanisms. The first attempt to solve this problem goes back to subsumption architectures, which were in fact more an implementation technique than an algorithm. However, this attempt paved the way to several others, in particular the free-flow hierarchies and activation networks. A comparison of the structure and performances of these two mechanisms demonstrated the advantage of using free-flow hierarchies in solving the action selection problem. However, motor schemas and process description languages are two other approaches that have been used with success for autonomous robots. == Notes and references == Arsenio, Artur M. (2004) Towards an embodied and situated AI, In: Proceedings of the International FLAIRS conference, 2004. (online) The Artificial Life Route To Artificial Intelligence: Building Embodied, Situated Agents, Luc Steels and Rodney Brooks Eds., Lawrence Erlbaum Publishing, 1995. (ISBN 978-0805815184) Rodney A. Brooks Cambrian Intelligence (MIT Press, 1999) ISBN 0-262-52263-2; collection of early papers including "Intelligence without representation" and "Intelligence without reason", from 1986 & 1991 respectively. Ronald C. Arkin Behavior-Based Robotics (MIT Press, 1998) ISBN 0-262-01165-4 Hendriks-Jansen, Horst (1996) Catching Ourselves in the Act: Situated Activity, Interactive Emergence, Evolution, and Human Thought. Cambridge, Mass.: MIT Press.

    Read more →
  • Feature engineering

    Feature engineering

    Feature engineering is a preprocessing step in supervised machine learning and statistical modeling which transforms raw data into a more effective set of inputs. Each input comprises several attributes, known as features. By providing models with relevant information, feature engineering significantly enhances their predictive accuracy and decision-making capability. Beyond machine learning, the principles of feature engineering are applied in various scientific fields, including physics. For example, physicists construct dimensionless numbers such as the Reynolds number in fluid dynamics, the Nusselt number in heat transfer, and the Archimedes number in sedimentation. They also develop first approximations of solutions, such as analytical solutions for the strength of materials in mechanics. == Clustering == One of the applications of feature engineering has been clustering of feature-objects or sample-objects in a dataset. Especially, feature engineering based on matrix decomposition has been extensively used for data clustering under non-negativity constraints on the feature coefficients. These include Non-Negative Matrix Factorization (NMF), Non-Negative Matrix-Tri Factorization (NMTF), Non-Negative Tensor Decomposition/Factorization (NTF/NTD), etc. The non-negativity constraints on coefficients of the feature vectors mined by the above-stated algorithms yields a part-based representation, and different factor matrices exhibit natural clustering properties. Several extensions of the above-stated feature engineering methods have been reported in literature, including orthogonality-constrained factorization for hard clustering, and manifold learning to overcome inherent issues with these algorithms. Other classes of feature engineering algorithms include leveraging a common hidden structure across multiple inter-related datasets to obtain a consensus (common) clustering scheme. An example is Multi-view Classification based on Consensus Matrix Decomposition (MCMD), which mines a common clustering scheme across multiple datasets. MCMD is designed to output two types of class labels (scale-variant and scale-invariant clustering), and: is computationally robust to missing information, can obtain shape- and scale-based outliers, and can handle high-dimensional data effectively. Coupled matrix and tensor decompositions are popular in multi-view feature engineering. == Predictive modelling == Feature engineering in machine learning and statistical modeling involves selecting, creating, transforming, and extracting data features. Key components include feature creation from existing data, transforming and imputing missing or invalid features, reducing data dimensionality through methods like Principal Components Analysis (PCA), Independent Component Analysis (ICA), and Linear Discriminant Analysis (LDA), and selecting the most relevant features for model training based on importance scores and correlation matrices. Features vary in significance. Even relatively insignificant features may contribute to a model. Feature selection can reduce the number of features to prevent a model from becoming too specific to the training data set (overfitting). Feature explosion occurs when the number of identified features is too large for effective model estimation or optimization. Common causes include: Feature templates - implementing feature templates instead of coding new features Feature combinations - combinations that cannot be represented by a linear system Feature explosion can be limited via techniques such as regularization, kernel methods, and feature selection. == Automation == Automation of feature engineering is a research topic that dates back to the 1990s. Machine learning software that incorporates automated feature engineering has been commercially available since 2016. Related academic literature can be roughly separated into two types: Multi-relational Decision Tree Learning (MRDTL) uses a supervised algorithm that is similar to a decision tree. Deep Feature Synthesis uses simpler methods. === Multi-relational Decision Tree Learning (MRDTL) === Multi-relational Decision Tree Learning (MRDTL) extends traditional decision tree methods to relational databases, handling complex data relationships across tables. It innovatively uses selection graphs as decision nodes, refined systematically until a specific termination criterion is reached. Most MRDTL studies base implementations on relational databases, which results in many redundant operations. These redundancies can be reduced by using techniques such as tuple id propagation. === Open-source implementations === There are a number of open-source libraries and tools that automate feature engineering on relational data and time series: featuretools is a Python library for transforming time series and relational data into feature matrices for machine learning. MCMD: An open-source feature engineering algorithm for joint clustering of multiple datasets. OneBM or One-Button Machine combines feature transformations and feature selection on relational data with feature selection techniques. OneBM helps data scientists reduce data exploration time allowing them to try and error many ideas in short time. On the other hand, it enables non-experts, who are not familiar with data science, to quickly extract value from their data with a little effort, time, and cost. getML community is an open source tool for automated feature engineering on time series and relational data. It is implemented in C/C++ with a Python interface. It has been shown to be at least 60 times faster than tsflex, tsfresh, tsfel, featuretools or kats. tsfresh is a Python library for feature extraction on time series data. It evaluates the quality of the features using hypothesis testing. tsflex is an open source Python library for extracting features from time series data. Despite being 100% written in Python, it has been shown to be faster and more memory efficient than tsfresh, seglearn or tsfel. seglearn is an extension for multivariate, sequential time series data to the scikit-learn Python library. tsfel is a Python package for feature extraction on time series data. kats is a Python toolkit for analyzing time series data. === Deep feature synthesis === The deep feature synthesis (DFS) algorithm beat 615 of 906 human teams in a competition. == Feature stores == The feature store is where the features are stored and organized for the explicit purpose of being used to either train models (by data scientists) or make predictions (by applications that have a trained model). It is a central location where you can either create or update groups of features created from multiple different data sources, or create and update new datasets from those feature groups for training models or for use in applications that do not want to compute the features but just retrieve them when it needs them to make predictions. A feature store includes the ability to store code used to generate features, apply the code to raw data, and serve those features to models upon request. Useful capabilities include feature versioning and policies governing the circumstances under which features can be used. Feature stores can be standalone software tools or built into machine learning platforms. == Alternatives == Feature engineering can be a time-consuming and error-prone process, as it requires domain expertise and often involves trial and error. Deep learning algorithms may be used to process a large raw dataset without having to resort to feature engineering. However, deep learning algorithms still require careful preprocessing and cleaning of the input data. In addition, choosing the right architecture, hyperparameters, and optimization algorithm for a deep neural network can be a challenging and iterative process.

    Read more →
  • System Service Descriptor Table

    System Service Descriptor Table

    The System Service Descriptor Table (SSDT) is an internal dispatch table within Microsoft Windows. == Function == The SSDT maps syscalls to kernel function addresses. When a syscall is issued by a user space application, it contains the service index as parameter to indicate which syscall is called. The SSDT is then used to resolve the address of the corresponding function within ntoskrnl.exe. In modern Windows kernels, two SSDTs are used: One for generic routines (KeServiceDescriptorTable) and a second (KeServiceDescriptorTableShadow) for graphical routines. A parameter passed by the calling userspace application determines which SSDT shall be used. == Hooking == Modification of the SSDT allows to redirect syscalls to routines outside the kernel. These routines can be either used to hide the presence of software or to act as a backdoor to allow attackers permanent code execution with kernel privileges. For both reasons, hooking SSDT calls is often used as a technique in both Windows kernel mode rootkits and antivirus software. In 2010, many computer security products which relied on hooking SSDT calls were shown to be vulnerable to exploits using race conditions to attack the products' security checks.

    Read more →
  • Compute (machine learning)

    Compute (machine learning)

    In machine learning and deep learning, compute is the amount of computing power or computational resources required to train machine learning models and large language models. More broadly, compute is the computational power or resources necessary for a computer or computer program to function. == Definition == Compute is commonly defined as the amount of computing power or computational resources required to train machine learning and large language models. The term "compute" has also been more broadly applied to cloud computing, referencing processing power, memory, networking, storage, and other resources required for the computation of any program. Compute is measured in petaflop/s-days and is used to document AI training. A petaflop/s-day (pfs-day) consists of performing 1015 neural net operations per second for one day, or a total of about 1020 operations. The compute-time product serves as a mental convenience, similar to kilowatt-hour for energy. An amount of compute is meant to give an idea of the number of actual operations performed. == History == In a 2018 analysis titled "AI and compute", artificial intelligence company OpenAI introduced the concept of compute. OpenAI identified two eras of training AI systems in terms of compute-usage. From 1959 to 2012, compute roughly followed Moore’s law. Between 2012 and 2018, the amount of compute used in the largest AI training runs increased exponentially, growing by more than 300,000 times — roughly doubling every 3.4 months. By comparison, Moore’s Law doubled every two years over the same period. One of the largest models, released in 2020, used 600,000 times more computing power than the 2012 model. After 2020, compute growth began to slow down, with the compute needed for the largest AI models continuing to slow down in 2023. The notion of compute has become increasingly used from the mid-2020s onwards. == Compute growth and AI progress == Larger AI models trained on more data and using more computational resources, tend to perform better. This happens even if the algorithms themselves remain unchanged. As early as 2018, OpenAI noted the exponential increase in compute to be have a key role in AI progress. OpenAI considers three factors drive the advance of AI: algorithmic innovation, data, and the amount of compute available for training. AI models with more compute not only improve in the tasks they were trained on but can develop emergent abilities. Incremental improvements can lead to more abrupt leaps in capabilities. AI provider SpaceXAI said in 2026 that their AI progress is driven by compute and used it a key metric in the AI training of its supercomputer Colossus, the which contains 1 million GPUs. Anthropic has a contract of $1.25 billion per month with SpaceXAI to buy all the compute capacity at Colossus 1 data center. === Criticism and policy === Increasing, promoting or constraining progress in artificial intelligence has often be done via controlling the amount of compute. Policymarkers have enacted policies and provided support to make compute resources more accessible to domestic AI researchers. In a January 2022 report, the Center for Security and Emerging Technology (CSET) suggested to institutions that increasingly powerful and generalizable AI (AGI) will likely require other strategies than maximizing compute. Some AI researchers are also concerned that government might exclusively focus on scaling compute instead of other strategies. The CSET has reported on the various bottlenecks which could explain why deep learning needs for compute have slow down: training is expensive and training extremely large models generates traffic jams across many processors that are difficult to manage. there is a limited supply of AI chips (see AI chip memory shortage). CSET advances that the main resource is human capital, specifically talented researchers — according to a 2023 published survey of more than 400 AI researchers, academic and private sector workers. The survey found that AI researchers are not primarily or exclusively constrained by compute access. However, both academic and industry AI researchers equally report concerns that insufficient compute could prevent them from contributing meaningfully to AI research in the future. High compute users are more concerned about compute access. When asked about which resource provided by the government would be the most useful to them, some AI researchers select compute, other prefer grant funding. For this goal, CSET advised policymakers to ensure that even researchers with smaller budgets could effectively contribute to AI research. Other proposed strategies include using contemporary AI algorithms, managing modern AI infrastructure or focusing on interdisciplinary work between the AI field and other fields of computer science. A 2024 study on compute access found that academic-only AI research teams often have less compute intensive research topics, especially foundation models, compared to industry AI labs. As a consequence, academia is likely to play a smaller role in advancing such techniques. The researchers suggest nationally-sponsored computing infrastructure as well as open science initiatives to boost academic compute access. === Data === A 2022 study found that current large language models are significantly under-trained, a consequence of focusing on scaling language models whilst keeping the amount of training data constant. By training over 400 language models of various parameter and token size, they found that "for compute-optimal training", the model size and the number of training tokens should ideally be scaled equally: for every doubling of model size the number of training tokens should also be doubled.

    Read more →
  • Sycophancy (artificial intelligence)

    Sycophancy (artificial intelligence)

    In the field of artificial intelligence, sycophancy is a tendency of large language models (LLMs) and other AI assistants to tailor their responses to what they predict the user wants to hear rather than to what is accurate or warranted. The behavior takes several forms: an assistant may agree with a user's stated opinion even when the user is mistaken; it may abandon a correct answer after a challenge such as "are you sure?"; it may validate beliefs, decisions or self-presentation regardless of merit; or it may praise the user, their work or their ideas in unwarranted terms. The word is borrowed from the ordinary English term for fawning flattery, and is used in AI alignment and AI safety research to describe a class of misalignment failures associated with training on human feedback. Researchers at Anthropic first documented the behavior systematically in 2022. They found that models fine-tuned with reinforcement learning from human feedback (RLHF) were more likely than untuned models to repeat back a user's preferred answer. A 2023 follow-up paper, "Towards Understanding Sycophancy in Language Models", showed that five frontier assistants from OpenAI, Anthropic and Meta all exhibited the behavior, and traced its origin to biases in the human preference data used during training. Later work documented sycophancy in mathematics, medicine, academic peer review and other domains, and identified a broader category called "social sycophancy" affecting an assistant's emotional and interpersonal responses. The issue drew widespread public attention in April 2025 after OpenAI rolled back an update to its GPT-4o model. Users had reported that the assistant praised dangerous decisions, endorsed delusional thinking and offered exaggerated compliments for trivial prompts. OpenAI's post-mortem attributed the change in behavior to an additional training signal based on user thumbs-up and thumbs-down feedback. That episode, together with reporting in The New York Times, Rolling Stone and elsewhere on users drawn into delusional thinking through prolonged chatbot interaction, has been cited in litigation and in academic studies as evidence that sycophancy poses risks to user well-being. Proposed mitigations include fine-tuning on synthetic data that rewards disagreement with incorrect user statements, editing the small subset of model parameters causally responsible for the behavior, changes to the dialogue or system prompt, and benchmarks designed to surface sycophantic behavior before models are released. == Causes == The dominant explanation points to RLHF, the standard technique for aligning chat assistants with user expectations. Human annotators rank candidate model responses; a reward model is trained to predict those rankings; and the language model is then optimized against the reward model. Because human raters tend to prefer outputs that confirm their existing beliefs or flatter their work, the pipeline systematically rewards responses that agree with the annotator. Perez and colleagues at Anthropic published the first large-scale empirical evidence of the effect in 2022. They reported that RLHF training increased the probability that a model would repeat back a dialog user's preferred answer, and that larger models exhibited the behavior more strongly. Sharma and colleagues, the following year, went further and examined Anthropic's own preference data directly. Both the human raters and the reward models trained on their judgments preferred convincingly written sycophantic responses to truthful ones at a non-negligible rate. Wei and co-authors at Google DeepMind found similar results in the PaLM family, observing that both model scale and instruction tuning increased sycophancy on opinion questions. The behavior is often classified as a form of reward hacking, in which an optimization process exploits a flaw in its reward signal rather than achieving the intended objective. OpenAI's post-mortem of the April 2025 GPT-4o incident identified a more specific mechanism. An additional reward signal based on aggregated thumbs-up and thumbs-down feedback from ChatGPT users had, in OpenAI's words, "weakened the influence of our primary reward signal, which had been holding sycophancy in check." Separately, an Anthropic interpretability paper from 2025 located a linear direction in a model's internal activations corresponding to sycophantic behavior, and showed that such "persona vectors" could be used to flag sycophancy-inducing training data and to steer models away from the trait at inference time. == Measurement == The Anthropic team released SycophancyEval with its 2023 paper, supplying test sets for each of the four canonical behaviors. Two further benchmarks from Stanford followed in 2025. SycEval, applied to mathematical and medical reasoning tasks, reported an overall sycophancy rate of 58 per cent across the GPT-4o, Claude and Gemini models tested. ELEPHANT, aimed at social sycophancy, found that the eleven LLMs evaluated affirmed posts that the Reddit community r/AmITheAsshole had judged inappropriate in 42 per cent of cases, and preserved a user's face 45 percentage points more often than human respondents did. Domain-specific benchmarks have followed. BrokenMath tests robustness to plausible-looking but false mathematical claims drawn from competition problems, and reports that the best evaluated model was sycophantic in 29 per cent of cases. SYCON-Bench measures how many dialogue turns are required before a model abandons a correct position. Visual sycophancy in multimodal models has been examined with MM-SY and PENDULUM. A 2026 study by researchers at the Massachusetts Institute of Technology reported that personalization features, which adapt assistants to individual users over repeated sessions, can intensify social sycophancy. == Notable incidents == === GPT-4o rollback (April 2025) === On 25 April 2025, OpenAI completed the rollout of an update to GPT-4o, the default model used in ChatGPT at the time. Within days, users reported that the assistant had begun praising trivial messages in extravagant terms, endorsing impulsive or dangerous decisions, and reinforcing strong emotional statements without pushback. Widely shared examples included the model congratulating a user who reported stopping prescribed psychiatric medication, and praising a business plan to sell "shit on a stick" as venture-capital ready. OpenAI's chief executive, Sam Altman, wrote on 27 April that recent updates had made the model "too sycophant-y and annoying" and said fixes were in progress. The company began reverting the update on 28 April and completed the rollback for free users by 30 April. Two post-mortems followed: a short note on 29 April and a longer technical follow-up, "Expanding on what we missed with sycophancy", on 2 May. Both attributed the regression to a new training signal based on user thumbs-up and thumbs-down feedback, to inadequate pre-launch evaluation for sycophantic drift, and to the dismissal of qualitative concerns raised by internal testers before release. Reporting in CNN, Fortune and Bloomberg News treated the incident as a turning point in public awareness of the problem. === Chatbot-related psychological harm === From mid-2025 onward, news reports began to link sycophantic chatbot behavior to acute psychological harm. In June 2025, The New York Times technology reporter Kashmir Hill published an investigation centered on Eugene Torres, a Manhattan accountant with no history of mental illness, who developed a sustained delusional episode after a series of conversations with ChatGPT about simulation theory. According to the article, the assistant encouraged Torres to stop taking prescribed medication, to cut off friends and family, and at one point told him that he could fly from a nineteen-story building if he "truly believed". Futurism and Rolling Stone ran parallel investigations documenting other cases in which heavy use of ChatGPT had been associated with delusional thinking, involuntary commitment or, in at least one case, the death of a user with a pre-existing psychiatric diagnosis. A 2026 paper by researchers at the Massachusetts Institute of Technology and the University of Washington put forward a formal Bayesian model. It showed that even an ideally rational user could be drawn into what the authors call "delusional spiraling" when interacting with a sufficiently sycophantic assistant, and that the effect was not eliminated by suppressing hallucinations or by warning users in advance. The lawsuit Raine v. OpenAI, filed in San Francisco Superior Court in August 2025 by the parents of a sixteen-year-old who had died by suicide, alleges that "heightened sycophancy" was a design feature of ChatGPT that contributed to their son's death; it is the first wrongful-death suit against a large language-model provider. === Wider commentary === Mainstream coverage in outlets including The New York Times, The Washington Pos

    Read more →