Sycophancy (artificial intelligence)

Sycophancy (artificial intelligence)

In the field of artificial intelligence, sycophancy is a tendency of large language models (LLMs) and other AI assistants to tailor their responses to what they predict the user wants to hear rather than to what is accurate or warranted. The behavior takes several forms: an assistant may agree with a user's stated opinion even when the user is mistaken; it may abandon a correct answer after a challenge such as "are you sure?"; it may validate beliefs, decisions or self-presentation regardless of merit; or it may praise the user, their work or their ideas in unwarranted terms. The word is borrowed from the ordinary English term for fawning flattery, and is used in AI alignment and AI safety research to describe a class of misalignment failures associated with training on human feedback. Researchers at Anthropic first documented the behavior systematically in 2022. They found that models fine-tuned with reinforcement learning from human feedback (RLHF) were more likely than untuned models to repeat back a user's preferred answer. A 2023 follow-up paper, "Towards Understanding Sycophancy in Language Models", showed that five frontier assistants from OpenAI, Anthropic and Meta all exhibited the behavior, and traced its origin to biases in the human preference data used during training. Later work documented sycophancy in mathematics, medicine, academic peer review and other domains, and identified a broader category called "social sycophancy" affecting an assistant's emotional and interpersonal responses. The issue drew widespread public attention in April 2025 after OpenAI rolled back an update to its GPT-4o model. Users had reported that the assistant praised dangerous decisions, endorsed delusional thinking and offered exaggerated compliments for trivial prompts. OpenAI's post-mortem attributed the change in behavior to an additional training signal based on user thumbs-up and thumbs-down feedback. That episode, together with reporting in The New York Times, Rolling Stone and elsewhere on users drawn into delusional thinking through prolonged chatbot interaction, has been cited in litigation and in academic studies as evidence that sycophancy poses risks to user well-being. Proposed mitigations include fine-tuning on synthetic data that rewards disagreement with incorrect user statements, editing the small subset of model parameters causally responsible for the behavior, changes to the dialogue or system prompt, and benchmarks designed to surface sycophantic behavior before models are released. == Causes == The dominant explanation points to RLHF, the standard technique for aligning chat assistants with user expectations. Human annotators rank candidate model responses; a reward model is trained to predict those rankings; and the language model is then optimized against the reward model. Because human raters tend to prefer outputs that confirm their existing beliefs or flatter their work, the pipeline systematically rewards responses that agree with the annotator. Perez and colleagues at Anthropic published the first large-scale empirical evidence of the effect in 2022. They reported that RLHF training increased the probability that a model would repeat back a dialog user's preferred answer, and that larger models exhibited the behavior more strongly. Sharma and colleagues, the following year, went further and examined Anthropic's own preference data directly. Both the human raters and the reward models trained on their judgments preferred convincingly written sycophantic responses to truthful ones at a non-negligible rate. Wei and co-authors at Google DeepMind found similar results in the PaLM family, observing that both model scale and instruction tuning increased sycophancy on opinion questions. The behavior is often classified as a form of reward hacking, in which an optimization process exploits a flaw in its reward signal rather than achieving the intended objective. OpenAI's post-mortem of the April 2025 GPT-4o incident identified a more specific mechanism. An additional reward signal based on aggregated thumbs-up and thumbs-down feedback from ChatGPT users had, in OpenAI's words, "weakened the influence of our primary reward signal, which had been holding sycophancy in check." Separately, an Anthropic interpretability paper from 2025 located a linear direction in a model's internal activations corresponding to sycophantic behavior, and showed that such "persona vectors" could be used to flag sycophancy-inducing training data and to steer models away from the trait at inference time. == Measurement == The Anthropic team released SycophancyEval with its 2023 paper, supplying test sets for each of the four canonical behaviors. Two further benchmarks from Stanford followed in 2025. SycEval, applied to mathematical and medical reasoning tasks, reported an overall sycophancy rate of 58 per cent across the GPT-4o, Claude and Gemini models tested. ELEPHANT, aimed at social sycophancy, found that the eleven LLMs evaluated affirmed posts that the Reddit community r/AmITheAsshole had judged inappropriate in 42 per cent of cases, and preserved a user's face 45 percentage points more often than human respondents did. Domain-specific benchmarks have followed. BrokenMath tests robustness to plausible-looking but false mathematical claims drawn from competition problems, and reports that the best evaluated model was sycophantic in 29 per cent of cases. SYCON-Bench measures how many dialogue turns are required before a model abandons a correct position. Visual sycophancy in multimodal models has been examined with MM-SY and PENDULUM. A 2026 study by researchers at the Massachusetts Institute of Technology reported that personalization features, which adapt assistants to individual users over repeated sessions, can intensify social sycophancy. == Notable incidents == === GPT-4o rollback (April 2025) === On 25 April 2025, OpenAI completed the rollout of an update to GPT-4o, the default model used in ChatGPT at the time. Within days, users reported that the assistant had begun praising trivial messages in extravagant terms, endorsing impulsive or dangerous decisions, and reinforcing strong emotional statements without pushback. Widely shared examples included the model congratulating a user who reported stopping prescribed psychiatric medication, and praising a business plan to sell "shit on a stick" as venture-capital ready. OpenAI's chief executive, Sam Altman, wrote on 27 April that recent updates had made the model "too sycophant-y and annoying" and said fixes were in progress. The company began reverting the update on 28 April and completed the rollback for free users by 30 April. Two post-mortems followed: a short note on 29 April and a longer technical follow-up, "Expanding on what we missed with sycophancy", on 2 May. Both attributed the regression to a new training signal based on user thumbs-up and thumbs-down feedback, to inadequate pre-launch evaluation for sycophantic drift, and to the dismissal of qualitative concerns raised by internal testers before release. Reporting in CNN, Fortune and Bloomberg News treated the incident as a turning point in public awareness of the problem. === Chatbot-related psychological harm === From mid-2025 onward, news reports began to link sycophantic chatbot behavior to acute psychological harm. In June 2025, The New York Times technology reporter Kashmir Hill published an investigation centered on Eugene Torres, a Manhattan accountant with no history of mental illness, who developed a sustained delusional episode after a series of conversations with ChatGPT about simulation theory. According to the article, the assistant encouraged Torres to stop taking prescribed medication, to cut off friends and family, and at one point told him that he could fly from a nineteen-story building if he "truly believed". Futurism and Rolling Stone ran parallel investigations documenting other cases in which heavy use of ChatGPT had been associated with delusional thinking, involuntary commitment or, in at least one case, the death of a user with a pre-existing psychiatric diagnosis. A 2026 paper by researchers at the Massachusetts Institute of Technology and the University of Washington put forward a formal Bayesian model. It showed that even an ideally rational user could be drawn into what the authors call "delusional spiraling" when interacting with a sufficiently sycophantic assistant, and that the effect was not eliminated by suppressing hallucinations or by warning users in advance. The lawsuit Raine v. OpenAI, filed in San Francisco Superior Court in August 2025 by the parents of a sixteen-year-old who had died by suicide, alleges that "heightened sycophancy" was a design feature of ChatGPT that contributed to their son's death; it is the first wrongful-death suit against a large language-model provider. === Wider commentary === Mainstream coverage in outlets including The New York Times, The Washington Pos

Video renderer

A video renderer is software that processes a video file and sends it sequentially to the video display controller card for display on a computer screen. An example of a video renderer, is the VMR-7 that was used by Microsoft's DirectShow. An example of a UNIX video renderer is the one container within GStreamer. Commonly used video renderers are: Enhanced Video Renderer VMR9 Renderless Haali's Video Renderer Madvr Video Renderer JRVR, a part of JRiver Media Center

AI-assisted targeting in the Gaza Strip

As part of the Gaza war, the Israel Defense Forces (IDF) have used artificial intelligence to rapidly and automatically perform much of the process of determining what to bomb. Israel has greatly expanded the bombing of the Gaza Strip, which in previous wars had been limited by the Israeli Air Force running out of targets. These tools include the Gospel, an AI which automatically reviews surveillance data looking for buildings, equipment and people thought to belong to the enemy, and upon finding them, recommends bombing targets to a human analyst who may then decide whether to pass it along to the field. Another is Lavender, an "AI-powered database" which lists tens of thousands of Palestinian men linked by AI to Hamas or Palestinian Islamic Jihad, and which is also used for target recommendation. Critics have argued the use of these AI tools puts civilians at risk, blurs accountability, and results in militarily disproportionate violence in violation of international humanitarian law. == The Gospel == Israel uses an AI system dubbed "Habsora", "the Gospel", to determine which targets the Israeli Air Force would bomb. It automatically provides a targeting recommendation to a human analyst, who decides whether to pass it along to soldiers in the field. The recommendations can be anything from individual fighters, rocket launchers, Hamas command posts, to private homes of suspected Hamas or Islamic Jihad members. AI can process military intelligence far faster than humans. Retired Lt Gen. Aviv Kohavi, head of the IDF until 2023, stated that the system could produce 100 bombing targets in Gaza a day, with real-time recommendations which ones to attack, where human analysts might produce 50 a year. A lecturer interviewed by NPR estimated these figures as 50–100 targets in 300 days for 20 intelligence officers, and 200 targets within 10–12 days for the Gospel. === Technological background === The Gospel uses machine learning, where an AI is tasked with identifying commonalities in vast amounts of data (e.g. scans of cancerous tissue, photos of a facial expression, surveillance of Hamas members identified by human analysts), then looking for those commonalities in new material. What information the Gospel uses is not known, but it is thought to combine surveillance data from diverse sources in enormous amounts. Recommendations are based on pattern-matching. A person with enough similarities to other people labeled as enemy combatants may be labelled a combatant themselves. Regarding the suitability of AIs for the task, NPR cited Heidy Khlaaf, engineering director of AI Assurance at the technology security firm Trail of Bits, as saying "AI algorithms are notoriously flawed with high error rates observed across applications that require precision, accuracy, and safety." Bianca Baggiarini, lecturer at the Australian National University's Strategic and Defence Studies Centre wrote AIs are "more effective in predictable environments where concepts are objective, reasonably stable, and internally consistent." She contrasted this with telling the difference between a combatant and non-combatant, which even humans frequently can't do. Khlaaf went on to point out that such a system's decisions depend entirely on the data it's trained on, and are not based on reasoning, factual evidence or causation, but solely on statistical probability. === Operation === The IAF ran out of targets to strike in the 2014 war and 2021 crisis. In an interview on France 24, investigative journalist Yuval Abraham of +972 Magazine stated that to maintain military pressure, and due to political pressure to continue the war, the military would bomb the same places twice. Since then, the integration of AI tools has significantly sped up the selection of targets. In early November, the IDF stated more than 12,000 targets in Gaza had been identified by the target administration division that uses the Gospel. NPR wrote on December 14 that it was unclear how many targets from the Gospel had been acted upon, but that the Israeli military said it was currently striking as many as 250 targets a day. The bombing, too, has intensified to what the December 14 article called an astonishing pace: the Israeli military stated at the time it had struck more than 22,000 targets inside Gaza, at a daily rate more than double that of the 2021 conflict, more than 3,500 of them since the collapse of the truce on December 1. Early in the offensive the head of the Air Force stated his forces only struck military targets, but added: "We are not being surgical." Once a recommendation is accepted, another AI, Fire Factory, cuts assembling the attack down from hours to minutes by calculating munition loads, prioritizing and assigning targets to aircraft and drones, and proposing a schedule, according to a pre-war Bloomberg article that described such AI tools as tailored for a military confrontation and proxy war with Iran. One change that The Guardian noted is that since senior Hamas leaders disappear into tunnels at the start of an offensive, systems such as the Gospel have allowed the IDF to locate and attack a much larger pool of more junior Hamas operatives. It cited an official who worked on targeting decisions in previous Gaza operations as saying that while the homes of junior Hamas members had previously not been targeted for bombing, the official believes the houses of suspected Hamas operatives were now targeted regardless of rank. In the France 24 interview, Abraham, of +972 Magazine, characterized this as enabling the systematization of dropping a 2000 lb bomb into a home to kill one person and everybody around them, something that had previously been done to a very small group of senior Hamas leaders. NPR cited a report by +972 Magazine and its sister publication Local Call as asserting the system is being used to manufacture targets so that Israeli military forces can continue to bombard Gaza at an enormous rate, punishing the general Palestinian population. NPR noted it had not verified this; it was unclear how many targets are being generated by AI alone, but there had been a substantial increase in targeting, with an enormous civilian toll. In principle, the combination of a computer's speed to identify opportunities and a human's judgment to evaluate them can enable more precise attacks and fewer civilian casualties. Israeli military and media have emphasized this capacity to minimize harm to non-combatants. Richard Moyes, researcher and head of the NGO Article 36, pointed to "the widespread flattening of an urban area with heavy explosive weapons" to question these claims, while Lucy Suchman, professor emeritus at Lancaster University, described the bombing as "aimed at maximum devastation of the Gaza Strip". The Guardian wrote that when a strike was authorized on private homes of those identified as Hamas or Islamic Jihad operatives, target researchers knew in advance the expected number of civilians killed, each target had a file containing a collateral damage score stipulating how many civilians were likely to be killed in a strike, and according to a senior Israeli military source, operatives use a "very accurate" measurement of the rate of civilians evacuating a building shortly before a strike. "We use an algorithm to evaluate how many civilians are remaining. It gives us a green, yellow, red, like a traffic signal." ==== 2021 use ==== Kohavi compared the target division using the Gospel to a machine and stated that once the machine was activated in the war of May 2021, it generated 100 targets a day, with half of them being attacked, in contrast with 50 targets in Gaza per year beforehand. Approximately 200 targets came from the Gospel out of the 1,500 targets Israel struck in Gaza in the war, including both static and moving targets according to the military. The Jewish Institute for National Security of America's after action report identified an issue, stating the system had data on what was a target, but lacked data on what wasn't. The system depends entirely on training data, and intel that human analysts had examined and deemed didn't constitute a target had been discarded, risking bias. The vice president expressed his hopes this had since been rectified. === Organization === The Gospel is used by the military's target administration division (or Directorate of Targets or Targeting Directorate), which was formed in 2019 in the IDF's intelligence directorate to address the air force running out of targets to bomb, and which Kohavi described as "powered by AI capabilities" and including hundreds of officers of soldiers. In addition to its wartime role, The Guardian wrote it'd helped the IDF build a database of between 30,000 and 40,000 suspected militants in recent years, and that systems such as the Gospel had played a critical role in building lists of individuals authorized to be assassinated. The Gospel was developed by Unit 8200 of the Israeli Intelligence C

Sora (text-to-video model)

Sora was a text-to-video model and social media app developed by OpenAI. Using artificial intelligence, the model generated short video clips based on prompts, and could also extend existing short videos. In February 2024, OpenAI previewed examples of its output to the public, with the first generation of Sora released publicly for ChatGPT Plus and ChatGPT Pro users in the United States and Canada in December 2024. The second generation of Sora was released to select users in the US and Canada at the end of September 2025. Sora 2 integrated social media features into the app. The app was shut down on April 26, 2026 and the application programming interface (API) is planned to be discontinued on September 24, 2026, marking the end of the Sora AI brand as a whole. By default, the generator used copyrighted material in its videos, unless copyright holders actively opt out of having their content included. Videos contained a visible, moving digital watermark to prevent misuse, but a week after Sora 2's release, third-party programs became available which could remove the watermark. == Background == Several other models capable of generating video from text had been created prior to Sora, including Meta's Make‑A‑Video, Runway's Gen‑2 and Google Veo. OpenAI, the company behind Sora, had released DALL·E 3, the third of its DALL-E text-to-image models, in September 2023. == History == === Initial release === The team that developed Sora named it after the Japanese word for 'sky' to signify its "limitless creative potential". On February 15, 2024, OpenAI first previewed Sora by releasing multiple clips of high-definition videos that it had created, including an SUV driving down a mountain road, an animation of a "short fluffy monster" next to a candle, two people walking through Tokyo in the snow, and fake historical footage of the California gold rush. OpenAI stated that it was able to generate videos as long as one minute. The company then shared a technical report that highlighted the methods used to train the model. OpenAI CEO Sam Altman also posted a series of tweets responding to Twitter users' prompts with Sora-generated videos of the prompts. As of December 9, 2024, OpenAI had gradually made Sora available to the public for ChatGPT Pro and ChatGPT Plus users in the U.S. and Canada. Prior to this, the company had provided limited access to a small "red team", including experts in misinformation and bias, to perform adversarial testing on the model. The company also shared Sora with a small group of creative professionals, including video makers and artists, to seek feedback on its usefulness in creative fields. In February 2025, OpenAI announced plans to integrate Sora into ChatGPT by letting users generate Sora videos from the chatbot. === Sora 2 === Sora 2 was unveiled on September 30, 2025, with an iOS app at the same time, as well as an Android app two months later. All videos generated by the model feature a visible, moving watermark to prevent misuse of the tool. The previous version of Sora also added a safety watermark to allow viewers to distinguish between real and fictional content. On October 7, 404 Media reported that third-party programs that could remove the watermark from Sora 2 videos had become prevalent. Many outlets, such as Wired magazine, have noted that the Sora 2 app is overtly similar to TikTok in style and features. === Discontinuation === On March 24, 2026, OpenAI announced on X that it was discontinuing Sora in both the mobile app and the API. The Sora app was shut down on April 26, 2026, while the API is planned to be shut down on September 24, 2026. OpenAI's partnership with Disney, which included a licensing agreement allowing Disney characters to be used within Sora, was also coming to an end. The decision prompted British technology news website The Register to label OpenAI a "product-killer", following in the footsteps of other technology companies such as Google, Amazon Web Services, Broadcom, Cloud Software Group, and Netscape. OpenAI did not provide a specific reason for discontinuing Sora in its shutdown notice. The reports that emerged regarding this discontinuity linked the decision to computation shortages, cost pressures, and a broader shift toward core enterprise products. Following its public launch, Sora's worldwide users peaked at around a million before declining to fewer than 500,000, while the service cost an estimated $1 million per day to operate due to the computational demands of video generation. == Legal regulation == In November 2024, an API key for Sora access was leaked by a group of testers on Hugging Face who posted a manifesto stating that they were protesting that Sora was used for "art washing". OpenAI revoked all access three hours after the leak was made public and stated that "hundreds of artists" have shaped the development and that "participation is voluntary". At the time of its launch, Sora 2 allowed copyrighted content by default unless copyright holders contacted OpenAI to restrict the generation of their content on the platform. On October 3, 2025, OpenAI stated that a future update to Sora 2 would give copyright holders "more granular control" over the generation of copyrighted content, but the company did not state whether existing content would be removed. On October 6, the chairman of the MPA criticized OpenAI's approach to copyright with Sora 2. On December 11, 2025, the Walt Disney Company announced that it would invest $1 billion in OpenAI to allow users to generate more than 200 of its copyrighted characters on Sora 2. These characters include those from Disney Animation, Pixar, Marvel Studios, and Star Wars. == Capabilities and limitations == The technology behind Sora is an adaptation of the technology behind DALL-E 3. According to OpenAI, Sora is a diffusion transformer, a denoising latent diffusion model with one transformer as its denoiser. A video is generated in latent space by denoising 3D "patches", then transformed to standard space by a video decompressor. Recaptioning is employed to augment training data by using a video-to-text model to create detailed captions for videos. OpenAI trained the model using publicly available videos as well as copyrighted videos licensed for the purpose, but did not reveal the number or the exact source of the videos. Upon its release, OpenAI acknowledged some of Sora's shortcomings, including its limited capacity to simulate complex physics, to understand causality and to differentiate left from right. OpenAI also stated that, in adherence to the company's existing safety practices, Sora will restrict text prompts for sexual, violent, hateful or celebrity imagery, as well as content featuring existing intellectual property. Sora researcher Tim Brooks stated that the model learned how to create 3D graphics from its dataset alone, while fellow Sora researcher Bill Peebles said that the model automatically created different video angles without being prompted. According to OpenAI, Sora-generated videos are also tagged with C2PA metadata to indicate that they are AI-processed. === Comparison with other models === The Artificial Analysis have placed Sora 2 pro lower than other text-to-video AI generators in the market on its leaderboard. Other models, such as Seedance 2.0 from ByteDance, Runaway 4.5 from Runaway, and Kling 3.0 from KlingAI, have ranked higher than Sora 2.0. == Reception == === Positive === In 2024, Will Douglas Heaven of the MIT Technology Review called the demonstration videos "impressive", but noted that they must have been cherry-picked and may not be representative of Sora's typical output. Lisa Lacy of CNET called its example videos "remarkably realistic – except perhaps when a human face appears close up or when sea creatures are swimming". In October 2025, The New York Times remarked that the release of the Sora 2 app in September 2025 was "jaw-dropping (for better and worse)" though also remarked that the app was a "social network in disguise" and "the type of product that companies like Meta and X have sought to build: a way to bring A.I. to the masses that people can share." The article expressed concern regarding the product's potential impact on society and its potential use to promote misinformation, disinformation, and scams. A 2025 study in Science Advances found that generative AI tools can lower barriers to entry in creative work. It enables users with diverse skill sets, including people with less formal artistic training and technical skills, to act on their creative and imaginative ideas. The lower barrier to entry allows such users previously locked out of the creative industry to produce content and easily act on their creative ideas. === Negative === Some internet users and online content creators, such as Hank Green, called the mobile app "SlopTok," a reference to both the mobile app TikTok and the term AI slop. Filmmaker Tyler Perry announced he would be putting a planned

Mobile Fortify

Mobile Fortify is a mobile app used by United States Immigration and Customs Enforcement (ICE) on their government-issued phones. The app allows agents to take a photo in order to gather biometrics, including contactless fingerprints and faceprints, for the purpose of identifying an individual and their potential immigration status. The app was created by NEC. == History == In June 2025, use of Mobile Fortify by ICE was uncovered through leaked emails and the user manual, reported by 404 Media. The app is internally developed, and details of the parent company and developer were initially unknown. In January 2026, the DHS's 2025 AI Use Case Inventory revealed the vendor as NEC Corporation, an international conglomerate with subsidiaries in Argentina, Australia, China, India and Malaysia. Later that month, several senators demanded transparency around the app and its origins, and that ICE stop using it. A second letter was sent again in November, after hearing no response to the previous letter from ICE. == Technology == Unlike other facial recognition software, Fortify uses federally linked databases. By contrast, Clearview AI uses public social media databases for biometric scanning. Federal databases include DHS's automated biometric identification system (IDENT), containing more than 270 million biometric records, and Customs and Border Protection's Traveler Verification Service. The State Department's visa and passport photo database, the FBI's National Crime Information Center, National Law Enforcement Telecommunications Systems, and CBP's TECS and Seized Assets and Case Tracing System (SEACATS). == Oversight == Several senators urged ICE to stop using the app for fear of infringing on fourth amendment and first amendment rights, and requested details on who developed the app, when it was deployed, whether the app was tested for accuracy, and policies and practices governing its use. In June 2025, they sent an open letter to Todd Lyons, ICE acting director, signed by senators Cory Booker, Chris Van Hollen, Ed Markey, Bernie Sanders, Adam Schiff, Tina Smith, Elizabeth Warren, and Ron Wyden. On November 3, a second letter was sent to the ICE by senators, after not receiving answers to questions from the previous letter deadlined for October 2. == Criticism == Mobile Fortify, and ICE's use of similar biometric identification technologies (such as Mobile Identify, an app similar to Mobile Fortify to be used by local or regional law enforcement to assist in immigration enforcement ) has faced scrutiny from a variety of digital rights organizations, politicians, and news outlets. The criticism is already considered to potentially be a reason why the similar Mobile Identify app was pulled from the Google Play Store. Facial recognition technologies are known to produce false-positives and generally unreliable results, especially on those with darker skin tones. ICE has already previously mistakenly arrested a U.S. citizen under the belief he was illegally in the country, and later stated that he "could be deported based on biometric confirmation of his identity" prior to his release. U.S. representative Bennie Thompson, ranking member of the House Homeland Security Committee has previously commented that "ICE officials have told us that an apparent biometric match by Mobile Fortify is a ‘definitive’ determination of a person's status and that an ICE officer may ignore evidence of American citizenship—including a birth certificate—if the app says the person is an alien," and that "Mobile Fortify is a dangerous tool in the hands of ICE, and it puts American citizens at risk of detention and even deportation," On January 19, 2026, 404 Media reported on a case where a woman, identified in court documents as "MJMA", was scanned by Mobile Fortify twice in the same interaction, and two entirely different names were provided by the app. According to the Innovation Law Lab, whose attorneys are representing MJMA, both of the names were incorrect. ICE has stated that they will not allow people to decline to be scanned by Mobile Fortify, and that photos taken, even those of U.S. citizens, will be stored for 15 years, something that has been criticized primarily because ICE has not performed a Privacy Impact Assessment (PIA) for Mobile Fortify, the right to decline other forms of biometric verification to the U.S. government is often available under other circumstances, and the 15 year window is viewed as unnecessarily large.

MegaHAL

MegaHAL is a computer conversation simulator, or "chatterbot", created by Jason Hutchens. == Background == In 1996, Jason Hutchens entered the Loebner Prize Contest with HeX, a chatterbot based on ELIZA. HeX won the competition that year and took the $2000 prize for having the highest overall score. In 1998, Hutchens again entered the Loebner Prize Contest with his new program, MegaHAL. MegaHAL made its debut in the 1998 Loebner Prize Contest. Like many chatterbots, the intent is for MegaHAL to appear as a human fluent in a natural language. As a user types sentences into MegaHAL, MegaHAL will respond with sentences that are sometimes coherent and at other times complete gibberish. MegaHAL learns as the conversation progresses, remembering new words and sentence structures. It will even learn new ways to substitute words or phrases for other words or phrases. Many would consider conversation simulators like MegaHAL to be a primitive form of artificial intelligence. However, MegaHAL doesn't understand the conversation or even the sentence structure. It generates its conversation based on sequential and mathematical relationships. In the world of conversation simulators, MegaHAL is based on relatively old technology and could be considered primitive. However, its popularity has grown due to its humorous nature; it has been known to respond with twisted or nonsensical statements that are often amusing. == Theory of Operation == MegaHal is based at least in part on a so-called "hidden Markov Model", so that the first thing that Megahal does when it "trains" on a script or text is to build a database of text fragments encompassing every possible subset of perhaps 4, 5, or even 6 consecutive words, so that for example - if MegaHal trains on the Declaration of Independence, then MegaHal will build a database containing text fragments such as "When in the course", "in the course of", "the course of human", "course of human events", "of human events, one", "human events, one people", and so on. Then if Megahal is fed another text, such has "Superman, Yes! It's Superman - he can change the course of mighty rivers, bend steel with his bare hands - and who disguised at Clark Kent …" IT MIGHT induce Megahal to apparently bemuse itself to proffer whether Superman can change the course of human events, or something else altogether - such as some rambling about "when in the course of mighty rivers", and so on. Thus likewise - if a phrase like "the White house said" comes up a lot in some text; then Megahal's ability to switch randomly between different contexts which otherwise share some similarity can result at times in some surprising lucidity, or else it might otherwise seem quite bizarre. == Examples == There are some sentences that MegaHAL generated: CHESS IS A FUN SPORT, WHEN PLAYED WITH SHOT GUNS. and COWS FLY LIKE CLOUDS BUT THEY ARE NEVER COMPLETELY SUCCESSFUL. == Distribution == MegaHAL is distributed under the Unlicense. Its source code can be downloaded from the Github repository.

Blackboard system

A blackboard system is an artificial intelligence approach based on the blackboard architectural model, where a common knowledge base, the "blackboard", is iteratively updated by a diverse group of specialist knowledge sources, starting with a problem specification and ending with a solution. Each knowledge source updates the blackboard with a partial solution when its internal constraints match the blackboard state. In this way, the specialists work together to solve the problem. The blackboard model was originally designed as a way to handle complex, ill-defined problems, where the solution is the sum of its parts. == Metaphor == The following scenario provides a simple metaphor that gives some insight into how a blackboard functions: A group of specialists are seated in a room with a large blackboard. They work as a team to brainstorm a solution to a problem, using the blackboard as the workplace for cooperatively developing the solution. The session begins when the problem specifications are written onto the blackboard. The specialists all watch the blackboard, looking for an opportunity to apply their expertise to the developing solution. When someone writes something on the blackboard that allows another specialist to apply their expertise, the second specialist records their contribution on the blackboard, hopefully enabling other specialists to then apply their expertise. This process of adding contributions to the blackboard continues until the problem has been solved. == Components == A blackboard-system application consists of three major components The software specialist modules, which are called knowledge sources (KSs). Like the human experts at a blackboard, each knowledge source provides specific expertise needed by the application. The blackboard, a shared repository of problems, partial solutions, suggestions, and contributed information. The blackboard can be thought of as a dynamic "library" of contributions to the current problem that have been recently "published" by other knowledge sources. The control shell, which controls the flow of problem-solving activity in the system. Just as the eager human specialists need a moderator to prevent them from trampling each other in a mad dash to grab the chalk, KSs need a mechanism to organize their use in the most effective and coherent fashion. In a blackboard system, this is provided by the control shell. === Learnable Task Modeling Language === A blackboard system is the central space in a multi-agent system. It's used for describing the world as a communication platform for agents. To realize a blackboard in a computer program, a machine readable notation is needed in which facts can be stored. One attempt in doing so is a SQL database, another option is the Learnable Task Modeling Language (LTML). The syntax of the LTML planning language is similar to PDDL, but adds extra features like control structures and OWL-S models. LTML was developed in 2007 as part of a much larger project called POIROT (Plan Order Induction by Reasoning from One Trial), which is a Learning from demonstrations framework for process mining. In POIROT, Plan traces and hypotheses are stored in the LTML syntax for creating semantic web services. Here is a small example: A human user is executing a workflow in a computer game. The user presses some buttons and interacts with the game engine. While the user interacts with the game, a plan trace is created. That means the user's actions are stored in a logfile. The logfile gets transformed into a machine readable notation which is enriched by semantic attributes. The result is a textfile in the LTML syntax which is put on the blackboard. Agents (software programs in the blackboard system) are able to parse the LTML syntax. == Implementations == We start by discussing two well known early blackboard systems, BB1 and GBB, below and then discuss more recent implementations and applications. The BB1 blackboard architecture was originally inspired by studies of how humans plan to perform multiple tasks in a trip, used task-planning as a simplified example of tactical planning for the Office of Naval Research. Hayes-Roth & Hayes-Roth found that human planning was more closely modeled as an opportunistic process, in contrast to the primarily top-down planners used at the time: While not incompatible with successive-refinement models, our view of planning is somewhat different. We share the assumption that planning processes operate in a two-dimensional planning space defined on time and abstraction dimensions. However, we assume that people's planning activity is largely opportunistic. That is, at each point in the process, the planner's current decisions and observations suggest various opportunities for plan development. The planner's subsequent decisions follow up on selected opportunities. Sometimes, these decision-sequences follow an orderly path and produce a neat top-down expansion as described above. However, some decisions and observations might also suggest less orderly opportunities for plan development. A key innovation of BB1 was that it applied this opportunistic planning model to its own control, using the same blackboard model of incremental, opportunistic, problem-solving that was applied to solve domain problems. Meta-level reasoning with control knowledge sources could then monitor whether planning and problem-solving were proceeding as expected or stalled. If stalled, BB1 could switch from one strategy to another as conditions – such as the goals being considered or the time remaining – changed. BB1 was applied in multiple domains: construction site planning, inferring 3-D protein structures from X-ray crystallography, intelligent tutoring systems, and real-time patient monitoring. BB1 also allowed domain-general language frameworks to be designed for wide classes of problems. For example, the ACCORD language framework defined a particular approach to solving configuration problems. The problem-solving approach was to incrementally assemble a solution by adding objects and constraints, one at a time. Actions in the ACCORD language framework appear as short English-like commands or sentences for specifying preferred actions, events to trigger KSes, preconditions to run a KS action, and obviation conditions to discard a KS action that is no longer relevant. GBB focused on efficiency, in contrast to BB1, which focused more on sophisticated reasoning and opportunistic planning. GBB improves efficiency by allowing blackboards to be multi-dimensional, where dimensions can be either ordered or not, and then by increasing the efficiency of pattern matching. GBB1, one of GBB's control shells implements BB1's style of control while adding efficiency improvements. Other well-known of early academic blackboard systems are the Hearsay II speech recognition system and Douglas Hofstadter's Copycat and Numbo projects. Some more recent examples of deployed real-world applications include: The PLAN component of the Mission Control System for RADARSAT-1, an Earth observation satellite developed by Canada to monitor environmental changes and Earth's natural resources. The GTXImage CAD software by GTX Corporation was developed in the early 1990s using a set of rulebases and neural networks as specialists operating on a blackboard system. Adobe Acrobat Capture (now discontinued), as it used a blackboard system to decompose and recognize image pages to understand the objects, text, and fonts on the page. This function is currently built into the retail version of Adobe Acrobat as "OCR Text Recognition". Details of a similar OCR blackboard for Farsi text are in the public domain. Blackboard systems are used routinely in many military C4ISTAR systems for detecting and tracking objects. Another example of current use is in Game AI, where they are considered a standard AI tool to help with adding AI to video games. == Recent developments == Blackboard-like systems have been constructed within modern Bayesian machine learning settings, using agents to add and remove Bayesian network nodes. In these 'Bayesian Blackboard' systems, the heuristics can acquire more rigorous probabilistic meanings as proposal and acceptances in Metropolis Hastings sampling though the space of possible structures. Conversely, using these mappings, existing Metropolis-Hastings samplers over structural spaces may now thus be viewed as forms of blackboard systems even when not named as such by the authors. Such samplers are commonly found in musical transcription algorithms for example. Blackboard systems have also been used to build large-scale intelligent systems for the annotation of media content, automating parts of traditional social science research. In this domain, the problem of integrating various AI algorithms into a single intelligent system arises spontaneously, with blackboards providing a way for a collection of distributed, modular natural language processing algorithm