AI Chat Picture

AI Chat Picture — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Ideonomy

    Ideonomy

    Ideonomy is a combinatorial "science of ideas" developed by American independent scholar Patrick M. Gunkel (1947–2017). Specifically, Ideonomy is concerned with the systematic organization of ideas and the discovery of the rules behind how ideas combine, diverge, and transform. Gunkel defined ideonomy as "the science of the laws of ideas and of the application of such laws to the generation of all possible ideas in connection with any subject, idea, or thing." In his 1992 book A History of Knowledge, Charles Van Doren compared ideonomy to a "mining operation" that excavates meanings and thought to discover treasures hidden deep within language. Sources from the 1980s and 1990s demonstrate that ideonomy was useful to academic researchers in fields including biology, toxicology, and nursing/patient care. Beginning in the 2010s, academics in a wide range of fields including machine learning, marketing, computational modeling, and cybersecurity have relied on materials generated for ideonomy to provide methodological support for their research. == Etymology and definition == The word "ideonomy" combines the Greek roots ideo- (from idea, meaning pattern or form) and -nomy (from nomos, meaning law or custom). The suffix -nomy suggests the laws concerning or the totality of knowledge about a given subject, as in astronomy or taxonomy. In a note posted on the MIT ideonomy website, Gunkel states that the word was supposedly first coined by the French Encyclopedists to refer to a science of ideas. No evidence is provided for this statement, however. The concept bears some relationship to Antoine Destutt de Tracy's "ideology" (1796), which originally meant a systematic science of ideas before acquiring its modern political connotations. Gunkel provided several metaphorical descriptions of ideonomy: An "idea bank": a computer network enabling systematic exploration of infinite possible ideas A "kaleidoscope" that can exhibit all possible combinations and transformations of ideas A "prism" capable of diffracting any idea into its cognitive components A "gigantic microscope for magnifying the ideocosm" == History and development == In 1984, Gunkel received a five-year unsolicited grant from the Richard Lounsbery Foundation of New York to develop ideonomy. A June 1, 1987 article on the front page of The Wall Street Journal brought Gunkel and ideonomy to wider public attention. Some academics were interested in using ideonomy's techniques, including biologist Betsey Dyer, who published several contemporaneous peer-reviewed studies citing ideonomy. Academic researchers in the field of toxicology and nursing/patient care also used ideonomy. However, ideonomy's broadest contribution to date came beginning in the 2010s, as a list of personality traits generated for combinatorial matching was used by researchers in artificial intelligence to code human emotions for machine-learning tasks, develop computational models related to personality, develop a measurement framework for influencer-brand recommender systems, and aid information awareness/cybersecurity assessment. == Methodology == The foundational empirical method of ideonomy involves the systematic creation of extensive lists. Gunkel's apartment reportedly contained thousands of lists on every conceivable topic. Gunkel termed each list an "organon," which he described as expanding through "combination, permutation, transformation, generalization, specialization, intersection, interaction, reapplication, recursive use, etc. of existing organons." The ideonomic process follows a progressive structure. The ideonomist begins with a simple list of examples of a particular idea, concept, or thing. The list need not be exhaustive. By studying this list, the ideonomist isolates and identifies types. This categorical analysis then reveals missing items, allowing the primary list to be improved and refined. Gunkel emphasized that list items must not only cover genuine categories of nature but also be formulated in ways that yield the largest possible number of syntactically coherent possibilities when combined. The core technique of ideonomy is "ideocombinatorics"—the systematic intersection and combination of items from different lists to generate novel composite concepts. Gunkel developed computer programs to automate this process. For example, combining a list of 230 Universal Elementary Shapes (pits, pyramids, trenches, hemispheres, needles) with a list of 74 Types of Order (recurrence, identity, likeness of parts) yields 17,020 possible "shapes of order." These combinations, when phrased as questions ("Can there be pits of recurrence?"), could suggest new categories of phenomena worthy of investigation. The computer-generated output is typically repetitive and often meaningless. However, with sufficient frequency, the combinations yield results that are unexpectedly interesting and fruitful. In one documented case, Gunkel's programs generated 45,540 questions about toxins for microbiologist David Bermudes. One question—"Can hierarchies of cell process be used as a basis for classifying toxic action?"—prompted Bermudes to develop a novel approach to classifying biological toxins by the type of molecule they attack, rather than by chemical structure or physiological system affected. According to one contemporaneous account of ideonomy, "Gunkel takes for his field all fields and all ideas about anything. He uses a computer to generate lists of words and phrases and by juxtaposition reviews the resultant patterns for novel ideas. The computer is ideal for this task because the mind would rebel at the formidable processing task ideonomy involves. What we have here is computer generated originality." == Applications == Gunkel and his supporters identified several practical applications for ideonomic methods: Scientific research: Biologist Betsey Dyer of Wheaton College published research crediting ideonomy for helping to generate ideas. Medical science: When Austin pathologist Michael T. O'Brien was presented with the ideonomically-generated question "Can arteries have rashes?", he initially dismissed it as nonsense. Upon reflection, he realized that large arteries are supplied with blood by tiny vessels that might become inflamed and dilated, analogous to skin vessels in a rash—a phenomenon potentially worth researching. Analogical thinking: Harvard law professor Robert Clark used ideonomic analogies to write a research paper comparing plant structure with human hierarchies. Artificial intelligence: Douglas Lenat, a researcher at Microelectronics and Computer Technology Corporation (MCC) in Austin, suggested that Gunkel's lists enumerating types of human mistakes could help design AI systems capable of recognizing and correcting their own errors. == Reception and criticism == Ideonomy received mixed reactions from the academic and scientific communities. Prominent supporters included: Edward Fredkin, former director of MIT's computer science laboratory, who praised Gunkel's "provocative ideas on artificial intelligence." Marvin Minsky, AI scientist and MIT professor, who described ideonomy as "perhaps the most extensive study of ways to generate ideas." Frederick Seitz, president emeritus of Rockefeller University, who noted Gunkel's "encyclopedic scope" Robert C. Clark, Harvard law professor, who called Gunkel "the most intelligent person I ever met" However, skeptics questioned whether ideonomy constituted a genuine science. Fredkin himself noted that Gunkel "pours out about 60 ideas a minute, and 59 of them are bad," though he added that "even with one good idea out of 60, it's still an amazing accomplishment." Douglas Lenat observed that brainstorming with Gunkel was "a bit like being hit over the head by the muse with a sledgehammer" and that "he puts people off." Gunkel himself acknowledged that ideonomy was in its infancy and might seem "absurdly utopian." His planned magnum opus on ideonomy remained incomplete, and was posted on an MIT website thanks to faculty advisor Whitman Richards. Gunkel wrote: "Pioneering in a completely new field, yes in a new science, is almost unreal. It is heartbreaking, it is pitiable, it is almost inhuman. Honestly, it is a hell. There is nothing heroic about it." == Related concepts == Gunkel identified several historical precedents for ideonomic thinking: Gottfried Wilhelm Leibniz (1646–1716): The philosopher's work on a universal characteristic (characteristica universalis) and calculus of reasoning Peter Mark Roget (1779–1869): Creator of Roget's Thesaurus, which organized concepts into a systematic taxonomy Dmitri Mendeleev (1834–1907): Developer of the periodic table, demonstrating how combining lists of element families could reveal previously unseen connections Fritz Zwicky (1898–1974): The Caltech astrophysicist whom Gunkel called the "grandfather of ideonomy" for his development of "morphological research"—systematic exploration of all possible solutions t

    Read more →
  • Best AI Code-review Tools in 2026

    Best AI Code-review Tools in 2026

    Looking for the best AI code-review tool? An AI code-review tool is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI code-review tool slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • ISO 2033

    ISO 2033

    The ISO 2033:1983 standard ("Coding of machine readable characters (MICR and OCR)") defines character sets for use with Optical Character Recognition or Magnetic Ink Character Recognition systems. The Japanese standard JIS X 9010:1984 ("Coding of machine readable characters (OCR and MICR)", originally designated JIS C 6229-1984) is closely related. == Character set for OCR-A == The version of the encoding for the OCR-A font registered with the ISO-IR registry as ISO-IR-91 is the Japanese (JIS X 9010 / JIS C 6229) version, which differs from the encoding defined by ISO 2033 only in the addition of a Yen sign at 5C. == Character set for OCR-B == The version of the G0 set for the OCR-B font registered with the ISO-IR registry as ISO-IR-92 is the Japanese (JIS X 9010 / JIS C 6229) version, which differs from the encoding defined by ISO 2033 only in being based on JIS-Roman (with a dollar sign at 0x24 and a Yen sign at 0x5C) rather than on the ISO 646 IRV (with a backslash at 0x5C and, at the time, a universal currency sign (¤) at 0x24). Besides those code points, it differs from ASCII only in omitting the backtick (`) and tilde (~). An additional supplementary set registered as ISO-IR-93 assigns the pound sign (£), universal currency sign (¤) and section sign (§) to their ISO-8859-1 codepoints, and the backslash to the ISO-8859-1 codepoint for the Yen sign. == Character set for JIS X 9008 (JIS C 6257) == JIS X 9010 (JIS C 6229) also defines character sets for the JIS X 9008:1981 (formerly JIS C 6257-1981) "hand-printed" OCR font. These include subsets of the JIS X 0201 Roman set (registered as ISO-IR-94 and omitting the backtick (`), lowercase letters, curly braces ({, }) and overline (‾)), and kana set (registered as ISO-IR-96 and omitting the East Asian style comma (、) and full stop (。), the interpunct (・) and the small kana), in addition to a set (registered as ISO-IR-95) containing only the backslash, which is assigned to the same code point as in ISO-IR-93. The JIS C 6527 font stylises the slash and backslash characters with a doubled appearance. The character names given are "Solidus" and "Reverse Solidus", matching the Unicode character names for the ASCII slash and backslash. However, the Unicode Optical Character Recognition block includes an additional code point for an "OCR Double Backslash" (⑊), although not for a double (forward) slash, although a double slash is available elsewhere, as U+2AFD ⫽ DOUBLE SOLIDUS OPERATOR. == Character set for E-13B == The ISO-IR-98 encoding defined by ISO 2033 encodes the character repertoire of the E13B font, as used with magnetic ink character recognition. Although ISO 2033 also specifies other encodings, the encoding for E-13B is the encoding referred to as ISO_2033_1983 by Perl libintl, and as ISO_2033-1983 or csISO2033 by the IANA. Other registered labels include iso-ir-98, its ISO-IR registration number, and simply e13b. The digits are preserved in their ASCII locations. Letters and symbols unavailable in the E13B font are omitted, while specialised punctuation for bank cheques included in the E13B font is added. The same symbols are available in Unicode in the Optical Character Recognition block.

    Read more →
  • James Curran (educator)

    James Curran (educator)

    James R. Curran is an Australian computational linguist. He is the former CEO of Grok Academy and previously a senior lecturer at the University of Sydney. He holds a PhD in Informatics from the University of Edinburgh. == Research == Curran's research focuses on natural language processing (NLP), more specifically combinatory categorial grammar and question answering systems. In addition to his contributions to NLP, Curran has produced a paper on the development of search engines to assist in driving problem based learning. == Works == Curran has co-authored software packages such as C&C tools, a CCG parser (with Stephen Clark). == Educational work == In addition to his work as a University of Sydney lecturer, Curran directed the National Computer Science School, an annual summer school for technologically talented high school students. In 2013, based on their work with NCSS, he, Tara Murphy, Nicky Ringland and Tim Dawborn founded Grok Learning. In 2013 he was one of the authors of the Digital Technologies section of the Australian Curriculum - its first appearance in the national curriculum. Additionally, he acted as an advocate for digital literacy among Australian students. He was the academic director of the Australian Computing Academy, a not-for-profit within the University of Sydney until its merger with Grok Learning in 2021 to form Grok Academy. In 2022, Grok Academy under Curran secured a significant amount of funding from Richard White, founder of WiseTech, with the aim of developing new courses and encouraging other large technology companies to donate likewise. In 2024 Curran cohosted an unreleased children's reality TV show called Future Fixers, which Grok was co-producing. The show was abandoned after other producers learned of pre-existing harassment claims against him. == Sexual harassment allegations == In October 2024, he resigned from his position as CEO and board member of Grok Academy after multiple allegations of harassment were substantiated by an independent investigator. It was reported that over a 10-year span there were nine women, including six who were in high school at the time, that allege Curran sent them inappropriate messages. Additionally, it was revealed that a 2019 University of Sydney investigation found 35 cases of harassment, after which he received a warning and a 2024 University of New South Wales investigation was referred to the NSW police, who took no action as they found no criminal wrongdoing by Curran, in part because the students were over 16 at the time of the alleged harassment. In December 2024, Curran said he was “deeply sorry” for his actions.

    Read more →
  • Sycophancy (artificial intelligence)

    Sycophancy (artificial intelligence)

    In the field of artificial intelligence, sycophancy is a tendency of large language models (LLMs) and other AI assistants to tailor their responses to what they predict the user wants to hear rather than to what is accurate or warranted. The behavior takes several forms: an assistant may agree with a user's stated opinion even when the user is mistaken; it may abandon a correct answer after a challenge such as "are you sure?"; it may validate beliefs, decisions or self-presentation regardless of merit; or it may praise the user, their work or their ideas in unwarranted terms. The word is borrowed from the ordinary English term for fawning flattery, and is used in AI alignment and AI safety research to describe a class of misalignment failures associated with training on human feedback. Researchers at Anthropic first documented the behavior systematically in 2022. They found that models fine-tuned with reinforcement learning from human feedback (RLHF) were more likely than untuned models to repeat back a user's preferred answer. A 2023 follow-up paper, "Towards Understanding Sycophancy in Language Models", showed that five frontier assistants from OpenAI, Anthropic and Meta all exhibited the behavior, and traced its origin to biases in the human preference data used during training. Later work documented sycophancy in mathematics, medicine, academic peer review and other domains, and identified a broader category called "social sycophancy" affecting an assistant's emotional and interpersonal responses. The issue drew widespread public attention in April 2025 after OpenAI rolled back an update to its GPT-4o model. Users had reported that the assistant praised dangerous decisions, endorsed delusional thinking and offered exaggerated compliments for trivial prompts. OpenAI's post-mortem attributed the change in behavior to an additional training signal based on user thumbs-up and thumbs-down feedback. That episode, together with reporting in The New York Times, Rolling Stone and elsewhere on users drawn into delusional thinking through prolonged chatbot interaction, has been cited in litigation and in academic studies as evidence that sycophancy poses risks to user well-being. Proposed mitigations include fine-tuning on synthetic data that rewards disagreement with incorrect user statements, editing the small subset of model parameters causally responsible for the behavior, changes to the dialogue or system prompt, and benchmarks designed to surface sycophantic behavior before models are released. == Causes == The dominant explanation points to RLHF, the standard technique for aligning chat assistants with user expectations. Human annotators rank candidate model responses; a reward model is trained to predict those rankings; and the language model is then optimized against the reward model. Because human raters tend to prefer outputs that confirm their existing beliefs or flatter their work, the pipeline systematically rewards responses that agree with the annotator. Perez and colleagues at Anthropic published the first large-scale empirical evidence of the effect in 2022. They reported that RLHF training increased the probability that a model would repeat back a dialog user's preferred answer, and that larger models exhibited the behavior more strongly. Sharma and colleagues, the following year, went further and examined Anthropic's own preference data directly. Both the human raters and the reward models trained on their judgments preferred convincingly written sycophantic responses to truthful ones at a non-negligible rate. Wei and co-authors at Google DeepMind found similar results in the PaLM family, observing that both model scale and instruction tuning increased sycophancy on opinion questions. The behavior is often classified as a form of reward hacking, in which an optimization process exploits a flaw in its reward signal rather than achieving the intended objective. OpenAI's post-mortem of the April 2025 GPT-4o incident identified a more specific mechanism. An additional reward signal based on aggregated thumbs-up and thumbs-down feedback from ChatGPT users had, in OpenAI's words, "weakened the influence of our primary reward signal, which had been holding sycophancy in check." Separately, an Anthropic interpretability paper from 2025 located a linear direction in a model's internal activations corresponding to sycophantic behavior, and showed that such "persona vectors" could be used to flag sycophancy-inducing training data and to steer models away from the trait at inference time. == Measurement == The Anthropic team released SycophancyEval with its 2023 paper, supplying test sets for each of the four canonical behaviors. Two further benchmarks from Stanford followed in 2025. SycEval, applied to mathematical and medical reasoning tasks, reported an overall sycophancy rate of 58 per cent across the GPT-4o, Claude and Gemini models tested. ELEPHANT, aimed at social sycophancy, found that the eleven LLMs evaluated affirmed posts that the Reddit community r/AmITheAsshole had judged inappropriate in 42 per cent of cases, and preserved a user's face 45 percentage points more often than human respondents did. Domain-specific benchmarks have followed. BrokenMath tests robustness to plausible-looking but false mathematical claims drawn from competition problems, and reports that the best evaluated model was sycophantic in 29 per cent of cases. SYCON-Bench measures how many dialogue turns are required before a model abandons a correct position. Visual sycophancy in multimodal models has been examined with MM-SY and PENDULUM. A 2026 study by researchers at the Massachusetts Institute of Technology reported that personalization features, which adapt assistants to individual users over repeated sessions, can intensify social sycophancy. == Notable incidents == === GPT-4o rollback (April 2025) === On 25 April 2025, OpenAI completed the rollout of an update to GPT-4o, the default model used in ChatGPT at the time. Within days, users reported that the assistant had begun praising trivial messages in extravagant terms, endorsing impulsive or dangerous decisions, and reinforcing strong emotional statements without pushback. Widely shared examples included the model congratulating a user who reported stopping prescribed psychiatric medication, and praising a business plan to sell "shit on a stick" as venture-capital ready. OpenAI's chief executive, Sam Altman, wrote on 27 April that recent updates had made the model "too sycophant-y and annoying" and said fixes were in progress. The company began reverting the update on 28 April and completed the rollback for free users by 30 April. Two post-mortems followed: a short note on 29 April and a longer technical follow-up, "Expanding on what we missed with sycophancy", on 2 May. Both attributed the regression to a new training signal based on user thumbs-up and thumbs-down feedback, to inadequate pre-launch evaluation for sycophantic drift, and to the dismissal of qualitative concerns raised by internal testers before release. Reporting in CNN, Fortune and Bloomberg News treated the incident as a turning point in public awareness of the problem. === Chatbot-related psychological harm === From mid-2025 onward, news reports began to link sycophantic chatbot behavior to acute psychological harm. In June 2025, The New York Times technology reporter Kashmir Hill published an investigation centered on Eugene Torres, a Manhattan accountant with no history of mental illness, who developed a sustained delusional episode after a series of conversations with ChatGPT about simulation theory. According to the article, the assistant encouraged Torres to stop taking prescribed medication, to cut off friends and family, and at one point told him that he could fly from a nineteen-story building if he "truly believed". Futurism and Rolling Stone ran parallel investigations documenting other cases in which heavy use of ChatGPT had been associated with delusional thinking, involuntary commitment or, in at least one case, the death of a user with a pre-existing psychiatric diagnosis. A 2026 paper by researchers at the Massachusetts Institute of Technology and the University of Washington put forward a formal Bayesian model. It showed that even an ideally rational user could be drawn into what the authors call "delusional spiraling" when interacting with a sufficiently sycophantic assistant, and that the effect was not eliminated by suppressing hallucinations or by warning users in advance. The lawsuit Raine v. OpenAI, filed in San Francisco Superior Court in August 2025 by the parents of a sixteen-year-old who had died by suicide, alleges that "heightened sycophancy" was a design feature of ChatGPT that contributed to their son's death; it is the first wrongful-death suit against a large language-model provider. === Wider commentary === Mainstream coverage in outlets including The New York Times, The Washington Pos

    Read more →
  • Best AI Logo Makers in 2026

    Best AI Logo Makers in 2026

    Looking for the best AI logo maker? An AI logo maker is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI logo maker slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Jun'ichi Tsujii

    Jun'ichi Tsujii

    Jun'ichi Tsujii (辻井 潤一, Tsujii Jun'ichi; born 7 February 1949) is a Japanese computer scientist specializing in natural language processing and text mining, particularly in the field of biology and bioinformatics. == Education == Tsujii received his Bachelor of Engineering, Master of Engineering and PhD degrees in electrical engineering from Kyoto University in 1971, 1973, and 1978 respectively. He was Assistant Professor and Associate Professor at Kyoto University, before accepting a position as Professor of Computational Linguistics at the University of Manchester Institute of Science and Technology (UMIST) in 1988. He was President of the Association for Computational Linguistics (ACL) in 2006, and has been a permanent member of the International Committee on Computational Linguistics (ICCL) since 1992, and the chair of the committee since 2014. == Research == Since May 2015, Tsujii has been the director of the Artificial Intelligence Research Center at the National Institute of Advanced Industrial Science and Technology, Japan. Tsujii was previously a Principal Researcher at Microsoft Research Asia (MSRA). Before joining MSRA, he was a professor at the University of Tokyo, where he belonged to both the School of Inter-faculty Initiative on Informatics and the Graduate School of Information Science and Technology. Tsujii is also a Visiting Professor and Scientific Advisor at the National Centre for Text Mining (NaCTeM) at the University of Manchester in the United Kingdom. == Awards == On 14 May 2010, Tsujii was awarded the Medals of Honor with Purple Ribbon, one of Japan's highest awards, presented to influential contributors in the fields of art, academics or sports. In September 2014, Tsujii was awarded the FUNAI Achievement Award at the Forum on Information Technology (FIT), which took place at the University of Tsukuba. The award is presented to distinguished individuals engaged in research or related business activities in the field of Information Technology who have produced excellent achievements in the field, are still active in leading positions and have strong impact on young students and researchers. In December 2014, Tsujii was named as an ACL Fellow, in recognition of his significant contributions to MT, parsing by unification-based grammar and text mining for biology. In March 2016, Tsujii was awarded Okawa Prize for his contribution to the field of Natural Language Processing, Machine Translation and Text Mining, together with Professor Jaime Carbonnel of CMU. In August 2021, Tsujii received ACL Lifetime Achievement Award, which is considered the most prestigious award in the field of Computational Linguistics and Natural Language Processing. In May 2022, Tsujii received the Order of the Sacred Treasure, Gold Rays and Neck Ribbon, from the Japanese government. In October 2024, Tsujii was designated a Person of Cultural Merit. == Selected publications == Oiwa, Hidekazu; Tsujii, Jun'ichi (2014). Common Space Embedding of Primal-Dual Relation Semantic Spaces. COLING 2014. Dublin. pp. 1579–1590. Taura, K.; Matsuzaki, T.; Miwa, M.; Kamoshida, Y.; Yokoyama, D.; Dun, N.; Shibata, T.; Jun, C. S.; Tsujii, J. (2013). "Design and implementation of GXP make – A workflow system based on make". Future Generation Computer Systems. 29 (2): 662–672. doi:10.1016/j.future.2011.05.026. S2CID 31627886. Sun, X.; Zhang, Y.; Matsuzaki, T.; Tsuruoka, Y.; Tsujii, J. (2013). "Probabilistic Chinese word segmentation with non-local information and stochastic training". Information Processing & Management. 49 (3): 626–636. doi:10.1016/j.ipm.2012.12.003. Mu, T.; Goulermas, J. Y.; Tsujii, J.; Ananiadou, S. (2012). "Proximity-Based Frameworks for Generating Embeddings from Multi-Output Data". IEEE Transactions on Pattern Analysis and Machine Intelligence. 34 (11): 2216–2232. Bibcode:2012ITPAM..34.2216M. doi:10.1109/TPAMI.2012.20. PMID 23289130. S2CID 711467. Miwa, M.; Sætre, R.; Kim, J. D.; Tsujii, J. (2010). "Event Extraction with Complex Event Classification Using Rich Features". Journal of Bioinformatics and Computational Biology. 08 (1): 131–146. doi:10.1142/S0219720010004586. PMID 20183879. Kim, J. D.; Ohta, T.; Tsujii, J. (2008). "Corpus annotation for mining biomedical events from literature". BMC Bioinformatics. 9 10. doi:10.1186/1471-2105-9-10. PMC 2267702. PMID 18182099. Miyao, Y.; Tsujii, J. (2008). "Feature Forest Models for Probabilistic HPSG Parsing". Computational Linguistics. 34: 35–80. doi:10.1162/coli.2008.34.1.35. S2CID 885002. Sagae, Kenji; Tsujii, Jun'ichi (2007). Dependency Parsing and Domain Adaptation with LR Models and Parser Ensembles. EMNLP-CoNLL. pp. 1044–1050. Ananiadou, S; Pyysalo, S; Tsujii, J; Kell, D. B. (2010). "Event extraction for systems biology by text mining the literature". Trends in Biotechnology. 28 (7): 381–90. doi:10.1016/j.tibtech.2010.04.005. PMID 20570001. Tsuruoka, Y.; Tateishi, Y.; Kim, J. D.; Ohta, T.; McNaught, J.; Ananiadou, S.; Tsujii, J. (2005). "Developing a Robust Part-of-Speech Tagger for Biomedical Text". Advances in Informatics. Lecture Notes in Computer Science. Vol. 3746. p. 382. doi:10.1007/11573036_36. ISBN 978-3-540-29673-7. S2CID 206592413. Tsuruoka, Y.; Tsujii, J. (2005). Bidirectional inference with the easiest-first strategy for tagging sequence data. Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing - HLT '05. pp. 467–474. doi:10.3115/1220575.1220634. Tsujii, J.; Ananiadou, S. (2005). "Thesaurus or Logical Ontology, Which One Do We Need for Text Mining?". Language Resources and Evaluation. 39: 77–90. doi:10.1007/s10579-005-2697-0. S2CID 3204827. Kazama, J. I.; Tsujii, J. I. (2005). "Maximum Entropy Models with Inequality Constraints: A Case Study on Text Categorization". Machine Learning. 60 (1–3): 159–194. doi:10.1007/s10994-005-0911-3. hdl:10119/3305. Matsuzaki, T.; Miyao, Y.; Tsujii, J. I. (2005). Probabilistic CFG with latent annotations. 43rd Annual Meeting on Association for Computational Linguistics - ACL '05. p. 75. doi:10.3115/1219840.1219850. Kim, J. -D.; Ohta, T.; Tateisi, Y.; Tsujii, J. (2003). "GENIA corpus--a semantically annotated corpus for bio-textmining". Bioinformatics. 19: i180–i182. doi:10.1093/bioinformatics/btg1023. PMID 12855455. Hirschman, L.; Park, J. C.; Tsujii, J.; Wong, L.; Wu, C. H. (2002). "Accomplishments and challenges in literature data mining for biology". Bioinformatics. 18 (12): 1553–1561. doi:10.1093/bioinformatics/18.12.1553. PMID 12490438. Torisawa, K.; Tsujii, J. I. (1996). Computing phrasal-signs in HPSG prior to parsing. 16th conference on Computational linguistics -. Vol. 2. p. 949. doi:10.3115/993268.993332.

    Read more →
  • The Best Free AI Code-review Tool for Beginners

    The Best Free AI Code-review Tool for Beginners

    Curious about the best AI code-review tool? An AI code-review tool is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI code-review tool slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Aporia (company)

    Aporia (company)

    Aporia is a machine learning observability platform based in Tel Aviv, Israel. The company has a US office located in San Jose, California. Aporia has developed software for monitoring and controlling undetected defects and failures used by other companies to detect and report anomalies, and warn in the early stages of faults. == History == Aporia was founded in 2019 by Liran Hason and Alon Gubkin. In April 2021, the company raised a $5 million seed round for its monitoring platform for ML models. In February 2022, the company closed a Series A round of $25 million for its ML observability platform. Aporia was named by Forbes as the Next Billion-Dollar Company in June 2022. In November, the company partnered with ClearML, an MLOPs platform, to improve ML pipeline optimization. In January 2023, Aporia launched Direct Data Connectors, a novel technology allowing organizations to monitor their ML models in minutes (previously the process of integrating ML monitoring into a customer’s cloud environment took weeks or more.) DDC (Direct Data Connectors) enables users to connect Aporia to their preferred data source and monitor all of their data at once, without data sampling or data duplication (which is a huge security risk for major organizations. In April 2023, Aporia announced the company partnered with Amazon Web Services (AWS) to provide more reliable ML observability to AWS consumers by deploying Aporia's architecture to their AWS environment, this will allow customers to monitor their models in production regardless of platform.

    Read more →
  • A Comprehensive Grammar of the English Language

    A Comprehensive Grammar of the English Language

    A Comprehensive Grammar of the English Language is a descriptive grammar of English written by Randolph Quirk, Sidney Greenbaum, Geoffrey Leech, and Jan Svartvik. It was first published by Longman in 1985. In 1991, it was called "The greatest of contemporary grammars, because it is the most thorough and detailed we have," and "It is a grammar that transcends national boundaries." The book relies on elicitation experiments as well as three corpora: a corpus from the Survey of English Usage, the Lancaster-Oslo-Bergen Corpus (UK English), and the Brown Corpus (US English). == Reviews == In 1988, Rodney Huddleston published a very critical review. He wrote:[T]here are some respects in which it is seriously flawed and disappointing. A number of quite basic categories and concepts do not seem to have been thought through with sufficient care; this results in a remarkable amount of unclarity and inconsistency in the analysis, and in the organization of the grammar. Aarts, F. G. A. M. (April 1988). "A Comprehensive Grammar of the English Language: The great tradition continued". English Studies. 69 (2): 163–173. doi:10.1080/00138388808598565.

    Read more →
  • Kunihiko Fukushima

    Kunihiko Fukushima

    Kunihiko Fukushima (Japanese: 福島 邦彦, born 16 March 1936) is a Japanese computer scientist, most noted for his work on artificial neural networks and deep learning. He is currently working part-time as a senior research scientist at the Fuzzy Logic Systems Institute in Fukuoka, Japan. == Notable scientific achievements == In 1980, Fukushima published the neocognitron, the original deep convolutional neural network (CNN) architecture. Fukushima proposed several supervised and unsupervised learning algorithms to train the parameters of a deep neocognitron such that it could learn internal representations of incoming data. Today, however, the CNN architecture is usually trained through backpropagation. This approach is now heavily used in computer vision. In 1969 Fukushima introduced the ReLU (Rectifier Linear Unit) activation function in the context of visual feature extraction in hierarchical neural networks, which he called "analog threshold element". (Though the ReLU was first used by Alston Householder in 1941 as a mathematical abstraction of biological neural networks.) As of 2017 it is the most popular activation function for deep neural networks. == Education and career == In 1958, Fukushima received his Bachelor of Engineering in electronics from Kyoto University. He became a senior research scientist at the NHK Science & Technology Research Laboratories. In 1989, he joined the faculty of Osaka University. In 1999, he joined the faculty of the University of Electro-Communications. In 2001, he joined the faculty of Tokyo University of Technology. From 2006 to 2010, he was a visiting professor at Kansai University. Fukushima acted as founding president of the Japanese Neural Network Society (JNNS). He also was a founding member on the board of governors of the International Neural Network Society (INNS), and president of the Asia-Pacific Neural Network Assembly (APNNA). He was one of the board of governors of the International Neural Network Society (INNS) in 1989-1990 and 1993-2005. == Awards == In 2020, Fukushima received the Bower Award and Prize for Achievement in Science. In 2022, Fukushima became a laureate of the Asian Scientist 100 by the Asian Scientist. He also received the IEICE Achievement Award and Excellent Paper Awards, the IEEE Neural Networks Pioneer Award, the APNNA Outstanding Achievement Award, the JNNS Excellent Paper Award and the INNS Helmholtz Award.

    Read more →
  • Douwe Kiela

    Douwe Kiela

    Douwe Kiela is a Dutch-American research scientist and entrepreneur working in the field of artificial intelligence with a focus on machine learning and natural language processing. He is a research scientist director at Google DeepMind. He previously co-founded and served as CEO of Contextual AI, an enterprise software company that provides a platform for building grounded AI agents for enterprise knowledge bases. He previously led the research team at Meta AI that introduced the RAG approach in 2020, co-authoring the foundational paper "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." Kiela also served as Head of Research at Hugging Face and is an adjunct professor in Symbolic Systems at Stanford University. == Early life and education == Douwe Kiela was born in Amsterdam, Netherlands, in 1986. He earned a Bachelor of Science degree in Liberal Arts and Sciences from Utrecht University, with a double major in Cognitive Artificial Intelligence and Philosophy. He then obtained an MSc in logic (cum laude) from the University of Amsterdam's Institute for Logic, Language and Computation (ILLC). Kiela received an MPhil and PhD in Computer Science from the University of Cambridge, specializing in natural language processing and machine learning. == Career == === Facebook AI Research (Meta) === In 2016, Kiela joined Facebook AI Research (FAIR) as a postdoctoral researcher, later becoming a research scientist in New York. While at Meta, he co-authored papers in natural language processing, with a focus on multimodal and grounded language learning. His projects included creating a virtual assistant bot that could navigate tourists around a city and leading the development of Dynabench, an interactive benchmarking platform released in 2020 that used human feedback to test and improve language models. In 2020, Kiela led the Meta AI research team that introduced Retrieval-Augmented Generation (RAG), co-authoring the influential paper "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks," alongside Patrick Lewis, Ethan Perez, and other researchers. The RAG framework transformed how large language models access and incorporate external information by allowing them to retrieve relevant context from external knowledge bases at query time, rather than relying solely on pre-trained data. This approach addressed key limitations such as hallucination, outdated information, and lack of source attribution. The RAG technique has since become widely adopted in enterprise AI applications and knowledge-intensive natural language processing tasks. === Hugging Face === After leaving Meta, Kiela served as Head of Research at Hugging Face. === Contextual AI === In 2023, Kiela co-founded Contextual AI with Amanpreet Singh, another former researcher at Facebook AI Research and Hugging Face. The Mountain View-based company develops a platform for building grounded AI agents for enterprises, focusing on applications in technology, semiconductor, logistics, finance, and media sectors. Contextual AI raised $20 million in seed funding in June 2023, led by Bain Capital Ventures. In August 2024, the company completed an $80 million Series A funding round led by Greycroft, with participation from Bezos Expeditions, NVentures (Nvidia), HSBC Ventures, and Snowflake Ventures, among others. In May 2026, Kiela joined Google DeepMind as part of a licensing agreement between Google and Contextual AI under which more than 20 Contextual AI researchers joined DeepMind. Following his departure, Jay Chen became interim CEO of Contextual AI. === Academic roles === Douwe Kiela serves as an adjunct professor in Symbolic Systems at Stanford University. In a 2023 interview with the Stanford Daily, he commented on the development of Alpaca, a low-cost instruction-finetuned model based on Meta's LLaMA, and emphasized the importance of open academic research in large language models.

    Read more →
  • Confusion matrix

    Confusion matrix

    In machine learning, a confusion matrix, also known as error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one. In unsupervised learning it is usually called a matching matrix. The term is used specifically in the problem of statistical classification. Each row of the matrix represents the instances in an actual class while each column represents the instances in a predicted class, or vice versa – both variants are found in the literature. The diagonal of the matrix therefore represents all instances that are correctly predicted. The name stems from the fact that it makes it easy to identify whether the system is confusing two classes (i.e., commonly mislabeling one class as another). The confusion matrix has its origins in human perceptual studies of auditory stimuli. It was adapted for machine learning studies and used by Frank Rosenblatt, among other early researchers, to compare human and machine classifications of visual (and later auditory) stimuli. It is a special kind of contingency table, with two dimensions ("actual" and "predicted"), and identical sets of "classes" in both dimensions (each combination of dimension and class is a variable in the contingency table). == Example == Given a sample of 12 individuals, 8 that have been diagnosed with cancer and 4 that are cancer-free, where individuals with cancer belong to class 1 (positive) and non-cancer individuals belong to class 0 (negative), we can display that data as follows: Assume that we have a classifier that distinguishes between individuals with and without cancer in some way, we can take the 12 individuals and run them through the classifier. The classifier then makes 9 accurate predictions and misses 3: 2 individuals with cancer wrongly predicted as being cancer-free (sample 1 and 2), and 1 person without cancer that is wrongly predicted to have cancer (sample 9). Notice, that if we compare the actual classification set to the predicted classification set, there are 4 different outcomes that could result in any particular column: The actual classification is positive and the predicted classification is positive (1,1). This is called a true positive result because the positive sample was correctly identified by the classifier. The actual classification is positive and the predicted classification is negative (1,0). This is called a false negative result because the positive sample is incorrectly identified by the classifier as being negative. The actual classification is negative and the predicted classification is positive (0,1). This is called a false positive result because the negative sample is incorrectly identified by the classifier as being positive. The actual classification is negative and the predicted classification is negative (0,0). This is called a true negative result because the negative sample gets correctly identified by the classifier. We can then perform the comparison between actual and predicted classifications and add this information to the table, making correct results appear in green so they are more easily identifiable. The template for any binary confusion matrix uses the four kinds of results discussed above (true positives, false negatives, false positives, and true negatives) along with the positive and negative classifications. The four outcomes can be formulated in a 2×2 confusion matrix, as follows: The color convention of the three data tables above were picked to match this confusion matrix, in order to easily differentiate the data. Now, we can simply total up each type of result, substitute into the template, and create a confusion matrix that will concisely summarize the results of testing the classifier: In this confusion matrix, of the 8 samples with cancer, the system judged that 2 were cancer-free, and of the 4 samples without cancer, it predicted that 1 did have cancer. All correct predictions are located in the diagonal of the table (highlighted in green), so it is easy to visually inspect the table for prediction errors, as values outside the diagonal will represent them. By summing up the 2 rows of the confusion matrix, one can also deduce the total number of positive (P) and negative (N) samples in the original dataset, i.e. P = T P + F N {\displaystyle P=TP+FN} and N = F P + T N {\displaystyle N=FP+TN} . == Table of confusion == In predictive analytics, a table of confusion (sometimes also called a confusion matrix) is a table with two rows and two columns that reports the number of true positives, false negatives, false positives, and true negatives. This allows more detailed analysis than simply observing the proportion of correct classifications (accuracy). Accuracy will yield misleading results if the data set is unbalanced; that is, when the numbers of observations in different classes vary greatly. For example, if there were 95 cancer samples and only 5 non-cancer samples in the data, a particular classifier might classify all the observations as having cancer. The overall accuracy would be 95%, but in more detail the classifier would have a 100% recognition rate (sensitivity) for the cancer class but a 0% recognition rate for the non-cancer class. F1 score is even more unreliable in such cases, and here would yield over 97.4%, whereas informedness removes such bias and yields 0 as the probability of an informed decision for any form of guessing (here always guessing cancer). According to Davide Chicco and Giuseppe Jurman, the most informative metric to evaluate a confusion matrix is the Matthews correlation coefficient (MCC). Other metrics can be included in a confusion matrix, each of them having their significance and use. Some researchers have argued that the confusion matrix, and the metrics derived from it, do not truly reflect a model's knowledge. In particular, the confusion matrix cannot show whether correct predictions were reached through sound reasoning or merely by chance (a problem known in philosophy as epistemic luck). It also does not capture situations where the facts used to make a prediction later change or turn out to be wrong (defeasibility). This means that while the confusion matrix is a useful tool for measuring classification performance, it may give an incomplete picture of a model’s true reliability. == Confusion matrices with more than two categories == Confusion matrix is not limited to binary classification and can be used in multi-class classifiers as well. The confusion matrices discussed above have only two conditions: positive and negative. For example, the table below summarizes communication of a whistled language between two speakers, with zero values omitted for clarity. == Confusion matrices in multi-label and soft-label classification == Confusion matrices are not limited to single-label classification (where only one class is present) or hard-label settings (where classes are either fully present, 1, or absent, 0). They can also be extended to Multi-label classification (where multiple classes can be predicted at once) and soft-label classification (where classes can be partially present). One such extension is the Transport-based Confusion Matrix (TCM), which builds on the theory of optimal transport and the principle of maximum entropy. TCM applies to single-label, multi-label, and soft-label settings. It retains the familiar structure of the standard confusion matrix: a square matrix sized by the number of classes, with diagonal entries indicating correct predictions and off-diagonal entries indicating confusion. In the single-label case, TCM is identical to the standard confusion matrix. TCM follows the same reasoning as the standard confusion matrix: if class A is overestimated (its predicted value is greater than its label value) and class B is underestimated (its predicted value is less than its label value), A is considered confused with B, and the entry (B, A) is increased. If a class is both predicted and present, it is correctly identified, and the diagonal entry (A, A) increases. Optimal transport and maximum entropy are used to determine the extent to which these entries are updated. TCM enables clearer comparison between predictions and labels in complex classification tasks, while maintaining a consistent matrix format across settings.

    Read more →
  • AI Pair Programmers: Free vs Paid (2026)

    AI Pair Programmers: Free vs Paid (2026)

    Trying to pick the best AI pair programmer? An AI pair programmer is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI pair programmer slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Rob Fergus

    Rob Fergus

    Rob Fergus is a British-American computer scientist working primarily in the fields of machine learning, deep learning, representational learning, and generative models. He is a professor of computer science at Courant Institute of Mathematical Sciences at New York University (NYU) and a research scientist at DeepMind. Fergus developed ZFNet in 2013 together with M.D. Zeiler, his PhD student in NYU. Fergus co-founded Meta AI (then known as Facebook Artificial Intelligence Research (FAIR)) along with Yann Le Cun in September 2013. In 2009, Rob Fergus co-founded the Computational Intelligence, Learning, Vision, and Robotics (CILVR) Lab at NYU along with Yann Le Cun. == Awards and recognition == Rob Fergus has been recognized in academia and received the following awards: NSF Faculty Early Career Development Program (CAREER) Sloan Research Fellowship Test-of-time awards at ECCV, CVPR and ICLR == Notable PhD students == Matt Zeiler (Clarifai founder) Wojciech Zaremba (OpenAI co-founder) Denis Yarats (Perplexity co-founder) Alex Rives (EvolutionaryScale co-founder; faculty at MIT)

    Read more →