AI Chat To Pdf

AI Chat To Pdf — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Adobe Presenter

    Adobe Presenter

    Adobe Presenter is eLearning software released by Adobe Systems available on the Microsoft Windows platform as a Microsoft PowerPoint plug-in, and on both Windows and OS X as the screencasting and video editing tool Adobe Presenter Video Express. It is mainly targeted towards learning professionals and trainers. In addition to recording one's computer desktop and speech, it also provides the option to add quizzes and track performance by integrating with learning management systems. Adobe Presenter was designed to replace the discontinued Adobe Ovation software, which had similar functions. == Predecessor == Adobe Ovation was originally released by Serious Magic. It converted PowerPoint slides into visual presentations with additional effects. Ovation included themes called PowerLooks that could add motion and polish the presentations. They were available in a variety of color variations complete with animated backgrounds and dynamic text effects. Ovation could make text with jagged edges more readable. TimeKeeper could be used to set the period of the presentation, and the PointPrompter scrolled down the notes. Ovation's development has been discontinued, nor does it support PowerPoint 2007. == Features == The main purpose of Adobe Presenter is to capture on-screen presentations and convert them into more interactive and engaging videos. Support is given to convert Microsoft PowerPoint 2010 and 2013 presentations into videos. It also allows for content authoring on PowerPoint and ActionScript 3, and offers integration with Adobe Captivate. Slide branching enables users to control slide navigation and titles and create complex slide branching to guide viewers through the content of the presentation. Video editing tools are also provided, and offer the ability to upload to video-sharing platforms such as YouTube, Vimeo and other sites. Multimedia features such as annotations, eLearning templates, actors, audio narration and drag-and-drop elements enrich users' presentations. Quizzes and surveys is another highlighted feature, which include generating question pools, importing questions from existing quizzes and in-course collaboration which allows presenters to receive feedback by allowing them to comment on specific content within a course or ask questions for more clarity. Presenters could opt to receive feedback from viewers through video analytics and create Experience API, SCORM and AICC-compliant content. Options to publish to Adobe Connect are provided. Other unique features include universal standards support, file size control, navigational restrictions among others.

    Read more →
  • Best AI Background Removers in 2026

    Best AI Background Removers in 2026

    Comparing the best AI background remover? An AI background remover is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI background remover slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • AI Voice Assistants Reviews: What Actually Works in 2026

    AI Voice Assistants Reviews: What Actually Works in 2026

    In search of the best AI voice assistant? An AI voice assistant is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI voice assistant slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Top 10 AI Text-to-image Tools Compared (2026)

    Top 10 AI Text-to-image Tools Compared (2026)

    Comparing the best AI text-to-image tool? An AI text-to-image tool is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI text-to-image tool slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • ChatGPT

    ChatGPT

    ChatGPT is a generative artificial intelligence chatbot developed by OpenAI. Originally released in November 2022, the product uses large language models—specifically generative pre-trained transformers (GPTs)—to generate text, speech, and images in response to user prompts. ChatGPT accelerated the AI boom, an ongoing period marked by rapid investment and public attention toward the field of artificial intelligence (AI). OpenAI operates the service on a freemium model. Users can interact with ChatGPT through text, audio, and image prompts. ChatGPT was quickly adopted, reaching 100 million monthly active users two months after its release and 900 million weekly active users in February 2026. It has been lauded for its potential to transform numerous professional fields, and has instigated public debate about the nature of creativity and the future of knowledge work. The chatbot has also been criticized for its limitations and potential for unethical use. It can generate plausible-sounding but incorrect or nonsensical answers, known as hallucinations. Biases in its training data have been reflected in its responses. The chatbot can facilitate academic dishonesty, generate misinformation, and create malicious code. The ethics of its development, particularly the use of copyrighted content as training data, have also drawn controversy. == Features == ChatGPT is a chatbot and AI assistant built on large language model (LLM) technology. It is designed to generate human-like text and can carry out a wide variety of tasks. These include, among many others, writing and debugging computer programs, composing music, scripts, fairy tales, and essays, answering questions (sometimes at a level exceeding that of an average human test-taker), and generating business concepts. ChatGPT is frequently used for translation and summarization tasks, and can simulate interactive environments such as a Linux terminal, a multi-user chat room, or simple text-based games such as tic-tac-toe. Users interact with ChatGPT through conversations which consist of text, audio, and image inputs and outputs. The user's inputs to these conversations are referred to as prompts. An optional "Memory" feature allows users to tell ChatGPT to memorize specific information. Another option allows ChatGPT to recall old conversations. GPT-based moderation classifiers are used to reduce the risk of harmful outputs being presented to users. In March 2023, OpenAI added support for plugins for ChatGPT. This includes both plugins made by OpenAI, such as web browsing and code interpretation, and external plugins from developers such as Expedia, OpenTable, and Zapier. From October to December 2024, ChatGPT Search was deployed. It allows ChatGPT to search the web in an attempt to make more accurate and up-to-date responses. It increased OpenAI's direct competition with major search engines. OpenAI allows businesses to tailor how their content appears in the ChatGPT Search results and influence what sources are used. In December 2024, OpenAI launched a new feature allowing users to call ChatGPT with a telephone for up to 15 minutes per month for free. In September 2025, OpenAI added a feature called Pulse, which generates a daily analysis of a user's chats and connected apps such as Gmail and Google Calendar. In October 2025, OpenAI launched ChatGPT Atlas, a browser integrating the ChatGPT assistant directly into web navigation, to compete with existing browsers such as Google Chrome. It has an additional feature called "agentic mode" that allows it to take online actions for the user. === Paid tier === ChatGPT was initially free to the public and remains free in a limited capacity. In February 2023, OpenAI launched a premium service, ChatGPT Plus, that costs US$20 per month. What was offered on the paid plan versus the free tier changed as OpenAI has continued to update ChatGPT, and a Pro tier at $200/mo was introduced in December 2024. The Pro launch coincided with the release of the o1 model. In August 2025, ChatGPT Go was offered in India for ₹399 per month. The plan has higher limits than the free version. === Mobile apps === In May-July 2023, OpenAI began offering ChatGPT iOS and Android apps. ChatGPT can also power Android's assistant. An app for Windows launched on the Microsoft Store on October 15, 2024. === Languages === OpenAI met Icelandic President Guðni Th. Jóhannesson in 2022. In 2023, OpenAI worked with a team of 40 Icelandic volunteers to fine-tune ChatGPT's Icelandic conversation skills as a part of Iceland's attempts to preserve the Icelandic language. ChatGPT (based on GPT-4) was better able to translate Japanese to English when compared to Bing, Bard, and DeepL Translator in 2023. In December 2023, the Albanian government decided to use ChatGPT for the rapid translation of European Union documents and the analysis of required changes needed for Albania's accession to the EU. Several studies have shown that ChatGPT can outperform Google Translate in some mainstream translation tasks. However, as of 2024, no machine translation services match human expert performance. In August 2024, a representative of the Asia Pacific wing of OpenAI made a visit to Taiwan, during which a demonstration of ChatGPT's Chinese abilities was made. ChatGPT's Mandarin Chinese abilities were lauded, but the ability of the AI to produce content in Mandarin Chinese in a Taiwanese accent was found to be "less than ideal" due to differences between mainland Mandarin Chinese and Taiwanese Mandarin. === GPT Store === In November 2023, OpenAI released GPT Builder, a tool allowing users to customize ChatGPT's behavior for a specific use case. The customized systems are referred to as GPTs. In January 2024, OpenAI launched the GPT Store, a marketplace for GPTs. At launch, OpenAI included more than 3 million GPTs created by GPT Builder users in the GPT Store. === ChatGPT Apps === In September 2025, OpenAI added support for Model Context Protocol (MCP) to ChatGPT apps. When enabled in developer mode, this allows for improved third-party access to ChatGPT tools and servers. === Deep Research === In February 2025, OpenAI released Deep Research, a feature that generates reports based on extensive web searches. It was initially based on the reasoning model o3 and took 5 to 30 minutes per report. === Images === In October 2023, OpenAI's image generation model DALL-E 3 was integrated into ChatGPT. The integration used ChatGPT to write prompts for DALL-E guided by conversations with users. In March 2025, OpenAI updated ChatGPT to generate images using GPT Image instead of DALL-E. One of the most significant improvements was in the generation of text within images, which is especially useful for branded content. However, this ability is noticeably worse in non-Latin alphabets. The model can also generate new images based on existing ones provided in the prompt. These images are generated with C2PA metadata, which can be used to verify that they are AI-generated. OpenAI has emplaced additional safeguards to prevent what the company deems to be harmful image generation. === Agents === In 2025, OpenAI added several features to make ChatGPT more agentic (capable of autonomously performing longer tasks). In January, Operator was released. It was capable of autonomously performing tasks through web browser interactions, including filling forms, placing online orders, scheduling appointments, and other browser-based tasks. It was controlling a software environment inside a virtual machine with limited internet connectivity and with safety restrictions. It struggled with complex user interfaces. In May 2025, OpenAI introduced an agent for coding named Codex. It is capable of writing software, answering codebase questions, running tests, and proposing pull requests. It is based on a fine-tuned version of OpenAI o3. It has two versions, one running in a virtual machine in the cloud, and one where the agent runs in the cloud, but performs actions on a local machine connected via API. In July 2025, OpenAI released ChatGPT agent, an AI agent that can perform multi-step tasks. Like Operator, it controls a virtual computer. It also inherits from Deep Research's ability to gather and summarize significant volumes of information. The user can interrupt tasks or provide additional instructions as needed. In September 2025, OpenAI partnered with Stripe, Inc. to release Agentic Commerce Protocol, enabling purchases through ChatGPT. At launch, the feature was limited to purchases on Etsy from US users with a payment method linked to their OpenAI account. OpenAI takes an undisclosed cut from the merchant's payment. === ChatGPT Health === On January 7, 2026, OpenAI introduced a feature called "ChatGPT Health", whereby ChatGPT can discuss the user's health in a way that is separate from other chats. The feature is not available for users in the United Kingdom, Switzerland, or the European Economic Area, and is available on a waitli

    Read more →
  • Maghi King

    Maghi King

    Margaret (Maghi) Daniel King is a retired British computational linguist known for her work on evaluating the quality of machine translation. She is an honorary professor in the Department of Translation Technology of the University of Geneva in Switzerland, and the former director of the Dalle Molle Institute for Semantic and Cognitive Studies at the University of Geneva. == Education and career == King read classics, Ancient History and Philosophy (Greats) at the University of Oxford, worked as a computer programmer, and became a lecturer in the Department of Computation at the University of Manchester Institute of Science and Technology. She moved to the Dalle Molle Institute for Semantic and Cognitive Studies (ISSCO) in 1974. In 1976, ISSCO became part of the University of Geneva, and she continued there, becoming ISSCO's director in 1978. She remained director until her retirement in 2006. == Recognition == King is a Fellow of the European Association for Artificial Intelligence (formerly ECCAI), elected in 1999.

    Read more →
  • Flex (lexical analyzer generator)

    Flex (lexical analyzer generator)

    Flex (fast lexical analyzer generator) is a free and open-source software alternative to lex. It is a computer program that generates lexical analyzers (also known as "scanners" or "lexers"). It is frequently used as the lex implementation together with Berkeley Yacc parser generator on BSD-derived operating systems (as both lex and yacc are part of POSIX), or together with GNU bison (a version of yacc) in BSD ports and in Linux distributions. Unlike Bison, flex is not part of the GNU Project and is not released under the GNU General Public License, although a manual for Flex was produced and published by the Free Software Foundation. == History == Flex was written in C around 1987 by Vern Paxson, with the help of many ideas and much inspiration from Van Jacobson. Original version by Jef Poskanzer. The fast table representation is a partial implementation of a design done by Van Jacobson. The implementation was done by Kevin Gong and Vern Paxson. == Example lexical analyzer == This is an example of a Flex scanner for the instructional programming language PL/0. The tokens recognized are: '+', '-', '', '/', '=', '(', ')', ',', ';', '.', ':=', '<', '<=', '<>', '>', '>='; numbers: 0-9 {0-9}; identifiers: a-zA-Z {a-zA-Z0-9} and keywords: begin, call, const, do, end, if, odd, procedure, then, var, while. == Internals == These programs perform character parsing and tokenizing via the use of a deterministic finite automaton (DFA). A DFA is a theoretical machine accepting regular languages, and is equivalent to read-only right moving Turing machines. The syntax is based on the use of regular expressions. See also nondeterministic finite automaton. == Issues == === Time complexity === A Flex lexical analyzer usually has time complexity O ( n ) {\displaystyle O(n)} in the length of the input. That is, it performs a constant number of operations for each input symbol. This constant is quite low: GCC generates 12 instructions for the DFA match loop. Note that the constant is independent of the length of the token, the length of the regular expression and the size of the DFA. However, using the REJECT macro in a scanner with the potential to match extremely long tokens can cause Flex to generate a scanner with non-linear performance. This feature is optional. In this case, the programmer has explicitly told Flex to "go back and try again" after it has already matched some input. This will cause the DFA to backtrack to find other accept states. The REJECT feature is not enabled by default, and because of its performance implications its use is discouraged in the Flex manual. === Reentrancy === By default the scanner generated by Flex is not reentrant. This can cause serious problems for programs that use the generated scanner from different threads. To overcome this issue there are options that Flex provides in order to achieve reentrancy. A detailed description of these options can be found in the Flex manual. === Usage under non-Unix environments === Normally the generated scanner contains references to the unistd.h header file, which is Unix specific. To avoid generating code that includes unistd.h, %option nounistd should be used. Another issue is the call to isatty (a Unix library function), which can be found in the generated code. The %option never-interactive forces flex to generate code that does not use isatty. === Using flex from other languages === Flex can only generate code for C and C++. To use the scanner code generated by flex from other languages a language binding tool such as SWIG can be used. === Unicode support === Flex is limited to matching 1-byte (8-bit) binary values and therefore does not support Unicode. RE/flex and other alternatives do support Unicode matching. == Flex++ == flex++ is a similar lexical scanner for C++ which is included as part of the flex package. The generated code does not depend on any runtime or external library except for a memory allocator (malloc or a user-supplied alternative) unless the input also depends on it. This can be useful in embedded and similar situations where traditional operating system or C runtime facilities may not be available. The flex++ generated C++ scanner includes the header file FlexLexer.h, which defines the interfaces of the two C++ generated classes.

    Read more →
  • Is an AI Background Remover Worth It in 2026?

    Is an AI Background Remover Worth It in 2026?

    Comparing the best AI background remover? An AI background remover is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI background remover slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Automated machine learning

    Automated machine learning

    Automated machine learning (AutoML) is the process of automating the tasks of applying machine learning to real-world problems. It is the combination of automation and ML. AutoML potentially includes every stage from beginning with a raw dataset to building a machine learning model ready for deployment. AutoML was proposed as an artificial intelligence-based solution to the growing challenge of applying machine learning. The high degree of automation in AutoML aims to allow non-experts to make use of machine learning models and techniques without requiring them to become experts in machine learning. Automating the process of applying machine learning end-to-end additionally offers the advantages of producing simpler solutions, faster creation of those solutions, and models that often outperform hand-designed models. Common techniques used in AutoML include hyperparameter optimization, meta-learning and neural architecture search. == Comparison to the standard approach == In a typical machine learning application, practitioners have a set of input data points to be used for training. The raw data may not be in a form that all algorithms can be applied to. To make the data amenable for machine learning, an expert may have to apply appropriate data pre-processing, feature engineering, feature extraction, and feature selection methods. After these steps, practitioners must then perform algorithm selection and hyperparameter optimization to maximize the predictive performance of their model. If deep learning is used, the architecture of the neural network must also be chosen manually by the machine learning expert. Each of these steps may be challenging, resulting in significant hurdles to using machine learning. AutoML aims to simplify these steps for non-experts, and to make it easier for them to use machine learning techniques correctly and effectively. AutoML plays an important role within the broader approach of automating data science, which also includes challenging tasks such as data engineering, data exploration and model interpretation and prediction. == Targets of automation == Automated machine learning can target various stages of the machine learning process. Steps to automate are: Data preparation and ingestion (from raw data and miscellaneous formats) Column type detection; e.g., Boolean, discrete numerical, continuous numerical, or text Column intent detection; e.g., target/label, stratification field, numerical feature, categorical text feature, or free text feature Task detection; e.g., binary classification, regression, clustering, or ranking Feature engineering Feature selection Feature extraction Meta-learning and transfer learning Detection and handling of skewed data and/or missing values Model selection - choosing which machine learning algorithm to use, often including multiple competing software implementations Ensembling - a form of consensus where using multiple models often gives better results than any single model Hyperparameter optimization of the learning algorithm and featurization Neural architecture search Pipeline selection under time, memory, and complexity constraints Selection of evaluation metrics and validation procedures Problem checking Leakage detection Misconfiguration detection Analysis of obtained results Creating user interfaces and visualizations == Challenges and Limitations == There are a number of key challenges being tackled around automated machine learning. A big issue surrounding the field is referred to as "development as a cottage industry". This phrase refers to the issue in machine learning where development relies on manual decisions and biases of experts. This is contrasted to the goal of machine learning which is to create systems that can learn and improve from their own usage and analysis of the data. Basically, it's the struggle between how much experts should get involved in the learning of the systems versus how much freedom they should be giving the machines. However, experts and developers must help create and guide these machines to prepare them for their own learning. To create this system, it requires labor intensive work with knowledge of machine learning algorithms and system design. Additionally, other challenges include meta-learning and computational resource allocation.

    Read more →
  • The Best Free AI Copywriting Tool for Beginners

    The Best Free AI Copywriting Tool for Beginners

    Curious about the best AI copywriting tool? An AI copywriting tool is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI copywriting tool slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Dilek Hakkani-Tür

    Dilek Hakkani-Tür

    Dilek Z. Hakkani-Tür is a Turkish-American computer scientist focusing on speech processing, speech recognition, and dialogue systems. She is a professor of computer science at the University of Illinois Urbana-Champaign. == Education and career == Hakkani-Tür is a 1994 graduate of Middle East Technical University in Ankara, Turkey. She continued her studies at Bilkent University, also in Ankara, where she earned a master's degree in 1996 and completed her Ph.D. in 2000. She worked as a researcher at AT&T Labs from 2001 to 2005, at the International Computer Science Institute from 2006 to 2010, at Microsoft Research from 2010 to 2016, at Google Research from 2016 to 2018, and at Amazon Alexa from 2018 to 2023. At Microsoft, she was in the team of scientists that built the first prototype of the Cortana virtual assistant. While working for Amazon Alexa, she also taught at the University of California, Santa Cruz as a distinguished visiting instructor. She joined the University of Illinois Urbana-Champaign faculty in 2023. She was editor-in-chief of IEEE/ACM Transactions on Audio, Speech and Language Processing from 2019 to 2021, and is president of the Special Interest Group on Discourse and Dialogue of the Association for Computational Linguistics for the 2023–2025 term. She has served as co-editor-in-chief of Transactions of the Association for Computational Linguistics since 2024. == Recognition == In 2014, Hakkani-Tür was elected as an IEEE Fellow "for contributions to spoken language processing", and as a Fellow of the International Speech Communication Association "for contributions to advancing the state-of-the-art in spoken language processing, especially for human/human and human/machine conversational understanding". In 2024, she was elected as a Fellow of the Association for Computational Linguistics for her contributions to spoken dialogue systems.

    Read more →
  • Best AI Voice Assistants in 2026

    Best AI Voice Assistants in 2026

    Trying to pick the best AI voice assistant? An AI voice assistant is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI voice assistant slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Facial age estimation

    Facial age estimation

    Facial age estimation is the use of artificial intelligence to estimate the age of a person based on their facial features. Computer vision techniques are used to analyse the facial features in the images of millions of people whose age is known and then deep learning is used to create an algorithm that tries to predict the age of an unknown person. The key use of the technology is to prevent access to age-restricted goods and services. Examples include restricting children from accessing internet pornography, checking that they meet a mandatory minimum age when registering for an account on social media, or preventing adults from accessing websites, online chat or games designed only for use by children. The technology is distinct from facial recognition systems as the software does not attempt to uniquely identify the individual. Researchers have applied neural networks for age estimation since at least 2010. == Evaluation == An ongoing study by the National Institute of Standards and Technology (NIST) entitled 'Face Analysis Technology Evaluation' seeks to establish the technical performance of prototype age estimation algorithms submitted by academic teams and software vendors including Brno University of Technology, Czech Technical University in Prague, Dermalog, IDEMIA, Incode Technologies Inc, Jumio, Nominder, Rank One Computing, Unissey and Yoti. == Public sector use == The UK government has explored using facial age estimation at the UK border as an alternative to bone X-rays and MRI scans when determining child status of asylum seekers. == Commercial use == Commercial users of facial age estimation include Instagram and OnlyFans. In January 2025, John Lewis & Partners announced that had started using the technology to check the age of people shopping for knives on its website, to comply with UK legislation to limit knife crime. In the UK, several supermarket chains have taken part in Home Office trials of the technology to automate the checking of a customer's age when buying age-restricted goods such as alcohol. UK legislation introduced in January 2025 mandates robust forms of age verification hosting adult content viewable in the UK by July 2025. Allowable methods include facial age estimation. == Criticism == Adam Schwartz, a lawyer for the Electronic Frontier Foundation, criticized the use of facial age estimation software, noting its inaccuracy especially in cases of minorities and women, as was found in NIST's 2024 report. Twenty organisations jointly under European Digital Rights called the practice a "systematic and invasive processing of young people's data" that risks discriminatory profiling.

    Read more →
  • AI Humanizers: Free vs Paid (2026)

    AI Humanizers: Free vs Paid (2026)

    Trying to pick the best AI humanizer? An AI humanizer is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI humanizer slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Law and Corpus Linguistics

    Law and Corpus Linguistics

    Law and corpus linguistics (LCL) is an academic sub-discipline that uses large databases of examples of language usage equipped with tools designed by linguists called corpora to better get at the meaning of words and phrases in legal texts (statutes, constitutions, contracts, etc.). Thus, LCL is the application of corpus linguistic tools, theories, and methodologies to issues of legal interpretation in much the same way law and economics is the application of economic tools, theories, and methodologies to various legal issues. == History == A 2005 law review article by Lawrence Solan noted in passing that corpus linguistics had potential for its application to interpreting legal texts. But the first systematic exploration and advocacy of applying the tools and methodologies of corpus linguistics to legal interpretive questions of law and corpus linguistics came in the fall of 2010, when the BYU Law Review published a note by Stephen Mouritsen, entitled The Dictionary is Not a Fortress: Definitional Fallacies and a Corpus-Based Approach to Plain Meaning. The note argued that dictionaries are the primary linguistic tool used by judges to determine the plain or ordinary meaning of words and phrases, and highlighted the deficiencies of such an approach. In its stead, the note proposed using corpus linguistics. And the note would be later cited by Adam Liptak in a New York Times article on statutory construction. Law and corpus linguistics (LCL) gained greater legitimacy in July 2011 with the first judicial opinion in American history utilizing corpus linguistics to determine the meaning of a legal text: In re the Adoption of Baby E.Z. In a concurrence in part and in the judgment, Justice Thomas Lee wrote to put forth an alternative ground for the majority's holding—interpreting the phrase "custody determination" by using corpus linguistics. Justice Lee looked at 500 randomized sample sentences from the Corpus of Contemporary American English (COCA) and found that the most common sense of "custody" was in the context of divorce rather than adoption. Further, he found that "custody" is ten times more likely to co-occur (or collocate) with "divorce" than with "adoption". From that evidence Justice Lee concluded that he "would find that the custody proceedings covered by the Act are limited to proceedings resulting in the modifiable custody orders of a divorce", rather than the broader range of custody proceedings. Other jurisprudence and scholarship would follow. In a 2015 concurrence in State v. Rasabout, Justice Lee used a COCA search to determine that "discharge" when used with a firearm (or one of its synonyms) overwhelmingly referred to a single shot rather than emptying the entire magazine of the weapon. And in 2016, four of the five justices joined a footnote in a majority opinion by Justice Lee commending a party for using corpus linguistics in its briefing even though the Court found it unnecessary to resolve the related question. Finally, in 2016 the Michigan Supreme Court became the first court to use a linguist-designed corpus in a majority opinion (COCA), with both the majority and the dissent turning to COCA to determine the meaning of the word "information". In 2020, courts desiring to bolster the legal theory of original intent have sought the opportunity to undertake analyses of statutes utilizing corpus linguistics. In a Ninth Circuit Court of Appeals case, Jones v. Becerra (No. 20-56174), a case involving the Second Amendment and the constitutionality of a California statute which bans the sale of firearms to individuals under the age of 21, a Ninth Circuit panel requested that the parties address three questions: 1) “What is the original public meaning of the Second Amendment phrases: ‘A well regulated Militia’; ‘the right of the people’; and ‘shall not be infringed’? 2) How does the tool of corpus linguistics help inform the determination of the original public meaning of those Second Amendment phrases?” 3) How do the data yielded from corpus linguistics assist in the interpretation of the constitutionality of age-based restrictions under the Second Amendment? As to scholarship, in 2012, Mouritsen followed up his original work with an article in the Columbia Science and Technology Law Review, where he further refined and promoted the use of corpus-based methods for determining questions of legal ambiguity. Additionally, in 2016 two essays and an article on law and corpus linguistics were published. The Yale Law Journal Forum published Corpus Linguistics & Original Public Meaning: A New Tool to Make Originalism More Empirical. Written by Justice Lee and two co-authors, the essay urged originalists to turn to corpus linguistics to improve the rigor and accuracy of originalist scholarship. And in response, the Forum published an essay by Lawrence Solan (a Brooklyn Law professor with a PhD in linguistics), Can Corpus Linguistics Help Make Originalism Scientific? The Boston University Public Interest Law Journal published The Merciful Corpus: The Rule of Lenity, Ambiguity and Corpus Linguistics by Daniel Ortner. In the article Ortner applied corpus linguistics to determining whether sufficient ambiguity exists to trigger the rule of lenity in five Supreme Court cases. Looking forward, in 2017 two more articles are slated for publication. Lee Strang focuses on corpus linguistics and originalism in the U.C. Davis Law Review, and Lawrence Solan and Tammy Gales explore corpus linguistics in the context of finding ordinary meaning in statutory interpretation in the International Journal of Legal Discourse. Lawyers and journalists have also taken notice of corpus linguistics at it relates to the law. In 2010, Neal Goldfarb filed the first known brief in the Supreme Court using corpus linguistics (COCA) to determine whether the ordinary meaning of "personal" referred to corporations in the case FCC v. AT&T. The amicus brief looked at the top collocates (words that co-occur) of "personal" in COHA as well as BYU's Time Magazine Corpus. And writing for The Atlantic, Ben Zimmer took note of this new trend, referring to corpus linguistics in the courts as "Like Lexis on Steroids". On the academic front, in 2013 BYU Law School started the first class on law and corpus linguistics, co-taught by Mouritsen, Lee, and (now Dean) Gordon Smith. The class is currently in its fourth year. And in February 2016, BYU Law School hosted the inaugural conference on LCL, with over two dozen legal and linguistic scholars from around the country discussing and debating the next steps forward for the growing academic movement. The conference has been held regularly in subsequent years. At the 2016 conference BYU Law School announced its plans and progress on the Corpus of Founding Era American English (COFEA), a corpus that covers 1760–1799 and contains more than 120 million words have been collected from founding era letters, diaries, newspapers, non-fiction books, fiction, sermons, speeches, debates, legal cases, and other legal materials.

    Read more →