AI Content Repurposing Service

AI Content Repurposing Service — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Automated attendant

    Automated attendant

    In telephony, an automated attendant (also auto attendant, auto-attendant, autoattendant, automatic phone menus, AA, or virtual receptionist) allows callers to be automatically transferred to an extension without the intervention of an operator/receptionist. Many AAs will also offer a simple menu system ("for sales, press 1, for service, press 2," etc.). An auto attendant may also allow a caller to reach a live operator by dialing a number, usually "0". Typically the auto attendant is included in a business's phone system such as a PBX, but some services allow businesses to use an AA without such a system. Modern AA services (which now overlap with more complicated interactive voice response or IVR systems) can route calls to mobile phones, VoIP virtual phones, other AAs/IVRs, or other locations using traditional land-line phones or voice message machines. == Feature description == Telephone callers will recognize an automated attendant system as one that greets calls incoming to an organization with a recorded greeting of the form, "Thank you for calling .... If you know your party's extension, you may dial it any time during this message." Callers who have a touch-tone (DTMF) phone can dial an extension number or, in most cases, wait for operator ("attendant") assistance. Since the telephone network does not transmit the DC signals from rotary dial telephones (except for audible clicks), callers who have rotary dial phones have to wait for assistance. On a purely technical level it could be argued that an automated attendant is a very simple kind of IVR however, in the telecom industry the terms IVR and auto attendant are generally considered distinct. An automated attendant serves a very specific purpose (replace live operator and route calls), whereas an IVR can perform all sorts of functions (telephone banking, account inquiries, etc.). An AA will often include a directory which will allow a caller to dial by name in order to find a user on a system. There is no standard format to these directories, and they can use combinations of first name, last name, or both. The following lists common routing steps that are components of an automated attendant: Transfer to extension Transfer to voicemail Play message (i.e., "our address is ...") Go to a sub-menu Repeat choices In addition, an automated attendant would be expected to have values for the following: '0' – where to go when the caller dials '0' Timeout – what to do if the caller does nothing (usually go to the same place as '0') Default mailbox – where to send calls if '0' is not answered (or is not pointing to a live person) == Background == PBXs (private branch exchanges) or PABXs (private automatic branch exchanges) are telephone systems that serve an organization that has many telephone extensions but fewer telephone lines (sometimes called "trunks") that connect that organization to the rest of the global telecommunications network. While persons within an enterprise served by a PBX can call each other by dialing their extension numbers, incoming calls, i.e., calls originating from a telephone not served by the PBX but intended for a party served by the PBX, required assistance from a switchboard operator (also called a "switchboard attendant") or a telephone service called DID ("direct inward dialing"). Direct inward dialing has advantages such as rapid connection to the destination party and disadvantages including cost, lack of identification of the called organization and use of ten-digit telephone numbers. Automated attendants provide, among many other things, a way for an external caller to be directed to an extension or department served by a PBX system without using direct inward dialing or without switchboard attendant assistance. == History == Automated attendants are not part of voicemail systems. Voice messaging (or voicemail or VM) technology has existed since the late 1970s; in the early 1980s companies provided voice-prompting systems that allowed callers to reach (route the call) to an intended party, not necessarily to leave a message. Automated attendant systems are also referred to as automated menu systems and much early work in this field was done by Michael J. Freeman, Ph.D. == Time-based routing == Many auto attendants will have options to allow for time-of-day routing, as well as weekend and holiday routing. The specifics of these features will depend entirely on the particular automated attendant, but typically there would be a normal greeting and routing steps that would take place during normal business hours, and a different greeting and routing for non-business hours.

    Read more →
  • Sudip Roy (computer scientist)

    Sudip Roy (computer scientist)

    Sudip Roy is a computer scientist and technology executive. He is the co-founder and chief technology officer of Adaption. He has worked on large-scale machine learning systems at organizations including Google DeepMind and Cohere. == Education == Roy earned a PhD in Computer Science from Cornell University. He holds a B.Tech in Computer Science and Engineering from the Indian Institute of Technology (IIT), Kharagpur. == Career == Sudip worked at Google Brain (now part of Google DeepMind) on systems research and large-scale data management. During his tenure, he contributed to infrastructure projects including Pathways and TensorFlow Extended, which support training and inference workflows for production machine learning models. He later served as Senior Director of Engineering at Cohere, leading work on inference infrastructure and fine-tuning systems. In late 2025, he co-founded the company Adaption Labs with Sara Hooker. The company focuses on developing AI systems designed for continuous learning and adaptation. Roy’s research spans systems for AI and AI for systems, including work on optimizing system performance and compilers. His publications have appeared in conferences such as MLSys, NeurIPS, SIGMOD, and KDD. He has been a program committee member or reviewer for the conferences SIGMOD, VLDB, ICDE, and MLSys. == Awards == He is the recipient of the MLSys Outstanding Paper Award (2022) and the SIGMOD Best Paper Award (2011). He holds multiple patents in machine learning systems, including methods for learned graph optimizations and neural network-based device placement.

    Read more →
  • Aravind Joshi

    Aravind Joshi

    Aravind Krishna Joshi (August 5, 1929 – December 31, 2017) was the Henry Salvatori Professor of Computer and Cognitive Science in the computer science department of the University of Pennsylvania. Joshi defined the tree-adjoining grammar formalism which is often used in computational linguistics and natural language processing. Joshi studied at Pune University and the Indian Institute of Science, where he was awarded a BE in electrical engineering and a DIISc in communication engineering respectively. Joshi's graduate work was done in the electrical engineering department at the University of Pennsylvania, and he was awarded his PhD in 1960. He became a professor at Penn and was the co-founder and co-director of the Institute for Research in Cognitive Science. == Awards and recognitions == Guggenheim fellow, 1971–72 Fellow of the Institute of Electrical and Electronics Engineers (IEEE), 1976 Best Paper Award at the National Conference on Artificial Intelligence, 1987 Founding Fellow of the American Association for Artificial Intelligence (AAAI), 1990 IJCAI Award for Research Excellence, 1997 Fellow of the Association for Computing Machinery, 1998 Elected to the National Academy of Engineering, 1999 First to be awarded the Association for Computational Linguistics Lifetime Achievement Award at the 40th anniversary meeting of the ACL, 2002 Awarded the Rumelhart Prize, 2003 Benjamin Franklin Medal in Computer and Cognitive Science, 2005 Doctor honoris causa of mathematical and physical sciences, Charles University in Prague, October 30, 2013 S.-Y. Kuroda Prize of the SIG Mathematics of Language of the ACL, 2013 === Awarded history === On April 21, 2005, Joshi was awarded the Franklin Institute's Benjamin Franklin Medal in Computer and Cognitive Science. The Franklin Institute citation states that he was awarded the medal "for his fundamental contributions to our understanding of how language is represented in the mind, and for developing techniques that enable computers to process efficiently the wide range of human languages. These advances have led to new methods for computer translation."

    Read more →
  • Datacap

    Datacap

    Datacap (an IBM Company), a privately owned company, manufactures and sells computer software, and services. Datacap's first product, Paper Keyboard, was a "forms processing" product and shipped in 1989. In August 2010, IBM announced that it had acquired Datacap for an undisclosed amount. == Overview == Datacap sells products through a value-added distribution network worldwide. The software is classified as "enterprise software", meaning that it requires trained professionals to install and configure. Although the Company has focused on providing solutions for scanning paper documents, most recently Company materials have emphasized customer requirements to handle electronic documents ("eDocs"), documents being received into an organization electronically (usually email). Datacap claims that its software is unique because of the rules engine ("Rulerunner") used for processing inbound documents, including performing the image processing (deskew, noise removal, etc.), optical character recognition (OCR), intelligent character recognition (ICR), validations, and export-release formatting of extracted data to target ERP and line of business application.

    Read more →
  • Text Retrieval Conference

    Text Retrieval Conference

    The Text REtrieval Conference (TREC) is an ongoing series of workshops focusing on a list of different information retrieval (IR) research areas, or tracks. It is co-sponsored by the National Institute of Standards and Technology (NIST) and the Intelligence Advanced Research Projects Activity (part of the office of the Director of National Intelligence), and began in 1992 as part of the TIPSTER Text program. Its purpose is to support and encourage research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies and to increase the speed of lab-to-product transfer of technology. TREC's evaluation protocols have improved many search technologies. A 2010 study estimated that "without TREC, U.S. Internet users would have spent up to 3.15 billion additional hours using web search engines between 1999 and 2009." Hal Varian the Chief Economist at Google wrote that "The TREC data revitalized research on information retrieval. Having a standard, widely available, and carefully constructed set of data laid the groundwork for further innovation in this field." Each track has a challenge wherein NIST provides participating groups with data sets and test problems. Depending on track, test problems might be questions, topics, or target extractable features. Uniform scoring is performed so the systems can be fairly evaluated. After evaluation of the results, a workshop provides a place for participants to collect together thoughts and ideas and present current and future research work.Text Retrieval Conference started in 1992, funded by DARPA (US Defense Advanced Research Project) and run by NIST. Its purpose was to support research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies. == Goals == Encourage retrieval search based on large text collections Increase communication among industry, academia, and government by creating an open forum for the exchange of research ideas Speed the transfer of technology from research labs into commercial products by demonstrating substantial improvements retrieval methodologies on real world problems To increase the availability of appropriate evaluation techniques for use by industry and academia including development of new evaluation techniques more applicable to current systems TREC is overseen by a program committee consisting of representatives from government, industry, and academia. For each TREC, NIST provide a set of documents and questions. Participants run their own retrieval system on the data and return to NIST a list of retrieved top-ranked documents. NIST pools the individual result judges the retrieved documents for correctness and evaluates the results. The TREC cycle ends with a workshop that is a forum for participants to share their experiences. == Relevance judgments in TREC == TREC defines relevance as: "If you were writing a report on the subject of the topic and would use the information contained in the document in the report, then the document is relevant." Most TREC retrieval tasks use binary relevance: a document is either relevant or not relevant. Some TREC tasks use graded relevance, capturing multiple degrees of relevance. Most TREC collections are too large to perform complete relevance assessment; for these collections it is impossible to calculate the absolute recall for each query. To decide which documents to assess, TREC usually uses a method call pooling. In this method, the top-ranked n documents from each contributing run are aggregated, and the resulting document set is judged completely. == Various TRECs == In 1992 TREC-1 was held at NIST. The first conference attracted 28 groups of researchers from academia and industry. It demonstrated a wide range of different approaches to the retrieval of text from large document collections .Finally TREC1 revealed the facts that automatic construction of queries from natural language query statements seems to work. Techniques based on natural language processing were no better no worse than those based on vector or probabilistic approach. TREC2 Took place in August 1993. 31 group of researchers participated in this. Two types of retrieval were examined. Retrieval using an ‘ad hoc’ query and retrieval using a ‘routing' query In TREC-3 a small group experiments worked with Spanish language collection and others dealt with interactive query formulation in multiple databases TREC-4 they made even shorter to investigate the problems with very short user statements TREC-5 includes both short and long versions of the topics with the goal of carrying out deeper investigation into which types of techniques work well on various lengths of topics In TREC-6 Three new tracks speech, cross language, high precision information retrieval were introduced. The goal of cross language information retrieval is to facilitate research on system that are able to retrieve relevant document regardless of language of the source document TREC-7 contained seven tracks out of which two were new Query track and very large corpus track. The goal of the query track was to create a large query collection TREC-8 contain seven tracks out of which two –question answering and web tracks were new. The objective of QA query is to explore the possibilities of providing answers to specific natural language queries TREC-9 Includes seven tracks In TREC-10 Video tracks introduced Video tracks design to promote research in content based retrieval from digital video In TREC-11 Novelty tracks introduced. The goal of novelty track is to investigate systems abilities to locate relevant and new information within the ranked set of documents returned by a traditional document retrieval system TREC-12 held in 2003 added three new tracks; Genome track, robust retrieval track, HARD (Highly Accurate Retrieval from Documents) == Tracks == === Current tracks === New tracks are added as new research needs are identified, this list is current for TREC 2018. CENTRE Track – Goal: run in parallel CLEF 2018, NTCIR-14, TREC 2018 to develop and tune an IR reproducibility evaluation protocol (new track for 2018). Common Core Track – Goal: an ad hoc search task over news documents. Complex Answer Retrieval (CAR) – Goal: to develop systems capable of answering complex information needs by collating information from an entire corpus. Incident Streams Track – Goal: to research technologies to automatically process social media streams during emergency situations (new track for TREC 2018). The News Track – Goal: partnership with The Washington Post to develop test collections in news environment (new for 2018). Precision Medicine Track – Goal: a specialization of the Clinical Decision Support track to focus on linking oncology patient data to clinical trials. Real-Time Summarization Track (RTS) – Goal: to explore techniques for real-time update summaries from social media streams. === Past tracks === Chemical Track – Goal: to develop and evaluate technology for large scale search in chemistry-related documents, including academic papers and patents, to better meet the needs of professional searchers, and specifically patent searchers and chemists. Clinical Decision Support Track – Goal: to investigate techniques for linking medical cases to information relevant for patient care Contextual Suggestion Track – Goal: to investigate search techniques for complex information needs that are highly dependent on context and user interests. Crowdsourcing Track – Goal: to provide a collaborative venue for exploring crowdsourcing methods both for evaluating search and for performing search tasks. Genomics Track – Goal: to study the retrieval of genomic data, not just gene sequences but also supporting documentation such as research papers, lab reports, etc. Last ran on TREC 2007. Dynamic Domain Track – Goal: to investigate domain-specific search algorithms that adapt to the dynamic information needs of professional users as they explore in complex domains. Enterprise Track – Goal: to study search over the data of an organization to complete some task. Last ran on TREC 2008. Entity Track – Goal: to perform entity-related search on Web data. These search tasks (such as finding entities and properties of entities) address common information needs that are not that well modeled as ad hoc document search. Cross-Language Track – Goal: to investigate the ability of retrieval systems to find documents topically regardless of source language. After 1999, this track spun off into CLEF. FedWeb Track – Goal: to select best resources to forward a query to, and merge the results so that most relevant are on the top. Federated Web Search Track – Goal: to investigate techniques for the selection and combination of search results from a large number of real on-line web search services. Filtering Track – Goal: to binarily decide retrieval of new

    Read more →
  • AI Art Generators: Free vs Paid (2026)

    AI Art Generators: Free vs Paid (2026)

    In search of the best AI art generator? An AI art generator is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI art generator slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Roni Rosenfeld

    Roni Rosenfeld

    Roni Rosenfeld (Hebrew: רוני רוזנפלד) is an Israeli-American computer scientist and computational epidemiologist, currently serving as the head of the Machine Learning Department at Carnegie Mellon University. He is an international expert in machine learning, infectious disease forecasting, statistical language modeling and artificial intelligence. == Education == Rosenfeld received his B.Sc. in mathematics and physics from Tel Aviv University in 1985. He received his Ph.D. in computer science from Carnegie Mellon University in 1994. While a graduate student, he developed and open-sourced a statistical language-modeling toolkit to allow anyone to create statistical language models from their own corpora and experiment with and extend the toolkit's capabilities. The toolkit has been used by more than 100 NLP laboratories in more than 20 countries. Rosenfeld's Ph.D. thesis, A Maximum Entropy Approach to Adaptive Statistical Language Modeling, was advised by Raj Reddy and Xuedong Huang and won the 2001 Computer, Speech and Language award for "Most Influential Paper in the Last 5 Years." == Career == Shortly after receiving his Ph.D., Rosenfeld joined the faculty of the Carnegie Mellon School of Computer Science as an assistant professor. He was promoted to the rank of associate professor in 1999 and received tenure in 2001. In 2005 he was promoted to professor of language technologies, machine learning computer science and computational biology in the School of Computer Science at Carnegie Mellon University. Rosenfeld also holds adjunct appointments at the University of Pittsburgh School of Medicine, department of computational and systems biology. From 2002 to 2003, Rosenfeld was a visiting professor at the University of Hong Kong. Rosenfeld is the director of Carnegie Mellon's Machine Learning for Social Good (ML4SG) program. He has held educational leadership positions in a variety of programs, including the M.S. in computational finance (1997–1999), graduate computational and statistical learning (2001–2003), M.S. in machine learning (2017) and undergraduate minor in machine learning. Rosenfeld was appointed Head of Carnegie Mellon's Machine Learning Department in 2018. == Research == Rosenfeld's research interests include epidemiological forecasting, information and communication technologies for development (ICT4D), and machine learning for social good. === Epidemiological forecasting === Rosenfeld is a world expert in epidemiological forecasting. He founded and directs the Delphi research group, which has won most of the epidemiological forecasting challenges organized by the U.S. CDC and other U.S. government agencies. In December 2016, the CDC named his group the "Most Accurate Forecaster" for 2015–2016, and in October 2017, the Delphi group's two systems took the top two spots in the 2016-2017 flu forecasting challenge. The CDC recognized Rosenfeld's Delphi group at Carnegie Mellon University as having contributed the most accurate national-, regional-, and state-level influenza-like illness forecasts and national-level hospitalization forecasts to the site. In 2019, the CDC recognized forecasts provided by the Delphi group at Carnegie Mellon as having been the most accurate for five seasons in a row, and named the Delphi group an Influenza Forecasting Center of Excellence, a five-year designation that includes $3 million in research funding. Rosenfeld describes his forecasting research goal as "to make epidemiological forecasting as universally accepted and useful as weather forecasting is today." His recent work in the area has focused on selecting high value epidemiological forecasting targets (e.g. Influenza and Dengue); creating baseline forecasting methods for them; establishing metrics for measuring and tracking forecasting accuracy; estimating the limits of forecastability for each target; and identifying new sources of data that could be helpful to the forecasting goal. == Honors and awards == 2017 Joel and Ruth Spira Teaching Award 2017 CDC Influenza Forecasting Challenge "Most Accurate Forecaster" 1992 Allen Newell Medal for Research Excellence

    Read more →
  • Best AI Headshot Generators in 2026

    Best AI Headshot Generators in 2026

    In search of the best AI headshot generator? An AI headshot generator is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI headshot generator slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Structured sparsity regularization

    Structured sparsity regularization

    Structured sparsity regularization is a class of methods, and an area of research in statistical learning theory, that extend and generalize sparsity regularization learning methods. Both sparsity and structured sparsity regularization methods seek to exploit the assumption that the output variable Y {\displaystyle Y} (i.e., response, or dependent variable) to be learned can be described by a reduced number of variables in the input space X {\displaystyle X} (i.e., the domain, space of features or explanatory variables). Sparsity regularization methods focus on selecting the input variables that best describe the output. Structured sparsity regularization methods generalize and extend sparsity regularization methods, by allowing for optimal selection over structures like groups or networks of input variables in X {\displaystyle X} . Common motivation for the use of structured sparsity methods are model interpretability, high-dimensional learning (where dimensionality of X {\displaystyle X} may be higher than the number of observations n {\displaystyle n} ), and reduction of computational complexity. Moreover, structured sparsity methods allow to incorporate prior assumptions on the structure of the input variables, such as overlapping groups, non-overlapping groups, and acyclic graphs. Examples of uses of structured sparsity methods include face recognition, magnetic resonance image (MRI) processing, socio-linguistic analysis in natural language processing, and analysis of genetic expression in breast cancer. == Definition and related concepts == === Sparsity regularization === Consider the linear kernel regularized empirical risk minimization problem with a loss function V ( y i , f ( x ) ) {\displaystyle V(y_{i},f(x))} and the ℓ 0 {\displaystyle \ell _{0}} "norm" as the regularization penalty: min w ∈ R d 1 n ∑ i = 1 n V ( y i , ⟨ w , x i ⟩ ) + λ ‖ w ‖ 0 , {\displaystyle \min _{w\in \mathbb {R} ^{d}}{\frac {1}{n}}\sum _{i=1}^{n}V(y_{i},\langle w,x_{i}\rangle )+\lambda \|w\|_{0},} where x , w ∈ R d {\displaystyle x,w\in \mathbb {R^{d}} } , and ‖ w ‖ 0 {\displaystyle \|w\|_{0}} denotes the ℓ 0 {\displaystyle \ell _{0}} "norm", defined as the number of nonzero entries of the vector w {\displaystyle w} . f ( x ) = ⟨ w , x i ⟩ {\displaystyle f(x)=\langle w,x_{i}\rangle } is said to be sparse if ‖ w ‖ 0 = s < d {\displaystyle \|w\|_{0}=s 0 {\displaystyle w_{j}>0} . However, as in this case groups may overlap, we take the intersection of the complements of those groups that are not set to zero. This intersection of complements selection criteria implies the modeling choice that we allow some coefficients within a particular group g {\displaystyle g} to be set to zero, while others within the same group g {\displaystyle g} may remain positive. In other words, coefficients within a group may differ depending on the several group memberships that each variable within the group may have. ==== Union of groups: latent group Lasso ==== A different approach is to consider union of groups for variable selection. This approach captures the modeling situation where variables can be selected as long as they belong at least to one group with positive coefficients. This modeling perspective implies that we want to preserve group structure. The formulation of the union of groups approach is also referred to as latent group Lasso, and requires to modify the group ℓ 2 {\displaystyle \ell _{2}} norm considered above and introduce the following regularizer R ( w ) = i n f { ∑ g ‖ w g ‖ g : w = ∑ g = 1 G w ¯ g } {\displaystyle R(w)=inf\left\{\sum _{g}\|w_{g}\|_{g}:w=\sum _{g=1}^{G}{\bar {w}}_{g}\right\}} where w ∈ R d {\displaystyle w\in {\mathbb {R^{d}} }} , w g ∈ G g {\displaystyle w_{g}\in G_{g}} is the vector of coefficients of group g, and w ¯ g ∈ R d {\displaystyle {\bar {w}}_{g}\in {\mathbb {R^{d}} }} is a vector with coefficients w g j {\displaystyle w_{g}^{j}} for all variables j {

    Read more →
  • Heikki Mannila

    Heikki Mannila

    Heikki Olavi Mannila (born 4 January 1960 in Espoo) is a Finnish computer scientist, the president of the Academy of Finland. Mannila earned his Ph.D. in 1985 from the University of Helsinki under the supervision of Esko Ukkonen and for many years he was a professor at the University of Helsinki himself. From 2004 to 2008 he was Academy Professor at the Academy of Finland. He became Vice President for Academic Affairs at Aalto University in 2009, and was appointed by the Finnish government as president of the Academy of Finland for a term lasting from 2012 to 2017. The appointment was renewed for the period 2017–2022. Mannila is known for his research in data mining, and has published highly cited papers on association rule learning and sequence mining. With David Hand and Padhraic Smyth, he is the co-author of the book Principles of Data Mining (MIT Press, 2001). Heikki Mannila is son to the professor Elina Haavio-Mannila.

    Read more →
  • Barbara Di Eugenio

    Barbara Di Eugenio

    Barbara Di Eugenio is an Italian-American computer scientist, the Collegiate Warren S. McCulloch Professor of Computer Science at the University of Illinois Chicago. Her research focuses on natural language processing and its applications to human–computer interaction, educational technology, and artificial intelligence in healthcare. == Education and career == Di Eugenio is originally from Turin. After an undergraduate education in Italy, she completed her Ph.D. in computer and information science in 1993 at the University of Pennsylvania. Her dissertation, Understanding Natural Language Instructions: A Computational Approach to Purpose Clauses, was supervised by Bonnie Webber. She became a faculty member at the University of Illinois Chicago in 1999, and at that time was the only woman faculty member in the Department of Electrical Engineering and Computer Science. == Recognition == In 2022, Di Eugenio received the Zenith Award of the Association for Women in Science. She was named as a Fellow of the Association for Computational Linguistics in 2023, "for outstanding contributions to natural language generation; intelligent tutoring systems; discourse; intercoder agreement; and applying multimodal interactive systems to health".

    Read more →
  • Self-verifying finite automaton

    Self-verifying finite automaton

    In automata theory, a self-verifying finite automaton (SVFA) is a special kind of a nondeterministic finite automaton (NFA) with a symmetric kind of nondeterminism introduced by Hromkovič and Schnitger. Generally, in self-verifying nondeterminism, each computation path is concluded with any of the three possible answers: yes, no, and I do not know. For each input string, no two paths may give contradictory answers, namely both answers yes and no on the same input are not possible. At least one path must give answer yes or no, and if it is yes then the string is considered accepted. SVFA accept the same class of languages as deterministic finite automata (DFA) and NFA but have different state complexity. == Formal definition == An SVFA is represented formally by a 6-tuple, A=(Q, Σ, Δ, q0, Fa, Fr) such that (Q, Σ, Δ, q0, Fa) is an NFA, and Fa, Fr are disjoint subsets of Q. For each word w = a1a2 … an, a computation is a sequence of states r0,r1, …, rn, in Q with the following conditions: r0 = q0 ri+1 ∈ Δ(ri, ai+1), for i = 0, …, n−1. If rn ∈ Fa then the computation is accepting, and if rn ∈ Fr then the computation is rejecting. There is a requirement that for each w there is at least one accepting computation or at least one rejecting computation but not both. == Results == Each DFA is a SVFA, but not vice versa. Jirásková and Pighizzini proved that for every SVFA of n states, there exists an equivalent DFA of g ( n ) = Θ ( 3 n / 3 ) {\displaystyle g(n)=\Theta (3^{n/3})} states. Furthermore, for each positive integer n, there exists an n-state SVFA such that the minimal equivalent DFA has exactly g ( n ) {\displaystyle g(n)} states. Other results on the state complexity of SVFA were obtained by Jirásková and her colleagues.

    Read more →
  • Speech recognition

    Speech recognition

    Speech recognition (automatic speech recognition (ASR), computer speech recognition, or speech-to-text (STT)) is a sub-field of computational linguistics concerned with methods and technologies that translate spoken language into text or other interpretable forms. Speech recognition applications include voice user interfaces, where the user speaks to a device, which "listens" and processes the audio. Common voice applications include interpreting commands for calling, call routing, home automation, and aircraft control. These applications are called direct voice input. Productivity applications include searching audio recordings, creating transcripts, and dictation. Speech recognition can be used to analyse speaker characteristics, such as identifying native language using pronunciation assessment. Voice recognition (speaker identification) refers to identifying the speaker, rather than speech contents. Recognizing the speaker can simplify the task of translating speech in systems trained on a specific person's voice. It can also be used to authenticate the speaker as part of a security process. == History == Applications for speech recognition developed over many decades, with progress accelerated due to advances in deep learning and the use of big data. These advances are reflected in an increase in academic papers, and greater system adoption. Key areas of growth include vocabulary size, more accurate recognition for unfamiliar speakers (speaker independence), and faster processing speed. === Pre-1970 === 1952 – Bell Labs researchers, Stephen Balashek, R. Biddulph, and K. H. Davis, built Audrey for single-speaker digit recognition. Their system located the formants in the power spectrum of each utterance. 1960 – Gunnar Fant developed and published the source–filter model of speech production. 1962 – IBM's 16-word "Shoebox" machine's speech recognition debuted at the 1962 World's Fair. 1966 – Linear predictive coding, a speech coding method, was proposed by Fumitada Itakura of Nagoya University and Shuzo Saito of Nippon Telegraph and Telephone. 1969 – Funding at Bell Labs came to a halt for several years after the company's head engineer, John R. Pierce, wrote an open letter criticizing speech recognition research. This defunding lasted until Pierce retired and James L. Flanagan took over. Raj Reddy was the first person to work on continuous speech recognition, as a graduate student at Stanford University in the late 1960s. Previous systems required users to pause after each word. Reddy's system issued spoken commands for playing chess. Around this time, Soviet researchers invented the dynamic time warping (DTW) algorithm and used it to create a recognizer capable of operating on a 200-word vocabulary. DTW processed speech by dividing it into short frames (e.g. 10 ms segments) and treating each frame as a unit. Speaker independence, however, remained unsolved. === 1970–1990 === 1971 – DARPA funded a five-year speech recognition research project, Speech Understanding Research, seeking a minimum vocabulary size of 1,000 words. The project considered speech understanding a key to achieving progress in speech recognition, which was later disproved. BBN, IBM, Carnegie Mellon (CMU), and Stanford Research Institute participated. 1972 – The IEEE Acoustics, Speech, and Signal Processing group held a conference in Newton, Massachusetts. 1976 – The first ICASSP was held in Philadelphia, which became a major venue for publishing on speech recognition. During the late 1960s, Leonard Baum developed the mathematics of Markov chains at the Institute for Defense Analysis. A decade later, at CMU, Raj Reddy's students James Baker and Janet M. Baker began using the hidden Markov model (HMM) for speech recognition. James Baker had learned about HMMs while at the Institute for Defense Analysis. HMMs enabled researchers to combine sources of knowledge, such as acoustics, language, and syntax, in a unified probabilistic model. By the mid-1980s, Fred Jelinek's team at IBM created a voice-activated typewriter called Tangora, which could handle a 20,000-word vocabulary. Jelinek's statistical approach placed less emphasis on emulating human brain processes in favor of statistical modelling. (Jelinek's group independently discovered the application of HMMs to speech.) This was controversial among linguists since HMMs are too simplistic to account for many features of human languages. However, the HMM proved to be a highly useful way for modelling speech and replaced dynamic time warping as the dominant speech recognition algorithm in the 1980s. 1982 – Dragon Systems, founded by James and Janet M. Baker, was one of IBM's few competitors. === Practical speech recognition === The 1980s also saw the introduction of the n-gram language model. 1987 – The back-off model enabled language models to use multiple-length n-grams, and CSELT used HMM to recognize languages (in software and hardware, e.g. RIPAC). At the end of the DARPA program in 1976, the best computer available to researchers was the PDP-10 with 4 MB of RAM. It could take up to 100 minutes to decode 30 seconds of speech. Practical products included: 1984 – the Apricot Portable was released with up to 4096 words support, of which only 64 could be held in RAM at a time. 1987 – a recognizer from Kurzweil Applied Intelligence 1990 – Dragon Dictate, a consumer product released in 1990. AT&T deployed the Voice Recognition Call Processing service in 1992 to route telephone calls without a human operator. The technology was developed by Lawrence Rabiner and others at Bell Labs. By the early 1990s, the vocabulary of the typical commercial speech recognition system had exceeded the average human vocabulary. Reddy's former student, Xuedong Huang, developed the Sphinx-II system at CMU. Sphinx-II was the first to do speaker-independent, large vocabulary, continuous speech recognition, and it won DARPA's 1992 evaluation. Handling continuous speech with a large vocabulary was a major milestone. Huang later founded the speech recognition group at Microsoft in 1993. Reddy's student Kai-Fu Lee joined Apple, where, in 1992, he helped develop the Casper speech interface prototype. Lernout & Hauspie, a Belgium-based speech recognition company, acquired other companies, including Kurzweil Applied Intelligence in 1997 and Dragon Systems in 2000. L&H was used in Windows XP. L&H was an industry leader until an accounting scandal destroyed it in 2001. L&H speech technology was bought by ScanSoft, which became Nuance in 2005. Apple licensed Nuance software for its digital assistant Siri. ==== 2000s ==== In the 2000s, DARPA sponsored two speech recognition programs: Effective Affordable Reusable Speech-to-Text (EARS) in 2002, followed by Global Autonomous Language Exploitation (GALE) in 2005. Four teams participated in EARS: IBM; a team led by BBN with LIMSI and the University of Pittsburgh; Cambridge University; and a team composed of ICSI, SRI, and the University of Washington. EARS funded the collection of the Switchboard telephone speech corpus, which contained 260 hours of recorded conversations from over 500 speakers. The GALE program focused on Arabic and Mandarin broadcast news. Google's first effort at speech recognition came in 2007 after recruiting Nuance researchers. Its first product, GOOG-411, was a telephone-based directory service. Since at least 2006, the U.S. National Security Agency has employed keyword spotting, allowing analysts to index large volumes of recorded conversations and identify speech containing "interesting" keywords. Other government research programs focused on intelligence applications, such as DARPA's EARS program and IARPA's Babel program. In the early 2000s, speech recognition was dominated by hidden Markov models combined with feed-forward artificial neural networks (ANN). Later, speech recognition was taken over by long short-term memory (LSTM), a recurrent neural network (RNN) published by Sepp Hochreiter & Jürgen Schmidhuber in 1997. LSTM RNNs avoid the vanishing gradient problem and can learn "Very Deep Learning" tasks that require memories of events that happened thousands of discrete time steps earlier, which is important for speech. Around 2007, LSTMs trained with Connectionist Temporal Classification (CTC) began to outperform. In 2015, Google reported a 49 percent error‑rate reduction in its speech recognition via CTC‑trained LSTM. Transformers, a type of neural network based solely on attention, were adopted in computer vision and language modelling, and then to speech recognition. Deep feed-forward (non-recurrent) networks for acoustic modelling were introduced in 2009 by Geoffrey Hinton and his students at the University of Toronto, and by Li Deng and colleagues at Microsoft Research. In contrast to the prioer incremental improvements, deep learning decreased error rates by 30%. Both shallow and deep forms (e.g., recurrent nets) of ANNs had been explored since the 1980s. Howev

    Read more →
  • Language and Computers

    Language and Computers

    Language and Computers: Studies in Practical Linguistics (ISSN 0921-5034) is a book series on corpus linguistics and related areas. As studies in linguistics, volumes in the series have, by definition, their foundations in linguistic theory; however, they are not concerned with theory for theory's sake, but always with a definite direct or indirect interest in the possibilities of practical application in the dynamic area where language and computers meet. The book series was founded in 1988, and is published by Brill|Rodopi. == Editors == Christian Mair Charles F. Meyer == Volumes == Volumes include: # 77. English Corpus Linguistics: Variation in Time, Space and Genre. Selected papers from ICAME 32., Edited by Gisle Andersen and Kristin Bech. ISBN 978-90-420-3679-6 E-ISBN 978-94-012-0940-3 # 76. English Corpus Linguistics: Crossing Paths., Edited by Merja Kytö. ISBN 978-90-420-3518-8 E-ISBN 978-94-012-0793-5 # 75. Corpus Linguistics and Variation in English.Theory and Description., Edited by Joybrato Mukherjee and Magnus Huber. ISBN 978-90-420-3495-2 E-ISBN 978-94-012-0771-3 # 74. English Corpus Linguistics: Looking back, Moving forward. Papers from the 30th International Conference on English Language Research on Computerized Corpora (ICAME 30), Lancaster, UK, 27–31 May 2009., Edited by Sebastian Hoffmann, Paul Rayson and Geoffrey Leech. ISBN 978-90-420-3466-2 E-ISBN 978-94-012-0747-8 #73. Corpus-based Studies in Language Use, Language Learning, and Language Documentation., Edited by John Newman, Harald Baayen and Sally Rice. ISBN 978-90-420-3401-3 E-ISBN 978-94-012-0688-4 #72. The Progressive in Modern English. A Corpus-Based Study of Grammaticalization and Related Changes., by Svenja Kranich. ISBN 978-90-420-3143-2 E-ISBN 978-90-420-3144-9 #71. Corpus-linguistic applications. Current studies, new directions, Edited by Stefan Th. Gries, Stefanie Wulff, and Mark Davies.. ISBN 978-90-420-2800-5 #70. A resource-light approach to morpho-syntactic tagging., by Anna Feldman and Jirka Hana. ISBN 978-90-420-2768-8 #69. Corpus Linguistics. Refinements and Reassessments., Edited by Antoinette Renouf and Andrew Kehoe. ISBN 978-90-420-2597-4 #68. Corpora: Pragmatics and Discourse. Papers from the 29th International Conference on English Language Research on Computerized Corpora (ICAME 29). Ascona, Switzerland, 14–18 May 2008., Edited by Andreas H. Jucker, Daniel Schreier and Marianne Hundt. ISBN 978-90-420-2592-9 #67. Modals and Quasi-modals in English., by Peter Collins. ISBN 978-90-420-2532-5 #66. Linking up contrastive and learner corpus research., Edited by Gaëtanelle Gilquin, Szilvia Papp and María Belén Díez-Bedmar. ISBN 978-90-420-2446-5 #64. Language, People, Numbers. Corpus Linguistics and Society., Edited by Andrea Gerbig and Oliver Mason. ISBN 978-90-420-2350-5 #63. Variation and change in the lexicon. A corpus-based analysis of adjectives in English ending in –ic and –ical. , by Mark Kaunisto. ISBN 978-90-420-2233-1 #62. Corpus Linguistics 25 Years on., Edited by Roberta Facchinetti. ISBN 978-90-420-2195-2 #61. Corpora in the Foreign Language Classroom. Selected papers from the Sixth International Conference on Teaching and Language Corpora (TaLC 6), Edited by Encarnación Hidalgo, Luis Quereda and Juan Santana. ISBN 978-90-420-2142-6 #60. Corpus Linguistics Beyond the Word. Corpus Research from Phrase to Discourse, Edited by Eileen Fitzpatrick. ISBN 978-90-420-2135-8 #59. Corpus Linguistics and the Web., Edited by Marianne Hundt, Nadja Nesselhauf and Carolin Biewer. ISBN 978-90-420-2128-0 #58. English mediopassive constructions. A cognitive, corpus-based study of their origin, spread, and current status, by Marianne Hundt. ISBN 978-90-420-2127-3 / ISBN 90-420-2127-6

    Read more →
  • AI Video Generators Reviews: What Actually Works in 2026

    AI Video Generators Reviews: What Actually Works in 2026

    Comparing the best AI video generator? An AI video generator is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI video generator slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →