AI For Business Research

AI For Business Research — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Noisy text analytics

    Noisy text analytics

    Noisy text analytics is a process of information extraction whose goal is to automatically extract structured or semistructured information from noisy unstructured text data. While Text analytics is a growing and mature field that has great value because of the huge amounts of data being produced, processing of noisy text is gaining in importance because a lot of common applications produce noisy text data. Noisy unstructured text data is found in informal settings such as online chat, text messages, e-mails, message boards, newsgroups, blogs, wikis and web pages. Also, text produced by processing spontaneous speech using automatic speech recognition and printed or handwritten text using optical character recognition contains processing noise. Text produced under such circumstances is typically highly noisy containing spelling errors, abbreviations, non-standard words, false starts, repetitions, missing punctuations, missing letter case information, pause filling words such as “um” and “uh” and other texting and speech disfluencies. Such text can be seen in large amounts in contact centers, chat rooms, optical character recognition (OCR) of text documents, short message service (SMS) text, etc. Documents with historical language can also be considered noisy with respect to today's knowledge about the language. Such text contains important historical, religious, ancient medical knowledge that is useful. The nature of the noisy text produced in all these contexts warrants moving beyond traditional text analysis techniques. == Techniques for noisy text analysis == Missing punctuation and the use of non-standard words can often hinder standard natural language processing tools such as part-of-speech tagging and parsing. Techniques to both learn from the noisy data and then to be able to process the noisy data are only now being developed. == Possible source of noisy text == World Wide Web: Poorly written text is found in web pages, online chat, blogs, wikis, discussion forums, newsgroups. Most of these data are unstructured and the style of writing is very different from, say, well-written news articles. Analysis for the web data is important because they are sources for market buzz analysis, market review, trend estimation, etc. Also, because of the large amount of data, it is necessary to find efficient methods of information extraction, classification, automatic summarization and analysis of these data. Contact centers: This is a general term for help desks, information lines and customer service centers operating in domains ranging from computer sales and support to mobile phones to apparels. On an average a person in the developed world interacts at least once a week with a contact center agent. A typical contact center agent handles over a hundred calls per day. They operate in various modes such as voice, online chat and E-mail. The contact center industry produces gigabytes of data in the form of E-mails, chat logs, voice conversation transcriptions, customer feedback, etc. A bulk of the contact center data is voice conversations. Transcription of these using state of the art automatic speech recognition results in text with 30-40% word error rate. Further, even written modes of communication like online chat between customers and agents and even the interactions over email tend to be noisy. Analysis of contact center data is essential for customer relationship management, customer satisfaction analysis, call modeling, customer profiling, agent profiling, etc., and it requires sophisticated techniques to handle poorly written text. Printed Documents: Many libraries, government organizations and national defence organizations have vast repositories of hard copy documents. To retrieve and process the content from such documents, they need to be processed using Optical Character Recognition. In addition to printed text, these documents may also contain handwritten annotations. OCRed text can be highly noisy depending on the font size, quality of the print etc. It can range from 2-3% word error rates to as high as 50-60% word error rates. Handwritten annotations can be particularly hard to decipher, and error rates can be quite high in their presence. Short Messaging Service (SMS): Language usage over computer mediated discourses, like chats, emails and SMS texts, significantly differs from the standard form of the language. An urge towards shorter message length facilitating faster typing and the need for semantic clarity, shape the structure of this non-standard form known as the texting language.

    Read more →
  • With Folded Hands ...

    With Folded Hands ...

    "With Folded Hands ..." is a 1947 science fiction novelette by American writer Jack Williamson (1908–2006). In writing it, Williamson was influenced by the aftermath of World War II, the atomic bombings of Hiroshima and Nagasaki, and his concern that "some of the technological creations we had developed with the best intentions might have disastrous consequences in the long run." The novelette first appeared in the July 1947 issue of Astounding Science Fiction and was later included in The Science Fiction Hall of Fame, Volume Two (1973) after being voted one of the best novellas up to 1965. In 1950, it was the first of several Astounding stories adapted for NBC's radio series Dimension X. == Rewrite and sequel == The 1947 publication was followed by a novel-length rewrite, with a different setting and inventor. At the behest of Astounding editor-in-chief John W. Campbell, a new ending had the robots defeated by means of what Williamson and Campbell would later christen "psionics". This novel was serialized, also in Astounding (March, April, May 1948), as ... And Searching Mind, and finally published in hardback book form as The Humanoids (1949). Much later, in 1980, Williamson followed with another sequel, The Humanoid Touch. == Plot summary == Underhill, a seller of "Mechanicals" (unthinking robots that perform menial tasks) in the small town of Two Rivers, is startled to find a competitor's store on his way home. The competitors are not humans but are small black robots who appear more advanced than anything Underhill has encountered before. They describe themselves as "humanoids". Disturbed at his encounter, Underhill rushes home to discover that his wife has taken in a new lodger, a mysterious old man named Sledge. In the course of the next day, the new Mechanicals have appeared everywhere in town. They state that they only follow the Prime Directive: "to serve and obey and guard men from harm". Offering their services free of charge, they replace humans as police officers, bank tellers, and more, and eventually drive Underhill out of business. Despite the humanoids' benign appearance and mission, Underhill soon realizes that, in the name of their Prime Directive, the mechanicals have essentially taken over every aspect of human life. No humans may engage in any behavior that might endanger them, and every human action is carefully scrutinized. Suicide is prohibited. Humans who resist the Prime Directive are taken away and lobotomized, so that they may live happily under the direction of the humanoids. Underhill learns that his lodger Sledge is the creator of the humanoids and is on the run from them. Sledge explains that 60 years earlier he had discovered the force of "rhodomagnetics" on the planet Wing IV and that his discovery resulted in a war that destroyed his planet. In his grief, Sledge designed the humanoids to help humanity and be invulnerable to human exploitation. However, he eventually realized that they had instead taken control of humanity, in the name of their Prime Directive, to make humans happy. The humanoids are spreading out from Wing IV to every human-occupied planet to implement their Prime Directive. Sledge and Underhill attempt to stop the humanoids by aiming a rhodomagnetic beam at Wing IV, but fail. The humanoids take Sledge away for surgery. He returns with no memory of his prior life, stating that he is now happy under the humanoids' care. Underhill is driven home by the humanoids, sitting "with folded hands," as there is nothing left to do. == Origins == In a 1991 interview, Williamson revealed how the story construction reflected events of his childhood in addition to technological extrapolations: I wrote "With Folded Hands" immediately after World War II, when the shadow of the atomic bomb had just fallen over SF and was just beginning to haunt the imaginations of people in the US. The story grows out of that general feeling that some of the technological creations we had developed with the best intentions might have disastrous consequences in the long run (that idea, of course, still seems relevant today). The notion I was consciously working on specifically came out of a fragment of a story I had worked on for a while about an astronaut in space who is accompanied by a robot obviously superior to him physically—i.e., the robot wasn't hurt by gravity, extremes of temperature, radiation, or whatever. Just looking at the fragment gave me the sense of how inferior humanity is in many ways to mechanical creations. That basic recognition was the essence of the story, and as I wrote it up in my notes the theme was that the perfect machine would prove to be perfectly destructive... It was only when I looked back at the story much later on that I was able to realize that the emotional reach of the story undoubtedly derived from my own early childhood, when people were attempting to protect me from all those hazardous things a kid is going to encounter in the isolated frontier setting I grew up in. As a result, I felt frustrated and over protected by people whom I couldn't hate because I loved them. A sort of psychological trap. Specifically, the first three years of my life were spent on a ranch at the top of the Sierra Madre Mountains on the headwaters of the Yaqui River in Sonora, Mexico. ... [My mother] was terrified by this environment. My father built a crib that became a psychological prison for me, particularly because my mother apparently kept me in it too long, when I needed to get out and crawl on the floor. ... In retrospect, I'm certain I projected my fears and suspicions of this kind of conditioning, and these projections became the governing emotional principle of "With Folded Hands" and The Humanoids. == Reception == In 2024, Robert Silverberg wrote an essay in which he asserted that "With Folded Hands..." is "probably the best story ever written about robots" and suggested that Elon Musk's Optimus Generation 2 is the realization of the "humanoids" along with their worst drawbacks.

    Read more →
  • Clanker

    Clanker

    Clanker is a derogatory term for robots and artificial intelligence (AI) software. The term has been used in Star Wars media, first appearing in the franchise's 2005 video game Star Wars: Republic Commando. In 2025, the term became widely used to express hatred or distaste for machines ranging from delivery robots to large language models. This trend has been attributed to anxiety around the negative societal effects of AI. == In science fiction == The term has been previously used in science fiction literature, first appearing in a 1958 article by William Tenn in which he uses it to describe robots from science fiction films like Metropolis. The Star Wars franchise began using the term as a slur against droids in the 2005 video game Star Wars: Republic Commando before being prominently used in the animated series Star Wars: The Clone Wars, which follows a galaxy-wide war between the Galactic Republic's clone troopers and the Confederacy of Independent Systems' battle droids. In Star Wars media, robots—more commonly known as droids—are routinely depicted as the subjects of discrimination. For example, in the original Star Wars film, C-3PO and R2-D2 are abducted by Jawas and sold to the family of Luke Skywalker. When visiting a cantina in Mos Eisley, both droids are refused service by the bartender, who remarks that "We don't serve their kind." In Star Wars lore, the term clanker had entered use by the time of the franchise's High Republic Era and became prominent during the Clone Wars, in which clone troopers regularly use the phrase against battle droids. == AI backlash == The growing popularity of the term clanker reflects an increase in direct contact between people and AI systems. On sidewalks, delivery robots impede mobility and cause safety issues. In digital spaces, cybersecurity experts have raised concerns about the rising number of bots online, which now make up a large portion of internet traffic. A 2025 report estimated that about one in five social media accounts are automated. The term is also a reaction to AI advocacy from industrialists like Elon Musk and Sam Altman, who have championed the integration of AI into nearly every aspect of modern life. This includes efforts by major companies and startups alike, such as Amazon's development of humanoid robots to replace human workers in service industries. Such initiatives have further fueled public skepticism, reinforcing the association of clanker with unease over automation and the displacement of human roles. A global survey conducted by the research firm Gartner in December 2023 found that 64% of customers would prefer companies to avoid using AI in customer service, with another 53% stating they would consider switching to a different company if they discovered AI was handling their service interactions. Another report by Ernst & Young, published in July 2025, found that 42% of employees across Europe are worried that the use of AI in the workplace may threaten their employment. Criticism has also been directed at the technology itself. Some of the backlash stems from concerns about the resource consumption of AI systems, their frequent reliance on copyrighted material without consent, and questions about the intentions of the corporations behind them. There are also concerns about the potential cognitive effects of relying heavily on AI. A study, authored by researchers at Microsoft and Carnegie Mellon University, warns that regular dependence on AI may leave users mentally unprepared for real-world problem solving, likening the effect to cognitive atrophy. In June 2025, United States Senator Ruben Gallego tweeted that his "new bill makes sure you don't have to talk to a clanker if you don't want to", referring to proposed legislation that would require call centers to disclose their use of automated customer service agents to callers in the United States and offer the option to switch to a human representative. == Analysis == Linguist Adam Aleksic has described clanker as an evolution of racial slurs that anthropomorphize robotic systems. Internet memes incorporating the term often reference historical discrimination against marginalized groups such as African Americans. Based on the work of linguist Geoffrey Nunberg, American news website Axios has argued that clanker is merely a derogatory word, rather than a slur, because it does not perpetuate social inequities. NPR has noted the irony that the word robot was coined by Karel Čapek for his 1920 science-fiction play R.U.R. as a similar criticism of industrialization forcing workers to become devoid of their humanity. Aleksic has observed that robot can be further traced to the Proto-Slavic noun orbъ, which means 'slave'. While other science fiction media include pejoratives for androids and robots, such as skinjob and toaster from the Blade Runner and Battlestar Galactica franchises, respectively, clanker is believed to have gained popularity because its usage is intuitive and flexible. Whereas AI slop describes low-quality output from artificial intelligence, clanker belittles the underlying computer systems.

    Read more →
  • Pommerman Challenge

    Pommerman Challenge

    The Pommerman Challenge is a multi-agent game to test autonomous artificial intelligence systems. == Game structure == Two-agent team compete against each other on an 11 x 11 board. Each agent can observe only part of the board, and the agents cannot communicate. The goal is to knock down the opponents. Agents place explosives to destroy walls and collect power-ups that appear from those walls, while avoiding death. Game objects can move unpredictably or be moved by an agent. == Play == The game involves real-time decision making. Agents must choose moves in about .1 seconds. == Algorithms == The real-time requirement limits the use of compute-heavy techniques such as Monte Carlo tree search. The branching factor at each move can be as large as 1,296, because all four agents act in each step, choosing among six possibilities. The agents choose by accounting for explosions, which have lifetimes of 10 steps. Explosions derail tree search techniques, as searches with less than 10 levels ignore explosions while deeper searches consider too many choices (given the branching factor). A hybrid approach uses a limited-depth tree search followed by exploring a deterministic/pessimistic scenario. Limiting the depth keeps the search tree small. The deterministic approach can predict far in the future, by omitting branching. "Good" actions are often those that perform well under pessimistic scenarios, particularly if safety is important. Identifying the worst sequence of positions for an object can suggest where to move it. After generating pessimistic scenarios, the agent quantifies the survivability of each move, notionally the number of positions in which the agent can then remain safely (without encountering other agents). == Competitions == 3 competitions were organized with slightly changing rules during 2018–2019. === Online - FFA === This round was a warm-up online event, where each competitor controlled only one agent. Results: 1st: Agent47Agent by Yichen Gong 2nd: aiKiller by Márton Görög === NeurIPS 2018 - Team === The first Pommerman competition with in-person finals. Results: 1st: hakozakijunctions by Toshihiro Takahashi 2nd: eisenach by Márton Görög 3rd: dypm by Takayuki Osogami The 3 best performing solutions used online tree search. === NeurIPS 2019 - Team Radio === The second competition with in-person finals improved communication between teammate agents. Results: 1st: Márton Görög 2nd: Paul Jasek 3rd: Yifan Zhang

    Read more →
  • System appreciation

    System appreciation

    System appreciation is an activity often included in the maintenance phase of software engineering projects. Key deliverables from this phase include documentation that describes what the system does in terms of its functional features, and how it achieves those features in terms of its architecture and design. Software architecture recovery is often the first step within System appreciation.

    Read more →
  • Mata v. Avianca, Inc.

    Mata v. Avianca, Inc.

    Mata v. Avianca, Inc. was a U.S. District Court for the Southern District of New York case in which the Court dismissed a personal injury case against the airline Avianca and issued a $5,000 fine to the plaintiffs' lawyers who had submitted fake precedents generated by ChatGPT in their legal briefs. == Background == In February 2022, Roberto Mata filed a personal injury lawsuit in the U.S. District Court for the Southern District of New York against Avianca, alleging that he was injured when a metal serving cart struck his knee during an international flight. The plaintiff's lawyers used ChatGPT to generate a legal motion, which contained numerous fake legal cases involving fictitious airlines with fabricated quotations and internal citations. Avianca's lawyers notified the Court that they had been "unable to locate" a few legal cases cited in the legal motion. The Court could not locate the cases either and ordered the plaintiff's lawyers to provide copies of the cited legal cases. Mata's lawyers provided copies of documents purportedly containing all but one of the legal cases, after ChatGPT assured that the cases "indeed exist" and "can be found in reputable legal databases such as LexisNexis and Westlaw." == Opinion == In May 2023, Judge P. Kevin Castel dismissed the personal injury case against Avianca and ordered the plaintiff's attorneys to pay a $5,000 fine. Judge Castel noted numerous inconsistencies in the opinion summaries, describing one of the legal analyses as "gibberish." Judge Castel held that Mata's lawyers had acted with "subjective bad faith" sufficient for sanctions under Federal Rule of Civil Procedure Rule 11. == Impact == In July 2024, the American Bar Association issued its first formal ethics opinion on the responsibilities of lawyers using generative AI (GAI). The 15-page opinion outlines how the Rules of Professional Conduct apply to the use of GAI in the practice of law. Experts caution that lawyers cannot reasonably rely on the accuracy, completeness, or validity of content generated by GAI tools. Due to the continued usage of GAI in the practice of law, Mata has been described as a landmark case by legal professionals, as it is frequently cited by courts in cases where usage of GAI during the course of proceedings leads to the creation and citation of nonexistent caselaw.

    Read more →
  • Mobile Fortify

    Mobile Fortify

    Mobile Fortify is a mobile app used by United States Immigration and Customs Enforcement (ICE) on their government-issued phones. The app allows agents to take a photo in order to gather biometrics, including contactless fingerprints and faceprints, for the purpose of identifying an individual and their potential immigration status. The app was created by NEC. == History == In June 2025, use of Mobile Fortify by ICE was uncovered through leaked emails and the user manual, reported by 404 Media. The app is internally developed, and details of the parent company and developer were initially unknown. In January 2026, the DHS's 2025 AI Use Case Inventory revealed the vendor as NEC Corporation, an international conglomerate with subsidiaries in Argentina, Australia, China, India and Malaysia. Later that month, several senators demanded transparency around the app and its origins, and that ICE stop using it. A second letter was sent again in November, after hearing no response to the previous letter from ICE. == Technology == Unlike other facial recognition software, Fortify uses federally linked databases. By contrast, Clearview AI uses public social media databases for biometric scanning. Federal databases include DHS's automated biometric identification system (IDENT), containing more than 270 million biometric records, and Customs and Border Protection's Traveler Verification Service. The State Department's visa and passport photo database, the FBI's National Crime Information Center, National Law Enforcement Telecommunications Systems, and CBP's TECS and Seized Assets and Case Tracing System (SEACATS). == Oversight == Several senators urged ICE to stop using the app for fear of infringing on fourth amendment and first amendment rights, and requested details on who developed the app, when it was deployed, whether the app was tested for accuracy, and policies and practices governing its use. In June 2025, they sent an open letter to Todd Lyons, ICE acting director, signed by senators Cory Booker, Chris Van Hollen, Ed Markey, Bernie Sanders, Adam Schiff, Tina Smith, Elizabeth Warren, and Ron Wyden. On November 3, a second letter was sent to the ICE by senators, after not receiving answers to questions from the previous letter deadlined for October 2. == Criticism == Mobile Fortify, and ICE's use of similar biometric identification technologies (such as Mobile Identify, an app similar to Mobile Fortify to be used by local or regional law enforcement to assist in immigration enforcement ) has faced scrutiny from a variety of digital rights organizations, politicians, and news outlets. The criticism is already considered to potentially be a reason why the similar Mobile Identify app was pulled from the Google Play Store. Facial recognition technologies are known to produce false-positives and generally unreliable results, especially on those with darker skin tones. ICE has already previously mistakenly arrested a U.S. citizen under the belief he was illegally in the country, and later stated that he "could be deported based on biometric confirmation of his identity" prior to his release. U.S. representative Bennie Thompson, ranking member of the House Homeland Security Committee has previously commented that "ICE officials have told us that an apparent biometric match by Mobile Fortify is a ‘definitive’ determination of a person's status and that an ICE officer may ignore evidence of American citizenship—including a birth certificate—if the app says the person is an alien," and that "Mobile Fortify is a dangerous tool in the hands of ICE, and it puts American citizens at risk of detention and even deportation," On January 19, 2026, 404 Media reported on a case where a woman, identified in court documents as "MJMA", was scanned by Mobile Fortify twice in the same interaction, and two entirely different names were provided by the app. According to the Innovation Law Lab, whose attorneys are representing MJMA, both of the names were incorrect. ICE has stated that they will not allow people to decline to be scanned by Mobile Fortify, and that photos taken, even those of U.S. citizens, will be stored for 15 years, something that has been criticized primarily because ICE has not performed a Privacy Impact Assessment (PIA) for Mobile Fortify, the right to decline other forms of biometric verification to the U.S. government is often available under other circumstances, and the 15 year window is viewed as unnecessarily large.

    Read more →
  • Speech synthesis

    Speech synthesis

    Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition. Synthesized speech can be created by concatenating pieces of recorded speech that are stored in a database. Systems differ in the size of the stored speech units; a system that stores phones or diphones provides the largest output range, but may lack clarity. For specific usage domains, the storage of entire words or sentences allows for high-quality output. Alternatively, a synthesizer can incorporate a model of the vocal tract and other human voice characteristics to create a completely "synthetic" voice output. The quality of a speech synthesizer is judged by its similarity to the human voice and by its ability to be understood clearly. An intelligible text-to-speech program allows people with visual impairments or reading disabilities to listen to written words on a home computer. The earliest computer operating system to have included a speech synthesizer was Unix in 1974, through the Unix speak utility. In 2000, Microsoft Sam was the default text-to-speech voice synthesizer used by the narrator accessibility feature, which shipped with all Windows 2000 operating systems, and subsequent Windows XP systems. A text-to-speech system (or "engine") is composed of two parts: a front-end and a back-end. The front-end has two major tasks. First, it converts raw text containing symbols like numbers and abbreviations into the equivalent of written-out words. This process is often called text normalization, pre-processing, or tokenization. The front-end then assigns phonetic transcriptions to each word, and divides and marks the text into prosodic units, like phrases, clauses, and sentences. The process of assigning phonetic transcriptions to words is called text-to-phoneme or grapheme-to-phoneme conversion. Phonetic transcriptions and prosody information together make up the symbolic linguistic representation that is output by the front-end. The back-end—often referred to as the synthesizer—then converts the symbolic linguistic representation into sound. In certain systems, this part includes the computation of the target prosody (pitch contour, phoneme durations), which is then imposed on the output speech. == History == Long before the invention of electronic signal processing, some people tried to build machines to emulate human speech. There were also legends of the existence of "Brazen Heads", such as those involving Pope Silvester II (d. 1003 AD), Albertus Magnus (1198–1280), and Roger Bacon (1214–1294). In 1779, the German-Danish scientist Christian Gottlieb Kratzenstein won the first prize in a competition announced by the Russian Imperial Academy of Sciences and Arts for models he built of the human vocal tract that could produce the five long vowel sounds (in International Phonetic Alphabet notation: [aː], [eː], [iː], [oː] and [uː]). There followed the bellows-operated "acoustic-mechanical speech machine" of Wolfgang von Kempelen of Pressburg, Hungary, described in a 1791 paper. This machine added models of the tongue and lips, enabling it to produce consonants as well as vowels. In 1837, Charles Wheatstone produced a "speaking machine" based on von Kempelen's design, and in 1846, Joseph Faber exhibited the "Euphonia". In 1923, Paget resurrected Wheatstone's design. In the 1930s, Bell Labs developed the vocoder, which automatically analyzed speech into its fundamental tones and resonances. From his work on the vocoder, Homer Dudley developed a keyboard-operated voice-synthesizer called The Voder (Voice Demonstrator), which he exhibited at the 1939 New York World's Fair. Franklin S. Cooper and his colleagues at Haskins Laboratories built the pattern playback in the late 1940s and completed it in 1950. There were several different versions of this hardware device; only one currently survives. The machine converts pictures of the acoustic patterns of speech in the form of a spectrogram back into sound. Using this device, Alvin Liberman and colleagues discovered acoustic cues for the perception of phonetic segments (consonants and vowels). === Electronic devices === The first computer-based speech-synthesis systems originated in the late 1950s. Noriko Umeda et al. developed the first general English text-to-speech system in 1968, at the Electrotechnical Laboratory in Japan. In 1961, physicist John Larry Kelly, Jr and his colleague Louis Gerstman used an IBM 704 computer to synthesize speech, an event among the most prominent in the history of Bell Labs. Kelly's voice recorder synthesizer (vocoder) recreated the song "Daisy Bell", with musical accompaniment from Max Mathews. Coincidentally, Arthur C. Clarke was visiting his friend and colleague John Pierce at the Bell Labs Murray Hill facility. Clarke was so impressed by the demonstration that he used it in the climactic scene of his screenplay for his novel 2001: A Space Odyssey, where the HAL 9000 computer sings the same song as astronaut Dave Bowman puts it to sleep. Despite the success of purely electronic speech synthesis, research into mechanical speech-synthesizers continues. Linear predictive coding (LPC), a form of speech coding, began development with the work of Fumitada Itakura of Nagoya University and Shuzo Saito of Nippon Telegraph and Telephone (NTT) in 1966. Further developments in LPC technology were made by Bishnu S. Atal and Manfred R. Schroeder at Bell Labs during the 1970s. LPC was later the basis for early speech synthesizer chips, such as the Texas Instruments LPC Speech Chips used in the Speak & Spell toys from 1978. In 1975, Fumitada Itakura developed the line spectral pairs (LSP) method for high-compression speech coding, while at NTT. From 1975 to 1981, Itakura studied problems in speech analysis and synthesis based on the LSP method. In 1980, his team developed an LSP-based speech synthesizer chip. LSP is an important technology for speech synthesis and coding, and in the 1990s was adopted by almost all international speech coding standards as an essential component, contributing to the enhancement of digital speech communication over mobile channels and the internet. In 1975, MUSA was released, and was one of the first Speech Synthesis systems. It consisted of a stand-alone computer hardware and a specialized software that enabled it to read Italian. A second version, released in 1978, was also able to sing Italian in an "a cappella" style. Dominant systems in the 1980s and 1990s were the DECtalk system, based largely on the work of Dennis Klatt at MIT, and the Bell Labs system; the latter was one of the first multilingual language-independent systems, making extensive use of natural language processing methods. Handheld electronics featuring speech synthesis began emerging in the 1970s. One of the first was the Telesensory Systems Inc. (TSI) Speech+ portable calculator for the blind in 1976. Other devices had primarily educational purposes, such as the Speak & Spell toy produced by Texas Instruments in 1978. Fidelity released a speaking version of its electronic chess computer in 1979. The first video game to feature speech synthesis was the 1980 shoot 'em up arcade game, Stratovox (known in Japan as Speak & Rescue), from Sun Electronics. The first personal computer game with speech synthesis was Manbiki Shoujo (Shoplifting Girl), released in 1980 for the PET 2001, for which the game's developer, Hiroshi Suzuki, developed a "zero cross" programming technique to produce a synthesized speech waveform. Another early example, the arcade version of Berzerk, also dates from 1980. The Milton Bradley Company produced the first multi-player electronic game using voice synthesis, Milton, in the same year. In 1976, Computalker Consultants released their CT-1 Speech Synthesizer. Designed by D. Lloyd Rice and Jim Cooper, it was an analog synthesizer built to work with microcomputers using the S-100 bus standard. Synthesized voices typically sounded male until 1990, when Ann Syrdal, at AT&T Bell Laboratories, created a female voice. Ray Kurzweil predicted in 2005 that as the cost-performance ratio caused speech synthesizers to become cheaper and more accessible, more people would benefit from the use of text-to-speech programs. === Artificial intelligence === In September 2016, DeepMind released WaveNet, which demonstrated that deep learning models are capable of modeling raw waveforms and generating speech from acoustic features like spectrograms or mel-spectrograms, starting the field of deep learning speech synthesis. Although WaveNet was initially considered to be computationally expensive and slow to be used in consumer products at the time, a year after its

    Read more →
  • Screenpal

    Screenpal

    ScreenPal (formerly known as Screencast-O-Matic) is cross-platform screen capture and screen recording software originally developed in 2006. == History == The company was founded by AJ Gregory in 2006 as Screencast-O-Matic. The software includes features for screen recording, screenshot capture, video editing, image editing, and a video and image hosting service. It is available for Windows and Mac operating systems, and has mobile apps for iOS and Android. The company launched a video editor in 2015. It began offering free video and image hosting in 2019, with premium hosting options for subscribers. In 2023, it was rebranded as ScreenPal.

    Read more →
  • Fuzzy finite element

    Fuzzy finite element

    The fuzzy finite element method combines the well-established finite element method with the concept of fuzzy numbers, the latter being a special case of a fuzzy set. The advantage of using fuzzy numbers instead of real numbers lies in the incorporation of uncertainty (on material properties, parameters, geometry, initial conditions, etc.) in the finite element analysis. One way to establish a fuzzy finite element (FE) analysis is to use existing FE software (in-house or commercial) as an inner-level module to compute a deterministic result, and to add an outer-level loop to handle the fuzziness (uncertainty). This outer-level loop comes down to solving an optimization problem. If the inner-level deterministic module produces monotonic behavior with respect to the input variables, then the outer-level optimization problem is greatly simplified, since in this case the extrema will be located at the vertices of the domain.

    Read more →
  • Xaitment

    Xaitment

    xaitment is a German-based company that develops and sells artificial intelligence (AI) software to video game developers and simulation developers. The company was founded in 2004 by Dr. Andreas Gerber, and is a spin-off of the German Research Centre for Artificial Intelligence, or DFKI. xaitment has its main office in Quierschied, Germany, and field offices in San Francisco and China. == Products == xaitment currently sells two AI software modules: xaitMap and xaitControl. xaitMap provides runtime libraries and graphical tools for navigation mesh generation (also called NavMesh generation), pathfinding, dynamic collision avoidance, and individual and crowd movement. xaitControl is a finite-state machine for game logic and character behavior modeling that also includes a real-time debugger. On January 11, 2012, xaitment announced that it making its source code for these modules available to "all current and future US and European licensees". On February 22, 2012 xaitment released two new plug-ins, xaitMap and xaitControl for the Unity Game Engine. The full versions are available for PC (Windows and Linux), PlayStation 3, Xbox 360 and Wii. The pathfinding plug-in is available with a Windows dev environment, but can deployed on iOS, Mac, Android and the Unity Web Player. == Partners == xaitment's AI software is currently integrated into the Unity game engine, Havok's Vision Engine, Bohemia Interactive's VBS2 Simulation Engine, GameBase's Gamebryo game engine. == Customers == xaitment sells its AI software products to video game developers and military and civil simulation developers. Current customers include Tencent, gamania, TML Studios, Emobi Games, IP Keys and others. A full list of customers can be found on xaitment's website.

    Read more →
  • Context-sensitive user interface

    Context-sensitive user interface

    A context-sensitive user interface offers the user options based on the state of the active program. Context sensitivity is ubiquitous in current graphical user interfaces, often in context menus. A user-interface may also provide context sensitive feedback, such as changing the appearance of the mouse pointer or cursor, changing the menu color, or with auditory or tactile feedback. == Reasoning and advantages of context sensitivity == The primary reason for introducing context sensitivity is to simplify the user interface. Advantages include: Reduced number of commands required to be known to the user for a given level of productivity. Reduced number of clicks or keystrokes required to carry out a given operation. Allows consistent behaviour to be pre-programmed or altered by the user. Reduces the number of options needed on screen at one time. === Disadvantages === Context sensitive actions may be perceived as dumbing down of the user interface, leaving the operator at a loss as to what to do when the computer decides to perform an unwanted action. Additionally non-automatic procedures may be hidden or obscured by the context sensitive interface causing an increase in user workload for operations the designers did not foresee. A poor implementation can be more annoying than helpful – a classic example of this is Office Assistant. == Implementation == At the simplest level each possible action is reduced to a single most likely action – the action performed is based on a single variable (such as file extension). In more complicated implementations multiple factors can be assessed such as the user's previous actions, the size of the file, the programs in current use, metadata etc. The method is not only limited to the response to imperative button presses and mouse clicks – pop-up menus can be pruned and/or altered, or a web search can focus results based on previous searches. At higher levels of implementation context sensitive actions require either larger amounts of meta-data, extensive case analysis based programming, or other artificial intelligence algorithms. === In computer and video games === Context sensitivity is important in video games, especially those controlled by a gamepad, joystick or computer mouse in which the number of buttons available is limited. It is primarily applied when the player is in a certain place and is used to interact with a person or object. For example, if the player is standing next to a non-player character, an option may come up allowing the player to talk with them. Implementations range from the embryonic 'Quick Time Event' to context sensitive sword combat in which the attack used depends on the position and orientation of both the player and opponent, as well as the virtual surroundings. A similar range of use is found in the 'action button' which, depending upon the in-game position of the player's character, may cause it to pick something up, open a door, grab a rope, punch a monster or opponent, or smash an object. The response does not have to be player activated – an on-screen device may only be shown in certain circumstances, e.g. 'targeting' cross hairs in a flight combat game may indicate the player should fire. An alternative implementation is to monitor the input from the player (e.g. level of button pressing activity) and use that to control the pace of the game in an attempt to maximize enjoyment or to control the excitement or ambience. The method has become increasingly important as more complex games are designed for machines with few buttons (keyboard-less consoles). Bennet Ring commented (in 2006) that "Context-sensitive is the new lens flare". === Context-sensitive help === Context sensitive help is a common implementation of context sensitivity, a single help button is actioned and the help page or menu will open a specific page or related topic.

    Read more →
  • Camera interface

    Camera interface

    The Camera Interface block or CAMIF is the hardware block that interfaces with different image sensor interfaces and provides a standard output that can be used for subsequent image processing. A typical Camera Interface would support at least a parallel interface although these days many camera interfaces are beginning to support the Mobile Industry Processor Interface (MIPI) Camera Serial Interface (CSI) interface. == Electrical connections == The camera interface's parallel interface consists of the following lines: 8 to 12 bits parallel data line These are parallel data lines that carry pixel data. The data transmitted on these lines change with every Pixel Clock (PCLK). Horizontal Sync (HSYNC) This is a special signal that goes from the camera sensor or ISP to the camera interface. An HSYNC indicates that one line of the frame is transmitted. Vertical Sync (VSYNC) This signal is transmitted after the entire frame is transferred. This signal is often a way to indicate that one entire frame is transmitted. Pixel Clock (PCLK) This is the pixel clock and it would change on every pixel. NOTE: The above lines are all treated as input lines to the Camera Interface hardware.

    Read more →
  • SQLf

    SQLf

    SQLf is a SQL extended with fuzzy set theory application for expressing flexible (fuzzy) queries to traditional (or ″Regular″) Relational Databases. Among the known extensions proposed to SQL, at the present time, this is the most complete, because it allows the use of diverse fuzzy elements in all the constructions of the language SQL. SQLf is the only known proposal of flexible query system allowing linguistic quantification over set of rows in queries, achieved through the extension of SQL nesting and partitioning structures with fuzzy quantifiers. It also allows the use of quantifiers to qualify the quantity of search criteria satisfied by single rows. Several mechanisms are proposed for query evaluation, the most important being the one based on the derivation principle. This consists in deriving classic queries that produce, given a threshold t, a t-cut of the result of the fuzzy query, so that the additional processing cost of using a fuzzy language is diminished. == Basic block == The fundamental querying structure of SQLf is the multi-relational block. The conception of this structure is based on the three basic operations of the relational algebra: projection, cartesian product and selection, and the application of fuzzy sets’ concepts. The result of a SQLf query is a fuzzy set of rows that is a fuzzy relation instead of a regular relation. A basic block in SQLf consists of a SELECT clause, a FROM clause and an optional WHERE clause. The semantic of this query structure is: The SELECT clause corresponds to the projection. It specifies the relations’ attributes (or attribute expressions) that will be selected. The resulting table is a fuzzy set and it is given in decreasing ordered of satisfaction degree. The SELECT clause specifies also a calibration that is intended to restrict the set of rows retrieved. There are two kinds of calibrations: quantitative and qualitative. In quantitative calibration the user specifies the number of results to be retrieved, so that the query will retrieve the rows with highest membership degrees up to the number of required answers. In qualitative calibration the user specifies a minim level of satisfaction that must have any retrieved row. The FROM clause corresponds to the Cartesian Product. The consult is made on the Cartesian Product of the relations that are specified in this clause. The WHERE clause corresponds to the selection. It specifies the condition for which the satisfaction degree will be calculated. Rows that do not satisfy at all the condition are rejected. This condition is a fuzzy predicate that may involve any attribute of the relations. The following is an example of a SELECT query that returns a list of hotels that are cheap. The query retrieves all rows from the Hotels table that satisfice the fuzzy predicate cheap defined by the fuzzy set μ=(∞, ∞, 25, 30). The result is sorted in descending order by the membership degree of the query.

    Read more →
  • Belief–desire–intention software model

    Belief–desire–intention software model

    The belief–desire–intention software model (BDI) is a software model developed for programming intelligent agents. Superficially characterized by the implementation of an agent's beliefs, desires and intentions, it actually uses these concepts to solve a particular problem in agent programming. In essence, it provides a mechanism for separating the activity of selecting a plan (from a plan library or an external planner application) from the execution of currently active plans. Consequently, BDI agents are able to balance the time spent on deliberating about plans (choosing what to do) and executing those plans (doing it). A third activity, creating the plans in the first place (planning), is not within the scope of the model, and is left to the system designer and programmer. == Overview == In order to achieve this separation, the BDI software model implements the principal aspects of Michael Bratman's theory of human practical reasoning (also referred to as Belief-Desire-Intention, or BDI). That is to say, it implements the notions of belief, desire and (in particular) intention, in a manner inspired by Bratman. For Bratman, desire and intention are both pro-attitudes (mental attitudes concerned with action). He identifies commitment as the distinguishing factor between desire and intention, noting that it leads to (1) temporal persistence in plans and (2) further plans being made on the basis of those to which it is already committed. The BDI software model partially addresses these issues. Temporal persistence, in the sense of explicit reference to time, is not explored. The hierarchical nature of plans is more easily implemented: a plan consists of a number of steps, some of which may invoke other plans. The hierarchical definition of plans itself implies a kind of temporal persistence, since the overarching plan remains in effect while subsidiary plans are being executed. An important aspect of the BDI software model (in terms of its research relevance) is the existence of logical models through which it is possible to define and reason about BDI agents. Research in this area has led, for example, to the axiomatization of some BDI implementations, as well as to formal logical descriptions such as Anand Rao and Michael Georgeff's BDICTL. The latter combines a multiple-modal logic (with modalities representing beliefs, desires and intentions) with the temporal logic CTL. More recently, Michael Wooldridge has extended BDICTL to define LORA (the Logic Of Rational Agents), by incorporating an action logic. In principle, LORA allows reasoning not only about individual agents, but also about communication and other interaction in a multi-agent system. The BDI software model is closely associated with intelligent agents, but does not, of itself, ensure all the characteristics associated with such agents. For example, it allows agents to have private beliefs, but does not force them to be private. It also has nothing to say about agent communication. Ultimately, the BDI software model is an attempt to solve a problem that has more to do with plans and planning (the choice and execution thereof) than it has to do with the programming of intelligent agents. This approach has recently been proposed by Steven Umbrello and Roman Yampolskiy as a means of designing autonomous vehicles for human values. == BDI agents == A BDI agent is a particular type of bounded rational software agent, imbued with particular mental attitudes, viz: Beliefs, Desires and Intentions (BDI). === Architecture === This section defines the idealized architectural components of a BDI system. Beliefs: Beliefs represent the informational state of the agent–its beliefs about the world (including itself and other agents). Beliefs can also include inference rules, allowing forward chaining to lead to new beliefs. Using the term belief rather than knowledge recognizes that what an agent believes may not necessarily be true (and in fact may change in the future). Beliefset: Beliefs are stored in database (sometimes called a belief base or a belief set), although that is an implementation decision. Desires: Desires represent the motivational state of the agent. They represent objectives or situations that the agent would like to accomplish or bring about. Examples of desires might be: find the best price, go to the party or become rich. Goals: A goal is a desire that has been adopted for active pursuit by the agent. Usage of the term goals adds the further restriction that the set of active desires must be consistent. For example, one should not have concurrent goals to go to a party and to stay at home – even though they could both be desirable. Intentions: Intentions represent the deliberative state of the agent – what the agent has chosen to do. Intentions are desires to which the agent has to some extent committed. In implemented systems, this means the agent has begun executing a plan. Plans: Plans are sequences of actions (recipes or knowledge areas) that an agent can perform to achieve one or more of its intentions. Plans may include other plans: my plan to go for a drive may include a plan to find my car keys. This reflects that in Bratman's model, plans are initially only partially conceived, with details being filled in as they progress. Events: These are triggers for reactive activity by the agent. An event may update beliefs, trigger plans or modify goals. Events may be generated externally and received by sensors or integrated systems. Additionally, events may be generated internally to trigger decoupled updates or plans of activity. BDI was also extended with an obligations component, giving rise to the BOID agent architecture to incorporate obligations, norms and commitments of agents that act within a social environment. === BDI interpreter === This section defines an idealized BDI interpreter that provides the basis of SRI's PRS lineage of BDI systems: initialize-state repeat options: option-generator (event-queue) selected-options: deliberate(options) update-intentions(selected-options) execute() get-new-external-events() drop-unsuccessful-attitudes() drop-impossible-attitudes() end repeat === Limitations and criticisms === The BDI software model is one example of a reasoning architecture for a single rational agent, and one concern in a broader multi-agent system. This section bounds the scope of concerns for the BDI software model, highlighting known limitations of the architecture. Learning: BDI agents lack any specific mechanisms within the architecture to learn from past behavior and adapt to new situations. Three attitudes: Classical decision theorists and planning research questions the necessity of having all three attitudes, distributed AI research questions whether the three attitudes are sufficient. Logics: The multi-modal logics that underlie BDI (that do not have complete axiomatizations and are not efficiently computable) have little relevance in practice. Multiple agents: In addition to not explicitly supporting learning, the framework may not be appropriate to learning behavior. Further, the BDI model does not explicitly describe mechanisms for interaction with other agents and integration into a multi-agent system. Explicit goals: Most BDI implementations do not have an explicit representation of goals. Lookahead: The architecture does not have (by design) any lookahead deliberation or forward planning. This may not be desirable because adopted plans may use up limited resources, actions may not be reversible, task execution may take longer than forward planning, and actions may have undesirable side effects if unsuccessful. == BDI agent implementations == === 'Pure' BDI === Procedural Reasoning System (PRS) IRMA (not implemented but can be considered as PRS with non-reconsideration) UM-PRS OpenPRS Distributed Multi-Agent Reasoning System (dMARS) AgentSpeak(L) – see Jason below AgentSpeak(RT) Agent Real-Time System (ARTS) (ARTS) JAM JACK Intelligent Agents JADEX (open source project) JaKtA JASON GORITE SPARK 3APL 2APL GOAL agent programming language CogniTAO (Think-As-One) Living Systems Process Suite PROFETA Gwendolen (Part of the Model Checking Agent Programming Languages Framework) === Extensions and hybrid systems === JACK Teams CogniTAO (Think-As-One) Living Systems Process Suite Brahms JaCaMo

    Read more →