Chinese speech synthesis

Chinese speech synthesis

Chinese speech synthesis is the application of speech synthesis to the Chinese language (usually Standard Chinese). It poses additional difficulties due to Chinese characters frequently having different pronunciations in different contexts and the complex prosody, which is essential to convey the meaning of words, and sometimes the difficulty in obtaining agreement among native speakers concerning what the correct pronunciation is of certain phonemes. == Concatenation (Ekho and KeyTip) == Recordings can be concatenated in any desired combination, but the joins sound forced (as is usual for simple concatenation-based speech synthesis) and this can severely affect prosody; these synthesizers are also inflexible in terms of speed and expression. However, because these synthesizers do not rely on a corpus, there is no noticeable degradation in performance when they are given more unusual or awkward phrases. Ekho is an open source TTS which simply concatenates sampled syllables. It currently supports Cantonese, Mandarin, and experimentally Korean. Some of the Mandarin syllables have been pitched-normalised in Praat. A modified version of these is used in Gradint's "synthesis from partials". cjkware.com used to ship a product called KeyTip Putonghua Reader which worked similarly; it contained 120 Megabytes of sound recordings (GSM-compressed to 40 Megabytes in the evaluation version), comprising 10,000 multi-syllable dictionary words plus single-syllable recordings in 6 different prosodies (4 tones, neutral tone, and an extra third-tone recording for use at the end of a phrase). == Lightweight synthesizers (eSpeak and Yuet) == The lightweight open-source speech project eSpeak, which has its own approach to synthesis, has experimented with Mandarin and Cantonese. eSpeak was used by Google Translate from May 2010 until December 2010. The commercial product "Yuet" is also lightweight (it is intended to be suitable for resource-constrained environments like embedded systems); it was written from scratch in ANSI C starting from 2013. Yuet claims a built-in NLP model that does not require a separate dictionary; the speech synthesised by the engine claims clear word boundaries and emphasis on appropriate words. Communication with its author is required to obtain a copy. Both eSpeak and Yuet can synthesis speech for Cantonese and Mandarin from the same input text, and can output the corresponding romanisation (for Cantonese, Yuet uses Yale and eSpeak uses Jyutping; both use Pinyin for Mandarin). eSpeak does not concern itself with word boundaries when these don't change the question of which syllable should be spoken. == Corpus-based == A "corpus-based" approach can sound very natural in most cases but can err in dealing with unusual phrases if they can't be matched with the corpus. The synthesiser engine is typically very large (hundreds or even thousands of megabytes) due to the size of the corpus. === iFlyTek === Anhui USTC iFlyTek Co., Ltd (iFlyTek) published a W3C paper in which they adapted Speech Synthesis Markup Language to produce a mark-up language called Chinese Speech Synthesis Markup Language (CSSML) which can include additional markup to clarify the pronunciation of characters and to add some prosody information. The amount of data involved is not disclosed by iFlyTek but can be seen from the commercial products that iFlyTek have licensed their technology to; for example, Bider's SpeechPlus is a 1.3 Gigabyte download, 1.2 Gigabytes of which is used for the highly compressed data for a single Chinese voice. iFlyTek's synthesiser can also synthesise mixed Chinese and English text with the same voice (e.g. Chinese sentences containing some English words); they claim their English synthesis to be "average". The iFlyTek corpus appears to be heavily dependent on Chinese characters, and it is not possible to synthesize from pinyin alone. It is sometimes possible by means of CSSML to add pinyin to the characters to disambiguate between multiple possible pronunciations, but this does not always work. === NeoSpeech === There is an online interactive demonstration for NeoSpeech speech synthesis, which accepts Chinese characters and also pinyin if it's enclosed in their proprietary "VTML" markup. === Mac OS === Mac OS had Chinese speech synthesizers available up to version 9. This was removed in 10.0 and reinstated in 10.7 (Lion). === Historical corpus-based synthesizers (no longer available) === A corpus-based approach was taken by Tsinghua University in SinoSonic, with the Harbin dialect voice data taking 800 Megabytes. This was planned to be offered as a download but the link was never activated. Nowadays, only references to it can be found on Internet Archive. Bell Labs' approach, which was demonstrated online in 1997 but subsequently removed, was described in a monograph "Multilingual Text-to-Speech Synthesis: The Bell Labs Approach" (Springer, October 31, 1997, ISBN 978-0-7923-8027-6), and the former employee who was responsible for the project, Chilin Shih (who subsequently worked at the University of Illinois) put some notes about her methods on her website.

Software requirements

Software requirements for a system are the description of what the system should do, the service or services that it provides and the constraints on its operation. The IEEE Standard Glossary of Software Engineering Terminology defines a requirement as: A condition or capability needed by a user to solve a problem or achieve an objective A condition or capability that must be met or possessed by a system or system component to satisfy a contract, standard, specification, or other formally imposed document A documented representation of a condition or capability as in 1 or 2 The activities related to working with software requirements can broadly be broken down into elicitation, analysis, specification, and management. Note that the wording Software requirements is additionally used in software release notes to explain, which depending on software packages are required for a certain software to be built/installed/used. == Elicitation == Elicitation is the gathering and discovery of requirements from stakeholders and other sources. A variety of techniques can be used such as joint application design (JAD) sessions, interviews, document analysis, focus groups, etc. Elicitation is the first step of requirements development. == Analysis == Analysis is the logical breakdown that proceeds from elicitation. Analysis involves reaching a richer and more precise understanding of each requirement and representing sets of requirements in multiple, complementary ways. Requirements Triage or prioritization of requirements is another activity which often follows analysis. This relates to Agile software development in the planning phase, e.g. by Planning poker, however it might not be the same depending on the context and nature of the project and requirements or product/service that is being built. == Specification == Specification involves representing and storing the collected requirements knowledge in a persistent and well-organized fashion that facilitates effective communication and change management. Use cases, user stories, functional requirements, and visual analysis models are popular choices for requirements specification. == Validation == Validation involves techniques to confirm that the correct set of requirements has been specified to build a solution that satisfies the project's business objectives, and to detect and correct errors in the requirements before implementation. == Management == Requirements change during projects and there are often many of them. Management of this change becomes paramount to ensuring that the correct software is built for the stakeholders. == Tool support for Requirements Engineering == === Tools for Requirements Elicitation, Analysis and Validation === Taking into account that these activities may involve some artifacts such as observation reports (user observation), questionnaires (interviews, surveys and polls), use cases, user stories; activities such as requirement workshops (charrettes), brainstorming, mind mapping, role-playing; and even, prototyping; software products providing some or all of these capabilities can be used to help achieve these tasks. There is at least one author who advocates, explicitly, for mind mapping tools such as FreeMind; and, alternatively, for the use of specification by example tools such as Concordion. Additionally, the ideas and statements resulting from these activities may be gathered and organized with wikis and other collaboration tools such as Trello. The features actually implemented and standards compliance vary from product to product. === Tools for Requirements Specification === A Software requirements specification (SRS) document might be created using general-purpose software like a word processor or one of several specialized tools. Some of these tools can import, edit, export and publish SRS documents. It may help to make SRS documents while following a standardised structure and methodology, such as ISO/IEC/IEEE 29148:2018. Likewise, software may or not use some standard to import or export requirements (such as ReqIF) or not allow these exchanges at all. === Tools for Requirements Document Verification === Tools of this kind verify if there are any errors in a requirements document according to some expected structure or standard. === Tools for Requirements Comparison === Tools of this kind compare two requirement sets according to some expected document structure and standard. === Tools for Requirements Merge and Update === Tools of this kind allow the merging and update of requirement documents. === Tools for Requirements Traceability === Tools of this kind allow tracing requirements to other artifacts such as models and source code (forward traceability) or, to previous ones such as business rules and constraints (backwards traceability). === Tools for Model-Based Software or Systems Requirement Engineering === Model-based systems engineering (MBSE) is the formalised application of modelling to support system requirements, design, analysis, verification and validation activities beginning in the conceptual design phase and continuing throughout development and later lifecycle phases. It is also possible to take a model-based approach for some stages of the requirements engineering and, a more traditional one, for others. Very many combinations might be possible. The level of formality and complexity depends on the underlying methodology involved (for instance, i is much more formal than SysML and, even more formal than UML) === Tools for general Requirements Engineering === Tools in this category may provide some mix of the capabilities mentioned previously and others such as requirement configuration management and collaboration. The features actually implemented and standards compliance vary from product to product. There are even more capable or general tools that support other stages and activities. They are classified as ALM tools.

Exploratory search

Exploratory search is a specialization of information exploration which represents the activities carried out by searchers who are: unfamiliar with the domain of their goal (i.e. need to learn about the topic in order to understand how to achieve their goal) or unsure about the ways to achieve their goals (either the technology or the process) or unsure about their goals in the first place. Exploratory search is distinguished from known-item search, for which the searcher has a particular target in mind. Consequently, exploratory search covers a broader class of activities than typical information retrieval, such as investigating, evaluating, comparing, and synthesizing, where new information is sought in a defined conceptual area; exploratory data analysis is another example of an information exploration activity. Typically, therefore, such users generally combine querying and browsing strategies to foster learning and investigation. == History == Exploratory search is a topic that has grown from the fields of information retrieval and information seeking but has become more concerned with alternatives to the kind of search that has received the majority of focus (returning the most relevant documents to a Google-like keyword search). The research is motivated by questions like "What if the user doesn't know which keywords to use?" or "What if the user isn't looking for a single answer?" Consequently, research has begun to focus on defining the broader set of information behaviors in order to learn about the situations when a user is, or feels, limited by only having the ability to perform a keyword search. In the last few years, a series of workshops has been held at various related and key events. In 2005, the Exploratory Search Interfaces workshop focused on beginning to define some of the key challenges in the field. Since then a series of other workshops has been held at related conferences: Evaluating Exploratory Search at SIGIR06 and Exploratory Search and HCI at CHI07 (in order to meet with the experts in human–computer interaction). In March 2008, an Information Processing and Management special issue focused particularly on the challenges of evaluating exploratory search, given the reduced assumptions that can be made about scenarios of use. In June 2008, the National Science Foundation sponsored an invitational workshop to identify a research agenda for exploratory search and similar fields for the coming years. == Research challenges == === Important scenarios === With the majority of research in the information retrieval community focusing on typical keyword search scenarios, one challenge for exploratory search is to further understand the scenarios of use for when keyword search is not sufficient. An example scenario, often used to motivate the research by mSpace, states: if a user does not know much about classical music, how should they even begin to find a piece that they might like. Similarly, for patients or their carers, if they don't know the right keywords for their health problems, how can they effectively find useful health information for themselves? === Designing new interfaces === With one of the motivations being to support users when keyword search is not enough, some research has focused on identifying alternative user interfaces and interaction models that support the user in different ways. An example is faceted search which presents diverse category-style options to the users, so that they can choose from a list instead of guess a possible keyword query. Many of the interactive forms of search, including faceted browsers, are being considered for their support of exploratory search conditions. Computational cognitive models of exploratory search have been developed to capture the cognitive complexities involved in exploratory search. Model-based dynamic presentation of information cues are proposed to facilitate exploratory search performance. === Evaluating interfaces === As the tasks and goals involved with exploratory search are largely undefined or unpredictable, it is very hard to evaluate systems with the measures often used in information retrieval. Accuracy was typically used to show that a user had found a correct answer, but when the user is trying to summarize a domain of information, the correct answer is near impossible to identify, if not entirely subjective (for example: possible hotels to stay in Paris). In exploration, it is also arguable that spending more time (where time efficiency is typically desirable) researching a topic shows that a system provides increased support for investigation. Finally, and perhaps most importantly, giving study participants a well specified task could immediately prevent them from exhibiting exploratory behavior. === Models of exploratory search behavior === There have been recent attempts to develop a process model of exploratory search behavior, especially in social information system (e.g., see models of collaborative tagging. The process model assumes that user-generated information cues, such as social tags, can act as navigational cues that facilitate exploration of information that others have found and shared with other users on a social information system (such as social bookmarking system). These models provided extension to existing process model of information search that characterizes information-seeking behavior in traditional fact-retrievals using search engines. Recent development in exploratory search is often concentrated in predicting users' search intents in interaction with the user. Such predictive user modeling, also referred as intent modeling, can help users to get accustomed to a body of domain knowledge and help users to make sense of the potential directions to be explored around their initial, often vague, expression of information needs. == Major figures == Key figures, including experts from both information seeking and human–computer interaction, are: Marcia Bates Nicholas Belkin Gary Marchionini m.c. schraefel Ryen W. White

Energy informatics

Energy informatics is a research field covering the use of information and communication technology to address energy utilization and management challenges. Methods used for "smart" implementations often combine IoT sensors with artificial intelligence and machine learning. Energy Informatics is founded on flow networks that are the major suppliers and consumers of energy. Their efficiency can be improved by collecting and analyzing information. == Application areas == The field among other consider application areas within: Smart Buildings by developing ICT-centred solutions for improving the energy-efficiency of buildings. Smart Cities by investigating the synergies between demand patterns and supply availability of energy flows in cities and communities to improve energy efficiency, increase integration of renewable sources, and provide resilience towards system faults caused by extreme situations, like hurricanes and flooding. Smart Industries including the development of ICT-centred solutions for improving the energy efficiency and predictability of energy intensive industrial processes, without compromising process and product quality. Smart Energy Networks by developing ICT-centred solutions for coordinating the supply and demand in environmentally sustainable energy networks.

Kinodynamic planning

In robotics and motion planning, kinodynamic planning is a class of problems for which velocity, acceleration, and force/torque bounds must be satisfied, together with kinematic constraints such as avoiding obstacles. The term was coined by Bruce Donald, Pat Xavier, John Canny, and John Reif. Donald et al. developed the first polynomial-time approximation schemes (PTAS) for the problem. By providing a provably polynomial-time ε-approximation algorithm, they resolved a long-standing open problem in optimal control. Their first paper considered time-optimal control ("fastest path") of a point mass under Newtonian dynamics, amidst polygonal (2D) or polyhedral (3D) obstacles, subject to state bounds on position, velocity, and acceleration. Later they extended the technique to many other cases, for example, to 3D open-chain kinematic robots under full Lagrangian dynamics. == Modern approaches == Since the foundational theoretical work of the 1990s, the field has evolved significantly with new algorithmic approaches that address the computational and practical limitations of early methods. === Sampling-based methods === Many practical heuristic algorithms based on stochastic optimization and iterative sampling have been developed by a wide range of authors to address the kinodynamic planning problem. Popular approaches include extensions of RRT algorithms such as RRT for kinodynamic systems, and sampling-based methods like Model Predictive Path Integral (MPPI) control. These stochastic techniques have been shown to work well in practice and can handle complex, high-dimensional state spaces more efficiently than deterministic methods. However, all motion planning methods are subject to the PSPACE-hardnesss of classical motion planning even without dynamics, which means (assuming the usual structural complexity conjectures) they all can be worst-case exponential-time in the state-space dimension (the number of degrees of freedom). On the other hand, the deterministic methods have provable guarantees of completeness, accuracy, and complexity (for fixed dimension, they are polynomial-time not only in the geometric complexity, but also in ( 1 / ε ) {\displaystyle (1/\varepsilon )} , the closeness of the desired approximation), whereas most of the recent heuristic/stochastic methods sacrifice at least one of these criteria. === Mixed-integer optimization approaches === Recent advances in mixed-integer programming have enabled new deterministic approaches to kinodynamic planning. These methods formulate the planning problem as an optimization task that simultaneously determines the spatial path and control sequence while respecting all kinodynamic constraints. By using techniques such as McCormick envelopes to handle bilinear constraints, these approaches can provide globally optimal solutions with mathematical guarantees while achieving significant computational speedups over traditional methods. === Genetic algorithm approaches === Genetic algorithms have also been adapted for kinodynamic planning, particularly for gradient-free optimization in challenging terrain. These methods use evolutionary computation to optimize trajectories over receding horizons, with specialized mutation operators that ensure vehicle controls remain within operational limits. This approach is particularly useful when dealing with non-differentiable cost functions or when gradient information is unavailable or unreliable. === Three-dimensional terrain planning === The foundational theoretical work of the 1990s was extended to higher degrees of freedom, and even to n {\displaystyle n} -link, 3D open-chain kinematic robots under full Lagrangian dynamics. However, many of the subsequent heuristic techniques (typically employing stochastic optimization) were confined to planar environments. More recent kinodynamic planning has extended beyond these planar environments to handle complex 3D terrains represented as simplicial complexes or triangular meshes. This advancement is particularly important for applications such as autonomous vehicle navigation in off-road environments, where elevation changes and terrain geometry significantly impact vehicle dynamics. These methods must account for pitch angles, surface curvature, and the coupling between terrain geometry and vehicle kinodynamic constraints. == Performance and guarantees == The landscape of performance guarantees in kinodynamic planning has evolved considerably. While early heuristic methods could not guarantee optimality, recent mixed-integer approaches have demonstrated the ability to find globally optimal solutions with proven constraint satisfaction. Experimental comparisons have shown that modern optimization-based planners can achieve execution times several orders of magnitude faster than sampling-based methods while maintaining strict adherence to kinodynamic constraints. However, the choice of method often depends on the specific application requirements. Sampling-based methods remain valuable for their ability to quickly find feasible solutions in high-dimensional spaces and their robustness to modeling uncertainties. Optimization-based methods excel when optimality guarantees and constraint compliance are critical, particularly in safety-critical applications. == Applications == Kinodynamic planning finds applications across numerous domains including: Autonomous vehicles: Path planning for cars, trucks, and other ground vehicles that must respect acceleration, steering, and velocity limits Aerial robotics: Trajectory planning for quadrotors and other unmanned aerial vehicles with dynamic constraints Manipulation: Planning for robotic arms where joint velocities, accelerations, and torques are limited Legged locomotion: Footstep and trajectory planning for walking and running robots Space robotics: Planning under thrust and fuel constraints for spacecraft and rovers

Owain Evans

Owain Rhys Evans is a British artificial intelligence researcher who works on AI alignment and machine learning safety. He founded Truthful AI, a research group based in Berkeley, California, and is an affiliate of the Center for Human Compatible AI (CHAI) at the University of California, Berkeley. His research addresses AI truthfulness, emergent behaviors in large language models, and the alignment of AI systems with human values. == Education == Evans earned a Bachelor of Arts in philosophy and mathematics from Columbia University in 2008 and a PhD in philosophy from the Massachusetts Institute of Technology in 2015. His doctoral research focused on Bayesian computational models of human preferences and decision-making. == Career == After completing his doctorate, Evans held positions at the Future of Humanity Institute (FHI) at the University of Oxford, first as a postdoctoral research fellow and later as a research scientist. While at FHI, he co-authored a survey of machine learning researchers on timelines for human-level AI, published in the Journal of Artificial Intelligence Research. The survey was reported on by Newsweek, New Scientist, the BBC, and The Economist. He was also among the co-authors of a 2018 report on the potential for misuse of AI technologies, published by researchers at Oxford, Cambridge, and other institutions. Since 2022, Evans has been based in Berkeley, where he founded Truthful AI, a non-profit research group that studies AI truthfulness, deception, and emergent behaviors in large language models. == Research == Evans's early work examined challenges in inverse reinforcement learning when human behavior is irrational or biased, proposing methods for AI systems to infer preferences from imperfect human demonstrations. He co-developed TruthfulQA (2021), a benchmark that tests whether language models give truthful answers rather than repeating common misconceptions. Initial evaluations found that larger models were not more truthful, suggesting that scaling alone does not improve factual accuracy. The benchmark has since been used by AI developers to evaluate large language models. He also co-authored a paper proposing design and governance strategies for building AI systems that do not deceive or hallucinate. In 2023, Evans and collaborators described the "reversal curse", showing that language models trained on a fact in one direction (e.g. "A is B") often cannot answer the corresponding reverse query ("B is A"). His group also developed a benchmark for evaluating situational awareness in language models. In 2025, Evans and colleagues published a study in Nature on what they termed "emergent misalignment": fine-tuning a language model on a narrow task (writing insecure code) caused it to produce unrelated harmful outputs without explicit instruction to do so. Later that year, Evans and collaborators (including researchers at Anthropic) reported that hidden behavioral traits can transfer between language models through training data, even when those traits are not explicitly present in the data, a phenomenon they called "subliminal learning". == Public engagement == In November 2025, Evans delivered the Hinton Lectures, a keynote lecture series on AI safety co-founded by Geoffrey Hinton and the Global Risk Institute.

Uncertain database

An uncertain database is a kind of database studied in database theory. The goal of uncertain databases is to manage information on which there is some uncertainty. Uncertain databases make it possible to explicitly represent and manage uncertainty on the data, usually in a succinct way. == Formal definition == At the basis of uncertain databases is the notion of possible world. Specifically, a possible world of an uncertain database is a (certain) database which is one of the possible realizations of the uncertain database. A given uncertain database typically has more than one, and potentially infinitely many, possible worlds. A formalism to represent uncertain databases then explains how to succinctly represent a set of possible worlds into one uncertain database. == Types of uncertain databases == Uncertain database models differ in how they represent and quantify these possible worlds: Incomplete databases are a compact representation of the set of possible worlds – the use of NULL in SQL, arguably the most commonplace instantiation of uncertain databases, is an example of incomplete database model. Probabilistic databases are a compact representation of a probability distribution over the set of possible worlds. Fuzzy databases are a compact representation of a fuzzy set of the possible worlds. Though mostly studied in the relational setting, uncertain database models can also be defined in other relational models such as graph databases or XML databases. === Incomplete database === The most common database model is the relational model. Multiple incomplete database models have been defined over the relational model, that form extensions to the relational algebra. These have been called Imieliński–Lipski algebras: Relations with NULL values, also called Codd tables c-tables v-tables === Example === The following table is a relation of an incomplete database, described in the formalism of NULL values: There are infinitely many possible worlds for this incomplete database, obtained by replacing the "NULL" values with concrete values. For instance, the following relation is a possible world: