AI Chatbot Quill

AI Chatbot Quill — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Peanut App

    Peanut App

    Peanut, a product of Peanut App Ltd. is an online community for women who are planning to become pregnant, women who are pregnant, women who have had children, and women who are experiencing menopause. Profiles of potential friends are displayed to users who can swipe up to show intent to connect. Users can also connect via discussion threads, groups, and live audio conversations. The app allows users to select their stage of life (trying to conceive, pregnancy, motherhood, or menopause), so as to meet women at a similar life stage, and to discover relevant content. Peanut was founded by Michelle Kennedy shortly after she left Bumble, a female-first dating app. She has described Peanut as, "the app she wishes she had when she first became a mother". == History == Peanut was initially launched in 2017 for mothers and pregnant women. The app focuses on helping users find others with shared interests, such as spoken languages, occupations, and hobbies. It also displays a woman's life stage, such as the age of her children, or the stage of pregnancy. In 2018, it launched a community discussion feature that intended to give women an "alternative to other social platforms". In 2019, it started to serve women who are trying to conceive. In April 2021, it integrated live audio, in response to the COVID-19 pandemic, and the restrictions around in-person socializing. in September 2021, it started to include women who are navigating perimenopause, menopause, and postmenopausal. Although it had initially catered for younger women navigating into new families, a large number of users had undergone surgically or chemically induced menopause due to medical conditions. In July 2021, Peanut launched an investment micro fund, Peanut StartHER, focused on investing in women-owned businesses, as well as other historically excluded founders. == Operation == The Peanut app is a social network exclusively for women, focusing on topics of pregnancy, motherhood, fertility, and menopause. It is available on iOS and Android devices. Users must prove their identity, in keeping with the primary function of in-app safety, and then they can create a profile to interact with other users. For pregnant users, the “Bump Buddies” feature helps connect them with other Peanut users who have a similar due date, which aimed to help expecting mothers combat loneliness during the COVID-19 pandemic. Peanut users also have the option to join “Groups” ‒ sub-sections of users focused on specific topics, including (but not limited to) location, life stage, pregnancy due date, and interests or hobbies. The live voice chat feature “Pods”, enables Peanut users to socialize without the pressure of photos or video chat. It offers features such as a muted audience of listeners who need to virtually raise their hand to speak, emoji reactions, and hosts who can moderate the conversations and invite people to speak.

    Read more →
  • Semantic translation

    Semantic translation

    Semantic translation is the process of using semantic information to aid in the translation of data in one representation or data model to another representation or data model. Semantic translation takes advantage of semantics that associate meaning with individual data elements in one dictionary to create an equivalent meaning in a second system. An example of semantic translation is the conversion of XML data from one data model to a second data model using formal ontologies for each system such as the Web Ontology Language (OWL). This is frequently required by intelligent agents that wish to perform searches on remote computer systems that use different data models to store their data elements. The process of allowing a single user to search multiple systems with a single search request is also known as federated search. Semantic translation should be differentiated from data mapping tools that do simple one-to-one translation of data from one system to another without actually associating meaning with each data element. Semantic translation requires that data elements in the source and destination systems have "semantic mappings" to a central registry or registries of data elements. The simplest mapping is of course where there is equivalence. There are three types of Semantic equivalence: Class Equivalence - indicating that class or "concepts" are equivalent. For example: "Person" is the same as "Individual" Property Equivalence - indicating that two properties are equivalent. For example: "PersonGivenName" is the same as "FirstName" Instance Equivalence - indicating that two individual instances of objects are equivalent. For example: "Dan Smith" is the same person as "Daniel Smith" Semantic translation is very difficult if the terms in a particular data model do not have direct one-to-one mappings to data elements in a foreign data model. In that situation, an alternative approach must be used to find mappings from the original data to the foreign data elements. This problem can be alleviated by centralized metadata registries that use the ISO-11179 standards such as the National Information Exchange Model (NIEM).

    Read more →
  • AI-driven design automation

    AI-driven design automation

    AI-driven design automation is the use of artificial intelligence (AI) to automate and improve different parts of the electronic design automation (EDA) process. It is particularly important in the design of integrated circuits (chips) and complex electronic systems, where it can potentially increase productivity, decrease costs, and speed up design cycles. AI Driven Design Automation uses several methods, including machine learning, expert systems, and reinforcement learning. These are used for many tasks, from planning a chip's architecture and logic synthesis to its physical design and final verification. == History == === 1980s–1990s: Expert systems and early experiments === The use of AI for design automation originated in the 1980s and 1990s, mainly with the creation of expert systems. These systems tried to capture the knowledge and practical rules used by human design experts, and used these rules, along with reasoning engines, to direct the design process. A notable early project was the ULYSSES system from Carnegie Mellon University. ULYSSES was a CAD tool integration environment that let expert designers turn their design methods into scripts that could be run automatically. It treated design tools as sources of knowledge that a scheduler could manage. Another example was the ADAM (Advanced Design AutoMation) system at the University of Southern California, which used an expert system called the Design Planning Engine. This engine figured out design strategies on the fly and handled different design jobs by organizing specialized knowledge into structured formats called frames. Other systems like DAA (Design Automation Assistant) used a rule-based approach for specific jobs, such as register transfer level (RTL) design for systems like the IBM 370. Researchers at Carnegie Mellon University also created TALIB, an expert system for mask layout that used over 1200 rules, and EMUCS/DAA for CPU architectural design which used about 70 rules. These projects showed that AI worked better for problems where relatively few rules were required to describe much larger amounts of data. At the same time, there was a surge of tools called silicon compilers like MacPitts, Arsenic, and Palladio. They used algorithms and search techniques to explore different design paradigms. This was another way to automate design, even if it was not always based on expert systems. Early tests with neural networks in VLSI design also happened during this time, although they were not as common as systems based on rules. === 2000s: Introduction of machine learning === In the 2000s, interest in AI for design automation increased. This was mostly because of better machine learning (ML) algorithms and more available data from design and manufacturing. For example, they were used to model and reduce the effects of small manufacturing differences in semiconductor devices. This became very important as the size of components on chips became smaller. The large amount of data created during chip design provided the foundation needed to train smarter ML models. This allowed for predicting outcomes and optimizing in areas that were hard to automate before. === 2016–2020: Reinforcement learning and large scale initiatives === A major turning point happened in the mid to late 2010s, sparked by successes in other areas of AI. The success of DeepMind's AlphaGo in mastering the game of Go inspired researchers. They began to apply reinforcement learning (RL) to difficult EDA problems. These problems often require searching through many options and making a series of decisions. In 2018, the U.S. DARPA started the Intelligent Design of Electronic Assets (IDEA) program. A main goal of IDEA was to create a fully automated layout generator that required no human intervention, able to produce a chip design ready for manufacturing from RTL specifications in 24 hours. Another big initiative was the OpenROAD project, a large effort under IDEA led by UC San Diego with industry and university partners, aimed to build an open source, independent toolchain. It used machine learning, parallelization and divide and conquer approaches. A much-publicized but controversial demonstration of RL's potential came from Google researchers between 2020 and 2021. They created a deep reinforcement learning method for planning the layout of a chip, known as floorplanning. They reported that this method created layouts that were as good as or better than those made by human experts, and it did so in less than six hours. This method used a type of network called a graph convolutional neural network. It showed that it could learn general patterns that could be applied to new problems, getting better as it saw more chip designs. The technology was later used to design Google's Tensor Processing Unit (TPU) accelerators. However, in the original paper, the improvement (if any) from AI was not demonstrated. There was no comparison with existing non-AI tools performing the same task, and since the data is proprietary, no ability for anyone else to perform this comparison. Various efforts to reproduce the AI algorithm, and compare its results with various commercial and academic tools, have yielded mixed results with no conclusive advantage to AI. === 2020s: Autonomous systems and agents === Entering the 2020s, the industry saw the commercial launch of autonomous AI driven EDA systems. For example, Synopsys launched DSO.ai (Design Space Optimization AI) in early 2020, calling it the first autonomous artificial intelligence application for chip design in the industry. This system uses reinforcement learning to search for the best ways to optimize a design within the huge number of possible solutions, trying to improve power, performance, and area (PPA). By 2023, DSO.ai had been used to produce over 100 commercial chips, showing mainstream adoption. Synopsys later grew its AI tools into a suite called Synopsys.ai. The goal was to use AI in the entire EDA workflow, including verification and testing. These advancements, which combine modern AI methods with cloud computing and large data resources, have led to talks about a new phase in EDA. Industry experts and participants sometimes call this 'EDA 4.0'. This new era is defined by the widespread use of AI and machine learning to deal with growing design complexity, automate more of the design process, and help engineers handle the huge amounts of data that EDA tools create. The purpose of EDA 4.0 is to optimize product performance, get products to market faster and make development and manufacturing smoother through intelligent automation. == Applications == Artificial intelligence (AI) is now used in many stages of the electronic design workflow. It aims to improve productivity, get better results, and handle the growing complexity of modern integrated circuits. AI helps designers from the very first ideas about architecture all the way to manufacturing and testing. === High level synthesis and architectural exploration === In the first phases of chip design, AI helps with High Level Synthesis (HLS) and exploring different system level design options (DSE). These processes are key for turning general ideas into detailed hardware plans. AI algorithms, often using supervised learning, are used to build simpler, substitute models. These models can quickly guess important design measurements like area, performance, and power for many different architectural options or HLS settings. For example, the Ithemal tool uses deep neural networks to estimate how fast basic code blocks will run, which helps in making processor architecture decisions. Similarly, PRIMAL uses machine learning estimate power use at the register transfer level (RTL), giving early information about how much power the chip will use. Reinforcement learning (RL) and Bayesian optimization are also used to guide the DSE process. They help search through the many parameters to find the best HLS settings or architectural details like cache sizes. LLMs are also being tested for creating architectural plans or initial C code for HLS, as seen with GPT4AIGChip. === Logic synthesis and optimization === Logic synthesis starts from a high level hardware description and generates an optimized list of electronic gates, known as a gate level netlist, that is ready for placement, routing, and then construction in a specific manufacturing process. AI methods help with different parts of this process, including logic optimization, technology mapping, and making improvements after mapping. Supervised learning, especially with Graph Neural Networks (GNNs), is good at handling data or problems that can be represented as graphs. Since circuit diagrams are instances of directed graphs, supervised learning can help create models that predict design properties like power or error rates in circuits. In logic synthesis and optimization reinforcement learning is used to perform logic optimization directly. In some cases ag

    Read more →
  • Terminology extraction

    Terminology extraction

    Terminology extraction (also known as term extraction, glossary extraction, term recognition, or terminology mining) is a subtask of information extraction. The goal of terminology extraction is to automatically extract relevant terms from a given corpus. In the semantic web era, a growing number of communities and networked enterprises started to access and interoperate through the internet. Modeling these communities and their information needs is important for several web applications, like topic-driven web crawlers, web services, recommender systems, etc. The development of terminology extraction is also essential to the language industry. One of the first steps to model a knowledge domain is to collect a vocabulary of domain-relevant terms, constituting the linguistic surface manifestation of domain concepts. Several methods to automatically extract technical terms from domain-specific document warehouses have been described in the literature. Typically, approaches to automatic term extraction make use of linguistic processors (part of speech tagging, phrase chunking) to extract terminological candidates, i.e. syntactically plausible terminological noun phrases. Noun phrases include compounds (e.g. "credit card"), adjective noun phrases (e.g. "local tourist information office"), and prepositional noun phrases (e.g. "board of directors"). In English, the first two (compounds and adjective noun phrases) are the most frequent. Terminological entries are then filtered from the candidate list using statistical and machine learning methods. Once filtered, because of their low ambiguity and high specificity, these terms are particularly useful for conceptualizing a knowledge domain or for supporting the creation of a domain ontology or a terminology base. Furthermore, terminology extraction is a very useful starting point for semantic similarity, knowledge management, human translation and machine translation, etc. == Bilingual terminology extraction == The methods for terminology extraction can be applied to parallel corpora. Combined with e.g. co-occurrence statistics, candidates for term translations can be obtained. Bilingual terminology can be extracted also from comparable corpora (corpora containing texts within the same text type, domain but not translations of documents between each other).

    Read more →
  • Deconvolution

    Deconvolution

    In mathematics, deconvolution is the inverse of convolution. Both operations are used in signal processing and image processing. For example, it may be possible to recover the original signal after a filter (convolution) by using a deconvolution method with a certain degree of accuracy. Due to the measurement error of the recorded signal or image, it can be demonstrated that the worse the signal-to-noise ratio (SNR), the worse the reversing of a filter will be; hence, inverting a filter is not always a good solution as the error amplifies. Deconvolution offers a solution to this problem. The foundations for deconvolution and time-series analysis were largely laid by Norbert Wiener of the Massachusetts Institute of Technology in his book Extrapolation, Interpolation, and Smoothing of Stationary Time Series (1949). The book was based on work Wiener had done during World War II but that had been classified at the time. Some of the early attempts to apply these theories were in the fields of weather forecasting and economics. == Description == In general, the objective of deconvolution is to find the solution f of a convolution equation of the form: f ∗ g = h {\displaystyle fg=h\,} Usually, h is some recorded signal, and f is some signal that we wish to recover, but has been convolved with a filter or distortion function g, before we recorded it. Usually, h is a distorted version of f and the shape of f can't be easily recognized by the eye or simpler time-domain operations. The function g represents the impulse response of an instrument or a driving force that was applied to a physical system. If we know g, or at least know the form of g, then we can perform deterministic deconvolution. However, if we do not know g in advance, then we need to estimate it. This can be done using methods of statistical estimation or building the physical principles of the underlying system, such as the electrical circuit equations or diffusion equations. There are several deconvolution techniques, depending on the choice of the measurement error and deconvolution parameters: === Raw deconvolution === When the measurement error is very low (ideal case), deconvolution collapses into a filter reversing. This kind of deconvolution can be performed in the Laplace domain. By computing the Fourier transform of the recorded signal h and the system response function g, you get H and G, with G as the transfer function. Using the convolution theorem, F = H / G {\displaystyle F=H/G\,} where F is the estimated Fourier transform of f. Finally, the inverse Fourier transform of the function F is taken to find the estimated deconvolved signal f. Note that G is at the denominator and could amplify elements of the error model if present. === Deconvolution with noise === In physical measurements, the situation is usually closer to ( f ∗ g ) + ε = h {\displaystyle (fg)+\varepsilon =h\,} In this case ε is noise that has entered our recorded signal. If a noisy signal or image is assumed to be noiseless, the statistical estimate of g will be incorrect. In turn, the estimate of ƒ will also be incorrect. The lower the signal-to-noise ratio, the worse the estimate of the deconvolved signal will be. That is the reason why inverse filtering the signal (as in the "raw deconvolution" above) is usually not a good solution. However, if at least some knowledge exists of the type of noise in the data (for example, white noise), the estimate of ƒ can be improved through techniques such as Wiener deconvolution. == Applications == === Seismology === The concept of deconvolution had an early application in reflection seismology. In 1950, Enders Robinson was a graduate student at MIT. He worked with others at MIT, such as Norbert Wiener, Norman Levinson, and economist Paul Samuelson, to develop the "convolutional model" of a reflection seismogram. This model assumes that the recorded seismogram s(t) is the convolution of an Earth-reflectivity function e(t) and a seismic wavelet w(t) from a point source, where t represents recording time. Thus, our convolution equation is s ( t ) = ( e ∗ w ) ( t ) . {\displaystyle s(t)=(ew)(t).\,} The seismologist is interested in e, which contains information about the Earth's structure. By the convolution theorem, this equation may be Fourier transformed to S ( ω ) = E ( ω ) W ( ω ) {\displaystyle S(\omega )=E(\omega )W(\omega )\,} in the frequency domain, where ω {\displaystyle \omega } is the frequency variable. By assuming that the reflectivity is white, we can assume that the power spectrum of the reflectivity is constant, and that the power spectrum of the seismogram is the spectrum of the wavelet multiplied by that constant. Thus, | S ( ω ) | ≈ k | W ( ω ) | . {\displaystyle |S(\omega )|\approx k|W(\omega )|.\,} If we assume that the wavelet is minimum phase, we can recover it by calculating the minimum phase equivalent of the power spectrum we just found. The reflectivity may be recovered by designing and applying a Wiener filter that shapes the estimated wavelet to a Dirac delta function (i.e., a spike). The result may be seen as a series of scaled, shifted delta functions (although this is not mathematically rigorous): e ( t ) = ∑ i = 1 N r i δ ( t − τ i ) , {\displaystyle e(t)=\sum _{i=1}^{N}r_{i}\delta (t-\tau _{i}),} where N is the number of reflection events, r i {\displaystyle r_{i}} are the reflection coefficients, t − τ i {\displaystyle t-\tau _{i}} are the reflection times of each event, and δ {\displaystyle \delta } is the Dirac delta function. In practice, since we are dealing with noisy, finite bandwidth, finite length, discretely sampled datasets, the above procedure only yields an approximation of the filter required to deconvolve the data. However, by formulating the problem as the solution of a Toeplitz matrix and using Levinson recursion, we can relatively quickly estimate a filter with the smallest mean squared error possible. We can also do deconvolution directly in the frequency domain and get similar results. The technique is closely related to linear prediction. === Optics and other imaging === In optics and imaging, the term "deconvolution" is specifically used to refer to the process of reversing the optical distortion that takes place in an optical microscope, electron microscope, telescope, or other imaging instrument, thus creating clearer images. It is usually done in the digital domain by a software algorithm, as part of a suite of microscope image processing techniques. Deconvolution is also practical to sharpen images that suffer from fast motion or jiggles during capturing. Early Hubble Space Telescope images were distorted by a flawed mirror and were sharpened by deconvolution. The usual method is to assume that the optical path through the instrument is optically perfect, convolved with a point spread function (PSF), that is, a mathematical function that describes the distortion in terms of the pathway a theoretical point source of light (or other waves) takes through the instrument. Usually, such a point source contributes a small area of fuzziness to the final image. If this function can be determined, it is then a matter of computing its inverse or complementary function, and convolving the acquired image with that. The result is the original, undistorted image. In practice, finding the true PSF is impossible, and usually an approximation of it is used, theoretically calculated or based on some experimental estimation by using known probes. Real optics may also have different PSFs at different focal and spatial locations, and the PSF may be non-linear. The accuracy of the approximation of the PSF will dictate the final result. Different algorithms can be employed to give better results, at the price of being more computationally intensive. Since the original convolution discards data, some algorithms use additional data acquired at nearby focal points to make up some of the lost information. Regularization in iterative algorithms (as in expectation-maximization algorithms) can be applied to avoid unrealistic solutions. When the PSF is unknown, it may be possible to deduce it by systematically trying different possible PSFs and assessing whether the image has improved. This procedure is called blind deconvolution. Blind deconvolution is a well-established image restoration technique in astronomy, where the point nature of the objects photographed exposes the PSF thus making it more feasible. It is also used in fluorescence microscopy for image restoration, and in fluorescence spectral imaging for spectral separation of multiple unknown fluorophores. The most common iterative algorithm for the purpose is the Richardson–Lucy deconvolution algorithm; the Wiener deconvolution (and approximations) are the most common non-iterative algorithms. For some specific imaging systems such as laser pulsed terahertz systems, PSF can be modeled mathematically. As a result, as shown in the figure, deconvolution of the modeled PS

    Read more →
  • Kunstweg

    Kunstweg

    Bürgi's Kunstweg is a set of algorithms developed by Jost Bürgi in the late 16th century. They are used to calculate sines to arbitrary precision.. Bürgi used these algorithms to calculate a Canon Sinuum, a sine table in increments of 2 arc seconds. It is believed that the table featured values accurate to eight sexagesimal places. Some authors have speculated that the table only covered the range from 0° to 45°, although there is no evidence supporting this claim. Such tables were crucial for maritime navigation. Johannes Kepler described the Canon Sinuum as the most precise sine table known at the time. Bürgi explained his algorithms in his work Fundamentum Astronomiae, which he presented to Emperor Rudolf II in 1592. The Kunstweg algorithm calculates sine values iteratively. In each step, the value of a cell is the sum of the two preceding cells in the same column. The final cell's value is halved before beginning the next iteration. Ultimately, the values in the last column are normalized. Accurate sine approximations are achieved after only a few iterations. In 2015, Menso Folkerts and coworkers demonstrated that this iterative process does indeed converge toward the true sine values. According to them this was the first step towards differential calculus.

    Read more →
  • Ontology engineering

    Ontology engineering

    In computer science, information science and systems engineering, ontology engineering is a field which studies the methods and methodologies for building ontologies, which encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities of a given domain of interest. In a broader sense, this field also includes a knowledge construction of the domain using formal ontology representations such as OWL/RDF. A large-scale representation of abstract concepts such as actions, time, physical objects and beliefs would be an example of ontological engineering. Ontology engineering is one of the areas of applied ontology, and can be seen as an application of philosophical ontology. Core ideas and objectives of ontology engineering are also central in conceptual modeling. Ontology engineering aims at making explicit the knowledge contained within software applications, and within enterprises and business procedures for a particular domain. Ontology engineering offers a direction towards solving the inter-operability problems brought about by semantic obstacles, i.e. the obstacles related to the definitions of business terms and software classes. Ontology engineering is a set of tasks related to the development of ontologies for a particular domain. Automated processing of information not interpretable by software agents can be improved by adding rich semantics to the corresponding resources, such as video files. One of the approaches for the formal conceptualization of represented knowledge domains is the use of machine-interpretable ontologies, which provide structured data in, or based on, RDF, RDFS, and OWL. Ontology engineering is the design and creation of such ontologies, which can contain more than just the list of terms (controlled vocabulary); they contain terminological, assertional, and relational axioms to define concepts (classes), individuals, and roles (properties) (TBox, ABox, and RBox, respectively). Ontology engineering is a relatively new field of study concerning the ontology development process, the ontology life cycle, the methods and methodologies for building ontologies, and the tool suites and languages that support them. A common way to provide the logical underpinning of ontologies is to formalize the axioms with description logics, which can then be translated to any serialization of RDF, such as RDF/XML or Turtle. Beyond the description logic axioms, ontologies might also contain SWRL rules. The concept definitions can be mapped to any kind of resource or resource segment in RDF, such as images, videos, and regions of interest, to annotate objects, persons, etc., and interlink them with related resources across knowledge bases, ontologies, and LOD datasets. This information, based on human experience and knowledge, is valuable for reasoners for the automated interpretation of sophisticated and ambiguous contents, such as the visual content of multimedia resources. Application areas of ontology-based reasoning include, but are not limited to, information retrieval, automated scene interpretation, and knowledge discovery. == Languages == An ontology language is a formal language used to encode the ontology. There are a number of such languages for ontologies, both proprietary and standards-based: Common logic is ISO standard 24707, a specification for a family of ontology languages that can be accurately translated into each other. The Cyc project has its own ontology language called CycL, based on first-order predicate calculus with some higher-order extensions. The Gellish language includes rules for its own extension and thus integrates an ontology with an ontology language. IDEF5 is a software engineering method to develop and maintain usable, accurate, domain ontologies. KIF is a syntax for first-order logic that is based on S-expressions. Rule Interchange Format (RIF), F-Logic and its successor ObjectLogic combine ontologies and rules. OWL is a language for making ontological statements, developed as a follow-on from RDF and RDFS, as well as earlier ontology language projects including OIL, DAML and DAML+OIL. OWL is intended to be used over the World Wide Web, and all its elements (classes, properties and individuals) are defined as RDF resources, and identified by URIs. OntoUML is a well-founded language for specifying reference ontologies. SHACL (RDF SHapes Constraints Language) is a language for describing structure of RDF data. It can be used together with RDFS and OWL or it can be used independently from them. XBRL (Extensible Business Reporting Language) is a syntax for expressing business semantics. == Methodologies and tools == DOGMA KAON OntoClean HOZO Protégé (software) Large language models == In life sciences == Life sciences is flourishing with ontologies that biologists use to make sense of their experiments. For inferring correct conclusions from experiments, ontologies have to be structured optimally against the knowledge base they represent. The structure of an ontology needs to be changed continuously so that it is an accurate representation of the underlying domain. Recently, an automated method was introduced for engineering ontologies in life sciences such as Gene Ontology (GO), one of the most successful and widely used biomedical ontology. Based on information theory, it restructures ontologies so that the levels represent the desired specificity of the concepts. Similar information theoretic approaches have also been used for optimal partition of Gene Ontology. Given the mathematical nature of such engineering algorithms, these optimizations can be automated to produce a principled and scalable architecture to restructure ontologies such as GO. Open Biomedical Ontologies (OBO), a 2006 initiative of the U.S. National Center for Biomedical Ontology, provides a common 'foundry' for various ontology initiatives, amongst which are: The Generic Model Organism Project (GMOD) Gene Ontology Consortium Sequence Ontology Ontology Lookup Service The Plant Ontology Consortium Standards and Ontologies for Functional Genomics and more

    Read more →
  • Data quality

    Data quality

    Data quality refers to the state of qualitative or quantitative pieces of information. There are many definitions of data quality, but data is generally considered high quality if it is "fit for [its] intended uses in operations, decision making and planning". Data is deemed of high quality if it correctly represents the real-world construct to which it refers. Apart from these definitions, as the number of data sources increases, the question of internal data consistency becomes significant, regardless of fitness for use for any particular external purpose. People's views on data quality can often be in disagreement, even when discussing the same set of data used for the same purpose. When this is the case, businesses may adopt recognised international standards for data quality (See #International Standards for Data Quality below). Data governance can also be used to form agreed upon definitions and standards, including international standards, for data quality. In such cases, data cleansing, including standardization, may be required in order to ensure data quality. == Definitions == Defining data quality is difficult due to the many contexts data are used in, as well as the varying perspectives among end users, producers, and custodians of data. From a consumer perspective, data quality is: "data that are fit for use by data consumers" data "meeting or exceeding consumer expectations" data that "satisfies the requirements of its intended use" From a business perspective, data quality is: data that are "'fit for use' in their intended operational, decision-making and other roles" or that exhibits "'conformance to standards' that have been set, so that fitness for use is achieved" data that "are fit for their intended uses in operations, decision making and planning" "the capability of data to satisfy the stated business, system, and technical requirements of an enterprise" From a standards-based perspective, data quality is: the "degree to which a set of inherent characteristics (quality dimensions) of an object (data) fulfills requirements" "the usefulness, accuracy, and correctness of data for its application" Arguably, in all these cases, "data quality" is a comparison of the actual state of a particular set of data to a desired state, with the desired state being typically referred to as "fit for use," "to specification," "meeting consumer expectations," "free of defect," or "meeting requirements." These expectations, specifications, and requirements are usually defined by one or more individuals or groups, standards organizations, laws and regulations, business policies, or software development policies. == Dimensions of data quality == Drilling down further, those expectations, specifications, and requirements are stated in terms of characteristics or dimensions of the data, such as: accessibility or availability accuracy or correctness comparability completeness or comprehensiveness consistency, coherence, or clarity credibility, reliability, or reputation flexibility plausibility relevance, pertinence, or usefulness timeliness or latency uniqueness validity or reasonableness A systematic scoping review of the literature suggests that data quality dimensions and methods with real world data are not consistent in the literature, and as a result quality assessments are challenging due to the complex and heterogeneous nature of these data. == International standards for data quality == ISO 8000 is an international standard for data quality. Managed by the International Organization for Standardization, the ISO 8000 standards address and describe general aspects of data quality including principles, vocabulary and measurement data governance data quality management data quality assessment quality of master data, including exchange of characteristic data and identifiers quality of industrial data == History == Before the rise of the inexpensive computer data storage, massive mainframe computers were used to maintain name and address data for delivery services. This was so that mail could be properly routed to its destination. The mainframes used business rules to correct common misspellings and typographical errors in name and address data, as well as to track customers who had moved, died, gone to prison, married, divorced, or experienced other life-changing events. Government agencies began to make postal data available to a few service companies to cross-reference customer data with the National Change of Address registry (NCOA). This technology saved large companies millions of dollars in comparison to manual correction of customer data. Large companies saved on postage, as bills and direct marketing materials made their way to the intended customer more accurately. Initially sold as a service, data quality moved inside the walls of corporations, as low-cost and powerful server technology became available. Companies with an emphasis on marketing often focused their quality efforts on name and address information, but data quality is recognized as an important property of all types of data. Principles of data quality can be applied to supply chain data, transactional data, and nearly every other category of data found. For example, making supply chain data conform to a certain standard has value to an organization by: 1) avoiding overstocking of similar but slightly different stock; 2) avoiding false stock-out; 3) improving the understanding of vendor purchases to negotiate volume discounts; and 4) avoiding logistics costs in stocking and shipping parts across a large organization. For companies with significant research efforts, data quality can include developing protocols for research methods, reducing measurement error, bounds checking of data, cross tabulation, modeling and outlier detection, verifying data integrity, etc. == Overview == There are a number of theoretical frameworks for understanding data quality. A systems-theoretical approach influenced by American pragmatism expands the definition of data quality to include information quality, and emphasizes the inclusiveness of the fundamental dimensions of accuracy and precision on the basis of the theory of science (Ivanov, 1972). One framework, dubbed "Zero Defect Data" (Hansen, 1991) adapts the principles of statistical process control to data quality. Another framework seeks to integrate the product perspective (conformance to specifications) and the service perspective (meeting consumers' expectations) (Kahn et al. 2002). Another framework is based in semiotics to evaluate the quality of the form, meaning and use of the data (Price and Shanks, 2004). One highly theoretical approach analyzes the ontological nature of information systems to define data quality rigorously (Wand and Wang, 1996). A considerable amount of data quality research involves investigating and describing various categories of desirable attributes (or dimensions) of data. Nearly 200 such terms have been identified and there is little agreement in their nature (are these concepts, goals or criteria?), their definitions or measures (Wang et al., 1993). Software engineers may recognize this as a similar problem to "ilities". MIT has an Information Quality (MITIQ) Program, led by Professor Richard Wang, which produces a large number of publications and hosts a significant international conference in this field (International Conference on Information Quality, ICIQ). This program grew out of the work done by Hansen on the "Zero Defect Data" framework (Hansen, 1991). In practice, data quality is a concern for professionals involved with a wide range of information systems, ranging from data warehousing and business intelligence to customer relationship management and supply chain management. One industry study estimated the total cost to the U.S. economy of data quality problems at over U.S. $600 billion per annum (Eckerson, 2002). Incorrect data – which includes invalid and outdated information – can originate from different data sources – through data entry, or data migration and conversion projects. In 2002, the USPS and PricewaterhouseCoopers released a report stating that 23.6 percent of all U.S. mail sent is incorrectly addressed. One reason contact data becomes stale very quickly in the average database – more than 45 million Americans change their address every year. In fact, the problem is such a concern that companies are beginning to set up a data governance team whose sole role in the corporation is to be responsible for data quality. In some organizations, this data governance function has been established as part of a larger Regulatory Compliance function - a recognition of the importance of Data/Information Quality to organizations. Problems with data quality don't only arise from incorrect data; inconsistent data is a problem as well. Eliminating data shadow systems and centralizing data in a warehouse is one of the initiatives a company can take to ensure data consistency. En

    Read more →
  • Comparison of video editing software

    Comparison of video editing software

    This is a comparison of non-linear video editing software applications. See also a more complete list of video editing software. == General information == This table gives basic general information about the different editors: === Active === === Discontinued / Inactive === ==== Definition ==== professional: used for full length Hollywood movies; professional (small): mainly used for paid commercials, short films or podcasts/YouTube channels; prosumer: Mainly targeting private use, anything that can do more than just trimming a film; basic: trimming a film; == System requirements == This table lists the operating systems that different editors can run on without emulation, as well as other system requirements. Note that minimum system requirements are listed; some features (like High Definition support) may be unavailable with these specifications. "Unix" includes the similar Linux, BSD and Unix-like operating systems. == High definition/High resolution import == The table below indicates the ability of each program to import various High Definition video or High resolution video formats for editing. == Feature set == == Output options == Please note that recording to Blu-ray does not imply 1080@50p/60p . Most only support up to 1080i 25/30 frames per second recording. Also not all formats can be output.

    Read more →
  • Retention period

    Retention period

    A retention period (associated with a retention schedule or retention program) is an aspect of records and information management (RIM) and the records life cycle that identifies the duration of time for which the information should be maintained or "retained", irrespective of format (paper, electronic, or other). Retention periods vary with different types of information, based on content and a variety of other factors, including internal organizational need, regulatory requirements for inspection or audit, legal statutes of limitation, involvement in litigation, and taxation and financial reporting needs, as well as other factors as defined by local, regional, state, national, and/or international governing entities. Once an applicable retention period has elapsed for a given type or series of information, and all holds/moratoriums have been released, the information is typically destroyed using an approved and effective destruction method, which renders the information completely and irreversibly unusable via any means. Alternatively, it may be converted from one form to another (e.g. from paper to electronic), depending on the defined retention period per format. Information with historical value beyond its "usable value" may be accessioned to the custody of an archive organization for permanent or extended long-term preservation. == Defensible disposition == Defensible disposition refers to the ability of an identified and applied retention period to effectively provide for the defense of the record, and its eventual destruction or accessioning when scrutinized within a court of law or by other review. It is commonly advised by records and information management (RIM) professionals that any and all retention periods applied to organizational information should be reviewed and approved for use by competent legal counsel, which represents the organization, and is familiar with the specific business needs and legal and regulatory requirements of the organization. Additionally, a practical approach to information assessment/classification, proper documentation of the disposition program, strategic review of disposition policy over time for efficacy are required for proper defensible disposition. == Guidance and education organizations == ARMA International Information and Records Management Society filerskeepers records retention FAQ

    Read more →
  • Algorithmic mechanism design

    Algorithmic mechanism design

    Algorithmic mechanism design (AMD) lies at the intersection of economic game theory, optimization, and computer science. The prototypical problem in mechanism design is to design a system for multiple self-interested participants, such that the participants' self-interested actions at equilibrium lead to good system performance. Typical objectives studied include revenue maximization and social welfare maximization. Algorithmic mechanism design differs from classical economic mechanism design in several respects. It typically employs the analytic tools of theoretical computer science, such as worst case analysis and approximation ratios, in contrast to classical mechanism design in economics which often makes distributional assumptions about the agents. It also considers computational constraints to be of central importance: mechanisms that cannot be efficiently implemented in polynomial time are not considered to be viable solutions to a mechanism design problem. This often, for example, rules out the classic economic mechanism, the Vickrey–Clarke–Groves auction. == History == Noam Nisan and Amir Ronen first coined "Algorithmic mechanism design" in a research paper published in 1999.

    Read more →
  • Data (word)

    Data (word)

    The word data is most often used as a singular collective mass noun in educated everyday usage. However, due to the history and etymology of the word, considerable controversy has existed on whether it should be considered a mass noun used with verbs conjugated in the singular, or should be treated as the plural of the now-rarely-used datum. == Usage in English == In one sense, data is the plural form of datum. Datum actually can also be a count noun with the plural datums (see usage in datum article) that can be used with cardinal numbers (e.g., "80 datums"); data (originally a Latin plural) is not used like a normal count noun with cardinal numbers and can be plural with plural determiners such as these and many, or it can be used as a mass noun with a verb in the singular form. Even when a very small quantity of data is referenced (one number, for example), the phrase piece of data is often used, as opposed to datum. The debate over appropriate usage continues, but "data" as a singular form is far more common. In English, the word datum is still used in the general sense of "an item given". In cartography, geography, nuclear magnetic resonance and technical drawing, it is often used to refer to a single specific reference datum from which distances to all other data are measured. Any measurement or result is a datum, though data point is now far more common. Data is indeed most often used as a singular mass noun in educated everyday usage. Some major newspapers, such as The New York Times, use it either in the singular or plural. In The New York Times, the phrases "the survey data are still being analyzed" and "the first year for which data is available" have appeared within one day. The Wall Street Journal explicitly allows this usage in its style guide. The Associated Press style guide classifies data as a collective noun that takes the singular when treated as a unit but the plural when referring to individual items (e.g., "The data is sound" and "The data have been carefully collected"). In scientific writing, data is often treated as a plural, as in These data do not support the conclusions, but the word is also used as a singular mass entity like information (e.g., in computing and related disciplines). British usage now widely accepts treating data as singular in standard English, including everyday newspaper usage at least in non-scientific use. UK scientific publishing still prefers treating it as a plural. Some UK university style guides recommend using data for both singular and plural use, and others recommend treating it only as a singular in connection with computers. The IEEE Computer Society allows usage of data as either a mass noun or plural based on author preference, while IEEE in the editorial style manual indicates to always use the plural form. Some professional organizations and style guides require that authors treat data as a plural noun. For example, the Air Force Flight Test Center once stated that the word data is always plural, never singular.

    Read more →
  • UpScrolled

    UpScrolled

    UpScrolled is an Australian social media platform for microblogging and short-form online video sharing that was launched in June 2025 by Recursive Methods Pty Ltd. It was founded by Issam Hijazi. == History == UpScrolled was launched in June 2025 by Recursive Methods Pty Ltd. It was founded by Issam Hijazi, a Palestinian-Australian app developer. UpScrolled is backed by the Tech for Palestine incubator. In January 2026, UpScrolled saw increased attention and number of downloads after the acquisition of TikTok by a group of pro-Donald Trump US investors, including Larry Ellison, which led to calls to boycott TikTok and migrate to other apps. TikTok was alleged to be suppressing pro-Palestinian content, as well as news surrounding the killing of Alex Pretti in Minneapolis on the platform. UpScrolled subsequently climbed to the top 10 of Apple's App Store list of free apps. The app saw a reported 2,850% increase in downloads between 22 and 24 January 2026. As of 27 January 2026, UpScrolled "had been downloaded about 400,000 times in the US and 700,000 globally since launching in June 2025". The app became the most downloaded app in the Apple App store on 29 January 2026, following allegations that TikTok was suppressing videos and content opposed to Immigration and Customs Enforcement (ICE) under its new ownership. By 2 February 2026, UpScrolled had reached 2.5 million users. According to the Google Play Store and the Apple App Store, it has become the most downloaded social media app in the United States and Canada, with rising interest in the United Kingdom, France, Germany and Italy. On 14 February, UpScrolled was suspended from the Google Play Store; the suspension was reverted by 15 February. == Founder == Hijazi was born in Jordan. His parents and grandparents are from Safad, a northern Israeli city near the Lebanese border. He worked for IBM and Oracle prior to starting UpScrolled. Hijazi told Rest of World that he launched UpScrolled in response to Israel's genocide in Gaza which followed the October 7 attacks. He said, "I couldn't take it anymore. I lost family members in Gaza, and I didn't want to be complicit. So I was like, I'm done with this, I want to feel useful. I found this gap in the market, with a lot of people asking why there is no alternative to the Big Tech platforms for their content, which was getting censored." Hijazi also alleges that social media accounts that were posting pro-Palestinian content were getting shadow banned on larger platforms, and alleges that even his account was not exempt from being targeted by censors. Hijazi has further elaborated on the importance of social media independence to further the Palestinian cause. In January 2026, Web Summit Qatar announced that Hijazi would be an opening night speaker. Following the announcement, there was a surge in ticket sales for the summit. Hijazi lives in Sydney with his wife and daughter. He lost 60 family members during the Gaza war. == Features == UpScrolled's algorithm allows users to discover posts based on likes, comments, and shares with time decay and some randomness, all chronologically, with "no manipulation" according to the app's website. UpScrolled has an interface resembling a mix of Instagram and Twitter, allowing users to post and view text posts, photos, and videos. It also lets users send private messages to each other. The app is currently available for iOS and Android devices, with plans to upscale. UpScrolled does not include Israel as an option in its location selection menu. Cities such as Tel Aviv are included under "Occupied Territories of Palestine", and Palestine can also be set as the location. UpScrolled says that it is against censorship and shadow banning, and describes itself as "belong[ing] to the people who use it — not to hidden algorithms or outside agendas". Hijazi said, "The other platforms claim to be free speech platforms. But when it comes to anything on Palestine, that's a different story." UpScrolled states that it "does not tolerate hate speech, propaganda, or bad-faith behaviour, but it also refuses to silence voices quietly or without explanation". == User base and content == Al Jazeera reported that posts expressing pro-Palestinian sentiment or depicting the continued suffering in the Gaza Strip were "flooding" the app. Political and global issues such as the Gaza war are prominent. Content includes updates from the Gaza Freedom Flotilla, posts by doctors working in Gaza, video essays about Palantir’s influence within the military and calls for boycotts of Israel. It has been used by Gazans to crowdfund and record daily life. Celebrity users of UpScrolled include American labour activist Chris Smalls and actor Jacob Berger, both of whom were on the July 2025 Gaza Freedom Flotilla. Political figures have also joined UpScrolled, such as South African politician and Economic Freedom Fighters leader Julius Malema, and Islamic Revolutionary Guard Corps commander Esmail Qaani. One user said that most early users were attracted to the platform for the opportunity to criticize Zionism. The Jewish Telegraphic Agency (JTA) reported that UpScrolled was observed to be "flooded" with antisemitic and anti-Israel content, including Holocaust denial and accusations that Israel carried out the 9/11 attacks. In a statement, UpScrolled said, "Our content moderation hasn't been able to keep up with the massive rise of users this week. We're working with digital rights experts to grow our Trust & Safety team and are beefing up our content moderation to prevent this. We apologise to all impacted users, thank you for being part of Upscrolled." The Times reported in February 2026 that UpScrolled was hosting content that could potentially breach UK law, including antisemitic content and posts promoting Hamas, Hezbollah, Islamic State and Al-Qaeda, as well as footage of the 2019 Christchurch mosque shootings and content praising the perpetrators of the 2019 Halle synagogue shooting and 2018 Pittsburgh synagogue shooting. Antisemitic influencers Lucas Gage, Jake Shields, Stew Peters and Anastasia Maria Loupis have accounts on UpScrolled. UpScrolled’s policies prohibit threats, glorification of harm or support for terrorist or violent groups. Hijazi said harmful content was being uploaded to UpScrolled and the company had expanded its content moderation team and upgraded its technology infrastructure to deal with the issue. In May 2026, Moment magazine said that users had identified some antisemitic content, pornography and extremist videos on the platform. The magazine said there were gaps in content moderation due to the small size of the developer team. == Reception == In January 2026, the Council on American–Islamic Relations (CAIR) praised UpScrolled for "pledging to protect the free flow of ideas on its platform, including both support for and opposition to the Israeli government's human rights abuses." Guy Christensen, a pro-Palestinian social media celebrity, has encouraged his audience to download UpScrolled. Christensen characterized UpScrolled as having "no censorship, no ownership by billionaires who put their interests and biases onto you to control you". He compared the platform to others like TikTok, saying that Israel is behind censorship that wouldn't happen on UpScrolled. Jaigris Hodson, an associate professor of Interdisciplinary Studies at Royal Roads University in Canada, has argued that "Network effects mean that unless UpScrolled continues its explosive growth, people are unlikely to continue to choose it over the more established TikTok. At best, we might see a Twitter/X effect, which is where TikTok will host more pro-U.S. government content creators and those people who want to follow them, and UpScrolled will host more critical content creators and their followers."

    Read more →
  • AIVA

    AIVA

    AIVA (Artificial Intelligence Virtual Artist) is an electronic composer recognized by the SACEM. == Description == Created in February 2016, AIVA specializes in classical and symphonic music composition. It became the world's first virtual composer to be recognized by a music society (SACEM). By reading a large collection of existing works of classical music (written by human composers such as Bach, Beethoven, Mozart) AIVA is capable of detecting regularities in music and on this base composing on its own. The algorithm AIVA is based on deep learning and reinforcement learning architectures. Since January 2019, the company offers a commercial product, Music Engine, capable of generating short (up to 3 minutes) compositions in various styles (rock, pop, jazz, fantasy, shanty, tango, 20th century cinematic, modern cinematic, and Chinese). AIVA was presented at TED by Pierre Barreau. == Discography == AIVA is a published composer; its first studio album "Genesis" was released in November 2016. Second album "Among the Stars" in 2018. 2016 CD album « Genesis » Hv-Com – LEPM 048427. Track listing "Genesis": 2018 CD album « Among the Stars » Hv-Com – LEPM 048708 Avignon Symphonic Orchestra [ORAP] also performed Aiva's compositions [2] in April 2017.

    Read more →
  • Algorithmic transparency

    Algorithmic transparency

    Algorithmic transparency is the principle that the factors that influence the decisions made by algorithms should be visible, or transparent, to the people who use, regulate, and are affected by systems that employ those algorithms. Although the phrase was coined in 2016 by Nicholas Diakopoulos and Michael Koliska about the role of algorithms in deciding the content of digital journalism services, the underlying principle dates back to the 1970s and the rise of automated systems for scoring consumer credit. The phrases "algorithmic transparency" and "algorithmic accountability" are sometimes used interchangeably – especially since they were coined by the same people – but they have subtly different meanings. Specifically, "algorithmic transparency" states that the inputs to the algorithm and the algorithm's use itself must be known, but they need not be fair. "Algorithmic accountability" implies that the organizations that use algorithms must be accountable for the decisions made by those algorithms, even though the decisions are being made by a machine, and not by a human being. Current research around algorithmic transparency interested in both societal effects of accessing remote services running algorithms, as well as mathematical and computer science approaches that can be used to achieve algorithmic transparency. In the United States, the Federal Trade Commission's Bureau of Consumer Protection studies how algorithms are used by consumers by conducting its own research on algorithmic transparency and by funding external research. In the European Union, the data protection laws that came into effect in May 2018 include a "right to explanation" of decisions made by algorithms, though it is unclear what this means. Furthermore, the European Union founded The European Center for Algorithmic Transparency (ECAT).

    Read more →