AI Coding Meta

AI Coding Meta — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Snapshot isolation

    Snapshot isolation

    In databases, and transaction processing (transaction management), snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database (in practice it reads the last committed values that existed at the time it started), and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made since that snapshot. Snapshot isolation has been adopted by several major database management systems, such as InterBase, Firebird, Oracle, MySQL, PostgreSQL, SQL Anywhere, MongoDB and Microsoft SQL Server (2005 and later). The main reason for its adoption is that it allows better performance than serializability, yet still avoids most of the concurrency anomalies that serializability avoids (but not all). In practice snapshot isolation is implemented within multiversion concurrency control (MVCC), where generational values of each data item (versions) are maintained: MVCC is a common way to increase concurrency and performance by generating a new version of a database object each time the object is written, and allowing transactions' read operations of several last relevant versions (of each object). Snapshot isolation has been used to criticize the ANSI SQL-92 standard's definition of isolation levels, as it exhibits none of the "anomalies" that the SQL standard prohibited, yet is not serializable (the anomaly-free isolation level defined by ANSI). In spite of its distinction from serializability, snapshot isolation is sometimes referred to as serializable by Oracle. == Definition == A transaction executing under snapshot isolation appears to operate on a personal snapshot of the database, taken at the start of the transaction. When the transaction concludes, it will successfully commit only if the values updated by the transaction have not been changed externally since the snapshot was taken. Such a write–write conflict will cause the transaction to abort. In a write skew anomaly, two transactions (T1 and T2) concurrently read an overlapping data set (e.g. values V1 and V2), concurrently make disjoint updates (e.g. T1 updates V1, T2 updates V2), and finally concurrently commit, neither having seen the update performed by the other. Were the system serializable, such an anomaly would be impossible, as either T1 or T2 would have to occur "first", and be visible to the other. In contrast, snapshot isolation permits write skew anomalies. As a concrete example, imagine V1 and V2 are two balances held by a single person, Phil. The bank will allow either V1 or V2 to run a deficit, provided the total held in both is never negative (i.e. V1 + V2 ≥ 0). Both balances are currently $100. Phil initiates two transactions concurrently, T1 withdrawing $200 from V1, and T2 withdrawing $200 from V2. If the database guaranteed serializable transactions, the simplest way of coding T1 is to deduct $200 from V1, and then verify that V1 + V2 ≥ 0 still holds, aborting if not. T2 similarly deducts $200 from V2 and then verifies V1 + V2 ≥ 0. Since the transactions must serialize, either T1 happens first, leaving V1 = −$100, V2 = $100, and preventing T2 from succeeding (since V1 + (V2 − $200) is now −$200), or T2 happens first and similarly prevents T1 from committing. If the database is under snapshot isolation(MVCC), however, T1 and T2 operate on private snapshots of the database: each deducts $200 from an account, and then verifies that the new total is zero, using the other account value that held when the snapshot was taken. Since neither update conflicts, both commit successfully, leaving V1 = V2 = −$100, and V1 + V2 = −$200. Some systems built using multiversion concurrency control (MVCC) may support (only) snapshot isolation to allow transactions to proceed without worrying about concurrent operations, and more importantly without needing to re-verify all read operations when the transaction finally commits. This is convenient because MVCC maintains a series of recent history consistent states. The only information that must be stored during the transaction is a list of updates made, which can be scanned for conflicts fairly easily before being committed. However, MVCC systems (such as MarkLogic) will use locks to serialize writes together with MVCC to obtain some of the performance gains and still support the stronger "serializability" level of isolation. == Workarounds == Potential inconsistency problems arising from write skew anomalies can be fixed by adding (otherwise unnecessary) updates to the transactions in order to enforce the serializability property. Materialize the conflict Add a special conflict table, which both transactions update in order to create a direct write–write conflict. Promotion Have one transaction "update" a read-only location (replacing a value with the same value) in order to create a direct write–write conflict (or use an equivalent promotion, e.g. Oracle's SELECT FOR UPDATE). In the example above, we can materialize the conflict by adding a new table which makes the hidden constraint explicit, mapping each person to their total balance. Phil would start off with a total balance of $200, and each transaction would attempt to subtract $200 from this, creating a write–write conflict that would prevent the two from succeeding concurrently. However, this approach violates the normal form. Alternatively, we can promote one of the transaction's reads to a write. For instance, T2 could set V1 = V1, creating an artificial write–write conflict with T1 and, again, preventing the two from succeeding concurrently. This solution may not always be possible. In general, therefore, snapshot isolation puts some of the problem of maintaining non-trivial constraints onto the user, who may not appreciate either the potential pitfalls or the possible solutions. The upside to this transfer is better performance. == Terminology == Snapshot isolation is called "serializable" mode in Oracle and PostgreSQL versions prior to 9.1, which may cause confusion with the "real serializability" mode. There are arguments both for and against this decision; what is clear is that users must be aware of the distinction to avoid possible undesired anomalous behavior in their database system logic. == History == Snapshot isolation arose from work on multiversion concurrency control databases, where multiple versions of the database are maintained concurrently to allow readers to execute without colliding with writers. Such a system allows a natural definition and implementation of such an isolation level. InterBase, later owned by Borland, was acknowledged to provide SI rather than full serializability in version 4, and likely permitted write-skew anomalies since its first release in 1985. Unfortunately, the ANSI SQL-92 standard was written with a lock-based database in mind, and hence is rather vague when applied to MVCC systems. Berenson et al. wrote a paper in 1995 critiquing the SQL standard, and cited snapshot isolation as an example of an isolation level that did not exhibit the standard anomalies described in the ANSI SQL-92 standard, yet still had anomalous behaviour when compared with serializable transactions. In 2008, Cahill et al. showed that write-skew anomalies could be prevented by detecting and aborting "dangerous" triplets of concurrent transactions. This implementation of serializability is well-suited to multiversion concurrency control databases, and has been adopted in PostgreSQL 9.1, where it is known as Serializable Snapshot Isolation (SSI). When used consistently, this eliminates the need for the above workarounds. The downside over snapshot isolation is an increase in aborted transactions. This can perform better or worse than snapshot isolation with the above workarounds, depending on workload.

    Read more →
  • Ross Quinlan

    Ross Quinlan

    John Ross Quinlan is a computer science researcher in data mining and decision theory. He has contributed extensively to the development of decision tree algorithms, including inventing the canonical C4.5 and ID3 algorithms. He also contributed to early ILP literature with First Order Inductive Learner (FOIL). He is currently running the company RuleQuest Research which he founded in 1997. == Education == He received his BSc degree in Physics and Computing from the University of Sydney in 1965 and his computer science doctorate at the University of Washington in 1968. He has held positions at the University of New South Wales, University of Sydney, University of Technology Sydney, and RAND Corporation. == Artificial intelligence == Quinlan is a specialist in artificial intelligence, particularly in the aspect involving machine learning and its application to data mining. He is a Founding Fellow of the Association for the Advancement of Artificial Intelligence. === ID3 === Ross Quinlan invented the Iterative Dichotomiser 3 (ID3) algorithm which is used to generate decision trees. ID3 follows the principle of Occam's razor in attempting to create the smallest decision tree possible. === C4.5 === He then expanded upon the principles used in ID3 to create C4.5. C4.5 improved: discrete and continuous attributes, missing attribute values, attributes with differing costs, pruning trees (replacing irrelevant branches with leaf nodes). === C5.0 === C5.0, which Quinlan is commercially selling (single-threaded version is distributed under the terms of the GNU General Public License), is an improvement on C4.5. The advantages are speed (several orders of magnitude faster), memory efficiency, smaller decision trees, boosting (more accuracy), ability to weight different attributes, and winnowing (reducing noise). == Selected works == === Books === 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers. ISBN 1-55860-238-0. === Articles === Quinlan, J. R. (1982) Semi-autonomous acquisition of pattern-based knowledge, In Machine intelligence 10 (eds J. E. Hayes, D. Michie, and Y.-H. Pao). Ellis Norwood,Chichester. Quinlan, J.R. (1985). Decision trees and multi-valued attributes, In J.E. Hayes & D. Michie (Eds.), Machine intelligence 11. Oxford University Press. Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1):81-106 2008. (with Qiang Yang, Philip S. Yu, Zhou Zhihua, and David Hand et al). Top 10 algorithms in data mining. Knowledge and Information Systems 14.1: 1-37 Quinlan, J. R. (1990). Learning logical definitions from relations. Machine Learning, 5:239-266.

    Read more →
  • AI Voice Assistants: Free vs Paid (2026)

    AI Voice Assistants: Free vs Paid (2026)

    Shopping for the best AI voice assistant? An AI voice assistant is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI voice assistant slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Nathalie Japkowicz

    Nathalie Japkowicz

    Nathalie Japkowicz is a Canadian computer scientist specializing in machine learning. She is a professor and department chair of computer science at the American University College of Arts and Sciences. == Life == Nathalie Japkowicz completed a B.Sc. at McGill University in 1988. She earned an M.Sc. from the University of Toronto in 1990. She completed a Ph.D. at Rutgers University in 1999. Her dissertation was titled Concept-learning in the absence of counter-examples: an autoassociation-based approach to classification. Stephen José Hanson and Casimir Alexander Kulikowski were her doctoral advisors. Japkowicz worked at the University of Ottawa in the school of electrical engineering and computer science. She was the lead of its laboratory for research on machine learning for defense security. From 2003 to 2005, Japkowicz was the secretary of the Canadian Artificial Intelligence Association (CAIAC). She was CAIAC vice president from 2009 to 2014 and president from 2013 to 2015, and part-president from 2015 to 2017. Japkowicz is a professor and department chair of computer science at the American University College of Arts and Sciences. She researches artificial intelligence, machine learning, data mining, and big data analysis. == Selected works == Gao, Yong; Japkowicz, Nathalie, eds. (2009). Advances in Artificial Intelligence: 22nd Canadian Conference on Artificial Intelligence, Canadian AI 2009 Kelowna, Canada, May 25–27, 2009 Proceedings. Lecture Notes in Computer Science. Vol. 5549. Berlin, Heidelberg: Springer Berlin Heidelberg. doi:10.1007/978-3-642-01818-3. ISBN 978-3-642-01817-6. S2CID 27083226. Japkowicz, Nathalie; Shah, Mohak (2011). Evaluating Learning Algorithms: A Classification Perspective (1 ed.). Cambridge University Press. doi:10.1017/cbo9780511921803. ISBN 978-0-511-92180-3. Japkowicz, Nathalie; Matwin, Stan, eds. (2015). Discovery Science: 18th International Conference, DS 2015, Banff, AB, Canada, October 4–6, 2015. Proceedings. Lecture Notes in Computer Science. Vol. 9356. Cham: Springer International Publishing. doi:10.1007/978-3-319-24282-8. ISBN 978-3-319-24281-1. S2CID 1302223. Japkowicz, Nathalie; Stefanowski, Jerzy, eds. (2016). Big Data Analysis: New Algorithms for a New Society. Studies in Big Data. Vol. 16. Cham: Springer International Publishing. doi:10.1007/978-3-319-26989-4. ISBN 978-3-319-26987-0. Ceci, Michelangelo; Japkowicz, Nathalie; Liu, Jiming; Papadopoulos, George A.; Raś, Zbigniew W., eds. (2018). Foundations of Intelligent Systems: 24th International Symposium, ISMIS 2018, Limassol, Cyprus, October 29–31, 2018, Proceedings. Lecture Notes in Computer Science. Vol. 11177. Cham: Springer International Publishing. doi:10.1007/978-3-030-01851-1. ISBN 978-3-030-01850-4. S2CID 53038780.

    Read more →
  • Contrastive Language-Image Pre-training

    Contrastive Language-Image Pre-training

    Contrastive Language-Image Pre-training (CLIP) is a technique for training a pair of neural network models, one for image understanding and one for text understanding, using a contrastive objective. This method has enabled broad applications across multiple domains, including cross-modal retrieval, text-to-image generation, and aesthetic ranking. == Algorithm == The CLIP method trains a pair of models contrastively. One model takes in a piece of text as input and outputs a single vector representing its semantic content. The other model takes in an image and similarly outputs a single vector representing its visual content. The models are trained so that the vectors corresponding to semantically similar text-image pairs are close together in the shared vector space, while those corresponding to dissimilar pairs are far apart. To train a pair of CLIP models, one would start by preparing a large dataset of image-caption pairs. During training, the models are presented with batches of N {\displaystyle N} image-caption pairs. Let the outputs from the text and image models be respectively v 1 , . . . , v N , w 1 , . . . , w N {\displaystyle v_{1},...,v_{N},w_{1},...,w_{N}} . Two vectors are considered "similar" if their dot product is large. The loss incurred on this batch is the multi-class N-pair loss, which is a symmetric cross-entropy loss over similarity scores: − 1 N ∑ i ln ⁡ e v i ⋅ w i / T ∑ j e v i ⋅ w j / T − 1 N ∑ j ln ⁡ e v j ⋅ w j / T ∑ i e v i ⋅ w j / T {\displaystyle -{\frac {1}{N}}\sum _{i}\ln {\frac {e^{v_{i}\cdot w_{i}/T}}{\sum _{j}e^{v_{i}\cdot w_{j}/T}}}-{\frac {1}{N}}\sum _{j}\ln {\frac {e^{v_{j}\cdot w_{j}/T}}{\sum _{i}e^{v_{i}\cdot w_{j}/T}}}} In essence, this loss function encourages the dot product between matching image and text vectors ( v i ⋅ w i {\displaystyle v_{i}\cdot w_{i}} ) to be high, while discouraging high dot products between non-matching pairs. The parameter T > 0 {\displaystyle T>0} is the temperature, which is parameterized in the original CLIP model as T = e − τ {\displaystyle T=e^{-\tau }} where τ ∈ R {\displaystyle \tau \in \mathbb {R} } is a learned parameter. Other loss functions are possible. For example, Sigmoid CLIP (SigLIP) proposes the following loss function: L = 1 N ∑ i , j ∈ 1 : N f ( ( 2 δ i , j − 1 ) ( e τ w i ⋅ v j + b ) ) {\displaystyle L={\frac {1}{N}}\sum _{i,j\in 1:N}f((2\delta _{i,j}-1)(e^{\tau }w_{i}\cdot v_{j}+b))} where f ( x ) = ln ⁡ ( 1 + e − x ) {\displaystyle f(x)=\ln(1+e^{-x})} is the negative log sigmoid loss, and the Dirac delta symbol δ i , j {\displaystyle \delta _{i,j}} is 1 if i = j {\displaystyle i=j} else 0. == CLIP models == While the original model was developed by OpenAI, subsequent models have been trained by other organizations as well. === Image model === The image encoding models used in CLIP are typically vision transformers (ViT). The naming convention for these models often reflects the specific ViT architecture used. For instance, "ViT-L/14" means a "vision transformer large" (compared to other models in the same series) with a patch size of 14, meaning that the image is divided into 14-by-14 pixel patches before being processed by the transformer. The size indicator ranges from B, L, H, G (base, large, huge, giant), in that order. Other than ViT, the image model is typically a convolutional neural network, such as ResNet (in the original series by OpenAI), or ConvNeXt (in the OpenCLIP model series by LAION). Since the output vectors of the image model and the text model must have exactly the same length, both the image model and the text model have fixed-length vector outputs, which in the original report is called "embedding dimension". For example, in the original OpenAI model, the ResNet models have embedding dimensions ranging from 512 to 1024, and for the ViTs, from 512 to 768. Its implementation of ViT was the same as the original one, with one modification: after position embeddings are added to the initial patch embeddings, there is a LayerNorm. Its implementation of ResNet was the same as the original one, with 3 modifications: In the start of the CNN (the "stem"), they used three stacked 3x3 convolutions instead of a single 7x7 convolution, as suggested by. There is an average pooling of stride 2 at the start of each downsampling convolutional layer (they called it rect-2 blur pooling according to the terminology of ). This has the effect of blurring images before downsampling, for antialiasing. The final convolutional layer is followed by a multiheaded attention pooling. ALIGN a model with similar capabilities, trained by researchers from Google used EfficientNet, a kind of convolutional neural network. === Text model === The text encoding models used in CLIP are typically Transformers. In the original OpenAI report, they reported using a Transformer (63M-parameter, 12-layer, 512-wide, 8 attention heads) with lower-cased byte pair encoding (BPE) with 49152 vocabulary size. Context length was capped at 76 for efficiency. Like GPT, it was decoder-only, with only causally-masked self-attention. Its architecture is the same as GPT-2. Like BERT, the text sequence is bracketed by two special tokens [SOS] and [EOS] ("start of sequence" and "end of sequence"). Take the activations of the highest layer of the transformer on the [EOS], apply LayerNorm, then a final linear map. This is the text encoding of the input sequence. The final linear map has output dimension equal to the embedding dimension of whatever image encoder it is paired with. These models all had context length 77 and vocabulary size 49408. ALIGN used BERT of various sizes. == Dataset == === WebImageText === The CLIP models released by OpenAI were trained on a dataset called "WebImageText" (WIT) containing 400 million pairs of images and their corresponding captions scraped from the internet. The total number of words in this dataset is similar in scale to the WebText dataset used for training GPT-2, which contains about 40 gigabytes of text data. The dataset contains 500,000 text-queries, with up to 20,000 (image, text) pairs per query. The text-queries were generated by starting with all words occurring at least 100 times in English Wikipedia, then extended by bigrams with high mutual information, names of all Wikipedia articles above a certain search volume, and WordNet synsets. The dataset is private and has not been released to the public, and there is no further information on it. ==== Data preprocessing ==== For the CLIP image models, the input images are preprocessed by first dividing each of the R, G, B values of an image by the maximum possible value, so that these values fall between 0 and 1, then subtracting by [0.48145466, 0.4578275, 0.40821073], and dividing by [0.26862954, 0.26130258, 0.27577711]. The rationale was that these are the mean and standard deviations of the images in the WebImageText dataset, so this preprocessing step roughly whitens the image tensor. These numbers slightly differ from the standard preprocessing for ImageNet, which uses [0.485, 0.456, 0.406] and [0.229, 0.224, 0.225]. If the input image does not have the same resolution as the native resolution (224×224 for all except ViT-L/14@336px, which has 336×336 resolution), then the input image is first scaled by bicubic interpolation, so that its shorter side is the same as the native resolution, then the central square of the image is cropped out. === Others === ALIGN used over one billion image-text pairs, obtained by extracting images and their alt-tags from online crawling. The method was described as similar to how the Conceptual Captions dataset was constructed, but instead of complex filtering, they only applied a frequency-based filtering. Later models trained by other organizations had published datasets. For example, LAION trained OpenCLIP with published datasets LAION-400M, LAION-2B, and DataComp-1B. == Training == In the original OpenAI CLIP report, they reported training 5 ResNet and 3 ViT (ViT-B/32, ViT-B/16, ViT-L/14). Each was trained for 32 epochs. The largest ResNet model took 18 days to train on 592 V100 GPUs. The largest ViT model took 12 days on 256 V100 GPUs. All ViT models were trained on 224×224 image resolution. The ViT-L/14 was then boosted to 336×336 resolution by FixRes, resulting in a model. They found this was the best-performing model. In the OpenCLIP series, the ViT-L/14 model was trained on 384 A100 GPUs on the LAION-2B dataset, for 160 epochs for a total of 32B samples seen. == Applications == === Cross-modal retrieval === CLIP's cross-modal retrieval enables the alignment of visual and textual data in a shared latent space, allowing users to retrieve images based on text descriptions and vice versa, without the need for explicit image annotations. In text-to-image retrieval, users input descriptive text, and CLIP retrieves images with matching embeddings. In image-to-text retrieval, images are used to find related text content. CLIP’s ability to connect vis

    Read more →
  • How to Choose an AI Customer-support Bot

    How to Choose an AI Customer-support Bot

    Comparing the best AI customer-support bot? An AI customer-support bot is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI customer-support bot slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Interactive machine translation

    Interactive machine translation

    Interactive machine translation (IMT), is a specific sub-field of computer-aided translation. Under this translation paradigm, the computer software that assists the human translator attempts to predict the text the user is going to input by taking into account all the information it has available. Whenever such prediction is wrong and the user provides feedback to the system, a new prediction is performed considering the new information available. Such process is repeated until the translation provided matches the user's expectations. Interactive machine translation is specially interesting when translating texts in domains where it is not admissible to output a translation containing errors, hence requiring a human user to amend the translations provided by the system. In such cases, interactive machine translation has been proved to provide benefit to potential users. Nevertheless, there are few commercial software that implements interactive machine translation and work done in the field is mostly restrained to academic research. == History == Historically, interactive machine translation is born as an evolution of the computer-aided translation paradigm, where the human translator and the machine translation system were intended to work as a tandem. This first work was extended within the TransType research project, funded by the Canadian government. In this project, the human interaction was aimed towards producing the target text for the first time by embedding data-driven machine translation techniques within the interactive translation environment with the goal of achieving the best of both actors: the efficiency of the automatic system and the reliability of human translators. Later, a larger-scale research project, TransType2, funded by the European Commission extended such work by analyzing the incorporation of a complete machine translation system into the process, with the goal of producing a complete translation hypothesis, which the human user is allowed to amend or accept. If the user decides to amend the hypothesis, the system then attempts to make the best use of such feedback in order to produce a new translation hypothesis that takes into account the modifications introduced by the user. More recently, CASMACAT, also funded by the European Commission, aimed at developing novel types of assistance to human translators and integrated them into a new workbench, consisting of an editor, a server, and analysis and visualisation tools. The workbench was designed in a modular fashion and can be combined with existing computer aided translation tools. Furthermore, the CASMACAT workbench can learn from the interaction with the human translator by updating and adapting its models instantly based on the translation choices of the user. Recent work on involving an extensive evaluation with human users revealed the fact that interactive machine translation may even be used by users that do not speak the source language in order to achieve near professional translation quality. Moreover, it also elucidated the fact that an interactive scenario is more beneficial than a classic post-edition scenario. The previously described approaches rely on a tightly coupled underlying corpus-based machine translation system (usually, a Statistical machine translation system) that is used as a glass box, therefore inheriting the shortcomings of the translation systems and limiting the usage of interactive machine translation for some scenarios. For this reason, an approach that uses any kind of bilingual resource (not limited to machine translation) as a black-box to provide interactive machine translation was developed. This approach is not able to extract as much information from the bilingual resources used, due to the black-box nature of the interaction, but can use any resource available to the user. Forecat is a black-box interactive machine translation implementation that is available both as a web application (that includes a webpage and a web services interface) and as a plugin for OmegaT (Forecat-OmegaT). == Process == The interactive machine translation process starts with the system suggesting a translation hypothesis to the user. Then, the user may accept the complete sentence as correct, or may modify it if he considers there is some error. Typically, when modifying a given word, it is assumed that the prefix until that word is correct, leading to a left-to-right interaction scheme. Once the user has changed the word considered incorrect, the system then proposes a new suffix, i.e. the remainder of the sentence. Such process continues until the translation provided satisfies the user. Although explained at the word level, the previous process may also be implemented at the character level, and hence the system provides a suffix whenever the human translator types in a single character. In addition, there is ongoing effort towards changing the typical left-to-right interaction scheme in order to make human-machine interaction easier. A similar approach is used in the Caitra translation tool. == Evaluation == Evaluation is a difficult issue in interactive machine translation. Ideally, evaluation should take place in experiments involving human users. However, given the high monetary cost this would imply, this is seldom the case. Moreover, even when considering human translators in order to perform a true evaluation of interactive machine translation techniques, it is not clear what should be measured in such experiments, since there are many different variables that should be taken into account and cannot be controlled, as is for instance the time the user takes in order to get used to the process. In the CASMACAT project, some field trials have been carried out to study some of these variables. For quick evaluations in laboratory conditions, interactive machine translation is measured by using the key stroke ratio or the word stroke ratio. Such criteria attempt to measure how many key-strokes or words did the user need to introduce before producing the final translated document. == Differences with classical computer-aided translation == Although interactive machine translation is a sub-field of computer-aided translation, the main attractive of the former with respect to the latter is the interactivity. In classical computer-aided translation, the translation system may suggest one translation hypothesis in the best case, and then the user is required to post-edit such hypothesis. In contrast, in interactive machine translation the system produces a new translation hypothesis each time the user interacts with the system, i.e. after each word (or letter) has been introduced.

    Read more →
  • Krohn–Rhodes theory

    Krohn–Rhodes theory

    In mathematics and computer science, the Krohn–Rhodes theory (or algebraic automata theory) is an approach to the study of finite semigroups and automata that seeks to decompose them in terms of elementary components. These components correspond to finite aperiodic semigroups and finite simple groups that are combined in a feedback-free manner (called a "wreath product" or "cascade"). Krohn and Rhodes found a general decomposition for finite automata. The authors discovered and proved an unexpected major result in finite semigroup theory, revealing a deep connection between finite automata and semigroups. Decidability of Krohn-Rhodes complexity long motivated much work in semigroup theory. In June 2024, Stuart Margolis, John Rhodes, and Anne Schilling announced a proof that the complexity is decidable. == Definitions and description of the Krohn–Rhodes theorem == Let T {\displaystyle T} be a semigroup. A semigroup S {\displaystyle S} that is a homomorphic image of a subsemigroup of T {\displaystyle T} is said to be a divisor of T {\displaystyle T} . The Krohn–Rhodes theorem for finite semigroups states that every finite semigroup S {\displaystyle S} is a divisor of a finite alternating wreath product of finite simple groups, each a divisor of S {\displaystyle S} , and finite aperiodic semigroups (which contain no nontrivial subgroups). In the automata formulation, the Krohn–Rhodes theorem for finite automata states that given a finite automaton A {\displaystyle A} with states Q {\displaystyle Q} and input alphabet I {\displaystyle I} , output alphabet U {\displaystyle U} , then one can expand the states to Q ′ {\displaystyle Q'} such that the new automaton A ′ {\displaystyle A'} embeds into a cascade of "simple", irreducible automata: In particular, A {\displaystyle A} is emulated by a feed-forward cascade of (1) automata whose transformation semigroups are finite simple groups and (2) automata that are banks of flip-flops running in parallel. The new automaton A ′ {\displaystyle A'} has the same input and output symbols as A {\displaystyle A} . Here, both the states and inputs of the cascaded automata have a very special hierarchical coordinate form. Moreover, each simple group (prime) or non-group irreducible semigroup (subsemigroup of the flip-flop monoid) that divides the transformation semigroup of A {\displaystyle A} must divide the transformation semigroup of some component of the cascade, and only the primes that must occur as divisors of the components are those that divide A {\displaystyle A} 's transformation semigroup. == Group complexity == The Krohn–Rhodes complexity (also called group complexity or just complexity) of a finite semigroup S is the least number of groups in a wreath product of finite groups and finite aperiodic semigroups of which S is a divisor. All finite aperiodic semigroups have complexity 0, while non-trivial finite groups have complexity 1. In fact, there are semigroups of every non-negative integer complexity. For example, for any n greater than 1, the multiplicative semigroup of all (n+1) × (n+1) upper-triangular matrices over any fixed finite field has complexity n (Kambites, 2007). A major open problem in finite semigroup theory is the decidability of complexity: is there an algorithm that will compute the Krohn–Rhodes complexity of a finite semigroup, given its multiplication table? Upper bounds and ever more precise lower bounds on complexity have been obtained (see, e.g. Rhodes & Steinberg, 2009). Rhodes has conjectured that the problem is decidable. In June 2024, Stuart Margolis, John Rhodes, and Anne Schilling announced a proof in the affirmative of the conjecture, though as of 2025 the result has yet to be confirmed. == History and applications == At a conference in 1962, Kenneth Krohn and John Rhodes announced a method for decomposing a (deterministic) finite automaton into "simple" components that are themselves finite automata. This joint work, which has implications for philosophy, comprised both Krohn's doctoral thesis at Harvard University and Rhodes' doctoral thesis at MIT. Simpler proofs, and generalizations of the theorem to infinite structures, have been published since then (see Chapter 4 of Rhodes and Steinberg's 2009 book The q-Theory of Finite Semigroups for an overview). In the 1965 paper by Krohn and Rhodes, the proof of the theorem on the decomposition of finite automata (or, equivalently sequential machines) made extensive use of the algebraic semigroup structure. Later proofs contained major simplifications using finite wreath products of finite transformation semigroups. The theorem generalizes the Jordan–Hölder decomposition for finite groups (in which the primes are the finite simple groups), to all finite transformation semigroups (for which the primes are again the finite simple groups plus all subsemigroups of the "flip-flop" (see above)). Both the group and more general finite automata decomposition require expanding the state-set of the general, but allow for the same number of input symbols. In the general case, these are embedded in a larger structure with a hierarchical "coordinate system". One must be careful in understanding the notion of "prime" as Krohn and Rhodes explicitly refer to their theorem as a "prime decomposition theorem" for automata. The components in the decomposition, however, are not prime automata (with prime defined in a naïve way); rather, the notion of prime is more sophisticated and algebraic: the semigroups and groups associated to the constituent automata of the decomposition are prime (or irreducible) in a strict and natural algebraic sense with respect to the wreath product (Eilenberg, 1976). Also, unlike earlier decomposition theorems, the Krohn–Rhodes decompositions usually require expansion of the state-set, so that the expanded automaton covers (emulates) the one being decomposed. These facts have made the theorem difficult to understand and challenging to apply in a practical way—until recently, when computational implementations became available (Egri-Nagy & Nehaniv 2005, 2008). H.P. Zeiger (1967) proved an important variant called the holonomy decomposition (Eilenberg 1976). The holonomy method appears to be relatively efficient and has been implemented computationally by A. Egri-Nagy (Egri-Nagy & Nehaniv 2005). Meyer and Thompson (1969) give a version of Krohn–Rhodes decomposition for finite automata that is equivalent to the decomposition previously developed by Hartmanis and Stearns, but for useful decompositions, the notion of expanding the state-set of the original automaton is essential (for the non-permutation automata case). Many proofs and constructions now exist of Krohn–Rhodes decompositions (e.g., [Krohn, Rhodes & Tilson 1968], [Ésik 2000], [Diekert et al. 2012]), with the holonomy method the most popular and efficient in general (although not in all cases). [Zimmermann 2010] gives an elementary proof of the theorem. Owing to the close relation between monoids and categories, a version of the Krohn–Rhodes theorem is applicable to category theory. This observation and a proof of an analogous result were offered by Wells (1980). The Krohn–Rhodes theorem for semigroups/monoids is an analogue of the Jordan–Hölder theorem for finite groups (for semigroups/monoids rather than groups). As such, the theorem is a deep and important result in semigroup/monoid theory. The theorem was also surprising to many mathematicians and computer scientists since it had previously been widely believed that the semigroup/monoid axioms were too weak to admit a structure theorem of any strength, and prior work (Hartmanis & Stearns) was only able to show much more rigid and less general decomposition results for finite automata. Work by Egri-Nagy and Nehaniv (2005, 2008–) continues to further automate the holonomy version of the Krohn–Rhodes decomposition extended with the related decomposition for finite groups (so-called Frobenius–Lagrange coordinates) using the computer algebra system GAP. Applications outside of the semigroup and monoid theories are now computationally feasible. They include computations in biology and biochemical systems (e.g. Egri-Nagy & Nehaniv 2008), artificial intelligence, finite-state physics, psychology, and game theory (see, for example, Rhodes 2009).

    Read more →
  • XLeratorDB

    XLeratorDB

    XLeratorDB is a suite of database function libraries that enable Microsoft SQL Server to perform a wide range of additional (non-native) business intelligence and ad hoc analytics. The libraries, which are embedded and run centrally on the database, include more than 450 individual functions similar to those found in Microsoft Excel spreadsheets. The individual functions are grouped and sold as six separate libraries based on usage: finance, statistics, math, engineering, unit conversions and strings. WestClinTech, the company that developed XLeratorDB, claims it is "the first commercial function package add-in for Microsoft SQL Server." == Company history == WestClinTech (LLC), founded by software industry veterans Charles Flock and Joe Stampf in 2008, is located in Irvington, New York, United States. Flock was a co-founder of The Frustum Group, developer of the OPICS enterprise banking and trading platform, which was acquired by London-based Misys, PLC in 1996. Stampf joined Frustum in 1994 and with Flock remained active with the company after acquisition, helping to develop successive generations of OPICS now employed by over 150 leading financial institutions worldwide. Following a full year of research, development and testing, WestClinTech introduced and recorded its first commercial sale of XLeratorDB in April 2009. In September 2009, XLeratorDB became available to all Federal agencies through NASA's Strategic Enterprise-Wide Procurement (SEWP-IV) program, a government-wide acquisition contract. == Technology == XLeratorDB uses Microsoft SQL CLR(Common Language Runtime) technology. SQL CLR allows managed code to be hosted by, and run in, the Microsoft SQL Server environment. SQL CLR relies on the creation, deployment and registration of .NET Framework assemblies that are physically stored in managed code dynamic-link libraries (DLL). The assemblies may contain .NET namespaces, classes, functions, and properties. Because managed code compiles to native code prior to execution, functions using SQL CLR can achieve significant performance increases versus the equivalent functions written in T-SQL in some scenarios. XLeratorDB requires Microsoft SQL Server 2005 or SQL Server 2005 Express editions, or later (compatibility mode 90 or higher). The product installs with PERMISSION_SET=SAFE. SAFE mode, the most restrictive permission set, is accessible by all users. Code executed by an assembly with SAFE permissions cannot access external system resources such as files, the network, the internet, environment variables, or the registry. == Functions == In computer science, a function is a portion of code within a larger program which performs a specific task and is relatively independent of the remaining code. As used in database and spreadsheet applications these functions generally represent mathematical formulas widely used across a variety of fields. While this code may be user-generated, it is also embedded as a pre-written sub-routine in applications. These functions are typically identified by common nomenclature which corresponds to their underlying operations: e.g. IRR identifies the function which calculates Internal Rate of Return on a series of periodic cash flows. === Function uses === As subroutines, functions can be integrated and used in a variety of ways, and as part of larger, more complicated applications. Within large enterprise applications they may, for example, play an important role in defining business rules or risk management parameters, while remaining virtually invisible to end users. Within database management systems and spreadsheets, however, these kinds of functions also represent discrete sets of tools; they can be accessed directly and utilized on a stand-alone basis, or in more complex, user-defined configurations. In this context, functions can be used for business intelligence and ad hoc analysis of data in fields such as finance, statistics, engineering, math, etc. === Function types === XLeratorDB uses three kinds of functions to perform analytic operations: scalar, aggregate, and a hybrid form which WestClinTech calls Range Queries. Scalar functions take a single value, perform an operation and return a single value. An example of this type of function is LOG, which returns the logarithm of a number to a specified base. Aggregate functions operate on a series of values but return a single, summarizing value. An example of this type of function is AVG, which returns the average of values in a specified group. In XLeratorDB there are some functions which have characteristics of aggregate functions (operating on multiple series of values) but cannot be processed in SQL CLR using single column inputs, such as AVG does. For example, irregular internal rate of return (XIRR), a financial function, operates on a collection of cash flow values from one column, but must also apply variable period lengths from another column and an initial iterative assumption from a third, in order to return a single, summarizing value. WestClinTech documentation notes that Range Queries specify the data to be included in the result set of the function independently of the WHERE clause associated with the T-SQL statement, by incorporating a SELECT statement into the function as a string argument; the function then traps that SELECT statement, executes it internally and processes the result. Some XLeratorDB functions that employ Range Queries are: NPV, XNPV, IRR, XIRR, MIRR, MULTINOMIAL, and SERIESSUM. Within the application these functions are identified by a "_q" naming convention: e.g. NPV_q, IRR_q, etc. == Analytic functions == === SQL Server functions === Microsoft SQL Server is the #3 selling database management system (DBMS), behind Oracle and IBM. (While versions of SQL Server have been on the market since 1987, XLeratorDB is compatible with only the 2005 edition and later.) Like all major DBMS, SQL Server performs a variety of data mining operations by returning or arraying data in different views (also known as drill-down). In addition, SQL Server uses Transact-SQL (T-SQL) to execute four major classes of pre-defined functions in native mode. Functions operating on the DBMS offer several advantages over client layer applications like Excel: they utilize the most up-to-date data available; they can process far larger quantities of data; and, the data is not subject to exporting and transcription errors. SQL Server 2008 includes a total of 58 functions that perform relatively basic aggregation (12), math (23) and string manipulation (23) operations useful for analytics; it includes no native functions that perform more complex operations directly related to finance, statistics or engineering. === Excel functions === Microsoft Excel, a component of Microsoft Office suite, is one of the most widely used spreadsheet applications on the market today. In addition to its inherent utility as a stand-alone desktop application, Excel overlaps and complements the functionality of DBMS in several ways: storing and arraying data in rows and columns; performing certain basic tasks such as pivot table and aggregating values; and facilitating sharing, importing and exporting of database data. Excel's chief limitation relative to a true database is capacity; Excel 2003 is limited to some 65k rows and 256 columns; Excel 2007 extends this capacity to roughly 1million rows and 16k columns. By comparison, SQL Server is able to manage over 500k terabytes of memory. Excel offers, however, an extensive library of specialized pre-written functions which are useful for performing ad hoc analysis on database data. Excel 2007 includes over 300 of these pre-defined functions, although customized functions can also be created by users, or imported from third party developers as add-ons. Excel functions are grouped by type: === Excel business intelligence functions === Operating on the client computing layer Excel plays an important role as a business intelligence tool because it: performs a wide array of complex analytic functions not native to most DBMS software offers far greater ad hoc reporting and analytic flexibility than most enterprise software provides a medium for sharing and collaborating because of its ubiquity throughout the enterprise Microsoft reinforces this positioning with Business Intelligence documentation that positions Excel in a clearly pivotal role. === XLeratorDB vs. Excel functions === While operating within the database environment, XLeratorDB functions utilize the same naming conventions and input formats, and in most cases, return the same calculation results as Excel functions. XLeratorDB, coupled with SQL Server's native capabilities, compares to Excel's function sets as follows:

    Read more →
  • Jollo

    Jollo

    Jollo was an online machine translation service where users could instantly translate texts into 23 languages, request human translations from a community of volunteers around the world, and compare the correctness of several leading machine translation websites. It was discontinued in 2012. == System == Jollo was a free Web 2.0 website that attempted to improve the way in which people translate online through the use of existing machine translation websites and a community of volunteers who correct and rate translations. The system relied on a similar methodology as computer-assisted translation to ensure translation quality, and featured a public translation memory that records past translations. Jollo received some notable media attention, including in The Daily Telegraph. According to the blog KillerStartups, Jollo combined the benefits of the speed of machine translations and human reviews to ensure translation quality. According to Jeffrey Hill from The English Blog, the community features made Jollo an interesting alternative to other online translation services. == Development == The Jollo website was classified as beta. It was developed using LAMP and was praised for its colorful graphics and simple user interface. Jollo offered a simple web-based API that could be used for translations. For example, the URL: http://www.jollo.com/translate.php?st=I%20love%20you&sl=en&tl=zh was used to translate the sentence "I love you" from English into Chinese.

    Read more →
  • Is an AI Avatar Generator Worth It in 2026?

    Is an AI Avatar Generator Worth It in 2026?

    Looking for the best AI avatar generator? An AI avatar generator is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI avatar generator slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Co-Büchi automaton

    Co-Büchi automaton

    In automata theory, a co-Büchi automaton is a variant of Büchi automaton. The only difference is the accepting condition: a Co-Büchi automaton accepts an infinite word w {\displaystyle w} if there exists a run, such that all the states occurring infinitely often in the run are in the final state set F {\displaystyle F} . In contrast, a Büchi automaton accepts a word w {\displaystyle w} if there exists a run, such that at least one state occurring infinitely often in the final state set F {\displaystyle F} . (Deterministic) Co-Büchi automata are strictly weaker than (nondeterministic) Büchi automata. == Formal definition == Formally, a deterministic co-Büchi automaton is a tuple A = ( Q , Σ , δ , q 0 , F ) {\displaystyle {\mathcal {A}}=(Q,\Sigma ,\delta ,q_{0},F)} that consists of the following components: Q {\displaystyle Q} is a finite set. The elements of Q {\displaystyle Q} are called the states of A {\displaystyle {\mathcal {A}}} . Σ {\displaystyle \Sigma } is a finite set called the alphabet of A {\displaystyle {\mathcal {A}}} . δ : Q × Σ → Q {\displaystyle \delta :Q\times \Sigma \rightarrow Q} is the transition function of A {\displaystyle {\mathcal {A}}} . q 0 {\displaystyle q_{0}} is an element of Q {\displaystyle Q} , called the initial state. F ⊆ Q {\displaystyle F\subseteq Q} is the final state set. A {\displaystyle {\mathcal {A}}} accepts exactly those words w {\displaystyle w} with the run ρ ( w ) {\displaystyle \rho (w)} , in which all of the infinitely often occurring states in ρ ( w ) {\displaystyle \rho (w)} are in F {\displaystyle F} . In a non-deterministic co-Büchi automaton, the transition function δ {\displaystyle \delta } is replaced with a transition relation Δ {\displaystyle \Delta } . The initial state q 0 {\displaystyle q_{0}} is replaced with an initial state set Q 0 {\displaystyle Q_{0}} . Generally, the term co-Büchi automaton refers to the non-deterministic co-Büchi automaton. For more comprehensive formalism see also ω-automaton. == Acceptance Condition == The acceptance condition of a co-Büchi automaton is formally ∃ i ∀ j : j ≥ i ρ ( w j ) ∈ F . {\displaystyle \exists i\forall j:\;j\geq i\quad \rho (w_{j})\in F.} The Büchi acceptance condition is the complement of the co-Büchi acceptance condition: ∀ i ∃ j : j ≥ i ρ ( w j ) ∈ F . {\displaystyle \forall i\exists j:\;j\geq i\quad \rho (w_{j})\in F.} == Properties == Co-Büchi automata are closed under union, intersection, projection and determinization.

    Read more →
  • Pyramid (image processing)

    Pyramid (image processing)

    Pyramid, or pyramid representation, is a type of multi-scale signal representation developed by the computer vision, image processing and signal processing communities, in which a signal or an image is subject to repeated smoothing and subsampling. Pyramid representation is a predecessor to scale-space representation and multiresolution analysis. == Pyramid generation == There are two main types of pyramids: lowpass and bandpass. A lowpass pyramid is made by smoothing the image with an appropriate smoothing filter and then subsampling the smoothed image, usually by a factor of 2 along each coordinate direction. The resulting image is then subjected to the same procedure, and the cycle is repeated multiple times. Each cycle of this process results in a smaller image with increased smoothing, but with decreased spatial sampling density (that is, decreased image resolution). If illustrated graphically, the entire multi-scale representation will look like a pyramid, with the original image on the bottom and each cycle's resulting smaller image stacked one atop the other. A bandpass pyramid is made by forming the difference between images at adjacent levels in the pyramid and performing image interpolation between adjacent levels of resolution, to enable computation of pixelwise differences. == Pyramid generation kernels == A variety of different smoothing kernels have been proposed for generating pyramids. Among the suggestions that have been given, the binomial kernels arising from the binomial coefficients stand out as a particularly useful and theoretically well-founded class. Thus, given a two-dimensional image, we may apply the (normalized) binomial filter (1/4, 1/2, 1/4) typically twice or more along each spatial dimension and then subsample the image by a factor of two. This operation may then proceed as many times as desired, leading to a compact and efficient multi-scale representation. If motivated by specific requirements, intermediate scale levels may also be generated where the subsampling stage is sometimes left out, leading to an oversampled or hybrid pyramid. With the increasing computational efficiency of CPUs available today, it is in some situations also feasible to use wider supported Gaussian filters as smoothing kernels in the pyramid generation steps. === Gaussian pyramid === In a Gaussian pyramid, subsequent images are weighted down using a Gaussian average (Gaussian blur) and scaled down. Each pixel containing a local average corresponds to a neighborhood pixel on a lower level of the pyramid. This technique is used especially in texture synthesis. === Laplacian pyramid === A Laplacian pyramid is very similar to a Gaussian pyramid but saves the difference image of the blurred versions between each levels. Only the smallest level is not a difference image to enable reconstruction of the high resolution image using the difference images on higher levels. This technique can be used in image compression. === Steerable pyramid === A steerable pyramid, developed by Simoncelli and others, is an implementation of a multi-scale, multi-orientation band-pass filter bank used for applications including image compression, texture synthesis, and object recognition. It can be thought of as an orientation selective version of a Laplacian pyramid, in which a bank of steerable filters are used at each level of the pyramid instead of a single Laplacian or Gaussian filter. == Applications of pyramids == === Alternative representation === In the early days of computer vision, pyramids were used as the main type of multi-scale representation for computing multi-scale image features from real-world image data. More recent techniques include scale-space representation, which has been popular among some researchers due to its theoretical foundation, the ability to decouple the subsampling stage from the multi-scale representation, the more powerful tools for theoretical analysis as well as the ability to compute a representation at any desired scale, thus avoiding the algorithmic problems of relating image representations at different resolution. Nevertheless, pyramids are still frequently used for expressing computationally efficient approximations to scale-space representation. === Detail manipulation === Levels of a Laplacian pyramid can be added to or removed from the original image to amplify or reduce detail at different scales. However, detail manipulation of this form is known to produce halo artifacts in many cases, leading to the development of alternatives such as the bilateral filter. Some image compression file formats use the Adam7 algorithm or some other interlacing technique. These can be seen as a kind of image pyramid. Because those file format store the "large-scale" features first, and fine-grain details later in the file, a particular viewer displaying a small "thumbnail" or on a small screen can quickly download just enough of the image to display it in the available pixels—so one file can support many viewer resolutions, rather than having to store or generate a different file for each resolution.

    Read more →
  • How to Choose an AI Customer-support Bot

    How to Choose an AI Customer-support Bot

    Comparing the best AI customer-support bot? An AI customer-support bot is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI customer-support bot slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Lin-Shan Lee

    Lin-Shan Lee

    Lin-Shan Lee (Chinese: 李琳山; born 23 September 1952) is a Taiwanese computer scientist. == Education and career == Lee earned a bachelor's degree in electrical engineering from National Taiwan University in 1974, and pursued a doctorate in the same subject at Stanford University, graduating in 1977. He subsequently returned to Taiwan and joined the NTU faculty in 1982. Lee is a 1993 fellow of the Institute of Electrical and Electronics Engineers, recognized "[f]or contributions to computer voice input/output techniques for Mandarin Chinese and to engineering education." The International Speech Communication Association elevated him to fellow status in 2010 "[f]or his contributions to Chinese spoken language processing and speech information retrieval, and his service to the speech language community." In 2016, Lee was elected a member of Academia Sinica.

    Read more →