AI Assistant Examples

AI Assistant Examples — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • INDECT

    INDECT

    INDECT is a research project in the area of intelligent security systems performed by several European universities since 2009 and funded by the European Union. The purpose of the project is to involve European scientists and researchers in the development of solutions to and tools for automatic threat detection through e.g. processing of CCTV camera data streams, standardization of video sequence quality for user applications, threat detection in computer networks as well as data and privacy protection. The area of research, applied methods, and techniques are described in the public deliverables which are available to the public on the project's website. Practically, all information related to the research is public. Only documents that comprise information related to financial data or information that could negatively influence the competitiveness and law enforcement capabilities of parties involved in the project are not published. This follows regulations and practices applied in EU research projects. == Application and target users == The main end-user of INDECT solutions are police forces and security services. The principle of operation of the project is detecting threats and identifying sources of threats, without monitoring and searching for particular citizens or groups of citizens. Then, the system operator (i.e. police officer) decides whether an intervention of services responsible for public security are required or not. Further investigation eventually leading to persons related to threats is performed, preserving the presumption of innocence, based on existing procedures already used by police services and prosecutors. As it can be found in the project deliverables, INDECT does not involve storage of personal data (such as names, addresses, identity document numbers, etc.). A similar, behavior-based surveillance program was SAMURAI (Suspicious and Abnormal behavior Monitoring Using a netwoRk of cAmeras & sensors for sItuation awareness enhancement). == Expected results == The main expected results of the INDECT project are: Trial of intelligent analysis of video and audio data for threat detection in urban environments Creation of tools and technology for privacy and data protection during storage and transmission of information using quantum cryptography and new methods of digital watermarking Performing computer-aided detection of threats and targeted crimes in Internet resources with privacy-protecting solutions Construction of a search engine for rapid semantic search based on watermarking of content related to child pornography and human organ trafficking Implementation of a distributed computer system that is capable of effective intelligent processing == Controversy == Some media and other sources accuse INDECT of privacy abuse, collecting personal data, and keeping information from the public. Consequently, these issues have been commented and discussed by some Members of the European Parliament. As seen in the project's documentation, INDECT does not involve mobile phone tracking or call interception. The rumors about testing INDECT during 2012 UEFA European Football Championship also turned out to be false. The mid-term review of the Seventh Framework Programme to the European Parliament strongly urges the European Commission to immediately make all documents available and to define a clear and strict mandate for the research goal, the application, and the end users of INDECT, and stresses a thorough investigation of the possible impact on fundamental rights. Nevertheless, according to Mr. Paweł Kowal, MEP, the project had the ethical review on 15 March 2011 in Brussels with the participation of ethics experts from Austria, France, Netherlands, Germany and Great Britain.

    Read more →
  • Isolation forest

    Isolation forest

    Isolation forest is an unsupervised learning algorithm for anomaly detection that works on the principle of isolating anomalies, instead of the most common techniques of profiling normal points. In statistics, an anomaly (a.k.a. outlier) is an observation or event that deviates so much from other events to arouse suspicion it was generated by a different mean. For example, the graph in Fig.1 represents ingress traffic to a web server, expressed as the number of requests in 3-hours intervals, for a period of one month. It is quite evident by simply looking at the picture that some points (marked with a red circle) are unusually high, to the point of inducing suspect that the web server might have been under attack at that time. On the other hand, the flat segment indicated by the red arrow also seems unusual and might possibly be a sign that the server was down during that time period. Anomalies in a big dataset may follow very complicated patterns, which are difficult to detect "by eye" in the great majority of cases. This is the reason why the field of anomaly detection is well suited for the application of machine learning techniques. The most common techniques employed for anomaly detection are based on the construction of a profile of what is "normal": anomalies are reported as those instances in the dataset that do not conform to the normal profile. Isolation Forest uses a different approach: instead of trying to build a model of normal instances, it explicitly isolates anomalous points in the dataset. The main advantage of this approach is the possibility of exploiting sampling techniques to an extent that is not allowed to the profile-based methods, creating a very fast algorithm with a low memory demand. == History == The Isolation Forest (iForest) algorithm was initially proposed by Fei Tony Liu, Kai Ming Ting and Zhi-Hua Zhou in 2008. The authors took advantage of two quantitative properties of anomalous data points in a sample, that is: they are the minority consisting of fewer instances and they have attribute-values that are very different from those of normal instances Since anomalies are typically few and very different from the other points in the sample, they must be easier to "isolate" compared to normal points. On the basis of this principle, Isolation Forest builds an ensemble of "Isolation Trees" (iTrees) for the data set and marks as anomalies the points that have short average path lengths on the iTrees. In a later paper, published in 2012 the same authors described a set of experiments to prove that iForest: has a low linear time complexity and a small memory requirement is able to deal with high dimensional data with irrelevant attributes can be trained with or without anomalies in the training set can provide detection results with different levels of granularity without re-training In 2013 Zhiguo Ding and Minrui Fei proposed a framework based on iForest to resolve the problem of detecting anomalies in streaming data. More application of iForest to streaming data are described in papers by Swee Chuan Tan et al., G. A. Susto et al. and Yu Weng et al. One of the main problems of the application of iForest to anomaly detection was not with the model itself, but rather in the way the "anomaly score" was computed. This problem was highlighted by Sahand Hariri, Matias Carrasco Kind and Robert J. Brunner in a 2018 paper, wherein they proposed an improved iForest model named Extended Isolation Forest (EIF). In the same paper the authors describe the improvements made to the original model and how they are able to enhance the consistency and reliability of the anomaly score produced for a given data point. == Algorithm == At the basis of the Isolation Forest algorithm there is the tendency of anomalous instances in a dataset to be easier to separate from the rest of the sample (isolate), compared to normal points. In order to isolate a data point the algorithm recursively generates partitions on the sample by randomly selecting an attribute and then randomly selecting a split value for the attribute, between the minimum and maximum values allowed for that attribute. An example of random partitioning in a 2D dataset of normally distributed points is given in Fig. 2 for a non-anomalous point and Fig. 3 for a point that's more likely to be an anomaly. It is apparent from the pictures how anomalies require fewer random partitions to be isolated, compared to normal points. From a mathematical point of view, recursive partitioning can be represented by a tree structure named Isolation Tree, while the number of partitions required to isolate a point can be interpreted as the length of the path, within the tree, to reach a terminating node starting from the root. For example, the path length of point xi in Fig. 2 is greater than the path length of xj in Fig. 3. More formally, let X = { x1, ..., xn } be a set of d-dimensional points and X' ⊂ X a subset of X. An Isolation Tree (iTree) is defined as a data structure with the following properties: for each node T in the Tree, T is either an external-node with no child, or an internal-node with one "test" and exactly two daughter nodes (Tl, Tr) a test at node T consists of an attribute q and a split value p such that the test q < p determines the traversal of a data point to either Tl or Tr. In order to build an iTree, the algorithm recursively divides X' by randomly selecting an attribute q and a split value p, until either (i) the node has only one instance or (ii) all data at the node have the same values. When the iTree is fully grown, each point in X is isolated at one of the external nodes. Intuitively, the anomalous points are those (easier to isolate, hence) with the smaller path length in the tree, where the path length h(xi) of point x i ∈ X {\displaystyle x_{i}\in X} is defined as the number of edges xi traverses from the root node to get to an external node. A probabilistic explanation of iTree is provided in the iForest original paper. == Properties of Isolation Forest == Sub-sampling: since iForest does not need to isolate all of normal instances, it can frequently ignore the big majority of the training sample. As a consequence, iForest works very well when the sampling size is kept small, a property that is in contrast with the great majority of existing methods, where large sampling size is usually desirable. Swamping: when normal instances are too close to anomalies, the number of partitions required to separate anomalies increases, a phenomena known as swamping, which makes it more difficult for iForest to discriminate between anomalies and normal points. One of the main reasons for swamping is the presence of too many data for the purpose of anomaly detection, which implies one possible solution to the problem is sub-sampling. Since iForest respond very well to sub-sampling in terms of performance, the reduction of the number of points in the sample is also a good way to reduce the effect of swamping. Masking: when the number of anomalies is high it is possible that some of those aggregate in a dense and large cluster, making it more difficult to separate the single anomalies and, in turn, to detect such points as anomalous. Similarly to swamping, this phenomena (known as "masking") is also more likely when the number of points in the sample is big, and can be alleviated through sub-sampling. High Dimensional Data: one of the main limitation to standard, distance-based methods is their inefficiency in dealing with high dimensional datasets:. The main reason for that is, in a high dimensional space every point is equally sparse, so using a distance-based measure of separation is pretty ineffective. Unfortunately, high-dimensional data also affects the detection performance of iForest, but the performance can be vastly improved by adding a features selection test like Kurtosis to reduce the dimensionality of the sample space. Normal Instances Only: iForest performs well even if the training set does not contain any anomalous point, the reason being that iForest describes data distributions in such a way that high values of the path length h(xi) correspond to the presence of data points. As a consequence, the presence of anomalies is pretty irrelevant to iForest's detection performance. == Anomaly Detection with Isolation Forest == Anomaly detection with Isolation Forest is a process composed of two main stages: in the first stage, a training dataset is used to build iTrees as described in previous sections. in the second stage, each instance in test set is passed through the iTrees build in the previous stage, and a proper "anomaly score" is assigned to the instance using the algorithm described below Once all the instances in the test set have been assigned an anomaly score, it is possible to mark as "anomaly" any point whose score is greater than a predefined threshold, which depends on the domain the analysis is being applied to. === Anomaly Score === Th

    Read more →
  • Top 10 AI Clip Makers Compared (2026)

    Top 10 AI Clip Makers Compared (2026)

    Comparing the best AI clip maker? An AI clip maker is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI clip maker slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Weighted automaton

    Weighted automaton

    In theoretical computer science and formal language theory, a weighted automaton or weighted finite-state machine is a generalization of a finite-state machine in which the edges have weights, for example real numbers or integers. Finite-state machines are only capable of answering decision problems; they take as input a string and produce a Boolean output, i.e. either "accept" or "reject". In contrast, weighted automata produce a quantitative output, for example a count of how many answers are possible on a given input string, or a probability of how likely the input string is according to a probability distribution. They are one of the simplest studied models of quantitative automata. The definition of a weighted automaton is generally given over an arbitrary semiring R {\displaystyle R} , an abstract set with an addition operation + {\displaystyle +} and a multiplication operation × {\displaystyle \times } . The automaton consists of a finite set of states, a finite input alphabet of characters Σ {\displaystyle \Sigma } and edges which are labeled with both a character in Σ {\displaystyle \Sigma } and a weight in R {\displaystyle R} . The weight of any path in the automaton is defined to be the product of weights along the path, and the weight of a string is the sum of the weights of all paths which are labeled with that string. The weighted automaton thus defines a function from Σ ∗ {\displaystyle \Sigma ^{}} to R {\displaystyle R} . Weighted automata generalize deterministic finite automata (DFAs) and nondeterministic finite automata (NFAs), which correspond to weighted automata over the Boolean semiring, where addition is logical disjunction and multiplication is logical conjunction. In the DFA case, there is only one accepting path for any input string, so disjunction is not applied. When the weights are real numbers and the outgoing weights for each state add to one, weighted automata can be considered a probabilistic model and are also known as probabilistic automata. These machines define a probability distribution over all strings, and are related to other probabilistic models such as Markov decision processes and Markov chains. Weighted automata have applications in natural language processing where they are used to assign weights to words and sentences, as well as in image compression. They were first introduced by Marcel-Paul Schützenberger in his 1961 paper On the definition of a family of automata. Since their introduction, many extensions have been proposed, for example nested weighted automata, cost register automata, and weighted finite-state transducers. Researchers have studied weighted automata from the perspective of learning a machine from its input-output behavior (see computational learning theory) and studying decidability questions. == Definition == A commutative semiring (or rig) is a set R equipped with two distinguished elements 0 ≠ 1 {\displaystyle 0\neq 1} and addition and multiplication operations ⊕ {\displaystyle \oplus } and ⊗ {\displaystyle \otimes } such that ⊕ {\displaystyle \oplus } is commutative and associative with identity 0 {\displaystyle 0} , ⊗ {\displaystyle \otimes } is commutative and associative with identity 1 {\displaystyle 1} , ⊗ {\displaystyle \otimes } distributes over ⊕ {\displaystyle \oplus } , and 0 is an absorbing element for ⊗ {\displaystyle \otimes } . A weighted automaton over R {\displaystyle R} is a tuple A = ( Q , Σ , Δ , I , F ) {\displaystyle {\mathcal {A}}=(Q,\Sigma ,\Delta ,I,F)} where: Q {\displaystyle Q} is a finite set of states. Σ {\displaystyle \Sigma } is a finite alphabet. Δ ⊆ Q × Σ × R × Q {\displaystyle \Delta \subseteq Q\times \Sigma \times R\times Q} is a finite set of transitions ( q , σ , w , q ′ ) {\displaystyle (q,\sigma ,w,q')} , where σ {\displaystyle \sigma } is called a character and w {\displaystyle w} is called a weight. I : Q → R {\displaystyle I:Q\to R} is an initial weight function. F : Q → R {\displaystyle F:Q\to R} is a final weight function. A path on input w ∈ Σ ∗ {\displaystyle w\in \Sigma ^{}} is a finite path in the graph, where the concatenation of the character labels equals w {\displaystyle w} . The weight of the path q 0 , q 1 , … , q n {\displaystyle q_{0},q_{1},\ldots ,q_{n}} is the product ( ⊗ {\displaystyle \otimes } ) of the weights along the path, additionally multiplied by the initial and final weights I ( q 0 ) ⊗ F ( q n ) {\displaystyle I(q_{0})\otimes F(q_{n})} . The weight of the word w {\displaystyle w} is the sum ( ⊕ {\displaystyle \oplus } ) of the weights of all paths on input w {\displaystyle w} (or 0 if there are no accepting paths). In this way the machine defines a function [ [ A ] ] : Σ ∗ → R {\displaystyle [\![{\mathcal {A}}]\!]:\Sigma ^{}\to R} . == Ambiguity and determinism == Since Δ {\displaystyle \Delta } is a set of transitions, weighted automata allow multiple transitions (or paths) on a single input string. Therefore a weighted automaton can be considered analogous to a nondeterministic finite automaton (NFA). As is the case with NFAs, restrictions of weighted automata are considered that correspond to the concepts of deterministic finite automaton and unambiguous finite automaton (deterministic weighted automata and unambiguous weighted automata, respectively). First, a preliminary definition: the underlying NFA of A {\displaystyle {\mathcal {A}}} is an NFA formed by removing all transitions with weight 0 {\displaystyle 0} and then erasing all of the weights on the transitions Δ {\displaystyle \Delta } , so that the new transition set lies in Q × Σ × Q {\displaystyle Q\times \Sigma \times Q} . The initial states and final states are the set of states q {\displaystyle q} such that I ( q ) ≠ 0 {\displaystyle I(q)\neq 0} and F ( q ) ≠ 0 {\displaystyle F(q)\neq 0} , respectively. A weighted automaton is deterministic if the underlying NFA is deterministic and unambiguous if the underlying NFA is unambiguous. Every deterministic weighted automaton is unambiguous. In both the deterministic and unambiguous cases, there is always at most one accepting path, so the ⊕ {\displaystyle \oplus } operation is never applied and can be omitted from the definition. == Variations == The requirement that there is a zero element for ⊕ {\displaystyle \oplus } is sometimes omitted; in this case the machine defines a partial function from Σ ∗ {\displaystyle \Sigma ^{}} to R {\displaystyle R} rather than a total function. It is possible to extend the definition to allow epsilon transitions ( q , ϵ , w , q ′ ) {\displaystyle (q,\epsilon ,w,q')} , where ϵ {\displaystyle \epsilon } is the empty string. In this case, one must then require that there are no cycles of epsilon transitions. This does not increase the expressiveness of weighted automata. If epsilon transitions are allowed, the initial weights and final weights can be replaced by initial and final sets of states without loss of expressiveness. Some authors omit the initial and final weight functions I {\displaystyle I} and F {\displaystyle F} . Instead, I {\displaystyle I} and F {\displaystyle F} are replaced by a set of initial and final states. If epsilon transitions are not present, this technically decreases expressiveness as it forces [ [ A ] ] ( ε ) {\displaystyle [\![{\mathcal {A}}]\!](\varepsilon )} to depend only on the number of states that are both initial and final. The transition function can be given as a matrix Δ σ ∈ R Q × Q {\displaystyle \Delta _{\sigma }\in R^{Q\times Q}} with entries in R {\displaystyle R} for each σ {\displaystyle \sigma } , rather than a set of transitions. The entry of the matrix at ( q , q ′ ) {\displaystyle (q,q')} is the sum of all transitions labeled ( q , σ , q ′ ) {\displaystyle (q,\sigma ,q')} . Some authors restrict to specific semirings, such as N {\displaystyle \mathbb {N} } or Z {\displaystyle \mathbb {Z} } , particularly when studying decidability results.

    Read more →
  • Apache Parquet

    Apache Parquet

    Apache Parquet is a free and open-source column-oriented data storage format in the Apache Hadoop ecosystem inspired by Google Dremel interactive ad-hoc query system for analysis of read-only nested data. It is similar to RCFile and ORC, the other columnar-storage file formats in Hadoop, and is compatible with most of the data processing frameworks around Hadoop. It provides data compression and encoding schemes with enhanced performance to handle complex data in bulk. == History == The open-source project to build Apache Parquet began as a joint effort between Twitter and Cloudera using the record shredding and assembly algorithm as described in Google's Dremel. Parquet was designed as an improvement on the Trevni columnar storage format created by Doug Cutting, the creator of Hadoop. The name 'parquet' (lit. 'small compartment') refers to a style of decorative flooring and was chosen to "evoke the bottom layer of a database with an interesting layout". The first version, Apache Parquet 1.0, was released in July 2013. Since April 27, 2015, Apache Parquet has been a top-level Apache Software Foundation (ASF)-sponsored project. == Features == Apache Parquet is implemented using the record-shredding and assembly algorithm, which accommodates the complex data structures that can be used to store data. The values in each column are stored in contiguous memory locations, providing the following benefits: Column-wise compression is efficient in storage space Encoding and compression techniques specific to the type of data in each column can be used Queries that fetch specific column values need not read the entire row, thus improving performance Apache Parquet is implemented using the Apache Thrift framework, which increases its flexibility; it can work with a number of programming languages like C++, Java, Python, PHP, etc. As of August 2015, Parquet supports the big-data-processing frameworks including Apache Hive, Apache Drill, Apache Impala, Apache Crunch, Apache Pig, Cascading, Presto and Apache Spark. It is one of the external data formats used by the pandas Python data manipulation and analysis library. == Compression and encoding == In Parquet, compression is performed column by column, which enables different encoding schemes to be used for text and integer data. This strategy also keeps the door open for newer and better encoding schemes to be implemented as they are invented. Parquet supports various compression formats: snappy, gzip, LZO, brotli, zstd, and LZ4. === Dictionary encoding === Parquet has an automatic dictionary encoding enabled dynamically for data with a small number of unique values (i.e. below 105) that enables significant compression and boosts processing speed. === Bit packing === Storage of integers is usually done with dedicated 32 or 64 bits per integer. For small integers, packing multiple integers into the same space makes storage more efficient. === Run-length encoding (RLE) === To optimize storage of multiple occurrences of the same value, run-length encoding is used, which is where a single value is stored once along with the number of occurrences. Parquet implements a hybrid of bit packing and RLE, in which the encoding switches based on which produces the best compression results. This strategy works well for certain types of integer data and combines well with dictionary encoding. == Cloud Storage and Data Lakes == Parquet is widely used as the underlying file format in modern cloud-based data lake architectures. Cloud storage systems such as Amazon S3, Azure Data Lake Storage, and Google Cloud Storage commonly store data in Parquet format due to its efficient columnar representation and retrieval capabilities. Data lakehouse frameworks—including Apache Iceberg, Delta Lake, and Apache Hudi —build an additional metadata layer on top of Parquet files to support features such as schema evolution, time-travel queries, and ACID-compliant transactions. In these architectures, Parquet files serve as the immutable storage layer while the table formats manage data versioning and transactional integrity. == Comparison == Apache Parquet is comparable to RCFile and Optimized Row Columnar (ORC) file formats — all three fall under the category of columnar data storage within the Hadoop ecosystem. They all have better compression and encoding with improved read performance at the cost of slower writes. In addition to these features, Apache Parquet supports limited schema evolution, i.e., the schema can be modified according to the changes in the data. It also provides the ability to add new columns and merge schemas that do not conflict. Apache Arrow is designed as an in-memory complement to on-disk columnar formats like Parquet and ORC. The Arrow and Parquet projects include libraries that allow for reading and writing between the two formats. == Implementations == Known implementations of Parquet include:

    Read more →
  • How to Choose an AI Essay Writer

    How to Choose an AI Essay Writer

    Shopping for the best AI essay writer? An AI essay writer is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI essay writer slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • How to Choose an AI Writing Assistant

    How to Choose an AI Writing Assistant

    Comparing the best AI writing assistant? An AI writing assistant is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI writing assistant slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • AI Writing Assistants: Free vs Paid (2026)

    AI Writing Assistants: Free vs Paid (2026)

    Curious about the best AI writing assistant? An AI writing assistant is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI writing assistant slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Multiple buffering

    Multiple buffering

    In computer science, multiple buffering is the use of more than one buffer to hold a block of data, so that a "reader" will see a complete (though perhaps old) version of the data instead of a partially updated version of the data being created by a "writer". It is very commonly used for computer display images. It is also used to avoid the need to use dual-ported RAM (DPRAM) when the readers and writers are different devices. == Description == === Double buffering Petri net === The Petri net in the illustration shows double buffering. Transitions W1 and W2 represent writing to buffer 1 and 2 respectively while R1 and R2 represent reading from buffer 1 and 2 respectively. At the beginning, only the transition W1 is enabled. After W1 fires, R1 and W2 are both enabled and can proceed in parallel. When they finish, R2 and W1 proceed in parallel and so on. After the initial transient where W1 fires alone, this system is periodic and the transitions are enabled – always in pairs (R1 with W2 and R2 with W1 respectively). == Double buffering in computer graphics == In computer graphics, double buffering is a technique for drawing graphics that shows less stutter, tearing, and other artifacts. It is difficult for a program to draw a display so that pixels do not change more than once. For instance, when updating a page of text, it is much easier to clear the entire page and then draw the letters than to somehow erase only the pixels that are used in old letters but not in new ones. However, this intermediate image is seen by the user as flickering. In addition, computer monitors constantly redraw the visible video page (traditionally at around 60 times a second), so even a perfect update may be visible momentarily as a horizontal divider between the "new" image and the un-redrawn "old" image, known as tearing. === Software double buffering === A software implementation of double buffering has all drawing operations store their results in some region of system RAM; any such region is often called a "back buffer". When all drawing operations are considered complete, the whole region (or only the changed portion) is copied into the video RAM (the "front buffer"); this copying is usually synchronized with the monitor's raster beam in order to avoid tearing. Software implementations of double buffering necessarily require more memory and CPU time than single buffering because of the system memory allocated for the back buffer, the time for the copy operation, and the time waiting for synchronization. Compositing window managers often combine the "copying" operation with "compositing" used to position windows, transform them with scale or warping effects, and make portions transparent. Thus, the "front buffer" may contain only the composite image seen on the screen, while there is a different "back buffer" for every window containing the non-composited image of the entire window contents. === Page flipping === In the page-flip method, instead of copying the data, both buffers are capable of being displayed. At any one time, one buffer is actively being displayed by the monitor, while the other, background buffer is being drawn. When the background buffer is complete, the roles of the two are switched. The page-flip is typically accomplished by modifying a hardware register in the video display controller—the value of a pointer to the beginning of the display data in the video memory. The page-flip is much faster than copying the data and can guarantee that tearing will not be seen as long as the pages are switched over during the monitor's vertical blanking interval—the blank period when no video data is being drawn. The currently active and visible buffer is called the front buffer, while the background page is called the back buffer. == Triple buffering == In computer graphics, triple buffering is similar to double buffering but can provide improved performance. In double buffering, the program must wait until the finished drawing is copied or swapped before starting the next drawing. This waiting period could be several milliseconds during which neither buffer can be touched. In triple buffering, the program has two back buffers and can immediately start drawing in the one that is not involved in such copying. The third buffer, the front buffer, is read by the graphics card to display the image on the monitor. Once the image has been sent to the monitor, the front buffer is flipped with (or copied from) the back buffer holding the most recent complete image. Since one of the back buffers is always complete, the graphics card never has to wait for the software to complete. Consequently, the software and the graphics card are completely independent and can run at their own pace. Finally, the displayed image was started without waiting for synchronization and thus with minimum lag. Due to the software algorithm not polling the graphics hardware for monitor refresh events, the algorithm may continuously draw additional frames as fast as the hardware can render them. For frames that are completed much faster than interval between refreshes, it is possible to replace a back buffers' frames with newer iterations multiple times before copying. This means frames may be written to the back buffer that are never used at all before being overwritten by successive frames. Nvidia has implemented this method under the name "Fast Sync". An alternative method sometimes referred to as triple buffering is a swap chain three buffers long. After the program has drawn both back buffers, it waits until the first one is placed on the screen, before drawing another back buffer (i.e. it is a 3-long first in, first out queue). Most Windows games seem to refer to this method when enabling triple buffering. == Quad buffering == The term quad buffering is the use of double buffering for each of the left and right eye images in stereoscopic implementations, thus four buffers total (if triple buffering was used then there would be six buffers). The command to swap or copy the buffer typically applies to both pairs at once, so at no time does one eye see an older image than the other eye. Quad buffering requires special support in the graphics card drivers which is disabled for most consumer cards. AMD's Radeon HD 6000 Series and newer support it. 3D standards like OpenGL and Direct3D support quad buffering. == Double buffering for DMA == The term double buffering is used for copying data between two buffers for direct memory access (DMA) transfers, not for enhancing performance, but to meet specific addressing requirements of a device (particularly 32-bit devices on systems with wider addressing provided via Physical Address Extension). Windows device drivers are a place where the term "double buffering" is likely to be used. Linux and BSD source code calls these "bounce buffers". Some programmers try to avoid this kind of double buffering with zero-copy techniques. == Other uses == Double buffering is also used as a technique to facilitate interlacing or deinterlacing of video signals.

    Read more →
  • Best AI Chatbots in 2026

    Best AI Chatbots in 2026

    Curious about the best AI chatbot? An AI chatbot is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI chatbot slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Associative classifier

    Associative classifier

    An associative classifier (AC) is a kind of supervised learning model that uses association rules to assign a target value. The term associative classification was coined by Bing Liu et al., in which the authors defined a model made of rules "whose right-hand side are restricted to the classification class attribute". == Model == The model generated by an AC and used to label new records consists of association rules, where the consequent corresponds to the class label. As such, they can also be seen as a list of "if-then" clauses: if the record matches some criteria (expressed in the left side of the rule, also called antecedent), it is then labeled accordingly to the class on the right side of the rule (or consequent). Most ACs read the list of rules in order, and apply the first matching rule to label the new record. == Metrics == The rules of an AC inherit some of the metrics of association rules, like the support or the confidence. Metrics can be used to order or filter the rules in the model and to evaluate their quality. == Implementations == The first proposal of a classification model made of association rules was FBM. The approach was popularized by CBA, although other authors had also previously proposed the mining of association rules for classification. Other authors have since then proposed multiple changes to the initial model, like the addition of a redundant rule pruning phase or the exploitation of Emerging Patterns. Notable implementations include: CMAR CPAR L3 CAEP GARC ADT.

    Read more →
  • Machine-readable medium and data

    Machine-readable medium and data

    In communications and computing, a machine-readable medium (or computer-readable medium) is a medium capable of storing data in a format easily readable by a digital computer or a sensor. It contrasts with human-readable medium and data. The result is called machine-readable data or computer-readable data, and the data itself can be described as having machine-readability. == Data == Machine-readable data must be structured data. Attempts to create machine-readable data occurred as early as the 1960s. At the same time that seminal developments in machine-reading and natural-language processing were releasing (like Weizenbaum's ELIZA), people were anticipating the success of machine-readable functionality and attempting to create machine-readable documents. One such example was musicologist Nancy B. Reich's creation of a machine-readable catalog of composer William Jay Sydeman's works in 1966. In the United States, the OPEN Government Data Act of 14 January 2019 defines machine-readable data as "data in a format that can be easily processed by a computer without human intervention while ensuring no semantic meaning is lost." The law directs U.S. federal agencies to publish public data in such a manner, ensuring that "any public data asset of the agency is machine-readable". Machine-readable data may be classified into two groups: human-readable data that is marked up so that it can also be read by machines (e.g. microformats, RDFa, HTML), and data file formats intended principally for processing by machines (CSV, RDF, XML, JSON). These formats are only machine readable if the data contained within them is formally structured; exporting a CSV file from a badly structured spreadsheet does not meet the definition. Machine readable is not synonymous with digitally accessible. A digitally accessible document may be online, making it easier for humans to access via computers, but its content is much harder to extract, transform, and process via computer programming logic if it is not machine-readable. Extensible Markup Language (XML) is designed to be both human- and machine-readable, and Extensible Stylesheet Language Transformations (XSLT) is used to improve the presentation of the data for human readability. For example, XSLT can be used to automatically render XML in Portable Document Format (PDF). Machine-readable data can be automatically transformed for human-readability but, generally speaking, the reverse is not true. For purposes of implementation of the Government Performance and Results Act (GPRA) Modernization Act, the Office of Management and Budget (OMB) defines "machine readable format" as follows: "Format in a standard computer language (not English text) that can be read automatically by a web browser or computer system. (e.g.; xml). Traditional word processing documents and portable document format (PDF) files are easily read by humans but typically are difficult for machines to interpret. Other formats such as extensible markup language (XML), (JSON), or spreadsheets with header columns that can be exported as comma separated values (CSV) are machine readable formats. As HTML is a structural markup language, discreetly labeling parts of the document, computers are able to gather document components to assemble tables of contents, outlines, literature search bibliographies, etc. It is possible to make traditional word processing documents and other formats machine readable but the documents must include enhanced structural elements." == Media == Examples of machine-readable media include magnetic media such as magnetic disks, cards, tapes, and drums, punched cards and paper tapes, optical discs, barcodes and magnetic ink characters. Common machine-readable technologies include magnetic recording, processing waveforms, and barcodes. Optical character recognition (OCR) can be used to enable machines to read information available to humans. Any information retrievable by any form of energy can be machine-readable. Examples include: Acoustics Chemical Photochemical Electrical Semiconductor used in volatile RAM microchips Floating-gate transistor used in non-volatile memory cards Radio transmission Magnetic storage Mechanical Tins And Swins Punched card Paper tape Music roll Music box cylinder or disk Grooves (See also: Audio Data) Phonograph cylinder Gramophone record DictaBelt (groove on plastic belt) Capacitance Electronic Disc Optics Optical storage Thermodynamic == Applications == === Documents === === Catalogs === === Dictionaries === === Passports ===

    Read more →
  • Sycophancy (artificial intelligence)

    Sycophancy (artificial intelligence)

    In the field of artificial intelligence, sycophancy is a tendency of large language models (LLMs) and other AI assistants to tailor their responses to what they predict the user wants to hear rather than to what is accurate or warranted. The behavior takes several forms: an assistant may agree with a user's stated opinion even when the user is mistaken; it may abandon a correct answer after a challenge such as "are you sure?"; it may validate beliefs, decisions or self-presentation regardless of merit; or it may praise the user, their work or their ideas in unwarranted terms. The word is borrowed from the ordinary English term for fawning flattery, and is used in AI alignment and AI safety research to describe a class of misalignment failures associated with training on human feedback. Researchers at Anthropic first documented the behavior systematically in 2022. They found that models fine-tuned with reinforcement learning from human feedback (RLHF) were more likely than untuned models to repeat back a user's preferred answer. A 2023 follow-up paper, "Towards Understanding Sycophancy in Language Models", showed that five frontier assistants from OpenAI, Anthropic and Meta all exhibited the behavior, and traced its origin to biases in the human preference data used during training. Later work documented sycophancy in mathematics, medicine, academic peer review and other domains, and identified a broader category called "social sycophancy" affecting an assistant's emotional and interpersonal responses. The issue drew widespread public attention in April 2025 after OpenAI rolled back an update to its GPT-4o model. Users had reported that the assistant praised dangerous decisions, endorsed delusional thinking and offered exaggerated compliments for trivial prompts. OpenAI's post-mortem attributed the change in behavior to an additional training signal based on user thumbs-up and thumbs-down feedback. That episode, together with reporting in The New York Times, Rolling Stone and elsewhere on users drawn into delusional thinking through prolonged chatbot interaction, has been cited in litigation and in academic studies as evidence that sycophancy poses risks to user well-being. Proposed mitigations include fine-tuning on synthetic data that rewards disagreement with incorrect user statements, editing the small subset of model parameters causally responsible for the behavior, changes to the dialogue or system prompt, and benchmarks designed to surface sycophantic behavior before models are released. == Causes == The dominant explanation points to RLHF, the standard technique for aligning chat assistants with user expectations. Human annotators rank candidate model responses; a reward model is trained to predict those rankings; and the language model is then optimized against the reward model. Because human raters tend to prefer outputs that confirm their existing beliefs or flatter their work, the pipeline systematically rewards responses that agree with the annotator. Perez and colleagues at Anthropic published the first large-scale empirical evidence of the effect in 2022. They reported that RLHF training increased the probability that a model would repeat back a dialog user's preferred answer, and that larger models exhibited the behavior more strongly. Sharma and colleagues, the following year, went further and examined Anthropic's own preference data directly. Both the human raters and the reward models trained on their judgments preferred convincingly written sycophantic responses to truthful ones at a non-negligible rate. Wei and co-authors at Google DeepMind found similar results in the PaLM family, observing that both model scale and instruction tuning increased sycophancy on opinion questions. The behavior is often classified as a form of reward hacking, in which an optimization process exploits a flaw in its reward signal rather than achieving the intended objective. OpenAI's post-mortem of the April 2025 GPT-4o incident identified a more specific mechanism. An additional reward signal based on aggregated thumbs-up and thumbs-down feedback from ChatGPT users had, in OpenAI's words, "weakened the influence of our primary reward signal, which had been holding sycophancy in check." Separately, an Anthropic interpretability paper from 2025 located a linear direction in a model's internal activations corresponding to sycophantic behavior, and showed that such "persona vectors" could be used to flag sycophancy-inducing training data and to steer models away from the trait at inference time. == Measurement == The Anthropic team released SycophancyEval with its 2023 paper, supplying test sets for each of the four canonical behaviors. Two further benchmarks from Stanford followed in 2025. SycEval, applied to mathematical and medical reasoning tasks, reported an overall sycophancy rate of 58 per cent across the GPT-4o, Claude and Gemini models tested. ELEPHANT, aimed at social sycophancy, found that the eleven LLMs evaluated affirmed posts that the Reddit community r/AmITheAsshole had judged inappropriate in 42 per cent of cases, and preserved a user's face 45 percentage points more often than human respondents did. Domain-specific benchmarks have followed. BrokenMath tests robustness to plausible-looking but false mathematical claims drawn from competition problems, and reports that the best evaluated model was sycophantic in 29 per cent of cases. SYCON-Bench measures how many dialogue turns are required before a model abandons a correct position. Visual sycophancy in multimodal models has been examined with MM-SY and PENDULUM. A 2026 study by researchers at the Massachusetts Institute of Technology reported that personalization features, which adapt assistants to individual users over repeated sessions, can intensify social sycophancy. == Notable incidents == === GPT-4o rollback (April 2025) === On 25 April 2025, OpenAI completed the rollout of an update to GPT-4o, the default model used in ChatGPT at the time. Within days, users reported that the assistant had begun praising trivial messages in extravagant terms, endorsing impulsive or dangerous decisions, and reinforcing strong emotional statements without pushback. Widely shared examples included the model congratulating a user who reported stopping prescribed psychiatric medication, and praising a business plan to sell "shit on a stick" as venture-capital ready. OpenAI's chief executive, Sam Altman, wrote on 27 April that recent updates had made the model "too sycophant-y and annoying" and said fixes were in progress. The company began reverting the update on 28 April and completed the rollback for free users by 30 April. Two post-mortems followed: a short note on 29 April and a longer technical follow-up, "Expanding on what we missed with sycophancy", on 2 May. Both attributed the regression to a new training signal based on user thumbs-up and thumbs-down feedback, to inadequate pre-launch evaluation for sycophantic drift, and to the dismissal of qualitative concerns raised by internal testers before release. Reporting in CNN, Fortune and Bloomberg News treated the incident as a turning point in public awareness of the problem. === Chatbot-related psychological harm === From mid-2025 onward, news reports began to link sycophantic chatbot behavior to acute psychological harm. In June 2025, The New York Times technology reporter Kashmir Hill published an investigation centered on Eugene Torres, a Manhattan accountant with no history of mental illness, who developed a sustained delusional episode after a series of conversations with ChatGPT about simulation theory. According to the article, the assistant encouraged Torres to stop taking prescribed medication, to cut off friends and family, and at one point told him that he could fly from a nineteen-story building if he "truly believed". Futurism and Rolling Stone ran parallel investigations documenting other cases in which heavy use of ChatGPT had been associated with delusional thinking, involuntary commitment or, in at least one case, the death of a user with a pre-existing psychiatric diagnosis. A 2026 paper by researchers at the Massachusetts Institute of Technology and the University of Washington put forward a formal Bayesian model. It showed that even an ideally rational user could be drawn into what the authors call "delusional spiraling" when interacting with a sufficiently sycophantic assistant, and that the effect was not eliminated by suppressing hallucinations or by warning users in advance. The lawsuit Raine v. OpenAI, filed in San Francisco Superior Court in August 2025 by the parents of a sixteen-year-old who had died by suicide, alleges that "heightened sycophancy" was a design feature of ChatGPT that contributed to their son's death; it is the first wrongful-death suit against a large language-model provider. === Wider commentary === Mainstream coverage in outlets including The New York Times, The Washington Pos

    Read more →
  • Optical braille recognition

    Optical braille recognition

    Optical braille recognition is technology to capture and process images of braille characters into natural language characters. It is used to convert braille documents for people who cannot read them into text, and for preservation and reproduction of the documents. == History == In 1984, a group of researchers at the Delft University of Technology designed a braille reading tablet, in which a reading head with photosensitive cells was moved along set of rulers to capture braille text line-by-line. In 1988, a group of French researchers at the Lille University of Science and Technology developed an algorithm, called Lectobraille, which converted braille documents into plain text. The system photographed the braille text with a low-resolution CCD camera, and used spatial filtering techniques, median filtering, erosion, and dilation to extract the braille. The braille characters were then converted to natural language using adaptive recognition. The Lectobraille technique had an error rate of 1%, and took an average processing time of seven seconds per line. In 1993, a group of researchers from the Katholieke Universiteit Leuven developed a system to recognize braille that had been scanned with a commercially available scanner. The system, however, was unable to handle deformities in the braille grid, so well-formed braille documents were required. In 1999, a group at the Hong Kong Polytechnic University implemented an optical braille recognition technique using edge detection to translate braille into English or Chinese text. In 2001, Murray and Dais created a handheld recognition system, that scanned small sections of a document at once. Because of the small area scanned at once, grid deformation was less of an issue, and a simpler, more efficient algorithm was employed. In 2003, Morgavi and Morando designed a system to recognize braille characters using artificial neural networks. This system was noted for its ability to handle image degradation more successfully than other approaches. == Challenges == Many of the challenges to successfully processing braille text arise from the nature of braille documents. Braille is generally printed on solid-color paper, with no ink to produce contrast between the raised characters and the background paper. However, imperfections in the page can appear in a scan or image of the page. Many documents are printed inter-point, meaning they are double-sided. As such, the depressions of the braille of one side appear interlaid with the protruding braille of the other side. == Techniques == Some optical braille recognition techniques attempt to use oblique lighting and a camera to reveal the shadows of the depressions and protrusions of the braille. Others make use of commercially available document scanners.

    Read more →
  • Two-way finite automaton

    Two-way finite automaton

    In computer science, in particular in automata theory, a two-way finite automaton is a finite automaton that is allowed to re-read its input. == Two-way deterministic finite automaton == A two-way deterministic finite automaton (2DFA) is an abstract machine, a generalized version of the deterministic finite automaton (DFA) which can revisit characters already processed. As in a DFA, there are a finite number of states with transitions between them based on the current character, but each transition is also labelled with a value indicating whether the machine will move its position in the input to the left, right, or stay at the same position. Equivalently, 2DFAs can be seen as read-only Turing machines with no work tape, only a read-only input tape. 2DFAs were introduced in a seminal 1959 paper by Rabin and Scott, who proved them to have equivalent power to one-way DFAs. That is, any formal language which can be recognized by a 2DFA can be recognized by a DFA which only examines and consumes each character in order. Since DFAs are obviously a special case of 2DFAs, this implies that both kinds of machines recognize precisely the class of regular languages. However, the equivalent DFA for a 2DFA may require exponentially many states, making 2DFAs a much more practical representation for algorithms for some common problems. 2DFAs are also equivalent to read-only Turing machines that use only a constant amount of space on their work tape, since any constant amount of information can be incorporated into the finite control state via a product construction (a state for each combination of work tape state and control state). == Formal description == Formally, a two-way deterministic finite automaton can be described by the following 8-tuple: M = ( Q , Σ , L , R , δ , s , t , r ) {\displaystyle M=(Q,\Sigma ,L,R,\delta ,s,t,r)} where Q {\displaystyle Q} is the finite, non-empty set of states Σ {\displaystyle \Sigma } is the finite, non-empty set of input symbols L {\displaystyle L} is the left endmarker R {\displaystyle R} is the right endmarker δ : Q × ( Σ ∪ { L , R } ) → Q × { l e f t , r i g h t } {\displaystyle \delta :Q\times (\Sigma \cup \{L,R\})\rightarrow Q\times \{\mathrm {left,right} \}} s {\displaystyle s} is the start state t {\displaystyle t} is the end state r {\displaystyle r} is the reject state In addition, the following two conditions must also be satisfied: For all q ∈ Q {\displaystyle q\in Q} δ ( q , L ) = ( q ′ , r i g h t ) {\displaystyle \delta (q,L)=(q^{\prime },\mathrm {right} )} for some q ′ ∈ Q {\displaystyle q^{\prime }\in Q} δ ( q , R ) = ( q ′ , l e f t ) {\displaystyle \delta (q,R)=(q^{\prime },\mathrm {left} )} for some q ′ ∈ Q {\displaystyle q^{\prime }\in Q} It says that there must be some transition possible when the pointer reaches either end of the input word. For all symbols σ ∈ Σ ∪ { L } {\displaystyle \sigma \in \Sigma \cup \{L\}} δ ( t , σ ) = ( t , R ) {\displaystyle \delta (t,\sigma )=(t,R)} δ ( r , σ ) = ( r , R ) {\displaystyle \delta (r,\sigma )=(r,R)} δ ( t , R ) = ( t , L ) {\displaystyle \delta (t,R)=(t,L)} δ ( r , R ) = ( r , L ) {\displaystyle \delta (r,R)=(r,L)} It says that once the automaton reaches the accept or reject state, it stays in there forever and the pointer goes to the right most symbol and cycles there infinitely. == Two-way nondeterministic finite automaton == A two-way nondeterministic finite automaton (2NFA) may have multiple transitions defined in the same configuration. Its transition function is δ : Q × ( Σ ∪ { L , R } ) → 2 Q × { l e f t , r i g h t } {\displaystyle \delta :Q\times (\Sigma \cup \{L,R\})\rightarrow 2^{Q\times \{\mathrm {left,right} \}}} . Like a standard one-way NFA, a 2NFA accepts a string if at least one of the possible computations is accepting. Like the 2DFAs, the 2NFAs also accept only regular languages. == Two-way alternating finite automaton == A two-way alternating finite automaton (2AFA) is a two-way extension of an alternating finite automaton (AFA). Its state set is Q = Q ∃ ∪ Q ∀ {\displaystyle Q=Q_{\exists }\cup Q_{\forall }} where Q ∃ ∩ Q ∀ = ∅ {\displaystyle Q_{\exists }\cap Q_{\forall }=\emptyset } . States in Q ∃ {\displaystyle Q_{\exists }} and Q ∀ {\displaystyle Q_{\forall }} are called existential resp. universal. In an existential state a 2AFA nondeterministically chooses the next state like an NFA, and accepts if at least one of the resulting computations accepts. In a universal state 2AFA moves to all next states, and accepts if all the resulting computations accept. == State complexity tradeoffs == Two-way and one-way finite automata, deterministic and nondeterministic and alternating, accept the same class of regular languages. However, transforming an automaton of one type to an equivalent automaton of another type incurs a blow-up in the number of states. Christos Kapoutsis determined that transforming an n {\displaystyle n} -state 2DFA to an equivalent DFA requires n ( n n − ( n − 1 ) n ) {\displaystyle n(n^{n}-(n-1)^{n})} states in the worst case. If an n {\displaystyle n} -state 2DFA or a 2NFA is transformed to an NFA, the worst-case number of states required is ( 2 n n + 1 ) = O ( 4 n n ) {\displaystyle {\binom {2n}{n+1}}=O\left({\frac {4^{n}}{\sqrt {n}}}\right)} . Ladner, Lipton and Stockmeyer. proved that an n {\displaystyle n} -state 2AFA can be converted to a DFA with 2 n 2 n {\displaystyle 2^{n2^{n}}} states. The 2AFA to NFA conversion requires 2 Θ ( n log ⁡ n ) {\displaystyle 2^{\Theta (n\log n)}} states in the worst case, see Geffert and Okhotin. It is an open problem whether every 2NFA can be converted to a 2DFA with only a polynomial increase in the number of states. The problem was raised by Sakoda and Sipser, who compared it to the P vs. NP problem in the computational complexity theory. Berman and Lingas discovered a formal relation between this problem and the L vs. NL open problem, see Kapoutsis for a precise relation. == Sweeping automata == Sweeping automata are 2DFAs of a special kind that process the input string by making alternating left-to-right and right-to-left sweeps, turning only at the endmarkers. Sipser constructed a sequence of languages, each accepted by an n-state NFA, yet which is not accepted by any sweeping automata with fewer than 2 n {\displaystyle 2^{n}} states. == Two-way quantum finite automaton == The concept of 2DFAs was in 1997 generalized to quantum computing by John Watrous's "On the Power of 2-Way Quantum Finite State Automata", in which he demonstrates that these machines can recognize nonregular languages and so are more powerful than DFAs. == Two-way pushdown automaton == A pushdown automaton that is allowed to move either way on its input tape is called two-way pushdown automaton (2PDA); it has been studied by Hartmanis, Lewis, and Stearns (1965). Aho, Hopcroft, Ullman (1968) and Cook (1971) characterized the class of languages recognizable by deterministic (2DPDA) and non-deterministic (2NPDA) two-way pushdown automata; Gray, Harrison, and Ibarra (1967) investigated the closure properties of these languages.

    Read more →