AI Humanizer

AI Humanizer — hands-on reviews, top picks, pricing, pros and cons and a practical how-to guide on Aizhi.

  • Fluency Voice Technology

    Fluency Voice Technology

    Fluency Voice Technology was a company that developed and sold packaged speech recognition solutions for use in call centers. Fluency's Speech Recognition solutions are used by call centers worldwide to improve customer service and significantly reduce costs and are available on-premises and hosted. == History == 1998 – Fluency was created as a spin-off from the Voice Research & Development team of a company called netdecisions. This R&D operation was established in Cambridge UK. The focus of the development was speech recognition systems based on the VXML standard. 2001 – Fluency became a separate entity in May 2001. Fluency began the creation of a software development platform specifically aimed at automating call center activities. This platform became Fluency's VoiceRunner. 2002 to 2004 – Fluency establishes accomplishes many successful deployments in customer sites such as National Express and Barclaycard. 2003 – Fluency expanded into the USA. Fluency also acquires Vocalis of Cambridge, UK in August 2003. 2004 – Fluency receives £6 million investment from leading European Venture Capitalists and establishes a global OEM partnership with Avaya, and the acquisition of SRC Telecom. 2008 – Fluency is acquired by Syntellect Ltd == Customers == Call Centers around the world use Fluency to improve service and reduce costs. They include Travelodge, Standard Life Bank, Sutton and East Surrey Water, Pizza Hut, CWT, Barclays, Powergen, First Choice, OutRight, J D Williams, Capital Blue Cross, Chelsea Building Society, EDF, bss, TV Licensing and Capita Software Services.

    Read more →
  • VITAL (machine learning software)

    VITAL (machine learning software)

    VITAL (Validating Investment Tool for Advancing Life Sciences) was a Board Management Software machine learning proprietary software developed by Aging Analytics, a company registered in Bristol (England) and dissolved in 2017. Andrew Garazha (the firm's Senior Analyst) declared that the project aimed "through iterative releases and updates to create a piece of software capable of making autonomous investment decisions." According to Nick Dyer-Witheford, VITAL 1.0 was a "basic algorithm". On 13 May 2014, Deep Knowledge Ventures, a Hong Kong venture capital firm, claimed to have appointed VITAL to its board of directors in order to prove that artificial intelligence could be an instrument for investment decision-making. The announcement received great press coverage despite the fact commentators consider this a publicity stunt. Fortune reported in 2019 that VITAL is no longer used. == Criticism == Academics and journalists viewed VITAL's board appointment with skepticism. University of Sheffield computer science professor Noel Sharkey called it "a publicity hype". Michael Osborne, a University of Oxford associate professor in machine learning, found it is "a gimmick to call that an actual board member". Simon Sharwood of The Register, wrote there is "a strong whiff of stunt and/or promotion about this". In a 2019 speech, the Chief Scientist of Australia, Alan Finkel, commented, "At the time, most of us probably dismissed Vital as a PR exercise. I admit, I used her story three years ago to get a laugh in one of my speeches." Florian Möslein, a law professor at the University of Marburg, wrote in 2018 that "Vital has widely been acknowledged as the 'world's first artificial intelligence company director'". Vice journalist Jason Koebler suggested that the software did not have any article intelligence capabilities and concluded "VITAL can’t talk, and it can’t hear, and it can’t be a real, functional executive of a company." Sharwood of The Register noted that because VITAL was not a natural person, it could not be a board member under Hong Kong's corporate governance laws. However, in a 2017 interview to The Nikkei, Dmitry Kaminskiy, managing partner of Deep Knowledge Ventures, stated that VITAL had observer status on the board and no voting rights. University of Sheffield computer science professor Noel Sharkey said of VITAL, "On first sight, it looks like a futuristic idea but on reflection it is really a little bit of publicity hype." Vice journalist Jason Koebler said "this is a gimmick" and said "There is literally nothing to suggest that VITAL has any sort of capabilities beyond any other proprietary analysis software". Michael Osborne, a University of Oxford associate professor in machine learning, found VITAL's appointment to be noncredible, saying it is "a bit of a gimmick to call that an actual board member". Osborne said that a core duty of board members to converse with each other, which the algorithm is incapable of doing, so its more likely functionality is to serve as a springboard for conversation among other board members. In a 2019 speech, the Chief Scientist of Australia, Alan Finkel, commented, "At the time, most of us probably dismissed Vital as a PR exercise. I admit, I used her story three years ago to get a laugh in one of my speeches." == Machine intelligence as board member == VITAL was created by a group of programmers employed by Aging Analytics According to Andrew Garazh, Aging Analytics Senior Analyst, VITAL was not a machine learning algorithm as the necessary datasets on investment rounds, intellectual property and clinical trial outcomes are generally not disclosed. Rather, VITAL used fuzzy logic based on 50 parameters to assess risk factors. Aging Analytics licensed the software to Deep Knowledge Ventures. It was used to help the human board members of Deep Knowledge Venture make investment decisions in biotechnology companies. For instance, it supported investments in Insilico Medicine, which creates ways for computers to help find drugs in research into aging. VITAL also supported investing in Pathway Pharmaceuticals, which uses the OncoFinder algorithm to choose and appraise cancer treatments. According to Dmitry Kaminskiy, managing partner of Deep Knowledge Ventures, the motivation for using VITAL was the large number of failed investments in the biotechnology sector and the desire to avoid investing in companies likely to fail. == Ethical and legal implications == Scholars addressed questions around the safety, privacy, accountability transparency and bias in algorithms. Writing in the philosophical journal Multitudes, the academic Ariel Kyrou raised questions about the consequences of a mistake made by an algorithm recommending a dangerous investment. He raised the hypothetical where VITAL was able to persuade the board to invest in a startup that had the facade of doing research into treatment for age-associated ills, but in actuality was run by terrorists who were raising funds. Kyrou raised a series of questions about who society would fault for VITAL's mistake. As the owner of VITAL, should Deep Knowledge Ventures be held accountable, or rather should the companies that supplied data to VITAL or the people who created VITAL be held liable? Simon Sharwood of The Register wrote that because the appointment of a software program to the board directors is not legally feasible in Hong Kong, there is "a strong whiff of stunt and/or promotion about this". Quoting a Thomson Reuters website describing Hong Kong legislation related to corporate governance, Sharwood pointed out that in Hong Kong "the board comprises all of the directors of the company" and "a director must normally be a natural person, except that a private company may have a body corporate as its director if the company is not a member of a listed group." He concluded that since VITAL cannot be considered a "natural person", it is merely a "cosmetic" appointment to the board and that "this software is no more a Board member than Caligula's horse was a senator". Sharwood further argued that corporations frequently purchase directors and officers liability insurance but that it would be practically impossible to get such insurance for VITAL. Sharwood also wrote that were VITAL to be hacked, any misinformation it outputs could be considered "false and misleading communications". In the book Research Handbook on the Law of Artificial Intelligence, Florian Mölein wrote that VITAL could not become a director as defined in Hong Kong's corporate laws, so the other directors just were approaching it as "a member of [the] board with observer status". Lin Shaowei raised concerns in a Journal of East China University of Political Science and Law article about how the software's appearance inspired a complex question about the relationship between corporate law and artificial intelligence. VITAL could be considered either a board director who has voting rights or an observer who does not. Lin said either choice raised questions about whether VITAL is subject to corporate law and who would be held accountable if VITAL recommends a choice that turns out to be damaging to the company. David Theo Goldberg in the Critical Times, a peer reviewed journal in Critical Global Theory, argues that VITAL processed a dataset to predict the most remunerative investment opportunities. Drawing his analysis on an article from Business Insider, Goldberg describes VITAL's decision-making predictiveness based "on surface pattern recognition and the identification of regularities and/or irregularities". In other words, Goldberg asserts that "the normativity of the surface" explains algorithmic knowledge of a "product" like VITAL. In Homo Deus, Yuval Noah Harari mentions VITAL as an example of the future risks that humankind faces. Harari argues that the human mind is being replaced by a world in which algorithms and data make the decisions. Specifically, it is argued that "as algorithms push humans out of the job market," executive boards driven by artificial intelligence are more likely to give priority to algorithms over the humans.

    Read more →
  • Log-linear model

    Log-linear model

    A log-linear model is a mathematical model that takes the form of a function whose logarithm equals a linear combination of the parameters of the model, which makes it possible to apply (possibly multivariate) linear regression. That is, it has the general form exp ⁡ ( c + ∑ i w i f i ( X ) ) {\displaystyle \exp \left(c+\sum _{i}w_{i}f_{i}(X)\right)} , in which the fi(X) are quantities that are functions of the variable X, in general a vector of values, while c and the wi stand for the model parameters. The term may specifically be used for: A log-linear plot or graph, which is a type of semi-log plot. Poisson regression for contingency tables, a type of generalized linear model. The specific applications of log-linear models are where the output quantity lies in the range 0 to ∞, for values of the independent variables X, or more immediately, the transformed quantities fi(X) in the range −∞ to +∞. This may be contrasted to logistic models, similar to the logistic function, for which the output quantity lies in the range 0 to 1. Thus the contexts where these models are useful or realistic often depends on the range of the values being modelled.

    Read more →
  • Averaged one-dependence estimators

    Averaged one-dependence estimators

    Averaged one-dependence estimators (AODE) is a probabilistic classification learning technique. It was developed to address the attribute-independence problem of the popular naive Bayes classifier. It frequently develops substantially more accurate classifiers than naive Bayes at the cost of a modest increase in the amount of computation. == The AODE classifier == AODE seeks to estimate the probability of each class y given a specified set of features x1, ... xn, P(y | x1, ... xn). To do so it uses the formula P ^ ( y ∣ x 1 , … x n ) = ∑ i : 1 ≤ i ≤ n ∧ F ( x i ) ≥ m P ^ ( y , x i ) ∏ j = 1 n P ^ ( x j ∣ y , x i ) ∑ y ′ ∈ Y ∑ i : 1 ≤ i ≤ n ∧ F ( x i ) ≥ m P ^ ( y ′ , x i ) ∏ j = 1 n P ^ ( x j ∣ y ′ , x i ) {\displaystyle {\hat {P}}(y\mid x_{1},\ldots x_{n})={\frac {\sum _{i:1\leq i\leq n\wedge F(x_{i})\geq m}{\hat {P}}(y,x_{i})\prod _{j=1}^{n}{\hat {P}}(x_{j}\mid y,x_{i})}{\sum _{y^{\prime }\in Y}\sum _{i:1\leq i\leq n\wedge F(x_{i})\geq m}{\hat {P}}(y^{\prime },x_{i})\prod _{j=1}^{n}{\hat {P}}(x_{j}\mid y^{\prime },x_{i})}}} where P ^ ( ⋅ ) {\displaystyle {\hat {P}}(\cdot )} denotes an estimate of P ( ⋅ ) {\displaystyle P(\cdot )} , F ( ⋅ ) {\displaystyle F(\cdot )} is the frequency with which the argument appears in the sample data and m is a user specified minimum frequency with which a term must appear in order to be used in the outer summation. In recent practice m is usually set at 1. == Derivation of the AODE classifier == We seek to estimate P(y | x1, ... xn). By the definition of conditional probability P ( y ∣ x 1 , … x n ) = P ( y , x 1 , … x n ) P ( x 1 , … x n ) . {\displaystyle P(y\mid x_{1},\ldots x_{n})={\frac {P(y,x_{1},\ldots x_{n})}{P(x_{1},\ldots x_{n})}}.} For any 1 ≤ i ≤ n {\displaystyle 1\leq i\leq n} , P ( y , x 1 , … x n ) = P ( y , x i ) P ( x 1 , … x n ∣ y , x i ) . {\displaystyle P(y,x_{1},\ldots x_{n})=P(y,x_{i})P(x_{1},\ldots x_{n}\mid y,x_{i}).} Under an assumption that x1, ... xn are independent given y and xi, it follows that P ( y , x 1 , … x n ) = P ( y , x i ) ∏ j = 1 n P ( x j ∣ y , x i ) . {\displaystyle P(y,x_{1},\ldots x_{n})=P(y,x_{i})\prod _{j=1}^{n}P(x_{j}\mid y,x_{i}).} This formula defines a special form of One Dependence Estimator (ODE), a variant of the naive Bayes classifier that makes the above independence assumption that is weaker (and hence potentially less harmful) than the naive Bayes' independence assumption. In consequence, each ODE should create a less biased estimator than naive Bayes. However, because the base probability estimates are each conditioned by two variables rather than one, they are formed from less data (the training examples that satisfy both variables) and hence are likely to have more variance. AODE reduces this variance by averaging the estimates of all such ODEs. == Features of the AODE classifier == Like naive Bayes, AODE does not perform model selection and does not use tuneable parameters. As a result, it has low variance. It supports incremental learning whereby the classifier can be updated efficiently with information from new examples as they become available. It predicts class probabilities rather than simply predicting a single class, allowing the user to determine the confidence with which each classification can be made. Its probabilistic model can directly handle situations where some data are missing. AODE has computational complexity O ( l n 2 ) {\displaystyle O(ln^{2})} at training time and O ( k n 2 ) {\displaystyle O(kn^{2})} at classification time, where n is the number of features, l is the number of training examples and k is the number of classes. This makes it infeasible for application to high-dimensional data. However, within that limitation, it is linear with respect to the number of training examples and hence can efficiently process large numbers of training examples. == Implementations == The free Weka machine learning suite includes an implementation of AODE.

    Read more →
  • Tiimo

    Tiimo

    Tiimo is an app designed to help neurodivergent individuals with planning their life. In August 2024 the company raised €1.4 million, bringing their total funding to €4.3 million. At that point they had over 500,000 users, including 50,000 paid users. The app has Apple Watch support and a learning platform that includes courses on well-being and neurodiversity. The app was founded by Helene Lassen Nørlem and Melissa Würtz Azari in 2015. After being a finalist in 2024, in December 2025 Tiimo was won Apple’s iPhone App of the Year. The premium version is $10/mo and features an AI chatbot alongside the daily planner.

    Read more →
  • Detrended correspondence analysis

    Detrended correspondence analysis

    Detrended correspondence analysis (DCA) is a multivariate statistical technique widely used by ecologists to find the main factors or gradients in large, species-rich but usually sparse data matrices that typify ecological community data. DCA is frequently used to suppress artifacts inherent in most other multivariate analyses when applied to gradient data. == History == DCA was created in 1979 by Mark Hill of the United Kingdom's Institute for Terrestrial Ecology (now merged into Centre for Ecology and Hydrology) and implemented in FORTRAN code package called DECORANA (Detrended Correspondence Analysis), a correspondence analysis method. DCA is sometimes erroneously referred to as DECORANA; however, DCA is the underlying algorithm, while DECORANA is a tool implementing it. == Issues addressed == According to Hill and Gauch, DCA suppresses two artifacts inherent in most other multivariate analyses when applied to gradient data. An example is a time-series of plant species colonising a new habitat; early successional species are replaced by mid-successional species, then by late successional ones (see example below). When such data are analysed by a standard ordination such as a correspondence analysis: the ordination scores of the samples will exhibit the 'edge effect', i.e. the variance of the scores at the beginning and the end of a regular succession of species will be considerably smaller than that in the middle, when presented as a graph the points will be seen to follow a horseshoe shaped curve rather than a straight line ('arch effect'), even though the process under analysis is a steady and continuous change that human intuition would prefer to see as a linear trend. Outside ecology, the same artifacts occur when gradient data are analysed (e.g. soil properties along a transect running between 2 different geologies, or behavioural data over the lifespan of an individual) because the curved projection is an accurate representation of the shape of the data in multivariate space. Ter Braak and Prentice (1987, p. 121) cite a simulation study analysing two-dimensional species packing models resulting in a better performance of DCA compared to CA. == Method == DCA is an iterative algorithm that has shown itself to be a highly reliable and useful tool for data exploration and summary in community ecology (Shaw 2003). It starts by running a standard ordination (CA or reciprocal averaging) on the data, to produce the initial horse-shoe curve in which the 1st ordination axis distorts into the 2nd axis. It then divides the first axis into segments (default = 26), and rescales each segment to have mean value of zero on the 2nd axis - this effectively squashes the curve flat. It also rescales the axis so that the ends are no longer compressed relative to the middle, so that 1 DCA unit approximates to the same rate of turnover all the way through the data: the rule of thumb is that 4 DCA units mean that there has been a total turnover in the community. Ter Braak and Prentice (1987, p. 122) warn against the non-linear rescaling of the axes due to robustness issues and recommend using detrending-by-polynomials only. == Drawbacks == No significance tests are available with DCA, although there is a constrained (canonical) version called DCCA in which the axes are forced by Multiple linear regression to correlate optimally with a linear combination of other (usually environmental) variables; this allows testing of a null model by Monte-Carlo permutation analysis. == Example == The example shows an ideal data set: The species data is in rows, samples in columns. For each sample along the gradient, a new species is introduced but another species is no longer present. The result is a sparse matrix. Ones indicate the presence of a species in a sample. Except at the edges each sample contains five species. The plot of the first two axes of the correspondence analysis result on the right hand side clearly shows the disadvantages of this procedure: the edge effect, i.e. the points are clustered at the edges of the first axis, and the arch effect. == Software == An open source implementation of DCA, based on the original FORTRAN code, is available in the vegan R-package.

    Read more →
  • Genetic representation

    Genetic representation

    In computer programming, genetic representation is a way of presenting solutions/individuals in evolutionary computation methods. The term encompasses both the concrete data structures and data types used to realize the genetic material of the candidate solutions in the form of a genome, and the relationships between search space and problem space. In the simplest case, the search space corresponds to the problem space (direct representation). The choice of problem representation is tied to the choice of genetic operators, both of which have a decisive effect on the efficiency of the optimization. Genetic representation can encode appearance, behavior, physical qualities of individuals. Difference in genetic representations is one of the major criteria drawing a line between known classes of evolutionary computation. Terminology is often analogous with natural genetics. The block of computer memory that represents one candidate solution is called an individual. The data in that block is called a chromosome. Each chromosome consists of genes. The possible values of a particular gene are called alleles. A programmer may represent all the individuals of a population using binary encoding, permutational encoding, encoding by tree, or any one of several other representations. == Representations in some popular evolutionary algorithms == Genetic algorithms (GAs) are typically linear representations; these are often, but not always, binary. Holland's original description of GA used arrays of bits. Arrays of other types and structures can be used in essentially the same way. The main property that makes these genetic representations convenient is that their parts are easily aligned due to their fixed size. This facilitates simple crossover operation. Depending on the application, variable-length representations have also been successfully used and tested in evolutionary algorithms (EA) in general and genetic algorithms in particular, although the implementation of crossover is more complex in this case. Evolution strategy uses linear real-valued representations, e.g., an array of real values. It uses mostly gaussian mutation and blending/averaging crossover. Genetic programming (GP) pioneered tree-like representations and developed genetic operators suitable for such representations. Tree-like representations are used in GP to represent and evolve functional programs with desired properties. Human-based genetic algorithm (HBGA) offers a way to avoid solving hard representation problems by outsourcing all genetic operators to outside agents, in this case, humans. The algorithm has no need for knowledge of a particular fixed genetic representation as long as there are enough external agents capable of handling those representations, allowing for free-form and evolving genetic representations. === Common genetic representations === binary array integer or real-valued array binary tree natural language parse tree directed graph == Distinction between search space and problem space == Analogous to biology, EAs distinguish between problem space (corresponds to phenotype) and search space (corresponds to genotype). The problem space contains concrete solutions to the problem being addressed, while the search space contains the encoded solutions. The mapping from search space to problem space is called genotype-phenotype mapping. The genetic operators are applied to elements of the search space, and for evaluation, elements of the search space are mapped to elements of the problem space via genotype-phenotype mapping. == Relationships between search space and problem space == The importance of an appropriate choice of search space for the success of an EA application was recognized early on. The following requirements can be placed on a suitable search space and thus on a suitable genotype-phenotype mapping: === Completeness === All possible admissible solutions must be contained in the search space. === Redundancy === When more possible genotypes exist than phenotypes, the genetic representation of the EA is called redundant. In nature, this is termed a degenerate genetic code. In the case of a redundant representation, neutral mutations are possible. These are mutations that change the genotype but do not affect the phenotype. Thus, depending on the use of the genetic operators, there may be phenotypically unchanged offspring, which can lead to unnecessary fitness determinations, among other things. Since the evaluation in real-world applications usually accounts for the lion's share of the computation time, it can slow down the optimization process. In addition, this can cause the population to have higher genotypic diversity than phenotypic diversity, which can also hinder evolutionary progress. In biology, the Neutral Theory of Molecular Evolution states that this effect plays a dominant role in natural evolution. This has motivated researchers in the EA community to examine whether neutral mutations can improve EA functioning by giving populations that have converged to a local optimum a way to escape that local optimum through genetic drift. This is discussed controversially and there are no conclusive results on neutrality in EAs. On the other hand, there are other proven measures to handle premature convergence. === Locality === The locality of a genetic representation corresponds to the degree to which distances in the search space are preserved in the problem space after genotype-phenotype mapping. That is, a representation has a high locality exactly when neighbors in the search space are also neighbors in the problem space. In order for successful schemata not to be destroyed by genotype-phenotype mapping after a minor mutation, the locality of a representation must be high. === Scaling === In genotype-phenotype mapping, the elements of the genotype can be scaled (weighted) differently. The simplest case is uniform scaling: all elements of the genotype are equally weighted in the phenotype. A common scaling is exponential. If integers are binary coded, the individual digits of the resulting binary number have exponentially different weights in representing the phenotype. Example: The number 90 is written in binary (i.e., in base two) as 1011010. If now one of the front digits is changed in the binary notation, this has a significantly greater effect on the coded number than any changes at the rear digits (the selection pressure has an exponentially greater effect on the front digits). For this reason, exponential scaling has the effect of randomly fixing the "posterior" locations in the genotype before the population gets close enough to the optimum to adjust for these subtleties. == Hybridization and repair in genotype-phenotype mapping == When mapping the genotype to the phenotype being evaluated, domain-specific knowledge can be used to improve the phenotype and/or ensure that constraints are met. This is a commonly used method to improve EA performance in terms of runtime and solution quality. It is illustrated below by two of the three examples. == Examples == === Example of a direct representation === An obvious and commonly used encoding for the traveling salesman problem and related tasks is to number the cities to be visited consecutively and store them as integers in the chromosome. The genetic operators must be suitably adapted so that they only change the order of the cities (genes) and do not cause deletions or duplications. Thus, the gene order corresponds to the city order and there is a simple one-to-one mapping. === Example of a complex genotype-phenotype mapping === In a scheduling task with heterogeneous and partially alternative resources to be assigned to a set of subtasks, the genome must contain all necessary information for the individual scheduling operations or it must be possible to derive them from it. In addition to the order of the subtasks to be executed, this includes information about the resource selection. A phenotype then consists of a list of subtasks with their start times and assigned resources. In order to be able to create this, as many allocation matrices must be created as resources can be allocated to one subtask at most. In the simplest case this is one resource, e.g., one machine, which can perform the subtask. An allocation matrix is a two-dimensional matrix, with one dimension being the available time units and the other being the resources to be allocated. Empty matrix cells indicate availability, while an entry indicates the number of the assigned subtask. The creation of allocation matrices ensures firstly that there are no inadmissible multiple allocations. Secondly, the start times of the subtasks can be read from it as well as the assigned resources. A common constraint when scheduling resources to subtasks is that a resource can only be allocated once per time unit and that the reservation must be for a contiguous period of time. To achieve this in a timely manner, which is a c

    Read more →
  • Training, validation, and test data sets

    Training, validation, and test data sets

    In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from input data. These input data used to build the model are usually divided into multiple data sets. In particular, three data sets are commonly used in different stages of the creation of the model: training, validation, and testing sets. The model is initially fit on a training data set, which is a set of examples used to fit the parameters (e.g. weights of connections between neurons in artificial neural networks) of the model. The model (e.g. a naive Bayes classifier) is trained on the training data set using a supervised learning method, for example using optimization methods such as gradient descent or stochastic gradient descent. In practice, the training data set often consists of pairs of an input vector (or scalar) and the corresponding output vector (or scalar), where the answer key is commonly denoted as the target (or label). The current model is run with the training data set and produces a result, which is then compared with the target, for each input vector in the training data set. Based on the result of the comparison and the specific learning algorithm being used, the parameters of the model are adjusted. The model fitting can include both variable selection and parameter estimation. Successively, the fitted model is used to predict the responses for the observations in a second data set called the validation data set. The validation data set provides an unbiased evaluation of a model fit on the training data set while tuning the model's hyperparameters (e.g. the number of hidden units—layers and layer widths—in a neural network). Validation data sets can be used for regularization by early stopping (stopping training when the error on the validation data set increases, as this is a sign of over-fitting to the training data set). This simple procedure is complicated in practice by the fact that the validation data set's error may fluctuate during training, producing multiple local minima. This complication has led to the creation of many ad-hoc rules for deciding when over-fitting has truly begun. Finally, the test data set is a data set used to provide an unbiased evaluation of a model fit on the training data set. When the data in the test data set has never been used (for example in cross-validation), the test data set is called a holdout data set. The term "validation set" is sometimes used instead of "test set" in some literature (e.g., if the original data set was partitioned into only two subsets, the test set might be referred to as the validation set). Deciding the sizes and strategies for data set division in training, test and validation sets is very dependent on the problem and data available. == Training data set == A training data set is a data set of examples used during the learning process and is used to fit the parameters (e.g., weights) of, for example, a classifier. For classification tasks, a supervised learning algorithm looks at the training data set to determine, or learn, the optimal combinations of variables that will generate a good predictive model. The goal is to produce a trained (fitted) model that generalizes well to new, unknown data. The fitted model is evaluated using “new” examples from the held-out data sets (validation and test data sets) to estimate the model’s accuracy in classifying new data. To reduce the risk of issues such as over-fitting, the examples in the validation and test data sets should not be used to train the model. Most approaches that search through training data for empirical relationships tend to overfit the data, meaning that they can identify and exploit apparent relationships in the training data that do not hold in general. When a training set is continuously expanded with new data, then this is incremental learning. == Validation data set == A validation data set is a data set of examples used to tune the hyperparameters (i.e. the architecture) of a model. It is sometimes also called the development set or the "dev set". An example of a hyperparameter for artificial neural networks includes the number of hidden units in each layer. It, as well as the testing set (as mentioned below), should follow the same probability distribution as the training data set. In order to avoid overfitting, when any classification parameter needs to be adjusted, it is necessary to have a validation data set in addition to the training and test data sets. For example, if the most suitable classifier for the problem is sought, the training data set is used to train the different candidate classifiers, the validation data set is used to compare their performances and decide which one to take and, finally, the test data set is used to obtain the performance characteristics such as accuracy, sensitivity, specificity, F-measure, and so on. The validation data set functions as a hybrid: it is training data used for testing, but neither as part of the low-level training nor as part of the final testing. The basic process of using a validation data set for model selection (as part of training data set, validation data set, and test data set) is: Since our goal is to find the network having the best performance on new data, the simplest approach to the comparison of different networks is to evaluate the error function using data which is independent of that used for training. Various networks are trained by minimization of an appropriate error function defined with respect to a training data set. The performance of the networks is then compared by evaluating the error function using an independent validation set, and the network having the smallest error with respect to the validation set is selected. This approach is called the hold out method. Since this procedure can itself lead to some overfitting to the validation set, the performance of the selected network should be confirmed by measuring its performance on a third independent set of data called a test set. An application of this process is in early stopping, where the candidate models are successive iterations of the same network, and training stops when the error on the validation set grows, choosing the previous model (the one with minimum error). == Test data set == A test data set is a data set that is independent of the training data set, but that follows the same probability distribution as the training data set. A test set is therefore a set of examples used only to assess the performance (i.e. generalization) of a specified classifier on unseen data. To do this, the model is used to predict classifications of examples in the test set. Those predictions are compared to the examples' true classifications to assess the model's accuracy. If a model fit to the training and validation data set also fits the test data set well, minimal overfitting has taken place (see figure below). A better fitting of the training or validation data sets as opposed to the test data set usually points to overfitting. In the scenario where a data set has a low number of samples, it is usually partitioned into a training set and a validation data set, where the model is trained on the training set and refined using the validation set to improve accuracy, but this approach will lead to overfitting. The holdout method can also be employed, where the test set is used at the end, after training on the training set. Other techniques, such as cross-validation and bootstrapping, are used on small data sets. The bootstrap method generates numerous simulated data sets of the same size by randomly sampling with replacement from the original data, allowing the random data points to serve as test sets for evaluating model performance. Cross-validation splits the data set into multiple folds, with a single sub-fold used as test data; the model is trained on the remaining folds, and all folds are cross-validated (with results averaged and models consolidated) to estimate final model performance. Note that some sources advise against using a single split, as it can lead to overfitting as well as biased model performance estimates. For this reason, data sets are split into three partitions: training, validation and test data sets. The standard machine learning practice is to train on the training set and tune hyperparameters using the validation set, where the validation process selects the model with the lowest validation loss, which is then tested on the test data set (normally held out) to assess the final model. The holdout method for the test set reduces computation by avoiding using the test set after each epoch. The test data set should never be used for validating the training model or fine-tuning hyperparameters, as it provides an accurate and honest evaluation of the model's final performance on unseen dat

    Read more →
  • AI Security Institute

    AI Security Institute

    The AI Security Institute (AISI) is a research organisation under the Department for Science, Innovation and Technology, UK, that aims "to equip governments with a scientific understanding of the risks posed by advanced AI". It conducts research and develop and test mitigations. Previously, it was known as the AI Safety Institute. Its creation followed world's first major AI Safety Summit that was held in Bletchley Park in 2023. The institute's professed goal is "building the world's leading understanding of advanced AI risks and solutions, to inform governments so they can keep the public safe". It is designed like a startup in the government "combining the authority of government with the expertise and agility of the private sector". AISI has made access agreements with Anthropic, Google and OpenAI to test their models before release. It has an open source platform called Inspect that permits companies, governments and academics to run standardised safety tests for AI usage. Among the works AISI has done is the reported detection of multiple serious vulnerabilities that could enable development of biological weapons; the vulnerabilities were fixed before the model was launched. It conducts research on diverse fields of AI application. One study by AISI found that LLMs post-trained for political persuasiveness became systematically less accurate and up to 51% more persuasive on political issues. AISI has also worked on the usage of AI for emotional needs. It found that nearly 10 percent of UK citizens used systems like chatbots for emotional purposes on a weekly basis. It found that "systems are now outperforming PhD-level researchers on scientific knowledge tests and helping non-experts succeed at lab work that would previously have been out of reach" in a report published in December 2025. Former chief AI officer of GCHQ Adam Beaumont is the institution's interim director. UK prime minister's AI advisor Jade Leung is the chief technology officer.

    Read more →
  • Relationship square

    Relationship square

    In statistics, the relationship square is a graphical representation for use in the factorial analysis of a table individuals x variables. This representation completes classical representations provided by principal component analysis (PCA) or multiple correspondence analysis (MCA), namely those of individuals, of quantitative variables (correlation circle) and of the categories of qualitative variables (at the centroid of the individuals who possess them). It is especially important in factor analysis of mixed data (FAMD) and in multiple factor analysis (MFA). == Definition of relationship square in the MCA frame == The first interest of the relationship square is to represent the variables themselves, not their categories, which is all the more valuable as there are many variables. For this, we calculate for each qualitative variable j {\displaystyle j} and each factor F s {\displaystyle F_{s}} ( F s {\displaystyle F_{s}} , rank s {\displaystyle s} factor, is the vector of coordinates of the individuals along the axis of rank s {\displaystyle s} ; in PCA, F s {\displaystyle F_{s}} is called principal component of rank s {\displaystyle s} ), the square of the correlation ratio between the F s {\displaystyle F_{s}} and the variable j {\displaystyle j} , usually denoted : η 2 ( j , F s ) {\displaystyle \eta ^{2}(j,F_{s})} Thus, to each factorial plane, we can associate a representation of qualitative variables themselves. Their coordinates being between 0 and 1, the variables appear in the square having as vertices the points (0,0), ( 0,1), (1,0) and (1,1). == Example in MCA == Six individuals ( i 1 , … , i 6 ) {\displaystyle i_{1},\ldots ,i_{6})} are described by three variables ( q 1 , q 2 , q 3 ) {\displaystyle (q_{1},q_{2},q_{3})} having respectively 3, 2 and 3 categories. Example : the individual i 1 {\displaystyle i_{1}} possesses the category a {\displaystyle a} of q 1 {\displaystyle q_{1}} , d {\displaystyle d} of q 2 {\displaystyle q_{2}} and f {\displaystyle f} of q 3 {\displaystyle q_{3}} . Applied to these data, the MCA function included in the R Package FactoMineR provides to the classical graph in Figure 1. The relationship square (Figure 2) makes easier the reading of the classic factorial plane. It indicates that: The first factor is related to the three variables but especially q 3 {\displaystyle q_{3}} (which have a very high coordinate along the first axis) and then q 2 {\displaystyle q_{2}} . The second factor is related only to q 1 {\displaystyle q_{1}} and q 3 {\displaystyle q_{3}} (and not to q 2 {\displaystyle q_{2}} which has a coordinate along axis 2 equal to 0) and that in a strong and equal manner. All this is visible on the classic graphic but not so clearly. The role of the relationship square is first to assist in reading a conventional graphic. This is precious when the variables are numerous and possess numerous coordinates. == Extensions == This representation may be supplemented with those of quantitative variables, the coordinates of the latter being the square of correlation coefficients (and not of correlation ratios). Thus, the second advantage of the relationship square lies in the ability to represent simultaneously quantitative and qualitative variables. The relationship square can be constructed from any factorial analysis of a table individuals x variables. In particular, it is (or should be) used systematically: in multiple correspondences analysis (MCA); in principal components analysis (PCA) when there are many supplementary variables; in factor analysis of mixed data (FAMD). An extension of this graphic to groups of variables (how to represent a group of variables by a single point ?) is used in Multiple Factor Analysis (MFA) == History == The idea of representing the qualitative variables themselves by a point (and not the categories) is due to Brigitte Escofier. The graphic as it is used now has been introduced by Brigitte Escofier and Jérôme Pagès in the framework of multiple factor analysis == Conclusion == In MCA, the relationship square provides a synthetic view of the connections between mixed variables, all the more valuable as there are many variables having many categories. This representation iscan be useful in any factorial analysis when there are numerous mixed variables, active and/or supplementary.

    Read more →
  • Mutation (evolutionary algorithm)

    Mutation (evolutionary algorithm)

    Mutation is a genetic operator used to maintain genetic diversity of the chromosomes of a population of an evolutionary algorithm (EA), including genetic algorithms in particular. It is analogous to biological mutation. The classic example of a mutation operator of a binary coded genetic algorithm (GA) involves a probability that an arbitrary bit in a genetic sequence will be flipped from its original state. A common method of implementing the mutation operator involves generating a random variable for each bit in a sequence. This random variable tells whether or not a particular bit will be flipped. This mutation procedure, based on the biological point mutation, is called single point mutation. Other types of mutation operators are commonly used for representations other than binary, such as floating-point encodings or representations for combinatorial problems. The purpose of mutation in EAs is to introduce diversity into the sampled population. Mutation operators are used in an attempt to avoid local minima by preventing the population of chromosomes from becoming too similar to each other, thus slowing or even stopping convergence to the global optimum. This reasoning also leads most EAs to avoid only taking the fittest of the population in generating the next generation, but rather selecting a random (or semi-random) set with a weighting toward those that are fitter. The following requirements apply to all mutation operators used in an EA: every point in the search space must be reachable by one or more mutations. there must be no preference for parts or directions in the search space (no drift). small mutations should be more probable than large ones. For different genome types, different mutation types are suitable. Some mutations are Gaussian, Uniform, Zigzag, Scramble, Insertion, Inversion, Swap, and so on. An overview and more operators than those presented below can be found in the introductory book by Eiben and Smith or in. == Bit string mutation == The mutation of bit strings ensue through bit flips at random positions. Example: The probability of a mutation of a bit is 1 l {\displaystyle {\frac {1}{l}}} , where l {\displaystyle l} is the length of the binary vector. Thus, a mutation rate of 1 {\displaystyle 1} per mutation and individual selected for mutation is reached. == Mutation of real numbers == Many EAs, such as the evolution strategy or the real-coded genetic algorithms, work with real numbers instead of bit strings. This is due to the good experiences that have been made with this type of coding. The value of a real-valued gene can either be changed or redetermined. A mutation that implements the latter should only ever be used in conjunction with the value-changing mutations and then only with comparatively low probability, as it can lead to large changes. In practical applications, the respective value range of the decision variables to be changed of the optimisation problem to be solved is usually limited. Accordingly, the values of the associated genes are each restricted to an interval [ x min , x max ] {\displaystyle [x_{\min },x_{\max }]} . Mutations may or may not take these restrictions into account. In the latter case, suitable post-treatment is then required as described below. === Mutation without consideration of restrictions === A real number x {\displaystyle x} can be mutated using normal distribution N ( 0 , σ ) {\displaystyle {\mathcal {N}}(0,\sigma )} by adding the generated random value to the old value of the gene, resulting in the mutated value x ′ {\displaystyle x'} : x ′ = x + N ( 0 , σ ) {\displaystyle x'=x+{\mathcal {N}}(0,\sigma )} In the case of genes with a restricted range of values, it is a good idea to choose the step size of the mutation σ {\displaystyle \sigma } so that it reasonably fits the range [ x min , x max ] {\displaystyle [x_{\min },x_{\max }]} of the gene to be changed, e.g.: σ = x max − x min 6 {\displaystyle \sigma ={\frac {x_{\text{max}}-x_{\text{min}}}{6}}} The step size can also be adjusted to the smaller permissible change range depending on the current value. In any case, however, it is likely that the new value x ′ {\displaystyle x'} of the gene will be outside the permissible range of values. Such a case must be considered a lethal mutation, since the obvious repair by using the respective violated limit as the new value of the gene would lead to a drift. This is because the limit value would then be selected with the entire probability of the values beyond the limit of the value range. The evolution strategy works with real numbers and mutation based on normal distribution. The step sizes are part of the chromosome and are subject to evolution together with the actual decision variables. === Mutation with consideration of restrictions === One possible form of changing the value of a gene while taking its value range [ x min , x max ] {\displaystyle [x_{\min },x_{\max }]} into account is the mutation relative parameter change of the evolutionary algorithm GLEAM (General Learning Evolutionary Algorithm and Method), in which, as with the mutation presented earlier, small changes are more likely than large ones. First, an equally distributed decision is made as to whether the current value x {\displaystyle x} should be increased or decreased and then the corresponding total change interval is determined. Without loss of generality, an increase is assumed for the explanation and the total change interval is then [ x , x max ] {\displaystyle [x,x_{\max }]} . It is divided into k {\displaystyle k} sub-areas of equal size with the width δ {\displaystyle \delta } , from which k {\displaystyle k} sub-change intervals of different size are formed: i {\displaystyle i} -th sub-change interval: [ x , x + δ ⋅ i ] {\displaystyle [x,x+\delta \cdot i]} with δ = ( x max − x ) k {\displaystyle \delta ={\frac {(x_{\text{max}}-x)}{k}}} and i = 1 , … , k {\displaystyle i=1,\dots ,k} Subsequently, one of the k {\displaystyle k} sub-change intervals is selected in equal distribution and a random number, also equally distributed, is drawn from it as the new value x ′ {\displaystyle x'} of the gene. The resulting summed probabilities of the sub-change intervals result in the probability distribution of the k {\displaystyle k} sub-areas shown in the adjacent figure for the exemplary case of k = 10 {\displaystyle k=10} . This is not a normal distribution as before, but this distribution also clearly favours small changes over larger ones. This mutation for larger values of k {\displaystyle k} , such as 10, is less well suited for tasks where the optimum lies on one of the value range boundaries. This can be remedied by significantly reducing k {\displaystyle k} when a gene value approaches its limits very closely. === Common properties === For both mutation operators for real-valued numbers, the probability of an increase and decrease is independent of the current value and is 50% in each case. In addition, small changes are considerably more likely than large ones. For mixed-integer optimization problems, rounding is usually used. == Mutation of permutations == Mutations of permutations are specially designed for genomes that are themselves permutations of a set. These are often used to solve combinatorial tasks. In the two mutations presented, parts of the genome are rotated or inverted. === Rotation to the right === The presentation of the procedure is illustrated by an example on the right: === Inversion === The presentation of the procedure is illustrated by an example on the right: === Variants with preference for smaller changes === The requirement raised at the beginning for mutations, according to which small changes should be more probable than large ones, is only inadequately fulfilled by the two permutation mutations presented, since the lengths of the partial lists and the number of shift positions are determined in an equally distributed manner. However, the longer the partial list and the shift, the greater the change in gene order. This can be remedied by the following modifications. The end index j {\displaystyle j} of the partial lists is determined as the distance d {\displaystyle d} to the start index i {\displaystyle i} : j = ( i + d ) mod | P 0 | {\displaystyle j=(i+d){\bmod {\left|P_{0}\right|}}} where d {\displaystyle d} is determined randomly according to one of the two procedures for the mutation of real numbers from the interval [ 0 , | P 0 | − 1 ] {\displaystyle \left[0,\left|P_{0}\right|-1\right]} and rounded. For the rotation, k {\displaystyle k} is determined similarly to the distance d {\displaystyle d} , but the value 0 {\displaystyle 0} is forbidden. For the inversion, note that i ≠ j {\displaystyle i\neq j} must hold, so for d {\displaystyle d} the value 0 {\displaystyle 0} must be excluded.

    Read more →
  • Multidimensional analysis

    Multidimensional analysis

    In statistics, econometrics and related fields, multidimensional analysis (MDA) is a data analysis process that groups data into two categories: data dimensions and measurements. For example, a data set consisting of the number of wins for a single football team at each of several years is a single-dimensional (in this case, longitudinal) data set. A data set consisting of the number of wins for several football teams in a single year is also a single-dimensional (in this case, cross-sectional) data set. A data set consisting of the number of wins for several football teams over several years is a two-dimensional data set. == Higher dimensions == In many disciplines, two-dimensional data sets are also called panel data. While, strictly speaking, two- and higher-dimensional data sets are "multi-dimensional", the term "multidimensional" tends to be applied only to data sets with three or more dimensions. For example, some forecast data sets provide forecasts for multiple target periods, conducted by multiple forecasters, and made at multiple horizons. The three dimensions provide more information than can be gleaned from two-dimensional panel data sets. == Software == Computer software for MDA include Online analytical processing (OLAP) for data in relational databases, pivot tables for data in spreadsheets, and Array DBMSs for general multi-dimensional data (such as raster data) in science, engineering, and business.

    Read more →
  • ReRites

    ReRites

    ReRites (also known as RERITES, ReadingRites, Big Data Poetry) is a literary work of "Human + A.I. poetry" by David Jhave Johnston that used neural network models trained to generate poetry which the author then edited. ReRites won the Robert Coover Award for a Work of Electronic Literature in 2022. == About the project == The ReRites project began as a daily rite of writing with a neural network, expanded into a series of performances from which video documentation has been published online, and concluded with a set of 12 books and an accompanying book of essays published by Anteism Books in 2019. In Electronic Literature, Scott Rettberg describes the early phases of the project in 2016, when it bore the preliminary name Big Data Poetry. Jhave (the artist name that David Jhave Johnston goes by) describes the process of writing ReRites as a rite: "Every morning for 2 hours (normally 6:30–8:30am) I get up and edit the poetic output of a neural net. Deleting, weaving, conjugating, lineating, cohering. Re-writing. Re-wiring authorship: hybrid augmented enhanced evolutionary". There is video documentation of the writing process. The human editing of the neural network's output is fundamental to this project, and Jhave gives examples of both unedited text extracts and his edited versions in publications about the project. Kyle Booten describes ReRites as "simultaneously dusty and outrageously verdant, monotonously sublime and speckled with beautiful and rare specimens". === Performances === ReRites was first shared with an audience through a series of performances where audience members and poets would participate in reading the automatically generated texts, which appeared on screen so fast that human readers could barely keep up. This has been described as allowing participants to "re-discover[..] the peculiar pleasures of being embodied", or, in Jhave's own words, as a space where human participants were "playing their wits and voices against an evocative infinite deep-learning muse". The first performance was at Brown University's Interrupt Festival in 2019. It has been performed many times since, including at the Barbican Centre in London and Anteism Books. === Print publications === For a single year Jhave published one book of poetry from the ReRites project each month. These twelve volumes are accompanied by a book of essays, all published by Anteism Books. The accompanying essays provide critical responses to the project from poets and scholars including Allison Parrish, Johanna Drucker, Kyle Booten, Stephanie Strickland, John Cayley, Lai-Tze Fan, Nick Montfort, Mairéad Byrne, and Chris Funkhouser. Allison Parrish notes elsewhere that these paratexts to ReRites serve a legitimising function for a genre of poetry that is not yet institutionally acknowledged. === Technical details === Starting in 2016 under the name Big Data Poetry, Jhave generated poems using, in his own words, "neural network code (..) adapted from three corporate github-hosted machine-learning libraries: TensorFlow (Google), PyTorch (Facebook), and AWD-LSTM (SalesForce)". He explains that the "models were trained on a customised corpus of 600,000 lines of poetry ranging from the romantic epoch to the 20th century avant garde". Jhave maintains a GitHub repository with some of the code supporting ReRites. == Reception == ReRites is described by John Cayley as "one of the most thorough and beautiful" poetic responses to machine learning. The work's influence on the field of electronic literature was acknowledged in 2022, when the work won the Electronic Literature Organization's Robert Coover Award for a Work of Electronic Literature. The jury described ReRites as particularly poignant in the time of the pandemic, as it was "a documentation of the performance of the private ritual of writing and the obsessive-compulsive need for writers to communicate — even when no one else is reading". The question of authorship and voice in ReRites has been raised by several critics. Although generated poetry is an established genre in electronic literature, Cayley notes that unlike the combinatory poems created by authors like Nick Montfort, where the author explicitly defines which words and phrases will be recombined, ReRites has "not been directed by literary preconceptions inscribed in the program itself, but only by patterns and rhythms pre-existing in the corpora". In an essay for the Australian journal TEXT, David Thomas Henry Wright asks how to understand authorship and authority in ReRites: "Who or what is the authority of the work? The original data fed into the machine, that is not currently retrievable or discernible from the final works? The code that was taken and adapted for his purposes? Or Jhave, the human editor?" Wright concludes that Jhave is the only actor with any intentionality and therefore the authority of the work. The centrality of the human editor is also emphasised by other scholars. In a chapter analysing ReRites Malthe Stavning Erslev argues that the machine learning misrepresents the dataset it is trained on. While ReRites uses 21st century neural networks, it has been compared to earlier literary traditions. Poet Victoria Stanton, who read at one of the ReRites performances, has compared ReRites to found poetry, while David Thomas Henry Wright compares it to the Oulipo movement and Mark Amerika to the cut-up technique. Scholars also position ReRites firmly within the long tradition of generative poetry both in electronic literature and print, stretching from the I Ching, Queneau's Cent Mille Milliards de Poemes and Nabokov's Pale Fire to computer-generated poems like Christopher Strachey's Love Letter Generator (1952) and more contemporary examples. Jhave describes the process of working with the output from the neural network as "carving". In his book My Life as an Artificial Creative Intelligence, Mark Amerika writes that the "method of carving the digital outputs provided by the language model as part of a collaborative remix jam session with GPT-2, where the language artist and the language model play off each other’s unexpected outputs as if caught in a live postproduction set, is one I share with electronic literature composer David Jhave Johnston, whose AI poetry experiments precede my own investigations."

    Read more →
  • PVLV

    PVLV

    The primary value learned value (PVLV) model is a possible explanation for the reward-predictive firing properties of dopamine (DA) neurons. It simulates behavioral and neural data on Pavlovian conditioning and the midbrain dopaminergic neurons that fire in proportion to unexpected rewards. It is an alternative to the temporal-differences (TD) algorithm. It is used as part of Leabra.

    Read more →
  • Joseph Nechvatal

    Joseph Nechvatal

    Joseph Nechvatal (born January 15, 1951) is an American post-conceptual digital artist and art theoretician who creates computer-assisted paintings and computer animations, often using custom computer viruses. == Life and work == Joseph Nechvatal was born in Chicago. He studied fine art and philosophy at Southern Illinois University Carbondale, Cornell University, and Columbia University. He earned a Doctor of Philosophy in Philosophy of Art and Technology at the Planetary Collegium at University of Wales, Newport and has taught art theory and art history at the School of Visual Arts. He has had many solo exhibitions and is one of five artists that art historian Patrick Frank examines in his 2024 book Art of the 1980s: As If the Digital Mattered. His work in the late 1970s and early 1980s chiefly consisted of postminimal gray palimpsest-like drawings that were often photo-mechanically enlarged. Beginning in 1979 he became associated with the artist group Colab, organized the Public Arts International/Free Speech series, and helped established the non-profit group ABC No Rio. In 1983 he co-founded the avant-garde electronic art music audio project Tellus Audio Cassette Magazine. In 1984, Nechvatal began work on an opera called XS: The Opera Opus (1984-6) with the no wave musical composer Rhys Chatham. He began using computers and robotics to make post-conceptual paintings in 1986 and later, in his signature work, began to employ self-created computer viruses. From 1991 to 1993, he was artist-in-residence at the Louis Pasteur Atelier in Arbois, France and at the Saline Royale/Ledoux Foundation's computer lab. There he worked on The Computer Virus Project, his first artistic experiment with computer viruses and computer virus animation. He exhibited computer-robotic paintings at Documenta 8 in 1987. In 2002 he extended his experimentation into viral artificial life through a collaboration with the programmer Stephane Sikora of music2eye in a work called the Computer Virus Project II. Nechvatal has also created a noise music work called viral symphOny, a collaborative sound symphony created by using his computer virus software at the Institute for Electronic Arts at Alfred University. In 2021 Pentiments released Nechvatal's retrospective audio cassette called Selected Sound Works (1981-2021) and in 2022 his The Viral Tempest, a double vinyl LP of new audio work. In 2025, he joined the roster of artists/musicians at Table of the Elements with two CD/book releases: Selected Sound Works (1981-2021) and The Marriage of Orlando and Artaud, Even. From 1999 to 2013, Nechvatal taught art theories of immersive virtual reality and the viractual at the School of Visual Arts in New York City (SVA). A book of his collected essays entitled Towards an Immersive Intelligence: Essays on the Work of Art in the Age of Computer Technology and Virtual Reality (1993–2006) was published by Edgewise Press in 2009. Also in 2009, his virtual reality art theory and art history book Immersive Ideals / Critical Distances was published. In 2011, his book Immersion Into Noise was published by Open Humanities Press in conjunction with the University of Michigan Library's Scholarly Publishing Office. Nechvatal has also published three books with Punctum Books: Minóy (noise music—ed.—2014), Destroyer of Naivetés (poetry—2015), and Styling Sagaciousness (poetry—2022). In 2023 his art theory cybersex farce novella venus©~Ñ~vibrator, even was published by Orbis Tertius Press The Joseph Nechvatal archive is housed at The Fales Library Downtown Collection at the NYU Special Collections Library in New York City. === Viractualism === Viractualism is an art theory concept developed by Nechvatal in 1999 from Ph.D. research Nechvatal conducted at the Planetary Collegium at University of Wales, Newport. There he developed his concept of the viractual, which strives to create an interface between the actual and the virtual.

    Read more →