AI Data Center Zoning

AI Data Center Zoning — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Security.txt

    Security.txt

    security.txt is an accepted standard for website security information that allows security researchers to report security vulnerabilities easily. The standard prescribes a text file named security.txt in the well known location, similar in syntax to robots.txt but intended to be machine and human readable, for those wishing to contact a website's owner about security issues. security.txt files have been adopted by Google, GitHub, LinkedIn, and Facebook. == History == The Internet Draft was first submitted by Edwin Foudil in September 2017. At that time it covered four directives, "Contact", "Encryption", "Disclosure" and "Acknowledgement". Foudil expected to add further directives based on feedback. In addition, web security expert Scott Helme said he had seen positive feedback from the security community while use among the top 1 million websites was "as low as expected right now". In 2019, the Cybersecurity and Infrastructure Security Agency (CISA) published a draft binding operational directive that requires all US federal agencies to publish a security.txt file within 180 days. The Internet Engineering Steering Group (IESG) issued a Last Call for security.txt in December 2019 which ended on January 6, 2020. A study in 2021 found that over ten percent of top-100 websites published a security.txt file, with the percentage of sites publishing the file decreasing as more websites were considered. The study also noted a number of discrepancies between the standard and the content of the file. In April 2022 the security.txt file has been accepted by Internet Engineering Task Force (IETF) as RFC 9116. == File format == security.txt files can be served under the /.well-known/ directory (i.e. /.well-known/security.txt) or the top-level directory (i.e. /security.txt) of a website. The file must be served over HTTPS and in plaintext format.

    Read more →
  • Statistical learning theory

    Statistical learning theory

    Statistical learning theory is a framework for machine learning drawing from the fields of statistics and functional analysis. Statistical learning theory deals with the statistical inference problem of finding a predictive function based on data. Statistical learning theory has led to successful applications in fields such as computer vision, speech recognition, and bioinformatics. == Introduction == The goals of learning are understanding and prediction. Learning falls into many categories, including supervised learning, unsupervised learning, online learning, and reinforcement learning. From the perspective of statistical learning theory, supervised learning is best understood. Supervised learning involves learning from a training set of data. Every point in the training is an input–output pair, where the input maps to an output. The learning problem consists of inferring the function that maps between the input and the output, such that the learned function can be used to predict the output from future input. Depending on the type of output, supervised learning problems are either problems of regression or problems of classification. If the output takes a continuous range of values, it is a regression problem. Using Ohm's law as an example, a regression could be performed with voltage as input and current as an output. The regression would find the functional relationship between voltage and current to be R {\displaystyle R} , such that V = I R {\displaystyle V=IR} Classification problems are those for which the output will be an element from a discrete set of labels. Classification is very common for machine learning applications. In facial recognition, for instance, a picture of a person's face would be the input, and the output label would be that person's name. The input would be represented by a large multidimensional vector whose elements represent pixels in the picture. After learning a function based on the training set data, that function is validated on a test set of data, data that did not appear in the training set. == Formal description == Take X {\displaystyle X} to be the vector space of all possible inputs, and Y {\displaystyle Y} to be the vector space of all possible outputs. Statistical learning theory takes the perspective that there is some unknown probability distribution over the product space Z = X × Y {\displaystyle Z=X\times Y} , i.e. there exists some unknown p ( z ) = p ( x , y ) {\displaystyle p(z)=p(\mathbf {x} ,y)} . The training set is made up of n {\displaystyle n} samples from this probability distribution, and is notated S = { ( x 1 , y 1 ) , … , ( x n , y n ) } = { z 1 , … , z n } {\displaystyle S=\{(\mathbf {x} _{1},y_{1}),\dots ,(\mathbf {x} _{n},y_{n})\}=\{\mathbf {z} _{1},\dots ,\mathbf {z} _{n}\}} Every x i {\displaystyle \mathbf {x} _{i}} is an input vector from the training data, and y i {\displaystyle y_{i}} is the output that corresponds to it. In this formalism, the inference problem consists of finding a function f : X → Y {\displaystyle f:X\to Y} such that f ( x ) ∼ y {\displaystyle f(\mathbf {x} )\sim y} . Let H {\displaystyle {\mathcal {H}}} be a space of functions f : X → Y {\displaystyle f:X\to Y} called the hypothesis space. The hypothesis space is the space of functions the algorithm will search through. Let V ( f ( x ) , y ) {\displaystyle V(f(\mathbf {x} ),y)} be the loss function, a metric for the difference between the predicted value f ( x ) {\displaystyle f(\mathbf {x} )} and the actual value y {\displaystyle y} . The expected risk is defined to be I [ f ] = ∫ X × Y V ( f ( x ) , y ) p ( x , y ) d x d y {\displaystyle I[f]=\int _{X\times Y}V(f(\mathbf {x} ),y)\,p(\mathbf {x} ,y)\,d\mathbf {x} \,dy} The target function, the best possible function f {\displaystyle f} that can be chosen, is given by the f {\displaystyle f} that satisfies f = argmin h ∈ H ⁡ I [ h ] {\displaystyle f=\mathop {\operatorname {argmin} } _{h\in {\mathcal {H}}}I[h]} Because the probability distribution p ( x , y ) {\displaystyle p(\mathbf {x} ,y)} is unknown, a proxy measure for the expected risk must be used. This measure is based on the training set, a sample from this unknown probability distribution. It is called the empirical risk I S [ f ] = 1 n ∑ i = 1 n V ( f ( x i ) , y i ) {\displaystyle I_{S}[f]={\frac {1}{n}}\sum _{i=1}^{n}V(f(\mathbf {x} _{i}),y_{i})} A learning algorithm that chooses the function f S {\displaystyle f_{S}} that minimizes the empirical risk is called empirical risk minimization. == Loss functions == The choice of loss function is a determining factor on the function f S {\displaystyle f_{S}} that will be chosen by the learning algorithm. The loss function also affects the convergence rate for an algorithm. It is important for the loss function to be convex. Different loss functions are used depending on whether the problem is one of regression or one of classification. === Regression === The most common loss function for regression is the square loss function (also known as the L2-norm). This familiar loss function is used in Ordinary Least Squares regression. The form is: V ( f ( x ) , y ) = ( y − f ( x ) ) 2 {\displaystyle V(f(\mathbf {x} ),y)=(y-f(\mathbf {x} ))^{2}} The absolute value loss (also known as the L1-norm) is also sometimes used: V ( f ( x ) , y ) = | y − f ( x ) | {\displaystyle V(f(\mathbf {x} ),y)=|y-f(\mathbf {x} )|} === Classification === In some sense the 0-1 indicator function is the most natural loss function for classification. It takes the value 0 if the predicted output is the same as the actual output, and it takes the value 1 if the predicted output is different from the actual output. For binary classification with Y = { − 1 , 1 } {\displaystyle Y=\{-1,1\}} , this is: V ( f ( x ) , y ) = θ ( − y f ( x ) ) {\displaystyle V(f(\mathbf {x} ),y)=\theta (-yf(\mathbf {x} ))} where θ {\displaystyle \theta } is the Heaviside step function. == Regularization == In machine learning problems, a major problem that arises is that of overfitting. Because learning is a prediction problem, the goal is not to find a function that most closely fits the (previously observed) data, but to find one that will most accurately predict output from future input. Empirical risk minimization runs this risk of overfitting: finding a function that matches the data exactly but does not predict future output well. Overfitting is symptomatic of unstable solutions; a small perturbation in the training set data would cause a large variation in the learned function. It can be shown that if the stability for the solution can be guaranteed, generalization and consistency are guaranteed as well. Regularization can solve the overfitting problem and give the problem stability. Regularization can be accomplished by restricting the hypothesis space H {\displaystyle {\mathcal {H}}} . A common example would be restricting H {\displaystyle {\mathcal {H}}} to linear functions: this can be seen as a reduction to the standard problem of linear regression. H {\displaystyle {\mathcal {H}}} could also be restricted to polynomial of degree p {\displaystyle p} , exponentials, or bounded functions on L1. Restriction of the hypothesis space avoids overfitting because the form of the potential functions are limited, and so does not allow for the choice of a function that gives empirical risk arbitrarily close to zero. One example of regularization is Tikhonov regularization. This consists of minimizing 1 n ∑ i = 1 n V ( f ( x i ) , y i ) + γ ‖ f ‖ H 2 {\displaystyle {\frac {1}{n}}\sum _{i=1}^{n}V(f(\mathbf {x} _{i}),y_{i})+\gamma \left\|f\right\|_{\mathcal {H}}^{2}} where γ {\displaystyle \gamma } is a fixed and positive parameter, the regularization parameter. Tikhonov regularization ensures existence, uniqueness, and stability of the solution. == Bounding empirical risk == Consider a binary classifier f : X → { 0 , 1 } {\displaystyle f:{\mathcal {X}}\to \{0,1\}} . We can apply Hoeffding's inequality to bound the probability that the empirical risk deviates from the true risk to be a Sub-Gaussian distribution. P ( | R ^ ( f ) − R ( f ) | ≥ ϵ ) ≤ 2 e − 2 n ϵ 2 {\displaystyle \mathbb {P} (|{\hat {R}}(f)-R(f)|\geq \epsilon )\leq 2e^{-2n\epsilon ^{2}}} But generally, when we do empirical risk minimization, we are not given a classifier; we must choose it. Therefore, a more useful result is to bound the probability of the supremum of the difference over the whole class. P ( sup f ∈ F | R ^ ( f ) − R ( f ) | ≥ ϵ ) ≤ 2 S ( F , n ) e − n ϵ 2 / 8 ≈ n d e − n ϵ 2 / 8 {\displaystyle \mathbb {P} {\bigg (}\sup _{f\in {\mathcal {F}}}|{\hat {R}}(f)-R(f)|\geq \epsilon {\bigg )}\leq 2S({\mathcal {F}},n)e^{-n\epsilon ^{2}/8}\approx n^{d}e^{-n\epsilon ^{2}/8}} where S ( F , n ) {\displaystyle S({\mathcal {F}},n)} is the shattering number and n {\displaystyle n} is the number of samples in your dataset. The exponential term comes from Hoeffding but there is an extra cost of taking the supremum over the whole cla

    Read more →
  • Machine learning

    Machine learning

    Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus perform tasks without being explicitly programmed. Advances in the field of deep learning have allowed neural networks, a class of statistical algorithms, to surpass many previous machine learning approaches in performance. Statistics and mathematical optimisation methods compose the foundations of machine learning. Data mining is a related field of study, focusing on exploratory data analysis (EDA) through unsupervised learning. From a theoretical viewpoint, probably approximately correct learning provides a mathematical and statistical framework for describing machine learning. Most traditional machine learning and deep learning algorithms can be described as empirical risk minimisation under this framework. == History == The term machine learning was coined in 1959 by Arthur Samuel, an IBM employee and pioneer in the field of computer gaming and artificial intelligence. The synonym self-teaching computers was also used during this time period. The earliest machine learning program was introduced in the 1950s, when Samuel invented a computer program that calculated the chance of winning in checkers for each side, but the history of machine learning is rooted in decades of efforts to study human cognitive processes. In 1949, Canadian psychologist Donald Hebb published the book The Organization of Behavior, in which he introduced a theoretical neural structure formed by certain interactions among nerve cells. The Hebbian theory of neuron interaction set the groundwork for how many machine learning algorithms work, with connected artificial neurons changing the strength of their connections based on data. Other researchers who have studied human cognitive systems contributed to the modern machine learning technologies as well, including Walter Pitts and Warren McCulloch, who proposed the first mathematical model of neural networks including algorithms that mirror human thought processes. By the early 1960s, an experimental "learning machine" with punched tape memory, called Cybertron, had been developed by Raytheon Company to analyse sonar signals, electrocardiograms, and speech patterns using rudimentary reinforcement learning. It was repetitively "trained" by a human operator/teacher to recognise patterns and equipped with a "goof" button to cause it to reevaluate incorrect decisions. A representative book on research into machine learning during the 1960s was Nils Nilsson's book "Learning Machines", dealing mostly with machine learning for pattern classification. Interest related to pattern recognition continued into the 1970s, as described by Duda and Hart in 1973. In 1981, a report was given on using teaching strategies so that an artificial neural network learns to recognise 40 characters (26 letters, 10 digits, and 4 special symbols) from a computer terminal. Tom M. Mitchell provided a widely quoted, more formal definition of the algorithms studied in the machine learning field: "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E." This definition of the tasks in which machine learning is concerned is fundamentally operational rather than defining the field in cognitive terms. This follows Alan Turing's proposal in his paper "Computing Machinery and Intelligence", in which the question, "Can machines think?", is replaced by asking whether machines can convincingly imitate a human in its responses to human-posed questions. In 2014 Ian Goodfellow and others introduced generative adversarial networks (GANs) which could produce realistic synthetic data. By 2016 AlphaGo had won against top human players in Go using reinforcement learning techniques. == Relationships to other fields == === Artificial intelligence === As a scientific endeavour, machine learning grew out of the quest for artificial intelligence (AI). In the early days of AI as an academic discipline, some researchers were interested in having machines learn from data. They attempted to approach the problem with various symbolic methods, as well as what were then termed "neural networks"; these were mostly perceptrons and other models that were later found to be reinventions of the generalised linear models of statistics. Probabilistic reasoning was also employed, especially in automated medical diagnosis. However, an increasing emphasis on the logical, knowledge-based approach caused a rift between AI and machine learning. Probabilistic systems were plagued by theoretical and practical problems of data acquisition and representation. By 1980, expert systems had come to dominate AI, and statistics was out of favour. Work on symbolic/knowledge-based learning continued within AI, leading to inductive logic programming (ILP), but the more statistical line of research was now outside the field of AI proper, in pattern recognition and information retrieval. Neural network research was abandoned by AI and computer science around the same time. This subfield, termed "connectionism", was continued by researchers from other disciplines, including John Hopfield, David Rumelhart, and Geoffrey Hinton. Their main success came in the mid-1980s with the reinvention of backpropagation. Machine learning (ML), reorganised and recognised as its own field, started to flourish in the 1990s. The field changed its goal from achieving artificial intelligence to tackling solvable problems of a practical nature. It shifted focus away from the symbolic approaches it had inherited from AI, and toward methods and models borrowed from statistics, fuzzy logic, and probability theory. === Data compression === === Data mining === Machine learning and data mining often employ the same methods and overlap significantly, but while machine learning focuses on prediction based on known properties learned from the training data, data mining focuses on the discovery of previously unknown properties in the data (this is the analysis step of knowledge discovery in databases). Data mining uses many machine learning methods, but with different goals; on the other hand, machine learning also employs data mining methods as "unsupervised learning" or as a preprocessing step to improve learner accuracy. Much of the confusion between these two research communities comes from the basic assumptions they work with: in machine learning, performance is usually evaluated with respect to the ability to reproduce known knowledge, while in knowledge discovery and data mining (KDD) the key task is the discovery of previously unknown knowledge. Evaluated with respect to known knowledge, an uninformed (unsupervised) method will easily be outperformed by other supervised methods, while in a typical KDD task, supervised methods cannot be used due to the unavailability of training data. Machine learning also has intimate ties to optimization: Many learning problems are formulated as minimisation of some loss function on a training set of examples. Loss functions express the discrepancy between the predictions of the model being trained and the actual problem instances (for example, in classification, one wants to assign a label to instances, and models are trained to correctly predict the preassigned labels of a set of examples). === Generalization === Characterizing the generalisation of various learning algorithms is an active topic of current research, especially for deep learning algorithms. === Statistics === Machine learning and statistics are closely related fields in terms of methods, but distinct in their principal goal: statistics draws population inferences from a sample, while machine learning finds generalisable predictive patterns. Conventional statistical analyses require the a priori selection of a model most suitable for the study data set. In addition, only significant or theoretically relevant variables based on previous experience are included for analysis. In contrast, machine learning is not built on a pre-structured model; rather, the data shape the model by detecting underlying patterns. The more variables (input) used to train the model, the more accurate the ultimate model will be. Leo Breiman distinguished two statistical modelling paradigms: the data model and the algorithmic model, wherein "algorithmic model" means more or less the machine learning algorithms like Random forest. Some statisticians have adopted methods from machine learning, producing the field of statistical learning. === Statistical physics === Analytical and computational techniques derived from deep-rooted physics of disordered systems can be extended to large-scale problems, including machine learning, e.g., to analyse the weight space of deep neural networks. Statistical physics is thus

    Read more →
  • Operation Serenata de Amor

    Operation Serenata de Amor

    Operation Serenata de Amor is an artificial intelligence project designed to analyze public spending in Brazil. The project has been funded by a recurrent financing campaign since September 7, 2016, and came in the wake of major scandals of misappropriation of public funds in Brazil, such as the Mensalão scandal and what was revealed in the Operation Car Wash investigations. The analysis began with data from the National Congress then expanded to other types of budget and instances of government, such as the Federal Senate. The project is built through collaboration on GitHub and using a public group with more than 600 participants on Telegram. The name "Serenata de Amor," which means "serenade of love," was taken from a popular cashew cream bonbon produced by Chocolates Garoto in Brazil. == Modules == Throughout development of the project, new modules have been newly introduced in addition to the main repository: The main repository, serenata-de-amor, serves as the starting point for investigative work. Rosie is the robot programmed to identify public funds expenses with discrepancies, starting with CEAP (Quota for Exercise of Parliamentary Activity); it analyzes each of the reimbursements requested by the deputies and senators, indicating the reasons that lead it to believe they are suspicious. From Rosie was born whistleblower, which tweets under the name of @RosieDaSerenata, distributing the results found on social media. Jarbas (Github repository) is a data visualization tool which shows a complete list of reimbursements made available by the Chamber of Deputies and mined by Rosie. Toolbox is a Python installable package that supports the development of Serenata de Amor and Rosie. == History == Operation Serenata de Amor is an Artificial intelligence project for analysis of public expenditures. It was conceived in March 2016 by data scientist Irio Musskopf, sociologist Eduardo Cuducos and entrepreneur Felipe Cabral. The project was financed collectively in the Catarse platform, where it reached 131% of the collection goal paying 3 months of project development. Ana Schwendler, also a data scientist, Pedro Vilanova "Tonny", data journalist, Bruno Pazzim, software engineer, Filipe Linhares, a frontend engineer, Leandro Devegili, an entrepreneur and André Pinho took the first steps towards constructing the platform, such as collecting and structuring the first datasets. Jessica Temporal, data scientist and Yasodara Córdova "Yaso", researcher, Tatiana Balachova "Russa", UX designer, joined the project after the financing took place. The members created a recurring financing campaign, expanding the analysis of public spending to the Federal Senate. Donors make monthly payments ranging from 5 BRL to 200 BRL to maintain group activities. The monthly amount collected is around 10,000 BRL. == Results == In January 2017, concluding the period financed by the initial campaign, the group carried out an investigation into the suspicious activities found by the data analysis system. 629 complaints were made to the Ombudsman's Office of the Chamber of Deputies, questioning expenses of 216 federal deputies. In addition, the Facebook project page has more than 25,000 followers, and users frequently cite the operation as a benchmark in transparency in the Brazilian government. One of the examples of results obtained by the operation is the case of the Deputy who had to return about 700 BRL to the House after his expenses were analyzed by the platform. The platform was able to analyze more than 3 million notes, raising about 8,000 suspected cases in public spending. The community that supports the work of the team benefits from open source repositories, with licenses open for the collaboration. So much so that the two main data scientists of the project presented it at the CivicTechFest in Taipei, obtaining several mentions even in the international press. The technical leader presented the project in Poland during DevConf2017 in Kraków. It was also presented in the Google News Lab in 2017. It was presented by Yaso, when she was the Director of the initiative, at the MIT Media Lab/Berkman Klein Center Initiative for Artificial Intelligence ethics, and at the Artificial Intelligence and Inclusion Symposium, an initiative of the Global Network of Internet & Society Centers (NoC). It was also presented both by Irio and Yaso at the Digital Harvard Kennedy School, over a lunch seminar, where the transparency of the platform and the main solutions found were discussed, so that the code and data are always available to verify its suitability. This infographic provides information about the first results of Operation Serenata de Amor, a project that analyzes open data on public spending to find discrepancies. The project was presented by Yaso to the House Audit and Control Committee of the Chamber of Deputies in August 2017, and raised the interest of House officials who work with open data. The operation has been a source of inspiration for other civic projects that aim to work with similar goals, demonstrating the broader impact of artificial intelligence also in industry in Brazil. Participation of several team members in events throughout Brazil and abroad can be found on the Internet, such as presentation at OpenDataDay, held at Calango Hackerspace in the Federal District, Campus Party Bahia, Campus Party Brasilia, Friends of Tomorrow, XIII National Meeting of Internal Control, in the event USP Talks Hackfest against corruption in João Pessoa, the latter being also highlighted in the National Press.

    Read more →
  • Neural field

    Neural field

    In machine learning, a neural field (also known as implicit neural representation, neural implicit, or coordinate-based neural network), is a mathematical field that is fully or partially parametrized by a neural network. Initially developed to tackle visual computing tasks, such as rendering or reconstruction (e.g., neural radiance fields), neural fields emerged as a promising strategy to deal with a wider range of problems, including surrogate modelling of partial differential equations, such as in physics-informed neural networks. Differently from traditional machine learning algorithms, such as feed-forward neural networks, convolutional neural networks, or transformers, neural fields do not work with discrete data (e.g. sequences, images, tokens), but map continuous inputs (e.g., spatial coordinates, time) to continuous outputs (i.e., scalars, vectors, etc.). This makes neural fields not only discretization independent, but also easily differentiable. Moreover, dealing with continuous data allows for a significant reduction in space complexity, which translates to a much more lightweight network. == Formulation and training == According to the universal approximation theorem, provided adequate learning, sufficient number of hidden units, and the presence of a deterministic relationship between the input and the output, a neural network can approximate any function to any degree of accuracy. Hence, in mathematical terms, given a field y = Φ ( x ) {\textstyle {\boldsymbol {y}}=\Phi ({\boldsymbol {x}})} , with x ∈ R n {\displaystyle {\boldsymbol {x}}\in \mathbb {R} ^{n}} and y ∈ R m {\displaystyle {\boldsymbol {y}}\in \mathbb {R} ^{m}} , a neural field Ψ θ {\displaystyle \Psi _{\theta }} , with parameters θ {\displaystyle {\boldsymbol {\theta }}} , is such that: Ψ θ ( x ) = y ^ ≈ y {\displaystyle \Psi _{\theta }({\boldsymbol {x}})={\hat {\boldsymbol {y}}}\approx {\boldsymbol {y}}} === Training === For supervised tasks, given N {\displaystyle N} examples in the training dataset (i.e., ( x i , y i ) ∈ D t r a i n , i = 1 , … , N {\displaystyle ({\boldsymbol {x_{i}}},{\boldsymbol {y_{i}}})\in {\mathcal {D_{train}}},i=1,\dots ,N} ), the neural field parameters can be learned by minimizing a loss function L {\displaystyle {\mathcal {L}}} (e.g., mean squared error). The parameters θ ~ {\displaystyle {\tilde {\theta }}} that satisfy the optimization problem are found as: θ ~ = argmin θ 1 N ∑ ( x i , y i ) ∈ D t r a i n L ( Ψ θ ( x i ) , y i ) {\displaystyle {\tilde {\boldsymbol {\theta }}}={\underset {\boldsymbol {\theta }}{\text{argmin}}}\;{\frac {1}{N}}\sum _{({\boldsymbol {x_{i}}},{\boldsymbol {y_{i}}})\in {\mathcal {D_{train}}}}{\mathcal {L}}(\Psi _{\theta }({\boldsymbol {x}}_{i}),{\boldsymbol {y}}_{i})} Notably, it is not necessary to know the analytical expression of Φ {\displaystyle \Phi } , for the previously reported training procedure only requires input-output pairs. Indeed, a neural field is able to offer a continuous and differentiable surrogate of the true field, even from purely experimental data. Moreover, neural fields can be used in unsupervised settings, with training objectives that depend on the specific task. For example, physics-informed neural networks may be trained on just the residual. === Spectral bias === As for any artificial neural network, neural fields may be characterized by a spectral bias (i.e., the tendency to preferably learn the low frequency content of a field), possibly leading to a poor representation of the ground truth. In order to overcome this limitation, several strategies have been developed. For example, SIREN uses sinusoidal activations, while the Fourier-features approach embeds the input through sines and cosines. == Conditional neural fields == In many real-world cases, however, learning a single field is not enough. For example, when reconstructing 3D vehicle shapes from Lidar data, it is desirable to have a machine learning model that can work with arbitrary shapes (e.g., a car, a bicycle, a truck, etc.). The solution is to include additional parameters, the latent variables (or latent code) z ∈ R d {\displaystyle {\boldsymbol {z}}\in \mathbb {R} ^{d}} , to vary the field and adapt it to diverse tasks. === Latent code production === When dealing with conditional neural fields, the first design choice is represented by the way in which the latent code is produced. Specifically, two main strategies can be identified: Encoder: the latent code is the output of a second neural network, acting as an encoder. During training, the loss function is the objective used to learn the parameters of both the neural field and the encoder. Auto-decoding: each training example has its own latent code, jointly trained with the neural field parameters. When the model has to process new examples (i.e., not originally present in the training dataset), a small optimization problem is solved, keeping the network parameters fixed and only learning the new latent variables. Since the latter strategy requires additional optimization steps at inference time, it sacrifices speed, but keeps the overall model smaller. Moreover, despite being simpler to implement, an encoder may harm the generalization capabilities of the model. For example, when dealing with a physical scalar field f : R 2 → R {\displaystyle f:\mathbb {R} ^{2}\rightarrow \mathbb {R} } (e.g., the pressure of a 2D fluid), an auto-decoder-based conditional neural field can map a single point to the corresponding value of the field, following a learned latent code z {\displaystyle {\boldsymbol {z}}} . However, if the latent variables were produced by an encoder, it would require access to the entire set of points and corresponding values (e.g. as a regular grid or a mesh graph), leading to a less robust model. === Global and local conditioning === In a neural field with global conditioning, the latent code does not depend on the input and, hence, it offers a global representation (e.g., the overall shape of a vehicle). However, depending on the task, it may be more useful to divide the domain of x {\displaystyle {\boldsymbol {x}}} in several subdomains, and learn different latent codes for each of them (e.g., splitting a large and complex scene in sub-scenes for a more efficient rendering). This is called local conditioning. === Conditioning strategies === There are several strategies to include the conditioning information in the neural field. In the general mathematical framework, conditioning the neural field with the latent variables is equivalent to mapping them to a subset θ ∗ {\displaystyle {\boldsymbol {\theta }}^{}} of the neural field parameters: θ ∗ = Γ ( z ) {\displaystyle {\boldsymbol {\theta }}^{}=\Gamma ({\boldsymbol {z}})} In practice, notable strategies are: Concatenation: the neural field receives, as input, the concatenation of the original input x {\displaystyle {\boldsymbol {x}}} with the latent codes z {\displaystyle {\boldsymbol {z}}} . For feed-forward neural networks, this is equivalent to setting θ ∗ {\displaystyle {\boldsymbol {\theta }}^{}} as the bias of the first layer and Γ ( z ) {\displaystyle \Gamma ({\boldsymbol {z}})} as an affine transformation. Hypernetworks: a hypernetwork is a neural network that outputs the parameters of another neural network. Specifically, it consists of approximating Γ ( z ) {\displaystyle \Gamma ({\boldsymbol {z}})} with a neural network Γ ^ γ ( z ) {\displaystyle {\hat {\Gamma }}_{\gamma }({\boldsymbol {z}})} , where γ {\displaystyle {\boldsymbol {\gamma }}} are the trainable parameters of the hypernetwork. This approach is the most general, as it allows to learn the optimal mapping from latent codes to neural field parameters. However, hypernetworks are associated to larger computational and memory complexity, due to the large number of trainable parameters. Hence, leaner approaches have been developed. For example, in the Feature-wise Linear Modulation (FiLM), the hypernetwork only produces scale and bias coefficients for the neural field layers. === Meta-learning === Instead of relying on the latent code to adapt the neural field to a specific task, it is also possible to exploit gradient-based meta-learning. In this case, the neural field is seen as the specialization of an underlying meta-neural-field, whose parameters are modified to fit the specific task, through a few steps of gradient descent. An extension of this meta-learning framework is the CAVIA algorithm, that splits the trainable parameters in context-specific and shared groups, improving parallelization and interpretability, while reducing meta-overfitting. This strategy is similar to the auto-decoding conditional neural field, but the training procedure is substantially different. == Applications == Thanks to the possibility of efficiently modelling diverse mathematical fields with neural networks, neural fields have been applied to a wide range of problems: 3D scene reconstruction: neural fields can be used to model t

    Read more →
  • AI Security Institute

    AI Security Institute

    The AI Security Institute (AISI) is a research organisation under the Department for Science, Innovation and Technology, UK, that aims "to equip governments with a scientific understanding of the risks posed by advanced AI". It conducts research and develop and test mitigations. Previously, it was known as the AI Safety Institute. Its creation followed world's first major AI Safety Summit that was held in Bletchley Park in 2023. The institute's professed goal is "building the world's leading understanding of advanced AI risks and solutions, to inform governments so they can keep the public safe". It is designed like a startup in the government "combining the authority of government with the expertise and agility of the private sector". AISI has made access agreements with Anthropic, Google and OpenAI to test their models before release. It has an open source platform called Inspect that permits companies, governments and academics to run standardised safety tests for AI usage. Among the works AISI has done is the reported detection of multiple serious vulnerabilities that could enable development of biological weapons; the vulnerabilities were fixed before the model was launched. It conducts research on diverse fields of AI application. One study by AISI found that LLMs post-trained for political persuasiveness became systematically less accurate and up to 51% more persuasive on political issues. AISI has also worked on the usage of AI for emotional needs. It found that nearly 10 percent of UK citizens used systems like chatbots for emotional purposes on a weekly basis. It found that "systems are now outperforming PhD-level researchers on scientific knowledge tests and helping non-experts succeed at lab work that would previously have been out of reach" in a report published in December 2025. Former chief AI officer of GCHQ Adam Beaumont is the institution's interim director. UK prime minister's AI advisor Jade Leung is the chief technology officer.

    Read more →
  • Cross-entropy method

    Cross-entropy method

    The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorial and continuous problems, with either a static or noisy objective. The method approximates the optimal importance sampling estimator by repeating two phases: Draw a sample from a probability distribution. Minimize the cross-entropy between this distribution and a target distribution to produce a better sample in the next iteration. Reuven Rubinstein developed the method in the context of rare-event simulation, where tiny probabilities must be estimated, for example in network reliability analysis, queueing models, or performance analysis of telecommunication systems. The method has also been applied to the traveling salesman, quadratic assignment, DNA sequence alignment, max-cut and buffer allocation problems. == Estimation via importance sampling == Consider the general problem of estimating the quantity ℓ = E u [ H ( X ) ] = ∫ H ( x ) f ( x ; u ) d x {\displaystyle \ell =\mathbb {E} _{\mathbf {u} }[H(\mathbf {X} )]=\int H(\mathbf {x} )\,f(\mathbf {x} ;\mathbf {u} )\,{\textrm {d}}\mathbf {x} } , where H {\displaystyle H} is some performance function and f ( x ; u ) {\displaystyle f(\mathbf {x} ;\mathbf {u} )} is a member of some parametric family of distributions. Using importance sampling this quantity can be estimated as ℓ ^ = 1 N ∑ i = 1 N H ( X i ) f ( X i ; u ) g ( X i ) {\displaystyle {\hat {\ell }}={\frac {1}{N}}\sum _{i=1}^{N}H(\mathbf {X} _{i}){\frac {f(\mathbf {X} _{i};\mathbf {u} )}{g(\mathbf {X} _{i})}}} , where X 1 , … , X N {\displaystyle \mathbf {X} _{1},\dots ,\mathbf {X} _{N}} is a random sample from g {\displaystyle g\,} . For positive H {\displaystyle H} , the theoretically optimal importance sampling density (PDF) is given by g ∗ ( x ) = H ( x ) f ( x ; u ) / ℓ {\displaystyle g^{}(\mathbf {x} )=H(\mathbf {x} )f(\mathbf {x} ;\mathbf {u} )/\ell } . This, however, depends on the unknown ℓ {\displaystyle \ell } . The CE method aims to approximate the optimal PDF by adaptively selecting members of the parametric family that are closest (in the Kullback–Leibler sense) to the optimal PDF g ∗ {\displaystyle g^{}} . == Generic CE algorithm == Choose initial parameter vector v ( 0 ) {\displaystyle \mathbf {v} ^{(0)}} ; set t = 1. Generate a random sample X 1 , … , X N {\displaystyle \mathbf {X} _{1},\dots ,\mathbf {X} _{N}} from f ( ⋅ ; v ( t − 1 ) ) {\displaystyle f(\cdot ;\mathbf {v} ^{(t-1)})} Solve for v ( t ) {\displaystyle \mathbf {v} ^{(t)}} , where v ( t ) = argmax v ⁡ 1 N ∑ i = 1 N H ( X i ) f ( X i ; u ) f ( X i ; v ( t − 1 ) ) log ⁡ f ( X i ; v ) {\displaystyle \mathbf {v} ^{(t)}=\mathop {\textrm {argmax}} _{\mathbf {v} }{\frac {1}{N}}\sum _{i=1}^{N}H(\mathbf {X} _{i}){\frac {f(\mathbf {X} _{i};\mathbf {u} )}{f(\mathbf {X} _{i};\mathbf {v} ^{(t-1)})}}\log f(\mathbf {X} _{i};\mathbf {v} )} If convergence is reached then stop; otherwise, increase t by 1 and reiterate from step 2. In several cases, the solution to step 3 can be found analytically. Situations in which this occurs are When f {\displaystyle f\,} belongs to the natural exponential family When f {\displaystyle f\,} is discrete with finite support When H ( X ) = I { x ∈ A } {\displaystyle H(\mathbf {X} )=\mathrm {I} _{\{\mathbf {x} \in A\}}} and f ( X i ; u ) = f ( X i ; v ( t − 1 ) ) {\displaystyle f(\mathbf {X} _{i};\mathbf {u} )=f(\mathbf {X} _{i};\mathbf {v} ^{(t-1)})} , then v ( t ) {\displaystyle \mathbf {v} ^{(t)}} corresponds to the maximum likelihood estimator based on those X k ∈ A {\displaystyle \mathbf {X} _{k}\in A} . == Continuous optimization—example == The same CE algorithm can be used for optimization, rather than estimation. Suppose the problem is to maximize some function S {\displaystyle S} , for example, S ( x ) = e − ( x − 2 ) 2 + 0.8 e − ( x + 2 ) 2 {\displaystyle S(x)={\textrm {e}}^{-(x-2)^{2}}+0.8\,{\textrm {e}}^{-(x+2)^{2}}} . To apply CE, one considers first the associated stochastic problem of estimating P θ ( S ( X ) ≥ γ ) {\displaystyle \mathbb {P} _{\boldsymbol {\theta }}(S(X)\geq \gamma )} for a given level γ {\displaystyle \gamma \,} , and parametric family { f ( ⋅ ; θ ) } {\displaystyle \left\{f(\cdot ;{\boldsymbol {\theta }})\right\}} , for example the 1-dimensional Gaussian distribution, parameterized by its mean μ t {\displaystyle \mu _{t}\,} and variance σ t 2 {\displaystyle \sigma _{t}^{2}} (so θ = ( μ , σ 2 ) {\displaystyle {\boldsymbol {\theta }}=(\mu ,\sigma ^{2})} here). Hence, for a given γ {\displaystyle \gamma \,} , the goal is to find θ {\displaystyle {\boldsymbol {\theta }}} so that D K L ( I { S ( x ) ≥ γ } ‖ f θ ) {\displaystyle D_{\mathrm {KL} }({\textrm {I}}_{\{S(x)\geq \gamma \}}\|f_{\boldsymbol {\theta }})} is minimized. This is done by solving the sample version (stochastic counterpart) of the KL divergence minimization problem, as in step 3 above. It turns out that parameters that minimize the stochastic counterpart for this choice of target distribution and parametric family are the sample mean and sample variance corresponding to the elite samples, which are those samples that have objective function value ≥ γ {\displaystyle \geq \gamma } . The worst of the elite samples is then used as the level parameter for the next iteration. This yields the following randomized algorithm that happens to coincide with the so-called Estimation of Multivariate Normal Algorithm (EMNA), an estimation of distribution algorithm. === Pseudocode === // Initialize parameters μ := −6 σ2 := 100 t := 0 maxits := 100 N := 100 Ne := 10 // While maxits not exceeded and not converged while t < maxits and σ2 > ε do // Obtain N samples from current sampling distribution X := SampleGaussian(μ, σ2, N) // Evaluate objective function at sampled points S := exp(−(X − 2) ^ 2) + 0.8 exp(−(X + 2) ^ 2) // Sort X by objective function values in descending order X := sort(X, S) // Update parameters of sampling distribution via elite samples μ := mean(X(1:Ne)) σ2 := var(X(1:Ne)) t := t + 1 // Return mean of final sampling distribution as solution return μ == Related methods == Simulated annealing Genetic algorithms Harmony search Estimation of distribution algorithm Tabu search Natural Evolution Strategy Ant colony optimization algorithms

    Read more →
  • Agent verification

    Agent verification

    Agent verification is activity to gain assurances that purposeful artificial constructs act in accordance with their specifications. While primitive forms of inorganic agents have been used in manufacturing for centuries, the study of artificial agents did not begin until the mid 20th century. Foundational work on such agents was closely bound with the emergence of artificial intelligence as an academic discipline. Early agents deployed for industrial control systems and in computing were often controlled by quite simple logic however, not involving artificial intelligence as such. When deployed as part of a multi-agent system, even such simple agents could require special agent orientated testing methods, as their collective behaviour was challenging to verify with traditional testing techniques. Difficulties in providing assurances that agents will not behave in dangerous ways became more prevalent after the introduction of LLM agents, especially after the rapid acceleration of their deployment in 2025. The verification of agent behaviour can be conducted by formal or informal methods. Informal verification requires less mathematical skill. But when agents are part of systems where errors have significant risks — such as danger to human life, environmental damage or major financial loss — formal verification is preferred. Both regulators and system designers themselves like formal verification as it provides a high degree of mathematical certainty. It is not however always possible to formally test all aspects of an agent based system's behaviour, especially where newer LLM based agents are concerned, due in part to their high degree of autonomy. Accordingly, agent verification for low impact deployments might be carried out only with informal methods, while for high impact deployments, it may be performed with a mix of formal and informal techniques. == Terminology == In academia, the term agent verification is often defined to mean activity concerned with gaining assurance that the agent behaves in accordance with its specification - whether by processes such as testing or simulation. 'Verification' is typically contrasted with 'validation', the latter meaning activity concerned with checking that the specification itself meets user or real world needs. Such definitions are not universally adhered to however - for example, in some workplaces and documents, the words 'verification' and 'validation' can be used synonymously. Efforts to gain confidence in Agents have intensified sharply since 2025 due to the rapid roll out of LLM agents; different terms are sometimes used in the commercial sector. Here the term 'agent verification' can be used in the same sense as it is in academia, but sometimes the same activity can be covered by more ambiguous and wider ranging terms such as 'Agent governance' , 'Agent observability' or 'AI agent policing'. == History == === Classical agents === The theoretical underpinnings for artificial (inorganic) agents emerged in the mid 20th century, with establishment of cybernetics and artificial intelligence. Oliver Selfridge's 1958 Pandemonium - A Paradigm for Learning paper was an important early theoretical contribution in establishing agent oriented architecture. Practical implementations of agents for real world applications began to become widespread in the 1990s, after the introduction of the belief–desire–intention software model (BDI), and agent-oriented programming. Pure digital agents were deployed in computer infrastructure for purposes such as monitoring, while agents connected to real-world sensors and actuators were increasingly used in industrial control systems. While the concept of artificial agents was interwoven with early artificial intelligence studies right from the start, early agents lacked general purpose reasoning capabilities, often only having simple if then logic. Even a device as simple as a thermostat, which has a sensor and a means of acting, can be considered a proto agent in this sense. Verifying the behaviours of a simple single agent system is not generally especially difficult, but it can be a different matter when several simple agents coexist in the same system. Craig Reynolds's work on boids showed that relatively complex, "intelligent" behaviour can emerge from a number of such simple agents working together in a Multi-agent system (MAS). By the 1990s, even the behaviour of a single agent system could sometimes be quite complex; in accordance with the Belief–desire–intention software model, agents could have believes that might evolve over time. Agents were increasingly introduced that were controlled by quite large decision tree models, which had new vulnerabilities to adversarial attack. It was becoming increasingly apparent that traditional software verification methods had limitations for testing such agents, or even for the more primitive type of agents when they were deployed as part of a MAS. It was the use of agents for industrial control systems, sometimes associated with robotics, that lent urgency to the practice of agent verification. Informal testing might be acceptable for digital agents used say to monitor whether each of an organisation's computers are properly licensed. But with an increasing potential for faulty agents to result in a failure that might cause a large fire to break out at a chemical manufacturing plant, a botched medical operation, or even a crashed aircraft, the need to develop reliable means of verifying behaviour of such agents was considered urgent. The Foundation for Intelligent Physical Agents was established in 1996. From the late 90s, a growing number of industry and university based scientists began working on the problem, with researchers publishing papers on the verification of both single and multi agent systems. Much of this work showed how formal verification techniques like model checking could be used to gain a high level of assurance that agent based systems would conform with their specification. A 2018 systematic review covering 231 studies found that model checking was the most common technique for agent verification, with theorem proving the second most commonly used formal verification method. In the first two decades of the 20th century, agents run by AI became more common, with Siri and Alexa being well known examples. But such agents still lacked general reasoning capabilities and did not pose new pressing problems for agent verification. === General purpose reasoning agents === The advent of LLMs created huge potential for further use of artificial agents, as agents based on them could have general purpose cognitive abilities. Agents run by LLMs (and occasionally non-LLM foundation models) have similar vulnerability to adversarial attack as those run by decision tree models. The wider scope of actions for LLM agents has created new challenges for their verification, over and above those present for classical agents. For example, the LLM's neural network endows it with infinite domains, an especial challenge for traditional formal verification techniques. Academics began to study the problems involved in verifying LLM agents from 2018. Deployment of such agents began to accelerate in late 2023 after OpenAI's "function-calling" API was made available, and especially after Anthropic's late 2024 introduction of Model Context Protocol (MCP), a standardised way for LLM agents to gain contextual awareness, and to act on the world by calling various external tools. The rapid rollout of LLM agents following MCP's release has seen the task of agent verification receive increased attention within academia, and also from the private sector. In 2024 and 2025 several startups focusing on LLM agent verification have been founded in both Europe and the US to meet growing demand. == Approaches == === Formal verification === Formal verification involves proving the correctness of some or all aspects of a system using mathematical methods. Such methods can range from manual formal proof, to verification assisted with automated theorem provers like Isabelle. For agent verification, model checking is by far the most frequently used formal verification method; for pre-LLM models it was often complemented with techniques using computation tree logic. Another common method is theorem proving. Formal verification provides a higher degree of confidence than informal methods, but it is not always used, even when it is possible. Sometimes a person or organisation developing software agents won't have the necessary skills, or may not see it as worth the effort if the agent(s) will not have the ability to cause much harm even if they malfunction. When agents are deployed in systems where errors could have serious consequences, the ability of formal verification methods to provide mathematical certainty tends to be strongly preferred by both regulators and designers themselves. But even for high impact systems, formal verificatio

    Read more →
  • CloudMinds

    CloudMinds

    CloudMinds is an operator of cloud-based systems for cognitive robotics. == History == CloudMinds was founded in 2015 and is backed by SoftBank, Foxconn, Walden Venture Investments, and Keytone Ventures. CloudMinds has developed research in smart devices, robot control, high-speed security networks, and cloud intelligence integration. CloudMinds developed the Mobile Intranet Cloud Services (MCS) based on these technologies in order to increase the information security of the cloud robot remote control. The technology has been applied in the fields of finance, medicine, the military, public safety, and large-scale manufacturing. == U.S. sanctions == In May 2020, CloudMinds was added to the Bureau of Industry and Security's Entity List due to U.S. national security concerns.

    Read more →
  • Inductive programming

    Inductive programming

    Inductive programming (IP) is a special area of automatic programming, covering research from artificial intelligence and programming, which addresses learning of typically declarative (logic or functional) and often recursive programs from incomplete specifications, such as input/output examples or constraints. Depending on the programming language used, there are several kinds of inductive programming. Inductive functional programming, which uses functional programming languages such as Lisp or Haskell, and most especially inductive logic programming, which uses logic programming languages such as Prolog and other logical representations such as description logics, have been more prominent, but other (programming) language paradigms have also been used, such as constraint programming or probabilistic programming. == Definition == Inductive programming incorporates all approaches which are concerned with learning programs or algorithms from incomplete (formal) specifications. Possible inputs in an IP system are a set of training inputs and corresponding outputs or an output evaluation function, describing the desired behavior of the intended program, traces or action sequences which describe the process of calculating specific outputs, constraints for the program to be induced concerning its time efficiency or its complexity, various kinds of background knowledge such as standard data types, predefined functions to be used, program schemes or templates describing the data flow of the intended program, heuristics for guiding the search for a solution or other biases. Output of an IP system is a program in some arbitrary programming language containing conditionals and loop or recursive control structures, or any other kind of Turing-complete representation language. In many applications the output program must be correct with respect to the examples and partial specification, and this leads to the consideration of inductive programming as a special area inside automatic programming or program synthesis, usually opposed to 'deductive' program synthesis, where the specification is usually complete. In other cases, inductive programming is seen as a more general area where any declarative programming or representation language can be used and we may even have some degree of error in the examples, as in general machine learning, the more specific area of structure mining or the area of symbolic artificial intelligence. A distinctive feature is the number of examples or partial specification needed. Typically, inductive programming techniques can learn from just a few examples. The diversity of inductive programming usually comes from the applications and the languages that are used: apart from logic programming and functional programming, other programming paradigms and representation languages have been used or suggested in inductive programming, such as functional logic programming, constraint programming, probabilistic programming, abductive logic programming, modal logic, action languages, agent languages and many types of imperative languages. == History == The early works of Plotkin, and his "relative least general generalization (rlgg)", had an enormous impact in inductive logic programming. There were some encouraging results on learning recursive Prolog programs such as quicksort from examples together with suitable background knowledge, for example with GOLEM. However, after initial success, the community got disappointed by limited progress about the induction of recursive programs with ILP less and less focusing on recursive programs and leaning more and more towards a machine learning setting with applications in relational data mining and knowledge discovery. In parallel to work in ILP, Koza proposed genetic programming in the early 1990s as a generate-and-test based approach to learning programs. The idea of genetic programming was further developed into the inductive programming system ADATE and the systematic-search-based system MagicHaskeller. Here again, functional programs are learned from sets of positive examples together with an output evaluation (fitness) function which specifies the desired input/output behavior of the program to be learned. The early work in grammar induction (also known as grammatical inference) is related to inductive programming, as rewriting systems or logic programs can be used to represent production rules. In fact, early works in inductive inference considered grammar induction and Lisp program inference as basically the same problem. The results in terms of learnability were related to classical concepts, such as identification-in-the-limit, as introduced in the seminal work of Gold. More recently, the language learning problem was addressed by the inductive programming community. In the recent years, the classical approaches have been resumed and advanced with great success. Therefore, the synthesis problem has been reformulated on the background of constructor-based term rewriting systems taking into account modern techniques of functional programming, as well as moderate use of search-based strategies and usage of background knowledge as well as automatic invention of subprograms. Many new and successful applications have recently appeared beyond program synthesis, most especially in the area of data manipulation, programming by example and cognitive modelling (see below). Other ideas have also been explored with the common characteristic of using declarative languages for the representation of hypotheses. For instance, the use of higher-order features, schemes or structured distances have been advocated for a better handling of recursive data types and structures; abstraction has also been explored as a more powerful approach to cumulative learning and function invention. One powerful paradigm that has been recently used for the representation of hypotheses in inductive programming (generally in the form of generative models) is probabilistic programming (and related paradigms, such as stochastic logic programs and Bayesian logic programming). == Application areas == The first workshop on Approaches and Applications of Inductive Programming (AAIP) Archived 2016-03-03 at the Wayback Machine held in conjunction with ICML 2005 identified all applications where "learning of programs or recursive rules are called for, [...] first in the domain of software engineering where structural learning, software assistants and software agents can help to relieve programmers from routine tasks, give programming support for end users, or support of novice programmers and programming tutor systems. Further areas of application are language learning, learning recursive control rules for AI-planning, learning recursive concepts in web-mining or for data-format transformations". Since then, these and many other areas have shown to be successful application niches for inductive programming, such as end-user programming, the related areas of programming by example and programming by demonstration, and intelligent tutoring systems. Other areas where inductive inference has been recently applied are knowledge acquisition, artificial general intelligence, reinforcement learning and theory evaluation, and cognitive science in general. There may also be prospective applications in intelligent agents, games, robotics, personalisation, ambient intelligence and human interfaces.

    Read more →
  • Instance (computer science)

    Instance (computer science)

    In computer science, an instance or token (from metalogic and metamathematics) is a specific occurrence of a software element that is based on a type definition. When created, an occurrence is said to have been instantiated, and both the creation process and the result of creation are called instantiation. == Examples == Chat AI instance In chat-based AI systems, an assistant can be invoked across many independent conversation sessions (often called a thread), each with its own message history. A specific execution of the assistant over that session may be represented as a run (an execution on a thread). Class instance In object-oriented programming, an object created from a class type. Each instance of a class shares the class-defined structure and behavior but has its own identity and state. Procedural instance In some contexts (including Simula), each procedure call can be viewed as an instance of that procedure—an activation with its own parameters and local variables. Computer instance In cloud computing and virtualization, an instance commonly refers to a provisioned virtual machine or virtual server with an allocated combination of compute, memory, network, and storage resources. Polygonal model In computer graphics, a model may be instanced so it can be drawn multiple times with different transforms and parameters, improving performance by reusing shared geometry data. Program instance In a POSIX-oriented operating system, a running process is an instance of a program. It can be instantiated via system calls such as fork() and exec(). Each executing process is an instance of a program it has been instantiated from.

    Read more →
  • Cost-sensitive machine learning

    Cost-sensitive machine learning

    Cost-sensitive machine learning is an approach within machine learning that considers varying costs associated with different types of errors. This method diverges from traditional approaches by introducing a cost matrix, explicitly specifying the penalties or benefits for each type of prediction error. The inherent difficulty which cost-sensitive machine learning tackles is that minimizing different kinds of classification errors is a multi-objective optimization problem. == Overview == Cost-sensitive machine learning optimizes models based on the specific consequences of misclassifications, making it a valuable tool in various applications. It is especially useful in problems with a high imbalance in class distribution and a high imbalance in associated costs Cost-sensitive machine learning introduces a scalar cost function in order to find one (of multiple) Pareto optimal points in this multi-objective optimization problem (similar to the Weighted sum model) == Cost Matrix == The cost matrix is a crucial element within cost-sensitive modeling, explicitly defining the costs or benefits associated with different prediction errors in classification tasks. Represented as a table, the matrix aligns true and predicted classes, assigning a cost value to each combination. For instance, in binary classification, it may distinguish costs for false positives and false negatives. The utility of the cost matrix lies in its application to calculate the expected cost or loss. The formula, expressed as a double summation, utilizes joint probabilities: Expected Loss = ∑ i ∑ j P ( Actual i , Predicted j ) ⋅ Cost Actual i , Predicted j {\displaystyle {\text{Expected Loss}}=\sum _{i}\sum _{j}P({\text{Actual}}_{i},{\text{Predicted}}_{j})\cdot {\text{Cost}}_{{\text{Actual}}_{i},{\text{Predicted}}_{j}}} Here, P ( Actual i , Predicted j ) {\displaystyle P({\text{Actual}}_{i},{\text{Predicted}}_{j})} denotes the joint probability of actual class i {\displaystyle i} and predicted class j {\displaystyle j} , providing a nuanced measure that considers both the probabilities and associated costs. This approach allows practitioners to fine-tune models based on the specific consequences of misclassifications, adapting to scenarios where the impact of prediction errors varies across classes. == Applications == === Fraud Detection === In the realm of data science, particularly in finance, cost-sensitive machine learning is applied to fraud detection. By assigning different costs to false positives and false negatives, models can be fine-tuned to minimize the overall financial impact of misclassifications. === Medical Diagnostics === In healthcare, cost-sensitive machine learning plays a role in medical diagnostics. The approach allows for customization of models based on the potential harm associated with misdiagnoses, ensuring a more patient-centric application of machine learning algorithms. == Challenges == A typical challenge in cost-sensitive machine learning is the reliable determination of the cost matrix which may evolve over time. == Literature == Cost-Sensitive Machine Learning. USA, CRC Press, 2011. ISBN 9781439839287 Abhishek, K., Abdelaziz, D. M. (2023). Machine Learning for Imbalanced Data: Tackle Imbalanced Datasets Using Machine Learning and Deep Learning Techniques. (n.p.): Packt Publishing. ISBN 9781801070881

    Read more →
  • Kuaishou

    Kuaishou

    Kuaishou Technology is a Chinese publicly traded partly state-owned holding company based in Haidian District, Beijing, that was founded in 2011 by Hua Su (Chinese: 宿华) and Cheng Yixiao (Chinese: 程一笑). The company, listed on the Hong Kong Stock Exchange, is known for developing a mobile app for sharing users' short videos, a social network, and video special effects editor. The app is known as Kwai in many countries outside of China. It is also known as Snack Video in India, Pakistan and Indonesia. == Ownership and governance == Kuaishou's overseas team is led by the former CEO of the application 99, and staff from Google, Facebook, Netflix, and TikTok were recruited to lead the company's international expansion. The China Internet Investment Fund, a state-owned enterprise controlled by the Cyberspace Administration of China, holds a golden share ownership stake in Kuaishou. == History == Kuaishou is China's first short video platform that was developed in 2011 by engineer Hua Su and Cheng Yixiao. Prior to co-founding Kuaishou, Su Hua had worked for both Google and Baidu as a software engineer. The company is headquartered in Haidian District, Beijing. Kuaishou's predecessor "GIF Kuaishou" was founded in March 2011. GIF Kuaishou was a mobile app with which users could make and share GIF pictures. In 2013, Kuaishou became a short-video social platform. By 2013, the app had reached 100 million daily users. By 2019, it had exceeded 200 million active daily users. In March 2017, Kuaishou closed a US$350 million investment round that was led by Tencent. In January 2018, Forbes estimated the company's valuation to be US$18 billion. In April 2018, Kuaishou's app was briefly banned from Chinese app stores after China Central Television (CCTV) reported on the platform popularizing videos of teenage mothers. In 2019, the company announced a partnership with the People's Daily, an official newspaper of the Central Committee of the Chinese Communist Party, to help it experiment with the use of artificial intelligence in news. In June 2020, following the start of the 2020–2021 China–India skirmishes, the Government of India banned Kwai along with 58 other apps, citing "data and privacy issues". In January 2021, Kuaishou announced it was planning an initial public offering (IPO) to raise approximately US$5 billion. Kuaishou's stock completed its first day of trading at $300 Hong Kong dollars (HKD) (US$38.70), more than doubling its initial offer price, and causing its market value to rise to over $1 trillion HKD (US$159 billion). In February 2021, Kuaishou made a debut on the Hong Kong Stock Exchange, with its shares soaring by 194% at the opening. The company subsequently encountered major setbacks as a result of heightened regulatory restrictions on Chinese internet firms, which contributed to its share price falling by nearly 80% from its post-IPO peak. By December 2021, Kuaishou announced a major reorganization, including the layoff of 30% of its staff, primarily targeting mid-level employees earning an annual salary of $157,000 or more. This restructuring aimed to cut costs and mitigate financial losses. In October 2022, state-owned Beijing Radio and Television Station took a minority ownership stake in Kuaishou. In April 2024, a Financial Times article citing current and former Kuaishou employees stated that the company has been running an ageist redundancy programme known internally as "Limestone", culling workers in their mid-30s. In June 2024, Kuaishou and the Sichuan international communication center launched a branch center in São Paulo, Brazil. In June 2024, Kuaishou released its diffusion transformer text-to-video model, Kling, which they claimed could generate two minutes of video at 30 frames per second and in 1080p resolution. The model has been compared to that of OpenAI's Sora text-to-video model. It is accessible to the public on Kuaishou's video editing app KwaiCut via signing up for a waitlist with a Chinese phone number. In December 2025, Kuaishou came under a cyberattack which led to a temporary influx of violent and pornographic content. == Popularity == As of 2019, it had a worldwide user base of over 200 million, leading the "Most Downloaded" lists of the Google Play and Apple App Store in eight countries, such as Brazil, where it was introduced in 2019. Its main short-video platform competitor was Douyin, which is known as TikTok outside China. Compared to Douyin, Kuaishou is more popular with older users living outside China's Tier 1 cities. Its initial popularity came from videos of Chinese rural life. The app is particularly well known for its "rustic" aesthetic and is popular among rural people. Kuaishou also relied more on e-commerce revenue than on advertising revenue compared to its main competitor. == Reception == Kwai (as the app is called outside of China) was banned in India in 2020 along with other short video apps like TikTok. Kuaishou then released the clone SnackVideo, which was subsequently also banned. The app is one of the most popular social media platforms in Brazil, where Kuaishou partnered with creators to make telenovela style content, and appeals to football fans by working with football teams CR Flamengo and Santos FC and sponsoring the tournament Copa América. Kwai was notable in Brazil for spreading information (and misinformation) about the COVID-19 vaccine and political misinformation. === Manjiao Wenhua === "Manjiao wenhua" (慢脚文化) is a sarcasm term on Chinese internet on the unethical or illegal contents on Kuaishou. State broadcaster China Central Television (CCTV) reported that many contents are about child pregnancy. "Dating, pregnancy, bearing a child...these are strictly prohibited in the real time by a minor, but these contents can easily shown to audiences here." In addition, many students from primary or secondary schools make a pose of smoking. Wang Zhenhui (王贞会) from CUPSL stated that these kinds of bad values will give negative effects to the minors.

    Read more →
  • Incremental heuristic search

    Incremental heuristic search

    Incremental heuristic search algorithms combine both incremental and heuristic search to speed up searches of sequences of similar search problems, which is important in domains that are only incompletely known or change dynamically. Incremental search has been studied at least since the late 1960s. Incremental search algorithms reuse information from previous searches to speed up the current search and solve search problems potentially much faster than solving them repeatedly from scratch. Similarly, heuristic search has also been studied at least since the late 1960s. Heuristic search algorithms, often based on A, use heuristic knowledge in the form of approximations of the goal distances to focus the search and solve search problems potentially much faster than uninformed search algorithms. The resulting search problems, sometimes called dynamic path planning problems, are graph search problems where paths have to be found repeatedly because the topology of the graph, its edge costs, the start vertex or the goal vertices change over time. So far, three main classes of incremental heuristic search algorithms have been developed: The first class restarts A at the point where its current search deviates from the previous one (example: Fringe Saving A). The second class updates the h-values (heuristic, i.e. approximate distance to goal) from the previous search during the current search to make them more informed (example: Generalized Adaptive A). The third class updates the g-values (distance from start) from the previous search during the current search to correct them when necessary, which can be interpreted as transforming the A search tree from the previous search into the A search tree for the current search (examples: Lifelong Planning A, D, D Lite). All three classes of incremental heuristic search algorithms are different from other replanning algorithms, such as planning by analogy, in that their plan quality does not deteriorate with the number of replanning episodes. == Applications == Incremental heuristic search has been extensively used in robotics, where a larger number of path planning systems are based on either D (typically earlier systems) or D Lite (current systems), two different incremental heuristic search algorithms.

    Read more →
  • Grammar systems theory

    Grammar systems theory

    Grammar systems theory is a field of theoretical computer science that studies systems of finite collections of formal grammars generating a formal language. Each grammar works on a string, a so-called sequential form that represents an environment. Grammar systems can thus be used as a formalization of decentralized or distributed systems of agents in artificial intelligence. Let A {\displaystyle \mathbb {A} } be a simple reactive agent moving on the table and trying not to fall down from the table with two reactions, t for turning and ƒ for moving forward. The set of possible behaviors of A {\displaystyle \mathbb {A} } can then be described as formal language L A = { ( f m t n f r ) + : 1 ≤ m ≤ k ; 1 ≤ n ≤ ℓ ; 1 ≤ r ≤ k } , {\displaystyle \mathbb {L_{A}} =\{(f^{m}t^{n}f^{r})^{+}:1\leq m\leq k;1\leq n\leq \ell ;1\leq r\leq k\},} where ƒ can be done maximally k times and t can be done maximally ℓ times considering the dimensions of the table. Let G A {\displaystyle \mathbb {G_{A}} } be a formal grammar which generates language L A {\displaystyle \mathbb {L_{A}} } . The behavior of A {\displaystyle \mathbb {A} } is then described by this grammar. Suppose the A {\displaystyle \mathbb {A} } has a subsumption architecture; each component of this architecture can be then represented as a formal grammar, too, and the final behavior of the agent is then described by this system of grammars. The schema on the right describes such a system of grammars which shares a common string representing an environment. The shared sequential form is sequentially rewritten by each grammar, which can represent either a component or generally an agent. If grammars communicate together and work on a shared sequential form, it is called a Cooperating Distributed (DC) grammar system. Shared sequential form is a similar concept to the blackboard approach in AI, which is inspired by an idea of experts solving some problem together while they share their proposals and ideas on a shared blackboard. Each grammar in a grammar system can also work on its own string and communicate with other grammars in a system by sending their sequential forms on request. Such a grammar system is then called a Parallel Communicating (PC) grammar system. PC and DC are inspired by distributed AI. If there is no communication between grammars, the system is close to the decentralized approaches in AI. These kinds of grammar systems are sometimes called colonies or Eco-Grammar systems, depending (besides others) on whether the environment is changing on its own (Eco-Grammar system) or not (colonies).

    Read more →