A chatbot is a software application or web interface that is designed to mimic human conversation through text or voice interactions. Modern chatbots are typically online and use generative artificial intelligence systems that are capable of maintaining a conversation with a user in natural language and simulating the way a human would behave as a conversational partner. Such chatbots often use large language models (LLMs) and natural language processing, but simpler chatbots have existed for decades. == LLM chatbots == == General chatbots == == Historical chatbots ==
Image stitching
Image stitching or photo stitching is the process of combining multiple photographic images with overlapping fields of view to produce a segmented panorama or high-resolution image. Commonly performed through the use of computer software, most approaches to image stitching require nearly exact overlaps between images and identical exposures to produce seamless results, although some stitching algorithms actually benefit from differently exposed images by doing high-dynamic-range imaging in regions of overlap. Some digital cameras can stitch their photos internally. == Applications == Image stitching is widely used in modern applications, such as the following: Document mosaicing Image stabilization feature in camcorders that use frame-rate image alignment High-resolution image mosaics in digital maps and satellite imagery Medical imaging Multiple-image super-resolution imaging Video stitching Object insertion == Process == The image stitching process can be divided into three main components: image registration, calibration, and blending. === Image stitching algorithms === In order to estimate image alignment, algorithms are needed to determine the appropriate mathematical model relating pixel coordinates in one image to pixel coordinates in another. Algorithms that combine direct pixel-to-pixel comparisons with gradient descent (and other optimization techniques) can be used to estimate these parameters. Distinctive features can be found in each image and then efficiently matched to rapidly establish correspondences between pairs of images. When multiple images exist in a panorama, techniques have been developed to compute a globally consistent set of alignments and to efficiently discover which images overlap one another. A final compositing surface onto which to warp or projectively transform and place all of the aligned images is needed, as are algorithms to seamlessly blend the overlapping images, even in the presence of parallax, lens distortion, scene motion, and exposure differences. === Image stitching issues === Since the illumination in two views cannot be guaranteed to be identical, stitching two images could create a visible seam. Other reasons for seams could be the background changing between two images for the same continuous foreground. Other major issues to deal with are the presence of parallax, lens distortion, scene motion, and exposure differences. In a non-ideal real-life case, the intensity varies across the whole scene, and so does the contrast and intensity across frames. Additionally, the aspect ratio of a panorama image needs to be taken into account to create a visually pleasing composite. For panoramic stitching, the ideal set of images will have a reasonable amount of overlap (at least 15–30%) to overcome lens distortion and have enough detectable features. The set of images will have consistent exposure between frames to minimize the probability of seams occurring. === Keypoint detection === Feature detection is necessary to automatically find correspondences between images. Robust correspondences are required in order to estimate the necessary transformation to align an image with the image it is being composited on. Corners, blobs, Harris corners, and differences of Gaussians of Harris corners are good features since they are repeatable and distinct. One of the first operators for interest point detection was developed by Hans Moravec in 1977 for his research involving the automatic navigation of a robot through a clustered environment. Moravec also defined the concept of "points of interest" in an image and concluded these interest points could be used to find matching regions in different images. The Moravec operator is considered to be a corner detector because it defines interest points as points where there are large intensity variations in all directions. This often is the case at corners. However, Moravec was not specifically interested in finding corners, just distinct regions in an image that could be used to register consecutive image frames. Harris and Stephens improved upon Moravec's corner detector by considering the differential of the corner score with respect to direction directly. They needed it as a processing step to build interpretations of a robot's environment based on image sequences. Like Moravec, they needed a method to match corresponding points in consecutive image frames, but were interested in tracking both corners and edges between frames. SIFT and SURF are recent key-point or interest point detector algorithms but a point to note is that SURF is patented and its commercial usage restricted. Once a feature has been detected, a descriptor method like SIFT descriptor can be applied to later match them. === Registration === Image registration involves matching features in a set of images or using direct alignment methods to search for image alignments that minimize the sum of absolute differences between overlapping pixels. When using direct alignment methods one might first calibrate one's images to get better results. Additionally, users may input a rough model of the panorama to help the feature matching stage, so that e.g. only neighboring images are searched for matching features. Since there are smaller group of features for matching, the result of the search is more accurate and execution of the comparison is faster. To estimate a robust model from the data, a common method used is known as RANSAC. The name RANSAC is an abbreviation for "RANdom SAmple Consensus". It is an iterative method for robust parameter estimation to fit mathematical models from sets of observed data points which may contain outliers. The algorithm is non-deterministic in the sense that it produces a reasonable result only with a certain probability, with this probability increasing as more iterations are performed. It being a probabilistic method means that different results will be obtained for every time the algorithm is run. The RANSAC algorithm has found many applications in computer vision, including the simultaneous solving of the correspondence problem and the estimation of the fundamental matrix related to a pair of stereo cameras. The basic assumption of the method is that the data consists of "inliers", i.e., data whose distribution can be explained by some mathematical model, and "outliers" which are data that do not fit the model. Outliers are considered points which come from noise, erroneous measurements, or simply incorrect data. For the problem of homography estimation, RANSAC works by trying to fit several models using some of the point pairs and then checking if the models were able to relate most of the points. The best model – the homography, which produces the highest number of correct matches – is then chosen as the answer for the problem; thus, if the ratio of number of outliers to data points is very low, the RANSAC outputs a decent model fitting the data. === Calibration === Image calibration aims to minimize differences between an ideal lens models and the camera-lens combination that was used, optical defects such as distortions, exposure differences between images, vignetting, camera response and chromatic aberrations. If feature detection methods were used to register images and absolute positions of the features were recorded and saved, stitching software may use the data for geometric optimization of the images in addition to placing the images on the panosphere. Panotools and its various derivative programs use this method. ==== Alignment ==== Alignment may be necessary to transform an image to match the view point of the image it is being composited with. Alignment, in simple terms, is a change in the coordinates system so that it adopts a new coordinate system which outputs image matching the required viewpoint. The types of transformations an image may go through are pure translation, pure rotation, a similarity transform which includes translation, rotation and scaling of the image which needs to be transformed, Affine or projective transform. Projective transformation is the farthest an image can transform (in the set of two dimensional planar transformations), where only visible features that are preserved in the transformed image are straight lines whereas parallelism is maintained in an affine transform. Projective transformation can be mathematically described as x ′ = H ⋅ x , {\displaystyle x'=H\cdot x,} where x {\displaystyle x} is points in the old coordinate system, x ′ {\displaystyle x'} is the corresponding points in the transformed image and H {\displaystyle H} is the homography matrix. Expressing the points x {\displaystyle x} and x ′ {\displaystyle x'} using the camera intrinsics ( K {\displaystyle K} and K ′ {\displaystyle K'} ) and its rotation and translation [ R t ] {\displaystyle [R\,t]} to the real-world coordinates X {\displaystyle X} and < m a t h > x {\displaystyle , we get Using the abo
Random indexing
Random indexing is a dimensionality reduction method and computational framework for distributional semantics, based on the insight that very-high-dimensional vector space model implementations are impractical, that models need not grow in dimensionality when new items (e.g. new terminology) are encountered, and that a high-dimensional model can be projected into a space of lower dimensionality without compromising L2 distance metrics if the resulting dimensions are chosen appropriately. This is the original point of the random projection approach to dimension reduction first formulated as the Johnson–Lindenstrauss lemma, and locality-sensitive hashing has some of the same starting points. Random indexing, as used in representation of language, originates from the work of Pentti Kanerva on sparse distributed memory, and can be described as an incremental formulation of a random projection. It can be also verified that random indexing is a random projection technique for the construction of Euclidean spaces—i.e. L2 normed vector spaces. In Euclidean spaces, random projections are elucidated using the Johnson–Lindenstrauss lemma. The TopSig technique extends the random indexing model to produce bit vectors for comparison with the Hamming distance similarity function. It is used for improving the performance of information retrieval and document clustering. In a similar line of research, Random Manhattan Integer Indexing (RMII) is proposed for improving the performance of the methods that employ the Manhattan distance between text units. Many random indexing methods primarily generate similarity from co-occurrence of items in a corpus. Reflexive Random Indexing (RRI) generates similarity from co-occurrence and from shared occurrence with other items.
Elastic map
Elastic maps provide a tool for nonlinear dimensionality reduction. By their construction, they are a system of elastic springs embedded in the data space. This system approximates a low-dimensional manifold. The elastic coefficients of this system allow the switch from completely unstructured k-means clustering (zero elasticity) to the estimators located closely to linear PCA manifolds (for high bending and low stretching modules). With some intermediate values of the elasticity coefficients, this system effectively approximates non-linear principal manifolds. This approach is based on a mechanical analogy between principal manifolds, that are passing through "the middle" of the data distribution, and elastic membranes and plates. The method was developed by A.N. Gorban, A.Y. Zinovyev and A.A. Pitenko in 1996–1998. == Energy of elastic map == Let S {\displaystyle {\mathcal {S}}} be a data set in a finite-dimensional Euclidean space. Elastic map is represented by a set of nodes w j {\displaystyle {\bf {w}}_{j}} in the same space. Each datapoint s ∈ S {\displaystyle s\in {\mathcal {S}}} has a host node, namely the closest node w j {\displaystyle {\bf {w}}_{j}} (if there are several closest nodes then one takes the node with the smallest number). The data set S {\displaystyle {\mathcal {S}}} is divided into classes K j = { s | w j is a host of s } {\displaystyle K_{j}=\{s\ |\ {\bf {w}}_{j}{\mbox{ is a host of }}s\}} . The approximation energy D is the distortion D = 1 2 ∑ j = 1 k ∑ s ∈ K j ‖ s − w j ‖ 2 {\displaystyle D={\frac {1}{2}}\sum _{j=1}^{k}\sum _{s\in K_{j}}\|s-{\bf {w}}_{j}\|^{2}} , which is the energy of the springs with unit elasticity which connect each data point with its host node. It is possible to apply weighting factors to the terms of this sum, for example to reflect the standard deviation of the probability density function of any subset of data points { s i } {\displaystyle \{s_{i}\}} . On the set of nodes an additional structure is defined. Some pairs of nodes, ( w i , w j ) {\displaystyle ({\bf {w}}_{i},{\bf {w}}_{j})} , are connected by elastic edges. Call this set of pairs E {\displaystyle E} . Some triplets of nodes, ( w i , w j , w k ) {\displaystyle ({\bf {w}}_{i},{\bf {w}}_{j},{\bf {w}}_{k})} , form bending ribs. Call this set of triplets G {\displaystyle G} . The stretching energy is U E = 1 2 λ ∑ ( w i , w j ) ∈ E ‖ w i − w j ‖ 2 {\displaystyle U_{E}={\frac {1}{2}}\lambda \sum _{({\bf {w}}_{i},{\bf {w}}_{j})\in E}\|{\bf {w}}_{i}-{\bf {w}}_{j}\|^{2}} , The bending energy is U G = 1 2 μ ∑ ( w i , w j , w k ) ∈ G ‖ w i − 2 w j + w k ‖ 2 {\displaystyle U_{G}={\frac {1}{2}}\mu \sum _{({\bf {w}}_{i},{\bf {w}}_{j},{\bf {w}}_{k})\in G}\|{\bf {w}}_{i}-2{\bf {w}}_{j}+{\bf {w}}_{k}\|^{2}} , where λ {\displaystyle \lambda } and μ {\displaystyle \mu } are the stretching and bending moduli respectively. The stretching energy is sometimes referred to as the membrane, while the bending energy is referred to as the thin plate term. For example, on the 2D rectangular grid the elastic edges are just vertical and horizontal edges (pairs of closest vertices) and the bending ribs are the vertical or horizontal triplets of consecutive (closest) vertices. The total energy of the elastic map is thus U = D + U E + U G . {\displaystyle U=D+U_{E}+U_{G}.} The position of the nodes { w j } {\displaystyle \{{\bf {w}}_{j}\}} is determined by the mechanical equilibrium of the elastic map, i.e. its location is such that it minimizes the total energy U {\displaystyle U} . == Expectation-maximization algorithm == For a given splitting of dataset S {\displaystyle {\mathcal {S}}} in classes K j {\displaystyle K_{j}} , minimization of the quadratic functional U {\displaystyle U} is a linear problem with the sparse matrix of coefficients. Therefore, similar to principal component analysis or k-means, a splitting method is used: For given { w j } {\displaystyle \{{\bf {w}}_{j}\}} find { K j } {\displaystyle \{K_{j}\}} ; For given { K j } {\displaystyle \{K_{j}\}} minimize U {\displaystyle U} and find { w j } {\displaystyle \{{\bf {w}}_{j}\}} ; If no change, terminate. This expectation-maximization algorithm guarantees a local minimum of U {\displaystyle U} . For improving the approximation various additional methods are proposed. For example, the softening strategy is used. This strategy starts with a rigid grids (small length, small bending and large elasticity modules λ {\displaystyle \lambda } and μ {\displaystyle \mu } coefficients) and finishes with soft grids (small λ {\displaystyle \lambda } and μ {\displaystyle \mu } ). The training goes in several epochs, each epoch with its own grid rigidness. Another adaptive strategy is growing net: one starts from a small number of nodes and gradually adds new nodes. Each epoch goes with its own number of nodes. == Applications == Most important applications of the method and free software are in bioinformatics for exploratory data analysis and visualisation of multidimensional data, for data visualisation in economics, social and political sciences, as an auxiliary tool for data mapping in geographic informational systems and for visualisation of data of various nature. The method is applied in quantitative biology for reconstructing the curved surface of a tree leaf from a stack of light microscopy images. This reconstruction is used for quantifying the geodesic distances between trichomes and their patterning, which is a marker of the capability of a plant to resist to pathogenes. Recently, the method is adapted as a support tool in the decision process underlying the selection, optimization, and management of financial portfolios. The method of elastic maps has been systematically tested and compared with several machine learning methods on the applied problem of identification of the flow regime of a gas-liquid flow in a pipe. There are various regimes: Single phase water or air flow, Bubbly flow, Bubbly-slug flow, Slug flow, Slug-churn flow, Churn flow, Churn-annular flow, and Annular flow. The simplest and most common method used to identify the flow regime is visual observation. This approach is, however, subjective and unsuitable for relatively high gas and liquid flow rates. Therefore, the machine learning methods are proposed by many authors. The methods are applied to differential pressure data collected during a calibration process. The method of elastic maps provided a 2D map, where the area of each regime is represented. The comparison with some other machine learning methods is presented in Table 1 for various pipe diameters and pressure. Here, ANN stands for the backpropagation artificial neural networks, SVM stands for the support vector machine, SOM for the self-organizing maps. The hybrid technology was developed for engineering applications. In this technology, elastic maps are used in combination with Principal Component Analysis (PCA), Independent Component Analysis (ICA) and backpropagation ANN. The textbook provides a systematic comparison of elastic maps and self-organizing maps (SOMs) in applications to economic and financial decision-making.
Nearest neighbor search
Nearest neighbor search (NNS), as a form of proximity search, is the optimization problem of finding the point in a given set that is closest (or most similar) to a given point. Closeness is typically expressed in terms of a dissimilarity function: the less similar the objects, the larger the function values. Formally, the nearest neighbor (NN) search problem is defined as follows: given a set S of points in a space M and a query point q ∈ M {\displaystyle q\in M} , find the closest point in S to q. Donald Knuth in volume 3 of The Art of Computer Programming (1973) called it the post-office problem, referring to an application of assigning to a residence the nearest post office. A direct generalization of this problem is a k-NN search, where we need to find the k closest points. Most commonly M is a metric space and dissimilarity is expressed as a distance metric, which is symmetric and satisfies the triangle inequality. Even more common, M is taken to be the d-dimensional vector space where dissimilarity is measured using the Euclidean distance, Manhattan distance or other distance metric. However, the dissimilarity function can be arbitrary. One example is asymmetric Bregman divergence, for which the triangle inequality does not hold. == Applications == The nearest neighbor search problem arises in numerous fields of application, including: Pattern recognition – in particular for optical character recognition Statistical classification – see k-nearest neighbor algorithm Computer vision – for point cloud registration Computational geometry – see Closest pair of points problem Cryptanalysis – for lattice problem Databases – e.g. content-based image retrieval Coding theory – see maximum likelihood decoding Semantic search Vector databases, where nearest-neighbor lookup over embeddings is used to retrieve semantically similar records Retrieval-augmented generation systems, where nearest-neighbor retrieval over embeddings is used to fetch candidate passages or documents before generation Data compression – see MPEG-2 standard Robotic sensing Recommendation systems, e.g. see Collaborative filtering Internet marketing – see contextual advertising and behavioral targeting DNA sequencing Spell checking – suggesting correct spelling Plagiarism detection Similarity scores for predicting career paths of professional athletes. Cluster analysis – assignment of a set of observations into subsets (called clusters) so that observations in the same cluster are similar in some sense, usually based on Euclidean distance Chemical similarity Sampling-based motion planning == Methods == Various solutions to the NNS problem have been proposed. The quality and usefulness of the algorithms are determined by the time complexity of queries as well as the space complexity of any search data structures that must be maintained. The informal observation usually referred to as the curse of dimensionality states that there is no general-purpose exact solution for NNS in high-dimensional Euclidean space using polynomial preprocessing and polylogarithmic search time. === Exact methods === ==== Linear search ==== The simplest solution to the NNS problem is to compute the distance from the query point to every other point in the database, keeping track of the "best so far". This algorithm, sometimes referred to as the naive approach, has a running time of O(dN), where N is the cardinality of S and d is the dimensionality of S. There are no search data structures to maintain, so the linear search has no space complexity beyond the storage of the database. Naive search can, on average, outperform space partitioning approaches on higher dimensional spaces. The absolute distance is not required for distance comparison, only the relative distance. In geometric coordinate systems the distance calculation can be sped up considerably by omitting the square root calculation from the distance calculation between two coordinates. The distance comparison will still yield identical results. ==== Space partitioning ==== Since the 1970s, the branch and bound methodology has been applied to the problem. In the case of Euclidean space, this approach encompasses spatial index or spatial access methods. Several space-partitioning methods have been developed for solving the NNS problem. Perhaps the simplest is the k-d tree, which iteratively bisects the search space into two regions containing half of the points of the parent region. Queries are performed via traversal of the tree from the root to a leaf by evaluating the query point at each split. Depending on the distance specified in the query, neighboring branches that might contain hits may also need to be evaluated. For constant dimension query time, average complexity is O(log N) in the case of randomly distributed points, worst case complexity is O(kN^(1-1/k)) Alternatively the R-tree data structure was designed to support nearest neighbor search in dynamic context, as it has efficient algorithms for insertions and deletions such as the R tree. R-trees can yield nearest neighbors not only for Euclidean distance, but can also be used with other distances. In the case of general metric space, the branch-and-bound approach is known as the metric tree approach. Particular examples include vp-tree and BK-tree methods. Using a set of points taken from a 3-dimensional space and put into a BSP tree, and given a query point taken from the same space, a possible solution to the problem of finding the nearest point-cloud point to the query point is given in the following description of an algorithm. (Strictly speaking, no such point may exist, because it may not be unique. But in practice, usually we only care about finding any one of the subset of all point-cloud points that exist at the shortest distance to a given query point.) The idea is, for each branching of the tree, guess that the closest point in the cloud resides in the half-space containing the query point. This may not be the case, but it is a good heuristic. After having recursively gone through all the trouble of solving the problem for the guessed half-space, now compare the distance returned by this result with the shortest distance from the query point to the partitioning plane. This latter distance is that between the query point and the closest possible point that could exist in the half-space not searched. If this distance is greater than that returned in the earlier result, then clearly there is no need to search the other half-space. If there is such a need, then you must go through the trouble of solving the problem for the other half space, and then compare its result to the former result, and then return the proper result. The performance of this algorithm is nearer to logarithmic time than linear time when the query point is near the cloud, because as the distance between the query point and the closest point-cloud point nears zero, the algorithm needs only perform a look-up using the query point as a key to get the correct result. === Approximation methods === An approximate nearest neighbor search algorithm is allowed to return points whose distance from the query is at most c {\displaystyle c} times the distance from the query to its nearest points. The appeal of this approach is that, in many cases, an approximate nearest neighbor is almost as good as the exact one. In particular, if the distance measure accurately captures the notion of user quality, then small differences in the distance should not matter. ==== Greedy search in proximity neighborhood graphs ==== Proximity graph methods (such as navigable small world graphs and HNSW) are considered the current state-of-the-art for the approximate nearest neighbors search. The methods are based on greedy traversing in proximity neighborhood graphs G ( V , E ) {\displaystyle G(V,E)} in which every point x i ∈ S {\displaystyle x_{i}\in S} is uniquely associated with vertex v i ∈ V {\displaystyle v_{i}\in V} . The search for the nearest neighbors to a query q in the set S takes the form of searching for the vertex in the graph G ( V , E ) {\displaystyle G(V,E)} . The basic algorithm – greedy search – works as follows: search starts from an enter-point vertex v i ∈ V {\displaystyle v_{i}\in V} by computing the distances from the query q to each vertex of its neighborhood { v j : ( v i , v j ) ∈ E } {\displaystyle \{v_{j}:(v_{i},v_{j})\in E\}} , and then finds a vertex with the minimal distance value. If the distance value between the query and the selected vertex is smaller than the one between the query and the current element, then the algorithm moves to the selected vertex, and it becomes new enter-point. The algorithm stops when it reaches a local minimum: a vertex whose neighborhood does not contain a vertex that is closer to the query than the vertex itself. The idea of proximity neighborhood graphs was exploited in multiple publications, including the seminal paper by Arya and Mount, in the VoroNet syst
Zynn
Zynn was a Chinese video-sharing social networking service owned by Kuaishou, a Beijing-based internet technology company established in 2011 by Su Hua and Cheng Yixiao. It was used to create and share short videos, and it pays its users for using the app and referring others. Zynn was launched on May 7, 2020. It became the most-downloaded app in the App Store in the same month. It has also been criticized for being a "pyramid scheme", and it has faced accusations of plagiarism and stealing content. Aside from Zynn in North America, Kuaishou is available under the name Kwai in Russia, South Korea, Japan, Thailand, Vietnam, Philippines, Malaysia, Indonesia, Brazil, America, India, and the Middle East. Kwai used to be available in Australia and the United States on the App Store, but was removed at an unknown date. Zynn was permanently shut down on the 20th of August, 2021. == History == In 2011, entrepreneur Su Hua co-founded Kuaishou with business partner Cheng Yixiao. Originally a GIF-making app, Kuaishou soon moved to short video content. Su Hua also serves as the current Kuaishou CEO. In December 2019, Chinese internet conglomerate Tencent invested $2 billion in Kuaishou, reportedly to compete with rival ByteDance. In December 2019, Kuaishou acquired an app developer called Owlii, which is the developer of Zynn. Zynn was developed to be a North American Market edition of Kuaishou. On May 7, 2020, the app was launched and it was downloaded over 2 million times in that month. On May 12, 2020, Kuaishou filed a lawsuit seeking compensation for "unfair competition", and accused Douyin, the sister app of TikTok, of "interfering" with search results on app stores. Zynn shut down on the 20th of August, 2021. == Features == Zynn allows its users to create, edit and share short videos of themselves. Its interface has been described as a "complete clone" of TikTok, its main competitor. The Zynn app was unique in the way that they paid users for using the platform. Each user earned $1 for signing up, and they could earn money for referring users to the platform. Watching videos resulted in earning "points", which could be redeemed for gift cards or be cashed out via PayPal.[1] == Criticisms and controversies == Multiple TikTok users had reported seeing their entire accounts plagiarized, with one account pretending to be Addison Rae. Despite being launched in May, many videos were posted in February. Zynn has employed "intermittent variable rewards" in its point system, which has been criticized as being the "same reinforcement strategy used to addict people to slot machines". Cash payouts for using the app have resulted in criticism and accusations of anti-competitive behavior. The app was taken down from the Google Play store on June 10. Zynn blamed it on an "isolated incident". Six days later, it was taken down from the App Store as well. US Senator Josh Hawley has criticized the platform, calling it "predatory" and "anti-competitive" in a letter to the Federal Trade Commission asking for an investigation into Zynn. He said "[Zynn] smacks of a textbook predatory-pricing scheme, one calculated to attain immediate market dominance for Zynn by driving competitors out of the market."
Receptron
The receptron (short for "reservoir perceptron") is a neuromorphic data processing model — specifically neuromorphic computing — that generalizes the traditional perceptron, by incorporating non-linear interactions between inputs. Unlike classical perceptron, which rely on linearly independent weights, the receptron leverages complexity in physical substrates, such as the electric conduction properties of nanostructured materials or optical speckle fields, to perform classification tasks. The receptron bridges unconventional computing and neural network principles, enabling solutions that do not require the training approaches typical of artificial neural networks based on the perceptron model. == Algorithm == The receptron is an algorithm for supervised learning of binary classifiers, so a classification algorithm that makes its predictions based on a predictor function, combining a set of weights with the feature vector. The mathematical model is based on the sum of inputs with non-linear interactions: S = ∑ k = 1 n x j w ~ j ( x → ) | S ∈ R {\displaystyle S=\sum _{k=1}^{n}x_{j}{\widetilde {w}}_{j}({\vec {x}})|S\in R} (1) where j ∈ [ 1 , n ] {\displaystyle j\in [1,n]} and w ~ j {\displaystyle {\widetilde {w}}_{j}} are non-linear weight functions depending on the inputs, x → {\displaystyle {\vec {x}}} . Nonlinearity will typically make the system extremely complex, and allowing for the solution of problems not solvable through the simpler rules of a linear system, such as the perceptron or McCulloch Pitts neurons, which is based on the sum of linearly independent weights: S = ∑ k = 1 n x j w j p {\displaystyle S=\sum _{k=1}^{n}x_{j}w_{j}^{p}} (2) where w j {\displaystyle w_{j}} are constant real values. A consequence of this simplicity is the limitation to linearly separable functions, which necessitates multi-layer architectures and training algorithms like backpropagation As in the perceptron case, the summation in Eq. 1 origins the activation of the receptron output through the thresholding process, Y ( x 1 , . . . , x n ) = { 1 if S > th 0 if S ≤ th {\displaystyle Y(x_{1},...,x_{n})={\begin{cases}1&{\text{if }}S>{\text{th}}\\0&{\text{if }}S\leq {\text{th}}\end{cases}}} (3) where th is a constant threshold parameter. Equation 3 can be written by using the Heaviside step function. The weight functions w ~ ( x → ) {\displaystyle {\widetilde {w}}({\vec {x}})} can be written with a finite number of parameters w j 1 . . . j n {\displaystyle w_{j_{1}...j_{n}}} , simplifying the model representation. One can Taylor-expand w ~ ( x → ) {\displaystyle {\widetilde {w}}({\vec {x}})} and use the idempotency of Boolean variables ( x j ) q = x j ∀ q ≥ 1 {\displaystyle (x_{j})^{q}=x_{j}\forall q\geq 1} such that S ′ = b + ∑ k = 1 n x j w ~ j ( x → ) {\displaystyle S'=b+\sum _{k=1}^{n}x_{j}{\widetilde {w}}_{j}({\vec {x}})} can be written as S ′ ( x → ) = b + ∑ j w j x j + ∑ j < k w j k x j x k + ∑ j < k < l w j k l x j x k x l + . . . {\displaystyle S'({\vec {x}})=b+\sum _{j}w_{j}x_{j}+\sum _{j