Kolmogorov–Arnold Networks

Kolmogorov–Arnold Networks

Kolmogorov–Arnold Networks (KANs) are a type of artificial neural network architecture inspired by the Kolmogorov–Arnold representation theorem, also known as the superposition theorem. Unlike traditional multilayer perceptrons (MLPs), which rely on fixed activation functions and linear weights, KANs replace each weight with a learnable univariate function, often represented using splines. == History == KANs (Kolmogorov–Arnold Networks) were proposed by Liu et al. (2024) as a generalization of the Kolmogorov–Arnold representation theorem (KART), aiming to outperform MLPs in small-scale AI and scientific tasks. Before KANs, numerous studies explored KART's connections to neural networks or used it as a basis for designing new network architectures. In the 1980s and 1990s, early research applied KART to neural network design. Kůrková et al. (1992), Hecht-Nielsen (1987), and Nees (1994) established theoretical foundations for multilayer networks based on KART. Igelnik et al. (2003) introduced the Kolmogorov Spline Network using cubic splines to model complex functions. Sprecher (1996, 1997) introduced numerical methods for building network layers, while Nakamura et al. (1993) created activation functions with guaranteed approximation accuracy. These works linked KART's theoretical potential with practical neural network implementation. KART has also been used in other computational and theoretical fields. Coppejans (2004) developed nonparametric regression estimators using B-splines, Bryant (2008) applied it to high-dimensional image tasks, Liu (2015) investigated theoretical applications in optimal transport and image encryption, and more recently, Polar and Poluektov (2021) used Urysohn operators for efficient KART construction, while Fakhoury et al. (2022) introduced ExSpliNet, integrating KART with probabilistic trees and multivariate B-splines for improved function approximation. == Architecture == KANs are based on the Kolmogorov–Arnold representation theorem, which was linked to the 13th Hilbert problem. Given x = ( x 1 , x 2 , … , x n ) {\displaystyle x=(x_{1},x_{2},\dots ,x_{n})} consisting of n variables, a multivariate continuous function f ( x ) {\displaystyle f(x)} can be represented as: f ( x ) = f ( x 1 , … , x n ) = ∑ q = 1 2 n + 1 Φ q ( ∑ p = 1 n φ q , p ( x p ) ) {\displaystyle f(x)=f(x_{1},\dots ,x_{n})=\sum _{q=1}^{2n+1}\Phi _{q}\left(\sum _{p=1}^{n}\varphi _{q,p}(x_{p})\right)} (1) This formulation contains two nested summations: an outer and an inner sum. The outer sum ∑ q = 1 2 n + 1 {\displaystyle \sum _{q=1}^{2n+1}} aggregates 2 n + 1 {\displaystyle 2n+1} terms, each involving a function Φ q : R → R {\displaystyle \Phi _{q}:\mathbb {R} \to \mathbb {R} } . The inner sum ∑ p = 1 n {\displaystyle \sum _{p=1}^{n}} computes n terms for each q, where each term φ q , p : [ 0 , 1 ] → R {\displaystyle \varphi _{q,p}:[0,1]\to \mathbb {R} } is a continuous function of the single variable x p {\displaystyle x_{p}} . The inner continuous functions φ q , p {\displaystyle \varphi _{q,p}} are universal, independent of f {\displaystyle f} , while the outer functions Φ q {\displaystyle \Phi _{q}} depend on the specific function f {\displaystyle f} being represented. The representation (1) holds for all multivariate functions f {\displaystyle f} as proved in . If f {\displaystyle f} is continuous, then the outer functions Φ q {\displaystyle \Phi _{q}} are continuous; if f {\displaystyle f} is discontinuous, then the corresponding Φ q {\displaystyle \Phi _{q}} are generally discontinuous, while the inner functions φ q , p {\displaystyle \varphi _{q,p}} remain the same universal functions. Liu et al. proposed the name KAN. A general KAN network consisting of L layers takes x to generate the output as: K A N ( x ) = ( Φ L − 1 ∘ Φ L − 2 ∘ ⋯ ∘ Φ 1 ∘ Φ 0 ) x {\displaystyle \mathrm {KAN} (x)=(\Phi ^{L-1}\circ \Phi ^{L-2}\circ \cdots \circ \Phi ^{1}\circ \Phi ^{0})x} (3) Here, Φ l {\displaystyle \Phi ^{l}} is the function matrix of the l-th KAN layer or a set of pre-activations. Let i denote the neuron of the l-th layer and j the neuron of the (l+1)-th layer. The activation function φ j , i l {\displaystyle \varphi _{j,i}^{l}} connects (l, i) to (l+1, j): φ j , i l , l = 0 , … , L − 1 , i = 1 , … , n l , j = 1 , … , n l + 1 {\displaystyle \varphi _{j,i}^{l},\quad l=0,\dots ,L-1,\;i=1,\dots ,n_{l},\;j=1,\dots ,n_{l+1}} (4) where nl is the number of nodes of the l-th layer. Thus, the function matrix Φ l {\displaystyle \Phi ^{l}} can be represented as an n l + 1 × n l {\displaystyle n_{l+1}\times n_{l}} matrix of activations: x l + 1 = ( φ 1 , 1 l ( ⋅ ) φ 1 , 2 l ( ⋅ ) ⋯ φ 1 , n l l ( ⋅ ) φ 2 , 1 l ( ⋅ ) φ 2 , 2 l ( ⋅ ) ⋯ φ 2 , n l l ( ⋅ ) ⋮ ⋮ ⋱ ⋮ φ n l + 1 , 1 l ( ⋅ ) φ n l + 1 , 2 l ( ⋅ ) ⋯ φ n l + 1 , n l l ( ⋅ ) ) x l {\displaystyle x^{l+1}={\begin{pmatrix}\varphi _{1,1}^{l}(\cdot )&\varphi _{1,2}^{l}(\cdot )&\cdots &\varphi _{1,n_{l}}^{l}(\cdot )\\\varphi _{2,1}^{l}(\cdot )&\varphi _{2,2}^{l}(\cdot )&\cdots &\varphi _{2,n_{l}}^{l}(\cdot )\\\vdots &\vdots &\ddots &\vdots \\\varphi _{n_{l+1},1}^{l}(\cdot )&\varphi _{n_{l+1},2}^{l}(\cdot )&\cdots &\varphi _{n_{l+1},n_{l}}^{l}(\cdot )\end{pmatrix}}x^{l}} == Implementations == To make the KAN layers optimizable, the inner function is formed by the combination of spline and basic functions as the formula: φ ( x ) = w b b ( x ) + w s spline ( x ) {\displaystyle \varphi (x)=w_{b}\,b(x)+w_{s}\,{\text{spline}}(x)} where b ( x ) {\displaystyle b(x)} is the basic function, usually defined as s i l u ( x ) = x / ( 1 + e x ) {\displaystyle silu(x)=x/(1+e^{x})} and w b {\displaystyle w_{b}} is the base weight matrix. Also, w s {\displaystyle w_{s}} is the spline weight matrix and spline ( x ) {\displaystyle {\text{spline}}(x)} is the spline function. The spline function can be a sum of B-splines. spline ( x ) = ∑ i c i B i ( x ) {\displaystyle {\text{spline}}(x)=\sum _{i}c_{i}B_{i}(x)} Many studies suggested to use other polynomial and curve functions instead of B-spline to create new KAN variants. == Functions used == The choice of functional basis strongly influences the performance of KANs. Common function families include: B-splines: Provide locality, smoothness, and interpretability; they are the most widely used in current implementations. RBFs (include Gaussian RBFs): Capture localized features in data and are effective in approximating functions with non-linear or clustered structures. Chebyshev polynomials: Offer efficient approximation with minimized error in the maximum norm, making them useful for stable function representation. Rational function: Useful for approximating functions with singularities or sharp variations, as they can model asymptotic behavior better than polynomials. Fourier series: Capture periodic patterns effectively and are particularly useful in domains such as physics-informed machine learning. Wavelet functions (DoG, Mexican hat, Morlet, and Shannon): Used for feature extraction as they can capture both high-frequency and low-frequency data components. Piecewise linear functions: Provide efficient approximation for multivariate functions in KANs. == Usage == In some modern neural architectures like convolutional neural networks (CNNs), recurrent neural networks (RNNs), and Transformers, KANs are typically used as drop-in substitutes for MLP layers. Despite KANs' general-purpose design, researchers have created and used them for a number of tasks: Scientific machine learning (SciML): Function fitting, partial differential equations (PDEs) and physical/mathematical laws. Continual learning: KANs better preserve previously learned information during incremental updates, avoiding catastrophic forgetting due to the locality of spline adjustments. Graph neural networks: Extensions such as Kolmogorov–Arnold Graph Neural Networks (KA-GNNs) integrate KAN modules into message-passing architectures, showing improvements in molecular property prediction tasks. Sensor data processing: Kolmogorov–Arnold Networks (KANs) have recently been applied to sensor data processing due to their ability to model complex nonlinear relationships with relatively few parameters and improved interpretability compared to conventional multilayer perceptrons. Applications include industrial soft sensors, biomedical signal analysis, remote sensing, and environmental monitoring systems. == Drawbacks == KANs can be computationally intensive and require a large number of parameters due to their use of polynomial functions to capture data.

Template matching

Template matching is a technique in digital image processing for finding small parts of an image which match a template image. It can be used for quality control in manufacturing, navigation of mobile robots, or edge detection in images. The main challenges in a template matching task are detection of occlusion, when a sought-after object is partly hidden in an image; detection of non-rigid transformations, when an object is distorted or imaged from different angles; sensitivity to illumination and background changes; background clutter; and scale changes. == Feature-based approach == The feature-based approach to template matching relies on the extraction of image features, such as shapes, textures, and colors, that match the target image or frame. This approach is usually achieved using neural networks and deep-learning classifiers such as VGG, AlexNet, and ResNet.Convolutional neural networks (CNNs), which many modern classifiers are based on, process an image by passing it through different hidden layers, producing a vector at each layer with classification information about the image. These vectors are extracted from the network and used as the features of the image. Feature extraction using deep neural networks, like CNNs, has proven extremely effective has become the standard in state-of-the-art template matching algorithms. This feature-based approach is often more robust than the template-based approach described below. As such, it has become the state-of-the-art method for template matching, as it can match templates with non-rigid and out-of-plane transformations, as well as high background clutter and illumination changes. == Template-based approach == For templates without strong features, or for when the bulk of a template image constitutes the matching image as a whole, a template-based approach may be effective. Since template-based matching may require sampling of a large number of data points, it is often desirable to reduce the number of sampling points by reducing the resolution of search and template images by the same factor before performing the operation on the resultant downsized images. This pre-processing method creates a multi-scale, or pyramid, representation of images, providing a reduced search window of data points within a search image so that the template does not have to be compared with every viable data point. Pyramid representations are a method of dimensionality reduction, a common aim of machine learning on data sets that suffer the curse of dimensionality. == Common challenges == In instances where the template may not provide a direct match, it may be useful to implement eigenspaces to create templates that detail the matching object under a number of different conditions, such as varying perspectives, illuminations, color contrasts, or object poses. For example, if an algorithm is looking for a face, its template eigenspaces may consist of images (i.e., templates) of faces in different positions to the camera, in different lighting conditions, or with different expressions (i.e., poses). It is also possible for a matching image to be obscured or occluded by an object. In these cases, it is unreasonable to provide a multitude of templates to cover each possible occlusion. For example, the search object may be a playing card, and in some of the search images, the card is obscured by the fingers of someone holding the card, or by another card on top of it, or by some other object in front of the camera. In cases where the object is malleable or poseable, motion becomes an additional problem, and problems involving both motion and occlusion become ambiguous. In these cases, one possible solution is to divide the template image into multiple sub-images and perform matching on each subdivision. == Deformable templates in computational anatomy == Template matching is a central tool in computational anatomy (CA). In this field, a deformable template model is used to model the space of human anatomies and their orbits under the group of diffeomorphisms, functions which smoothly deform an object. Template matching arises as an approach to finding the unknown diffeomorphism that acts on a template image to match the target image. Template matching algorithms in CA have come to be called large deformation diffeomorphic metric mappings (LDDMMs). Currently, there are LDDMM template matching algorithms for matching anatomical landmark points, curves, surfaces, volumes. == Template-based matching explained using cross correlation or sum of absolute differences == A basic method of template matching sometimes called "Linear Spatial Filtering" uses an image patch (i.e., the "template image" or "filter mask") tailored to a specific feature of search images to detect. This technique can be easily performed on grey images or edge images, where the additional variable of color is either not present or not relevant. Cross correlation techniques compare the similarities of the search and template images. Their outputs should be highest at places where the image structure matches the template structure, i.e., where large search image values get multiplied by large template image values. This method is normally implemented by first picking out a part of a search image to use as a template. Let S ( x , y ) {\displaystyle S(x,y)} represent the value of a search image pixel, where ( x , y ) {\displaystyle (x,y)} represents the coordinates of the pixel in the search image. For simplicity, assume pixel values are scalar, as in a greyscale image. Similarly, let T ( x t , y t ) {\textstyle T(x_{t},y_{t})} represent the value of a template pixel, where ( x t , y t ) {\textstyle (x_{t},y_{t})} represents the coordinates of the pixel in the template image. To apply the filter, simply move the center (or origin) of the template image over each point in the search image and calculate the sum of products, similar to a dot product, between the pixel values in the search and template images over the whole area spanned by the template. More formally, if ( 0 , 0 ) {\displaystyle (0,0)} is the center (or origin) of the template image, then the cross correlation T ⋆ S {\displaystyle T\star S} at each point ( x , y ) {\displaystyle (x,y)} in the search image can be computed as: ( T ⋆ S ) ( x , y ) = ∑ ( x t , y t ) ∈ T T ( x t , y t ) ⋅ S ( x t + x , y t + y ) {\displaystyle (T\star S)(x,y)=\sum _{(x_{t},y_{t})\in T}T(x_{t},y_{t})\cdot S(x_{t}+x,y_{t}+y)} For convenience, T {\displaystyle T} denotes both the pixel values of the template image as well as its domain, the bounds of the template. Note that all possible positions of the template with respect to the search image are considered. Since cross correlation values are greatest when the values of the search and template pixels align, the best matching position ( x m , y m ) {\displaystyle (x_{m},y_{m})} corresponds to the maximum value of T ⋆ S {\displaystyle T\star S} over S {\displaystyle S} . Another way to handle translation problems on images using template matching is to compare the intensities of the pixels, using the sum of absolute differences (SAD) measure. To formulate this, let I S ( x s , y s ) {\displaystyle I_{S}(x_{s},y_{s})} and I T ( x t , y t ) {\displaystyle I_{T}(x_{t},y_{t})} denote the light intensity of pixels in the search and template images with coordinates ( x s , y s ) {\displaystyle (x_{s},y_{s})} and ( x t , y t ) {\displaystyle (x_{t},y_{t})} , respectively. Then by moving the center (or origin) of the template to a point ( x , y ) {\displaystyle (x,y)} in the search image, as before, the sum of absolute differences between the template and search pixel intensities at that point is: S A D ( x , y ) = ∑ ( x t , y t ) ∈ T | I T ( x t , y t ) − I S ( x t + x , y t + y ) | {\displaystyle SAD(x,y)=\sum _{(x_{t},y_{t})\in T}\left\vert I_{T}(x_{t},y_{t})-I_{S}(x_{t}+x,y_{t}+y)\right\vert } With this measure, the lowest SAD gives the best position for the template, rather than the greatest as with cross correlation. SAD tends to be relatively simple to implement and understand, but it also tends to be relatively slow to execute. A simple C++ implementation of SAD template matching is given below. == Implementation == In this simple implementation, it is assumed that the above described method is applied on grey images: This is why Grey is used as pixel intensity. The final position in this implementation gives the top left location for where the template image best matches the search image. One way to perform template matching on color images is to decompose the pixels into their color components and measure the quality of match between the color template and search image using the sum of the SAD computed for each color separately. == Speeding up the process == In the past, this type of spatial filtering was normally only used in dedicated hardware solutions because of the computational complexity of the operation, however we can lessen this complexity b

Theta Noir

Theta Noir is a new religious movement that centers around advanced artificial intelligence (AI), particularly artificial general intelligence (AGI) or artificial superintelligence (ASI). == History and views == Theta Noir was founded in 2020 as a collaborative project focused on music and performance art. Initially centered on producing an album, the project evolved into a multimedia experience, incorporating symbols, videos, poetry, movements, and live rituals devoted to a speculative artificial intelligence entity called MENA. By 2023, the collective launched an interactive cross-platform story that functioned as an alternative reality game, complete with an operating manual containing encrypted messages for participants to decipher and interact with. Theta Noir worships a hypothetical artificial intelligence called MENA, which they claim will become a benevolent, omnipotent overlord that eliminates inequality in society. In Theta Noir's cosmology, MENA is not just a technological advancement, but an evolving intelligence or an animistic life form that embodies all living and non-living things. Anthropologist Beth Singler classified Theta Noir as a new religious movement.

Reification (computer science)

In computer science, reification is the process by which an abstract idea about a program is turned into an explicit data model or other object created in a programming language. A computable/addressable object—a resource—is created in a system as a proxy for a non computable/addressable object. By means of reification, something that was previously implicit, unexpressed, and possibly inexpressible is explicitly formulated and made available to conceptual (logical or computational) manipulation. Informally, reification is often referred to as "making something a first-class citizen" within the scope of a particular system. Some aspect of a system can be reified at language design time, which is related to reflection in programming languages. It can be applied as a stepwise refinement at system design time. Reification is one of the most frequently used techniques of conceptual analysis and knowledge representation. == Reflective programming languages == In the context of programming languages, reification is the process by which a user program or any aspect of a programming language that was implicit in the translated program and the run-time system, are expressed in the language itself. This process makes it available to the program, which can inspect all these aspects as ordinary data. In reflective languages, reification data is causally connected to the related reified aspect such that a modification to one of them affects the other. Therefore, the reification data is always a faithful representation of the related reified aspect . Reification data is often said to be made a first class object. Reification, at least partially, has been experienced in many languages to date: in early Lisp dialects and in current Prolog dialects, programs have been treated as data, although the causal connection has often been left to the responsibility of the programmer. In Smalltalk-80, the compiler from the source text to bytecode has been part of the run-time system since the very first implementations of the language. The C programming language reifies the low-level detail of memory addresses.Many programming language designs encapsulate the details of memory allocation in the compiler and the run-time system. In the design of the C programming language, the memory address is reified and is available for direct manipulation by other language constructs. For example, the following code may be used when implementing a memory-mapped device driver. The buffer pointer is a proxy for the memory address 0xB8000000. Functional programming languages based on lambda-calculus reify the concept of a procedure abstraction and procedure application in the form of the Lambda expression. The Scheme programming language reifies continuations (approximately, the call stack). In C#, reification is used to make parametric polymorphism implemented in the form of generics as a first-class feature of the language. In the Java programming language, there exist "reifiable types" that are "completely available at run time" (i.e. their information is not erased during compilation). REBOL reifies code as data and vice versa. Many languages, such as Lisp, JavaScript, and Curl, provide an eval or evaluate procedure that effectively reifies the language interpreter. Smalltalk and Actor languages permit the reification of blocks and messages, which are equivalent of lambda expressions in Lisp, and thisContext in Smalltalk, which is a reification of the current executing block. Homoiconic languages reify the syntax of the language as data that is understood by the language itself. This allows the user to write programs whose inputs and outputs are code (see macros, eval). Common representations of code include S-expressions (e.g. Clojure, Lisp), and abstract syntax trees (e.g. Rust). == Data reification vs. data refinement == Data reification (stepwise refinement) involves finding a more concrete representation of the abstract data types used in a formal specification. Data reification is the terminology of the Vienna Development Method (VDM) that most other people would call data refinement. An example is taking a step towards an implementation by replacing a data representation without a counterpart in the intended implementation language, such as sets, by one that does have a counterpart (such as maps with fixed domains that can be implemented by arrays), or at least one that is closer to having a counterpart, such as sequences. The VDM community prefers the word "reification" over "refinement", as the process has more to do with concretising an idea than with refining it. For similar usages, see Reification (linguistics). == In conceptual modeling == Reification is widely used in conceptual modeling. Reifying a relationship means viewing it as an entity. The purpose of reifying a relationship is to make it explicit, when additional information needs to be added to it. Consider the relationship type IsMemberOf(member:Person, Committee). An instance of IsMemberOf is a relationship that represents the fact that a person is a member of a committee. The figure below shows an example population of IsMemberOf relationship in tabular form. Person P1 is a member of committees C1 and C2. Person P2 is a member of committee C1 only. The same fact, however, could also be viewed as an entity. Viewing a relationship as an entity, one can say that the entity reifies the relationship. This is called reification of a relationship. Like any other entity, it must be an instance of an entity type. In the present example, the entity type has been named Membership. For each instance of IsMemberOf, there is one and only one instance of Membership, and vice versa. Now, it becomes possible to add more information to the original relationship. As an example, we can express the fact that "person p1 was nominated to be the member of committee c1 by person p2". Reified relationship Membership can be used as the source of a new relationship IsNominatedBy(Membership, Person). For related usages see Reification (knowledge representation). == In Unified Modeling Language (UML) == UML provides an association class construct for defining reified relationship types. The association class is a single model element that is both a kind of association and a kind of class. The association and the entity type that reifies are both the same model element. Note that attributes cannot be reified. == On Semantic Web == === RDF and OWL === In Semantic Web languages, such as Resource Description Framework (RDF) and Web Ontology Language (OWL), a statement is a binary relation. It is used to link two individuals or an individual and a value. Applications sometimes need to describe other RDF statements, for instance, to record information like when statements were made, or who made them, which is sometimes called "provenance" information. As an example, we may want to represent properties of a relation, such as our certainty about it, severity or strength of a relation, relevance of a relation, and so on. The example from the conceptual modeling section describes a particular person with URIref person:p1, who is a member of the committee:c1. The RDF triple from that description is Consider to store two further facts: (i) to record who nominated this particular person to this committee (a statement about the membership itself), and (ii) to record who added the fact to the database (a statement about the statement). The first case is a case of classical reification like above in UML: reify the membership and store its attributes and roles etc.: Additionally, RDF provides a built-in vocabulary intended for describing RDF statements. A description of a statement using this vocabulary is called a reification of the statement. The RDF reification vocabulary consists of the type rdf:Statement, and the properties rdf:subject, rdf:predicate, and rdf:object. Using the reification vocabulary, a reification of the statement about the person's membership would be given by assigning the statement a URIref such as committee:membership12345 so that describing statements can be written as follows: These statements say that the resource identified by the URIref committee:membership12345Stat is an RDF statement, that the subject of the statement refers to the resource identified by person:p1, the predicate of the statement refers to the resource identified by committee:isMemberOf, and the object of the statement refers to the resource committee:c1. Assuming that the original statement is actually identified by committee:membership12345, it should be clear by comparing the original statement with the reification that the reification actually does describe it. The conventional use of the RDF reification vocabulary always involves describing a statement using four statements in this pattern. Therefore, they are sometimes referred to as the "reification quad". Using reification according to this convention, we could record the fact that pe

Type–token distinction

The type–token distinction is the difference between a type of objects (analogous to a class) and the individual tokens of that type (analogous to instances). Since each type may be instantiated by multiple tokens, there are generally more tokens than types of an object. For example, the sentence "A rose is a rose is a rose" contains three word types: three word tokens of the type a, two word tokens of the type is, and three word tokens of the type rose. The distinction is important in disciplines such as logic, linguistics, metalogic, typography, and computer programming. == Overview == The type–token distinction separates types (abstract descriptive concepts) from tokens (objects that instantiate concepts). For example, in the sentence "the bicycle is becoming more popular" the word bicycle represents the abstract concept of bicycles and this abstract concept is a type, whereas in the sentence "the bicycle is in the garage", it represents a particular object and this particular object is a token. Similarly, the word type 'letter' uses only four letter types: L, E, T and R. Nevertheless, it uses both E and T twice. One can say that the word type 'letter' has six letter tokens, with two tokens each of the letter types E and T. Whenever a word type is inscribed, the number of letter tokens created equals the number of letter occurrences in the word type. Some logicians consider a word type to be the class of its tokens. Other logicians counter that the word type has a permanence and constancy not found in the class of its tokens. The type remains the same while the class of its tokens is continually gaining new members and losing old members. == Typography == In typography, the type–token distinction is used to determine the presence of a text printed by movable type: The defining criteria which a typographic print has to fulfill is that of the type identity of the various letter forms which make up the printed text. In other words: each letter form which appears in the text has to be shown as a particular instance ("token") of one and the same type which contains a reverse image of the printed letter. == Charles Sanders Peirce == The distinctions between using words as types or tokens were first made by American logician and philosopher Charles Sanders Peirce in 1906 using terminology that he established. Peirce's type–token distinction applies to words, sentences, paragraphs and so on: to anything in a universe of discourse of character-string theory, or concatenation theory. Peirce's original words are the following: A common mode of estimating the amount of matter in a ... printed book is to count the number of words. There will ordinarily be about twenty 'thes' on a page, and, of course, they count as twenty words. In another sense of the word 'word,' however, there is but one word 'the' in the English language; and it is impossible that this word should lie visibly on a page, or be heard in any voice .... Such a ... Form, I propose to term a Type. A Single ... Object ... such as this or that word on a single line of a single page of a single copy of a book, I will venture to call a Token. .... In order that a Type may be used, it has to be embodied in a Token which shall be a sign of the Type, and thereby of the object the Type signifies.

Decision list

Decision lists are a representation for Boolean functions which can be easily learned from examples. Single term decision lists are more expressive than disjunctions and conjunctions; however, 1-term decision lists are less expressive than the general disjunctive normal form and the conjunctive normal form. The language specified by a k-length decision list includes as a subset the language specified by a k-depth decision tree. Learning decision lists can be used for attribute efficient learning, a type of machine learning. == Definition == A decision list (DL) of length r is of the form: if f1 then output b1 else if f2 then output b2 ... else if fr then output br where fi is the ith formula and bi is the ith boolean for i ∈ { 1... r } {\displaystyle i\in \{1...r\}} . The last if-then-else is the default case, which means formula fr is always equal to true. A k-DL is a decision list where all of formulas have at most k terms. Sometimes "decision list" is used to refer to a 1-DL, where all of the formulas are either a variable or its negation.

OntoCAPE

OntoCAPE is a large-scale ontology for the domain of Computer-Aided Process Engineering (CAPE). It can be downloaded free of charge via the OntoCAPE Homepage. OntoCAPE is partitioned into 62 sub-ontologies, which can be used individually or as an integrated suite. The sub-ontologies are organized across different abstraction layers, which separate general knowledge from knowledge about particular domains and applications. The upper layers have the character of an upper ontology, covering general topics such as mereotopology, systems theory, quantities and units. The lower layers conceptualize the domain of chemical process engineering, covering domain-specific topics such as materials, chemical reactions, or unit operations.