SPKAC

SPKAC

SPKAC (Signed Public Key and Challenge, also known as Netscape SPKI) is a format for sending a certificate signing request (CSR): it encodes a public key, that can be manipulated using OpenSSL. It is created using the little documented HTML keygen element inside a number of Netscape compatible browsers. == Standardisation == There exists an ongoing effort to standardise SPKAC through an Internet Draft in the Internet Engineering Task Force (IETF). The purpose of this work has been to formally define what has existed prior as a de facto standard, and to address security deficiencies, particular with respect to historic insecure use of MD5 that has since been declared unsafe for use with digital signatures. == Implementations == HTML5 originally specified the element to support SPKAC in the browser to make it easier to create client side certificates through a web service for protocols such as WebID; however, subsequent work for HTML 5.1 placed the keygen element "at-risk", and the first public working draft of HTML 5.2 removes the keygen element entirely. The removal of the keygen element is due to non-interoperability and non-conformity from a standards perspective in addition to security concerns. The World Wide Web Consortium (W3C) Web Authentication Working Group developed the WebAuthn (Web Authentication) API to replace the keygen element. Bouncy Castle provides a Java class. An implementation for Erlang/OTP exists too. An implementation for Python is named pyspkac. PHP OpenSSL extension as of version 5.6.0. Node.js implementation. === Deficiencies === The user interface needs to be improved in browsers, to make it more obvious to users when a server is asking for the client certificate.

Text simplification

Text simplification is an aspect of natural language processing that involves modifying, organizing, or categorizing existing text to make it easier to understand while retaining its original meaning. This process is essential in today's world, where communication is increasingly complex due to advancements in science, technology, and media. Human languages are inherently intricate, with extensive vocabularies and complex structures that can be challenging for machines to handle efficiently. Researchers have found that semantic compression techniques can help streamline and simplify text by reducing linguistic diversity and simplifying the vocabulary used in a given context. == Example == Text simplification involves modifying complex sentences into simpler ones to enhance readability and comprehension. Siddharthan (2006) provides an example to illustrate this process. The original sentence contains multiple clauses and phrases, which can be broken down into simpler sentences for better understanding. Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents, which precedes the full purchasing agents report that is due out today and gives an indication of what the full report might hold. Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents. The Chicago report precedes the full purchasing agents report. The Chicago report gives an indication of what the full report might hold. The full report is due out today. An approach to text simplification involves lexical simplification via lexical substitution, a process that replaces complex words with simpler synonyms. Identifying complex words is a challenge addressed by machine learning classifiers trained on labeled data. Researchers have found that asking labelers to sort words by complexity levels yields more consistent results than the traditional method of categorizing words as simple or complex.

Margin classifier

In machine learning (ML), a margin classifier is a type of classification model which is able to give an associated distance from the decision boundary for each data sample. For instance, if a linear classifier is used, the distance (typically Euclidean, though others may be used) of a sample from the separating hyperplane is the margin of that sample. The notion of margins is important in several ML classification algorithms, as it can be used to bound the generalization error of these classifiers. These bounds are frequently shown using the VC dimension. The generalization error bound in boosting algorithms and support vector machines is particularly prominent. == Margin for boosting algorithms == The margin for an iterative boosting algorithm given a dataset with two classes can be defined as follows: the classifier is given a sample pair ( x , y ) {\displaystyle (x,y)} , where x ∈ X {\displaystyle x\in X} is a domain space and y ∈ Y = { − 1 , + 1 } {\displaystyle y\in Y=\{-1,+1\}} is the sample's label. The algorithm then selects a classifier h j ∈ C {\displaystyle h_{j}\in C} at each iteration j {\displaystyle j} where C {\displaystyle C} is a space of possible classifiers that predict real values. This hypothesis is then weighted by α j ∈ R {\displaystyle \alpha _{j}\in R} as selected by the boosting algorithm. At iteration t {\displaystyle t} , the margin of a sample x {\displaystyle x} can thus be defined as y ∑ j t α j h j ( x ) ∑ | α j | . {\displaystyle {\frac {y\sum _{j}^{t}\alpha _{j}h_{j}(x)}{\sum |\alpha _{j}|}}.} By this definition, the margin is positive if the sample is labeled correctly, or negative if the sample is labeled incorrectly. This definition may be modified and is not the only way to define the margin for boosting algorithms. However, there are reasons why this definition may be appealing. == Examples of margin-based algorithms == Many classifiers can give an associated margin for each sample. However, only some classifiers utilize information of the margin while learning from a dataset. Many boosting algorithms rely on the notion of a margin to assign weight to samples. If a convex loss is utilized (as in AdaBoost or LogitBoost, for instance) then a sample with a higher margin will receive less (or equal) weight than a sample with a lower margin. This leads the boosting algorithm to focus weight on low-margin samples. In non-convex algorithms (e.g., BrownBoost), the margin still dictates the weighting of a sample, though the weighting is non-monotone with respect to the margin. == Generalization error bounds == One theoretical motivation behind margin classifiers is that their generalization error may be bound by the algorithm parameters and a margin term. An example of such a bound is for the AdaBoost algorithm. Let S {\displaystyle S} be a set of m {\displaystyle m} data points, sampled independently at random from a distribution D {\displaystyle D} . Assume the VC-dimension of the underlying base classifier is d {\displaystyle d} and m ≥ d ≥ 1 {\displaystyle m\geq d\geq 1} . Then, with probability 1 − δ {\displaystyle 1-\delta } , we have the bound: P D ( y ∑ j t α j h j ( x ) ∑ | α j | ≤ 0 ) ≤ P S ( y ∑ j t α j h j ( x ) ∑ | α j | ≤ θ ) + O ( 1 m d log 2 ⁡ ( m / d ) / θ 2 + log ⁡ ( 1 / δ ) ) {\displaystyle P_{D}\left({\frac {y\sum _{j}^{t}\alpha _{j}h_{j}(x)}{\sum |\alpha _{j}|}}\leq 0\right)\leq P_{S}\left({\frac {y\sum _{j}^{t}\alpha _{j}h_{j}(x)}{\sum |\alpha _{j}|}}\leq \theta \right)+O\left({\frac {1}{\sqrt {m}}}{\sqrt {d\log ^{2}(m/d)/\theta ^{2}+\log(1/\delta )}}\right)} for all θ > 0 {\displaystyle \theta >0} .

Vapnik–Chervonenkis dimension

In Vapnik–Chervonenkis theory, the Vapnik–Chervonenkis (VC) dimension is a measure of the size (capacity, complexity, expressive power, richness, or flexibility) of a class of sets. The notion can be extended to classes of binary functions. It is defined as the cardinality of the largest set of points that the function class can shatter—that is, for which all possible binary labelings can be realized by some function in the class. It was originally defined by Vladimir Vapnik and Alexey Chervonenkis. Informally, the capacity of a classification model is related to how complicated it can be. For example, consider the thresholding of a high-degree polynomial: if the polynomial evaluates above zero, that point is classified as positive, otherwise as negative. A high-degree polynomial can be wiggly, so that it can fit a given set of training points well. Such a polynomial has a high capacity. A much simpler alternative is to threshold a linear function. This function may not fit the training set well, because it has a low capacity. This notion of capacity is made rigorous below. == Definitions == === VC dimension of a set-family === Let C = { C } C ∈ C {\displaystyle {\mathcal {C}}=\{C\}_{C\in {\mathcal {C}}}} be a family of sets (also called set family, collection of sets or set of sets) and X {\displaystyle X} a set. Their intersection is defined as the following set family: C ∩ X := { C ∩ X ∣ C ∈ C } . {\displaystyle {\mathcal {C}}\cap X:=\{C\cap X\mid C\in {\mathcal {C}}\}.} Here typically X {\displaystyle X} and each C ∈ C {\displaystyle C\in {\mathcal {C}}} are subsets of a big "universe" of possibilities U {\displaystyle U} where intersection takes place. We say that a set X {\displaystyle X} is shattered by C {\displaystyle {\mathcal {C}}} if P ( X ) = C ∩ X {\displaystyle {\mathcal {P}}(X)={\mathcal {C}}\cap X} i.e. the set of intersections contains (hence is equal to) all the subsets of X {\displaystyle X} . For finite sets X {\displaystyle X} this is equivalent to | C ∩ X | = 2 | X | . {\displaystyle |{\mathcal {C}}\cap X|=2^{|X|}.} The VC dimension D {\displaystyle D} of C {\displaystyle {\mathcal {C}}} is the cardinality of the largest set that is shattered by C {\displaystyle {\mathcal {C}}} . If arbitrarily large sets can be shattered, the VC dimension of C {\displaystyle {\mathcal {C}}} is ∞ {\displaystyle \infty } . === VC dimension of a classification model === A binary classification model f {\displaystyle f} with some parameter vector θ {\displaystyle \theta } is said to shatter a set of generally positioned data points ( x 1 , x 2 , … , x n ) {\displaystyle (x_{1},x_{2},\ldots ,x_{n})} if, for every assignment of labels to those points, there exists a θ {\displaystyle \theta } such that the model f {\displaystyle f} makes no errors when evaluating that set of data points. The VC dimension of a model f {\displaystyle f} is the maximum number of points that can be arranged so that f {\displaystyle f} shatters them. More formally, it is the maximum cardinal D {\displaystyle D} such that there exists a generally positioned data point set of cardinality D {\displaystyle D} that can be shattered by f {\displaystyle f} . == Examples == f {\displaystyle f} is a constant classifier (with no parameters); Its VC dimension is 0 since it cannot shatter even a single point. In general, the VC dimension of a finite classification model, which can return at most 2 d {\displaystyle 2^{d}} different classifiers, is at most d {\displaystyle d} (this is an upper bound on the VC dimension; the Sauer–Shelah lemma gives a lower bound on the dimension). f {\displaystyle f} is a single-parametric threshold classifier on real numbers; i.e., for a certain threshold θ {\displaystyle \theta } , the classifier f θ {\displaystyle f_{\theta }} returns 1 if the input number is larger than θ {\displaystyle \theta } and 0 otherwise. The VC dimension of f {\displaystyle f} is 1 because: (a) It can shatter a single point. For every point x {\displaystyle x} , a classifier f θ {\displaystyle f_{\theta }} labels it as 0 if θ > x {\displaystyle \theta >x} and labels it as 1 if θ < x {\displaystyle \theta x + 2 {\displaystyle \theta >x+2} , as (1,0) if θ ∈ [ x − 4 , x − 2 ) {\displaystyle \theta \in [x-4,x-2)} , as (1,1) if θ ∈ [ x − 2 , x ] {\displaystyle \theta \in [x-2,x]} , and as (0,1) if θ ∈ ( x , x + 2 ] {\displaystyle \theta \in (x,x+2]} . (b) It cannot shatter any set of three points. For every set of three numbers, if the smallest and the largest are labeled 1, then the middle one must also be labeled 1, so not all labelings are possible. f {\displaystyle f} is a straight line as a classification model on points in a two-dimensional plane (this is the model used by a perceptron). The line should separate positive data points from negative data points. There exist sets of 3 points that can indeed be shattered using this model (any 3 points that are not collinear can be shattered). However, no set of 4 points can be shattered: by Radon's theorem, any four points can be partitioned into two subsets with intersecting convex hulls, so it is not possible to separate one of these two subsets from the other. Thus, the VC dimension of this particular classifier is 3. It is important to remember that while one can choose any arrangement of points, the arrangement of those points cannot change when attempting to shatter for some label assignment. Note, only 3 of the 23 = 8 possible label assignments are shown for the three points. f {\displaystyle f} is a single-parametric sine classifier, i.e., for a certain parameter θ {\displaystyle \theta } , the classifier f θ {\displaystyle f_{\theta }} returns 1 if the input number x {\displaystyle x} has sin ⁡ ( θ x ) > 0 {\displaystyle \sin(\theta x)>0} and 0 otherwise. The VC dimension of f {\displaystyle f} is infinite, since it can shatter any finite subset of the set { 2 − m ∣ m ∈ N } {\displaystyle \{2^{-m}\mid m\in \mathbb {N} \}} . == Uses == === In statistical learning theory === The VC dimension can predict a probabilistic upper bound on the test error of a classification model. Vapnik proved that the probability of the test error (i.e., risk with 0–1 loss function) distancing from an upper bound (on data that is drawn i.i.d. from the same distribution as the training set) is given by: Pr ( test error ⩽ training error + 1 N [ D ( log ⁡ ( 2 N D ) + 1 ) − log ⁡ ( η 4 ) ] ) = 1 − η , {\displaystyle \Pr \left({\text{test error}}\leqslant {\text{training error}}+{\sqrt {{\frac {1}{N}}\left[D\left(\log \left({\tfrac {2N}{D}}\right)+1\right)-\log \left({\tfrac {\eta }{4}}\right)\right]}}\,\right)=1-\eta ,} where D {\displaystyle D} is the VC dimension of the classification model, 0 < η ⩽ 1 {\displaystyle 0<\eta \leqslant 1} , and N {\displaystyle N} is the size of the training set (restriction: this formula is valid when D ≪ N {\displaystyle D\ll N} . When D {\displaystyle D} is larger, the test-error may be much higher than the training-error. This is due to overfitting). The VC dimension also appears in sample-complexity bounds. A space of binary functions with VC dimension D {\displaystyle D} can be learned with: N = Θ ( D + ln ⁡ 1 δ ε 2 ) {\displaystyle N=\Theta \left({\frac {D+\ln {1 \over \delta }}{\varepsilon ^{2}}}\right)} samples, where ε {\displaystyle \varepsilon } is the learning error and δ {\displaystyle \delta } is the failure probability. Thus, the sample-complexity is a linear function of the VC dimension of the hypothesis space. === In computational geometry === The VC dimension is one of the critical parameters in the size of ε-nets, which determines the complexity of approximation algorithms based on them; range sets without finite VC dimension may not have finite ε-nets at all. == Bounds == The VC dimension of the dual set-family of C {\displaystyle {\mathcal {C}}} is strictly less than 2 vc ⁡ ( C ) + 1 {\displaystyle 2^{\operatorname {vc} ({\mathcal {C}})+1}} , and this is best possible. The VC dimension of a finite set-family C {\displaystyle {\mathcal {C}}} is at most log 2 ⁡ | C | {\displaystyle \log _{2}|{\mathcal {C}}|} . This is because | C ∩ X | ≤ | X | {\displaystyle |{\mathcal {C}}\cap X|\leq |X|} by definition. Given a set-fa

Independent component analysis

In signal processing, independent component analysis (ICA) is a computational method for separating a multivariate signal into additive subcomponents. This is done by assuming that at most one subcomponent is Gaussian and that the subcomponents are statistically independent from each other. ICA was invented by Jeanny Hérault and Christian Jutten in 1985. ICA is a special case of blind source separation. A common example application of ICA is the "cocktail party problem" of listening in on one person's speech in a noisy room. == Introduction == Independent component analysis attempts to decompose a multivariate signal into independent non-Gaussian signals. As an example, sound is usually a signal that is composed of the numerical addition, at each time t, of signals from several sources. The question then is whether it is possible to separate these contributing sources from the observed total signal. When the statistical independence assumption is correct, blind ICA separation of a mixed signal gives very good results. It is also used for signals that are not supposed to be generated by mixing for analysis purposes. A simple application of ICA is the "cocktail party problem", where the underlying speech signals are separated from a sample data consisting of people talking simultaneously in a room. Usually the problem is simplified by assuming no time delays or echoes. Note that a filtered and delayed signal is a copy of a dependent component, and thus the statistical independence assumption is not violated. Mixing weights for constructing the M {\textstyle M} observed signals from the N {\textstyle N} components can be placed in an M × N {\textstyle M\times N} matrix. An important thing to consider is that if N {\textstyle N} sources are present, at least N {\textstyle N} observations (e.g. microphones if the observed signal is audio) are needed to recover the original signals. When there are an equal number of observations and source signals, the mixing matrix is square ( M = N {\textstyle M=N} ). Other cases of underdetermined ( M < N {\textstyle M N {\textstyle M>N} ) have been investigated. The success of ICA separation of mixed signals relies on two assumptions and three effects of mixing source signals. Two assumptions: The source signals are independent of each other. The values in each source signal have non-Gaussian distributions. Three effects of mixing source signals: Independence: As per assumption 1, the source signals are independent; however, their signal mixtures are not. This is because the signal mixtures share the same source signals. Normality: According to the Central Limit Theorem, the distribution of a sum of independent random variables with finite variance tends towards a Gaussian distribution.Loosely speaking, a sum of two independent random variables usually has a distribution that is closer to Gaussian than any of the two original variables. Here we consider the value of each signal as the random variable. Complexity: The temporal complexity of any signal mixture is greater than that of its simplest constituent source signal. Those principles contribute to the basic establishment of ICA. If the signals extracted from a set of mixtures are independent and have non-Gaussian distributions or have low complexity, then they must be source signals. Another common example is image steganography, where ICA is used to embed one image within another. For instance, two grayscale images can be linearly combined to create mixed images in which the hidden content is visually imperceptible. ICA can then be used to recover the original source images from the mixtures. This technique underlies digital watermarking, which allows the embedding of ownership information into images, as well as more covert applications such as undetected information transmission. The method has even been linked to real-world cyberespionage cases. In such applications, ICA serves to unmix the data based on statistical independence, making it possible to extract hidden components that are not apparent in the observed data. Steganographic techniques, including those potentially involving ICA-based analysis, have been used in real-world cyberespionage cases. In 2010, the FBI uncovered a Russian spy network known as the "Illegals Program" (Operation Ghost Stories), where agents used custom-built steganography tools to conceal encrypted text messages within image files shared online. In another case, a former General Electric engineer, Xiaoqing Zheng, was convicted in 2022 for economic espionage. Zheng used steganography to exfiltrate sensitive turbine technology by embedding proprietary data within image files for transfer to entities in China. == Defining component independence == ICA finds the independent components (also called factors, latent variables or sources) by maximizing the statistical independence of the estimated components. We may choose one of many ways to define a proxy for independence, and this choice governs the form of the ICA algorithm. The two broadest definitions of independence for ICA are Minimization of mutual information Maximization of non-Gaussianity The Minimization-of-Mutual information (MMI) family of ICA algorithms uses measures like Kullback-Leibler Divergence and maximum entropy. The non-Gaussianity family of ICA algorithms, motivated by the central limit theorem, uses kurtosis and negentropy. Typical algorithms for ICA use centering (subtract the mean to create a zero mean signal), whitening (usually with the eigenvalue decomposition), and dimensionality reduction as preprocessing steps in order to simplify and reduce the complexity of the problem for the actual iterative algorithm. == Mathematical definitions == Linear independent component analysis can be divided into noiseless and noisy cases, where noiseless ICA is a special case of noisy ICA. Nonlinear ICA should be considered as a separate case. === General Derivation === In the classical ICA model, it is assumed that the observed data x i ∈ R m {\displaystyle \mathbf {x} _{i}\in \mathbb {R} ^{m}} at time t i {\displaystyle t_{i}} is generated from source signals s i ∈ R m {\displaystyle \mathbf {s} _{i}\in \mathbb {R} ^{m}} via a linear transformation x i = A s i {\displaystyle \mathbf {x} _{i}=A\mathbf {s} _{i}} , where A {\displaystyle A} is an unknown, invertible mixing matrix. To recover the source signals, the data is first centered (zero mean), and then whitened so that the transformed data has unit covariance. This whitening reduces the problem from estimating a general matrix A {\displaystyle A} to estimating an orthogonal matrix V {\displaystyle V} , significantly simplifying the search for independent components. If the covariance matrix of the centered data is Σ x = A A ⊤ {\displaystyle \Sigma _{x}=AA^{\top }} , then using the eigen-decomposition Σ x = Q D Q ⊤ {\displaystyle \Sigma _{x}=QDQ^{\top }} , the whitening transformation can be taken as D − 1 / 2 Q ⊤ {\displaystyle D^{-1/2}Q^{\top }} . This step ensures that the recovered sources are uncorrelated and of unit variance, leaving only the task of rotating the whitened data to maximize statistical independence. This general derivation underlies many ICA algorithms and is foundational in understanding the ICA model. ==== Reduced Mixing Problem ==== Independent component analysis (ICA) addresses the problem of recovering a set of unobserved source signals s i = ( s i 1 , s i 2 , … , s i m ) T {\displaystyle s_{i}=(s_{i1},s_{i2},\dots ,s_{im})^{T}} from observed mixed signals x i = ( x i 1 , x i 2 , … , x i m ) T {\displaystyle x_{i}=(x_{i1},x_{i2},\dots ,x_{im})^{T}} , based on the linear mixing model: x i = A s i , {\displaystyle x_{i}=A\,s_{i},} where the A {\displaystyle A} is an m × m {\displaystyle m\times m} invertible matrix called the mixing matrix, s i {\displaystyle s_{i}} represents the m‑dimensional vector containing the values of the sources at time t i {\displaystyle t_{i}} , and x i {\displaystyle x_{i}} is the corresponding vector of observed values at time t i {\displaystyle t_{i}} . The goal is to estimate both A {\displaystyle A} and the source signals { s i } {\displaystyle \{s_{i}\}} solely from the observed data { x i } {\displaystyle \{x_{i}\}} . After centering, the Gram matrix is computed as: ( X ∗ ) T X ∗ = Q D Q T , {\displaystyle (X^{})^{T}X^{}=Q\,D\,Q^{T},} where D is a diagonal matrix with positive entries (assuming X ∗ {\displaystyle X^{}} has maximum rank), and Q is an orthogonal matrix. Writing the SVD of the mixing matrix A = U Σ V T {\displaystyle A=U\Sigma V^{T}} and comparing with A A T = U Σ 2 U T {\displaystyle AA^{T}=U\Sigma ^{2}U^{T}} the mixing A has the form A = Q D 1 / 2 V T . {\displaystyle A=Q\,D^{1/2}\,V^{T}.} So, the normalized source values satisfy s i ∗ = V y i ∗ {\displaystyle s_{i}^{}=V\,y_{i}^{}} , where y i ∗ = D − 1 2 Q T x i ∗ . {\displaystyle y_{i}^{}=D^{-{\tfrac {1}{2}}}Q^{T}x_{i}^{}.} Thus, ICA reduces

Video renderer

A video renderer is software that processes a video file and sends it sequentially to the video display controller card for display on a computer screen. An example of a video renderer, is the VMR-7 that was used by Microsoft's DirectShow. An example of a UNIX video renderer is the one container within GStreamer. Commonly used video renderers are: Enhanced Video Renderer VMR9 Renderless Haali's Video Renderer Madvr Video Renderer JRVR, a part of JRiver Media Center

Physical neural network

A physical neural network is a type of artificial neural network in which an electrically adjustable material is used to emulate the function of a neural synapse or a higher-order (dendritic) neuron model. "Physical" neural network is used to emphasize the reliance on physical hardware used to emulate neurons as opposed to software-based approaches. More generally the term is applicable to other artificial neural networks in which a memristor or other electrically adjustable resistance material is used to emulate a neural synapse. == Types of physical neural networks == === ADALINE === In the 1960s Bernard Widrow and Ted Hoff developed ADALINE (Adaptive Linear Neuron) which used electrochemical cells called memistors (memory resistors) to emulate synapses of an artificial neuron. The memistors were implemented as 3-terminal devices operating based on the reversible electroplating of copper such that the resistance between two of the terminals is controlled by the integral of the current applied via the third terminal. The ADALINE circuitry was briefly commercialized by the Memistor Corporation in the 1960s enabling some applications in pattern recognition. However, since the memistors were not fabricated using integrated circuit fabrication techniques the technology was not scalable and was eventually abandoned as solid-state electronics became mature. === Analog VLSI === In 1989 Carver Mead published his book Analog VLSI and Neural Systems, which spun off perhaps the most common variant of analog neural networks. The physical realization is implemented in analog VLSI. This is often implemented as field effect transistors in low inversion. Such devices can be modelled as translinear circuits. This is a technique described by Barrie Gilbert in several papers around mid 1970th, and in particular his Translinear Circuits from 1981. With this method circuits can be analyzed as a set of well-defined functions in steady-state, and such circuits assembled into complex networks. === Physical Neural Network === Alex Nugent describes a physical neural network as one or more nonlinear neuron-like nodes used to sum signals and nanoconnections formed from nanoparticles, nanowires, or nanotubes which determine the signal strength input to the nodes. Alignment or self-assembly of the nanoconnections is determined by the history of the applied electric field performing a function analogous to neural synapses. Numerous applications for such physical neural networks are possible. For example, a temporal summation device can be composed of one or more nanoconnections having an input and an output thereof, wherein an input signal provided to the input causes one or more of the nanoconnection to experience an increase in connection strength thereof over time. Another example of a physical neural network is taught by U.S. Patent No. 7,039,619 entitled "Utilized nanotechnology apparatus using a neural network, a solution and a connection gap," which issued to Alex Nugent by the U.S. Patent & Trademark Office on May 2, 2006. A further application of physical neural network is shown in U.S. Patent No. 7,412,428 entitled "Application of hebbian and anti-hebbian learning to nanotechnology-based physical neural networks," which issued on August 12, 2008. Nugent and Molter have shown that universal computing and general-purpose machine learning are possible from operations available through simple memristive circuits operating the AHaH plasticity rule. More recently, it has been argued that also complex networks of purely memristive circuits can serve as neural networks. === Phase change neural network === In 2002, Stanford Ovshinsky described an analog neural computing medium in which phase-change material has the ability to cumulatively respond to multiple input signals. An electrical alteration of the resistance of the phase change material is used to control the weighting of the input signals. === Memristive neural network === Greg Snider of HP Labs describes a system of cortical computing with memristive nanodevices. The memristors (memory resistors) are implemented by thin film materials in which the resistance is electrically tuned via the transport of ions or oxygen vacancies within the film. DARPA's SyNAPSE project has funded IBM Research and HP Labs, in collaboration with the Boston University Department of Cognitive and Neural Systems (CNS), to develop neuromorphic architectures which may be based on memristive systems. === Protonic artificial synapses === In 2022, researchers reported the development of nanoscale brain-inspired artificial synapses, using the ion proton (H+), for 'analog deep learning'.