Universal approximation theorem

Universal approximation theorem

In the field of machine learning, the universal approximation theorems (UATs) state that neural networks with a certain structure can, in principle, approximate any continuous function to any desired degree of accuracy. These theorems provide a mathematical justification for using neural networks, assuring researchers that a sufficiently large or deep network can model the complex, non-linear relationships often found in real-world data. The best-known version of the theorem applies to feedforward networks with a single hidden layer. It states that if the layer's activation function is non-polynomial (which is true for common choices like the sigmoid function or ReLU), then the network can act as a "universal approximator." Universality is achieved by increasing the number of neurons in the hidden layer, making the network "wider." Other versions of the theorem show that universality can also be achieved by keeping the network's width fixed but increasing its number of layers, making it "deeper." These are existence theorems. They guarantee that a network with the right structure exists, but they do not provide a method for finding the network's parameters (training it), nor do they specify exactly how large the network must be for a given function. Finding a suitable network remains a practical challenge that is typically addressed with optimization algorithms like backpropagation. == Setup == Artificial neural networks are combinations of multiple simple mathematical functions that implement more complicated functions from (typically) real-valued vectors to real-valued vectors. The spaces of multivariate functions that can be implemented by a network are determined by the structure of the network, the set of simple functions, and its multiplicative parameters. A great deal of theoretical work has gone into characterizing these function spaces. Most universal approximation theorems are in one of two classes. The first quantifies the approximation capabilities of neural networks with an arbitrary number of artificial neurons ("arbitrary width" case) and the second focuses on the case with an arbitrary number of hidden layers, each containing a limited number of artificial neurons ("arbitrary depth" case). In addition to these two classes, there are also universal approximation theorems for neural networks with bounded number of hidden layers and a limited number of neurons in each layer ("bounded depth and bounded width" case). == History == === Arbitrary width === The first results concerned the arbitrary width case. Ken-ichi Funahashi (May 1989) showed that Rumelhart–Hinton–Williams type backpropagation networks possess universal approximation capability with a class of sigmoidal activation functions, extending the result to multi-output mappings as well. Kurt Hornik, Maxwell Stinchcombe, and Halbert White (July 1989) showed that multilayer feed-forward networks with as few as one hidden layer are universal approximators, provided that the activation function satisfies certain conditions. George Cybenko (December 1989) independently established a related result for sigmoid activation functions using functional-analytic methods. Hornik also showed in 1991 that it is not the specific choice of the activation function but rather the multilayer feed-forward architecture itself that gives neural networks the potential of being universal approximators. Moshe Leshno et al in 1993 and later Allan Pinkus in 1999 showed that the universal approximation property is equivalent to having a nonpolynomial activation function. === Arbitrary depth === The arbitrary depth case was also studied by a number of authors such as Gustaf Gripenberg in 2003, Dmitry Yarotsky, Zhou Lu et al in 2017, Boris Hanin and Mark Sellke in 2018 who focused on neural networks with ReLU activation function. In 2020, Patrick Kidger and Terry Lyons extended those results to neural networks with general activation functions such, e.g. tanh or GeLU. One special case of arbitrary depth is that each composition component comes from a finite set of mappings. In 2024, Cai constructed a finite set of mappings, named a vocabulary, such that any continuous function can be approximated by compositing a sequence from the vocabulary. This is similar to the concept of compositionality in linguistics, which is the idea that a finite vocabulary of basic elements can be combined via grammar to express an infinite range of meanings. === Bounded depth and bounded width === The bounded depth and bounded width case was first studied by Maiorov and Pinkus in 1999. They showed that there exists an analytic sigmoidal activation function such that two hidden layer neural networks with bounded number of units in hidden layers are universal approximators. In 2018, Guliyev and Ismailov constructed a smooth sigmoidal activation function providing universal approximation property for two hidden layer feedforward neural networks with fewer units in hidden layers. In 2018, they also constructed single hidden layer networks with bounded width that are still universal approximators for univariate functions. However, this does not apply for multivariable functions. In 2022, Shen et al. obtained precise quantitative information on the depth and width required to approximate a target function by deep and wide ReLU neural networks. === Quantitative bounds === The question of minimal possible width for universality was first studied in 2021, Park et al obtained the minimum width required for the universal approximation of Lp functions using feed-forward neural networks with ReLU as activation functions. Similar results that can be directly applied to residual neural networks were also obtained in the same year by Paulo Tabuada and Bahman Gharesifard using control-theoretic arguments. In 2023, Cai obtained the optimal minimum width bound for the universal approximation. For the arbitrary depth case, Leonie Papon and Anastasis Kratsios derived explicit depth estimates depending on the regularity of the target function and of the activation function. === Kolmogorov network === The Kolmogorov–Arnold representation theorem is similar in spirit. Indeed, certain neural network families can directly apply the Kolmogorov–Arnold theorem to yield a universal approximation theorem. Robert Hecht-Nielsen showed that a three-layer neural network can approximate any continuous multivariate function. This was extended to the discontinuous case by Vugar Ismailov. In 2024, Ziming Liu and co-authors showed a practical application. === Reservoir computing and quantum reservoir computing === In reservoir computing a sparse recurrent neural network with fixed weights equipped of fading memory and echo state property is followed by a trainable output layer. Its universality has been demonstrated separately for what concerns networks of rate neurons and spiking neurons, respectively. In 2024, the framework has been generalized and extended to quantum reservoirs where the reservoir is based on qubits defined over Hilbert spaces. === Variants === Variants include discontinuous activation functions, noncompact domains, certifiable networks, random neural networks, and alternative network architectures and topologies. The universal approximation property of width-bounded networks has been studied as a dual of classical universal approximation results on depth-bounded networks. For input dimension d x {\displaystyle d_{x}} and output dimension d y {\displaystyle d_{y}} the minimum width required for the universal approximation of the Lp functions is exactly m a x { d x + 1 , d y } {\displaystyle max\{d_{x}+1,d_{y}\}} (for a ReLU network). More generally this also holds if both ReLU and a threshold activation function are used. Universal function approximation on graphs (or rather on graph isomorphism classes) by popular graph convolutional neural networks (GCNs or GNNs) can be made as discriminative as the Weisfeiler–Leman graph isomorphism test. In 2020, a universal approximation theorem result was established by Brüel-Gabrielsson, showing that graph representation with certain injective properties is sufficient for universal function approximation on bounded graphs and restricted universal function approximation on unbounded graphs, with an accompanying O ( | V | ⋅ | E | ) {\displaystyle {\mathcal {O}}(\left|V\right|\cdot \left|E\right|)} -runtime method that performed at state of the art on a collection of benchmarks (where V {\displaystyle V} and E {\displaystyle E} are the sets of nodes and edges of the graph respectively). There are also a variety of results between non-Euclidean spaces and other commonly used architectures and, more generally, algorithmically generated sets of functions, such as the convolutional neural network (CNN) architecture, radial basis functions, or neural networks with specific properties. == Arbitrary-width case == A universal approximation theorem formally states that a family of neural network funct

Simulation noise

Simulation noise is a function that creates a divergence-free vector field. This signal can be used in artistic simulations for the purpose of increasing the perception of extra detail. The function can be calculated in three dimensions by dividing the space into a regular lattice grid. With each edge is associated a random value, indicating a rotational component of material revolving around the edge. By following rotating material into and out of faces, one can quickly sum the flux passing through each face of the lattice. Flux values at lattice faces are then interpolated to create a field value for all positions. Perlin noise is the earliest form of lattice noise, which has become very popular in computer graphics. Perlin Noise is not suited for simulation because it is not divergence-free. Noises based on lattices, such as simulation noise and Perlin noise, are often calculated at different frequencies and summed together to form band-limited fractal signals. Other approaches developed later that use vector calculus identities to produce divergence free fields, such as "Curl-Noise" as suggested by Rook Bridson, and "Divergence-Free Noise" due to Ivan DeWolf. These often require calculation of lattice noise gradients, which sometimes are not readily available. A naive implementation would call a lattice noise function several times to calculate its gradient, resulting in more computation than is strictly necessary. Unlike these noises, simulation noise has a geometric rationale in addition to its mathematical properties. It simulates vortices scattered in space, to produce its pleasing aesthetic. == Curl noise == The vector field is created as follows, for every point (x,y,z) in the space a vector field G is created, every component x, y and z of the vector field (Gx, Gy, Gz) is defined by a 3D perlin or simplex noise function with x, y and z as parameters. The partial derivative of Gx, Gy, and Gz respect to x, y and z is obtained with the gradient of the perlin or simplex noise by finite differences of implicit calculation inside the simplex noise. The partial derivatives are used to calculate F as the curl of G given by F = ( ∂ G z ∂ y − ∂ G y ∂ z , ∂ G x ∂ z − ∂ G z ∂ x , ∂ G y ∂ x − ∂ G x ∂ y ) {\displaystyle F=({\frac {\partial Gz}{\partial y}}-{\frac {\partial Gy}{\partial z}},{\frac {\partial Gx}{\partial z}}-{\frac {\partial Gz}{\partial x}},{\frac {\partial Gy}{\partial x}}-{\frac {\partial Gx}{\partial y}})} == Bitangent noise == This method is based in the fact that the curl of the gradient of scalar field is zero and the identity that expand the divergence of a cross product of two vectors A and B as the difference of the dot products of each vector with the curl of the other: ∇ × ( ∇ φ ) = 0 . {\displaystyle \nabla \times (\nabla \varphi )=\mathbf {0} .} ∇ ⋅ ( A × B ) = ( ∇ × A ) ⋅ B − A ⋅ ( ∇ × B ) {\displaystyle \nabla \cdot (\mathbf {A} \times \mathbf {B} )=\ (\nabla {\times }\mathbf {A} )\cdot \mathbf {B} \,-\,\mathbf {A} \cdot (\nabla {\times }\mathbf {B} )} which means that if the curl of both vector fields is zero then the divergence of the product of two vectors that are the gradients of scalar fields is zero too. This result in a divergence free vector field by construction only calling two noise functions to create the scalar fields. The vector field es created as follows, two scalar fields are calculated ϕ {\displaystyle \phi } and ψ {\displaystyle \psi } using 3D perlin or simplex noise functions, then the gradients A and B of each of this fields is calculated, the cross product of A and B gives a divergence free vector field. == Signed distance noise == The vector field is created based on a closed and differentiable implicit surface S = F(x,y,z) = 0. For every point in the space, frequently outside or near the surface, we get a vector g that is normal to the surface, this is the gradient of S or the partial derivatives respect to x, y and z, this vector is not unitary, but we can get a unitary normal n by dividing each component of the point by the magnitude of the gradient g. Outside of the surface all these normals point away from the surface. g = ∇ F ( x , y , z ) = ( ∂ F ∂ x , ∂ F ∂ y , ∂ F ∂ z ) {\displaystyle g=\nabla F(x,y,z)=\left({\frac {\partial F}{\partial x}},{\frac {\partial F}{\partial y}},{\frac {\partial F}{\partial z}}\right)} n = g ( x , y , z ) ‖ ∇ F ( x , y , z ) ‖ {\displaystyle \mathbf {n} ={\frac {g(x,y,z)}{\|\nabla F(x,y,z)\|}}} ‖ ∇ F ( x , y , z ) ‖ = ( ∂ F ∂ x ) 2 + ( ∂ F ∂ y ) 2 + ( ∂ F ∂ z ) 2 {\displaystyle \|\nabla F(x,y,z)\|={\sqrt {\left({\frac {\partial F}{\partial x}}\right)^{2}+\left({\frac {\partial F}{\partial y}}\right)^{2}+\left({\frac {\partial F}{\partial z}}\right)^{2}}}} Afterwards we calculate a scalar value p for that point in the space using a 3D perlin or simplex noise function. Now we create a vector field V = pn pointing outside of the surface. The curl of this vector field gives the direction in every point in the space where the particles should move. S D N = ( ∂ V z ∂ y − ∂ V y ∂ z , ∂ V x ∂ z − ∂ V z ∂ x , ∂ V y ∂ x − ∂ V x ∂ y ) {\displaystyle SDN=({\frac {\partial Vz}{\partial y}}-{\frac {\partial Vy}{\partial z}},{\frac {\partial Vx}{\partial z}}-{\frac {\partial Vz}{\partial x}},{\frac {\partial Vy}{\partial x}}-{\frac {\partial Vx}{\partial y}})} By construction this vector SDN will point in a tangent direction to an isosurface at the level of the signed distance to the original surface and can be used to confine the movements of the particles to stay in that surface.

Active networking

Active networking is a communication pattern that allows packets flowing through a telecommunications network to dynamically modify the operation of the network. Active network architecture is composed of execution environments (similar to a unix shell that can execute active packets), a node operating system capable of supporting one or more execution environments. It also consists of active hardware, capable of routing or switching as well as executing code within active packets. This differs from the traditional network architecture which seeks robustness and stability by attempting to remove complexity and the ability to change its fundamental operation from underlying network components. Network processors are one means of implementing active networking concepts. Active networks have also been implemented as overlay networks. == What does it offer? == Active networking allows the possibility of highly tailored and rapid "real-time" changes to the underlying network operation. This enables such ideas as sending code along with packets of information allowing the data to change its form (code) to match the channel characteristics. The smallest program that can generate a sequence of data can be found in the definition of Kolmogorov complexity. The use of real-time genetic algorithms within the network to compose network services is also enabled by active networking. == How it relates to other networking paradigms == Active networking relates to other networking paradigms primarily based upon how computing and communication are partitioned in the architecture. === Active networking and software-defined networking === Active networking is an approach to network architecture with in-network programmability. The name derives from a comparison with network approaches advocating minimization of in-network processing, based on design advice such as the "end-to-end argument". Two major approaches were conceived: programmable network elements ("switches") and capsules, a programmability approach that places computation within packets traveling through the network. Treating packets as programs later became known as "active packets". Software-defined networking decouples the system that makes decisions about where traffic is sent (the control plane) from the underlying systems that forward traffic to the selected destination (the data plane). The concept of a programmable control plane originated at the University of Cambridge in the Systems Research Group, where (using virtual circuit identifiers available in Asynchronous Transfer Mode switches) multiple virtual control planes were made available on a single physical switch. Control Plane Technologies (CPT) was founded to commercialize this concept. == Fundamental challenges == Active network research addresses the nature of how best to incorporate extremely dynamic capability within networks. In order to do this, active network research must address the problem of optimally allocating computation versus communication within communication networks. A similar problem related to the compression of code as a measure of complexity is addressed via algorithmic information theory. One of the challenges of active networking has been the inability of information theory to mathematically model the active network paradigm and enable active network engineering. This is due to the active nature of the network in which communication packets contain code that dynamically change the operation of the network. Fundamental advances in information theory are required in order to understand such networks. == Nanoscale active networks == As the limit in reduction of transistor size is reached with current technology, active networking concepts are being explored as a more efficient means accomplishing computation and communication. More on this can be found in nanoscale networking.

Scalable Video Coding

Scalable Video Coding (SVC) is a video compression standard developed jointly by the ITU-T and the ISO/IEC. The two organizations formed the Joint Video Team (JVT) to create the H.264/MPEG-4 AVC standard (ITU-T Rec. H.264 | ISO/IEC 14496-10 AVC). SVC aims to provide adaptable or scalable content, allowing a single encoded video stream to be decoded at various bitrates, resolutions, and quality levels, thus catering to diverse devices and network conditions. == History == In October 2003, the Moving Picture Experts Group (MPEG) issued a Call for Proposals on SVC Technology. Fourteen proposals were submitted, twelve of which utilized wavelet compression, while the remaining two were extensions of H.264/MPEG-4 AVC. The proposal from the Heinrich-Hertz-Institut (HHI) was selected by MPEG as the foundation for the SVC standardization project. In January 2005, MPEG and the Video Coding Experts Group (VCEG) agreed to finalize SVC as an amendment to the H.264/MPEG-4 AVC standard. In November 2008, Google launched Gmail Video Chat, which employed an H.264/SVC codec, marking the first consumer application of the standard. This service was succeeded by Google+ Hangouts in 2012. In 2011, Google Code highlighted SVC as the successor to the open-source RVC video chat engine, noting its prominence in 2010. == Principles of scalability == === Overview === Scalability refers to the ability to represent a video signal at multiple levels of detail within a single encoded bitstream. This enables decoding of a base layer for basic quality and additional enhancement layers for progressively higher quality. SVC defines three types of scalability: Spatial scalability: Supports multiple resolution levels. Temporal scalability: Enables varying frame rates. Quality scalability: Provides different image quality levels. === Spatial scalability === Spatial scalability allows the reconstruction of video at different resolutions, such as QCIF, CIF, or SD. This is achieved through a pyramidal decomposition into multiple spatial layers. === Temporal scalability === Temporal scalability adjusts the frame rate of the decoded video stream. Various frame rates are supported using a hierarchical structure of video frames. === Quality scalability === Quality scalability, or Signal-to-Noise Ratio (SNR) scalability, improves the signal-to-noise ratio of a layer, reducing quantization distortion between the original and reconstructed images. SVC supports two approaches: Fine Grain Scalability (FGS) and Coarse Grain Scalability (CGS). ==== Coarse Grain Scalability (CGS) ==== CGS incorporates quality scalability across spatial resolutions. Each spatial resolution is encoded as a separate layer, refining texture and motion data. For a given resolution, quality scalability is achieved by encoding multiple quality layers with progressively finer quantization steps, starting from a base layer with minimal quality. ==== Fine Grain Scalability (FGS) ==== FGS enables progressive refinement of transformed coefficients within a single spatial layer. The base quality layer is encoded using the AVC standard with an initial quantization parameter (QP) ensuring minimal acceptable quality. Subsequent refinement layers reduce the QP by six, halving the quantization step. The refinement data stream can be truncated at any point, allowing fine-grained quality scalability.

Data plan

A data plan is a subscription plan from a cellular or other mobile service provider to provide internet data and connectivity. == Formatting == Data plans are usually created by a contract between the telecommunications carrier and the user of their service. This contract outlines a maximum amount of usable data, usually highlighted in either megabytes or gigabytes, allotted per month for the user. In most cases companies will allow a user to surpass the amount of data allowed in the contract, however, will have to pay a per-gigabyte fee, ranging anywhere from five to fifteen U.S. dollars. === Popularization of unlimited plans === Unlimited data plans have seen a large increase in usage by consumers since their initial introduction by U.S. network T-Mobile. These plans, instead of setting an overall maximum for the user, have an amount set-up that, when surpassed, will slow the speed of the network for that user. Unlimited plans typically cost significantly more than the traditional shared data plans, which is a major reason that carriers have set large boundaries and fees. The limits imposed on unlimited plans are designed to fight against attempts to misuse the network, such as a DDoS attack, but are more commonly reasoned as a method to increase the number of people that can use one tower simultaneously. === Data speed changes === When a network is near reaching peak capacity data speeds may be slowed down by carriers as part of most major telecom contracts. This, as stated previously, allows for more people to be utilizing one tower, reducing needed capital for the company. Since speed changes are allowed at the company's will, the user has no official guarantee of speed on most major networks. === Costs brought upon by additional data === In many cases both the user and carrier have to incur additional costs when a user utilizes more of a given data package, which has helped in the proliferation of data caps and other forms of shared data plans. Most of the charges that the carrier has to incur for additional data usage is partially or fully given to the user of the network. ==== Users ==== Users are required to pay flat-rate additional fees that occur when they go above the amount of data given to them in their contract, utility, or prepaid plan. The cost per gigabyte of this fee is usually higher than what the contract itself offers, which discourages users from over-utilizing data and incurring a charge for the carrier. Certain contracts, which do not offer paying additional fees for an increase in data, may result in a shutdown of service, or in extremely rare cases, termination of the service as a whole. ==== Carriers ==== Carriers incur costs for additional data usage, as it limits the number of customers, and associated contracts, that they can handle on one network. Creating more cell phone towers in a given area would be costly, and largely useless until particular spikes in traffic. When the peak usable amount of one tower is reached, it may cause negative public relations towards the reliability of the corporation as a whole.

AppValley

AppValley is an independent American digital distribution service operated and trademarked by AppValley LLC. It serves as an alternative app store for the iOS mobile operating system, which allows users to download applications that are not available on the App Store, most commonly tweaked "++" apps, jailbreak apps, and apps including paid apps on the app store. == Legality == AppValley is among several services that violate enterprise developer certificates from Apple. The terms under which these are granted make clear that they are for companies who wish to distribute apps to their employees. AppValley uses these certificates to distribute software directly to non-employees, thereby bypassing the AppStore. AppValley's conduct had implications in U.S. sanctioned markets like Iran, Iraq, North Korea, Cuba, and Venezuela, which have all been subject to commercial sanctions. Among the software offered by AppValley and other services is pirated software, including paid apps on the app store and premium versions of Instagram, Spotify, Pokémon Go, and others. For instance, AppValley distributes an ad-free version of the music streaming app Spotify even on the free tier. == History == The website was founded in May 2017, releasing late that month with a very basic version of the app. There were less than 100 apps available for download at this time. On Jan 19, 2018, a new version dubbed AppValley 2.0 was released bringing dark mode, more categories, a search, and a much faster interface. On February 14, 2019, a Chinese partner "Jason Wu" allegedly took control of the main Twitter account and domain, causing the original AppValley developers to migrate to the domain app-valley.vip and the Twitter account handle @App_Valley_vip. As of September 2024, the app-valley.vip domain now redirects to appvalley.signulous.com. Today, AppValley continues to offer an alternative to Apple's App Store where app developers can publish their applications. == Features == AppValley is a mobile app installer which can also support iOS version that can be installed and downloaded on the mobile or the devices of the people who wish to get access to many different applications available. AppValley also contains apps that have been modified or tweaked for user preferences, and allows the user to by pass national restrictions on the use of apps, without having to resort to jailbreaking. As of June 2, 2020, there are over 1300 apps available for download.

Spintronics

Spintronics (a portmanteau of spin transport electronics), also known as spin electronics, is the study of the intrinsic spin of the electron and its associated magnetic moment, in addition to its fundamental electronic charge, in solid-state devices. The field of spintronics concerns spin-charge coupling in metallic systems. The analogous effects in insulators fall into the field of multiferroics. Spintronics fundamentally differs from traditional electronics in that, in addition to charge state, electron spins are used as a further degree of freedom, with implications in the efficiency of data storage and transfer. Spintronic systems are most often realised in dilute magnetic semiconductors (DMS) and Heusler alloys and are of particular interest in the field of quantum computing, such as atomtronics computation. == History == Spintronics emerged from discoveries in the 1980s concerning spin-dependent electron transport phenomena in solid-state devices. This includes the observation of spin-polarized electron injection from a ferromagnetic metal to a normal metal by Johnson and Silsbee (1985) and the discovery of giant magnetoresistance independently by Albert Fert et al. and Peter Grünberg et al. (1988). The origin of spintronics can be traced to the ferromagnet/superconductor tunneling experiments pioneered by Meservey and Tedrow and initial experiments on magnetic tunnel junctions by Julliere in the 1970s. The use of semiconductors for spintronics began with the theoretical proposal of a spin field-effect-transistor by Datta and Das in 1990 and of the electric dipole spin resonance by Rashba in 1960. In 2012, persistent spin helices of synchronized electrons were made to persist for more than a nanosecond, a 30-fold increase over earlier efforts, and longer than the duration of a modern processor clock cycle. In 2025, at 60 K (−213.2 °C; −351.7 °F) crystalline nickel(II) iodide (NiI2) was reported to exhibit p-wave magnetism, in which the spins of nickel atoms became arranged in a spiral pattern in two orientations. The orientations can be switched via a small electrical current. Applied in digital devices, this spintronics behavior requires far less current than the conventional charge-based electronics that powers devices such as computers and phones. == Theory == The spin of the electron is an intrinsic angular momentum that is separate from the angular momentum due to its orbital motion. The magnitude of the projection of the electron's spin along an arbitrary axis is 1 2 ℏ {\displaystyle {\tfrac {1}{2}}\hbar } , implying that the electron acts as a fermion by the spin-statistics theorem. Like orbital angular momentum, the spin has an associated magnetic moment, the magnitude of which is expressed as μ = 3 2 q m e ℏ {\displaystyle \mu ={\tfrac {\sqrt {3}}{2}}{\frac {q}{m_{e}}}\hbar } . In a solid, the spins of many electrons can act together to affect the magnetic and electronic properties of a material, for example endowing it with a permanent magnetic moment as in a ferromagnet. In many materials, electron spins are equally present in both the up and the down state, and no transport properties are dependent on spin. A spintronic device requires generation or manipulation of a spin-polarized population of electrons, resulting in an excess of spin up or spin down electrons. The polarization of any spin dependent property X can be written as P X = X ↑ − X ↓ X ↑ + X ↓ {\displaystyle P_{X}={\frac {X_{\uparrow }-X_{\downarrow }}{X_{\uparrow }+X_{\downarrow }}}} . A net spin polarization can be achieved either through creating an equilibrium energy split between spin up and spin down. Methods include putting a material in a large magnetic field (Zeeman effect), the exchange energy present in a ferromagnet or forcing the system out of equilibrium. The period of time that such a non-equilibrium population can be maintained is known as the spin lifetime, τ {\displaystyle \tau } . In a diffusive conductor, a spin diffusion length λ {\displaystyle \lambda } can be defined as the distance over which a non-equilibrium spin population can propagate. Spin lifetimes of conduction electrons in metals are relatively short (typically less than 1 nanosecond). An important research area is devoted to extending this lifetime to technologically relevant timescales. The mechanisms of decay for a spin polarized population can be broadly classified as spin-flip scattering and spin dephasing. Spin-flip scattering is a process inside a solid that does not conserve spin, and can therefore switch an incoming spin up state into an outgoing spin down state. Spin dephasing is the process wherein a population of electrons with a common spin state becomes less polarized over time due to different rates of electron spin precession. In confined structures, spin dephasing can be suppressed, leading to spin lifetimes of milliseconds in semiconductor quantum dots at low temperatures. Superconductors can enhance central effects in spintronics such as magnetoresistance effects, spin lifetimes and dissipationless spin-currents. The simplest method of generating a spin-polarised current in a metal is to pass the current through a ferromagnetic material. The most common applications of this effect involve giant magnetoresistance (GMR) devices. A typical GMR device consists of at least two layers of ferromagnetic materials separated by a spacer layer. When the two magnetization vectors of the ferromagnetic layers are aligned, the electrical resistance will be lower (so a higher current flows at constant voltage) than if the ferromagnetic layers are anti-aligned. This constitutes a magnetic field sensor. Two variants of GMR have been applied in devices: Current-in-plane (CIP), where the electric current flows parallel to the layers and, Current-perpendicular-to-plane (CPP), where the electric current flows in a direction perpendicular to the layers. Other metal-based spintronics devices: Tunnel magnetoresistance (TMR), where CPP transport is achieved by using quantum-mechanical tunneling of electrons through a thin insulator separating ferromagnetic layers. Spin-transfer torque, where a current of spin-polarized electrons is used to control the magnetization direction of ferromagnetic electrodes in the device. Spin-wave logic devices carry information in the phase. Interference and spin-wave scattering can perform logic operations. == Device types == === Spintronic-logic === Non-volatile spin-logic devices to enable scaling are being extensively studied. Spin-transfer, torque-based logic devices that use spins and magnets for information processing have been proposed. These devices are part of the ITRS exploratory road map. Logic-in memory applications are already in the development stage. A 2017 review article can be found in Materials Today. A generalized circuit theory for spintronic integrated circuits has been proposed so that the physics of spin transport can be utilized by SPICE developers and subsequently by circuit and system designers for the exploration of spintronics for "beyond CMOS computing". === Semiconductor === Doped semiconductor materials display dilute ferromagnetism. In recent years, dilute magnetic oxides (DMOs) including ZnO based DMOs and TiO2-based DMOs have been the subject of numerous experimental and computational investigations. N`0 sources (like manganese-doped gallium arsenide (Ga,Mn)As), increase the interface resistance with a tunnel barrier, or using hot-electron injection. Spin detection in semiconductors has been addressed with multiple techniques: Faraday/Kerr rotation of transmitted/reflected photons Circular polarization analysis of electroluminescence Nonlocal spin valve (adapted from Johnson and Silsbee's work with metals) Ballistic spin filtering The latter technique was used to overcome the lack of spin-orbit interaction and materials issues to achieve spin transport in silicon. Because external magnetic fields (and stray fields from magnetic contacts) can cause large Hall effects and magnetoresistance in semiconductors (which mimic spin-valve effects), the only conclusive evidence of spin transport in semiconductors is demonstration of spin precession and dephasing in a magnetic field non-collinear to the injected spin orientation, called the Hanle effect. === Storage media === Antiferromagnetic storage media have been studied as an alternative to ferromagnetism, especially since with antiferromagnetic material the bits can be stored as well as with ferromagnetic material. Instead of the usual definition 0 ↔ 'magnetisation upwards', 1 ↔ 'magnetisation downwards', the states can be, e.g., 0 ↔ 'vertically alternating spin configuration' and 1 ↔ 'horizontally-alternating spin configuration'.). The main advantages of antiferromagnetic material are: insensitivity to data-damaging perturbations by stray fields due to zero net external magnetization; no effect on near particles, implying that antiferromagnetic device elements wo