Meta-learning (computer science)

Meta-learning (computer science)

Meta-learning is a subfield of machine learning where automatic learning algorithms are applied to metadata about machine learning experiments. As of 2017, the term had not found a standard interpretation, however the main goal is to use such metadata to understand how automatic learning can become flexible in solving learning problems, hence to improve the performance of existing learning algorithms or to learn (induce) the learning algorithm itself, hence the alternative term learning to learn. Flexibility is important because each learning algorithm is based on a set of assumptions about the data, its inductive bias. This means that it will only learn well if the bias matches the learning problem. A learning algorithm may perform very well in one domain, but not on the next. This poses strong restrictions on the use of machine learning or data mining techniques, since the relationship between the learning problem (often some kind of database) and the effectiveness of different learning algorithms is not yet understood. By using different kinds of metadata, like properties of the learning problem, algorithm properties (like performance measures), or patterns previously derived from the data, it is possible to learn, select, alter or combine different learning algorithms to effectively solve a given learning problem. Critiques of meta-learning approaches bear a strong resemblance to the critique of metaheuristic, a possibly related problem. A good analogy to meta-learning, and the inspiration for Jürgen Schmidhuber's early work (1987) and Yoshua Bengio et al.'s work (1991), considers that genetic evolution learns the learning procedure encoded in genes and executed in each individual's brain. In an open-ended hierarchical meta-learning system using genetic programming, better evolutionary methods can be learned by meta evolution, which itself can be improved by meta meta evolution, etc. == Definition == A proposed definition for a meta-learning system combines three requirements: The system must include a learning subsystem. Experience is gained by exploiting meta knowledge extracted in a previous learning episode on a single dataset, or from different domains. Learning bias must be chosen dynamically. Bias refers to the assumptions that influence the choice of explanatory hypotheses and not the notion of bias represented in the bias-variance dilemma. Meta-learning is concerned with two aspects of learning bias. Declarative bias specifies the representation of the space of hypotheses, and affects the size of the search space (e.g., represent hypotheses using linear functions only). Procedural bias imposes constraints on the ordering of the inductive hypotheses (e.g., preferring smaller hypotheses). == Common approaches == There are three common approaches: using (cyclic) networks with external or internal memory (model-based) learning effective distance metrics (metrics-based) explicitly optimizing model parameters for fast learning (optimization-based). === Model-Based === Model-based meta-learning models updates its parameters rapidly with a few training steps, which can be achieved by its internal architecture or controlled by another meta-learner model. ==== Memory-Augmented Neural Networks ==== A Memory-Augmented Neural Network, or MANN for short, is claimed to be able to encode new information quickly and thus to adapt to new tasks after only a few examples. ==== Meta Networks ==== Meta Networks (MetaNet) learns a meta-level knowledge across tasks and shifts its inductive biases via fast parameterization for rapid generalization. === Metric-Based === The core idea in metric-based meta-learning is similar to nearest neighbors algorithms, which weight is generated by a kernel function. It aims to learn a metric or distance function over objects. The notion of a good metric is problem-dependent. It should represent the relationship between inputs in the task space and facilitate problem solving. ==== Convolutional Siamese Neural Network ==== Siamese neural network is composed of two twin networks whose output is jointly trained. There is a function above to learn the relationship between input data sample pairs. The two networks are the same, sharing the same weight and network parameters. ==== Matching Networks ==== Matching Networks learn a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types. ==== Relation Network ==== The Relation Network (RN), is trained end-to-end from scratch. During meta-learning, it learns to learn a deep distance metric to compare a small number of images within episodes, each of which is designed to simulate the few-shot setting. ==== Prototypical Networks ==== Prototypical Networks learn a metric space in which classification can be performed by computing distances to prototype representations of each class. Compared to recent approaches for few-shot learning, they reflect a simpler inductive bias that is beneficial in this limited-data regime, and achieve satisfied results. === Optimization-Based === What optimization-based meta-learning algorithms intend for is to adjust the optimization algorithm so that the model can be good at learning with a few examples. ==== LSTM Meta-Learner ==== LSTM-based meta-learner is to learn the exact optimization algorithm used to train another learner neural network classifier in the few-shot regime. The parametrization allows it to learn appropriate parameter updates specifically for the scenario where a set amount of updates will be made, while also learning a general initialization of the learner (classifier) network that allows for quick convergence of training. ==== Temporal Discreteness ==== Model-Agnostic Meta-Learning (MAML) is a fairly general optimization algorithm, compatible with any model that learns through gradient descent. ==== Reptile ==== Reptile is a remarkably simple meta-learning optimization algorithm, given that both of its components rely on meta-optimization through gradient descent and both are model-agnostic. == Examples == Some approaches which have been viewed as instances of meta-learning: Recurrent neural networks (RNNs) are universal computers. In 1993, Jürgen Schmidhuber showed how "self-referential" RNNs can in principle learn by backpropagation to run their own weight change algorithm, which may be quite different from backpropagation. In 2001, Sepp Hochreiter & A.S. Younger & P.R. Conwell built a successful supervised meta-learner based on Long short-term memory RNNs. It learned through backpropagation a learning algorithm for quadratic functions that is much faster than backpropagation. Researchers at Deepmind (Marcin Andrychowicz et al.) extended this approach to optimization in 2017. In the 1990s, Meta Reinforcement Learning or Meta RL was achieved in Schmidhuber's research group through self-modifying policies written in a universal programming language that contains special instructions for changing the policy itself. There is a single lifelong trial. The goal of the RL agent is to maximize reward. It learns to accelerate reward intake by continually improving its own learning algorithm which is part of the "self-referential" policy. An extreme type of Meta Reinforcement Learning is embodied by the Gödel machine, a theoretical construct which can inspect and modify any part of its own software which also contains a general theorem prover. It can achieve recursive self-improvement in a provably optimal way. Model-Agnostic Meta-Learning (MAML) was introduced in 2017 by Chelsea Finn et al. Given a sequence of tasks, the parameters of a given model are trained such that few iterations of gradient descent with few training data from a new task will lead to good generalization performance on that task. MAML "trains the model to be easy to fine-tune." MAML was successfully applied to few-shot image classification benchmarks and to policy-gradient-based reinforcement learning. Variational Bayes-Adaptive Deep RL (VariBAD) was introduced in 2019. While MAML is optimization-based, VariBAD is a model-based method for meta reinforcement learning, and leverages a variational autoencoder to capture the task information in an internal memory, thus conditioning its decision making on the task. When addressing a set of tasks, most meta learning approaches optimize the average score across all tasks. Hence, certain tasks may be sacrificed in favor of the average score, which is often unacceptable in real-world applications. By contrast, Robust Meta Reinforcement Learning (RoML) focuses on improving low-score tasks, increasing robustness to the selection of task. RoML works as a meta-algorithm, as it can be applied on top of other meta learning algorithms (such as MAML and VariBAD) to increase their robustness. It is applicable to both supervised meta learning and meta reinforcement learning. Discovering meta-knowledge works by inducing knowledge

Truth discovery

Truth discovery (also known as truth finding) is the process of choosing the actual true value for a data item when different data sources provide conflicting information on it. Several algorithms have been proposed to tackle this problem, ranging from simple methods like majority voting to more complex ones able to estimate the trustworthiness of data sources. Truth discovery problems can be divided into two sub-classes: single-truth and multi-truth. In the first case only one true value is allowed for a data item (e.g birthday of a person, capital city of a country). While in the second case multiple true values are allowed (e.g. cast of a movie, authors of a book). Typically, truth discovery is the last step of a data integration pipeline, when the schemas of different data sources have been unified and the records referring to the same data item have been detected. == General principles == The abundance of data available on the web makes more and more probable to find that different sources provide (partially or completely) different values for the same data item. This, together with the fact that we are increasing our reliance on data to derive important decisions, motivates the need of developing good truth discovery algorithms. Many currently available methods rely on a voting strategy to define the true value of a data item. Nevertheless, recent studies, have shown that, if we rely only on majority voting, we could get wrong results even in 30% of the data items. The solution to this problem is to assess the trustworthiness of the sources and give more importance to votes coming from trusted sources. Ideally, supervised learning techniques could be exploited to assign a reliability score to sources after hand-crafted labeling of the provided values; unfortunately, this is not feasible since the number of needed labeled examples should be proportional to the number of sources, and in many applications the number of sources can be prohibitive. == Single-truth vs multi-truth discovery == Single-truth and multi-truth discovery are two very different problems. Single-truth discovery is characterized by the following properties: only one true value is allowed for each data item; different values provided for a given data item oppose to each other; values and sources can either be correct or erroneous. While in the multi-truth case the following properties hold: the truth is composed by a set of values; different values could provide a partial truth; claiming one value for a given data item does not imply opposing to all the other values; the number of true values for each data item is not known a priori. Multi-truth discovery has unique features that make the problem more complex and should be taken into consideration when developing truth-discovery solutions. The examples below point out the main differences of the two methods. Knowing that in both examples the truth is provided by source 1, in the single truth case (first table) we can say that sources 2 and 3 oppose to the truth and as a result provide wrong values. On the other hand, in the second case (second table), sources 2 and 3 are neither correct nor erroneous, they instead provide a subset of the true values and at the same time they do not oppose the truth. == Source trustworthiness == The vast majority of truth discovery methods are based on a voting approach: each source votes for a value of a certain data item and, at the end, the value with the highest vote is select as the true one. In the more sophisticated methods, votes do not have the same weight for all the data sources, more importance is indeed given to votes coming from trusted sources. Source trustworthiness usually is not known a priori but estimated with an iterative approach. At each step of the truth discovery algorithm the trustworthiness score of each data source is refined, improving the assessment of the true values that in turn leads to a better estimation of the trustworthiness of the sources. This process usually ends when all the values reach a convergence state. Source trustworthiness can be based on different metrics, such as accuracy of provided values, copying values from other sources and domain coverage. Detecting copying behaviors is very important, in fact, copy allows to spread false values easily making truth discovery very hard, since many sources would vote for the wrong values. Usually systems decrease the weight of votes associated to copied values or even don’t count them at all. == Single-truth methods == Most of the currently available truth discovery methods have been designed to work well only in the single-truth case. Below are reported some of the characteristics of the most relevant typologies of single-truth methods and how different systems model source trustworthiness. === Majority voting === Majority voting is the simplest method, the most popular value is selected as the true one. Majority voting is commonly used as a baseline when assessing the performances of more complex methods. === Web-link based === These methods estimate source trustworthiness exploiting a similar technique to the one used to measure authority of web pages based on web links. The vote assigned to a value is computed as the sum of the trustworthiness of the sources that provide that particular value, while the trustworthiness of a source is computed as the sum of the votes assigned to the values that the source provides. === Information-retrieval based === These methods estimate source trustworthiness using similarity measures typically used in information retrieval. Source trustworthiness is computed as the cosine similarity (or other similarity measures) between the set of values provided by the source and the set of values considered true (either selected in a probabilistic way or obtained from a ground truth). === Bayesian based === These methods use Bayesian inference to define the probability of a value being true conditioned on the values provided by all the sources. P ( v ∣ ψ ( o ) ) = P ( ψ ( o ) ∣ v ) ⋅ P ( v ) P ( ψ ( o ) ) {\displaystyle P(v\mid \psi (o))={\frac {P(\psi (o)\mid v)\cdot P(v)}{P(\psi (o))}}} where v {\displaystyle \textstyle v} is a value provided for a data item o {\displaystyle \textstyle o} and ψ ( o ) {\displaystyle \textstyle \psi (o)} is the set of the observed values provided by all the sources for that specific data item. The trustworthiness of a source is then computed based on the accuracy of the values that provides. Other more complex methods exploit Bayesian inference to detect copying behaviors and use these insights to better assess source trustworthiness. == Multi-truth methods == Due to its complexity, less attention has been devoted to the study of the multi-truth discovery Below are reported two typologies of multi-truth methods and their characteristics. === Bayesian based === These methods use Bayesian inference to define the probability of a group of values being true conditioned on the values provided by all the data sources. In this case, since there could be multiple true values for each data item, and sources can provide multiple values for a single data item, it is not possible to consider values individually. An alternative is to consider mappings and relations between set of provided values and sources providing them. The trustworthiness of a source is then computed based on the accuracy of the values that provides. More sophisticated methods also consider domain coverage and copying behaviors to better estimate source trustworthiness. === Probabilistic Graphical Models based === These methods use probabilistic graphical models to automatically define the set of true values of given data item and also to assess source quality without need of any supervision. == Applications == Many real-world applications can benefit from the use of truth discovery algorithms. Typical domains of application include: healthcare, crowd/social sensing, crowdsourcing aggregation, information extraction and knowledge base construction. Truth discovery algorithms could be also used to revolutionize the way in which web pages are ranked in search engines, going from current methods based on link analysis like PageRank, to procedures that rank web pages based on the accuracy of the information they provide.

Unspent transaction output

In cryptocurrencies, an unspent transaction output (UTXO, often capitalized as UTxO) is a distinctive element in a subset of digital currency models. A UTXO represents a certain amount of cryptocurrency that has been authorized by a sender and is available to be spent by a recipient. The utilization of UTXOs in transaction processes is a key feature of many cryptocurrencies, but it primarily characterizes those implementing the UTXO model. UTXOs employ public key cryptography to ascertain and transfer ownership. More specifically, the recipient's public key is formatted into the UTXO, thereby limiting the capability to spend the UTXO to the account that can demonstrate ownership of the corresponding private key. A valid digital signature associated with the public key must be included for the UTXO to be spent. In the UTXO model, each unit of currency is treated as a discrete object. The history of a UTXO is documented only within the blocks where it is transferred. To ascertain the total balance of an account, one must scan each block to find the latest UTXOs linked to that account. While all nodes within a blockchain network must consent to the block history, the blocks relevant to an account's balance are unique to that account. UTXOs constitute a chain of ownership depicted as a series of digital signatures dating back to the coin's inception, regardless of whether the coin was minted via mining, staking, or another procedure determined by the cryptocurrency protocol. The UTXO model was invented for Bitcoin. Cardano uses an extended version of the UTXO model known as EUTXO. == Origins == The conceptual framework of the UTXO model can be traced back to Hal Finney's Reusable Proofs of Work proposal, which itself was based on Adam Back's 1997 Hashcash proposal. Bitcoin, released in 2009, was the first widespread implementation of the UTXO model in practice. == UTXO model vs. account Model == Cryptocurrencies that utilize the UTXO model function differently compared to those using the account model. In the UTXO model, individual units of cryptocurrency, termed as unspent transaction outputs (UTXOs), are transferred between users, analogous to the exchange of physical cash. This model impacts how transactions and ownership are recorded and verified within the blockchain network. The account model preserves a record of each account and its corresponding balance for every block added to the network. This setup enables quicker balance verification without the need to scan historical blocks, but it increases the raw size of each block (though data compression techniques can be utilized to alleviate this). However, both models necessitate the inspection of past blocks to fully authenticate the origin of coins. In the UTXO model, each object is immutable - units of coins cannot be 'edited' in the same way an account balance is modified when a transaction occurs. Rather, the balance is computed from the transaction history dating back to when the coins were first minted. This simplicity enhances security, as a UTXO either exists in its anticipated form or it does not. In contrast, the account model requires meticulous verification of the account's status during transactions, which can lead to oversights if not conducted correctly. In valid blockchain transactions, only unspent outputs (UTXOs) are permissible for funding subsequent transactions. This requirement is critical to prevent double-spending and fraud. Accordingly, inputs in a transaction are removed from the UTXO set, while outputs create new UTXOs that are added to the set. The holders of private keys, such as those with cryptocurrency wallets, can utilize these UTXOs for future transactions.

Visible (mobile app)

Visible is a health tracking mobile app for people with long COVID and myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS). The company was founded by a Harry Leeming, an engineer from London living with long Covid since 2020, and Luke Martin-Fuller. In November 2022, Visible released an open beta of an app that aims to help people pace their activities to avoid post-exertional malaise. The app gathers data on exertion levels, symptom severity, and heart-rate variability. HRV is approximated using a smartphone's camera via a technique called photoplethysmography, and according to the app's developers, can indicate how much someone needs rest. The app is currently free, but is expected to be freemium in the future. Users can also opt to allow their data be used for research purposes. In July 2023, Visible and Imperial College London announced the start of the first two studies. One is on the effects of the menstrual cycle on long COVID symptoms, and the other is on the condition's epidemiology and economic impact. Visible has announced plans to couple the app with activity trackers for continuous monitoring of heart-rate and actimetry data, which the developers claim will be more effective. As of 2022, no clinical trials on Visible's effectiveness have been conducted.

Truth discovery

Truth discovery (also known as truth finding) is the process of choosing the actual true value for a data item when different data sources provide conflicting information on it. Several algorithms have been proposed to tackle this problem, ranging from simple methods like majority voting to more complex ones able to estimate the trustworthiness of data sources. Truth discovery problems can be divided into two sub-classes: single-truth and multi-truth. In the first case only one true value is allowed for a data item (e.g birthday of a person, capital city of a country). While in the second case multiple true values are allowed (e.g. cast of a movie, authors of a book). Typically, truth discovery is the last step of a data integration pipeline, when the schemas of different data sources have been unified and the records referring to the same data item have been detected. == General principles == The abundance of data available on the web makes more and more probable to find that different sources provide (partially or completely) different values for the same data item. This, together with the fact that we are increasing our reliance on data to derive important decisions, motivates the need of developing good truth discovery algorithms. Many currently available methods rely on a voting strategy to define the true value of a data item. Nevertheless, recent studies, have shown that, if we rely only on majority voting, we could get wrong results even in 30% of the data items. The solution to this problem is to assess the trustworthiness of the sources and give more importance to votes coming from trusted sources. Ideally, supervised learning techniques could be exploited to assign a reliability score to sources after hand-crafted labeling of the provided values; unfortunately, this is not feasible since the number of needed labeled examples should be proportional to the number of sources, and in many applications the number of sources can be prohibitive. == Single-truth vs multi-truth discovery == Single-truth and multi-truth discovery are two very different problems. Single-truth discovery is characterized by the following properties: only one true value is allowed for each data item; different values provided for a given data item oppose to each other; values and sources can either be correct or erroneous. While in the multi-truth case the following properties hold: the truth is composed by a set of values; different values could provide a partial truth; claiming one value for a given data item does not imply opposing to all the other values; the number of true values for each data item is not known a priori. Multi-truth discovery has unique features that make the problem more complex and should be taken into consideration when developing truth-discovery solutions. The examples below point out the main differences of the two methods. Knowing that in both examples the truth is provided by source 1, in the single truth case (first table) we can say that sources 2 and 3 oppose to the truth and as a result provide wrong values. On the other hand, in the second case (second table), sources 2 and 3 are neither correct nor erroneous, they instead provide a subset of the true values and at the same time they do not oppose the truth. == Source trustworthiness == The vast majority of truth discovery methods are based on a voting approach: each source votes for a value of a certain data item and, at the end, the value with the highest vote is select as the true one. In the more sophisticated methods, votes do not have the same weight for all the data sources, more importance is indeed given to votes coming from trusted sources. Source trustworthiness usually is not known a priori but estimated with an iterative approach. At each step of the truth discovery algorithm the trustworthiness score of each data source is refined, improving the assessment of the true values that in turn leads to a better estimation of the trustworthiness of the sources. This process usually ends when all the values reach a convergence state. Source trustworthiness can be based on different metrics, such as accuracy of provided values, copying values from other sources and domain coverage. Detecting copying behaviors is very important, in fact, copy allows to spread false values easily making truth discovery very hard, since many sources would vote for the wrong values. Usually systems decrease the weight of votes associated to copied values or even don’t count them at all. == Single-truth methods == Most of the currently available truth discovery methods have been designed to work well only in the single-truth case. Below are reported some of the characteristics of the most relevant typologies of single-truth methods and how different systems model source trustworthiness. === Majority voting === Majority voting is the simplest method, the most popular value is selected as the true one. Majority voting is commonly used as a baseline when assessing the performances of more complex methods. === Web-link based === These methods estimate source trustworthiness exploiting a similar technique to the one used to measure authority of web pages based on web links. The vote assigned to a value is computed as the sum of the trustworthiness of the sources that provide that particular value, while the trustworthiness of a source is computed as the sum of the votes assigned to the values that the source provides. === Information-retrieval based === These methods estimate source trustworthiness using similarity measures typically used in information retrieval. Source trustworthiness is computed as the cosine similarity (or other similarity measures) between the set of values provided by the source and the set of values considered true (either selected in a probabilistic way or obtained from a ground truth). === Bayesian based === These methods use Bayesian inference to define the probability of a value being true conditioned on the values provided by all the sources. P ( v ∣ ψ ( o ) ) = P ( ψ ( o ) ∣ v ) ⋅ P ( v ) P ( ψ ( o ) ) {\displaystyle P(v\mid \psi (o))={\frac {P(\psi (o)\mid v)\cdot P(v)}{P(\psi (o))}}} where v {\displaystyle \textstyle v} is a value provided for a data item o {\displaystyle \textstyle o} and ψ ( o ) {\displaystyle \textstyle \psi (o)} is the set of the observed values provided by all the sources for that specific data item. The trustworthiness of a source is then computed based on the accuracy of the values that provides. Other more complex methods exploit Bayesian inference to detect copying behaviors and use these insights to better assess source trustworthiness. == Multi-truth methods == Due to its complexity, less attention has been devoted to the study of the multi-truth discovery Below are reported two typologies of multi-truth methods and their characteristics. === Bayesian based === These methods use Bayesian inference to define the probability of a group of values being true conditioned on the values provided by all the data sources. In this case, since there could be multiple true values for each data item, and sources can provide multiple values for a single data item, it is not possible to consider values individually. An alternative is to consider mappings and relations between set of provided values and sources providing them. The trustworthiness of a source is then computed based on the accuracy of the values that provides. More sophisticated methods also consider domain coverage and copying behaviors to better estimate source trustworthiness. === Probabilistic Graphical Models based === These methods use probabilistic graphical models to automatically define the set of true values of given data item and also to assess source quality without need of any supervision. == Applications == Many real-world applications can benefit from the use of truth discovery algorithms. Typical domains of application include: healthcare, crowd/social sensing, crowdsourcing aggregation, information extraction and knowledge base construction. Truth discovery algorithms could be also used to revolutionize the way in which web pages are ranked in search engines, going from current methods based on link analysis like PageRank, to procedures that rank web pages based on the accuracy of the information they provide.

Neighborhood operation

In computer vision and image processing a neighborhood operation is a commonly used class of computations on image data which implies that it is processed according to the following pseudo code: Visit each point p in the image data and do { N = a neighborhood or region of the image data around the point p result(p) = f(N) } This general procedure can be applied to image data of arbitrary dimensionality. Also, the image data on which the operation is applied does not have to be defined in terms of intensity or color, it can be any type of information which is organized as a function of spatial (and possibly temporal) variables in p. The result of applying a neighborhood operation on an image is again something which can be interpreted as an image, it has the same dimension as the original data. The value at each image point, however, does not have to be directly related to intensity or color. Instead it is an element in the range of the function f, which can be of arbitrary type. Normally the neighborhood N is of fixed size and is a square (or a cube, depending on the dimensionality of the image data) centered on the point p. Also the function f is fixed, but may in some cases have parameters which can vary with p, see below. In the simplest case, the neighborhood N may be only a single point. This type of operation is often referred to as a point-wise operation. == Examples == The most common examples of a neighborhood operation use a fixed function f which in addition is linear, that is, the computation consists of a linear shift invariant operation. In this case, the neighborhood operation corresponds to the convolution operation. A typical example is convolution with a low-pass filter, where the result can be interpreted in terms of local averages of the image data around each image point. Other examples are computation of local derivatives of the image data. It is also rather common to use a fixed but non-linear function f. This includes median filtering, and computation of local variances. The Nagao-Matsuyama filter is an example of a complex local neighbourhood operation that uses variance as an indicator of the uniformity within a pixel group. The result is similar to a convolution with a low-pass filter with the added effect of preserving sharp edges. There is also a class of neighborhood operations in which the function f has additional parameters which can vary with p: Visit each point p in the image data and do { N = a neighborhood or region of the image data around the point p result(p) = f(N, parameters(p)) } This implies that the result is not shift invariant. Examples are adaptive Wiener filters. == Implementation aspects == The pseudo code given above suggests that a neighborhood operation is implemented in terms of an outer loop over all image points. However, since the results are independent, the image points can be visited in arbitrary order, or can even be processed in parallel. Furthermore, in the case of linear shift-invariant operations, the computation of f at each point implies a summation of products between the image data and the filter coefficients. The implementation of this neighborhood operation can then be made by having the summation loop outside the loop over all image points. An important issue related to neighborhood operation is how to deal with the fact that the neighborhood N becomes more or less undefined for points p close to the edge or border of the image data. Several strategies have been proposed: Compute result only for points p for which the corresponding neighborhood is well-defined. This implies that the output image will be somewhat smaller than the input image. Zero padding: Extend the input image sufficiently by adding extra points outside the original image which are set to zero. The loops over the image points described above visit only the original image points. Border extension: Extend the input image sufficiently by adding extra points outside the original image which are set to the image value at the closest image point. The loops over the image points described above visit only the original image points. Mirror extension: Extend the image sufficiently much by mirroring the image at the image boundaries. This method is less sensitive to local variations at the image boundary than border extension. Wrapping: The image is tiled, so that going off one edge wraps around to the opposite side of the image. This method assumes that the image is largely homogeneous, for example a stochastic image texture without large textons.

Path tracing

Path tracing is a rendering algorithm in computer graphics that simulates how light interacts with objects and participating media to generate realistic (physically plausible) images. It is based on earlier, more limited, ray tracing algorithms. Path tracing is used to create photorealistic images for artistic purposes, and for applications such as architectural rendering and product design. It is also used to render frames for animated films, and visual effects for film and television. Because it can be very accurate and unbiased, it is commonly used to generate reference images when testing the quality of other rendering algorithms. The technique uses the Monte Carlo method to compute estimates of global illumination and simulate the ways different materials reflect (or scatter), transmit, absorb, and emit light. It can incorporate simple modeling of the effects of aperture and lens (depth of field, and bokeh) and shutter speed (motion blur), or more realistic simulation of the optical components in a camera. The algorithm works by describing illumination in a scene using the rendering equation, or light transport equation, and finding an approximate solution using Monte Carlo integration. An inefficient (but accurate) version of the algorithm can be very simple, and involves tracing a ray from the camera, allowing this ray to bounce in random directions as it hits different objects in the scene, and computing the amount of light transmitted along the path to the camera whenever the path encounters a light source. This process is repeated many times for each pixel (each repetition, with generated path and transmitted light, is called a sample), and the results are averaged. One main difference between this algorithm and standard ray tracing is that a single unbranching path is traced each time, while "Whitted-style" or "Cook-style" ray tracing recursively samples branching paths (e.g. when light is both reflected and refracted by a glass object). More practical versions incorporate improvements such as quasi-Monte Carlo methods (techniques that distribute samples more evenly), importance sampling (take more samples of paths that are likely to transport more light), and next event estimation (allow a very limited form of branching, and sample additional paths that connect to the lights more directly). Because path tracing uses random samples there is noise in the final image, which decreases as more samples are taken. Images commonly require many thousands of samples per pixel (spp) to reduce noise to an acceptable level, and denoising techniques (e.g. based on neural networks) are often used. Denoising is usually necessary when path tracing is used for real-time rendering in video games, because relatively few samples can be taken. Many alternative algorithms for path tracing have been developed, although they do not always outperform more straightforward implementations. These include bidirectional path tracing (which traces paths forwards from the light source as well as backwards from the camera), Metropolis light transport, and ways of combining path tracing with photon mapping. Video games often use biased versions of path tracing to improve performance (e.g. limiting the number of bounces in each path). A family of techniques called ReSTIR has been developed that can help real-time path tracing by sharing data between nearby pixels and consecutive frames. == History == Like all ray tracing methods, path tracing is based on ray casting, which Arthur Appel used for computer graphics rendering in the late 1960s. In 1980, John Turner Whitted published a recursive ray tracing algorithm that allows rendering images of scenes containing mirrored surfaces and refractive transparent objects. In 1984, Cook et al. described a form of ray tracing called distributed ray tracing, which uses Monte Carlo integration to render effects such as depth of field, motion blur, reflection from rough surfaces, and area lights. The same year, the radiosity method (not a ray tracing method) was published, which was the first physically based method for rendering diffuse global illumination. In 1986, Jim Kajiya published a paper exploring how to use distributed ray tracing to render physically-based global illumination, and this paper also introduced and named the method called "path tracing". Path tracing and other distributed ray tracing techniques were further refined in the late 1980s and early 1990s by researchers such as James Arvo and Peter Shirley, and by Greg Ward in the open source Radiance software. Despite being theoretically able to render any lighting, the original form of path tracing can sometimes be very inefficient (or noisy) for rendering light that is reflected or refracted before illuminating a visible surface, including diffuse global illumination where light enters an area through narrow gaps, because it traces paths only from the camera. To address this, variations of path tracing that trace paths from both the camera and from light sources, called bidirectional path tracing, were published in 1993 by Eric Lafortune and Yves Willems, and in 1997 by Eric Veach and Leonidas Guibas. In 1997 Veach and Guibas also published an alternative method called Metropolis light transport, which combines bidirectional path tracing with the Metropolis method. Veach's lengthy Ph.D. dissertation described both techniques, along with the theoretical background of path tracing; later, the book Physically Based Rendering (which won an Academy Award for Technical Achievement in 2014) helped to make information about path tracing more widely available. Path tracing requires tracing a large number of paths of light in order to produce an image with a visually acceptable amount of noise. This made path tracing very slow on computers available in the 1980s and 1990s, and noise remained a problem when trying to reproduce the style of earlier computer graphics animated films. Most animated films produced until around 2010, by studios such as Pixar, used rasterization-based rendering, with ray tracing used selectively for reflections (and later for precomputed or cached global illumination). However the speed of computers rapidly increased during the 1990s. Blue Sky Studios pioneered using Monte Carlo ray tracing for global illumination in animation, including in the 1998 short film "Bunny", but they did not disclose the precise techniques used. Path tracing gradually become more practical for film production in the early 2000s. The Arnold renderer, developed by Marcos Fajardo, was used by Sony Pictures Imageworks to produce the feature-length film Monster House, released in 2006. Pixar rewrote their RenderMan software to use path tracing, and released their first feature-length path-traced film Finding Dory in 2016. Although path tracing still had a large computational cost, animation studios discovered that less human labor was required when using it, for example because global illumination no longer needed to be faked by manually placing lights. The amount of noise present in path traced images still caused difficulties, particularly when rendering motion blur (which was used extensively by earlier animated films) but denoising techniques were developed to address this. New techniques were also needed for rendering hair and fur, and to handle the extremely large scenes sometimes required by films. Renderers such as Arnold, and Disney's Hyperion, originally only used CPUs for rendering, but as GPUs became more capable (and APIs such as CUDA, OpenCL, and OptiX were released) researchers and developers began adapting algorithms and implementations to use GPUs. GPUs can dramatically reduce rendering time: for example using a high-end GPU to accelerate portions of the rendering code can make it over 30 times faster than using only a high-end CPU. == Description == Kajiya's 1986 paper defined a recursive integral equation called the rendering equation, which describes a simplified form of light transport. Using Monte Carlo integration for the integral on the right side of the equation leads fairly directly to the path tracing algorithm: I ( x , x ′ ) = g ( x , x ′ ) [ ϵ ( x , x ′ ) + ∫ S ρ ( x , x ′ , x ″ ) I ( x ′ , x ″ ) d x ″ ] {\displaystyle I(x,x')=g(x,x')\left[\epsilon (x,x')+\int _{S}\rho (x,x',x'')I(x',x'')dx''\right]} This expresses I(x,x'), the light arriving at point x from point x', as the product of a geometry term, g(x,x'), which is 0 if there is something blocking the light between the two points and 1 otherwise, and the amount of light leaving point x' and traveling towards x. The light leaving point x' is the sum of the light emitted by the surface at x', and the integral of the light arriving at x' from all other points in the scene (the integration domain S) and being reflected towards x. The factor ρ(x,x',x''), which calculates how much light is reflected, must take into account the angles at which the light is arriving and leaving, and