gCube is an open source software system specifically designed and developed to enact the building and operation of a Data Infrastructure providing their users with a rich array of services suitable for supporting the co-creation of Virtual Research Environments and promoting the implementation of open science workflows and practices. It is at the heart of the D4Science Data Infrastructure. == Overview == It is primarily organised in a number of web service called to offer functionality supporting the phases of knowledge production and sharing. In addition, it consists of a set of software libraries supporting service development, service-to-service integration, and service capabilities extension, and a set of portlets dedicated to realise user interface constituents facilitating the exploitation of one or more services. It is designed and conceived to enact system of systems. In fact, its gCube services rely on standards and mediators to interact with other services as well as are made available by standard and APIs to make it possible for clients to use them. For instance, the DataMiner service implements the Web Processing Service protocol to facilitate clients to execute processes. The set of components dealing with Identity and Access Management rely on Keycloak and federates other IDMs thus making the overall Authentication and the Authorization management compliant with open standards such as OAuth2, User-Managed Access (UMA), and OpenID Connect (OIDC)protocols. The Catalogue relies on DCAT, OAI-PMH, and Catalogue Service for the Web to collect contents from other catalogues and data sources and offers its content by DCAT, OAI-PMH, and a proprietary REST API (gCat REST API). Its Continuous Integration/Continuous Delivery pipeline implemented by Jenkins represents an innovative approach to software delivering conceived to be scalable and easy to maintain and upgrade at a minimal cost. == History == gCube has been developed in the context of the D4Science initiative with the support of several EU projects.
Physical access
Physical access is a term in computer security that refers to the ability of people to physically gain access to a computer system. According to Gregory White, "Given physical access to an office, the knowledgeable attacker will quickly be able to find the information needed to gain access to the organization's computer systems and network." == Attacks and countermeasures == === Attacks === Physical access opens up a variety of avenues for hacking. Michael Meyers notes that "the best network software security measures can be rendered useless if you fail to physically protect your systems," since an intruder could simply walk off with a server and crack the password at his leisure. Physical access also allows hardware keyloggers to be installed. An intruder may be able to boot from a CD or other external media and then read unencrypted data on the hard drive. They may also exploit a lack of access control in the boot loader; for instance, pressing F8 while certain versions of Microsoft Windows are booting, specifying 'init=/bin/sh' as a boot parameter to Linux (usually done by editing the command line in GRUB), etc. One could also use a rogue device to access a poorly secured wireless network; if the signal were sufficiently strong, one might not even need to breach the perimeter. === Countermeasures === IT security standards in the United States typically call for physical access to be limited by locked server rooms, sign-in sheets, etc. Physical access systems and IT security systems have historically been administered by separate departments of organizations, but are increasingly being seen as having interdependent functions needing a single, converged security policy. An IT department could, for instance, check security log entries for suspicious logons occurring after business hours, and then use keycard swipe records from a building access control system to narrow down the list of suspects to those who were in the building at that time. Surveillance cameras might also be used to deter or detect unauthorized access.
Random forest
Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that works by creating a multitude of decision trees during training. For classification tasks, the output of the random forest is the class selected by most trees. For regression tasks, the output is the average of the predictions of the trees. Random forests correct for decision trees' habit of overfitting to their training set. The first algorithm for random decision forests was created in 1995 by Tin Kam Ho using the random subspace method, which, in Ho's formulation, is a way to implement the "stochastic discrimination" approach to classification proposed by Eugene Kleinberg. An extension of the algorithm was developed by Leo Breiman and Adele Cutler, who registered "Random Forests" as a trademark in 2006 (as of 2019, owned by Minitab, Inc.). The extension combines Breiman's "bagging" idea and random selection of features, introduced first by Ho and later independently by Amit and Geman in order to construct a collection of decision trees with controlled variance. == History == The general method of random decision forests was first proposed by Salzberg and Heath in 1993, with a method that used a randomized decision tree algorithm to create multiple trees and then combine them using majority voting. This idea was developed further by Ho in 1995. Ho established that forests of trees splitting with oblique hyperplanes can gain accuracy as they grow without suffering from overtraining, as long as the forests are randomly restricted to be sensitive to only selected feature dimensions. A subsequent work along the same lines concluded that other splitting methods behave similarly, as long as they are randomly forced to be insensitive to some feature dimensions. This observation that a more complex classifier (a larger forest) gets more accurate nearly monotonically is in sharp contrast to the common belief that the complexity of a classifier can only grow to a certain level of accuracy before being hurt by overfitting. The explanation of the forest method's resistance to overtraining can be found in Kleinberg's theory of stochastic discrimination. The early development of Breiman's notion of random forests was influenced by the work of Amit and Geman who introduced the idea of searching over a random subset of the available decisions when splitting a node, in the context of growing a single tree. The idea of random subspace selection from Ho was also influential in the design of random forests. This method grows a forest of trees, and introduces variation among the trees by projecting the training data into a randomly chosen subspace before fitting each tree or each node. Finally, the idea of randomized node optimization, where the decision at each node is selected by a randomized procedure, rather than a deterministic optimization was first introduced by Thomas G. Dietterich. The proper introduction of random forests was made in a paper by Leo Breiman, that has become one of the world's most cited papers. This paper describes a method of building a forest of uncorrelated trees using a CART like procedure, combined with randomized node optimization and bagging. In addition, this paper combines several ingredients, some previously known and some novel, which form the basis of the modern practice of random forests, in particular: Using out-of-bag error as an estimate of the generalization error. Measuring variable importance through permutation. The report also offers the first theoretical result for random forests in the form of a bound on the generalization error which depends on the strength of the trees in the forest and their correlation. == Algorithm == === Preliminaries: decision tree learning === Decision trees are a popular method for various machine learning tasks. Tree learning is almost "an off-the-shelf procedure for data mining", say Hastie et al., "because it is invariant under scaling and various other transformations of feature values, is robust to inclusion of irrelevant features, and produces inspectable models. However, they are seldom accurate". In particular, trees that are grown very deep tend to learn highly irregular patterns: they overfit their training sets, i.e. have low bias, but very high variance. Random forests are a way of averaging multiple deep decision trees, trained on different parts of the same training set, with the goal of reducing the variance. This comes at the expense of a small increase in the bias and some loss of interpretability, but generally greatly boosts the performance in the final model. === Bagging === The training algorithm for random forests applies the general technique of bootstrap aggregating, or bagging, to tree learners. Given a training set X = x1, ..., xn with responses Y = y1, ..., yn, bagging repeatedly (B times) selects a random sample with replacement of the training set and fits trees to these samples: After training, predictions for unseen samples x' can be made by averaging the predictions from all the individual regression trees on x': f ^ = 1 B ∑ b = 1 B f b ( x ′ ) {\displaystyle {\hat {f}}={\frac {1}{B}}\sum _{b=1}^{B}f_{b}(x')} or by taking the plurality vote in the case of classification trees. This bootstrapping procedure leads to better model performance because it decreases the variance of the model, without increasing the bias. This means that while the predictions of a single tree are highly sensitive to noise in its training set, the average of many trees is not, as long as the trees are not correlated. Simply training many trees on a single training set would give strongly correlated trees (or even the same tree many times, if the training algorithm is deterministic); bootstrap sampling is a way of de-correlating the trees by showing them different training sets. Additionally, an estimate of the uncertainty of the prediction can be made as the standard deviation of the predictions from all the individual regression trees on x′: σ = ∑ b = 1 B ( f b ( x ′ ) − f ^ ) 2 B − 1 . {\displaystyle \sigma ={\sqrt {\frac {\sum _{b=1}^{B}(f_{b}(x')-{\hat {f}})^{2}}{B-1}}}.} The number B of samples (equivalently, of trees) is a free parameter. Typically, a few hundred to several thousand trees are used, depending on the size and nature of the training set. B can be optimized using cross-validation, or by observing the out-of-bag error: the mean prediction error on each training sample xi, using only the trees that did not have xi in their bootstrap sample. The training and test error tend to level off after some number of trees have been fit. === From bagging to random forests === The above procedure describes the original bagging algorithm for trees. Random forests also include another type of bagging scheme: they use a modified tree learning algorithm that selects, at each candidate split in the learning process, a random subset of the features. This process is sometimes called "feature bagging". The reason for doing this is the correlation of the trees in an ordinary bootstrap sample: if one or a few features are very strong predictors for the response variable (target output), these features will be selected in many of the B trees, causing them to become correlated. An analysis of how bagging and random subspace projection contribute to accuracy gains under different conditions is given by Ho. Typically, for a classification problem with p {\displaystyle p} features, p {\displaystyle {\sqrt {p}}} (rounded down) features are used in each split. For regression problems the inventors recommend p / 3 {\displaystyle p/3} (rounded down) with a minimum node size of 5 as the default. In practice, the best values for these parameters should be tuned on a case-to-case basis for every problem. === ExtraTrees === Adding one further step of randomization yields extremely randomized trees, or ExtraTrees. As with ordinary random forests, they are an ensemble of individual trees, but there are two main differences: (1) each tree is trained using the whole learning sample (rather than a bootstrap sample), and (2) the top-down splitting is randomized: for each feature under consideration, a number of random cut-points are selected, instead of computing the locally optimal cut-point (based on, e.g., information gain or the Gini impurity). The values are chosen from a uniform distribution within the feature's empirical range (in the tree's training set). Then, of all the randomly chosen splits, the split that yields the highest score is chosen to split the node. Similar to ordinary random forests, the number of randomly selected features to be considered at each node can be specified. Default values for this parameter are p {\displaystyle {\sqrt {p}}} for classification and p {\displaystyle p} for regression, where p {\displaystyle p} is the number of features in the model. === Random forests for high-dimensional data === The basic random forest procedure may
Learning classifier system
Learning classifier systems, or LCS, are a paradigm of rule-based machine learning methods that combine a discovery component (e.g. typically a genetic algorithm in evolutionary computation) with a learning component (performing either supervised learning, reinforcement learning, or unsupervised learning). Learning classifier systems seek to identify a set of context-dependent rules that collectively store and apply knowledge in a piecewise manner in order to make predictions (e.g. behavior modeling, classification, data mining, regression, function approximation, or game strategy). This approach allows complex solution spaces to be broken up into smaller, simpler parts for the reinforcement learning that is inside artificial intelligence research. The founding concepts behind learning classifier systems came from attempts to model complex adaptive systems, using rule-based agents to form an artificial cognitive system (i.e. artificial intelligence). == Methodology == The architecture and components of a given learning classifier system can be quite variable. It is useful to think of an LCS as a machine consisting of several interacting components. Components may be added or removed, or existing components modified/exchanged to suit the demands of a given problem domain (like algorithmic building blocks) or to make the algorithm flexible enough to function in many different problem domains. As a result, the LCS paradigm can be flexibly applied to many problem domains that call for machine learning. The major divisions among LCS implementations are as follows: (1) Michigan-style architecture vs. Pittsburgh-style architecture, (2) reinforcement learning vs. supervised learning, (3) incremental learning vs. batch learning, (4) online learning vs. offline learning, (5) strength-based fitness vs. accuracy-based fitness, and (6) complete action mapping vs best action mapping. These divisions are not necessarily mutually exclusive. For example, XCS, the best known and best studied LCS algorithm, is Michigan-style, was designed for reinforcement learning but can also perform supervised learning, applies incremental learning that can be either online or offline, applies accuracy-based fitness, and seeks to generate a complete action mapping. === Elements of a generic LCS algorithm === Keeping in mind that LCS is a paradigm for genetic-based machine learning rather than a specific method, the following outlines key elements of a generic, modern (i.e. post-XCS) LCS algorithm. For simplicity let us focus on Michigan-style architecture with supervised learning. See the illustrations on the right laying out the sequential steps involved in this type of generic LCS. ==== Environment ==== The environment is the source of data upon which an LCS learns. It can be an offline, finite training dataset (characteristic of a data mining, classification, or regression problem), or an online sequential stream of live training instances. Each training instance is assumed to include some number of features (also referred to as attributes, or independent variables), and a single endpoint of interest (also referred to as the class, action, phenotype, prediction, or dependent variable). Part of LCS learning can involve feature selection, therefore not all of the features in the training data need to be informative. The set of feature values of an instance is commonly referred to as the state. For simplicity let's assume an example problem domain with Boolean/binary features and a Boolean/binary class. For Michigan-style systems, one instance from the environment is trained on each learning cycle (i.e. incremental learning). Pittsburgh-style systems perform batch learning, where rule sets are evaluated in each iteration over much or all of the training data. ==== Rule/classifier/population ==== A rule is a context dependent relationship between state values and some prediction. Rules typically take the form of an {IF:THEN} expression, (e.g. {IF 'condition' THEN 'action'}, or as a more specific example, {IF 'red' AND 'octagon' THEN 'stop-sign'}). A critical concept in LCS and rule-based machine learning alike, is that an individual rule is not in itself a model, since the rule is only applicable when its condition is satisfied. Think of a rule as a "local-model" of the solution space. Rules can be represented in many different ways to handle different data types (e.g. binary, discrete-valued, ordinal, continuous-valued). Given binary data LCS traditionally applies a ternary rule representation (i.e. rules can include either a 0, 1, or '#' for each feature in the data). The 'don't care' symbol (i.e. '#') serves as a wild card within a rule's condition allowing rules, and the system as a whole to generalize relationships between features and the target endpoint to be predicted. Consider the following rule (#1###0 ~ 1) (i.e. condition ~ action). This rule can be interpreted as: IF the second feature = 1 AND the sixth feature = 0 THEN the class prediction = 1. We would say that the second and sixth features were specified in this rule, while the others were generalized. This rule, and the corresponding prediction are only applicable to an instance when the condition of the rule is satisfied by the instance. This is more commonly referred to as matching. In Michigan-style LCS, each rule has its own fitness, as well as a number of other rule-parameters associated with it that can describe the number of copies of that rule that exist (i.e. the numerosity), the age of the rule, its accuracy, or the accuracy of its reward predictions, and other descriptive or experiential statistics. A rule along with its parameters is often referred to as a classifier. In Michigan-style systems, classifiers are contained within a population [P] that has a user defined maximum number of classifiers. Unlike most stochastic search algorithms (e.g. evolutionary algorithms), LCS populations start out empty (i.e. there is no need to randomly initialize a rule population). Classifiers will instead be initially introduced to the population with a covering mechanism. In any LCS, the trained model is a set of rules/classifiers, rather than any single rule/classifier. In Michigan-style LCS, the entire trained (and optionally, compacted) classifier population forms the prediction model. ==== Matching ==== One of the most critical and often time-consuming elements of an LCS is the matching process. The first step in an LCS learning cycle takes a single training instance from the environment and passes it to [P] where matching takes place. In step two, every rule in [P] is now compared to the training instance to see which rules match (i.e. are contextually relevant to the current instance). In step three, any matching rules are moved to a match set [M]. A rule matches a training instance if all feature values specified in the rule condition are equivalent to the corresponding feature value in the training instance. For example, assuming the training instance is (001001 ~ 0), these rules would match: (###0## ~ 0), (00###1 ~ 0), (#01001 ~ 1), but these rules would not (1##### ~ 0), (000##1 ~ 0), (#0#1#0 ~ 1). Notice that in matching, the endpoint/action specified by the rule is not taken into consideration. As a result, the match set may contain classifiers that propose conflicting actions. In the fourth step, since we are performing supervised learning, [M] is divided into a correct set [C] and an incorrect set [I]. A matching rule goes into the correct set if it proposes the correct action (based on the known action of the training instance), otherwise it goes into [I]. In reinforcement learning LCS, an action set [A] would be formed here instead, since the correct action is not known. ==== Covering ==== At this point in the learning cycle, if no classifiers made it into either [M] or [C] (as would be the case when the population starts off empty), the covering mechanism is applied (fifth step). Covering is a form of online smart population initialization. Covering randomly generates a rule that matches the current training instance (and in the case of supervised learning, that rule is also generated with the correct action. Assuming the training instance is (001001 ~ 0), covering might generate any of the following rules: (#0#0## ~ 0), (001001 ~ 0), (#010## ~ 0). Covering not only ensures that each learning cycle there is at least one correct, matching rule in [C], but that any rule initialized into the population will match at least one training instance. This prevents LCS from exploring the search space of rules that do not match any training instances. ==== Parameter updates/credit assignment/learning ==== In the sixth step, the rule parameters of any rule in [M] are updated to reflect the new experience gained from the current training instance. Depending on the LCS algorithm, a number of updates can take place at this step. For supervised learning, we can simply update the accuracy/error of a
Defining length
In the field of genetic algorithms, a schema (plural: schemata) is a template that represents a subset of potential solutions. These templates use fixed symbols (e.g., `0` or `1`) for specific positions and a wildcard or "don't care" symbol (often `#` or ``) for others. The defining length of a schema, denoted as L(H), measures the distance between the outermost fixed positions in the template. According to the Schema theorem, a schema with a shorter defining length is less likely to be disrupted by the genetic operator of crossover. As a result, short schemata are considered more robust and are more likely to be propagated to the next generation. In genetic programming, where solutions are often represented as trees, the defining length is the number of links in the minimum tree fragment that includes all the non-wildcard symbols within a schema H. == Example == The defining length is calculated by subtracting the position of the first fixed symbol from the position of the last one. Using 1-based indexing for a string of length 5: The schema `1##0#` has its first fixed symbol (`1`) at position 1 and its last fixed symbol (`0`) at position 4. Its defining length is 4 − 1 = 3. The schema `00##0` has its first fixed symbol at position 1 and its last at position 5. Its defining length is 5 − 1 = 4. The schema `##0##` has only one fixed symbol at position 3. The first and last fixed positions are the same, so its defining length is 3 − 3 = 0.
List of .NET libraries and frameworks
This article contains a list of libraries that can be used in .NET languages. These languages require .NET Framework, Mono, or .NET, which provide a basis for software development, platform independence, language interoperability and extensive framework libraries. Standard Libraries (including the Base Class Library) are not included in this article. == Introduction == Apps created with .NET Framework or .NET run in a software environment known as the Common Language Runtime (CLR), an application virtual machine that provides services such as security, memory management, and exception handling. The framework includes a large class library called Framework Class Library (FCL). Thanks to the hosting virtual machine, different languages that are compliant with the .NET Common Language Infrastructure (CLI) can operate on the same kind of data structures. These languages can therefore use the FCL and other .NET libraries that are also written in one of the CLI compliant languages. When the source code of such languages are compiled, the compiler generates platform-independent code in the Common Intermediate Language (CIL, also referred to as bytecode), which is stored in CLI assemblies. When a .NET app runs, the just-in-time compiler (JIT) turns the CIL code into platform-specific machine code. To improve performance, .NET Framework also comes with the Native Image Generator (NGEN), which performs ahead-of-time compilation to machine code. This architecture provides language interoperability. Each language can use code written in other languages. Calls from one language to another are exactly the same as would be within a single programming language. If a library is written in one CLI language, it can be used in other CLI languages. Moreover, apps that consist only of pure .NET assemblies, can be transferred to any platform that contains an implementation of CLI and run on that platform. For example, apps written using .NET can run on Windows, macOS, and various versions of Linux. .NET apps or their libraries, however, may depend on native platform features, e.g. COM. As such, platform independence of .NET apps depends on the ability to transfer necessary native libraries to target platforms. In 2019, the Windows Forms and Windows Presentation Foundation portions of .NET Framework were made open source. === .NET implementations === There are four primary .NET implementations that are actively developed and maintained: .NET Framework: The original .NET implementation that has existed since 2002. While not yet discontinued, Microsoft does not plan on releasing its next major version, 5.0. Mono: A cross-platform implementation of .NET Framework by Ximian, introduced in 2004. It is free and open-source. It is now developed by Xamarin, a subsidiary of Microsoft. Universal Windows Platform (UWP): An implementation of .NET used for building UWP apps. It's designed to unify development for different targeted types of devices, including PCs, tablets, phablets, phones, and the Xbox. .NET: A cross-platform re-implementation of .NET Framework, introduced in 2016 and initially called .NET Core. It is free and open-source. .NET superseded .NET Framework with the release of .NET 5. Each implementation of .NET includes the following components: One or more runtime environments, e.g. Common Language Runtime (CLR) for .NET Framework and CoreCLR for .NET A class library The .NET Standard is a set of common APIs that are implemented in the Base Class Library of any .NET implementation. The class library of each implementation must implement the .NET Standard, but may also implement additional APIs. Traditionally, .NET apps targeted a certain version of a .NET implementation, e.g. .NET Framework 4.6. Starting with the .NET Standard, an app can target a version of the .NET Standard and then it could be used (without recompiling) by any implementation that supports that level of the standard. This enables portability across different .NET implementations. The following table lists the .NET implementations that adhere to the .NET Standard and the version number at which each implementation became compliant with a given version of .NET Standard. For example, according to this table, .NET Core 3.0 was the first version of .NET Core that adhered to .NET Standard 2.1. This means that any version of .NET Core bigger than 3.0 (e.g. .NET Core 3.1) also adheres to .NET Standard 2.1. == Web frameworks == === ASP.NET === First released in 2002, ASP.NET is an open-source server-side web application framework designed for web development to produce dynamic web pages. It is the successor to Microsoft's Active Server Pages (ASP) technology, built on the Common Language Runtime (CLR). === ASP.NET Core === ASP.NET was completely rewritten in 2016 as a modular web framework, together with other frameworks like Entity Framework. The re-written framework uses the new open-source .NET Compiler Platform (also known by its codename "Roslyn") and is cross platform. The programming models ASP.NET MVC, ASP.NET Web API, and ASP.NET Web Pages (a model using only Razor pages) were merged into a unified MVC 6. === Blazor === Blazor is a free and open-source web framework that enables developers to create Single-page Web apps using C# and HTML in ASP.NET Razor pages ("components"). Blazor is part of the ASP.NET Core framework. Blazor Server apps are hosted on a web server, while Blazor WebAssembly apps are downloaded to the client's web browser before running. In addition, a Blazor Hybrid framework is available with server-based and client-based application components. == Numerical libraries == === Open-source numerical libraries === ==== AForge.NET ==== This is a computer vision and artificial intelligence library. It implements a number of genetic, fuzzy logic and machine learning algorithms with several architectures of artificial neural networks with corresponding training algorithms. ==== ALGLIB ==== This is a cross-platform open source numerical analysis and data processing library. It consists of algorithm collections written in different programming languages (C++, C#, FreePascal, Delphi, VBA) and has dual licensing – commercial and GPL. ==== Math.NET Numerics ==== This library aims to provide methods and algorithms for numerical computations in science, engineering and everyday use. Covered topics include special functions, linear algebra, probability models, random numbers, interpolation, integral transforms and more. MIT/X11 license. ==== Meta.Numerics ==== This is a library for advanced scientific computation in the .NET Framework. ==== ML.NET ==== This is a free software machine learning library. The preview release of ML.NET included transforms for feature engineering like n-gram creation, and learners to handle binary classification, multi-class classification, and regression tasks. Additional ML tasks like anomaly detection and recommendation systems have since been added, and other approaches like deep learning will be included in future versions. === Proprietary numerical libraries === ==== ILNumerics.Net ==== This is a high performance, typesafe numerical array set of classes and functions for general math, FFT and linear algebra. The library, developed for .NET/Mono, aims to provide 32- and 64-bit script-like syntax in C#, 2D & 3D plot controls, and efficient memory management. It is released under GPLv3 or commercial license. ==== Measurement Studio ==== This is an integrated suite of UI controls and class libraries for use in developing test and measurement applications. The analysis class libraries provide various digital signal processing, signal filtering, signal generation, peak detection, and other general mathematical functionality. ==== NMath ==== This is a numerical component library for the .NET platform developed by CenterSpace Software. It includes signal processing (FFT) classes, a linear algebra (LAPACK & BLAS) framework, and a statistics package. == 3D graphics == === Open-source 3D graphics === ==== Open Toolkit (OpenTK) ==== This is a low-level C# binding for OpenGL, OpenGL ES and OpenAL. It runs on Windows, Linux, Mac OS X, BSD, Android and iOS. It can be used standalone or integrated into a GUI. ==== Windows Presentation Foundation (WPF) ==== This is a graphical subsystem for rendering user interfaces, developed by Microsoft. It also contains a 3D rendering engine. In addition, interactive 2D content can be overlaid on 3D surfaces natively. It only runs on Windows operating systems. === Proprietary 3D graphics === ==== Unity ==== This is a cross-platform game engine developed by Unity Technologies and used to develop video games for PC, consoles, mobile devices and websites. == Image processing == === AForge.NET === This is a computer vision and artificial intelligence library. It implements a number of image processing algorithms and filters. It is released under the LGPLv3 and partly GPLv3 license. Majority of the library is written in C# and th
Markov model
In probability theory, a Markov model is a stochastic model used to model pseudo-randomly changing systems. It is assumed that future states depend only on the current state, not on the events that occurred before it (that is, it assumes the Markov property). Generally, this assumption enables reasoning and computation with the model that would otherwise be intractable. For this reason, in the fields of predictive modelling and probabilistic forecasting, it is desirable for a given model to exhibit the Markov property. == Introduction == Andrey Andreyevich Markov (14 June 1856 – 20 July 1922) was a Russian mathematician best known for his work on stochastic processes. A primary subject of his research later became known as the Markov chain. There are four common Markov models used in different situations, depending on whether every sequential state is observable or not, and whether the system is to be adjusted on the basis of observations made: == Markov chain == The simplest Markov model is the Markov chain. It models the state of a system with a random variable that changes through time. In this context, the Markov property indicates that the distribution for this variable depends only on the distribution of a previous state. An example use of a Markov chain is Markov chain Monte Carlo, which uses the Markov property to prove that a particular method for performing a random walk will sample from the joint distribution. == Hidden Markov model == A hidden Markov model is a Markov chain for which the state is only partially observable or noisily observable. In other words, observations are related to the state of the system, but they are typically insufficient to precisely determine the state. Several well-known algorithms for hidden Markov models exist. For example, given a sequence of observations, the Viterbi algorithm will compute the most-likely corresponding sequence of states, the forward algorithm will compute the probability of the sequence of observations, and the Baum–Welch algorithm will estimate the starting probabilities, the transition function, and the observation function of a hidden Markov model. One common use is for speech recognition, where the observed data is the speech audio waveform and the hidden state is the spoken text. In this example, the Viterbi algorithm finds the most likely sequence of spoken words given the speech audio. == Markov decision process == A Markov decision process is a Markov chain in which state transitions depend on the current state and an action vector that is applied to the system. Typically, a Markov decision process is used to compute a policy of actions that will maximize some utility with respect to expected rewards. == Partially observable Markov decision process == A partially observable Markov decision process (POMDP) is a Markov decision process in which the state of the system is only partially observed. POMDPs are known to be NP complete, but recent approximation techniques have made them useful for a variety of applications, such as controlling simple agents or robots. == Markov random field == A Markov random field, or Markov network, may be considered to be a generalization of a Markov chain in multiple dimensions. In a Markov chain, state depends only on the previous state in time, whereas in a Markov random field, each state depends on its neighbors in any of multiple directions. A Markov random field may be visualized as a field or graph of random variables, where the distribution of each random variable depends on the neighboring variables with which it is connected. More specifically, the joint distribution for any random variable in the graph can be computed as the product of the "clique potentials" of all the cliques in the graph that contain that random variable. Modeling a problem as a Markov random field is useful because it implies that the joint distributions at each vertex in the graph may be computed in this manner. == Hierarchical Markov models == Hierarchical Markov models can be applied to categorize human behavior at various levels of abstraction. For example, a series of simple observations, such as a person's location in a room, can be interpreted to determine more complex information, such as in what task or activity the person is performing. Two kinds of Hierarchical Markov Models are the Hierarchical hidden Markov model and the Abstract Hidden Markov Model. Both have been used for behavior recognition and certain conditional independence properties between different levels of abstraction in the model allow for faster learning and inference. == Tolerant Markov model == A Tolerant Markov model (TMM) is a probabilistic-algorithmic Markov chain model. It assigns the probabilities according to a conditioning context that considers the last symbol, from the sequence to occur, as the most probable instead of the true occurring symbol. A TMM can model three different natures: substitutions, additions or deletions. Successful applications have been efficiently implemented in DNA sequences compression. == Markov-chain forecasting models == Markov-chains have been used as a forecasting methods for several topics, for example price trends, wind power and solar irradiance. The Markov-chain forecasting models utilize a variety of different settings, from discretizing the time-series to hidden Markov-models combined with wavelets and the Markov-chain mixture distribution model (MCM).