Manifold regularization

Manifold regularization

In machine learning, manifold regularization is a technique for using the shape of a dataset to constrain the functions that should be learned on that dataset. In many machine learning problems, the data to be learned do not cover the entire input space. For example, a facial recognition system may not need to classify any possible image, but only the subset of images that contain faces. The technique of manifold learning assumes that the relevant subset of data comes from a manifold, a mathematical structure with useful properties. The technique also assumes that the function to be learned is smooth: data with different labels are not likely to be close together, and so the labeling function should not change quickly in areas where there are likely to be many data points. Because of this assumption, a manifold regularization algorithm can use unlabeled data to inform where the learned function is allowed to change quickly and where it is not, using an extension of the technique of Tikhonov regularization. Manifold regularization algorithms can extend supervised learning algorithms in semi-supervised learning and transductive learning settings, where unlabeled data are available. The technique has been used for applications including medical imaging, geographical imaging, and object recognition. == Manifold regularizer == === Motivation === Manifold regularization is a type of regularization, a family of techniques that reduces overfitting and ensures that a problem is well-posed by penalizing complex solutions. In particular, manifold regularization extends the technique of Tikhonov regularization as applied to Reproducing kernel Hilbert spaces (RKHSs). Under standard Tikhonov regularization on RKHSs, a learning algorithm attempts to learn a function f {\displaystyle f} from among a hypothesis space of functions H {\displaystyle {\mathcal {H}}} . The hypothesis space is an RKHS, meaning that it is associated with a kernel K {\displaystyle K} , and so every candidate function f {\displaystyle f} has a norm ‖ f ‖ K {\displaystyle \left\|f\right\|_{K}} , which represents the complexity of the candidate function in the hypothesis space. When the algorithm considers a candidate function, it takes its norm into account in order to penalize complex functions. Formally, given a set of labeled training data ( x 1 , y 1 ) , … , ( x ℓ , y ℓ ) {\displaystyle (x_{1},y_{1}),\ldots ,(x_{\ell },y_{\ell })} with x i ∈ X , y i ∈ Y {\displaystyle x_{i}\in X,y_{i}\in Y} and a loss function V {\displaystyle V} , a learning algorithm using Tikhonov regularization will attempt to solve the expression arg min f ∈ H 1 ℓ ∑ i = 1 ℓ V ( f ( x i ) , y i ) + γ ‖ f ‖ K 2 {\displaystyle {\underset {f\in {\mathcal {H}}}{\arg \!\min }}{\frac {1}{\ell }}\sum _{i=1}^{\ell }V(f(x_{i}),y_{i})+\gamma \left\|f\right\|_{K}^{2}} where γ {\displaystyle \gamma } is a hyperparameter that controls how much the algorithm will prefer simpler functions over functions that fit the data better. Manifold regularization adds a second regularization term, the intrinsic regularizer, to the ambient regularizer used in standard Tikhonov regularization. Under the manifold assumption in machine learning, the data in question do not come from the entire input space X {\displaystyle X} , but instead from a nonlinear manifold M ⊂ X {\displaystyle M\subset X} . The geometry of this manifold, the intrinsic space, is used to determine the regularization norm. === Laplacian norm === There are many possible choices for the intrinsic regularizer ‖ f ‖ I {\displaystyle \left\|f\right\|_{I}} . Many natural choices involve the gradient on the manifold ∇ M {\displaystyle \nabla _{M}} , which can provide a measure of how smooth a target function is. A smooth function should change slowly where the input data are dense; that is, the gradient ∇ M f ( x ) {\displaystyle \nabla _{M}f(x)} should be small where the marginal probability density P X ( x ) {\displaystyle {\mathcal {P}}_{X}(x)} , the probability density of a randomly drawn data point appearing at x {\displaystyle x} , is large. This gives one appropriate choice for the intrinsic regularizer: ‖ f ‖ I 2 = ∫ x ∈ M ‖ ∇ M f ( x ) ‖ 2 d P X ( x ) {\displaystyle \left\|f\right\|_{I}^{2}=\int _{x\in M}\left\|\nabla _{M}f(x)\right\|^{2}\,d{\mathcal {P}}_{X}(x)} In practice, this norm cannot be computed directly because the marginal distribution P X {\displaystyle {\mathcal {P}}_{X}} is unknown, but it can be estimated from the provided data. === Graph-based approach of the Laplacian norm === When the distances between input points are interpreted as a graph, then the Laplacian matrix of the graph can help to estimate the marginal distribution. Suppose that the input data include ℓ {\displaystyle \ell } labeled examples (pairs of an input x {\displaystyle x} and a label y {\displaystyle y} ) and u {\displaystyle u} unlabeled examples (inputs without associated labels). Define W {\displaystyle W} to be a matrix of edge weights for a graph, where W i j {\displaystyle W_{ij}} is a similarity built from distance measure between the data points x i {\displaystyle x_{i}} and x j {\displaystyle x_{j}} (so that more close implies higher W i j {\displaystyle W_{ij}} ). Define D {\displaystyle D} to be a diagonal matrix with D i i = ∑ j = 1 ℓ + u W i j {\displaystyle D_{ii}=\sum _{j=1}^{\ell +u}W_{ij}} and L {\displaystyle L} to be the Laplacian matrix D − W {\displaystyle D-W} . Then, as the number of data points ℓ + u {\displaystyle \ell +u} increases, L {\displaystyle L} converges to the Laplace–Beltrami operator Δ M {\displaystyle \Delta _{M}} , which is the divergence of the gradient ∇ M {\displaystyle \nabla _{M}} . Then, if f {\displaystyle \mathbf {f} } is a vector of the values of f {\displaystyle f} at the data, f = [ f ( x 1 ) , … , f ( x l + u ) ] T {\displaystyle \mathbf {f} =[f(x_{1}),\ldots ,f(x_{l+u})]^{\mathrm {T} }} , the intrinsic norm can be estimated: ‖ f ‖ I 2 = 1 ( ℓ + u ) 2 f T L f {\displaystyle \left\|f\right\|_{I}^{2}={\frac {1}{(\ell +u)^{2}}}\mathbf {f} ^{\mathrm {T} }L\mathbf {f} } As the number of data points ℓ + u {\displaystyle \ell +u} increases, this empirical definition of ‖ f ‖ I 2 {\displaystyle \left\|f\right\|_{I}^{2}} converges to the definition when P X {\displaystyle {\mathcal {P}}_{X}} is known. === Solving the regularization problem with graph-based approach === Using the weights γ A {\displaystyle \gamma _{A}} and γ I {\displaystyle \gamma _{I}} for the ambient and intrinsic regularizers, the final expression to be solved becomes: arg min f ∈ H 1 ℓ ∑ i = 1 ℓ V ( f ( x i ) , y i ) + γ A ‖ f ‖ K 2 + γ I ( ℓ + u ) 2 f T L f {\displaystyle {\underset {f\in {\mathcal {H}}}{\arg \!\min }}{\frac {1}{\ell }}\sum _{i=1}^{\ell }V(f(x_{i}),y_{i})+\gamma _{A}\left\|f\right\|_{K}^{2}+{\frac {\gamma _{I}}{(\ell +u)^{2}}}\mathbf {f} ^{\mathrm {T} }L\mathbf {f} } As with other kernel methods, H {\displaystyle {\mathcal {H}}} may be an infinite-dimensional space, so if the regularization expression cannot be solved explicitly, it is impossible to search the entire space for a solution. Instead, a representer theorem shows that under certain conditions on the choice of the norm ‖ f ‖ I {\displaystyle \left\|f\right\|_{I}} , the optimal solution f ∗ {\displaystyle f^{}} must be a linear combination of the kernel centered at each of the input points: for some weights α i {\displaystyle \alpha _{i}} , f ∗ ( x ) = ∑ i = 1 ℓ + u α i K ( x i , x ) {\displaystyle f^{}(x)=\sum _{i=1}^{\ell +u}\alpha _{i}K(x_{i},x)} Using this result, it is possible to search for the optimal solution f ∗ {\displaystyle f^{}} by searching the finite-dimensional space defined by the possible choices of α i {\displaystyle \alpha _{i}} . === Functional approach of the Laplacian norm === The idea beyond the graph-Laplacian is to use neighbors to estimate the Laplacian. This method is akin to local averaging methods, that are known to scale poorly in high-dimensional problems. Indeed, the graph Laplacian is known to suffer from the curse of dimensionality. Luckily, it is possible to leverage expected smoothness of the function to estimate thanks to more advanced functional analysis. This method consists of estimating the Laplacian operator using derivatives of the kernel reading ∂ 1 , j K ( x i , x ) {\displaystyle \partial _{1,j}K(x_{i},x)} where ∂ 1 , j {\displaystyle \partial _{1,j}} denotes the partial derivatives according to the j-th coordinate of the first variable. This second approach to the Laplacian norm is to put in relation with meshfree methods, that contrast with the finite difference method in PDE. == Applications == Manifold regularization can extend a variety of algorithms that can be expressed using Tikhonov regularization, by choosing an appropriate loss function V {\displaystyle V} and hypothesis space H {\displaystyle {\mathcal {H}}} . Two commonly used examples are the families of support vector machines and regularized least squares algorithm

Mastodon (social network)

Mastodon is a free and open-source software platform for decentralized social networking with microblogging features similar to Twitter. It operates as a federated network of independently managed servers that communicate using the ActivityPub protocol, allowing users to connect across different instances within the Fediverse. Each Mastodon instance establishes its own moderation policies and content guidelines, distinguishing it from centrally controlled social media platforms. First released in 2016 by Eugen Rochko, Mastodon has positioned itself as an alternative to mainstream social media, particularly for users seeking decentralized, community-driven spaces. The platform has experienced multiple surges in adoption, most notably following the Twitter acquisition by Elon Musk in 2022, as users sought alternatives to Twitter. It is part of a broader shift toward decentralized social networks, including Bluesky and Lemmy. Mastodon emphasizes user privacy and moderation flexibility, offering features such as granular post visibility controls, content warning options, and local community-driven moderation. The software is written in Ruby on Rails and Node.js, with a web interface built using React and Redux. It is interoperable with other ActivityPub-based platforms, such as Threads, and supports various third-party applications on desktop and mobile devices. == Functionality == Users post short-form status messages, historically known as "toots", for others to see and interact with. On a standard Mastodon instance, these messages can include up to 500 text-based characters, greater than Twitter's 280-character limit. Some instances support even longer messages. Images, audio files, videos or polls can also be added to a message. Users join a specific Mastodon server, rather than a single centralized website or application. The servers are connected as nodes in a network, and each server can administer its own rules, account privileges, and whether to share messages to and from other servers. Users can communicate and follow each other across connected Mastodon servers with usernames similar in format to full email addresses. Since version 2.9.0, Mastodon's web user interface has offered a single-column mode for new users by default. In advanced mode, the interface approximates the microblogging interface of TweetDeck. === Privacy === Mastodon includes a number of specific privacy features. Each message has a variety of privacy options available, and users can choose whether the message is public or private. Messages can display public on a global feed, known as a timeline, or can be shared only to the user's followers. Messages can also be marked as unlisted from timelines or direct between users. Users can also mark their accounts as completely private. In the timeline, messages can display with an optional content warning feature, which requires readers to click on the hidden main body of the message to reveal it. Mastodon servers have used this feature to hide spoilers, trigger warnings, and not safe for work (NSFW) content, though some accounts use the feature to hide links and thoughts others might not want to read. Mastodon aggregates messages in local and federated timelines in real time. The local timeline shows messages from users on a singular server, while the federated timeline shows messages across all participating Mastodon servers. === Content moderation === In early 2017, journalists like Sarah Jeong distinguished Mastodon from Twitter for its approach to combating harassment. Mastodon uses community-based moderation, in which each server can limit or filter out undesirable types of content, while Twitter uses a single, global policy on content moderation. Servers can choose to limit or filter out messages with disparaging content. The founder of Mastodon, Eugen Rochko, believes that small, closely related communities deal with unwanted behavior more effectively than a large company's small safety team. In Move Slowly and Build Bridges, Robert W. Gehl argues that predominantly white participation has shaped Mastodon in ways that affect how reports of racism are received and limit its ability to replicate Black Twitter on Twitter. Users can also block and report others to administrators, much like on Twitter. Instance administrators can block other instances from interacting with their own, an action called defederation. By posting toots hashtagged with #fediblock, some instance administrators and users alert others of issues requiring moderation. === Searching === Mastodon by default allows searching for hashtags and mentioned accounts in the Fediverse. Server administrators can optionally enable Elasticsearch to search the full-text of public posts that have opted in to being indexed. == Versions == In September 2018, with the release of version 2.5 with redesigned public profile pages, Mastodon marked its 100th release. Mastodon 2.6 was released in October 2018, introducing the possibilities of verified profiles and live, in-stream link previews for images and videos. Version 2.7, in January 2019, made it possible to search for multiple hashtags at once, instead of searching for just a single hashtag, with more robust moderation capabilities for server administrators and moderators, while accessibility, such as contrast for users with sight issues, was improved. The ability for users to create and vote in polls, as well as a new invitation system to manage registrations was integrated in April 2019. Mastodon 2.8.1, released in May 2019, made images with content warnings blurred instead of completely hidden. In version 2.9 in June 2019, an optional single-column view was added. This view became the default displayed to new users, with a user "preferences" option to switch to a multiple-column-based view. In August 2020, Mastodon 3.2 was released. It included a redesigned audio player with custom thumbnails and the ability to add personal notes to one's profile. In July 2021, an official client for iOS devices was released. According to the project's then CEO, Eugen Rochko, the release was part of an effort to attract new users. Mastodon 4.0 was released in November 2022, including language support for translating posts, editing posts and following hashtags. Mastodon 4.5 was released in November 2025. Among other features it introduced quote posts, which were previously rejected from being implemented due to concerns about toxicity and harassment. To mitigate these issues Mastodon's quote post feature has been designed in a way that lets users decide if and by whom their posts can be quoted. == Software == Mastodon is published as free and open-source software under the Affero GPL license, allowing anyone to use the software or modify it as they wish. Servers can be run by any individual or organization, and users can join these servers as they wish. The server software itself is powered by Ruby on Rails and Node.js, with its web client being written in React.js and Redux. The only database software supported is PostgreSQL, with Redis being used for job processing and various actions that Mastodon needs to process. The service is interoperable with the fediverse, a collection of social networking services which use the ActivityPub protocol for communication between each other, with previous versions containing support for OStatus. Client apps for interacting with the Mastodon API are available for desktop computer operating systems, including Windows, macOS and the Linux family of operating systems, as well as mobile phones running iOS and Android. The API is open for anyone to utilize, allowing clients to be built for any operating system that can connect to the internet. === Integration with Fediverse === Mastodon uses the ActivityPub protocol for federation; this allows users to communicate between independent Mastodon instances and other ActivityPub compatible services. Thus, Mastodon is generally considered to be a part of the Fediverse. Services utilizing the ActivityPub protocol exist which allow for searching all posts on all instances as long as users opt-in. For similar reasons, only hashtags can appear in a Mastodon instance's trending topics, not arbitrary popular words. Trending topics vary between instances, since individual instances are aware of different subsets of posts from the whole fediverse. === Security concerns === While Mastodon's decentralized structure is one of its most distinctive features, it also poses additional security challenges. Since many Mastodon instances are run by volunteers, some security experts are concerned about data security and responsiveness to new threats and vulnerabilities across the network, considering the difficulty of configuring and maintaining an instance as well as uneven skill levels among administrators. Administrators of an instance also have access to the private information of any users that are either registered with that instance or have federated

Neuromorphic computing

Neuromorphic computing is a computing approach inspired by the human brain's structure and function. It uses artificial neurons to perform computations, mimicking neural systems for tasks such as perception, motor control, and multisensory integration. These systems, implemented in analog, digital, or mixed-mode VLSI, prioritize robustness, adaptability, and learning by emulating the brain’s distributed processing across small computing elements. This interdisciplinary field integrates biology, physics, mathematics, computer science, and electronic engineering to develop systems that emulate the brain’s morphology and computational strategies. Neuromorphic systems aim to enhance energy efficiency and computational power for applications including artificial intelligence, pattern recognition, and sensory processing. == History == Carver Mead proposed one of the first applications for neuromorphic engineering in the late 1980s. In 2006, researchers at Georgia Tech developed a field programmable neural array, a silicon-based chip modeling neuron channel-ion characteristics. In 2011, MIT researchers created a chip mimicking synaptic communication using 400 transistors and standard CMOS techniques. In 2012 HP Labs researchers reported that Mott memristors exhibit volatile behavior at low temperatures, enabling the creation of neuristors that mimic neuron behavior and support Turing machine components. Also in 2012, Purdue University researchers presented a neuromorphic chip design using lateral spin valves and memristors, noted for energy efficiency. The 2013 Blue Brain Project creates detailed digital models of rodent brains. Neurogrid, developed by Brains in Silicon at Stanford University, used 16 NeuroCore chips to emulate 65,536 neurons with high energy efficiency in 2014. The 2014 BRAIN Initiative and IBM’s TrueNorth chip contributed to neuromorphic advancements. The 2016 BrainScaleS project, a hybrid neuromorphic supercomputer at University of Heidelberg, operated 864 times faster than biological neurons. In 2017, Intel unveiled its Loihi chip, using an asynchronous artificial neural network for efficient learning and inference. Also in 2017 IMEC’s self-learning chip, based on OxRAM, demonstrated music composition by learning from minuets. In 2022, MIT researchers developed artificial synapses using protons for analog deep learning. In 2019, the European Union funded neuromorphic quantum computing to explore quantum operations using neuromorphic systems. Also in 2022, researchers at the Max Planck Institute for Polymer Research developed an organic artificial spiking neuron for in-situ neuromorphic sensing and biointerfacing. Researchers reported in 2024 that chemical systems in liquid solutions can detect sound at various wavelengths, offering potential for neuromorphic applications. == Neurological inspiration == Neuromorphic engineering emulates the brain’s structure and operations, focusing on the analog nature of biological computation and the role of neurons in cognition. The brain processes information via neurons using chemical signals, abstracted into mathematical functions. Neuromorphic systems distribute computation across small elements, similar to neurons, using methods guided by anatomical and functional neural maps from electron microscopy and neural connection studies. == Implementation == Neuromorphic systems employ hardware such as oxide-based memristors, spintronic memories, threshold switches, and transistors. Software implementations train spiking neural networks using error backpropagation. === Neuromemristive systems === Neuromemristive systems use memristors to implement neuroplasticity, focusing on abstract neural network models rather than detailed biological mimicry. These systems enable applications in speech recognition, face recognition, and object recognition, and can replace conventional digital logic gates. The Caravelli-Traversa-Di Ventra equation describes memristive memory evolution, revealing tunneling phenomena and Lyapunov functions. === Neuromorphic sensors === Neuromorphic principles extend to sensors, such as the retinomorphic sensor or event camera, which mimic human vision by registering brightness changes individually, optimizing power consumption. An example of this applied to detecting light is the retinomorphic sensor or, when employed in an array, an event camera. == Ethical considerations == Neuromorphic systems raise the same ethical questions as those for other approaches to artificial intelligence. Daniel Lim argued that advanced neuromorphic systems could lead to machine consciousness, raising concerns about whether civil rights and other protocols should be extended to them. Legal debates, such as in Acohs Pty Ltd v. Ucorp Pty Ltd, question ownership of work produced by neuromorphic systems, as non-human-generated outputs may not be copyrightable.

Stanford Research Institute Problem Solver

The Stanford Research Institute Problem Solver, known by its acronym STRIPS, is an automated planner developed by Richard Fikes and Nils Nilsson in 1971 at SRI International. The same name was later used to refer to the formal language of the inputs to this planner. This language is the base for most of the languages for expressing automated planning problem instances in use today; such languages are commonly known as action languages. This article only describes the language, not the planner. == Definition == A STRIPS instance is composed of: An initial state; The specification of the goal states – situations that the planner is trying to reach; A set of actions. For each action, the following are included: preconditions (what must be established before the action is performed); postconditions (what is established after the action is performed). Mathematically, a STRIPS instance is a quadruple ⟨ P , O , I , G ⟩ {\displaystyle \langle P,O,I,G\rangle } , in which each component has the following meaning: P {\displaystyle P} is a set of conditions (i.e., propositional variables); O {\displaystyle O} is a set of operators (i.e., actions); each operator is itself a quadruple ⟨ α , β , γ , δ ⟩ {\displaystyle \langle \alpha ,\beta ,\gamma ,\delta \rangle } , each element being a set of conditions. These four sets specify, in order, which conditions must be true for the action to be executable, which ones must be false, which ones are made true by the action and which ones are made false; I {\displaystyle I} is the initial state, given as the set of conditions that are initially true (all others are assumed false); G {\displaystyle G} is the specification of the goal state; this is given as a pair ⟨ N , M ⟩ {\displaystyle \langle N,M\rangle } , which specify which conditions are true and false, respectively, in order for a state to be considered a goal state. A plan for such a planning instance is a sequence of operators that can be executed from the initial state and that leads to a goal state. Formally, a state is a set of conditions: a state is represented by the set of conditions that are true in it. Transitions between states are modeled by a transition function, which is a function mapping states into new states that result from the execution of actions. Since states are represented by sets of conditions, the transition function relative to the STRIPS instance ⟨ P , O , I , G ⟩ {\displaystyle \langle P,O,I,G\rangle } is a function succ : 2 P × O → 2 P , {\displaystyle \operatorname {succ} :2^{P}\times O\rightarrow 2^{P},} where 2 P {\displaystyle 2^{P}} is the set of all subsets of P {\displaystyle P} , and is therefore the set of all possible states. The transition function succ {\displaystyle \operatorname {succ} } for a state C ⊆ P {\displaystyle C\subseteq P} , can be defined as follows, using the simplifying assumption that actions can always be executed but have no effect if their preconditions are not met: The function succ {\displaystyle \operatorname {succ} } can be extended to sequences of actions by the following recursive equations: succ ⁡ ( C , [ ] ) = C {\displaystyle \operatorname {succ} (C,[\ ])=C} succ ⁡ ( C , [ a 1 , a 2 , … , a n ] ) = succ ⁡ ( succ ⁡ ( C , a 1 ) , [ a 2 , … , a n ] ) {\displaystyle \operatorname {succ} (C,[a_{1},a_{2},\ldots ,a_{n}])=\operatorname {succ} (\operatorname {succ} (C,a_{1}),[a_{2},\ldots ,a_{n}])} A plan for a STRIPS instance is a sequence of actions such that the state that results from executing the actions in order from the initial state satisfies the goal conditions. Formally, [ a 1 , a 2 , … , a n ] {\displaystyle [a_{1},a_{2},\ldots ,a_{n}]} is a plan for G = ⟨ N , M ⟩ {\displaystyle G=\langle N,M\rangle } if F = succ ⁡ ( I , [ a 1 , a 2 , … , a n ] ) {\displaystyle F=\operatorname {succ} (I,[a_{1},a_{2},\ldots ,a_{n}])} satisfies the following two conditions: N ⊆ F {\displaystyle N\subseteq F} M ∩ F = ∅ {\displaystyle M\cap F=\varnothing } == Extensions == The above language is actually the propositional version of STRIPS; in practice, conditions are often about objects: for example, that the position of a robot can be modeled by a predicate A t {\displaystyle At} , and A t ( r o o m 1 ) {\displaystyle At(room1)} means that the robot is in Room1. In this case, actions can have free variables, which are implicitly existentially quantified. In other words, an action represents all possible propositional actions that can be obtained by replacing each free variable with a value. The initial state is considered fully known in the language described above: conditions that are not in I {\displaystyle I} are all assumed false. This is often a limiting assumption, as there are natural examples of planning problems in which the initial state is not fully known. Extensions of STRIPS have been developed to deal with partially known initial states. == A sample STRIPS problem == A monkey is at location A in a lab. There is a box in location C. The monkey wants the bananas that are hanging from the ceiling in location B, but it needs to move the box and climb onto it in order to reach them. Initial state: At(A), Level(low), BoxAt(C), BananasAt(B) Goal state: Have(bananas) Actions: // move from X to Y _Move(X, Y)_ Preconditions: At(X), Level(low) Postconditions: not At(X), At(Y) // climb up on the box _ClimbUp(Location)_ Preconditions: At(Location), BoxAt(Location), Level(low) Postconditions: Level(high), not Level(low) // climb down from the box _ClimbDown(Location)_ Preconditions: At(Location), BoxAt(Location), Level(high) Postconditions: Level(low), not Level(high) // move monkey and box from X to Y _MoveBox(X, Y)_ Preconditions: At(X), BoxAt(X), Level(low) Postconditions: BoxAt(Y), not BoxAt(X), At(Y), not At(X) // take the bananas _TakeBananas(Location)_ Preconditions: At(Location), BananasAt(Location), Level(high) Postconditions: Have(bananas) == Complexity == Deciding whether any plan exists for a propositional STRIPS instance is PSPACE-complete. Various restrictions can be enforced in order to decide if a plan exists in polynomial time or at least make it an NP-complete problem. == Macro operator == In the monkey and banana problem, the robot monkey has to execute a sequence of actions to reach the banana at the ceiling. A single action provides a small change in the game. To simplify the planning process, it make sense to invent an abstract action, which isn't available in the normal rule description. The super-action consists of low level actions and can reach high-level goals. The advantage is that the computational complexity is lower, and longer tasks can be planned by the solver. Identifying new macro operators for a domain can be realized with genetic programming. The idea is, not to plan the domain itself, but in the pre-step, a heuristics is created that allows the domain to be solved much faster. In the context of reinforcement learning, a macro-operator is called an option. Similar to the definition within AI planning, the idea is, to provide a temporal abstraction (span over a longer period) and to modify the game state directly on a higher layer.

Shane Legg

Shane Legg (born 1973 or 1974) is a machine learning researcher and entrepreneur. With Demis Hassabis and Mustafa Suleyman, he cofounded DeepMind Technologies (later bought by Google and now called Google DeepMind), and works there as the chief AGI scientist. He is also known for his academic work on artificial general intelligence, including his thesis supervised by Marcus Hutter. == Early life and education == Legg attended Rotorua Lakes High School in Rotorua, on New Zealand's North Island. He completed his undergraduate studies at Waikato University in 1996. Also in 1996, he obtained his MSc degree with a thesis entitled "Solomonoff Induction", with Cristian S. Calude at the University of Auckland. == Research interests == In the early 2000s, Legg re-introduced and popularized with Ben Goertzel the term "artificial general intelligence" (AGI), to describe an AI that can do practically any cognitive task a human can do. At that time, talking about AGI "would put you on the lunatic fringe". Legg is known for his concern of existential risk from AI, highlighted in 2011 in an interview on LessWrong and in 2023 he signed the statement on AI risk of extinction. == Career == Before his PhD and before cofounding DeepMind, Shane Legg worked at "a number of software development positions at private companies", including the "big data firm Adaptive Intelligence" and the startup WebMind founded by Ben Goertzel. === Research === Legg later obtained a PhD at the Dalle Molle Institute for Artificial Intelligence Research (IDSIA), a joint research institute of USI Università della Svizzera italiana and SUPSI. He worked on theoretical models of super intelligent machines (AIXI) with Marcus Hutter, and completed in 2008 his doctoral thesis entitled "Machine Super Intelligence". He then went on to complete a postdoctoral fellowship in finance at USI, and began a further fellowship at University College London's Gatsby Computational Neuroscience Unit. === DeepMind === Demis Hassabis and Shane Legg first met in 2009 at University College London, where Legg was a postdoctoral researcher. In 2010, Legg cofounded the start-up DeepMind Technologies along with Demis Hassabis and Mustafa Suleyman. DeepMind Technologies was bought in 2014 by Google. After the merge with Google Brain in 2023, the company is now known as Google DeepMind. According to a 2017 article, a significant part of his job as the chief scientist was to supervise recruitment, to decide where DeepMind should focus its efforts, and to lead DeepMind's AI safety work. As of July 2023, Legg works at Google DeepMind as the Chief AGI Scientist. == Awards and honors == Legg was awarded the $10,000 prize of the Singularity Institute for Artificial Intelligence for his PhD done in 2008. Legg was appointed Commander of the Order of the British Empire (CBE) in the 2019 Birthday Honours for services to the science and technology sector and to investment.

IMPACT (computer graphics)

IMPACT (sometimes spelled Impact) is a computer graphics architecture for Silicon Graphics computer workstations. IMPACT Graphics was developed in 1995 and was available as a high-end graphics option on workstations released during the mid-1990s. IMPACT graphics gives the workstation real-time 2D and 3D graphics rendering capability similar to that of even high-end PCs made well after IMPACT's introduction. IMPACT graphics systems consist of either one or two Geometry Engines and one or two Raster Engines in various configurations. IMPACT graphics consists of five graphics subsystems: the Command Engine, Geometry Subsystem, Raster Engine, framebuffer and Display Subsystem. IMPACT Graphics can produce resolutions up to 1600 x 1200 pixels with 32-bit color and can also process unencoded NTSC and PAL analog television signals. IMPACT graphics subsystems come in three configurations for SGI Indigo2 IMPACT workstations: Solid IMPACT, High IMPACT, and Maximum IMPACT. The equivalent configurations also exist for the SGI Octane workstation but are referred to as SI, SSI, and MXI (I-series). Later Octane workstations used a similar configuration but with updated ASIC chips and are referred to as SE, SSE, and MXE (E-series). IMPACT uses Rambus RDRAM for texture memory. The IMPACT graphics architecture was superseded by SGI's VPro graphics architecture in 1997.

MuZero

MuZero is a computer program developed by artificial intelligence research company DeepMind, a subsidiary of Google, to master games without knowing their rules and underlying dynamics. Its release in 2019 included benchmarks of its performance in Go, chess, shogi, and a suite of 57 different Atari games. The algorithm uses an approach similar to AlphaZero, where a combination of a tree-based search and a learned model is deployed. It matched AlphaZero's performance in chess and shogi, improved on its performance in Go, and improved on the state of the art in mastering a suite of 57 Atari games (the Arcade Learning Environment), a visually-complex domain. MuZero was trained via self-play, with no access to rules, opening books, or endgame tablebases. The trained algorithm used the same convolutional and residual architecture as AlphaZero, but with 20 percent fewer computation steps per node in the search tree. == History == MuZero really is discovering for itself how to build a model and understand it just from first principles. On November 19, 2019, the DeepMind team released a preprint introducing MuZero. === Derivation from AlphaZero === MuZero (MZ) is a combination of the high-performance planning of the AlphaZero (AZ) algorithm with approaches to model-free reinforcement learning. The combination allows for more efficient training in classical planning regimes, such as Go, while also handling domains with much more complex inputs at each stage, such as visual video games. MuZero was derived directly from AZ code, sharing its rules for setting hyperparameters. Differences between the approaches include: AZ's planning process uses a simulator. The simulator knows the rules of the game. It has to be explicitly programmed. A neural network then predicts the policy and value of a future position. Perfect knowledge of game rules is used in modeling state transitions in the search tree, actions available at each node, and termination of a branch of the tree. MZ does not have access to the rules, and instead learns one with neural networks. AZ has a single model for the game (from board state to predictions); MZ has separate models for representation of the current state (from board state into its internal embedding), dynamics of states (how actions change representations of board states), and prediction of policy and value of a future position (given a state's representation). MZ's hidden model may be complex, and it may turn out it can host computation; exploring the details of the hidden model in a trained instance of MZ is a topic for future exploration. MZ does not expect a two-player game where winners take all. It works with standard reinforcement-learning scenarios, including single-agent environments with continuous intermediate rewards, possibly of arbitrary magnitude and with time discounting. AZ was designed for two-player games that could be won, drawn, or lost. === Comparison with R2D2 === The previous state of the art technique for learning to play the suite of Atari games was R2D2, the Recurrent Replay Distributed DQN. MuZero surpassed both R2D2's mean and median performance across the suite of games, though it did not do better in every game. == Training and results == MuZero used 16 third-generation tensor processing units (TPUs) for training, and 1000 TPUs for selfplay for board games, with 800 simulations per step and 8 TPUs for training and 32 TPUs for selfplay for Atari games, with 50 simulations per step. AlphaZero used 64 second-generation TPUs for training, and 5000 first-generation TPUs for selfplay. As TPU design has improved (third-generation chips are 2x as powerful individually as second-generation chips, with further advances in bandwidth and networking across chips in a pod), these are comparable training setups. R2D2 was trained for 5 days through 2M training steps. === Initial results === MuZero matched AlphaZero's performance in chess and shogi after roughly 1 million training steps. It matched AZ's performance in Go after 500,000 training steps and surpassed it by 1 million steps. It matched R2D2's mean and median performance across the Atari game suite after 500 thousand training steps and surpassed it by 1 million steps, though it never performed well on 6 games in the suite. == Reactions and related work == MuZero was viewed as a significant advancement over AlphaZero, and a generalizable step forward in unsupervised learning techniques. The work was seen as advancing understanding of how to compose systems from smaller components, a systems-level development more than a pure machine-learning development. While only pseudocode was released by the development team, Werner Duvaud produced an open source implementation based on that. MuZero has been used as a reference implementation in other work, for instance as a way to generate model-based behavior. In late 2021, a more efficient variant of MuZero was proposed, named EfficientZero. It "achieves 194.3 percent mean human performance and 109.0 percent median performance on the Atari 100k benchmark with only two hours of real-time game experience". In early 2022, a variant of MuZero was proposed to play stochastic games (for example 2048, backgammon), called Stochastic MuZero, which uses afterstate dynamics and chance codes to account for the stochastic nature of the environment when training the dynamics network.