The RFPolicy outlines a method for contacting vendors about security vulnerabilities found in their products. It was initially written in 2000 by hacker and security consultant Rain Forest Puppy. It was perhaps the second disclosure policy, following Simple Nomad's. The policy gives the vendor five working days to respond to the reporter of the bug. If the vendor fails to contact the reporter within those five days, the issue is recommended to be disclosed to the general community. The reporter should help the vendor reproduce the bug and work out a fix. The reporter should delay notifying the general community about the bug if the vendor provides feasible reasons for requiring so. If the vendor fails to respond or shuts down communication with the reporter of the problem within five working days, the reporter should disclose the issue to the general community. When issuing an alert or fix, the vendor should give the reporter proper credit for reporting the bug. Context for the history of vulnerability disclosure is available in a history article.
Cloem
Cloem is a company based in Cannes, France, which applies natural language processing (NLP) technologies to assist patent applicants in creating variants of patent claims, called "cloems". According to the company, these "computer-generated claims can be published to keep potential competitors from attempting to file adjacent patent claims." == Technology == According to Cloem, dictionaries, ontologies and proprietary claim-drafting algorithms are used to draft alternative claims based on a client's original set of claims. In particular, the original set of claims is subject to various permutations and linguistic manipulations "by considering alternative definitions for terms as well as “synonyms, hyponyms, hyperonyms, meronyms, holonyms, and antonyms.”" == Possible uses == Cloem can optionally publish one or more created texts, as electronic publications or as paper-printed publications. These can potentially serve – through a defensive publication – as prior art to prevent another party for obtaining a patent on the subject-matter at stake. In other words, after an initial patent filing, an "improvement" patent (adjacent invention) can be applied for by another party, such as a competitor. By publishing variants of a patent claim, the risk of adverse patenting may potentially be decreased (improvement inventions may no longer be patentable). Cloems may also be potentially patentable. One of the issues of patentability, however, is that only a natural person can be a listed as an inventor on a patent. Since cloems are produced by a computer based on a person's input, it is not clear if the computer or the person is the inventor. The inventorship of Cloem texts is an open question.
Curse of dimensionality
The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience. The expression was coined by Richard E. Bellman when considering problems in dynamic programming. The curse generally refers to issues that arise when the number of datapoints is small (in a suitably defined sense) relative to the intrinsic dimension of the data. Dimensionally cursed phenomena occur in domains such as numerical analysis, sampling, combinatorics, machine learning, data mining and databases. The common theme of these problems is that when the dimensionality increases, the volume of the space increases so fast that the available data becomes sparse. In order to obtain a reliable result, the amount of data needed often grows exponentially with the dimensionality. Also, organizing and searching data often relies on detecting areas where objects form groups with similar properties; in high dimensional data, however, all objects appear to be sparse and dissimilar in many ways, which prevents common data organization strategies from being efficient. == Domains == === Combinatorics === In some problems, each variable can take one of several discrete values, or the range of possible values is divided to give a finite number of possibilities. Taking the variables together, a huge number of combinations of values must be considered. This effect is also known as the combinatorial explosion. Even in the simplest case of d {\displaystyle d} binary variables, the number of possible combinations already is 2 d {\displaystyle 2^{d}} , exponential in the dimensionality. Naively, each additional dimension doubles the effort needed to try all combinations. === Sampling === There is an exponential increase in volume associated with adding extra dimensions to a mathematical space. For example, 102 = 100 evenly spaced sample points suffice to sample a unit interval (try to visualize a "1-dimensional" cube, i.e. a line) with no more than 10−2 = 0.01 distance between points; an equivalent sampling of a 10-dimensional unit hypercube with a lattice that has a spacing of 10−2 = 0.01 between adjacent points would require 1020 = [(102)10] sample points. In general, with a spacing distance of 10−n the 10-dimensional hypercube appears to be a factor of 10n(10−1) = [(10n)10/(10n)] "larger" than the 1-dimensional hypercube, which is the unit interval. In the above example n = 2: when using a sampling distance of 0.01 the 10-dimensional hypercube appears to be 1018 "larger" than the unit interval. This effect is a combination of the combinatorics problems above and the distance function problems explained below. === Optimization === When solving dynamic optimization problems by numerical backward induction, the objective function must be computed for each combination of values. This is a significant obstacle when the dimension of the "state variable" is large. === Machine learning === In machine learning problems that involve learning a "state-of-nature" from a finite number of data samples in a high-dimensional feature space with each feature having a range of possible values, typically an enormous amount of training data is required to ensure that there are several samples with each combination of values. In an abstract sense, as the number of features or dimensions grows, the amount of data we need to generalize accurately grows exponentially. A typical rule of thumb is that there should be at least 5 training examples for each dimension in the representation. In machine learning and insofar as predictive performance is concerned, the curse of dimensionality is used interchangeably with the peaking phenomenon, which is also known as Hughes phenomenon. This phenomenon states that with a fixed number of training samples, the average (expected) predictive power of a classifier or regressor first increases as the number of dimensions or features used is increased but beyond a certain dimensionality it starts deteriorating instead of improving steadily. Nevertheless, in the context of a simple classifier (e.g., linear discriminant analysis in the multivariate Gaussian model under the assumption of a common known covariance matrix), Zollanvari et al. showed both analytically and empirically that as long as the relative cumulative efficacy of an additional feature set (with respect to features that are already part of the classifier) is greater (or less) than the size of this additional feature set, the expected error of the classifier constructed using these additional features will be less (or greater) than the expected error of the classifier constructed without them. In other words, both the size of additional features and their (relative) cumulative discriminatory effect are important in observing a decrease or increase in the average predictive power. In metric learning, higher dimensions can sometimes allow a model to achieve better performance. After normalizing embeddings to the surface of a hypersphere, FaceNet achieves the best performance using 128 dimensions as opposed to 64, 256, or 512 dimensions in one ablation study. A loss function for unitary-invariant dissimilarity between word embeddings was found to be minimized in high dimensions. === Data mining === In data mining, the curse of dimensionality refers to a data set with too many features. Consider the first table, which depicts 200 individuals and 2000 genes (features) with a 1 or 0 denoting whether or not they have a genetic mutation in that gene. A data mining application to this data set may be finding the correlation between specific genetic mutations and creating a classification algorithm such as a decision tree to determine whether an individual has cancer or not. A common practice of data mining in this domain would be to create association rules between genetic mutations that lead to the development of cancers. To do this, one would have to loop through each genetic mutation of each individual and find other genetic mutations that occur over a desired threshold and create pairs. They would start with pairs of two, then three, then four until they result in an empty set of pairs. The complexity of this algorithm can lead to calculating all permutations of gene pairs for each individual or row. Given the formula for calculating the permutations of n items with a group size of r is: n ! ( n − r ) ! {\displaystyle {\frac {n!}{(n-r)!}}} , calculating the number of three pair permutations of any given individual would be 7988004000 different pairs of genes to evaluate for each individual. The number of pairs created will grow by an order of factorial as the size of the pairs increase. The growth is depicted in the permutation table (see right). As we can see from the permutation table above, one of the major problems data miners face regarding the curse of dimensionality is that the space of possible parameter values grows exponentially or factorially as the number of features in the data set grows. This problem critically affects both computational time and space when searching for associations or optimal features to consider. Another problem data miners may face when dealing with too many features is that the number of false predictions or classifications tends to increase as the number of features grows in the data set. In terms of the classification problem discussed above, keeping every data point could lead to a higher number of false positives and false negatives in the model. This may seem counterintuitive, but consider the genetic mutation table from above, depicting all genetic mutations for each individual. Each genetic mutation, whether they correlate with cancer or not, will have some input or weight in the model that guides the decision-making process of the algorithm. There may be mutations that are outliers or ones that dominate the overall distribution of genetic mutations when in fact they do not correlate with cancer. These features may be working against one's model, making it more difficult to obtain optimal results. This problem is up to the data miner to solve, and there is no universal solution. The first step any data miner should take is to explore the data, in an attempt to gain an understanding of how it can be used to solve the problem. One must first understand what the data means, and what they are trying to discover before they can decide if anything must be removed from the data set. Then they can create or use a feature selection or dimensionality reduction algorithm to remove samples or features from the data set if they deem it necessary. One example of such methods is the interquartile range method, used to remove outliers in a data set by calculating the standard deviation of a feature or occurrence. === Distance function === When a measure such as a Euclidean distance is defined using many coordinat
Automated decision-making
Automated decision-making (ADM) is the use of data, machines and algorithms to make decisions in a range of contexts, including public administration, business, health, education, law, employment, transport, media and entertainment, with varying degrees of human oversight or intervention. ADM may involve large-scale data from a range of sources, such as databases, text, social media, sensors, images or speech, that is processed using various technologies including computer software, algorithms, machine learning, natural language processing, artificial intelligence, augmented intelligence and robotics. The increasing use of automated decision-making systems (ADMS) across a range of contexts presents many benefits and challenges to human society requiring consideration of the technical, legal, ethical, societal, educational, economic and health consequences. == Overview == There are different definitions of ADM based on the level of automation involved. Some definitions suggests ADM involves decisions made through purely technological means without human input, such as the EU's General Data Protection Regulation (Article 22). However, ADM technologies and applications can take many forms ranging from decision-support systems that make recommendations for human decision-makers to act on, sometimes known as augmented intelligence or 'shared decision-making', to fully automated decision-making processes that make decisions on behalf of individuals or organizations without human involvement. Models used in automated decision-making systems can be as simple as checklists and decision trees through to artificial intelligence and deep neural networks (DNN). Since the 1950s computers have gone from being able to do basic processing to having the capacity to undertake complex, ambiguous and highly skilled tasks such as image and speech recognition, gameplay, scientific and medical analysis and inferencing across multiple data sources. ADM is now being increasingly deployed across all sectors of society and many diverse domains from entertainment to transport. An ADM system (ADMS) may involve multiple decision points, data sets, and technologies (ADMT) and may sit within a larger administrative or technical system such as a criminal justice system or business process. == Data == Automated decision-making involves using data as input to be analyzed within a process, model, or algorithm or for learning and generating new models. ADM systems may use and connect a wide range of data types and sources depending on the goals and contexts of the system, for example, sensor data for self-driving cars and robotics, identity data for security systems, demographic and financial data for public administration, medical records in health, criminal records in law. This can sometimes involve vast amounts of data and computing power. === Data quality === The quality of the available data and its ability to be used in ADM systems is fundamental to the outcomes. It is often highly problematic for many reasons. Datasets are often highly variable; corporations or governments may control large-scale data, restricted for privacy or security reasons, incomplete, biased, limited in terms of time or coverage, measuring and describing terms in different ways, and many other issues. For machines to learn from data, large corpora are often required, which can be challenging to obtain or compute; however, where available, they have provided significant breakthroughs, for example, in diagnosing chest X-rays. == ADM technologies == Automated decision-making technologies (ADMT) are software-coded digital tools that automate the translation of input data to output data, contributing to the function of automated decision-making systems. There are a wide range of technologies in use across ADM applications and systems. ADMTs involving basic computational operations Search (includes 1-2-1, 1-2-many, data matching/merge) Matching (two different things) Mathematical Calculation (formula) ADMTs for assessment and grouping: User profiling Recommender systems Clustering Classification Feature learning Predictive analytics (includes forecasting) ADMTs relating to space and flows: Social network analysis (includes link prediction) Mapping Routing ADMTs for processing of complex data formats Image processing Audio processing Natural Language Processing (NLP) Other ADMT Business rules management systems Time series analysis Anomaly detection Modelling/Simulation === Machine learning === Machine learning (ML) involves training computer programs through exposure to large data sets and examples to learn from experience and solve problems. Machine learning can be used to generate and analyse data as well as make algorithmic calculations and has been applied to image and speech recognition, translations, text, data and simulations. While machine learning has been around for some time, it is becoming increasingly powerful due to recent breakthroughs in training deep neural networks (DNNs), and dramatic increases in data storage capacity and computational power with GPU coprocessors and cloud computing. Machine learning systems based on foundation models run on deep neural networks and use pattern matching to train a single huge system on large amounts of general data such as text and images. Early models tended to start from scratch for each new problem however since the early 2020s many are able to be adapted to new problems. Examples of these technologies include Open AI's DALL-E (an image creation program) and their various GPT language models, and Google's PaLM language model program. == Applications == ADM is being used to replace or augment human decision-making by both public and private-sector organisations for a range of reasons including to help increase consistency, improve efficiency, reduce costs and enable new solutions to complex problems. === Debate === Research and development are underway into uses of technology to assess argument quality, assess argumentative essays and judge debates. Potential applications of these argument technologies span education and society. Scenarios to consider, in these regards, include those involving the assessment and evaluation of conversational, mathematical, scientific, interpretive, legal, and political argumentation and debate. === Law === In legal systems around the world, algorithmic tools such as risk assessment instruments (RAI), are being used to supplement or replace the human judgment of judges, civil servants and police officers in many contexts. In the United States RAI are being used to generate scores to predict the risk of recidivism in pre-trial detention and sentencing decisions, evaluate parole for prisoners and to predict "hot spots" for future crime. These scores may result in automatic effects or may be used to inform decisions made by officials within the justice system. In Canada ADM has been used since 2014 to automate certain activities conducted by immigration officials and to support the evaluation of some immigrant and visitor applications. === Economics === Automated decision-making systems are used in certain computer programs to create buy and sell orders related to specific financial transactions and automatically submit the orders in the international markets. Computer programs can automatically generate orders based on predefined set of rules using trading strategies which are based on technical analyses, advanced statistical and mathematical computations, or inputs from other electronic sources. === Business === ==== Continuous auditing ==== Continuous auditing uses advanced analytical tools to automate auditing processes. It can be utilized in the private sector by business enterprises and in the public sector by governmental organizations and municipalities. As artificial intelligence and machine learning continue to advance, accountants and auditors may make use of increasingly sophisticated algorithms which make decisions such as those involving determining what is anomalous, whether to notify personnel, and how to prioritize those tasks assigned to personnel. === Media and entertainment === Digital media, entertainment platforms, and information services increasingly provide content to audiences via automated recommender systems based on demographic information, previous selections, collaborative filtering or content-based filtering. This includes music and video platforms, publishing, health information, product databases and search engines. Many recommender systems also provide some agency to users in accepting recommendations and incorporate data-driven algorithmic feedback loops based on the actions of the system user. Large-scale machine learning language models and image creation programs being developed by companies such as OpenAI and Google in the 2020s have restricted access however they are likely to have widespread application in fields such as advertising, copywriting, stock imagery and gra
Artificial general intelligence
Artificial general intelligence (AGI) is a hypothetical type of artificial intelligence that matches or surpasses human capabilities across virtually all cognitive tasks. Beyond AGI, artificial superintelligence (ASI) would outperform the best human abilities across every domain by a wide margin. Unlike artificial narrow intelligence (ANI), whose competence is confined to well‑defined tasks, an AGI system can generalise knowledge, transfer skills between domains, and solve novel problems without task‑specific reprogramming. Creating AGI is a stated goal of technology companies such as OpenAI, Google, xAI, and Meta. A 2020 survey identified 72 active AGI research and development projects across 37 countries. AGI is a common topic in science fiction and futures studies. Contention exists over whether AGI represents an existential risk. Some AI experts and industry figures have stated that mitigating the risk of human extinction posed by AGI should be a global priority. Others find the development of AGI to be in too remote a stage to present such a risk. == Terminology == AGI is also known as strong AI, full AI, human-level AI, human-level intelligent AI, or general intelligent action. The term "artificial general intelligence" was used in 1997 by Mark Gubrud in a discussion of the implications of fully automated military production and operations. A mathematical formalism of AGI named AIXI was proposed in 2000 by Marcus Hutter, who defines intelligence as "an agent’s ability to achieve goals or succeed in a wide range of environments". This type of AGI has also been called "universal artificial intelligence". The term AGI was re-introduced and popularized by Shane Legg and Ben Goertzel around 2002. Some academic sources reserve the term "strong AI" for computer programs that will experience sentience or consciousness. In contrast, weak AI (or narrow AI) can solve a specific problem but lacks general cognitive abilities. Some academic sources use "weak AI" to refer more broadly to any programs that neither experience consciousness nor have a mind in the same sense as humans. Related concepts include artificial superintelligence and transformative AI. An artificial superintelligence (ASI) is a hypothetical type of AGI that is much more generally intelligent than humans, while the notion of transformative AI relates to AI having a large impact on society, for example, similar to the agricultural or industrial revolution. A framework for classifying AGI was proposed in 2023 by Google DeepMind researchers. They define five performance levels of AGI: emerging, competent, expert, virtuoso, and superhuman. For example, a competent AGI is defined as an AI that outperforms 50% of skilled adults in a wide range of non-physical tasks, and a superhuman AGI (i.e., an artificial superintelligence) is similarly defined but with a threshold of 100%. They consider large language models like ChatGPT or LLaMA 2 to be instances of emerging AGI (comparable to unskilled humans). Regarding the autonomy of AGI and associated risks, they define five levels: tool (fully in human control), consultant, collaborator, expert, and agent (fully autonomous). == Characteristics == There is no single agreed-upon definition of intelligence as applied to computers. Computer scientist John McCarthy wrote in 2007: "We cannot yet characterize in general what kinds of computational procedures we want to call intelligent." === Intelligence traits === Researchers generally hold that a system is required to do all of the following to be regarded as an AGI: reason, use strategy, solve puzzles, and make judgments under uncertainty, represent knowledge, including common sense knowledge, plan, learn, communicate in natural language, if necessary, integrate these skills in completion of any given goal. Many interdisciplinary approaches (e.g. cognitive science, computational intelligence, and decision making) consider additional traits such as imagination (the ability to form novel mental images and concepts) and autonomy. Computer-based systems exhibiting these capabilities are now widespread, with modern large language models demonstrating computational creativity, automated reasoning, and decision support simultaneously across domains. === Physical traits === Other capabilities are considered desirable in intelligent systems, as they may affect intelligence or aid in its expression. These include: the ability to sense (e.g. see, hear, etc.), and the ability to act (e.g. move and manipulate objects, change location to explore, etc.) This includes the ability to detect and respond to hazard. === Tests for human-level AGI === Several tests meant to confirm human-level AGI have been considered. ==== Turing test ==== The Turing test was proposed by Alan Turing in his 1950 paper "Computing Machinery and Intelligence". This test involves a human judge engaging in natural language conversations with both a human and a machine designed to generate human-like responses. The machine passes the test if it can convince the judge that it is human a significant fraction of the time. Turing proposed this as a practical measure of machine intelligence, focusing on the ability to produce human-like responses rather than on the internal workings of the machine. The idea of the test is that the machine has to try and pretend to be a man, by answering questions put to it, and it will only pass if the pretence is reasonably convincing. A considerable portion of a jury, who should not be experts about machines, must be taken in by the pretence. In 2014, a chatbot named Eugene Goostman, designed to imitate a 13-year-old Ukrainian boy, reportedly passed a Turing Test event by convincing 33% of judges that it was human. However, this claim was met with significant skepticism from the AI research community, who questioned the test's implementation and its relevance to AGI. A 2025 pre‑registered, three‑party Turing‑test study by Cameron R. Jones and Benjamin K. Bergen showed that GPT-4.5 was judged to be the human in 73% of five‑minute text conversations—surpassing the 67% humanness rate of real confederates and meeting the researchers' criterion for having passed the test. ==== Ikea test ==== The "Ikea test", also known as the Flat Pack Furniture Test, involves an AI controlling a robot which attempts to assemble an Ikea flat-pack furniture product after having been shown the parts and instructions. As early as 2013, MIT's IkeaBot demonstrated fully autonomous multi-robot assembly of an IKEA Lack table in ten minutes, with no human intervention and no pre-programmed assembly instructions. The robots inferred the assembly sequence from the geometry of the parts alone. ==== Coffee test ==== Steve Wozniak proposed a test where a machine is required to enter an average American home and figure out how to make coffee. It must find the coffee machine, find the coffee, add water, find a mug, and brew the coffee by pushing the proper buttons. This test has been substantially approached across multiple systems. In January 2024, Figure AI's Figure 01 humanoid learned to operate a Keurig coffee machine autonomously after watching video demonstrations, using end-to-end neural networks to translate visual input into motor actions. In 2025, researchers at the University of Edinburgh published the ELLMER framework in Nature Machine Intelligence, demonstrating a robotic arm that interprets verbal instructions, analyses its surroundings, and autonomously makes coffee in dynamic kitchen environments — adapting to unforeseen obstacles in real time rather than following pre-programmed sequences. ==== Suleyman's test ==== Mustafa Suleyman's test proposes giving an AI model US$100,000 and asking it to obtain US$1 million. ==== Use of video-games ==== Adams, et al. propose that the ability to learn and succeed in a wide range of video games can be used to test AI intelligence. This range would include games unknown to the AGI developers before the test is administered. === AI-complete problems === A problem is informally called "AI-complete" or "AI-hard" if it is believed that AGI would be needed to solve it, because the solution is beyond the capabilities of a purpose-specific algorithm. == History == === Classical AI === Modern AI research began in the mid-1950s. The first generation of AI researchers were convinced that artificial general intelligence was possible and that it would exist in just a few decades. AI pioneer Herbert A. Simon wrote in 1965: "machines will be capable, within twenty years, of doing any work a man can do". Their predictions were the inspiration for Stanley Kubrick and Arthur C. Clarke's fictional character HAL 9000, who embodied what AI researchers believed they could create by the year 2001. AI pioneer Marvin Minsky was a consultant on the project of making HAL 9000 as realistic as possible according to the consensus predictions of the time. He said in 1967, "Within a generation... the problem of
WomanStats Project
The WomanStats Project is a donor-funded research and database project housed at Brigham Young University that "seeks to collect detailed statistical data on the status of women around the world, and to connect that data with data on the security of states." The WomanStats Database aims to provide a comprehensive compilation of information on the status of women in the world. Coders comb the extant literature and conduct expert interviews to find qualitative and quantitative information on over 300 indicators of women's status in 174 countries with populations of at least 200,000. Access to the online database is free. == History and structure == WomanStats began as an outgrowth of a paper Dr. Valerie M. Hudson (of the Brigham Young University Political Science department) and one of her graduate students, Andrea den Boer, published in International Security on the association between national security and the abnormal sex ratio in Asia. After the success and influence of their first article, (later added as one of their top twenty national security articles of that journal of all time), Hudson and den Boer did further research on the connection between the status of women and national security, but found that there was no single database that covered the range of topics that they needed for their research. Consequently, they began compiling information on variables regarding the status of women around the world. The database was officially formed in 2001 and grew exponentially as it later added more variables. The Project went live on the Internet in July 2007. The principal investigators are: Valerie M. Hudson (International Relations), Bonnie Ballif-Spanvill (Psychology, emeritus), and Chad F. Emmett (Geography) all from Brigham Young University, Mary Caprioli from the University of Minnesota, Duluth (International Relations), Rose McDermott from Brown University (International Relations), Andrea Den Boer from the University of Kent at Canterbury in the United Kingdom (International Relations) and S. Matthew Stearmer from the Ohio State University (Sociology; doctoral student). Approximately a dozen undergraduate and graduate students at Brigham Young University and Texas A&M University work at any one time as coders for the project. The coders take the raw quantitative and qualitative data collected in government reports, news articles, research papers, etc. and sort the applicable information on women into categories. They may also implement scales developed by the principal investigators, or that they (the students) themselves have developed. == Database == As of February 2011, the database has 307 variables, covers 174 nations with populations over 200,000, uses 18,015 sources and contains over 111,000 individual data points. All data is referenced to original sources. Not every variable has information for each country; similarly, not all countries have information for each variable: overall, about 70% of country-variable combinations have information. These database coding gaps exist where information is not available or is incomplete, or variables are not collected and reported by governments or international organizations. At times, information from different sources may be contradictory, and the WomanStats Database records this discrepant information for triangulation purposes. == Users and role of the database == The database is meant to help fill a hole in the extant data on the situation of women around the world. WomanStats data and research has been vetted and/or used by the United Nations, the United States Department of Defense, the Central Intelligence Agency, and the World Bank. Their data and research were also used by the United States Senate Committee on Foreign Relations in crafting the International Violence Against Women’s Act. The Inter-Agency Network on Women and Gender Equality (IANWGE) of the United Nations has stated that the WomanStats project "filled a major gap in the availability of data on women" (2007). Victor Asal and Mitchell Brown, researchers not affiliated with WomanStats, stated in an article published in Politics and Policy that "one of the most significant challenges of cross-national empirical studies of the prevalence of interpersonal violence is the paucity of available data, particularly reliable data," and that "WomanStats has allowed for an important first glimpse at analyzing the factors related to interpersonal violence." They conclude by stating that "Our findings suggest that, in the same way that larger disciplinary resources have invested in interstate and intrastate war, disciplinary resources need to be expended in creating a data set exploring interpersonal violence. Until the rights and the lives of women and children are taken as seriously as the survival of states by more proactively collaborating on projects like WomanStats, we will continue to only have a small lens through which to understand problems like this." Princeton University professor Evan S. Liberman wrote, "Although data on political regimes and group conflict have been in far greater demand by political scientists than data on gender politics and policies, two gender-related databases provide...examples of innovative HIRDs. Both the Womanstats database project (Hudson et al. 2009) and the Research Network on Gender Politics and the State (RNGS) project (McBride et al. 2008) are well-integrated presentations of quantitative and qualitative data characterizing the quality of gender relations around the world and, in particular, analytic descriptions of the treatment of women."." == Research == The research component of WomanStats focuses on exploring the relationship between the situation of women and the behavior and security of states. Current research initiatives include: Exploring the relationship between violent instability and inequity and family law. Examining the effect of polygyny and marriage market dislocations on the rise of suicide terrorism. Documenting discrepancies between laws on the books and cultural practices on the ground concerning gender issues. Investigating how well the situation of women predicts the peacefulness of nations-states, compared to their variables such as democracy, wealth, and civilization. The Project has published articles in International Security, International Studies Quarterly, Peace and Conflict, Journal of Peace Research, Political Psychology, Cumberland Law Review, and World Political Review, and has a forthcoming book from Columbia University Press.
Meta-learning (computer science)
Meta-learning is a subfield of machine learning where automatic learning algorithms are applied to metadata about machine learning experiments. As of 2017, the term had not found a standard interpretation, however the main goal is to use such metadata to understand how automatic learning can become flexible in solving learning problems, hence to improve the performance of existing learning algorithms or to learn (induce) the learning algorithm itself, hence the alternative term learning to learn. Flexibility is important because each learning algorithm is based on a set of assumptions about the data, its inductive bias. This means that it will only learn well if the bias matches the learning problem. A learning algorithm may perform very well in one domain, but not on the next. This poses strong restrictions on the use of machine learning or data mining techniques, since the relationship between the learning problem (often some kind of database) and the effectiveness of different learning algorithms is not yet understood. By using different kinds of metadata, like properties of the learning problem, algorithm properties (like performance measures), or patterns previously derived from the data, it is possible to learn, select, alter or combine different learning algorithms to effectively solve a given learning problem. Critiques of meta-learning approaches bear a strong resemblance to the critique of metaheuristic, a possibly related problem. A good analogy to meta-learning, and the inspiration for Jürgen Schmidhuber's early work (1987) and Yoshua Bengio et al.'s work (1991), considers that genetic evolution learns the learning procedure encoded in genes and executed in each individual's brain. In an open-ended hierarchical meta-learning system using genetic programming, better evolutionary methods can be learned by meta evolution, which itself can be improved by meta meta evolution, etc. == Definition == A proposed definition for a meta-learning system combines three requirements: The system must include a learning subsystem. Experience is gained by exploiting meta knowledge extracted in a previous learning episode on a single dataset, or from different domains. Learning bias must be chosen dynamically. Bias refers to the assumptions that influence the choice of explanatory hypotheses and not the notion of bias represented in the bias-variance dilemma. Meta-learning is concerned with two aspects of learning bias. Declarative bias specifies the representation of the space of hypotheses, and affects the size of the search space (e.g., represent hypotheses using linear functions only). Procedural bias imposes constraints on the ordering of the inductive hypotheses (e.g., preferring smaller hypotheses). == Common approaches == There are three common approaches: using (cyclic) networks with external or internal memory (model-based) learning effective distance metrics (metrics-based) explicitly optimizing model parameters for fast learning (optimization-based). === Model-Based === Model-based meta-learning models updates its parameters rapidly with a few training steps, which can be achieved by its internal architecture or controlled by another meta-learner model. ==== Memory-Augmented Neural Networks ==== A Memory-Augmented Neural Network, or MANN for short, is claimed to be able to encode new information quickly and thus to adapt to new tasks after only a few examples. ==== Meta Networks ==== Meta Networks (MetaNet) learns a meta-level knowledge across tasks and shifts its inductive biases via fast parameterization for rapid generalization. === Metric-Based === The core idea in metric-based meta-learning is similar to nearest neighbors algorithms, which weight is generated by a kernel function. It aims to learn a metric or distance function over objects. The notion of a good metric is problem-dependent. It should represent the relationship between inputs in the task space and facilitate problem solving. ==== Convolutional Siamese Neural Network ==== Siamese neural network is composed of two twin networks whose output is jointly trained. There is a function above to learn the relationship between input data sample pairs. The two networks are the same, sharing the same weight and network parameters. ==== Matching Networks ==== Matching Networks learn a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types. ==== Relation Network ==== The Relation Network (RN), is trained end-to-end from scratch. During meta-learning, it learns to learn a deep distance metric to compare a small number of images within episodes, each of which is designed to simulate the few-shot setting. ==== Prototypical Networks ==== Prototypical Networks learn a metric space in which classification can be performed by computing distances to prototype representations of each class. Compared to recent approaches for few-shot learning, they reflect a simpler inductive bias that is beneficial in this limited-data regime, and achieve satisfied results. === Optimization-Based === What optimization-based meta-learning algorithms intend for is to adjust the optimization algorithm so that the model can be good at learning with a few examples. ==== LSTM Meta-Learner ==== LSTM-based meta-learner is to learn the exact optimization algorithm used to train another learner neural network classifier in the few-shot regime. The parametrization allows it to learn appropriate parameter updates specifically for the scenario where a set amount of updates will be made, while also learning a general initialization of the learner (classifier) network that allows for quick convergence of training. ==== Temporal Discreteness ==== Model-Agnostic Meta-Learning (MAML) is a fairly general optimization algorithm, compatible with any model that learns through gradient descent. ==== Reptile ==== Reptile is a remarkably simple meta-learning optimization algorithm, given that both of its components rely on meta-optimization through gradient descent and both are model-agnostic. == Examples == Some approaches which have been viewed as instances of meta-learning: Recurrent neural networks (RNNs) are universal computers. In 1993, Jürgen Schmidhuber showed how "self-referential" RNNs can in principle learn by backpropagation to run their own weight change algorithm, which may be quite different from backpropagation. In 2001, Sepp Hochreiter & A.S. Younger & P.R. Conwell built a successful supervised meta-learner based on Long short-term memory RNNs. It learned through backpropagation a learning algorithm for quadratic functions that is much faster than backpropagation. Researchers at Deepmind (Marcin Andrychowicz et al.) extended this approach to optimization in 2017. In the 1990s, Meta Reinforcement Learning or Meta RL was achieved in Schmidhuber's research group through self-modifying policies written in a universal programming language that contains special instructions for changing the policy itself. There is a single lifelong trial. The goal of the RL agent is to maximize reward. It learns to accelerate reward intake by continually improving its own learning algorithm which is part of the "self-referential" policy. An extreme type of Meta Reinforcement Learning is embodied by the Gödel machine, a theoretical construct which can inspect and modify any part of its own software which also contains a general theorem prover. It can achieve recursive self-improvement in a provably optimal way. Model-Agnostic Meta-Learning (MAML) was introduced in 2017 by Chelsea Finn et al. Given a sequence of tasks, the parameters of a given model are trained such that few iterations of gradient descent with few training data from a new task will lead to good generalization performance on that task. MAML "trains the model to be easy to fine-tune." MAML was successfully applied to few-shot image classification benchmarks and to policy-gradient-based reinforcement learning. Variational Bayes-Adaptive Deep RL (VariBAD) was introduced in 2019. While MAML is optimization-based, VariBAD is a model-based method for meta reinforcement learning, and leverages a variational autoencoder to capture the task information in an internal memory, thus conditioning its decision making on the task. When addressing a set of tasks, most meta learning approaches optimize the average score across all tasks. Hence, certain tasks may be sacrificed in favor of the average score, which is often unacceptable in real-world applications. By contrast, Robust Meta Reinforcement Learning (RoML) focuses on improving low-score tasks, increasing robustness to the selection of task. RoML works as a meta-algorithm, as it can be applied on top of other meta learning algorithms (such as MAML and VariBAD) to increase their robustness. It is applicable to both supervised meta learning and meta reinforcement learning. Discovering meta-knowledge works by inducing knowledge