Markovian discrimination

Markovian discrimination

Markovian discrimination is a class of spam filtering methods used in CRM114 and other spam filters to filter based on statistical patterns of transition probabilities between words or other lexical tokens in spam messages that would not be captured using simple bag-of-words naive Bayes spam filtering. == Markovian Discrimination vs. Bag-of-Words Discrimination == A bag-of-words model contains only a dictionary of legal words and their relative probabilities in spam and genuine messages. A Markovian model additionally includes the relative transition probabilities between words in spam and in genuine messages, where the relative transition probability is the likelihood that a given word will be written next, based on what the current word is. Put another way, a bag-of-words filter discriminates based on relative probabilities of single words alone regardless of phrase structure, while a Markovian word-based filter discriminates based on relative probabilities of either pairs of words, or, more commonly, short sequences of words. This allows the Markovian filter greater sensitivity to phrase structure. Neither naive Bayes nor Markovian filters are limited to the word level for tokenizing messages. They may also process letters, partial words, or phrases as tokens. In such cases, specific bag-of-words methods would correspond to general bag-of-tokens methods. Modelers can parameterize Markovian spam filters based on the relative probabilities of any such tokens' transitions appearing in spam or in legitimate messages. == Visible and Hidden Markov Models == There are two primary classes of Markov models, visible Markov models and hidden Markov models, which differ in whether the Markov chain generating token sequences is assumed to have its states fully determined by each generated token (the visible Markov models) or might also have additional state (the hidden Markov models). With a visible Markov model, each current token is modeled as if it contains the complete information about previous tokens of the message relevant to the probability of future tokens, whereas a hidden Markov model allows for more obscure conditional relationships. Since those more obscure conditional relationships are more typical of natural language messages including both genuine messages and spam, hidden Markov models are generally preferred over visible Markov models for spam filtering. Due to storage constraints, the most commonly employed model is a specific type of hidden Markov model known as a Markov random field, typically with a 'sliding window' or clique size ranging between four and six tokens.

Vismon

Vismon was the Bell Labs system which displayed authors' faces on one of their internal e-mail systems. The name was a pun on the sysmon program used at Bell to show the load on computer systems. It can also be interpreted as "visual monitor". The system inspired Rich Burridge to develop the similar but more widespread faces system, which spread with Unix distributions in the 1980s. This in turn inspired Steve Kinzler to develop the Picons, or personal icons, which have the goal of offering symbols and other images, as well as faces, to represent individuals and institutions in email messages. Other systems such as the faces available on the LAN email functions of the NeXTSTEP platform also seem to have been influenced by the original Vismon capabilities. The faces program in Plan 9 is the direct descendant of this system. Vismon was the work of Rob Pike and Dave Presotto. It was based on some early experiments by Luca Cardelli. Many other scientists and engineers of the Computing Science Research Center of the Murray Hill facility were also involved. All had been spurred by the introduction in 1983 of the new Blit graphics terminal developed by Pike and Bart Locanthi and marketed by Teletype Corporation of Skokie, Illinois as the DMD 5620. Pike was eager, along with his colleagues, to exploit the new graphic capabilities. Pike and company went around their Center, convincing everybody, from directors and administrative assistants to engineers and scientists, to pose as they got out a 4×5 view camera with a Polaroid back and took black-and-white photos (Polaroid type 52) of their faces. Their efforts yielded nearly 100 faces, which they digitised with a scanner from graphics colleagues. They wrote several programs to transform the faces, store them and serve them on several machines at the lab. As time went by, they added faces from outside their Center and outside Bell Labs. This database also led to the pico image editor (originally named zunk) which was used for image transformations, many of them with colleagues as the preferred target. The first programs built around vismon were used to announce incoming mail in a dedicated window, using the 48 by 48 pixel faces. Later on the faces were also used to decorate line printer banners.

Radiant AI

The Radiant AI is a technology developed by Bethesda Softworks for The Elder Scrolls video games. It allows non-player characters (NPCs) to make choices and engage in behaviors more complex than in past titles. The technology was developed for The Elder Scrolls IV: Oblivion and expanded in The Elder Scrolls V: Skyrim; it is also used in Fallout 3, Fallout: New Vegas and Fallout 4, also published by Bethesda, with 3 and 4 being developed by them as well. == Technology == The Radiant AI technology, as it evolved in its iteration developed for Skyrim, comprises two parts: === Radiant AI === The Radiant AI system deals with NPC interactions and behavior. It allows non-player characters to dynamically react to and interact with the world around them. General goals, such as "Eat in this location at 2pm" are given to NPCs, and NPCs are left to determine how to achieve them. The absence of individual scripting for each character allows for the construction of a world on a much larger scale than other games had developed, and aids in the creation of what Todd Howard described as an "organic feel" for the game. === Radiant Story === The Radiant Story system deals with how the game itself reacts to the player behavior, such as the creation of new dynamic quests. Dynamically generated quests are placed by the game in locations the player hasn't visited yet and are related to earlier adventures.

Chinese room

The Chinese room argument holds that a computer executing a program cannot have a mind, understanding, or consciousness, regardless of how intelligently or human-like the program may make the computer behave. The argument was presented in a 1980 paper by the American philosopher John Searle, entitled "Minds, Brains, and Programs" and published in the journal Behavioral and Brain Sciences. Similar arguments had been made previously by others, including Gottfried Wilhelm Leibniz, Peter Winch, and Anatoly Dneprov. Searle's version has been widely discussed in the years since. The centerpiece of Searle's argument is a thought experiment known as the "Chinese room". The argument is directed against the philosophical positions of functionalism and computationalism, which hold that the mind may be viewed as an information-processing system operating on formal symbols, and that simulation of a given mental state is sufficient for its presence. Specifically, the argument is intended to refute a position Searle calls the strong AI hypothesis: "The appropriately programmed computer with the right inputs and outputs would thereby have a mind in exactly the same sense human beings have minds." Although its proponents originally presented the argument in reaction to statements of artificial intelligence (AI) researchers, it is not an argument against the goals of mainstream AI research because it does not show a limit in the amount of intelligent behavior a machine can display. The argument applies only to digital computers running programs and does not apply to machines in general. While widely discussed, the argument has been subject to significant criticism and remains controversial among philosophers of mind and AI researchers. == Chinese room thought experiment == Suppose that artificial intelligence research has succeeded in programming a computer to behave as if it understands Chinese. The machine accepts Chinese characters as input, carries out each instruction of the program step by step, and then produces Chinese characters as output. The machine does this so perfectly that no one can tell that they are communicating with a machine and not a hidden Chinese speaker. The questions at issue are these: does the machine actually understand the conversation, or is it just simulating the ability to understand the conversation? Does the machine have a mind in exactly the same sense that people do, or is it just acting as if it had a mind? Now suppose that Searle is in a room with an English version of the program, along with sufficient pencils, paper, erasers and filing cabinets. Chinese characters are slipped in under the door, and he follows the program step-by-step, which eventually instructs him to slide other Chinese characters back out under the door. If the computer had passed the Turing test this way, it follows that Searle would do so as well, simply by running the program by hand. Searle can see no essential difference between the roles of the computer and himself in the experiment. Each simply follows a program, step-by-step, producing behavior that makes them appear to understand. However, Searle would not be able to understand the conversation. Therefore, he argues, it follows that the computer would not be able to understand the conversation either. Searle argues that, without "understanding" (or "intentionality"), we cannot describe what the machine is doing as "thinking" and, since it does not think, it does not have a "mind" in the normal sense of the word. Therefore, he concludes that the strong AI hypothesis is false: a computer running a program that simulates a mind would not have a mind in the same sense that human beings have a mind. == History == Gottfried Wilhelm Leibniz made a similar argument in 1713 against mechanism, the idea that everything that makes up a human being could, in principle, be explained in mechanical terms—in other words, that a person, including their mind, is merely a very complex machine. Leibniz used the thought experiment of expanding the brain until it was the size of a mill. He found it difficult to imagine that a "mind" capable of "perception" could be constructed using only mechanical processes. British philosopher Peter Winch made the same point in his 1958 book The Idea of a Social Science and its Relation to Philosophy, in which he argues that "a man who understands Chinese is not a man who has a firm grasp of the statistical probabilities for the occurrence of the various words in the Chinese language" (p. 108). Soviet cyberneticist Anatoly Dneprov made an essentially identical argument in 1961, in the form of his short story "The Game". In it, a stadium of people act as switches and memory cells implementing a program to translate a sentence from Portuguese, a language none of them know. The game was organized by a "Professor Zarubin" to answer the question "Can mathematical machines think?" Speaking through Zarubin, Dneprov writes that "the only way to prove that machines can think is to turn yourself into a machine and examine your thinking process", and he concludes, as Searle does, that "even the most perfect simulation of machine thinking is not the thinking process itself." In 1974, Lawrence H. Davis imagined duplicating the brain using telephone lines and offices staffed by people, and in 1978, Ned Block envisioned the entire population of China involved in such a brain simulation. This is known as the China brain thought experiment. Searle's version appeared in his 1980 article "Minds, Brains, and Programs", published in Behavioral and Brain Sciences. It eventually became the journal's "most influential target article", generating an enormous number of commentaries and responses in the ensuing decades, and Searle had continued to defend and refine the argument in multiple papers, popular articles, and books. David Cole writes that "the Chinese Room argument has probably been the most widely discussed philosophical argument in cognitive science to appear in the past 25 years". Most of the discussion consists of attempts to refute it. "The overwhelming majority", notes Behavioral and Brain Sciences editor Stevan Harnad, "still think that the Chinese Room Argument is dead wrong". The sheer volume of the literature that has grown up around it inspired Pat Hayes to comment that the field of cognitive science ought to be redefined as "the ongoing research program of showing Searle's Chinese Room Argument to be false". Searle's argument has become "something of a classic in cognitive science", according to Harnad. Varol Akman agrees, and has described the original paper as "an exemplar of philosophical clarity and purity". == Philosophy == Although the Chinese Room argument was originally presented in reaction to the statements of artificial intelligence researchers, philosophers have come to consider it as an important part of the philosophy of mind. It is a challenge to functionalism and the computational theory of mind, and is related to such questions as the mind–body problem, the problem of other minds, the symbol grounding problem, and the hard problem of consciousness. === Strong AI === Searle identified a philosophical position he calls "strong AI": The appropriately programmed computer with the right inputs and outputs would thereby have a mind in exactly the same sense human beings have minds. The definition depends on the distinction between simulating a mind and actually having one. Searle writes that "according to Strong AI, the correct simulation really is a mind. According to Weak AI, the correct simulation is a model of the mind." The claim is implicit in some of the statements of early AI researchers and analysts. For example, in 1957, the economist and psychologist Herbert A. Simon declared that "there are now in the world machines that think, that learn and create". Simon, together with Allen Newell and Cliff Shaw, after having completed the first program that could do formal reasoning (the Logic Theorist), claimed that they had "solved the venerable mind–body problem, explaining how a system composed of matter can have the properties of mind." John Haugeland wrote that "AI wants only the genuine article: machines with minds, in the full and literal sense. This is not science fiction, but real science, based on a theoretical conception as deep as it is daring: namely, we are, at root, computers ourselves." Searle also ascribes the following claims to advocates of strong AI: AI systems can be used to explain the mind; The study of the brain is irrelevant to the study of the mind; and The Turing test is adequate for establishing the existence of mental states. === Strong AI as computationalism or functionalism === In more recent presentations of the Chinese room argument, Searle has identified "strong AI" as "computer functionalism" (a term he attributes to Daniel Dennett). Functionalism is a position in modern philosophy of mind that holds that we can define menta

Social History and Industrial Classification

Social History and Industrial Classification (SHIC) is a classification system used by many British museums for social history and industrial collections. It was first published in 1983. == Purpose == SHIC classifies materials (books, objects, recordings etc.) by their interaction with the people who used them. For example, a carpenter's hammer is classified with other tools of the carpenter, and not with a blacksmith's hammer. In contrast other classification systems, for example the Dewey Decimal Classification, might class all hammers together and close to the classification for other percussive tools. The specialist subject network, Social History Curator's Group (SHCG), obtained funding in 2012 to develop an on-line version, now on their website http://www.shcg.org.uk/ == Scheme == Materials are classified under four major category numbers: Community life Domestic and family life Personal life Working life Further classification within a category is by the use of further numbers after the decimal point. It is permissible to assign more than one classification in cases where the object had more than one use.

Grammar systems theory

Grammar systems theory is a field of theoretical computer science that studies systems of finite collections of formal grammars generating a formal language. Each grammar works on a string, a so-called sequential form that represents an environment. Grammar systems can thus be used as a formalization of decentralized or distributed systems of agents in artificial intelligence. Let A {\displaystyle \mathbb {A} } be a simple reactive agent moving on the table and trying not to fall down from the table with two reactions, t for turning and ƒ for moving forward. The set of possible behaviors of A {\displaystyle \mathbb {A} } can then be described as formal language L A = { ( f m t n f r ) + : 1 ≤ m ≤ k ; 1 ≤ n ≤ ℓ ; 1 ≤ r ≤ k } , {\displaystyle \mathbb {L_{A}} =\{(f^{m}t^{n}f^{r})^{+}:1\leq m\leq k;1\leq n\leq \ell ;1\leq r\leq k\},} where ƒ can be done maximally k times and t can be done maximally ℓ times considering the dimensions of the table. Let G A {\displaystyle \mathbb {G_{A}} } be a formal grammar which generates language L A {\displaystyle \mathbb {L_{A}} } . The behavior of A {\displaystyle \mathbb {A} } is then described by this grammar. Suppose the A {\displaystyle \mathbb {A} } has a subsumption architecture; each component of this architecture can be then represented as a formal grammar, too, and the final behavior of the agent is then described by this system of grammars. The schema on the right describes such a system of grammars which shares a common string representing an environment. The shared sequential form is sequentially rewritten by each grammar, which can represent either a component or generally an agent. If grammars communicate together and work on a shared sequential form, it is called a Cooperating Distributed (DC) grammar system. Shared sequential form is a similar concept to the blackboard approach in AI, which is inspired by an idea of experts solving some problem together while they share their proposals and ideas on a shared blackboard. Each grammar in a grammar system can also work on its own string and communicate with other grammars in a system by sending their sequential forms on request. Such a grammar system is then called a Parallel Communicating (PC) grammar system. PC and DC are inspired by distributed AI. If there is no communication between grammars, the system is close to the decentralized approaches in AI. These kinds of grammar systems are sometimes called colonies or Eco-Grammar systems, depending (besides others) on whether the environment is changing on its own (Eco-Grammar system) or not (colonies).

UMBEL

UMBEL (Upper Mapping and Binding Exchange Layer) is a logically organized knowledge graph of 34,000 concepts and entity types that can be used in information science for relating information from disparate sources to one another. It was retired at the end of 2019. UMBEL was first released in July 2008. Version 1.00 was released in February 2011. Its current release is version 1.50. The grounding of this information occurs by common reference to the permanent URIs for the UMBEL concepts; the connections within the UMBEL upper ontology enable concepts from sources at different levels of abstraction or specificity to be logically related. Since UMBEL is an open-source extract of the OpenCyc knowledge base, it can also take advantage of the reasoning capabilities within Cyc. UMBEL has two means to promote the semantic interoperability of information:. It is: An ontology of about 35,000 reference concepts, designed to provide common mapping points for relating different ontologies or schema to one another, and A vocabulary for aiding that ontology mapping, including expressions of likelihood relationships distinct from exact identity or equivalence. This vocabulary is also designed for interoperable domain ontologies. UMBEL is written in the Semantic Web languages of SKOS and OWL 2. It is a class structure used in Linked Data, along with OpenCyc, YAGO, and the DBpedia ontology. Besides data integration, UMBEL has been used to aid concept search, concept definitions, query ranking, ontology integration, and ontology consistency checking. It has also been used to build large ontologies and for online question answering systems. Including OpenCyc, UMBEL has about 65,000 formal mappings to DBpedia, PROTON, GeoNames, and schema.org, and provides linkages to more than 2 million Wikipedia pages (English version). All of its reference concepts and mappings are organized under a hierarchy of 31 different "super types", which are mostly disjoint from one another. Each of these "super types" has its own typology of entity classes to provide flexible tie-ins for external content. 90% of UMBEL is contained in these entity classes.