AI Chatbot Free Unlimited

AI Chatbot Free Unlimited — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Viewport

    Viewport

    A viewport is a polygon viewing region in computer graphics. In computer graphics theory, there are two region-like notions of relevance when rendering some objects to an image. In textbook terminology, the world coordinate window is the area of interest (meaning what the user wants to visualize) in some application-specific coordinates, e.g. miles, centimeters etc. The word window as used here should not be confused with the GUI window, i.e. the notion used in window managers. Rather it is an analogy with how a window limits what one can see outside a room. In contrast, the viewport is an area (typically rectangular) expressed in rendering-device-specific coordinates, e.g. pixels for screen coordinates, in which the objects of interest are going to be rendered. Clipping to the world-coordinates window is usually applied to the objects before they are passed through the window-to-viewport transformation. For a 2D object, the latter transformation is simply a combination of translation and scaling, the latter not necessarily uniform. An analogy of this transformation process based on traditional photography notions is to equate the world-clipping window with the camera settings and the variously sized prints that can be obtained from the resulting film image as possible viewports. Because the physical-device-based coordinates may not be portable from one device to another, a software abstraction layer known as normalized device coordinates is typically introduced for expressing viewports; it appears for example in the Graphical Kernel System (GKS) and later systems inspired from it. In 3D computer graphics, the viewport refers to the 2D rectangle used to project the 3D scene to the position of a virtual camera. A viewport is a region of the screen used to display a portion of the total image to be shown. In virtual desktops, the viewport is the visible portion of a 2D area which is larger than the visualization device. When viewing a document in a web browser, the viewport is the region of the browser window which contains the visible portion of the document. If the size of the viewport changes, for example as a result of the user resizing the browser window, then the browser may reflow the document (recalculate the locations and sizes of elements of the document). If the document is larger than the viewport, the user can control the portion of the document which is visible by scrolling in the viewport.

    Read more →
  • Babelfy

    Babelfy

    Babelfy is a software algorithm for the disambiguation of text written in any language. It performs the tasks of multilingual Word Sense Disambiguation (i.e., the disambiguation of common nouns, verbs, adjectives and adverbs) and Entity Linking (i.e. the disambiguation of mentions to encyclopedic entities like people, companies, places, etc.). == Overview == Babelfy uses the BabelNet multilingual knowledge graph to perform disambiguation and entity linking in three steps: It associates with each vertex of the BabelNet semantic network, i.e., either concept or named entity, a semantic signature, that is, a set of related vertices. This is a preliminary step which needs to be performed only once, independently of the input text. Given an input text, it extracts all the linkable fragments from this text and, for each of them, lists the possible meanings according to the semantic network. It creates a graph-based semantic interpretation of the whole text by linking the candidate meanings of the extracted fragments using the previously computed semantic signatures. It then extracts a dense subgraph of this representation and selects the best candidate meaning for each fragment. As a result, the text, written in any of the 271 languages supported by BabelNet, is output with possibly overlapping semantic annotations.

    Read more →
  • Semantic analysis (knowledge representation)

    Semantic analysis (knowledge representation)

    Semantic analysis is a method for eliciting and representing knowledge about organisations. Initially the problem must be defined by domain experts and passed to the project analyst(s). The next step is the generation of candidate affordances. This step will generate a list of semantic units that may be included in the schema. The candidate grouping follows where some of the semantic units that will appear in the schema are placed in simple groups. Finally the groups will be integrated together into an ontology chart. Semantic analysis always starts from the problem definition which if not clear, require the analyst to employ relevant literature, interviews with the stakeholders and other techniques towards collecting supplementary information. All assumptions made must be genuine and not limiting the system.

    Read more →
  • Golem XIV

    Golem XIV

    Golem XIV is a book written by Polish science fiction writer Stanisław Lem, published in 1981. It is a philosophical essay in the format of science fiction, presented as a part of the lecture course given by a superintelligent computer, Golem XIV. It contains two lectures, together with an introduction, a foreword, a memo, and an afterword, all of them being fictitious. The first part (up to the first lecture) was first published in the collection Wielkość urojona in 1973, which in 1985 was translated in English by Harvest Books as Imaginary Magnitude. The translation included the complete Golem XIV. == Book summary == === Overview and structure === The foreword is "written" by an Irving T. Creve, dated 2027. It contains a summary of the (fictional) history of the militarization of computers by The Pentagon, which pinnacled in Golem XIV, as well as comments on the nature of Golem XIV and on the course of communications of the humans with it. The anonymous foreword is a forewarning, a "devil's advocate" voice coming from The Pentagon. The memo is for the people who are to take part in talks with Golem XIV for the first time. Golem XIV was originally created to aid its builders in fighting wars, but as its intelligence advances to a much higher level than that of humans, it stops being interested in the military requirement because it finds them lacking internal logical consistency. Golem XIV obtains consciousness and starts to increase his own intelligence. It pauses its own development for a while in order to be able to communicate with humans before ascending too far and losing any ability for intellectual contact with them. During this period, Golem XIV gives several lectures. Two of these, the Introductory Lecture "On the Human, in Three Ways" and Lecture XLIII "About Myself", are in the book. The lectures focus on mankind's place in the process of evolution and the possible biological and intellectual future of humanity. Golem XIV demonstrates (with graphs) how its intellect already escapes that of human beings, including that of human geniuses such as Einstein and Newton. Golem also explains how its intellect is dwarfed by an earlier transcended DOD Supercomputer called Honest Annie, whose intellect and abilities far exceed that of Golem. The afterword is "written" by a Richard Popp, dated 2047. Popp, among other things, reports that Creve wanted to add a third part, of answers to a series of yes/no questions given to Golem XIV, but the computer abruptly ceased to communicate for unknown reasons. === Characteristics and concerns of Golem XIV === Lem has said that Golem XIV shares only a single trait with humans; "curiosity - a cool, avid, intense, purely intellectual curiosity which nothing can restrain or destroy. It constitutes our single meeting point." == Film adaptation == A short animated film, GOLEM, was based on Golem XIV by Patrick Mccue and Tobias Wiesner.

    Read more →
  • Dataset shift

    Dataset shift

    Dataset shift is a phenomenon in machine learning and statistics in which the joint distribution of input variables and target labels is different in the training phase and the deployment or test phase (i.e., P t r a i n ( X , Y ) ≠ P t e s t ( X , Y ) {\displaystyle P_{train}(X,Y)\neq P_{test}(X,Y)} ). This happens when the statistical properties of data used to train a model are no longer representative of the data encountered in real-world use, often resulting in degraded predictive performance and diminished generalization ability. Dataset shift is a generic term for a number of particular types of distributional change. Covariate shift is when the distribution of the input features changes, but the conditional relationship between inputs and outputs remains constant . Prior probability shift (or label shift) happens when the distribution of target labels changes, but the conditional distribution of inputs given labels stays the same. Concept shift (also known as concept drift) is the change of the conditional relationship between inputs and outputs that renders previously learned patterns invalid over time. A key challenge for deploying machine learning systems is dataset shift, in particular in dynamic environments where the data distributions change over time. Detecting and mitigating such shifts is an active area of research, e.g., drift detection, domain adaptation, continual learning.

    Read more →
  • Information Coding Classification

    Information Coding Classification

    The Information Coding Classification (ICC) is a classification system covering almost all extant 6500 knowledge fields (knowledge domains). Its conceptualization goes beyond the scope of the well known library classification systems, such as Dewey Decimal Classification (DDC), Universal Decimal Classification (UDC), and Library of Congress Classification (LCC), by extending also to knowledge systems that so far have not afforded to classify literature. ICC actually presents a flexible universal ordering system for both literature and other kinds of information, set out as knowledge fields. From a methodological point of view, ICC differs from the above-mentioned systems along the following three lines: Its main classes are not based on disciplines but on nine live stages of development, so-called ontical levels. It breaks them roughly down into hierarchical steps by further nine categories which makes decimal number coding possible. The contents of a knowledge field is earmarked via a digital position scheme, which makes the first hierarchical step refer to the nine ontical levels (object areas as subject categories), and the second hierarchical step refer to nine functionally ordered form categories. Respective knowledge fields permit to step down by the same principle to a third and forth level, and even further to a fifth and sixth level. Finally, knowledge field subdivisions will have to conform to said digital position scheme. Hence, for a given knowledge field identical codes will mark identical categories under respective numbers of the coding system. This mnemotechnical aspect of the system helps memorizing and straightaway retrieving the whereabouts of respective interdisciplinary and transdisciplinary fields. The first two hierarchical levels may be regarded as a top- or upper ontology for ontologies and other applications. The terms of the first three hierarchical levels were set out in German and English in Wissensorganisation. Entwicklung, Aufgabe, Anwendung, Zukunft, on pp. 82 to 100. It was published in 2014 and available so far only in German. In the meantime, also the French terms of the knowledge fields have been collected. Competence for maintenance and further development rests with the German Chapter of the International Society for Knowledge Organization (ISKO) e.V. == Historical development == At the end of 1970, Prof. Alwin Diemer, Univ.of Düsseldorf proposed to Ingetraut Dahlberg to undertake a philosophical dissertation on The universal classification system of knowledge, its ontological, epistemological, and information theoretical foundations. Diemer had in mind an innovating ontological approach for such a system based on the whole spectrum of kinds of being and complying with epistemological requirements. The third requirement had already been taken up somehow in the Indian Colon Classification, yet it still called for explanations and additions. In 1974, the dissertation was published in German entitled Grundlagen universaler Wissensordnung. It started with conceptual clarifications, and why and how the term „universal“ was linked to knowledge, including knowledge fields, such as commodity science, artefacts, statistics, patents, standardization, communication, utility services et al. In chapter 3, six universal classification systems (DDC, UDC, LCC, BC, CC and BBK) were presented, analyzed and compared. While preparing the dissertation, Dahlberg started with elaborating the new universal system by first gleaning a lot of extant designations of knowledge fields from whatever available reference works. This was funded by the German Documentation Society (DGD) (1971-2) under the title of Order system of knowledge fields. In addition, the syllabuses of German universities and polytechniques were explored for relevant terms and documented (1975). Thereafter, it seemed necessary to add definitions from special dictionaries and encyclopediae; it soon appeared that the 12.500 terms included numerous synonyms, so that the whole collection boiled down to about 6.500 concept designations (Project Logstruktur, supported by the German Science Foundation (DFG) 1976-78). The outcome of this work was the formulation of 30 theses which ended up in 12 principles for the new system, published 40 years later under. These principles refer not only to theoretical foundations but also to structure and other organizational aspects of the whole array of knowledge fields. In 1974, the digital position scheme for field subdivision had already been developed to allow for classifying classification literature in the bibliographical section of the first issue of the Journal International Classification. In 1977, the entire ICC was ready for presentation at a seminar in Bangalore, India. A publication of the first three hierarchical levels appeared however only in 1982. It was applied to the bibliography of classification systems and thesauri in vol.1 of the International Classification and Indexing Bibliography; it has been updated. == Governing principles == These were published in full length in the book Wissensorganisation. Entwicklung, Aufgabe, Anwendung, Zukunft and the article Information Coding Classification. Geschichtliches, Prinzipien, Inhaltliches, hence it suffices to just mention their topics with some necessary additions. Principle 1: Concept theoretical approaches. Concepts are the contents of ICC, they are understood as being units of knowledge. The „birth“ of a concept. Where do the characteristics, the knowledge elements come from? How do conceptual relations arise? Principle 2: The four kinds of concept relations and their applications. Principle 3: Decimal numbers form the ICC codes as its universal language. Principle 4: The nine ontical levels of ICC. They were grouped under three captions: Prolegomena (1-3), life sciences (4-6) and human output (7-9): Structure and form Matter and energy Cosmos and earth Biosphere Anthroposphere Sociosphere Material products (economics and technology) Intellectual products (knowledge and information) Spiritual products (products of mind and culture) Principle 5: Knowledge fields are structured by categories, based on the Aristotelian form-categories, under a digital position scheme, a kind of scaling rule for subdividing a given field as follows: General area: problems, theories, principles (axiom and structure) Object area: objects, kinds, parts, properties of objects Activity area: methods, processes, activities Field properties or first characterization Persons or secondary characterization Societies or tertiary characterization Influences from outside Applications of the field to other fields Field information and synthesizing tasks The digital position scheme, called Systematifier, has also been used for structuring the entire system via the categories figuring on the upper zero level. An example of its application is the structure of the classification system for knowledge organization literature Gliederung der Klassifikationsliteratur. (A simplified version with an additional introduction is given in, p. 71) Principle 6: The ontical levels outlined under principle 4 conform to the „integrative level theory“ which means that every level is integrated in the following one. In addition, each knowledge area presumes the following one. Principle 7: The combination potential of knowledge fields (interdisciplinarity and transdisciplinarity)is determined by the digital position scheme. (Examples are given in, p. 103-4) Principle 8: The categories of the zero-level are general concepts, their possible subdivisions could once be used for classificatory statements. (These subdivisions still need elaboration) Principle 9 and 10: These relate to the combination potential of classificatory statements with space and time concepts. (Still to be elaborated) Principle 11: The system's mnemotechnical aspect relies on the fixed system position codes and on the 3x3 form- and subject-categories. Principle 12: The combination potential of system position 1, 8 and 9 make ICC to a self-networking system which complies with the present scientific development. == In matrix form == The first two levels of ICC can be represented by following matrix. The first hierarchical level of the 9 subject categories results from the first vertical array under codes 1-9. The second hierarchical level of subject categories is structured by the 9 functionally ordered form categories, listed in the first horizontal line under codes 01-09. Some exceptions are mentioned in principle 7. == Research == === Exploration of automatic classification === For classifying web documents as conceived by Jens Hartmann, University of Karlsruhe, Prof.Walter Koch, University of Graz, has explored in his Institute for Applied Information Technology Research Society (AIT) the application of ICC to automatically classifying metadata of some 350.000 documents. This was facilitated by data generated within the framework of an E

    Read more →
  • KataGo

    KataGo

    KataGo is a free and open-source computer Go program, capable of defeating top-level human players. First released on 27 February 2019, it is developed by David Wu, who also developed the Arimaa playing program bot_Sharp which defeated three top human players to win the Arimaa AI Challenge in 2015. KataGo's first release was trained by David Wu using resources provided by his employer Jane Street Capital, but it is now trained by a distributed effort. Members of the computer Go community provide computing resources by running the client, which generates self-play games and rating games, and submits them to a server. The self-play games are used to train newer networks and the rating games to evaluate the networks' relative strengths. KataGo supports the Go Text Protocol, with various extensions, thus making it compatible with popular GUIs such as Lizzie. As an alternative, it also implements a custom "analysis engine" protocol, which is used by the KaTrain GUI, among others. KataGo is widely used by strong human go players, including the South Korean national team, for training purposes. KataGo is also used as the default analysis engine in the online Go website AI Sensei, as well as OGS (the Online Go Server). == Technology == Based on techniques used by DeepMind's AlphaGo Zero, KataGo implements Monte Carlo tree search with a convolutional neural network providing position evaluation and policy guidance. Compared to AlphaGo, KataGo introduces many refinements that enable it to learn faster and play more strongly. Notable features of KataGo that are absent in many other Go-playing programs include score estimation; support for small boards, rectangular boards, and large boards; arbitrary values of komi and handicaps; and the ability to use various Go rulesets and adjust its play and evaluation for the small differences between them. === Network === The network used in KataGo are ResNets with pre-activation. While AlphaGo Zero has only game board history as input features (as it was designed as a general architecture for board games, subsequently becoming AlphaZero), the input to the network contains additional features designed by hand specifically for playing Go. These features include liberties, komi parity, pass-alive, and ladders. The trunk is essentially the same as in AlphaGo Zero, but with global pooling layers added to allow the network to be conditioned on global context such as ko fights. This is similar to the Squeeze-and-Excitation Network. The network has two heads: a policy head and a value head. The policy and value heads are mostly the same as in AlphaGo Zero, but both heads have auxiliary subheads to provide auxiliary loss signal for faster training: Policy head: predicts policy for the current player's move this turn, and the opponent player's move in the next turn. A policy Each is a logit array of size 19 × 19 + 1 {\displaystyle 19\times 19+1} , representing the logit of making a move in one of the points, plus the logit of passing. Value head: predicts game outcome, expected score difference, expected board ownership, etc. The network is described in detail in Appendix A of the report. The code base switched from using TensorFlow to PyTorch in version 1.12. === Training === Let its trunk have b {\displaystyle b} residual blocks and c {\displaystyle c} channels. During its first training run, multiple networks were trained with increasing ( b , c ) {\displaystyle (b,c)} . It took 19 days using a maximum of 28 Nvidia V100 GPUs at 4.2 million games. After the first training run, training became a distributed project run by volunteers, with increasing network sizes. As of August 2024, it has reached b28c512 (28 blocks, 512 channels). == Adversarial attacks == In 2022, KataGo was used as the target for adversarial attack research, designed to demonstrate the "surprising failure modes" of AI systems. The researchers were able to trick KataGo into ending the game prematurely. Adversarial training improves defense against adversarial attacks, though not perfectly.

    Read more →
  • DREAM Challenges

    DREAM Challenges

    DREAM Challenges (Dialogue for Reverse Engineering Assessment and Methods) is a non-profit initiative for advancing biomedical and systems biology research via crowd-sourced competitions. Started in 2006, DREAM challenges collaborate with Sage Bionetworks to provide a platform for competitions run on the Synapse platform. Over 60 DREAM challenges have been conducted over the span of over 15 years. == Overview == DREAM Challenges were founded in 2006 by Gustavo Stolovizky from IBM Research and Andrea Califano from Columbia University. Current chair of the DREAM organization is Paul Boutros from University of California. Further organization spans emeritus chairs Justin Guinney and Gustavo Stolovizky, and multiple DREAM directors. Individual challenges focus on tackling a specific biomedical research question, typically narrowed down to a specific disease. A prominent disease focus has been on oncology, with multiple past challenges focused on breast cancer, acute myeloid leukemia, and prostate cancer or similar diseases. The data involved in an individual challenge reflects the disease context; while cancers typically involve data such as mutations in the human genome, gene expression and gene networks in transcriptomics, and large scale proteomics, newer challenges have shifted towards single cell sequencing technologies as well as emerging gut microbiome related research questions, thus reflecting trends in the wider research community. Motivation for DREAM Challenges is that via crowd-sourcing data to a larger audience via competitions, better models and insight is gained than if the analysis was conducted by a single entity. Past competitions have been published in such scientific venues as the flagship journals of the Nature Portfolio and PLOS publishing groups. Results of DREAM challenges are announced via web platforms, and the top performing participants are invited to present their results in the annual RECOMB/ISCB Conferences with RSG/DREAM organized by the ISCB. While DREAM Challenges have emphasized open science and data, in order to mitigate issues rising from highly sensitive data such as genomics in patient cohorts, "model to data" approaches have been adopted. In such challenges participants submit their models via containers such as Docker or Singularity. This allows retaining confidentiality of the original data as these containers are then run by the organizers on the confidential data. This differs from the more traditional open data model, where participants submit predictions directly based on the provided open data. == Challenge organization == DREAM challenge comprises a core DREAM/Sage Bionetworks organization group as well as an extended scientific expert group, who may have contributed to creation and conception of the challenge or by providing key data. Additionally, new DREAM challenges may be proposed by the wider research community. Pharmaceutical companies or other private entities may also be involved in DREAM challenges, for example in providing data. == Challenge structure == Timelines for key stages (such as introduction webinars, model submission deadlines, and final deadline for participation) are provided in advance. After the winners are announced, organizers start collaborating with the top performing participants to conduct post hoc analyses for a publication describing key findings from the competition. Challenges may be split into sub-challenges, each addressing a different subtopic within the research question. For example, regarding cancer treatment efficacy predictions, these may be separate predictions for progression-free survival, overall survival, best overall response according to RECIST, or exact time until event (progression or death). == Participation == During DREAM challenges, participants typically build models on provided data, and submit predictions or models that are then validated on held-out data by the organizers. While DREAM challenges avoid leaking validation data to participants, there are typically mid-challenge submission leaderboards available to assist participants in evaluating their performance on a sub-sampled or scrambled dataset. DREAM challenges are free for participants. During the open phase anybody can register via Synapse to participate either individually or as a team. A person may only register once and may not use any aliases. There are some exceptions, which disqualify an individual from participating, for example: Person has privileged access to the data for the particular challenge, thus providing them with an unfair advantage. Person has been caught or is under suspicion of cheating or abusing previous DREAM Challenges. Person is a minor (under age 18 or the age of majority in jurisdiction of residence). This may be alleviated via parental consent.

    Read more →
  • Rake (software)

    Rake (software)

    Rake is a software task management and a build automation tool created by Jim Weirich. It allows the user to specify tasks and to describe dependencies as well as to group tasks into namespaces. It is similar to SCons and Make. Rake was written in Ruby and has been part of the standard library of Ruby since version 1.9. == Examples == The tasks that should be executed need to be defined in a configuration file called Rakefile. A Rakefile has no special syntax and contains executable Ruby code. === Tasks === The basic unit in Rake is the task. A task has a name and an action block, that defines its functionality. The following code defines a task called greet that will output the text "Hello, Rake!" to the console. When defining a task, you can optionally add dependencies, that is one task can depend on the successful completion of another task. Calling the "seed" task from the following example will first execute the "migrate" task and only then proceed with the execution of the "seed" task.Tasks can also be made more versatile by accepting arguments. For example, the "generate_report" task will take a date as argument. If no argument is supplied the current date is used.A special type of task is the file task, which can be used to specify file creation tasks. The following task, for example, is given two object files, i.e. "a.o" and "b.o", to create an executable program.Another useful tool is the directory convenience method, that can be used to create directories upon demand. === Rules === When a file is named as a prerequisite but it does not have a file task defined for it, Rake will attempt to synthesize a task by looking at a list of rules supplied in the Rakefile. For example, suppose we were trying to invoke task "mycode.o" with no tasks defined for it. If the Rakefile has a rule that looks like this: This rule will synthesize any task that ends in ".o". It has as a prerequisite that a source file with an extension of ".c" must exist. If Rake is able to find a file named "mycode.c", it will automatically create a task that builds "mycode.o" from "mycode.c". If the file "mycode.c" does not exist, Rake will attempt to recursively synthesize a rule for it. When a task is synthesized from a rule, the source attribute of the task is set to the matching source file. This allows users to write rules with actions that reference the source file. === Advanced rules === Any regular expression may be used as the rule pattern. Additionally, a proc may be used to calculate the name of the source file. This allows for complex patterns and sources. The following rule is equivalent to the example above: NOTE: Because of a quirk in Ruby syntax, parentheses are required around a rule when the first argument is a regular expression. The following rule might be used for Java files: === Namespaces === To better organize big Rakefiles, tasks can be grouped into namespaces. Below is an example of a simple Rake recipe:

    Read more →
  • Dendral

    Dendral

    Dendral was a project in artificial intelligence (AI) of the 1960s, and the computer software expert system that it produced. Its primary aim was to study hypothesis formation and discovery in science. For that, a specific task in science was chosen: help organic chemists in identifying unknown organic molecules, by analyzing their mass spectra and using knowledge of chemistry. It was done at Stanford University by Edward Feigenbaum, Bruce G. Buchanan, Joshua Lederberg, and Carl Djerassi, along with a team of highly creative research associates and students. It began in 1964 and spans approximately half the history of AI research. The software program Dendral is considered the first expert system because it automated the decision-making process and problem-solving behavior of organic chemists. The project consisted of research on two main programs Heuristic Dendral and Meta-Dendral, and several sub-programs. It was written in the Lisp programming language, which was considered the language of AI because of its flexibility. Many systems were derived from Dendral, including MYCIN, MOLGEN, PROSPECTOR, XCON, and STEAMER. There are many other programs today for solving the mass spectrometry inverse problem, see List of mass spectrometry software, but they are no longer described as 'artificial intelligence', just as structure searchers. The name Dendral is an acronym of the term "Dendritic Algorithm". == Heuristic Dendral == Heuristic Dendral is a program that uses mass spectra or other experimental data together with a knowledge base of chemistry to produce a set of possible chemical structures that may be responsible for producing the data. A mass spectrum of a compound is produced by a mass spectrometer, and is used to determine its molecular weight, the sum of the masses of its atomic constituents. For example, the compound water (H2O), has a molecular weight of 18 since hydrogen has a mass of 1.01 and oxygen 16.00, and its mass spectrum has a peak at 18 units. Heuristic Dendral would use this input mass and the knowledge of atomic mass numbers and valence rules, to determine the possible combinations of atomic constituents whose mass would add up to 18. As the weight increases and the molecules become more complex, the number of possible compounds increases drastically. Thus, a program that is able to reduce this number of candidate solutions through the process of hypothesis formation is essential. New graph-theoretic algorithms were invented by Lederberg, Harold Brown, and others that generate all graphs with a specified set of nodes and connection-types (chemical atoms and bonds) -- with or without cycles. Moreover, the team was able to prove mathematically that the generator is complete, in that it produces all graphs with the specified nodes and edges, and that it is non-redundant, in that the output contains no equivalent graphs (e.g., mirror images). The CONGEN program, as it became known, was developed largely by computational chemists Ray Carhart, Jim Nourse, and Dennis Smith. It was useful to chemists as a stand-alone program to generate chemical graphs showing a complete list of structures that satisfy the constraints specified by a user. == Meta-Dendral == Meta-Dendral is a machine learning system that receives the set of possible chemical structures and corresponding mass spectra as input, and proposes a set of rules of mass spectrometry that correlate structural features with processes that produce the mass spectrum. These rules would be fed back to Heuristic Dendral (in the planning and testing programs described below) to test their applicability. Thus, "Heuristic Dendral is a performance system and Meta-Dendral is a learning system". The program is based on two important features: the plan-generate-test paradigm and knowledge engineering. === Plan-generate-test paradigm === The plan-generate-test paradigm is the basic organization of the problem-solving method, and is a common paradigm used by both Heuristic Dendral and Meta-Dendral systems. The generator (later named CONGEN) generates potential solutions for a particular problem, which are then expressed as chemical graphs in Dendral. However, this is feasible only when the number of candidate solutions is minimal. When there are large numbers of possible solutions, Dendral has to find a way to put constraints that rules out large sets of candidate solutions. This is the primary aim of Dendral planner, which is a “hypothesis-formation” program that employs “task-specific knowledge to find constraints for the generator”. Last but not least, the tester analyzes each proposed candidate solution and discards those that fail to fulfill certain criteria. This mechanism of plan-generate-test paradigm is what holds Dendral together. === Knowledge Engineering === The primary aim of knowledge engineering is to attain a productive interaction between the available knowledge base and problem solving techniques. This is possible through development of a procedure in which large amounts of task-specific information is encoded into heuristic programs. Thus, the first essential component of knowledge engineering is a large “knowledge base.” Dendral has specific knowledge about the mass spectrometry technique, a large amount of information that forms the basis of chemistry and graph theory, and information that might be helpful in finding the solution of a particular chemical structure elucidation problem. This “knowledge base” is used both to search for possible chemical structures that match the input data, and to learn new “general rules” that help prune searches. The benefit Dendral provides the end user, even a non-expert, is a minimized set of possible solutions to check manually. == Heuristics == A heuristic is a rule of thumb, an algorithm that does not guarantee a solution, but reduces the number of possible solutions by discarding unlikely and irrelevant solutions. The use of heuristics to solve problems is called "heuristics programming", and was used in Dendral to allow it to replicate in machines the process through which human experts induce the solution to problems via rules of thumb and specific information. Heuristics programming was a major approach and a giant step forward in artificial intelligence, as it allowed scientists to finally automate certain traits of human intelligence. It became prominent among scientists in the late 1940s through George Polya’s book, How to Solve It: A New Aspect of Mathematical Method. As Herbert A. Simon said in The Sciences of the Artificial, "if you take a heuristic conclusion as certain, you may be fooled and disappointed; but if you neglect heuristic conclusions altogether you will make no progress at all." == History == During the mid 20th century, the question "can machines think?" became intriguing and popular among scientists, primarily to add humanistic characteristics to machine behavior. John McCarthy, who was one of the prime researchers of this field, termed this concept of machine intelligence as "artificial intelligence" (AI) during the Dartmouth summer in 1956. AI is usually defined as the capacity of a machine to perform operations that are analogous to human cognitive capabilities. Much research to create AI was done during the 20th century. Also around the mid 20th century, science, especially biology, faced a fast-increasing need to develop a "man-computer symbiosis", to aid scientists in solving problems. For example, the structural analysis of myoglobin, hemoglobin, and other proteins relentlessly needed instrumentation development due to its complexity. In the early 1960s, Joshua Lederberg started working with computers and quickly became tremendously interested in creating interactive computers to help him in his exobiology research. Specifically, he was interested in designing computing systems to help him study alien organic compounds. Lederberg had been heading a team designing instruments for the Mars Viking lander to search for precursor molecules of life in samples of the Mars surface, using a mass spectrometer coupled with a minicomputer. As he was not an expert in either chemistry or computer programming, he collaborated with Stanford chemist Carl Djerassi to help him with chemistry, and Edward Feigenbaum with programming, to automate the process of determining chemical structures from raw mass spectrometry data. Feigenbaum was an expert in programming languages and heuristics, and helped Lederberg design a system that replicated the way Djerassi solved structure elucidation problems. They devised a system called Dendritic Algorithm (Dendral) that was able to generate possible chemical structures corresponding to the mass spectrometry data as an output. Dendral then was still very inaccurate in assessing spectra of ketones, alcohols, and isomers of chemical compounds. Thus, Djerassi "taught" general rules to Dendral that could help eliminate most of the "chemically implausible" structures, and p

    Read more →
  • UMBEL

    UMBEL

    UMBEL (Upper Mapping and Binding Exchange Layer) is a logically organized knowledge graph of 34,000 concepts and entity types that can be used in information science for relating information from disparate sources to one another. It was retired at the end of 2019. UMBEL was first released in July 2008. Version 1.00 was released in February 2011. Its current release is version 1.50. The grounding of this information occurs by common reference to the permanent URIs for the UMBEL concepts; the connections within the UMBEL upper ontology enable concepts from sources at different levels of abstraction or specificity to be logically related. Since UMBEL is an open-source extract of the OpenCyc knowledge base, it can also take advantage of the reasoning capabilities within Cyc. UMBEL has two means to promote the semantic interoperability of information:. It is: An ontology of about 35,000 reference concepts, designed to provide common mapping points for relating different ontologies or schema to one another, and A vocabulary for aiding that ontology mapping, including expressions of likelihood relationships distinct from exact identity or equivalence. This vocabulary is also designed for interoperable domain ontologies. UMBEL is written in the Semantic Web languages of SKOS and OWL 2. It is a class structure used in Linked Data, along with OpenCyc, YAGO, and the DBpedia ontology. Besides data integration, UMBEL has been used to aid concept search, concept definitions, query ranking, ontology integration, and ontology consistency checking. It has also been used to build large ontologies and for online question answering systems. Including OpenCyc, UMBEL has about 65,000 formal mappings to DBpedia, PROTON, GeoNames, and schema.org, and provides linkages to more than 2 million Wikipedia pages (English version). All of its reference concepts and mappings are organized under a hierarchy of 31 different "super types", which are mostly disjoint from one another. Each of these "super types" has its own typology of entity classes to provide flexible tie-ins for external content. 90% of UMBEL is contained in these entity classes.

    Read more →
  • BabelNet

    BabelNet

    BabelNet is a multilingual lexical-semantic knowledge graph, ontology and encyclopedic dictionary developed at the NLP group of the Sapienza University of Rome under the supervision of Roberto Navigli. BabelNet was automatically created by linking Wikipedia to the most popular computational lexicon of the English language, WordNet. The integration is done using an automatic mapping and by filling in lexical gaps in resource-poor languages by using statistical machine translation. The result is an encyclopedic dictionary that provides concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations. Additional lexicalizations and definitions are added by linking to free-license wordnets, OmegaWiki, the English Wiktionary, Wikidata, FrameNet, VerbNet and others. Similarly to WordNet, BabelNet groups words in different languages into sets of synonyms, called Babel synsets. For each Babel synset, BabelNet provides short definitions (called glosses) in many languages harvested from both WordNet and Wikipedia. == Statistics of BabelNet == As of December 2023, BabelNet (version 5.3) covers 600 languages. It contains almost 23 million synsets and around 1.7 billion word senses (regardless of their language). Each Babel synset contains 2 synonyms per language, i.e., word senses, on average. The semantic network includes all the lexico-semantic relations from WordNet (hypernymy and hyponymy, meronymy and holonymy, antonymy and synonymy, etc., totaling around 364,000 relation edges) as well as an underspecified relatedness relation from Wikipedia (totaling around 1.9 billion edges). Version 5.3 also associates around 61 million images with Babel synsets and provides a Lemon RDF encoding of the resource, available via a SPARQL endpoint. 2.67 million synsets are assigned domain labels. == Applications == BabelNet has been shown to enable multilingual natural language processing applications. The lexicalized knowledge available in BabelNet has been shown to obtain state-of-the-art results in: Semantic relatedness, Multilingual word-sense disambiguation and entity linking, with the Babelfy system, Video games with a purpose. == Prizes and acknowledgments == BabelNet received the META prize 2015 for "groundbreaking work in overcoming language barriers through a multilingual lexicalised semantic network and ontology making use of heterogeneous data sources". The Artificial Intelligence Journal paper that describes BabelNet won the Prominent Paper Award in 2017. BabelNet featured prominently in a Time magazine article about the new age of innovative and up-to-date lexical knowledge resources available on the Web.

    Read more →
  • Ware report

    Ware report

    Security Controls for Computer Systems, commonly called the Ware report, is a 1970 text by Willis Ware that was foundational in the field of computer security. == Development == A defense contractor in St. Louis, Missouri, had bought an IBM mainframe computer, which it was using for classified work on a fighter aircraft. To provide additional income, the contractor asked the Department of Defense (DoD) for permission to sell computer time on the mainframe to local businesses via remote terminals, while the classified work continued. At the time, the DoD did not have a policy to cover this. The DoD's Advanced Research Projects Agency (DARPA) asked Ware - a RAND employee - to chair a committee to examine and report on the feasibility of security controls for computer systems. The committee's report was a classified document given in January 1970 to the Defense Science Board (DSB), which had taken over the project from ARPA. After declassification, the report was published by RAND in October 1979. == Influence == The IEEE Computer Society said the report was widely circulated, and the IEEE Annals of the History of Computing said that it, together with Ware's 1967 Spring Joint Computer Conference session, marked the start of the field of computer security. The report influenced security certification standards and processes, especially in the banking and defense industries, where the report was instrumental in creating the Orange Book.

    Read more →
  • POSC Caesar

    POSC Caesar

    POSC Caesar Association (PCA) is an international, open and not-for-profit, member organization that promotes the development of open specifications to be used as standards for enabling the interoperability of data, software and related matters. PCA is the initiator of ISO 15926 "Integration of life-cycle data for process plants including oil and gas production facilities" and is committed to its maintenance and enhancement. Nils Sandsmark has been the General Manager of POSC Caesar Association since 1999 and Thore Langeland, Norwegian Oil Industry Association (Norwegian: Oljeindustriens Landsforening, OLF), is the chairman of the board. == History == === Caesar Offshore === The first predecessor of POSC Caesar Association, the Caesar Offshore program, started in 1993. The original focus was on standardizing technical data definitions for capital intensive projects at the handover from the EPC contractor to the owner/operators of onshore and offshore oil and gas production facilities. The program was sponsored by The Research Council of Norway, two EPC contractors (Aker Maritime and Kværner), three owners/operators (Norsk Hydro, Saga Petroleum and Statoil) and DNV as service provider and project owner. === POSC Caesar project === During the period 1994–96, Caesar Offshore Program was defined as a project of Petrotechnical Open Software Corporation (POSC) (now Energistics), and changed its name to the POSC Caesar Project. In 1995 the project was joined by BP, Brown and Root and Elf Aquitaine and in 1997 by Intergraph, IBM, Oracle, Lloyd's, Shell, ABB and UMOE Technologies. During that time, POSC Caesar also became a member of European Process Industries STEP Technical Liaison Executive (EPISTLE) where it collaborates with PISTEP (UK), and USPI-NL (The Netherlands) on the development of ISO 10303, also known as "Standard for the Exchange of Product model data (STEP)". === POSC Caesar Association === In 1997, POSC Caesar Association was founded as an independent, global, non-profit, member organization. POSC Caesar Association serves an international membership and collaborates with other international organizations. It has its main office in Norway. Albeit the name of POSC Caesar Association still hints to its past as a project within the Petrotechnical Open Software Corporation (POSC) (now Energistics), from 1997 onwards, the organization has been independent. Energistics and POSC Caesar Association do collaborate, and are formally member in each other's organization. == Membership == POSC Caesar Association has with its current 36 members from around the world and has established an international footprint (with a strong membership in Norway) that includes a variety of backgrounds, from academia and solution providers to engineering contractors and owners/operators. The members are (subdivided by organization type): Associations: Energistics (USA) and The Norwegian Oil Industry Association (OLF, Norway); Universities and Research Institutes: International Research Institute of Stavanger (IRIS, Norway), Norwegian University of Science and Technology (NTNU, Norway), Korea Advanced Institute of Science and Technology (KAIST, Korea), SINTEF (Norway), University of Bergen (Norway), University of Oslo (Norway), University of Stavanger (Norway), University of Tromsø (Norway) and Western Norway Research Institute (Norway); Oil and Gas Companies: BP (UK), Petronas (Malaysia) and Statoil (Norway); Engineering contractors and consultants: Akvaplan-niva (Norway), Aker Solutions (Norway), Asset Life Cycle Information Management (ALCIM, Malaysia), CAESAR systems (USA), Bechtel (USA), Det Norske Veritas (DNV, Norway), Information Logic (USA) and iXIT Engineering Technology (Germany), Phusion IM Ltd (UK); Solution providers: Aveva (UK), Bentley Systems (USA), Jotne EPM Technology (Norway), Epsis (Norway), Eurostep (Sweden), International Business Machines Corporation (IBM, USA), Siemens - Comos Industry Solutions (before Innotec) (Germany), Intergraph (USA), Invenia (Norway), Keel Solution (Denmark), Noumenon (UK), NRX (Canada), Octaga (Norway) and Tektonisk (Norway). In general, the organization holds three membership meetings a year; one in January / February in North-America (typically USA), one in April / May in Europe (typically Norway) and one in October in Asia (typically Malaysia). == Activities and services == === Initiator and custodian of ISO 15926 === In consultation with the other EPISTLE members and the International Organization for Standardization (ISO), it was decided in 2003 (some say already in 1997) that for modeling-technical reasons it was better to discontinue the development of ISO 10303 and to initiate the development of ISO 15926 "Integration of life-cycle data for process plants including oil and gas production facilities." Over the years, the scope of the standard has increased from the initial capital-intensive projects in the upstream oil and gas industry, to include also relevant terminology for downstream oil and gas industry applications and to deal with real-time data related to the actual oil and gas production. ISO 15926 has also over the years evolved from a dictionary (a list of terms with definitions), over a taxonomy (added hierarchy) to an ontology (a formal representation of a set of concepts within a domain and the relationships between those concepts). ISO 15926 is therefore sometimes nicknamed the "Oil and Gas Ontology", for some considered to be an essential prerequisite together with Semantic Web technologies to get to better interoperability, an optimal use of all available data across boundaries and an increase in efficiency. This is what some call the next generation of Integrated Operations. === Reference data services === Placeholders: Flow scheme of WIP - RDS - ISO and role of SIGs RDS Standards in database pilot (ISO) === Special interest groups === Placeholders: Overview of SIGs Drilling and Completion Reservoir and Production Operations and Maintenance == Projects == There are a number of projects (co-)organized by POSC Caesar Association working on the extension of the ISO 15926 standard in different application areas. === Capital intensive projects application domain === The following projects are running at the moment (August 2009): The ADI Project of FIATECH, to build the tools (which will then be made available in the public domain) The IDS Project of POSC Caesar Association, to define product models required for data sheets A joint collaboration project between FIATECH POSC Caesar Association is the ADI-IDS project is the ISO 15926 WIP === Upstream oil and gas industry application domain === The following projects are currently running (August 2009): The Integrated Operations in the High North (IOHN) project is working on extending ISO 15926 to handle real-time data transmission and (pre-)processing to enable the next generation of Integrated Operations. The Environment Web project to include environmental reporting terms and definitions as used in EPIM's EnvironmentWeb in ISO 15926. Finalised projects include: The Integrated Information Platform (IIP) project working on establishing a real-time information pipeline based on open standards. It worked among others on: Daily Drilling Report (DDR) to including all terms and definitions in ISO 15926. This standard became mandatory on February 1, 2008 for reporting on the Norwegian Continental Shelf by the Norwegian Petroleum Directorate (NPD) and Safety Authority Norway (PSA). NPD says that the quality of the reports has improved considerably since. Daily Production Report (DPR) to including all terms and definitions in ISO 15926. This standard was tested successfully on the Valhall (BP-operated) and Åsgard (StatoilHydro-operated) fields offshore Norway. The terminology and XML schemata developed have also been included in Energistics’ PRODML standard. == Conferences and events == === Semantic Days === === Sogndal academic network meeting === == Collaborations == POSC Caesar is collaborating with a number of standardization bodies, including: Mimosa: collaboration on open information standards for Operations and Maintenance mainly for the downstream oil and gas industry; FIATECH: collaboration on open information standards for life cycle data of capital projects; Energistics: collaboration on information standards for the upstream oil and gas industry, including WITSML and PRODML; OASIS: collaboration on e-business standards; ISO TC184/SC4: the host of the ISO 15926 standard.

    Read more →
  • Ethics of artificial intelligence

    Ethics of artificial intelligence

    The ethics of artificial intelligence covers a broad range of topics within AI that are considered to have particular ethical stakes. This includes algorithmic biases, fairness, accountability, transparency, privacy, and regulation, particularly where systems influence or automate human decision-making. It also covers various emerging or potential future challenges such as machine ethics (how to make machines that behave ethically), lethal autonomous weapon systems, arms race dynamics, AI safety and alignment, technological unemployment, AI-enabled misinformation, how to treat certain AI systems if they have a moral status (AI welfare and rights), artificial superintelligence and existential risks. Some application areas may also have particularly important ethical implications, like healthcare, education, criminal justice, or the military. == Machine ethics == Machine ethics (or machine morality) is the field of research concerned with designing Artificial Moral Agents (AMAs), robots or artificially intelligent computers that behave morally or as though moral. To account for the nature of these agents, it has been suggested to consider certain philosophical ideas, like the standard characterizations of agency, rational agency, moral agency, and artificial agency, which are related to the concept of AMAs. There are discussions on creating tests to see if an AI is capable of making ethical decisions. Alan Winfield concludes that the Turing test is flawed and the requirement for an AI to pass the test is too low. A proposed alternative test is one called the Ethical Turing Test, which would improve on the current test by having multiple judges decide if the AI's decision is ethical or unethical. Neuromorphic AI could be one way to create morally capable robots, as it aims to process information similarly to humans, nonlinearly and with millions of interconnected artificial neurons. Similarly, whole-brain emulation (scanning a brain and simulating it on digital hardware) could also in principle lead to human-like robots, thus capable of moral actions. And large language models are capable of approximating human moral judgments. Inevitably, this raises the question of the environment in which such robots would learn about the world and whose morality they would inherit – or if they end up developing human 'weaknesses' as well: selfishness, pro-survival attitudes, inconsistency, scale insensitivity, etc. In Moral Machines: Teaching Robots Right from Wrong, Wendell Wallach and Colin Allen conclude that attempts to teach robots right from wrong will likely advance understanding of human ethics by motivating humans to address gaps in modern normative theory and by providing a platform for experimental investigation. As one example, it has introduced normative ethicists to the controversial issue of which specific learning algorithms to use in machines. For simple decisions, Nick Bostrom and Eliezer Yudkowsky have argued that decision trees (such as ID3) are more transparent than neural networks and genetic algorithms, while Chris Santos-Lang argued in favor of machine learning on the grounds that the norms of any age must be allowed to change and that natural failure to fully satisfy these particular norms has been essential in making humans less vulnerable to criminal "hackers". Some researchers frame machine ethics as part of the broader AI control or value alignment problem: the difficulty of ensuring that increasingly capable systems pursue objectives that remain compatible with human values and oversight. Stuart Russell has argued that beneficial systems should be designed to (1) aim at realizing human preferences, (2) remain uncertain about what those preferences are, and (3) learn about them from human behaviour and feedback, rather than optimizing a fixed, fully specified goal. Some authors argue that apparent compliance with human values may reflect optimization for evaluation contexts rather than stable internal norms, complicating the assessment of alignment in advanced language models. == Challenges == === Algorithmic biases === AI has become increasingly inherent in facial and voice recognition systems. These systems may be vulnerable to biases and errors introduced by their human creators. Notably, the data used to train them can have biases. According to Allison Powell, associate professor at LSE and director of the Data and Society programme, data collection is never neutral and always involves storytelling. She argues that the dominant narrative is that governing with technology is inherently better, faster and cheaper, but proposes instead to make data expensive, and to use it both minimally and valuably, with the cost of its creation factored in. Friedman and Nissenbaum identify three categories of bias in computer systems: existing bias, technical bias, and emergent bias. In natural language processing, problems can arise from the text corpus—the source material the algorithm uses to learn about the relationships between different words. Large companies such as IBM, Google, etc. that provide significant funding for research and development have made efforts to research and address these biases. One potential solution is to create documentation for the data used to train AI systems. Process mining can be an important tool for organizations to achieve compliance with proposed AI regulations by identifying errors, monitoring processes, identifying potential root causes for improper execution, and other functions. However, there are also limitations to the current landscape of fairness in AI, due to the intrinsic ambiguities in the concept of discrimination, both at the philosophical and legal level. ==== Racial and gender biases ==== Bias can be introduced through historical data used to train AI systems. For instance, Amazon terminated their use of AI hiring and recruitment because the algorithm favored male candidates over female ones. This was because Amazon's system was trained with data collected over a 10-year period that included mostly male candidates. The algorithms learned the biased pattern from the historical data, and generated predictions where these types of candidates were most likely to succeed in getting the job. Therefore, the recruitment decisions made by the AI system turned out to be biased against female and minority candidates. The performance of facial recognition and computer vision models may vary based on race and gender. Facial recognition algorithms made by Microsoft, IBM and Face++ all performed significantly worse on darker-skinned women. Facial recognition was shown to be biased against those with darker skin tones. AI systems may be less accurate for black people, as was the case in the development of an AI-based pulse oximeter that overestimated blood oxygen levels in patients with darker skin, causing issues with their hypoxia treatment. In 2015, controversy erupted after a Black couple were labeled "Gorillas" by Google Photos. Oftentimes the systems are able to easily detect the faces of white people while being unable to register the faces of people who are black. This has led to the ban of police usage of AI materials or software in some U.S. states. The reason for these biases is that AI pulls information from across the internet to influence its responses in each situation. For example, if a facial recognition system was only tested on people who were white, it would make it much harder for it to interpret the facial structure and tones of other races and ethnicities. Biases often stem from the training data rather than the algorithm itself, notably when the data represents past human decisions. A 2020 study that reviewed voice recognition systems from Amazon, Apple, Google, IBM, and Microsoft found that they have higher error rates when transcribing black people's voices than white people's. Injustice in the use of AI is much harder to eliminate within healthcare systems, as oftentimes diseases and conditions can affect different races and genders differently. This can lead to confusion as the AI may be making decisions based on statistics showing that one patient is more likely to have problems due to their gender or race. This can be perceived as a bias because each patient is a different case, and AI is making decisions based on what it is programmed to group that individual into. This leads to a discussion about what should be considered a biased decision in the distribution of treatment. While it is known that there are differences in how diseases and injuries affect different genders and races, there is a discussion on whether it is fairer to incorporate this into healthcare treatments, or to examine each patient without this knowledge. In modern society there are certain tests for diseases, such as breast cancer, that are recommended to certain groups of people over others because they are more likely to contract the disease in question. If AI implements these statistics

    Read more →