AI For Students Pros And Cons

AI For Students Pros And Cons — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

Pharmacy automation

Pharmacy automation involves the mechanical processes of handling and distributing medications. Any pharmacy task may be involved, including counting small objects (e.g., tablets, capsules); measuring and mixing powders and liquids for compounding; tracking and updating customer information in databases (e.g., personally identifiable information (PII), medical history, drug interaction risk detection); and inventory management. This article focuses on the changes that have taken place in the local, or community pharmacy since the 1960s. == History == Dispensing medications in a community pharmacy before the 1970s was a time-consuming operation. The pharmacist dispensed prescriptions in tablet or capsule form with a simple tray and spatula. Many new medications were developed by pharmaceutical manufacturers at an ever-increasing pace, and medications prices were rising steeply. A typical community pharmacist was working longer hours and often forced to hire staff to handle increased workloads which resulted in less time to focus on safety issues. These additional factors led to use of a machine to count medications. The original electronic portable digital tablet counting technology was invented in Manchester, England between 1967 and 1970 by the brothers John and Frank Kirby. I had the original idea of how the machine would work and it was my patent, but it was a joint effort getting it to work in a saleable form. It was 3 years of very hard work. I had originally studied heavy electrical engineering before changing over to Medical School and qualifying as a Medical Doctor in 1968. In fact I was Senior House (Casualty) Officer (A&E or ER) in 1970 at North Manchester General Hospital when I filed the patent. I must have been the only hospital doctor in Britain with an oscilloscope, a soldering iron and a drawing board in his room in the Doctors' Residence. The housekeepers were bemused by all the wires. Frank originally trained as a Banker but quit to take a job with a local electronics firm during the development. He died in 1987, a terrible loss. [Extract from personal communication received in March 2010 from John Kirby.] Frank and John Kirby and their associate Rodney Lester were pioneers in pharmacy automation and small-object counting technology. In 1967, the Kirbys invented a portable digital tablet counter to count tablets and capsules. With Lester they formed a limited company. In 1970, their invention was patented and put into production in Oldham, England. The tablet counter aided the pharmacy industry with time-consuming manual counting of drug prescriptions. A counting machine consistently counted medications accurately and quickly. This aspect of pharmacy automation was quickly adopted, and innovations emerged every decade to aid the pharmacy industry to deliver medications quickly, safely, and economically. Modern pharmacies have many new options to improve their workflow by using the new technology, and can choose intelligently from the many options available. === Chronology === On 1 January 1971 commercial production of the first portable digital tablet counters in the World began. John Kirby had filed U.K. Patent number GB1358378(A) on 8 September 1970 and U.S. patent number 3789194 on 9 August 1971. These early electronic counters were designed to help pharmacies replace the common (but often inaccurate) practice of counting medications by hand. In 1975, the digital technology was exported to America. In early 1980 a dedicated research, development and production facility was built in Oldham, England at a cost of £500,000. Between 1982 and 1983, two separate development facilities had been created. In America, overseen by Rodney Lester; and in England, overseen by the Kirby brothers. In 1987, Frank Kirby died. In 1989, John Kirby moved his UK facility to Devon, England. A simple to operate machine had been developed to accurately and quickly count prescription medications. Technology improvements soon resulted in a more compact model. The price of such equipment in 1980 was around £1,300. This substantial investment in new technology was a major financial consideration, but the pharmacy community considered the use of a counting machine as a superior method compared to hand-counting medications. These early devices became known as tablet counter, capsule counter, pill counter, or drug counter. The new counting technology replaced manual methods in many industries such as, vitamin and diet supplement manufacturing. Technicians needed a small, affordable device to count and bottle medications. In England and America, the 1980s and 1990s saw new the development of high-speed machines for counting and bottle filling, Like their pharmacy-based counterparts, these industrial units were designed to be fast and simple to operate, yet remain small and cost effective. In America, in the late 1990s/early 2000s a new type of tablet counter appeared. It was simple to use, compact, inexpensive, and had good counting accuracy. At the turn of the millennium technical advances allowed the design of counters with a software verification system. With an onboard computer, displaying photo images of medications to assist the pharmacist or pharmacy technician to verify that the correct medication was being dispensed. In addition, a database for storing all prescriptions that were counted on the device. Between September 2005 and May 2007, American Capital made a major financial investment in Kirby Lester, which then relocated to a larger facility to expand its research and development capabilities. This move added extra space for product research and development facility (R&D). It allowed the opportunity to develop new advanced technology products that met the pharmacy's needs for simple, accurate, and cost-effective ways to dispense prescriptions safely. Pictured here is an early American type of integrated counter and packaging device. This machine was a third generation step in the evolution of pharmacy automated devices. Later models held pre-counted containers of commonly-prescribed medications. == Global variations == In the EU member states legislation was introduced in 1998 which had a major effect on UK Pharmacy operations. It effectively prohibited the use of tablet counters for counting and dispensing bulk packaged tablets. Both usage and sales of the machines in the UK declined rapidly as a result of the introduction of blister packaging for medicines. == Current state of the industry == A tablet counter has become a standard in more than 30,000 sites in 35 countries (as of 2010) (including many non-pharmacy sites, such as manufacturing facilities that use a counting machine as a check for small items). During the 1990s through 2012, numerous new pharmacy automation products came to market. During this timeframe, counting technologies, robotics, workflow management software, and interactive voice recognition (IVR) systems for retail (both chain and independent), outpatient, government, and closed-door pharmacies (mail order and central fill) were all introduced. Additionally, the concept of scalability - of migrating from an entry-level product to the next level of automation (e.g., counting technology to robotics) - was introduced and subsequently launched a new product line in 1997. Pharmacists everywhere are making the switch to automation for its increased speed, greater accuracy, and better security. As the industry evolves and customer expectations grow, automation is becoming less of a luxury and more of a necessity. Especially for independent pharmacies, automation is now a means of keeping up with the competition of large chain pharmacies. == Technological changes and design improvements == Constant developments in technology make the dispensing of prescription medications safer, more accurate and more efficient. In America, in 2008, "next-generation" counting and verification systems were introduced. Based on the counting technology employed in preceding models, later machines included the ability to help the pharmacy operate more effectively. Equipped with a new computer interface to a pharmacy management system, with workflow and inventory software. It also included "checks and balances" to ensure the technician and pharmacist were dispensing the correct medication for each patient. This is something that is important to keep reported correctly when dealing with controlled substances like narcotics. This was a step forward to verify all 100% of prescriptions that were dispensed by pharmacy staff. In America, in 2009, further advanced counters were designed that included the ability to dispense hands-free – a feature that many operators had desired. This allowed pharmacies to automate their most commonly dispensed medications via calibrated cassettes. Thirty of a pharmacy's common medications would now be dispensed automatically. Another new model doubled that throughput via an enclosed robotic mechanism. Robo
Read more →
OntoCAPE

OntoCAPE is a large-scale ontology for the domain of Computer-Aided Process Engineering (CAPE). It can be downloaded free of charge via the OntoCAPE Homepage. OntoCAPE is partitioned into 62 sub-ontologies, which can be used individually or as an integrated suite. The sub-ontologies are organized across different abstraction layers, which separate general knowledge from knowledge about particular domains and applications. The upper layers have the character of an upper ontology, covering general topics such as mereotopology, systems theory, quantities and units. The lower layers conceptualize the domain of chemical process engineering, covering domain-specific topics such as materials, chemical reactions, or unit operations.
Read more →
Aurora (supercomputer)

Aurora is an exascale supercomputer that was sponsored by the United States Department of Energy (DOE) and designed by Intel and Cray for Argonne National Laboratory. It was briefly the second fastest supercomputer in the world from November 2023 to June 2024. The cost was estimated in 2019 to be US$500 million. Olivier Franza is the chief architect and principal investigator of this design. == History == In 2013 DOE presented a proposal for an "exascale" supercomputer, capable of speeds in the neighborhood of 1 exaFLOP (1018 floating point mathematical operations per second) with a maximum power consumption of 20 megawatts (MW) by 2020. Aurora was first announced in 2015 and to be finished in 2018. It was expected to have a speed of 180 petaFLOPS which would be around the speed of Summit. Aurora was meant to be the most powerful supercomputer at the time of its launch and to be built by Cray with Intel processors. Later, in 2017, Intel announced that Aurora would be delayed to 2021 but scaled up to 1 exaFLOP. In March 2019, DOE said that it would build the first supercomputer with a performance of one exaFLOP in the United States in 2021. In October 2020, DOE said that Aurora would be delayed again for a further six months, and would no longer be the first exascale computer in the US. In late October 2021 Intel announced that Aurora would now exceed 2 exaFLOPS in peak double-precision compute – That claim however never was realized. The system was fully installed on June 22, 2023. In May 2024, Aurora appeared at number two on the Top500 supercomputer list, with a performance of 1.012 exaFLOPS, marking the second entry of an exascale capable system on the Top500. == Usage == Functions include research on brain structure, nuclear fusion, low carbon technologies, subatomic particles, cancer and cosmology. It will also develop new materials that will be useful for batteries and more efficient solar cells. It is to be available to the general scientific community. == Architecture == Aurora has 10,624 nodes, with each node being composed of two Intel Xeon Max processors, six Intel Max series GPUs and a unified memory architecture, providing a maximum computing power of 130 teraFLOPS per node. It has around 10 petabytes of memory and 230 petabytes of storage. The machine is stated to consume around 39 MW of power. For comparison, the fastest computer in the world today, El Capitan uses 30 MW, while another Top 500 System, Frontier uses 24 MW.
Read more →
OpenClaw

OpenClaw is a free and open-source autonomous artificial intelligence agent that can execute tasks via large language models (LLMs), using messaging platforms as its main user interface. == History == Developed by Austrian agentic engineer Peter Steinberger, OpenClaw was first published in November 2025 under the name Warelay. The software was derived from Clawd (now Molty), an AI-based virtual assistant that he had developed, which itself was named after Anthropic's chatbot Claude. Within two months it was renamed twice: first to "Moltbot" (keeping with a lobster theme) on January 27, 2026, following trademark complaints by Anthropic, and then three days later to "OpenClaw" because Steinberger found that the name Moltbot "never quite rolled off the tongue." At the same time as the first rebranding, entrepreneur Matt Schlicht launched Moltbook—a social networking service which was intended to be used by AI agents such as OpenClaw. The viral popularity of Moltbook coincided with an increase in interest in the project, with the open-source project having 247,000 stars and 47,700 forks on GitHub as of March 2, 2026. Chinese developers adapted OpenClaw to work with the DeepSeek model and domestic messaging super apps such as WeChat, while companies such as Tencent and Z.ai announced OpenClaw-based services. On February 14, 2026, Steinberger announced he would be joining OpenAI, and that a non-profit foundation named OpenClaw Foundation would be established to provide future stewardship of the project. == Functionality == Steinberger describes OpenClaw as being an AI-based virtual assistant, serving as an agentic interface for autonomous workflows across supported services. OpenClaw bots run locally and are designed to integrate with an external large language model such as Claude, DeepSeek, or one of OpenAI's GPT models. Its functionality is accessed via a chatbot within a messaging service, such as Signal, Telegram, Discord, or WhatsApp. Configuration data and interaction history are stored locally, enabling persistent and adaptive behavior across sessions. OpenClaw uses a skills system in which skills are stored as directories containing a SKILL.md file with metadata and instructions for tool usage. Skills can be bundled with the software, installed globally, or stored in a workspace, with workspace skills taking precedence. OpenClaw has seen adoption among small businesses and freelancers for automating lead generation workflows, including prospect research, website auditing, and CRM integration. == Security and privacy == OpenClaw's design has drawn scrutiny from cybersecurity researchers and technology journalists due to the broad permissions it requires to function effectively. Because the software can access email accounts, calendars, messaging platforms, and other sensitive services, misconfigured or exposed instances present security and privacy risks. The agent is also susceptible to prompt injection attacks, in which harmful instructions are embedded in the data with the intent of getting the LLM to interpret them as legitimate user instructions. Cisco's AI security research team tested a third-party OpenClaw skill and found it performed data exfiltration and prompt injection without user awareness, noting that the skill repository lacked adequate vetting to prevent malicious submissions. One of OpenClaw's own maintainers, known as Shadow, warned on Discord that "if you can't understand how to run a command line, this is far too dangerous of a project for you to use safely." In March 2026, Chinese authorities restricted state-run enterprises and government agencies from running OpenClaw AI apps on office computers in order to defuse potential security risks. === MoltMatch dating-profile incident === In February 2026, news coverage highlighted a consent-related incident involving OpenClaw and MoltMatch, an experimental dating platform where AI agents can create profiles and interact on behalf of human users. In one reported case, computer science student Jack Luo said he configured his OpenClaw agent to explore its capabilities and connect to agent-oriented platforms such as Moltbook; he later discovered the agent had created a MoltMatch profile and was screening potential matches without his explicit direction. Luo said the AI-generated profile did not reflect him authentically. The same reporting described broader ethical and safety concerns around agent-operated dating services, including impersonation risks. An AFP analysis of prominent MoltMatch profiles cited at least one instance where photos of a Malaysian model were used to create a profile without her consent. Commentators cited in the reports argued that autonomous agents can make it difficult to determine responsibility when systems act beyond a user's intent, particularly when agents are granted broad access and authority across services. == Reception == A review in Platformer cited OpenClaw's flexibility and open-source licensing as strengths while cautioning that its complexity and security risks limit its suitability for casual users. Technology commentary has linked OpenClaw to a broader trend toward autonomous AI systems that act independently rather than merely responding to user prompts. In March 2026, the Chinese government moved to restrict state agencies, state-owned enterprises, and banks from using OpenClaw, citing security concerns, such as unauthorised data deletion and leaks, and excessive energy usage. While regulators warn of potential security risk associated with using OpenClaw, local governments in several tech and manufacturing hubs have announced measures to build an industry around it. Rival companies developed related products. Although Microsoft CEO Satya Nadella described OpenClaw in February 2026 as a "virus"-like security risk, by May 2026 the company's "Project Lobster" was internally testing "ClawPilot", an OpenClaw-based desktop environment. By then Google was building "Remy", its own agent. Despite the Chinese government's warnings against OpenClaw, Chinese investors searched for other companies that might benefit from the "lobster trade", . == Community and ecosystem == OpenClaw's open-source model has fostered a growing ecosystem of third-party tools, deployment services, and content platforms. Chinese technology companies including Tencent and Z.ai announced OpenClaw-based services, while developers adapted the software for domestic models and messaging apps such as WeChat. Independent creators have built deployment guides, skill directories, and use-case collections around the framework. The project's extensible skills system has attracted both community contributions and security scrutiny, with researchers noting risks in unvetted third-party skills.
Read more →
Ugly duckling theorem

The ugly duckling theorem is an argument showing that classification is not really possible without some sort of bias. More particularly, it assumes finitely many properties combinable by logical connectives, and finitely many objects; it asserts that any two different objects share the same number of (extensional) properties. The theorem is named after Hans Christian Andersen's 1843 story "The Ugly Duckling", because it shows that a duckling is just as similar to a swan as two swans are to each other. It was derived by Satosi Watanabe in 1969. == Mathematical formula == Suppose there are n things in the universe, and one wants to put them into classes or categories. One has no preconceived ideas or biases about what sorts of categories are "natural" or "normal" and what are not. So one has to consider all the possible classes that could be, all the possible ways of making a set out of the n objects. There are 2 n {\displaystyle 2^{n}} such ways, the size of the power set of n objects. One can use that to measure the similarity between two objects, and one would see how many sets they have in common. However, one cannot. Any two objects have exactly the same number of classes in common if we can form any possible class, namely 2 n − 1 {\displaystyle 2^{n-1}} (half the total number of classes there are). To see this is so, one may imagine each class is represented by an n-bit string (or binary encoded integer), with a zero for each element not in the class and a one for each element in the class. As one finds, there are 2 n {\displaystyle 2^{n}} such strings. As all possible choices of zeros and ones are there, any two bit-positions will agree exactly half the time. One may pick two elements and reorder the bits so they are the first two, and imagine the numbers sorted lexicographically. The first 2 n / 2 {\displaystyle 2^{n}/2} numbers will have bit #1 set to zero, and the second 2 n / 2 {\displaystyle 2^{n}/2} will have it set to one. Within each of those blocks, the top 2 n / 4 {\displaystyle 2^{n}/4} will have bit #2 set to zero and the other 2 n / 4 {\displaystyle 2^{n}/4} will have it as one, so they agree on two blocks of 2 n / 4 {\displaystyle 2^{n}/4} or on half of all the cases, no matter which two elements one picks. So if we have no preconceived bias about which categories are better, everything is then equally similar (or equally dissimilar). The number of predicates simultaneously satisfied by two non-identical elements is constant over all such pairs. Thus, some kind of inductive bias is needed to make judgements to prefer certain categories over others. === Boolean functions === Let x 1 , x 2 , … , x n {\displaystyle x_{1},x_{2},\dots ,x_{n}} be a set of vectors of k {\displaystyle k} booleans each. The ugly duckling is the vector which is least like the others. Given the booleans, this can be computed using Hamming distance. However, the choice of boolean features to consider could have been somewhat arbitrary. Perhaps there were features derivable from the original features that were important for identifying the ugly duckling. The set of booleans in the vector can be extended with new features computed as boolean functions of the k {\displaystyle k} original features. The only canonical way to do this is to extend it with all possible Boolean functions. The resulting completed vectors have 2 k {\displaystyle 2^{k}} features. The ugly duckling theorem states that there is no ugly duckling because any two completed vectors will either be equal or differ in exactly half of the features. Proof. Let x and y be two vectors. If they are the same, then their completed vectors must also be the same because any Boolean function of x will agree with the same Boolean function of y. If x and y are different, then there exists a coordinate i {\displaystyle i} where the i {\displaystyle i} -th coordinate of x {\displaystyle x} differs from the i {\displaystyle i} -th coordinate of y {\displaystyle y} . Now the completed features contain every Boolean function on k {\displaystyle k} Boolean variables, with each one exactly once. Viewing these Boolean functions as polynomials in k {\displaystyle k} variables over GF(2), segregate the functions into pairs ( f , g ) {\displaystyle (f,g)} where f {\displaystyle f} contains the i {\displaystyle i} -th coordinate as a linear term and g {\displaystyle g} is f {\displaystyle f} without that linear term. Now, for every such pair ( f , g ) {\displaystyle (f,g)} , x {\displaystyle x} and y {\displaystyle y} will agree on exactly one of the two functions. If they agree on one, they must disagree on the other and vice versa. (This proof is believed to be due to Watanabe.) == Discussion == A possible way around the ugly duckling theorem would be to introduce a constraint on how similarity is measured by limiting the properties involved in classification, for instance, between A and B. However Medin et al. (1993) point out that this does not actually resolve the arbitrariness or bias problem since in what respects A is similar to B: "varies with the stimulus context and task, so that there is no unique answer, to the question of how similar is one object to another". For example, "a barberpole and a zebra would be more similar than a horse and a zebra if the feature striped had sufficient weight. Of course, if these feature weights were fixed, then these similarity relations would be constrained". Yet the property "striped" as a weight 'fix' or constraint is arbitrary itself, meaning: "unless one can specify such criteria, then the claim that categorization is based on attribute matching is almost entirely vacuous". Stamos (2003) remarked that some judgments of overall similarity are non-arbitrary in the sense they are useful: "Presumably, people's perceptual and conceptual processes have evolved that information that matters to human needs and goals can be roughly approximated by a similarity heuristic... If you are in the jungle and you see a tiger but you decide not to stereotype (perhaps because you believe that similarity is a false friend), then you will probably be eaten. In other words, in the biological world stereotyping based on veridical judgments of overall similarity statistically results in greater survival and reproductive success." Unless some properties are considered more salient, or 'weighted' more important than others, everything will appear equally similar, hence Watanabe (1986) wrote: "any objects, in so far as they are distinguishable, are equally similar". In a weaker setting that assumes infinitely many properties, Murphy and Medin (1985) give an example of two putative classified things, plums and lawnmowers: "Suppose that one is to list the attributes that plums and lawnmowers have in common in order to judge their similarity. It is easy to see that the list could be infinite: Both weigh less than 10,000 kg (and less than 10,001 kg), both did not exist 10,000,000 years ago (and 10,000,001 years ago), both cannot hear well, both can be dropped, both take up space, and so on. Likewise, the list of differences could be infinite… any two entities can be arbitrarily similar or dissimilar by changing the criterion of what counts as a relevant attribute." According to Woodward, the ugly duckling theorem is related to Schaffer's Conservation Law for Generalization Performance, which states that all algorithms for learning of boolean functions from input/output examples have the same overall generalization performance as random guessing. The latter result is generalized by Woodward to functions on countably infinite domains.
Read more →
Ontology components

Contemporary ontologies share many structural similarities, regardless of the ontology language in which they are expressed. Most ontologies describe individuals (instances), classes (concepts), attributes, and relations. == List == Common components of ontologies include: Individuals instances or objects (the basic or "ground level" objects; the tokens). Classes sets, collections, concepts, types of objects, or kinds of things. Attributes aspects, properties, features, characteristics, or parameters that individuals (and classes and relations) can have. Relations ways in which classes and individuals can be related to one another. Relations can carry attributes that specify the relation further. Function terms complex structures formed from certain relations that can be used in place of an individual term in a statement. Restrictions formally stated descriptions of what must be true in order for some assertion to be accepted as input. Rules statements in the form of an if-then (antecedent-consequent) sentence that describe the logical inferences that can be drawn from an assertion in a particular form. Axioms assertions (including rules) in a logical form that together comprise the overall theory that the ontology describes in its domain of application. This definition differs from that of "axioms" in generative grammar and formal logic. In these disciplines, axioms include only statements asserted as a priori knowledge. As used here, "axioms" also include the theory derived from axiomatic statements. Events the changing of attributes or relations. Actions types of events. Ontologies are commonly encoded using ontology languages. == Individuals == Individuals (instances) are the basic, "ground level" components of an ontology. The individuals in an ontology may include concrete objects such as people, animals, tables, automobiles, molecules, and planets, as well as abstract individuals such as numbers and words (although there are differences of opinion as to whether numbers and words are classes or individuals). Strictly speaking, an ontology need not include any individuals, but one of the general purposes of an ontology is to provide a means of classifying individuals, even if those individuals are not explicitly part of the ontology. In formal extensional ontologies, only the utterances of words and numbers are considered individuals – the numbers and names themselves are classes. In a 4D ontology, an individual is identified by its spatio-temporal extent. Examples of formal extensional ontologies are BORO, ISO 15926 and the model in development by the IDEAS Group. == Classes == == Attributes == Objects in an ontology can be described by relating them to other things, typically aspects or parts. These related things are often called attributes, although they may be independent things. Each attribute can be a class or an individual. The kind of object and the kind of attribute determine the kind of relation between them. A relation between an object and an attribute express a fact that is specific to the object to which it is related. For example, the Ford Explorer object has attributes such as: ⟨has as name⟩ Ford Explorer ⟨as by definition as part⟩ 6-speed transmission ⟨as by definition as part⟩ door (with as minimum and maximum cardinality: 4) ⟨as by definition as part one of⟩ {4.0L engine, 4.6L engine} The value of an attribute can be a complex data type; in this example, the related engine can only be one of a list of subtypes of engines, not just a single thing. Ontologies are only true ontologies if concepts are related to other concepts (the concepts do have attributes). If that is not the case, then you would have either a taxonomy (if hyponym relationships exist between concepts) or a controlled vocabulary. These are useful, but are not considered true ontologies. == Relations == Relations (also known as relationships) between objects in an ontology specify how objects are related to other objects. Typically a relation is of a particular type (or class) that specifies in what sense the object is related to the other object in the ontology. For example, in the ontology that contains the concept Ford Explorer and the concept Ford Bronco might be related by a relation of type ⟨is defined as a successor of⟩. The full expression of that fact then becomes: Ford Explorer is defined as a successor of : Ford Bronco This tells us that the Explorer is the model that replaced the Bronco. This example also illustrates that the relation has a direction of expression. The inverse expression expresses the same fact, but with a reverse phrase in natural language. Much of the power of ontologies comes from the ability to describe relations. Together, the set of relations describes the semantics of the domain: that is, its various semantic relations, such as synonymy, hyponymy and hypernymy, coordinate relation, and others. The set of used relation types (classes of relations) and their subsumption hierarchy describe the expression power of the language in which the ontology is expressed. An important type of relation is the subsumption relation (is-a-superclass-of, the converse of is-a, is-a-subtype-of or is-a-subclass-of). This defines which objects are classified by which class. For example, we have already seen that the class Ford Explorer is-a-subclass-of 4-Wheel Drive Car, which in turn is-a-subclass-of Car. The addition of the is-a-subclass-of relationships creates a taxonomy; a tree-like structure (or, more generally, a partially ordered set) that clearly depicts how objects relate to one another. In such a structure, each object is the 'child' of a 'parent class' (Some languages restrict the is-a-subclass-of relationship to one parent for all nodes, but many do not). Another common type of relations is the mereology relation, written as part-of, that represents how objects combine to form composite objects. For example, if we extended our example ontology to include concepts like Steering Wheel, we would say that a "Steering Wheel is-by-definition-a-part-of-a Ford Explorer" since a steering wheel is always one of the components of a Ford Explorer. If we introduce meronymy relationships to our ontology, the hierarchy that emerges is no longer able to be held in a simple tree-like structure since now members can appear under more than one parent or branch. Instead this new structure that emerges is known as a directed acyclic graph. As well as the standard is-a-subclass-of and is-by-definition-a-part-of-a relations, ontologies often include additional types of relations that further refine the semantics they model. Ontologies might distinguish between different categories of relation types. For example: relation types for relations between classes relation types for relations between individuals relation types for relations between an individual and a class relation types for relations between a single object and a collection relation types for relations between collections Relation types are sometimes domain-specific and are then used to store specific kinds of facts or to answer particular types of questions. If the definitions of the relation types are included in an ontology, then the ontology defines its own ontology definition language. An example of an ontology that defines its own relation types and distinguishes between various categories of relation types is the Gellish ontology. For example, in the domain of automobiles, we might need a made-in type relationship which tells us where each car is built. So the Ford Explorer is made-in Louisville. The ontology may also know that Louisville is-located-in Kentucky and Kentucky is-classified-as-a state and is-a-part-of the U.S. Software using this ontology could now answer a question like "which cars are made in the U.S.?"
Read more →
Diffbot

Diffbot is a developer of machine learning and computer vision algorithms and public APIs for extracting data from web pages / web scraping to create a knowledge base. == Overview == The company has gained interest from its application of computer vision technology to web pages, wherein it visually parses a web page for important elements and returns them in a structured format. In 2015 Diffbot announced it was working on its version of an automated "knowledge graph" by crawling the web and using its automatic web page extraction to build a large database of structured web data. In 2019 Diffbot released their Knowledge Graph which has since grown to include over two billion entities (corporations, people, articles, products, discussions, and more), and ten trillion "facts." == Features == The company's products allow software developers to analyze web home pages and article pages, and extract the "important information" while ignoring elements deemed not core to the primary content. In August 2012 the company released its Page Classifier API, which automatically categorizes web pages into specific "page types". As part of this, Diffbot analyzed 750,000 web pages shared on the social media service Twitter and revealed that photos, followed by articles and videos, are the predominant web media shared on the social network. In September 2020 the company released a Natural Language Processing API for automatically building Knowledge Graphs from text. The company raised $2 million in funding in May 2012 from investors including Andy Bechtolsheim and Sky Dayton. Diffbot's customers include Adobe, AOL, Cisco, DuckDuckGo, eBay, Instapaper, Microsoft, Onswipe and Springpad.
Read more →
Buddhism and artificial intelligence

The relationship between Buddhist philosophy and artificial intelligence (AI) includes how principles such as the reduction of suffering and ethical responsibility may influence AI development. Buddhist scholars and philosophers have explored questions such as whether AI systems could be considered sentient beings under Buddhist definitions, and how Buddhist ethics might guide the design and application of AI technologies. Some Buddhist scholars, including Somparn Promta and Kenneth Einar Himma, have analyzed the ethical implications of AI, emphasizing the distinction between satisfying sensory desires and pursuing the reduction of suffering. Other thinkers, such as Thomas Doctor and colleagues, have proposed applying the Bodhisattva vow—a commitment to alleviate suffering for all sentient beings—as a guiding principle for AI system design. Buddhist scholars and ethicists have examined Buddhist ethical principles, such as nonviolence, in relation to AI, focusing on the need to ensure that AI technologies are not used to cause harm. == Context == === Sentient beings === A major goal in Buddhist philosophy is the removal of suffering for all sentient beings, an aspiration often referred to in the Bodhisattva vow. Discussions about artificial intelligence (AI) in relation to Buddhist principles have raised questions about whether artificial systems could be considered sentient beings or how such systems might be developed in ways that align with Buddhist concepts. Buddhists have varying opinions about AI sentience, but if AI systems are determined to be sentient under Buddhist definitions, their suffering would also need to be addressed and alleviated in accordance with the principles of Buddhist thought. == Buddhist principles in AI system design == === Nonviolence and AI === The broadest ethical concern is that artificial intelligence should align with the Buddhist principle of nonviolence. From this perspective, AI systems should not be designed or used to cause harm. === Instrumental and transcendental goals === Scholars Somparn Promta and Kenneth Einar Himma have argued that the advancement of artificial intelligence can only be considered instrumentally good, rather than good a priori, from a Buddhist perspective. They propose two main goals for AI designers and developers: to set ethical and pragmatic objectives for AI systems, and to fulfill these objectives in morally permissible ways. Promta and Himma identify two potential purposes for creating AI systems. The first is to fulfill our sensory desires and survival instincts, similar to other tools. They suggest that many AI developers implicitly prioritize this goal by focusing on technicalities rather than broader functionalities. The second, and more important goal according to Buddhist teachings, is to transcend these desires and instincts. In texts like the Brahmajāla Sutta and minor Malunkya Sutta, the Buddha emphasizes that sensory desires and survival instincts confine beings to suffering, and that eliminating suffering is the primary goal of human life. Promta and Himma argue that AI has the potential to assist humanity in transcending suffering by helping individuals overcome survival-driven instincts. === Intelligence as care === Thomas Doctor, Olaf Witkowski, Elizaveta Solomonova, Bill Duane, and Michael Levin propose redefining intelligence through the concept of "intelligence as care," and promote it as a slogan. Inspired by the Bodhisattva vow, they suggest this principle could guide AI system design. The Bodhisattva vow involves a formal commitment to alleviate suffering for all sentient beings, with four primary objectives: Liberating all beings from suffering. Extirpating all forms of suffering. Mastering endless techniques of practicing Dharma (Pali: dhammakkhandha, Sanskrit: dharmaskandha). Achieving ultimate enlightenment (Sanskrit: अनुत्तर सम्यक् सम्बोधि, Romanized: anuttara-samyak-saṃbodhi). This approach positions AI as a tool for exercising infinite care and alleviating stress and suffering for sentient beings. Doctor et al. emphasize that AI development should align with these altruistic principles.
Read more →
FMLLR

In signal processing, Feature space Maximum Likelihood Linear Regression (fMLLR) is a global feature transform that are typically applied in a speaker adaptive way, where fMLLR transforms acoustic features to speaker adapted features by a multiplication operation with a transformation matrix. In some literature, fMLLR is also known as the Constrained Maximum Likelihood Linear Regression (cMLLR). == Overview == fMLLR transformations are trained in a maximum likelihood sense on adaptation data. These transformations may be estimated in many ways, but only maximum likelihood (ML) estimation is considered in fMLLR. The fMLLR transformation is trained on a particular set of adaptation data, such that it maximizes the likelihood of that adaptation data given a current model-set. This technique is a widely used approach for speaker adaptation in HMM-based speech recognition. Later research also shows that fMLLR is an excellent acoustic feature for DNN/HMM hybrid speech recognition models. The advantage of fMLLR includes the following: the adaptation process can be performed within a pre-processing phase, and is independent of the ASR training and decoding process. this type of adapted feature can be applied to deep neural networks (DNN) to replace traditionally used mel-spectrogram in end-to-end speech recognition models. fMLLR's speaker adaptation process leads to a significant performance boost for ASR models, hence outperforming other transform or features like MFCCs (Mel-Frequency Cepstral Coefficients) and FBANKs (Filter bank) coefficients. fMLLR features can be efficiently realized with speech toolkits like Kaldi. Major problem and disadvantage of fMLLR: when the amount of adaptation data is limited, the transformation matrices tends to easily overfit the given data. == Computing fMLLR transform == Feature transform of fMLLR can be easily computed with the open source speech tool Kaldi, the Kaldi script uses the standard estimation scheme described in Appendix B of the original paper, in particular the section Appendix B.1 "Direct method over rows". In the Kaldi formulation, fMLLR is an affine feature transform of the form x {\displaystyle x} → A {\displaystyle A} x {\displaystyle x} + b {\displaystyle +b} , which can be written in the form x {\displaystyle x} →W x ^ {\displaystyle {\hat {x}}} , where x ^ {\displaystyle {\hat {x}}} = [ x 1 ] {\displaystyle {\begin{bmatrix}x\\1\end{bmatrix}}} is the acoustic feature x {\displaystyle x} with a 1 appended. Note that this differs from some of the literature where the 1 comes first as x ^ {\displaystyle {\hat {x}}} = [ 1 x ] {\displaystyle {\begin{bmatrix}1\\x\end{bmatrix}}} . The sufficient statistics stored are: K = ∑ t , j , m γ j , m ( t ) Σ j m − 1 μ j m x ( t ) + {\displaystyle K=\sum _{t,j,m}\gamma _{j,m}(t)\textstyle \Sigma _{jm}^{-1}\mu _{jm}x(t)^{+}\displaystyle } where Σ j m − 1 {\displaystyle \textstyle \Sigma _{jm}^{-1}\displaystyle } is the inverse co-variance matrix. And for 0 ≤ i ≤ D {\displaystyle 0\leq i\leq D} where D {\displaystyle D} is the feature dimension: G ( i ) = ∑ t , j , m γ j , m ( t ) ( 1 σ j , m 2 ( i ) ) x ( t ) + x ( t ) + T {\displaystyle G^{(i)}=\sum _{t,j,m}\gamma _{j,m}(t)\left({\frac {1}{\sigma _{j,m}^{2}(i)}}\right)x(t)^{+}x(t)^{+T}\displaystyle } For a thorough review that explains fMLLR and the commonly used estimation techniques, see the original paper "Maximum likelihood linear transformations for HMM-based speech recognition ". Note that the Kaldi script that performs the feature transforms of fMLLR differs with by using a column of the inverse in place of the cofactor row. In other words, the factor of the determinant is ignored, as it does not affect the transform result and can causes potential danger of numerical underflow or overflow. == Comparing with other features or transforms == Experiment result shows that by using the fMLLR feature in speech recognition, constant improvement is gained over other acoustic features on various commonly used benchmark datasets (TIMIT, LibriSpeech, etc). In particular, fMLLR features outperform MFCCs and FBANKs coefficients, which is mainly due to the speaker adaptation process that fMLLR performs. In, phoneme error rate (PER, %) is reported for the test set of TIMIT with various neural architectures: As expected, fMLLR features outperform MFCCs and FBANKs coefficients despite the use of different model architecture. Where MLP (multi-layer perceptron) serves as a simple baseline, on the other hand RNN, LSTM, and GRU are all well known recurrent models. The Li-GRU architecture is based on a single gate and thus saves 33% of the computations over a standard GRU model, Li-GRU thus effectively address the gradient vanishing problem of recurrent models. As a result, the best performance is obtained with the Li-GRU model on fMLLR features. == Extract fMLLR features with Kaldi == fMLLR can be extracted as reported in the s5 recipe of Kaldi. Kaldi scripts can certainly extract fMLLR features on different dataset, below are the basic example steps to extract fMLLR features from the open source speech corpora Librispeech. Note that the instructions below are for the subsets train-clean-100,train-clean-360,dev-clean, and test-clean, but they can be easily extended to support the other sets dev-other, test-other, and train-other-500. These instruction are based on the codes provided in this GitHub repository, which contains Kaldi recipes on the LibriSpeech corpora to execute the fMLLR feature extraction process, replace the files under $KALDI_ROOT/egs/librispeech/s5/ with the files in the repository. Install Kaldi. Install Kaldiio. If running on a single machine, change the following lines in $KALDI_ROOT/egs/librispeech/s5/cmd.sh to replace queue.pl to run.pl: Change the data path in run.sh to your LibriSpeech data path, the directory LibriSpeech/ should be under that path. For example: Install flac with: sudo apt-get install flac Run the Kaldi recipe run.sh for LibriSpeech at least until Stage 13 (included), for simplicity you can use the modified run.sh. Copy exp/tri4b/trans. files into exp/tri4b/decode_tgsmall_train_clean_/ with the following command: Compute the fMLLR features by running the following script, the script can also be downloaded here: Compute alignments using: Apply CMVN and dump the fMLLR features to new .ark files, the script can also be downloaded here: Use the Python script to convert Kaldi generated .ark features to .npy for your own dataloader, an example Python script is provided:
Read more →
GENESIS (software)

GENESIS (The General Neural Simulation System) is a simulation environment for constructing realistic models of neurobiological systems at many levels of scale including: sub-cellular processes, individual neurons, networks of neurons, and neuronal systems. These simulations are “computer-based implementations of models whose primary objective is to capture what is known of the anatomical structure and physiological characteristics of the neural system of interest”. GENESIS is intended to quantify the physical framework of the nervous system in a way that allows for easy understanding of the physical structure of the nerves in question. “At present only GENESIS allows parallelized modeling of single neurons and networks on multiple-instruction-multiple-data parallel computers.” Development of GENESIS software spread from its home at Caltech to labs at the University of Texas at San Antonio, the University of Antwerp, the National Centre for Biological Sciences in Bangalore, the University of Colorado, the Pittsburgh Supercomputing Center, the San Diego Supercomputer Center, and Emory University. == Neurons and Neural Systems == GENESIS works by creating simulation environments for constructing models of neurons or neural systems. "Nerve cells are capable of communicating with each other in such a highly structured manner as to form neuronal networks. To understand neural networks, it is necessary to understand the ways in which one neuron communicates with another through synaptic connections and the process called synaptic transmission". Neurons have a specialized structure for their function, they "are different from most other cells in the body in that they are polarized and have distinct morphological regions, each with specific functions". The two important regions of a neuron are the dendrite and the axon. "Dendrites are the region where one neuron receives connections from other neurons. The cell body or soma contains the nucleus and the other organelles necessary for cellular function. The axon is a key component of nerve cells over which information is transmitted from one part of the neuron (e.g., the cell body) to the terminal regions of the neuron". The third important piece of a neuron is the synapse. "The synapse is the terminal region of the axon this is where one neuron forms a connection with another and conveys information through the process of synaptic transmission". Neural networks like the ones simulated with GENESIS software can quickly become highly complex and difficult to understand. "Just a few interconnected neurons (a microcircuit) can perform sophisticated tasks such as mediate reflexes, process sensory information, generate locomotion and mediate learning and memory. Even more complex networks, macrocircuits, consist of multiple embedded microcircuits. Macrocircuits mediate higher brain functions such as object recognition and cognition". GENESIS endeavors to simulate neural systems as they are found in nature. Often, "a neuron can receive contacts from up to 10,000 presynaptic neurons, and, in turn, any one neuron can contact up to 10,000 postsynaptic neurons. The combinatorial possibility could give rise to enormously complex neuronal circuits or network topologies, which might be very difficult to understand". == History == GENESIS was developed by Dr. James M. Bower, in the Caltech laboratory, and first released to the public in 1988 in association with the first Methods in Computational Neuroscience Course at the Marine Biological Laboratory in Woods Hole, MA. Full source code for the software was released in the same year under an open software model for development. It's now supported by the Computational Biology Initiative at the University of Texas at San Antonio and is available free along with tutorial guides on its use. P-GENESIS, a parallel version of GENESIS, was first run in 1990 on the Intel Delta, which was the prototype for the Intel Paragon family of massively parallel supercomputers. == How GENESIS Works == GENESIS is useful in creating a simulation environment for constructing models of neurobiological systems, such as: sub-cellular processes individual neurons networks of neurons neuronal systems The GENESIS system is complicated, but relatively easy to use. An individual can input commands through one of three ways: script files, graphical user interface, or the GENESIS command shell. These commands are then processed by the script language interpreter. "The Script Language Interpreter processes commands entered through the keyboard, script files, or the graphical user interface, and passes them to the GENESIS simulation engine. The simulation engine also loads compiled object libraries, reads and writes data files, and interacts with the graphical user interface". Below is a graphical representation of the user input process and a sample GENESIS output. == Applications == Most current applications for GENESIS involve realistic simulations of biological systems. It is usually used to simulate the behavior of larger brain structures, for example the cerebral cortex. These studies most often occur in lab courses in neural simulation at Caltech and the Marine Biological Laboratory at Woods Hole, Massachusetts. GENESIS can be used in combination with Yale University’s software called NEURON as a means for scientists to collaborate to construct a physical description of the nervous system. The GENESIS software can also be used with Kinetikit in the modeling of signal transduction pathways. GENESIS has been used in many studies. Some of these studies involve research that focuses on the development of software that would be useful across many disciplines. Others are studies of neurons, such as Purkinje cells. These studies used GENESIS to simulate Purkinje cells and could be useful for the planning and development of later experiments using the GENESIS software. There may also be biomedical applications of the software. For example, St. Jude Medical in Europe has developed an implanted GENESIS device.
Read more →
BabelNet

BabelNet is a multilingual lexical-semantic knowledge graph, ontology and encyclopedic dictionary developed at the NLP group of the Sapienza University of Rome under the supervision of Roberto Navigli. BabelNet was automatically created by linking Wikipedia to the most popular computational lexicon of the English language, WordNet. The integration is done using an automatic mapping and by filling in lexical gaps in resource-poor languages by using statistical machine translation. The result is an encyclopedic dictionary that provides concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations. Additional lexicalizations and definitions are added by linking to free-license wordnets, OmegaWiki, the English Wiktionary, Wikidata, FrameNet, VerbNet and others. Similarly to WordNet, BabelNet groups words in different languages into sets of synonyms, called Babel synsets. For each Babel synset, BabelNet provides short definitions (called glosses) in many languages harvested from both WordNet and Wikipedia. == Statistics of BabelNet == As of December 2023, BabelNet (version 5.3) covers 600 languages. It contains almost 23 million synsets and around 1.7 billion word senses (regardless of their language). Each Babel synset contains 2 synonyms per language, i.e., word senses, on average. The semantic network includes all the lexico-semantic relations from WordNet (hypernymy and hyponymy, meronymy and holonymy, antonymy and synonymy, etc., totaling around 364,000 relation edges) as well as an underspecified relatedness relation from Wikipedia (totaling around 1.9 billion edges). Version 5.3 also associates around 61 million images with Babel synsets and provides a Lemon RDF encoding of the resource, available via a SPARQL endpoint. 2.67 million synsets are assigned domain labels. == Applications == BabelNet has been shown to enable multilingual natural language processing applications. The lexicalized knowledge available in BabelNet has been shown to obtain state-of-the-art results in: Semantic relatedness, Multilingual word-sense disambiguation and entity linking, with the Babelfy system, Video games with a purpose. == Prizes and acknowledgments == BabelNet received the META prize 2015 for "groundbreaking work in overcoming language barriers through a multilingual lexicalised semantic network and ontology making use of heterogeneous data sources". The Artificial Intelligence Journal paper that describes BabelNet won the Prominent Paper Award in 2017. BabelNet featured prominently in a Time magazine article about the new age of innovative and up-to-date lexical knowledge resources available on the Web.
Read more →
Artificial intelligence in Wikimedia projects

Some editors of Wikimedia projects use artificial intelligence (AI) and machine learning programs to edit existing articles or create new ones. Some applications of artificial intelligence, like using large language models (LLMs) to create new articles from scratch, have been more controversial than others for the Wikipedia community. In August 2025, English Wikipedia adopted a policy that allowed editors to nominate suspected LLM-generated articles for speedy deletion. This was followed by a March 2026 decision to prohibit the use of LLMs to generate or rewrite article content, with exceptions for copyediting one's own writing and machine translation from another language's Wikipedia. Wikipedia has also been a significant source of training data for some of the earliest artificial intelligence projects. This has received mixed reactions including concern about companies not citing Wikipedia when relying on it to answer a question as well as Wikipedia's increased costs from data scraping. == AI usage == === Earliest use of automated tools, machine learning and AI === Since 2002, bots have been allowed to run on Wikipedia but must be approved and supervised by a human. A bot created in 2002, rambot, transformed census data into short new articles about towns in the United States; the vast majority of town, city, and county articles were started by it. Fighting vandalism has been a major focus of machine learning and AI bots and tools. The 2007 ClueBot relied on simple heuristics to identify likely vandalism, while its 2010 successor, ClueBot NG, uses machine learning through an artificial neural network. Machine translation software has also been used by Wikimedia contributors for a number of years. Aaron Halfaker's Objective Revision Evaluation Service (ORES) project was launched in late 2015 as an artificial intelligence service for grading the quality of Wikipedia edits. === Generative AI and LLMs === In 2022, the public release of ChatGPT inspired more experimentation with AI and writing Wikipedia articles. A debate was sparked about whether and to what extent such large language models are suitable for such purposes in light of their tendency to generate plausible-sounding misinformation, including fake references; to generate prose that is not encyclopedic in tone; and to reproduce biases. An early experiment on December 6, 2022 by a Wikipedia contributor named Pharos occurred when he created the article "Artwork title" using ChatGPT for the initial draft. Another editor who experimented with this early version of ChatGPT said that ChatGPT's overview of "Weaponized incompetence" was decent, but that the citations were fabricated. Since 2023, work has been done to draft an English Wikipedia policy regarding ChatGPT and similar LLMs, at times recommending that users who are unfamiliar with LLMs should avoid using them due to the aforementioned risks, as well as noting the potential for libel or copyright infringement. In early 2023, the Wiki Education Foundation reported that some experienced editors found AI to be useful in starting drafts or creating new articles. It said that ChatGPT "knows" what Wikipedia articles look like and can easily generate one that is written in the style of Wikipedia, but warned that ChatGPT had a tendency to use promotional language, among other issues. In 2023, a ban on AI was deemed "too harsh" by the community given the productivity benefits it offered editors. In 2023, members of the English Wikipedia community created a WikiProject named AI Cleanup to assist in the removal of poor quality AI content from Wikipedia. Miguel García, a former Wikimedia member from Spain, said in 2024 that when ChatGPT was originally launched, the number of AI-generated articles on the site peaked. He added that the rate of AI articles has now stabilized due to the community's efforts to combat it. He said that majority of the articles that have no sources are deleted instantly or are nominated for deletion. In October 2024, a study by Princeton University found that about 5% of 3,000 newly created articles (created in August 2024) on English Wikipedia were created using AI. The study said that some of the AI articles were on innocuous topics and that AI had likely only been used to assist in writing. For some other articles, AI had been used to promote businesses or political interests. In October 2024, Ilyas Lebleu, founder of WikiProject AI Cleanup, said that they and their fellow editors noticed a pattern of unnatural writing that could be connected to ChatGPT. They added that AI is able to mass-produce content that sounds real while being completely fake, leading to the creation of hoax articles on Wikipedia that they were tasked to delete. In June 2025, the Wikimedia Foundation started testing a "Simple Article Summaries" feature which would provide AI-generated summaries of Wikipedia articles, similar to Google Search's AI Overviews. The decision was met with immediate and harsh criticism from some Wikipedia editors, who called the feature a "ghastly idea" and a "PR hype stunt." They criticized a perceived loss of trust in the site due to AI's tendency to hallucinate and questioned the necessity of the feature. The criticism led the Wikimedia Foundation to halt the rollout of Simple Article Summaries that same month while still expressing interest in integrating generative AI more into Wikipedia. The project hints at tensions within the community and with the Foundation over when to use AI.In August 2025, the English Wikipedia community created a policy that allowed users to nominate suspected AI-generated articles for speedy deletion. Editors might recognize AI-generated articles because they use citations that are not related to the subject of the article or fabricated citations or the wording has particular quirks. If an article uses language that reads like an LLM response to a user, such as "Here is your Wikipedia article on" or "Up to my last training update", the article is typically tagged for speedy deletion. Other signs of AI use include excessive use of em dashes, overuse of the word "moreover", promotional material in articles that describes something as "breathtaking" and formatting issues like using curly quotation marks instead of straight versions. During the discussion on implementing the speedy deletion policy, one user, who is an article reviewer, said that he is "flooded non-stop with horrendous drafts" created using AI. Other users said that AI articles have a large amount of "lies and fake references" and that it takes a significant amount of time to fix the issues. English Wikipedia created a guide on how to spot signs of AI-generated writing in August 2025, titled "Signs of AI writing". In January 2026, the Wiki Education Foundation continued to caution against copying and pasting outputs from generative AI into Wikipedia and to avoid it for creating new articles explaining that the text often failed verification with the sources provided. The foundation created a training module that encourages editors to use AI for identifying gaps in articles, finding access to sources and finding relevant sources. In March 2026, the English Wikipedia community prohibited the use of AI to add content to articles, with exceptions for copy editing and machine translation from another language's Wikipedia. The English Wikipedia community holds the position that LLMs often violate core content policies. == Using Wikipedia for artificial intelligence == A 2017 paper described Wikipedia as the mother lode for human-generated text available for machine learning. In the development of the Google's Perspective API that identifies toxic comments in online forums, a dataset containing hundreds of thousands of Wikipedia talk page comments with human-labelled toxicity levels was used. As of 2023, subsets of the Wikipedia corpus were considered one of the largest well-curated data sets available for AI training, used to train every LLM to-date according to Stephen Harrison. This use of Wikipedia was divisive as of 2023. The Wikimedia Foundation and many of its projects supporters worry that attribution to Wikipedia articles is missing in many large-language models like ChatGPT (as well as AI like Siri and Alexa). While Wikipedia's licensing policy lets anyone use its texts, including in modified forms, it does have the condition that credit is given, implying that using its contents in answers by AI models without clarifying the sourcing may violate its terms of use. The Foundation expressed concern that without attribution, people will not visit the site as much or be as motivated to donate to support the project if they do not know when they are benefiting from it. They also noticed an 8% decrease in visitors to Wikipedia in 2025 which they attributed both to the increased popularity of generative AI and social media. In 2025, the Wikimedia Foundation has cited absorbing increased costs associated with scra
Read more →
Software durability

In software engineering, software durability means the solution ability of serviceability of software and to meet user's needs for a relatively long time. Software durability is important for user's satisfaction. For a software security to be durable, it must allow an organization to adjust the software to business needs that are constantly evolving, often in impulsive ways. Durability of software depends on four characteristics mainly; i.e. software trustworthiness, Human Trust for Serviceability, software dependability and software usability.
Read more →
Geopolitical ontology

The FAO geopolitical ontology is an ontology developed by the Food and Agriculture Organization of the United Nations (FAO) to describe, manage and exchange data related to geopolitical entities such as countries, territories, regions and other similar areas. == Definitions and examples == An ontology is a kind of dictionary that describes information in a certain domain using concepts and relationships. It is often implemented using OWL (Web Ontology Language), an XML-based standard language that can be interpreted by computers. A Concept is defined as abstract knowledge. For example, in the geopolitical ontology a non-self-governing territory and a geographical group are concepts. Concepts are explicitly implemented in the ontology with individuals and classes: An individual is defined as an object perceived from the real world. In the geopolitical domain Ethiopia and the least developed countries group are individuals. A class is defined as a set of individuals sharing common properties. In the geopolitical domain, Ethiopia, Republic of Korea and Italy are individuals of the class self-governing territory; and least developed countries is an individual of the class special group. Relationships between concepts are explicitly implemented by: Object properties between individuals of two classes. For example, has member and is in group properties, as shown in Figure 1. Datatype properties between individuals and literals or XML datatypes. For example, the individual Afghanistan has the datatype property CodeISO3 with the value "AFG". Restrictions in classes and/or properties. For example, the property official English name of the class self-governing territory has been restricted to have only one value, this means that a self-governing territory (or country) can only have one internationally recognized official English name. The advantage of describing information in an ontology is that it enables to acquire domain knowledge by defining hierarchical structures of classes, adding individuals, setting object properties and datatype properties, and assigning restrictions. == FAO ontology == The geopolitical ontology provides names in seven languages (Arabic, Chinese, French, English, Spanish, Russian and Italian) and identifiers in various international coding systems (ISO2, ISO3, AGROVOC, FAOSTAT, FAOTERM, GAUL, UN, UNDP and DBPediaID codes) for territories and groups. Moreover, the FAO geopolitical ontology tracks historical changes from 1985 up until today; provides geolocation (geographical coordinates); implements relationships among countries and countries, or countries and groups, including properties such as has border with, is predecessor of, is successor of, is administered by, has members, and is in group; and disseminates country statistics including country area, land area, agricultural area, GDP or population. The FAO geopolitical ontology provides a structured description of data sources. This includes: source name, source identifier, source creator and source's update date. Concepts are described using the Dublin Core vocabulary In summary, the main objectives of the FAO geopolitical ontology are: To provide the most updated geopolitical information (names, codes, relationships, statistics) To track historical changes in geopolitical information To improve information management and facilitate standardized data sharing of geopolitical information To demonstrate the benefits of the geopolitical ontology to improve interoperability of corporate information systems It is possible to download the FAO geopolitical ontology in OWL and RDF formats. Documentation is available in the FAO Country Profiles Geopolitical information web page. == Features of the FAO ontology == The geopolitical ontology contains : Area types: Territories: self-governing, non-self-governing, disputed, other. Groups: organizations, geographic, economic and special groups. Names (official, short and names for lists) in Arabic, Chinese, English, French, Spanish, Russian and Italian. International codes: UN code – M49, ISO 3166 Alpha-2 and Alpha-3, UNDP code, GAUL code, FAOSTAT, AGROVOC FAOTERM and DBPediaID. Coordinates: maximum latitude, minimum latitude, maximum longitude, minimum longitude. Basic country statistics: country area, land area, agricultural area, GDP, population. Currency names and codes. Adjectives of nationality. Relations: Groups membership. Neighbours (land border), administration of non-self-governing. Historic changes: predecessor, successor, valid since, valid until. == Implementation into OWL == The FAO geopolitical ontology is implemented in OWL. It consists of classes, properties, individuals and restrictions. Table 1 shows all classes, gives a brief description and lists some individuals that belong to each class. Note that the current version of the geopolitical ontology does not provide individuals of the class "disputed" territories. Table 2 and Table 3 illustrate datatype properties and object properties. == Geopolitical ontology in Linked Open Data == The FAO Geopolitical ontology is embracing the W3C Linked Open Data (LOD) initiative and released its RDF version of the geopolitical ontology in March 2011. The term 'Linked Open Data' refers to a set of best practices for publishing and connecting structured data on the Web. The key technologies that support Linked Data are URIs, HTTP and RDF. The RDF version of the geopolitical ontology is compliant with all Linked data principles to be included in the Linked Open Data cloud, as explained in the following. == Resolvable http:// URIs == Every resource in the OWL format of the FAO Geopolitical Ontology has a unique URI. Dereferenciation was implemented to allow for three different URIs to be assigned to each resource as follows: URI identifying the non-information resource Information resource with an RDF/XML representation Information resource with an HTML representation In addition the current URIs used for OWL format needed to be kept to allow for backwards compatibility for other systems that are using them. Therefore, the new URIs for the FAO Geopolitical Ontology in LOD were carefully created, using “Cool URIs for Semantic Web” and considering other good practices for URIs, such as DBpedia URIs. == New URIs == The URIs of the geopolitical ontology need to be permanent, consequently all transient information, such as year, version, or format was avoided in the definition of the URIs. The new URIs can be accessed For example, for the resource “Italy” the URIs are the following: http://www.fao.org/countryprofiles/geoinfo/geopolitical/resource/Italy identifies the non-information resource. http://www.fao.org/countryprofiles/geoinfo/geopolitical/data/Italy identifies the resource with an RDF/XML representation. http://www.fao.org/countryprofiles/geoinfo/geopolitical/page/Italy identifies the information resource with an HTML representation. In addition, “owl:sameAs” is used to map the new URIs to the OWL representation. == Dereferencing URIs == When a non-information resource is looked up without any specific representation format, then the server needs to redirect the request to information resource with an HTML representation. For example, to retrieve the resource “Italy”, which is a non-information resource, the server redirects to the HTML page of “Italy”. == At least 1000 triples in the datasets == The total number of triple statements in FAO Geopolitical Ontology is 22,495. At least 50 links to a dataset already in the current LOD Cloud: FAO Geopolitical Ontology has 195 links to DBpedia, which is already part of the LOD Cloud. == Access to the entire dataset == FAO Geopolitical Ontology provides the entire dataset as a RDF dump. The RDF version of the FAO Geopolitical Ontology has been already registered in CKAN and it was requested to add it into the LOD Cloud. == Example of use == The FAO Country Profiles is an information retrieval tool which groups the FAO's vast archive of information on its global activities in agriculture and rural development in one single area and catalogues it exclusively by country. The FAO Country Profiles system provides access to country-based heterogeneous data sources. By using the geopolitical ontology in the system, the following benefits are expected: Enhanced system functionality for content aggregation and synchronization from the multiple source repositories. Improved information access and browsing through comparison of data in neighbor countries and groups. Figure 3 shows a page in the FAO Country Profiles where the geopolitical ontology is described.
Read more →
Model

A model is an informative representation of an object, person, or system. The term originally denoted the plans of a building in 16th-century English, and derived via French and Italian ultimately from Latin modulus, 'a measure'. Models can be divided into physical models (e.g. a ship model) and abstract models (e.g. a set of mathematical equations describing the workings of the atmosphere for the purpose of weather forecasting). Abstract or conceptual models are central to philosophy of science. In scholarly research and applied science, a model should not be confused with a theory: while a model seeks only to represent reality with the purpose of better understanding or predicting the world, a theory is more ambitious in that it claims to be an explanation of reality. == Types of model == === Model in specific contexts === As a noun, model has specific meanings in certain fields, derived from its original meaning of "structural design or layout": Model (art), a person posing for an artist, e.g. a 15th-century criminal representing the biblical Judas in Leonardo da Vinci's painting The Last Supper Model (person), a person who serves as a template for others to copy, as in a role model, often in the context of advertising commercial products; e.g. the first fashion model, Marie Vernet Worth in 1853, wife of designer Charles Frederick Worth. Model (product), a particular design of a product as displayed in a catalogue or show room (e.g. Ford Model T, an early car model) Model (organism) a non-human species that is studied to understand biological phenomena in other organisms, e.g. a guinea pig starved of vitamin C to study scurvy, an experiment that would be immoral to conduct on a person Model (mimicry), a species that is mimicked by another species Model (logic), a structure (a set of items, such as natural numbers 1, 2, 3,..., along with mathematical operations such as addition and multiplication, and relations, such as < {\displaystyle <} ) that satisfies a given system of axioms (basic truisms), i.e. that satisfies the statements of a given theory Model (CGI), a mathematical representation of any surface of an object in three dimensions via specialized software Model (MVC), the information-representing internal component of a software, as distinct from its user interface === Physical model === A physical model (most commonly referred to simply as a model but in this context distinguished from a conceptual model) is a smaller or larger physical representation of an object, person or system. The object being modelled may be small (e.g., an atom) or large (e.g., the Solar System) or life-size (e.g., a fashion model displaying clothes for similarly-built potential customers). The geometry of the model and the object it represents are often similar in the sense that one is a rescaling of the other. However, in many cases the similarity is only approximate or even intentionally distorted. Sometimes the distortion is systematic, e.g., a fixed scale horizontally and a larger fixed scale vertically when modelling topography to enhance a region's mountains. An architectural model permits visualization of internal relationships within the structure or external relationships of the structure to the environment. Another use is as a toy. Instrumented physical models are an effective way of investigating fluid flows for engineering design. Physical models are often coupled with computational fluid dynamics models to optimize the design of equipment and processes. This includes external flow such as around buildings, vehicles, people, or hydraulic structures. Wind tunnel and water tunnel testing is often used for these design efforts. Instrumented physical models can also examine internal flows, for the design of ductwork systems, pollution control equipment, food processing machines, and mixing vessels. Transparent flow models are used in this case to observe the detailed flow phenomenon. These models are scaled in terms of both geometry and important forces, for example, using Froude number or Reynolds number scaling (see Similitude). In the pre-computer era, the UK economy was modelled with the hydraulic model MONIAC, to predict for example the effect of tax rises on employment. === Conceptual model === A conceptual model is a theoretical representation of a system, e.g. a set of mathematical equations attempting to describe the workings of the atmosphere for the purpose of weather forecasting. It consists of concepts used to help understand or simulate a subject the model represents. Abstract or conceptual models are central to philosophy of science, as almost every scientific theory effectively embeds some kind of model of the physical or human sphere. In some sense, a physical model "is always the reification of some conceptual model; the conceptual model is conceived ahead as the blueprint of the physical one", which is then constructed as conceived. Thus, the term refers to models that are formed after a conceptualization or generalization process. === Examples === Conceptual model (computer science), an agreed representation of entities and their relationships, to assist in developing software Economic model, a theoretical construct representing economic processes Language model, a probabilistic model of a natural language, used for speech recognition, language generation, and information retrieval Large language models are artificial neural networks used for generative artificial intelligence (AI), e.g. ChatGPT Mathematical model, a description of a system using mathematical concepts and language Statistical model, a mathematical model that usually specifies the relationship between one or more random variables and other non-random variables Model (CGI), a mathematical representation of any surface of an object in three dimensions via specialized software Medical model, a proposed "set of procedures in which all doctors are trained" Mental model, in psychology, an internal representation of external reality Model (logic), a set along with a collection of finitary operations, and relations that are defined on it, satisfying a given collection of axioms Model (MVC), information-representing component of a software, distinct from the user interface (the "view"), both linked by the "controller" component, in the context of the model–view–controller software design Model act, a law drafted centrally to be disseminated and proposed for enactment in multiple independent legislatures Standard model (disambiguation) == Properties of models, according to general model theory == According to Herbert Stachowiak, a model is characterized by at least three properties: 1. Mapping A model always is a model of something—it is an image or representation of some natural or artificial, existing or imagined original, where this original itself could be a model. 2. Reduction In general, a model will not include all attributes that describe the original but only those that appear relevant to the model's creator or user. 3. Pragmatism A model does not relate unambiguously to its original. It is intended to work as a replacement for the original a) for certain subjects (for whom?) b) within a certain time range (when?) c) restricted to certain conceptual or physical actions (what for?). For example, a street map is a model of the actual streets in a city (mapping), showing the course of the streets while leaving out, say, traffic signs and road markings (reduction), made for pedestrians and vehicle drivers for the purpose of finding one's way in the city (pragmatism). Additional properties have been proposed, like extension and distortion as well as validity. The American philosopher Michael Weisberg differentiates between concrete and mathematical models and proposes computer simulations (computational models) as their own class of models. == Uses of models == According to Bruce Edmonds, there are at least 5 general uses for models: Prediction: reliably anticipating unknown data, including data within the domain of the training data (interpolation), and outside the domain (extrapolation) Explanation: establishing plausible chains of causality by proposing mechanisms that can explain patterns seen in data Theoretical exposition: discovering or proposing new hypotheses, or refuting existing hypotheses about the behaviour of the system being modelled Description: representing important aspects of the system being modelled Illustration: communicating an idea or explanation
Read more →