AI Coding Kiro

AI Coding Kiro — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Wilkinson's Grammar of Graphics

    Wilkinson's Grammar of Graphics

    The Grammar of Graphics (GoG) is a grammar-based system for representing graphics to provide grammatical constraints on the composition of data and information visualizations. A graphical grammar differs from a graphics pipeline as it focuses on semantic components such as scales and guides, statistical functions, coordinate systems, marks and aesthetic attributes. For example, a bar chart can be converted into a pie chart by specifying a polar coordinate system without any other change in graphical specification. The grammar of graphics concept was launched by Leland Wilkinson in 2001 (Wilkinson et al., 2001; Wilkinson, 2005) and graphical grammars have since been written in a variety of languages with various parameterisations and extensions. The major implementations of graphical grammars are nViZn created by a team at SPSS/IBM, followed by Polaris focusing on multidimensional relational databases which is commercialised as Tableau, a revised Layered Grammar of Graphics by Hadley Wickham in Ggplot2, and Vega-Lite which is a visualisation grammar with added interactivity. The grammar of graphics continues to evolve with alternate parameterisations, extensions, or new specifications. == Wilkinson's Grammar of Graphics == === Theory === Wilkinson conceived the seven elements of a graphics to be Variables: mapping of objects to values represented in a graphic Algebra: operations to combine variables and specify dimensions of graphs Geometry: creation of geometric graphs from variables Aesthetics: sensory attributes Statistics: functions to change the appearance and representation of graphs Scales: represent variables on measured dimensions Coordinates: mapping to coordinate systems With these, Wilkinson hypothesised that These seven constructs are orthogonal and virtually all known statistical charts can be generated relatively parsimoniously This computational system is not a taxonomy of charts and rather it describes the meaning of what we do when we construct statistical graphics. === Implementations === Wilkinson wrote SYSTAT, a statistical software package, in the early 1980s. This program was noted for its comprehensive graphics, including the first software implementation of the heatmap display now widely used among biologists. After his company grew to 50 employees, he sold it to SPSS in 1995. At SPSS, he assembled a team of graphics programmers who developed the nViZn platform that produces the visualizations in SPSS, Clementine, and other analytics products. While at Stanford, Tableau founders Hanrahan and Stolte, as well as Diane Tang, created the predecessor to Tableau, named Polaris. Polaris was a data visualization software tool, built with the support of a United States Department of Energy defense program, the Accelerated Strategic Computing Initiative (ASCI). The main differences between Wilkinson's system and Polaris are the use of SQL relational algebra for database services and using shelves instead of cross and nest operators. == Wickham's Layered Grammar of Graphics == === Theory === Hadley Wickham conceived an alternate parameterisation of the syntax Wilkinson had derived, creating a layered grammar of graphics which he implemented as ggplot2 for R (programming language) users. This added a hierarchy of defaults based around the idea of building up a graphic from multiple layers. Wickham conceived these elements to be: Defaults: consists of data and mapping Data: dataset Mapping: aesthetic mappings Layer: consists of data, mapping, geom, stat, and position Data: dataset, or inherit from defaults Mapping: aesthetic mappings, or inherit from defaults Geom: geometric object Stat: statistical transformation Position: position adjustment Scale: mapping of data to aesthetic attributes Coord: mapping of data to the plane of the plot Facet: split up the data === Reception === Wilkinson is generally positive on Wickham's parameterisation and implementation of ggplot2, praising its elegance and expressivity whilst claiming that his original Grammar of Graphics is capable of representing a wider range of statistical graphics. === Implementations === ggplot2 is the first implementation of a layered grammar of graphics in R and implementations in other programming languages have ensued. These include direct ports plotnine for Python, gramm for MATLAB, Lets-Plot for Kotlin and gadfly for Julia. Projects inspired by elements of Wickham's grammar include Vega-Lite which specifies plots in JSON and uses a JavaScript engine. Implementations for Python include Vega-Altair (built on top of Vega-Lite). == Vega-Lite: A Grammar of Interactive Graphics == === Theory === Vega-Lite combines ideas from Wilkinson's Grammar of Graphics and Wickham's Layered Grammar of Graphics with a composition algebra for layered and multi-view displays with a grammar of interaction. The Vega-Lite specification is instantiated in JSON and rendered by the lower-level Vega. The graphical grammar implemented by Vega-Lite is composed of the following: Unit: consists of data, transforms, mark-type and encoding Data: relational table consisting of records (rows) and named attributes (columns) Transforms: data transformations Mark-type: geometric object for visual encoding Encodings: mapping of data attributes to visual marks properties where each encoding consists of: Channel: e.g. colour, shape, size, or text Field: data attribute Data-type: e.g. nominal, ordinal, quantitative, or temporal Value: use a literal instead of a data-type Functions: e.g. binning, aggregation, and sorting Scale: maps from data domain to visual range Guide: axis or legend for visualising scale Composite Views: compose views from multiple unit specifications with operators: Layer: charts plotted on top of each other Hconcat/Vconcat: place views side-by-side Facet: subset data to produce a trellis plot Repeat: multiple plots similar to facet but with full data replication in each cell Interaction: selections identify the set of points a user is interested in manipulating, with components: Selection: get the minimal number of backing points Name: reference Type: how many backing values are stored Predicate: determine the set of selected points e.g. single, list, interval Domain|Range: store data domain or visual range Event: e.g. mouseover, mousedown, mouseup, Init: initialise with specific backing points Transforms: e.g. project, toggle, translate, zoom, and nearest Resolve: resolve selections to union or intersect ==== Implementations ==== Whilst Vega-Lite is the sole implementation of this graphics grammar specification with compilation to Vega, other implementations do create JSON files which can be interpreted by Vega-Lite. == Related projects == Ggplot2 is an R package for plotting Tableau Software (originally known as Polaris) is a commercial software built using the Grammar of Graphics nViZn built by Wilkinson. SYSTAT (statistics package) built by Wilkinson ggpy, ggplot for Python, but has not been updated since 20 November 2016 plotnine started as an effort to improve the scalability of ggplot for Python and is largely compatible with ggplot2 syntax. Plotly - Interactive, online ggplot2 graphs gramm, a plotting class for MATLAB inspired by ggplot2 gadfly, a system for plotting and visualization written in Julia, based largely on ggplot2 Chart::GGPlot - ggplot2 port in Perl, but has not been updated since 16 March 2023 The Lets-Plot for Python library includes a native backend and a Python API, which was mostly based on the ggplot2 package. Lets-Plot Kotlin API is an open-source plotting library for statistical data implemented using the Kotlin programming language, and is built on the principles of layered graphics first described in the Leland Wilkinson's work The Grammar of Graphics. ggplotnim, plotting library using the Nim programming language inspired by ggplot2. Vega and Vega-Lite are plotting libraries that use JSON to specify plots. Vega-Altair, a Python library built on top of Vega-Lite chart-parts - React-friendly Grammar of Graphics, but has not been updated since 10 Dec 2021 g2 - a JavaScript library

    Read more →
  • Artificial intelligence and elections

    Artificial intelligence and elections

    As artificial intelligence (AI) has become more mainstream, there is growing concern about how this will influence elections. Potential targets of AI include election processes, election offices, election officials and election vendors. There are also global efforts to improve elections using AI. == Tactics == Generative AI capabilities allow creation of misleading content. Examples of this include text-to-video, deepfake videos, text-to-image, AI-altered images, text-to-speech, voice cloning, and text-to-text. In the context of an election, a deepfake video of a candidate may propagate information that the candidate does not endorse. Chatbots could spread misinformation related to election locations, times or voting methods. In contrast to malicious actors in the past, these techniques require little technical skill and can spread rapidly. LLM-generated messages have the capacity to persuade humans on political issues. Researchers have begun to investigate how people rate messages that LLMs generate for how persuasive they are. When it came to policy issues, the LLM-generated messages received a 2.91 compared to a 2.80 when it came to smartness between the AI and humans. The LLM-generated messages were often more technical and analytical than human-generated messages. Generative AI has been used to micro-target people during tight political elections. The generation of targeted large language models has triggered concern that they will be used to leverage readily scale microtargeting. Rephrasing inputs have been used to generate fraudulent emails and phishing websites. Rephrasing inputs in a microtargeting does not violate the terms of OpenAI usage. There are no safeguards to prevent the use of rephrasing and creation of fraudulent emails. Political campaign managers have access to this allowing for them to create targeted content. == Usage by country == === Argentina === ==== 2023 elections ==== During the 2023 Argentine primary elections, Javier Milei's team distributed AI generated images including a fabricated image of his rival Sergio Massa and drew 3 million views. The team also created an unofficial Instagram account entitled "AI for the Homeland." Sergio Massa's team also distributed AI generated images and videos. === Bangladesh === ==== 2024 elections ==== In the run up to the 2024 Bangladeshi general election, deepfake videos of female opposition politicians appeared. Rumin Farhana was pictured in a bikini while Nipun Ray was shown in a swimming pool. === Canada === ==== 2025 elections ==== In the run up to the 2025 Canadian federal election, the use of AI tools is likely to figure prominently. India, Pakistan and Iran are all expected to make efforts to subvert the national vote using disinformation campaigns to deceive voters and sway diaspora communities. In a report by the Canadian Centre for Cyber Security called "Cyber Threats to Canada's Democratic Process: 2025 Update", it states that malicious actors including China and Russia: "are most likely to use generative AI as a means of creating and spreading disinformation, designed to sow division among Canadians and push narratives conducive to the interests of foreign states". === France === ==== 2024 elections ==== In the 2024 French legislative election, deepfake videos appeared claiming: i) That they showed the family of Marine le Pen. In the videos, young women, supposedly Le Pen's nieces, are seen skiing, dancing and at the beach "while making fun of France’s racial minorities": However, the family members don't exist. On social media there were over 2 million views. ii) In a video seen on social media, a deepfake video of a France24 broadcast appeared to report that the Ukrainian leadership had "tried to lure French president Emmanuel Macron to Ukraine to assassinate him and then blame his death on Russia". === Ghana === ==== 2024 elections ==== During the months before the December 2024 Ghanaian general election, a network of at least 171 fake accounts has been used to spam social media. Posts have been used by a group identified as "@TheTPatriots" to promote the New Patriotic Party, although it is not known whether the two are connected. All the networks' posts were "highly likely" to have been generated by ChatGPT and appear to be the "first secretly partisan network using AI to influence elections in Ghana". The opposition National Democratic Congress was also criticized with its leader John Mahama being called a drunkard. === India === ==== 2024 elections ==== In the 2024 Indian general election, politicians used deepfakes in their campaign materials. These deepfakes included politicians who had died prior to the election. Mathuvel Karunanidhi's party posted with his likeness even though he had died 2018. A video The All-India Anna Dravidian Progressive Federation party posted showed an audio clip of Jayaram Jayalalithaa even though she had died in 2016. The Deepfakes Analysis Unit (DAU) is an open source platform created in March 2024 for the public to share misleading content and assess if it had been AI-generated. AI was also used to translate political speeches in real time. This translating ability was widely used to reach more voters. === Indonesia === ==== 2024 elections ==== In the 2024 Indonesian presidential election, Prabowo Subianto made extensive use of AI-generated art in his campaign, which ranged from images of himself as an adorable child to various child portrayals in his advertisements. The Indonesian Children's Protection Commission condemned these ads, labeling them as a form of misuse. Other candidates, Anies Baswedan and Ganjar Pranowo, also incorporated AI art into their campaigns. Throughout the election period, all presidential candidates faced attacks from deepfakes, both in video and audio formats. === Ireland === ==== 2024 elections ==== In the last weeks of the 2024 Irish general election a spoof election poster appeared in Dublin featuring "an AI-generated candidate with three arms". The candidate is called Aidan Irwin, but no-one stood in the election with that name. A slogan on the poster says "put matters into artificial intelligence’s hands". The convincing election poster shows a man that "has six fingers on one hand, three arms, and a distorted thumb". === New Zealand === ==== 2023 elections ==== In May 2023, ahead of the 2023 New Zealand general election in October 2023, the New Zealand National Party published a "series of AI-generated political advertisements" on its Instagram account. After confirming that the images were faked, a party spokesperson said that it was "an innovative way to drive our social media". === Pakistan === ==== 2024 elections ==== AI has been used by the imprisoned ex-Prime Minister Imran Khan and his media team in the 2024 Pakistani general election: i) An AI generated audio of his voice was added to a video clip and was broadcast at a virtual rally. ii) An op-ed in The Economist written by Khan was later claimed by himself to have been written by AI which was later denied by his team. The article was liked and shared on social media by thousands of users. === South Africa === ==== 2024 elections ==== In the 2024 South African general election, there were several uses of AI content: i) A deepfaked video of Joe Biden emerged on social media showing him saying that "The U.S. would place sanctions on SA and declare it an enemy state if the African National Congress (ANC) won". ii) In a deepfake video, Donald Trump was shown endorsing the uMkhonto weSizwe party. It was posted to social media and was viewed more than 158,000 times. iii) Less than 3 months before the elections, a deepfake video showed U.S. rapper Eminem endorsing the Economic Freedom Fighters party while criticizing the ANC. The deepfake was viewed on social media more than 173,000 times. === South Korea === ==== 2022 elections ==== In the 2022 South Korean presidential election, a committee for one presidential candidate Yoon Suk Yeol released an AI avatar 'Al Yoon Seok-yeol' that would campaign in places the candidate could not go. The other presidential candidate Lee Jae-myung introduced a chatbot that provided information about the candidate's pledges. ==== 2024 elections ==== Deepfakes were used to spread misinformation before the 2024 South Korean legislative election with one source reporting 129 deepfake violations of election laws within a two week period. Seoul hosted the 2024 Summit for Democracy, a virtual gathering of world leaders initiated by US President Joe Biden in 2021. The focus of the summit was on digital threats to democracy including artificial intelligence and deepfakes. === Taiwan === ==== 2024 elections ==== AI-generated content was used during the 2024 Taiwanese presidential election. Among the media were: i) A deepfake video of General Secretary of the Chinese Communist Party Xi Jinping which showed him supporting the presidential elections. Created on social media, the video was "widely circulated

    Read more →
  • Manifold regularization

    Manifold regularization

    In machine learning, manifold regularization is a technique for using the shape of a dataset to constrain the functions that should be learned on that dataset. In many machine learning problems, the data to be learned do not cover the entire input space. For example, a facial recognition system may not need to classify any possible image, but only the subset of images that contain faces. The technique of manifold learning assumes that the relevant subset of data comes from a manifold, a mathematical structure with useful properties. The technique also assumes that the function to be learned is smooth: data with different labels are not likely to be close together, and so the labeling function should not change quickly in areas where there are likely to be many data points. Because of this assumption, a manifold regularization algorithm can use unlabeled data to inform where the learned function is allowed to change quickly and where it is not, using an extension of the technique of Tikhonov regularization. Manifold regularization algorithms can extend supervised learning algorithms in semi-supervised learning and transductive learning settings, where unlabeled data are available. The technique has been used for applications including medical imaging, geographical imaging, and object recognition. == Manifold regularizer == === Motivation === Manifold regularization is a type of regularization, a family of techniques that reduces overfitting and ensures that a problem is well-posed by penalizing complex solutions. In particular, manifold regularization extends the technique of Tikhonov regularization as applied to Reproducing kernel Hilbert spaces (RKHSs). Under standard Tikhonov regularization on RKHSs, a learning algorithm attempts to learn a function f {\displaystyle f} from among a hypothesis space of functions H {\displaystyle {\mathcal {H}}} . The hypothesis space is an RKHS, meaning that it is associated with a kernel K {\displaystyle K} , and so every candidate function f {\displaystyle f} has a norm ‖ f ‖ K {\displaystyle \left\|f\right\|_{K}} , which represents the complexity of the candidate function in the hypothesis space. When the algorithm considers a candidate function, it takes its norm into account in order to penalize complex functions. Formally, given a set of labeled training data ( x 1 , y 1 ) , … , ( x ℓ , y ℓ ) {\displaystyle (x_{1},y_{1}),\ldots ,(x_{\ell },y_{\ell })} with x i ∈ X , y i ∈ Y {\displaystyle x_{i}\in X,y_{i}\in Y} and a loss function V {\displaystyle V} , a learning algorithm using Tikhonov regularization will attempt to solve the expression arg min f ∈ H 1 ℓ ∑ i = 1 ℓ V ( f ( x i ) , y i ) + γ ‖ f ‖ K 2 {\displaystyle {\underset {f\in {\mathcal {H}}}{\arg \!\min }}{\frac {1}{\ell }}\sum _{i=1}^{\ell }V(f(x_{i}),y_{i})+\gamma \left\|f\right\|_{K}^{2}} where γ {\displaystyle \gamma } is a hyperparameter that controls how much the algorithm will prefer simpler functions over functions that fit the data better. Manifold regularization adds a second regularization term, the intrinsic regularizer, to the ambient regularizer used in standard Tikhonov regularization. Under the manifold assumption in machine learning, the data in question do not come from the entire input space X {\displaystyle X} , but instead from a nonlinear manifold M ⊂ X {\displaystyle M\subset X} . The geometry of this manifold, the intrinsic space, is used to determine the regularization norm. === Laplacian norm === There are many possible choices for the intrinsic regularizer ‖ f ‖ I {\displaystyle \left\|f\right\|_{I}} . Many natural choices involve the gradient on the manifold ∇ M {\displaystyle \nabla _{M}} , which can provide a measure of how smooth a target function is. A smooth function should change slowly where the input data are dense; that is, the gradient ∇ M f ( x ) {\displaystyle \nabla _{M}f(x)} should be small where the marginal probability density P X ( x ) {\displaystyle {\mathcal {P}}_{X}(x)} , the probability density of a randomly drawn data point appearing at x {\displaystyle x} , is large. This gives one appropriate choice for the intrinsic regularizer: ‖ f ‖ I 2 = ∫ x ∈ M ‖ ∇ M f ( x ) ‖ 2 d P X ( x ) {\displaystyle \left\|f\right\|_{I}^{2}=\int _{x\in M}\left\|\nabla _{M}f(x)\right\|^{2}\,d{\mathcal {P}}_{X}(x)} In practice, this norm cannot be computed directly because the marginal distribution P X {\displaystyle {\mathcal {P}}_{X}} is unknown, but it can be estimated from the provided data. === Graph-based approach of the Laplacian norm === When the distances between input points are interpreted as a graph, then the Laplacian matrix of the graph can help to estimate the marginal distribution. Suppose that the input data include ℓ {\displaystyle \ell } labeled examples (pairs of an input x {\displaystyle x} and a label y {\displaystyle y} ) and u {\displaystyle u} unlabeled examples (inputs without associated labels). Define W {\displaystyle W} to be a matrix of edge weights for a graph, where W i j {\displaystyle W_{ij}} is a similarity built from distance measure between the data points x i {\displaystyle x_{i}} and x j {\displaystyle x_{j}} (so that more close implies higher W i j {\displaystyle W_{ij}} ). Define D {\displaystyle D} to be a diagonal matrix with D i i = ∑ j = 1 ℓ + u W i j {\displaystyle D_{ii}=\sum _{j=1}^{\ell +u}W_{ij}} and L {\displaystyle L} to be the Laplacian matrix D − W {\displaystyle D-W} . Then, as the number of data points ℓ + u {\displaystyle \ell +u} increases, L {\displaystyle L} converges to the Laplace–Beltrami operator Δ M {\displaystyle \Delta _{M}} , which is the divergence of the gradient ∇ M {\displaystyle \nabla _{M}} . Then, if f {\displaystyle \mathbf {f} } is a vector of the values of f {\displaystyle f} at the data, f = [ f ( x 1 ) , … , f ( x l + u ) ] T {\displaystyle \mathbf {f} =[f(x_{1}),\ldots ,f(x_{l+u})]^{\mathrm {T} }} , the intrinsic norm can be estimated: ‖ f ‖ I 2 = 1 ( ℓ + u ) 2 f T L f {\displaystyle \left\|f\right\|_{I}^{2}={\frac {1}{(\ell +u)^{2}}}\mathbf {f} ^{\mathrm {T} }L\mathbf {f} } As the number of data points ℓ + u {\displaystyle \ell +u} increases, this empirical definition of ‖ f ‖ I 2 {\displaystyle \left\|f\right\|_{I}^{2}} converges to the definition when P X {\displaystyle {\mathcal {P}}_{X}} is known. === Solving the regularization problem with graph-based approach === Using the weights γ A {\displaystyle \gamma _{A}} and γ I {\displaystyle \gamma _{I}} for the ambient and intrinsic regularizers, the final expression to be solved becomes: arg min f ∈ H 1 ℓ ∑ i = 1 ℓ V ( f ( x i ) , y i ) + γ A ‖ f ‖ K 2 + γ I ( ℓ + u ) 2 f T L f {\displaystyle {\underset {f\in {\mathcal {H}}}{\arg \!\min }}{\frac {1}{\ell }}\sum _{i=1}^{\ell }V(f(x_{i}),y_{i})+\gamma _{A}\left\|f\right\|_{K}^{2}+{\frac {\gamma _{I}}{(\ell +u)^{2}}}\mathbf {f} ^{\mathrm {T} }L\mathbf {f} } As with other kernel methods, H {\displaystyle {\mathcal {H}}} may be an infinite-dimensional space, so if the regularization expression cannot be solved explicitly, it is impossible to search the entire space for a solution. Instead, a representer theorem shows that under certain conditions on the choice of the norm ‖ f ‖ I {\displaystyle \left\|f\right\|_{I}} , the optimal solution f ∗ {\displaystyle f^{}} must be a linear combination of the kernel centered at each of the input points: for some weights α i {\displaystyle \alpha _{i}} , f ∗ ( x ) = ∑ i = 1 ℓ + u α i K ( x i , x ) {\displaystyle f^{}(x)=\sum _{i=1}^{\ell +u}\alpha _{i}K(x_{i},x)} Using this result, it is possible to search for the optimal solution f ∗ {\displaystyle f^{}} by searching the finite-dimensional space defined by the possible choices of α i {\displaystyle \alpha _{i}} . === Functional approach of the Laplacian norm === The idea beyond the graph-Laplacian is to use neighbors to estimate the Laplacian. This method is akin to local averaging methods, that are known to scale poorly in high-dimensional problems. Indeed, the graph Laplacian is known to suffer from the curse of dimensionality. Luckily, it is possible to leverage expected smoothness of the function to estimate thanks to more advanced functional analysis. This method consists of estimating the Laplacian operator using derivatives of the kernel reading ∂ 1 , j K ( x i , x ) {\displaystyle \partial _{1,j}K(x_{i},x)} where ∂ 1 , j {\displaystyle \partial _{1,j}} denotes the partial derivatives according to the j-th coordinate of the first variable. This second approach to the Laplacian norm is to put in relation with meshfree methods, that contrast with the finite difference method in PDE. == Applications == Manifold regularization can extend a variety of algorithms that can be expressed using Tikhonov regularization, by choosing an appropriate loss function V {\displaystyle V} and hypothesis space H {\displaystyle {\mathcal {H}}} . Two commonly used examples are the families of support vector machines and regularized least squares algorithm

    Read more →
  • Business process automation

    Business process automation

    Business process automation (BPA), also known as business automation, refers to the technology-enabled automation of business processes. == Development approaches == There are three main approaches to developing BPA: traditional business process automation involves developing BPA software in a programming language for integrating relevant applications in the digital ecosystem to execute a given process; robotic process automation uses software robots (also called agents, bots, or workers) to emulate human-computer interaction for executing a combination of processes, activities, transactions, and tasks in one or more unrelated software systems; hyperautomation (also called intelligent automation (IA), intelligent process automation (IPA), integrated automation platform (IAP), and cognitive automation (CA) combines business process automation, artificial intelligence (AI), and machine learning (ML) to discover, validate, and execute organizational processes automatically with no or minimal human intervention. == Deployment == BPA toolsets vary in capability. With the increasing adoption of artificial intelligence (AI), organizations are implementing AI-driven technologies that can process natural language, interpret unstructured datasets, and interact with users. These systems are designed to adapt to new types of problems with reduced reliance on human intervention. == Business process management implementation == A business process management system differs from BPA. However, it is possible to implement automation based on a BPM implementation. The methods to achieve this vary, from writing custom application code to using specialist BPA tools. == Robotic process automation == Robotic process automation (RPA) involves the deployment of attended or unattended software agents in an organization's environment. These software agents, or robots, are programmed to perform predefined structured and repetitive sets of business tasks or processes. Robotic process automation is designed to streamline workflows by delegating repetitive tasks to software agents, allowing human workers to focus on more complex and strategic activities. BPA providers typically focus on different industry sectors, but the underlying approach is generally similar in that they aim to provide the shortest route to automation by interacting with the user interface rather than modifying the application code or database behind it. == Use of artificial intelligence == Artificial intelligence software robots are used to handle unstructured data sets (like images, texts, audios) and are often deployed after implementing robotic process automation. They can, for instance, generate an automatic transcript from a video. The combination of automation and artificial intelligence (AI) enables autonomy for robots, along with the capability to perform cognitive tasks. At this stage, robots can learn and improve processes by analyzing and adapting them.

    Read more →
  • Digital Image Processing with Sound

    Digital Image Processing with Sound

    DIPS (Digital Image Processing with Sound) is a set of plug-in objects that handle real-time digital image processing in Max/MSP programming environment. Combining with the built-in objects of the environment, DIPS enables to program the interaction between audio and visual events with ease, and supports the realization of interactive multimedia art as well as interactive computer music. == Summary of Features == A plug-in software for Max/MSP (Max 5 and 6) More than 300 Max external objects and abstractions More than 90 OpenGL objects included More than 110 visual effect objects (Dfx library, Core Image Filters) A utility library for the easy of programming (prefix Dlib) A comprehensive set of sample patches, and a detailed tutorial Handling images & movie files (QuickTime, OpenGL) Render and move 3D models (OpenGL) Video signal input (QuickTime, video texture) Video input analysis: motion detect, face tracking (OpenCV, OpenGL) Importing 3D models (.obj file) Importing Quartz Composer files OpenGL Shading Language (GLSL) programming interface Easy integration of visual events using DIPSWindowMixer (OpenGL) == Description == DIPS is a free plug-in software (a set of external objects) for Max/MSP. It supports the designing of the interaction between sound and visual events in Max using Apple’s Core Image, OpenGL and OpenCV technologies, and consequently, provides a powerful and user-friendly programming environment for the creation of interactive multimedia art. DIPS can be used to detect a performer’s motions and to track positions of subtle details, such as the face, mouth, and eyes. It can also be used to measure the distance between objects and a Kinect sensor system, and offers powerful tools for realtime image processing of incoming video stream and stored movie files. In addition, it can be used to create complex images in a virtual three-dimensional space. The DIPS consists of a library of more than 300 Max external objects and abstractions, a comprehensive set of sample patches, and a detailed tutorial. Some of its strong points, in comparison with other similar plug-ins and software, are its ease of programming, power, and efficiency. The sample patches and tutorial contained in the installation package allows composers and artists who are interested in the creation of interactive art to realize sophisticated realtime video effects on a live video signal at their first practice. And because of its ease of programming, it is likely that one will soon acquire skills needed to create state-of-the-art interactive performance works, multimedia installations, interactive multimedia artworks, and Max VJ applications using DIPS. == History == Initially developed by Shu Matsuda in 1997, DIPS was a plug-in software for Max/FTS running on SGI Octane and O2 computers. Since 2000, it has been developed by the DIPS Development Group supervised by Takayuki Rai. Current active group members are Shu Matsuda, Yota Morimoto, Takuto Fukuda, and Keitaro Takahashi. Previously, Chikashi Miyama, Daichi Ando and Takayuki Hamano also contributed to its development. 2013 DIPS5 for Max (Mac OS X) 2009 DIPS4 for Max/MSP (Mac OS X) 2006 DIPS3 for Max/MSP (Mac OS X) 2003 DIPS2 for jMax4 (Mac OS X) 2002 DIPS for jMax2 (Mac OS X & Linux) 2000 DIPS for jMax (Linux)

    Read more →
  • The AI Con

    The AI Con

    The AI Con: How to Fight Big Tech's Hype and Create the Future We Want is a 2025 non-fiction book by linguist Emily M. Bender and sociologist Alex Hanna. It argues that much of what is labeled "artificial intelligence" is a misleading term that obscures ordinary automation while concentrating power in a small number of technology firms. The book was published in May 2025 by Harper in the United States and Bodley Head in the United Kingdom. It was developed alongside the authors' long-running podcast Mystery AI Hype Theater 3000, which critiques exaggerated claims about AI. == Synopsis == The authors present AI as a marketing umbrella that encourages audiences to infer understanding and agency where none exist. They argue readers should treat such language skeptically and to separate specific automated tasks from broad claims of intelligence. The book describes a recurring hype cycle in which corporate narratives justify data and labor extraction, the replacement of human services with cheaper substitutes, and the diversion of attention from present harms to speculative futures. While acknowledging limited uses such as pattern recognition, the authors argue that contemporary systems are best understood as text and media generators shaped by training data and human labor, not as thinking or reasoning entities. A central theme is the social and environmental cost of scaling these systems, including increased energy and water use, the appropriation of creative work for training, and the outsourcing of ghost work to low-paid data workers worldwide. These costs are linked to workplace effects, with the authors arguing that automation rarely eliminates jobs outright and more often degrades them through surveillance, work intensification, and unpaid oversight. As alternatives to passive adoption, the authors propose concrete responses: asking precise questions about what is being automated and why, demanding transparency about data and evaluation, and practicing what they call strategic refusal when deployment conflicts with evidence or values. The book also develops a vocabulary for public debate, rejecting both boosterish and doomerish narratives as grounded in the same assumption that AI is a singular, autonomous force. The authors recommend reading strategies such as favoring trusted human sources over automated summaries and using humor to deflate inflated claims. They describe a link between language to policy and power, arguing that precise terminology can help policymakers and the public resist austerity-driven automation and demand accountability for errors and harms. == Reception == The Guardian praised the book's myth-busting approach and its analysis of how hype erodes cultural and civic life by normalizing synthetic media as a substitute for human judgment. Kirkus Reviews described it as a contrarian account that catalogs concrete risks while cutting through speculative predictions. An interview in Business Insider highlighted the authors' accessible frameworks, including their proposal to describe chatbots as conversation simulators and to evaluate systems in terms of values, labor, and evidence. Coverage in GeekWire emphasized the book's call for resistance through collective bargaining, stronger data rights, and a norm of rejecting deployments that fail basic standards of necessity and evaluation. Some reviews were more critical. A review in LLRX argued that the book's tone could be overly polemical and that it gave limited attention to potential benefits claimed for generative systems. Coverage in the Financial Times, focused on Bender's broader public scholarship, situated the book within her long-standing critique of anthropomorphic narratives about large language models and her advocacy for more democratic oversight of automated systems.

    Read more →
  • Description logic

    Description logic

    Description logics (DL) are a family of formal knowledge representation languages. Many DLs are more expressive than propositional logic but less expressive than first-order logic. In contrast to the latter, the core reasoning problems for DLs are (usually) decidable, and efficient decision procedures have been designed and implemented for these problems. There are general, spatial, temporal, spatiotemporal, and fuzzy description logics, and each description logic features a different balance between expressive power and reasoning complexity by supporting different sets of mathematical constructors. DLs are used in artificial intelligence to describe and reason about the relevant concepts of an application domain (known as terminological knowledge). It is of particular importance in providing a logical formalism for ontologies and the Semantic Web: the Web Ontology Language (OWL) and its profiles are based on DLs. A major area of application of DLs and OWL is in biomedical informatics, where they assist in the codification of biomedical knowledge. DLs and OWL are also applied in other domains, including defense, climate modeling, and large-scale industrial knowledge graphs. == Introduction == A DL models concepts, roles and individuals, and their relationships. The fundamental modeling concept of a DL is the axiom—a logical statement relating roles and/or concepts. This is a key difference from the frames paradigm where a frame specification declares and completely defines a class. == Nomenclature == === Terminology compared to FOL and OWL === The description logic community uses different terminology than the first-order logic (FOL) community for operationally equivalent notions; some examples are given below. The Web Ontology Language (OWL) uses again a different terminology, also given in the table below. === Naming convention === There are many varieties of description logics and there is an informal naming convention, roughly describing the operators allowed. The expressivity is encoded in the label for a logic starting with one of the following basic logics: Followed by any of the following extensions: ==== Exceptions ==== Some canonical DLs that do not exactly fit this convention are: ==== Examples ==== As an example, A L C {\displaystyle {\mathcal {ALC}}} is a centrally important description logic from which comparisons with other varieties can be made. A L C {\displaystyle {\mathcal {ALC}}} is simply A L {\displaystyle {\mathcal {AL}}} with complement of any concept allowed, not just atomic concepts. A L C {\displaystyle {\mathcal {ALC}}} is used instead of the equivalent A L U E {\displaystyle {\mathcal {ALUE}}} . A further example, the description logic S H I Q {\displaystyle {\mathcal {SHIQ}}} is the logic A L C {\displaystyle {\mathcal {ALC}}} plus extended cardinality restrictions, and transitive and inverse roles. The naming conventions aren't purely systematic so that the logic A L C O I N {\displaystyle {\mathcal {ALCOIN}}} might be referred to as A L C N I O {\displaystyle {\mathcal {ALCNIO}}} and other abbreviations are also made where possible. The Protégé ontology editor supports S H O I N ( D ) {\displaystyle {\mathcal {SHOIN}}^{\mathcal {(D)}}} . Three major biomedical informatics terminology bases, SNOMED CT, GALEN, and GO, are expressible in E L {\displaystyle {\mathcal {EL}}} (with additional role properties). OWL 2 provides the expressiveness of S R O I Q ( D ) {\displaystyle {\mathcal {SROIQ}}^{\mathcal {(D)}}} , OWL-DL is based on S H O I N ( D ) {\displaystyle {\mathcal {SHOIN}}^{\mathcal {(D)}}} , and for OWL-Lite it is S H I F ( D ) {\displaystyle {\mathcal {SHIF}}^{\mathcal {(D)}}} . == History == Description logic was given its current name in the 1980s. Previous to this it was called (chronologically): terminological systems, and concept languages. === Knowledge representation === Frames and semantic networks lack formal (logic-based) semantics. DL was first introduced into knowledge representation (KR) systems to overcome this deficiency. The first DL-based KR system was KL-ONE (by Ronald J. Brachman and Schmolze, 1985). During the '80s other DL-based systems using structural subsumption algorithms were developed including KRYPTON (1983), LOOM (1987), BACK (1988), K-REP (1991) and CLASSIC (1991). This approach featured DL with limited expressiveness but relatively efficient (polynomial time) reasoning. In the early '90s, the introduction of a new tableau based algorithm paradigm allowed efficient reasoning on more expressive DL. DL-based systems using these algorithms — such as KRIS (1991) — show acceptable reasoning performance on typical inference problems even though the worst case complexity is no longer polynomial. From the mid '90s, reasoners were created with good practical performance on very expressive DL with high worst case complexity. Examples from this period include FaCT, RACER (2001), CEL (2005), and KAON 2 (2005). DL reasoners, such as FaCT, FaCT++, RACER, DLP and Pellet, implement the method of analytic tableaux. KAON2 is implemented by algorithms which reduce a SHIQ(D) knowledge base to a disjunctive datalog program. === Semantic web === The DARPA Agent Markup Language (DAML) and Ontology Inference Layer (OIL) ontology languages for the Semantic Web can be viewed as syntactic variants of DL. In particular, the formal semantics and reasoning in OIL use the S H I Q {\displaystyle {\mathcal {SHIQ}}} DL. The DAML+OIL DL was developed as a submission to—and formed the starting point of—the World Wide Web Consortium (W3C) Web Ontology Working Group. In 2004, the Web Ontology Working Group completed its work by issuing the OWL recommendation. The design of OWL is based on the S H {\displaystyle {\mathcal {SH}}} family of DL with OWL DL and OWL Lite based on S H O I N ( D ) {\displaystyle {\mathcal {SHOIN}}^{\mathcal {(D)}}} and S H I F ( D ) {\displaystyle {\mathcal {SHIF}}^{\mathcal {(D)}}} respectively. The W3C OWL Working Group began work in 2007 on a refinement of - and extension to - OWL. In 2009, this was completed by the issuance of the OWL2 recommendation. OWL2 is based on the description logic S R O I Q ( D ) {\displaystyle {\mathcal {SROIQ}}^{\mathcal {(D)}}} . Practical experience demonstrated that OWL DL lacked several key features necessary to model complex domains. == Modeling == === TBox vs Abox === In DL, a distinction is drawn between the so-called TBox (terminological box) and the ABox (assertional box). In general, the TBox contains sentences describing concept hierarchies (i.e., relations between concepts) while the ABox contains ground sentences stating where in the hierarchy, individuals belong (i.e., relations between individuals and concepts). For example, the statement: belongs in the TBox, while the statement: belongs in the ABox. Note that the TBox/ABox distinction is not significant, in the same sense that the two "kinds" of sentences are not treated differently in first-order logic (which subsumes most DL). When translated into first-order logic, a subsumption axiom like (1) is simply a conditional restriction to unary predicates (concepts) with only variables appearing in it. Clearly, a sentence of this form is not privileged or special over sentences in which only constants ("grounded" values) appear like (2). === Motivation for having Tbox and Abox === So why was the distinction introduced? The primary reason is that the separation can be useful when describing and formulating decision-procedures for various DL. For example, a reasoner might process the TBox and ABox separately, in part because certain key inference problems are tied to one but not the other one ('classification' is related to the TBox, 'instance checking' to the ABox). Another example is that the complexity of the TBox can greatly affect the performance of a given decision-procedure for a certain DL, independently of the ABox. Thus, it is useful to have a way to talk about that specific part of the knowledge base. The secondary reason is that the distinction can make sense from the knowledge base modeler's perspective. It is plausible to distinguish between our conception of terms/concepts in the world (class axioms in the TBox) and particular manifestations of those terms/concepts (instance assertions in the ABox). In the above example: when the hierarchy within a company is the same in every branch but the assignment to employees is different in every department (because there are other people working there), it makes sense to reuse the TBox for different branches that do not use the same ABox. There are two features of description logic that are not shared by most other data description formalisms: DL does not make the unique name assumption (UNA) or the closed-world assumption (CWA). Not having UNA means that two concepts with different names may be allowed by some inference to be shown to be equivalent. Not having CWA, or rather having the open world assumption (OWA) means that

    Read more →
  • Schema-agnostic databases

    Schema-agnostic databases

    Schema-agnostic databases or vocabulary-independent databases aim at supporting users to be abstracted from the representation of the data, supporting the automatic semantic matching between queries and databases. Schema-agnosticism is the property of a database of mapping a query issued with the user terminology and structure, automatically mapping it to the dataset vocabulary. The increase in the size and in the semantic heterogeneity of database schemas bring new requirements for users querying and searching structured data. At this scale it can become unfeasible for data consumers to be familiar with the representation of the data in order to query it. At the center of this discussion is the semantic gap between users and databases, which becomes more central as the scale and complexity of the data grows. == Description == The evolution of data environments towards the consumption of data from multiple data sources and the growth in the schema size, complexity, dynamicity and decentralisation (SCoDD) of schemas increases the complexity of contemporary data management. The SCoDD trend emerges as a central data management concern in Big Data scenarios, where users and applications have a demand for more complete data, produced by independent data sources, under different semantic assumptions and contexts of use, which is the typical scenario for Semantic Web Data applications. The evolution of databases in the direction of heterogeneous data environments strongly impacts the usability, semiotics and semantic assumptions behind existing data accessibility methods such as structured queries, keyword-based search and visual query systems. With schema-less databases containing potentially millions of dynamically changing attributes, it becomes unfeasible for some users to become aware of the 'schema' or vocabulary in order to query the database. At this scale, the effort in understanding the schema in order to build a structured query can become prohibitive. == Schema-agnostic queries == Schema-agnostic queries can be defined as query approaches over structured databases which allow users satisfying complex information needs without the understanding of the representation (schema) of the database. Similarly, Tran et al. defines it as "search approaches, which do not require users to know the schema underlying the data". Approaches such as keyword-based search over databases allow users to query databases without employing structured queries. However, as discussed by Tran et al.: "From these points, users however have to do further navigation and exploration to address complex information needs. Unlike keyword search used on the Web, which focuses on simple needs, the keyword search elaborated here is used to obtain more complex results. Instead of a single set of resources, the goal is to compute complex sets of resources and their relations." The development of approaches to support natural language interfaces (NLI) over databases have aimed towards the goal of schema-agnostic queries. Complementarily, some approaches based on keyword search have targeted keyword-based queries which express more complex information needs. Other approaches have explored the construction of structured queries over databases where schema constraints can be relaxed. All these approaches (natural language, keyword-based search and structured queries) have targeted different degrees of sophistication in addressing the problem of supporting a flexible semantic matching between queries and data, which vary from the completely absence of the semantic concern to more principled semantic models. While the demand for schema-agnosticism has been an implicit requirement across semantic search and natural language query systems over structured data, it is not sufficiently individuated as a concept and as a necessary requirement for contemporary database management systems. Recent works have started to define and model the semantic aspects involved on schema-agnostic queries. === Schema-agnostic structured queries === Consist of schema-agnostic queries following the syntax of a structured standard (for example SQL, SPARQL). The syntax and semantics of operators are maintained, while different terminologies are used. ==== Example 1 ==== SELECT ?y { BillClinton hasDaughter ?x . ?x marriedTo ?y . } which maps to the following SPARQL query in the dataset vocabulary: ==== Example 2 ==== which maps to the following SPARQL query in the dataset vocabulary: === Schema-agnostic keyword queries === Consist of schema-agnostic queries using keyword queries. In this case the syntax and semantics of operators are different from the structured query syntax. ==== Example ==== "Bill Clinton daughter married to" "Books by William Goldman with more than 300 pages" == Semantic complexity == As of 2016 the concept of schema-agnostic queries has been developed primarily in academia. Most of schema-agnostic query systems have been investigated in the context of Natural Language Interfaces over databases or over the Semantic Web. These works explore the application of semantic parsing techniques over large, heterogeneous and schema-less databases. More recently, the individuation of the concept of schema-agnostic query systems and databases have appeared more explicitly within the literature. Freitas et al. provide a probabilistic model on the semantic complexity of mapping schema-agnostic queries.

    Read more →
  • Human-centered AI

    Human-centered AI

    Human-centered AI is the initiative at the intersection of the fields of artificial intelligence (AI) and human-computer interaction (HCI) to develop AI systems in a way that prioritizes human values, needs, and general flourishing. Emphasis is placed on the recognition that artificial intelligence systems are rapidly changing, and will continue to influence, many aspects of the human experience, in areas ranging from scientific inquiry, governance and policy, labor and the economy, and creative expression, with an aim set to adapt current developments and guide future developments on a trajectory which is most beneficial to the human population at large, with the goal of augmenting human intelligence and capacities across these areas, as opposed to replacing them. Particular attention is paid to mitigating negative effects of AI automation on the livelihoods of the labor force, the use of AI in healthcare fields, and imbuing AI systems with societal values. Human-centered AI is linked to related endeavors in AI alignment and AI safety, but while these fields primarily focus on mitigating risks posed by AI that is unaligned to human values and/or uncontrollable AI self-development, human-centered AI places significant focus in exploring how AI systems can augment human capacities and serve as collaborators. == Conceptual history == The importance of the alignment of artificial intelligence development towards human values in some sense predates artificial intelligence itself, as before the modern conception of artificial intelligence as coined at the 1956 Dartmouth Workshop, the conception of robots as constructed, autonomous agents entered the cultural consciousness as early as the 1920s, with Karel Capek's Rossum's Universal Robots. The imagined issues relating to robots' aims and values requiring intentional alignment and direction with those of humans followed soon after, most widely known from science fiction author Isaac Asimov’s Three Laws of Robotics, dating to his 1942 short story “Runaround”. Two of the three eponymous laws are directly concerned with robots’ interaction with and positioned deference towards humans, and have in recent times been reexamined in the face of modern AI. In 1985, after artificial intelligence research had taken off and its effects were more acutely conceptualized, Asimov added a Rule Zero, treating robots' relationship with humanity as a whole, distinct from individual humans. While modern artificial intelligence is largely distinct from robotics, the conceptualization of both robots and AI systems as autonomous agents positions this as a foundation for conceptions of human-centered AI. Aside from robots, artificially intelligent autonomous agents in interaction with humans have been conceived of for at least 75 years. In 1950, Alan Turing published his famous "Imitation Game", often also called the Turing Test, a thought experiment that uses human-machine interaction as an assessor for the intelligence of a system. In recent times, artificial intelligence researchers such as Stanford's Erik Brynjolfsson have conceived of rapid AI development leading to a so-called "Turing Trap". == Augmentation and automation == A major stated aim of human-centered AI is to promote the development of AI in ways that augment human capabilities, rather than replacing them. To this end, organizations and initiatives that take a human-centered approach to AI development focus on frameworks that encourage collaboration between humans and artificial intelligence systems to build towards even greater progress, rather than attempting to automate tasks currently handled by humans. Such avenues include everything from data visualization for big data, allowing human engineers to better understand extremely large datasets, allowing for the design of better machine learning models to handle them, to AI-powered sensors to monitor vitals, allowing for better responsiveness from healthcare providers. Many human-centered AI initiatives often position it as a better alternative to the apparent mainstream in AI development, which is primarily concerned with automation. Driven by the pressures of the market economy, AI development that does replace tasks currently performed by humans with automated processes is incentivized, as it allows for greater profit margins; this often comes at the detriment of the human whose performance is replaced, thus leading to an environment wherein human workers are outcompeted by AI systems across various service-sector and technology-based industries. At the same time, automation and augmentation are not always incompatible; a major aim of human-centered AI is towards the automation of rote tasks that would otherwise hinder a human’s productivity or creativity, freeing them to direct their energy and intelligence towards higher-level tasks, thus achieving augmentation through automation. Empirical research in pharmaceutical sales has shown that a human-centered implementation - where work procedures, training, and incentives are designed around individuals' cognitive needs - improves augmentation performance, while implementation without such adaptation can worsen outcomes relative to a legacy system. == Research == Much of the work done on human-centered AI comes from research institutes, within universities, companies, and as freestanding organizations. The Stanford Institute for Human-Centered AI (abbreviated to HAI) is one such group, engaging academics, industry professionals, and policymakers centered in Stanford University to conduct research and inform policy in various areas in human-centered AI, including on aspects of the intelligence itself, augmentation, and on measuring the impacts of AI systems on sociopolitcal and cultural institutions. Similar groups exist at other universities, including the Chicago Human + AI (CHAI) Lab at the University of Chicago, the HCAI@GU group at the University of Gothenburg, and the Human-Centered AI (HAI) Lab at the University of Oxford. Outside of the academy, companies such as IBM have research initiatives dedicated to advancements in human-centered AI. At Kenyon College, the Integrated Program for Humane Studies (IPHS) launched a human-centered AI program in 2016 integrating artificial intelligence research with humanities and social science inquiry. This approach treats computation and humanistic scholarship as a single unified field of research rather than as separate disciplines requiring collaboration. The program's researchers have published in both AI venues (such as the International Conference on Machine Learning and Frontiers of Computer Science) and humanities journals (such as PMLA and Poetics Today), and the lab was selected in December 2025 by Schmidt Sciences for its Humanities and AI Virtual Institute to apply AI methods to cultural heritage preservation.

    Read more →
  • Situated approach (artificial intelligence)

    Situated approach (artificial intelligence)

    In artificial intelligence research, the situated approach builds agents that are designed to behave effectively successfully in their environment. This requires designing AI "from the bottom-up" by focussing on the basic perceptual and motor skills required to survive. The situated approach gives a much lower priority to abstract reasoning or problem-solving skills. The approach was originally proposed as an alternative to traditional approaches (that is, approaches popular before 1985 or so). After several decades, classical AI technologies started to face intractable issues (e.g. combinatorial explosion) when confronted with real-world modeling problems. All approaches to address these issues focus on modeling intelligences situated in an environment. They have become known as the situated approach to AI. == Emergence of a concept == === From traditional AI to Nouvelle AI === During the late 1980s, the approach now known as Nouvelle AI (Nouvelle means new in French) was pioneered at the MIT Artificial Intelligence Laboratory by Rodney Brooks. As opposed to classical or traditional artificial intelligence, Nouvelle AI purposely avoided the traditional goal of modeling human-level performance, but rather tries to create systems with intelligence at the level of insects, closer to real-world robots. But eventually, at least at MIT new AI did lead to an attempt for humanoid AI in the Cog Project. === From Nouvelle AI to behavior-based and situated AI === The conceptual shift introduced by nouvelle AI flourished in the robotics area, given way to behavior-based robotics (BBR), a methodology for developing AI based on a modular decomposition of intelligence. It was made famous by Rodney Brooks: his subsumption architecture was one of the earliest attempts to describe a mechanism for developing BBAI. It is extremely popular in robotics and to a lesser extent to implement intelligent virtual agents because it allows the successful creation of real-time dynamic systems that can run in complex environments. For example, it underlies the intelligence of the Sony Aibo and many RoboCup robot teams. Realizing that in fact all these approaches were aiming at building not an abstract intelligence, but rather an intelligence situated in a given environment, they have come to be known as the situated approach. In fact, this approach stems out from early insights of Alan Turing, describing the need to build machines equipped with sense organs to learn directly from the real-world instead of focusing on abstract activities, such as playing chess. == Definitions == Classically, a software entity is defined as a simulated element, able to act on itself and on its environment, and which has an internal representation of itself and of the outside world. An entity can communicate with other entities, and its behavior is the consequence of its perceptions, its representations, and its interactions with the other entities. === AI loop === Simulating entities in a virtual environment requires simulating the entire process that goes from a perception of the environment, or more generally from a stimulus, to an action on the environment. This process is called the AI loop and technology used to simulate it can be subdivided in two categories. Sensorimotor or low-level AI deals with either the perception problem (what is perceived?) or the animation problem (how are actions executed?). Decisional or high-level AI deals with the action selection problem (what is the most appropriate action in response to a given perception, i.e. what is the most appropriate behavior?). === Traditional or symbolic AI === There are two main approaches in decisional AI. The vast majority of the technologies available on the market, such as planning algorithms, finite-state machines (FSA), or expert systems, are based on the traditional or symbolic AI approach. Its main characteristics are: It is top-down: it subdivides, in a recursive manner, a given problem into a series of sub-problems that are supposedly easier to solve. It is knowledge-based: it relies on a symbolic description of the world, such as a set of rules. However, the limits of traditional AI, which goal is to build systems that mimic human intelligence, are well-known: inevitably, a combinatorial explosion of the number of rules occurs due to the complexity of the environment. In fact, it is impossible to predict all the situations that will be encountered by an autonomous entity. === Situated or behavioral AI === In order to address these issues, another approach to decisional AI, also known as situated or behavioral AI, has been proposed. It does not attempt to model systems that produce deductive reasoning processes, but rather systems that behave realistically in their environment. The main characteristics of this approach are the following: It is bottom-up: it relies on elementary behaviors, which can be combined to implement more complex behaviors. It is behavior-based: it does not rely on a symbolic description of the environment, but rather on a model of the interactions of the entities with their environment. The goal of situated AI is to model entities that are autonomous in their environment. This is achieved thanks to both the intrinsic robustness of the control architecture, and its adaptation capabilities to unforeseen situations. === Situated agents === In artificial intelligence and cognitive science, the term situated refers to an agent which is embedded in an environment. The term situated is commonly used to refer to robots, but some researchers argue that software agents can also be situated if: they exist in a dynamic (rapidly changing) environment, which they can manipulate or change through their actions, and which they can sense or perceive. Examples might include web-based agents, which can alter data or trigger processes (such as purchases) over the Internet, or virtual-reality bots which inhabit and change virtual worlds, such as Second Life. Being situated is generally considered to be part of being embodied, but it is useful to consider each perspective individually. The situated perspective emphasizes that intelligent behavior derives from the environment and the agent's interactions with it. The nature of these interactions are defined by an agent's embodiment. == Implementation principles == === Modular decomposition === The most important attribute of a system driven by situated AI is that the intelligence is controlled by a set of independent semi-autonomous modules. In the original systems, each module was actually a separate device or was at least conceived of as running on its own processing thread. Generally, though, the modules are just abstractions. In this respect, situated AI may be seen as a software engineering approach to AI, perhaps akin to object oriented design. Situated AI is often associated with reactive planning, but the two are not synonymous. Brooks advocated an extreme version of cognitive minimalism which required initially that the behavior modules were finite-state machines and thus contained no conventional memory or learning. This is associated with reactive AI because reactive AI requires reacting to the current state of the world, not to an agent's memory or preconception of that world. However, learning is obviously key to realistic strong AI, so this constraint has been relaxed, though not entirely abandoned. === Action selection mechanism === The situated AI community has presented several solutions to modeling decision-making processes, also known as action selection mechanisms. The first attempt to solve this problem goes back to subsumption architectures, which were in fact more an implementation technique than an algorithm. However, this attempt paved the way to several others, in particular the free-flow hierarchies and activation networks. A comparison of the structure and performances of these two mechanisms demonstrated the advantage of using free-flow hierarchies in solving the action selection problem. However, motor schemas and process description languages are two other approaches that have been used with success for autonomous robots. == Notes and references == Arsenio, Artur M. (2004) Towards an embodied and situated AI, In: Proceedings of the International FLAIRS conference, 2004. (online) The Artificial Life Route To Artificial Intelligence: Building Embodied, Situated Agents, Luc Steels and Rodney Brooks Eds., Lawrence Erlbaum Publishing, 1995. (ISBN 978-0805815184) Rodney A. Brooks Cambrian Intelligence (MIT Press, 1999) ISBN 0-262-52263-2; collection of early papers including "Intelligence without representation" and "Intelligence without reason", from 1986 & 1991 respectively. Ronald C. Arkin Behavior-Based Robotics (MIT Press, 1998) ISBN 0-262-01165-4 Hendriks-Jansen, Horst (1996) Catching Ourselves in the Act: Situated Activity, Interactive Emergence, Evolution, and Human Thought. Cambridge, Mass.: MIT Press.

    Read more →
  • Argumentation framework

    Argumentation framework

    In artificial intelligence and related fields, an argumentation framework is a way to deal with contentious information and draw conclusions from it using formalized arguments. In an abstract argumentation framework, entry-level information is a set of abstract arguments that, for instance, represent data or a proposition. Conflicts between arguments are represented by a binary relation on the set of arguments. In concrete terms, an argumentation framework is represented with a directed graph such that the nodes are the arguments, and the arrows represent the attack relation. There exist some extensions of the Dung's framework, like the logic-based argumentation frameworks or the value-based argumentation frameworks. == Abstract argumentation frameworks == === Formal framework === Abstract argumentation frameworks, also called argumentation frameworks à la Dung, are defined formally as a pair: A set of abstract elements called arguments, denoted A {\displaystyle A} A binary relation on A {\displaystyle A} , called attack relation, denoted R {\displaystyle R} For instance, the argumentation system S = ⟨ A , R ⟩ {\displaystyle S=\langle A,R\rangle } with A = { a , b , c , d } {\displaystyle A=\{a,b,c,d\}} and R = { ( a , b ) , ( b , c ) , ( d , c ) } {\displaystyle R=\{(a,b),(b,c),(d,c)\}} contains four arguments ( a , b , c {\displaystyle a,b,c} and d {\displaystyle d} ) and three attacks ( a {\displaystyle a} attacks b {\displaystyle b} , b {\displaystyle b} attacks c {\displaystyle c} and d {\displaystyle d} attacks c {\displaystyle c} ). Dung defines some notions : an argument a ∈ A {\displaystyle a\in A} is acceptable with respect to E ⊆ A {\displaystyle E\subseteq A} if and only if E {\displaystyle E} defends a {\displaystyle a} , that is ∀ b ∈ A {\displaystyle \forall b\in A} such that ( b , a ) ∈ R , ∃ c ∈ E {\displaystyle (b,a)\in R,\exists c\in E} such that ( c , b ) ∈ R {\displaystyle (c,b)\in R} , a set of arguments E {\displaystyle E} is conflict-free if there is no attack between its arguments, formally : ∀ a , b ∈ E , ( a , b ) ∉ R {\displaystyle \forall a,b\in E,(a,b)\not \in R} , a set of arguments E {\displaystyle E} is admissible if and only if it is conflict-free and all its arguments are acceptable with respect to E {\displaystyle E} . === Different semantics of acceptance === ==== Extensions ==== To decide if an argument can be accepted or not, or if several arguments can be accepted together, Dung defines several semantics of acceptance that allows, given an argumentation system, sets of arguments (called extensions) to be computed. For instance, given S = ⟨ A , R ⟩ {\displaystyle S=\langle A,R\rangle } , E {\displaystyle E} is a complete extension of S {\displaystyle S} only if it is an admissible set and every acceptable argument with respect to E {\displaystyle E} belongs to E {\displaystyle E} , E {\displaystyle E} is a preferred extension of S {\displaystyle S} only if it is a maximal element (with respect to the set-theoretical inclusion) among the admissible sets with respect to S {\displaystyle S} , E {\displaystyle E} is a stable extension of S {\displaystyle S} only if it is a conflict-free set that attacks every argument that does not belong in E {\displaystyle E} (formally, ∀ a ∈ A ∖ E , ∃ b ∈ E {\displaystyle \forall a\in A\backslash E,\exists b\in E} such that ( b , a ) ∈ R {\displaystyle (b,a)\in R} , E {\displaystyle E} is the (unique) grounded extension of S {\displaystyle S} only if it is the smallest element (with respect to set inclusion) among the complete extensions of S {\displaystyle S} . There exists some inclusions between the sets of extensions built with these semantics : Every stable extension is preferred, Every preferred extension is complete, The grounded extension is complete, If the system is well-founded (there exists no infinite sequence a 0 , a 1 , … , a n , … {\displaystyle a_{0},a_{1},\dots ,a_{n},\dots } such that ∀ i > 0 , ( a i + 1 , a i ) ∈ R {\displaystyle \forall i>0,(a_{i+1},a_{i})\in R} ), all these semantics coincide—only one extension is grounded, stable, preferred, and complete. Some other semantics have been defined. One introduce the notation E x t σ ( S ) {\displaystyle Ext_{\sigma }(S)} to note the set of σ {\displaystyle \sigma } -extensions of the system S {\displaystyle S} . In the case of the system S {\displaystyle S} in the figure above, E x t σ ( S ) = { { a , d } } {\displaystyle Ext_{\sigma }(S)=\{\{a,d\}\}} for every Dung's semantic—the system is well-founded. That explains why the semantics coincide, and the accepted arguments are: a {\displaystyle a} and d {\displaystyle d} . ==== Labellings ==== Labellings are a more expressive way than extensions to express the acceptance of the arguments. Concretely, a labelling is a mapping that associates every argument with a label in (the argument is accepted), out (the argument is rejected), or undec (the argument is undefined—not accepted or refused). One can also note a labelling as a set of pairs ( a r g u m e n t , l a b e l ) {\displaystyle ({\mathit {argument}},{\mathit {label}})} . Such a mapping does not make sense without additional constraint. The notion of reinstatement labelling guarantees the sense of the mapping. L {\displaystyle L} is a reinstatement labelling on the system S = ⟨ A , R ⟩ {\displaystyle S=\langle A,R\rangle } if and only if : ∀ a ∈ A , L ( a ) = i n {\displaystyle \forall a\in A,L(a)={\mathit {in}}} if and only if ∀ b ∈ A {\displaystyle \forall b\in A} such that ( b , a ) ∈ R , L ( b ) = o u t {\displaystyle (b,a)\in R,L(b)={\mathit {out}}} ∀ a ∈ A , L ( a ) = o u t {\displaystyle \forall a\in A,L(a)={\mathit {out}}} if and only if ∃ b ∈ A {\displaystyle \exists b\in A} such that ( b , a ) ∈ R {\displaystyle (b,a)\in R} and L ( b ) = i n {\displaystyle L(b)={\mathit {in}}} ∀ a ∈ A , L ( a ) = u n d e c {\displaystyle \forall a\in A,L(a)={\mathit {undec}}} if and only if L ( a ) ≠ i n {\displaystyle L(a)\neq {\mathit {in}}} and L ( a ) ≠ o u t {\displaystyle L(a)\neq {\mathit {out}}} One can convert every extension into a reinstatement labelling: the arguments of the extension are in, those attacked by an argument of the extension are out, and the others are undec. Conversely, one can build an extension from a reinstatement labelling just by keeping the arguments in. Indeed, Caminada proved that the reinstatement labellings and the complete extensions can be mapped in a bijective way. Moreover, the other Datung's semantics can be associated to some particular sets of reinstatement labellings. Reinstatement labellings distinguish arguments not accepted because they are attacked by accepted arguments from undefined arguments—that is, those that are not defended cannot defend themselves. An argument is undec if it is attacked by at least another undec. If it is attacked only by arguments out, it must be in, and if it is attacked some argument in, then it is out. The unique reinstatement labelling that corresponds to the system S {\displaystyle S} above is L = { ( a , i n ) , ( b , o u t ) , ( c , o u t ) , ( d , i n ) } {\displaystyle L=\{(a,{\mathit {in}}),(b,{\mathit {out}}),(c,{\mathit {out}}),(d,{\mathit {in}})\}} . === Inference from an argumentation system === In the general case when several extensions are computed for a given semantic σ {\displaystyle \sigma } , the agent that reasons from the system can use several mechanisms to infer information: Credulous inference: the agent accepts an argument if it belongs to at least one of the σ {\displaystyle \sigma } -extensions—in which case, the agent risks accepting some arguments that are not acceptable together ( a {\displaystyle a} attacks b {\displaystyle b} , and a {\displaystyle a} and b {\displaystyle b} each belongs to an extension) Skeptical inference: the agent accepts an argument only if it belongs to every σ {\displaystyle \sigma } -extension. In this case, the agent risks deducing too little information (if the intersection of the extensions is empty or has a very small cardinal). For these two methods to infer information, one can identify the set of accepted arguments, respectively C r σ ( S ) {\displaystyle Cr_{\sigma }(S)} the set of the arguments credulously accepted under the semantic σ {\displaystyle \sigma } , and S c σ ( S ) {\displaystyle Sc_{\sigma }(S)} the set of arguments accepted skeptically under the semantic σ {\displaystyle \sigma } (the σ {\displaystyle \sigma } can be missed if there is no possible ambiguity about the semantic). Of course, when there is only one extension (for instance, when the system is well-founded), this problem is very simple: the agent accepts arguments of the unique extension and rejects others. The same reasoning can be done with labellings that correspond to the chosen semantic : an argument can be accepted if it is in for each labelling and refused if it is out for each labelling, the others being in an undecided state (the status of the arguments can remind the

    Read more →
  • AI literacy

    AI literacy

    AI literacy or artificial intelligence literacy is "a set of competencies that enables individuals to critically evaluate AI technologies; communicate and collaborate effectively with AI; and use AI as a tool online, at home, and in the workplace." AI is employed in a variety of applications, including self-driving automobiles, virtual assistants and text generation by generative AI models. Users of these tools should be able to make informed decisions. AI literacy may have an impact on students' future employment prospects. With the rise of generative AI platforms, AI literacy has become a topic of conversation in the field of education. Some think AI literacy is essential for school and college students, while others restrict or prohibit the use of AI in assignments, viewing it as a form of academic dishonesty. However, many researchers and educational institutions promote a more nuanced approach, encouraging critical engagement with AI while developing policies that balance academic integrity with opportunities for learning. == Definitions == Other definitions of AI literacy include the ability to understand, use, monitor, and critically reflect on AI applications. That use of the term usually refers to teaching skills and knowledge to the general public, particularly those who are not adept in AI and the ability to understand, use, evaluate, and ethically navigate AI. As research into AI literacy is still emerging and focused on developing context-specific skills, there is not yet a single, broadly agreed-upon definition. AI literacy is linked to other forms of literacy. AI literacy requires digital literacy, whereas scientific and computational literacy may inform it. Data literacy also significantly overlaps with it. == Categories == AI literacy encompasses multiple categories, including a theoretical understanding of how artificial intelligence works, the usage of artificial intelligence technologies, and the critical appraisal of artificial intelligence, and its ethics. === Know and understand AI === Knowledge and understanding of AI refers to a basic understanding of what artificial intelligence is and how it works. This includes familiarity with machine learning algorithms and the limitations and biases present in AI systems. Users who know and understand AI should be familiar with various technologies that use artificial intelligence, including cognitive systems, robotics and machine learning. This includes recognizing that large language models (LLMs) are machine learning models trained on extensive datasets which generate new text rather than retrieving pre-written responses. === Use and apply AI === Using and applying AI refers to the ability to use AI tools to solve problems and perform tasks such as programming and analyzing big data. Some consider prompt engineering, the practice of designing effective prompts to guide generative AI platforms more effectively, as another competency within AI literacy. === Evaluate and create AI === Evaluation and creation refers to the ability to critically evaluate the quality and reliability of AI systems. It also refers to designing and building fair and ethical AI systems. To evaluate correctly, users should also learn in which areas AI is strong, and in which areas it is weak. === AI ethics === AI ethics refers to understanding the moral implications of AI, and the making informed decisions regarding the use of AI tools. This area includes considerations such as: Accountability: Hold AI actors accountable for the operation of AI systems and adherence to ethical ideals. Accuracy: Identify and report sources of error and uncertainty in algorithms and data. Auditability: Enable other parties to audit and assess algorithm behavior via transparent information sharing. Explainability: Make sure that algorithmic judgments and the underlying data can be presented in simple language. Fairness: Prevent biases and consider varied viewpoints. To do so, increase the diversity of researchers in the field. Human Centricity and Well-being: Prioritize human well-being in AI development and deployment. Human rights Alignment: Ensure that technology do not infringe internationally recognized human rights. Inclusivity: Make AI accessible to everyone. Progress: Choose high value initiatives. Responsibility, accountability, and transparency: Foster trust via responsibility, accountability, and fairness. Robustness and Security: Make AI systems safe, secure, and resistant to manipulation or data breach. Sustainability: Choose implementations that generate long-term, useful benefits. Environmental Implications: How this tool impacts the environment, any restrictions or laws, if this impact is worth the effects or not. === Enabling AI === Support AI by developing associated knowledge and skills such as programming and statistics. == Promoting AI literacy == Several governments have recognized the need to promote AI literacy, including among adults. Such programs have been published in the United States, China, Germany and Finland. Programs intended for the general public usually consist of short and easy to understand online study units. Programs intended for children are usually project-based. Programs for students at colleges and universities often address the specific professional needs of the student, depending on their field of study. Beyond the education system, AI literacy can also be developed in the community, for example in museums. === Schools === Schools use diverse pedagogies to promote AI literacy. These include: Performing a Turing test with an intelligent agent Creating chatbots Building apps using Blockly-based programming Project-based learning Building robots Data visualization Training AI models Artificial intelligence curricula can improve students' understanding of topics such as machine learning, neural networks, and deep learning. === Higher education === Before the second decade of the 21st century, artificial intelligence was studied mainly in STEM courses. Later, projects emerged to increase artificial intelligence education, specifically to promote AI literacy. Most courses start with one or more study units that deal with basic questions such as what artificial intelligence is, where it comes from, what it can do and what it can't do. Most courses also refer to machine learning and deep learning. Some of the courses deal with moral issues in artificial intelligence. In Ireland, the Higher Education Authority published Generative AI in Higher Education Teaching & Learning: Policy Framework in December 2025, which encouraged higher education institutions to embed AI literacy across programmes as a core graduate attribute. ==== Disciplinary policy ==== As a response to the increase of generative AI use in education, several disciplines formed committees or task forces to examine context-specific approaches toward AI literacy. In spring 2025, the Modern Language Association and Conference on College Composition and Communication Joint Task Force finished development of three working papers, a guide on AI literacy for students, and a collection of resources addressing AI use in writing. The task force emphasized the need for "a culture of critical AI literacy" and included guidelines not only for students but also educators and institutions, highlighting the need for modeling ethical AI use in planning processes. Similarly, a committee formed by the American Historical Association Council published "Guiding Principles for Artificial Intelligence in History Education" which encouraged "clear and transparent engagement with generative AI." The guidelines demonstrate the value of criticality when working with generative AI in thinking and research.

    Read more →
  • Empirical risk minimization

    Empirical risk minimization

    In statistical learning theory, the principle of empirical risk minimization defines a family of learning algorithms based on evaluating performance over a known and fixed dataset. The core idea is based on an application of the law of large numbers; more specifically, we cannot know exactly how well a predictive algorithm will work in practice (i.e. the "true risk") because we do not know the true distribution of the data, but we can instead estimate and optimize the performance of the algorithm on a known set of training data. The performance over the known set of training data is referred to as the "empirical risk". == Background == The following situation is a general setting of many supervised learning problems. There are two spaces of objects X {\displaystyle X} and Y {\displaystyle Y} and we would like to learn a function h : X → Y {\displaystyle \ h:X\to Y} (often called hypothesis) which outputs an object y ∈ Y {\displaystyle y\in Y} , given x ∈ X {\displaystyle x\in X} . To do so, there is a training set of n {\displaystyle n} examples ( x 1 , y 1 ) , … , ( x n , y n ) {\displaystyle \ (x_{1},y_{1}),\ldots ,(x_{n},y_{n})} where x i ∈ X {\displaystyle x_{i}\in X} is an input and y i ∈ Y {\displaystyle y_{i}\in Y} is the corresponding response that is desired from h ( x i ) {\displaystyle h(x_{i})} . To put it more formally, assuming that there is a joint probability distribution P ( x , y ) {\displaystyle P(x,y)} over X {\displaystyle X} and Y {\displaystyle Y} , and that the training set consists of n {\displaystyle n} instances ( x 1 , y 1 ) , … , ( x n , y n ) {\displaystyle \ (x_{1},y_{1}),\ldots ,(x_{n},y_{n})} drawn i.i.d. from P ( x , y ) {\displaystyle P(x,y)} . The assumption of a joint probability distribution allows for the modelling of uncertainty in predictions (e.g. from noise in data) because y {\displaystyle y} is not a deterministic function of x {\displaystyle x} , but rather a random variable with conditional distribution P ( y | x ) {\displaystyle P(y|x)} for a fixed x {\displaystyle x} . It is also assumed that there is a non-negative real-valued loss function L ( y ^ , y ) {\displaystyle L({\hat {y}},y)} which measures how different the prediction y ^ {\displaystyle {\hat {y}}} of a hypothesis is from the true outcome y {\displaystyle y} . For classification tasks, these loss functions can be scoring rules. The risk associated with hypothesis h ( x ) {\displaystyle h(x)} is then defined as the expectation of the loss function: R ( h ) = E [ L ( h ( x ) , y ) ] = ∫ L ( h ( x ) , y ) d P ( x , y ) . {\displaystyle R(h)=\mathbf {E} [L(h(x),y)]=\int L(h(x),y)\,dP(x,y).} A loss function commonly used in theory is the 0-1 loss function: L ( y ^ , y ) = { 1 if y ^ ≠ y 0 if y ^ = y {\displaystyle L({\hat {y}},y)={\begin{cases}1&{\mbox{ if }}\quad {\hat {y}}\neq y\\0&{\mbox{ if }}\quad {\hat {y}}=y\end{cases}}} . The ultimate goal of a learning algorithm is to find a hypothesis h ∗ {\displaystyle h^{}} among a fixed class of functions H {\displaystyle {\mathcal {H}}} for which the risk R ( h ) {\displaystyle R(h)} is minimal: h ∗ = a r g m i n h ∈ H R ( h ) . {\displaystyle h^{}={\underset {h\in {\mathcal {H}}}{\operatorname {arg\,min} }}\,{R(h)}.} For classification problems, the Bayes classifier is defined to be the classifier minimizing the risk defined with the 0–1 loss function. == Formal definition == In general, the risk R ( h ) {\displaystyle R(h)} cannot be computed because the distribution P ( x , y ) {\displaystyle P(x,y)} is unknown to the learning algorithm. However, given a sample of iid training data points, we can compute an estimate, called the empirical risk, by computing the average of the loss function over the training set; more formally, computing the expectation with respect to the empirical measure: R emp ( h ) = 1 n ∑ i = 1 n L ( h ( x i ) , y i ) . {\displaystyle \!R_{\text{emp}}(h)={\frac {1}{n}}\sum _{i=1}^{n}L(h(x_{i}),y_{i}).} The empirical risk minimization principle states that the learning algorithm should choose a hypothesis h ^ {\displaystyle {\hat {h}}} which minimizes the empirical risk over the hypothesis class H {\displaystyle {\mathcal {H}}} : h ^ = a r g m i n h ∈ H R emp ( h ) . {\displaystyle {\hat {h}}={\underset {h\in {\mathcal {H}}}{\operatorname {arg\,min} }}\,R_{\text{emp}}(h).} Thus, the learning algorithm defined by the empirical risk minimization principle consists in solving the above optimization problem. == Properties == Guarantees for the performance of empirical risk minimization depend strongly on the function class selected as well as the distributional assumptions made. In general, distribution-free methods are too coarse, and do not lead to practical bounds. However, they are still useful in deriving asymptotic properties of learning algorithms, such as consistency. In particular, distribution-free bounds on the performance of empirical risk minimization given a fixed function class can be derived using bounds on the VC complexity of the function class. For simplicity, considering the case of binary classification tasks, it is possible to bound the probability of the selected classifier, ϕ n {\displaystyle \phi _{n}} being much worse than the best possible classifier ϕ ∗ {\displaystyle \phi ^{}} . Consider the risk L {\displaystyle L} defined over the hypothesis class C {\displaystyle {\mathcal {C}}} with growth function S ( C , n ) {\displaystyle {\mathcal {S}}({\mathcal {C}},n)} given a dataset of size n {\displaystyle n} . Then, for every ϵ > 0 {\displaystyle \epsilon >0} : P ( L ( ϕ n ) − L ( ϕ ∗ ) > ϵ ) ≤ 8 S ( C , n ) exp ⁡ { − n ϵ 2 / 32 } {\displaystyle \mathbb {P} \left(L(\phi _{n})-L(\phi ^{})>\epsilon \right)\leq {\mathcal {8}}S({\mathcal {C}},n)\exp\{-n\epsilon ^{2}/32\}} Similar results hold for regression tasks. These results are often based on uniform laws of large numbers, which control the deviation of the empirical risk from the true risk, uniformly over the hypothesis class. === Impossibility results === It is also possible to show lower bounds on algorithm performance if no distributional assumptions are made. This is sometimes referred to as the No free lunch theorem. Even though a specific learning algorithm may provide the asymptotically optimal performance for any distribution, the finite sample performance is always poor for at least one data distribution. This means that no classifier can improve on the error for a given sample size for all distributions. Specifically, let ϵ > 0 {\displaystyle \epsilon >0} and consider a sample size n {\displaystyle n} and classification rule ϕ n {\displaystyle \phi _{n}} , there exists a distribution of ( X , Y ) {\displaystyle (X,Y)} with risk L ∗ = 0 {\displaystyle L^{}=0} (meaning that perfect prediction is possible) such that: E L n ≥ 1 / 2 − ϵ . {\displaystyle \mathbb {E} L_{n}\geq 1/2-\epsilon .} It is further possible to show that the convergence rate of a learning algorithm is poor for some distributions. Specifically, given a sequence of decreasing positive numbers a i {\displaystyle a_{i}} converging to zero, it is possible to find a distribution such that: E L n ≥ a i {\displaystyle \mathbb {E} L_{n}\geq a_{i}} for all n {\displaystyle n} . This result shows that universally good classification rules do not exist, in the sense that the rule must be low quality for at least one distribution. === Computational complexity === Empirical risk minimization for a classification problem with a 0-1 loss function is known to be an NP-hard problem even for a relatively simple class of functions such as linear classifiers. Nevertheless, it can be solved efficiently when the minimal empirical risk is zero, i.e., data is linearly separable. In practice, machine learning algorithms cope with this issue either by employing a convex approximation to the 0–1 loss function (like hinge loss for SVM), which is easier to optimize, or by imposing assumptions on the distribution P ( x , y ) {\displaystyle P(x,y)} (and thus stop being agnostic learning algorithms to which the above result applies). In the case of convexification, Zhang's lemma majors the excess risk of the original problem using the excess risk of the convexified problem. Minimizing the latter using convex optimization also allow to control the former. == Tilted empirical risk minimization == Tilted empirical risk minimization is a machine learning technique used to modify standard loss functions like squared error, by introducing a tilt parameter. This parameter dynamically adjusts the weight of data points during training, allowing the algorithm to focus on specific regions or characteristics of the data distribution. Tilted empirical risk minimization is particularly useful in scenarios with imbalanced data or when there is a need to emphasize errors in certain parts of the prediction space.

    Read more →
  • JAX (software)

    JAX (software)

    JAX is a Python library for accelerator-oriented array computation and program transformation, designed for high-performance numerical computing and large-scale machine learning. It is developed by Google with contributions from Nvidia and other community contributors. It is described as bringing together a modified version of the automatic differentiation system autograd and OpenXLA's XLA (Accelerated Linear Algebra). It is designed to follow the structure and workflow of NumPy as closely as possible and works with various existing frameworks such as TensorFlow and PyTorch. The primary features of JAX are: Providing a unified NumPy-like interface to computations that run on CPU, GPU, or TPU, in local or distributed settings. Built-in Just-In-Time (JIT) compilation via OpenXLA, an open-source machine learning compiler ecosystem. Efficient evaluation of gradients via its automatic differentiation transformations. Automatic vectorization to efficiently map functions over arrays representing batches of inputs. == Libraries using Jax == Flax Equinox Optax

    Read more →
  • Neuro-symbolic AI

    Neuro-symbolic AI

    Neuro-symbolic AI is a subfield of artificial intelligence that integrates neural methods (e.g., neural networks and deep learning) with symbolic methods (e.g., formal logic, knowledge representation, and automated reasoning). The goal is to combine the strengths of both approaches, resulting in AI systems that can be trained from raw data and demonstrate robustness against outliers or errors in the base data, while preserving explainability, explicit use of expert knowledge, and explicit cognitive reasoning. As argued by Leslie Valiant and others, the effective construction of rich computational cognitive models demands the combination of symbolic reasoning and efficient machine learning. Gary Marcus argued, "We cannot construct rich cognitive models in an adequate, automated way without the triumvirate of hybrid architecture, rich prior knowledge, and sophisticated techniques for reasoning." Further, "To build a robust, knowledge-driven approach to AI we must have the machinery of symbol manipulation in our toolkit. Too much of useful knowledge is abstract to make do without tools that represent and manipulate abstraction, and to date, the only known machinery that can manipulate such abstract knowledge reliably is the apparatus of symbol manipulation." Angelo Dalli, Henry Kautz, Francesca Rossi, and Bart Selman also argued for such a synthesis. Their arguments attempt to address the two kinds of thinking, as discussed in Daniel Kahneman's book Thinking, Fast and Slow. It describes cognition as encompassing two components: System 1 is fast, reflexive, intuitive, and unconscious. System 2 is slower, step-by-step, and explicit. System 1 is used for pattern recognition. System 2 handles planning, deduction, and deliberative thinking. In this view, deep learning best handles the first kind of cognition, while symbolic reasoning best handles the second kind. Both are necessary for the development of a robust and reliable AI system capable of learning, reasoning, and interacting with humans to accept advice and answer questions. Since the 1990s, dual-process models with explicit references to the two contrasting systems have been the focus of research in both the fields of AI and cognitive science by numerous researchers. In 2025, the adoption of neurosymbolic AI, an approach that integrates neural networks with symbolic reasoning, increased in response to the need to address hallucination issues in large language models. For example, Amazon implemented Neurosymbolic AI in its Vulcan warehouse robots and Rufus shopping assistant to enhance accuracy and decision-making. == Approaches == Approaches for integration are diverse. Henry Kautz's taxonomy of neuro-symbolic architectures follows, along with some examples: Symbolic Neural symbolic is the current approach of many neural models in natural language processing, where words or subword tokens are the ultimate input and output of large language models. Examples include BERT, RoBERTa, and GPT-3. Symbolic[Neural] is exemplified by AlphaGo, where symbolic techniques are used to invoke neural techniques. In this case, the symbolic approach is Monte Carlo tree search and the neural techniques learn how to evaluate game positions. Neural | Symbolic uses a neural architecture to interpret perceptual data as symbols and relationships that are reasoned about symbolically. Neural-Concept Learner is an example. Neural: Symbolic → Neural relies on symbolic reasoning to generate or label training data that is subsequently learned by a deep learning model, e.g., to train a neural model for symbolic computation by using a Macsyma-like symbolic mathematics system to create or label examples. NeuralSymbolic uses a neural net that is generated from symbolic rules. An example is the Neural Theorem Prover, which constructs a neural network from an AND-OR proof tree generated from knowledge base rules and terms. Logic Tensor Networks also fall into this category. Neural[Symbolic] according to Kautz, this approach embeds true symbolic reasoning inside a neural network. These are tightly-coupled neural-symbolic systems, in which the logical inference rules are internal to the neural network. This way, the neural network internally computes the inference from the premises and learns to reason based on logical inference systems. Early work on connectionist modal and temporal logics by Garcez, Lamb, and Gabbay is aligned with this approach. These categories are not exhaustive, as they do not consider multi-agent systems. In 2005, Bader and Hitzler presented a more fine-grained categorization that took into account, e.g., whether the use of symbols included logic and, if so, whether the logic was propositional or first-order logic. The 2005 categorization and Kautz's taxonomy above are compared and contrasted in a 2021 article. Sepp Hochreiter argued that Graph Neural Networks "...are the predominant models of neural-symbolic computing" since "[t]hey describe the properties of molecules, simulate social networks, or predict future states in physical and engineering applications with particle-particle interactions." == Artificial general intelligence == Gary Marcus argues that "...hybrid architectures that combine learning and symbol manipulation are necessary for robust intelligence, but not sufficient", and that there are ...four cognitive prerequisites for building robust artificial intelligence: hybrid architectures that combine large-scale learning with the representational and computational powers of symbol manipulation, large-scale knowledge bases—likely leveraging innate frameworks—that incorporate symbolic knowledge along with other forms of knowledge, reasoning mechanisms capable of leveraging those knowledge bases in tractable ways, and rich cognitive models that work together with those mechanisms and knowledge bases. This echoes earlier calls for hybrid models as early as the 1990s. == History == Garcez and Lamb described research in this area as ongoing, at least since the 1990s. During that period, the terms symbolic and sub-symbolic AI were popular. A series of workshops on neuro-symbolic AI has been held annually since 2005 Neuro-Symbolic Artificial Intelligence. In the early 1990s, an initial set of workshops on this topic were organized. == Research == Key research questions remain, such as: What is the best way to integrate neural and symbolic architectures? How should symbolic structures be represented within neural networks and extracted from them? How should common-sense knowledge be learned and reasoned about? How can abstract knowledge that is hard to encode logically be handled? == Implementations == Implementations of neuro-symbolic approaches include: AllegroGraph: an integrated Knowledge Graph based platform for neuro-symbolic application development. Scallop: a language based on Datalog that supports differentiable logical and relational reasoning. Scallop can be integrated in Python and with a PyTorch learning module. Logic Tensor Networks: encode logical formulas as neural networks and simultaneously learn term encodings, term weights, and formula weights. DeepProbLog: combines neural networks with the probabilistic reasoning of ProbLog. Abductive Learning: integrates machine learning and logical reasoning in a balanced-loop via abductive reasoning, enabling them to work together in a mutually beneficial way. SymbolicAI: a compositional differentiable programming library.

    Read more →