AI App Editing Video

AI App Editing Video — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • SUPS

    SUPS

    In computational neuroscience, SUPS (for Synaptic Updates Per Second) or formerly CUPS (Connections Updates Per Second) is a measure of a neuronal network performance, useful in fields of neuroscience, cognitive science, artificial intelligence, and computer science. == Computing == For a processor or computer designed to simulate a neural network SUPS is measured as the product of simulated neurons N {\displaystyle N} and average connectivity c {\displaystyle c} (synapses) per neuron per second: S U P S = c × N {\displaystyle SUPS=c\times N} Depending on the type of simulation it is usually equal to the total number of synapses simulated. In an "asynchronous" dynamic simulation if a neuron spikes at υ {\displaystyle \upsilon } Hz, the average rate of synaptic updates provoked by the activity of that neuron is υ c N {\displaystyle \upsilon cN} . In a synchronous simulation with step Δ t {\displaystyle \Delta t} the number of synaptic updates per second would be c N Δ t {\displaystyle {\frac {cN}{\Delta t}}} . As Δ t {\displaystyle \Delta t} has to be chosen much smaller than the average interval between two successive afferent spikes, which implies Δ t < 1 υ N {\displaystyle \Delta t<{\frac {1}{\upsilon N}}} , giving an average of synaptic updates equal to υ c N 2 {\displaystyle \upsilon cN^{2}} . Therefore, spike-driven synaptic dynamics leads to a linear scaling of computational complexity O(N) per neuron, compared with the O(N2) in the "synchronous" case. == Records == Developed in the 1980s Adaptive Solutions' CNAPS-1064 Digital Parallel Processor chip is a full neural network (NNW). It was designed as a coprocessor to a host and has 64 sub-processors arranged in a 1D array and operating in a SIMD mode. Each sub-processor can emulate one or more neurons and multiple chips can be grouped together. At 25 MHz it is capable of 1.28 GMAC. After the presentation of the RN-100 (12 MHz) single neuron chip at Seattle 1991 Ricoh developed the multi-neuron chip RN-200. It had 16 neurons and 16 synapses per neuron. The chip has on-chip learning ability using a proprietary backdrop algorithm. It came in a 257-pin PGA encapsulation and drew 3.0 W at a maximum. It was capable of 3 GCPS (1 GCPS at 32 MHz). In 1991–97, Siemens developed the MA-16 chip, SYNAPSE-1 and SYNAPSE-3 Neurocomputer. The MA-16 was a fast matrix-matrix multiplier that can be combined to form systolic arrays. It could process 4 patterns of 16 elements each (16-bit), with 16 neuron values (16-bit) at a rate of 800 MMAC or 400 MCPS at 50 MHz. The SYNAPSE3-PC PCI card contained 2 MA-16 with a peak performance of 2560 MOPS (1.28 GMAC); 7160 MOPS (3.58 GMAC) when using three boards. In 2013, the K computer was used to simulate a neural network of 1.73 billion neurons with a total of 10.4 trillion synapses (1% of the human brain). The simulation ran for 40 minutes to simulate 1 s of brain activity at a normal activity level (4.4 on average). The simulation required 1 Petabyte of storage.

    Read more →
  • Space-based data center

    Space-based data center

    Space-based data centers or orbital AI infrastructure are proposed concepts to build AI data centers in the sun-synchronous orbit or other orbits utilizing space-based solar power. Electric power has become the main bottleneck for terrestrial AI infrastructure. Space-based edge computing has historical roots in military architectures designed to bypass the latency of ground-based targeting networks. In the 1980s, the Strategic Defense Initiative's Brilliant Pebbles program first envisioned autonomous on-orbit data processing for missile defense. In 2019, the Space Development Agency (SDA) began to revive this decentralized approach through its Proliferated Warfighter Space Architecture (PWSA). This ambitious "sensor-to-shooter" infrastructure is treated as a prerequisite for the modern Golden Dome program, which would rely on space-based data processing to continuously track targets. == History == Early thinking about space-based computing infrastructure grew out of mid-20th-century visions for large orbital industrial systems, most notably proposals for space-based solar power, which were popularized in both technical literature and science writing by figures such as Isaac Asimov in the 1940s. These ideas emphasized exploiting the vacuum, continuous solar energy, and thermal characteristics of space to support power-intensive activities that would be difficult or inefficient on Earth. In the 21st century, advances in small satellites, reusable launch vehicles, and high-performance computing revived interest in space-based data centers, with governments and private companies exploring orbital or near-space platforms for edge computing, secure data handling, and low-latency processing of Earth-observation data. In September 2024, Y Combinator-backed Starcloud released a white paper detailing plans to build multiple gigawatts of AI compute in orbit. It was the first widely cited proposal to actually start building large orbital data centers. In 2025, Starcloud deployed an NVIDIA H100-class system and became the first company to train an LLM in space and run a version of Google Gemini in space. In March 2025, Lonestar deployed a data backup machine on the surface of the moon. In early January 2026, a team from the University of Pennsylvania presented a tether-based architecture for orbital data centers at the AIAA SciTech conference. The design relied on gravity gradient tension and solar-pressure-based passive attitude stabilization to minimize the mass of MW-scale orbital data centers. In January 2026, SpaceX filed plans with the Federal Communications Commission (FCC) for millions of satellites, leveraging reusable launches and Starlink integration to extend cloud and AI computing into orbit. Around the same time, Blue Origin announced the TeraWave constellation of about 5,400 satellites, designed to provide high‑throughput networking for data centers, enterprise, and government customers. Meanwhile, China announced a 200,000‑satellite constellation, focusing on state coordination, data sovereignty, and in-orbit processing for secure, time-critical applications. In February 2026, Starcloud submitted a proposal to the FCC for a constellation of up to 88,000 satellites for orbital data centers. In March, it announced intentions to be the first to mine Bitcoin in space, flying bitcoin mining ASICs on its second satellite, Starcloud-2. In May 2026, Edge Aerospace was awarded a contract by the European Space Agency under its Space Cloud program to study use cases, architectures and implementation roadmap for orbital data centers. == Feasibility == In October 2025, Nature Electronics published a study led by a research group at Nanyang Technological University on the development of carbon-neutral data centres in space. In November 2025, Google published a feasibility study on space-based data centers. The authors argued that if launch costs to low earth orbit reached US$200/kg, the launch cost for data center satellites could be cost effective relative to current energy costs for ground-based data centers. They project this may occur around 2035 if SpaceX's Starship project scales to 180 launches/year by then. == Advantages == Some sun-synchronous orbit (SSO) planes have constant sunlight in the dawn/dusk which could provide continuous solar energy. SSO is a limited resource and proper management and sharing of it is required. Solar irradiance is 36% higher in Earth orbit than on the surface No Earth weather storms or clouds, however more exposed to Solar storms. No property tax or land-use regulation. Saves space for other land use. Ample space for scalability. Won't strain the power grid. Direct access to power source without additional infrastructure. == Disadvantages == The deployment of space-based data centers raises several technical, economic, and environmental concerns. Existing launch costs are substantial and remains main cost of space infrastructure deployment Cooling is limited to heat dissipation through radiation only, which made in inefficient in comparison to convection in terrestrial data centers Space infrastructure must be designed to survive launch and to work under environment conditions of radiation, wide range of temperatures, in vacuum and in microgravity In-space assembly is on early development stage to enable deployment of mega-structures Megastructures are particularly exposed to orbital debris Solar arrays efficiency decrease 0.5% to 0.8% per year due to exposure of ultraviolet rays, space weather and orbital thermal cycles Hardware is designed for limited lifespan. Maintenance and repair in space (known as On-Orbit Servicing (OOS)) is still on early stage of practical implementation. Disposable data centre: technology obsolescence of AI data centre being a concern and difficult maintenance in space imply the single-use purpose of those space data centres. To extend lifetime, space infrastructure will require either refueling or orbit rasie by the servicer, which is going to increase its operational costs The environmental impact on Earth has its own challenges: The environmental impact of launches need to be addressed. Deployment consumes Earth resources that cannot be recovered or recycled. Computers require lots of resources, some of which are strategic. Recycling e-waste is already a challenge on Earth and extremely unlikely in space. Space debris (orbit pollution) is another sustainability challenge for space: Orbits are, like any resources, a limited physical and electromagnetic resource and available for all mankind. The accumulation of satellites on a particular orbit reduces the use of space for other purposes. A consequence of the increase of satellite in orbit is a higher risk of the runaway of space debris (see Kessler syndrome). This means some orbits could become unusable. Latency and bandwidth are constrained in space, and consumes limited electromagnetic resources. Satellite flares could inhibit ground-based and space-based observational astronomy. == Size and power generated == It would take ~1 square mile solar array in earth orbit to produce 1 gigawatt of power at 30% cell efficiency. == Companies pursuing space-based AI infrastructure == Blue Origin Cowboy Space Corporation (formerly Aetherflux) Edge Aerospace Google – Project Suncatcher Nvidia OpenAI SpaceX Starcloud

    Read more →
  • Smart object

    Smart object

    A smart object is an object that enhances the interaction with not only people but also with other smart objects. Also known as smart connected products or smart connected things (SCoT), they are products, assets and other things embedded with processors, sensors, software and connectivity that allow data to be exchanged between the product and its environment, manufacturer, operator/user, and other products and systems. Connectivity also enables some capabilities of the product to exist outside the physical device, in what is known as the product cloud. The data collected from these products can be then analysed to inform decision-making, enable operational efficiencies and continuously improve the performance of the product. It can not only refer to interaction with physical world objects but also to interaction with virtual (computing environment) objects. A smart physical object may be created either as an artifact or manufactured product or by embedding electronic tags such as RFID tags or sensors into non-smart physical objects. Smart virtual objects are created as software objects that are intrinsic when creating and operating a virtual or cyber world simulation or game. The concept of a smart object has several origins and uses, see History. There are also several overlapping terms, see also smart device, tangible object or tangible user interface and Thing as in the Internet of things. == History == In the early 1990s, Mark Weiser, from whom the term ubiquitous computing originated, referred to a vision "When almost every object either contains a computer or can have a tab attached to it, obtaining information will be trivial", Although Weiser did not specifically refer to an object as being smart, his early work did imply that smart physical objects are smart in the sense that they act as digital information sources. Hiroshi Ishii and Brygg Ullmer refer to tangible objects in terms of tangibles bits or tangible user interfaces that enable users to "grasp & manipulate" bits in the center of users' attention by coupling the bits with everyday physical objects and architectural surfaces. The smart object concept was introduced by Marcelo Kallman and Daniel Thalmann as an object that can describe its own possible interactions. The main focus here is to model interactions of smart virtual objects with virtual humans, agents, in virtual worlds. The opposite approach to smart objects is 'plain' objects that do not provide this information. The additional information provided by this concept enables far more general interaction schemes, and can greatly simplify the planner of an artificial intelligence agent. In contrast to smart virtual objects used in virtual worlds, Lev Manovich focuses on physical space filled with electronic and visual information. Here, "smart objects" are described as "objects connected to the Net; objects that can sense their users and display smart behaviour". More recently in the early 2010s, smart objects are being proposed as a key enabler for the vision of the Internet of things. The combination of the Internet and emerging technologies such as near field communications, real-time localization, and embedded sensors enables everyday objects to be transformed into smart objects that can understand and react to their environment. Such objects are building blocks for the Internet of things and enable novel computing applications. In 2018, one of the world's first smart houses was built in Klaukkala, Finland in the form of a five-floor apartment block, using the Kone Residential Flow solution created by KONE, allowing even a smartphone to act as a home key. == Characteristics == Although we can view interaction with physical smart object in the physical world as distinct from interaction with virtual smart objects in a virtual simulated world, these can be related. Poslad considers the progression of: how humans use models of smart objects situated in the physical world to enhance human to physical world interaction; versus how smart physical objects situated in the physical world can model human interaction in order to lessen the need for human to physical world interaction; versus how virtual smart objects by modelling both physical world objects and modelling humans as objects and their subsequent interactions can form a predominantly smart virtual object environment. === Smart physical objects === The concept smart for a smart physical object simply means that it is active, digital, networked, can operate to some extent autonomously, is reconfigurable and has local control of the resources it needs such as energy, data storage, etc. Note, a smart object does not necessarily need to be intelligent as in exhibiting a strong essence of artificial intelligence—although it can be designed to also be intelligent. Physical world smart objects can be described in terms of three properties: Awareness: is a smart object's ability to understand (that is, sense, interpret, and react to) events and human activities occurring in the physical world. Representation: refers to a smart object's application and programming model—in particular, programming abstractions. Interaction: denotes the object's ability to converse with the user in terms of input, output, control, and feedback. Based upon these properties, these have been classified into three types: Activity-Aware Smart Objects: Are objects that can record information about work activities and its own use. Policy-Aware Smart Objects: Are objects that are activity-aware Objects can interpret events and activities with respect to predefined organizational policies. Process-Aware Smart Objects: Processes play a fundamental role in industrial work management and operation. A process is a collection of related activities or tasks that are ordered according to their position in time and space. === Smart virtual objects === For the virtual object in a virtual world case, an object is called smart when it has the ability to describe its possible interactions. This focuses on constructing a virtual world using only virtual objects that contain their own interaction information. There are four basic elements to constructing such a smart virtual object framework. Object properties: physical properties and a text description Interaction information: position of handles, buttons, grips, and the like Object behavior: different behaviors based on state variables Agent behaviors: description of the behavior an agent should follow when using the object Some versions of smart objects also include animation information in the object information, but this is not considered to be an efficient approach, since this can make objects inappropriately oversized. === Categorization === The terms smart, connected product or smart product can be confusing as it is used to cover a broad range of different products, ranging from smart home appliances (e.g., smart bathroom scales or smart light bulbs) to smart cars (e.g., Tesla). While these products share certain similarities, they often differ substantially in their capabilities. Raff et al. developed a conceptual framework that distinguishes different smart products based on their capabilities, which features 4 types of smart product archetypes (in ascending order of "smartness"). Digital Connected Responsive Intelligent == Advantages == Smart, connected products have three primary components: Physical – made up of the product's mechanical and electrical parts. Smart – made up of sensors, microprocessors, data storage, controls, software, and an embedded operating system with enhanced user interface. Connectivity – made up of ports, antennae, and protocols enabling wired/wireless connections that serve two purposes, it allows data to be exchanged with the product and enables some functions of the product to exist outside the physical device. Each component expands the capabilities of one another resulting in "a virtuous cycle of value improvement". First, the smart components of a product amplify the value and capabilities of the physical components. Then, connectivity amplifies the value and capabilities of the smart components. These improvements include: Monitoring of the product's conditions, its external environment, and its operations and usage. Control of various product functions to better respond to changes in its environment, as well as to personalize the user experience. Optimization of the product's overall operations based on actual performance data, and reduction of downtimes through predictive maintenance and remote service. Autonomous product operation, including learning from their environment, adapting to users' preferences and self-diagnosing and service. === The Internet of things (IoT) === The Internet of things is the network of physical objects that contain embedded technology to communicate and sense or interact with their internal states or the external environment. The phrase "Internet of things" reflects the gro

    Read more →
  • Software agent

    Software agent

    In computer science, a software agent is a computer program that acts for a user or another program in a relationship of agency. The term agent is derived from the Latin agere (to do): an agreement to act on one's behalf. Such "action on behalf of" implies the authority to decide which, if any, action is appropriate. Some agents are colloquially known as bots, from robot. They may be embodied, as when execution is paired with a robot body, or as software such as a chatbot executing on a computer, such as a mobile device, e.g. Siri. Software agents may be autonomous or work together with other agents or people. Software agents interacting with people (e.g. chatbots, human-robot interaction environments) may possess human-like qualities such as natural language understanding and speech, personality or embody humanoid form (see Asimo). Related and derived concepts include intelligent agents (in particular exhibiting some aspects of artificial intelligence, such as reasoning), autonomous agents (capable of modifying the methods of achieving their objectives), distributed agents (being executed on physically distinct computers), multi-agent systems (distributed agents that work together to achieve an objective that could not be accomplished by a single agent acting alone), and mobile agents (agents that can relocate their execution onto different processors). == Concepts == The basic attributes of an autonomous software agent are that agents: are not strictly invoked for a task, but activate themselves, may reside in wait status on a host, perceiving context, may get to run status on a host upon starting conditions, do not require interaction of user, may invoke other tasks including communication. The concept of an agent provides a method of describing a complex software entity that is capable of acting with a certain degree of autonomy in order to accomplish tasks on behalf of its host. But unlike objects, which are defined in terms of methods and attributes, an agent is defined in terms of its behavior. Various authors have proposed different definitions of agents, these commonly include concepts such as: persistence: code is not executed on demand but runs continuously and decides for itself when it should perform some activity; autonomy: agents have capabilities of task selection, prioritization, goal-directed behavior, decision-making without human intervention; social ability: agents are able to engage other components through some sort of communication and coordination, they may collaborate on a task; reactivity: agents perceive the context in which they operate and react to it appropriately. === Distinguishing agents from programs === All agents are programs, but not all programs are agents. Contrasting the term with related concepts may help clarify its meaning. Franklin & Graesser (1997) discuss four key notions that distinguish agents from arbitrary programs: reaction to the environment, autonomy, goal-orientation and persistence. === Intuitive distinguishing agents from objects === Agents are more autonomous than objects. Agents have flexible behavior: reactive, proactive, social. Agents have at least one thread of control but may have more. === Distinguishing agents from expert systems === Expert systems are not coupled to their environment. Expert systems are not designed for reactive, proactive behavior. Expert systems do not consider social ability. === Distinguishing intelligent software agents from intelligent agents in AI === Intelligent agents (also known as rational agents) are not just computer programs: they may also be machines, human beings, communities of human beings (such as firms) or anything that is capable of goal-directed behavior. == Impact of software agents == Software agents may offer various benefits to their end users by automating complex or repetitive tasks. However, there are organizational and cultural impacts of this technology that need to be considered prior to implementing software agents. === Organizational impact === === Work contentment and job satisfaction impact === People like to perform easy tasks providing the sensation of success unless the repetition of the simple tasking is affecting the overall output. In general implementing software agents to perform administrative requirements provides a substantial increase in work contentment, as administering their own work does never please the worker. The effort freed up serves for a higher degree of engagement in the substantial tasks of individual work. Hence, software agents may provide the basics to implement self-controlled work, relieved from hierarchical controls and interference. Such conditions may be secured by application of software agents for required formal support. === Cultural impact === The cultural effects of the implementation of software agents include trust affliction, skills erosion, privacy attrition and social detachment. Some users may not feel entirely comfortable fully delegating important tasks to software applications. Those who start relying solely on intelligent agents may lose important skills, for example, relating to information literacy. In order to act on a user's behalf, a software agent needs to have a complete understanding of a user's profile, including his/her personal preferences. This, in turn, may lead to unpredictable privacy issues. When users start relying on their software agents more, especially for communication activities, they may lose contact with other human users and look at the world with the eyes of their agents. These consequences are what agent researchers and users must consider when dealing with intelligent agent technologies. === History === The concept of an agent can be traced back to Hewitt's Actor Model (Hewitt, 1977) - "A self-contained, interactive and concurrently-executing object, possessing internal state and communication capability." To be more academic, software agent systems are a direct evolution of Multi-Agent Systems (MAS). MAS evolved from Distributed Artificial Intelligence (DAI), Distributed Problem Solving (DPS) and Parallel AI (PAI), thus inheriting all characteristics (good and bad) from DAI and AI. John Sculley's 1987 "Knowledge Navigator" video portrayed an image of a relationship between end-users and agents. Being an ideal first, this field experienced a series of unsuccessful top-down implementations, instead of a piece-by-piece, bottom-up approach. The range of agent types is now (from 1990) broad: WWW, search engines, etc. == Examples of intelligent software agents == === Buyer agents (shopping bots) === Buyer agents travel around a network (e.g. the internet) retrieving information about goods and services. These agents, also known as 'shopping bots', work very efficiently for commodity products such as CDs, books, electronic components, and other one-size-fits-all products. Buyer agents are typically optimized to allow for digital payment services used in e-commerce and traditional businesses. === User agents (personal agents) === User agents, or personal agents, are intelligent agents that take action on your behalf. In this category belong those intelligent agents that already perform, or will shortly perform, the following tasks: Check your e-mail, sort it according to the user's order of preference, and alert you when important emails arrive. Play computer games as your opponent or patrol game areas for you. Assemble customized news reports for you. There are several versions of these, including CNN. Find information for you on the subject of your choice. Fill out forms on the Web automatically for you, storing your information for future reference Scan Web pages looking for and highlighting text that constitutes the "important" part of the information there Discuss topics with you ranging from your deepest fears to sports Facilitate with online job search duties by scanning known job boards and sending the resume to opportunities who meet the desired criteria Profile synchronization across heterogeneous social networks === Monitoring-and-surveillance (predictive) agents === Monitoring and surveillance agents are used to observe and report on equipment, usually computer systems. The agents may keep track of company inventory levels, observe competitors' prices and relay them back to the company, watch stock manipulation by insider trading and rumors, etc. For example, NASA's Jet Propulsion Laboratory has an agent that monitors inventory, planning, schedules equipment orders to keep costs down, and manages food storage facilities. These agents usually monitor complex computer networks that can keep track of the configuration of each computer connected to the network. A special case of monitoring-and-surveillance agents are organizations of agents used to automate decision-making process during tactical operations. The agents monitor the status of assets (ammunition, weapons available, platforms for transport, etc.) and receive goals from hi

    Read more →
  • Huawei Mobile Services

    Huawei Mobile Services

    Huawei Mobile Services (HMS) is a collection of proprietary services and high level application programming interfaces (APIs) developed by Huawei Technologies Co., Ltd. Its hub known as HMS Core serves as a toolkit for app development on Huawei devices. HMS is typically installed on Huawei devices on top of running HarmonyOS 4.x and earlier operating system on its earlier devices running the Android operating system with EMUI including devices already distributed with Google Mobile Services. Alongside, HMS Core Wear Engine for Android phones with lightweight based LiteOS wearable middleware app framework integration connectivity like notifications, status etc. HMS consists of seven key services and the HMS Core. The key services are Huawei ID, Huawei Cloud, AppGallery, Themes, Huawei Video, Browser, and Assistant. The web browser is based on Chromium. Huawei Quick Apps is the alternative to Google Instant Apps. By January 2020, over 50,000 apps had been integrated with HMS Core. Its rival, Google Mobile Services has 3 million apps on Google's Play Store. The AppGallery claimed 180 billion downloads in 2019. In March 2020, HMS was used by 650 million monthly active users across 170 countries. A Chinese phone manufacturer, LeTV, hosted a smartphone business communication meeting in Beijing on September 27, 2021, to demonstrate its phone, the LeTV S1. This was the first smartphone from a third-party manufacturer to include Huawei Mobile Services (HMS). == HMS on Android and HarmonyOS == Huawei Mobile Services on Android goes all the way back to August 2016 as Huawei ID services for phones, basic functionalities for Huawei P9 series. However, in May 2019 proved to be a significant change to HMS when Google was prohibited from working with Huawei on any new devices extending ecosystem for AppGallery store front launched in April 2018, year prior. This also included bundling Google's Apps, including Gmail, Maps and YouTube. Any new Huawei devices launched after 16 May 2019 were unable to receive updates from Google services and would be considered 'uncertified' meaning Huawei's only solution at the time was to turn HMS into a genuine competitor to Google and incentivize app developers to utilize the platform. Huawei officially launched Huawei Mobile Services in China on December 24, 2019, as a beta. Huawei expanded Huawei Mobile Services in Europe in February 2020 and other markets in Asia, Latin America, Middle East & Africa, Canada, Mexico followed outside banned US market. HMS is available on the Honor 9X Pro, View 30 Pro, Huawei Mate XS. HMS is also available, alongside GMS, on many other Huawei models launched before the ban. Huawei promised developers it would take, “less than 10 minutes", to port their app over to HMS - to illustrate the ease of portability between Google's Play Store and the HMS AppGallery. On January 15, 2020, HMS Core 4.0 (Huawei Mobile Services Core 4.0) was officially launched. Huawei announced that at this time, there were already 1.3 million developers and 55,000 applications on board. The next day, Huawei held a developer day event in London and invested £20 million to encourage developers in the United Kingdom and Ireland to use HMS. On July 15, 2021, Huawei expanded HMS with classic HarmonyOS dual-framework that provided Java support and eventually with JavaScript and ArkTS (eTS) language support with HMS Core 6.0 for app development with primarily Android apps, alongside limited HAP imperative developed based apps that shares AOSP file system libraries in all types of devices from smartphones, tablets, smart screens, smartwatches, and car machines. Including various third-party development frameworks, such as React Native, Cordova, etc. At HDC 2023, Huawei unveiled HarmonyOS 5, marking a total break from the hybrid Android derived platform. This shift replaced the legacy Android and classic HarmonyOS-based HMS SDK with a full native API developer kit SDK built solely on OpenHarmony. The architecture moved from middleware services to vertical integration path. In this new model, HMS Core libraries are no longer external add-ons but are bundled directly into the system and DevEco Studio as native HarmonyOS Kits. == HMS Core == HMS Core is a hub for Huawei Mobile Services and serves as a toolkit for app development on Huawei devices. The core comprises Development, Growth and Monetizing and was created as a replacement for Google Mobile Services (GMS) Core. HMS core services were available in more than 55,000 apps in June 2020; HMS Core 5.0 debuted in September 2020. HMS Core 6.0 was launched in June 2021 with extended support for Huawei Cloud services. In June 2021, the number of registered developers within the HMS ecosystem was 4 million, and the number of apps integrated with the HMS Core had reached 134,000. As of July 2022, registered developers within HMS ecosystem had grown to 5 million, and the number of apps integrated with the HMS Core reached 203,000. The number of apps had grown to 220,000 by 30 September 2022. == AppGallery == The AppGallery has a key rival, Google's Play Store on Android. The AppGallery is available in 170 countries, across 78 languages. == Reception == The reception of HMS is mixed, with the majority of discussion based around the key Google/Android apps which are not yet present on the AppGallery and whether or not this presents a significant problem to users. The open development of HMS Core has been regarded by some as benefiting the Android project as a whole, "If Huawei continues to invest in a holistically open approach ... the result could be that we could all end up a bit less beholden to Google".

    Read more →
  • Knowledge graph embedding

    Knowledge graph embedding

    In representation learning, knowledge graph embedding (KGE), also called knowledge representation learning (KRL), or multi-relation learning, is a machine learning task of learning a low-dimensional representation of a knowledge graph's entities and relations while preserving their semantic meaning. Leveraging their embedded representation, knowledge graphs can be used for various applications such as link prediction, triple classification, entity recognition, clustering, and relation extraction. == Definition == A knowledge graph G = { E , R , F } {\displaystyle {\mathcal {G}}=\{E,R,F\}} is a collection of entities E {\displaystyle E} , relations R {\displaystyle R} , and facts F {\displaystyle F} . A fact is a triple ( h , r , t ) ∈ F {\displaystyle (h,r,t)\in F} that denotes a link r ∈ R {\displaystyle r\in R} between the head h ∈ E {\displaystyle h\in E} and the tail t ∈ E {\displaystyle t\in E} of the triple. Another notation that is often used in the literature to represent a triple (or fact) is ⟨ head , relation , tail ⟩ {\displaystyle \langle {\text{head}},{\text{relation}},{\text{tail}}\rangle } . This notation is called the Resource Description Framework (RDF). A knowledge graph represents the knowledge related to a specific domain; leveraging this structured representation, it is possible to infer a piece of new knowledge from it after some refinement steps. However, nowadays, people have to deal with the sparsity of data and the computational inefficiency to use them in a real-world application. The embedding of a knowledge graph is a function that translates each entity and each relation into a vector of a given dimension d {\displaystyle d} , called embedding dimension. It is even possible to embed the entities and relations with different dimensions. The embedding vectors can then be used for other tasks. A knowledge graph embedding is characterized by four aspects: Representation space: The low-dimensional space in which the entities and relations are represented. Scoring function: A measure of the goodness of a triple-embedded representation. Encoding models: The modality in which the embedded representation of the entities and relations interact with each other. Additional information: Any additional information coming from the knowledge graph that can enrich the embedded representation. Usually, an ad hoc scoring function is integrated into the general scoring function for each additional piece of information. == Embedding procedure == All algorithms for creating a knowledge graph embedding follow the same approach. First, the embedding vectors are initialized to random values. Then, they are iteratively optimized using a training set of triples. In each iteration, a batch of size b {\displaystyle b} triples is sampled from the training set, and a triple from it is sampled and corrupted—i.e., a triple that does not represent a true fact in the knowledge graph. The corruption of a triple involves substituting the head or the tail (or both) of the triple with another entity that makes the fact false. The original triple and the corrupted triple are added in the training batch, and then the embeddings are updated, optimizing a scoring function. Iteration stops when a stop condition is reached. Usually, the stop condition depends on the overfitting of the training set. At the end, the learned embeddings should have extracted semantic meaning from the training triples and should correctly predict unseen true facts in the knowledge graph. === Pseudocode === The following is the pseudocode for the general embedding procedure. algorithm Compute entity and relation embeddings input: The training set S = { ( h , r , t ) } {\displaystyle S=\{(h,r,t)\}} , entity set E {\displaystyle E} , relation set R {\displaystyle R} , embedding dimension k {\displaystyle k} output: Entity and relation embeddings initialization: the entities e {\displaystyle e} and relations r {\displaystyle r} embeddings (vectors) are randomly initialized while stop condition do S b a t c h ← s a m p l e ( S , b ) {\displaystyle S_{batch}\leftarrow sample(S,b)} // Sample a batch from the training set for each ( h , r , t ) {\displaystyle (h,r,t)} in S b a t c h {\displaystyle S_{batch}} do ( h ′ , r , t ′ ) ← s a m p l e ( S ′ ) {\displaystyle (h',r,t')\leftarrow sample(S')} // Sample a corrupted fact T b a t c h ← T b a t c h ∪ { ( ( h , r , t ) , ( h ′ , r , t ′ ) ) } {\displaystyle T_{batch}\leftarrow T_{batch}\cup \{((h,r,t),(h',r,t'))\}} end for Update embeddings by minimizing the loss function end while == Performance indicators == These indexes are often used to measure the embedding quality of a model. The simplicity of the indexes makes them very suitable for evaluating the performance of an embedding algorithm even on a large scale. Given Q {\displaystyle {\ce {Q}}} as the set of all ranked predictions of a model, it is possible to define three different performance indexes: Hits@K, MR, and MRR. === Hits@K === Hits@K or in short, H@K, is a performance index that measures the probability to find the correct prediction in the first top K model predictions. Usually, it is used k = 10 {\displaystyle k=10} . Hits@K reflects the accuracy of an embedding model to predict the relation between two given triples correctly. Hits@K = | { q ∈ Q : q < k } | | Q | ∈ [ 0 , 1 ] {\displaystyle ={\frac {|\{q\in Q:q Read more →

  • AI data center

    AI data center

    An AI data center is a specialized data center facility designed for the computationally intensive tasks of training and running inference for artificial intelligence (AI) and machine learning models. Unlike general-purpose data centers, they are optimized for the parallel processing demands of AI workloads, typically using hardware such as AI accelerators (e.g., GPUs, TPUs) and high-speed interconnects. The global push to construct these specialized facilities accelerated dramatically during the AI boom of the 2020s. Memory manufacturers prioritized production of High Bandwidth Memory (HBM) essential for AI servers, which led to a global memory supply shortage amid a broader competition for advanced chips, power, and infrastructure. Major tech companies are estimated to spend $650 billion on AI data centers in 2026. == Architecture == Data centers for building and running large machine learning models contain specialized computer chips, GPUs, that use 2 to 4 times as much energy as their regular CPU counterparts (250-500 watts). AI data centers use 60 or more kilowatts per server rack, whereas more standard data centers typically use 5 to 10 kilowatts per rack. == Operators == As of August 2025, The Information tracked 18 planned or existing AI data centers in the United States, operated by Amazon Web Services, CoreWeave, Crusoe, Meta, Microsoft/OpenAI, Oracle, Tesla, and xAI. Other AI data center operators include Digital Realty and Alibaba. Data centers are also being built in China, India, Europe, Saudi Arabia, and Canada. The New Yorker described CoreWeave as the most prominent AI data center operator in the United States. Two types of data center providers for machine learning have been noted: hyperscalers and neoclouds. The Verge listed large technology companies such as Google, Meta, Microsoft, Oracle and Amazon as hyperscalers. The New York Times described neoclouds as "a new generation of data center providers". CoreWeave, Nebius, Nscale, and Lambda have been described as examples of neoclouds. In January 2025, OpenAI, in partnership with Oracle and Softbank, announced the Stargate project, which as of September 2025 is composed of six built or proposed AI data centers in the United States. In response to the Stargate project, Amazon launched in October 2025 an AI data center on 1,200 acres of farmland in Indiana. This data center, known as Project Rainier, is one of the largest AI data centers in the world, with Amazon spending $11 billion on the project. Rainier is specifically intended for training and running machine learning models from Anthropic. As of that time, this facility contains seven data centers (out of an estimated 30 planned) and will use 2.2 gigawatts of electricity (equivalent to 1 million households) and millions of gallons of water per year. Computer chips from Annapurna Labs and Anthropic, Trainium 2, were designed for use in such facilities. Amazon pumped millions of gallons of water out of the ground to construct the data center, and as of June 2025, Indiana state officials are investigating whether this dewatering process led to dry wells for local residents. In November 2025, Anthropic announced a plan in partnership with Fluidstack to develop artificial intelligence infrastructure in the United States, including data centers in New York and Texas, worth $50 billion. Other AI data center projects include the Colossus supercomputer from xAI, a Louisiana-based project from Meta, Hyperion, expected to use 5 GW of power, and a second Ohio-based Meta project, Prometheus, with a capacity of 1 GW. A 3,200-acre AI data center, capable of 4.4-4.5 GW of power and located on the decommissioned Homer City Generating Station, is under construction as of 2025, and will use seven 30-acre gas generating stations supplied by EQT. As of December 2025, CRH is working on over 100 data centers in the United States. In 2025, ExxonMobil and NextEra announced plans to build a data center powered by natural gas and using carbon capture technology, with 1.2 GW of power capacity. They previously purchased 2,500 acres of land in the Southeastern United States and plan to market the data center to an artificial intelligence company. The increased interest in AI data centers has led to several executives from companies in that space becoming billionaires, including CoreWeave, QTS, Nebius, Astera Labs, Groq, Fermi (which is connected to former United States Secretary of Energy Rick Perry), Snowflake and Cipher Mining. Several companies involved in cryptocurrency mining, such as Bitdeer, CoreWeave, Cipher Mining, TeraWulf, IREN, Core Scientific, and CleanSpark have also been involved with AI data centers. == Finances == Between January and August 2024, Microsoft, Meta, Google and Amazon collectively spent $125 billion on AI data centers. Citigroup forecasted that $2.8 trillion would be spent on AI data centers by 2030, while McKinsey and Company estimated that almost $7 trillion would be spent globally by that time. According to S&P Global, $61 billion has been spent on the data center market as a whole in 2025, while debt issuance for data centers was $182 billion during the same year. Large technology companies have offloaded the financial risks of building AI data centers by setting up special purpose vehicles or by contracting with neoclouds. For example, Meta's Hyperion was mostly funded by Blue Owl Capital, which did so using a bond offering from PIMCO. Those bonds were sold to a number of clients, including BlackRock. Meta did not borrow money itself and instead established a special purpose vehicle from which it would rent the data center. This deal was structured by Morgan Stanley for $30 billion, the largest known private capital transaction as of 2025. Neoclouds such as CoreWeave have gone into debt to buy computer chips from Nvidia for their data centers, and the chips themselves have been used for loan collateral. As of December 2025, CoreWeave took out three GPU-backed loans, collectively worth $12.4 billion, from private credit firms (Blackstone, Coatue, BlackRock, PIMCO) and from banks (Goldman Sachs, JPMorgan Chase, Wells Fargo). Thus, these companies provide an indirect connection between private credit and established banks. Data centers have also established asset-backed securities, and debt for data centers has its own derivative financial products. The real estate industry, including asset managers, public companies and private investors, has also invested in data centers. == Energy sourcing == == Environmental footprint == Average AI data centers have an electricity footprint equivalent to 100,000 households, and use billions of gallons of water for cooling their hardware. In 2025, the International Energy Agency estimated that the larger AI data centers currently under construction could consume as much electricity as 2 million households. A 2024 report from the United States Department of Energy stated that data centers overall used 17 billion gallons of water per year in the United States, primarily due to "rapid proliferation of AI servers", and that this usage was forecasted to grow to nearly 80 billion gallons by 2028. Researchers estimated that AI data centers in the United States would emit 24-44 million metric tons of carbon dioxide and use 731–1,125 million cubic meters of water per year between 2024 and 2030. Peaking power plants, which have been proposed as a power source for AI data centers, emit sulfur dioxide and have historically been located disproportionately near communities of color in the United States. Reciprocating internal combustion engines, proposed as another power source for a data center, emit PM 2.5, nitrogen oxides, and volatile organic compounds. == AI data centers in the United States == In the United States, both the Biden administration and second Trump administration supported the construction of AI data centers. In January 2025, then-president Joe Biden signed an executive order for federal government agencies to support AI data centers on federal sites built by private companies, study their effect on energy prices, and encourage their use of renewable energy. In April 2025, the United States Department of Energy suggested 16 possible sites, including Los Alamos National Laboratory, Sandia National Laboratories and Oak Ridge National Laboratory. In its July 2025 AI Action Plan, the second Trump administration supported increased production of AI data centers. Several US states have incentivized local data center construction. For example, in 2024, lawmakers in Michigan approved tax breaks for data center equipment and construction material. Some data center companies have also invested or promised to invest in the infrastructure of local communities. In December 2025, Democratic senators Elizabeth Warren, Chris Van Hollen, and Richard Blumenthal wrote to seven technology companies (Google, Microsoft, Amazon, Meta, CoreWeave, Digital Realty, and Equinix) that they w

    Read more →
  • Human–AI interaction

    Human–AI interaction

    Human–AI interaction is a developing field of research and a sub-field of human–computer interaction (HCI). HCI is a field of research that explores the interactions between humans and computer-based technology, focusing on design implementation, user experience, and psychological factors. With the proliferation of artificial intelligence (AI), there has developed a sub-section of HCI research dedicated specifically to artificial intelligence and how people interact with and are impacted by it. This is human–AI interaction, abbreviated either as HAX or HAII. == Introduction == Artificial intelligence (AI), in general, has fluid definitions and varied research applications, but in brief can be applied to mechanizing tasks that would require human intelligence to complete. AI are tools designed to replicate the human abilities of navigating uncertainty, active learning, and processing information in different contexts. Within the context of HCI and HAX research, artificial intelligence can be broken into two sub-fields, natural language processing (NLP) and computer vision (CV). AI technologies notably include machine-learning, deep-learning and neural networks, and large-language models (LLMs). As a new and rapidly developing technology, AI is changing how computers work and therefore changing how humans interact with computers. Unlike the traditional human-computer interaction, where a human directs a machine, human-AI interaction is characterized by a more collaborative relationship between the computer program (the AI) and the human user, as AI is perceived as an active agent rather than a tool. This changing dynamic creates new questions and necessitates new research methods that are not present in traditional HCI research. According to a scoping review on the state of the discipline, the HAX field comprises research on the "design, development, and evaluation of AI systems" and encompasses the themes of human-AI collaboration, human-AI competition, human-AI conflict, and human-AI symbiosis. == Design == Machine learning and artificial intelligence have been used for decades in targeted advertising and to recommend content in social media. Ethical Guidelines (Framework for ethical AI development) == User Experience (UX) == This section should handle research on how users interact with tools. What techniques do they use, do they develop habits, what types of programs and devices are they using to access these tools, what do they use these tools to do exactly. === Cognitive Frameworks in AI Tool Users === AI has been viewed with various expectations, attributions, and often misconceptions. Many people exclusively understand AI as the LLM chatbots they interact with, like ChatGPT or Claude, or other generative AI programs. [Insert section: discuss how people interact with these specific AI tools as a connection to the following paragraphs] Most fundamentally, humans have a mental model of understanding AI's reasoning and motivation for its decision recommendations, and building a holistic and precise mental model of AI helps people create prompts to receive more valuable responses from AI. However, these mental models are not whole because people can only gain more information about AI through their limited interaction with it; more interaction with AI builds a better mental model that a person may build to produce better prompt outcomes. Research on human-AI interaction has emphasized that users develop mental models of AI systems and revise those models through repeated use, feedback, and explanation, while design research has stressed the importance of communicating capabilities and limitations early and supporting trust calibration through explanation and correction. In a 2025 SSRN working paper, John DeVadoss proposed "Hypothetico-Deductive Interaction" (HDI), a framework that describes human-AI interaction as a mutual process of conjecture and refutation in which users test assumptions about an AI system's capabilities while the system infers and updates assumptions about user goals through its responses and clarifying questions. DeVadoss argued that this framing helps explain prompt iteration, weak capability awareness, and trust miscalibration, and suggested design responses such as clearer communication of uncertainty, easier correction, actionable explanations, and safer failure modes. == Research themes == === Human-AI collaboration === Human-AI collaboration occurs when the human and AI supervise the task on the same level and extent to achieve the same goal. Some collaboration occurs in the form of augmenting human capability. AI may help human ability in analysis and decision-making through providing and weighing a volume of information, and learning to defer to the human decision when it recognizes its unreliability. It is especially beneficial when the human can detect a task that AI can be trusted to make few errors so that there is not a lot of excessive checking process required on the human's end. Some findings show signs of human-AI augmentation, or human–AI symbiosis, in which AI enhances human ability in a way that co-working on a task with AI produces better outcomes than a human working alone. For example: the quality and speed of customer service tasks increase when a human agent collaborates with AI, training on specific models allows AI to improve diagnoses in clinical settings, and AI with human-intervention can improve creativity of artwork while fully AI-generated haikus were rated negatively. Human-AI synergy, a concept in which human-AI collaboration would produce more optimal outcomes than either human or AI working alone could explain why AI does not always help with performance. Some AI features and development may accelerate human-AI synergy, while others may stagnate it. For example, when AI updates for better performance, it sometimes worsens the team performance with human and AI by reducing the compatibility with the new model and the mental model a user has developed on the previous version. Research has found that AI often supports human capabilities in the form of human-AI augmentation and not human-AI synergy, potentially because people rely too much on AI and stop thinking on their own. Prompting people to actively engage in analysis and think when to follow AI recommendations reduces their over-reliance, especially for individuals with higher need for cognition. === Human-AI competition === Robots and computers have substituted routine tasks historically completed by humans, but agentic AI has made it possible to also replace cognitive tasks including taking phone calls for appointments and driving a car. At the point of 2016, research has estimated that 45% of paid activities could be replaced by AI by 2030. Perceived autonomy of robots is known to increase people's negative attitude toward them, and worry about the technology taking over leads people to reject it. There has been a consistent tendency of algorithm aversion in which people prefer human advice over AI advice. However, people are not always able to tell apart tasks completed by AI or other humans. See AI takeover for more information. It is also notable that this sentiment is more prominent in the Western cultures as Westerners tend to show less positive views about AI compared to East Asians. == Research on the psychological impacts of AI == === Perception on others who use AI === As much as people perceive and make judgment about AI itself, they also form impressions of themselves and others who use AI. In the workplace, employees who disclose the use of AI in their tasks are more likely to receive feedback that they are not as hardworking as those who are in the same job who receive non-AI help to complete the same tasks. AI use disclosure diminishes the perceived legitimacy in the employee's task and decision making which ultimately leads observers to distrust people who use AI. Although these negative effects of AI use disclosure are weakened by the observers who use AI frequently themselves, the effect is still not attenuated by the observers' positive attitude towards AI. === Bias, AI, and human === Although AI provides a wide range of information and suggestions to its users, AI itself is not free of biases and stereotypes, and it does not always help people reduce their cognitive errors and biases. People are prone to such errors by failing to see other potential ideas and cases that are not listed by AI responses and committing to a decision suggested by AI that directly contradicts the correct information and directions that they are already aware of. Gender bias is also reflected as the female gendering of AI technologies which conceptualizes females as a helpful assistant. == Emotional connection with AI == Human-AI interaction has been theorized in the context of interpersonal relationships mainly in social psychology, communications and media studies, and as a technology interface through the lens of hu

    Read more →
  • Video Super Resolution

    Video Super Resolution

    RTX Video Super Resolution (RTX VSR) is a video scaling feature by Nvidia. It was released on February 28, 2023. == History == The feature was first unveiled during CES 2023 as RTX Video Super Resolution. It uses the on-board Tensor Cores to upscale browser video content in real time. Video Super Resolution was initially only available on RTX 30 and 40 series GPUs, while support for 20 series GPUs was added afterwards; it is now available on all Nvidia RTX-branded GPUs. The feature supports input resolutions from 360p to 1440p and a max output of 4K and comes without support for HDR content although that could be likely added in the future. Nvidia released RTX Video Super Resolution 1.5 with improved video quality and RTX 20 series support on October 17, 2023. == Reception == According to ComputerBase, although "the algorithm is not yet working flawlessly", the feature is "overall recommendable".

    Read more →
  • Wadhwani Institute for Artificial Intelligence

    Wadhwani Institute for Artificial Intelligence

    Wadhwani AI, based in Mumbai, Maharashtra, is an independent, non-profit institute. Founded in 2018, it is dedicated to developing Artificial intelligence solutions for social good. Their mission is to build AI-based innovations and solutions for underserved communities in developing countries, for a wide range of domains including agriculture, education, financial inclusion, healthcare, and infrastructure. == History and funding == The institute was founded with a $30 million philanthropic effort by the Wadhwani brothers, Romesh Wadhwani and Sunil Wadhwani. The institute was inaugurated and dedicated to the nation by Narendra Modi, the 14th Prime Minister of India. In 2019, the institute received a $2 million grant from Google.org to create technologies to help reduce crop losses in cotton farming, through integrated pest management. The United States Agency for International Development awarded $2 million to the institute in 2020 to develop tools, using mathematical modeling techniques and digital technologies such as artificial intelligence and machine learning, to forecast COVID-19 disease patterns, estimate resources needed, and plan interventions. == Collaboration == With assistance from Google, the Ministry of Agriculture and Farmers' Welfare and the Wadhwani AI developed Krishi 24/7, the first AI-powered automated agricultural news monitoring and analysis tool. Through better decision-making, Krishi 24/7 will support the identification of valuable news, provide timely notifications, and respond quickly to safeguard farmers' interests and advance sustainable agricultural growth. The application converts news articles into English after scanning them in several languages. It ensures that the ministry is informed in a timely manner about pertinent occurrences that are published online by extracting key information from news items, including the headline, crop name, event type, date, location, severity, summary, and source link. The National Center for Disease Control has effectively implemented a comparable automated surveillance and analysis tool for disease outbreaks.

    Read more →
  • Artificial Inventor Project

    Artificial Inventor Project

    The Artificial Inventor Project (AIP) is a global legal initiative headed by Professor Ryan Abbott dedicated to pursuing intellectual property (IP) rights for inventions and creative works generated autonomously by artificial intelligence (AI) systems without traditional human inventorship or authorship. The project coordinates a series of pro bono test cases worldwide, aiming to prompt law reform and public debate on how IP law should accommodate non-human creators. == History == In 2019, AIP filed patent applications in multiple jurisdictions, including the United States, United Kingdom, European Patent Office, Australia, Switzerland, and South Africa, naming the AI system DABUS (Device for the Autonomous Bootstrapping of Unified Sentience), created by Stephen Thaler, as the inventor. The aim was to challenge legal norms that require inventors to be natural persons and highlight pressing policy questions about AI-generated innovation and IP regimes. == Legal proceedings by jurisdiction == === Australia === In July 2021, a Federal Court of Australia judge (Beach J) ruled that AI can be considered an inventor under the Patents Act 1990, ordering IP Australia to reinstate the relevant patent. However, the full court then overturned this ruling on appeal and denied further review. === European Patent Office === The EPO Board of Appeal determined in 2022 that only a human inventor may be named, rendering DABUS‑based applications unacceptable. === South Africa === In 2021, a patent was granted listing DABUS as the inventor. As South Africa’s procedural system does not involve substantive inventorship review, the grant proceeded on formal grounds alone. === Switzerland === On 26 June 2025, the Swiss Federal Administrative Court ruled that artificial intelligence systems such as DABUS cannot be listed as inventors on patent applications. The court upheld the existing practice of the Swiss Federal Institute of Intellectual Property (IPI), affirming that only natural persons may be recognized as inventors under Swiss patent law. === United Kingdom === In December 2023, the UK Supreme Court unanimously held that AI systems cannot be legally recognized as inventors, affirming that "an inventor must be a person" under current British law. === United States === In Thaler v. Hirshfeld (2021), a U.S. federal court agreed with the USPTO that inventors must be natural persons, rejecting the DABUS application and setting a precedent consistent with existing statute and administrative policy. == Criticism and impact == The project has fueled substantial discourse. Critics caution that allowing AI inventorship may complicate notions of accountability and ownership. Proponents argue that legal recognition must evolve to avoid disincentivizing innovation produced by AI and to maintain honesty about the true source of invention.

    Read more →
  • PagedAttention

    PagedAttention

    PagedAttention is an attention algorithm for efficient serving of large language models (LLMs). It was introduced in 2023 by Woosuk Kwon and colleagues in the paper Efficient Memory Management for Large Language Model Serving with PagedAttention, alongside the vLLM serving engine. The method stores the key–value cache used during autoregressive decoding in fixed-size blocks that can be mapped to non-contiguous physical memory, borrowing ideas from virtual memory, paging, and operating system design. == Background == In transformer inference, the key–value cache grows with sequence length and the number of concurrent requests. Kwon et al. argued that earlier serving systems typically reserved contiguous cache regions in advance, which caused reserved space, internal fragmentation, and external fragmentation. In their experiments, the paper reported that the effective memory utilization of previous systems could fall as low as 20.4%. == Description == PagedAttention partitions the cache of each sequence into fixed-size KV blocks. A request's cache is represented as a sequence of logical blocks, while a block table maps those logical blocks to physical GPU-memory blocks. As a result, neighboring logical blocks do not need to be contiguous in physical memory, and new blocks can be allocated on demand as generation proceeds. The design also makes it easier to share cache state across related decoding paths. In vLLM, physical blocks can be reference-counted and shared among requests or branches, with block-granularity copy-on-write used when a shared block must be modified. The original paper applied this design to parallel sampling, beam search, and prompts with shared prefixes. == Mathematical formulation == For a query token i {\displaystyle i} in causal self-attention, the standard attention output can be written as a i j = exp ⁡ ( q i ⊤ k j / d ) ∑ t = 1 i exp ⁡ ( q i ⊤ k t / d ) , o i = ∑ j = 1 i a i j v j {\displaystyle a_{ij}={\frac {\exp(\mathbf {q} _{i}^{\top }\mathbf {k} _{j}/{\sqrt {d}})}{\sum _{t=1}^{i}\exp(\mathbf {q} _{i}^{\top }\mathbf {k} _{t}/{\sqrt {d}})}},\;\mathbf {o} _{i}=\sum _{j=1}^{i}a_{ij}\mathbf {v} _{j}} where q i {\displaystyle \mathbf {q} _{i}} , k j {\displaystyle \mathbf {k} _{j}} , and v j {\displaystyle \mathbf {v} _{j}} are the query, key, and value vectors, and d {\displaystyle d} is the attention dimension. If the cache is partitioned into blocks of size B {\displaystyle B} , the key and value blocks may be written as K j = ( k ( j − 1 ) B + 1 , … , k j B ) , V j = ( v ( j − 1 ) B + 1 , … , v j B ) {\displaystyle \mathbf {K} _{j}=(\mathbf {k} _{(j-1)B+1},\ldots ,\mathbf {k} _{jB}),\;\mathbf {V} _{j}=(\mathbf {v} _{(j-1)B+1},\ldots ,\mathbf {v} _{jB})} PagedAttention then performs the computation blockwise: A i j = exp ⁡ ( q i ⊤ K j / d ) ∑ t = 1 ⌈ i / B ⌉ exp ⁡ ( q i ⊤ K t / d ) , o i = ∑ j = 1 ⌈ i / B ⌉ V j A i j ⊤ {\displaystyle \mathbf {A} _{ij}={\frac {\exp(\mathbf {q} _{i}^{\top }\mathbf {K} _{j}/{\sqrt {d}})}{\sum _{t=1}^{\lceil i/B\rceil }\exp(\mathbf {q} _{i}^{\top }\mathbf {K} _{t}/{\sqrt {d}})}},\;\mathbf {o} _{i}=\sum _{j=1}^{\lceil i/B\rceil }\mathbf {V} _{j}\mathbf {A} _{ij}^{\top }} where A i j {\displaystyle \mathbf {A} _{ij}} is the vector of attention scores for the j {\displaystyle j} -th KV block. In the formulation given by Kwon et al., this preserves the causal attention calculation while allowing the key and value blocks to reside in non-contiguous physical memory. == Performance and use == The vLLM paper reported that, on its evaluated workloads, the use of PagedAttention and the associated memory-management design improved serving throughput by 2–4× over the compared baselines, including FasterTransformer and Orca, while preserving model outputs. In experiments on OPT-13B with the Alpaca trace, the paper also reported memory savings of 6.1–9.8% for parallel sampling and 37.6–55.2% for beam search through KV-block sharing. A 2024 survey of LLM serving systems described PagedAttention as having become an industry norm in LLM serving frameworks, citing support in TGI, vLLM, and TensorRT-LLM. == Limitations and alternatives == Subsequent work has described trade-offs in the approach. The 2025 vAttention paper argued that PagedAttention requires attention kernels to be rewritten to support paging and increases software complexity, portability issues, redundancy, and execution overhead, proposing instead a memory manager that keeps the cache contiguous in virtual memory while relying on demand paging for physical allocation. === vAttention === Unlike PagedAttention, vAttention does not introduce a different attention rule; it retains the standard attention computation Attention ⁡ ( q i , K , V ) = softmax ⁡ ( q i K ⊤ s c a l e ) V . {\displaystyle \operatorname {Attention} (q_{i},K,V)=\operatorname {softmax} \left({\frac {q_{i}K^{\top }}{\mathrm {scale} }}\right)V.} In the notation of Prabhu et al., the key and value tensors for a request seen so far are K , V ∈ R L ′ × ( H × D ) {\displaystyle K,V\in \mathbb {R} ^{L'\times (H\times D)}} , where L ′ {\displaystyle L'} is the context length seen so far, H {\displaystyle H} is the number of KV heads on a worker, and D {\displaystyle D} is the dimension of each KV head. In systems prior to PagedAttention, the K cache (or V cache) at each layer of a worker is typically allocated as a 4D tensor of shape [ B , L , H , D ] , {\displaystyle [B,L,H,D],} where B {\displaystyle B} is batch size and L {\displaystyle L} is the maximum context length supported by the model. vAttention preserves this contiguous virtual-memory view while deferring physical-memory allocation to runtime. A serving framework maintains separate K and V tensors for each layer, so vAttention reserves 2 N {\displaystyle 2N} virtual-memory buffers on a worker, where N {\displaystyle N} is the number of layers managed by that worker. The maximum size of one virtual-memory buffer is B S = B × S , {\displaystyle BS=B\times S,} where S {\displaystyle S} is the maximum size of a single request's per-layer K cache (or V cache) on a worker. The paper defines S = L × H × D × P , {\displaystyle S=L\times H\times D\times P,} where P {\displaystyle P} is the number of bytes needed to store one element. In this formulation, vAttention keeps the KV cache contiguous in virtual memory and relies on demand paging for physical allocation, rather than modifying the attention kernel to operate over non-contiguous KV-cache blocks.

    Read more →
  • ARIS Express

    ARIS Express

    ARIS Express is a free-of-charge modeling tool for business process analysis and management. It supports different modeling notations such as BPMN 2, Event-driven Process Chains (EPC), Organizational charts, process landscapes, whiteboards, etc. ARIS Express was initially developed by IDS Scheer, which was bought by Software AG in December 2010. The tool is provided as freeware on the ARIS Community webpage. ARIS Express is notable - having been mentioned in research published by Schumm, Garcia, Krumnow and Greenwood amongst others. == History == ARIS Express was first announced on April 28, 2009 in a press release by IDS Scheer. The first release was on July 28, 2009 in a public beta test on ARIS Community. Only people, who registered before for the beta test were allowed to download and test this beta version. This closed beta test was followed with another public beta test. The official release of ARIS Express 1.0 was on September 9, 2009. In this first stable version, features such as Microsoft Visio import were added, which were not present in the version for the public beta test. On February 26, 2010, ARIS Express 2.0 was released. Major changes compared to version 1.0 include BPMN 2 support, integrated spellchecking and ARISalign integration. On May 25, 2010, version 2.1 of ARIS Express was released. This update improves BPMN 2 support, provides a new online help system for instant feedback, better ARISalign integration and some new symbols in different diagrams. Along with the release, a poster showing the most important modeling concepts supported by ARIS Express was released. In addition, an executable setup is provided for Microsoft Windows-based systems. Beginning of July, an update was released as ARIS Express 2.2, providing bug fixes only. ARIS Express version 2.2 is the current stable release. An official press release published mid of August 2010 said there are more than 50,000 downloads of ARIS Express. On February 2, 2011, version 2.3 of ARIS Express was released. This new version changes the file format of ARIS Express so that models can be shown in an interactive model viewer in ARIS Community. The release announcement contained no details about additional features or changes. == Functionality == === Overview === ARIS Express is a standalone single-user application. It is divided in a home screen and a modeling environment. The home screen is used to create new models or open recently edited ones. The modeling environment is used to edit diagrams. === Supported notations === The following notations are supported by ARIS Express. Users can create diagrams containing an unlimited number of modeling objects. BPMN 2 Collaboration Diagrams Event-driven Process Chains (EPC) Organizational charts Process landscape (value-added chain diagram) Data model in ERM notation IT infrastructure (network diagram) System landscape (component diagram) Whiteboard General diagram === Noteworthy features === Besides common features such as creating new diagrams, saving them as files or adding objects to the modeling canvas, ARIS Express also provides some noteworthy features, which can't be found in most comparable modeling tools. fragments - Often used modeling constructs such as an exclusive decision in a process model can be stored as fragments so that they are available for direct reuse in another model. smart designs - The flow of a process model or hierarchies of other models can be captured in a spreadsheet-like interface. While entering the data in the spreadsheet, the model is generated and laid out in the background while typing. mini toolbar - While moving the mouse pointer over an object in a diagram, a small toolbar is shown allowing quick access to the most important modeling actions. Microsoft Visio import - Diagrams created with Microsoft Visio 2007 or above can be imported to and edited in ARIS Express. A Microsoft Visio export is not provided. ARISalign import - Models created on the online collaboration platform ARISalign can be opened and edited in ARIS Express. === Exports === ARIS Express can export diagrams to different formats such as: PDF JPEG PNG EMF ADF ADF is the file format of ARIS Express. The professional tools of ARIS Platform are able to import diagrams stored in the ADF format. Yet, there are major limitations during import - namely, each object in diagram will be treated as unique object, despite having same type and name, forcing redrawing large sections of diagrams after import. Besides export formats, it is also possible to use the clipboard to copy and paste an ARIS Express diagram into typical office suites such as Microsoft PowerPoint. == Technology == ARIS Express is a Java-based application, which shares some of the features of ARIS Platform products such as ARIS Business Architect and ARIS Business Designer. In contrast to ARIS Platform products, ARIS Express doesn't use a central database for model storage. Instead, each diagram is stored in an ADF file. ARIS Express uses Java Web Start. After download, the application can be started immediately without installation procedure. For Microsoft Windows based systems, an ordinary setup is provided, too. ARIS Express requires Java 1.6.10 or above. On first startup, the user must enter a valid ARIS Community account to register the application. Creating an ARIS Community account is free-of-charge. After installation, no Internet connection is needed to use ARIS Express. ARIS Express uses a mechanism provided by Java Web Start to automatically update the application as soon as a new version becomes available and the user is connected to the Internet during startup. There are reports that this automated update failed while upgrading from version 1.0 to version 2.0. As ARIS Express is based on Java Web Start, it can be installed on any platform supported by Java. The ARIS Community and other Internet sources have reports of successful deployment of ARIS Express on other operating systems than Microsoft Windows. However, ARIS Express is officially supported only on Microsoft Windows. == Miscellaneous == A quick reference sheet is available for ARIS Express. The poster shows all supported diagrams plus the most important modelling concepts for each supported modelling language. ARIS Express contains a hidden game, a so-called Easter Egg. The game can be started by clicking several times on the product logo in the about dialog. Highscores achieved in the game can be submitted to a special page in ARIS Community. A Firefox Personas is available for ARIS Express.

    Read more →
  • Empowerment (artificial intelligence)

    Empowerment (artificial intelligence)

    Empowerment in the field of artificial intelligence formalises and quantifies (via information theory) the potential an agent perceives that it has to influence its environment. An agent which follows an empowerment maximising policy, acts to maximise future options (typically up to some limited horizon). Empowerment can be used as a (pseudo) utility function that depends only on information gathered from the local environment to guide action, rather than seeking an externally imposed goal, thus is a form of intrinsic motivation. The empowerment formalism depends on a probabilistic model commonly used in artificial intelligence. An autonomous agent operates in the world by taking in sensory information and acting to change its state, or that of the environment, in a cycle of perceiving and acting known as the perception-action loop. Agent state and actions are modelled by random variables ( S : s ∈ S , A : a ∈ A {\displaystyle S:s\in {\mathcal {S}},A:a\in {\mathcal {A}}} ) and time ( t {\displaystyle t} ). The choice of action depends on the current state, and the future state depends on the choice of action, thus the perception-action loop unrolled in time forms a causal bayesian network. == Definition == Empowerment ( E {\displaystyle {\mathfrak {E}}} ) is defined as the channel capacity ( C {\displaystyle C} ) of the actuation channel of the agent, and is formalised as the maximal possible information flow between the actions of the agent and the effect of those actions some time later. Empowerment can be thought of as the future potential of the agent to affect its environment, as measured by its sensors. E := C ( A t ⟶ S t + 1 ) ≡ max p ( a t ) I ( A t ; S t + 1 ) {\displaystyle {\mathfrak {E}}:=C(A_{t}\longrightarrow S_{t+1})\equiv \max _{p(a_{t})}I(A_{t};S_{t+1})} In a discrete time model, Empowerment can be computed for a given number of cycles into the future, which is referred to in the literature as 'n-step' empowerment. E ( A t n ⟶ S t + n ) = max p ( a t , . . . , a t + n − 1 ) I ( A t , . . . , A t + n − 1 ; S t + n ) {\displaystyle {\mathfrak {E}}(A_{t}^{n}\longrightarrow S_{t+n})=\max _{p(a_{t},...,a_{t+n-1})}I(A_{t},...,A_{t+n-1};S_{t+n})} The unit of empowerment depends on the logarithm base. Base 2 is commonly used in which case the unit is bits. === Contextual Empowerment === In general the choice of action (action distribution) that maximises empowerment varies from state to state. Knowing the empowerment of an agent in a specific state is useful, for example to construct an empowerment maximising policy. State-specific empowerment can be found using the more general formalism for 'contextual empowerment'. C {\displaystyle C} is a random variable describing the context (e.g. state). E ( A t n ⟶ S t + n ∣ C ) = ∑ c ∈ C p ( c ) E ( A t n ⟶ S t + n ∣ C = c ) {\displaystyle {\mathfrak {E}}(A_{t}^{n}\longrightarrow S_{t+n}{\mid }C)=\sum _{c{\in }C}p(c){\mathfrak {E}}(A_{t}^{n}\longrightarrow S_{t+n}{\mid }C=c)} == Application == Empowerment maximisation can be used as a pseudo-utility function to enable agents to exhibit intelligent behaviour without requiring the definition of external goals, for example balancing a pole in a cart-pole balancing scenario where no indication of the task is provided to the agent. Empowerment has been applied in studies of collective behaviour and in continuous domains. As is the case with Bayesian methods in general, computation of empowerment becomes computationally expensive as the number of actions and time horizon extends, but approaches to improve efficiency have led to usage in real-time control. Empowerment has been used for intrinsically motivated reinforcement learning agents playing video games, and in the control of underwater vehicles.

    Read more →
  • TinyML

    TinyML

    TinyML (short for tiny machine learning) is an area of machine learning that focuses on deploying and running models on low-power, resource-constrained embedded systems such as microcontrollers and edge devices. TinyML supports on-device inference with low latency and minimal reliance on cloud connectivity, which makes it suitable for applications in the Internet of Things (IoT), wearable devices, and real-time systems. == History == The idea of running machine learning models on embedded systems has gained traction in the late 2010s, as model compression, quantization, and efficient neural network architectures progressed. The term TinyML was popularized in 2019 with the publication of the book TinyML by Pete Warden and Daniel Situnayake and the creation of the TinyML Foundation.

    Read more →