AI Data Center Zoning

AI Data Center Zoning — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Image fusion

    Image fusion

    The image fusion process is defined as gathering all the important information from multiple images, and their inclusion into fewer images, usually a single one. This single image is more informative and accurate than any single source image, and it consists of all the necessary information. The purpose of image fusion is not only to reduce the amount of data but also to construct images that are more appropriate and understandable for the human and machine perception. In computer vision, multisensor image fusion is the process of combining relevant information from two or more images into a single image. The resulting image will be more informative than any of the input images. In remote sensing applications, the increasing availability of space borne sensors gives a motivation for different image fusion algorithms. Several situations in image processing require high spatial and high spectral resolution in a single image. Most of the available equipment is not capable of providing such data convincingly. Image fusion techniques allow the integration of different information sources. The fused image can have complementary spatial and spectral resolution characteristics. However, the standard image fusion techniques can distort the spectral information of the multispectral data while merging. In satellite imaging, two types of images are available. The panchromatic image acquired by satellites is transmitted with the maximum resolution available and the multispectral data are transmitted with coarser resolution. This will usually be two or four times lower. At the receiver station, the panchromatic image is merged with the multispectral data to convey more information. Many methods exist to perform image fusion. The very basic one is the high-pass filtering technique. Later techniques are based on Discrete Wavelet Transform, uniform rational filter bank, and Laplacian pyramid. == Motivation == Multi sensor data fusion has become a discipline which demands more general formal solutions to a number of application cases. Several situations in image processing require both high spatial and high spectral information in a single image. This is important in remote sensing. However, the instruments are not capable of providing such information either by design or because of observational constraints. One possible solution for this is data fusion. == Methods == Image fusion methods can be broadly classified into two groups – spatial domain fusion and transform domain fusion. The fusion methods such as averaging, Brovey method, principal component analysis (PCA) and IHS based methods fall under spatial domain approaches. Another important spatial domain fusion method is the high-pass filtering based technique. Here the high frequency details are injected into upsampled version of MS images. The disadvantage of spatial domain approaches is that they produce spatial distortion in the fused image. Spectral distortion becomes a negative factor while we go for further processing, such as classification problem. Spatial distortion can be very well handled by frequency-domain approaches on image fusion. The multiresolution analysis has become a very useful tool for analysing remote sensing images. The discrete wavelet transform has become a very useful tool for fusion. Some other fusion methods are also there, such as Laplacian pyramid based, curvelet transform based etc. These methods show a better performance in spatial and spectral quality of the fused image compared to other spatial methods of fusion. The images used in image fusion should already be registered. Misregistration is a major source of error in image fusion. Some well-known image fusion methods are: High-pass filtering technique IHS transform based image fusion PCA-based image fusion Wavelet transform image fusion Pair-wise spatial frequency matching Comparative analysis of image fusion methods demonstrates that different metrics support different user needs, sensitive to different image fusion methods, and need to be tailored to the application. Categories of image fusion metrics are based on information theory features, structural similarity, or human perception. === Multi-focus image fusion === Multi-focus image fusion is used to collect useful and necessary information from input images with different focus depths in order to create an output image that ideally has all information from input images. In visual sensor network (VSN), sensors are cameras which record images and video sequences. In many applications of VSN, a camera can’t give a perfect illustration including all details of the scene. This is because of the limited depth of focus exists in the optical lens of cameras. Therefore, just the object located in the focal length of camera is focused and cleared and the other parts of image are blurred. VSN has an ability to capture images with different depth of focuses in the scene using several cameras. Due to the large amount of data generated by camera compared to other sensors such as pressure and temperature sensors and some limitation such as limited band width, energy consumption and processing time, it is essential to process the local input images to decrease the amount of transmission data. The aforementioned reasons emphasize the necessary of multi-focus images fusion. Multi-focus image fusion is a process which combines the input multi-focus images into a single image including all important information of the input images and it’s more accurate explanation of the scene than every single input image. == Applications == === In remote sensing === Image fusion in remote sensing has several application domains. An important domain is the multi-resolution image fusion (commonly referred to pan-sharpening). In satellite imagery we can have two types of images: Panchromatic images – An image collected in the broad visual wavelength range but rendered in black and white. Multispectral images – Images optically acquired in more than one spectral or wavelength interval. Each individual image is usually of the same physical area and scale but of a different spectral band. The SPOT PAN satellite provides high resolution (10m pixel) panchromatic data. While the LANDSAT TM satellite provides low resolution (30m pixel) multispectral images. Image fusion attempts to merge these images and produce a single high resolution multispectral image. The standard merging methods of image fusion are based on Red–Green–Blue (RGB) to Intensity–Hue–Saturation (IHS) transformation. The usual steps involved in satellite image fusion are as follows: Resize the low resolution multispectral images to the same size as the panchromatic image. Transform the R, G and B bands of the multispectral image into IHS components. Modify the panchromatic image with respect to the multispectral image. This is usually performed by histogram matching of the panchromatic image with Intensity component of the multispectral images as reference. Replace the intensity component by the panchromatic image and perform inverse transformation to obtain a high resolution multispectral image. Pan-sharpening can be done with Photoshop. Other applications of image fusion in remote sensing are available. === In medical imaging === Image fusion has become a common term used within medical diagnostics and treatment. The term is used when multiple images of a patient are registered and overlaid or merged to provide additional information. Fused images may be created from multiple images from the same imaging modality, or by combining information from multiple modalities, such as magnetic resonance image (MRI), computed tomography (CT), positron emission tomography (PET), and single-photon emission computed tomography (SPECT). In radiology and radiation oncology, these images serve different purposes. For example, CT images are used more often to ascertain differences in tissue density while MRI images are typically used to diagnose brain tumors. For accurate diagnosis, radiologists must integrate information from multiple image formats. Fused, anatomically consistent images are especially beneficial in diagnosing and treating cancer. With the advent of these new technologies, radiation oncologists can take full advantage of intensity modulated radiation therapy (IMRT). Being able to overlay diagnostic images into radiation planning images results in more accurate IMRT target tumor volumes.

    Read more →
  • RealSense

    RealSense

    RealSense is an American technology company that develops depth cameras and computer-vision systems used in robotics, access control, industrial automation and healthcare. The company’s stereoscopic 3D cameras and software are marketed as a perception platform for “physical AI”, particularly for humanoid robots and autonomous mobile robots (AMRs). RealSense was incubated for more than a decade inside Intel’s perceptual computing and depth-sensing group before being spun out as an independent company in July 2025 with a US$50 million Series A round backed by a semiconductor-focused private equity firm and strategic investors including Intel Capital and the MediaTek Innovation Fund. Following the spin-out, RealSense announced a strategic collaboration with Nvidia to integrate its AI depth cameras with the Nvidia Jetson Thor robotics platform, the Isaac Sim simulation environment and the Holoscan Sensor Bridge for low-latency sensor fusion. In November 2025, Swiss access-solutions provider dormakaba acquired a minority stake in RealSense and formed a partnership to develop AI-powered biometric access-control and security systems for data centres, airports and other critical infrastructure. == History == === Origins in Intel Perceptual Computing === Intel began developing depth-sensing and perceptual-computing technologies in the early 2010s under the Perceptual Computing brand, with research spanning gesture control, facial recognition and eye-tracking systems. The work led to a series of 3D cameras and developer challenge programmes intended to stimulate software ecosystems for natural-user interfaces. In 2014 Intel rebranded the effort as Intel RealSense, positioning the technology as a family of depth cameras and vision processors for PCs, mobile devices and embedded systems. Early devices such as the F200 and R200 were integrated into laptops and tablets from OEMs including Asus, HP, Dell, Lenovo and Acer, and were also sold as standalone webcams by partners such as Razer and Creative. === Refocus on robotics and near-closure === By the late 2010s Intel had steered RealSense away from mainstream PC peripherals toward robotics, industrial and embedded applications, adding stereo and lidar-based depth cameras to the portfolio. In August 2021, trade publication CRN reported that Intel planned to wind down the RealSense business as part of a broader restructuring, raising questions about the future of the product line. Despite that announcement, Intel continued to invest in new custom silicon for depth cameras, and RealSense remained widely used in mobile robots and automation projects. === Spin-out as RealSense Inc. (2025) === On 11 July 2025, Intel completed the spin-out of its RealSense 3D-camera business into a new privately held company, RealSense Inc., and the new entity announced a US$50 million Series A funding round. The round was led by a semiconductor-focused private equity investor with participation from Intel Capital, MediaTek Innovation Fund and other strategics. Independent coverage described RealSense as serving more than 3,000 active customers and supplying depth cameras to a large share of global AMR and humanoid robot platforms. The company stated that it would continue to support the existing Intel RealSense product roadmap while accelerating development of AI-enabled cameras and perception software. === Strategic partnerships and investments === In October 2025 RealSense and Nvidia announced a strategic collaboration centered on integrating RealSense AI depth cameras with Nvidia’s Jetson Thor robotics compute modules, the Isaac Sim simulation environment and the Holoscan Sensor Bridge for multi-sensor streaming. The collaboration is positioned as enabling “physical AI” workloads such as whole-body humanoid control, real-time mapping and safety-critical human–robot interaction. On 19 November 2025, dormakaba announced that it had acquired a minority stake in RealSense and entered into a partnership to co-develop intelligent access-control solutions, including biometric gates for airports and enterprise facilities. The partnership aims to combine RealSense’s depth and facial-authentication technology with dormakaba’s installed base of sensors, doors and turnstiles. == Products == === Depth-camera families === RealSense’s products are sold as modular components (depth modules, vision processors and complete cameras) and as integrated systems with on-device AI. The company continues to offer and support the Intel RealSense D400 family of active-stereo depth cameras (including the D415, D435 and D455), which are widely used in robotics and automation. These devices combine a RealSense Vision Processor from the D4 family with dual infrared imagers and, on some models, an RGB camera. Earlier generations of Intel RealSense cameras, including the F200, R200, SR300 and the L515 lidar camera, remain in use in niche and legacy applications but are no longer the focus of the independent company’s roadmap. === D555 PoE depth camera === The first new hardware platform announced after the spin-out was the RealSense Depth Camera D555, a ruggedised stereo-depth device aimed at industrial and robotics deployments. The D555 uses the longer-range D450 optical module with a global shutter and integrates RealSense’s Vision SoC V5, a new generation of vision processor optimised for neural-network inference and depth computation. Key features highlighted in technical coverage include: Power over Ethernet (PoE), allowing power and data to be delivered over a single cable and supporting both RJ45 and ruggedised M12 connections; an IP-rated enclosure designed for harsh indoor and outdoor environments; a built-in inertial measurement unit (IMU) to support simultaneous localisation and mapping (SLAM) and motion tracking; native support for ROS 2 and integration with the open-source RealSense SDK. According to independent reporting, the D555 is used in AI-enabled embedded-vision applications in mobile robots and fixed industrial systems, and was among the first RealSense products to be tightly integrated with Nvidia’s Jetson Thor and Holoscan platforms for low-latency sensor fusion. === Software and SDK === RealSense cameras are supported by a cross-platform, open-source software stack historically branded as Intel RealSense SDK 2.0. The SDK provides device drivers, depth and point-cloud processing, tracking and calibration tools, and bindings for languages such as C++, Python and C#. The independent company has continued to maintain and extend the SDK for new hardware, including D555 and other Vision SoC V5-based devices, and publishes reference integrations for ROS 2 and industrial-automation frameworks. === Biometrics and access-control products === In addition to general-purpose depth cameras, RealSense offers facial-authentication hardware and software, commonly referred to as RealSense ID, for biometric access control and identity verification. These products combine an active depth sensor with a dedicated neural-network pipeline running on embedded processors, aimed at applications such as secure doors, turnstiles and kiosks. Use-case material published by partners describes deployments of RealSense-based biometric readers in school lunch programmes, agricultural biosecurity checkpoints and enterprise facilities. The dormakaba partnership announced in 2025 extends this portfolio to integrated biometric gates and sensor-equipped doors in airports and data centres. == Applications == === Robotics and automation === RealSense depth cameras are used in autonomous mobile robots, humanoid robots, drones and industrial automation systems for tasks such as obstacle avoidance, navigation and manipulation. Reuters reported in 2025 that RealSense cameras were embedded in around 60 percent of the world’s AMRs and humanoid robots, citing customers including Unitree Robotics and ANYbotics. Developers and integrators use RealSense systems with platforms such as Nvidia Jetson, ROS and proprietary motion-planning stacks. === Biometrics and security === RealSense technology is also applied in biometric access control and surveillance, where depth and infrared imaging are used to improve anti-spoofing performance for facial recognition. The dormakaba investment and collaboration is aimed at integrating these capabilities into boarding gates, staff entrances and secure facilities, with RealSense providing perception hardware and algorithms and dormakaba providing access-control infrastructure and global distribution. == Reception == Early coverage of Intel RealSense for consumer PCs noted that the technology’s impact would depend on the availability of compelling software and use cases for depth-sensing cameras. Later reporting on the spin-out has characterised the new company as part of a broader wave of investment in robotics and physical AI, with some analysts suggesting that RealSense’s installed base and patent portfolio give it an advantage as dep

    Read more →
  • Pinakes

    Pinakes

    The Pinakes (Ancient Greek: Πίνακες 'tables', plural of πίναξ pinax) is a lost bibliographic work composed by Callimachus (310/305–240 BCE) that is popularly considered to be the first library catalog in the West; its contents were based upon the holdings of the Library of Alexandria during Callimachus's tenure there during the third century BCE. == History == The Library of Alexandria had been founded by Ptolemy I Soter about 306 BCE. The first recorded librarian was Zenodotus of Ephesus. During Zenodotus' tenure, Callimachus, who was never the head librarian, compiled many catalogues/lists, each called Pinakes. His most famous one listed authors and their works; thus he became the first known bibliographer and the scholar who organized the library by authors and subjects about 245 BCE. His work was 120 volumes long. Apollonius of Rhodes was the successor to Zenodotus. Eratosthenes of Cyrene succeeded Apollonius in 235 BCE and compiled his tetagmenos epi teis megaleis bibliothekeis, the 'scheme of the great bookshelves'. In 195 BCE Aristophanes of Byzantium, Eratosthenes' successor, was the librarian and updated the Pinakes, although it is also possible that his work was not a supplement of Callimachus' Pinakes themselves, but an independent polemic against, or commentary upon, their contents. == Description == The collection at the Library of Alexandria contained nearly 500,000 papyrus scrolls, which were grouped together by subject matter and stored in bins. Each bin carried a label with painted tablets hung above the stored papyri. Pinakes was named after these tablets and are a set of index lists. The bins gave bibliographical information for every roll. A typical entry started with a title and also provided the author's name, birthplace, father's name, any teachers trained under, and educational background. It contained a brief biography of the author and a list of the author's publications. The entry had the first line of the work, a summary of its contents, the name of the author, and information about the origin of the roll, as well as any doubts about the genuineness of the ascription. Callimachus' system divided works into six genres of poetry and five sections of prose: rhetoric, law, epic, tragedy, comedy, lyric poetry, history, medicine, mathematics, natural science, and miscellanies. Each category was alphabetized by author. Callimachus composed two other works that were referred as pinakes and were probably somewhat similar in format to the Pinakes (of which they "may or may not be subsections"), but were concerned with individual topics. These are listed by the Suda as: A Chronological Pinax and Description of Didaskaloi from the Beginning and Pinax of the Vocabulary and Treatises of Democritus. == Later bibliographic pinakes == The term pinax was used for bibliographic catalogs beyond Callimachus. For example, Ptolemy-el-Garib's catalog of Aristotle's writings comes to us with the title Pinax (catalog) of Aristotle's writings. == Legacy == The Pinakes proved indispensable to librarians for centuries, and they became a model for organizing knowledge throughout the Mediterranean. Their later influence can be traced to medieval times, even to the Arabic counterpart of the tenth century: Ibn al-Nadim's Al-Fihrist ("Index"). Local variations for cataloging and library classification continued through the late 19th century, when Anthony Panizzi and Melvil Dewey paved the way for more shared and standardized approaches.

    Read more →
  • Simple Knowledge Organization System

    Simple Knowledge Organization System

    Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data. == History == === DESIRE II project (1997–2000) === The most direct ancestor to SKOS was the RDF Thesaurus work undertaken in the second phase of the EU DESIRE project . Motivated by the need to improve the user interface and usability of multi-service browsing and searching, a basic RDF vocabulary for Thesauri was produced. As noted later in the SWAD-Europe workplan, the DESIRE work was adopted and further developed in the SOSIG and LIMBER projects. A version of the DESIRE/SOSIG implementation was described in W3C's QL'98 workshop, motivating early work on RDF rule and query languages: A Query and Inference Service for RDF. === LIMBER (1999–2001) === SKOS built upon the output of the Language Independent Metadata Browsing of European Resources (LIMBER) project funded by the European Community, and part of the Information Society Technologies programme. In the LIMBER project CCLRC further developed an RDF thesaurus interchange format which was demonstrated on the European Language Social Science Thesaurus (ELSST) at the UK Data Archive as a multilingual version of the English language Humanities and Social Science Electronic Thesaurus (HASSET) which was planned to be used by the Council of European Social Science Data Archives CESSDA. === SWAD-Europe (2002–2004) === SKOS as a distinct initiative began in the SWAD-Europe project, bringing together partners from both DESIRE, SOSIG (ILRT) and LIMBER (CCLRC) who had worked with earlier versions of the schema. It was developed in the Thesaurus Activity Work Package, in the Semantic Web Advanced Development for Europe (SWAD-Europe) project. SWAD-Europe was funded by the European Community, and part of the Information Society Technologies programme. The project was designed to support W3C's Semantic Web Activity through research, demonstrators and outreach efforts conducted by the five project partners, ERCIM, the ILRT at Bristol University, HP Labs, CCLRC and Stilo. The first release of SKOS Core and SKOS Mapping were published at the end of 2003, along with other deliverables on RDF encoding of multilingual thesauri and thesaurus mapping. === Semantic web activity (2004–2005) === Following the termination of SWAD-Europe, SKOS effort was supported by the W3C Semantic Web Activity in the framework of the Best Practice and Deployment Working Group. During this period, focus was put both on consolidation of SKOS Core, and development of practical guidelines for porting and publishing thesauri for the Semantic Web. === Development as W3C Recommendation (2006–2009) === The SKOS main published documents — the SKOS Core Guide, the SKOS Core Vocabulary Specification, and the Quick Guide to Publishing a Thesaurus on the Semantic Web — were developed through the W3C Working Draft process. Principal editors of SKOS were Alistair Miles, initially Dan Brickley, and Sean Bechhofer. The Semantic Web Deployment Working Group, chartered for two years (May 2006 – April 2008), put in its charter to push SKOS forward on the W3C Recommendation track. The roadmap projected SKOS as a Candidate Recommendation by the end of 2007, and as a Proposed Recommendation in the first quarter of 2008. The main issues to solve were determining its precise scope of use, and its articulation with other RDF languages and standards used in libraries (such as Dublin Core). === Formal release (2009) === On August 18, 2009, W3C released the new standard that builds a bridge between the world of knowledge organization systems – including thesauri, classifications, subject headings, taxonomies, and folksonomies – and the linked data community, bringing benefits to both. Libraries, museums, newspapers, government portals, enterprises, social networking applications, and other communities that manage large collections of books, historical artifacts, news reports, business glossaries, blog entries, and other items can now use SKOS to leverage the power of linked data. === Historical view of components === SKOS was originally designed as a modular and extensible family of languages, organized as SKOS Core, SKOS Mapping, and SKOS Extensions, and a Metamodel. The entire specification is now complete within the namespace http://www.w3.org/2004/02/skos/core#. == Overview == In addition to the reference itself, the SKOS Primer (a W3C Working Group Note) summarizes the Simple Knowledge Organization System. The SKOS defines the classes and properties sufficient to represent the common features found in a standard thesaurus. It is based on a concept-centric view of the vocabulary, where primitive objects are not terms, but abstract notions represented by terms. Each SKOS concept is defined as an RDF resource. Each concept can have RDF properties attached, including: one or more preferred index terms (at most one in each natural language) alternative terms or synonyms definitions and notes, with specification of their language Concepts can be organized in hierarchies using broader-narrower relationships, or linked by non-hierarchical (associative) relationships. Concepts can be gathered in concept schemes, to provide consistent and structured sets of concepts, representing whole or part of a controlled vocabulary. === Element categories === The principal element categories of SKOS are concepts, labels, notations, documentation, semantic relations, mapping properties, and collections. The associated elements are listed in the table below. === Concepts === The SKOS vocabulary is based on concepts. Concepts are the units of thought—ideas, meanings, or objects and events (instances or categories)—which underlie many knowledge organization systems. As such, concepts exist in the mind as abstract entities which are independent of the terms used to label them. In SKOS, a Concept (based on the OWL Class) is used to represent items in a knowledge organization system (terms, ideas, meanings, etc.) or such a system's conceptual or organizational structure. A ConceptScheme is analogous to a vocabulary, thesaurus, or other way of organizing concepts. SKOS does not constrain a concept to be within a particular scheme, nor does it provide any way to declare a complete scheme—there is no way to say the scheme consists only of certain members. A topConcept is (one of) the upper concept(s) in a hierarchical scheme. === Labels and notations === Each SKOS label is a string of Unicode characters, optionally with language tags, that are associated with a concept. The prefLabel is the preferred human-readable string (maximum one per language tag), while altLabel can be used for alternative strings, and hiddenLabel can be used for strings that are useful to associate, but not meant for humans to read. A SKOS notation is similar to a label, but this literal string has a datatype, like integer, float, or date; the datatype can even be made up (see 6.5.1 Notations, Typed Literals and Datatypes in the SKOS Reference). The notation is useful for classification codes and other strings not recognizable as words. === Documentation === The Documentation or Note properties provide basic information about SKOS concepts. All the properties are considered a type of skos:note; they just provide more specific kinds of information. The property definition, for example, should contain a full description of the subject resource. More specific note types can be defined in a SKOS extension, if desired. A query for skos:note ? will obtain all the notes about , including definitions, examples, and scope, history and change, and editorial documentation. Any of these SKOS Documentation properties can refer to several object types: a literal (e.g., a string); a resource node that has its own properties; or a reference to another document, for example using a URI. This enables the documentation to have its own metadata, like creator and creation date. Specific guidance on SKOS documentation properties can be found in the SKOS Primer Documentary Notes. === Semantic relations === SKOS semantic relations are intended to provide ways to declare relationships between concepts within a concept scheme. While there are no restrictions precluding their use with two concepts from separate schemes, this is discouraged because it is likely to overstate what can be known about the two schemes, and perhaps link them inappropriately. The property related simply makes an association relationship between two concepts; no hierarchy or generality relation is implied. The properties broader and narrower are used to assert a direct hierarchical link between two concepts. The meaning may be unexpected; the relat

    Read more →
  • Clarizen

    Clarizen

    Clarizen, Inc. is a project management software and collaborative work management company. Clarizen uses a software as a service business model. Clarizen's features include attaching CAD drawings to a project, moving between the project view and design view and an E-mail reporting feature. In May 2014 Clarizen raised $35 million in venture capital investment led by Goldman Sachs. The round brought investment to $90 million. Previous investors, including Benchmark Capital, Carmel Ventures, DAG Ventures, Opus Capital and Vintage Investment Partners participated. In April 2020, Clarizen appointed Matt Zilli as its new CEO, replacing Boaz Chalamish who is appointed as Executive Chairman. In January 2021 Clarizen was acquired by Planview.

    Read more →
  • Moral Machine

    Moral Machine

    Moral Machine is an online platform, developed by Iyad Rahwan's Scalable Cooperation group at the Massachusetts Institute of Technology, that generates moral dilemmas and collects information on the decisions that people make between two destructive outcomes. The platform is the idea of Iyad Rahwan and social psychologists Azim Shariff and Jean-François Bonnefon, who conceived of the idea ahead of the publication of their article about the ethics of self-driving cars. The key contributors to building the platform were MIT Media Lab graduate students Edmond Awad and Sohan Dsouza. The presented scenarios are often variations of the trolley problem, and the information collected would be used for further research regarding the decisions that machine intelligence must make in the future. For example, as artificial intelligence plays an increasingly significant role in autonomous driving technology, research projects like Moral Machine help to find solutions for challenging life-and-death decisions that will face self-driving vehicles. Moral Machine was active from January 2016 to July 2020. The Moral Machine continues to be available on their website for people to experience. == The experiment == The Moral Machine was an ambitious project; it was the first attempt at using such an experimental design to test a large number of humans in over 200 countries worldwide. The study was approved by the Institute Review Board (IRB) at Massachusetts Institute of Technology (MIT). The setup of the experiment asks the viewer to make a decision on a single scenario in which a self-driving car is about to hit pedestrians. The user can decide to have the car either swerve to avoid hitting the pedestrians or keep going straight to preserve the lives it is transporting. Participants can complete as many scenarios as they want to, however the scenarios themselves are generated in groups of thirteen. Within this thirteen, a single scenario is entirely random while the other twelve are generated from a space in a database of 26 million different possibilities. They are chosen with two dilemmas focused on each of six dimensions of moral preferences: character gender, character age, character physical fitness, character social status, character species, and character number. The experiment setup remains the same throughout multiple scenarios but each scenario tests a different set of factors. Most notably, the characters involved in the scenario are different in each one. Characters may include ones such as: Stroller, girl, boy, pregnant, Male Doctor, Female Doctor, Female Athlete, Executive Female, Male Athlete, Executive Male, Large Woman, Large Man, homeless, old man, old woman, dog, criminal, and a cat. Through these different characters researchers were able to understand how a wide variety of people will judge scenarios based on those involved. == Analysis == The Moral Machine collected 40 million moral decisions from 4 million participants in 233 countries, analysis of which revealed trends within individual countries and humanity as a whole. It tested for nine factors: preference for sparing humans versus pets, passengers versus pedestrians, men versus women, young versus elderly, fit versus overweight, higher versus lower social status, jaywalkers versus law abiders, larger versus smaller groups, and inaction (i.e. staying on course) versus swerving. Globally, participants favored human lives over lives of animals like dogs and cats. They preferred to spare more lives if possible, and younger lives as opposed to older. Babies were most often spared with cats being the least spared. In terms of gender variations, people tended to spare men over women for doctors and the elderly. All countries generally shared the preference to spare pedestrians over passengers and law-abiders over criminals. Participants from less wealthy countries showed a higher tendency of sparing pedestrians who crossed illegally compared to those from more wealthy and developed countries. This is most likely due to their experience living in a society where individuals are more likely to deviate from rules due to less stringent enforcement of laws. Countries of higher economic inequality overwhelmingly prefer to save wealthier individuals over poorer ones. === Cultural differences === Researchers subdivided 130 countries with similar results into three ‘cultural clusters’. North America and European countries with significant Christian populations had a higher preference for inaction on the part of the driver and thus had less of a preference for sparing pedestrians as compared to other clusters. East Asian and Islamic countries, together constituting the second cluster, did not have as much preference to spare younger humans compared to the other two clusters and had a higher preference for sparing law-abiding humans. Latin America and Francophone countries had a higher preference for sparing women, the young, the fit, and those of higher status, but a lower preference for sparing humans over pets or other animals. Individualistic cultures tended to spare larger groups, and collectivist cultures had a stronger preference for sparing the lives of older people. For instance, China ranked far below the world average for preference to spare the younger over elderly, while the average respondent from the US exhibited a much higher tendency to save younger lives and larger groups. == Applications of the data == The findings from the moral machine can help decision makers when designing self-driving automotive systems. Designers must make sure that these vehicles are able to solve problems on the road that aligns with the moral values of humans around it. This is a challenge because of the complex nature of humans who may all make different decisions based on their personal values. However, by collecting a large amount of decisions from humans all over the world, researchers can begin to understand patterns in the context of a particular culture, community, and people. == Other features == The Moral Machine was deployed in June 2016. In October 2016, a feature was added that offered users the option to fill a survey about their demographics, political views, and religious beliefs. Between November 2016 and March 2017, the website was progressively translated into nine languages in addition to English (Arabic, Chinese, French, German, Japanese, Korean, Portuguese, Russian, and Spanish). Overall, the Moral Machine offers four different modes, with the focus being on the data-gathering feature of the website, called the Judge mode. This means that the Moral Machine, in addition to providing their own scenarios for users to judge, also invites users to create their own scenarios to be submitted and approved so that other people may also judge those scenarios. Data is also open sourced for anyone to explore via an interactive map that is featured on the Moral Machine website. == In the literature == Studies and research on the Moral Machine have taken a wide variety of approaches. However, theological examinations of the topic are still scarce where two bodies of work that examine such perspective currently exist in this regard: One is Buddhist while the other is Christian.

    Read more →
  • John Schulman

    John Schulman

    John Schulman (born 1987 or 1988) is an American artificial intelligence researcher and co-founder of OpenAI. In August 2024, he announced he would be joining Anthropic. In February 2025, he announced he was leaving to join Thinking Machines Lab, where he is chief scientist. == Early life and education == Schulman had an interest in science and math from a young age. He enjoyed science fiction, especially the work of Isaac Asimov. When he was in seventh grade, he became deeply interested in the television program BattleBots, which featured combat between remote-controlled robots. In what he said was his first self-directed study, he read extensively in subject areas that would help him design a superior robot, but the robot he and his friends worked on was never built. He attended Great Neck South High School. He was a member of the US Physics olympiad Team in 2005. In 2010, he graduated from Caltech with a degree in physics. He has a PhD in electrical engineering and computer sciences from the University of California, Berkeley, where he was advised by Pieter Abbeel. == Career == In December 2015, shortly before finishing his PhD, Schulman co-founded OpenAI with Sam Altman, Elon Musk, Ilya Sutskever, Greg Brockman, Trevor Blackwell, Vicki Cheung, Andrej Karpathy, Durk Kingma, Pamela Vagata, and Wojciech Zaremba, with Sam Altman and Elon Musk as the co-chairs. There, he led the reinforcement learning team that created ChatGPT. He has been referred to as the "architect" of ChatGPT. In August 2024, Schulman announced he would be joining Anthropic. He stated his move was to allow him to deepen his focus on AI alignment and return to more hands-on technical work. In February 2025, he announced he was leaving to join Thinking Machines Lab, where he is chief scientist. == Awards and honors == In 2025, Schulman received the Mark Bingham Award for Excellence in Achievement by Young Alumni from his alma mater, UC Berkeley.

    Read more →
  • Mustafa Suleyman

    Mustafa Suleyman

    Mustafa Suleyman (born in August 1984) is a British artificial intelligence (AI) entrepreneur. He is the CEO of Microsoft AI, and the co-founder and former head of applied AI at DeepMind, an AI company which was acquired by Google. After leaving DeepMind, he co-founded Inflection AI, a machine learning and generative AI company, in 2022. == Early life and education == Suleyman's Syrian father worked as a taxi driver and his English mother was a nurse. He grew up off Caledonian Road, London, where he lived with his parents and his two younger brothers. Suleyman went to Thornhill Primary School, a state school in Islington, followed by Queen Elizabeth's School, Barnet, a boys' grammar school. Around that time, he met his DeepMind co-founder, Demis Hassabis, through his best friend, who was Demis's younger brother. Suleyman shared that he and Hassabis often discussed how they could make a positive impact on the world. Suleyman enrolled to study philosophy and theology at the University of Oxford where he was an undergraduate student at Mansfield College, Oxford, before dropping out at 19. == Career == In August 2001, while still a teenager and a "strong atheist", Suleyman helped Mohammed Mamdani establish a telephone counselling service called the Muslim Youth Helpline. The organization would later become one of the largest mental health support services. Suleyman subsequently worked as a policy officer on human rights for Ken Livingstone, the Mayor of London, before going on to start Reos Partners, a "systemic change" consultancy that uses methods from conflict resolution to navigate social problems. As a negotiator and facilitator, Mustafa worked for a wide range of clients such as the United Nations, the Dutch government, and the World Wide Fund for Nature. === DeepMind and Google === In 2010 Suleyman co-founded DeepMind Technologies, an artificial intelligence (AI) and machine learning company, and became its chief product officer. The company quickly established itself as one of the leaders in the AI sector. In 2014 DeepMind was acquired by Google for a reported £400 million, the company's largest acquisition in Europe at that time. Following the acquisition, Suleyman became head of applied AI at DeepMind, taking on responsibility for integrating the company's technology across a wide range of Google products. In February 2016 Suleyman launched DeepMind Health at the Royal Society of Medicine. DeepMind Health builds clinician-led technology for the National Health Service (NHS) and other partners to improve frontline healthcare services. Under Suleyman, DeepMind also developed research collaborations with healthcare organizations in the United Kingdom, including Moorfields Eye Hospital NHS foundation trust. In 2016, Suleyman led an effort to apply DeepMind's machine learning algorithms to help reduce the energy required to cool Google's data centres. The system evaluated the billions of possible combinations of actions that the data centre operators could take, and came up with recommendations based on the predicted power usage. The system discovered novel methods of cooling, leading to a reduction of up to 40% of the amount of energy used for cooling, and a 15% improvement in the buildings' overall energy efficiency. Since June 2019, Suleyman has served on the board of The Economist Group, which publishes The Economist newspaper. In August 2019, Suleyman was placed on administrative leave following allegations of bullying employees. The company hired an external lawyer to investigate, and shortly thereafter Suleyman left to take a VP role at parent company Google. An email circulated by DeepMind's leadership to staff after the story broke, as well as additional details published by Business Insider, said Suleyman's "management style fell short" of expected standards. In December 2019, Suleyman announced he would be leaving DeepMind to join Google, working in a policy role. === Inflection AI === Suleyman left Google in January 2022 and joined Greylock Partners as a venture partner and in March 2022, Suleyman co-founded Inflection AI, a new AI lab venture with Greylock's Reid Hoffman. The company was founded with the goal of leveraging "AI to help humans 'talk' to computers," recruited former staff from companies such as Google and Meta and raised $225 million in its first funding round. In 2023, Inflection AI launched a chatbot named “Pi” for Personal Intelligence. The bot “remembers” past conversations and seems to get to know its users over time. According to Suleyman, the long-term goal for Pi is to be a digital “Chief of Staff”, with the initial design focused on maintaining conversational dialogue with users, asking questions, and offering emotional support. === Microsoft AI === In March 2024, Microsoft appointed Suleyman as Executive Vice President (EVP) and CEO of its newly created consumer AI unit, Microsoft AI. Several members of Inflection AI's team were also appointed to the division, including co-founder Karen Simonyan. === Awards and honours === Suleyman was appointed a Commander of the British Empire (CBE) in the 2019 New Year Honours. Suleyman was named by Time as one of the 100 most influential people in artificial intelligence in 2023 and in 2024. === Views on AI ethics === Suleyman is prominent in the debate over the ethics of AI and has spoken widely about the need for companies, governments and civil society to join in holding technologists accountable for the impacts of their work. He has advocated redesigning incentives in the technology industry to steer business leaders toward prioritising social responsibility alongside their fiduciary duties. Within DeepMind he set up a research unit called DeepMind Ethics & Society to study the real-world impacts of AI and help technologists put ethics into practice. Suleyman is also a founding co-chair of the Partnership on AI – an organisation that includes representatives from companies such as Amazon, Apple, DeepMind, Meta, Google, IBM, and Microsoft. The organisation studies and formulates best practices for AI technologies, advances the public's understanding of AI, and serves as an open platform for discussion and engagement about AI and how it affects people and society. Its board of directors has equal representation from non-profit and for profit entities. In September 2023, Suleyman, in collaboration with researcher Michael Bhaskar, published The Coming Wave, Technology, Power and the 21st Century's Greatest Dilemma, a book that examines the transformative and potentially perilous impact of advanced technologies, particularly AI and synthetic biology. According to Suleyman, AI notably has the potential to bring "radical abundance", address climate change and empower people with its cheap problem-solving capabilities. But it may also improve its own design and manufacturing processes, leading to a period of dangerously rapid AI progress. And it could enable catastrophic misuse, from bioengineered pathogens to autonomous weapons, making global oversight and containment essential to avoid unintended consequences. It was shortlisted for the 2023 Financial Times Business Book of the Year Award. In June 2024, in an interview with Andrew Ross Sorkin at the Aspen Ideas Festival, Suleyman expressed the view that unless a website explicitly specifies otherwise, for "content that is already on the open web, the social contract of that content since the 90s has been that it is fair use. Anyone can copy it, recreate with it, reproduce with it. That has been freeware, if you like. That's been the understanding." The statement sparked controversy over the use of Internet data for training AI models. == Personal life == A Business Insider profile in 2017 described Suleyman as being liberal.

    Read more →
  • Time-compressed speech

    Time-compressed speech

    Time-compressed speech refers to an audio recording of verbal text in which the text is presented in a much shorter time interval than it would through normally-paced real time speech. The basic purpose is to make recorded speech contain more words in a given time, yet still be understandable. For example: a paragraph that might normally be expected to take 20 seconds to read, might instead be presented in 15 seconds, which would represent a time-compression of 25% (5 seconds out of 20). The term "time-compressed speech" should not be confused with "speech compression", which controls the volume range of a sound, but does not alter its time envelope. == Methods == While some voice talents are capable of speaking at rates significantly in excess of general norms, the term "time-compressed speech" most usually refers to examples in which the time-reduction has been accomplished through some form of electronic processing of the recorded speech. In general, recorded speech can be electronically time-compressed by: increasing its speed (linear compression); removing silences (selective editing); a combination of the two (non-linear compression). The speed of a recording can be increased, which will cause the material to be presented at a faster rate (and hence in a shorter amount of time), but this has the undesirable side-effect of increasing the frequency of the whole passage, raising the pitch of the voices, which can reduce intelligibility. There are normally silences between words and sentences, and even small silences within certain words, both of which can be reduced or removed ("edited-out") which will also reduce the amount of time occupied by the full speech recording. However, this can also have the effect of removing verbal "punctuation" from the speech, causing words and sentences to run together unnaturally, again reducing intelligibility. Vowels are typically held a minimum of 20 milliseconds, over many cycles of the fundamental pitch. DSP systems can detect the beginning and end of each cycle and then skip over some fraction of those cycles, causing the material to be presented at a faster rate, without changing the pitch, maintaining a "normal" tone of voice. The current preferred method of time-compression is called "non-linear compression", which employs a combination of selectively removing silences; speeding up the speech to make the reduced silences sound normally-proportioned to the text; and finally applying various data algorithms to bring the speech back down to the proper pitch. This produces a more acceptable result than either of the two earlier techniques; however, if unrestrained, removing the silences and increasing the speed can make a selection of speech sound more insistent, possibly to the point of unpleasantness. == Applications == === Advertising === Time-compressed speech is frequently used in television and radio advertising. The advantage of time-compressed speech is that the same number of words can be compressed into a smaller amount of time, reducing advertising costs, and/or allowing more information to be included in a given radio or TV advertisement. It is usually most noticeable in the information-dense caveats and disclaimers presented (usually by legal requirement) at the end of commercials—the aural equivalent of the "fine print" in a printed contract. This practice, however, is not new: before electronic methods were developed, spokespeople who could talk extremely quickly and still be understood were widely used as voice talents for radio and TV advertisements, and especially for recording such disclaimers. === Education === Time-compressed speech has educational applications such as increasing the information density of trainings, and as a study aid. A number of studies have demonstrated that the average person is capable of relatively easily comprehending speech delivered at higher-than-normal rates, with the peak occurring at around 25% compression (that is, 25% faster than normal); this facility has been demonstrated in several languages. Conversational speech (in English) takes place at a rate of around 150 wpm (words per minute), but the average person is able to comprehend speech presented at rates of up to 200-250 wpm without undue difficulty. Blind and severely visually impaired subjects scored similar comprehension levels at even higher rates, up to 300-350 wpm. Blind people have been found to use time-compressed speech extensively, for example, when reviewing recorded lectures from high school and college classes, or professional trainings. Comprehension rates in older blind subjects have been found to be as good, or in some cases better than those found in younger sighted subjects. Other studies have determined that the ability to comprehend highly time-compressed speech tends to fall off with increased age, and is also reduced when the language of the time-compressed speech is not the listener's native language. Non-native speakers can, however, improve their comprehension level of time-compressed speech with multiday training. === Voice Mail === Voice mail systems have employed time-compressed speech since as far back as the 1970s. In this application, the technology enables the rapid review of messages in high-traffic systems, by a relatively small number of people. === Streaming Multimedia === Time-compressed speech has been explored as one of a variety of interrelated factors which may be manipulated to increase the efficiency of streaming multimedia presentations, by significantly reducing the latency times involved in the transfer of large digitally encoded media files.

    Read more →
  • Colossus (supercomputer)

    Colossus (supercomputer)

    Colossus is a supercomputer developed by xAI. Construction began in 2024 in Memphis, Tennessee; the system became operational in July 2024. It is currently the world's largest AI supercomputer. Colossus's primary purpose is to train the company's chatbot, Grok. In addition, Colossus provides computing support to the social-media platform X and to other projects of Elon Musk, such as SpaceX. In 2025, it expanded to neighboring Southaven, Mississippi across the Tennessee–Mississippi border. As of May 6, 2026, Anthropic has agreed to rent all compute capacity at the Colossus 1 data center. == Background == Colossus was launched in September 2024 at a former Electrolux site in South Memphis to train the AI language model Grok. Within 19 days of the project's conception, xAI was ready to begin construction. The site was chosen because the abandoned Electrolux building could be repurposed to expedite construction and its proximity to a nearby wastewater treatment facility provided a water source. As of February 2025, xAI plans to build an $80 million facility to process additional wastewater for use at the supercomputer. === xAI === Musk incorporated xAI in March 2023 with the stated purpose of understanding the "nature of the universe". The team includes former members of OpenAI, DeepMind, Microsoft, and Tesla. Musk was one of the founding members of the company OpenAI, investing up to US$45 million in 2015. He left OpenAI in 2018, reportedly to avoid conflicts of interest with Tesla. It has also been reported that he had made a bid for leadership at OpenAI and left when his proposal was rejected. The exact reasons for his departure from the company are unclear. Both Dell Technologies and Supermicro partnered with xAI to build the supercomputer. It was originally powered by 100,000 Nvidia graphics processing units (GPUs) and was constructed in 122 days. 3 months after the first 100,000 GPUs were deployed, xAI announced that they had increased the system to 200,000 GPUs and that they intended to continue increasing the computer's processing power to 1 million GPUs. As of April 2025, xAI claimed Colossus was the largest AI training platform in the world. == Choice of location == xAI selected Memphis, in southwestern Tennessee, as the site for Colossus in part because an existing industrial facility allowed the project to proceed more quickly than constructing a new data center. Elon Musk was initially told that building a data center would take 18–24 months. The company instead searched for a vacant facility and selected the former Electrolux factory in Memphis. Electrolux opened the facility in 2012 and operated it for about eight years before closing it in 2020 after relocating operations to Springfield, Tennessee. The building covered 785,000 sq ft (72,900 m2) and had been purchased by Phoenix Investors in December 2023 for $35 million . Because the structure was already in place, work on the supercomputer could begin immediately rather than waiting for a new facility to be constructed. According to Forbes, xAI considered seven or eight other sites before selecting Memphis, and Musk finalized the decision to build in Memphis in about a week. The decision was finalized in March 2024, after which construction began. xAI publicly announced in June 2024 that Colossus would be built in Memphis. The building itself was not the only reason xAI selected Memphis. According to the Greater Memphis Chamber, the company chose the city because of its "reliable power grid, ability to create a water recycling facility, proximity to the Mississippi River and ample land". The city was also able to provide the large amounts of electricity and water needed to operate the supercomputer. At full capacity, the system was expected to require 150 megawatts of electricity and millions of gallons of water per day. The project also relied on partnerships with local and regional organizations including Memphis Light, Gas and Water (MLGW), Tennessee Valley Authority (TVA), the City of Memphis, and Shelby County. The city also provided financial incentives for the project. == Environmental impact == AI data centers consume large amounts of energy. At the site of Colossus in South Memphis, the grid connection was only 8 MW, so xAI applied to temporarily set up more than a dozen gas turbines (Voltagrid’s 2.5 MW units and Solar Turbines’ 16 MW SMT-130s) which would steadily burn methane gas from a 16-inch natural gas main. Aerial imagery in April 2025 showed 35 gas turbines had been set up at a combined 422 MW. These turbines have been estimated to generate about "72 megawatts, which is approximately 3% of the (TVA) power grid". The higher number of gas turbines and the subsequent emissions requires xAI to have a major source permit. In Memphis, xAI was able to avoid some environmental rules in the construction of Colossus, such as operating without permits for the on-site methane gas turbines because they are "portable". The Shelby County Health Department told NPR that "it only regulates gas-burning generators if they're in the same location for more than 364 days". However, in a January 2026 ruling, the EPA revised its New Source Performance Standard and announced that large methane gas turbines require permits even for temporary operations. In November 2024, the grid connection was upgraded to 150 MW, and some turbines were removed. Along with high electricity needs, the expected water demand is over five million gallons of water per day. While xAI has stated they plan to work with MLGW on a wastewater treatment facility and the installation of 50 megawatts of large battery storage facilities, there are currently no concrete plans in place aside from a one-page factsheet shared by MLGW. == Community response == The plan to build Colossus in Memphis was unknown to residents, City Council members, and environmental agencies. Many did not find out about the project until the day before, or the day of, as they watched the announcement on the local news. Keshaun Pearson, president of Memphis Community Against Pollution, stated that there is a historical lack of transparency and communication surrounding environmental issues in Memphis. Some community members in Memphis have expressed concern about the potential for additional air and water pollution caused by the supercomputer. In a letter to the Shelby County Health Department, the Southern Environmental Law Center stated the emissions from the turbines make the facility "...likely the largest industrial emitter of NOx in Memphis..." This is due to data supplied by the manufacturer showing that "...xAI emits between 1,200 and 2,000 tons of smog-forming nitrogen oxides (NOx)..." At a public Shelby County Commissioner's hearing on April 9, 2025, residents living near the site of Colossus voiced complaints about air quality, noting that they have chronic respiratory issues related to living in a polluted section of Memphis. One woman said she smells "everything but the right thing and the right thing is the clean air." Other residents voiced frustration that Brent Mayo, the senior xAI official responsible for building out xAI's infrastructure, did not attend the meeting to discuss community concerns. Keshaun Pearson also stated that "We're getting more and more days a year where it is unhealthy for us to go outside." People living near the site of Colossus have said they were not offered the opportunity for a public review of the plans, nor were they provided with information on how their community could potentially benefit. The community is also concerned about the strain on the power grid. Memphis's peak demand is around 3 GW. In November 2024, TVA approved xAI's request for access to more than 100 megawatts of power to Colossus which is supplied by MLGW. In December 2022, MLGW imposed (then rescinded) rolling blackouts during several days of extreme cold, straining the power grid. In a letter to the TVA, the SELC "urged the agency to 'prioritize Memphis families' access to reliable power over the 'secondary purpose' of serving xAI". == Current progress == In early December 2024, Ted Townsend detailed how the power of Colossus doubled in its processing capability. When it first went online in September 2024, it was using "100,000 Nvidia H100 processing chips". This initial launch demonstrated Colossus to be the largest supercomputer globally. The maximum power consumption increased from 150 to 250 MW. As of June 2025, the supercomputer consists of 150,000 H100 GPUs, 50,000 H200 GPUs, and 30,000 GB200 GPUs. Another 110,000 GB200 GPUs are to be brought online at a second data center, also in the Memphis area. The expansion of this supercomputer has already been discussed and will be the second phase of the project. xAI also plans to increase Colossus to 1 million GPUs. Because the supercomputer currently utilizes gas turbines for power, alongside 168 Tesla Megapack battery storage units. xAI is also looking to add more

    Read more →
  • Semantic knowledge management

    Semantic knowledge management

    In computer science, semantic knowledge management is a set of practices that seeks to classify content so that the knowledge it contains may be immediately accessed and transformed for delivery to the desired audience, in the required format. This classification of content is semantic in its nature – identifying content by its type or meaning within the content itself and via external, descriptive metadata – and is achieved by employing XML technologies. The specific outcomes of these practices are: Maintain content for multiple audiences together in a single document Transform content into various delivery formats without re-authoring Search for content more effectively Involve more subject-matter experts in the creation of content without reducing quality Reduce production costs for delivery formats Reduce the manual administration of getting the right knowledge to the right people Reduce the cost and time to localize content == Notable semantic knowledge management systems == Learn eXact Thinking Cap LCMS Thinking Cap LMS Xyleme LCMS iMapping

    Read more →
  • VP-Expert

    VP-Expert

    VP-Expert is an artificial intelligence development tool that gained popularity in the late 1980s and early 1990s. Published by Paperback Software, VP-Expert was designed to facilitate the creation of rule-based expert systems, primarily for applications in business and industry. It was the best-selling expert-system software for microcomputers in the late 1980s. == History == VP-Expert was created by Brian Sawyer and published by Paperback Software in 1987. VP-Expert was widely adopted during the late 1980s. By April 1989, InfoWorld described it as "the best-selling expert-system software for personal computers." In June 1991, ownership of VP-Expert transferred from Paperback Software to WordTech Systems, Inc. following Paperback Software’s liquidation after a legal dispute with Lotus Development Corporation regarding its VP-Planner spreadsheet. VP-Expert continued to receive positive reviews with InfoWorld stating in 1992 "for automatically creating simple expert systems and being able to edit them into more sophisticated applications, hardly a better product exists than VP-Expert". == Features == VP-Expert used an inference engine based on backward chaining to reach conclusions through English-like if/then rules. It operated through a text interface and included an explanation facility that showed the reasoning steps used to justify its conclusions. == Applications == VP-Expert found applications across various domains. In environmental analysis, researchers used VP-Expert to develop a knowledge-based system for analyzing the impact of particulate matter air pollution on human health. In engineering design, VP-Expert was utilized in the creation of a prototype expert system to assist in fishway design. In aviation management, the tool was employed to develop an expert system aimed at maximizing airport capacity while adhering to noise-mitigation plans. == Limitations == While VP-Expert offered certain advantages, it also had limitations. Its rule-based approach could become challenging to manage for large and complex knowledge bases, and the process of eliciting and encoding knowledge from experts could be time-consuming and difficult.

    Read more →
  • Direct Graphics Access

    Direct Graphics Access

    Direct Graphics Access is a plug-in for the X display servers that allows client programs direct access to the frame buffer. Graphics hardware communicates via a chunk of memory called a frame buffer. This is an array of values that represent pixel color values on the screen. Writing the appropriate values into the frame buffer therefore allows a program to paint areas of the screen. However, as with any shared resource, problems occur when multiple programs attempt to access the same resource, as they tend to write over each other's work. In the X Window System, this is solved by having a central display server that mediates between programs that want to draw on the screen. The display server also used to perform a lot of the drawing work, allowing programs to say Draw me a circle of this radius filled with this pattern or draw this text in this font. The X server does all this work, freeing programmers from having to write their own drawing code. Another advantage of the X architecture is that it works over a network, allowing programs on one machine to display output on the screen of another. Direct Graphics Access allows direct access to the frame buffer and the X-server hands over control of the frame buffer to the client program and waits for the client to hand it back. This means that the client program has control of the whole screen, and so it is mostly used for full-screen video/games.

    Read more →
  • Retrieval-based Voice Conversion

    Retrieval-based Voice Conversion

    Retrieval-based Voice Conversion (RVC) is an open source voice conversion AI algorithm that enables realistic speech-to-speech transformations, accurately preserving the intonation and audio characteristics of the original speaker. == Overview == In contrast to text-to-speech systems such as ElevenLabs, RVC differs by providing speech-to-speech outputs instead. It maintains the modulation, timbre and vocal attributes of the original speaker, making it suitable for applications where emotional tone is crucial. The algorithm enables both pre-processed and real-time voice conversion with low latency. This real-time capability marks a significant advancement over previous AI voice conversion technologies, such as So-vits SVC. Its speed and accuracy have led many to note that its generated voices sound near-indistinguishable from "real life", provided that sufficient computational specifications and resources (e.g., a powerful GPU and ample RAM) are available when running it locally and that a high-quality voice model is used. == Technical foundation == Retrieval-based Voice Conversion (RVC) utilizes a hybrid approach that integrates feature extraction with retrieval-based synthesis. Instead of directly mapping source speaker features to the target speaker using statistical models, RVC retrieves relevant segments from a target speech database, aiming to enhance the naturalness and speaker fidelity of the converted speech. At a high level, the RVC system typically comprises three main components: (1) a content feature extractor, such as a phonetic posteriorgram (PPG) encoder or self-supervised models like HuBERT; (2) a vector retrieval module that searches a target voice database for the most similar speech units; and (3) a vocoder or neural decoder that synthesizes waveform output from the retrieved representations. The retrieval-based paradigm aims to mitigate the oversmoothing effect commonly observed in fully neural sequence-to-sequence models, potentially leading to more expressive and natural-sounding speech. Furthermore, with the incorporation of high-dimensional embeddings and k-nearest-neighbor search algorithms, the model can perform efficient matching across large-scale databases without significant computational overhead. Recent RVC frameworks have incorporated adversarial learning strategies and GAN-based vocoders, such as HiFi-GAN, to enhance synthesis quality. These integrations have been shown to produce clearer harmonics and reduce reconstruction errors. == Research developments == Research on RVC has recently explored the use of self-supervised learning (SSL) encoders such as wav2vec 2.0 and HuBERT to replace hand-engineered features like MFCCs. These encoders improve content preservation, especially when source and target speakers have dissimilar speaking styles or accents. Moreover, modern RVC models leverage vector quantization methods to discretize the acoustic space, improving both synthesis accuracy and generalization across unseen speakers. For example, retrieval-augmented VQ models can condition the synthesis stage on quantized speech tokens, which enhances controllability and style transfer. Despite its strengths, RVC still faces limitations related to database coverage, especially in real-time or few-shot settings. Inadequate diversity in the target voice corpus may lead to suboptimal retrieval or unnatural prosody. These advances demonstrate the viability of RVC as a strong alternative to conventional deep learning VC systems, balancing both flexibility and efficiency in diverse voice synthesis applications. == Training process == The training pipeline for retrieval-based voice conversion typically includes a preprocessing step where the target speaker's dataset is segmented and normalized. A pitch extractor such as librosa or DDSP-DDC may be used to obtain fundamental frequency (F0) features. During training, the model learns to map content features from the source speaker to the acoustic representation of the target speaker while maintaining pitch and prosody. The training objective often combines reconstruction loss with feature consistency loss across intermediate layers, and may incorporate cycle consistency loss to preserve speaker identity. Fine-tuning on small datasets is feasible due to the use of pre-trained models, particularly for the SSL encoder and content extractor components. This approach allows transfer learning to be applied effectively, enabling the model to converge faster and generalize better to unseen inputs. Most open implementations support batch training, gradient accumulation, and mixed-precision acceleration (e.g., FP16), especially when utilizing NVIDIA CUDA-enabled GPUs. == Real-time deployment == RVC systems can be deployed in real-time scenarios through WebUI interfaces and streaming audio frameworks. Optimizations include converting the inference graph to ONNX or TensorRT formats, reducing latency. Audio buffers are typically processed in chunks of 0.2–0.5 seconds to ensure minimal delay and seamless conversion. Cross-platform compatibility with tools such as OBS Studio and Voicemeeter enables integration into live streaming, video production, or virtual avatar environments. == Applications and concerns == The technology enables voice changing and mimicry, allowing users to create accurate models of others using only a negligible amount of minutes of clear audio samples. These voice models can be saved as .pth (PyTorch) files. While this capability facilitates numerous creative applications, it has also raised concerns about potential misuse as deepfake software for identity theft and malicious impersonation through voice calls. == Ethical and legal considerations == As with other deep generative models, the rise of RVC technology has led to increasing debate about copyright, consent, and authorship. While some jurisdictions may allow parody or fair use in creative contexts, impersonating living individuals without permission may infringe upon privacy and likeness rights. As a result, some platforms have begun issuing takedown notices against AI-generated voice content that closely mimics celebrities or musicians. === In pop culture === RVC inference has been used to create realistic depictions of song covers, such as replacing original vocals with characters like Twilight Sparkle and Mordecai to have them sing duets of popular music like "Airplanes" and "Somebody That I Used to Know." These AI-generated covers, which can sound strikingly similar to the voice imitated, have gained popularity on platforms like YouTube as humorous memes.

    Read more →
  • Microelectronics and Computer Technology Corporation

    Microelectronics and Computer Technology Corporation

    Microelectronics and Computer Technology Corporation, originally the Microelectronics and Computer Consortium and widely seen by the acronym MCC, was the first, and at one time one of the largest, computer industry research and development consortia in the United States. MCC ceased operations in 2000 and was formally dissolved in 2004. == Divisions == MCC did research and development in the following areas: [1] System Architecture and Design (optimise hardware and software design, provide for scalability and interoperability, allow rapid prototyping for improved time-to-market, and support the re-engineering of existing systems for open systems). Advanced Microelectronics Packaging and Interconnection (smaller, faster, more powerful, and cost-competitive). Hardware Systems Engineering (tools and methodologies for cost-efficient, up-front design of advanced electronic systems, including modelling and design-for-test techniques to improve cost, yield, quality, and time-to-market). Environmentally Conscious Technologies (process control and optimisation tools, information management and analysis capabilities, and non-hazardous material alternatives supporting cost-efficient production, waste minimisation, and reduced environmental impact). Distributed Information Technology (managing and maintaining physically distributed corporate information resources on different platforms, building blocks for the national information infrastructure, networking tools and services for integration within and between companies, and electronic commerce). Intelligent Systems (systems that "intelligently" support business processes and enhance performance, including decision support, data management, forecasting and prediction). == History == The MCC was a response to the announcement of Japan's Fifth Generation Project, a large Japanese research project launched in 1982 aimed at producing a new kind of computer by 1991. The Japanese had formed similar industrial research consortia as early as 1956.[2] Many European and American computer companies saw this new Japanese initiative as an attempt to take full control of the world's high-end computer market, and MCC was created, in part, as a defensive move against that threat. In late 1982, several major computer and semiconductor manufacturers in the United States banded together and founded MCC under the leadership of Admiral Bobby Ray Inman, whose previous positions had been Director of the National Security Agency and deputy director of the Central Intelligence Agency. Such formations were illegal in the United States until the 1984 Congressional passage of the "National Cooperative Research Act". Several sites with relevant universities were considered, including Atlanta, Georgia (Georgia Tech), the Research Triangle, N.C. (UNC), the Washington, D.C. area (George Mason), Stanford University and Austin, Texas (UT) which was the final selection. The University of Texas offered land upon which they would construct a new building specifically designed for the MCC within their Austin campus. Ross Perot also offered the use of his private plane for 2 years for staff recruitment. Austin was selected as the site for MCC in 1983. Despite this purpose and the background of Inman and his senior staff, MCC accepted no government funding for many years and was a refuge for some avoiding work on Strategic Defense Initiative projects. MCC was part of the Artificial Intelligence boom of the 1980s, reportedly the single largest customer of both Symbolics and Lisp Machines, Inc. (and like Symbolics, was one of the first companies to register a .com domain). In the 1980s its major programs were packaging, software engineering, CAD, and advanced computer architectures. The latter comprised artificial intelligence, human interface, database, and parallel processing, the latter two merging in the late 1980s. Many of the early shareholder companies were mainframe computer companies under stress in the 1980s. Over the years, MCC's membership diversified to include a broad range of high-profile corporations involved in information technology products, as well as government research and development agencies and leading universities. In June, 2000 the MCC Board of Directors voted to dissolve the consortium, and the few remaining employees held a wake at Scholz's Beer Garden in Austin on October 25. Formal dissolution papers were reportedly not filed until 2004. == Spinoffs == While multiple technologies were transferred to member companies and government agencies in the final years, fourteen companies were spun out of MCC. Those spinoffs include: TeraVicta Technologies, Austin's first MEMS company; its focus was to develop microscopic switch technology for fiber optic switching and radiofrequency switching in mobile phones specifically to dynamically switch between the future 3G-4GLTE-future5G wireless communication frequencies and ensure mobile phones were communicating over the strongest wireless signal to reduce dropped calls. Robert Miracky was the founding CEO who spun out the first commercial metal micromachining technology developed by MCC researchers Brent Lunceford, Jason Reed, Richard Nelson, K.Hu, and C. Hilbert in a collaborative development program with IBM in a novel implementation and operational paradigm for solid-state integrated circuit coolers integrated with conductive MEMS switches. TeraVicta was liquidated under Chapter 7 bankruptcy proceedings in 2015. The Austin region subsequently built up a MEMS & Sensors value chain in the billions of dollars comprising companies such as 3M, Cypress Semiconductor, NXP Semiconductor, Cirrus Logic, Silicon Labs, and the Austin division of the now-defunct Silicon Valley Technology Center. Portelligent, a company that provides reverse engineering teardown services. At the time, Portelligent was the first company to commercialize such services; they had been provided by MCC to its member companies. Today, there are at least twelve companies worldwide that sell reports known as "reverse engineering teardown reports." Modern day teardown reports provide detailed information about technology products such as the bill of materials, microchip, and printed circuit board design specifics, manufacturing details including manufacturing location details for the entire value chain responsible for making electronics, including the iPhone and Samsung Galaxy smartphones. Portelligent was acquired by CMP Technology in 2007. Evolutionary Technologies International, a company focused on developing database tools and data warehousing. It was spun off from MCC in 1990.

    Read more →