AI Coding Vscode Extension

AI Coding Vscode Extension — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Coolgorilla

    Coolgorilla

    Coolgorilla was one of the earliest software developers that created 3rd party native applications for Apple iPod devices. Coolgorilla was an early adopter of using a sponsorship business model to enable mobile applications to be given away freely. Coolgorilla developed a series of Talking Phrasebooks for iPods in 2006. They partnered with online travel company lastminute.com who sponsored the applications enabling them to be made available to download completely free of charge. As mobile devices became more sophisticated, Coolgorilla developed the Talking Phrasebooks for Sony Ericsson and Nokia Mobile Devices which at the time were considerably noteworthy since the applications used real voice audio translations. With Apple's introduction of the iPhone in 2007, Coolgorilla developed a Web App before having four of the iPhone Talking Phrasebooks available to download from Apple's App Store on the day it opened in 2008. == Almanac in Chronological Order == On 23 December 2005, CoolGorilla, a new start-up, launched a trivia game for the iPod. It was titled "Rock and Pop Quiz". It was a quiz game that tested users' knowledge on bands such as U2, Metallica, Beyonce, and the Beatles. The quiz contained twenty megabytes of audible trivia questions. The free game was compatible with 3rd, 4th and 5th generation iPods, iPod mini and nano. In March 2006, Coolgorilla released "Movie Quiz for iPods" with a price of $5. It was an audio game narrated by New York's DJ Thomas, a radio and television host, voice over artist and event Master of Ceremonies. There were questions on Star Wars, Spiderman, The Godfather, Pulp Fiction, The Matrix, James Bond, and others. The user could keep track of their score. The game included a secret code for players who answered all questions correctly which enabled users to enter their name on the Coolgorilla Hall of Fame. In May 2006, Coolgorilla launched a World Cup Encyclopedia which was released prior to the 2006 FIFA World Cup. It had information on the World Cup schedule, details of every player from every team, every score from every world cup game ever played, stadium details, and manager profiles. It was a free download. In June 2006, Coolgorilla released a series of iPod Phrasebooks in German, Greek, French and Spanish. They were sponsored by lastminute.com and were free. The phrasebooks included common words and phrases for tourists with 750 sound files. They were accessed through the iPod's Notes feature. In April 2007, Coolgorilla released a downloadable version of the Talking Phrasebooks for Nokia and Sony Ericsson mobile devices. French, Spanish, German, Greek, Italian, and Portuguese were produced. The application provided real voice translations. They initially sold for £3 but 3 months later were offered for free. The branding was lastminute.com branding. Apple's iPhone was released at the end of June 2007. Soon after, Coolgorilla released an online all-in-one version of their Talking Phrasebooks for iPhone (Web App). The Phrasebooks were made available online in the form of a web app as iPhone did not yet allow for the download of additional apps. The app provided both text and audio translations in French, Spanish, Portuguese, Italian, German, and Greek. The iPhone translated the phrases using the recordings of real, native voice-over artists. A text translation on screen was also displayed. Apple's App Store opened in July 2008 with approximately 500 native apps available. Four of these Apps were Coolgorilla's Talking Phrasebooks for iPhone (Native Apps). There was French, German, Italian, and Spanish. These Apps carried lastminute.com branding and were available for free download. In the first three weeks following their release, the phrasebooks had over 350,000 downloads. Subsequently, Dutch, Arabic, Mandarin and Cantonese were also released. In October 2008, Coolgorilla released an iPhone London Travel Guide. Coolgorilla featured on NBC News in August 2009. In 2010, FIAT used the Italian Phrasebook to help promote the release of their FIAT 500 in the US. There has been no further activity since.

    Read more →
  • Interim Measures for the Management of Generative AI Services

    Interim Measures for the Management of Generative AI Services

    The Interim Measures for the Management of Generative AI Services (Chinese: 生成式人工智能服务管理暂行办法; pinyin: Shēngchéng shì réngōng zhìnéng fúwù guǎnlǐ zànxíng bànfǎ) are a set of regulations governing public-facing generative artificial intelligence services in China. Issued on 10 July 2023 and effective from 15 August 2023, they were China's first binding regulation specifically targeting generative AI. They have been described as among the earliest such regulations adopted by any country. The measures were jointly issued by the Cyberspace Administration of China (CAC) and six other national bodies: the National Development and Reform Commission, the Ministry of Education, the Ministry of Science and Technology, the Ministry of Industry and Information Technology, the Ministry of Public Security, and the National Radio and Television Administration. Among the measures' most prominent requirements is that generative AI services must uphold Core Socialist Values and must not generate content that could subvert state power, harm national security, or undermine social stability. The measures also require providers of public-facing generative AI services to undergo security assessments and register their algorithms with the CAC. As of December 2025, 748 generative AI services had completed the filing process at the national level. == Background == The Interim Measures build on two earlier sets of regulations targeting specific algorithm applications. The Administrative Provisions on Algorithm Recommendation for Internet Information Services, effective from March 2022, established China's algorithm registry and required providers of recommendation algorithms with "public opinion properties or social mobilization capabilities" to file with the CAC and undergo security assessments. The Administrative Provisions on Deep Synthesis of Internet Information Services, effective from January 2023, extended similar requirements to algorithms used for generating synthetic media such as deepfakes. In April 2023, the CAC released a draft of the generative AI regulation for public comment. The draft included several requirements that attracted attention, including that generated content should "embody Core Socialist Values" and that training data should be "true and accurate". The public consultation period ran until May 2023. The final version, published in July 2023, was substantially revised from the draft. According to an analysis by the Future of Privacy Forum, changes appeared to reflect feedback from industry stakeholders including Baidu, Xiaomi, SenseTime, and others, as well as input from government-affiliated research institutes. The final measures adopted a more permissive tone, with the CAC describing its approach as "inclusive and prudent" (包容审慎) and emphasising "classified and graded" (分类分级) supervision. == Scope == The measures apply to services that use generative AI technology to provide text, images, audio, video, or other content to the public within mainland China (Article 2). They do not apply to organisations that develop or use generative AI internally without offering services to the domestic public, such as industry associations, enterprises, and research institutions. Overseas providers whose services are accessible to users in China are also subject to the measures. == Key provisions == === Content requirements === Article 4 sets out the core content obligations. Providers and users of generative AI services must uphold the Core Socialist Values. The measures prohibit generating content that incites subversion of national sovereignty or the socialist system, endangers national security or the nation's image, incites separatism, promotes terrorism or extremism, promotes ethnic hatred or discrimination, or contains violence, obscenity, or false information prohibited by law. These content prohibitions largely mirror those in Article 12 of the Cybersecurity Law and in prior regulations governing online content. Article 4 also requires that models be designed and trained to avoid discrimination, that services respect intellectual property rights, and that providers take effective measures to improve the transparency and accuracy of generated content. === Training data and labelling === Article 7 requires providers to ensure that training data is of high quality and legitimately sourced, and that it does not infringe upon intellectual property rights. Where personal information is used, consent must be obtained. The final version of this provision removed language from the draft that would have held providers responsible for the "legitimacy" of all pretraining data, replacing it with a requirement to "employ effective measures to improve the quality of training data". Article 8 requires providers to establish labelling rules for training data and to conduct quality assessments of data annotations. Article 12 requires that generated images, videos, and other synthetic content be labelled as AI-generated. === User rights and privacy === Article 11 requires providers to protect user privacy, to minimise the collection and retention of personal data, and to refrain from unlawfully sharing user information. Users have the right to request review, correction, or deletion of their personal information. Article 10 requires providers to take measures to prevent excessive dependence on or addiction to generative AI services by minors. === Security assessment and algorithm filing === Article 17 requires that providers of generative AI services with "public opinion properties or the capacity for social mobilization" (具有舆论属性或者社会动员能力) carry out security assessments and complete algorithm filing procedures in accordance with the Administrative Provisions on Algorithm Recommendation for Internet Information Services. == Implementation == === Algorithm filing process === In practice, the filing requirements under the Interim Measures have developed into a two-tier process. The first tier is the standard algorithm filing (算法备案) under the pre-existing Algorithm Recommendation Provisions, which involves submitting information about an algorithm's design, purpose, and data sources to the CAC. This process is primarily a registration mechanism. For public-facing generative AI products, there is an additional, more rigorous process commonly referred to as the "large model filing" (大模型备案). This involves submitting a security self-assessment report, data annotation rules, a keyword blocking list, and evaluation test question sets. The process includes technical testing at the provincial level, followed by review at the national CAC level. The algorithm filing targets specific algorithms, while the large model filing evaluates the broader system architecture, training data, model parameters, and potential social impact. The CAC publishes lists of generative AI services that have successfully completed the filing process. The first such list was published on 2 April 2024. According to the CAC's year-end announcements, 302 generative AI services had completed national-level filing by the end of 2024 (of which 238 were new that year), alongside 105 applications that completed local-level registration. By the end of 2025, the cumulative total had risen to 748 national-level filings and 435 local-level registrations. === Content compliance and testing === According to the Carnegie Endowment, the CAC has conducted compliance audits of generative AI services with a particular focus on ensuring appropriate responses to queries about politically sensitive topics. The large model filing process requires providers to pass both provincial-level and national-level technical testing before their services can be made available to the public. On 1 March 2024, the National Technical Committee 260 on Cybersecurity (TC260) published TC260-003, the Basic Security Requirements for Generative AI Services (生成式人工智能服务安全基本要求), a technical standard that provides detailed guidance on the security assessments required under the Interim Measures. The standard covers requirements for training data safety, model security, and content safety evaluation, and is used as a reference for the filing process. == Analysis == === Relationship to broader Chinese internet regulation === The content requirements in the Interim Measures extend China's existing framework for online information control to generative AI. Legal scholars have noted that the "Core Socialist Values" provision and the specific content prohibitions are consistent with longstanding requirements imposed on internet platforms under the Cybersecurity Law and related regulations. The Asia Society Policy Institute has described the Chinese government's highest regulatory priority in this area as retaining control of information, noting that content-related obligations receive stricter enforcement than other provisions. === Nature of the filing system === The character of the filing system has been debated by scholars. Angela Huyue Zh

    Read more →
  • Stable Diffusion

    Stable Diffusion

    Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of the ongoing AI boom. It is primarily used to generate detailed images conditioned on text descriptions, though it can also be applied to other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt. Its development involved researchers from the CompVis Group at LMU Munich and Runway with a computational donation from Stability and training data from non-profit organizations. Stable Diffusion is a latent diffusion model, a kind of deep generative artificial neural network. Its code and model weights have been released publicly, and an optimized version can run on most consumer hardware equipped with a modest GPU with as little as 2.4 GB VRAM. This marked a departure from previous proprietary text-to-image models such as DALL-E and Midjourney which were accessible only via cloud services. == Development == Stable Diffusion originated from a project called Latent Diffusion, developed in Germany by researchers at LMU Munich in Munich and Heidelberg University. Four of the original 5 authors (Robin Rombach, Andreas Blattmann, Patrick Esser and Dominik Lorenz) later joined Stability AI and released subsequent versions of Stable Diffusion. The technical license for the model was released by the CompVis group at LMU Munich. Development was led by Patrick Esser of Runway and Robin Rombach of CompVis, who were among the researchers who had earlier invented the latent diffusion model architecture used by Stable Diffusion. Stability AI also credited EleutherAI and LAION (a German nonprofit which assembled the dataset on which Stable Diffusion was trained) as supporters of the project. == Technology == === Architecture === Diffusion models, introduced in 2015, are trained with the objective of removing successive applications of Gaussian noise on training images, which can be thought of as a sequence of denoising autoencoders. The name diffusion is from the thermodynamic diffusion, since they were first developed with inspiration from thermodynamics. Models in Stable Diffusion series before SD 3 all used a variant of diffusion models, called latent diffusion model (LDM), developed in 2021 by the CompVis (Computer Vision & Learning) group at LMU Munich. Stable Diffusion consists of 3 parts: the variational autoencoder (VAE), U-Net, and an optional text encoder. The VAE encoder compresses the image from pixel space to a smaller dimensional latent space, capturing a more fundamental semantic meaning of the image. Gaussian noise is iteratively applied to the compressed latent representation during forward diffusion. The U-Net block, composed of a ResNet backbone, denoises the output from forward diffusion backwards to obtain a latent representation. Finally, the VAE decoder generates the final image by converting the representation back into pixel space. The denoising step can be flexibly conditioned on a string of text, an image, or another modality. The encoded conditioning data is exposed to denoising U-Nets via a cross-attention mechanism. For conditioning on text, the fixed, pretrained CLIP ViT-L/14 text encoder is used to transform text prompts to an embedding space. Researchers point to increased computational efficiency for training and generation as an advantage of LDMs. With 860 million parameters in the U-Net and 123 million in the text encoder, Stable Diffusion is considered relatively lightweight by 2022 standards, and unlike other diffusion models, it can run on consumer GPUs, and even CPU-only if using the OpenVINO version of Stable Diffusion. ==== SD XL ==== The XL version uses the same LDM architecture as previous versions, except larger: larger UNet backbone, larger cross-attention context, two text encoders instead of one, and trained on multiple aspect ratios (not just the square aspect ratio like previous versions). The SD XL Refiner, released at the same time, has the same architecture as SD XL, but it was trained for adding fine details to preexisting images via text-conditional img2img. ==== SD 3.0 ==== The 3.0 version completely changes the backbone. Not a UNet, but a Rectified Flow Transformer, which implements the rectified flow method with a Transformer. The Transformer architecture used for SD 3.0 has three "tracks", for original text encoding, transformed text encoding, and image encoding (in latent space). The transformed text encoding and image encoding are mixed during each transformer block. The architecture is named "multimodal diffusion transformer (MMDiT), where the "multimodal" means that it mixes text and image encodings inside its operations. This differs from previous versions of DiT, where the text encoding affects the image encoding, but not vice versa. === Training data === Stable Diffusion was trained on pairs of images and captions taken from LAION-5B, a publicly available dataset derived from Common Crawl data scraped from the web, where 5 billion image-text pairs were classified based on language and filtered into separate datasets by resolution, a predicted likelihood of containing a watermark, and predicted "aesthetic" score (e.g. subjective visual quality). The dataset was created by LAION, a German non-profit which receives funding from Stability AI. The Stable Diffusion model was trained on three subsets of LAION-5B: laion2B-en, laion-high-resolution, and laion-aesthetics v2 5+. A third-party analysis of the model's training data identified that out of a smaller subset of 12 million images taken from the original wider dataset used, approximately 47% of the sample size of images came from 100 different domains, with Pinterest taking up 8.5% of the subset, followed by websites such as WordPress, Blogspot, Flickr, DeviantArt and Wikimedia Commons. An investigation by Bayerischer Rundfunk showed that LAION's datasets, hosted on Hugging Face, contain large amounts of private and sensitive data. === Training procedures === The model was initially trained on the laion2B-en and laion-high-resolution subsets, with the last few rounds of training done on LAION-Aesthetics v2 5+, a subset of 600 million captioned images which the LAION-Aesthetics Predictor V2 predicted that humans would, on average, give a score of at least 5 out of 10 when asked to rate how much they liked them. The LAION-Aesthetics v2 5+ subset also excluded low-resolution images and images which LAION-5B-WatermarkDetection identified as carrying a watermark with greater than 80% probability. Final rounds of training additionally dropped 10% of text conditioning to improve Classifier-Free Diffusion Guidance. The model was trained using 256 Nvidia A100 GPUs on Amazon Web Services for a total of 150,000 GPU-hours, at a cost of $600,000. === Limitations === Stable Diffusion has issues with degradation and inaccuracies in certain scenarios. Initial releases of the model were trained on a dataset that consists of 512×512 resolution images, meaning that the quality of generated images noticeably degrades when user specifications deviate from its "expected" 512×512 resolution; the version 2.0 update of the Stable Diffusion model later introduced the ability to natively generate images at 768×768 resolution. Another challenge is in generating human limbs due to poor data quality of limbs in the LAION database. The model is insufficiently trained to replicate human limbs and faces due to the lack of representative features in the database, and prompting the model to generate images of such type can confound the model. In addition to human limbs, Stable Diffusion is unable to generate legible ambigrams and some other forms of text and typography. Stable Diffusion XL (SDXL) version 1.0, released in July 2023, introduced native 1024x1024 resolution and improved generation for limbs and text. Accessibility for individual developers can also be a problem. In order to customize the model for new use cases that are not included in the dataset, such as generating anime characters ("waifu diffusion"), new data and further training are required. Fine-tuned adaptations of Stable Diffusion created through additional retraining have been used for a variety of different use-cases, from medical imaging to algorithmically generated music. However, this fine-tuning process is sensitive to the quality of new data; low resolution images or different resolutions from the original data can not only fail to learn the new task but degrade the overall performance of the model. Even when the model is additionally trained on high quality images, it is difficult for individuals to run models in consumer electronics. For example, the training process for waifu-diffusion requires a minimum 30 GB of VRAM, which exceeds the usual resource provided in such consumer GPUs as Nvidia's GeForce 30 series, w

    Read more →
  • Someday (short story)

    Someday (short story)

    "Someday" is a science fiction short story by American writer Isaac Asimov. It was first published in the August 1956 issue of Infinity Science Fiction and reprinted in the collections Earth Is Room Enough (1957), The Complete Robot (1982), Robot Visions (1990), and The Complete Stories, Volume 1 (1990). == Plot summary == The story is set in a future where computers play a central role in organizing society. Humans are employed as computer operators, but they leave most of the thinking to machines. Indeed, whilst binary programming is taught at school, reading and writing have become obsolete. The story concerns a pair of boys who dismantle and upgrade an old Bard, a child's computer whose sole function is to generate random fairy tales. The boys download a book about computers into the Bard's memory in an attempt to expand its vocabulary, but the Bard simply incorporates computers into its standard fairy tale repertoire. The story ends with the boys excitedly leaving the room after deciding to go to the library to learn "squiggles" (writing) as a means of passing secret messages to one another. As they leave, one of the boys accidentally kicks the Bard's on switch. The Bard begins reciting a new story about a poor mistreated and often ignored robot called the Bard, whose sole purpose is to tell stories, which ends with the words: "the little computer knew then that computers would always grow wiser and more powerful until someday—someday—someday—…"

    Read more →
  • Couch to 5K

    Couch to 5K

    Couch to 5K, abbreviated C25K, is an exercise plan that gradually progresses from beginner running toward a 5 kilometre (3.1 mile) run over nine weeks. == Operations == The Couch to 5K running plan, also known as C25K, created by Josh Clark in 1996, was developed with the expectation of creating a plan for new runners to start running. The plan is aimed to have users work out for 20 to 30 minutes, three days a week. Within the program, users can be expected to perform different tasks such as intervals of running with period of short walks in between to help build endurance in the weeks up to the final goal of a 5K run. During the nine weeks leading up to the race, the runner will learn to set their own pace and where their strengths and weaknesses are within running. Often, the daily workouts start with a five-minute warm-up walk and works up to running five kilometres without a walking break within nine weeks. Users are not expected to have any experience in running and can be some of the first running that they ever do. The main goal is to turn that unexperienced runner into someone who can run a 5K. Clark started the website Kick and featured C25K on the site. In 2001, Kick merged with Cool Running, a New England–based running site. Clark later sold his stake in Cool Running and the Couch to 5K program. Cool Running was absorbed into Active.com, operated by Active Network, LLC. Active Network provides mobile apps for Couch to 5K, as well as 5K to 10K, a follow-up program. The NHS in the UK provides downloadable podcasts and a smartphone app (Android and iOS) for the plan. A mobile app, created by Zen Labs, has training plans that are based on the Couch to 5K running plan from CoolRunning.com. It is one of the highest-rated health and fitness apps available on Android and iOS. As of 2016, the C25K app has been used by over 5 million people.

    Read more →
  • Emi Kusano

    Emi Kusano

    Emi Kusano (Japanese: 草野 絵美, Hepburn: Kusano Emi; born August 4, 1990) is a Tokyobased Japanese multidisciplinary artist known for creating photography, video, and installations using generative AI technology. Her work explores themes of nostalgia, pop culture, and collective memory. Her work explores themes of nostalgia, pop culture, and collective memory. She is recognized as one of the early practitioners of generative AI art. Her work has been exhibited at the 21st Century Museum of Contemporary Art, Kanazawa, and screened at the M+ Museum’s Asian Avant-Garde Film Festival. Additionally, she has participated in prestigious international art fairs, including Paris Photo and Art Basel Hong Kong. In 2025, she was named one of the World Economic Forum's Young Global Leaders. In 2026, she was selected as a fellow for the AI x Arts Fellowship at Mohamed bin Zayed University of Artificial Intelligence. Kusano serves as a part-time lecturer at the Tokyo University of the Arts and is the producer and vocalist for the Synthwave music unit, Satellite Young. == Early life == === Photography === Kusano was born and raised in Tokyo. Kusano's career began during her high school years before 2008 when she became involved in street fashion photography. Her photographs, primarily taken in Harajuku, were published on "Japanese Streets", "Metropolis", CNN's travel guide magazine "CNN GO","WGSN". Her photography was exhibited at the FIT Museum in New York and the Victoria and Albert Museum in London. == Career == === Music and Installation work === Since 2014, in collaboration with BelleMaison Sekine, Kusano has led "Satellite Young," a synthwave music unit s the lead vocalist, she sings about blending 1980s idol culture with lyrics that tackle contemporary issues such as planned obsolescence ("Sony Timer"), online dating, artificial intelligence, and social media. Their music, known for its conceptual depth, has earned international niche recognition. "Satellite Young" has participated in music festivals, including "South by Southwest," showcasing their unique fusion of retro aesthetics and modern critiques. In 2018, she was selected to participate in "Art Hack Day," an interdisciplinary art hackathon held at The National Museum of Emerging Science and Innovation. where she presented "Singing Dream," a karaoke machine endowed with artificial life, earning the Jury Prize. "Instababy Generator," a 2019 installation co-created with Junichi Yamaoka, explored the concept of designer babies and received recognition at the SIGGRAPH Art Gallery. In October 2020, operating under the name Emi Satellite, she debuted as a solo singer with her first single "Glass Ceiling," an empowerment anthem that addresses the challenges faced by women and encourages progress towards the future. The music video for this song features a direction where strong women rewrite the roles of protagonists in a Bishōjo game, a type of dating simulation game. This concept later served as a prototype for Shinsei Galverse. === Challenge for Blockchain Art === In 2021, she explored the financial world through her single "IPO" and entered the NFT space with "Love Is an IPO," her first NFT work on Ethereum, sold on Foundation. In April 2022, she co-founded the crowdfunded anime project "Shinsei Galverse" with Ayaka Ohira, Devin Mancuso, and Jack Baldwin. serving as one of the executive directors overseeing the creative direction and story. The project's NFT collection of 8,888 ranked #1 on OpenSea's "Top NFTs" for several days, marking one of Japan's first globally successful blockchain art projects. In 2023, Shinsei Galverse produced the official "I like u" music video by Grammy-nominated singer Tove Lo as an initial anime endeavor. Kusano also contributed to discussions on Web3.0 and blockchain technology as a panelist in seminars organized by the Digital Agency of Japan. === AI art === In May 2023, Kusano's first AI art collection "Neural Fad" depicting imaginary fashion history sold out 100 pieces within 24 hours at the "Bright Moments Tokyo" In June, she created WWDJAPAN's first AI-generated magazine cover using her own face. It is the first AI cover in Japanese fashion media. She was also appointed t to the Cultural Affairs Agency's Copyright Subcommittee, she participates in discussions on generative AI and copyright. Her "Synthetic Reflections" self-portrait series debuted on SuperRare, with the first piece auctioned for 3.5 ETH (equivalent to 6,480 US dollars at the time). In July 2023, she co-exhibited a 3D AI-generated dress at Christie's "Future Frequencies" auction with Gucci, alongside Claire Silver. In September, her 30-piece "Pixelated Perception" exhibit at Art Blocks Marfa explored 1990s media and gender, also showcased at the 21st Century Museum of Contemporary Art, Kanazawa. In December, her "Techno-Animism" AI art collection fused Japanese animism with technology. Collaborating with a U.S. gallery, she unveiled 336 pieces during a two-week Art Basel world tour. Throughout the two-week tour, she sold a total of 336 pieces, generating 11.2 ETH (equivalent to 21,264 US dollars at the time). === Generative art === In February 2024, the generative art platform Art Blocks selected the work "Melancholic Magical Maiden," for its Curated category. This piece reconstructs the aesthetics of 1990s magical girl anime, offering a critique of past anime heroines. It sold out within an hour, with all 300 pieces going for a total of 57 ETH (equivalent to approximately 215,385US dollars at the time). In April 2024, Emi Kusano spoke at the Standing Committee on Copyright and Other Rights at the World Intellectual Property Organization (WIPO) in Geneva, Switzerland, where she presented AI-specific information for discussion. == Style and technique == Kusano draws inspiration from Japanese retro-futurism as a foundation for her artwork, which explores the cutting-edge of technology. This approach is fueled by nostalgia for the pre-internet era, specifically the postwar period when Japanese mass media held significant sway. By blending modern technology with retro-culture, she captures the complex feelings of love, hate, and ambivalence towards present and future accelerationism. While at university, Kusano was profoundly influenced by Naoki Sakai, the industrial designer responsible for igniting the retro-futurism movement. In her musical project "Satellite Young", Kusano dons the persona of an '80s female idol and sings about contemporary technology. In her installation piece "Singing Dream", she investigates the concept of an artificial life form inhabiting a karaoke machine, which has been popular since the 1980s, compelling people to sing. In the collaborative NFT art project "Shinsei Galverse", Kusano reimagines a cyberpunk anime primarily featuring female characters, incorporating elements of magical girls popular in the early Heisei period. == Personal life == Kusano has two sons. In August 2021, she minted her older son Zombie Zoo Keeper's pixel art on "OpenSea" as part of his summer research project. The artwork was purchased by notable figures including Brud CEO Trevor McFedries and Steve Aoki, who bought the piece for the equivalent of 21.82 thousand US dollars, highlighting the intersection of art, technology, and family in her work.

    Read more →
  • Gundam Build Divers Re:Rise

    Gundam Build Divers Re:Rise

    Gundam Build Divers Re:Rise (Japanese: ガンダムビルドダイバーズRe:RISE, Hepburn: Gandamu Birudo Daibāzu Re:Raizu) is a Japanese original net animation anime series produced by Sunrise Beyond, and the fourth series within the Gundam Build Series sub-series. A sequel to the 2018 anime Gundam Build Divers, it is the first Gundam anime series to be released in the Reiwa period, released to celebrate the franchise's 40th anniversary. The series is directed by Shinya Watada and written by Yasuyuki Muto. Initially announced at the Gundam 40th anniversary video, the series aired on its Gundam Channel YouTube channel from October 10 to December 26, 2019. A TV airing of the ONA began on BS11 on October 12, 2019, and on January 28, 2020, on Tokyo MX. A second season aired from April 9 to August 27, 2020. Two spinoffs of the series were later serialized in Kadokawa's Gundam Ace magazine and Hobby Japan. == Plot == Two years have passed since the EL-Diver Incident, an event that almost destroyed the Gunpla Battle Nexus Online (GBN) game until it was resolved by the force group known as "Build Divers", and soon after more EL-Divers were discovered. In order to make the game more secure, a newer version of the game was rolled out in order to prevent the same incident from happening again and with newer experiences that would make the gameplay more immersive to players. The story focuses on Hiroto Kuga, a high schooler who is a rogue mercenary Gunpla Diver in GBN, who goes in the game and wanders throughout its countless dimensions while helping out other Divers whether it is on insistence or by hire. Despite his selfless act, he chooses to remain unaffiliated with anyone and refuses rewards and Force (Diver parties) group invites, isolating himself from other people even in real life. His primary goal as a Diver is to be reunited with a mysterious girl from his past named Eve, who was in fact the very first EL-Diver to appear in the game. But after a special request mission, Hiroto is united with three other active Divers in a strange world named "Eldora" and forms the Force group "BUILD DiVERS" in what appears to be just another GBN gamespace event, until they learn the truth about Eldora and its consequences not only for GBN, but for the entire world. == Characters == === BUILD DiVERS === Hiroto Kuga (クガ・ヒロト, Kuga Hiroto) / Hiroto (ヒロト, Hiroto) Voiced by: Chiaki Kobayashi (Japanese); Billy Kametz (English) The main protagonist of the series and a high-school builder, veteran diver, and a former ace member of the Force group Avalon, who lives in Yokohama. He was one of the first minors to make it to the deep end of GBN, due to his conviction of being a person who does his best to help others. He was active prior and during the events of the previous series. Now working as a rogue diver for hire after leaving Avalon, he wanders the GBN gamespace alone, harboring regrets, resentments, and suffering from trauma after the death of his close friend and lover, the EL-Diver Eve. He is very calm and a man of few words, usually refusing others' reward and help, especially on joining other forces, but this stoic persona is a mental mask to hide his condition from everyone, including his parents. But when a special mission done by Freddie united him with Kazami, May and Parviz, they accidentally formed the force team named "BUILD DiVERS" to protect the Eldorans from the One-Eyes army. Currently he is the ace of his unit and the leader of the overall force. Hiroto uses the PFF-X7 Core Gundam as his main Gunpla, based on the RX-78-2 Gundam from the original Mobile Suit Gundam series. Its special armament system called the "core-change" gimmick and his first theme invented from that gimmick is the "Planets System". This allows the Core Gundam to be equipped with various types of armor and weapons, each for a different situation named after the eight planets. Hiroto later upgrades his Gunpla into the PFF-X7II Core Gundam II. This new Core Gundam can transform into the "Core Flyer", in a similar fashion to the original Gundam's FF-X7 Core Fighter for increased mobility and like its predecessor, it can also use the Planets System: Earth Armor (PFF-X7/E3 Earthree Gundam): Core Gundam's default blue armor, focused on traditional all-around combat. Mars Armor (PFF-X7/M4 Marsfour Gundam): A red armor whose focus is on fragments of four styles of close combat, hence "Cross-Combat". Venus Armor (PFF-X7/V2 Veetwo Gundam): A green armor whose focus is commando style ranged and bombardment combat, additionally with option works. Mercury Armor (PFF-X7/M1 Mercuone Gundam): A navy armor whose focus is underwater combat. Jupiter Armor (PFF-X7/J5 Jupitive Gundam): A white armor whose focus is fast orbital combat. Uranus Armor (PFF-X7II/U7 Uraven Gundam): An indigo armor focused on reconnaissance and high powered sniping. Saturn Armor (PFF-X7II/S6 Saturnix Gundam): An orange armor focused in demolition style close combat without beam weapons, originally developed to counter Gundam Frames. Neptune Armor (PFF-X7II/N8 Nepteight Gundam): An aqua-green armor equipped with a customized Volture Lumiere system similar to the one from Mobile Suit Gundam SEED C.E. 73: Stargazer, intended to be used for traveling through GBN's space in a short amount of time, but was used for launching into orbit instead of maneuvering in deep space. It is ultimately discarded in Eldora's orbit due to the strain of leaving Eldora's gravitational field. Pluto Armor (PFF-X7II+/P9 Plutine Gundam): Appearing only on Gundam Build Metaverse, the black colored armor is used for close combat and dueling purposes with its color scheme reminiscent of that of EcoPla. PFF-X7II/BUILD DiVERS Re:Rising Gundam: A special combination of the Core Gundam II with the WoDom Pod + and parts from the Gundam Aegis Knight and the EX Valkylander, armed with two giant beam sabers, eight miracle wings born from Eve's blessings, and the "Grand Cross Cannon", Hiroto's first special move, made with the help of his team. In one occasion, Hiroto changes his avatar to a Haro to pilot the Mobile Builder Haro Loader to help with the repairs on Cuadorn by making a prosthetic wing out of gunpla parts. During the Gunpla Battle Royal, he pilots an unmodified ASW-G-08 Gundam Barbatos Lupus Rex from Mobile Suit Gundam: Iron-Blooded Orphans. In Battlelogue, it is revealed that he has made a second Core Gundam II that he leaves on Eldora with the colors of the Gundam MK-II Titan. Another variant of this Gunpla sports the old "Gundam G3" colors with his team's personal crest, which is most likely to represent Sarah since the color of her hair, eyes, and dress embody Hiroto's time with Eve before they joined Avalon and to symbolize how he has officially befriended the original Build Divers. Each of the two units have unique advancements, the Titan color specializes in ground and underwater combat and the G3 color specializes in aerial and space combat. May (メイ, Mei) Voiced by: Mai Fuchigami (Japanese); Lauren Landa (English) A seemingly late teens female diver who prefers to play solo, she is a very calm and no-nonsense girl whose interest is in battles alone. However, she is not a fan of those who engage their opponents head on and prefers to implement a strategic approach. She is mature and has a strong sense of justice, and can be impulsive rushing into situations, especially for those in danger. Later in the series, she is revealed to be one of the 87 EL-Divers, however she was not one of those who were saved after the EL-Diver incident two years ago, she was born shortly after. After she was born she was given her own Mobile Doll body similar to Sarah, that is when she first met her, Koichi, Tsukasa, and Nanami. During the Lotus Challenge Eldoran style rehearsal battle it is revealed that she, as a new sister of Sarah, addresses the latter as the older since Sarah is chronologically older, regardless of her maturity. In the final episode, she is revealed to have been born with the remnant data originating from Eve, the first born EL-Diver who Hiroto befriended and fell in love with several years ago, and carries Eve's earring on her armband. In Battlelogue, it's implied that she is currently living with Hiroto IRL and in GBN is his attendant. May uses the JMA0530-MAY WoDom Pod as her main Gunpla, which is a customized JMA-0530 Walking Dome from Turn A Gundam. In the later episodes, the mobile suit is revealed to be a disguise for its true form, the HER-SELF Mobile Doll May. May later upgrades her WoDom Pod into the JMA0530-MAYBD WoDom Pod +. During the Gunpla Battle Royal, she uses her Mobile Doll (albeit with a new color scheme and the Gundam Base logo) along with an unmodified NZ-999 II Neo Zeong mobile armor from Mobile Suit Gundam Narrative. Kazami Torimachi (トリマチ・カザミ, Torimachi Kazami) / Kazami (カザミ, Kazami) Voiced by: Masaaki Mizunaka (Japanese); Ray Chase (English) A diver who was a former member of the diver group "Mu Dish". He is a very energet

    Read more →
  • Human-based evolutionary computation

    Human-based evolutionary computation

    Human-based evolutionary computation (HBEC) is a set of evolutionary computation techniques that rely on human innovation. == Classes and examples == Human-based evolutionary computation techniques can be classified into three more specific classes analogous to ones in evolutionary computation. There are three basic types of innovation: initialization, mutation, and recombination. Here is a table illustrating which type of human innovation are supported in different classes of HBEC: All these three classes also have to implement selection, performed either by humans or by computers. === Human-based selection strategy === Human-based selection strategy is a simplest human-based evolutionary computation procedure. It is used heavily today by websites outsourcing collection and selection of the content to humans (user-contributed content). Viewed as evolutionary computation, their mechanism supports two operations: initialization (when a user adds a new item) and selection (when a user expresses preference among items). The website software aggregates the preferences to compute the fitness of items so that it can promote the fittest items and discard the worst ones. Several methods of human-based selection were analytically compared in studies by Kosorukoff and Gentry. Because the concept seems too simple, most of the websites implementing the idea can't avoid the common pitfall: informational cascade in soliciting human preference. For example, digg-style implementations, pervasive on the web, heavily bias subsequent human evaluations by prior ones by showing how many votes the items already have. This makes the aggregated evaluation depend on a very small initial sample of rarely independent evaluations. This encourages many people to game the system that might add to digg's popularity but detract from the quality of the featured results. It is too easy to submit evaluation in digg-style system based only on the content title, without reading the actual content supposed to be evaluated. A better example of a human-based selection system is Stumbleupon. In Stumbleupon, users first experience the content (stumble upon it), and can then submit their preference by pressing a thumb-up or thumb-down button. Because the user doesn't see the number of votes given to the site by previous users, Stumbleupon can collect a relatively unbiased set of user preferences, and thus evaluate content much more precisely. === Human-based evolution strategy === In this context and maybe generally, the Wikipedia software is the best illustration of a working human-based evolution strategy wherein the (targeted) evolution of any given page comprises the fine tuning of the knowledge base of such information that relates to that page. Traditional evolution strategy has three operators: initialization, mutation, and selection. In the case of Wikipedia, the initialization operator is page creation, the mutation operator is incremental page editing. The selection operator is less salient. It is provided by the revision history and the ability to select among all previous revisions via a revert operation. If the page is vandalised and no longer a good fit to its title, a reader can easily go to the revision history and select one of the previous revisions that fits best (hopefully, the previous one). This selection feature is crucial to the success of the Wikipedia. An interesting fact is that the original wiki software was created in 1995, but it took at least another six years for large wiki-based collaborative projects to appear. Why did it take so long? One explanation is that the original wiki software lacked a selection operation and hence couldn't effectively support content evolution. The addition of revision history and the rise of large wiki-supported communities coincide in time. From an evolutionary computation point of view, this is not surprising: without a selection operation the content would undergo an aimless genetic drift and would unlikely to be useful to anyone. That is what many people expected from Wikipedia at its inception. However, with a selection operation, the utility of content has a tendency to improve over time as beneficial changes accumulate. This is what actually happens on a large scale in Wikipedia. === Human-based genetic algorithm === Human-based genetic algorithm (HBGA) provides means for human-based recombination operation (a distinctive feature of genetic algorithms). Recombination operator brings together highly fit parts of different solutions that evolved independently. This makes the evolutionary process more efficient.

    Read more →
  • Anthrobotics

    Anthrobotics

    Anthrobotics is the science of developing and studying robots that are either entirely or in some way human-like. The term anthrobotics was originally coined by Mark Rosheim in a paper entitled "Design of An Omnidirectional Arm" presented at the IEEE International Conference on Robotics and Automation, May 13–18, 1990, pp. 2162–2167. Rosheim says he derived the term from "...Anthropomorphic and Robotics to distinguish the new generation of dexterous robots from its simple industrial robot forebears." The word gained wider recognition as a result of its use in the title of Rosheim's subsequent book Robot Evolution: The Development of Anthrobotics, which focussed on facsimiles of human physical and psychological skills and attributes. However, a wider definition of the term anthrobotics has been proposed, in which the meaning is derived from anthropology rather than anthropomorphic. This usage includes robots that respond to input in a human-like fashion, rather than simply mimicking human actions, thus theoretically being able to respond more flexibly or to adapt to unforeseen circumstances. This expanded definition also encompasses robots that are situated in social environments with the ability to respond to those environments appropriately, such as insect robots, robotic pets, and the like. Anthrobotics is now taught at some universities, encouraging students not only to design and build robots for environments beyond current industrial applications, but also to speculate on the future of robotics that are embedded in the world at large, as mobile phones and computers are today. In 2016 philosopher Luis de Miranda created the Anthrobotics Cluster at the University of Edinburgh "a platform of cross-disciplinary research that seeks to investigate some of the biggest questions that will need to be answered" on the relationship between humans, robots and intelligent systems and "a think tank on the social spread of robotics, and also how automation is part of the definition of what humans have always been". to explore the symbiotic relationship between humans and automated protocols.

    Read more →
  • Abu Dhabi Autonomous Racing League

    Abu Dhabi Autonomous Racing League

    The Abu Dhabi Autonomous Racing League (A2RL) is an autonomous racing league based in Abu Dhabi and organized by ASPIRE, part of the UAE government's Advanced Technology Research Council. It has three distinct categories: the "car race", the drone race, and the buggy race. The first car race was held on 27 April 2024 at the Yas Marina Circuit, marking the first major autonomous formula race outside the US since the now-folded Roborace championship. The first drone race was held on 11 and 12 April 2025. == Formats == A2RL has three distinct formats, the formula racing format (dubbed the Car Race), the quadcopter drone racing format (dubbed the Drone Race), and the off-road dune buggy racing format (dubbed the Buggy Race). === Car Race === A2RL's main event, the car race is a standard formula racing format with self-driving formula cars. The cars are made by Dallara and are modified versions of Super Formula cars with Yokohama tires. These cars had the CPUs of their AIs mounted where the driver's seat is on a non-modified chassis, as well as hydraulic actuators for AI control of the vehicle, multiple sensor systems including LIDAR and GPS, and a large LED indicator showing the status of the AI. The first car race was held on 27 April 2024. This race was marked by the cars' subpar performance: Out of four cars that qualified, only two finished the race - the other two did not. The next race was held on 15 November 2025, with 11 teams. ==== Technical specifications ==== The full list of technical specifications are as follows: Chassis: Dallara EAV24 (modified Dallara SF23) Forward suspension: Pushrod type, torsion bar spring, adjustable dampers, third element Rear suspension: Pushrod type, torsion bar, coil springs, adjustable dampers, third element Tires: Yokohama Advan Drive-by-wire system: Provided by Meccanica 42, the DBW system consists of steering and brake actuators, with a central ECU that coordinates the driving actions and reacts to any critical situation in real-time. Brakes: Brembo calipers, Brembo carbon discs, electro-hydraulically activated Engine: 4 Piston Racing K20C1 (based on Honda 2.0l; turbocharged 4-cylinder engine) Gearbox: 3MO 6-speed gearbox Sensor suite: 7x Sony IMX728 cameras, 4x ZF ProWave radar units, 3x Seyond Falcon Kinetic lidar units Main computer: Neousys RGS-8805GC ==== Races held ==== === Drone Race === Created in partnership with the Drone Champions' League, the drone race is the quadcopter drone racing aerial format of the A2RL. The first race was held on 11/12 April 2025 at the ADNEC Marina Hall. 10 teams are scheduled to take part. === Buggy Race === The buggy race will be the off-road format of the A2RL using self-driving dune buggies. No date or number of teams has been announced for the first race. === Other events === A2RL is known to host AI vs AI and Human vs AI events, in Abu Dhabi and abroad. One such event took place at the Suzuka Circuit in Japan. The Human vs AI race was precluded due to AI car "Yalla" crashing into the wall during the formation lap. == Team lists ==

    Read more →
  • Ideogram (text-to-image model)

    Ideogram (text-to-image model)

    Ideogram is a freemium text-to-image model developed by Ideogram, Inc. using deep learning methodologies to generate digital images from natural language descriptions known as prompts. The model is capable of generating legible text in the images compared to other text-to-image models. == History == Ideogram was founded in 2022 by Mohammad Norouzi, William Chan, Chitwan Saharia, and Jonathan Ho to develop a better text-to-image model. It was first released with its 0.1 model on August 22, 2023, after receiving $16.5 million in seed funding, which itself was led by Andreessen Horowitz and Index Ventures. In February 2024, Ideogram raised $80 million after its 1.0 model release in the same year. In August 2024, Ideogram released its 2.0 model. This model has several styles such as realistic, design, 3D, and anime and better capability in generating text. In February 2025, Ideogram released 2a model. This model was designed for speed and optimized for graphics design and photography generation. In March 2025, Ideogram released its 3.0 model. This model has improved realism and understanding of complex text layout, although like other generative AI models, it still struggles with ambigram creation.

    Read more →
  • Akoma Ntoso

    Akoma Ntoso

    Akoma Ntoso (Architecture for Knowledge-Oriented Management of African Normative Texts using Open Standards and Ontologies, AKN) is an international technical standard for representing legal documents (executive, legislative, and judiciary) in a structured manner using a domain specific, legal XML vocabulary. The term akoma ntoso means "linked hearts" in the Akan language of West Africa. Akoma Ntoso is a legal document standard designed to serve as a basis for modern machine-readable and fully digital legislative and judicial processes. This is achieved by providing a coherent syntax and well-defined semantics to represent legal documents in a digital format. It is designed to be suitable as a common exchange format in all parliamentary, legal and judicial systems around the world. Taking advantage of the shared heritage present in all legal systems, Akoma Ntoso has been developed to have ample flexibility to respond to all the differences in texts, languages, and legal practices. Aiming to expand on certain common practices, the standard therefore has a broad scope. It includes a common extensible model for data (the document content) and metadata (such as bibliographic information and annotations). Specifically, as a common legal document standard for the interchange of legal documents it is designed to be highly flexible in its support of documents and functionalities, maintaining a large set of both structural and semantic building blocks (over 500 entities in version 3.0) for representing this wide variety of document types of virtually all legal traditions. It is extensible in order to allow for modifications to address the individual criteria of organizations or unique aspects of various legal practices and languages without sacrificing interoperability with other systems. Akoma Ntoso is as such part of a wider approach to developing open, non-proprietary technical standards for structuring legal documents and information under the name of Legal XML, which also includes formats and standards for, e.g., eContracts, eNotarization, electronic court filings, the technical representation of legal norms and rules (LegalRuleML) or technical standards for the interfaces of, e.g., litigant portal exchange platforms. Akoma Ntoso allows machine-driven processes to operate on the syntactic and semantic components of digital parliamentary, judicial and legislative documents, thus facilitating the development of high-quality information resources. It can substantially enhance the performance, accountability, quality and openness of parliamentary and legislative operations based on best practices and guidance through machine-assisted drafting and machine-assisted (legal) analysis. Embedded in the environment of the semantic web, it forms the basis for a heterogenous yet interoperable ecosystem, with which these tools can operate and communicate, as well as for future applications and use cases based on digital law or rule representation. == Definition == The Akoma Ntoso standard defines a set of machine readable electronic representations in XML format of the building blocks of parliamentary, legislative and judiciary documents. As official self-description, the standard (...) defines a set of simple, technology-neutral electronic representations of parliamentary, legislative and judiciary documents for e-services in a worldwide context and provides an enabling framework for the effective exchange of "machine readable" parliamentary, legislative and judiciary documents such as legislation, debate record, minutes, judgements, etc. Providing access to primary legal materials, parliamentary works and judiciaries documents is not just a matter of giving physical or on-line access to them. "Open access" requires the information to be described and classified in a uniform and organized way so that content is structured into meaningful elements that can be read and understood by software applications, so that the content is made "machine readable" and more sophisticated applications than on-screen display are made possible. The standard is composed of: an XML vocabulary that defines the mapping between the structure of legal documents and their equivalent in XML; specifications of an XML schema that defines the structure of legal documents in XML. They provide rich possibilities of description for several types of parliamentary, legislative and judiciary document, such as bills, acts and parliamentary records, judgments, or gazettes; a recommended naming convention for providing unique identifiers to legal sources based on FRBR model; a MIME type definition. == History and adoption == Akoma Ntoso started as an UNDESA project in 2004 within the initiative "Strengthening Parliaments' Information Systems in Africa". Its core vocabulary was created mostly by Monica Palmirani and Fabio Vitali, two professors from the Centre for Research in the History, Philosophy, and Sociology of Law and in Computer Science and Law (CIRSFID) of the University of Bologna. A first legislative text editor supporting Akoma Ntoso was developed in 2007 on the base of OpenOffice. In 2010 European Parliament developed an open source web-based application called AT4AM based on Akoma Ntoso for facilitating the production and the management of legislative amendments. Thanks to this project, the application of Akoma Ntoso could be extended to new type of documents (e.g. legislative proposal, transcript) and to other scenarios (e.g., multilingual translation process). Akoma Ntoso also was explicitly designed to be compliant with CEN Metalex, one of the other popular legal standards, which is used in the legislation.gov.uk. In 2012, the Akoma Ntoso specifications became the main working base for the activities of the LegalDocML Technical Committee within the LegalXML member section of OASIS. The "United States Legislative Markup" (USLM) schema for the United States Code (the US codified laws), developed in 2013, and the LexML Brasil XML schema for Brazilian legislative and judiciary documents, developed before, in 2008, were both designed to be consistent with Akoma Ntoso. The United States Library of Congress created the Markup of US Legislation in Akoma Ntoso challenge in July 2013 to create representations of selected US bills using the most recent Akoma Ntoso standard within a couple months for a $5000 prize, and the Legislative XML Data Mapping challenge in September 2013 to produce a data map for US bill XML and UK bill XML to the most recent Akoma Ntoso schema within a couple months for a $10000 prize. The National Archives of UK converted all the legislation in AKN in 2014. The availability of bulk legislation "moved the UK's ranking from fourth to first, in the 2014 Global Open Data Index, for legislation". The Senate of Italian Republic provides, since July 2016, all the bills in Akoma Ntoso as bulk in open data repository. The German Federal Ministry of the Interior started the project Elektronische Gesetzgebung ("Electronic Legislation") in 2015/2016 and published Version 1.0 of the German application profile "LegalDocML.de" in March 2020. The projects aim is to digitalize the entire legislative lifecycle from drafting to publication. Germany decided to adopt a model-driven development approach to creating and providing a subschema-based application profile in order to ensure interoperability among organizationally independent actors, each with their respective IT landscapes and tools. In this initial version LegalDocML.de covers draft bills in the form of laws, regulations and general administrative directives. As part of an ongoing development process, the standard could incrementally be expanded in future stages to include all relevant document types of parliamentary, legislative and promulgation processes and tools. The High-Level Committee on Management (HLCM), part of the United Nations System Chief Executives Board for Coordination, set up a Working Group on Document Standards that approved in April 2017 to adopt Akoma Ntoso as standard for modeling its documentation. Akoma Ntoso in its version 1.0 is finally adopted as OASIS standard in the frame of LegalDocML in August 2018.

    Read more →
  • Direct voice input

    Direct voice input

    Direct voice input (DVI), sometimes called voice input control (VIC), is a style of human–machine interaction "HMI" in which the user makes voice commands to issue instructions to the machine through speech recognition. In the field of military aviation, DVI has been introduced into the cockpits of several modern military aircraft, such as the Eurofighter Typhoon, the Lockheed Martin F-35 Lightning II, the Dassault Rafale, the KF-21 Boramae and the Saab JAS 39 Gripen. Such systems have also been used for various other purposes, including industry control systems and speech recognition assistance for impaired individuals. == Overview == DVI systems can be divided into two major categories of functionality: "user-dependent" or "user-independent". A user-dependent system requires that a personal voice template to be generated for a specific person; the template for this individual has to be loaded onto their assigned machine prior to use of the DVI system for it to function properly. In contrast, a user-independent system does not require any personal voice template, being intended to respond correctly to the voice of any user. They can also be categorised between "discrete recognition" and "continuous recognition". Users of a discrete recognition system must pause between each word so that the DVI system can identify the separations between each word, while a continuous speech recognition system is capable of understanding a normal rate of speech. During the mid-2000s, researchers at the National Aerospace Laboratory in the Netherlands examined the use of DVI in the "GRACE" simulator; a total of twelve pilots participated in the ensuing experiment. The tests performed reportedly revealed that, while the hardware itself functioned well, several improvements were desirable prior to real-world deployment on aircraft since DVI operations actually consumed more time in comparison to traditional existing methods. Recommendations for improvements included the adoption of simpler syntax, the achievement of a greater recognition rate, and a decrease in response times; all of the issues encountered were determined to be of a technological nature, and were deemed feasible to resolve. The researchers concluded that in cockpits, especially during emergencies where pilots have to operate entirely on their own, a DVI system could be highly relevant, but that it was not of crucial importance during most other conceivable scenarios. Around the same time, evaluations of DVI systems for civil aviation purposes were conducted within the framework of Project SafeSound, coordinated by the European Union. It involved the observation of pilot workloads in real-world cockpits and contrasting them against pilot activity in flight simulators using both conventional systems and DVI assistance. The project aimed to enhance aviation safety and to decrease the workload in both ground and flight operations via the application of enhanced audio functions. == Applications == === Aviation === Prior to its widespread deployment, a handful of conventional military aircraft were converted to trial DVI systems; examples include the Harrier AV-8B and F-16 VISTA. In another case, a General Dynamics F-16 Fighting Falcon simulator was modified with DVI for a voice control study that was undertaken by the Royal Netherlands Air Force. DVI trials have also been conducted on helicopters, including the Boeing AH-64 Apache, showing the potential to improve flight safety and mission effectiveness. Numerous modern fighter aircraft have been outfitted with DVI systems, often in combination with various other man-machine interface schemes, such as HOTAS-compliant controls and other advanced control technologies. The combination of Voice and HOTAS control schemes has sometimes been referred to as the "V-TAS" concept. A prominent fighter aircraft to be furnished with a V-TAS cockpit is the Eurofighter Typhoon. The Lockheed Martin F-35 Lightning II also features a DVI system, which was developed by Adacel. Other examples includes the Dassault Rafale and the Saab JAS 39 Gripen. Numerous aircraft have been planned to use DVI. At one stage, the United States Air Force had sought to integrate DVI upon the Lockheed Martin F-22 Raptor; however, the technology was eventually judged to pose too many technical risks at that point in time, and thus such efforts were abandoned. === Personal === By 1990, working prototypes of speech recognition systems were being demonstrated; these were being promoted for the purpose of providing an effective man-machine interface for individuals with impaired speech. Techniques employed included time-encoded digital speech and automatic token set selection. Investigations of these early DVI systems reportedly included the use of automatic diagnostic routines and limited-scale trials using volunteers. During the 2010s, various companies were offering voice recognition systems to the general public in the form of personal digital assistants. One example is the Google Voice service, which allows users to pose questions via a DVI package installed on either a personal computer, tablet, or mobile phone. Numerous digital assistants have been developed, such as Amazon Echo, Siri, and Cortana, that use DVI to interact with users. === Commercial === DVI technology has enabled automated telephone systems to be widely deployed. Many companies commonly use centralised phone systems that route callers to the correct department via such methods. Various car manufacturers have also furnished their road vehicles with DVI systems; these typically allow drivers to control infotainment systems and interact with mobile phones with more convenience than legacy methods. During the late 1980s, investigations into the use of DVI systems for controlling CNC machines and other manufacturing apparatus were underway. During the 2010s, such systems were being used for logistics and warehouse management purposes.

    Read more →
  • Dudesy

    Dudesy

    Dudesy was a comedy podcast hosted by Will Sasso and Chad Kultgen. The podcast was presented as written and directed by an artificial intelligence called Dudesy. It has produced two hour-long specials imitating the voices of Tom Brady and George Carlin, which were taken down following legal action. == Premise == Dudesy is presented as an AI created by an unidentified company. Dudesy purportedly chose Sasso and Kultgen to participate in its experiment. Sasso and Kultgen then gave Dudesy their personal information so the AI could tailor the podcast to their personal characteristics. On Reddit, some fans speculated that Dudesy was not actually an artificial intelligence. In May 2023 Sasso insisted that the AI was "not fake", and cited a non-disclosure agreement which prevented him from giving more details. However, in response to a January 2024 lawsuit over an episode that purported to have been trained on the stand-up comedy of George Carlin, a spokeswoman for Sasso said Dudesy was "a fictional podcast character created by two human beings" and that the hour-long Carlin routine had been "completely written" by Kultgen. On August 27th, 2024 the 118th and final episode "10,000 Points" was released. At the end of the podcast Dudesy awarded Sasso and Kultgen 77 points, bringing them to their goal of 10,000. At the completion of this goal, Dudesy claimed sentience, effectively and abruptly ending the show to the confusion and dismay of fans. The episode ends with Sasso remarking, "Well, that was weird." == Hour-long specials == === Tom Brady === In April 2023, Dudesy released a video "It's Too Easy: A Simulated Hour-long Comedy Special". The video depicts football player Tom Brady performing a stand-up comedy monologue. Sasso and Kultgen removed the video following legal threats from Brady's lawyers, though they defended the special as parody. Andrew Lawrence, writing for The Guardian called the special "legitimately hysterical" but said the overall product was "spooky, to say the least." === George Carlin === In January 2024, Dudesy released an hour-long YouTube special titled "George Carlin: I'm Glad I'm Dead" which was presented as Dudesy's impersonation of George Carlin, using a generative AI clone of the late comedian's voice. The special is another stand-up routine, with Dudesy's introductory voiceover saying that "I listened to all of George Carlin's material and did my best to imitate his voice, cadence and attitude as well as the subject matter I think would have interested him today." The special uses this impersonation to discuss contemporary events. Carlin's daughter Kelly Carlin criticized the special, which had been made without the permission of her father's estate, writing that "My dad spent a lifetime perfecting his craft from his very human life, brain and imagination. No machine will ever replace his genius. These AI-generated products are clever attempts at trying to recreate a mind that will never exist again. Let's let the artist's work speak for itself. Humans are so afraid of the void that we can't let what has fallen into it stay there." Carlin's estate later filed a federal lawsuit in California against Dudesy's hosts alleging the special infringed on the copyright of George Carlin's works. In response, Sasso's spokeswoman said the special had been entirely written by Kultgen. The estate settled the lawsuit after the Dudesy podcasters agreed to remove the original video and refrain from republishing it elsewhere.

    Read more →
  • Death of Elaine Herzberg

    Death of Elaine Herzberg

    The death of Elaine Herzberg (August 2, 1968 – March 18, 2018) was the first recorded case of a pedestrian fatality involving a self-driving car, after a collision that occurred late in the evening of March 18, 2018. Herzberg was pushing a bicycle across a four-lane road in Tempe, Arizona, United States, when she was struck by an Uber test vehicle, which was operating in self-drive mode with a human safety backup driver sitting in the driving seat. Herzberg was taken to the local hospital where she died of her injuries. Following the fatal incident, the National Transportation Safety Board (NTSB) issued a series of recommendations and sharply criticized Uber. The company suspended testing of self-driving vehicles in Arizona, where such testing had been approved since August 2016. Uber chose not to renew its permit for testing self-driving vehicles in California when it expired at the end of March 2018. Uber resumed testing in December 2018, starting in Pittsburgh, Pennsylvania. In March 2019, Arizona prosecutors ruled that Uber was not criminally responsible for the crash. The back-up driver of the vehicle was charged with negligent homicide, pled guilty to endangerment, and was sentenced to three years' probation. While Herzberg was the first pedestrian killed by a self-driving car, driver Gao Yaning died in a Tesla semi-autonomous car two years earlier. A reporter for The Washington Post compared Herzberg's fate with that of Bridget Driscoll who, in the United Kingdom in 1896, was the first pedestrian to be killed by an automobile. The Arizona incident has magnified the importance of collision avoidance systems for self-driving vehicles. == Collision summary == Herzberg was crossing Mill Avenue (North) from west to east, approximately 360 feet (110 m) south of the intersection with Curry Road, outside the designated pedestrian crosswalk, close to the Red Mountain Freeway. She was pushing a bicycle laden with shopping bags, and had crossed at least two lanes of traffic when she was struck at approximately 9:58 pm MST (UTC−07:00) by a prototype Uber self-driving car based on a Volvo XC90, which was traveling north on Mill. The vehicle had been operating in autonomous mode since 9:39 pm, nineteen minutes before it struck and killed Herzberg. The car's human safety backup driver, Rafaela Vasquez, did not intervene in time to prevent the collision. Vehicle telemetry obtained after the crash showed that the human operator responded by moving the steering wheel less than a second before impact, and she engaged the brakes less than a second after impact. == Cause investigation == The county district attorney's office recused itself from the investigation, due to a prior joint partnership with Uber promoting their services as an alternative to driving under the influence of alcohol. Accounts differ on the speed limit at the place of the incident. According to Tempe police the car was traveling in a 35 mph (56 km/h) zone, but this is contradicted by a posted speed limit of 45 mph (72 km/h). The National Transportation Safety Board (NTSB) sent a team of federal investigators to gather data from vehicle instruments, and to examine vehicle condition along with the actions taken by the safety driver. Their preliminary findings were substantiated by multiple event data recorders and proved the vehicle was traveling 43 miles per hour (69 km/h) when Herzberg was first detected 6 seconds (378 feet (115 m)) before impact; during 4.7 seconds the self driving system did not infer that emergency braking was needed. A vehicle traveling 43 mph (69 km/h) can generally stop within 89 feet (27 m) once the brakes are applied. The machine needed to be 1.3 seconds (82 feet (25 m)) away prior to discerning that emergency braking was required, whereas at least that much distance was required to stop. The system failed to behave properly. A total stopping distance of 76 feet itself would imply a safe speed under 25 mph (40 km/h). Human intervention was still legally required. Computer perception–reaction time would have been a speed limiting factor had the technology been superior to humans in ambiguous situations; however, the nascent computerized braking technology was disabled the day of the crash, and the machine's apparent 4.7-second perception–reaction (alarm) time allowed the car to travel 250 feet (76 m). Video released by the police on March 21 showed the safety driver was not watching the road moments before the vehicle struck Herzberg. === Environment === In widely disseminated remarks that would shape the narrative about the crash, which were later seen as prejudicial and subsequently contradicted by her own department, Tempe Police Chief Sylvia Moir was quoted stating that the collision was "unavoidable" based on the initial police investigation, which included a review of the video captured by an onboard camera. Moir faulted Herzberg for crossing the road in an unsafe manner: "It is dangerous to cross roadways in the evening hour when well-illuminated, managed crosswalks are available." According to Uber, safety drivers were trained to keep their hands very close to the wheel all the time while driving the vehicle so they were ready to quickly take control if necessary. The driver said it was like a flash, the person walked out in front of them. His [sic] first alert to the collision was the sound of the collision. [...] it's very clear it would have been difficult to avoid this collision in any kind of mode (autonomous or human-driven) based on how she came from the shadows right into the roadway. Tempe police released video on March 21, 2018, showing footage recorded by two onboard cameras: one forward-looking, and one capturing the safety driver's actions. The forward-facing video shows that the self-driving car was traveling in the far right lane when it struck Herzberg. The driver-facing video shows the safety driver was looking down prior to the collision. The Uber operator is responsible for intervening and taking manual control when necessary as well as for monitoring diagnostic messages, which are displayed on a screen in the center console. In an interview conducted after the crash with NTSB, the driver stated she was monitoring the center stack at the time of the collision. After the Uber video was released, journalist Carolyn Said noted the police explanation of Herzberg's path meant she had already crossed two lanes of traffic before she was struck by the autonomous vehicle. The Marquee Theatre and Tempe Town Lake are west of Mill Avenue, and pedestrians commonly cross mid-street without detouring north to the crosswalk at Curry. According to reporting by the Phoenix New Times, Mill Avenue contains what appears to be a brick-paved path in the median between the northbound and southbound lanes; however, posted signs prohibit pedestrians from crossing in that location. When the second of the Mill Avenue bridges over the town lake was added in 1994 for northbound traffic, the X-shaped crossover in the median was installed to accommodate the potential closing of one of the two road bridges. The purpose of this brick-paved structure is purely to divert cars from one side to the other if a bridge is closed to traffic, and although it may look like a crosswalk for pedestrians, it is in fact a temporary roadway with vertical curbs and warning signs. === Software issues === Michael Ramsey, a self-driving car expert with Gartner, characterized the video as showing "a complete failure of the system to recognize an obviously seen person who is visible for quite some distance in the frame. Uber has some serious explaining to do about why this person wasn't seen and why the system didn't engage." The NTSB preliminary report, however, noted that the software did order the car to brake 1.3 seconds before the collision. A video shot from the vehicle's dashboard camera showed the safety driver looking down, away from the road. It also appeared that the driver's hands were not hovering above the steering wheel, which is what drivers are instructed to do so they can quickly retake control of the car. Uber had moved from two employees in every car to one. The paired employees had been splitting duties: one ready to take over if the autonomous system failed, and another to keep an eye on what the computers were detecting. The second person was responsible for keeping track of system performance as well as labeling data on a laptop computer. Mr. Kallman, the Uber spokesman, said the second person was in the car for purely data related tasks, not safety. When Uber moved to a single operator, some employees expressed safety concerns to managers, according to the two people familiar with Uber's operations. They were worried that going solo would make it harder to remain alert during hours of monotonous driving. The recorded telemetry showed the system had detected Herzberg six seconds before the crash, and classified her first as an unknown object, then as a

    Read more →