AI Grammar Detection

AI Grammar Detection — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Syman

    Syman

    SYMAN is an artificial intelligence technology that uses data from social media profiles to identify trends in the job market. SYMAN is designed to organize actionable data for products and services including recruiting, human capital management, CRM, and marketing. SYMAN was developed with a $21 million series B financing round secured by Identified, which was led by VantagePoint Capital Partners and Capricorn Investment Group.

    Read more →
  • Loab

    Loab

    Loab ( LOBE) is a fictional character that artist and writer Steph Maj Swanson claimed to have discovered with a text-to-image AI model in April 2022. In a viral Twitter thread, Swanson described the images of Loab as an unexpectedly emergent property of the software, saying they discovered them when asking the model to produce something "as different from the prompt as possible". == History == The Sweden-based artist Steph Maj Swanson said that they first generated these images in April 2022 by using the algorithmic technique of "negative prompt weights" accessing latent space. The initial prompt - 'Brando::-1', requesting the opposite of actor Marlon Brando - generated a "skyline logo" with the cryptic lettering "DIGITA PNTICS". Attempting to generate the opposite of this image using the prompt "DIGITA PNTICS skyline logo::-1" yielded what Swanson described as "off-putting images, all of the same devastated-looking older woman with defined triangles of rosacea(?) on her cheeks". Swanson nicknamed the character "Loab", after one of the generated images resembled an album cover that included the printed word "loab". Swanson says that using the image as a prompt for further images produced increasingly violent and gory results. Swanson speculated that something about the image could be "adjacent to extremely gory and macabre imagery in the distribution of the AI's world knowledge". Swanson says that when they combined images of Loab with other pictures, the subsequent results consistently return an image including Loab, regardless of how much distortion they added to the prompts to try and remove her visage. Swanson speculated that the latent space region of the AI map that Loab is located in, in addition to being near gruesome imagery, must be isolated enough that any combinations with other images could only use Loab from her area and no related images due to its isolation. After enough crossbreeding of images and dilution attempts, Swanson was able to eventually generate images without Loab, but found that crossbreeding those diluted images would also eventually lead to a version of Loab to reappear in the resulting images. Swanson has said that "for various reasons" they declined to disclose the software used to create the images. Loab has been referred to as the "first AI-generated cryptid" and as such has gone viral. Despite hyping up the cryptid nature of the discovery in their wording, Swanson admitted that "Loab isn't really haunted, of course", but noted that the mythos that has sprung up around the AI-generated character has gone beyond their initial involvement. Swanson speculated that people sharing pictures and memes of Loab would lead future AIs to use those images as a part of their latent space maps, making her an innate part of the internet landscape, with Swanson adding "If we want to get rid of her, it's already too late." == Response == There has been discussion of whether the Loab series of images are "a legitimate quirk of AI art software, or a cleverly disguised creepypasta." Smithsonian magazine has written that "Loab sparked some lengthy ethical conversations around visual aesthetics, art and technology," and some have criticized the labeling of a woman with rosacea as a horror image, considering this to be "stigmatizing disability". Swanson responded that if the AI map is combining Loab with violent imagery, then that is a "social bias" in the data being used for the image modeling software. The Atlantic writer Stephen Marche described Loab as a "form of expression that has never existed before" whose authorship is unclear and that exists as an "emanation of the collective imagistic heritage, the unconscious visual mind". Laurens Verhagen in de Volkskrant commented that rather than showing that there are "dark horror creatures hidden deep within AI", the existence of Loab instead implies that our current "understanding of AI is limited". Mhairi Aitken at the Alan Turing Institute stated that rather than a "creepy" emergent property, output results like Loab were representative of the "limitations of AI image-generator models" and was more concerned about the urban legends that are born from such "boring" innocuous things and how easily "other people take these things seriously". Carly Cassella for ScienceAlert described Loab as a "modern day tronie" (a style of Dutch painting) that is not representative of an actual person, but just a concept or idea, similar but distinct from works like the Girl With A Pearl Earring. Wired's Joel Warner argued that Loab was only the beginning and that, with AI text generators such as ChatGPT becoming more commonplace, a "linguistic version of Loab" would emerge in that space as well and begin creating ideas through "intentional prompts" or otherwise that will be as disturbing as The 120 Days of Sodom.

    Read more →
  • Transdermal optical imaging

    Transdermal optical imaging

    Transdermal optical imaging, also known as transdermal optical imagery or TOI, is a method of detecting blood flow of the face by measuring hemoglobin concentration using a digital video camera. Because of the translucent property of skin, light can travel beneath the skin and re-emit. The re-emitted light from underneath the skin is affected by chromophores, mainly hemoglobin and melanin, which differ in color. The color difference allows TOI machine learning software to separate the images into layers, which are known as bitplanes. It extracts signals rich in hemoglobin and signals rich in melanin, then discards the melanin-rich signals to obtain a recording of hemoglobin changes under the skin. Transdermal optical imaging has been proposed as an alternative to cuff-based methods of measuring blood pressure because it is able to measure heart rate accurately in a "contactless and non-invasive" way. Transdermal optical imaging may be able to detect hidden emotions using the patterns of blood flow in the face.

    Read more →
  • Overwatch

    Overwatch

    Overwatch (abbreviated as OW) is a multimedia franchise centered on a series of multiplayer first-person shooter (FPS) video games developed by Blizzard Entertainment. Overwatch was released in 2016. Overwatch 2 was released in 2022 and the original game was taken offline upon its release, though Blizzard renamed it back to Overwatch in 2026. Overwatch features hero-based combat between two teams of players fighting over various objectives, along with other traditional gameplay modes. Released in 2016, Overwatch lacked a traditional story mode. Instead, Blizzard employed a transmedia storytelling strategy to disseminate lore regarding the game's characters, releasing comics and other literary media, as well as animated media that includes short films. The game enjoyed both critical and commercial success, and garnered a devoted following. The fan community around the franchise has produced a large amount of content including art, cosplay, fan fiction, anime-influenced music videos, Internet memes, and pornography. Blizzard helped launch and promote an esports scene surrounding the game, including an annual Overwatch World Cup, Overwatch League a minor league, and the Overwatch Champions Series which borrowed elements found in traditional American sports leagues. == Gameplay == Both games in the Overwatch series are team-based hero shooters. Players select a hero character from a large roster (52 as of Season 2), divided among three class types. These are: Tanks, who have higher health and generally meant to help protect their teammates from damage, but are larger and easier to hit; Damage, who act as the team's offensive leads; and Support, who heal, provide buffs for teammates, or de-buff the opposing team. Each role also features sub-roles with extra passives. These sub-roles include 'Initiator', 'Stalwart', and 'Bruiser' for Tank. 'Specialist', 'Flanker', 'Recon', and 'Sharpshooter' for Damage. 'Medic', 'Tactician', and 'Survivor' for Support. Players are generally free to change to different heroes while inside their spawn room during the course of a match in response to the current tactics employed by other players. As of the development of Overwatch 2, a standard game features one tank player, two damage players and two support players, a change from having two of each class in its predecessor. Players choose their class before the match, and can only pick characters within that class for the duration of the game. There are different styles of game modes, however, that allow players to choose characters from any class throughout the game. Each hero has a skill kit that includes a primary attack, active skills that require a cooldown period before they can be used again, passive skills that remain active at all times, and an Ultimate skill that can only be used once they fill their Ultimate meter either by damaging opponents, mitigating damage, healing teammates or by passively generating it over time. An update in 2025 saw each hero receive a total of four unique abilities known as perks. Each hero has two minor and two major perks; minor perks consist of smaller changes to a hero's kit, while major perks are intended to affect the match more significantly. At the beginning of each match, all heroes are set to level 1 for each player. As the match progresses, players can individually level up their respective heroes, minor perks are unlocked at level 2, and major perks are unlocked at the maximum level 3. When perks become available, players may only select one of each type of perk; a selected perk becomes irreversibly attached to the current hero for the remainder of the match. If a player switches to another hero mid-match, the previously selected hero retains their level and perk progress. Game types of Overwatch are split between standard matches, competitive play, custom games, and arcade modes. Standard matches have matchmaking based loosely on the player's skill level as measured by the game. Competitive mode uses more strict matchmaking based on a player's current rank on the competitive ladder, with their rank increasing or decreasing when they win or lose a game, respectively. Arcade modes do not use matchmaking and are generally more experimental modes compared to standard and competitive modes. Custom games are created via the workshop and can be utilised to make game modes that are very different from the base game. The workshop, is the software in Overwatch which creates the game using either presets and settings or rules and conditions made by code. These game modes can be published directly onto Overwatch’s custom browse tab or shared off platform using a 5 digit alphanumeric code. Standard and competitive game modes are randomly selected at the start of each match, and are objective based, requiring teams to control a fixed objective point for a duration of time, or escort a payload to a target zone before match time expires. These modes include: Assault (introduced in Overwatch): Also known as 2 Capture Points (or 2CP), Assault has the attacking team tasked with capturing two target points in sequence on the map, while the defending team must stop them. Assault-style maps were removed from main gameplay rotation after Overwatch 2 released but available in the game's arcade mode. It is still available in the game's custom game modes. Since Season 2, Assault-style maps are available in Arcade Mode daily routines. Escort (introduced in Overwatch): Also known as "Payload" by the community, The attacking team is tasked with escorting a payload to a certain delivery point before time runs out, while the defending team must stop them. The payload vehicle moves along a fixed track when any player on the attacking team is close to it, increasing in speed if multiple attackers are present, the increase capping at 3, but will stop if a defending player is nearby; should no attacker be near the vehicle, it will start to move backwards along the track. The payload will also heal any attacking players by 10 health per second while they are near the payload. Passing specific checkpoints will extend the match time and prevent the payload from moving backwards from that point. Hybrid (Assault/Escort) (introduced in Overwatch): The attacking team has to capture the payload (as if it were a target point from Assault) and escort it to its destination, while the defending team tries to hold them back. Control (introduced in Overwatch): Each team tries to capture and maintain a common control point until their capture percentage reaches 100%. This game mode is played in a best-of-three format. Control maps are laid out in a symmetric fashion so no team has an intrinsic position advantage. Push (introduced in Overwatch 2's launch): Each team attempts to secure control of a large robot that pushes one of two barriers to the opposing team's side of the map, whilst being escorted by at least one team member, stopping when enemy players are nearby, similar to the payload movement system in Escort. The team that pushes the payload fully to the other side, or furthest into the enemy territory before the time runs out, wins the match. Flashpoint (introduced in Overwatch 2 in 2023): Similar to Control, each team attempts to capture and maintain a common control point until their capture percentage reaches 100%. This game mode takes place on significantly larger maps with five separate control points, which take a shorter amount of time to capture as compared to a standard Control map. A central control point is always activated first; after it is secured by one team, the remaining four are activated in a random order. The first team to secure three control points wins. Clash (introduced in Overwatch 2 in 2024): Clash maps feature symmetrical maps with five control points. Teams initially vie for control of the central point, with the winning team progressing to the next control point, towards the opponent's base. Opponents can push back by winning control points and shifting the next point away from their base. If a team captures the point closest to the opponent's base, they win. Otherwise the match plays out until one team wins control five times. Arcade modes may include variations of the above modes with experimental rules, and can also include modes like Deathmatch and Capture the Flag. Other common arcade modes include: Elimination (introduced in Overwatch in 2016): Two teams face off in a series of rounds, attempting to wipe out the other team; once a player is killed they remain out of the game until the next round, though they can be revived by Mercy's 'Resurrect' ability. If no team has won a round by a certain time, then the winners are decided by the team that can first take a neutral control point. Players cannot change heroes until the next round. Some of these can be played in "lockout" mode, in which the heroes selected by the winning team for a round are "locked" and cannot be selected in future rounds. Total Mayhem (i

    Read more →
  • CEITON

    CEITON

    CEITON is a web-based software system for facilitating and automating business processes such as planning, scheduling, and payroll using workflow technologies. The system is used by several media companies such as MDR, Yle, RAI and Red Bull Media House. In December 2018, the first CEITON User Group Meeting took place in Leipzig, Germany. == Architecture == The software runs on a server (on premises) or in the cloud and is scalable on parallel servers. Data security is warranted by role-based access control (RBAC). The software is used via web-browsers and not dependent on particular system software. == Structure and Features == CEITON combines the two classical approaches of production planning and control and workflow management. === Project Management === The scheduling system plans, manages, bills, and analyzes projects or tasks. It manages human and technical resources, material, and locations on a single GUI. The system uses a gantt chart to assign tasks to be done to available and eligible resources (i.e. staff), automatically or by drag-and-drop. The scheduling module includes material management, resource management/ human resource management, integration of freelancers, clients and suppliers, long-term budget planning, time-tracking, shift scheduling, quality management, delivery and logistics, document management, archive, analysis and controlling, business reporting, as well as all accounting and documentation processes. === Workflow === The workflow management system module coordinates business processes. Processes are defined once as a workflow and then repeatedly executed. Human resources are automatically assigned to steps (tasks) and integrated in workflow forms. Systems are integrated with an EAI/SOAP module, allowing data exchange with arbitrary external systems which are also involved in the business process. It also features a 3-D workflow overview in which the status of each project step can be determined by its color in the overview. === Process Management === For project and order processing management, business processes are designed as workflows, and coordinate communication automatically. Different user interfaces for staff, customers or suppliers can be created so each gets only relevant information. Different workflow forms are associated with different log-ins. The main application for the system is knowledge-based business processes, in which many people are involved and virtual results are produced, e.g. in research, or development of media products, such as TV and movies. Broadcasters and media companies such as MDR and Yle use CEITON to control their production processes for products and services and coordinate complex workflows with all kinds of resources. === Integrations === An integrated EAI module allows CEITON to integrate every external system in any business process without programming, using SOAP and similar technologies. Aspera and FileCatalyst were integrated for faster data transfer, yet complex ERP systems and numerous SAP modules have also been integrated, for example, to extract working times to payroll. === Mobile Working === Since Version 7, released in 2015, CEITON includes a time-tracking module allowing employees to enter their times from mobile devices such as tablets running Android, iPhones etc. == History == Ceiton Technologies (SME tech firm), the company developing CEITON, was founded in Leipzig, Germany in 2000, staffing solutions for the Bureau of Internal Revenue in Manila, Philippines, were implemented in 2000 together with the Deutsche Gesellschaft für Technische Zusammenarbeit of the German government. The first version (1.0) of the software was released in July 2001. The product was originally developed for German broadcasting companies. CEITON is named after the Japanese concept Seiton, one of the principles of Japanese workplace design methodology known as 5S. Since version 7, released in 2015, CEITON includes a time-tracking module allowing employees to enter their times from mobile devices such as tablets running Android, iPhones etc. In May 2005 CEITON won the IQ innovation award, sponsored by Siemens, in the category Excellent innovation in the IT-sector. Since 2007, CEITON has been present at the broadcast trade fairs NAB in Las Vegas and IBC in Amsterdam. In 2020, the company celebrated its 20th anniversary.

    Read more →
  • European Conference on Artificial Intelligence

    European Conference on Artificial Intelligence

    The European Conference on Artificial Intelligence (ECAI) is the leading conference in the field of Artificial Intelligence in Europe, and is commonly listed together with IJCAI and AAAI as one of the three major general AI conferences worldwide. The conference series has been held without interruption since 1974, originally under the name AISB. The conference was originally held biennially, but has been organized annually since ECAI 2022. The conferences are held under the auspices of the European Coordinating Committee for Artificial Intelligence (ECCAI) and organized by one of the member societies. The journal AI Communications, sponsored by the same society, regularly publishes special issues in which conference attendees report on the conference. Publication of a paper in ECAI is considered by some journals to be archival: the paper should be considered equivalent to a journal publication and that the contents of ECAI papers cannot be reformulated as separate journal submissions unless a significant amount of new material is added. == List of ECAI conferences == ECAI-1992 took place in Vienna, Austria. ECAI-1996 took place in Budapest, Hungary. ECAI-1998 tool place in Brighton, United Kingdom. ECAI-2000 took place in Berlin, Germany. ECAI-2004 took place in Valencia, Spain. ECAI-2006 took place in Riva del Garda, Italy. ECAI-2008 took place in Patras, Greece. ECAI-2010 took place in Lisbon, Portugal. ECAI-2012 took place in Montpellier, France. ECAI-2014 took place in Prague, Czech Republic. ECAI-2016 took place in The Hague, Netherlands. ECAI-2018 took place in Stockholm, Sweden. ECAI-2020 took place in Santiago de Compostela, Spain. ECAI-2022 took place in Vienna, Austria. ECAI-2023 took place in Kraków, Poland. ECAI-2024 took place in Santiago de Compostela, Spain. ECAI-2025 took place in Bologna, Italy.

    Read more →
  • Niceaunties

    Niceaunties

    Niceaunties is the pseudonym of a Singapore-based artist and designer whose work incorporates generative artificial intelligence, video, and digital installation. Her practice centers around the figure of the "auntie", a common term for older women in Southeast Asian contexts, and explores themes such as aging, care, domesticity, and gender roles. Her work has been featured in exhibitions and media platforms including TED, Christie's Art + Tech, Expanded.Art, and publications such as The Guardian, The Straits Times. == Early life and career == Niceaunties was born in 1981 in Singapore. She attributes her inspiration for "auntie culture" to the matriarchal environment and older women of her household, including her grandmother, while growing up. She is also an architectural designer with Spark Architect. The Niceaunties project began in 2023 after she encountered AI-generated images in her work as an architect. It draws inspiration from women in the artist's family and broader Southeast Asian cultural dynamics. Her work often features AI-generated visuals created with tools such as DALL-E, Krea, RunwayML, and SORA. Her imagery and narratives center on the fictional "Auntieverse", which features older women in imagined settings involving community, ecology, and labor. Her notable works include 'Auntlantis', a five-part video series imagining older women engaged in ocean clean-up and collective ritual, and 'Goddess,' a video created with Sora, featuring a character who gradually forgets her divine identity through years of domestic labor. == Exhibitions == 2024 – Expanded.Art, Berlin – Auntiedote solo exhibition 2024 – TED (conference), Vancouver – Speaker and screening 2024 – Victoria and Albert Museum, London – Digital Art Weekend 2024 – Louisiana Museum of Modern Art, Denmark – Ocean exhibition 2025 – Christie's Augmented Intelligence Auction, New York == Reception == In 2024, Niceaunties gave a TED Talk titled The Weird and Wonderful Art of Niceaunties. Journalist Rebecca Ratcliffe, writing for The Guardian, described her work as combining AI with "the surreal and the political," noting her focus on older women as central characters. Her work has also received criticism for being reliant on generative AI, which many feel exploits and steals from traditional artists.

    Read more →
  • Resistance Database Initiative

    Resistance Database Initiative

    HIV Resistance Response Database Initiative (RDI) was formed in 2002 to use artificial intelligence (AI) to predict how patients will respond to HIV drugs using data from more 250,000 patients from around 50 countries around the world. The RDI used its models to power its HIV Treatment Response Prediction System (HIV-TRePS). Launched in 2010, this free online tool enabled healthcare professionals to upload their patient’s data and obtain highly accurate predictions of how they would respond to different combinations of the 30 or more drugs available. The tool enabled physicians to individualize their patients’ treatment, using these predictions based on more than a million patient-years of treatment experience. HIV-TRePS was possibly the first ever AI-based system for medical decision-making to be developed, successfully tested, and used in clinical practice. It has since been used by thousands of healthcare professionals to optimise the treatment of tens of thousands of patients. Since the RDI’s inception the treatment of HIV infection has progressed enormously, with more effective and better tolerated drugs available in ever more convenient combination formulations. In most countries HIV is now considered a chronic, manageable condition. Moreover, the success of the drugs in reducing the amount of virus is substantially reducing the onward transmission of the virus and cases of new infections are falling in many settings. This improvement in HIV treatment means the need for sophisticated AI to support HIV treatment decisions has significantly reduced. In response, the RDI ceased development of further models and, in March 2024, withdrew its HIV-TRePS system. == Background == Human immunodeficiency virus (HIV) is the virus that causes acquired immunodeficiency syndrome (AIDS), a condition in which the immune system begins to fail, leading to life-threatening opportunistic infections. There are approximately 30 HIV antiretroviral drugs that have been approved for the treatment of HIV infection, from six different classes, based on the point in the HIV life-cycle at which they act. They are used in combination; typically 3 or more drugs from 2 or more different classes, a form of therapy known as highly active antiretroviral therapy (HAART). The aim of therapy is to suppress the virus to very low, ideally undetectable, levels in the blood. This prevents the virus from depleting the immune cells that it preferentially attacks CD4 cells and prevents or delays illness and death. Despite the expanding availability of these drugs and the impact of their use, treatments continue to fail, often involving to the development of resistance. During drug therapy, low-level virus replication may still occur, particularly when a patient misses a dose. HIV makes errors in copying its genetic material and, if a mutation makes the virus resistant to one or more of the drugs in the patient's treatment, it may begin to replicate more successfully in the presence of that drug and undermine the effect of the treatment. If this happens, the treatment needs to be changed to re-establish control over the virus. == RDI's Approach == The RDI’s approach was to use artificial intelligence (including neural network and random forest models), trained with data from hundreds of thousands of patients, treated with different drugs in a variety of clinical settings all over the world, to predict how an individual patient will respond to any new combination of HIV drugs. The models were tested with independent data sets and consistently achieved accuracy of approximately 80%.

    Read more →
  • List of assembly software and tools

    List of assembly software and tools

    This is a list of assembly software and tools, including software used for assembly language programming, machine code generation, disassembly, debugging, binary analysis, reverse engineering, and instruction-set simulation. == Assemblers and machine-code generators == == Disassemblers and binary-analysis tools == == Debuggers with assembly-level features == == Educational IDEs, simulators and emulators == == Portable and intermediate assembly-like languages == == Assembly language families == Assembly language is not a single programming language, but a family of low-level languages associated with particular instruction set architectures and processor families. Examples include:

    Read more →
  • Tensor network

    Tensor network

    Tensor networks or tensor network states are a class of variational wave functions used in the study of many-body quantum systems and fluids. Tensor networks extend one-dimensional matrix product states to higher dimensions while preserving some of their useful mathematical properties. The wave function is encoded as a tensor contraction of a network of individual tensors. The structure of the individual tensors can impose global symmetries on the wave function (such as antisymmetry under exchange of fermions) or restrict the wave function to specific quantum numbers, like total charge, angular momentum, or spin. It is also possible to derive strict bounds on quantities like entanglement and correlation length using the mathematical structure of the tensor network. This has made tensor networks useful in theoretical studies of quantum information in many-body systems. They have also proved useful in variational studies of ground states, excited states, and dynamics of strongly correlated many-body systems. == Diagrammatic notation == In general, a tensor network diagram (Penrose diagram) can be viewed as a graph where nodes (or vertices) represent individual tensors, while edges represent summation over an index. Free indices are depicted as edges (or legs) attached to a single vertex only. Sometimes, there is also additional meaning to a node's shape. For instance, one can use trapezoids for unitary matrices or tensors with similar behaviour. This way, flipped trapezoids would be interpreted as complex conjugates to them. == History == Foundational research on tensor networks began in 1971 with a paper by Roger Penrose. In "Applications of negative dimensional tensors" Penrose developed tensor diagram notation, describing how the diagrammatic language of tensor networks could be used in applications in physics. In 1992, Steven R. White developed the density matrix renormalization group (DMRG) for quantum lattice systems. The DMRG was the first successful tensor network and associated algorithm. In 2002, Guifré Vidal and Reinhard Werner attempted to quantify entanglement, laying the groundwork for quantum resource theories. This was also the first description of the use of tensor networks as mathematical tools for describing quantum systems. In 2004, Frank Verstraete and Ignacio Cirac developed the theory of matrix product states, projected entangled pair states, and variational renormalization group methods for quantum spin systems. In 2006, Vidal developed the multi-scale entanglement renormalization ansatz (MERA). In 2007 he developed entanglement renormalization for quantum lattice systems. In 2010, Ulrich Schollwock developed the density-matrix renormalization group for the simulation of one-dimensional strongly correlated quantum lattice systems. In 2014, Román Orús introduced tensor networks for complex quantum systems and machine learning, as well as tensor network theories of symmetries, fermions, entanglement and holography. == Connection to machine learning == Tensor networks have been adapted for supervised learning, taking advantage of similar mathematical structure in variational studies in quantum mechanics and large-scale machine learning. This crossover has spurred collaboration between researchers in artificial intelligence and quantum information science. In June 2019, Google, the Perimeter Institute for Theoretical Physics, and X (company), released TensorNetwork, an open-source library for efficient tensor calculations. The main interest in tensor networks and their study from the perspective of machine learning is to reduce the number of trainable parameters (in a layer) by approximating a high-order tensor with a network of lower-order ones. Using the so-called tensor train technique (TT), one can reduce an N-order tensor (containing exponentially many trainable parameters) to a chain of N tensors of order 2 or 3, which gives us a polynomial number of parameters.

    Read more →
  • Smart speaker

    Smart speaker

    A smart speaker is a type of loudspeaker and voice command device with an integrated virtual assistant that offers interactive actions and hands-free activation with the help of one "wake word" (or several "wake words"). Some smart speakers also act as smart home hubs by using Wi-Fi, Bluetooth, Thread, and other protocol standards to extend usage beyond audio playback and control home automation devices connected through a local area network. == History == Early voice-activated devices began in 2013 with MIT's Jasper project, which used multiple microphones and cloud software to power hands-free interactions from across a room. The first commercial smart speaker was the Amazon Echo, which was released in 2014 powered by Alexa and a ring of far-field microphones. Google followed in 2016 with Home, powered by Google Assistant. By 2017, devices like the Echo Show and Home Hub (later called Nest Hub) added touchscreens and video, creating the "smart display" subcategory. In 2018, Apple joined the smart speaker trend by launching the HomePod, which focused on high-quality audio alongside their built-in assistant Siri. ASUS release its own smart Speaker Xiao-Bu in 2019 with Artificial Intelligence, it terminates the Cloud Service on June 1st, 2025, which means all real-time service such as weather, news, currency conversion is affected. Sonos's 1st smart speaker Sonos One released in 2017, powered by Alexa. Invoke by Harman Kardon was powered by Microsoft's intelligent personal assistant, Cortana. In the early 2020s, smart speakers gained on-device voice processing for faster responses and improved privacy. New standards such as Matter and Thread allowed multitudes of smart-home devices (even from completely different brands) to work together. == Features == === Audio and Voice === Smart speakers use multiple microphones along with noise-cancelling software to pick up your voice from across the room, even when music is playing or the assistant is already talking. Noise suppression and echo cancellation is also used by the speaker so it can focus in on who is talking and ignore any background noises. Most smart speaker models can recognize who is speaking by voiceprint, which allows the speaker to grab information from that person's calendar, preferences, or music playlists. Listening to music on a speaker is when importance for good audio quality becomes apparent. Entry-level (cheaper) speakers such as the Home Mini or the Echo Dot have a single full-range driver. These lower-end speakers typically aren't great for listening to music as the audio quality is pretty poor. More advanced units such as the Home Max or Echo Studio have separate tweeters and woofers meant for listening to music in high quality. === Connectivity and smart-home control === Most connect over Wi-Fi or Bluetooth and support hub protocols like Thread and Matter. That lets them not only stream and play music but also allows you to control various brands of smart lights, thermostats, door locks, cameras, and much more-all from one point of control. Each can have its own designated interface and features in-house, usually launched or controlled via application or home automation software. These devices are able to communicate with each other via peer-to-peer connection through mesh networking. These speakers and related smart devices are typically controlled with one smartphone application. === Assistant services and skills === The built-in assistants handle timers, alarms, reminders, news briefings, weather updates, send messages to other smart devices, send texts, make calls, and simple questions. You can combine actions together in what are typically known as routines (for example saying "good morning" turns on lights, starts the coffee, says the weather, and reads the news) and add extra functions known as skills or actions (for things like ordering food or playing trivia games). This hands-free use of smart speakers can help assist those with disabilities. Most other technologies need the user to be able to physically interact with the device. Smart speakers are not bound by these limitations and can serve as an excellent tool for those who are unable to use their arms or legs or have vision issues. Although these tasks can be completed by a phone or computer, consumers tend to lean towards smart speakers due to factors such as their range being much greater than that of a phone and the need to not have to physically interact with the speaker to get the voice assistant as with most smartphones, certain parts of a phone may need to be interacted with to activate the speaking assistant. === Smart displays === Some smart speakers also include a screen to show the user a visual response. A smart speaker with a touchscreen is known as a smart display; these integrate a conversational user interface with display screens to augment voice interaction with images and video. They are powered by one of the common voice assistants and offer additional controls for smart home devices, feature streaming apps, and web browsers with touch controls for selecting content. The first smart displays were introduced in 2017 by Amazon (Amazon Echo Show) and Google (Google/Nest Home Hub). Hotel chain Marriott International partnered with Amazon to install Echo devices in select hotels since 2018. A Taiwanese startup, Aiello, launched the Aiello Voice Assistant (AVA) in the Asian hotel market in 2019, claiming it is powered by a multi-AI model system. Angie by Nomadix, which is similar to the Amazon Echo, launched its first product in 2017, specifically targeting hotel properties in the North America. In May 2019, Angie Hospitality acquired the assets of Roxy, a competitor that also built its own speech-enabled virtual assistant technology for hotels. This acquisition merged two proprietary NLP stacks into the current Nomadix product. === Artificial intelligence === The newest speakers can use on-device AI or cloud-based generative models to allow the smart speaker to carry on much more natural conversations, draft emails or recipes, suggest ideas based on context, or even create short pieces of music or art. This AI evolution allows these speakers to do far more than what they could do before. == Accuracy == According to a study by Proceedings of the National Academy of Sciences of the United States of America released In March 2020, the six biggest tech development companies, Amazon, Apple, Google, Yandex, IBM and Microsoft, have misidentified more words spoken by "black people" than "white people". The systems tested errors and unreadability, with a 19 and 35 percent discrepancy for the former and a 2 and 20 percent discrepancy for the latter. The North American Chapter of the Association for Computational Linguistics (NAACL) also identified a discrepancy between male and female voices. According to their research, Google's speech recognition software is 13 percent more accurate for men than women. It performs better than the systems used by Bing, AT&T, and IBM. == Privacy concerns == The built-in microphone in smart speakers is continuously listening for wake words followed by a command. However, these continuously listening microphones also raise privacy concerns among users. According to a survey taken by 1,007 people in Western Europe, it is clear that privacy is the biggest concern holding consumers back from buying "smart" products. these concerns include what is being recorded, how the data will be used, how it will be protected, and whether it will be used for invasive advertising. Furthermore, an analysis of Amazon Echo Dots showed that 30–38% of "spurious audio recordings were human conversations", suggesting that these devices capture audio other than strictly detection of the wake word. === As a wiretap === There are strong concerns that the ever-listening microphone of smart speakers presents a perfect candidate for wiretapping. In 2017, British security researcher Mark Barnes showed that pre-2017 Echos have exposed pins which allow for a compromised OS to be booted. According to Umar Iqbal, an assistant professor at Washington University in St. Louis, research indicates that data from consumer interactions with Alexa was used to targeted advertisements and products to consumer with over 40% of transmitted data lacking proper encryption raising privacy concerns. Further data indicates that due to the Smart Speakers ability to always capture audio, it begins to pick up on external conversations from consumers not related to commands given to the smart speaker. Things such as other members in the household, consumers on the phone and even TV audio can be picked up by these speakers and stored for future use by companies. === Voice assistance vs privacy === While voice assistants provide a valuable service, there can be some hesitation towards using them in various social contexts, such as in public or around other users. However, only more recently have users begun interac

    Read more →
  • Mivar-based approach

    Mivar-based approach

    The Mivar-based approach is a mathematical tool for designing artificial intelligence (AI) systems. Mivar (Multidimensional Informational Variable Adaptive Reality) was developed by combining production and Petri nets. The Mivar-based approach was developed for semantic analysis and adequate representation of humanitarian epistemological and axiological principles in the process of developing artificial intelligence. The Mivar-based approach incorporates computer science, informatics and discrete mathematics, databases, expert systems, graph theory, matrices and inference systems. The Mivar-based approach involves two technologies: Information accumulation is a method of creating global evolutionary data-and-rules bases with variable structure. It works on the basis of adaptive, discrete, mivar-oriented information space, unified data and rules representation, based on three main concepts: “object, property, relation”. Information accumulation is designed to store any information with possible evolutionary structure and without limitations concerning the amount of information and forms of its presentation. Data processing is a method of creating a logical inference system or automated algorithm construction from modules, services or procedures on the basis of a trained mivar network of rules with linear computational complexity. Mivar data processing includes logical inference, computational procedures and services. Mivar networks allow us to develop cause-effect dependencies (“If-then”) and create an automated, trained, logical reasoning system. Representatives of Russian association for artificial intelligence (RAAI) – for example, V. I. Gorodecki, doctor of technical science, professor at SPIIRAS and V. N. Vagin, doctor of technical science, professor at MPEI declared that the term is incorrect and suggested that the author should use standard terminology. == History == While working in the Russian Ministry of Defense, O. O. Varlamov started developing the theory of “rapid logical inference” in 1985. He was analyzing Petri nets and productions to construct algorithms. Generally, mivar-based theory represents an attempt to combine entity-relationship models and their problem instance – semantic networks and Petri networks. The abbreviation MIVAR was introduced as a technical term by O. O. Varlamov, Doctor of Technical Science, professor at Bauman MSTU in 1993 to designate a “semantic unit” in the process of mathematical modeling. The term has been established and used in all of his further works. The first experimental systems operating according to mivar-based principles were developed in 2000. Applied mivar systems were introduced in 2015. == Mivar == Mivar is the smallest structural element of discrete information space. == Object-property-relation == Object-Property-Relation (VSO) is a graph, the nodes of which are concepts and arcs are connections between concepts. Mivar space represents a set of axes, a set of elements, a set of points of space and a set of values of points. A = { a n } , n = 1 , … , N , {\displaystyle A=\{a_{n}\},n=1,\ldots ,N,} where: A {\displaystyle A} is a set of mivar space axis names; N {\displaystyle N} is a number of mivar space axes. Then: ∀ a n ∃ F n = { f n i n } , n = 1 , … , N , i n = 1 , … , I n , {\displaystyle \forall a_{n}\exists F_{n}=\{f_{{ni}_{n}}\},n=1,\ldots ,N,i_{n}=1,\ldots ,I_{n},} where: F n {\displaystyle F_{n}} is a set of axis a n {\displaystyle a_{n}} elements; i n {\displaystyle i_{n}} is a set F n {\displaystyle F_{n}} element identifier; I n = | F n | . {\displaystyle I_{n}=|F_{n}|.} F n {\displaystyle F_{n}} sets form multidimensional space: M = F 1 × F 2 × ⋯ × F n . {\displaystyle M=F_{1}\times F_{2}\times \cdots \times F_{n}.} m = ( i 1 , i 2 , … , i N ) , {\displaystyle m=(i_{1},i_{2},\ldots ,i_{N}),} where: m ∈ M {\displaystyle m\in M} ; m {\displaystyle m} is a point of multidimensional space; ( i 1 , i 2 , … , i N ) {\displaystyle (i_{1},i_{2},\ldots ,i_{N})} are coordinates of point m {\displaystyle m} . There is a set of values of multidimensional space points of M {\displaystyle M} : C M = { c i 1 , i 2 , … , i N ∣ i 1 = 1 , … , I 1 , i 2 = 1 , … , I 2 , … , i n = 1 , … , I N } , {\displaystyle C_{M}=\{c_{i_{1},i_{2},\ldots ,i_{N}}\mid i_{1}=1,\ldots ,I_{1},i_{2}=1,\ldots ,I_{2},\ldots ,i_{n}=1,\ldots ,I_{N}\},} where: c i 1 , i 2 , … , i N {\displaystyle c_{i_{1},i_{2},\ldots ,i_{N}}} is a value of the point of multidimensional space M {\displaystyle M} is a value of the point of multidimensional space ( i 1 , i 2 , … , i N ) {\displaystyle (i_{1},i_{2},\ldots ,i_{N})} . For every point of space M {\displaystyle M} there is a single value from C M {\displaystyle C_{M}} set or there is no such value. Thus, C M {\displaystyle C_{M}} is a set of data model state changes represented in multidimensional space. To implement a transition between multidimensional space and set of points values the relation μ {\displaystyle \mu } has been introduced: C x = μ ( M x ) , {\displaystyle C_{x}=\mu (M_{x}),} where: M x ⊆ M ; {\displaystyle M_{x}\subseteq M;} M x = F 1 x × F 2 x × ⋯ × F N x . {\displaystyle M_{x}=F_{1x}\times F_{2x}\times \cdots \times F_{Nx}.} To describe a data model in mivar information space it is necessary to identify three axes: The axis of relations « O {\displaystyle O} »; The axis of attributes (properties) « S {\displaystyle S} »; The axis of elements (objects) of subject domain « V {\displaystyle V} ». These sets are independent. The mivar space can be represented by the following tuple: ⟨ V , S , O ⟩ {\displaystyle \langle V,S,O\rangle } Thus, mivar is described by « V S O {\displaystyle VSO} » formula, in which « V {\displaystyle V} » denotes an object or a thing, « S {\displaystyle S} » denotes properties, « O {\displaystyle O} » variety of relations between other objects of a particular subject domain. The category “Relations” can describe dependencies of any complexity level: formulae, logical transitions, text expressions, functions, services, computational procedures and even neural networks. A wide range of capabilities complicates description of modeling interconnections, but can take into consideration all the factors. Mivar computations use mathematical logic. In a simplified form they can be represented as implication in the form of an "if…, then …” formula. The result of mivar modeling can be represented in the form of a bipartite graph binding two sets of objects: source objects and resultant objects. == Mivar network == Mivar network is a method for representing objects of the subject domain and their processing rules in the form of a bipartite directed graph consisting of objects and rules. A Mivar network is a bipartite graph that can be described in the form of a two-dimensional matrix, in that records information about the subject domain of the current task. Generally, mivar networks provide formalization and representation of human knowledge in the form of a connected multidimensional space. That is, a mivar network is a method of representing a piece of mivar space information in the form of a bipartite, directed graph. The mivar space information is formed by objects and connections, which in total represent the data model of the subject domain. Connections include rules for objects processing. Thus, a mivar network of a subject domain is a part of the mivar space knowledge for that domain. The graph can consist of objects-variables and rules-procedures. First, two lists are made that form two nonintersecting partitions: the list of objects and the list of rules. Objects are denoted by circles. Each rule in a mivar network is an extension of productions, hyper-rules with multi-activators or computational procedures. It is proved that from the perspective of further processing, these formalisms are identical and in fact are nodes of the bipartite graph, denoted by rectangles. === Multi-dimensional binary matrices === Mivar networks can be implemented on single computing systems or service-oriented architectures. Certain constraints restrict their application, in particular, the dimension of matrix of linear matrix method for determining logical inference path on the adaptive rule networks. The matrix dimension constraint is due to the fact that implementation requires sending a general matrix to multiple processors. Since every matrix value is initially represented in symbol form, the amount of sent data is crucial when obtaining, for example, 10000 rules/variables. Classical mivar-based method requires storing three values in each matrix cell: 0 – no value; x – input variable for the rule; y – output variable for the rule. The analysis of possibility of firing a rule is separated from determining output variables according to stages after firing the rule. Consequently, it is possible to use different matrices for “search for fired rules” and “setting values for output variables”. This allowsthe use of multidimensional binary m

    Read more →
  • LIVAC Synchronous Corpus

    LIVAC Synchronous Corpus

    LIVAC is an uncommon language corpus dynamically maintained since 1995. Different from other existing corpora, LIVAC has adopted a rigorous and regular "Windows" approach in processing and filtering massive media texts from representative Chinese speech communities such as Beijing, Hong Kong, Macau, Taipei, Singapore, Shanghai, as well as Guangzhou, and Shenzhen. The contents are thus deliberately repetitive in most cases, represented by textual samples drawn from editorials, local and international news, cross-Taiwan Strait news, as well as news on finance, sports and entertainment. By 2023, more than 3 billion characters of news media texts have been filtered, of which 700 million characters have been processed and analyzed and have yielded an expanding Pan-Chinese dictionary of 2.5 million words from the Pan-Chinese printed media. Through rigorous analysis based on computational linguistic methodology, LIVAC has at the same time accumulated a large amount of accurate and meaningful statistical data on the Chinese language and on their diverse speech communities in the Pan-Chinese context, and the results show considerable and important long standing as well as evolving variations. The "Windows" approach is the most innovative feature of LIVAC and has enabled Pan-Chinese media texts to be quantitatively analyzed according to various attributes such as locations, time and subject domains. Thus, various types of comparative studies and applications in information technology as well as development of often related innovative applications have been possible. Moreover, LIVAC has allowed longitudinal developments to be taken into account, facilitating Key Word in Context (KWIC) search and comprehensive study of target words and their underlying concepts as well as linguistic structures over the past 25 years, based on the above mentioned variables of location, time and subject. Results from the extensive and accumulative data analysis contained in LIVAC have enabled the cultivation of textual databases of proper names, place names, organization names, new words, and bi-weekly and annual rosters of media figures. Related applications have included the establishment of verb and adjective databases, the formulation of sentiment indices, and related opinion mining, to measure and compare the popularity of global media figures in the Chinese media (LIVAC Annual Pan-Chinese Celebrity Rosters, later renamed as the Pan-Chinese Newsmaker Rosters). Notable among these are the decades long periodic reviews of the 25 years of annual pan-Chinese rosters since 2000 and compilation of new word databases (LIVAC Annual Pan-Chinese New Word Rosters). On this basis, the analysis of the emergence, diffusion and transformation of new words, and the publication of dictionaries of neologisms have been made possible. A recent focus is on the relative balance between disyllabic words and growing trisyllabic words in the Chinese language, and the comparative study of light verbs in three Chinese speech communities. as well as the link between the language use and use of language as a reflection of epochal change in China. A new LIVAC version 3.1 was launched in February 2024. == Corpus data processing == Accessing media texts, manual input, etc. Text unification including conversion from simplified to traditional Chinese characters, stored as Big5 and Unicode versions Automatic word segmentation Automatic alignment of parallel texts Manual verification, part-of-speech tagging Extraction of words and addition to regional sub-corpora Combination of regional sub-corpora to update the LIVAC corpus, and master lexical database == Labeling for data curation == Categories used include general terms and proper names, such as: general names, surnames, semi titles; geographical, organizations and commercial entities, etc.; time, prepositions, locations, etc.; stack-words; loanwords; case-word; numerals, etc. Construction of databases of proper names, place names, and specific terms, etc. Generate rosters: "new word rosters", "celebrity or media personality rosters", "place name rosters", compound words and matched words Other parts of speech tagging for sub-database, such as common nouns, numerals, numeral classifiers, different types of verbs, and of adjectives, pronouns, adverbs, prepositions, conjunctions, particles marking mood, onomatopoeia, interjection, etc. == Applications == Compilation of Pan-Chinese dictionaries or local dictionaries Information technology research, such as predictive Chinese text input for mobile phones, automatic speech to text conversion, opinion mining Comparative studies on linguistic and cultural developments in the Pan-Chinese regions, especially in a critical period of history in modern China. Language teaching and learning research, and speech-to-text conversion Customized service on linguistic research and lexical search for international corporations and government agencies The above applications are provided by the following functions: Word Segmentation Search Phrase Search Example Sentence Selection Multi-word Comparison Word Cloud

    Read more →
  • Whisper (speech recognition system)

    Whisper (speech recognition system)

    Whisper is a machine learning model for speech recognition and transcription, created by OpenAI and first released as open-source software in September 2022. It is capable of transcribing speech in English and multiple other languages, and can translate several non-English languages into English. Whisper is a weakly-supervised deep learning acoustic model, made using an encoder-decoder transformer architecture. OpenAI claims that the combination of different training data and post-training filtering used in its development has led to improved recognition of accents, background noise, and jargon compared to previous approaches. While the model does not outperform larger, more specialized models and still experiences AI hallucination, it has been showed to be useful for general sound recognition and has many applications across different industries. == Background == Speech recognition has had a long history in research; the first approaches made use of statistical methods, such as dynamic time warping, and later hidden Markov models. At around the 2010s, deep neural network approaches became more common for speech recognition models, which were enabled by the availability of large datasets ("big data") and increased computational performance. Early approaches to deep learning in speech recognition included convolutional neural networks, which were limited due to their inability to capture sequential data, which later led to developments of Seq2seq approaches, which include recurrent neural networks, which made use of long short-term memory. Transformers, introduced in 2017 by Google, displaced many prior state-of-the-art approaches across a wide range in machine learning, and started becoming the core neural architecture in fields such as language modeling and computer vision. Weakly-supervised approaches to training acoustic models were recognized in the early 2020s as promising for speech recognition approaches using deep neural networks. According to a NYT report, in 2021 OpenAI believed they exhausted sources of higher-quality data to train their large language models and decided to complement scraped web text with transcriptions of YouTube videos and podcasts, and developed Whisper to solve this task. Whisper Large V2 was released on December 8, 2022, followed by Whisper Large V3 being released in November 2023, during the OpenAI Dev Day. In March 2025, OpenAI released new transcription models based on GPT-4o and GPT-4o mini, both of which have lower error rates than Whisper. == Architecture == The Whisper architecture is based on an encoder-decoder transformer. Input audio is resampled to 16,000 Hertz (Hz) and converted to an 80-channel Log-magnitude Mel spectrogram using 25 ms windows with a 10 ms stride. The spectrogram is then normalized to a [-1, 1] range with near-zero mean. The encoder takes this Mel spectrogram as input and processes it. It first passes through two convolutional layers. Sinusoidal positional embeddings are added. It is then processed by a series of Transformer encoder blocks (with pre-activation residual connections). The encoder's output is layer normalized. The decoder is a standard transformer decoder. It has the same width and Transformer blocks as the encoder. It uses learned positional embeddings and tied input-output token representations (using the same weight matrix for both the input and output embeddings). It uses a byte-pair encoding tokenizer, of the same kind as used in GPT-2. English-only models use the GPT-2 vocabulary, while multilingual models employ a re-trained multilingual vocabulary with the same number of words. Special tokens are used to allow the decoder to perform multiple tasks: Tokens that denote language (one unique token per language). Tokens that specify task (<|transcribe|> or <|translate|>). Tokens that specify if no timestamps are present (<|notimestamps|>). If the token is not present, then the decoder predicts timestamps relative to the segment, and quantized to 20 ms intervals. <|nospeech|> for voice activity detection. <|startoftranscript|>, and <|endoftranscript|> . Any text that appears before <|startoftranscript|> is not generated by the decoder, but given to the decoder as context. Loss is only computed over non-contextual parts of the sequence, i.e. tokens between these two special tokens. == Training data == The training dataset consists of 680,000 hours of labeled audio-transcript pairs sourced from the internet using semi-supervised learning. This includes 117,000 hours in 96 non-English languages and 125,000 hours of X→English translation data, where X stands for any non-English language. Preprocessing involved standardization of transcripts, filtering to remove machine-generated transcripts using heuristics (e.g., punctuation, capitalization), language identification and matching with transcripts, fuzzy deduplication, and deduplication with evaluation datasets to avoid data contamination. Speechless segments were also included to allow voice activity detection training. For the files still remaining after the filtering process, audio files were then broken into 30-second segments paired with the subset of the transcript that occurs within that time. If this predicted spoken language differed from the language of the text transcript associated with the audio, that audio-transcript pair was not used for training the speech recognition models, but instead for training translation. The model was trained using the AdamW optimizer with gradient norm clipping and a linear learning rate decay with warmup, with batch size 256 segments. Training proceeded for 1 million updates (approximately 2-3 epochs). No data augmentation or regularization, except for the Large V2 model, which used SpecAugment, Stochastic Depth, and BPE Dropout. The training used data parallelism with float16, dynamic loss scaling, and activation checkpointing. === Post-training filtering === After training the first model, researchers ran it on different subsets of the training data, each representing a distinct source. Data sources were ranked by a combination of their error rate and size. Manual inspection of the top-ranked sources (high error, large size) helped determine if the source was low quality (e.g., partial transcriptions, inaccurate alignment). After training, it was fine-tuned to suppress the prediction of speaker names and low-quality sources were then removed. == Capacity == While Whisper does not outperform models which specialize in the LibriSpeech dataset, when tested across many datasets, it is more robust and makes 55.2% fewer errors than other models. Whisper has a differing error rate with respect to transcribing different languages, with a higher word error rate in languages not well-represented in the training data. The authors found that multi-task learning improved overall performance compared to models specialized to one task. They conjectured that the best Whisper model trained is still underfitting the dataset, and larger models and longer training can result in better models. Third-party evaluations have found varying levels of AI hallucination. A study of transcripts of public meetings found hallucinations in eight out of every 10 transcripts, while an engineer discovered hallucinations in "about half" of 100 hours of transcriptions and a developer identified them in "nearly every one" of 26,000 transcripts. A study of 13,140 short audio segments (averaging 10 seconds) found 187 hallucinations (1.4%), 38% of which generated text that could be harmful because it inserted false references to things like race, non-existent medications, or violent events that were not in the audio. == Applications == The model has been used as the base for many applications, such as a unified model for speech recognition and more general sound recognition. Whisper has also been integrated into the workflow of biomedical research. In 2025, a study on Alzheimer's disease detection used the model to transcribe spontaneous speech recordings. The transcripts that were generated by the model were combined with LLM vector embeddings and traditional classifiers to help classify the patients' health. Another application is when OVALYTICS incorporated Whisper to transcribe YouTube videos and automate content moderation systems, which improved its detection of offensive content. The model has also been used in academic libraries and cultral heritage institutions to generate transcripts and captions for their digitized audiovisual collections. In a 2025 case study, Emory University Libraries found that Whisper reduced the labor used in transcription by around 30-35%, shifting work from text creation to text correction. However, human review is still necessary to make sure accuracy, formatting, and accessibility are all standard.

    Read more →
  • Fuzzy architectural spatial analysis

    Fuzzy architectural spatial analysis

    Fuzzy architectural spatial analysis (FASA) (also fuzzy inference system (FIS) based architectural space analysis or fuzzy spatial analysis) is a spatial analysis method of analysing the spatial formation and architectural space intensity within any architectural organization. Fuzzy architectural spatial analysis is used in architecture, interior design, urban planning and similar spatial design fields. == Overview == Fuzzy architectural spatial analysis was developed by Burcin Cem Arabacioglu (2010) from the architectural theories of space syntax and visibility graph analysis, and is applied with the help of a fuzzy system with a Mamdani inference system based on fuzzy logic within any architectural space. Fuzzy architectural spatial analysis model analyses the space by considering the perceivable architectural element by their boundary and stress characteristics and intensity properties. The method is capable of taking all sensorial factors into account during analyses in conformably with the perception process of architectural space which is a multi-sensorial act.

    Read more →