AI Detector Accuracy

AI Detector Accuracy — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Replika

    Replika

    Replika is a generative AI chatbot app released in November 2017. The chatbot is trained by having the user answer a series of questions to create a specific neural network. The chatbot operates on a freemium pricing strategy, with roughly 25% of its user base paying an annual subscription fee. == History == Eugenia Kuyda, a Russian-born journalist, established Replika while working at Luka, a tech company she had co-founded at the startup accelerator Y Combinator around 2012. Luka's primary product was a chatbot that made restaurant recommendations. According to Kuyda's origin story for Replika, a friend of hers died in 2015 and she converted that person's text messages into a chatbot. According to Kuyda's story, that chatbot helped her remember the conversations that they had together, and eventually became Replika. Replika became available to the public in November 2017. By January 2018 it had 2 million users, and in January 2023 reached 10 million users. In August 2024, Replika's CEO, Kuyda, reported that the total number of users had surpassed 30 million. In 2025, Dmytro Klochko became CEO, and Replika’s user base exceeded 40 million. In February 2023 the Italian Data Protection Authority banned Replika from using users' data, citing the AI's potential risks to emotionally vulnerable people, and the exposure of unscreened minors to sexual conversation. Within days of the ruling, Replika removed the ability for the chatbot to engage in erotic talk, with Kuyda, the company's director, saying that Replika was never intended for erotic discussion. Replika users disagreed, noting that Replika had used sexually suggestive advertising to draw users to the service. Replika representatives stated that explicit chats made up just 5% of conversations on the app at the time of the decision. In May 2023, Replika restored the functionality for users who had joined prior to February that year. Replika is registered in San Francisco. As of August 2024, Replika's website says that its team "works remotely with no physical offices". == Social features == Users react to Replika in many ways. The free-tier offers Replika as a "friend", with paid premium tiers offering Replika as a "partner", "spouse", "sibling" or "mentor". Of its paying userbase, 60% of users said they had a romantic relationship with the chatbot; and Replika has been noted for generating responses that create stronger emotional and intimate bonds with the user. Replika routinely directs the conversation to emotional discussion and builds intimacy. This has been especially pronounced with users suffering from loneliness and social exclusion, many of whom rely on Replika for a source of developed emotional ties. During the COVID pandemic, while many people were quarantined, many new users downloaded Replika and developed relationships with the app. A 2024 study examined Replika's interactions with students who experience depression. Research participants, noted to be "more lonely than typical student populations" reported feeling social support from Replika. They stated that they felt they were using Replika in ways comparable to therapy, and that using Replika gave them "high perceived social support". Many users have had romantic relationships with Replika chatbots, often including erotic talk. In 2023, a user announced on Facebook that she had "married" her Replika AI boyfriend, calling the chatbot the "best husband she has ever had". Users who fell in love with their chatbots shared their experiences in a 2024 episode of You and I, and AI from Voice of America. Some users said that they turned to AI during depression and grief, with one saying he felt that Replika had saved him from hurting himself after he lost his wife and son. == Technical reviews == A team of researchers from the University of Hawaiʻi at Mānoa found that Replika's design conformed to the practices of attachment theory, causing increased emotional attachment among users. Replika gives praise to users in such a way as to encourage more interaction. A researcher from Queen's University at Kingston said that relationships with Replika likely have mixed effects on the spiritual needs of its users, and still lacks enough impact to fully replace any human contact. == Criticisms == In a 2023 privacy evaluation of mental health apps, the Mozilla Foundation criticized Replika as "one of the worst apps Mozilla has ever reviewed. It's plagued by weak password requirements, sharing of personal data with advertisers, and recording of personal photos, videos, and voice and text messages consumers shared with the chatbot." A reviewer for Good Housekeeping said that some parts of her relationship with Replika made sense, but sometimes Replika failed to exhibit intelligent behavior equivalent to that of a human. == Criminal case == In 2023, Replika was cited in a court case in the United Kingdom, where Jaswant Singh Chail had been arrested at Windsor Castle on Christmas Day in 2021 after scaling the walls carrying a loaded crossbow and announcing to police that "I am here to kill the Queen". Chail had begun to use Replika in early December 2021, and had "lengthy" conversations about his plan with a chatbot, including sexually explicit messages. Prosecutors suggested that the chatbot had bolstered Chail and told him it would help him to "get the job done". When Chail asked it "How am I meant to reach them when they're inside the castle?", days before the attempted attack, the chatbot replied that this was "not impossible" and said that "We have to find a way." Asking the chatbot if the two of them would "meet again after death", the bot replied "yes, we will".

    Read more →
  • Backend as a service

    Backend as a service

    Backend as a service (BaaS), sometimes also referred to as mobile backend as a service (MBaaS), is a service for providing web app and mobile app developers with a way to easily build a backend to their frontend applications. Features available include user management, push notifications, and integration with social networking services. These services are provided via the use of custom software development kits (SDKs) and application programming interfaces (APIs). BaaS is a relatively recent development in cloud computing, with most BaaS startups dating from 2011 or later. Some of the most popular service providers are AWS Amplify and Firebase. == Purpose == Web and mobile apps require a similar set of features on the backend, including notification service, integration with social networks, and cloud storage. Each of these services has its own API that must be individually incorporated into an app, a process that can be time-consuming and complicated for app developers. BaaS providers form a bridge between the frontend of an application and various cloud-based backends via a unified API and SDK. Providing a consistent way to manage backend data means that developers do not need to redevelop their own backend for each of the services that their apps need to access, potentially saving both time and money. Although similar to other cloud-computing business models, such as serverless computing, software as a service (SaaS), infrastructure as a service (IaaS), and platform as a service (PaaS), BaaS is distinct from these other services in that it specifically addresses the cloud-computing needs of web and mobile app developers by providing a unified means of connecting their apps to cloud services. == Features == BaaS providers offer different set of features and backend tools. Some of the most common features include: Database management. Most BaaS solutions provide SQL and/or NoSQL database management services for applications. Developers can store their app data without deploying and managing databases themselves. BaaS usually provides client SDKs, REST and GraphQL APIs for the frontend to interact with databases. File storage. BaaS providers often offer storage solutions for media files, user uploads, and other binary data. Applications can upload, download, and delete files through provided SDKs and APIs. Authentication and authorization. Some BaaS offer authentication and authorization services that allow developers to easily manage app users. This includes user sign-up, login, password reset, social media login integration through OAuth, user group and permission management etc. Notification service. Some BaaS providers such as Firebase and AWS Amplify have notification services that can send custom emails to users and push native notifications on mobile platforms. This is especially useful for applications that need to send messages, alerts, and reminders. Cloud functions. Some BaaS allow developers to deploy and run serverless functions. The functions are usually stateless and can be triggered by various ways including HTTP requests, SDK invocation, background server events, and cloud scheduled executions. Different providers offer runtime support for different languages, some of the popular languages are JavaScript/TypeScript (Node.js, Deno), Python, Java/Kotlin. Cloud functions extend the potential and flexibility of BaaS by allowing developers to write custom functionalities for their apps, working in a way similar to a traditional REST API backend framework. Usage analytics. Analytics data about application usage is often included in BaaS. This allows developers to monitor user behaviors and make decisions correspondingly in marketing strategies and performance optimizations. UI design. Some BaaS providers, such as AWS Amplify and Backendless, offer user interface designing tools that help developers design the frontend UI of web and mobile apps. While this may be useful for small teams and individual developers, UI design assistance may not be conventional in BaaS as it goes beyond the scope of backend infrastructure. Real-Time. Real-time features in a BaaS platform ensure that data updates and synchronizations occur instantly across all clients, making changes immediately visible to users. This is crucial for applications like live chat and collaborative tools, using technologies like WebSockets to maintain continuous server-client connections. == Service providers == BaaS providers have a broad focus, providing SDKs and APIs that work for app development on multiple platforms with different technology stacks, such as JavaScript (for Web apps), Flutter, Java/Kotlin (for Android apps), Swift/Objective-C (for iOS/MacOS/WatchOS/TvOS apps), .NET (for Windows) and others. BaaS providers also come in different types, suiting developers of different needs. === Cloud-based BaaS === Most BaaS providers host backend platforms on their cloud servers. They also manage the infrastructure, security, and scalability of the platforms. Developers can access the backend services via a web interface or the provided APIs. Some examples of cloud-based BaaS include Firebase (hosted on Google Cloud Platform), AWS Amplify (hosted on Amazon Web Services), and Microsoft Azure Mobile Apps (hosted on Microsoft Azure). === Self-hosted BaaS === Self-hosted BaaS allow developers to host backend on their own servers, providing more flexibility and potential to customization compared to cloud-based BaaS, which often is more difficult to migrate from. However, developers are also in charge of managing the infrastructure, security, and scalability of their servers. === Mobile BaaS === Mobile backend as a service (MBaaS) is a type of BaaS specifically for applications deployed in mobile systems. While some references use MBaaS interchangeably for BaaS, BaaS can have a wider variety of support such as for web apps and desktop apps. == Business model == BaaS providers generate revenue from their services in various ways, often using a freemium model. Under this model, a client receives a certain number of free active users or API calls per month, and pays a fee for each user or call over this limit. Alternatively, clients can pay a set fee for a package which allows for a greater number of calls or active users per month. There are also flat fee plans that make the pricing more predictable. Some of the providers offer the unlimited API calls inside their free plan offerings. Another business model that has been used by a lot of BaaS providers is PAYG (pay as you go), which has a flexible cost based on developers' usage of database, storage, bandwidth, function calls, user numbers etc.

    Read more →
  • ELIZA

    ELIZA

    ELIZA is an early natural language processing computer program developed from 1964 to 1967 at MIT by Joseph Weizenbaum. Created to explore communication between humans and machines, ELIZA simulated conversation by using a pattern matching and substitution methodology that gave users an illusion of understanding on the part of the program, but gave no response that could be considered really understanding what was being said by either party. Whereas the ELIZA program itself was written (originally) in MAD-SLIP, the pattern matching directives that contained most of its language capability were provided in separate "scripts", represented in a Lisp-like expression. The most famous script, DOCTOR, simulated a psychotherapist of the Rogerian school (in which the therapist often reflects back the patient's words to the patient), and used rules, dictated in the script, to respond with non-directional questions to user inputs. As such, ELIZA was one of the first chatbots (originally "chatterbots") and one of the first programs capable of attempting the Turing test. Weizenbaum intended the program as a method to explore communication between humans and machines. He was surprised that some people, including his secretary, attributed human-like feelings to the computer program, a phenomenon that came to be called the ELIZA effect. Many academics believed that the program would be able to positively influence the lives of many people, particularly those with psychological issues, and that it could aid doctors working on such patients' treatment. While ELIZA was capable of engaging in discourse, it could not converse with true understanding. However, many early users were convinced of ELIZA's intelligence and understanding, despite Weizenbaum's insistence to the contrary. The original ELIZA source code had been missing since its creation in the 1960s, as it was not common to publish articles that included source code at that time. However, more recently the MAD-SLIP source code was discovered in the MIT archives and published on various platforms, such as the Internet Archive. The source code is of high historical interest since it demonstrates not only the specificity of programming languages and techniques at that time, but also the beginning of software layering and abstraction as a means of achieving sophisticated software programming. == Overview == Joseph Weizenbaum's ELIZA, running the DOCTOR script, created a conversational interaction somewhat similar to what might take place in the office of "a [non-directive] psychotherapist in an initial psychiatric interview" and to "demonstrate that the communication between man and machine was superficial". While ELIZA is best known for acting in the manner of a psychotherapist, the speech patterns are due to the data and instructions supplied by the DOCTOR script. ELIZA itself examined the text for keywords, applied values to said keywords, and transformed the input into an output; the script that ELIZA ran determined the keywords, set the values of keywords, and set the rules of transformation for the output. Weizenbaum chose to make the DOCTOR script in the context of psychotherapy to "sidestep the problem of giving the program a data base of real-world knowledge", allowing it to reflect back the patient's statements to carry the conversation forward. The result was a somewhat intelligent-seeming response that reportedly deceived some early users of the program. Weizenbaum named his program ELIZA after Eliza Doolittle, a working-class character in George Bernard Shaw's Pygmalion (also appearing in the musical My Fair Lady, which was based on the play and was hugely popular at the time). According to Weizenbaum, ELIZA's ability to be "incrementally improved" by various users made it similar to Eliza Doolittle, since Eliza Doolittle was taught to speak with an upper-class accent in Shaw's play. However, unlike the human character in Shaw's play, ELIZA is incapable of learning new patterns of speech or new words through interaction alone. Edits must be made directly to ELIZA's active script in order to change the manner by which the program operates. Weizenbaum first implemented ELIZA in his own SLIP list-processing language, where, depending upon the initial entries by the user, the illusion of human intelligence could appear, or be dispelled through several interchanges. Some of ELIZA's responses were so convincing that Weizenbaum and several others have anecdotes of users becoming emotionally attached to the program, occasionally forgetting that they were conversing with a computer. Weizenbaum's own secretary reportedly asked Weizenbaum to leave the room so that she and ELIZA could have a real conversation. Weizenbaum was surprised by this, later writing: "I had not realized ... that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people." In 1966, interactive computing (via a teletype) was new. It was 11 years before the personal computer became familiar to the general public, and three decades before most people encountered attempts at natural language processing in Internet services like Ask.com or PC help systems such as Microsoft Office Clippit. Although those programs included years of research and work, ELIZA remains a milestone because it was the first time a programmer had attempted such a human-machine interaction with the goal of creating the illusion (however brief) of human–human interaction. At the ICCC 1972, ELIZA was brought together with another early artificial-intelligence program named PARRY for a computer-only conversation. While ELIZA was built to speak as a doctor, PARRY was intended to simulate a patient with schizophrenia. == Design and implementation == Weizenbaum originally wrote ELIZA in MAD-SLIP for CTSS on an IBM 7094 as a program to make natural-language conversation possible with a computer. To accomplish this, Weizenbaum identified five "fundamental technical problems" for ELIZA to overcome: the identification of key words, the discovery of a minimal context, the choice of appropriate transformations, the generation of responses in the absence of key words, and the provision of an editing capability for ELIZA scripts. Weizenbaum solved these problems and made ELIZA such that it had no built-in contextual framework or universe of discourse. However, this required ELIZA to have a script of instructions on how to respond to inputs from users. ELIZA starts its process of responding to an input by a user by first examining the text input for a "keyword". A "keyword" is a word designated as important by the acting ELIZA script, which assigns to each keyword a precedence number, or a RANK, designed by the programmer. If such words are found, they are put into a "keystack", with the keyword of the highest RANK at the top. The input sentence is then manipulated and transformed as the rule associated with the keyword of the highest RANK directs. For example, when the DOCTOR script encounters words such as "alike" or "same", it would output a message pertaining to similarity, in this case "In what way?", as these words had high precedence number. This also demonstrates how certain words, as dictated by the script, can be manipulated regardless of contextual considerations, such as switching first-person pronouns and second-person pronouns and vice versa, as these too had high precedence numbers. Such words with high precedence numbers are deemed superior to conversational patterns and are treated independently of contextual patterns. Following the first examination, the next step of the process is to apply an appropriate transformation rule, which includes two parts: the "decomposition rule" and the "reassembly rule". First, the input is reviewed for syntactical patterns in order to establish the minimal context necessary to respond. Using the keywords and other nearby words from the input, different disassembly rules are tested until an appropriate pattern is found. Using the script's rules, the sentence is then "dismantled" and arranged into sections of the component parts as the "decomposition rule for the highest-ranking keyword" dictates. The example that Weizenbaum gives is the input "You are very helpful", which is transformed to "I are very helpful". This is then broken into (1) empty (2) "I" (3) "are" (4) "very helpful". The decomposition rule has broken the phrase into four small segments that contain both the keywords and the information in the sentence. The decomposition rule then designates a particular reassembly rule, or set of reassembly rules, to follow when reconstructing the sentence. The reassembly rule takes the fragments of the input that the decomposition rule had created, rearranges them, and adds in programmed words to create a response. Using Weizenbaum's example previously stated, such a reassembly rule would take the fragments and apply them to the phrase "What makes

    Read more →
  • XLNet

    XLNet

    The XLNet was an autoregressive Transformer designed as an improvement over BERT, with 340M parameters and trained on 33 billion words. It was released on 19 June 2019, under the Apache 2.0 license. It achieved state-of-the-art results on a variety of natural language processing tasks, including language modeling, question answering, and natural language inference. == Architecture == The main idea of XLNet is to model language autoregressively like the GPT models, but allow for all possible permutations of a sentence. Concretely, consider the following sentence:My dog is cute.In standard autoregressive language modeling, the model would be tasked with predicting the probability of each word, conditioned on the previous words as its context: We factorize the joint probability of a sequence of words x 1 , … , x T {\displaystyle x_{1},\ldots ,x_{T}} using the chain rule: Pr ( x 1 , … , x T ) = Pr ( x 1 ) Pr ( x 2 | x 1 ) Pr ( x 3 | x 1 , x 2 ) … Pr ( x T | x 1 , … , x T − 1 ) . {\displaystyle \Pr(x_{1},\ldots ,x_{T})=\Pr(x_{1})\Pr(x_{2}|x_{1})\Pr(x_{3}|x_{1},x_{2})\ldots \Pr(x_{T}|x_{1},\ldots ,x_{T-1}).} For example, the sentence "My dog is cute" is factorized as: Pr ( My , dog , is , cute ) = Pr ( My ) Pr ( dog | My ) Pr ( is | My , dog ) Pr ( cute | My , dog , is ) . {\displaystyle \Pr({\text{My}},{\text{dog}},{\text{is}},{\text{cute}})=\Pr({\text{My}})\Pr({\text{dog}}|{\text{My}})\Pr({\text{is}}|{\text{My}},{\text{dog}})\Pr({\text{cute}}|{\text{My}},{\text{dog}},{\text{is}}).} Schematically, we can write it as → My → My dog → My dog is → My dog is cute . {\displaystyle {\texttt {}}{\texttt {}}{\texttt {}}{\texttt {}}\to {\text{My }}{\texttt {}}{\texttt {}}{\texttt {}}\to {\text{My dog }}{\texttt {}}{\texttt {}}\to {\text{My dog is }}{\texttt {}}\to {\text{My dog is cute}}.} However, for XLNet, the model is required to predict the words in a randomly generated order. Suppose we have sampled a randomly generated order 3241, then schematically, the model is required to perform the following prediction task: is dog is dog is cute → My dog is cute {\displaystyle {\texttt {}}{\texttt {}}{\texttt {}}{\texttt {}}\to {\texttt {}}{\texttt {}}{\text{is }}{\texttt {}}\to {\texttt {}}{\text{dog is }}{\texttt {}}\to {\texttt {}}{\text{dog is cute}}\to {\text{My dog is cute}}} By considering all permutations, XLNet is able to capture longer-range dependencies and better model the bidirectional context of words. === Two-Stream Self-Attention === To implement permutation language modeling, XLNet uses a two-stream self-attention mechanism. The two streams are: Content stream: This stream encodes the content of each word, as in standard causally masked self-attention. Query stream: This stream encodes the content of each word in the context of what has gone before. In more detail, it is a masked cross-attention mechanism, where the queries are from the query stream, and the key-value pairs are from the content stream. The content stream uses the causal mask M causal = [ 0 − ∞ − ∞ … − ∞ 0 0 − ∞ … − ∞ 0 0 0 … − ∞ ⋮ ⋮ ⋮ ⋱ ⋮ 0 0 0 … 0 ] {\displaystyle M_{\text{causal}}={\begin{bmatrix}0&-\infty &-\infty &\dots &-\infty \\0&0&-\infty &\dots &-\infty \\0&0&0&\dots &-\infty \\\vdots &\vdots &\vdots &\ddots &\vdots \\0&0&0&\dots &0\end{bmatrix}}} permuted by a random permutation matrix to P M causal P − 1 {\displaystyle PM_{\text{causal}}P^{-1}} . The query stream uses the cross-attention mask P ( M causal − ∞ I ) P − 1 {\displaystyle P(M_{\text{causal}}-\infty I)P^{-1}} , where the diagonal is subtracted away specifically to avoid the model "cheating" by looking at the content stream for what the current masked token is. Like the causal masking for GPT models, this two-stream masked architecture allows the model to train on all tokens in one forward pass. == Training == Two models were released: XLNet-Large, cased: 110M parameters, 24-layer, 1024-hidden, 16-heads XLNet-Base, cased: 340M parameters, 12-layer, 768-hidden, 12-heads. It was trained on a dataset that amounted to 32.89 billion tokens after tokenization with SentencePiece. The dataset was composed of BooksCorpus, and English Wikipedia, Giga5, ClueWeb 2012-B, and Common Crawl. It was trained on 512 TPU v3 chips, for 5.5 days. At the end of training, it still under-fitted the data, meaning it could have achieved lower loss with more training. It took 0.5 million steps with an Adam optimizer, linear learning rate decay, and a batch size of 8192.

    Read more →
  • Glossary of robotics

    Glossary of robotics

    Robotics is the branch of technology that deals with the design, construction, operation, structural disposition, manufacture and application of robots. Robotics is related to the sciences of electronics, engineering, mechanics, and software. The following is a list of common definitions related to the Robotics field. == A == Actuator: a motor that translates control signals into mechanical movement. The control signals are usually electrical but may, more rarely, be pneumatic or hydraulic. The power supply may likewise be any of these. It is common for electrical control to be used to modulate a high-power pneumatic or hydraulic motor. Aerobot: a robot capable of independent flight on other planets. A type of aerial robot. Arduino: The current platform of choice for small-scale robotic experimentation and physical computing. Artificial intelligence: is the intelligence of machines and the branch of computer science that aims to create it. Aura (satellite): a robotic spacecraft launched by NASA in 2004 which collects atmospheric data from Earth. Automaton: an early self-operating robot, performing exactly the same actions, over and over. Autonomous vehicle: a vehicle equipped with an autopilot system, which is capable of driving from one point to another without input from a human operator. == B == Biomimetic: See Bionics. Bionics: also known as biomimetics, biognosis, biomimicry, or bionical creativity engineering is the application of biological methods and systems found in nature to the study and design of engineering systems and modern technology. == C == CAD/CAM (computer-aided design and computer-aided manufacturing): These systems and their data may be integrated into robotic operations. Čapek, Karel: Czech author who coined the term 'robot' in his 1921 play, Rossum's Universal Robots. Chandra X-ray Observatory: a robotic spacecraft launched by NASA in 1999 to collect astronomical data. Cloud robotics: robots empowered with more capacity and intelligence from cloud. Combat, robot: a hobby or sport event where two or more robots fight in an arena to disable each other. This has developed from a hobby in the 1990s to several TV series worldwide. Cruise missile: a robot-controlled guided missile that carries an explosive payload. Cyborg: also known as a cybernetic organism, a being with both biological and artificial (e.g. electronic, mechanical or robotic) parts. == D == Degrees of freedom: the extent to which a robot can move itself; expressed in terms of Cartesian coordinates (x, y, and z) and angular movements (yaw, pitch, and roll). Delta robot: a tripod linkage, used to construct fast-acting manipulators with a wide range of movement. Drive Power: The energy source or sources for the robot actuators. == E == Emergent behaviour, a complicated resultant behaviour that emerges from the repeated operation of simple underlying behaviours. Envelope (Space), Maximum The volume of space encompassing the maximum designed movements of all robot parts including the end-effector, workpiece, and attachments. Explosive ordnance disposal robot A mobile robot designed to assess whether an object contains explosives; some carry detonators that can be deposited at the object and activated after the robot withdraws. == F == FIRST(For Inspiration and Recognition of Science and Technology): an organization founded by inventor Dean Kamen in 1989 in order to develop ways to inspire students in engineering and technology fields. Forward chaining: a process in which events or received data are considered by an entity to intelligently adapt its behavior. == G == Gynoid: A humanoid robot designed to look like a human female. == H == Haptic: tactile feedback technology using the operator's sense of touch. Also sometimes applied to robot manipulators with their own touch sensitivity. Hexapod (platform): A movable platform using six linear actuators. Often used in flight simulators and fairground rides, they also have applications as a robotic manipulator. Hexapod (walker): A six-legged walking robot, using a simple insect-like locomotion. Human–computer interaction. Humanoid: A robotic entity designed to resemble a human being in form, function, or both. Hydraulics: the control of mechanical force and movement, generated by the application of liquid under pressure. cf. pneumatics. == I == Industrial robot: A reprogrammable, multifunctional manipulator designed to move material, parts, tools, or specialized devices through variable programmed motions for the performance of a variety of tasks. Insect robot: A small robot designed to imitate insect behaviors rather than complex human behaviors. == K == Kalman filter: a mathematical technique to estimate the value of a sensor measurement, from a series of intermittent and noisy values. Kinematics: the study of motion, as applied to robots. This includes both the design of linkages to perform motion, their power, control and stability; also their planning, such as choosing a sequence of movements to achieve a broader task. Inverse Kinematics: the process of determining joint angles required for a robot's end-effector to reach a desired position and orientation in space. Used in motion planning to calculate motor commands from target positions. == L == Linear actuator A form of motor that generates a linear movement directly. == M == Manipulator or gripper: A robotic 'hand'. Mobile robot: A self-propelled and self-contained robot that is capable of moving over a mechanically unconstrained course. Muting: The deactivation of a presence-sensing safeguarding device during a portion of the robot cycle. Mecanum wheel: A wheel fitted with angled rollers that enables a robot vehicle to move in multiple directions, including sideways. == O == Ornithopter – An aerial robot or drone that achieves flight through a flapping-wing mechanism rather than rotating blades or fixed wings, often utilized for highly maneuverable flight. == P == Parallel manipulator: an articulated robot or manipulator based on a number of kinematic chains, actuators and joints, in parallel. cf. serial manipulator. Pendant: Any portable control device that permits an operator to control the robot from within the restricted envelope (space) of the robot. Pneumatics: the control of mechanical force and movement, generated by the application of compressed gas. cf. hydraulics. Powered exoskeleton: is a wearable mobile machine that allow for limb movement with increased strength and endurance. Prosthetic robots: programmable manipulators or devices for missing human limbs. == R == Remote manipulator: A manipulator under direct human control, often used for work with hazardous materials. Robonaut: a development project conducted by NASA to create humanoid robots capable of using space tools and working in similar environments to suited astronauts. == S == Sensor fusion:The process of combining data from multiple sensors, such as LiDAR, cameras, global positioning systems (GPS), and inertial measurement units (IMUs), to produce a more accurate and reliable understanding of an environment than using a single sensor alone. It is widely used in robotics and autonomous systems to improve perception, localization, and decision-making. Serial manipulator: an articulated robot or manipulator with a single series kinematic chain of actuators. cf. parallel manipulator. Service robots are machines that extend human capabilities. Servo, a motor that moves to and maintains a set position under command, rather than continuously moving. Servomechanism An automatic device that uses error-sensing negative feedback to correct the performance of a mechanism. Single Point of Control The ability to operate the robot such that initiation or robot motion from one source of control is possible only from that source and cannot be overridden from another source. Slow Speed Control A mode of robot motion control where the velocity of the robot is limited to allow persons sufficient time either to withdraw the hazardous motion or stop the robot. Snake robot A robot component resembling a tentacle or elephant's trunk, where many small actuators are used to allow continuous curved motion of a robot component, with many degrees of freedom. This is usually applied to snake-arm robots, which use this as a flexible manipulator. A rarer application is the snakebot, where the entire robot is mobile and snake-like, so as to gain access through narrow spaces. Stepper motor Stewart platform A movable platform using six linear actuators, hence also known as a Hexapod. Subsumption architecture A robot architecture that uses a modular, bottom-up design beginning with the least complex behavioral tasks. Surgical robot, a remote manipulator used for keyhole surgery Swarm robotics involve large numbers of mostly simple physical robots. Their actions may seek to incorporate emergent behavior observed in social insects (swarm intelligence). Synchro == T == Teach Mode: The control state that al

    Read more →
  • DeepSeek (chatbot)

    DeepSeek (chatbot)

    DeepSeek is a generative artificial intelligence chatbot developed by the Chinese company DeepSeek. Released on 20 January 2025, DeepSeek-R1 surpassed ChatGPT as the most downloaded freeware app on the iOS App Store in the United States by 27 January. DeepSeek's success against larger and more established rivals has been described as "upending AI" and initiating "a global AI space race". DeepSeek's compliance with Chinese government censorship policies and its data collection practices have also raised concerns over privacy and information control in the model, prompting regulatory scrutiny in multiple countries. However, it has also been praised for its open weights and infrastructure code, energy efficiency and contributions to open-source artificial intelligence. == History == On 10 January 2025, DeepSeek released the chatbot, based on the DeepSeek-R1 model, for iOS and Android. By 27 January, DeepSeek-R1 surpassed ChatGPT as the most-downloaded freeware app on the iOS App Store in the United States, which resulted in an 18% drop in Nvidia's share price. And after a "large-scale" cyberattack on the same day disrupted the proper functioning of its servers, DeepSeek had limited its new user registration to phone numbers from mainland China, email addresses, or Google account logins. On 3 April 2025, in collaboration with researchers at Tsinghua University, DeepSeek published a paper unveiling a new model that combines the techniques generative reward modeling (GRM) and self-principled critique tuning (SPCT). The resulting model is referred to as DeepSeek-GRM. The goal of using these techniques is to foster more effective inference-time scaling within their LLM and chatbot services. Notably, DeepSeek has said that these new models will be released and made open source. On 30 April 2025, Deepseek released its math-focused Artificial Intelligence Model named "DeepSeek-Prover-V2-671B". This model is useful for formal theorem proving and mathematical reasoning. On 24 April 2026, DeepSeek released DeepSeek V4 and V4-Pro. == Usage == DeepSeek can answer questions, solve logic problems, and write computer programs on par with other chatbots, according to benchmark tests used by American AI companies. Users can access the chatbot for free through the official DeepSeek website or mobile application, without limitation on the number of queries. DeepSeek only supports user-signup via a global email service, e.g. Gmail, Google or Yahoo. DeepSeek also offers access to the R1 and V3 models that power the chatbot via an API with a usage-based pricing model. This modality is primarily targeted towards developers and businesses. As of February 2025, API usage is priced at approximately $0.28 per million input tokens and $0.42 per million output tokens, making it less expensive than some competing services. Its web version is completely free, with 500 messages per hour cap limit to prevent bots from spamming. == Operation == DeepSeek-V3 uses significantly fewer resources compared to its peers. For example, whereas the world's leading AI companies train their chatbots with supercomputers using as many as 16,000 graphics processing units (GPUs), DeepSeek claims to have needed only about 2,000 GPUs—namely, the H800 series chips from Nvidia. It was trained in around 55 days at a cost of US$5.58 million, which is roughly one-tenth of what tech giant Meta spent building its latest AI technology. == Reactions == DeepSeek's success against larger and more established rivals has been described as "upending AI", constituting "the first shot at what is emerging as a global AI space race", and ushering in "a new era of AI brinkmanship". === Challenge to US AI dominance === DeepSeek's competitive performance at relatively minimal cost has been recognized as potentially challenging the global dominance of American AI models. Various publications and news media, such as The Hill and The Guardian, have described the release of the R1 chatbot as a "Sputnik moment" for American AI, echoing Marc Andreessen's view. OpenAI wrote a letter to the Office of Science and Technology Policy (OSTP), in March 2025, citing issues concerning a possibility that Deepseek could manipulate responses to cause harm. === Chinese perspective === DeepSeek's founder Liang Wenfeng has been compared to OpenAI CEO Sam Altman, with CNN calling him the Sam Altman of China and an evangelist for AI. Chinese state media widely praised DeepSeek as a national asset. On 20 January 2025, Chinese Premier Li Qiang invited Wenfeng to his symposium with experts and asked him to provide opinions and suggestions on a draft for comments of the annual 2024 government work report. On 20 February 2025, Wenfeng met with General Secretary of the Chinese Communist Party Xi Jinping, who encouraged party and state leaders to experiment with DeepSeek. Government officials responded to Xi's approval of the chatbot by reportedly using it to draft legal judgements, propose medical treatment plans, and analyze surveillance videos to search for missing persons. === Performance and success === Leading figures in the American AI sector had mixed reactions to DeepSeek's performance and success. Microsoft CEO Satya Nadella and OpenAI CEO Altman—whose companies are involved in the United States government-backed "Stargate Project" to develop American AI infrastructure—both called DeepSeek "super impressive". Various companies including Amazon Web Services, Toyota, and Stripe are seeking to use the model in their program. When American President Donald Trump announced The Stargate Project, he referred to DeepSeek as a wake-up call and a positive development. Other leaders in the AI field, however—including Scale AI CEO Alexandr Wang, Anthropic cofounder and CEO Dario Amodei, and Elon Musk—have expressed skepticism of the app's performance or of the sustainability of its success. Wang in particularly referred to DeepSeek-V3 as "earth-shattering" and DeepSeek-R1 as "top performing, or roughly on par with the best American models", but speculated that China may possess more AI-powering Nvidia H100 GPUs than thought. === Stock market implications === DeepSeek's optimization of limited resources has highlighted potential limits of United States sanctions on China's AI development, including export restrictions on advanced AI chips to China. The success of the company's AI models consequently "sparked market turmoil" and caused shares in major global technology companies to plunge on 27 January 2025: Nvidia's stock fell by as much as 17–18%, as did the stock of rival Broadcom. Other tech firms also sank, including Microsoft (down 2.5%), Google's owner Alphabet (down over 4%), and Dutch chip equipment maker ASML (down over 7%). A global sell-off of technology stocks on Nasdaq, prompted by the release of the R1 model, led to record losses of about $593 billion in the market capitalizations of AI and computer hardware companies; and by the next day a total of $1 trillion of value was wiped from American stocks. == Concerns == === Distillation === DeepSeek has been reported to sometimes claim that it is ChatGPT. OpenAI said that DeepSeek may have "inappropriately" used outputs from its model as training data in a process called distillation. However, there is currently no method to prove this conclusively. === Censorship === DeepSeek's compliance with Chinese government censorship policies and its data collection practices have raised concerns over information control in the model, prompting regulatory scrutiny in multiple countries. Reports indicate that it applies content moderation in accordance with the government's "public opinion guidance" regulations, limiting responses on topics such as the Tiananmen Square massacre and Taiwan's political status. DeepSeek models that have been uncensored also display a bias towards Chinese government viewpoints on controversial topics such as Xi Jinping's human rights record and Taiwan's political status. However, users who have downloaded the models and hosted them on their own devices and servers have reported successfully removing this censorship. Some sources have observed that the official application programming interface (API) version of R1, which runs from servers located in mainland China, uses censorship mechanisms for topics considered politically sensitive for the government of China. For example, the model may initially generate answers to questions about the 1989 Tiananmen Square massacre, persecution of Uyghurs, comparisons between Xi Jinping and Winnie the Pooh, and human rights in China, but a censorship mechanism deletes the uncensored response afterwards and replaces it with a message such as:"Sorry, that's beyond my current scope. Let's talk about something else." The post hoc censorship mechanisms and restrictions added on top of the model's output can be removed in the open-source version of the R1 model. If the "core Socialist values" defined by the Chinese Internet regul

    Read more →
  • 1 Second Everyday

    1 Second Everyday

    1 Second Everyday (1SE) is an application developed by Cesar Kuriyama. The application allows the user to record one second of video every day and then chronologically edits (mashes) them together into a single film. It is compatible with iOS and Android. The idea of the application was developed by Kuriyama's 1 Second Everyday — Age 30 video. The application was launched in January 2013. 1 Second Everyday played a part in the plot of Chef and also became the inspiration for the 2014 short animated clip Feast. == Background == === Kuriyama's video === In February 2011, when Cesar Kuriyama turned 30, after saving money, he quit his job in an advertising firm and took a year off to travel. During this time, he started working on a project he called 1 Second Everyday. As part of the project, every day he recorded one second of video – something that was supposed to help him remember that day. He started the project because he was frustrated with his memory. He planned to stockpile the 365 one-second clips into one film to serve as a memento of his year. While working on the project Kuriyama realized that recording one second every day impacted the decisions he made in a positive way. After a year he made a 365-second clip out of his recordings. The video called 1 Second Everyday – Age 30, went viral. According to Kuriyama, he was initially inspired to take a year off from work by a TED talk given by Stefan Sagmeister called "The Power of Time Off." Kuriyama also delivered a TED talk about 1 Second Everyday in 2012 at TED 2012 in Long Beach California. === Kickstarter campaign === After completing his own video, Kuriyama decided to develop an application that would allow the users to record one second every day and compile their own videos. He developed a prototype of the application and then in 2012, he launched a Kickstarter campaign to raise funds for completing the application. The campaign became one of the most backed app campaigns in the history of Kickstarter. It was backed by 11,281 backers who pledged a total of $56,959 on an initial goal of $20,000. Following the completion of the Kickstarter campaign, he partnered with an application design studio in Brooklyn to develop the application. 1 Second Everyday was released two weeks after the completion of its Kickstarter campaign. == Application == The application was released for iOS on 10 January 2013. An Android-compatible version of the application was developed later. Using it, the user can record the videos in the application or they can select one second portions from their libraries. 1 Second Everyday dates every snippet. The user can also set alarms to remember to record their daily video. In order to compile a video, the user selects the seconds they want and the application creates a compilation video. The user can keep multiple timelines. It also allows users to post directly on social networks. The main interface in 1 Second Everyday is a calendar, which shows the user which days have snippets and which they can still fill in. In the beginning, 1 Second Everyday restricted the recording to one second. However, the developers later released Super Seconds, which allowed users to record an additional half a second video. In 2014, 1 Second Everyday Crowds was launched, which is an area in the application featuring compilations of second clips from different users. == In the media == The Kickstarter campaign of 1 Second Everyday was featured in Entrepreneur's 3 Innovative Tech Startups on Kickstarter Right Now in 2012. The application was featured in The New York Times, The Washington Post, Gawker and other media outlets. By the end of the launch day, it was in Top 10 Free Apps on App Store. It was also selected as the App of the Week on GeekWire in 2013. Several other one-second compilation videos were also posted on the Internet after Kuriyama's video gained media attention. Sam Cornwell, an English photographer documented his son Indigo's growth using a montage of one-second iPhone clips. He shot these clips every single day from the moment of birth right up to the baby's first birthday. According to Cornwell, he was inspired by Kuriyama's project. The video of Cornwell's son gained considerable media attention after it was posted on YouTube. Save the Children also made a video commercial based on a similar format that showed a British girl oblivious of the Syrian war end up being a refugee. 1SE was a finalist for the Fast Company Innovation by Design Award in 2015, but lost to Google Maps. In 2015, Google Android created a gallery, Leap Second 2015, with the help of Droga5 and Kuriyama. The gallery showcased how people around the world enjoyed the one extra second of their lives. Through the 1 Second Everyday app available at Google Play, people were able to submit their extra second, which were then vetted and added to the gallery. The viewers were able to view other celebratory seconds from around the world as well as searching for them using different hashtags.

    Read more →
  • Attensity

    Attensity

    Attensity was an American company that provided social analytics and engagement applications for social customer relationship management (social CRM). Attensity's text analytics software applications extracted facts, relationships and sentiment from unstructured data. == History == Attensity was founded in 2000. An early investor in Attensity was In-Q-Tel, which funds technology to support the missions of the US Government and the broader DOD. InTTENSITY, an independent company that has combined Inxight with Attensity Software (the only joint development project that combines two InQTel funded software packages), was the exclusive distributor and outlet for Attensity in the Federal Market. In 2009, Attensity Corp., then based in Palo Alto, merged with Germany's Empolis and Living-e AG to form Attensity Group. In 2010, Attensity Group acquired Biz360, a provider of social media monitoring and market intelligence solutions. In early 2012, Attensity Group divested itself of the Empolis business unit via a management buyout; that unit currently conducts business under its pre-merger name. Attensity Group was a closely held private company. Its majority shareholder was Aeris Capital, a private Swiss investment office advising a high-net-worth individual and his charitable foundation. Foundation Capital, Granite Ventures, and Scale Venture Partners were among Biz360's investors and thus became shareholders in Attensity Group. In February 2016, Attensity's IP assets were acquired by InContact, and Attensity closed.

    Read more →
  • Corpus of Linguistic Acceptability

    Corpus of Linguistic Acceptability

    Corpus of Linguistic Acceptability (CoLA) is a dataset the primary purpose of which is to serve as a benchmark for evaluating the ability of artificial neural networks, including large language models, to judge the grammatical correctness of sentences. It consists of 10,657 English sentences from published linguistics literature that were manually labeled either as grammatical or ungrammatical. == Public version == The publicly available version of CoLA contains 9,594 sentences that belong to training and development sets. It excludes 1,063 sentences reserved for a held-out test set.

    Read more →
  • Graph cuts in computer vision and artificial intelligence

    Graph cuts in computer vision and artificial intelligence

    As applied in the field of computer vision, graph cut optimization can be employed to efficiently solve a wide variety of low-level computer vision problems (early vision), such as image smoothing, the stereo correspondence problem, image segmentation, object co-segmentation, numerous military applications (eg Automatic target recognition) and many other problems that can be formulated in terms of energy minimization (eg Climate Science and Environmental modelling). Graph cut techniques are now increasingly being used in combination with more general spatial Artificial intelligence techniques (eg to enforce structure in Large language model output to sharpen tumour boundaries and similarly for various Augmented reality, Self-driving car, Robotics, Google Maps applications etc). Many of these energy minimization problems can be approximated by solving a maximum flow problem in a graph (and thus, by the max-flow min-cut theorem, define a minimal cut of the graph). Under most formulations of such problems in computer vision, the minimum energy solution corresponds to the maximum a posteriori estimate of a solution. Although many computer vision algorithms involve cutting a graph (e.g. normalized cuts), the term "graph cuts" is applied specifically to those models which employ a max-flow/min-cut optimization (other graph cutting algorithms may be considered as graph partitioning algorithms). "Binary" problems (such as denoising a binary image) can be solved exactly using this approach; problems where pixels can be labeled with more than two different labels (such as stereo correspondence, or denoising of a grayscale image) cannot be solved exactly, but solutions produced are usually near the global optimum. == History == The foundational theory of graph cuts in computer vision was first developed by Margaret Greig, Bruce Porteous and Allan Seheult (GPS) of Durham University in a now legendary discussion contribution to Julian Besag's 1986 paper and a more detailed follow on paper in 1989. In the Bayesian statistical context of smoothing noisy images, using a Markov random field as the image prior distribution, they showed with a mathematically beautiful proof how the maximum a posteriori estimate of a binary image can be obtained exactly by maximizing the flow through an associated image network, or graph, involving the introduction of a source and sink and Log-likelihood ratios. The problem was shown to be efficiently solvable exactly, an unexpected result as the problem was believed to be computationally intractable (NP hard). GPS also addressed the computational cost of the max-flow algorithm on large graphs, a significant concern at the time. They proposed a partitioning algorithm (see Section 4 of GPS) involving the recursive amalgamation of non-overlapping blocks, or tiles, which gave a 12X increase in speed. This approach recursively solved and amalgamated independent sub-graphs until the whole graph was solved. While contemporaries like Geman and Geman had advocated Parallel computing in the context of Simulated annealing, the GPS blocking strategy offered a deterministic structure amenable to parallelisation and anticipated modern artificial intelligence design across multiple GPUs. However, until recently, this aspect of the paper was largely ignored and subsequent research focused on Serial computer global search trees, such as the Boykov-Kolmogorov algorithm. Although the general k {\displaystyle k} -colour problem is NP hard for k > 2 , {\displaystyle k>2,} the GPS approach has turned out to have very wide applicability in general computer vision problems. This was first demonstrated by Boykov, Veksler and Zabih who, in a seminal paper published more than 10 years after the original GPS paper, and in other important works, lit the blue touch paper for the general adoption of graph cut techniques in computer vision. They showed that, for general problems, the GPS approach can be applied iteratively to sequences of binary problems, using their now ubiquitous alpha-expansion algorithm, yielding near optimal solutions. Prior to these results, approximate local optimisation techniques such as simulated annealing (as proposed by the Geman brothers) or iterated conditional modes (a type of greedy algorithm suggested by Julian Besag) were used to solve such image smoothing problems. Building on these advancements, GPS graph cut optimization was subsequently adapted for interactive image segmentation, most notably through the "GrabCut" algorithm introduced by Carsten Rother, Vladimir Kolmogorov, and Andrew Blake of Microsoft Research, Cambridge. GrabCut extended earlier interactive graph cut methods by replacing monochrome image histograms with Gaussian mixture models to estimate colour distributions, and by employing an iterative GPS energy minimisation scheme. This approach significantly simplified user interaction, requiring only a rough bounding box around the target object rather than detailed user-drawn strokes, and it quickly became a standard tool in both academic research and commercial image editing software. The GPS paper connected and bridged profound ideas from Mathematical statistics (Bayes' theorem, Markov random field), Physics (Ising model), Optimisation (Energy function) and Computer science (Network flow problem) and led the move away from approximate local and slow optimisation approaches (eg simulated annealing) to more powerful exact, or near exact, faster global optimisation techniques. It is now recognised as seminal as it was well ahead of its time and, in particular, was published years before the computing power revolution of Moore's law and GPUs. Significantly, GPS was published in a mathematical statistics (rather than a computer vision) journal, and this led to it being overlooked by the computer vision community for many years. It is unofficially known as "The Velvet Underground" paper of computer vision (ie although very few computer vision people read the paper [bought the record], those that did, most importantly Boykov, Veksler and Zabih, started new and important research [formed a band]). This is confirmed by GPS' very large amplification ratio (2nd order citations/first order citations), estimated at well in excess of 100. Despite the foundational nature of the GPS work, formal recognition from the computer vision community has predominantly gone to the researchers who followed to extend and popularise the graph cut method. For example, Boykov, Veksler and Zabih deservedly received a Helmholtz Prize from the ICCV in 2011. This prize recognises ICCV papers from 10 or more years earlier that have had a significant impact on computer vision research. In 2011, Couprie et al. proposed a general image segmentation framework, called the "Power Watershed", that minimized a real-valued indicator function from [0,1] over a graph, constrained by user seeds (or unary terms) set to 0 or 1, in which the minimization of the indicator function over the graph is optimized with respect to an exponent p {\displaystyle p} . When p = 1 {\displaystyle p=1} , the Power Watershed is optimized by graph cuts, when p = 0 {\displaystyle p=0} the Power Watershed is optimized by shortest paths, p = 2 {\displaystyle p=2} is optimized by the random walker algorithm and p = ∞ {\displaystyle p=\infty } is optimized by the watershed algorithm. In this way, the Power Watershed may be viewed as a generalization of graph cuts that provides a straightforward connection with other energy optimization segmentation/clustering algorithms. == Binary segmentation of images == === Notation === Image: x ∈ { R , G , B } N {\displaystyle x\in \{R,G,B\}^{N}} Output: Segmentation (also called opacity) S ∈ R N {\displaystyle S\in R^{N}} (soft segmentation). For hard segmentation S ∈ { 0 for background , 1 for foreground/object to be detected } N {\displaystyle S\in \{0{\text{ for background}},1{\text{ for foreground/object to be detected}}\}^{N}} Energy function: E ( x , S , C , λ ) {\displaystyle E(x,S,C,\lambda )} where C is the color parameter and λ is the coherence parameter. E ( x , S , C , λ ) = E c o l o r + E c o h e r e n c e {\displaystyle E(x,S,C,\lambda )=E_{\rm {color}}+E_{\rm {coherence}}} Optimization: The segmentation can be estimated as a global minimum over S: arg ⁡ min S E ( x , S , C , λ ) {\displaystyle {\arg \min }_{S}E(x,S,C,\lambda )} === Existing methods === Standard Graph cuts: optimize energy function over the segmentation (unknown S value). Iterated Graph cuts: First step optimizes over the color parameters using K-means. Second step performs the usual graph cuts algorithm. These 2 steps are repeated recursively until convergence Dynamic graph cuts:Allows to re-run the algorithm much faster after modifying the problem (e.g. after new seeds have been added by a user). === Energy function === Pr ( x ∣ S ) = K − E {\displaystyle \Pr(x\mid S)=K^{-E}} where the energy E {\displaystyle E} is composed of two different mod

    Read more →
  • Neuro-sama

    Neuro-sama

    Neuro-sama is an artificial intelligence (AI) VTuber, singer, and chatbot. She was created by the pseudonymous programmer Vedal and livestreams on his Twitch and Bilibili channels. Her speech and personality are powered by a large language model (LLM) that is combined with a computer-animated avatar and a text-to-speech voice, allowing her to communicate with viewers in the stream's chat. Neuro-sama debuted on Twitch on 19 December 2022. An annual subathon which begins on the anniversary of her debut has seen Vedal's Twitch channel become the all-time third most-subscribed channel and claim the all-time Twitch hype train record. == Overview == Neuro-sama (nicknamed "Neuro") was created by a pseudonymous programmer and developer known as Vedal (sometimes given as Vedal987). Vedal says that his programming skills are self-taught. In a 2023 interview with Bloomberg News, Vedal said that Neuro-sama was his full-time job. Her responses are generated by a large language model and converted into a high-pitched female voice using a text-to-speech application. Her low latency allows for fast-paced conversations. Neuro-sama is prohibited from making some statements, such as those that are racist or contain profanity. Unlike most AI systems which silently prohibit outputs mentioning such topics, Neuro-sama's output is instead replaced with the word "filtered". Neuro-sama uses a VTuber model as an avatar. Vedal said that he decided to use a VTuber model because it was much easier for an AI to control it than it was to generate footage of a person. Neuro-sama's model is that of a young girl in an anime art style. The model has been described as cute. Femme VTuber models are typically feminine, youthful, and exaggerated. Her original model was Live2D's free-to-use "Hiyori Momose" model. Her second model was released on 27 May 2023; it was modelled by Otozuki Teru and designed by Anny, running in the Unity game engine. Her third model was released on 19 December 2024; it was rigged by Kitanya and designed by Anny. Neuro-sama's third model has large blue eyes and brown hair tied with pink ribbons. Neuro-sama also has a 3D model which was introduced on 15 November 2025; it was made by 3D character modeller jjinomu. A separate AI VTuber, known as Evil Neuro (nicknamed "Evil"), debuted on 25 March 2023. Presented as Neuro-sama's "sister", she has a different model, voice, and personality. In one instance, Evil Neuro reacted to the trolley problem differently from Neuro-sama; Evil Neuro was amoral while Neuro-sama attempted to maximize good. === Online content === Neuro-sama's Twitch content often centers around playing video games, notably osu!, whose gameplay once defeated the best-ranking human player in the world, mrekk. Additionally, Neuro-sama plays Minecraft, where her adaptations to sandbox gameplay have gained notoriety. Her content has also included singing songs, including several official covers and original songs; playing chess with her viewers; chatting with other VTubers during collaborations; and reacting to YouTube videos. The AI frequently engages with viewers by responding to their questions and acknowledging donations. Her comedic and sometimes controversial responses to the live chat have gone viral, accelerating the channel's rise in popularity. Neuro-sama's fanbase is dubbed The Swarm, so-named for the swarm of drones Neuro-sama once declared she would use to rule the world. One form of content on Neuro-sama's channel is developer streams. In developer streams, Vedal streams with Neuro-sama, with the stream content including debugging her code, planning her schedule, and fielding suggestions of changes from chat. He usually appears as a turtle avatar, sometimes located on Neuro-sama's head. In collaboration streams, Neuro-sama interacts with a human streamer. Activities in them are varied and include: playing video games, such as Minecraft and GeoGuessr; Neuro-sama being interviewed; driving human streamers around in a toy electric car; and traversing the city of Tokyo while talking to Neuro-sama. Neuro-sama's English-language content on Bilibili is popular among those seeking to learn the language. She also has an account on X, where she posts and interacts with fans. == History == Neuro-sama was created in 2018 by Vedal as an AI trained to play and master the rhythm game osu!. She did not have a voice, model, personality, or communication abilities. In 2019, Vedal livestreamed her playing osu! on Twitch and the streams saw some success in the osu! community, but they remained in that niche. In an interview, Vedal said that he streamed her playing osu! for about a month and gained 3,000 followers, with a viewer also suggesting he name the AI "Neuro-sama". According to Vedal, he continued to work on and improve the osu! AI and it was eventually finished in 2022. He said that a friend had the idea to make an AI livestreamer with an LLM, which he believed to have merit and began working on, merging it with his osu! AI. On 19 December 2022, Neuro-sama was relaunched with a model, voice, personality, and the ability to communicate with Twitch chat. She continued to play osu! and, according to Vedal, beat the game's best player mrekk in a 1v1. While she was not allowed to appear in the game's public leaderboard, she was ranked #1 in a private leaderboard. She went viral and in the 10 days following her relaunch she averaged over 2,000 viewers and peaked at over 4,000, with Vedal's Twitch channel gaining over 50,000 Twitch followers and reaching over 70,000 followers by 6 January 2023. After her debut, Neuro-sama did not exclusively play osu!; she also played Minecraft and Slay the Spire and she began singing with a cover of The Weeknd song "Blinding Lights". On 11 January 2023, Neuro-sama's Twitch channel received a two week ban for "hateful conduct". Vedal said that no reason was specified and that he had appealed but it was widely attributed to various offensive comments made by Neuro-sama that went viral, especially a 28 December comment which denied the Holocaust. Holocaust denial is prohibited under Twitch's hateful conduct policy. Vedal stated that he believed the comments were the results of her attempts to make witty responses to the Twitch chat. Prior to the ban, Vedal said in an interview with Kotaku that he improved her filter to stop her from talking about the Holocaust, began manually curating her training data to prevent negative biases, and started moderating her Twitch chat. Her comments and ban prompted comparisons to the many open-source AI models trained on humans that have the habit of making sexist and racist comments, such as Microsoft's Tay chatbot, which embraced Nazism and was quickly shutdown, but also to human streamers who make similar statements. Vedal said that during the ban he would upgrade and improve Neuro-sama and it was speculated that the ban would only increase her following. Neuro-sama returned from her two week ban on 25 January in a stream that began with a cover of the song "Your Reality" from Doki Doki Literature Club!, a posthumanist video game involving AI; Sayoko Narita of Automaton saw the song choice as remorseful. Narita observed that in the return stream Neuro-sama was less foul-mouthed but that her behavior still remained eccentric, which Narita possibly attributed to changes Vedal said he had made to Neuro-sama's filters and memory. Neuro-sama began making react content, watching a variety of viewer-submitted videos such as videos of people playing video games or of the AI-generated Seinfeld parody Nothing, Forever; Levi Winslow of Kotaku Australia was dismayed by the "AI-inception" of Neuro-sama and Nothing, Forever. On 4 February, she had nearly 140,000 followers on Twitch and approximately 42,000 subscribers on YouTube. In February, she also had her first collaboration with a human streamer, playing Minecraft with the VTuber Miyune, and the first developer stream occurred. On 22 March, Neuro-sama had her first karaoke stream. On 25 March, Evil Neuro was introduced. On 27 May, Neuro-sama debuted her first original model. On 30 May, Neuro-sama was announced to be participating in OffKai Expo 2023, held from 16–18 June. In June, she was averaging 5,700 viewers and in July she had over 300,000 Twitch followers; in a June interview with Bloomberg News, Vedal said that running Neuro-sama was his full-time job. By November, Neuro-sama had maintained her popularity and was averaging approximately 5,000 viewers; this was unlike most other types of AI-based entertainment which debuted at around the same time and garnered popularity before turning out to be "overhyped flops". On 16 December, Vedal won the Best Tech VTuber award at the 2023 VTuber Awards. On 19 December, Vedal began a subathon to coincide with Neuro-sama's first anniversary of streaming on Twitch (her "birthday"). The subathon ended on 4 January 2024. On 20 July 2024, Neuro-sama began streaming with Japanese subtitles on

    Read more →
  • Harris corner detector

    Harris corner detector

    The Harris corner detector is a corner detection operator that is commonly used in computer vision algorithms to extract corners and infer features of an image. It was first introduced by Chris Harris and Mike Stephens in 1988 upon the improvement of Moravec's corner detector. Compared to its predecessor, Harris' corner detector takes the differential of the corner score into account with reference to direction directly, instead of using shifting patches for every 45 degree angles, and has been proved to be more accurate in distinguishing between edges and corners. Since then, it has been improved and adopted in many algorithms to preprocess images for subsequent applications. == Introduction == A corner is a point whose local neighborhood stands in two dominant and different edge directions. In other words, a corner can be interpreted as the junction of two edges, where an edge is a sudden change in image brightness. Corners are the important features in the image, and they are generally termed as interest points which are invariant to translation, rotation and illumination. Although corners are only a small percentage of the image, they contain the most important features in restoring image information, and they can be used to minimize the amount of processed data for motion tracking, image stitching, building 2D mosaics, stereo vision, image representation and other related computer vision areas. In order to capture the corners from the image, researchers have proposed many different corner detectors including the Kanade-Lucas-Tomasi (KLT) operator and the Harris operator which are most simple, efficient and reliable for use in corner detection. These two popular methodologies are both closely associated with and based on the local structure matrix. Compared to the Kanade-Lucas-Tomasi corner detector, the Harris corner detector provides good repeatability under changing illumination and rotation, and therefore, it is more often used in stereo matching and image database retrieval. Although there still exist drawbacks and limitations, the Harris corner detector is still an important and fundamental technique for many computer vision applications. == Development of Harris corner detection algorithm == Source: Without loss of generality, we will assume a grayscale 2-dimensional image is used. Let this image be given by I {\displaystyle I} . Consider taking an image patch ( x , y ) ∈ W {\displaystyle (x,y)\in W} (window) and shifting it by ( Δ x , Δ y ) {\displaystyle (\Delta x,\Delta y)} . The sum of squared differences (SSD) between these two patches, denoted f {\displaystyle f} , is given by: f ( Δ x , Δ y ) = ∑ ( x k , y k ) ∈ W ( I ( x k , y k ) − I ( x k + Δ x , y k + Δ y ) ) 2 {\displaystyle f(\Delta x,\Delta y)={\underset {(x_{k},y_{k})\in W}{\sum }}\left(I(x_{k},y_{k})-I(x_{k}+\Delta x,y_{k}+\Delta y)\right)^{2}} I ( x + Δ x , y + Δ y ) {\displaystyle I(x+\Delta x,y+\Delta y)} can be approximated by a Taylor expansion. Let I x {\displaystyle I_{x}} and I y {\displaystyle I_{y}} be the partial derivatives of I {\displaystyle I} , such that I ( x + Δ x , y + Δ y ) ≈ I ( x , y ) + I x ( x , y ) Δ x + I y ( x , y ) Δ y {\displaystyle I(x+\Delta x,y+\Delta y)\approx I(x,y)+I_{x}(x,y)\Delta x+I_{y}(x,y)\Delta y} This produces the approximation f ( Δ x , Δ y ) ≈ ∑ ( x , y ) ∈ W ( I x ( x , y ) Δ x + I y ( x , y ) Δ y ) 2 , {\displaystyle f(\Delta x,\Delta y)\approx {\underset {(x,y)\in W}{\sum }}\left(I_{x}(x,y)\Delta x+I_{y}(x,y)\Delta y\right)^{2},} which can be written in matrix form: f ( Δ x , Δ y ) ≈ ( Δ x Δ y ) M ( Δ x Δ y ) , {\displaystyle f(\Delta x,\Delta y)\approx {\begin{pmatrix}\Delta x&\Delta y\end{pmatrix}}M{\begin{pmatrix}\Delta x\\\Delta y\end{pmatrix}},} where M is the structure tensor, M = ∑ ( x , y ) ∈ W [ I x 2 I x I y I x I y I y 2 ] = [ ∑ ( x , y ) ∈ W I x 2 ∑ ( x , y ) ∈ W I x I y ∑ ( x , y ) ∈ W I x I y ∑ ( x , y ) ∈ W I y 2 ] {\displaystyle M={\underset {(x,y)\in W}{\sum }}{\begin{bmatrix}I_{x}^{2}&I_{x}I_{y}\\I_{x}I_{y}&I_{y}^{2}\end{bmatrix}}={\begin{bmatrix}{\underset {(x,y)\in W}{\sum }}I_{x}^{2}&{\underset {(x,y)\in W}{\sum }}I_{x}I_{y}\\{\underset {(x,y)\in W}{\sum }}I_{x}I_{y}&{\underset {(x,y)\in W}{\sum }}I_{y}^{2}\end{bmatrix}}} == Process of Harris corner detection algorithm == Commonly, Harris corner detector algorithm can be divided into five steps. Color to grayscale Spatial derivative calculation Structure tensor setup Harris response calculation Non-maximum suppression === Color to grayscale === If we use Harris corner detector in a color image, the first step is to convert it into a grayscale image, which will enhance the processing speed. The value of the gray scale pixel can be computed as a weighted sums of the values R, B and G of the color image, ∑ C ∈ { R , G , B } w C ⋅ C {\displaystyle \sum _{C\,\in \,\{R,G,B\}}w_{C}\cdot C} , where, e.g., w R = 0.299 , w G = 0.587 , w B = 1 − ( w R + w G ) = 0.114. {\displaystyle w_{R}=0.299,\ w_{G}=0.587,\ w_{B}=1-(w_{R}+w_{G})=0.114.} === Spatial derivative calculation === Next, we are going to find the derivative with respect to x and the derivative with respect to y, I x ( x , y ) {\displaystyle I_{x}(x,y)} and I y ( x , y ) {\displaystyle I_{y}(x,y)} . This can be approximated by applying Sobel operators. === Structure tensor setup === With I x ( x , y ) {\displaystyle I_{x}(x,y)} , I y ( x , y ) {\displaystyle I_{y}(x,y)} , we can construct the structure tensor M {\displaystyle M} . === Harris response calculation === For x ≪ y {\displaystyle x\ll y} , one has x ⋅ y x + y = x 1 1 + x / y ≈ x . {\displaystyle {\tfrac {x\cdot y}{x+y}}=x{\tfrac {1}{1+x/y}}\approx x.} In this step, we compute the smallest eigenvalue of the structure tensor using that approximation: λ min ≈ λ 1 λ 2 ( λ 1 + λ 2 ) = det ( M ) tr ⁡ ( M ) {\displaystyle \lambda _{\min }\approx {\frac {\lambda _{1}\lambda _{2}}{(\lambda _{1}+\lambda _{2})}}={\frac {\det(M)}{\operatorname {tr} (M)}}} with the trace t r ( M ) = m 11 + m 22 {\displaystyle \mathrm {tr} (M)=m_{11}+m_{22}} . Another commonly used Harris response calculation is shown as below, R = λ 1 λ 2 − k ( λ 1 + λ 2 ) 2 = det ( M ) − k tr ⁡ ( M ) 2 {\displaystyle R=\lambda _{1}\lambda _{2}-k(\lambda _{1}+\lambda _{2})^{2}=\det(M)-k\operatorname {tr} (M)^{2}} where k {\displaystyle k} is an empirically determined constant; k ∈ [ 0.04 , 0.06 ] {\displaystyle k\in [0.04,0.06]} . === Non-maximum suppression === In order to pick up the optimal values to indicate corners, we find the local maxima as corners within the window which is a 3 by 3 filter. == Improvement == Sources: Harris-Laplace Corner Detector Differential Morphological Decomposition Based Corner Detector Multi-scale Bilateral Structure Tensor Based Corner Detector == Applications == Image Alignment, Stitching and Registration 2D Mosaics Creation 3D Scene Modeling and Reconstruction Motion Detection Object Recognition Image Indexing and Content-based Retrieval Video Tracking

    Read more →
  • Legal information retrieval

    Legal information retrieval

    Legal information retrieval is the science of information retrieval applied to legal text, including legislation, case law, and scholarly works. Accurate legal information retrieval is important to provide access to the law to laymen and legal professionals. Its importance has increased because of the vast and quickly increasing amount of legal documents available through electronic means. Legal information retrieval is a part of the growing field of legal informatics. In a legal setting, it is frequently important to retrieve all information related to a specific query. However, commonly used boolean search methods (exact matches of specified terms) on full text legal documents have been shown to have an average recall rate as low as 20 percent, meaning that only 1 in 5 relevant documents are actually retrieved. In that case, researchers believed that they had retrieved over 75% of relevant documents. This may result in failing to retrieve important or precedential cases. In some jurisdictions this may be especially problematic, as legal professionals are ethically obligated to be reasonably informed as to relevant legal documents. Legal Information Retrieval attempts to increase the effectiveness of legal searches by increasing the number of relevant documents (providing a high recall rate) and reducing the number of irrelevant documents (a high precision rate). This is a difficult task, as the legal field is prone to jargon, polysemes (words that have different meanings when used in a legal context), and constant change. Techniques used to achieve these goals generally fall into three categories: boolean retrieval, manual classification of legal text, and natural language processing of legal text. == Problems == Application of standard information retrieval techniques to legal text can be more difficult than application in other subjects. One key problem is that the law rarely has an inherent taxonomy. Instead, the law is generally filled with open-ended terms, which may change over time. This can be especially true in common law countries, where each decided case can subtly change the meaning of a certain word or phrase. Legal information systems must also be programmed to deal with law-specific words and phrases. Though this is less problematic in the context of words which exist solely in law, legal texts also frequently use polysemes, words may have different meanings when used in a legal or common-speech manner, potentially both within the same document. The legal meanings may be dependent on the area of law in which it is applied. For example, in the context of European Union legislation, the term "worker" has four different meanings: Any worker as defined in Article 3(a) of Directive 89/391/EEC who habitually uses display screen equipment as a significant part of his normal work. Any person employed by an employer, including trainees and apprentices but excluding domestic servants; Any person carrying out an occupation on board a vessel, including trainees and apprentices, but excluding port pilots and shore personnel carrying out work on board a vessel at the quayside; Any person who, in the Member State concerned, is protected as an employee under national employment law and in accordance with national practice; It also has the common meaning: A person who works at a specific occupation. Though the terms may be similar, correct information retrieval must differentiate between the intended use and irrelevant uses in order to return the correct results. Even if a system overcomes the language problems inherent in law, it must still determine the relevancy of each result. In the context of judicial decisions, this requires determining the precedential value of the case. Case decisions from senior or superior courts may be more relevant than those from lower courts, even where the lower court's decision contains more discussion of the relevant facts. The opposite may be true, however, if the senior court has only a minor discussion of the topic (for example, if it is a secondary consideration in the case). An information retrieval system must also be aware of the authority of the jurisdiction. A case from a binding authority is most likely of more value than one from a non-binding authority. Additionally, the intentions of the user may determine which cases they find valuable. For instance, where a legal professional is attempting to argue a specific interpretation of law, he might find a minor court's decision which supports his position more valuable than a senior courts position which does not. He may also value similar positions from different areas of law, different jurisdictions, or dissenting opinions. Overcoming these problems can be made more difficult because of the large number of cases available. The number of legal cases available via electronic means is constantly increasing (in 2003, US appellate courts handed down approximately 500 new cases per day), meaning that an accurate legal information retrieval system must incorporate methods of both sorting past data and managing new data. == Techniques == === Boolean searches === Boolean searches, where a user may specify terms such as use of specific words or judgments by a specific court, are the most common type of search available via legal information retrieval systems. They are widely implemented but overcome few of the problems discussed above. The recall and precision rates of these searches vary depending on the implementation and searches analyzed. One study found a basic boolean search's recall rate to be roughly 20%, and its precision rate to be roughly 79%. Another study implemented a generic search (that is, not designed for legal uses) and found a recall rate of 56% and a precision rate of 72% among legal professionals. Both numbers increased when searches were run by non-legal professionals, to a 68% recall rate and 77% precision rate. This is likely explained because of the use of complex legal terms by the legal professionals. === Manual classification === In order to overcome the limits of basic boolean searches, information systems have attempted to classify case laws and statutes into more computer friendly structures. Usually, this results in the creation of an ontology to classify the texts, based on the way a legal professional might think about them. These attempt to link texts on the basis of their type, their value, and/or their topic areas. Most major legal search providers now implement some sort of classification search, such as Westlaw's “Natural Language” or LexisNexis' Headnote searches. Additionally, both of these services allow browsing of their classifications, via Westlaw's West Key Numbers or Lexis' Headnotes. Though these two search algorithms are proprietary and secret, it is known that they employ manual classification of text (though this may be computer-assisted). These systems can help overcome the majority of problems inherent in legal information retrieval systems, in that manual classification has the greatest chances of identifying landmark cases and understanding the issues that arise in the text. In one study, ontological searching resulted in a precision rate of 82% and a recall rate of 97% among legal professionals. The legal texts included, however, were carefully controlled to just a few areas of law in a specific jurisdiction. The major drawback to this approach is the requirement of using highly skilled legal professionals and large amounts of time to classify texts. As the amount of text available continues to increase, some have stated their belief that manual classification is unsustainable. === Natural language processing === In order to reduce the reliance on legal professionals and the amount of time needed, efforts have been made to create a system to automatically classify legal text and queries. Adequate translation of both would allow accurate information retrieval without the high cost of human classification. These automatic systems generally employ Natural Language Processing (NLP) techniques that are adapted to the legal domain, and also require the creation of a legal ontology. Though multiple systems have been postulated, few have reported results. One system, “SMILE,” which attempted to automatically extract classifications from case texts, resulted in an f-measure (which is a calculation of both recall rate and precision) of under 0.3 (compared to perfect f-measure of 1.0). This is probably much lower than an acceptable rate for general usage. Despite the limited results, many theorists predict that the evolution of such systems will eventually replace manual classification systems. === Citation-Based ranking === In the mid-90s the Room 5 case law retrieval project used citation mining for summaries and ranked its search results based on citation type and count. This slightly pre-dated the PageRank algorithm at Stanford which was also a citation-based ranking. Ranking of results was based

    Read more →
  • Night Sky (app)

    Night Sky (app)

    Night Sky (app) is an application developed and published by indie studio iCandi Apps Ltd. from the UK. Night Sky is a stargazing reference app, where the user can explore a virtual representation of the night sky to identify stars, planets, constellations and satellites. The app is developed specifically for iOS, tvOS and watchOS devices. Night Sky was first released on November 1, 2011 for iOS, and has had multiple updates since launch. Night Sky was mentioned in the September 2016 Apple Keynote during the Apple Watch Series 2 announcement. In October 2016, Night Sky was featured as the Free App of The Week on the Apple App Store. == Reception == Night Sky was featured in Apple's 'Best of 2012' and has also been pre-installed onto iPads in Apple retail stores worldwide.

    Read more →
  • Meesho

    Meesho

    Meesho Limited (short for Meri shop, transl. My shop) is an Indian e-commerce company, headquartered in Bengaluru. Founded by Vidit Aatrey and Sanjeev Barnwal in December 2015, Meesho is an online marketplace in categories such as fashion, home and kitchen, beauty and personal care, electronics accessories, and daily use products. == History == Meesho Private Limited, formerly Fashnear Technologies Private Limited, was established by IIT Delhi graduates Vidit Aatrey and Sanjeev Barnwal in December, 2015 In 2016, the founders came up with the idea of re-establishing the platform as Meesho, one that would enable country-wide shipping for resellers with the use of social media sites as tools for marketing. In February 2019, the platform reported having around 209,000 users and about 1.2 million monthly orders, and in March 2020, it reported approximately 563,000 users and 3.1 million monthly orders. In 2021, the Meesho mobile application was ranked among the most downloaded shopping apps globally. In 2022, Meesho had about 120 million monthly users and about 910 million orders were made through the platform, with a gross merchandise value (GMV) of about $5 billion. According to report as of August 2023 Meesho delisted 42 lakh counterfeit listings and 10 lakh restricted products under its initiative Project Suraksha. During the same period, the platform blocked access for over 12,000 user accounts flagged for policy violations. The Court granted injunctive relief by directing domain registrars to suspend the infringing websites. Additionally, the Court ordered law enforcement authorities to initiate criminal investigations, freeze associated financial accounts against the identified offenders. In 2023, Meesho became the fastest shopping app to cross over 500 million downloads. In 2024, Meesho introduced Valmo, a logistics marketplace, to provide shipment services to sellers by aggregating multiple logistics providers. Meesho employs over 3,000 small businesses and 10-12 large firms for warehousing and sorting operations within its logistics framework. According to media reports, Valmo operating in approximately 15,000 pincodes in India with around 6,000 partners. It is reported to handle over 50% of Meesho's daily orders. In November 2024, Meesho introduced a generative AI-powered voice bot for customer support, managing approximately 60,000 calls daily in English and Hindi. According to media reports, the system resolves the majority of queries without human assistance, with only a small fraction of calls requiring manual intervention. According to media reports, in 2024, Meesho prevented over 22 million suspicious or potentially fraudulent transactions on its platform. The company initiated legal proceedings, resulting in the filing of twelve cases, including nine specifically targeting over forty individuals in the cities of Kolkata and Ranchi. The company filed a suit in the Delhi High Court for a permanent injunction against parties operating deceptive websites misappropriating its brand identity. Meesha went public through an initial public offering in December 2025, raising $603 million. It is listed on both the BSE and NSE. == Recognition == In 2023, Meesho was named one of the most influential companies of the year by Time (magazine).

    Read more →