AI Code Tester

AI Code Tester — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Application software

    Application software

    Application software is software that is intended for end-user use – not operating, administering or programming a computer. It includes programs such as word processors, web browsers, media players, and mobile applications used in daily tasks. An application (app, application program, software application) is any program that can be categorized as application software. Application is a subjective classification that is often used to differentiate from system and utility software. Application software represents the user-facing layer of computing systems, designed to translate complex system capabilities into task-oriented, goal-driven workflows. Unlike system software, which focuses on hardware orchestration and resource management, application software is centered on problem abstraction, user interaction, and domain-specific functionality. The abbreviation app became popular with the 2008 introduction of the iOS App Store, to refer to applications for mobile devices such as smartphones and tablets. Later, with the release of the Mac App Store in 2010 and the Windows Store in 2011, it began to be used to refer to end-user software in general, regardless of platform. Applications may be bundled with the computer and its system software or published separately. Applications may be proprietary or open-source. == Terminology == === Meaning program and software === When used as an adjective, application can have a broader meaning than that described in this article. For example, concepts such as application programming interface (API), application server, application virtualization, application lifecycle management and portable application refer to programs and software in general. === Distinction between system and application software === The distinction between system and application software is subjective and has been the subject of controversy. For example, one of the key questions in the United States v. Microsoft Corp. antitrust trial was whether Microsoft's Internet Explorer web browser was part of its Windows operating system or a separate piece of application software. As another example, the GNU/Linux naming controversy is, in part, due to disagreement about the relationship between the Linux kernel and the operating systems built over this kernel. In some types of embedded systems, the application software and the operating system software may be indistinguishable by the user, as in the case of software used to control a VCR, DVD player, or microwave oven. The above definitions may exclude some applications that may exist on some computers in large organizations. For an alternative definition of an app: see Application Portfolio Management. === Killer application === A killer application (killer app, coined in the late 1980s) is an application that is so popular that it causes demand for its host platform to increase. For example, VisiCalc was the first modern spreadsheet software for the Apple II and helped sell the then-new personal computers into offices. For the BlackBerry, it was its email software. === Software suite === As software suite consists of multiple applications bundled together. They usually have related functions, features, and user interfaces, and may be able to interact with each other, e.g. open each other's files. Business applications often come in suites, e.g. Microsoft Office, LibreOffice and iWork, which bundle together a word processor, a spreadsheet, etc.; but suites exist for other purposes, e.g. graphics or music. == Ways to classify == As there so many applications and since their attributes vary so dramatically, there are many different ways to classify them. === By legal aspects === Proprietary software is protected under an exclusive copyright, and a software license grants limited usage rights. Such applications may allow add-ons from third parties. Free and open-source software (FOSS) can be run, distributed, sold, and extended for any purpose. FOSS software released under a free license may be perpetual and also royalty-free. Perhaps, the owner, the holder or third-party enforcer of any right (copyright, trademark, patent, or ius in re aliena) are entitled to add exceptions, limitations, time decays or expiring dates to the license terms of use. Public-domain software is a type of FOSS that is royalty-free and can be run, distributed, modified, reversed, republished, or created in derivative works without any copyright attribution and therefore revocation. It can even be sold, but without transferring the public domain property to other single subjects. Public-domain software can be released under a (un)licensing legal statement, which enforces those terms and conditions for an indefinite duration (for a lifetime, or forever). === By platform === An application can be categorized by the host platform on which it runs. Notable platforms include operating system (native), web browser, cloud computing and mobile. For example a web application runs in a web browser whereas a more traditional, native application runs in the environment of a computer's operating system. There has been a contentious debate regarding web applications replacing native applications for many purposes, especially on mobile devices such as smartphones and tablets. Web apps have indeed greatly increased in popularity for some uses, but the advantages of applications make them unlikely to disappear soon, if ever. Furthermore, the two can be complementary, and even integrated. === Horizontal vs. vertical === Application software can be seen as either horizontal or vertical. Horizontal applications are more popular and widespread, because they are general purpose, for example word processors or databases. Vertical applications are niche products, designed for a particular type of industry or business, or department within an organization. Integrated suites of software will try to handle every specific aspect possible of, for example, manufacturing or banking worker, accounting, or customer service. === By purpose === There are many types of application software: Enterprise Addresses the needs of an entire organization's processes and data flows, across several departments, often in a large distributed environment. Examples include enterprise resource planning systems, customer relationship management (CRM) systems, data replication engines, and supply chain management software. Departmental Software is a sub-type of enterprise software with a focus on smaller organizations or groups within a large organization. (Examples include travel expense management and IT Helpdesk.) Enterprise infrastructure Provides common capabilities needed to support enterprise software systems. (Examples include databases, email servers, and systems for managing networks and security.) Application platform as a service (aPaaS) A cloud computing service that offers development and deployment environments for application services. Knowledge worker Lets users create and manage information, often for and individual media editors may aid in multiple information worker tasks. Content access Used primarily to access content without editing, but may include software that allows for content editing. Such software addresses the needs of individuals and groups to consume digital entertainment and published digital content. (Examples include media players, web browsers, and help browsers.) Educational Related to content access software, but has the content or features adapted for use by educators or students. For example, it may deliver evaluations (tests), track progress through material, or include collaborative capabilities. Simulation Simulates physical or abstract systems for either research, training, or entertainment purposes. Media development Generates print and electronic media for others to consume, most often in a commercial or educational setting. This includes graphic-art software, desktop publishing software, multimedia development software, HTML editors, digital-animation editors, digital audio and video composition, and many others. Engineering Used in developing hardware and software products. This includes computer-aided design (CAD), computer-aided engineering (CAE), computer language editing and compiling tools, integrated development environments, and application programmer interfaces. Entertainment Refers to video games, screen savers, programs to display motion pictures or play recorded music, and other forms of entertainment which can be experienced through the use of a computing device. == Taxonomy == This section is a taxonomy of kinds of applications. This organization is but one of many different ways to organize them. A kind is included in only one category even if it logically fits in multiple. === General-purpose === Calculator Spreadsheet Web browser Web mapping E-commerce Social media === Communication === Chat Email Presentation software Phone Messages Networking software Web conferencing === Documentation === Desktop

    Read more →
  • Spleak

    Spleak

    Spleak was an IM platform where users could publish and rate content. It existed in the form of six bots covering as many subject areas: CelebSpleak, SportSpleak, VoteSpleak, TVSpleak, GameSpleak, and StyleSpleak. == Overview == Users can add a "multi-Spleak" (which contains all of the different Spleak bots in one) or add the separate bots to their IM buddy lists on MSN and AIM. Users are also allowed access to Spleak online by using a CelebSpleak, SportSpleak, or VoteSpleak widget, or through the CelebSpleak and SportSpleak applications with Facebook. Spleak was an alternate reality game and is moving to its own company, Spleak Media Network. "Celebrate Spleak" was introduced throughout 2007, launched in 2008, and was forced to retire in 2009. == Key people == Spleak was co-founded by Morten Lund and Nicolaj Reffstrup. The company's chief executive officer is Morrie Eisenburg; Josh Scott is Vice President in Product and Tyler Wells is Vice President in Engineering.

    Read more →
  • Point distribution model

    Point distribution model

    The point distribution model is a model for representing the mean geometry of a shape and some statistical modes of geometric variation inferred from a training set of shapes. == Background == The point distribution model concept has been developed by Cootes, Taylor et al. and became a standard in computer vision for the statistical study of shape and for segmentation of medical images where shape priors really help interpretation of noisy and low-contrasted pixels/voxels. The latter point leads to active shape models (ASM) and active appearance models (AAM). Point distribution models rely on landmark points. A landmark is an annotating point posed by an anatomist onto a given locus for every shape instance across the training set population. For instance, the same landmark will designate the tip of the index finger in a training set of 2D hands outlines. Principal component analysis (PCA), for instance, is a relevant tool for studying correlations of movement between groups of landmarks among the training set population. Typically, it might detect that all the landmarks located along the same finger move exactly together across the training set examples showing different finger spacing for a flat-posed hands collection. == Details == First, a set of training images are manually landmarked with enough corresponding landmarks to sufficiently approximate the geometry of the original shapes. These landmarks are aligned using the generalized procrustes analysis, which minimizes the least squared error between the points. k {\displaystyle k} aligned landmarks in two dimensions are given as X = ( x 1 , y 1 , … , x k , y k ) {\displaystyle \mathbf {X} =(x_{1},y_{1},\ldots ,x_{k},y_{k})} . It's important to note that each landmark i ∈ { 1 , … k } {\displaystyle i\in \lbrace 1,\ldots k\rbrace } should represent the same anatomical location. For example, landmark #3, ( x 3 , y 3 ) {\displaystyle (x_{3},y_{3})} might represent the tip of the ring finger across all training images. Now the shape outlines are reduced to sequences of k {\displaystyle k} landmarks, so that a given training shape is defined as the vector X ∈ R 2 k {\displaystyle \mathbf {X} \in \mathbb {R} ^{2k}} . Assuming the scattering is gaussian in this space, PCA is used to compute normalized eigenvectors and eigenvalues of the covariance matrix across all training shapes. The matrix of the top d {\displaystyle d} eigenvectors is given as P ∈ R 2 k × d {\displaystyle \mathbf {P} \in \mathbb {R} ^{2k\times d}} , and each eigenvector describes a principal mode of variation along the set. Finally, a linear combination of the eigenvectors is used to define a new shape X ′ {\displaystyle \mathbf {X} '} , mathematically defined as: X ′ = X ¯ + P b {\displaystyle \mathbf {X} '={\overline {\mathbf {X} }}+\mathbf {P} \mathbf {b} } where X ¯ {\displaystyle {\overline {\mathbf {X} }}} is defined as the mean shape across all training images, and b {\displaystyle \mathbf {b} } is a vector of scaling values for each principal component. Therefore, by modifying the variable b {\displaystyle \mathbf {b} } an infinite number of shapes can be defined. To ensure that the new shapes are all within the variation seen in the training set, it is common to only allow each element of b {\displaystyle \mathbf {b} } to be within ± {\displaystyle \pm } 3 standard deviations, where the standard deviation of a given principal component is defined as the square root of its corresponding eigenvalue. PDM's can be extended to any arbitrary number of dimensions, but are typically used in 2D image and 3D volume applications (where each landmark point is R 2 {\displaystyle \mathbb {R} ^{2}} or R 3 {\displaystyle \mathbb {R} ^{3}} ). == Discussion == An eigenvector, interpreted in euclidean space, can be seen as a sequence of k {\displaystyle k} euclidean vectors associated to corresponding landmark and designating a compound move for the whole shape. Global nonlinear variation is usually well handled provided nonlinear variation is kept to a reasonable level. Typically, a twisting nematode worm is used as an example in the teaching of kernel PCA-based methods. Due to the PCA properties: eigenvectors are mutually orthogonal, form a basis of the training set cloud in the shape space, and cross at the 0 in this space, which represents the mean shape. Also, PCA is a traditional way of fitting a closed ellipsoid to a Gaussian cloud of points (whatever their dimension): this suggests the concept of bounded variation. The idea behind PDMs is that eigenvectors can be linearly combined to create an infinity of new shape instances that will 'look like' the one in the training set. The coefficients are bounded alike the values of the corresponding eigenvalues, so as to ensure the generated 2n/3n-dimensional dot will remain into the hyper-ellipsoidal allowed domain—allowable shape domain (ASD).

    Read more →
  • LMArena

    LMArena

    Arena (formerly LMArena and Chatbot Arena) is a public, web-based platform that evaluates large language models (LLMs). Users enter prompts for two anonymous models to respond to and vote on the model that gave the better response, after which the models' identities are revealed. Users can also choose models to test themselves via the "Direct" selection. Companies which have supplied the company with their large language models include OpenAI, Google DeepMind, and Anthropic. The website has been used for preview releases of upcoming models. Chinese company DeepSeek tested its prototype models in the Arena months before its R1 model gained attention in Western media. Other notable pre-release models include OpenAI's GPT-5 under the codename "summit" and Google DeepMind's Gemini 2.5 Flash Image (an image-generation and editing model) under the codename "Nano Banana". Research has identified specific limitations in Arena's methodology. == History == Chatbot Arena was released on April 24, 2023. In June 2024, Chatbot Arena added image support. In September 2024, Chatbot Arena moved to its own dedicated domain name, lmarena.ai (or LMArena). In April 2025, Meta released Llama 4. Llama 4 Maverick beat GPT-4o and Gemini 2.0 Flash on LMArena, but the version of Maverick on LMArena unfairly differed from the publicly available version. LMArena updated their policies in response. In April 2025, LMArena incorporated as an independent company. That May, LMArena raised $100 million in a seed funding round, valuing the company at $600 million. Participants in the seed funding round included Andreessen Horowitz, UC Investments, Lightspeed Venture Partners, Felicis Ventures, and Kleiner Perkins. On January 6, 2026, LMArena announced the closing of a $150 million Series A funding round, bringing the company’s post-money valuation to approximately $1.7 billion. The round was led by Felicis and UC Investments (University of California), with participation from Andreessen Horowitz, The House Fund, LDVP, Kleiner Perkins, Lightspeed Venture Partners, and Laude Ventures. In January 2026, LMArena added video support. On January 28, 2026, LMArena rebranded to "Arena".

    Read more →
  • Automation

    Automation

    Automation describes a wide range of technologies that reduce human intervention in processes, mainly by predetermining decision criteria, subprocess relationships, and related actions, as well as embodying those predeterminations in machines. Automation has been achieved by various means including mechanical, hydraulic, pneumatic, electrical, electronic devices, and computers, usually in combination. Complicated systems, such as modern factories, airplanes, and ships typically use combinations of all of these techniques. The benefits of automation includes labor savings, reducing waste, savings in electricity costs, savings in material costs, and improvements to quality, accuracy, and precision. Automation includes the use of various equipment and control systems such as machinery, processes in factories, boilers, and heat-treating ovens, switching on telephone networks, steering, stabilization of ships, aircraft and other applications and vehicles with reduced human intervention. Examples range from a household thermostat controlling a boiler to a large industrial control system with tens of thousands of input measurements and output control signals. In the simplest type of an automatic control loop, a controller compares a measured value of a process with a desired set value and processes the resulting error signal to change some input to the process, in such a way that the process stays at its set point despite disturbances. This closed-loop control is an application of negative feedback to a system. The mathematical basis of control theory began in the 18th century and advanced rapidly in the 20th. The term automation, inspired by the earlier word automatic (coming from automaton), was not widely used before 1947, when Ford established an automation department. It was during this time that the industry was rapidly adopting feedback controllers, Technological advancements introduced in the 1930s revolutionized various industries significantly. The World Bank's World Development Report of 2019 shows evidence that the new industries and jobs in the technology sector outweigh the economic effects of workers being displaced by automation. Job losses and downward mobility blamed on automation have been cited as one of many factors in the resurgence of nationalist, protectionist and populist politics in the US, UK and France, among other countries since the 2010s. == History == === Early history === It was a preoccupation of the Greeks and Arabs (in the period between about 300 BC and about 1200 AD) to keep an accurate track of time. In Ptolemaic Egypt, about 270 BC, Ctesibius described a float regulator for a water clock, a device not unlike the ball and cock in a modern flush toilet. This was the earliest feedback-controlled mechanism. The appearance of the mechanical clock in the 14th century made the water clock and its feedback control system obsolete. The Persian Banū Mūsā brothers, in their Book of Ingenious Devices (850 AD), described a number of automatic controls. Two-step level controls for fluids, a form of discontinuous variable structure controls, were developed by the Banu Musa brothers. They also described a feedback controller. The design of feedback control systems up through the Industrial Revolution was by trial-and-error, together with a great deal of engineering intuition. It was not until the mid-19th century that the stability of feedback control systems was analyzed using mathematics, the formal language of automatic control theory. The centrifugal governor was invented by Christiaan Huygens in the seventeenth century, and used to adjust the gap between millstones. === Industrial Revolution in Western Europe === The introduction of prime movers, or self-driven machines advanced grain mills, furnaces, boilers, and the steam engine created a new requirement for automatic control systems including temperature regulators (invented in 1624; see Cornelius Drebbel), pressure regulators (1681), float regulators (1700) and speed control devices. Another control mechanism was used to tent the sails of windmills. It was patented by Edmund Lee in 1745. Also in 1745, Jacques de Vaucanson invented the first automated loom. Around 1800, Joseph Marie Jacquard created a punch-card system to program looms. In 1771 Richard Arkwright invented the first fully automated spinning mill driven by water power, known at the time as the water frame. An automatic flour mill was developed by Oliver Evans in 1785, making it the first completely automated industrial process. A centrifugal governor was used by Mr. Bunce of England in 1784 as part of a model steam crane. The centrifugal governor was adopted by James Watt for use on a steam engine in 1788 after Watt's partner Boulton saw one at a flour mill Boulton & Watt were building. The governor could not actually hold a set speed; the engine would assume a new constant speed in response to load changes. The governor was able to handle smaller variations such as those caused by fluctuating heat load to the boiler. Also, there was a tendency for oscillation whenever there was a speed change. As a consequence, engines equipped with this governor were not suitable for operations requiring constant speed, such as cotton spinning. Several improvements to the governor, plus improvements to valve cut-off timing on the steam engine, made the engine suitable for most industrial uses before the end of the 19th century. Advances in the steam engine stayed well ahead of science, both thermodynamics and control theory. The governor received relatively little scientific attention until James Clerk Maxwell published a paper that established the beginning of a theoretical basis for understanding control theory. === 20th century === Relay logic was introduced with factory electrification, which underwent rapid adaptation from 1900 through the 1920s. Central electric power stations were also undergoing rapid growth and the operation of new high-pressure boilers, steam turbines and electrical substations created a great demand for instruments and controls. Central control rooms became common in the 1920s, but as late as the early 1930s, most process controls were on-off. Operators typically monitored charts drawn by recorders that plotted data from instruments. To make corrections, operators manually opened or closed valves or turned switches on or off. Control rooms also used color-coded lights to send signals to workers in the plant to manually make certain changes. The development of the electronic amplifier during the 1920s, which was important for long-distance telephony, required a higher signal-to-noise ratio, which was solved by negative feedback noise cancellation. This and other telephony applications contributed to the control theory. In the 1940s and 1950s, German mathematician Irmgard Flügge-Lotz developed the theory of discontinuous automatic controls, which found military applications during the Second World War to fire control systems and aircraft navigation systems. Controllers, which were able to make calculated changes in response to deviations from a set point rather than on-off control, began being introduced in the 1930s. Controllers allowed manufacturing to continue showing productivity gains to offset the declining influence of factory electrification. Factory productivity was greatly increased by electrification in the 1920s. U.S. manufacturing productivity growth fell from 5.2%/yr 1919–29 to 2.76%/yr 1929–41. Alexander Field notes that spending on non-medical instruments increased significantly from 1929 to 1933 and remained strong thereafter. The First and Second World Wars saw major advancements in the field of mass communication and signal processing. Other key advances in automatic controls include differential equations, stability theory and system theory (1938), frequency domain analysis (1940), ship control (1950), and stochastic analysis (1941). Starting in 1958, various systems based on solid-state digital logic modules for hard-wired programmed logic controllers (the predecessors of programmable logic controllers [PLC]) emerged to replace electro-mechanical relay logic in industrial control systems for process control and automation, including early Telefunken/AEG Logistat, Siemens Simatic, Philips/Mullard/Valvo Norbit, BBC Sigmatronic, ACEC Logacec, Akkord Estacord, Krone Mibakron, Bistat, Datapac, Norlog, SSR, or Procontic systems. In 1959 Texaco's Port Arthur Refinery became the first chemical plant to use digital control. Conversion of factories to digital control began to spread rapidly in the 1970s as the price of computer hardware fell. === Significant applications === The automatic telephone switchboard was introduced in 1892 along with dial telephones. By 1929, 31.9% of the Bell system was automatic. Automatic telephone switching originally used vacuum tube amplifiers and electro-mechanical switches, which consumed a large amount of electricity. Call volume eve

    Read more →
  • Cloem

    Cloem

    Cloem is a company based in Cannes, France, which applies natural language processing (NLP) technologies to assist patent applicants in creating variants of patent claims, called "cloems". According to the company, these "computer-generated claims can be published to keep potential competitors from attempting to file adjacent patent claims." == Technology == According to Cloem, dictionaries, ontologies and proprietary claim-drafting algorithms are used to draft alternative claims based on a client's original set of claims. In particular, the original set of claims is subject to various permutations and linguistic manipulations "by considering alternative definitions for terms as well as “synonyms, hyponyms, hyperonyms, meronyms, holonyms, and antonyms.”" == Possible uses == Cloem can optionally publish one or more created texts, as electronic publications or as paper-printed publications. These can potentially serve – through a defensive publication – as prior art to prevent another party for obtaining a patent on the subject-matter at stake. In other words, after an initial patent filing, an "improvement" patent (adjacent invention) can be applied for by another party, such as a competitor. By publishing variants of a patent claim, the risk of adverse patenting may potentially be decreased (improvement inventions may no longer be patentable). Cloems may also be potentially patentable. One of the issues of patentability, however, is that only a natural person can be a listed as an inventor on a patent. Since cloems are produced by a computer based on a person's input, it is not clear if the computer or the person is the inventor. The inventorship of Cloem texts is an open question.

    Read more →
  • Photo-consistency

    Photo-consistency

    In computer vision, photo-consistency determines whether a given voxel is occupied. A voxel is considered to be photo consistent when its color appears to be similar to all the cameras that can see it. Most voxel coloring or space carving techniques require using photo consistency as a check condition in Image-based modeling and rendering applications. == Usage == 3D Volumetric Reconstruction. Image registration. Multi-view reconstruction.

    Read more →
  • Information extraction

    Information extraction

    Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents and other electronically represented sources. Typically, this involves processing human language texts by means of natural language processing (NLP). Recent activities in multimedia document processing like automatic annotation and content extraction out of images/audio/video/documents could be seen as information extraction. Recent advances in NLP techniques have allowed for significantly improved performance compared to previous years. An example is the extraction from newswire reports of corporate mergers, such as denoted by the formal relation: MergerBetween ⁡ ( c o m p a n y 1 , c o m p a n y 2 , d a t e ) {\displaystyle \operatorname {MergerBetween} (\mathrm {company} _{1},\mathrm {company} _{2},\mathrm {date} )} , from an online news sentence such as: "Yesterday, New York based Foo Inc. announced their acquisition of Bar Corp." A broad goal of IE is to allow computation to be done on the previously unstructured data. A more specific goal is to allow automated reasoning about the logical form of the input data. Structured data is semantically well-defined data from a chosen target domain, interpreted with respect to category and context. Information extraction is the part of a greater puzzle which deals with the problem of devising automatic methods for text management, beyond its transmission, storage and display. The discipline of information retrieval (IR) has developed automatic methods, typically of a statistical flavor, for indexing large document collections and classifying documents. Another complementary approach is that of natural language processing (NLP) which has solved the problem of modelling human language processing with considerable success when taking into account the magnitude of the task. In terms of both difficulty and emphasis, IE deals with tasks in between both IR and NLP. In terms of input, IE assumes the existence of a set of documents in which each document follows a template, i.e. describes one or more entities or events in a manner that is similar to those in other documents but differing in the details. An example, consider a group of newswire articles on Latin American terrorism with each article presumed to be based upon one or more terroristic acts. We also define for any given IE task a template, which is a(or a set of) case frame(s) to hold the information contained in a single document. For the terrorism example, a template would have slots corresponding to the perpetrator, victim, and weapon of the terroristic act, and the date on which the event happened. An IE system for this problem is required to "understand" an attack article only enough to find data corresponding to the slots in this template. == History == Information extraction dates back to the late 1970s in the early days of NLP. An early commercial system from the mid-1980s was JASPER built for Reuters by the Carnegie Group Inc with the aim of providing real-time financial news to financial traders. Beginning in 1987, IE was spurred by a series of Message Understanding Conferences. MUC is a competition-based conference that focused on the following domains: MUC-1 (1987), MUC-3 (1989): Naval operations messages. MUC-3 (1991), MUC-4 (1992): Terrorism in Latin American countries. MUC-5 (1993): Joint ventures and microelectronics domain. MUC-6 (1995): News articles on management changes. MUC-7 (1998): Satellite launch reports. Considerable support came from the U.S. Defense Advanced Research Projects Agency (DARPA), who wished to automate mundane tasks performed by government analysts, such as scanning newspapers for possible links to terrorism. == Present significance == The present significance of IE pertains to the growing amount of information available in unstructured form. Tim Berners-Lee, inventor of the World Wide Web, refers to the existing Internet as the web of documents and advocates that more of the content be made available as a web of data. Until this transpires, the web largely consists of unstructured documents lacking semantic metadata. Knowledge contained within these documents can be made more accessible for machine processing by means of transformation into relational form, or by marking-up with XML tags. An intelligent agent monitoring a news data feed requires IE to transform unstructured data into something that can be reasoned with. A typical application of IE is to scan a set of documents written in a natural language and populate a database with the information extracted. == Tasks and subtasks == Applying information extraction to text is linked to the problem of text simplification in order to create a structured view of the information present in free text. The overall goal being to create a more easily machine-readable text to process the sentences. Typical IE tasks and subtasks include: Template filling: Extracting a fixed set of fields from a document, e.g. extract perpetrators, victims, time, etc. from a newspaper article about a terrorist attack. Event extraction: Given an input document, output zero or more event templates. For instance, a newspaper article might describe multiple terrorist attacks. Knowledge Base Population: Fill a database of facts given a set of documents. Typically the database is in the form of triplets, (entity 1, relation, entity 2), e.g. (Barack Obama, Spouse, Michelle Obama) Named entity recognition: recognition of known entity names (for people and organizations), place names, temporal expressions, and certain types of numerical expressions, by employing existing knowledge of the domain or information extracted from other sentences. Typically the recognition task involves assigning a unique identifier to the extracted entity. A simpler task is named entity detection, which aims at detecting entities without having any existing knowledge about the entity instances. For example, in processing the sentence "M. Smith likes fishing", named entity detection would denote detecting that the phrase "M. Smith" does refer to a person, but without necessarily having (or using) any knowledge about a certain M. Smith who is (or, "might be") the specific person whom that sentence is talking about. Coreference resolution: detection of coreference and anaphoric links between text entities. In IE tasks, this is typically restricted to finding links between previously extracted named entities. For example, "International Business Machines" and "IBM" refer to the same real-world entity. If we take the two sentences "M. Smith likes fishing. But he doesn't like biking", it would be beneficial to detect that "he" is referring to the previously detected person "M. Smith". Relationship extraction: identification of relations between entities, such as: PERSON works for ORGANIZATION (extracted from the sentence "Bill works for IBM.") PERSON located in LOCATION (extracted from the sentence "Bill is in France.") Semi-structured information extraction which may refer to any IE that tries to restore some kind of information structure that has been lost through publication, such as: Table extraction: finding and extracting tables from documents. Table information extraction : extracting information in structured manner from the tables. This task is more complex than table extraction, as table extraction is only the first step, while understanding the roles of the cells, rows, columns, linking the information inside the table and understanding the information presented in the table are additional tasks necessary for table information extraction. Comments extraction : extracting comments from the actual content of articles in order to restore the link between authors of each of the sentences Language and vocabulary analysis Terminology extraction: finding the relevant terms for a given corpus Audio extraction Template-based music extraction: finding relevant characteristic in an audio signal taken from a given repertoire; for instance time indexes of occurrences of percussive sounds can be extracted in order to represent the essential rhythmic component of a music piece. Note that this list is not exhaustive and that the exact meaning of IE activities is not commonly accepted and that many approaches combine multiple sub-tasks of IE in order to achieve a wider goal. Machine learning, statistical analysis and/or natural language processing are often used in IE. IE on non-text documents is becoming an increasingly interesting topic in research, and information extracted from multimedia documents can now be expressed in a high level structure as it is done on text. This naturally leads to the fusion of extracted information from multiple kinds of documents and sources. == World Wide Web applications == IE has been the focus of the MUC conferences. The proliferation of the Web, however, intensified the need for developing IE systems that help people

    Read more →
  • Yap (company)

    Yap (company)

    Yap Speech Cloud was a multimodal speech recognition system developed by American technology company Yap Inc. It offered a fully cloud-based speech-to-text transcription platform that was used by customers such as Microsoft. The Company was a contestant at the inaugural TechCrunch conference and was subsequently acquired by Amazon in September 2011 to help develop products such as Alexa Voice Service, Echo, and Fire TV.

    Read more →
  • Albert One

    Albert One

    Albert One is an artificial intelligence chatbot created by Robby Garner and designed to mimic the way humans make conversations using a multi-faceted approach in natural language programming. == History == In both 1998 and 1999, Albert One won the Loebner Prize Contest, a competition between chatterbots. Some parts of Albert were deployed on the internet beginning in 1995, to gather information about what kinds of things people would say to a chatterbot. Another element of Albert One involved the building of a large database of human statements, and associated replies. This portion of the project was tested at the 1994-1997 Loebner Prize contests. Albert was the first of Robby Garner's multifaceted bots. The Albert One system was composed of several subsystems. Among those were a version of Eliza, the therapist, Elivs, another Eliza-like bot, and several other helper applications working together in a hierarchical arrangement. As a continuation of the stimulus-response library, various other database queries and assertions were tested to arrive at each of Albert's responses. Robby went on to develop networked examples of this kind of hierarchical "glue" at The Turing Hub.

    Read more →
  • ChatScript

    ChatScript

    ChatScript is a combination Natural Language engine and dialog management system designed initially for creating chatbots, but is currently also used for various forms of NL processing. It is written in C++. The engine is an open source project at SourceForge. and GitHub. ChatScript was written by Bruce Wilcox and originally released in 2011, after Suzette (written in ChatScript) won the 2010 Loebner Prize, fooling one of four human judges. == Features == In general ChatScript aims to author extremely concisely, since the limiting scalability of hand-authored chatbots is how much/fast one can write the script. Because ChatScript is designed for interactive conversation, it automatically maintains user state across volleys. A volley is any number of sentences the user inputs at once and the chatbots response. The basic element of scripting is the rule. A rule consists of a type, a label (optional), a pattern, and an output. There are three types of rules. Gambits are something a chatbot might say when it has control of the conversation. Rejoinders are rules that respond to a user remark tied to what the chatbot just said. Responders are rules that respond to arbitrary user input which is not necessarily tied to what the chatbot just said. Patterns describe conditions under which a rule may fire. Patterns range from extremely simplistic to deeply complex (analogous to Regex but aimed for NL). Heavy use is typically made of concept sets, which are lists of words sharing a meaning. ChatScript contains some 2000 predefined concepts and scripters can easily write their own. Output of a rule intermixes literal words to be sent to the user along with common C-style programming code. Rules are bundled into collections called topics. Topics can have keywords, which allows the engine to automatically search the topic for relevant rules based on user input. == Example code == Words starting with ~ are concept sets. For example, ~fruit is the list of all known fruits. The simple pattern (~fruit) reacts if any fruit is mentioned immediately after the chatbot asks for favorite food. The slightly more complex pattern for the rule labelled WHATMUSIC requires all the words what, music, you and any word or phrase meaning to like, but they may occur in any order. Responders come in three types. ?: rules react to user questions. s: rules react to user statements. u: rules react to either. ChatScript code supports standard if-else, loops, user-defined functions and calls, and variable assignment and access. == Data == Some data in ChatScript is transient, meaning it will disappear at the end of the current volley. Other data is permanent, lasting forever until explicitly killed off. Data can be local to a single user or shared across all users at the bot level. Internally all data is represented as text and is automatically converted to a numeric form as needed. === Variables === User variables come in several kinds. Variables purely local to a topic or function are transient. Global variables can be declared as transient or permanent. A variable is generally declared merely by using it, and its type depends on its prefix ($, $$, $_). === Facts === In addition to variables, ChatScript supports facts – triples of data, which can also be transient or permanent. Functions can query for facts having particular values of some of the fields, making them act like an in-memory database. Fact retrieval is very quick and efficient the number of available in-memory facts is largely constrained to the available memory of the machine running the ChatScript engine. Facts can represent record structures and are how ChatScript represents JSON internally. Tables of information can be defined to generate appropriate facts. The above table links people to what they invented (1 per line) with Einstein getting a list of things he did. == External communication == ChatScript embeds the Curl library and can directly read and write facts in JSON to a website. == Server == A ChatScript engine can run in local or server mode. == Pos-tagging, parsing, and ontology == ChatScript comes with a copy of English WordNet embedded within, including its ontology, and creates and extends its own ontology via concept declarations. It has an English language pos-tagger and parser and supports integration with TreeTagger for pos-tagging a number of other languages (TreeTagger commercial license required). == Databases == In addition to an internal fact database, ChatScript supports PostgreSQL, MySQL, MSSQL and MongoDB both for access by scripts, but also as a central filesystem if desired so ChatScript can be scaled horizontally. A common use case is to use a centralized database to host the user files and multiple servers to scale the ChatScript engine. == JavaScript == ChatScript also embeds DukTape, ECMAScript E5/E5.1 compatibility, with some semantics updated from ES2015+. == Spelling Correction == ChatScript has built-in automatic spell checking, which can be augmented in script as both simple word replacements or context sensitive changes. With appropriate simple rules you can change perfect legal words into other words or delete them. E.g., if you have a concept of ~electronic_goods and don't want an input of Radio Shack (a store name) to be detected as an electronic good, you can get the input to change to Radio_Shack (a single word), or allow the words to remain but block the detection of the concept. This is particularly useful when combined with speech-to-text code that is imperfect, but you are familiar with common failings of it and can compensate for them in script. == Control flow == A chatbot's control flow is managed by the control script. This is merely another ordinary topic of rules, that invokes API functions of the engine. Thus control is fully configurable by the scripter (and functions exist to allow introspection into the engine). There are pre-processing control flow and post-processing control flow options available, for special processing.

    Read more →
  • Graph cuts in computer vision and artificial intelligence

    Graph cuts in computer vision and artificial intelligence

    As applied in the field of computer vision, graph cut optimization can be employed to efficiently solve a wide variety of low-level computer vision problems (early vision), such as image smoothing, the stereo correspondence problem, image segmentation, object co-segmentation, numerous military applications (eg Automatic target recognition) and many other problems that can be formulated in terms of energy minimization (eg Climate Science and Environmental modelling). Graph cut techniques are now increasingly being used in combination with more general spatial Artificial intelligence techniques (eg to enforce structure in Large language model output to sharpen tumour boundaries and similarly for various Augmented reality, Self-driving car, Robotics, Google Maps applications etc). Many of these energy minimization problems can be approximated by solving a maximum flow problem in a graph (and thus, by the max-flow min-cut theorem, define a minimal cut of the graph). Under most formulations of such problems in computer vision, the minimum energy solution corresponds to the maximum a posteriori estimate of a solution. Although many computer vision algorithms involve cutting a graph (e.g. normalized cuts), the term "graph cuts" is applied specifically to those models which employ a max-flow/min-cut optimization (other graph cutting algorithms may be considered as graph partitioning algorithms). "Binary" problems (such as denoising a binary image) can be solved exactly using this approach; problems where pixels can be labeled with more than two different labels (such as stereo correspondence, or denoising of a grayscale image) cannot be solved exactly, but solutions produced are usually near the global optimum. == History == The foundational theory of graph cuts in computer vision was first developed by Margaret Greig, Bruce Porteous and Allan Seheult (GPS) of Durham University in a now legendary discussion contribution to Julian Besag's 1986 paper and a more detailed follow on paper in 1989. In the Bayesian statistical context of smoothing noisy images, using a Markov random field as the image prior distribution, they showed with a mathematically beautiful proof how the maximum a posteriori estimate of a binary image can be obtained exactly by maximizing the flow through an associated image network, or graph, involving the introduction of a source and sink and Log-likelihood ratios. The problem was shown to be efficiently solvable exactly, an unexpected result as the problem was believed to be computationally intractable (NP hard). GPS also addressed the computational cost of the max-flow algorithm on large graphs, a significant concern at the time. They proposed a partitioning algorithm (see Section 4 of GPS) involving the recursive amalgamation of non-overlapping blocks, or tiles, which gave a 12X increase in speed. This approach recursively solved and amalgamated independent sub-graphs until the whole graph was solved. While contemporaries like Geman and Geman had advocated Parallel computing in the context of Simulated annealing, the GPS blocking strategy offered a deterministic structure amenable to parallelisation and anticipated modern artificial intelligence design across multiple GPUs. However, until recently, this aspect of the paper was largely ignored and subsequent research focused on Serial computer global search trees, such as the Boykov-Kolmogorov algorithm. Although the general k {\displaystyle k} -colour problem is NP hard for k > 2 , {\displaystyle k>2,} the GPS approach has turned out to have very wide applicability in general computer vision problems. This was first demonstrated by Boykov, Veksler and Zabih who, in a seminal paper published more than 10 years after the original GPS paper, and in other important works, lit the blue touch paper for the general adoption of graph cut techniques in computer vision. They showed that, for general problems, the GPS approach can be applied iteratively to sequences of binary problems, using their now ubiquitous alpha-expansion algorithm, yielding near optimal solutions. Prior to these results, approximate local optimisation techniques such as simulated annealing (as proposed by the Geman brothers) or iterated conditional modes (a type of greedy algorithm suggested by Julian Besag) were used to solve such image smoothing problems. Building on these advancements, GPS graph cut optimization was subsequently adapted for interactive image segmentation, most notably through the "GrabCut" algorithm introduced by Carsten Rother, Vladimir Kolmogorov, and Andrew Blake of Microsoft Research, Cambridge. GrabCut extended earlier interactive graph cut methods by replacing monochrome image histograms with Gaussian mixture models to estimate colour distributions, and by employing an iterative GPS energy minimisation scheme. This approach significantly simplified user interaction, requiring only a rough bounding box around the target object rather than detailed user-drawn strokes, and it quickly became a standard tool in both academic research and commercial image editing software. The GPS paper connected and bridged profound ideas from Mathematical statistics (Bayes' theorem, Markov random field), Physics (Ising model), Optimisation (Energy function) and Computer science (Network flow problem) and led the move away from approximate local and slow optimisation approaches (eg simulated annealing) to more powerful exact, or near exact, faster global optimisation techniques. It is now recognised as seminal as it was well ahead of its time and, in particular, was published years before the computing power revolution of Moore's law and GPUs. Significantly, GPS was published in a mathematical statistics (rather than a computer vision) journal, and this led to it being overlooked by the computer vision community for many years. It is unofficially known as "The Velvet Underground" paper of computer vision (ie although very few computer vision people read the paper [bought the record], those that did, most importantly Boykov, Veksler and Zabih, started new and important research [formed a band]). This is confirmed by GPS' very large amplification ratio (2nd order citations/first order citations), estimated at well in excess of 100. Despite the foundational nature of the GPS work, formal recognition from the computer vision community has predominantly gone to the researchers who followed to extend and popularise the graph cut method. For example, Boykov, Veksler and Zabih deservedly received a Helmholtz Prize from the ICCV in 2011. This prize recognises ICCV papers from 10 or more years earlier that have had a significant impact on computer vision research. In 2011, Couprie et al. proposed a general image segmentation framework, called the "Power Watershed", that minimized a real-valued indicator function from [0,1] over a graph, constrained by user seeds (or unary terms) set to 0 or 1, in which the minimization of the indicator function over the graph is optimized with respect to an exponent p {\displaystyle p} . When p = 1 {\displaystyle p=1} , the Power Watershed is optimized by graph cuts, when p = 0 {\displaystyle p=0} the Power Watershed is optimized by shortest paths, p = 2 {\displaystyle p=2} is optimized by the random walker algorithm and p = ∞ {\displaystyle p=\infty } is optimized by the watershed algorithm. In this way, the Power Watershed may be viewed as a generalization of graph cuts that provides a straightforward connection with other energy optimization segmentation/clustering algorithms. == Binary segmentation of images == === Notation === Image: x ∈ { R , G , B } N {\displaystyle x\in \{R,G,B\}^{N}} Output: Segmentation (also called opacity) S ∈ R N {\displaystyle S\in R^{N}} (soft segmentation). For hard segmentation S ∈ { 0 for background , 1 for foreground/object to be detected } N {\displaystyle S\in \{0{\text{ for background}},1{\text{ for foreground/object to be detected}}\}^{N}} Energy function: E ( x , S , C , λ ) {\displaystyle E(x,S,C,\lambda )} where C is the color parameter and λ is the coherence parameter. E ( x , S , C , λ ) = E c o l o r + E c o h e r e n c e {\displaystyle E(x,S,C,\lambda )=E_{\rm {color}}+E_{\rm {coherence}}} Optimization: The segmentation can be estimated as a global minimum over S: arg ⁡ min S E ( x , S , C , λ ) {\displaystyle {\arg \min }_{S}E(x,S,C,\lambda )} === Existing methods === Standard Graph cuts: optimize energy function over the segmentation (unknown S value). Iterated Graph cuts: First step optimizes over the color parameters using K-means. Second step performs the usual graph cuts algorithm. These 2 steps are repeated recursively until convergence Dynamic graph cuts:Allows to re-run the algorithm much faster after modifying the problem (e.g. after new seeds have been added by a user). === Energy function === Pr ( x ∣ S ) = K − E {\displaystyle \Pr(x\mid S)=K^{-E}} where the energy E {\displaystyle E} is composed of two different mod

    Read more →
  • CrocBITE

    CrocBITE

    CrocBITE (currently CrocAttack) was an online database of wild crocodilian attacks reported on humans in the world. The non-profit online research tool helped to scientifically analyze crocodilian behavior via complex models. Users were encouraged to feed information in a crowdsourcing manner. This website excludes captive crocodilian attacks, as well as non-fatal bites on professional handlers, rangers, staff, or researchers, and crocodilian attacks on pets and livestock, because its primary goal is to analyze natural human-crocodilian conflict in the wild for conservation and management purposes, and that these incidents do are not considered indicative of natural species behavior or typical human-wildlife conflict, as well as not providing enough useful data and helping researchers understand wild population behavior or typical human-wildlife conflict dynamics and helps create safety strategies for people living or working near wild crocodilians, rather than tracking workplace accidents in zoos or farms. While fatal incidents involving handlers are sometimes included on the website, typical captive incidents (such as handlers being bitten by them in zoos) are excluded because they are considered manageable professional risks rather than general public safety threats. == About == The online database was established in 2013 (2013) by Dr Adam Britton, a researcher at Charles Darwin University, his student Brandon Sideleau and Erin Britton. It was a compilation of government records, individual reports, registered contributors and historical data. Dr Simon Pooley, Junior Research fellow, Imperial College London joined hands to further the studies. The collaboration culminated when Dr Pooley met Dr Britton at the IUCN Crocodile Specialist Group, in Louisiana in 2014. The program received funds from Economic and Social Research Council, United Kingdom to the tune of A$30,000 and unspecified resourced plus amount from Big Gecko Crocodilian Research, Crocodillian.com and Charles Darwin University. The research yielded pertinent observations that provide inside into crocodile attacks. It was observed that most attacks on humans occur from bites of Saltwater crocodile as against the popular understanding of Nile crocodiles taking the top spot. This is not, however, believed to be the actual case, as most attacks by the Nile crocodile are believed to go unreported or only reported on a local level. The broad category of Nile crocodile attacks were segmented into West African crocodile and Crocodylus niloticus (the Nile Crocodile) species to get a clear understanding of their respective attack zones. The objective was that the information would be used by communities and conservation managers to help inform and educate people about how to keep safe. The information was vital for Australia and Africa where such attacks are more likely than in other parts of the world. This was the only database of its kind with such comprehensive collection of information made available online. The database is no longer online, and its founder Adam Britton is in custody having pleaded guilty to charges of bestiality on September 25, 2023. It has been rebranded and renamed CrocAttack, and serves as a updated database focusing on human-crocodilian conflict and records over 8,500 incidents from the past decades.

    Read more →
  • CMU Pronouncing Dictionary

    CMU Pronouncing Dictionary

    The CMU Pronouncing Dictionary (also known as CMUdict) is an open-source pronouncing dictionary originally created by the Speech Group at Carnegie Mellon University (CMU) for use in speech recognition research. CMUdict provides a mapping orthographic/phonetic for English words in their North American pronunciations. It is commonly used to generate representations for speech recognition (ASR), e.g. the CMU Sphinx system, and speech synthesis (TTS), e.g. the Festival system. CMUdict can be used as a training corpus for building statistical grapheme-to-phoneme (g2p) models that will generate pronunciations for words not yet included in the dictionary. The most recent release is 0.7b; it contains over 134,000 entries. An interactive lookup version is available. == Database format == The database is distributed as a plain text file with one entry to a line in the format "WORD " with a two-space separator between the parts. If multiple pronunciations are available for a word, variants are identified using numbered versions (e.g. WORD(1)). The pronunciation is encoded using a modified form of the ARPABET system, with the addition of stress marks on vowels of levels 0, 1, and 2. A line-initial ;;; token indicates a comment. A derived format, directly suitable for speech recognition engines is also available as part of the distribution; this format collapses stress distinctions (typically not used in ASR). The following is a table of phonemes used by CMU Pronouncing Dictionary. == History == == Applications == The Unifon converter is based on the CMU Pronouncing Dictionary. The Natural Language Toolkit contains an interface to the CMU Pronouncing Dictionary. The Carnegie Mellon Logios tool incorporates the CMU Pronouncing Dictionary. PronunDict, a pronunciation dictionary of American English, uses the CMU Pronouncing Dictionary as its data source. Pronunciation is transcribed in IPA symbols. This dictionary also supports searching by pronunciation. Some singing voice synthesizer software like CeVIO Creative Studio and Synthesizer V uses modified version of CMU Pronouncing Dictionary for synthesizing English singing voices. Transcriber, a tool for the full text phonetic transcription, uses the CMU Pronouncing Dictionary 15.ai, a real-time text-to-speech tool using artificial intelligence, uses the CMU Pronouncing Dictionary

    Read more →
  • PatchMatch

    PatchMatch

    PatchMatch is an algorithm used to quickly find correspondences (or matches) between small square regions (or patches) of an image. It has various applications in image editing, such as reshuffling or removing objects from images or altering their aspect ratios without cropping or noticeably stretching them. PatchMatch was first presented in a 2011 paper by researchers at Princeton University. == Algorithm == The goal of the algorithm is to find the patch correspondence by defining a nearest-neighbor field (NNF) as a function f : R 2 → R 2 {\displaystyle f:\mathbb {R} ^{2}\to \mathbb {R} ^{2}} of offsets, which is over all possible matches of patch (location of patch centers) in image A, for some distance function of two patches D {\displaystyle D} . So, for a given patch coordinate a {\displaystyle a} in image A {\displaystyle A} and its corresponding nearest neighbor b {\displaystyle b} in image B {\displaystyle B} , f ( a ) {\displaystyle f(a)} is simply b − a {\displaystyle b-a} . However, if we search for every point in image B {\displaystyle B} , the work will be too hard to complete. So the following algorithm is done in a randomized approach in order to accelerate the calculation speed. The algorithm has three main components. Initially, the nearest-neighbor field is filled with either random offsets or some prior information. Next, an iterative update process is applied to the NNF, in which good patch offsets are propagated to adjacent pixels, followed by random search in the neighborhood of the best offset found so far. Independent of these three components, the algorithm also uses a coarse-to-fine approach by building an image pyramid to obtain the better result. === Initialization === When initializing with random offsets, we use independent uniform samples across the full range of image B {\displaystyle B} . This algorithm avoids using an initial guess from the previous level of the pyramid because in this way the algorithm can avoid being trapped in local minima. === Iteration === After initialization, the algorithm attempted to perform iterative process of improving the N N F {\displaystyle NNF} . The iterations examine the offsets in scan order (from left to right, top to bottom), and each undergoes propagation followed by random search. === Propagation === We attempt to improve f ( x , y ) {\displaystyle f(x,y)} using the known offsets of f ( x − 1 , y ) {\displaystyle f(x-1,y)} and f ( x , y − 1 ) {\displaystyle f(x,y-1)} , assuming that the patch offsets are likely to be the same. That is, the algorithm will take new value for f ( x , y ) {\displaystyle f(x,y)} to be arg ⁡ min ( x , y ) D ( f ( x , y ) ) , D ( f ( x − 1 , y ) ) , D ( f ( x , y − 1 ) ) {\displaystyle \arg \min \limits _{(x,y)}{D(f(x,y)),D(f(x-1,y)),D(f(x,y-1))}} . So if f ( x , y ) {\displaystyle f(x,y)} has a correct mapping and is in a coherent region R {\displaystyle R} , then all of R {\displaystyle R} below and to the right of f ( x , y ) {\displaystyle f(x,y)} will be filled with the correct mapping. Alternatively, on even iterations, the algorithm search for different direction, fill the new value to be arg ⁡ min ( x , y ) { D ( f ( x , y ) ) , D ( f ( x + 1 , y ) ) , D ( f ( x , y + 1 ) ) } {\displaystyle \arg \min \limits _{(x,y)}\{D(f(x,y)),D(f(x+1,y)),D(f(x,y+1))\}} . === Random search === Let v 0 = f ( x , y ) {\displaystyle v_{0}=f(x,y)} , we attempt to improve f ( x , y ) {\displaystyle f(x,y)} by testing a sequence of candidate offsets at an exponentially decreasing distance from v 0 {\displaystyle v_{0}} u i = v 0 + w α i R i {\displaystyle u_{i}=v_{0}+w\alpha ^{i}R_{i}} where R i {\displaystyle R_{i}} is a uniform random in [ − 1 , 1 ] × [ − 1 , 1 ] {\displaystyle [-1,1]\times [-1,1]} , w {\displaystyle w} is a large window search radius which will be set to maximum picture size, and α {\displaystyle \alpha } is a fixed ratio often assigned as 1/2. This part of the algorithm allows the f ( x , y ) {\displaystyle f(x,y)} to jump out of local minimum through random process. === Halting criterion === The often used halting criterion is set the iteration times to be about 4~5. Even with low iteration, the algorithm works well.

    Read more →