AI Data Jobs

AI Data Jobs — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • DryvIQ

    DryvIQ

    DryvIQ is a software application that enables businesses to migrate on-site system files and associated data across storage and content management platforms, as well as create synchronized hybrid storage systems. == History == Before it was DryvIQ, the software SkySync was released in 2013 by Ann Arbor, Michigan based company, Portal Architects, Inc. The company created SkySync, a back-end, administrative application designed to transfer content across storage platforms, after abandoning 18 months of development on a desktop application called SkyBrary in 2011. Between 2014 and 2015, Portal Architects established partnerships with the following companies: Autodesk, Box, Dropbox, Egnyte, EMC, Google, Syncplicity, Huddle, IBM, Microsoft, OpenText, Oracle, Citrix ShareFile, Hightail and Internet2. SkySync (currently DryvIQ) was named a "Cool Vendor in Content Management" by Gartner in 2015. In 2022, SkySync changed its name to DryvIQ, which is now what the company is currently known as. == Overview == DryvIQ is a software application that syncs, migrates or backs up files including their associated properties, metadata, versions, user accounts and permissions across on-premises and Cloud-based storage platforms. The software deploys on a server, virtual machine or within Microsoft Azure, Amazon Web Services or other cloud computing services.

    Read more →
  • The Best Free AI Writing Assistant for Beginners

    The Best Free AI Writing Assistant for Beginners

    Shopping for the best AI writing assistant? An AI writing assistant is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI writing assistant slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • AI Essay Writers: Free vs Paid (2026)

    AI Essay Writers: Free vs Paid (2026)

    Looking for the best AI essay writer? An AI essay writer is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI essay writer slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Mark Heimann

    Mark Heimann

    Mark A. Heimann is an American chess grandmaster and machine learning researcher. == Chess career == Heimann began playing chess at the age of 5 after his father bought him and his twin brother Alexander a chess set. He then won several national grade-level championships as well as the Pennsylvania and Ohio state championships in middle school and high school. In October 2007, he was ranked as the national #2 under-14 player, only behind future grandmaster Marc Tyler Arnold. In the February 2008 national rankings, he moved up to being the top-ranked under-14 player. In December 2012, he played for Washington University St. Louis' "A" team in the Pan-American Intercollegiate Chess Championships, where he was the second-most successful player, recording 4 wins, 1 draw, and 1 loss. The university's team also won the Division II championship title. In three tournaments between September and December 2022, Heimann earned three international master title norms, earning the international master title at the age of 29. In November 2024, he scored a GM norm at the U.S. Masters Chess Championship. He finished the event in joint-6th place. The following week, at the Saint Louis Masters tournament, he earned his final grandmaster norm and crossed 2500 in live rating, achieving the Grandmaster title. It was formally awarded to him in April 2025. == Research career == He obtained a bachelor's degree from Washington University in St. Louis in the School of Arts and Sciences and got his PhD from the University of Michigan. He is a machine learning researcher at Lawrence Livermore National Laboratory. == Personal life == Outside of chess and research, he also plays several instruments and is a competitive powerlifter.

    Read more →
  • Gamma (app)

    Gamma (app)

    Gamma is a web-based software platform that uses artificial intelligence to generate presentations, documents, webpages, and other visual content. The platform allow users to create structured layouts and draft text based on prompts or uploaded material. It operates as an online application and provides tools for editing, organizing, and sharing content. == History == Gamma was established in the early 2020s by Grant Lee, James Fox, and Jon Noronha during a period of increased development in artificial intelligence–based productivity software. The platform was introduced as a web-based format designed to present information through structured visual layouts rather than traditional slide-based presentations. Its interface was developed to adapt content to different screen sizes and devices. In later updates, Gamma expanded its functionality to support additional formats, including documents and simple webpages. By November 2025, the company reported that the platform had reached approximately 70 million users. Gamma has raised venture capital funding from a number of technology-focused investors since its founding. == Features == Gamma allows users to create presentations, documents, and webpages by entering prompts, pasting text, or uploading source files. The platform uses artificial intelligence to generate draft text, organize information, and apply structured layouts. Users can edit generated material manually and adjust formatting, structure, and visual elements. The software also supports collaborative editing, allowing multiple users to contribute to and revise the same project. Instead of relying only on fixed slide-based formats, Gamma presents content in scrollable layouts designed for web viewing across different screen sizes. Projects created on the platform can be shared through web links or exported to formats compatible with other software. Gamma also provides integration options and developer access through an application programming interface (API). == Technology == Gamma uses generative artificial intelligence models to interpret user input and generate structured content. The software automates elements of layout selection, formatting, and visual presentation. As with other AI-assisted tools, output produced by the system may require human review and revision to ensure accuracy and appropriate context. == Funding == Gamma has raised venture capital funding from a number of technology-focused investors since its founding. In November 2025, the company announced a Series B funding round that raised $68 million at a reported valuation of approximately $2.1 billion. Investors in the round included Andreessen Horowitz, Accel, and Uncork Capital, among others. == Controversy == In 2025, cybersecurity researchers reported that Gamma had been used in a phishing campaign targeting Microsoft accounts. Attackers shared links to presentations hosted on the platform that redirected users to a spoofed Microsoft SharePoint login page intended to collect credentials. Researchers noted that the incident reflected the broader misuse of legitimate online services in phishing schemes.

    Read more →
  • Flex (lexical analyzer generator)

    Flex (lexical analyzer generator)

    Flex (fast lexical analyzer generator) is a free and open-source software alternative to lex. It is a computer program that generates lexical analyzers (also known as "scanners" or "lexers"). It is frequently used as the lex implementation together with Berkeley Yacc parser generator on BSD-derived operating systems (as both lex and yacc are part of POSIX), or together with GNU bison (a version of yacc) in BSD ports and in Linux distributions. Unlike Bison, flex is not part of the GNU Project and is not released under the GNU General Public License, although a manual for Flex was produced and published by the Free Software Foundation. == History == Flex was written in C around 1987 by Vern Paxson, with the help of many ideas and much inspiration from Van Jacobson. Original version by Jef Poskanzer. The fast table representation is a partial implementation of a design done by Van Jacobson. The implementation was done by Kevin Gong and Vern Paxson. == Example lexical analyzer == This is an example of a Flex scanner for the instructional programming language PL/0. The tokens recognized are: '+', '-', '', '/', '=', '(', ')', ',', ';', '.', ':=', '<', '<=', '<>', '>', '>='; numbers: 0-9 {0-9}; identifiers: a-zA-Z {a-zA-Z0-9} and keywords: begin, call, const, do, end, if, odd, procedure, then, var, while. == Internals == These programs perform character parsing and tokenizing via the use of a deterministic finite automaton (DFA). A DFA is a theoretical machine accepting regular languages, and is equivalent to read-only right moving Turing machines. The syntax is based on the use of regular expressions. See also nondeterministic finite automaton. == Issues == === Time complexity === A Flex lexical analyzer usually has time complexity O ( n ) {\displaystyle O(n)} in the length of the input. That is, it performs a constant number of operations for each input symbol. This constant is quite low: GCC generates 12 instructions for the DFA match loop. Note that the constant is independent of the length of the token, the length of the regular expression and the size of the DFA. However, using the REJECT macro in a scanner with the potential to match extremely long tokens can cause Flex to generate a scanner with non-linear performance. This feature is optional. In this case, the programmer has explicitly told Flex to "go back and try again" after it has already matched some input. This will cause the DFA to backtrack to find other accept states. The REJECT feature is not enabled by default, and because of its performance implications its use is discouraged in the Flex manual. === Reentrancy === By default the scanner generated by Flex is not reentrant. This can cause serious problems for programs that use the generated scanner from different threads. To overcome this issue there are options that Flex provides in order to achieve reentrancy. A detailed description of these options can be found in the Flex manual. === Usage under non-Unix environments === Normally the generated scanner contains references to the unistd.h header file, which is Unix specific. To avoid generating code that includes unistd.h, %option nounistd should be used. Another issue is the call to isatty (a Unix library function), which can be found in the generated code. The %option never-interactive forces flex to generate code that does not use isatty. === Using flex from other languages === Flex can only generate code for C and C++. To use the scanner code generated by flex from other languages a language binding tool such as SWIG can be used. === Unicode support === Flex is limited to matching 1-byte (8-bit) binary values and therefore does not support Unicode. RE/flex and other alternatives do support Unicode matching. == Flex++ == flex++ is a similar lexical scanner for C++ which is included as part of the flex package. The generated code does not depend on any runtime or external library except for a memory allocator (malloc or a user-supplied alternative) unless the input also depends on it. This can be useful in embedded and similar situations where traditional operating system or C runtime facilities may not be available. The flex++ generated C++ scanner includes the header file FlexLexer.h, which defines the interfaces of the two C++ generated classes.

    Read more →
  • Top 10 AI Bug Finders Compared (2026)

    Top 10 AI Bug Finders Compared (2026)

    Trying to pick the best AI bug finder? An AI bug finder is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI bug finder slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • General Regionally Annotated Corpus of Ukrainian

    General Regionally Annotated Corpus of Ukrainian

    General Regionally Annotated Corpus of the Ukrainian Language (GRAC, Ukrainian: Генеральний регіонально анотований корпус української мови, romanized: Heneralnyi rehionalno anotovanyi korpus ukrainskoi movy, ГРАК, Ukrainian грак for rook) is a text corpus of the Ukrainian language comprising more than 2 billion tokens, intended for linguistic research in grammar, vocabulary, and the history of the Ukrainian literary language, as well as for use in compiling dictionaries and grammars. The corpus can be used for language study and also for preparing teaching materials, textbooks, learner’s dictionaries, and exercises using examples from real texts, taking into account frequency and collocational patterns, and so on. The corpus is not a model of standard Ukrainian: it may contain words and combinations that do not match current norms of the literary language. The corpus covers the period from 1816 to 2025, and as of 29 November 2025 it contains more than 812,000 texts by about 35,000 authors. == Composition of the corpus == In the 10th version of the corpus, available for searching from 20 October 2020, 35% consists of fiction. Some fiction genres are highlighted separately: children’s literature, folklore, dramatic works, and scripts. Among non-fiction texts: journalistic writing, including newspaper collections from 1888–1893, 1905, 1913–1918, 1919–1943, modern newspapers from different regions, and texts from online news/information sites; memoirs, letters, and diaries, including a sizeable corpus of Facebook texts representing blogs by people from all regions of Ukraine and the diaspora; scholarly and educational texts: monographs, dissertations, academic articles, textbooks; large subcorpora of academic literature in history, ethnography, philosophy, and law are singled out separately; religious texts, including two Ukrainian translations of the Bible; speeches and interviews. Some dictionaries that include phrasal examples and phraseology have also been incorporated, including the Ukrainian dictionary by Borys Hrinchenko and the Russian-Ukrainian idiomatic dictionary by I. Vyrhan and M. Pylynska. Using the corpus tools, these dictionaries can be searched not only for words, but also for lexico-grammatical patterns within examples and phraseological expressions. About 20% of the texts in the corpus are translations. The corpus includes translations from more than 80 languages, most of all from English and Russian. == Dating == Texts in the corpus are dated by the year of writing, or by the latest year in which a work could have been written; translated texts are dated by the year the translation was produced. A publication year may also be indicated, corresponding to the edition from which the text is taken. == Regional annotation == The corpus’s regional annotation is based on the modern administrative division of Ukraine. The corpus includes texts from all oblasts of Ukraine and from Crimea. A single text may belong to several regional subcorpora (if the author or translator was born, studied, or lived for a long time in different regions). In addition to regional subcorpora, there are subcorpora of works by authors of the Ukrainian diaspora (USA, Canada, Poland, Germany, the United Kingdom, France, etc.). These are mostly texts by emigrants of the 1940s, and to a lesser extent of 1917–1920s. == Morphological annotation == GRAC is based on the morphological analysis system nlp_uk, developed by specialists from the r2u group. The program analyzes the text and, for each word form, determines the lemma (lexeme) and tags (grammatical features). == Research based on the corpus == Research on the Ukrainian language has been carried out using the corpus, including studies of the historical dynamics of language norms, and letter and letter-combination frequencies for font development.

    Read more →
  • Scanimate

    Scanimate

    Scanimate is an analog computer animation (video synthesizer) system created by Lee Harrison III of Denver, Colorado. Harrison had developed its predecessor, ANIMAC, which generated used a motion capture system, based on a body suit with potentiometers. In contrast, Scanimate included TV technology. Scanimate's successor was called Caesar, and used a digital computer to control the analog system. The eight Scanimate systems were used to produce much of the video-based animation seen on television between most of the 1970s and early 1980s in commercials, promotions, and show openings. One of the major advantages the Scanimate system had over film-based animation and computer animation was the ability to create animations in real time. The speed with which animation could be produced on the system because of this, as well as its range of possible effects, helped it to supersede film-based animation techniques for television graphics. By the mid-1980s, it was superseded by digital computer animation, which produced sharper images and more sophisticated 3D imagery. Animations created on Scanimate and similar analog computer animation systems have a number of characteristic features that distinguish them from film-based animation: the motion is extremely fluid, using all 60 fields per second (in NTSC format video) or 50 fields (in PAL format video) rather than the 24 frames per second that film uses; the colors are much brighter and more saturated; and the images have a very "electronic" look that results from the direct manipulation of video signals through which the Scanimate produces the images. == How it works == A special high-resolution (around 945 lines) monochrome camera records high-contrast artwork. The image is then displayed on a high-resolution screen. Unlike a normal monitor, its deflection signals are passed through a special analog computer that enables the operator to bend the image in a variety of ways. The image is then shot from the screen by either a film camera or a video camera. In the case of a video camera, this signal is then fed into a colorizer, a device that takes certain shades of grey and turns it into color as well as transparency. The idea behind this is that the output of the Scanimate itself is always monochrome. Another advantage of the colorizer is that it gives the operator the ability to continuously add layers of graphics. This makes possible the creation of very complex graphics. This is done by using two video recorders. The background is played by one recorder and then recorded by another one. This process is repeated for every layer. This requires very high-quality video recorders (such as both the Ampex VR-2000 or IVC's IVC-9000 of Scanimate's era, the IVC-9000 being used quite frequently for Scanimate composition due to its very high generational quality between re-recordings). == Current usage == Two of the Scanimates are still in use at ZFx studios in Asheville, North Carolina. The original "Black Swan" R&D machine has been updated with more modern power supplies and can produce material in standard or 1080P high definition video. The "white Pearl" machine is the last one produced and is being kept in its original configuration for historical purposes by David Sieg at ZFx inc. The machines are installed in a working production environment with Grass Valley switchers, Kaleidoscope digital video effects systems, and Accom digital disk recorders for layering. == Use in television, music and films == === Music videos === Let's Groove by Earth, Wind & Fire Get Down on It by Kool & the Gang Blame It on the Boogie by The Jacksons Knock on Wood by Amii Stewart Popcorn Love by New Edition === TV programs/movies === === TV channels/home video/TV productions ===

    Read more →
  • Laws of Form

    Laws of Form

    Laws of Form (hereinafter LoF) is a book by G. Spencer-Brown, written by August 1967 and published in 1969. The book straddles the boundary between mathematics and philosophy. LoF describes three distinct logical systems: The primary arithmetic (described in Chapter 4 of LoF), whose models include Boolean arithmetic; The primary algebra (Chapter 6 of LoF), whose models include the two-element Boolean algebra (hereinafter abbreviated 2), Boolean logic, and the classical propositional calculus; Equations of the second degree (Chapter 11), whose interpretations include finite automata and Alonzo Church's Restricted Recursive Arithmetic (RRA). "Boundary algebra" is a Meguire (2011) term for the union of the primary algebra and the primary arithmetic. Laws of Form sometimes loosely refers to the "primary algebra" as well as to LoF. == Contents == The preface states that the work was first explored in 1959, and Spencer Brown cites Bertrand Russell as being supportive of his endeavour. He also thanks J. C. P. Miller of University College London for helping with the proofreading and offering other guidance. In 1963 Spencer Brown was invited by Harry Frost, staff lecturer in the physical sciences at the department of Extra-Mural Studies of the University of London, to deliver a course on the mathematics of logic. LoF emerged from work in electronic engineering its author did around 1960. Key ideas of the LOF were first outlined in his 1961 manuscript Design with the Nor, which remained unpublished until 2021, and further refined during subsequent lectures on mathematical logic he gave under the auspices of the University of London's extension program. LoF has appeared in several editions. The second series of editions appeared in 1972 with the "Preface to the First American Edition", which emphasised the use of self-referential paradoxes, and the most recent being a 1997 German translation. LoF has never gone out of print. LoF's mystical and declamatory prose and its love of paradox make it a challenging read for all. Spencer-Brown was influenced by Ludwig Wittgenstein and R. D. Laing. LoF also echoes a number of themes from the writings of Charles Sanders Peirce, Bertrand Russell, and Alfred North Whitehead. The work has had curious effects on some classes of its readership; for example, on obscure grounds, it has been claimed that the entire book is written in an operational way, giving instructions to the reader instead of telling them what "is", and that in accordance with G. Spencer-Brown's interest in paradoxes, the only sentence that makes a statement that something is, is the statement which says no such statements are used in this book. Furthermore, the claim asserts that except for this one sentence the book can be seen as an example of E-Prime. What prompted such a claim, is obscure, either in terms of incentive, logical merit, or as a matter of fact, because the book routinely and naturally uses the verb to be throughout, and in all its grammatical forms, as may be seen both in the original and in quotes shown below. == Reception == Ostensibly a work of formal mathematics and philosophy, LoF became something of a cult classic: it was praised by Heinz von Foerster when he reviewed it for the Whole Earth Catalog. Those who agree point to LoF as embodying an enigmatic "mathematics of consciousness", its algebraic symbolism capturing an (perhaps even "the") implicit root of cognition: the ability to "distinguish". LoF argues that primary algebra reveals striking connections among logic, Boolean algebra, and arithmetic, and the philosophy of language and mind. Stafford Beer wrote in a review for Nature in 1969, "When one thinks of all that Russell went through sixty years ago, to write the Principia, and all we his readers underwent in wrestling with those three vast volumes, it is almost sad". Banaschewski (1977) argues that the primary algebra is nothing but new notation for Boolean algebra. Indeed, the two-element Boolean algebra 2 can be seen as the intended interpretation of the primary algebra. Yet the notation of the primary algebra: Fully exploits the duality characterizing not just Boolean algebras but all lattices; Highlights how syntactically distinct statements in logic and 2 can have identical semantics; Dramatically simplifies Boolean algebra calculations, and proofs in sentential and syllogistic logic. Moreover, the syntax of the primary algebra can be extended to formal systems other than 2 and sentential logic, resulting in boundary mathematics. LoF has influenced, among others, Heinz von Foerster, Louis Kauffman, Niklas Luhmann, Humberto Maturana, Francisco Varela and William Bricken. Some of these authors have modified the primary algebra in a variety of interesting ways. LoF claimed that certain well-known mathematical conjectures of very long standing, such as the four color theorem, Fermat's Last Theorem, and the Goldbach conjecture, are provable using extensions of the primary algebra. Spencer-Brown eventually circulated a purported proof of the four color theorem, but it was met with skepticism. == The form (Chapter 1) == The symbol: Also called the "mark" or "cross", is the essential feature of the Laws of Form. In Spencer-Brown's inimitable and enigmatic fashion, the Mark symbolizes the root of cognition, i.e., the dualistic Mark indicates the capability of differentiating a "this" from "everything else but this". In LoF, a Cross denotes the drawing of a "distinction", and can be thought of as signifying the following, all at once: The act of drawing a boundary around something, thus separating it from everything else; That which becomes distinct from everything by drawing the boundary; Crossing from one side of the boundary to the other. All three ways imply an action on the part of the cognitive entity (e.g., person) making the distinction. As LoF puts it: "The first command: Draw a distinction can well be expressed in such ways as: Let there be a distinction, Find a distinction, See a distinction, Describe a distinction, Define a distinction, Or: Let a distinction be drawn". (LoF, Notes to chapter 2) The counterpoint to the Marked state is the Unmarked state, which is simply nothing, the void, or the un-expressable infinite represented by a blank space. It is simply the absence of a Cross. No distinction has been made and nothing has been crossed. The Marked state and the void are the two primitive values of the Laws of Form. The Cross can be seen as denoting the distinction between two states, one "considered as a symbol" and another not so considered. From this fact arises a curious resonance with some theories of consciousness and language. Paradoxically, the Form is at once Observer and Observed, and is also the creative act of making an observation. LoF (excluding back matter) closes with the words: ...the first distinction, the Mark and the observer are not only interchangeable, but, in the form, identical. C. S. Peirce came to a related insight in the 1890s; see § Related work. == The primary arithmetic (Chapter 4) == The syntax of the primary arithmetic goes as follows. There are just two atomic expressions: The empty Cross ; All or part of the blank page (the "void"). There are two inductive rules: A Cross may be written over any expression; Any two expressions may be concatenated. The semantics of the primary arithmetic are perhaps nothing more than the sole explicit definition in LoF: "Distinction is perfect continence". Let the "unmarked state" be a synonym for the void. Let an empty Cross denote the "marked state". To cross is to move from one value, the unmarked or marked state, to the other. We can now state the "arithmetical" axioms A1 and A2, which ground the primary arithmetic (and hence all of the Laws of Form): "A1. The law of Calling". Calling twice from a state is indistinguishable from calling once. To make a distinction twice has the same effect as making it once. For example, saying "Let there be light" and then saying "Let there be light" again, is the same as saying it once. Formally: = {\displaystyle \ =} "A2. The law of Crossing". After crossing from the unmarked to the marked state, crossing again ("recrossing") starting from the marked state returns one to the unmarked state. Hence recrossing annuls crossing. Formally: = {\displaystyle \ =} In both A1 and A2, the expression to the right of '=' has fewer symbols than the expression to the left of '='. This suggests that every primary arithmetic expression can, by repeated application of A1 and A2, be simplified to one of two states: the marked or the unmarked state. This is indeed the case, and the result is the expression's "simplification". The two fundamental metatheorems of the primary arithmetic state that: Every finite expression has a unique simplification. (T3 in LoF); Starting from an initial marked or unmarked state, "complicating" an expression by a finite number of repeated application of A1 and A2 cannot yield

    Read more →
  • Christopher K. I. Williams

    Christopher K. I. Williams

    Christopher Kenneth Ingle Williams (born 1960) is a professor at the School of Informatics, University of Edinburgh, working in Artificial intelligence, and particularly the areas of Machine learning and Computer vision. == Education == Williams received a BA in Physics and Theoretical Physics from the University of Cambridge in 1982, followed by Part III Mathematics (1983). He did a MSc in Water Resources at the University of Newcastle-Upon-Tyne, then worked in Lesotho on low-cost sanitation. In 1988, he studied at the Department of Computer Science of the University of Toronto under the supervision of Geoffrey Hinton. He obtained his MSc and PhD both in computer science, in 1990 and 1994, respectively. == Career and research == In 1994, Williams moved to Aston University as a Research Fellow. He became a Lecturer in August 1995. He moved to the University of Edinburgh in July 1998 and became Reader in 2000. He obtained a Personal Chair in Machine Learning in 2005 in the School of Informatics. Williams has been a Fellow of the European Laboratory for Learning and Intelligent Systems (ELLIS) since 2019. Williams' research interests are in machine learning and computer vision. He has worked on new models for understanding time-series and images, and for finding structure in data. He is best known for his work on Gaussian processes and for the book Gaussian Processes for Machine Learning, co-authored with Carl Rasmussen. The book received the 2009 DeGroot Prize of the International Society for Bayesian Analysis. Williams was an organizer of the PASCAL Visual Object Classes (VOC) project (2005–2012) along with Mark Everingham, Luc van Gool, John Winn, and Andrew Zisserman. == Awards and honours == In 2021 Williams was elected a Fellow of the Royal Society of Edinburgh (FRSE).

    Read more →
  • Steve Omohundro

    Steve Omohundro

    Stephen Malvern Omohundro (born 1959) is an American computer scientist whose areas of research include Hamiltonian physics, dynamical systems, programming languages, machine learning, machine vision, and the social implications of artificial intelligence. His current work uses rational economics to develop safe and beneficial intelligent technologies for better collaborative modeling, understanding, innovation, and decision making. == Education == Omohundro has degrees in physics and mathematics from Stanford University (Phi Beta Kappa) and a Ph.D. in physics from the University of California, Berkeley. == Learning algorithms == Omohundro started the "Vision and Learning Group" at the University of Illinois, which produced 4 Masters and 2 Ph.D. theses. His work in learning algorithms included a number of efficient geometric algorithms, the manifold learning task and various algorithms for accomplishing this task, other related visual learning and modelling tasks, the best-first model merging approach to machine learning (including the learning of Hidden Markov Models and Stochastic Context-free Grammars), and the Family Discovery Learning Algorithm, which discovers the dimension and structure of a parameterized family of stochastic models. == Self-improving artificial intelligence and AI safety == Omohundro started Self-Aware Systems in Palo Alto, California to research the technology and social implications of self-improving artificial intelligence. He is an advisor to the Machine Intelligence Research Institute on artificial intelligence. He argues that rational systems exhibit problematic natural "drives" that will need to be countered in order to build intelligent systems safely. His papers, talks, and videos on AI safety have generated extensive interest. He has given many talks on self-improving artificial intelligence, cooperative technology, AI safety, and connections with biological intelligence. == Programming languages == At Thinking Machines Corporation, Cliff Lasser and Steve Omohundro developed Star Lisp, the first programming language for the Connection Machine. Omohundro joined the International Computer Science Institute (ICSI) in Berkeley, California, where he led the development of the open source programming language Sather. Sather is featured in O'Reilly's History of Programming Languages poster. == Physics and dynamical systems theory == Omohundro's book Geometric Perturbation Theory in Physics describes natural Hamiltonian symplectic structures for a wide range of physical models that arise from perturbation theory analyses. He showed that there exist smooth partial differential equations which stably perform universal computation by simulating arbitrary cellular automata. The asymptotic behavior of these PDEs is therefore logically undecidable. With John David Crawford he showed that the orbits of three-dimensional period doubling systems can form an infinite number of topologically distinct torus knots and described the structure of their stable and unstable manifolds. == Mathematica and Apple tablet contest == From 1986 to 1988, he was an Assistant Professor of Computer science at the University of Illinois at Urbana-Champaign and cofounded the Center for Complex Systems Research with Stephen Wolfram and Norman Packard. While at the University of Illinois, he worked with Stephen Wolfram and five others to create the symbolic mathematics program Mathematica. He and Wolfram led a team of students that won an Apple Computer contest to design "The Computer of the Year 2000." Their design entry "Tablet" was a touchscreen tablet with GPS and other features that finally appeared when the Apple iPad was introduced 22 years later. == Other contributions == Subutai Ahmad and Steve Omohundro developed biologically realistic neural models of selective attention. As a research scientist at the NEC Research Institute, Omohundro worked on machine learning and computer vision, and was a co-inventor of U.S. Patent 5,696,964, "Multimedia Database Retrieval System Which Maintains a Posterior Probability Distribution that Each Item in the Database is a Target of a Search." === Pirate puzzle === Omohundro developed an extension to the game theoretic pirate puzzle featured in Scientific American. == Outreach == Omohundro has sat on the Machine Intelligence Research Institute board of advisors. He has written extensively on artificial intelligence, and has warned that "an autonomous weapons arms race is already taking place" because "military and economic pressures are driving the rapid development of autonomous systems".

    Read more →
  • Oracle Cloud Platform

    Oracle Cloud Platform

    Oracle Cloud Platform refers to a Platform as a Service (PaaS) offerings by Oracle Corporation as part of Oracle Cloud Infrastructure. These offerings are used to build, deploy, integrate and extend applications in the cloud. The offerings support a variety of programming languages, databases, tools and frameworks including Oracle-specific, open source and third-party software and systems. == Deployment models == Oracle Cloud Platform offers public, private and hybrid cloud deployment models. == Architecture == Oracle Cloud Platform provides both Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). The infrastructure is offered through a global network of Oracle managed data centers. Oracle deploys their cloud in Regions. Inside each Region are at least three fault-independent Availability Domains. Each of these Availability Domains contains an independent data center with power, thermal and network isolation. Oracle Cloud is generally available in North America, EMEA, APAC and Japan with announced South America and US Govt. regions coming soon.

    Read more →
  • TAUM system

    TAUM system

    TAUM (Traduction Automatique à l'Université de Montréal) is the name of a research group which was set up at the Université de Montréal in 1965. Most of its research was done between 1968 and 1980. It gave birth to the TAUM-73 and TAUM-METEO machine translation prototypes, using the Q-Systems programming language created by Alain Colmerauer, which were among the first attempts to perform automatic translation through linguistic analysis. The prototypes were never used in actual production. The TAUM-METEO name has been erroneously used for many years to designate the METEO System subsequently developed by John Chandioux.

    Read more →
  • Vector quantization

    Vector quantization

    Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. Developed in the early 1980s by Robert M. Gray, it was originally used for data compression. It works by dividing a large set of points (vectors) into groups having approximately the same number of points closest to them. Each group is represented by its centroid point, as in k-means and some other clustering algorithms. In simpler terms, vector quantization chooses a set of points to represent a larger set of points. The density matching property of vector quantization is powerful, especially for identifying the density of large and high-dimensional data. Since data points are represented by the index of their closest centroid, commonly occurring data have low error, and rare data high error. This is why VQ is suitable for lossy data compression. It can also be used for lossy data correction and density estimation. Vector quantization is based on the competitive learning paradigm, so it is closely related to the self-organizing map model and to sparse coding models used in deep learning algorithms such as autoencoder. == Training == One simple training algorithm for vector quantization is: Pick a sample point at random Move the nearest quantization vector centroid towards this sample point, by a small fraction of the distance Repeat A more sophisticated algorithm reduces the bias in the density matching estimation and ensures that all points are used, by including an extra sensitivity parameter: Increase each centroid's sensitivity s i {\displaystyle s_{i}} by a small amount Pick a sample point P {\displaystyle P} at random For each quantization vector centroid c i {\displaystyle c_{i}} , let d ( P , c i ) {\displaystyle d(P,c_{i})} denote the distance of P {\displaystyle P} and c i {\displaystyle c_{i}} Find the centroid c i {\displaystyle c_{i}} for which d ( P , c i ) − s i {\displaystyle d(P,c_{i})-s_{i}} is the smallest Move c i {\displaystyle c_{i}} towards P {\displaystyle P} by a small fraction of the distance Set s i {\displaystyle s_{i}} to zero Repeat It is desirable to use a cooling schedule to produce convergence: see Simulated annealing. Another simple method is LBG, which is based on k-means. The algorithm can be iteratively updated with "live" data, rather than by picking random points from a data set, but this will introduce some bias if the data are temporally correlated over many samples. == Applications == Vector quantization is used for lossy data compression, lossy data correction, pattern recognition, density estimation and clustering. Lossy data correction, or prediction, is used to recover data missing from some dimensions. It is done by finding the nearest group with the data dimensions available, then predicting the result based on the values for the missing dimensions, assuming that they will have the same value as the group's centroid. For density estimation, the area/volume that is closer to a particular centroid than to any other is inversely proportional to the density (due to the density matching property of the algorithm). === Use in data compression === Vector quantization, also called "block quantization" or "pattern matching quantization" is often used in lossy data compression. It works by encoding values from a multidimensional vector space into a finite set of values from a discrete subspace of lower dimension. A lower-space vector requires less storage space, so the data is compressed. Due to the density matching property of vector quantization, the compressed data has errors that are inversely proportional to density. The transformation is usually done by projection or by using a codebook. In some cases, a codebook can be also used to entropy code the discrete value in the same step, by generating a prefix coded variable-length encoded value as its output. The set of discrete amplitude levels is quantized jointly rather than each sample being quantized separately. Consider a k-dimensional vector [ x 1 , x 2 , . . . , x k ] {\displaystyle [x_{1},x_{2},...,x_{k}]} of amplitude levels. It is compressed by choosing the nearest matching vector from a set of n-dimensional vectors [ y 1 , y 2 , . . . , y n ] {\displaystyle [y_{1},y_{2},...,y_{n}]} , with n < k. All possible combinations of the n-dimensional vector [ y 1 , y 2 , . . . , y n ] {\displaystyle [y_{1},y_{2},...,y_{n}]} form the vector space to which all the quantized vectors belong. Only the index of the codeword in the codebook is sent instead of the quantized values. This conserves space and achieves more compression. Twin vector quantization (VQF) is part of the MPEG-4 standard dealing with time domain weighted interleaved vector quantization. === Video codecs based on vector quantization === Bink video Cinepak Daala is transform-based but uses pyramid vector quantization on transformed coefficients Digital Video Interactive: Production-Level Video and Real-Time Video Indeo Microsoft Video 1 QuickTime: Apple Video (RPZA) and Graphics Codec (SMC) Sorenson SVQ1 and SVQ3 Smacker video VQA format, used in many games The usage of video codecs based on vector quantization has declined significantly in favor of those based on motion compensated prediction combined with transform coding, e.g. those defined in MPEG standards, as the low decoding complexity of vector quantization has become less relevant. === Audio codecs based on vector quantization === AMR-WB+ CELP CELT (now part of Opus) is transform-based but uses pyramid vector quantization on transformed coefficients Codec 2 DTS G.729 iLBC Ogg Vorbis TwinVQ === Use in pattern recognition === VQ was also used in the eighties for speech and speaker recognition. Recently it has also been used for efficient nearest neighbor search and on-line signature recognition. In pattern recognition applications, one codebook is constructed for each class (each class being a user in biometric applications) using acoustic vectors of this user. In the testing phase the quantization distortion of a testing signal is worked out with the whole set of codebooks obtained in the training phase. The codebook that provides the smallest vector quantization distortion indicates the identified user. The main advantage of VQ in pattern recognition is its low computational burden when compared with other techniques such as dynamic time warping (DTW) and hidden Markov model (HMM). The main drawback when compared to DTW and HMM is that it does not take into account the temporal evolution of the signals (speech, signature, etc.) because all the vectors are mixed up. In order to overcome this problem a multi-section codebook approach has been proposed. The multi-section approach consists of modelling the signal with several sections (for instance, one codebook for the initial part, another one for the center and a last codebook for the ending part). === Use as clustering algorithm === As VQ is seeking for centroids as density points of nearby lying samples, it can be also directly used as a prototype-based clustering method: each centroid is then associated with one prototype. By aiming to minimize the expected squared quantization error and introducing a decreasing learning gain fulfilling the Robbins-Monro conditions, multiple iterations over the whole data set with a concrete but fixed number of prototypes converges to the solution of k-means clustering algorithm in an incremental manner. === Generative adversarial networks (GAN) === VQ has been used to quantize a feature representation layer in the discriminator of generative adversarial networks. The feature quantization (FQ) technique performs implicit feature matching. It improves the GAN training, and yields an improved performance on a variety of popular GAN models: BigGAN for image generation, StyleGAN for face synthesis, and U-GAT-IT for unsupervised image-to-image translation.

    Read more →