How to Solve it by Computer

How to Solve it by Computer

How to Solve it by Computer is a computer science book by R. G. Dromey, first published by Prentice-Hall in 1982. It is occasionally used as a textbook, especially in India. It is an introduction to the whys of algorithms and data structures. Features of the book: The design factors associated with problems, The creative process behind coming up with innovative solutions for algorithms and data structures, The line of reasoning behind the constraints, factors and the design choices made. The very fundamental algorithms portrayed by this book are mostly presented in pseudocode and/or Pascal notation.

Truth discovery

Truth discovery (also known as truth finding) is the process of choosing the actual true value for a data item when different data sources provide conflicting information on it. Several algorithms have been proposed to tackle this problem, ranging from simple methods like majority voting to more complex ones able to estimate the trustworthiness of data sources. Truth discovery problems can be divided into two sub-classes: single-truth and multi-truth. In the first case only one true value is allowed for a data item (e.g birthday of a person, capital city of a country). While in the second case multiple true values are allowed (e.g. cast of a movie, authors of a book). Typically, truth discovery is the last step of a data integration pipeline, when the schemas of different data sources have been unified and the records referring to the same data item have been detected. == General principles == The abundance of data available on the web makes more and more probable to find that different sources provide (partially or completely) different values for the same data item. This, together with the fact that we are increasing our reliance on data to derive important decisions, motivates the need of developing good truth discovery algorithms. Many currently available methods rely on a voting strategy to define the true value of a data item. Nevertheless, recent studies, have shown that, if we rely only on majority voting, we could get wrong results even in 30% of the data items. The solution to this problem is to assess the trustworthiness of the sources and give more importance to votes coming from trusted sources. Ideally, supervised learning techniques could be exploited to assign a reliability score to sources after hand-crafted labeling of the provided values; unfortunately, this is not feasible since the number of needed labeled examples should be proportional to the number of sources, and in many applications the number of sources can be prohibitive. == Single-truth vs multi-truth discovery == Single-truth and multi-truth discovery are two very different problems. Single-truth discovery is characterized by the following properties: only one true value is allowed for each data item; different values provided for a given data item oppose to each other; values and sources can either be correct or erroneous. While in the multi-truth case the following properties hold: the truth is composed by a set of values; different values could provide a partial truth; claiming one value for a given data item does not imply opposing to all the other values; the number of true values for each data item is not known a priori. Multi-truth discovery has unique features that make the problem more complex and should be taken into consideration when developing truth-discovery solutions. The examples below point out the main differences of the two methods. Knowing that in both examples the truth is provided by source 1, in the single truth case (first table) we can say that sources 2 and 3 oppose to the truth and as a result provide wrong values. On the other hand, in the second case (second table), sources 2 and 3 are neither correct nor erroneous, they instead provide a subset of the true values and at the same time they do not oppose the truth. == Source trustworthiness == The vast majority of truth discovery methods are based on a voting approach: each source votes for a value of a certain data item and, at the end, the value with the highest vote is select as the true one. In the more sophisticated methods, votes do not have the same weight for all the data sources, more importance is indeed given to votes coming from trusted sources. Source trustworthiness usually is not known a priori but estimated with an iterative approach. At each step of the truth discovery algorithm the trustworthiness score of each data source is refined, improving the assessment of the true values that in turn leads to a better estimation of the trustworthiness of the sources. This process usually ends when all the values reach a convergence state. Source trustworthiness can be based on different metrics, such as accuracy of provided values, copying values from other sources and domain coverage. Detecting copying behaviors is very important, in fact, copy allows to spread false values easily making truth discovery very hard, since many sources would vote for the wrong values. Usually systems decrease the weight of votes associated to copied values or even don’t count them at all. == Single-truth methods == Most of the currently available truth discovery methods have been designed to work well only in the single-truth case. Below are reported some of the characteristics of the most relevant typologies of single-truth methods and how different systems model source trustworthiness. === Majority voting === Majority voting is the simplest method, the most popular value is selected as the true one. Majority voting is commonly used as a baseline when assessing the performances of more complex methods. === Web-link based === These methods estimate source trustworthiness exploiting a similar technique to the one used to measure authority of web pages based on web links. The vote assigned to a value is computed as the sum of the trustworthiness of the sources that provide that particular value, while the trustworthiness of a source is computed as the sum of the votes assigned to the values that the source provides. === Information-retrieval based === These methods estimate source trustworthiness using similarity measures typically used in information retrieval. Source trustworthiness is computed as the cosine similarity (or other similarity measures) between the set of values provided by the source and the set of values considered true (either selected in a probabilistic way or obtained from a ground truth). === Bayesian based === These methods use Bayesian inference to define the probability of a value being true conditioned on the values provided by all the sources. P ( v ∣ ψ ( o ) ) = P ( ψ ( o ) ∣ v ) ⋅ P ( v ) P ( ψ ( o ) ) {\displaystyle P(v\mid \psi (o))={\frac {P(\psi (o)\mid v)\cdot P(v)}{P(\psi (o))}}} where v {\displaystyle \textstyle v} is a value provided for a data item o {\displaystyle \textstyle o} and ψ ( o ) {\displaystyle \textstyle \psi (o)} is the set of the observed values provided by all the sources for that specific data item. The trustworthiness of a source is then computed based on the accuracy of the values that provides. Other more complex methods exploit Bayesian inference to detect copying behaviors and use these insights to better assess source trustworthiness. == Multi-truth methods == Due to its complexity, less attention has been devoted to the study of the multi-truth discovery Below are reported two typologies of multi-truth methods and their characteristics. === Bayesian based === These methods use Bayesian inference to define the probability of a group of values being true conditioned on the values provided by all the data sources. In this case, since there could be multiple true values for each data item, and sources can provide multiple values for a single data item, it is not possible to consider values individually. An alternative is to consider mappings and relations between set of provided values and sources providing them. The trustworthiness of a source is then computed based on the accuracy of the values that provides. More sophisticated methods also consider domain coverage and copying behaviors to better estimate source trustworthiness. === Probabilistic Graphical Models based === These methods use probabilistic graphical models to automatically define the set of true values of given data item and also to assess source quality without need of any supervision. == Applications == Many real-world applications can benefit from the use of truth discovery algorithms. Typical domains of application include: healthcare, crowd/social sensing, crowdsourcing aggregation, information extraction and knowledge base construction. Truth discovery algorithms could be also used to revolutionize the way in which web pages are ranked in search engines, going from current methods based on link analysis like PageRank, to procedures that rank web pages based on the accuracy of the information they provide.

ISO 2033

The ISO 2033:1983 standard ("Coding of machine readable characters (MICR and OCR)") defines character sets for use with Optical Character Recognition or Magnetic Ink Character Recognition systems. The Japanese standard JIS X 9010:1984 ("Coding of machine readable characters (OCR and MICR)", originally designated JIS C 6229-1984) is closely related. == Character set for OCR-A == The version of the encoding for the OCR-A font registered with the ISO-IR registry as ISO-IR-91 is the Japanese (JIS X 9010 / JIS C 6229) version, which differs from the encoding defined by ISO 2033 only in the addition of a Yen sign at 5C. == Character set for OCR-B == The version of the G0 set for the OCR-B font registered with the ISO-IR registry as ISO-IR-92 is the Japanese (JIS X 9010 / JIS C 6229) version, which differs from the encoding defined by ISO 2033 only in being based on JIS-Roman (with a dollar sign at 0x24 and a Yen sign at 0x5C) rather than on the ISO 646 IRV (with a backslash at 0x5C and, at the time, a universal currency sign (¤) at 0x24). Besides those code points, it differs from ASCII only in omitting the backtick (`) and tilde (~). An additional supplementary set registered as ISO-IR-93 assigns the pound sign (£), universal currency sign (¤) and section sign (§) to their ISO-8859-1 codepoints, and the backslash to the ISO-8859-1 codepoint for the Yen sign. == Character set for JIS X 9008 (JIS C 6257) == JIS X 9010 (JIS C 6229) also defines character sets for the JIS X 9008:1981 (formerly JIS C 6257-1981) "hand-printed" OCR font. These include subsets of the JIS X 0201 Roman set (registered as ISO-IR-94 and omitting the backtick (`), lowercase letters, curly braces ({, }) and overline (‾)), and kana set (registered as ISO-IR-96 and omitting the East Asian style comma (、) and full stop (。), the interpunct (・) and the small kana), in addition to a set (registered as ISO-IR-95) containing only the backslash, which is assigned to the same code point as in ISO-IR-93. The JIS C 6527 font stylises the slash and backslash characters with a doubled appearance. The character names given are "Solidus" and "Reverse Solidus", matching the Unicode character names for the ASCII slash and backslash. However, the Unicode Optical Character Recognition block includes an additional code point for an "OCR Double Backslash" (⑊), although not for a double (forward) slash, although a double slash is available elsewhere, as U+2AFD ⫽ DOUBLE SOLIDUS OPERATOR. == Character set for E-13B == The ISO-IR-98 encoding defined by ISO 2033 encodes the character repertoire of the E13B font, as used with magnetic ink character recognition. Although ISO 2033 also specifies other encodings, the encoding for E-13B is the encoding referred to as ISO_2033_1983 by Perl libintl, and as ISO_2033-1983 or csISO2033 by the IANA. Other registered labels include iso-ir-98, its ISO-IR registration number, and simply e13b. The digits are preserved in their ASCII locations. Letters and symbols unavailable in the E13B font are omitted, while specialised punctuation for bank cheques included in the E13B font is added. The same symbols are available in Unicode in the Optical Character Recognition block.

Cognitive computer

A cognitive computer is a computer that hardwires artificial intelligence and machine learning algorithms into an integrated circuit that closely reproduces the behavior of the human brain. It generally adopts a neuromorphic engineering approach. Synonyms include neuromorphic chip and cognitive chip. In 2023, IBM's proof-of-concept NorthPole chip (optimized for 2-, 4- and 8-bit precision) achieved remarkable performance in image recognition. In 2013, IBM developed Watson, a cognitive computer that uses neural networks and deep learning techniques. The following year, it developed the 2014 TrueNorth microchip architecture which is designed to be closer in structure to the human brain than the von Neumann architecture used in conventional computers. In 2017, Intel also announced its version of a cognitive chip in "Loihi, which it intended to be available to university and research labs in 2018. Intel (most notably with its Pohoiki Beach and Springs systems), Qualcomm, and others are improving neuromorphic processors steadily. == IBM TrueNorth chip == TrueNorth was a neuromorphic CMOS integrated circuit produced by IBM in 2014. It is a manycore processor network on a chip design, with 4096 cores, each one having 256 programmable simulated neurons for a total of just over a million neurons. In turn, each neuron has 256 programmable "synapses" that convey the signals between them. Hence, the total number of programmable synapses is just over 268 million (228). Its basic transistor count is 5.4 billion. In 2023 Zhejiang University and Alibaba developed Darwin a neuromorphic chip The darwin3 chip was designed around 2023 so it is fairly modern compared to IBM's TrueNorth or Intel's LoihI. === Details === Memory, computation, and communication are handled in each of the 4096 neurosynaptic cores, TrueNorth circumvents the von Neumann-architecture bottleneck and is very energy-efficient, with IBM claiming a power consumption of 70 milliwatts and a power density that is 1/10,000th of conventional microprocessors. The SyNAPSE chip operates at lower temperatures and power because it only draws power necessary for computation. Skyrmions have been proposed as models of the synapse on a chip. The neurons are emulated using a Linear-Leak Integrate-and-Fire (LLIF) model, a simplification of the leaky integrate-and-fire model. According to IBM, it does not have a clock, operates on unary numbers, and computes by counting to a maximum of 19 bits. The cores are event-driven by using both synchronous and asynchronous logic, and are interconnected through an asynchronous packet-switched mesh network on chip (NOC). IBM developed a new network to program and use TrueNorth. It included a simulator, a new programming language, an integrated programming environment, and libraries. This lack of backward compatibility with any previous technology (e.g., C++ compilers) poses serious vendor lock-in risks and other adverse consequences that may prevent it from commercialization in the future. === Research === In 2018, a cluster of TrueNorth network-linked to a master computer was used in stereo vision research that attempted to extract the depth of rapidly moving objects in a scene. == IBM NorthPole chip == In 2023, IBM released its NorthPole chip, which is a proof-of-concept for dramatically improving performance by intertwining compute with memory on-chip, thus eliminating the Von Neumann bottleneck. It blends approaches from IBM's 2014 TrueNorth system with modern hardware designs to achieve speeds about 4,000 times faster than TrueNorth. It can run ResNet-50 or Yolo-v4 image recognition tasks about 22 times faster, with 25 times less energy and 5 times less space, when compared to GPUs which use the same 12-nm node process that it was fabricated with. It includes 224 MB of RAM and 256 processor cores and can perform 2,048 operations per core per cycle at 8-bit precision, and 8,192 operations at 2-bit precision. It runs at between 25 and 425 MHz. This is an inferencing chip, but it cannot yet handle GPT-4 because of memory and accuracy limitations == Intel Loihi chip == === Pohoiki Springs === Pohoiki Springs is a system that incorporates Intel's self-learning neuromorphic chip, named Loihi, introduced in 2017, perhaps named after the Hawaiian seamount Lōʻihi. Intel claims Loihi is about 1000 times more energy efficient than general-purpose computing systems used to train neural networks. In theory, Loihi supports both machine learning training and inference on the same silicon independently of a cloud connection, and more efficiently than convolutional neural networks or deep learning neural networks. Intel points to a system for monitoring a person's heartbeat, taking readings after events such as exercise or eating, and using the chip to normalize the data and work out the ‘normal’ heartbeat. It can then spot abnormalities and deal with new events or conditions. The first iteration of the chip was made using Intel's 14 nm fabrication process and houses 128 clusters of 1,024 artificial neurons each for a total of 131,072 simulated neurons. This offers around 130 million synapses, far less than the human brain's 800 trillion synapses, and behind IBM's TrueNorth. Loihi is available for research purposes among more than 40 academic research groups as a USB form factor. In October 2019, researchers from Rutgers University published a research paper to demonstrate the energy efficiency of Intel's Loihi in solving simultaneous localization and mapping. In March 2020, Intel and Cornell University published a research paper to demonstrate the ability of Intel's Loihi to recognize different hazardous materials, which could eventually aid to "diagnose diseases, detect weapons and explosives, find narcotics, and spot signs of smoke and carbon monoxide". === Pohoiki Beach === Intel's Loihi 2, named Pohoiki Beach, was released in September 2021 with 64 cores. It boasts faster speeds, higher-bandwidth inter-chip communications for enhanced scalability, increased capacity per chip, a more compact size due to process scaling, and improved programmability. === Hala Point === Hala Point packages 1,152 Loihi 2 processors produced on Intel 3 process node in a six-rack-unit chassis. The system supports up to 1.15 billion neurons and 128 billion synapses distributed over 140,544 neuromorphic processing cores, consuming 2,600 watts of power. It includes over 2,300 embedded x86 processors for ancillary computations. Intel claimed in 2024 that Hala Point was the world’s largest neuromorphic system. It uses Loihi 2 chips. It is claimed to offer 10x more neuron capacity and up to 12x higher performance. The Darwin3 chip exceeds these specs. Hala Point provides up to 20 quadrillion operations per second, (20 petaops), with efficiency exceeding 15 trillion (8-bit) operations s−1 W−1 on conventional deep neural networks. Hala Point integrates processing, memory and communication channels in a massively parallelized fabric, providing 16 PB s−1 of memory bandwidth, 3.5 PB s−1 of inter-core communication bandwidth, and 5 TB s−1 of inter-chip bandwidth. The system can process its 1.15 billion neurons 20 times faster than a human brain. Its neuron capacity is roughly equivalent to that of an owl brain or the cortex of a capuchin monkey. Loihi-based systems can perform inference and optimization using 100 times less energy at speeds as much as 50 times faster than CPU/GPU architectures. Intel claims that Hala Point can create LLMs. Much further research is needed == SpiNNaker == SpiNNaker (Spiking Neural Network Architecture) is a massively parallel, manycore supercomputer architecture designed by the Advanced Processor Technologies Research Group at the Department of Computer Science, University of Manchester. == Criticism == Critics argue that a room-sized computer – as in the case of IBM's Watson – is not a viable alternative to a three-pound human brain. Some also cite the difficulty for a single system to bring so many elements together, such as the disparate sources of information as well as computing resources. In 2021, The New York Times released Steve Lohr's article "What Ever Happened to IBM’s Watson?". He wrote about some costly failures of IBM Watson. One of them, a cancer-related project called the Oncology Expert Advisor, was abandoned in 2016 as a costly failure. During the collaboration, Watson could not use patient data. Watson struggled to decipher doctors’ notes and patient histories. The development of LLMs has placed a new emphasis on cognitive computers, because the Transformer technology that underpins LLMs demands huge energy for GPUs and PCs. Cognitive computers use significantly less energy, but the details of STDPs and neuron models cannot yet match the accuracy of backprop, and so ANN to SNN weight translations such as QAT and PQT or progressive quantization are becoming popular, with their own limitations.

Ω-automaton

In automata theory, a branch of theoretical computer science, an ω-automaton (or stream automaton) is a variation of a finite automaton that runs on infinite, rather than finite, strings as input. Since ω-automata do not stop, they have a variety of acceptance conditions rather than simply a set of accepting states. ω-automata are useful for specifying behavior of systems that are not expected to terminate, such as hardware, operating systems and control systems. For such systems, one may want to specify a property such as "for every request, an acknowledge eventually follows", or its negation "there is a request that is not followed by an acknowledge". The former is a property of infinite words: one cannot say of a finite sequence that it satisfies this property. Classes of ω-automata include the Büchi automata, Rabin automata, Streett automata, parity automata and Muller automata, each deterministic or non-deterministic. These classes of ω-automata differ only in terms of acceptance condition. They all recognize precisely the regular ω-languages except for the deterministic Büchi automata, which is strictly weaker than all the others. Although all these types of automata recognize the same set of ω-languages, they nonetheless differ in succinctness of representation for a given ω-language. == Deterministic ω-automata == Formally, a deterministic ω-automaton is a tuple A = ( Q , Σ , δ , q 0 , A a c c ) {\textstyle A=(Q,\Sigma ,\delta ,q_{0},A_{acc})} , that consists of the following components: Q {\textstyle Q} , is a finite set. The elements of Q {\textstyle Q} are called the states of A {\textstyle A} . Σ {\textstyle \Sigma } , is a finite set called the alphabet of A {\textstyle A} . δ : Q × Σ → Q {\textstyle \delta \colon Q\times \Sigma \rightarrow Q} is a function, called the transition function of A {\textstyle A} . Q 0 {\textstyle Q_{0}} is an element of Q {\textstyle Q} , called the initial state. A a c c {\textstyle A_{acc}} is a set of accepting states of A {\textstyle A} , formally a subset of Q ω {\textstyle Q^{\omega }} . An input for A {\textstyle A} is an infinite string over the alphabet Σ {\textstyle \Sigma } , i.e. it is an infinite sequence α = ( a 1 , a 2 , a 3 , … ) {\textstyle \alpha =(a_{1},a_{2},a_{3},\ldots )} . The run of A {\textstyle A} on such an input is an infinite sequence ρ = ( r 0 , r 1 , r 2 , … ) {\textstyle \rho =(r_{0},r_{1},r_{2},\ldots )} of states, defined as follows: r 0 = q 0 {\textstyle r_{0}=q_{0}} . r 1 = δ ( r 0 , a 1 ) {\textstyle r_{1}=\delta (r_{0},a_{1})} . r 2 = δ ( r 1 , a 2 ) {\textstyle r_{2}=\delta (r_{1},a_{2})} . ... that is, for every i {\textstyle i} : r i = δ ( r i − 1 , a i ) {\textstyle r_{i}=\delta (r_{i-1},a_{i})} . The main purpose of an ω-automaton is to define a subset of the set of all inputs: The set of accepted inputs. Whereas in the case of an ordinary finite automaton every run ends with a state r n {\textstyle r_{n}} and the input is accepted if and only if r n {\textstyle r_{n}} is an accepting state, the definition of the set of accepted inputs is more complicated for ω-automata. Here we must look at the entire run ρ {\textstyle \rho } . The input is accepted if the corresponding run is in Acc {\textstyle {\text{Acc}}} . The set of accepted input ω-words is called the recognized ω-language by the automaton, which is denoted as L ( A ) {\textstyle L(A)} . The definition of Acc {\textstyle {\text{Acc}}} as a subset of Q ω {\textstyle Q^{\omega }} is purely formal and not suitable for practice because normally such sets are infinite. The difference between various types of ω-automata (Büchi, Rabin etc.) consists in how they encode certain subsets Acc {\textstyle {\text{Acc}}} of Q ω {\textstyle Q^{\omega }} as finite sets, and therefore in which such subsets they can encode. == Nondeterministic ω-automata == Formally, a nondeterministic ω-automaton is a tuple A = ( Q , Σ , Δ , Q 0 , Acc ) {\textstyle A=(Q,\Sigma ,\Delta ,Q_{0},{\text{Acc}})} that consists of the following components: Q {\textstyle Q} is a finite set. The elements of Q {\textstyle Q} are called the states of A {\textstyle A} . Σ {\textstyle \Sigma } is a finite set called the alphabet of A {\textstyle A} . Δ {\textstyle \Delta } is a subset of Q × Σ × Q {\textstyle Q\times \Sigma \times Q} and is called the transition relation of A {\textstyle A} . Q 0 {\textstyle Q_{0}} is a subset of Q {\textstyle Q} , called the initial set of states. Acc {\textstyle {\text{Acc}}} is the acceptance condition, a subset of Q ω {\textstyle Q^{\omega }} . Unlike a deterministic ω-automaton, which has a transition function δ {\textstyle \delta } , the non-deterministic version has a transition relation Δ {\textstyle \Delta } . Note that Δ {\textstyle \Delta } can be regarded as a function Q × Σ → P ( Q ) {\textstyle Q\times \Sigma \rightarrow {\mathcal {P}}(Q)} from Q × Σ {\textstyle Q\times \Sigma } to the power set P ( Q ) {\textstyle {\mathcal {P}}(Q)} . Thus, given a state q n {\textstyle q_{n}} and a symbol a n {\textstyle a_{n}} , the next state q n + 1 {\textstyle q_{n+1}} is not necessarily determined uniquely, rather there is a set of possible next states. A run of A {\textstyle A} on the input α = ( a 1 , a 2 , a 3 , … ) {\textstyle \alpha =(a_{1},a_{2},a_{3},\ldots )} is any infinite sequence ρ = ( r 0 , r 1 , r 2 , … ) {\textstyle \rho =(r_{0},r_{1},r_{2},\ldots )} of states that satisfies the following conditions: r 0 {\textstyle r_{0}} is an element of Q 0 {\textstyle Q_{0}} . r 1 {\textstyle r_{1}} is an element of Δ ( r 0 , a 1 ) {\textstyle \Delta (r_{0},a_{1})} . r 2 {\textstyle r_{2}} is an element of Δ ( r 1 , a 2 ) {\textstyle \Delta (r_{1},a_{2})} . ... that is, for every i {\textstyle i} : r i {\textstyle r_{i}} is an element of Δ ( r i − 1 , a i ) {\textstyle \Delta (r_{i-1},a_{i})} . A nondeterministic ω-automaton may admit many different runs on any given input, or none at all. The input is accepted if at least one of the possible runs is accepting. Whether a run is accepting depends only on Acc {\textstyle {\text{Acc}}} , as for deterministic ω-automata. Every deterministic ω-automaton can be regarded as a nondeterministic ω-automaton by taking Δ {\textstyle \Delta } to be the graph of δ {\textstyle \delta } . The definitions of runs and acceptance for deterministic ω-automata are then special cases of the nondeterministic cases. == Acceptance conditions == Acceptance conditions may be infinite sets of ω-words. However, people mostly study acceptance conditions that are finitely representable. The following lists a variety of popular acceptance conditions. Before discussing the list, let's make the following observation. In the case of infinitely running systems, one is often interested in whether certain behavior is repeated infinitely often. For example, if a network card receives infinitely many ping requests, then it may fail to respond to some of the requests but should respond to an infinite subset of received ping requests. This motivates the following definition: For any run ρ {\textstyle \rho } , let Inf ( ρ ) {\textstyle {\text{Inf}}(\rho )} be the set of states that occur infinitely often in ρ {\textstyle \rho } . This notion of certain states being visited infinitely often will be helpful in defining the following acceptance conditions. A Büchi automaton is an ω-automaton A {\textstyle A} that uses the following acceptance condition, for some subset F {\textstyle F} of Q {\textstyle Q} : Büchi condition A {\textstyle A} accepts exactly those runs ρ {\textstyle \rho } for which Inf ( ρ ) ∩ F ≠ ∅ {\textstyle {\text{Inf}}(\rho )\cap F\neq \emptyset } , i.e. there is an accepting state that occurs infinitely often in ρ {\textstyle \rho } . A Rabin automaton is an ω-automaton A {\textstyle A} that uses the following acceptance condition, for some set Ω {\textstyle \Omega } of pairs ( B i , G i ) {\textstyle (B_{i},G_{i})} of sets of states: Rabin condition A {\textstyle A} accepts exactly those runs ρ {\textstyle \rho } for which there exists a pair ( B i , G i ) {\textstyle (B_{i},G_{i})} in Ω {\textstyle \Omega } such that B i ∩ Inf ( ρ ) = ∅ {\textstyle B_{i}\cap {\text{Inf}}(\rho )=\emptyset } and G i ∩ Inf ( ρ ) ≠ ∅ {\textstyle G_{i}\cap {\text{Inf}}(\rho )\neq \emptyset } . A Streett automaton is an ω-automaton A {\textstyle A} that uses the following acceptance condition, for some set Ω {\textstyle \Omega } of pairs ( B i , G i ) {\textstyle (B_{i},G_{i})} of sets of states: Streett condition A {\textstyle A} accepts exactly those runs ρ {\textstyle \rho } such that for all pairs ( B i , G i ) {\textstyle (B_{i},G_{i})} in Ω {\textstyle \Omega } , B i ∩ Inf ( ρ ) ≠ ∅ {\textstyle B_{i}\cap {\text{Inf}}(\rho )\neq \emptyset } or G i ∩ Inf ( ρ ) = ∅ {\textstyle G_{i}\cap {\text{Inf}}(\rho )=\emptyset } . A parity automaton is an automaton A {\textstyle A} whose set of states is Q = { 0 , 1 , 2 , … , k } {\textstyle Q=\{0,1,2,\ldots ,k\}} for some natural number k {\textst

Smart data capture

Smart data capture (SDC), also known as 'intelligent data capture' or 'automated data capture', describes the branch of technology concerned with using computer vision techniques like optical character recognition (OCR), barcode scanning, object recognition and other similar technologies to extract and process information from semi-structured and unstructured data sources. IDC characterize smart data capture as an integrated hardware, software, and connectivity strategy to help organizations enable the capture of data in an efficient, repeatable, scalable, and future-proof way. Data is captured visually from barcodes, text, IDs and other objects - often from many sources simultaneously - before being converted and prepared for digital use, typically by artificial intelligence-powered software. An important feature of SDC is that it focuses not just on capturing data more efficiently but serving up easy-to-access, actionable insights at the instant of data collection to both frontline and desk-based workers, aiding decision-making and making it a two-way process. Smart data capture automates and accelerates capture, applying insights in real time and automating processes based on extracted input. Smart data capture is designed to be repeatable and scalable to reduce low-level manual tasks and eliminate human error. To achieve this goal, smart data capture solutions are often made available using specialist software installed on commodity hardware such as smartphones. However, some solutions may rely on specialized hardware such as dedicated scanning devices, wearables or shop floor robots. == Differences from OCR == Optical character recognition applications are typically concerned with the actual data capture process; they are intended to faithfully reproduce text, words, letters and symbols from a printed document. Smart data capture is multimodal, capable of extracting data from a wider range of semi-structured and unstructured sources, going beyond basic text recognition to offer a wider scope of applications. By extending functionality to provide actionable insights at the point of capture, SDC is also a two-way process (capture-display), while OCR is more commonly one-way (capture only), primarily used for data input. Smart data capture solutions typically have two parts: Data capture (which includes OCR, barcode scanning, object recognition) Functionality that then uses this data to provide actionable insights at the point of capture. == Applications == Smart data capture can be applied to almost any industry and application that requires visual information capture and interpretation. This may include: Retail Warehouse inventory control Logistics, handling and shipping Manufacturing Field service Healthcare Transport and travel Fraud detection

Best AI Background Removers in 2026

Comparing the best AI background remover? An AI background remover is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI background remover slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.