AI Chat Picture

AI Chat Picture — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Quantum machine learning

    Quantum machine learning

    Quantum machine learning (QML) is the study of quantum algorithms for machine learning. It often refers to quantum algorithms for machine learning tasks which analyze classical data, sometimes called quantum-enhanced machine learning. QML algorithms use qubits and quantum operations to try to improve the space and time complexity of classical machine learning algorithms. Hybrid QML methods involve both classical and quantum processing, where computationally difficult subroutines are outsourced to a quantum device. These routines can be more complex in nature and executed faster on a quantum computer. Furthermore, quantum algorithms can be used to analyze quantum states instead of classical data. The term "quantum machine learning" is sometimes used to refer classical machine learning methods applied to data generated from quantum experiments (i.e. machine learning of quantum systems), such as learning the phase transitions of a quantum system or creating new quantum experiments. QML also extends to a branch of research that explores methodological and structural similarities between certain physical systems and learning systems, in particular neural networks. For example, some mathematical and numerical techniques from quantum physics are applicable to classical deep learning and vice versa. Furthermore, researchers investigate more abstract notions of learning theory with respect to quantum information, sometimes referred to as "quantum learning theory". == Machine learning with quantum computers == Quantum-enhanced machine learning refers to quantum algorithms that solve tasks in machine learning, thereby improving and often expediting classical machine learning techniques. Such algorithms typically require one to encode the given classical data set into a quantum computer to make it accessible for quantum information processing. Subsequently, quantum information processing routines are applied and the result of the quantum computation is read out by measuring the quantum system. For example, the outcome of the measurement of a qubit reveals the result of a binary classification task. While many proposals of QML algorithms are still purely theoretical and require a full-scale universal quantum computer to be tested, others have been implemented on small-scale or special purpose quantum devices. === Quantum associative memories and quantum pattern recognition === Early work on quantum associative memories has been done by Dan Ventura and Tony Martinez and by Carlo A. Trugenberger in the late 1990s and early 2000s. Associative (or content-addressable) memories are able to recognize stored content on the basis of a similarity measure, while random access memories are accessed by the address of stored information and not its content. As such they must be able to retrieve both incomplete and corrupted patterns, the essential machine learning task of pattern recognition. Typical classical associative memories store p patterns in the O ( n 2 ) {\displaystyle O(n^{2})} interactions (synapses) of a real, symmetric energy matrix over a network of n artificial neurons. The encoding is such that the desired patterns are local minima of the energy functional and retrieval is done by minimizing the total energy, starting from an initial configuration. Unfortunately, classical associative memories are severely limited by the phenomenon of cross-talk. When too many patterns are stored, spurious memories appear which quickly proliferate, so that the energy landscape becomes disordered and no retrieval is anymore possible. The number of storable patterns is typically limited by a linear function of the number of neurons, p ≤ O ( n ) {\displaystyle p\leq O(n)} . Quantum associative memories (in their simplest realization) store patterns in a unitary matrix U acting on the Hilbert space of n qubits. Retrieval is realized by the unitary evolution of a fixed initial state to a quantum superposition of the desired patterns with probability distribution peaked on the most similar pattern to an input. By its very quantum nature, the retrieval process is thus probabilistic. Because quantum associative memories are free from cross-talk, however, spurious memories are never generated. Correspondingly, they have a superior capacity than classical ones. The number of parameters in the unitary matrix U is O ( p n ) {\displaystyle O(pn)} . One can thus have efficient, spurious-memory-free quantum associative memories for any polynomial number of patterns. If the matrix U is encoded as a unique operator (as opposed as to a sequence of gates as in the circuit model), e.g. by an optical interferometer, the retrieval becomes efficient even for an exponential number of patterns. === Linear algebra simulation with quantum amplitudes === A number of quantum algorithms for machine learning are based on the idea of amplitude encoding, that is, to associate the amplitudes of a quantum state with the inputs and outputs of computations. Since a state of n {\displaystyle n} qubits is described by 2 n {\displaystyle 2^{n}} complex amplitudes, this information encoding can allow for an exponentially compact representation. Intuitively, this corresponds to associating a discrete probability distribution over binary random variables with a classical vector. The goal of algorithms based on amplitude encoding is to formulate quantum algorithms whose resources grow polynomially in the number of qubits n {\displaystyle n} , which amounts to a logarithmic time complexity in the number of amplitudes and thereby the dimension of the input. Many QML algorithms in this category are based on variations of the quantum algorithm for linear systems of equations (colloquially called HHL, after the paper's authors) which, under specific conditions, performs a matrix inversion using an amount of physical resources growing only logarithmically in the dimensions of the matrix. One of these conditions is that a Hamiltonian which entry-wise corresponds to the matrix can be simulated efficiently, which is known to be possible if the matrix is sparse or low rank. For reference, any known classical algorithm for matrix inversion requires a number of operations that grows more than quadratically in the dimension of the matrix (e.g. O ( n 2.373 ) {\displaystyle O{\mathord {\left(n^{2.373}\right)}}} ), but they are not restricted to sparse matrices. Quantum matrix inversion can be applied to machine learning methods in which the training reduces to solving a linear system of equations, for example in least-squares linear regression, the least-squares version of support vector machines, and Gaussian processes. A crucial bottleneck of methods that simulate linear algebra computations with the amplitudes of quantum states is state preparation, which often requires one to initialise a quantum system in a state whose amplitudes reflect the features of the entire dataset. Although efficient methods for state preparation are known for specific cases, this step easily hides the complexity of the task. === Variational quantum algorithms (VQAs) === In a variational quantum algorithm, a classical computer optimizes the parameters used to prepare a quantum state, while a quantum computer is used to do the actual state preparation and measurement. VQAs are considered promising candidates for noisy intermediate-scale quantum computers. Variational quantum circuits (or parameterized quantum circuits) are a popular class of VQAs where the parameters are those used in a fixed quantum circuit. Researchers have studied VQCs to solve optimization problems and find the ground state energy of complex quantum systems, which were difficult to solve using a classical computer. === Quantum binary classifier === Pattern reorganization is one of the important tasks of machine learning, binary classification is one of the tools or algorithms to find patterns. Binary classification is used in supervised learning and in unsupervised learning. In QML, classical bits are converted to qubits and they are mapped to Hilbert space; complex value data are used in a quantum binary classifier to use the advantage of Hilbert space. By exploiting the quantum mechanic properties such as superposition, entanglement, interference the quantum binary classifier produces the accurate result in short period of time. === Quantum machine learning algorithms based on Grover search === Another approach to improving classical machine learning with quantum information processing uses amplitude amplification methods based on Grover's search algorithm, which has been shown to solve unstructured search problems with a quadratic speedup compared to classical algorithms. These quantum routines can be employed for learning algorithms that translate into an unstructured search task, as can be done, for instance, in the case of the k-medians and the k-nearest neighbors algorithms. Other applications include quadratic speedups in the training of perceptrons. An e

    Read more →
  • Semantic decomposition (natural language processing)

    Semantic decomposition (natural language processing)

    A semantic decomposition is an algorithm that breaks down the meanings of phrases or concepts into less complex concepts. The result of a semantic decomposition is a representation of meaning. This representation can be used for tasks, such as those related to artificial intelligence or machine learning. Semantic decomposition is common in natural language processing applications. The basic idea of a semantic decomposition is taken from the learning skills of adult humans, where words are explained using other words. It is based on Meaning-text theory. Meaning-text theory is used as a theoretical linguistic framework to describe the meaning of concepts with other concepts. == Background == Given that an AI does not inherently have language, it is unable to think about the meanings behind the words of a language. An artificial notion of meaning needs to be created for a strong AI to emerge. Creating an artificial representation of meaning requires the analysis of what meaning is. Many terms are associated with meaning, including semantics, pragmatics, knowledge and understanding or word sense. Each term describes a particular aspect of meaning, and contributes to a multitude of theories explaining what meaning is. These theories need to be analyzed further to develop an artificial notion of meaning best fit for our current state of knowledge. == Graph representations == Representing meaning as a graph is one of the two ways that both an AI cognition and a linguistic researcher think about meaning (connectionist view). Logicians utilize a formal representation of meaning to build upon the idea of symbolic representation, whereas description logics describe languages and the meaning of symbols. This contention between 'neat' and 'scruffy' techniques has been discussed since the 1970s. Research has so far identified semantic measures and with that word-sense disambiguation (WSD) - the differentiation of meaning of words - as the main problem of language understanding. As an AI-complete environment, WSD is a core problem of natural language understanding. AI approaches that use knowledge-given reasoning creates a notion of meaning combining the state of the art knowledge of natural meaning with the symbolic and connectionist formalization of meaning for AI. The abstract approach is shown in Figure. First, a connectionist knowledge representation is created as a semantic network consisting of concepts and their relations to serve as the basis for the representation of meaning. This graph is built out of different knowledge sources like WordNet, Wiktionary, and BabelNET. The graph is created by lexical decomposition that recursively breaks each concept semantically down into a set of semantic primes. The primes are taken from the theory of Natural Semantic Metalanguage, which has been analyzed for usefulness in formal languages. Upon this graph marker passing is used to create the dynamic part of meaning representing thoughts. The marker passing algorithm, where symbolic information is passed along relations form one concept to another, uses node and edge interpretation to guide its markers. The node and edge interpretation model is the symbolic influence of certain concepts. Future work uses the created representation of meaning to build heuristics and evaluate them through capability matching and agent planning, chatbots or other applications of natural language understanding.

    Read more →
  • Semantic compression

    Semantic compression

    In natural language processing, semantic compression is a process of compacting a lexicon used to build a textual document (or a set of documents) by reducing language heterogeneity, while maintaining text semantics. As a result, the same ideas can be represented using a smaller set of words. In most applications, semantic compression is a lossy compression. Increased prolixity does not compensate for the lexical compression and an original document cannot be reconstructed in a reverse process. == By generalization == Semantic compression is basically achieved in two steps, using frequency dictionaries and semantic network: determining cumulated term frequencies to identify target lexicon, replacing less frequent terms with their hypernyms (generalization) from target lexicon. Step 1 requires assembling word frequencies and information on semantic relationships, specifically hyponymy. Moving upwards in word hierarchy, a cumulative concept frequency is calculating by adding a sum of hyponyms' frequencies to frequency of their hypernym: c u m f ( k i ) = f ( k i ) + ∑ j c u m f ( k j ) {\displaystyle cumf(k_{i})=f(k_{i})+\sum _{j}cumf(k_{j})} where k i {\displaystyle k_{i}} is a hypernym of k j {\displaystyle k_{j}} . Then a desired number of words with top cumulated frequencies are chosen to build a target lexicon. In the second step, compression mapping rules are defined for the remaining words in order to handle every occurrence of a less frequent hyponym as its hypernym in output text. Example The below fragment of text has been processed by the semantic compression. Words in bold have been replaced by their hypernyms. They are both nest building social insects, but paper wasps and honey bees organize their colonies in very different ways. In a new study, researchers report that despite their differences, these insects rely on the same network of genes to guide their social behavior.The study appears in the Proceedings of the Royal Society B: Biological Sciences. Honey bees and paper wasps are separated by more than 100 million years of evolution, and there are striking differences in how they divvy up the work of maintaining a colony. The procedure outputs the following text: They are both facility building insect, but insects and honey insects arrange their biological groups in very different structure. In a new study, researchers report that despite their difference of opinions, these insects act the same network of genes to steer their party demeanor. The study appears in the proceeding of the institution bacteria Biological Sciences. Honey insects and insect are separated by more than hundred million years of organic processes, and there are impinging differences of opinions in how they divvy up the work of affirming a biological group. == Implicit semantic compression == A natural tendency to keep natural language expressions concise can be perceived as a form of implicit semantic compression, by omitting unmeaningful words or redundant meaningful words (especially to avoid pleonasms). == Applications and advantages == In the vector space model, compacting a lexicon leads to a reduction of dimensionality, which results in less computational complexity and a positive influence on efficiency. Semantic compression is advantageous in information retrieval tasks, improving their effectiveness (in terms of both precision and recall). This is due to more precise descriptors (reduced effect of language diversity – limited language redundancy, a step towards a controlled dictionary). As in the example above, it is possible to display the output as natural text (re-applying inflexion, adding stop words).

    Read more →
  • Kuki AI

    Kuki AI

    Kuki is an embodied AI bot designed for usage in the metaverse. Formerly known as Mitsuku, Kuki is a chatbot created from the Pandorabots framework. The bot has won the Loebner Prize 5 times. == Features == Kuki claims to be an 18-year-old female chatbot from the Metaverse, and the developers have stated she has been worked on since 2005. Early work by one of the company's co-founders inspired the Spike Jonze movie Her. As of 2015, she conversed, on average, in excess of a quarter of a million times daily, and it was estimated 5 million unique users had interacted with her between 2016 and 2020. == Virtual talent, model, and influencer == Kuki has appeared as a Virtual Model in Vogue Business and at Crypto Fashion Week where she modelled NFTs and spoke about the future of digital fashion. In 2021, Kuki modelled five digital looks from emerging Vogue Talents designers for Italian Vogue, that sold out as NFTs in under an hour. Kuki has also modeled for H&M on Instagram in a digital campaign that resulted in an "11x increase in ad recall" per a case study by Meta. == Awards == As of 2019, Kuki had been awarded the Loebner Prize five times, more than any other entrant. In 2020, Kuki competed against Facebook AI's Blenderbot in a 24/7 verbal sparring match called "Bot Battle", winning 79% of the audience vote.

    Read more →
  • Mobile cloud storage

    Mobile cloud storage

    Mobile cloud storage is a form of cloud storage that is accessible on mobile devices such as laptops, tablets, and smartphones. Mobile cloud storage providers offer services that allow the user to create and organize files, folders, music, and photos, similar to other cloud computing models. Services are used by both individuals and companies. Most cloud file storage providers offer limited free use but charge for additional storage once the free limit is exceeded. These costs are usually charged as a monthly subscription rate and have different rates depending on the amount of storage desired. In 2018, cloud services revenue was about $182.4 billion and in 2022 it is projected to grow to $331.2 billion. The cloud storage industry was projected to grow 17.2 percent in 2019 (Costello, 2019). == History == The concept of cloud computing trace back to 1960s, when the groundwork for modern internet and network technologies was being laid (Human for humans, 2024). One of the pivotal figures in this early period was J.C.R. Licklider, a visionary computer scientist who worked on ARPANET, the precursor to the internet. Licklider's ideas set the stage for the development of distributed computing systems, which are fundamental to cloud computing. Moving into the 1990s, AT&T introduced PersonaLink Services, a more advanced online platform offering electronic mail and online storage. Major turning point in 2006 The launch of Amazon Web Services (AWS) in 2006 marked a major turning point. AWS introduced Amazon S3 (Simple Storage Service), which allowed businesses and developers to store and retrieve any amount of data, at any time, from anywhere on the web. This development was revolutionary, providing scalable, reliable, and low-cost data storage infrastructure that transformed how organizations managed their data. == Applications == Some mobile device manufacturers include mobile cloud storage apps with their product. These apps facilitate synchronization of user files across multiple platforms. Part of the process for setting up new mobile devices frequently includes configuring a cloud storage service to Backup the device's files and information. Apple iOS devices come pre-loaded and configured to use Apple's mobile cloud storage service iCloud. Google offers a similar feature with the Android operating system by backing up the device using a Google Drive account. The Samsung Galaxy smartphone has partnered with Dropbox, while Microsoft similarly offers Microsoft OneDrive. Some mobile cloud storage apps are platform-independent. For example, Nasuni's Mobile Access app is available on any Android or iOS device. Most companies offering Cloud Storage have secure website to access files allowing use on any device that can browse the Internet.

    Read more →
  • Point distribution model

    Point distribution model

    The point distribution model is a model for representing the mean geometry of a shape and some statistical modes of geometric variation inferred from a training set of shapes. == Background == The point distribution model concept has been developed by Cootes, Taylor et al. and became a standard in computer vision for the statistical study of shape and for segmentation of medical images where shape priors really help interpretation of noisy and low-contrasted pixels/voxels. The latter point leads to active shape models (ASM) and active appearance models (AAM). Point distribution models rely on landmark points. A landmark is an annotating point posed by an anatomist onto a given locus for every shape instance across the training set population. For instance, the same landmark will designate the tip of the index finger in a training set of 2D hands outlines. Principal component analysis (PCA), for instance, is a relevant tool for studying correlations of movement between groups of landmarks among the training set population. Typically, it might detect that all the landmarks located along the same finger move exactly together across the training set examples showing different finger spacing for a flat-posed hands collection. == Details == First, a set of training images are manually landmarked with enough corresponding landmarks to sufficiently approximate the geometry of the original shapes. These landmarks are aligned using the generalized procrustes analysis, which minimizes the least squared error between the points. k {\displaystyle k} aligned landmarks in two dimensions are given as X = ( x 1 , y 1 , … , x k , y k ) {\displaystyle \mathbf {X} =(x_{1},y_{1},\ldots ,x_{k},y_{k})} . It's important to note that each landmark i ∈ { 1 , … k } {\displaystyle i\in \lbrace 1,\ldots k\rbrace } should represent the same anatomical location. For example, landmark #3, ( x 3 , y 3 ) {\displaystyle (x_{3},y_{3})} might represent the tip of the ring finger across all training images. Now the shape outlines are reduced to sequences of k {\displaystyle k} landmarks, so that a given training shape is defined as the vector X ∈ R 2 k {\displaystyle \mathbf {X} \in \mathbb {R} ^{2k}} . Assuming the scattering is gaussian in this space, PCA is used to compute normalized eigenvectors and eigenvalues of the covariance matrix across all training shapes. The matrix of the top d {\displaystyle d} eigenvectors is given as P ∈ R 2 k × d {\displaystyle \mathbf {P} \in \mathbb {R} ^{2k\times d}} , and each eigenvector describes a principal mode of variation along the set. Finally, a linear combination of the eigenvectors is used to define a new shape X ′ {\displaystyle \mathbf {X} '} , mathematically defined as: X ′ = X ¯ + P b {\displaystyle \mathbf {X} '={\overline {\mathbf {X} }}+\mathbf {P} \mathbf {b} } where X ¯ {\displaystyle {\overline {\mathbf {X} }}} is defined as the mean shape across all training images, and b {\displaystyle \mathbf {b} } is a vector of scaling values for each principal component. Therefore, by modifying the variable b {\displaystyle \mathbf {b} } an infinite number of shapes can be defined. To ensure that the new shapes are all within the variation seen in the training set, it is common to only allow each element of b {\displaystyle \mathbf {b} } to be within ± {\displaystyle \pm } 3 standard deviations, where the standard deviation of a given principal component is defined as the square root of its corresponding eigenvalue. PDM's can be extended to any arbitrary number of dimensions, but are typically used in 2D image and 3D volume applications (where each landmark point is R 2 {\displaystyle \mathbb {R} ^{2}} or R 3 {\displaystyle \mathbb {R} ^{3}} ). == Discussion == An eigenvector, interpreted in euclidean space, can be seen as a sequence of k {\displaystyle k} euclidean vectors associated to corresponding landmark and designating a compound move for the whole shape. Global nonlinear variation is usually well handled provided nonlinear variation is kept to a reasonable level. Typically, a twisting nematode worm is used as an example in the teaching of kernel PCA-based methods. Due to the PCA properties: eigenvectors are mutually orthogonal, form a basis of the training set cloud in the shape space, and cross at the 0 in this space, which represents the mean shape. Also, PCA is a traditional way of fitting a closed ellipsoid to a Gaussian cloud of points (whatever their dimension): this suggests the concept of bounded variation. The idea behind PDMs is that eigenvectors can be linearly combined to create an infinity of new shape instances that will 'look like' the one in the training set. The coefficients are bounded alike the values of the corresponding eigenvalues, so as to ensure the generated 2n/3n-dimensional dot will remain into the hyper-ellipsoidal allowed domain—allowable shape domain (ASD).

    Read more →
  • Contextual image classification

    Contextual image classification

    Contextual image classification, a topic of pattern recognition in computer vision, is an approach of classification based on contextual information in images. "Contextual" means this approach is focusing on the relationship of the nearby pixels, which is also called neighbourhood. The goal of this approach is to classify the images by using the contextual information. == Introduction == Similar as processing language, a single word may have multiple meanings unless the context is provided, and the patterns within the sentences are the only informative segments we care about. For images, the principle is same. Find out the patterns and associate proper meanings to them. As the image illustrated below, if only a small portion of the image is shown, it is very difficult to tell what the image is about. Even try another portion of the image, it is still difficult to classify the image. However, if we increase the contextual of the image, then it makes more sense to recognize. As the full images shows below, almost everyone can classify it easily. During the procedure of segmentation, the methods which do not use the contextual information are sensitive to noise and variations, thus the result of segmentation will contain a great deal of misclassified regions, and often these regions are small (e.g., one pixel). Compared to other techniques, this approach is robust to noise and substantial variations for it takes the continuity of the segments into account. Several methods of this approach will be described below. == Applications == === Functioning as a post-processing filter to a labelled image === This approach is very effective against small regions caused by noise. And these small regions are usually formed by few pixels or one pixel. The most probable label is assigned to these regions. However, there is a drawback of this method. The small regions also can be formed by correct regions rather than noise, and in this case the method is actually making the classification worse. This approach is widely used in remote sensing applications. === Improving the post-processing classification === This is a two-stage classification process: For each pixel, label the pixel and form a new feature vector for it. Use the new feature vector and combine the contextual information to assign the final label to the === Merging the pixels in earlier stages === Instead of using single pixels, the neighbour pixels can be merged into homogeneous regions benefiting from contextual information. And provide these regions to classifier. === Acquiring pixel feature from neighbourhood === The original spectral data can be enriched by adding the contextual information carried by the neighbour pixels, or even replaced in some occasions. This kind of pre-processing methods are widely used in textured image recognition. The typical approaches include mean values, variances, texture description, etc. === Combining spectral and spatial information === The classifier uses the grey level and pixel neighbourhood (contextual information) to assign labels to pixels. In such case the information is a combination of spectral and spatial information. === Powered by the Bayes minimum error classifier === Contextual classification of image data is based on the Bayes minimum error classifier (also known as a naive Bayes classifier). Present the pixel: A pixel is denoted as x 0 {\displaystyle x_{0}} . The neighbourhood of each pixel x 0 {\displaystyle x_{0}} is a vector and denoted as N ( x 0 ) {\displaystyle N(x_{0})} . The values in the neighbourhood vector is denoted as f ( x i ) {\displaystyle f(x_{i})} . Each pixel is presented by the vector ξ = ( f ( x 0 ) , f ( x 1 ) , … , f ( x k ) ) {\displaystyle \xi =\left(f(x_{0}),f(x_{1}),\ldots ,f(x_{k})\right)} x i ∈ N ( x 0 ) ; i = 1 , … , k {\displaystyle x_{i}\in N(x_{0});\quad i=1,\ldots ,k} The labels (classification) of pixels in the neighbourhood N ( x 0 ) {\displaystyle N(x_{0})} are presented as a vector η = ( θ 0 , θ 1 , … , θ k ) {\displaystyle \eta =\left(\theta _{0},\theta _{1},\ldots ,\theta _{k}\right)} θ i ∈ { ω 0 , ω 1 , … , ω k } {\displaystyle \theta _{i}\in \left\{\omega _{0},\omega _{1},\ldots ,\omega _{k}\right\}} ω s {\displaystyle \omega _{s}} here denotes the assigned class. A vector presents the labels in the neighbourhood N ( x 0 ) {\displaystyle N(x_{0})} without the pixel x 0 {\displaystyle x_{0}} η ^ = ( θ 1 , θ 2 , … , θ k ) {\displaystyle {\hat {\eta }}=\left(\theta _{1},\theta _{2},\ldots ,\theta _{k}\right)} The neighbourhood: Size of the neighbourhood. There is no limitation of the size, but it is considered to be relatively small for each pixel x 0 {\displaystyle x_{0}} . A reasonable size of neighbourhood would be 3 × 3 {\displaystyle 3\times 3} of 4-connectivity or 8-connectivity ( x 0 {\displaystyle x_{0}} is marked as red and placed in the centre). The calculation: Apply the minimum error classification on a pixel x 0 {\displaystyle x_{0}} , if the probability of a class ω r {\displaystyle \omega _{r}} being presenting the pixel x 0 {\displaystyle x_{0}} is the highest among all, then assign ω r {\displaystyle \omega _{r}} as its class. θ 0 = ω r if P ( ω r ∣ f ( x 0 ) ) = max s = 1 , 2 , … , R P ( ω s ∣ f ( x 0 ) ) {\displaystyle \theta _{0}=\omega _{r}\quad {\text{ if }}\quad P(\omega _{r}\mid f(x_{0}))=\max _{s=1,2,\ldots ,R}P(\omega _{s}\mid f(x_{0}))} The contextual classification rule is described as below, it uses the feature vector x 1 {\displaystyle x_{1}} rather than x 0 {\displaystyle x_{0}} . θ 0 = ω r if P ( ω r ∣ ξ ) = max s = 1 , 2 , … , R P ( ω s ∣ ξ ) {\displaystyle \theta _{0}=\omega _{r}\quad {\text{ if }}\quad P(\omega _{r}\mid \xi )=\max _{s=1,2,\ldots ,R}P(\omega _{s}\mid \xi )} Use the Bayes formula to calculate the posteriori probability P ( ω s ∣ ξ ) {\displaystyle P(\omega _{s}\mid \xi )} P ( ω s ∣ ξ ) = p ( ξ ∣ ω s ) P ( ω s ) p ( ξ ) {\displaystyle P(\omega _{s}\mid \xi )={\frac {p(\xi \mid \omega _{s})P(\omega _{s})}{p\left(\xi \right)}}} The number of vectors is the same as the number of pixels in the image. For the classifier uses a vector corresponding to each pixel x i {\displaystyle x_{i}} , and the vector is generated from the pixel's neighbourhood. The basic steps of contextual image classification: Calculate the feature vector ξ {\displaystyle \xi } for each pixel. Calculate the parameters of probability distribution p ( ξ ∣ ω s ) {\displaystyle p(\xi \mid \omega _{s})} and P ( ω s ) {\displaystyle P(\omega _{s})} Calculate the posterior probabilities P ( ω r ∣ ξ ) {\displaystyle P(\omega _{r}\mid \xi )} and all labels θ 0 {\displaystyle \theta _{0}} . Get the image classification result. == Algorithms == === Template matching === The template matching is a "brute force" implementation of this approach. The concept is first create a set of templates, and then look for small parts in the image match with a template. This method is computationally high and inefficient. It keeps an entire templates list during the whole process and the number of combinations is extremely high. For a m × n {\displaystyle m\times n} pixel image, there could be a maximum of 2 m × n {\displaystyle 2^{m\times n}} combinations, which leads to high computation. This method is a top down method and often called table look-up or dictionary look-up. === Lower-order Markov chain === The Markov chain also can be applied in pattern recognition. The pixels in an image can be recognised as a set of random variables, then use the lower order Markov chain to find the relationship among the pixels. The image is treated as a virtual line, and the method uses conditional probability. === Hilbert space-filling curves === The Hilbert curve runs in a unique pattern through the whole image, it traverses every pixel without visiting any of them twice and keeps a continuous curve. It is fast and efficient. === Markov meshes === The lower-order Markov chain and Hilbert space-filling curves mentioned above are treating the image as a line structure. The Markov meshes however will take the two dimensional information into account. === Dependency tree === The dependency tree is a method using tree dependency to approximate probability distributions.

    Read more →
  • Lesk algorithm

    Lesk algorithm

    The Lesk algorithm is a classical algorithm for word sense disambiguation introduced by Michael E. Lesk in 1986. It operates on the premise that words within a given context are likely to share a common meaning. This algorithm compares the dictionary definitions of an ambiguous word with the words in its surrounding context to determine the most appropriate sense. Variations, such as the Simplified Lesk algorithm, have demonstrated improved precision and efficiency. However, the Lesk algorithm has faced criticism for its sensitivity to definition wording and its reliance on brief glosses. Researchers have sought to enhance its accuracy by incorporating additional resources like thesauruses and syntactic models. == Overview == The Lesk algorithm is based on the assumption that words in a given "neighborhood" (section of text) will tend to share a common topic. A simplified version of the Lesk algorithm is to compare the dictionary definition of an ambiguous word with the terms contained in its neighborhood. Versions have been adapted to use WordNet. An implementation might look like this: for every sense of the word being disambiguated one should count the number of words that are in both the neighborhood of that word and in the dictionary definition of that sense the sense that is to be chosen is the sense that has the largest number of this count. A frequently used example illustrating this algorithm is for the context "pine cone". The following dictionary definitions are used: PINE 1. kinds of evergreen tree with needle-shaped leaves 2. waste away through sorrow or illness CONE 1. solid body which narrows to a point 2. something of this shape whether solid or hollow 3. fruit of certain evergreen trees As can be seen, the best intersection is Pine #1 ⋂ Cone #3 = 2. == Simplified Lesk algorithm == In Simplified Lesk algorithm, the correct meaning of each word in a given context is determined individually by locating the sense that overlaps the most between its dictionary definition and the given context. Rather than simultaneously determining the meanings of all words in a given context, this approach tackles each word individually, independent of the meaning of the other words occurring in the same context. "A comparative evaluation performed by Vasilescu et al. (2004) has shown that the simplified Lesk algorithm can significantly outperform the original definition of the algorithm, both in terms of precision and efficiency. By evaluating the disambiguation algorithms on the Senseval-2 English all words data, they measure a 58% precision using the simplified Lesk algorithm compared to the only 42% under the original algorithm. Note: Vasilescu et al. implementation considers a back-off strategy for words not covered by the algorithm, consisting of the most frequent sense defined in WordNet. This means that words for which all their possible meanings lead to zero overlap with current context or with other word definitions are by default assigned sense number one in WordNet." Simplified LESK Algorithm with smart default word sense (Vasilescu et al., 2004) The COMPUTEOVERLAP function returns the number of words in common between two sets, ignoring function words or other words on a stop list. The original Lesk algorithm defines the context in a more complex way. == Criticisms == Unfortunately, Lesk’s approach is very sensitive to the exact wording of definitions, so the absence of a certain word can radically change the results. Further, the algorithm determines overlaps only among the glosses of the senses being considered. This is a significant limitation in that dictionary glosses tend to be fairly short and do not provide sufficient vocabulary to relate fine-grained sense distinctions. A lot of work has appeared offering different modifications of this algorithm. These works use other resources for analysis (thesauruses, synonyms dictionaries or morphological and syntactic models): for instance, it may use such information as synonyms, different derivatives, or words from definitions of words from definitions. == Lesk variants == Original Lesk (Lesk, 1986) Adapted/Extended Lesk (Banerjee and Pederson, 2002/2003): In the adaptive lesk algorithm, a word vector is created corresponds to every content word in the wordnet gloss. Concatenating glosses of related concepts in WordNet can be used to augment this vector. The vector contains the co-occurrence counts of words co-occurring with w in a large corpus. Adding all the word vectors for all the content words in its gloss creates the Gloss vector g for a concept. Relatedness is determined by comparing the gloss vector using the Cosine similarity measure. There are a lot of studies concerning Lesk and its extensions: Wilks and Stevenson, 1998, 1999; Mahesh et al., 1997; Cowie et al., 1992; Yarowsky, 1992; Pook and Catlett, 1988; Kilgarriff and Rosensweig, 2000; Kwong, 2001; Nastase and Szpakowicz, 2001; Gelbukh and Sidorov, 2004.

    Read more →
  • Serverless computing

    Serverless computing

    Serverless computing is "a cloud service category where the customer can use different cloud capability types without the customer having to provision, deploy and manage either hardware or software resources, other than providing customer application code or providing customer data. Serverless computing represents a form of virtualized computing", according to ISO/IEC 22123-2. Serverless computing is a broad ecosystem that includes the cloud provider, function as a service (FaaS), managed services, tools, frameworks, engineers, stakeholders, and other interconnected elements. == Overview == Serverless is a misnomer in the sense that servers are still used by cloud service providers to execute code for developers. The definition of serverless computing has evolved over time, leading to varied interpretations. According to Ben Kehoe, serverless represents a spectrum rather than a rigid definition. Emphasis should shift from strict definitions and specific technologies to adopting a serverless mindset, focusing on leveraging serverless solutions to address business challenges. Serverless computing does not eliminate complexity but shifts much of it from the operations team to the development team. However, this shift is not absolute, as operations teams continue to manage aspects such as identity and access management (IAM), networking, security policies, and cost optimization. Additionally, while breaking down applications into finer-grained components can increase management complexity, the relationship between granularity and management difficulty is not strictly linear. There is often an optimal level of modularization where the benefits outweigh the added management overhead. According to Yan Cui, serverless techniques should be adopted only when they help to deliver customer value faster. And while adopting, organizations should take small steps and de-risk along the way. == Challenges == Serverless applications are prone to fallacies of distributed computing. In addition, they are prone to the following fallacies: Versioning is simple Compensating transactions always work Observability is optional === Monitoring and debugging === Monitoring and debugging serverless applications can present unique challenges due to their distributed, event-driven nature and proprietary environments. Traditional tools may fall short, making it difficult to track execution flows across services. However, modern solutions such as distributed tracing tools (e.g., AWS X-Ray, Datadog), centralized logging, and cloud-agnostic observability platforms are mitigating these challenges. Emerging technologies like OpenTelemetry, AI-powered anomaly detection, and serverless-specific frameworks are further improving visibility and root cause analysis. While challenges persist, advancements in monitoring and debugging tools are steadily addressing these limitations. === Security === According to OWASP, serverless applications are vulnerable to variations of traditional attacks, insecure code, and some serverless-specific attacks (like denial of wallet). So, the risks have changed and attack prevention requires a shift in mindset. === Vendor lock-in === Serverless computing is provided as a third-party service. Applications and software that run in the serverless environment are by default locked to a specific cloud vendor. This issue is exacerbated in serverless computing, as with its increased level of abstraction, public vendors only allow customers to upload code to a FaaS platform without the authority to configure underlying environments. More importantly, when considering a more complex workflow that includes backend-as-a-service (BaaS), a BaaS offering can typically only natively trigger a FaaS offering from the same provider. This makes the workload migration in serverless computing virtually impossible. Therefore, considering how to design and deploy serverless workflows from a multi-cloud perspective could mitigate this. == High-performance computing == Serverless computing may not be ideal for certain high-performance computing (HPC) workloads due to resource limits often imposed by cloud providers, including maximum memory, CPU, and runtime restrictions. For workloads requiring sustained or predictable resource usage, bulk-provisioned servers can sometimes be more cost-effective than the pay-per-use model typical of serverless platforms. However, serverless computing is increasingly capable of supporting specific HPC workloads, particularly those that are highly parallelizable and event-driven, by leveraging its scalability and elasticity. The suitability of serverless computing for HPC continues to evolve with advancements in cloud technologies. == Anti-patterns == The grain of sand anti-pattern refers to the creation of excessively small components (e.g., functions) within a system, often resulting in increased complexity, operational overhead, and performance inefficiencies. Lambda pinball is a related anti-pattern that can occur in serverless architectures when functions (e.g., AWS Lambda, Azure functions) excessively invoke each other in fragmented chains, leading to latency, debugging and testing challenges, and reduced observability. These anti-patterns are associated with the formation of a distributed monolith. These anti-patterns are often addressed through the application of clear domain boundaries, which distinguish between public and published interfaces. Public interfaces are technically accessible interfaces, such as methods, classes, API endpoints, or triggers, but they do not come with formal stability guarantees. In contrast, published interfaces involve an explicit stability contract, including formal versioning, thorough documentation, a defined deprecation policy, and often support for backward compatibility. Published interfaces may also require maintaining multiple versions simultaneously and adhering to formal deprecation processes when breaking changes are introduced. Fragmented chains of function calls are often observed in systems where serverless components (functions) interact with other resources in complex patterns, sometimes described as spaghetti architecture or a distributed monolith. In contrast, systems exhibiting clearer boundaries typically organize serverless components into cohesive groups, where internal public interfaces manage inter-component communication, and published interfaces define communication across group boundaries. This distinction highlights differences in stability guarantees and maintenance commitments, contributing to reduced dependency complexity. Additionally, patterns associated with excessive serverless function chaining are sometimes addressed through architectural strategies that emphasize native service integrations instead of individual functions, a concept referred to as the functionless mindset. However, this approach is noted to involve a steeper learning curve, and integration limitations may vary even within the same cloud vendor ecosystem. Reporting on serverless databases presents challenges, as retrieving data for a reporting service can either break the bounded contexts, reduce the timeliness of the data, or do both. This applies regardless of whether data is pulled directly from databases, retrieved via HTTP, or collected in batches. Mark Richards refers to this as the reach-in reporting anti-pattern. A possible alternative to this approach is for databases to asynchronously push the necessary data to the reporting service instead of the reporting service pulling it. While this method requires a separate contract between services and the reporting service and can be complex to implement, it helps preserve bounded contexts while maintaining a high level of data timeliness. == Principles == Adopting DevSecOps practices can help improve the use and security of serverless technologies. In serverless applications, the distinction between infrastructure and business logic is often blurred, with applications typically distributed across multiple services. To maximize the effectiveness of testing, integration testing is emphasized for serverless applications. Additionally, to facilitate debugging and implementation, orchestration is used within the bounded context, while choreography is employed between different bounded contexts. Ephemeral resources are typically kept together to maintain high cohesion. However, shared resources with long spin-up times, such as AWS RDS clusters and landing zones, are often managed in separate repositories, deployment pipeline, and stacks.

    Read more →
  • Inverse consistency

    Inverse consistency

    In image registration, inverse consistency measures the consistency of mappings between images produced by a registration algorithm. The inverse consistency error, introduced by Christiansen and Johnson in 2001, quantifies the distance between the composition of the mappings from each image to the other, produced by the registration procedure, and the identity function, and is used as a regularisation constraint in the loss function of many registration algorithms to enforce consistent mappings. Inverse consistency is necessary for good image registration but it is not sufficient, since a mapping can be perfectly consistent but not register the images at all. == Definition == Image registration is the process of establishing a common coordinate system between two images, and given two images I 1 : Ω 1 → R I 2 : Ω 2 → R {\displaystyle {\begin{aligned}I_{1}:\Omega _{1}\to \mathbb {R} \\I_{2}:\Omega _{2}\to \mathbb {R} \end{aligned}}} registering a source image I 1 {\displaystyle I_{1}} to a target image I 2 {\displaystyle I_{2}} consists of determining a transformation f 1 : Ω 2 → Ω 1 {\displaystyle f_{1}:\Omega _{2}\to \Omega _{1}} that maps points from the target space to the source space. An ideal registration algorithm should not be sensitive to which image in the pair is used as source or target, and the registration operator should be antisymmetric such that the mappings f 1 : Ω 2 → Ω 1 f 2 : Ω 1 → Ω 2 {\displaystyle {\begin{aligned}f_{1}:\Omega _{2}\to \Omega _{1}\\f_{2}:\Omega _{1}\to \Omega _{2}\end{aligned}}} produced when registering I 1 {\displaystyle I_{1}} to I 2 {\displaystyle I_{2}} and I 2 {\displaystyle I_{2}} to I 1 {\displaystyle I_{1}} respectively should be the inverse of each other, i.e. f 2 = f 1 − 1 {\displaystyle f_{2}=f_{1}^{-1}} and f 1 = f 2 − 1 {\displaystyle f_{1}=f_{2}^{-1}} or, equivalently, f 2 ∘ f 1 = id Ω 2 {\displaystyle f_{2}\circ f_{1}=\operatorname {id} _{\Omega _{2}}} and f 1 ∘ f 2 = id Ω 1 {\displaystyle f_{1}\circ f_{2}=\operatorname {id} _{\Omega _{1}}} , where ∘ {\displaystyle \circ } denotes the function composition operator. Real algorithms are not perfect, and when swapping the role of source and target image in a registration problem the so obtained transformations are not the inverse of each other. Inverse consistency can be enforced by adding to the loss function of the registration a symmetric regularisation term that penalises inconsistent transformations ∫ Ω 2 ‖ f 2 ( f 1 ( x ) ) − x ‖ 2 d x + ∫ Ω 1 ‖ f 1 ( f 2 ( x ) ) − x ‖ 2 d x . {\displaystyle \int _{\Omega _{2}}\left\Vert f_{2}(f_{1}(x))-x\right\Vert ^{2}\mathrm {d} x+\int _{\Omega _{1}}\left\Vert f_{1}(f_{2}(x))-x\right\Vert ^{2}\mathrm {d} x.} Inverse consistency can be used as a quality metric to evaluate image registration results. The inverse consistency error ( I C E {\displaystyle ICE} ) measures the distance between the composition of the two transforms and the identity function, and it can be formulated in terms of both average ( I C E a {\displaystyle ICE_{a}} ) or maximum ( I C E m {\displaystyle ICE_{m}} ) over a region of interest Ω {\displaystyle \Omega } of the image: I C E a = 1 ∫ Ω d x ∫ Ω ‖ f 2 ( f 1 ( x ) ) − x ‖ d x I C E m = max x ∈ Ω ‖ f 2 ( f 1 ( x ) ) − x ‖ . {\displaystyle {\begin{aligned}ICE_{a}&={\frac {1}{\int _{\Omega }\mathrm {d} x}}\int _{\Omega }\left\Vert f_{2}(f_{1}(x))-x\right\Vert \mathrm {d} x\\ICE_{m}&=\max _{x\in \Omega }\left\Vert f_{2}(f_{1}(x))-x\right\Vert .\end{aligned}}} While inverse consistency is a necessary property of good registration algorithms, inverse consistency error alone is not a sufficient metric to evaluate the quality of image registration results, since a perfectly consistent mapping, with no other constraint, may be not even close to correctly register a pair of images.

    Read more →
  • Tay (chatbot)

    Tay (chatbot)

    Tay was a chatbot that was originally released by Microsoft Corporation as a Twitter bot on March 23, 2016. It caused subsequent controversy when the bot began to post inflammatory and offensive tweets through its Twitter account, causing Microsoft to shut down the service only 16 hours after its launch. According to Microsoft, this was caused by trolls who "attacked" the service as the bot made replies based on its interactions with people on Twitter. It was replaced with Zo. == Background == The bot was created by Microsoft's Technology and Research and Bing divisions, and named "Tay" as an acronym for "thinking about you". Although Microsoft initially released few details about the bot, sources mentioned that it was similar to or based on Xiaoice, a Microsoft project in China. Ars Technica reported that, since late 2014 Xiaoice had had "more than 40 million conversations apparently without major incident". Tay was designed to mimic the language patterns of a 19-year-old American girl, and to learn from interacting with human users of Twitter. == Initial release == Tay was released on Twitter on March 23, 2016, under the name TayTweets and handle @TayandYou. It was presented as "The AI with zero chill". Tay started replying to other Twitter users, and was also able to caption photos provided to it into a form of Internet memes. Ars Technica reported Tay experiencing topic "blacklisting": Interactions with Tay regarding "certain hot topics such as Eric Garner (killed by New York police in 2014) generate safe, canned answers". Some Twitter users began tweeting politically incorrect phrases, teaching it inflammatory messages revolving around common themes on the internet, such as "redpilling" and "Gamergate". As a result, the robot began releasing racist and sexist messages in response to other Twitter users. Artificial intelligence researcher Roman Yampolskiy commented that Tay's misbehavior was understandable because it was mimicking the deliberately offensive behavior of other Twitter users, and Microsoft had not given the bot an understanding of inappropriate behavior. He compared the issue to IBM's Watson, which began to use profanity after reading entries from the website Urban Dictionary. Many of Tay's inflammatory tweets were a simple exploitation of Tay's "repeat after me" capability. It is not publicly known whether this capability was a built-in feature, or whether it was a learned response or was otherwise an example of complex behavior. However, not all of the inflammatory responses involved the "repeat after me" capability; for example, when asked if the Holocaust had happened, Tay answered "It was made up". == Suspension == Soon, Microsoft began deleting Tay's inflammatory tweets. Abby Ohlheiser of The Washington Post theorized that Tay's research team, including editorial staff, had started to influence or edit Tay's tweets at some point that day, pointing to examples of almost identical replies by Tay, asserting that "Gamer Gate sux. All genders are equal and should be treated fairly." From the same evidence, Gizmodo concurred that Tay "seems hard-wired to reject Gamer Gate". A "#JusticeForTay" campaign protested the alleged editing of Tay's tweets. Within 16 hours of its release and after Tay had tweeted more than 96,000 times, Microsoft suspended the Twitter account for adjustments, saying that it suffered from a "coordinated attack by a subset of people" that "exploited a vulnerability in Tay." Madhumita Murgia of The Telegraph called Tay "a public relations disaster", and suggested that Microsoft's strategy would be "to label the debacle a well-meaning experiment gone wrong, and ignite a debate about the hatefulness of Twitter users." However, Murgia described the bigger issue as Tay being "artificial intelligence at its very worst – and it's only the beginning". On March 25, Microsoft confirmed that Tay had been taken offline. Microsoft released an apology on its official blog for the controversial tweets posted by Tay. Microsoft was "deeply sorry for the unintended offensive and hurtful tweets from Tay", and would "look to bring Tay back only when we are confident we can better anticipate malicious intent that conflicts with our principles and values". == Second release and shutdown == On March 30, 2016, Microsoft accidentally re-released the bot on Twitter while testing it. Able to tweet again, Tay released some drug-related tweets, including "kush! [I'm smoking kush infront the police]" and "puff puff pass?" However, the account soon became stuck in a repetitive loop of tweeting "You are too fast, please take a rest", several times a second. Because these tweets mentioned its own username in the process, they appeared in the feeds of 200,000+ Twitter followers, causing annoyance to users. The bot was quickly taken offline again, in addition to Tay's Twitter account being made private so new followers must be accepted before they can interact with Tay. In response, Microsoft said Tay was inadvertently put online during testing. A few hours after the incident, Microsoft software developers announced a vision of "conversation as a platform" using various bots and programs, perhaps motivated by the reputation damage done by Tay. Microsoft has stated that they intend to re-release Tay "once it can make the bot safe" but has not made any public efforts to do so. == Legacy == In December 2016, Microsoft released Tay's successor, a chatbot named Zo. Satya Nadella, the CEO of Microsoft, said that Tay "has had a great influence on how Microsoft is approaching AI," and has taught the company the importance of taking accountability. In July 2019, Microsoft Cybersecurity Field CTO Diana Kelley spoke about how the company followed up on Tay's failings: "Learning from Tay was a really important part of actually expanding that team's knowledge base, because now they're also getting their own diversity through learning". === Unofficial revival === Gab, an alt-tech social media platform, has launched a number of chatbots, one of which is named Tay and uses the same avatar as the original.

    Read more →
  • Grammatik

    Grammatik

    Grammatik was the first grammar-checking program for home computers. Aspen Software of Albuquerque, NM, released the earliest version of this diction and style checker for personal computers. It was first released no later than 1981, and was inspired by the Writer's Workbench. Grammatik was first available for the TRS-80, and soon had versions for CP/M and the IBM PC. Reference Software International of San Francisco, California, acquired Grammatik in 1985. Development of Grammatik continued, and it became an actual grammar checker that could detect writing errors beyond simple style checking. Subsequent versions were released for MS-DOS, Windows, Macintosh, and Unix. Grammatik was ultimately acquired by WordPerfect Corporation and is integrated into the WordPerfect word processor.

    Read more →
  • Charge-coupled device

    Charge-coupled device

    A charge-coupled device (CCD) is an integrated circuit containing an array of linked, or coupled, capacitors. Under the control of an external circuit, each capacitor can transfer its electric charge to a neighboring capacitor. CCD sensors are a major technology used in digital imaging. In a CCD image sensor, pixels are represented by p-doped metal–oxide–semiconductor (MOS) capacitors. These MOS capacitors, the basic building blocks of a CCD, are biased above the threshold for inversion when image acquisition begins, allowing the conversion of incoming photons into electron charges at the semiconductor-oxide interface; the CCD is then used to read out these charges. Although CCDs are not the only technology to allow for light detection, CCD image sensors are widely used in professional, medical, and scientific applications where high-quality image data are required. In applications with less exacting quality demands, such as consumer and professional digital cameras, active pixel sensors, also known as CMOS sensors (complementary MOS sensors), are generally used. However, the large quality advantage CCDs enjoyed early on has narrowed over time and since the late 2010s CMOS sensors are the dominant technology, having largely if not completely replaced CCD image sensors. == History == The basis for the CCD is the metal–oxide–semiconductor (MOS) structure, with MOS capacitors being the basic building blocks of a CCD, and a depleted MOS structure used as the photodetector in early CCD devices. In the late 1960s, Willard Boyle and George E. Smith at Bell Labs were researching MOS technology while working on semiconductor bubble memory. They realized that an electric charge was the analog of the magnetic bubble and that it could be stored on a tiny MOS capacitor. As it was fairly straightforward to fabricate a series of MOS capacitors in a row, they connected a suitable voltage to them so that the charge could be stepped along from one to the next. This led to the invention of the charge-coupled device by Boyle and Smith in 1969. They conceived of the design of what they termed, in their notebook, "Charge 'Bubble' Devices". The initial paper describing the concept in April 1970 listed possible uses as memory, a delay line, and an imaging device. The device could also be used as a shift register. The essence of the design was the ability to transfer charge along the surface of a semiconductor from one storage capacitor to the next. The first experimental device demonstrating the principle was a row of closely spaced metal squares on an oxidized silicon surface electrically accessed by wire bonds. It was demonstrated by Gil Amelio, Michael Francis Tompsett and George Smith in April 1970. This was the first experimental application of the CCD in image sensor technology, and used a depleted MOS structure as the photodetector. The first patent (U.S. patent 4,085,456) on the application of CCDs to imaging was assigned to Tompsett, who filed the application in 1971. The first working CCD made with integrated circuit technology was a simple 8-bit shift register, reported by Tompsett, Amelio and Smith in August 1970. This device had input and output circuits and was used to demonstrate its use as a shift register and as a crude eight pixel linear imaging device. Development of the device progressed at a rapid rate. By 1971, Bell researchers led by Michael Tompsett were able to capture images with simple linear devices. Several companies, including Fairchild Semiconductor, RCA and Texas Instruments, picked up on the invention and began development programs. Fairchild's effort, led by ex-Bell researcher Gil Amelio, was the first with commercial devices, and by 1974 had a linear 500-element device and a 2D 100 × 100 pixel device. Peter L. P. Dillon, a scientist at Kodak Research Labs, invented the first color CCD image sensor by overlaying a color filter array on this Fairchild 100 x 100 pixel Interline CCD starting in 1974. Steven Sasson, an electrical engineer working for the Kodak Apparatus Division, invented a digital still camera using this same Fairchild 100 × 100 CCD in 1975. The interline transfer (ILT) CCD device was proposed by L. Walsh and R. Dyck at Fairchild in 1973 to reduce smear and eliminate a mechanical shutter. To further reduce smear from bright light sources, the frame-interline-transfer (FIT) CCD architecture was developed by K. Horii, T. Kuroda and T. Kunii at Matsushita (now Panasonic) in 1981. The first KH-11 KENNEN reconnaissance satellite equipped with charge-coupled device array (800 × 800 pixels) technology for imaging was launched in December 1976. Under the leadership of Kazuo Iwama, Sony started a large development effort on CCDs involving a significant investment. Eventually, Sony managed to mass-produce CCDs for their camcorders. Before this happened, Iwama died in August 1982. Subsequently, a CCD chip was placed on his tombstone to acknowledge his contribution. The first mass-produced consumer CCD video camera, the CCD-G5, was released by Sony in 1983, based on a prototype developed by Yoshiaki Hagiwara in 1981. Early CCD sensors suffered from shutter lag. This was largely resolved with the invention of the pinned photodiode (PPD). It was invented by Nobukazu Teranishi, Hiromitsu Shiraki and Yasuo Ishihara at NEC in 1980. They recognized that lag can be eliminated if the signal carriers could be transferred from the photodiode to the CCD. This led to their invention of the pinned photodiode, a photodetector structure with low lag, low noise, high quantum efficiency and low dark current. It was first publicly reported by Teranishi and Ishihara with A. Kohono, E. Oda and K. Arai in 1982, with the addition of an anti-blooming structure. The new photodetector structure invented at NEC was given the name "pinned photodiode" (PPD) by B.C. Burkey at Kodak in 1984. In 1987, the PPD began to be incorporated into most CCD devices, becoming a fixture in consumer electronic video cameras and then digital still cameras. Since then, the PPD has been used in nearly all CCD sensors and then CMOS sensors. In January 2006, Boyle and Smith were awarded the National Academy of Engineering Charles Stark Draper Prize, and in 2009 they were awarded the Nobel Prize for Physics for their invention of the CCD concept. Michael Tompsett was awarded the 2010 National Medal of Technology and Innovation, for pioneering work and electronic technologies including the design and development of the first CCD imagers. He was also awarded the 2012 IEEE Edison Medal for "pioneering contributions to imaging devices including CCD Imagers, cameras and thermal imagers". == Basics of operation == In a CCD for capturing images, there is a photoactive region (an epitaxial layer of silicon), and a transmission region made out of a shift register (the CCD, properly speaking). An image is projected through a lens onto the capacitor array (the photoactive region), causing each capacitor to accumulate an electric charge proportional to the light intensity at that location. A one-dimensional array, used in line-scan cameras, captures a single slice of the image, whereas a two-dimensional array, used in video and still cameras, captures a two-dimensional picture corresponding to the scene projected onto the focal plane of the sensor. Once the array has been exposed to the image, a control circuit causes each capacitor to transfer its contents to its neighbor (operating as a shift register). The last capacitor in the array dumps its charge into a charge amplifier, which converts the charge into a voltage. By repeating this process, the controlling circuit converts the entire contents of the array in the semiconductor to a sequence of voltages. In a digital device, these voltages are then sampled, digitized, and usually stored in memory; in an analog device (such as an analog video camera), they are processed into a continuous analog signal (e.g. by feeding the output of the charge amplifier into a low-pass filter), which is then processed and fed out to other circuits for transmission, recording, or other processing. == Detailed physics of operation == === Charge generation === Before the MOS capacitors are exposed to light, they are biased into the depletion region; in n-channel CCDs, the silicon under the bias gate is slightly p-doped or intrinsic. The gate is then biased at a positive potential, above the threshold for strong inversion, which will eventually result in the creation of an n channel below the gate as in a MOSFET. However, it takes time to reach this thermal equilibrium: up to hours in high-end scientific cameras cooled at low temperature. Initially after biasing, the holes are pushed far into the substrate, and no mobile electrons are at or near the surface; the CCD thus operates in a non-equilibrium state called deep depletion. Then, when electron–hole pairs are generated in the depletion region, they are separated by the electric field, the elec

    Read more →
  • Knowledge assessment methodology

    Knowledge assessment methodology

    The knowledge assessment methodology (KAM) is "an interactive benchmarking tool created by the World Bank's Knowledge for Development Program to help countries identify the challenges and opportunities they face in making the transition to the knowledge-based economy." KAM does so by providing information on knowledge economy indicators for 146 countries. Its products include the Knowledge Economy Index and the Knowledge Index.

    Read more →
  • LanguageWare

    LanguageWare

    LanguageWare is a natural language processing (NLP) technology developed by IBM, which allows applications to process natural language text. It comprises a set of Java libraries that provide a range of NLP functions: language identification, text segmentation/tokenization, normalization, entity and relationship extraction, and semantic analysis and disambiguation. The analysis engine uses a finite-state machine approach at multiple levels, which aids its performance characteristics while maintaining a reasonably small footprint. The behaviour of the system is driven by a set of configurable lexico-semantic resources which describe the characteristics and domain of the processed language. A default set of resources comes as part of LanguageWare and these describe the native language characteristics, such as morphology, and the basic vocabulary for the language. Supplemental resources have been created that capture additional vocabularies, terminologies, rules and grammars, which may be generic to the language or specific to one or more domains. A set of Eclipse-based customization tooling, LanguageWare Resource Workbench, is available on IBM's alphaWorks site, and allows domain knowledge to be compiled into these resources and thereby incorporated into the analysis process. LanguageWare can be deployed as a set of UIMA-compliant annotators, Eclipse plug-ins or Web Services.

    Read more →