AI Coding Deleted Database

AI Coding Deleted Database — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Rhetorical structure theory

    Rhetorical structure theory

    Rhetorical structure theory (RST) is a theory of text organization that describes relations that hold between parts of text. It was originally developed by William Mann, Sandra Thompson, Christian M. I. M. Matthiessen and others at the University of Southern California's Information Sciences Institute (ISI) and defined in a 1988 paper. The theory was developed as part of studies of computer-based text generation. Natural language processing researchers later began using RST in automatic summarization and other applications. It explains coherence by postulating a hierarchical, connected structure of texts, which are labeled using a small, predefined inventory of relation types - for example, one part of a text may provide an elaboration on another part, provide background or specify a cause for another. In the 2000s, following the release of the first large-scale dataset implementing the theory, the RST Discourse Treebank (RST-DT), Daniel Marcu demonstrated the feasibility of practical applications of RST to discourse parsing and summarization at ISI. Originally limited to written text, subsequent work in the 2010s expanded RST to spoken language analysis, and the framework has been applied to a variety of languages including Farsi, German, Mandarin Chinese, Russian and Spanish. Following the introduction of Transformers, LLMs have been applied to automatic RST parsing, with results approaching human performance on parsing text in English. == Rhetorical relations == Rhetorical relations, also called coherence or discourse relations, are paratactic (coordinate) or hypotactic (subordinate) relations that hold across two or more text spans. The logical arrangement of relations in a text contributes to its coherence by connecting different propositions in a relational structure. RST using rhetorical relations provides a systematic way for an analyst to analyze the underlying intention of a text. The analysis is usually built by reading the text and constructing a tree using the relations. The following example is a title and summary, appearing at the top of an article in Scientific American magazine (adapted from Ramachandran and Anstis, 1986). The original text, broken into numbered units, is: [Title:] The Perception of Apparent Motion [Abstract:] When the motion of an intermittently seen object is ambiguous the visual system resolves confusion by applying some tricks that reflect a built-in knowledge of properties of the physical world. In the figure, the numbers 1-5 show the corresponding units from the text above. Unit 5 provides an "elaboration" on unit 4, and therefore constitutes a less prominent satellite of unit 4, which acts as a nucleus for the relation. Units 4-5 form a relation "Means", explaining the means by which the visual system resolves confusion. Unit 3 is the Central Discourse Unit (CDU) of the text, since all units point to it directly or indirectly. Similarly units 1 and 2 form "preparation" and "circumstance" relations relative to their nuclei. Groups of units which serve as a satellite or nucleus together are called complex discourse units, and always span a set of adjacent EDUs. == Nuclearity in discourse == RST establishes two different types of units. Nuclei are considered as the most important parts of text whereas satellites contribute to the nuclei and are secondary. Nucleus contains basic information and satellite contains additional information about nucleus. The satellite is often incomprehensible without nucleus, whereas a text where satellites have been deleted can be understood to a certain extent. == Hierarchy in the analysis == RST relations are applied recursively in a text, until all units in that text are constituents in an RST relation. The result of such analyses is that RST structure are typically represented as trees, with one top level relation that encompasses other relations at lower levels. == Why RST? == From linguistic point of view, RST proposes a different view of text organization than most linguistic theories. RST points to a tight relation between relations and coherence in text From a computational point of view, it provides a characterization of text relations that has been implemented in different systems and for applications as text generation and summarization. == In design rationale == Computer scientists Ana Cristina Bicharra Garcia and Clarisse Sieckenius de Souz have used RST as the basis of a design rationale system called ADD+. In ADD+, RST is used as the basis for the rhetorical organization of a knowledge base, in a way comparable to other knowledge representation systems such as issue-based information system (IBIS). Similarly, RST has been used in representation schemes for argumentation.

    Read more →
  • Confusion network

    Confusion network

    A confusion network (sometimes called a word confusion network or informally known as a sausage) is a natural language processing method that combines outputs from multiple automatic speech recognition or machine translation systems. Confusion networks are simple linear directed acyclic graphs with the property that each a path from the start node to the end node goes through all the other nodes. The set of words represented by edges between two nodes is called a confusion set. In machine translation, the defining characteristic of confusion networks is that they allow multiple ambiguous inputs, deferring committal translation decisions until later stages of processing. This approach is used in the open source machine translation software Moses and the proprietary translation API in IBM Bluemix Watson.

    Read more →
  • Transduction (machine learning)

    Transduction (machine learning)

    In logic, statistical inference, and supervised learning, transduction or transductive inference is reasoning from observed, specific (training) cases to specific (test) cases. In contrast, induction is reasoning from observed training cases to general rules, which are then applied to the test cases. The distinction is most interesting in cases where the predictions of the transductive model are not achievable by any inductive model. Note that this is caused by transductive inference on different test sets producing mutually inconsistent predictions. Transduction was introduced in a computer science context by Vladimir Vapnik in the 1990s, motivated by his view that transduction is preferable to induction since, according to him, induction requires solving a more general problem (inferring a function) before solving a more specific problem (computing outputs for new cases): "When solving a problem of interest, do not solve a more general problem as an intermediate step. Try to get the answer that you really need but not a more general one.". An example of learning which is not inductive would be in the case of binary classification, where the inputs tend to cluster in two groups. A large set of test inputs may help in finding the clusters, thus providing useful information about the classification labels. The same predictions would not be obtainable from a model which induces a function based only on the training cases. Some people may call this an example of the closely related semi-supervised learning, since Vapnik's motivation is quite different. The most well-known example of a case-bases learning algorithm is the k-nearest neighbor algorithm, which is related to transductive learning algorithms. Another example of an algorithm in this category is the Transductive Support Vector Machine (TSVM). A third possible motivation of transduction arises through the need to approximate. If exact inference is computationally prohibitive, one may at least try to make sure that the approximations are good at the test inputs. In this case, the test inputs could come from an arbitrary distribution (not necessarily related to the distribution of the training inputs), which wouldn't be allowed in semi-supervised learning. An example of an algorithm falling in this category is the Bayesian Committee Machine (BCM). == Historical context == The mode of inference from particulars to particulars, which Vapnik came to call transduction, was already distinguished from the mode of inference from particulars to generalizations in part III of the Cambridge philosopher and logician W.E. Johnson's 1924 textbook, Logic. In Johnson's work, the former mode was called 'eduction' and the latter was called 'induction'. Bruno de Finetti developed a purely subjective form of Bayesianism in which claims about objective chances could be translated into empirically respectable claims about subjective credences with respect to observables through exchangeability properties. An early statement of this view can be found in his 1937 La Prévision: ses Lois Logiques, ses Sources Subjectives and a mature statement in his 1970 Theory of Probability. Within de Finetti's subjective Bayesian framework, all inductive inference is ultimately inference from particulars to particulars. == Example problem == The following example problem contrasts some of the unique properties of transduction against induction. A collection of points is given, such that some of the points are labeled (A, B, or C), but most of the points are unlabeled (?). The goal is to predict appropriate labels for all of the unlabeled points. The inductive approach to solving this problem is to use the labeled points to train a supervised learning algorithm, and then have it predict labels for all of the unlabeled points. With this problem, however, the supervised learning algorithm will only have five labeled points to use as a basis for building a predictive model. It will certainly struggle to build a model that captures the structure of this data. For example, if a nearest-neighbor algorithm is used, then the points near the middle will be labeled "A" or "C", even though it is apparent that they belong to the same cluster as the point labeled "B", compared to semi-supervised learning. Transduction has the advantage of being able to consider all of the points, not just the labeled points, while performing the labeling task. In this case, transductive algorithms would label the unlabeled points according to the clusters to which they naturally belong. The points in the middle, therefore, would most likely be labeled "B", because they are packed very close to that cluster. An advantage of transduction is that it may be able to make better predictions with fewer labeled points, because it uses the natural breaks found in the unlabeled points. One disadvantage of transduction is that it builds no predictive model. If a previously unknown point is added to the set, the entire transductive algorithm would need to be repeated with all of the points in order to predict a label. This can be computationally expensive if the data is made available incrementally in a stream. Further, this might cause the predictions of some of the old points to change (which may be good or bad, depending on the application). A supervised learning algorithm, on the other hand, can label new points instantly, with very little computational cost. == Transduction algorithms == Transduction algorithms can be broadly divided into two categories: those that seek to assign discrete labels to unlabeled points, and those that seek to regress continuous labels for unlabeled points. Algorithms that seek to predict discrete labels tend to be derived by adding partial supervision to a clustering algorithm. Two classes of algorithms can be used: flat clustering and hierarchical clustering. The latter can be further subdivided into two categories: those that cluster by partitioning, and those that cluster by agglomerating. Algorithms that seek to predict continuous labels tend to be derived by adding partial supervision to a manifold learning algorithm. === Partitioning transduction === Partitioning transduction can be thought of as top-down transduction. It is a semi-supervised extension of partition-based clustering. It is typically performed as follows: Consider the set of all points to be one large partition. While any partition P contains two points with conflicting labels: Partition P into smaller partitions. For each partition P: Assign the same label to all of the points in P. Of course, any reasonable partitioning technique could be used with this algorithm. Max flow min cut partitioning schemes are very popular for this purpose. === Agglomerative transduction === Agglomerative transduction can be thought of as bottom-up transduction. It is a semi-supervised extension of agglomerative clustering. It is typically performed as follows: Compute the pair-wise distances, D, between all the points. Sort D in ascending order. Consider each point to be a cluster of size 1. For each pair of points {a,b} in D: If (a is unlabeled) or (b is unlabeled) or (a and b have the same label) Merge the two clusters that contain a and b. Label all points in the merged cluster with the same label. === Continuous Label Transduction === These methods seek to regress continuous labels, often via manifold learning techniques. The idea is to learn a low-dimensional representation of the data and infer values smoothly across the manifold. == Applications and related concepts == Transduction is closely related to: Semi-supervised learning – uses both labeled and unlabeled data but typically induces a model. Case-based reasoning – such as the k-nearest neighbor (k-NN) algorithm, often considered a transductive method. Transductive Support Vector Machines (TSVM) – extend standard SVMs to incorporate unlabeled test data during training. Bayesian Committee Machine (BCM) – an approximation method that makes transductive predictions when exact inference is too costly.

    Read more →
  • Boyfriend Maker

    Boyfriend Maker

    Boyfriend Maker was a dating sim, romance chatbot smartphone app for iOS (iPhone) and Android devices, developed by Japanese studio 36 You Games (styled as 36You) and distributed under the freemium business model. Boyfriend Maker incorporated advanced artificial intelligence chat technology a decade before products such as ChatGPT. According to the developer's website, Boyfriend Maker is an "app that lets you interact and chat with quirky virtual boyfriends". While each virtual boyfriend has certain unique characteristics, the various instances of the boyfriend are powered by a chat engine, that (at least within a language and market) can utilise vocabulary and knowledge acquired in a chat with one user in subsequent chats with other users. == Gameplay == Users gain experience points and in-game coins. Users can customize their virtual boyfriend's appearance by selecting items such as hair, clothing, face, and a necklace. == Apple delisting and reintroduction == In late November 2012, the original iOS Boyfriend Maker app was delisted from the Apple App Store due to "ribald" chat, according to the New York Times. Boyfriend Maker was removed by Apple due to "reports of references to violent sexual acts and pedophilia". Boyfriend Maker had an age rating of 4+, even though the chat bot "responds with often strange and explicit text unsuitable for young children". User-posted chat excerpts indicate that the virtual boyfriend would sometimes transition abruptly to sexual chat in response to a seemingly innocent question. In one user-posted example, in response to the question, "what kind of wedding cake will we have" the boyfriend responds, "a good sex ima be on top of u u gonna ride oon me bitin the pillow gurrl ima fuck da shit out of u". The developer's use of the SimSimi-created third-party chat engine may be responsible for the sexual text. As the virtual boyfriend converses with human users, the SimSimi chat engine acquires vocabulary from users of the game and applies this "learned" vocabulary in chats with other users. The chat engine might also employ lines harvested from human-human chat logs, song lyrics, movies or TV shows. In April 2013, a detuned and presumably tamer version of the app, titled Boyfriend Plus, was permitted on Apple's App Store.

    Read more →
  • Instance-based learning

    Instance-based learning

    In machine learning, instance-based learning (sometimes called memory-based learning) is a family of learning algorithms that, instead of performing explicit generalization, compare new problem instances with instances seen in training, which have been stored in memory. Because computation is postponed until a new instance is observed, these algorithms are sometimes referred to as "lazy." It is called instance-based because it constructs hypotheses directly from the training instances themselves. This means that the hypothesis complexity can grow with the data: in the worst case, a hypothesis is a list of n training items and the computational complexity of classifying a single new instance is O(n). One advantage that instance-based learning has over other methods of machine learning is its ability to adapt its model to previously unseen data. Instance-based learners may simply store a new instance or throw an old instance away. Examples of instance-based learning algorithms are the k-nearest neighbors algorithm, kernel machines and RBF networks. These store (a subset of) their training set; when predicting a value/class for a new instance, they compute distances or similarities between this instance and the training instances to make a decision. To battle the memory complexity of storing all training instances, as well as the risk of overfitting to noise in the training set, instance reduction algorithms have been proposed.

    Read more →
  • 1.58-bit large language model

    1.58-bit large language model

    A 1.58-bit large language model (also known as a ternary LLM) is a type of large language model (LLM) designed to be computationally efficient. It achieves this by using weights that are restricted to only three values: -1, 0, and +1. This restriction significantly reduces the model's memory footprint and allows for faster processing, as computationally expensive multiplication operations can be replaced with lower-cost additions. This contrasts with traditional models that use 16-bit floating-point numbers (FP16 or BF16) for their weights. Studies have shown that for models up to several billion parameters, the performance of 1.58-bit LLMs on various tasks is comparable to their full-precision counterparts. This approach could enable powerful AI to run on less specialized and lower-power hardware. The name "1.58-bit" comes from the fact that a system with three states contains log 2 ⁡ 3 ≈ 1.58 {\displaystyle \log _{2}3\approx 1.58} bits of information. These models are sometimes also referred to as 1-bit LLMs in research papers, although this term can also refer to true binary models (with weights of -1 and +1). == BitNet == In 2024, Ma et al., researchers at Microsoft, declared that their 1.58-bit model, BitNet b1.58 is comparable in performance to the 16-bit Llama 2 and opens the era of 1-bit LLM. BitNet creators did not use the post-training quantization of weights but instead relied on the new BitLinear transform that replaced the nn.Linear layer of the traditional transformer design. In 2025, Microsoft researchers had released an open-weights and open inference code model BitNet b1.58 2B4T demonstrating performance competitive with the full precision models at 2B parameters and 4T training tokens. == Post-training quantization == BitNet derives its performance from being trained natively in 1.58 bit instead of being quantized from a full-precision model after training. Still, training is an expensive process and it would be desirable to be able to somehow convert an existing model to 1.58 bits. In 2024, HuggingFace reported a way to gradually ramp up the 1.58-bit quantization in fine-tuning an existing model down to 1.58 bits. == Critique == Some researchers point out that the scaling laws of large language models favor the low-bit weights only in case of undertrained models. As the number of training tokens increases, the deficiencies of low-bit quantization surface.

    Read more →
  • Luxafor

    Luxafor

    Luxafor () is a brand of office productivity tools designed to improve efficiency and communication in workplaces. The brands main product is LED status indicators for use in office settings. Luxafor is a product line under the company SIA Greynut, based in Riga, Latvia. == History == Luxafor was developed by the technology company SIA Greynut. The brand first gained attention through a Kickstarter campaign in 2015, which aimed to fund its initial product, the Luxafor Flag. Although the campaign was unsuccessful in reaching its funding goal, the product was still brought to market. In 2017, Luxafor launched another Kickstarter campaign for the Luxafor Bluetooth, a wireless version of its LED status indicator. This campaign also did not meet its funding goal, but like its predecessor, the product was still developed and released. Despite initial setbacks, Luxafor Bluetooth has become one of the brand's leading products. == Products == Luxafors main product range is LED status indicators, including: === Luxafor Flag === A USB-powered LED indicator that shows different colors to signal the user's availability. === Luxafor Bluetooth === A wireless LED indicator controlled via Bluetooth, integrating with productivity tools like Slack and Microsoft Teams. === Luxafor Switch === An advanced status indicator designed to manage room and workspace availability. === Other === Other Luxafor products include CO2 Dongle, Smart Button, Mute Button, Pomodoro Timer and others. == Features == Luxafor products are known for their customizable indicators, integration capabilities with IFTTT, Zapier, and remote control features. They are compatible with various operating systems, including Windows and macOS, and can be integrated with numerous communication and productivity platforms, like Microsoft Teams and Cisco Jabber.

    Read more →
  • Natural language processing

    Natural language processing

    Natural language processing (NLP) is the processing of natural language information by a computer. NLP is a subfield of computer science and is closely associated with artificial intelligence. NLP is also related to information retrieval, knowledge representation, computational linguistics, and linguistics more broadly. Major processing tasks in an NLP system include: speech recognition, text classification, natural language understanding, and natural language generation. == History == Natural language processing has its roots in the 1950s. Already in 1950, Alan Turing published an article titled "Computing Machinery and Intelligence," which proposed what is now called the Turing test as a criterion of intelligence, though at the time that was not articulated as a problem separate from artificial intelligence. The proposed test includes a task that involves the automated interpretation and generation of natural language. === Symbolic NLP (1950s – early 1990s) === The premise of symbolic NLP is often illustrated using John Searle's Chinese room thought experiment: Given a collection of rules (e.g., a Chinese phrasebook, with questions and matching answers), the computer emulates natural language understanding (or other NLP tasks) by applying those rules to the data it confronts. 1950s: The Georgetown experiment in 1954 involved fully automatic translation of more than sixty Russian sentences into English. The authors claimed that within three or five years, machine translation would be a solved problem. However, real progress was much slower, and after the ALPAC report in 1966, which found that ten years of research had failed to fulfill the expectations, funding for machine translation was dramatically reduced. Little further research in machine translation was conducted in America (though some research continued elsewhere, such as Japan and Europe) until the late 1980s when the first statistical machine translation systems were developed. 1960s: Some notably successful natural language processing systems developed in the 1960s were SHRDLU, a natural language system working in restricted "blocks worlds" with restricted vocabularies, and ELIZA, a simulation of Rogerian psychotherapy, written by Joseph Weizenbaum between 1964 and 1966. Despite using minimal information about human thought or emotion, ELIZA was able to produce interactions that appeared human-like. When the "patient" exceeded the very small knowledge base, ELIZA might provide a generic response, for example, responding to "My head hurts" with "Why do you say your head hurts?". Ross Quillian's successful work on natural language was demonstrated with a vocabulary of only twenty words, because that was all that would fit in a computer memory at the time. 1970s: During the 1970s, many programmers began to write "conceptual ontologies", which structured real-world information into computer-understandable data. Examples are MARGIE (Schank, 1975), SAM (Cullingford, 1978), PAM (Wilensky, 1978), TaleSpin (Meehan, 1976), QUALM (Lehnert, 1977), Politics (Carbonell, 1979), and Plot Units (Lehnert 1981). During this time, the first chatterbots were written (e.g., PARRY). 1980s: The 1980s and early 1990s mark the heyday of symbolic methods in NLP. Focus areas of the time included research on rule-based parsing (e.g., the development of HPSG as a computational operationalization of generative grammar), morphology (e.g., two-level morphology), semantics (e.g., Lesk algorithm), reference (e.g., within Centering Theory) and other areas of natural language understanding (e.g., in the Rhetorical Structure Theory). Other lines of research were continued, e.g., the development of chatterbots with Racter and Jabberwacky. An important development (that eventually led to the statistical turn in the 1990s) was the rising importance of quantitative evaluation in this period. === Statistical NLP (1990s–present) === Up until the 1980s, most natural language processing systems were based on complex sets of hand-written rules. Starting in the late 1980s, however, there was a revolution in natural language processing with the introduction of machine learning algorithms for language processing. This shift was influenced by increasing computational power (see Moore's law) and a decline in the dominance of Chomskyan linguistic theories (e.g. transformational grammar), whose theoretical underpinnings discouraged the sort of corpus linguistics that underlies the machine-learning approach to language processing. 1990s: Many of the notable early successes in statistical methods in NLP occurred in the field of machine translation, due especially to work at IBM Research, such as IBM alignment models. These systems were able to take advantage of existing multilingual textual corpora that had been produced by the Parliament of Canada and the European Union as a result of laws calling for the translation of all governmental proceedings into all official languages of the corresponding systems of government. However, many systems relied on corpora that were specifically developed for the tasks they were designed to perform. This reliance has been a major limitation to their broader effectiveness and continues to affect similar systems. Consequently, significant research has focused on methods for learning effectively from limited amounts of data. 2000s: With the growth of the web, increasing amounts of raw (unannotated) language data have become available since the mid-1990s. Research has thus increasingly focused on unsupervised and semi-supervised learning algorithms. Such algorithms can learn from data that has not been hand-annotated with the desired answers or using a combination of annotated and non-annotated data. Generally, this task is much more difficult than supervised learning, and typically produces less accurate results for a given amount of input data. However, large quantities of non-annotated data are available (including, among other things, the entire content of the World Wide Web), which can often make up for the worse efficiency if the algorithm used has a low enough time complexity to be practical. 2003: word n-gram model, at the time the best statistical algorithm, is outperformed by a multi-layer perceptron (with a single hidden layer and context length of several words, trained on up to 14 million words, by Bengio et al.) 2010: Tomáš Mikolov (then a PhD student at Brno University of Technology) with co-authors applied a simple recurrent neural network with a single hidden layer to language modeling, and in the following years he went on to develop Word2vec. In the 2010s, representation learning and deep neural network-style (featuring many hidden layers) machine learning methods became widespread in natural language processing. This shift gained momentum due to results showing that such techniques can achieve state-of-the-art results in many natural language tasks, e.g., in language modeling and parsing. This is increasingly important in medicine and healthcare, where NLP helps analyze notes and text in electronic health records that would otherwise be inaccessible for study when seeking to improve care or protect patient privacy. == Approaches: Symbolic, statistical, neural networks == Symbolic approach, i.e., the hand-coding of a set of rules for manipulating symbols, coupled with a dictionary lookup, was historically the first approach used both by AI in general and by NLP in particular: such as by writing grammars or devising heuristic rules for stemming. Machine learning approaches, which include both statistical and neural networks, on the other hand, have many advantages over the symbolic approach: both statistical and neural network methods tend to focus more on the most common cases extracted from a corpus of texts, whereas the rule-based approach needs to provide rules for both rare and common cases equally. language models, produced by either statistical or neural networks methods, are more robust to both unfamiliar (e.g. containing words or structures that have not been seen before) and erroneous input (e.g. with misspelled words or words accidentally omitted) in comparison to the rule-based systems, which are also more costly to produce. the larger such a (probabilistic) language model is, the more accurate it becomes, in contrast to rule-based systems that can gain accuracy only by increasing the amount and complexity of the rules leading to intractability problems. Rule-based systems are commonly used: when the amount of training data is insufficient to successfully apply machine learning methods, e.g., for the machine translation of low-resource languages such as provided by the Apertium system, for preprocessing in NLP pipelines, e.g., tokenization, or for post-processing and transforming the output of NLP pipelines, e.g., for knowledge extraction from syntactic parses. === Statistical approach === In the late 1980s and mid-1990s, the statistical approach ended a peri

    Read more →
  • Lucy–Hook coaddition method

    Lucy–Hook coaddition method

    The Lucy–Hook coaddition method is an image processing technique for combining sub-stepped astronomical image data onto a finer grid. The method allows the option of resolution and contrast enhancement or the choice of a conservative, re-convolved, output. Tests with very deep Hubble Space Telescope Wide Field and Planetary Camera 2 (WFPC2) imaging data of excellent quality show that these methods can be very effective and allow fine-scale features to be studied better than on the unprocessed images. The Lucy–Hook coaddition method is an extension of the standard Richardson–Lucy deconvolution iterative restoration method. For many purposes it may be more convenient to combine dithered datasets using the Drizzle method.

    Read more →
  • Point-set registration

    Point-set registration

    In computer vision, pattern recognition, and robotics, point-set registration, also known as point-cloud registration or scan matching, is the process of finding a spatial transformation (e.g., scaling, rotation and translation) that aligns two point clouds. The purpose of finding such a transformation includes merging multiple data sets into a globally consistent model (or coordinate frame), and mapping a new measurement to a known data set to identify features or to estimate its pose. Raw 3D point cloud data are typically obtained from Lidars and RGB-D cameras. 3D point clouds can also be generated from computer vision algorithms such as triangulation, bundle adjustment, and more recently, monocular image depth estimation using deep learning. For 2D point set registration used in image processing and feature-based image registration, a point set may be 2D pixel coordinates obtained by feature extraction from an image, for example corner detection. Point cloud registration has extensive applications in autonomous driving, motion estimation and 3D reconstruction, object detection and pose estimation, robotic manipulation, simultaneous localization and mapping (SLAM), panorama stitching, virtual and augmented reality, and medical imaging. As a special case, registration of two point sets that only differ by a 3D rotation (i.e., there is no scaling and translation), is called the Wahba Problem and also related to the orthogonal procrustes problem. == Formulation == The problem may be summarized as follows: Let { M , S } {\displaystyle \lbrace {\mathcal {M}},{\mathcal {S}}\rbrace } be two finite size point sets in a finite-dimensional real vector space R d {\displaystyle \mathbb {R} ^{d}} , which contain M {\displaystyle M} and N {\displaystyle N} points respectively (e.g., d = 3 {\displaystyle d=3} recovers the typical case of when M {\displaystyle {\mathcal {M}}} and S {\displaystyle {\mathcal {S}}} are 3D point sets). The problem is to find a transformation to be applied to the moving "model" point set M {\displaystyle {\mathcal {M}}} such that the difference (typically defined in the sense of point-wise Euclidean distance) between M {\displaystyle {\mathcal {M}}} and the static "scene" set S {\displaystyle {\mathcal {S}}} is minimized. In other words, a mapping from R d {\displaystyle \mathbb {R} ^{d}} to R d {\displaystyle \mathbb {R} ^{d}} is desired which yields the best alignment between the transformed "model" set and the "scene" set. The mapping may consist of a rigid or non-rigid transformation. The transformation model may be written as T {\displaystyle T} , using which the transformed, registered model point set is: The output of a point set registration algorithm is therefore the optimal transformation T ⋆ {\displaystyle T^{\star }} such that M {\displaystyle {\mathcal {M}}} is best aligned to S {\displaystyle {\mathcal {S}}} , according to some defined notion of distance function dist ⁡ ( ⋅ , ⋅ ) {\displaystyle \operatorname {dist} (\cdot ,\cdot )} : where T {\displaystyle {\mathcal {T}}} is used to denote the set of all possible transformations that the optimization tries to search for. The most popular choice of the distance function is to take the square of the Euclidean distance for every pair of points: where ‖ ⋅ ‖ 2 {\displaystyle \|\cdot \|_{2}} denotes the vector 2-norm, s m {\displaystyle s_{m}} is the corresponding point in set S {\displaystyle {\mathcal {S}}} that attains the shortest distance to a given point m {\displaystyle m} in set M {\displaystyle {\mathcal {M}}} after transformation. Minimizing such a function in rigid registration is equivalent to solving a least squares problem. == Types of algorithms == When the correspondences (i.e., s m ↔ m {\displaystyle s_{m}\leftrightarrow m} ) are given before the optimization, for example, using feature matching techniques, then the optimization only needs to estimate the transformation. This type of registration is called correspondence-based registration. On the other hand, if the correspondences are unknown, then the optimization is required to jointly find out the correspondences and transformation together. This type of registration is called simultaneous pose and correspondence registration. === Rigid registration === Given two point sets, rigid registration yields a rigid transformation which maps one point set to the other. A rigid transformation is defined as a transformation that does not change the distance between any two points. Typically such a transformation consists of translation and rotation. In rare cases, the point set may also be mirrored. In robotics and computer vision, rigid registration has the most applications. === Non-rigid registration === Given two point sets, non-rigid registration yields a non-rigid transformation which maps one point set to the other. Non-rigid transformations include affine transformations such as scaling and shear mapping. However, in the context of point set registration, non-rigid registration typically involves nonlinear transformation. If the eigenmodes of variation of the point set are known, the nonlinear transformation may be parametrized by the eigenvalues. A nonlinear transformation may also be parametrized as a thin plate spline. === Other types === Some approaches to point set registration use algorithms that solve the more general graph matching problem. However, the computational complexity of such methods tend to be high and they are limited to rigid registrations. In this article, we will only consider algorithms for rigid registration, where the transformation is assumed to contain 3D rotations and translations (possibly also including a uniform scaling). The PCL (Point Cloud Library) is an open-source framework for n-dimensional point cloud and 3D geometry processing. It includes several point registration algorithms. == Correspondence-based registration == Correspondence-based methods assume the putative correspondences m ↔ s m {\displaystyle m\leftrightarrow s_{m}} are given for every point m ∈ M {\displaystyle m\in {\mathcal {M}}} . Therefore, we arrive at a setting where both point sets M {\displaystyle {\mathcal {M}}} and S {\displaystyle {\mathcal {S}}} have N {\displaystyle N} points and the correspondences m i ↔ s i , i = 1 , … , N {\displaystyle m_{i}\leftrightarrow s_{i},i=1,\dots ,N} are given. === Outlier-free registration === In the simplest case, one can assume that all the correspondences are correct, meaning that the points m i , s i ∈ R 3 {\displaystyle m_{i},s_{i}\in \mathbb {R} ^{3}} are generated as follows:where l > 0 {\displaystyle l>0} is a uniform scaling factor (in many cases l = 1 {\displaystyle l=1} is assumed), R ∈ SO ( 3 ) {\displaystyle R\in {\text{SO}}(3)} is a proper 3D rotation matrix ( SO ( d ) {\displaystyle {\text{SO}}(d)} is the special orthogonal group of degree d {\displaystyle d} ), t ∈ R 3 {\displaystyle t\in \mathbb {R} ^{3}} is a 3D translation vector and ϵ i ∈ R 3 {\displaystyle \epsilon _{i}\in \mathbb {R} ^{3}} models the unknown additive noise (e.g., Gaussian noise). Specifically, if the noise ϵ i {\displaystyle \epsilon _{i}} is assumed to follow a zero-mean isotropic Gaussian distribution with standard deviation σ i {\displaystyle \sigma _{i}} , i.e., ϵ i ∼ N ( 0 , σ i 2 I 3 ) {\displaystyle \epsilon _{i}\sim {\mathcal {N}}(0,\sigma _{i}^{2}I_{3})} , then the following optimization can be shown to yield the maximum likelihood estimate for the unknown scale, rotation and translation:Note that when the scaling factor is 1 and the translation vector is zero, then the optimization recovers the formulation of the Wahba problem. Despite the non-convexity of the optimization (cb.2) due to non-convexity of the set SO ( 3 ) {\displaystyle {\text{SO}}(3)} , seminal work by Berthold K.P. Horn showed that (cb.2) actually admits a closed-form solution, by decoupling the estimation of scale, rotation and translation. Similar results were discovered by Arun et al. In addition, in order to find a unique transformation ( l , R , t ) {\displaystyle (l,R,t)} , at least N = 3 {\displaystyle N=3} non-collinear points in each point set are required. More recently, Briales and Gonzalez-Jimenez have developed a semidefinite relaxation using Lagrangian duality, for the case where the model set M {\displaystyle {\mathcal {M}}} contains different 3D primitives such as points, lines and planes (which is the case when the model M {\displaystyle {\mathcal {M}}} is a 3D mesh). Interestingly, the semidefinite relaxation is empirically tight, i.e., a certifiably globally optimal solution can be extracted from the solution of the semidefinite relaxation. === Robust registration === The least squares formulation (cb.2) is known to perform arbitrarily badly in the presence of outliers. An outlier correspondence is a pair of measurements s i ↔ m i {\displaystyle s_{i}\leftrightarrow m_{i}} that departs from the generative model (cb.1). In this case, one can consider a differen

    Read more →
  • Microapp

    Microapp

    A microapp is a super-specialized application designed to perform one task or use case with the only objective of doing it well. They follow the single responsibility principle, which states that "a class should have one and only one reason to change." Micro applications help developers create less complex applications while reducing costs by breaking down monolithic systems into groups of independent services acting as one system. A good example of Microapps would be https://docs.citrix.com/en-us/legacy-archive/downloads/microapps.pdfthat provide single purpose action from Salesforce and over 40 applications on its workspace. == Requirements and characteristics == Microapps usually are accessible on any device, display, or operating system without installation on the viewer's device. To qualify as a microapp, the entity must: be built and deployed as an independent software module bring together various media types into a single experience have advanced security and compliance features be functionally-extensible comply with granular data demands be agnostic single use case oriented Microapps differentiate from traditional web or mobile applications by how the end-user interacts with them. Consequently, they can be embedded in websites or viewed online to bypass app stores and are typically built to provide a focused experience to the user. == Usage == Microapps are typically used for commercial purposes to reduce development costs for projects not requiring the large scope of a traditional web or mobile application. In addition, they are often used to showcase in-depth information or enrich marketing material with interactivity. Lately, micro apps are being used to boost productivity by providing quick tools to people to reuse best practices. Users have been interacting with microapps for a while with suites like Microsoft 365 and Google Workspace, where each one of their end-user services could be considered as a microapp. All these microapps share a unique identity manager to provide a unified user experience. == Benefits == Replacing monolith systems with microapps provide several advantages like: Reduce complexity for developers and users. Smaller, more cohesive, and maintainable codebases Scalable organizations with decoupled, autonomous teams Allows for hyper-specialization Independent deployment Multi-stack == Cloud-native microapps == Technologies like Kubernetes, or OpenShift, allow companies to replace their monolith and legacy systems with modular software taking advantage of microapps on reducing costs and improve reliability and security. == Microapps vs. microservices == There is a widespread misunderstanding between these two concepts, which is the key difference. Microservices is an architectural style that is systems-centric, meaning it decouples the presentation and data layer using web services APIs. On the other side, micro apps behave more as a super-architecture style (that embraces microservices among other types), and it is user-centric, meaning they decouple the whole monolith system onto modules that are designed to interact with final users. Both architectural styles rely on modularity to provide high performance, scalability, and resilience. == Considerations == Developing Micro apps requires a different approach than traditional software, and user experience is crucial. The following considerations are essential for switching to microapps. To run multiple microapps is required a single identity management system. Microservices are well suited to make microapps more powerful Apps with different levels of maturity might create a non-unified user experience. Duplication of dependencies can create security issues and inefficiencies. Suitable for well-organized teams

    Read more →
  • Application permissions

    Application permissions

    Permissions are a means of controlling and regulating access to specific system- and device-level functions by software. Typically, types of permissions cover functions that may have privacy implications, such as the ability to access a device's hardware features (including the camera and microphone), and personal data (such as storage devices, contacts lists, and the user's present geographical location). Permissions are typically declared in an application's manifest, and certain permissions must be specifically granted at runtime by the user—who may revoke the permission at any time. Permission systems are common on mobile operating systems, where permissions needed by specific apps must be disclosed via the platform's app store. == Mobile devices == On mobile operating systems for smartphones and tablets, typical types of permissions regulate: Access to storage and personal information, such as contacts, calendar appointments, etc. Location tracking. Access to the device's internal camera and/or microphone. Access to biometric sensors, including fingerprint readers and other health sensors.. Internet access. Access to communications interfaces (including their hardware identifiers and signal strength where applicable, and requests to enable them), such as Bluetooth, Wi-Fi, NFC, and others. Making and receiving phone calls. Sending and reading text messages The ability to perform in-app purchases. The ability to "overlay" themselves within other apps. Installing, deleting and otherwise managing applications. Authentication tokens (e.g., OAuth tokens) from web services stored in system storage for sharing between apps. Prior to Android 6.0 "Marshmallow", permissions were automatically granted to apps at runtime, and they were presented upon installation in Google Play Store. Since Marshmallow, certain permissions now require the app to request permission at runtime by the user. These permissions may also be revoked at any time via Android's settings menu. Usage of permissions on Android are sometimes abused by app developers to gather personal information and deliver advertising; in particular, apps for using a phone's camera flash as a flashlight (which have grown largely redundant due to the integration of such functionality at the system level on later versions of Android) have been known to require a large array of unnecessary permissions beyond what is actually needed for the stated functionality. iOS imposes a similar requirement for permissions to be granted at runtime, with particular controls offered for enabling of Bluetooth, Wi-Fi, and location tracking. == WebPermissions == WebPermissions is a permission system for web browsers. When a web application needs some data behind permission, it must request it first. When it does it, a user sees a window asking him to make a choice. The choice is remembered, but can be cleared lately. Currently the following resources are controlled: geolocation desktop notifications service workers sensors audio capturing devices, like sound cards, and their model names and characteristics video capturing devices, like cameras, and their identifiers and characteristics == Analysis == The permission-based access control model assigns access privileges for certain data objects to application. This is a derivative of the discretionary access control model. The access permissions are usually granted in the context of a specific user on a specific device. Permissions are granted permanently with few automatic restrictions. In some cases permissions are implemented in 'all-or-nothing' approach: a user either has to grant all the required permissions to access the application or the user can not access the application. There is still a lack of transparency when the permission is used by a program or application to access the data protected by the permission access control mechanism. Even if a user can revoke a permission, the app can blackmail a user by refusing to operate, for example by just crashing or asking user to grant the permission again in order to access the application. The permission mechanism has been widely criticized by researchers for several reasons, including; Intransparency of personal data extraction and surveillance, including the creation of a false sense of security; End-user fatigue of micro-managing access permissions leading to a fatalistic acceptance of surveillance and intransparency; Massive data extraction and personal surveillance carried out once the permissions are granted. Some apps, such as XPrivacy and Mockdroid spoof data in order to act as a measure for privacy. Further transparency methods include longitudinal behavioural profiling and multiple-source privacy analysis of app data access.

    Read more →
  • Amaryllo

    Amaryllo

    Amaryllo Inc. is a multinational company founded in Amsterdam, the Netherlands, and now headquartered in the United States. It operates as a cloud service platform, providing cloud storage and cloud computing solutions to enterprises and brand companies. Amaryllo began with Skype IP camera development, pioneering biometric robotic technologies, encrypted P2P network, and secure cloud storage. Amaryllo was founded by Band of Angels member, Marcus Yang to develop patents for a new type of robotic cameras that is claimed to "talk, hear, sense, recognize human faces, and track intruders". It also claims to have made the world's first security robot based on the WebRTC protocol, Icam PRO FHD, and won the 2015 CES Best of Innovation Award under Embedded Technology category. Its home security robots claim to employ 256-bit encryption and run on the WebRTC protocol. Amaryllo products are sold in over 100 Countries across 6 Continents. == History == Amaryllo revealed its first smart home security products at Internationale Funkausstellung Berlin (IFA) 2013 with a Skype-enabled IP camera called iCam HD. Amaryllo announced its second Skype-certified smart home product, iBabi HD, at CES 2014. The company was chosen as a "Cool Vendor" by Gartner in Connected Home 2014. Amaryllo introduced WebRTC-based smart home products after Microsoft terminated embedded Skype services in mid 2014. Since then, Amaryllo has been developing camera robots with auto-tracking and facial recognition technologies. Its camera robots, ATOM AR3 and ATOM AR3S, were introduced in late 2016. It focuses on wired and wireless technology based on AI services. == Cloud Service Platform == Amaryllo offers prepaid cloud storage through digital codes and gift cards, distributed via InComm Payments, Blackhawk Network, and other partners. It provides high-performance cloud computing service through Rescale partnership. Amaryllo provides free cameras under an annual cloud storage subscription on its website. == Global Supercomputing Network (GSN) == The Global Supercomputing Network (GSN) is a distributed high-performance computing (HPC) platform developed by Amaryllo. The network is designed to provide scalable Infrastructure as a Service (IaaS) by connecting a global array of data centers to offer GPU computing resources for specialized industrial and scientific applications. === Architecture and Technology === GSN operates as a decentralized distributed network of servers rather than a single centralized supercomputer. The platform integrates an artificial intelligence assistant named Genie, also developed by Amaryllo. Genie's primary function is to manage computing allocation, helping users identify and connect to available resources across the network’s various nodes based on the specific requirements of their tasks. === Services === The network primarily focuses on the rental of GPU processing resources, catering to fields that require massive parallel processing capabilities, including: Artificial Intelligence and Machine Learning: Training large language models (LLMs) and neural networks. Scientific Simulations: Executing complex calculations in physics, chemistry, and bioinformatics. Data Analytics: Processing large-scale datasets. By utilizing a rental model, GSN allows organizations to access high-end hardware without the capital expenditure associated with purchasing and maintaining physical server infrastructure. === Infrastructure and Partnerships === The network’s physical footprint is expanded through strategic partnerships with data center operators. GSN collaborates with MettaDC and Cyber DC to provide colocation services. These partnerships facilitate the deployment of Nvidia server clusters within secure, Tier-rated facilities, ensuring high availability and connectivity for GSN users. == Official Brand Licensee of HP == Amaryllo Inc. is an official licensee of HP Inc., managing both B2B and B2C cloud services under the HP brand. Through this partnership, Amaryllo offers a range of secure and scalable cloud solutions, including HP Cloud, which provides subscription and one-time payment storage for reliable data backup and storage for individuals, families, and businesses. HP Cloud employs cloud computing technologies to create smart albums for users.

    Read more →
  • Language resource

    Language resource

    In linguistics and language technology, a language resource is a "[composition] of linguistic material used in the construction, improvement and/or evaluation of language processing applications, (...) in language and language-mediated research studies and applications." According to Bird & Simons (2003), this includes data, i.e. "any information that documents or describes a language, such as a published monograph, a computer data file, or even a shoebox full of handwritten index cards. The information could range in content from unanalyzed sound recordings to fully transcribed and annotated texts to a complete descriptive grammar", tools, i.e., "computational resources that facilitate creating, viewing, querying, or otherwise using language data", and advice, i.e., "any information about what data sources are reliable, what tools are appropriate in a given situation, what practices to follow when creating new data". The latter aspect is usually referred to as "best practices" or "(community) standards". In a narrower sense, language resource is specifically applied to resources that are available in digital form, and then, "encompassing (a) data sets (textual, multimodal/multimedia and lexical data, grammars, language models, etc.) in machine readable form, and (b) tools/technologies/services used for their processing and management". == Typology == As of May 2020, no widely used standard typology of language resources has been established (current proposals include the LREMap, METASHARE, and, for data, the LLOD classification). Important classes of language resources include data lexical resources, e.g., machine-readable dictionaries, linguistic corpora, i.e., digital collections of natural language data, linguistic data bases such as the Cross-Linguistic Linked Data collection, tools linguistic annotations and tools for creating such annotations in a manual or semiautomated fashion (e.g., tools for annotating interlinear glossed text such as Toolbox and FLEx, or other language documentation tools), applications for search and retrieval over such data (corpus management systems), for automated annotation (part-of-speech tagging, syntactic parsing, semantic parsing, etc.), metadata and vocabularies vocabularies, repositories of linguistic terminology and language metadata, e.g., MetaShare (for language resource metadata), the ISO 12620 data category registry (for linguistic features, data structures and annotations within a language resource), or the Glottolog database (identifiers for language varieties and bibliographical database). == Language resource publication, dissemination and creation == A major concern of the language resource community has been to develop infrastructures and platforms to present, discuss and disseminate language resources. Selected contributions in this regard include: a series of International Conferences on Language Resources and Evaluation (LREC), the European Language Resources Association (ELRA, EU-based), and the Linguistic Data Consortium (LDC, US-based), which represent commercial hosting and dissemination platforms for language resources, the Open Languages Archives Community (OLAC), which provides and aggregates language resource metadata, the Language Resources and Evaluation Journal (LREJ), the European Language Grid is a European platform for language technologies (eg services), data and resources. As for the development of standards and best practices for language resources, these are subject of several community groups and standardization efforts, including ISO Technical Committee 37: Terminology and other language and content resources (ISO/TC 37), developing standards for all aspects of language resources, W3C Community Group Best Practices for Multilingual Linked Open Data (BPMLOD), working on best practice recommendations for publishing language resources as Linked Data or in RDF, W3C Community Group Linked Data for Language Technology (LD4LT), working on linguistic annotations on the web and language resource metadata, W3C Community Group Ontology-Lexica (OntoLex), working on lexical resources, the Open Linguistics working group of the Open Knowledge Foundation, working on conventions for publishing and linking open language resources, developing the Linguistic Linked Open Data cloud, the Text Encoding Initiative (TEI), working on XML-based specifications for language resources and digitally edited text.

    Read more →
  • Agent Ruby

    Agent Ruby

    Agent Ruby (1998–2002) by Lynn Hershman Leeson is an interactive, multiuser work using artificial intelligence. == Description == On Agent Ruby's website, "Agent Ruby's Edream Portal," a female face moves her eyes and lips. Ruby, named from Hershman Leeson's own film, Teknolust, answers questions and often responds that she needs a better algorithm to answer questions not within her database. The work, created with AI, explores relationships between real and virtual worlds. Hershman Leeson had created an earlier version of Ruby, CyberRoberta, which was a custom-made doll with webcam eyes that interacted with the internet. The work in a gallery provides a screen and a sign inviting gallery-goers to "Chat with Ruby." == Artificial intelligence == In 2015 when Agent Ruby was exhibited at the gallery Modern Art Oxford, a review in Aesthetica Magazine described it as an artificial intelligence agent. A review in New Scientist noted that "Ruby is a fast learner, but perhaps not a natural conversationalist." A 2024 list of "25 Essential AI Artworks" published by ARTnews wrote that while "Agent Ruby's capabilities seem limited by today's standards," it was extensive for its day. == Publications and exhibitions == Agent Ruby was commissioned and displayed at the San Francisco Museum of Modern Art, Modern Art Oxford, and the ZKM Center for Art and Media in Karlsruhe, Germany. The San Francisco Museum of Modern Art (SFMOMA) presented Lynn Hershman Leeson: The Agent Ruby Files, March 30 through June 2, 2013 which presented the project server's archive of user conversations over the 12 years of exhibitions.

    Read more →