AI Coding Meta

AI Coding Meta — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Fragment (computer graphics)

    Fragment (computer graphics)

    In computer graphics, a fragment is the data necessary to generate a single pixel's worth of a drawing primitive in the frame buffer. These data may include, but are not limited to: raster position depth interpolated attributes (color, texture coordinates, etc.) stencil alpha window ID As a scene is drawn, drawing primitives (the basic elements of graphics output, such as points, lines, circles, text etc.) are rasterized into fragments which are textured and combined with the existing frame buffer. How a fragment is combined with the data already in the frame buffer depends on various settings. In a typical case, a fragment may be discarded if it is further away than the pixel which is already at that location (according to the depth buffer). If it is nearer than the existing pixel, it may replace what is already there, or, if alpha blending is in use, the pixel's color may be replaced with a mixture of the fragment's color and the pixel's existing color, as in the case of drawing a translucent object. In general, a fragment can be thought of as the data needed to shade the pixel, plus the data needed to test whether the fragment survives to become a pixel (depth, alpha, stencil, scissor, window ID, etc.). Shading a fragment is done through a fragment shader (or pixel shaders in Direct3D). In computer graphics, a fragment is not necessarily opaque, and could contain an alpha value specifying its degree of transparency. The alpha is typically normalized to the range of [0, 1], with 0 denotes totally transparent and 1 denotes totally opaque. If the fragment is not totally opaque, then part of its background object could show through, which is known as alpha blending.

    Read more →
  • BFR algorithm

    BFR algorithm

    The BFR algorithm, named after its inventors Bradley, Fayyad and Reina, is a variant of k-means algorithm that is designed to cluster data in a high-dimensional Euclidean space. It makes a very strong assumption about the shape of clusters: they must be normally distributed about a centroid. The mean and standard deviation for a cluster may differ for different dimensions, but the dimensions must be independent. In other words, the data must take the shape of axis-aligned ellipses.

    Read more →
  • Top 10 AI Marketing Tools Compared (2026)

    Top 10 AI Marketing Tools Compared (2026)

    Comparing the best AI marketing tool? An AI marketing tool is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI marketing tool slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Optical braille recognition

    Optical braille recognition

    Optical braille recognition is technology to capture and process images of braille characters into natural language characters. It is used to convert braille documents for people who cannot read them into text, and for preservation and reproduction of the documents. == History == In 1984, a group of researchers at the Delft University of Technology designed a braille reading tablet, in which a reading head with photosensitive cells was moved along set of rulers to capture braille text line-by-line. In 1988, a group of French researchers at the Lille University of Science and Technology developed an algorithm, called Lectobraille, which converted braille documents into plain text. The system photographed the braille text with a low-resolution CCD camera, and used spatial filtering techniques, median filtering, erosion, and dilation to extract the braille. The braille characters were then converted to natural language using adaptive recognition. The Lectobraille technique had an error rate of 1%, and took an average processing time of seven seconds per line. In 1993, a group of researchers from the Katholieke Universiteit Leuven developed a system to recognize braille that had been scanned with a commercially available scanner. The system, however, was unable to handle deformities in the braille grid, so well-formed braille documents were required. In 1999, a group at the Hong Kong Polytechnic University implemented an optical braille recognition technique using edge detection to translate braille into English or Chinese text. In 2001, Murray and Dais created a handheld recognition system, that scanned small sections of a document at once. Because of the small area scanned at once, grid deformation was less of an issue, and a simpler, more efficient algorithm was employed. In 2003, Morgavi and Morando designed a system to recognize braille characters using artificial neural networks. This system was noted for its ability to handle image degradation more successfully than other approaches. == Challenges == Many of the challenges to successfully processing braille text arise from the nature of braille documents. Braille is generally printed on solid-color paper, with no ink to produce contrast between the raised characters and the background paper. However, imperfections in the page can appear in a scan or image of the page. Many documents are printed inter-point, meaning they are double-sided. As such, the depressions of the braille of one side appear interlaid with the protruding braille of the other side. == Techniques == Some optical braille recognition techniques attempt to use oblique lighting and a camera to reveal the shadows of the depressions and protrusions of the braille. Others make use of commercially available document scanners.

    Read more →
  • Iubenda

    Iubenda

    iubenda (stylized in lowercase; Italian pronunciation: [juˈbɛnda]) is an Italian software company that develops tools intended to support website and application compliance with data protection and privacy regulations, including consent management platforms. The company was founded in 2011 in Milan by Andrea Giannangelo. In February 2022, the company was acquired by team.blue. == History == iubenda was founded in 2011 in Milan, Italy, initially focusing on automated privacy policy generation. In 2015, the company expanded its services to include cookie compliance tools following the implementation of ePrivacy regulations in Italy. In 2018, following the introduction of the General Data Protection Regulation (GDPR) in the European Union, iubenda expanded its products to include consent management and compliance documentation services. In February 2022, iubenda was acquired by team.blue, which obtained a majority stake in the company. Italian media described the acquisition as one of the largest Italian technology startup exits in recent years. In October 2022, iubenda acquired consentmanager, a Sweden-based consent management provider. In 2025, the company acquired CookieFirst, a Netherlands-based consent management platform. In 2025, iubenda partnered with AccessiWay, a digital accessibility company owned by team.blue. == Activities == iubenda develops software tools intended to support compliance with data protection and privacy regulations. Its products include generators for privacy policies, cookie banners, terms and conditions documents, and consent management platforms. The company’s consent management platform integrates with frameworks used for online advertising and privacy compliance, including Google's Consent Mode. The platform is designed to support compliance with regulatory frameworks including the GDPR in the European Union, the UK GDPR, Brazil’s LGPD, Switzerland’s FADP and privacy laws in the United States. Its tools can be integrated with content management systems, web applications, and other digital platforms, including WordPress. The company operates internationally, with a customer base of more than 150,000 organisations, primarily in Europe and the Americas.

    Read more →
  • Co-Büchi automaton

    Co-Büchi automaton

    In automata theory, a co-Büchi automaton is a variant of Büchi automaton. The only difference is the accepting condition: a Co-Büchi automaton accepts an infinite word w {\displaystyle w} if there exists a run, such that all the states occurring infinitely often in the run are in the final state set F {\displaystyle F} . In contrast, a Büchi automaton accepts a word w {\displaystyle w} if there exists a run, such that at least one state occurring infinitely often in the final state set F {\displaystyle F} . (Deterministic) Co-Büchi automata are strictly weaker than (nondeterministic) Büchi automata. == Formal definition == Formally, a deterministic co-Büchi automaton is a tuple A = ( Q , Σ , δ , q 0 , F ) {\displaystyle {\mathcal {A}}=(Q,\Sigma ,\delta ,q_{0},F)} that consists of the following components: Q {\displaystyle Q} is a finite set. The elements of Q {\displaystyle Q} are called the states of A {\displaystyle {\mathcal {A}}} . Σ {\displaystyle \Sigma } is a finite set called the alphabet of A {\displaystyle {\mathcal {A}}} . δ : Q × Σ → Q {\displaystyle \delta :Q\times \Sigma \rightarrow Q} is the transition function of A {\displaystyle {\mathcal {A}}} . q 0 {\displaystyle q_{0}} is an element of Q {\displaystyle Q} , called the initial state. F ⊆ Q {\displaystyle F\subseteq Q} is the final state set. A {\displaystyle {\mathcal {A}}} accepts exactly those words w {\displaystyle w} with the run ρ ( w ) {\displaystyle \rho (w)} , in which all of the infinitely often occurring states in ρ ( w ) {\displaystyle \rho (w)} are in F {\displaystyle F} . In a non-deterministic co-Büchi automaton, the transition function δ {\displaystyle \delta } is replaced with a transition relation Δ {\displaystyle \Delta } . The initial state q 0 {\displaystyle q_{0}} is replaced with an initial state set Q 0 {\displaystyle Q_{0}} . Generally, the term co-Büchi automaton refers to the non-deterministic co-Büchi automaton. For more comprehensive formalism see also ω-automaton. == Acceptance Condition == The acceptance condition of a co-Büchi automaton is formally ∃ i ∀ j : j ≥ i ρ ( w j ) ∈ F . {\displaystyle \exists i\forall j:\;j\geq i\quad \rho (w_{j})\in F.} The Büchi acceptance condition is the complement of the co-Büchi acceptance condition: ∀ i ∃ j : j ≥ i ρ ( w j ) ∈ F . {\displaystyle \forall i\exists j:\;j\geq i\quad \rho (w_{j})\in F.} == Properties == Co-Büchi automata are closed under union, intersection, projection and determinization.

    Read more →
  • Top 10 AI Customer-support Bots Compared (2026)

    Top 10 AI Customer-support Bots Compared (2026)

    Trying to pick the best AI customer-support bot? An AI customer-support bot is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI customer-support bot slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Xu Li (computer scientist)

    Xu Li (computer scientist)

    Xu Li is a Chinese computer scientist and co-founder and current CEO of SenseTime, an artificial intelligence (AI) company. Xu has led SenseTime since the company's incorporation and helped it independently develop its proprietary deep learning platform. == Education and research == Xu obtained both his bachelor's and master's degrees in computer science from Shanghai Jiao Tong University. He received his doctorate in computer science from the Chinese University of Hong Kong. Xu has published more than 50 papers at international conferences and in journals in the field of computer vision and won the Best Paper Award at the international conference on Non-Photorealistic Rendering and Animation (NPAR) 2012 and the Best Reviewer Award at the international conferences Asian Conference on Computer Vision ACCV 2012 and International Conference on Computer Vision (ICCV) 2015. He has three algorithms that have been included into the visual open-source platform OpenCV, and his "L0 Smoothing" algorithm garnered the most citations in research papers over a span of five years (2011–2015) within the ACM Transactions on Graphics (TOG), a scientific journal that Thomson Reuters InCites has placed first among software engineering journals. == Career == Previously, Xu worked at Lenovo Corporate Research & Development. He was also a visiting researcher at Motorola China R&D Institute, Omron Research Institute, and Microsoft Research. == Selected publications == Jimmy Ren, Xiaohao Chen, Jianbo Liu, Wenxiu Sun, Li Xu, Jiahao Pang, Qiong Yan, Yu-wing Tai, "Accurate Single Stage Detector Using Recurrent Rolling Convolution", (CVPR), 2017. Jimmy SJ. Ren, Yongtao Hu, Yu-Wing Tai, Chuan Wang, Li Xu, Wenxiu Sun, Qiong Yan, "Look, Listen and Learn – A Multimodal LSTM for Speaker Identification", The 30th AAAI Conference on Artificial Intelligence (AAAI), 2016 Jimmy SJ. Ren, Li Xu, Qiong Yan, Wenxiu Sun, "Shepard Convolutional Neural Networks" Advances in Neural Information Processing Systems (NIPS), 2015. Xiaoyong Shen, Chao Zhou, Li Xu, Jiaya Jia, "Mutual-Structure for Joint Filtering" International Conference on Computer Vision (ICCV), (oral presentation), 2015. Jianping Shi, Qiong Yan, Li Xu, Jiaya Jia, "Hierarchical Image Saliency Detection on Extended CSSD" IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015. Jianping Shi, Xin Tao, Li Xu, Jiaya Jia, "Break Ames Room Illusion: Depth from General Single Images" ACM Transactions on Graphics (TOG), (Proc. ACM SIGGRAPH ASIA2015). Yongtao Hu, Jimmy SJ. Ren, Jingwen Dai, Chang Yuan, Li Xu, Wenping Wang, "Deep Multimodal Speaker Naming" ACM International Conference on Multimedia (MM), 2015. Li Xu, Jimmy SJ. Ren, Qiong Yan, Renjie Liao, Jiaya Jia "Deep Edge-Aware Filters" International Conference on Machine Learning (ICML), 2015. Jianping Shi, Li Xu, Jiaya Jia "Just Noticeable Defocus Blur Detection and Estimation" IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. Ziyang Ma, Renjie Liao, Xin Tao, Li Xu, Jiaya Jia, Enhua Wu "Handling Motion Blur in Multi-Frame Super-Resolution" IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. Xiaoyong Shen, Qiong Yan, Li Xu, Lizhuang Ma, Jiaya Jia"Multispectral Joint Image Restoration via Optimizing a Scale Map" IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015. Jimmy SJ. Ren, Li Xu, "On Vectorization of Deep Convolutional Neural Networks for Vision Tasks" AAAI Conference on Artificial Intelligence (AAAI), 2015. == Awards and honors == Xu was ranked 7th in Fortune magazine's 2018 edition of its 40 Under 40. He was also named "China's Outstanding AI Industry Leader" by The Economic Observer, received the "Innovative Business Leader" Award under NetEase's "Future Technology Talent Awards", and was honored as Sina's "2017 Top Ten Economic Figures". In 2018, Xu was named EY's "Entrepreneur of the Year China" in the Technology category.

    Read more →
  • Noom

    Noom

    Noom is an American privately held digital health company that provides weight management and behavioral health services through a subscription-based mobile application. Founded in 2008, the company combines behavior change psychology with access to weight loss medications and dietary supplements. The platform incorporates elements of cognitive behavioral therapy (CBT) and goal-setting strategies, and its programs are designed to support users in developing healthier habits. In addition to its weight management services, Noom has expanded to offer products related to stress management and general wellness. Noom has received both praise and criticism. Supporters cite its focus on mental and behavioral aspects of health, while critics have raised concerns about the accuracy of its calorie goals, the use of algorithmically determined weight loss targets, and questions about the qualifications of some of its coaching staff. == History == Noom was founded in 2008 by friends Artem Petakov and Saeju Jeong. The company's mobile app officially launched in 2016. In 2025, Noom relocated its headquarters from New York City to Princeton, New Jersey. Petakov, a former software engineer at Google, currently leads Noom Ventures, while Jeong serves as Noom's Chairman. In 2023, Geoff Cook was appointed CEO of Noom. In 2019, Noom partnered with Novo Nordisk to offer patients prescribed the diabetes medication Saxenda one year of free access to the Noom platform. In 2020, Noom reported $400 million in revenue. As of April 2021, the company stated it employed approximately 3,000 people, including 2,700 coaches. == Services == === Noom App === The Noom app is the primary platform through which users engage with the company's services. Upon creating an account, users are prompted to provide physical information such as weight, height, and age, along with experiential data including lifestyle habits, personal goals, and perceived obstacles. Users log their meals and physical activity, and in return, the app delivers feedback through multiple channels: algorithmically generated insights, guidance from a human coach, peer interaction, educational articles, and interactive quizzes. The app has been reviewed by a range of media outlets, including newspapers such as the Chicago Tribune and USA Today; health information sources such as WebMD; and lifestyle magazines including Good Housekeeping. === Other services === In 2024, Noom launched Noom Vibe, a mobile application that encourages users to develop healthy habits by awarding "vibes"—a form of points—for activities such as walking or meeting step goals. That same year, Noom introduced a 3D body scanning feature within its app, designed to help users monitor physical changes and prevent muscle atrophy during weight loss. Also in 2024, Noom began offering a compounded GLP-1 medication as part of its weight management program. The formulation includes the same active ingredient found in the anti-obesity medications Wegovy and Ozempic. == Research == In 2016, a study published in Scientific Reports analyzed data from approximately 36,000 users of the Noom app, of whom 78% were female and 22% male. The data were collected between October 2012 and April 2014. To be included in the analysis, users had to log their weight at least twice per month over a period of six consecutive months. The study found that 78% of participants self-reported weight loss while using the app. The median duration of weight reporting was 267 days (approximately nine months). The frequency of data logging was positively correlated with weight loss. Additionally, male users had a higher average starting BMI and reported greater average weight loss compared to female users. In 2017, the Centers for Disease Control and Prevention (CDC) recognized Noom as a certified diabetes prevention program, making it the first mobile health application to receive such designation. == Criticisms == === Health programs === Noom has been criticized for promoting elements of diet culture in its advertising campaigns. The app has also faced criticism for setting calorie goals that some users and experts have deemed inappropriately low, and for employing coaches who may lack formal qualifications as registered dietitians. Coaching has been described as relying heavily on canned responses. Upon sign-up, users are prompted to complete a questionnaire consisting of over 50 questions, which is used to generate a personalized program. In 2021, the UK-based organization Privacy International alleged that Noom, along with other diet platforms, used such lengthy surveys to attract users but did not always tailor the resulting programs to the collected data. The organization claimed that many users received the same or highly similar programs regardless of their answers. It also raised concerns about the handling of potentially sensitive health data, alleging a lack of transparency regarding the sharing of such data with third parties, including Facebook, potentially in violation of the European General Data Protection Regulation (GDPR). In a follow-up investigation in 2023, Privacy International reported that Noom had made "significant positive changes" to its data handling practices. However, the organization noted that data was still being shared with Facebook and concluded that "there is still room for improvement." === Billing issues lawsuit === In August 2020, the Better Business Bureau (BBB) issued a warning to consumers regarding Noom's subscription practices. The BBB reported that numerous customers had filed complaints about difficulties canceling their subscriptions after the free trial period, as well as challenges in contacting the company to request refunds. In February 2022, Noom agreed to a $62 million settlement in a class-action lawsuit that alleged the company had used deceptive billing practices related to automatic subscription renewals. Qualifying claimants received approximately $167 each. During the case, a former senior software engineer at Noom testified that the cancellation process was intentionally designed to be difficult, with the goal of generating revenue from customers who failed to cancel in time. In response, Noom stated that it had taken steps to improve transparency around its pricing and policies, including the implementation of self-service cancellation tools.

    Read more →
  • Generalized nondeterministic finite automaton

    Generalized nondeterministic finite automaton

    In the theory of computation, a generalized nondeterministic finite automaton (GNFA), also known as an expression automaton or a generalized nondeterministic finite state machine, is a variation of a nondeterministic finite automaton (NFA) where each transition is labeled with any regular expression. The GNFA reads blocks of symbols from the input which constitute a string as defined by the regular expression on the transition. There are several differences between a standard finite state machine and a generalized nondeterministic finite state machine. A GNFA must have only one start state and one accept state, and these cannot be the same state, whereas an NFA or DFA both may have several accept states, and the start state can be an accept state. A GNFA must have only one transition between any two states, whereas a NFA or DFA both allow for numerous transitions between states. In a GNFA, a state has a single transition to every state in the machine, although often it is a convention to ignore the transitions that are labelled with the empty set when drawing generalized nondeterministic finite state machines. == Formal definition == A GNFA can be defined as a 5-tuple, (S, Σ, T, s, a), consisting of a finite set of states (S); a finite set called the alphabet (Σ); a transition function (T : (S ∖ {\displaystyle \setminus } {a}) × (S ∖ {\displaystyle \setminus } {s}) → R); a start state (s ∈ S); an accept state (a ∈ S); where R is the collection of all regular expressions over the alphabet Σ. The transition function takes as its argument a pair of two states and outputs a regular expression (the label of the transition). This differs from other finite state machines, which take as input a single state and an input from the alphabet (or the empty string in the case of nondeterministic finite state machines) and outputs the next state (or the set of possible states in the case of nondeterministic finite state machines). A DFA or NFA can easily be converted into a GNFA and then the GNFA can be easily converted into a regular expression by repeatedly collapsing parts of it to single edges until S = {s, a}. Similarly, GNFAs can be reduced to NFAs by changing regular expression operators into new edges until each edge is labelled with a regular expression matching a single string of length at most 1. NFAs, in turn, can be reduced to DFAs using the powerset construction. This shows that GNFAs recognize the same set of formal languages as DFAs and NFAs.

    Read more →
  • Best AI Paraphrasing Tools in 2026

    Best AI Paraphrasing Tools in 2026

    Curious about the best AI paraphrasing tool? An AI paraphrasing tool is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI paraphrasing tool slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Best AI Subtitle Generators in 2026

    Best AI Subtitle Generators in 2026

    Comparing the best AI subtitle generator? An AI subtitle generator is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI subtitle generator slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • Underwater computer vision

    Underwater computer vision

    Underwater computer vision is a subfield of computer vision. In recent years, with the development of underwater vehicles ( ROV, AUV, gliders), the need to be able to record and process huge amounts of information has become increasingly important. Applications range from inspection of underwater structures for the offshore industry to the identification and counting of fishes for biological research. However, no matter how big the impact of this technology can be to industry and research, it still is in a very early stage of development compared to traditional computer vision. One reason for this is that, the moment the camera goes into the water, a whole new set of challenges appear. On one hand, cameras have to be made waterproof, marine corrosion deteriorates materials quickly and access and modifications to experimental setups are costly, both in time and resources. On the other hand, the physical properties of the water make light behave differently, changing the appearance of a same object with variations of depth, organic material, currents, temperature etc. == Applications == Seafloor survey Vehicle navigation and positioning Biological monitoring {possibly aquatic biomonitoring) Video mosaics as visual navigation maps Submarine pipeline inspection Wreckage visualization Maintenance of underwater structures Drowning detection systems == Medium differences == === Illumination === In air, light comes from the whole hemisphere on cloudy days, and is dominated by the sun. In water direct lighting comes from a cone about 96° wide above the scene. This phenomenon is called Snell's window. Artificial lighting can be used where natural light levels are insufficient and where the light path is too long to produce acceptable colour, as the loss of colour is a function of the total distance through water from the source to the camera lens port. === Light attenuation === Unlike air, water attenuates light exponentially. This results in hazy images with very low contrast. The main reasons for light attenuation are light absorption (where energy is removed from the light) and light scattering, by which the direction of light is changed. Light scattering can further be divided into forward scattering, which results in an increased blurriness and backward scattering that limits the contrast and is responsible for the characteristic veil of underwater images. Both scattering and attenuation are heavily influenced by the amount of organic matter dissolved or suspended in the water. Light attenuation in water is also a function of the wavelength. This means that different colours are attenuated at different rates, leading to colour degradation.with depth and distance. Red and orange light are attenuated faster, followed by yellows and greens. Blue is the least attenuated visible wavelength. === Artificial lighting === == Challenges == In high level computer vision, human structures are frequently used as image features for image matching in different applications. However, the sea bottom lacks such features, making it hard to find correspondences in two images. In order to be able to use a camera in the water, a watertight housing is required. However, refraction will happen at the water-glass and glass-air interface due to differences in density of the materials. This has the effect of introducing a non-linear image deformation. The motion of the vehicle presents another special challenge. Underwater vehicles are constantly moving due to currents and other phenomena. This introduces another uncertainty to algorithms, where small motions may appear in all directions. This can be specially important for video tracking. In order to reduce this problem image stabilization algorithms may be applied. == Relevant technology == === Image restoration === Image restoration< techniques are intended to model the degradation process and then invert it, obtaining the new image after solving. It is generally a complex approach that requires plenty of parameters that vary a lot between different water conditions. === Image enhancement === Image enhancement only tries to provide a visually more appealing image without taking the physical image formation process into account. These methods are usually simpler and less computational intensive. === Color correction === Various algorithms exist that perform automatic color correction. The UCM (Unsupervised Color Correction Method), for example, does this in the following steps: It firstly reduces the color cast by equalizing the color values. Then it enhances contrast by stretching the red histogram towards the maximum and finally saturation and intensity components are optimized. == Underwater stereo vision == It is usually assumed that stereo cameras have been calibrated previously, geometrically and radiometrically. This leads to the assumption that corresponding pixels should have the same color. However this can not be guaranteed in an underwater scene, because of dispersion and backscatter. However, it is possible to digitally model this phenomenon and create a virtual image with those effects removed == Other application fields == Imaging sonars have become more and more accessible and gained resolution, delivering better images. Sidescan sonars are used to produce complete maps of regions of the sea floor stitching together sequences of sonar images. However, sonar images often lack proper contrast and are degraded by artefacts and distortions due to noise, attitude changes of the AUV/ROV carrying the sonar or non uniform beam patterns. Another common problem with sonar computer vision is the comparatively low frame rate of sonar images.

    Read more →
  • Deepset

    Deepset

    deepset is an enterprise software vendor that provides developers with the tools to build production-ready Artificial Intelligence (AI) and natural language processing (NLP) systems, using architectures such as agents, retrieval augmented generation (RAG) and multimodal AI. It was founded in 2018 in Berlin by Milos Rusic, Malte Pietsch, and Timo Möller. deepset authored and maintains the open source software Haystack and its commercial SaaS and self-hosted (VPC, on-prem, air gapped) offering, Haystack Enterprise Platform. (formerly known as deepset Cloud and deepset AI Platform) == History == In June 2018, Milos Rusic, Malte Pietsch, and Timo Möller co-founded deepset in Berlin, Germany. In the same year, the company served first customers who wanted to implement NLP services by tailoring BERT language models to their domain. In July 2019, the company released the initial version of the open source software FARM. In November 2019, the company released the initial version of the open source software Haystack. Throughout 2020 and 2021 deepset published several applied research papers at EMNLP, COLING and ACL, the leading conferences in the area of NLP. In 2020, the research contributions comprised German language models named GBERT and GELECTRA, and a question answering dataset addressing the COVID-19 pandemic called COVID-QA, which was created in collaboration with Intel and has been annotated by biomedical experts. In 2021, the research contributions comprised German models and datasets for question answering and passage retrieval named GermanQuAD and GermanDPR, a semantic answer similarity metric, and an approach for multimodal retrieval of texts and tables to enable question answering on tabular data. Haystack contains implementations of all three contributions, enabling the use of the research through the open source framework. In November 2021, the development of the FARM framework was discontinued and its main features were integrated into the Haystack framework. In April 2022, the company announced its commercial SaaS offering deepset Cloud, which was rebranded in 2025 as Haystack Enterprise Platform supporting SaaS and on-premise deployment options. As of August 2023, the most popular finetuned language model created by deepset was downloaded more than 52 million times. In 2024, deepset was named a Gartner Cool Vendor in AI Engineering. In 2025, deepset was recognized for its growth by WirtschaftsWoche and Sifted and shared partnership integrations and announcements with Meta Llama Stack, MongoDB, NVIDIA, Amazon Web Services (AWS), and PwC. As of September 2025, the Haystack open source AI orchestration framework has more than 24,000 GitHub stars. == Products and applications == Haystack is an open source Python AI Orchestration framework for building custom AI agents and applications with large language models. With its modular building block components, software developers and AI engineers can implement pipelines to build and customize various AI architectures over large document and multimodal data collections, such as agents, retrieval augmented generation (RAG), intelligent document processing (IDP), text-to-SQL as well as document retrieval, semantic search, text generation, question answering, or summarization. Haystack emphasizes context engineering, an approach to AI system design that focuses on explicit control over how contextual information is retrieved, structured, routed to language models, and evaluated after generation. This allows developers to build AI systems with transparent data flow, tool usage, and configurable reasoning processes. Haystack integrates with 90+ model and technology providers including Hugging Face Transformers, Elasticsearch, OpenSearch, OpenAI, Cohere, Anthropic, Mistral and others. Developers can extend these integrations with their own custom components. The framework has an active community on Discord with more than 4k members and GitHub, where so far more than 300 people have contributed to its continuous development, and engage on Meetup. Thousands of organizations use the framework, including public sector leaders like the European Commission and Global 500 enterprises like Airbus, Intel, NVIDIA, Lufthansa, Netflix, Apple, Infineon, Alcatel-Lucent Enterprise, BetterUp, Etalab, Sooth.ai, and Lego. On top of the Haystack open source framework, deepset offers two enterprise offerings to organizations. Haystack Enterprise Starter provides enterprise support on the open source framework from the Haystack engineering team as well as a private GitHub repository with production use case templates and Kubernetes deployment guides. The Haystack Enterprise Platform supports customers at building scalable AI applications by covering the entire process of prototyping, experimentation, deployment, monitoring, and governance. It is built on the Haystack open source framework and is available for hosting in the cloud and self-hosted via VPC, on-premise, or air gapped environments. deepset's enterprise tools are used by organizations including The European Commission, The Economist, Oxford University Press, the German Federal Ministry of Research, Technology, and Space (BMFTR), Manz Verlag, and the German Armed Forces. FARM was an earlier framework for adapting representation models. One of its core concepts was the implementation of adaptive models, which comprised language models and an arbitrary number of prediction heads. FARM supported domain-adaptation and finetuning of these models with advanced options, for example gradient accumulation, cross-validation or automatic mixed-precision training. Its main features were integrated into Haystack in November 2021, and its development was discontinued at that time. == Funding == On August 9, 2023, deepset announced a Series B investment round of $30 million led by Balderton Capital and including participation from existing investors GV, System.One, Lunar Ventures and Harpoon Ventures. On April 28, 2022, deepset announced a Series A investment round of $14 million led by GV, with the participation of Harpoon Ventures, Acequia Capital and a team of experienced commercial open source software and machine learning founders, such as Alex Ratner (Snorkel AI), Mustafa Suleyman (Deepmind), Spencer Kimball (Cockroach Labs), Jeff Hammerbacher (Cloudera) and Emil Eifrem (Neo4j). A previous pre-seed investment round of $1.6 million on March 8, 2021, was led by System.One and Lunar Ventures, who also participated in the subsequent Series A round.

    Read more →
  • Noisy channel model

    Noisy channel model

    The noisy channel model is a framework used in spell checkers, question answering, speech recognition, and machine translation. In this model, the goal is to find the intended word given a word where the letters have been scrambled in some manner. == In spell-checking == See Chapter B of. Given an alphabet Σ {\displaystyle \Sigma } , let Σ ∗ {\displaystyle \Sigma ^{}} be the set of all finite strings over Σ {\displaystyle \Sigma } . Let the dictionary D {\displaystyle D} of valid words be some subset of Σ ∗ {\displaystyle \Sigma ^{}} , i.e., D ⊆ Σ ∗ {\displaystyle D\subseteq \Sigma ^{}} . The noisy channel is the matrix Γ w s = Pr ( s | w ) {\displaystyle \Gamma _{ws}=\Pr(s|w)} , where w ∈ D {\displaystyle w\in D} is the intended word and s ∈ Σ ∗ {\displaystyle s\in \Sigma ^{}} is the scrambled word that was actually received. The goal of the noisy channel model is to find the intended word given the scrambled word that was received. The decision function σ : Σ ∗ → D {\displaystyle \sigma :\Sigma ^{}\to D} is a function that, given a scrambled word, returns the intended word. Methods of constructing a decision function include the maximum likelihood rule, the maximum a posteriori rule, and the minimum distance rule. In some cases, it may be better to accept the scrambled word as the intended word rather than attempt to find an intended word in the dictionary. For example, the word schönfinkeling may not be in the dictionary, but might in fact be the intended word. === Example === Consider the English alphabet Σ = { a , b , c , . . . , y , z , A , B , . . . , Z , . . . } {\displaystyle \Sigma =\{a,b,c,...,y,z,A,B,...,Z,...\}} . Some subset D ⊆ Σ ∗ {\displaystyle D\subseteq \Sigma ^{}} makes up the dictionary of valid English words. There are several mistakes that may occur while typing, including: Missing letters, e.g., leter instead of letter Accidental letter additions, e.g., misstake instead of mistake Swapping letters, e.g., recieved instead of received Replacing letters, e.g., fimite instead of finite To construct the noisy channel matrix Γ {\displaystyle \Gamma } , we must consider the probability of each mistake, given the intended word ( Pr ( s | w ) {\displaystyle \Pr(s|w)} for all w ∈ D {\displaystyle w\in D} and s ∈ Σ ∗ {\displaystyle s\in \Sigma ^{}} ). These probabilities may be gathered, for example, by considering the Damerau–Levenshtein distance between s {\displaystyle s} and w {\displaystyle w} or by comparing the draft of an essay with one that has been manually edited for spelling. == In machine translation == One naturally wonders if the problem of translation could conceivably be treated as a problem in cryptography. When I look at an article in Russian, I say: 'This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode. See chapter 1, and chapter 25 of. Suppose we want to translate a foreign language to English, we could model P ( E | F ) {\displaystyle P(E|F)} directly: the probability that we have English sentence E given foreign sentence F, then we pick the most likely one E ^ = arg ⁡ max E P ( E | F ) {\displaystyle {\hat {E}}=\arg \max _{E}P(E|F)} . However, by Bayes law, we have the equivalent equation: E ^ = argmax E ∈ English P ( F ∣ E ) ⏞ translation model P ( E ) ⏞ language model {\displaystyle {\hat {E}}={\underset {E\in {\text{ English }}}{\operatorname {argmax} }}\overbrace {P(F\mid E)} ^{\text{translation model }}\overbrace {P(E)} ^{\text{language model}}} The benefit of the noisy-channel model is in terms of data: If collecting a parallel corpus is costly, then we would have only a small parallel corpus, so we can only train a moderately good English-to-foreign translation model, and a moderately good foreign-to-English translation model. However, we can collect a large corpus in the foreign language only, and a large corpus in the English language only, to train two good language models. Combining these four models, we immediately get a good English-to-foreign translator and a good foreign-to-English translator. The cost of noisy-channel model is that using Bayesian inference is more costly than using a translation model directly. Instead of reading out the most likely translation by arg ⁡ max E P ( E | F ) {\displaystyle \arg \max _{E}P(E|F)} , it would have to read out predictions by both the translation model and the language model, multiply them, and search for the highest number. == In speech recognition == Speech recognition can be thought of as translating from a sound-language to a text-language. Consequently, we have T ^ = argmax T ∈ Text P ( S ∣ T ) ⏞ speech model P ( T ) ⏞ language model {\displaystyle {\hat {T}}={\underset {T\in {\text{ Text }}}{\operatorname {argmax} }}\overbrace {P(S\mid T)} ^{\text{speech model }}\overbrace {P(T)} ^{\text{language model}}} where P ( S | T ) {\displaystyle P(S|T)} is the probability that a speech sound S is produced if the speaker is intending to say text T. Intuitively, this equation states that the most likely text is a text that's both a likely text in the language, and produces the speech sound with high probability. The utility of the noisy-channel model is not in capacity. Theoretically, any noisy-channel model can be replicated by a direct P ( T | S ) {\displaystyle P(T|S)} model. However, the noisy-channel model factors the model into two parts which are appropriate for the situation, and consequently it is generally more well-behaved. When a human speaks, it does not produce the sound directly, but first produces the text it wants to speak in the language centers of the brain, then the text is translated into sound by the motor cortex, vocal cords, and other parts of the body. The noisy-channel model matches this model of the human, and so it is appropriate. This is justified in the practical success of noisy-channel model in speech recognition. === Example === Consider the sound-language sentence (written in IPA for English) S = aɪ wʊd laɪk wʌn tuː. There are three possible texts T 1 , T 2 , T 3 {\displaystyle T_{1},T_{2},T_{3}} : T 1 = {\displaystyle T_{1}=} I would like one to. T 2 = {\displaystyle T_{2}=} I would like one too. T 3 = {\displaystyle T_{3}=} I would like one two. that are equally likely, in the sense that P ( S | T 1 ) = P ( S | T 2 ) = P ( S | T 3 ) {\displaystyle P(S|T_{1})=P(S|T_{2})=P(S|T_{3})} . With a good English language model, we would have P ( T 2 ) > P ( T 1 ) > P ( T 3 ) {\displaystyle P(T_{2})>P(T_{1})>P(T_{3})} , since the second sentence is grammatical, the first is not quite, but close to a grammatical one (such as "I would like one to [go]."), while the third one is far from grammatical. Consequently, the noisy-channel model would output T 2 {\displaystyle T_{2}} as the best transcription.

    Read more →