Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information from external data sources. With RAG, LLMs first refer to a specified set of documents, then respond to user queries. These documents supplement information from the LLM's pre-existing training data. This allows LLMs to use domain-specific and/or updated information that is not available in the training data. For example, this enables LLM-based chatbots to access internal company data or generate responses based on authoritative sources. RAG improves LLMs by incorporating information retrieval before generating responses. Unlike LLMs that rely on static training data, RAG pulls relevant text from databases, uploaded documents, or web sources. According to Ars Technica, "RAG is a way of improving LLM performance, in essence by blending the LLM process with a web search or other document look-up process to help LLMs stick to the facts." This method helps reduce AI hallucinations, which have caused chatbots to describe policies that don't exist, or recommend nonexistent legal cases to lawyers that are looking for citations to support their arguments. RAG also reduces the need to retrain LLMs with new data, saving on computational and financial costs. Beyond efficiency gains, RAG also allows LLMs to include sources in their responses, so users can verify the cited sources. This provides greater transparency, as users can cross-check retrieved content to ensure accuracy and relevance. The term retrieval-augmented generation (RAG) was introduced in a 2020 paper that described combining a parametric language model with a non-parametric external memory accessed through retrieval at inference time. == RAG and LLM limitations == LLMs can provide incorrect information. For example, when Google first demonstrated its LLM tool "Google Bard" (later re-branded to Gemini), the LLM provided incorrect information about the James Webb Space Telescope. This error contributed to a $100 billion decline in Google's stock value. RAG is used to prevent these errors, but it does not solve all the problems. For example, LLMs can generate misinformation even when pulling from factually correct sources if they misinterpret the context. MIT Technology Review gives the example of an AI-generated response stating, "The United States has had one Muslim president, Barack Hussein Obama." The model retrieved this from an academic book rhetorically titled Barack Hussein Obama: America's First Muslim President? The LLM did not "know" or "understand" the context of the title, generating a false statement. LLMs with RAG are programmed to prioritize new information. This technique has been called "prompt stuffing." Without prompt stuffing, the LLM's input is generated by a user; with prompt stuffing, additional relevant context is added to this input to guide the model's response. This approach provides the LLM with key information early in the prompt, encouraging it to prioritize the supplied data over pre-existing training knowledge. == Process == Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating an information-retrieval mechanism that allows models to access and utilize additional data beyond their original training set. Ars Technica notes that "when new information becomes available, rather than having to retrain the model, all that's needed is to augment the model's external knowledge base with the updated information" ("augmentation"). IBM states that "in the generative phase, the LLM draws from the augmented prompt and its internal representation of its training data to synthesize" an answer. === RAG key stages === Typically, the data to be referenced is converted into LLM embeddings, numerical representations in the form of a large vector space. RAG can be used on unstructured (usually text), semi-structured, or structured data (for example knowledge graphs). These embeddings are then stored in a vector database to allow for document retrieval. Given a user query, a document retriever is first called to select the most relevant documents that will be used to augment the query. This comparison can be done using a variety of methods, which depend in part on the type of indexing used. The model feeds this relevant retrieved information into the LLM via prompt engineering of the user's original query. Newer implementations (as of 2023) can also incorporate specific augmentation modules with abilities such as expanding queries into multiple domains and using memory and self-improvement to learn from previous retrievals. Finally, the LLM can generate output based on both the query and the retrieved documents. Some models incorporate extra steps to improve output, such as the re-ranking of retrieved information, context selection, and fine-tuning. == Applications == Retrieval-augmented generation is used in applications where generated responses need to be grounded in external or frequently updated information. Commonly cited use cases include search engines, question-answering systems, customer support chatbots, enterprise knowledge assistants, content generation, recommendation systems, retail and e-commerce, and industrial or manufacturing workflows. In healthcare, RAG has been studied as a way to ground large language model outputs in external medical knowledge sources, although reviews have noted continuing challenges around evaluation, ethics, and clinical reliability. == Improvements == Improvements to the basic process above can be applied at different stages in the RAG flow. === Encoder === These methods focus on the encoding of text as either dense or sparse vectors. Sparse vectors, which encode the identity of a word, are typically dictionary-length and contain mostly zeros. Dense vectors, which encode meaning, are more compact and contain fewer zeros. Various enhancements can improve the way similarities are calculated in the vector stores (databases). Performance improves by optimizing how vector similarities are calculated. Dot products enhance similarity scoring, while approximate nearest neighbor (ANN) searches improve retrieval efficiency over K-nearest neighbors (KNN) searches. Accuracy may be improved with Late Interactions, which allow the system to compare words more precisely after retrieval. This helps refine document ranking and improve search relevance. Hybrid vector approaches may be used to combine dense vector representations with sparse one-hot vectors, taking advantage of the computational efficiency of sparse dot products over dense vector operations. Other retrieval techniques focus on improving accuracy by refining how documents are selected. Some retrieval methods combine sparse representations, such as SPLADE, with query expansion strategies to improve search accuracy and recall. === Retriever-centric methods === These methods aim to enhance the quality of document retrieval in vector databases: Pre-training the retriever using the Inverse Cloze Task (ICT), a technique that helps the model learn retrieval patterns by predicting masked text within documents. Supervised retriever optimization aligns retrieval probabilities with the generator model's likelihood distribution. This involves retrieving the top-k vectors for a given prompt, scoring the generated response's perplexity, and minimizing KL divergence between the retriever's selections and the model's likelihoods to refine retrieval. Reranking techniques can refine retriever performance by prioritizing the most relevant retrieved documents during training. === Language model === By redesigning the language model with the retriever in mind, a 25-time smaller network can get comparable perplexity as its much larger counterparts. Because it is trained from scratch, this method (Retro) incurs the high cost of training runs that the original RAG scheme avoided. The hypothesis is that by giving domain knowledge during training, Retro needs less focus on the domain and can devote its smaller weight resources only to language semantics. The redesigned language model is shown here. It has been reported that Retro is not reproducible, so modifications were made to make it so. The more reproducible version is called Retro++ and includes in-context RAG. === Chunking === Chunking involves various strategies for breaking up the data into vectors so the retriever can find details in it. Three types of chunking strategies are: Fixed length with overlap. This is fast and easy. Overlapping consecutive chunks helps to maintain semantic context across chunks. Syntax-based chunks can break the document up into sentences. Libraries such as spaCy or NLTK can also help. File format-based chunking. Certain file types have natural chunks built in, and it's best to respect them. For example, code files are best chunked and vectorized as whole functions or classes. HTML files should leave
or base64 encoded
elements
Read more →
Pydio Cells, previously known as just Pydio and formerly known as AjaXplorer, is an open-source file-sharing and synchronisation software that runs on the user's own server or in the cloud. == Presentation == The project was created by musician Charles Du Jeu (current CEO and CTO) in 2007 under the name AjaXplorer. The name was changed in 2013 and became Pydio (an acronym for Put Your Data in Orbit). In May 2018, Pydio switched from PHP to Go with the release of Pydio Cells. The PHP version reached end-of-life state on 31 December 2019. Pydio Cells runs on any server supporting a recent Go version. Windows/Linux/macOS on the Intel architecture are directly supported; a fully functional working ARM implementation is under active development. Pydio Cells has been developed from scratch using the Go programming language; release 4.0.0 introduced code refactoring to fully support the Go modular structure as well as grid computing. Nevertheless, the web-based interface of Cells is very similar to the one from Pydio 8 (in PHP), and it successfully replicates most of its features, while adding a few more. There is also a new synchronisation client (also written in Go). The PHP version has been phased out as the company's focus is moving to Pydio Cells, with community feedback on the new features. According to the company, the switch to the new environment was made "to overcome inherent PHP limitations and provide you with a future-proof and modern solution for collaborating on documents". From a technical point of view, Pydio differs from solutions such as Google Drive or Dropbox. Pydio is not based on a public cloud; instead, the software connects to the user's existing storage (such as SAN / Local FS, SAMBA / CIFS, (s)FTP, NFS, S3-compatible cloud storage, Azure Blob Storage, Google Cloud Storage) as well as to the existing user directories (LDAP / AD, OAuth2 / OIDC SSO, SAML / Azure ADFS SSO, RADIUS, Shibboleth...), which allows companies to keep their data inside their infrastructure, according to their data security policy and user rights management. The software is built in a modular perspective; up to Pydio 8, various plugins allowed administrators to implement extra features. On the server side, Pydio Cells is deployed as a collection of independent microservices communicating among themselves using gRPC and logging user actions via Activity Streams 2.0 (AS2). Pydio Cells microservices are built with the Go Micro framework (using an embedded NATS server). A standard installation will deploy all required services on the same physical server, but for the purposes of performance, reliability and high availability, these can now be spread across several different servers (even in geographically separate locations) according to the 12-factors architecture pattern. Pydio Cells is available either through a free and open-source community distribution (Pydio Cells Home), or a commercially-licensed enterprise distribution (in two variants, Pydio Cells Connect and Pydio Cells Enterprise), which add features not available in the community distribution as well as additional levels of support beyond the community forums. == Features == File sharing between different internal users and across other Pydio instances SSL/TLS Encryption WebDAV file server Creation of dedicated workspaces, for each line of business / project / client, with a dedicated user rights management for each workspace. File-sharing with external users (private links, public links, password protection, download limitation, etc.) Online viewing and editing of documents with Collabora Office (Pydio Cells Enterprise also offers OnlyOffice integration) Preview and editing of image files Integrated audio and video reader Activity stream ('timeline') for all actions taken by users Integrated chat platform Client applications are available for all major desktop and mobile platforms.
Read more →
Deblurring is the process of removing blurring artifacts from images. Deblurring recovers a sharp image S from a blurred image B, where S is convolved with K (the blur kernel) to generate B. Mathematically, this can be represented as B = S ∗ K {\displaystyle B=SK} (where represents convolution). While this process is sometimes known as unblurring, deblurring is the correct technical word. The blur K is typically modeled as point spread function and is convolved with a hypothetical sharp image S to get B, where both the S (which is to be recovered) and the point spread function K are unknown. This is an example of an inverse problem. In almost all cases, there is insufficient information in the blurred image to uniquely determine a plausible original image, making it an ill-posed problem. In addition the blurred image contains additional noise which complicates the task of determining the original image. This is generally solved by the use of a regularization term to attempt to eliminate implausible solutions. This problem is analogous to echo removal in the signal processing domain. Nevertheless, when coherent beam is used for imaging, the point spread function can be modeled mathematically. By proper deconvolution of the point spread function K and the blurred image B, the blurred image B can be deblurred (unblur) and the sharp image S can be recovered.
Read more →
Marq (formerly Lucidpress) is a cloud-based software platform for brand management and templated content creation. The platform integrates with digital asset management (DAM) systems—including Aprimo and Bynder and customer relationship management (CRM) tools such as Salesforce and HubSpot. Marq also includes AI-assisted features for brand compliance and content automation. Trade publications have described the product as a brand templating and creative automation platform. == History == In October 2013, Lucid Software, Inc. announced Lucidpress as a public beta version. Following its release, Lucidpress was featured in TechCrunch, VentureBeat and PC World, with TechCrunch noting: "I had a chance to test the app before its launch and it is indeed very easy to use. If you've ever used a desktop publishing app in the past, you'll feel right at home with Marq, as it features the same kind of standard top-bar menu and layout options as most other publishing apps. In terms of features, it can also hold its own against similar desktop-based apps." In May 2021, Lucidpress announced that it had been acquired by Charles Thayne Capital ("CTC"), a growth-oriented and technology-focused private investment firm. In May 2021, following its acquisition by Charles Thayne Capital, Lucidpress became fully independent. Owen Fuller, who had served as General Manager since 2017, was appointed Chief Executive Officer. In 2022, Lucidpress was rebranded as Marq to reflect the company’s shift toward brand templating and creative automation tools, while continuing to support its publishing features. == Features == Marq integrates with customer relationship management (CRM) platforms such as Salesforce and HubSpot, enabling the creation of personalized, on-brand sales and marketing materials. The platform also connects with multiple digital asset management (DAM) systems, including Bynder, Aprimo, MediaValet, PhotoShelter, Acquia, and Canto. == Investment == Lucid Software raised $1 million in Seed in 2011, led by Google Ventures. In May 2014, the company received a $5 million investment. The round was led by Salt Lake-based Kickstart Seed Fund. In September 2016, the company received a $36 million investment from Spectrum Equity.
Read more →
Webmail (or web-based email) is an email service that can be accessed using a standard web browser. It contrasts with email service accessible through a specialised email client software. Additionally, many internet service providers (ISP) provide webmail as part of their internet service package. Similarly, some web hosting providers also provide webmail as a part of their hosting package. As with any web application, webmail's main advantage over the use of a desktop email client is the ability to send and receive email anywhere from a web browser. == History == === Early implementations === The first Web Mail implementation was developed at CERN in 1993 by Phillip Hallam-Baker as a test of the HTTP protocol stack, but was not developed further. In the next two years, however, several people produced working webmail applications. In Europe, there were three implementations, Søren Vejrum's "WWW Mail", Luca Manunza's "WebMail", and Remy Wetzels' "WebMail". Søren Vejrum's "WWW Mail" was written when he was studying and working at the Copenhagen Business School in Denmark, and was released on February 28, 1995. Luca Manunza's "WebMail" was written while he was working at CRS4 in Sardinia, from an idea of Gianluigi Zanetti, with the first source release on March 30, 1995. Remy Wetzels' "WebMail" was written while he was studying at the Eindhoven University of Technology in the Netherlands for the DSE and was released early January 1995. In the United States, Matt Mankins wrote "Webex", and Bill Fitler, while at Lotus cc:Mail, began working on an implementation which he demonstrated publicly at Lotusphere on January 24, 1995. Customers who saw the cc:Mail demonstration were very enthusiastic, one recalling that they were "like an angry mob. People were yelling, 'We want this now!'". Matt Mankins, under the supervision of Dr. Burt Rosenberg at the University of Miami, released his "Webex" application source code in a post to comp.mail.misc on August 8, 1995, although it had been in use as the primary email application at the School of Architecture where Mankins worked for some months prior. Bill Fitler's webmail implementation was further developed as a commercial product, which Lotus announced and released in the fall of 1995 as cc:Mail for the World Wide Web 1.0; thereby providing an alternative means of accessing a cc:Mail message store (the usual means being a cc:Mail desktop application that operated either via dialup or within the confines of a local area network). Early commercialization of webmail was also achieved when "Webex" began to be sold by Mankins' company, DotShop, Inc., at the end of 1995. Within DotShop, "Webex" changed its name to "EMUmail"; which would be sold to companies like UPS and Rackspace until its sale to Accurev in 2001. EMUmail was one of the first applications to feature a free version that included embedded advertising, as well as a licensed version that did not. Hotmail and Four11's RocketMail both launched in 1996 as free services and immediately became very popular. === Widespread deployment === As the 1990s progressed, and into the 2000s, it became more common for the general public to have access to webmail because: many Internet service providers (such as EarthLink) and web hosting providers (such as Verio) began bundling webmail into their service offerings (often in parallel with POP/SMTP services); many other enterprises (such as universities and large corporations) also started offering webmail as a way for their user communities to access their email (either locally managed or outsourced); webmail service providers (such as Hotmail and RocketMail) emerged in 1996 as a free service to the general public, and rapidly gained in popularity. In some cases, webmail application software is developed in-house by the organizations running and managing the application, and in some cases it is obtained from software companies that develop and sell such applications, usually as part of an integrated mail server package (an early example being Netscape Messaging Server). The market for webmail application software has continued into the 2010s. == Rendering and compatibility == Email users may find the use of both a webmail client and a desktop client using the POP3 protocol presents some difficulties. For example, email messages that are downloaded by the desktop client and are removed from the server will no longer be available on the webmail client. The user is limited to previewing messages using the web client before they are downloaded by the desktop email client. However, one may choose to leave the emails on the server, in which case this problem does not occur. The use of both a webmail client and a desktop client using the IMAP4 protocol allows the contents of the mailbox to be consistently displayed in both the webmail and desktop clients and any action the user performs on messages in one interface will be reflected when the email is accessed via the other interface. There are significant differences in rendering capabilities for many popular webmail services such as Gmail, Outlook.com and Yahoo! Mail. Due to the varying treatment of HTML tags, such as