Comparison of OLAP servers

Comparison of OLAP servers

The following tables compare general and technical information for a number of online analytical processing (OLAP) servers. Please see the individual products articles for further information. == General information == == Data storage modes == == APIs and query languages == APIs and query languages OLAP servers support. == OLAP distinctive features == A list of OLAP features that are not supported by all vendors. All vendors support features such as parent-child, multilevel hierarchy, drilldown. == System limits == == Security == == Operating systems == The OLAP servers can run on the following operating systems: Note (1):The server availability depends on Java Virtual Machine not on the operating system == Support information ==

Truth discovery

Truth discovery (also known as truth finding) is the process of choosing the actual true value for a data item when different data sources provide conflicting information on it. Several algorithms have been proposed to tackle this problem, ranging from simple methods like majority voting to more complex ones able to estimate the trustworthiness of data sources. Truth discovery problems can be divided into two sub-classes: single-truth and multi-truth. In the first case only one true value is allowed for a data item (e.g birthday of a person, capital city of a country). While in the second case multiple true values are allowed (e.g. cast of a movie, authors of a book). Typically, truth discovery is the last step of a data integration pipeline, when the schemas of different data sources have been unified and the records referring to the same data item have been detected. == General principles == The abundance of data available on the web makes more and more probable to find that different sources provide (partially or completely) different values for the same data item. This, together with the fact that we are increasing our reliance on data to derive important decisions, motivates the need of developing good truth discovery algorithms. Many currently available methods rely on a voting strategy to define the true value of a data item. Nevertheless, recent studies, have shown that, if we rely only on majority voting, we could get wrong results even in 30% of the data items. The solution to this problem is to assess the trustworthiness of the sources and give more importance to votes coming from trusted sources. Ideally, supervised learning techniques could be exploited to assign a reliability score to sources after hand-crafted labeling of the provided values; unfortunately, this is not feasible since the number of needed labeled examples should be proportional to the number of sources, and in many applications the number of sources can be prohibitive. == Single-truth vs multi-truth discovery == Single-truth and multi-truth discovery are two very different problems. Single-truth discovery is characterized by the following properties: only one true value is allowed for each data item; different values provided for a given data item oppose to each other; values and sources can either be correct or erroneous. While in the multi-truth case the following properties hold: the truth is composed by a set of values; different values could provide a partial truth; claiming one value for a given data item does not imply opposing to all the other values; the number of true values for each data item is not known a priori. Multi-truth discovery has unique features that make the problem more complex and should be taken into consideration when developing truth-discovery solutions. The examples below point out the main differences of the two methods. Knowing that in both examples the truth is provided by source 1, in the single truth case (first table) we can say that sources 2 and 3 oppose to the truth and as a result provide wrong values. On the other hand, in the second case (second table), sources 2 and 3 are neither correct nor erroneous, they instead provide a subset of the true values and at the same time they do not oppose the truth. == Source trustworthiness == The vast majority of truth discovery methods are based on a voting approach: each source votes for a value of a certain data item and, at the end, the value with the highest vote is select as the true one. In the more sophisticated methods, votes do not have the same weight for all the data sources, more importance is indeed given to votes coming from trusted sources. Source trustworthiness usually is not known a priori but estimated with an iterative approach. At each step of the truth discovery algorithm the trustworthiness score of each data source is refined, improving the assessment of the true values that in turn leads to a better estimation of the trustworthiness of the sources. This process usually ends when all the values reach a convergence state. Source trustworthiness can be based on different metrics, such as accuracy of provided values, copying values from other sources and domain coverage. Detecting copying behaviors is very important, in fact, copy allows to spread false values easily making truth discovery very hard, since many sources would vote for the wrong values. Usually systems decrease the weight of votes associated to copied values or even don’t count them at all. == Single-truth methods == Most of the currently available truth discovery methods have been designed to work well only in the single-truth case. Below are reported some of the characteristics of the most relevant typologies of single-truth methods and how different systems model source trustworthiness. === Majority voting === Majority voting is the simplest method, the most popular value is selected as the true one. Majority voting is commonly used as a baseline when assessing the performances of more complex methods. === Web-link based === These methods estimate source trustworthiness exploiting a similar technique to the one used to measure authority of web pages based on web links. The vote assigned to a value is computed as the sum of the trustworthiness of the sources that provide that particular value, while the trustworthiness of a source is computed as the sum of the votes assigned to the values that the source provides. === Information-retrieval based === These methods estimate source trustworthiness using similarity measures typically used in information retrieval. Source trustworthiness is computed as the cosine similarity (or other similarity measures) between the set of values provided by the source and the set of values considered true (either selected in a probabilistic way or obtained from a ground truth). === Bayesian based === These methods use Bayesian inference to define the probability of a value being true conditioned on the values provided by all the sources. P ( v ∣ ψ ( o ) ) = P ( ψ ( o ) ∣ v ) ⋅ P ( v ) P ( ψ ( o ) ) {\displaystyle P(v\mid \psi (o))={\frac {P(\psi (o)\mid v)\cdot P(v)}{P(\psi (o))}}} where v {\displaystyle \textstyle v} is a value provided for a data item o {\displaystyle \textstyle o} and ψ ( o ) {\displaystyle \textstyle \psi (o)} is the set of the observed values provided by all the sources for that specific data item. The trustworthiness of a source is then computed based on the accuracy of the values that provides. Other more complex methods exploit Bayesian inference to detect copying behaviors and use these insights to better assess source trustworthiness. == Multi-truth methods == Due to its complexity, less attention has been devoted to the study of the multi-truth discovery Below are reported two typologies of multi-truth methods and their characteristics. === Bayesian based === These methods use Bayesian inference to define the probability of a group of values being true conditioned on the values provided by all the data sources. In this case, since there could be multiple true values for each data item, and sources can provide multiple values for a single data item, it is not possible to consider values individually. An alternative is to consider mappings and relations between set of provided values and sources providing them. The trustworthiness of a source is then computed based on the accuracy of the values that provides. More sophisticated methods also consider domain coverage and copying behaviors to better estimate source trustworthiness. === Probabilistic Graphical Models based === These methods use probabilistic graphical models to automatically define the set of true values of given data item and also to assess source quality without need of any supervision. == Applications == Many real-world applications can benefit from the use of truth discovery algorithms. Typical domains of application include: healthcare, crowd/social sensing, crowdsourcing aggregation, information extraction and knowledge base construction. Truth discovery algorithms could be also used to revolutionize the way in which web pages are ranked in search engines, going from current methods based on link analysis like PageRank, to procedures that rank web pages based on the accuracy of the information they provide.

KidDesk

KidDesk is an alternative desktop software application. The early childhood learning company Hatch Early Childhood created KidDesk; it subsequently went to Edmark, which was bought by IBM then sold to Riverdeep (now Houghton Mifflin Harcourt Learning Technology). KidDesk is compatible with Microsoft Windows 95 and newer, as well as Apple System 7 and newer. KidDesk can be set to start when the computer starts up, and can only be exited through password entry. Adults choose what programs are included for the child to use, what icon represented the desk, and customize the software programs available for use. == History == Edmark first started shipping KidDesk in 1992. In 1993, Edmark updated KidDesk with KidDesk Family Edition for Macintosh and DOS, adding more desk accessories and desk styles (Sometimes included as a free exclusive offer with the Early Learning House and Thinkin' Things Series). In 1995, KidDesk Family Edition was enhanced for Windows 95, and released one month after the new operating system shipped. In 1998, Edmark developed KidDesk Internet Safe. The Internet Safe edition was written for Windows 95, Windows 98, and Macintosh (including OS8). In 2008, HMH ported KidDesk Family Edition was to run on Windows Vista and in 2011 version 3.07 of KidDesk Family Edition was released as part of the 'Young Explorer' suite which is fully supported on Windows XP, Windows Vista and Windows 7. == Features == A picture editor incorporated into the desk. Used both in the Adult settings menu and in the desk itself. KidDesk users can edit their user logo with a pixel grid paint program. A calendar incorporated into the desk. This allows the user to set dates that the user finds important, and allows the date to be marked with a picture or text. A password exit feature. For security reasons, the adult can set a password so that KidDesk can only be exited if it is entered. As an extra security measure, the password exit function could only be accessed if the user pressed the ctrl + alt + A keyboard buttons simultaneously. A skin changer with several themes - farm, princess, sports, ocean, etc. These themes can be changed. The e-mail and voicemail features are customizable depending on the KidDesk installation. The ability to add websites that can be accessed on KidDesk, and the ability to block hyperlinks, JavaScript, data entry, etc., on said sites was an added for the 'Internet Safe' edition released in 1998. KidDesk Internet Safe edition is available in Spanish and Brazilian-Portuguese versions. == Reception == KidDesk was given a platinum award at the 1994 Oppenheim Toy Portfolio Awards. The judges praised the program's security features allowing "configur[ation] so that kids never have access to the possibly destructive DOS prompt", and concluded that "[i]f you and your kids share a computer, you need to install Kiddesk immediately!" === Awards === Since 1992, KidDesk has won 15 major awards.

Deep Instinct

Deep Instinct is a cybersecurity company that applies deep learning to cybersecurity. The company implements artificial intelligence to the task of preventing and detecting malware. The company was the recipient of the Technology Pioneer by The World Economic Forum in 2017. Lane Bess has been CEO of the company since 2022. == Overview == In 2015, Deep Instinct was founded by Guy Caspi, Dr. Eli David, and Nadav Maman. The headquarters of the company is located in New York City. In July 2017, NVIDIA became an investor. According to Tom's Hardware, NVIDIA’s investment enabled access to a GPU-based neural network and CUDA platform, which they were using to achieve maximum vulnerability detection rates. As of February 2020, the company had raised $43 million in Series C funding round. In April 2021, Deep Instinct raised $100 million in Series D funding to accelerate growth. == Partnerships == In April 2019, Deep Instinct partnered with Chinese artist, Guo O. Dong on an art project titled, The Persistence of Chaos, consisting of a laptop infected with 6 pieces of malware that represented $95 billion in damages. The art was auctioned with a final bid of $1,345,000. In the same year, Globes reported that, HP Inc partnered with Deep Instinct to launch their security solution HP SureSense, which has been applied to the EliteBook and Zbook devices.

Meesho

Meesho Limited (short for Meri shop, transl. My shop) is an Indian e-commerce company, headquartered in Bengaluru. Founded by Vidit Aatrey and Sanjeev Barnwal in December 2015, Meesho is an online marketplace in categories such as fashion, home and kitchen, beauty and personal care, electronics accessories, and daily use products. == History == Meesho Private Limited, formerly Fashnear Technologies Private Limited, was established by IIT Delhi graduates Vidit Aatrey and Sanjeev Barnwal in December, 2015 In 2016, the founders came up with the idea of re-establishing the platform as Meesho, one that would enable country-wide shipping for resellers with the use of social media sites as tools for marketing. In February 2019, the platform reported having around 209,000 users and about 1.2 million monthly orders, and in March 2020, it reported approximately 563,000 users and 3.1 million monthly orders. In 2021, the Meesho mobile application was ranked among the most downloaded shopping apps globally. In 2022, Meesho had about 120 million monthly users and about 910 million orders were made through the platform, with a gross merchandise value (GMV) of about $5 billion. According to report as of August 2023 Meesho delisted 42 lakh counterfeit listings and 10 lakh restricted products under its initiative Project Suraksha. During the same period, the platform blocked access for over 12,000 user accounts flagged for policy violations. The Court granted injunctive relief by directing domain registrars to suspend the infringing websites. Additionally, the Court ordered law enforcement authorities to initiate criminal investigations, freeze associated financial accounts against the identified offenders. In 2023, Meesho became the fastest shopping app to cross over 500 million downloads. In 2024, Meesho introduced Valmo, a logistics marketplace, to provide shipment services to sellers by aggregating multiple logistics providers. Meesho employs over 3,000 small businesses and 10-12 large firms for warehousing and sorting operations within its logistics framework. According to media reports, Valmo operating in approximately 15,000 pincodes in India with around 6,000 partners. It is reported to handle over 50% of Meesho's daily orders. In November 2024, Meesho introduced a generative AI-powered voice bot for customer support, managing approximately 60,000 calls daily in English and Hindi. According to media reports, the system resolves the majority of queries without human assistance, with only a small fraction of calls requiring manual intervention. According to media reports, in 2024, Meesho prevented over 22 million suspicious or potentially fraudulent transactions on its platform. The company initiated legal proceedings, resulting in the filing of twelve cases, including nine specifically targeting over forty individuals in the cities of Kolkata and Ranchi. The company filed a suit in the Delhi High Court for a permanent injunction against parties operating deceptive websites misappropriating its brand identity. Meesha went public through an initial public offering in December 2025, raising $603 million. It is listed on both the BSE and NSE. == Recognition == In 2023, Meesho was named one of the most influential companies of the year by Time (magazine).

List & Label

List & Label is a professional reporting tool for software developers. It provides comprehensive design, print and export functions. The software component runs on Microsoft Windows and can be implemented in desktop, cloud and web applications. List & Label can be used to create user-defined dashboards, lists, invoices, forms and labels. It supports many development environments, frameworks and programming languages such as Microsoft Visual Studio, Embarcadero RAD Studio, .NET Framework, .NET Core, ASP.NET, C++, Delphi, Java, C Sharp and some more. List & Label either retrieves data from various sources via data binding, or works database independent. Reports are designed and created in the so-called List & Label Designer and then exported into a multitude of formats like PDF, Excel, XHTML and RTF. Since version 27 a web report designer for ASP.NET MVC is available. == History == The product was first released in 1992 by combit. The current version is 30. A new major version of List & Label is released every fall, usually in October. Updates are available several times a year via Service Pack. == Features == === Report Designer === The Designer enables users to graphically layout the report. It offers report objects such as tables, charts, crosstabs, gauges, HTML, conditionally formatted text, barcodes, matrix codes, and graphics, and is extensible using third-party add-ons. User applications can interact with the report via the programmable object model of the report. The real-time preview functionality allows users to view changes instantly. Usability features include layer and appearance management, enabling conditional logic to dynamically control the visibility of objects in reports. The Designer also supports the inclusion of multiple report containers in a single project, accommodating complex layouts such as parallel tables and charts. A formula wizard and support for scripting languages such as C# facilitate advanced calculations and logic. The Designer's object model (DOM) provides developers with the ability to modify layouts and behaviors programmatically. === Web Report Designer === The web report designer works browser-based and independent from printer drivers and spoolers - that makes deployments to the cloud easier. Just like the use of the Visual Studio deployment pipeline. === Data Sources === Depending on the programming language, the product offers automatic support for data sources: Databases such as Microsoft SQL Server, Oracle, MySQL, PostgreSQL, IBM Db2, SQLite, MariaDB, MongoDB, Cosmos DB XML data, CSV Business objects Data sources that can be accessed via OLE DB, ODBC or ADO.NET LINQ data and data from web services GraphQL Additionally, the product offers support for unbound data and can be extended to support other data sources via interfaces. === Output Options === Printer Image Formats (JPEG, BMP, EMF, TIFF, PNG, SVG, HEIF, WebP) Document Formats: PDF, PDF/A, Word (DOCX), Excel (XLS), PowerPoint (PPTX) HTML, XHTML, MHTML Barcodes Plain Text, RTF, CSV, JSON XML, ZIP, Email, JSON List & Label preview file === Target Audience === List & Label can be used in Windows development environments. While it competes most notably on the Microsoft .NET platform with other products such as Crystal Reports, SQL Server Reporting Services, ActiveReports, there are few competing products for other programming languages (e.g. Progress, Alaska Xbase++, Visual DataFlex). == Awards == Reader's Choice Award 2005–2008 Stevie Awards 2021: Best Technology for Data Visualization Top 100 Publisher Award Component Source 2013-2014, 2014-2015,2016, 2018, 2019, 2020, 2021, 2022

Speculative decoding

Speculative decoding is an inference-time optimization for autoregressive large language models (LLMs) that generates multiple tokens per decoding step instead of one. A smaller draft model proposes a sequence of candidate tokens, and the larger target model verifies them in a single forward pass through a modified rejection sampling scheme. The verification preserves the target model's original output distribution, so the technique produces the same results as standard decoding while cutting latency by roughly two to three times. The name is an analogy to speculative execution in CPU design, where a processor runs instructions along a predicted branch before the outcome is known. == Background == Standard autoregressive decoding in large language models generates one token at a time. The model computes a probability distribution over its vocabulary, samples the next token, and feeds that token back as input. For large models, this process is bottlenecked by memory bandwidth rather than arithmetic throughput: loading the model's parameters from high-bandwidth memory (HBM) to the processor takes up most of the wall-clock time at each step. Because of this, a forward pass over one token and a forward pass over several tokens in a batch take roughly the same time. Speculative decoding relies on this property. == Mechanism == The technique alternates between two phases: drafting and verification. During drafting, a fast approximation model generates a short run of K candidate tokens, typically between 3 and 12. The draft model is usually a much smaller version of the target model or a lightweight auxiliary network. During verification, the target model scores the entire draft sequence in one batched forward pass. A modified rejection sampling algorithm compares the draft and target probabilities at each position. If the target model would have been at least as likely to produce a given token, that token is accepted; the first token that fails is resampled from a corrected distribution, and everything after it is thrown out. The result is that the output distribution is the same as if each token had been generated one at a time. How many tokens get accepted per cycle depends on how well the draft model matches the target. For common words and predictable continuations the match tends to be good, so the target model can confirm several tokens at once. == History == An early precursor was blockwise parallel decoding, proposed in 2018 by Stern, Shazeer, and Uszkoreit. Their method predicted multiple future tokens through auxiliary prediction heads and validated them against the autoregressive model, but it only worked with greedy decoding and did not preserve the full sampling distribution. The modern form of the technique came from Yaniv Leviathan, Matan Kalman, and Yossi Matias at Google Research, who posted "Fast Inference from Transformers via Speculative Decoding" on arXiv in November 2022. Separately and at about the same time, Charlie Chen and colleagues at DeepMind arrived at a closely related method they called speculative sampling, published in February 2023. Both papers introduced the use of rejection sampling to guarantee that the output distribution is unchanged. Leviathan et al. showed roughly 2–3x speedup on T5-XXL (11 billion parameters); Chen et al. reported 2–2.5x on the Chinchilla model (70 billion parameters). The Leviathan et al. paper was presented as an oral at the International Conference on Machine Learning in July 2023. == Variants == SpecInfer (Miao et al., 2024) uses multiple small language models to jointly build a tree of candidate continuations rather than a single chain. The target model verifies the whole tree in parallel and keeps the longest valid path, with reported speedups of 1.5–3.5x. Medusa (Cai et al., 2024) takes a different approach by not using a separate draft model at all. Extra lightweight decoding heads are attached to the target model itself, and each one predicts a token at a different future position. The candidates are evaluated through a tree-structured attention mechanism. The authors measured 2.2–3.6x speedup. EAGLE (Li et al., 2024) performs autoregression on the target model's internal feature representations (specifically the second-to-top layer) rather than on tokens directly. On LLaMA 2 Chat 70B, this gave a 2.7–3.5x latency reduction. Later versions added dynamic draft trees (EAGLE-2) and further optimizations (EAGLE-3), reaching 3–6.5x speedup. == Adoption == By 2024, speculative decoding had become a standard part of production LLM serving. Google uses it in the AI Overviews feature of Google Search. Open-source inference frameworks such as vLLM, NVIDIA's TensorRT-LLM, and SGLang all include built-in support for speculative decoding and its variants. Apple, AWS, and Meta have also published research extending the method or deploying it at scale.