AI Essay Writer Undetectable Free

AI Essay Writer Undetectable Free — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Perplexity AI

    Perplexity AI

    Perplexity AI, Inc., or simply Perplexity, is an American privately held software company offering a web search engine that processes user queries and synthesizes responses. Perplexity products use large language models and incorporate real-time web search capabilities, providing responses based on current Internet content, citing sources used. Its real-time search engine is called Sonar and is based on Meta's Llama model. A free public version is available, while a paid Pro subscription offers access to more advanced language models and additional features. Perplexity AI, Inc., was founded in August 2022 by Aravind Srinivas, Denis Yarats, Johnny Ho, and Andy Konwinski. As of September 2025, the company was valued at US$20 billion. Perplexity AI has attracted legal scrutiny over allegations of copyright infringement, unauthorized content use, and trademark issues from several major media organizations, including the BBC, Dow Jones, and The New York Times. According to separate analyses by Wired and later Cloudflare, Perplexity uses undisclosed web crawlers with spoofed user-agent strings to scrape the content of websites which prohibit, or explicitly block, web scraping. == History == In August 2022, Perplexity AI, Inc., was founded by Aravind Srinivas, Denis Yarats, Johnny Ho, and Andy Konwinski, engineers with backgrounds in back-end systems, artificial intelligence (AI) and machine learning. It launched its main search engine on December 7, 2022, and has since released a Google Chrome extension and apps for iOS and Android. In February 2023, Perplexity reported two million unique visitors. By April 2024, Perplexity had raised $165 million in funding, valuing the company at over $1 billion. As of June 2025, Perplexity closed a $500 million round of funding that elevated its valuation to $14 billion. Investors in Perplexity AI have included Jeff Bezos, Tobias Lütke, Nat Friedman, Nvidia, and Databricks. Perplexity has also received funding from 1789 Capital, a venture capital firm notable for its association with Donald Trump Jr. During Bloomberg’s Tech Summit 2025, Srinivas shared that the company processed 780 million queries in May 2025, experiencing more than 20% month-over-month growth, processing around 30 million queries daily. In July 2024, Perplexity announced the launch of a new publishers' program to share advertising revenue with partners. On January 18, 2025, the day before the impending U.S. ban on the social media app TikTok, Perplexity submitted a proposal for a merger with TikTok US. On August 12, 2025, Perplexity made a bid to buy Chrome from Google for $34.5 billion. Perplexity stated that the sale could remedy anti-trust litigation against Google, in which a judge was considering compelling the sale of Chrome. In December 2025, Cristiano Ronaldo took an undisclosed stake in Perplexity AI and entered a global brand partnership with the company. === Business Strategy and Finance (2026) === As of early 2026, Perplexity AI reached a valuation of $21.21 billion following its Series E-6 funding round. The company's Annual Recurring Revenue (ARR) grew from $80 million in late 2024 to an estimated $200 million by February 2026. In January 2026, the company entered into a three-year, $750 million commitment with Microsoft Azure to secure the GPU capacity required for its advanced "Deep Research" and "Model Council" features. In February 2026, Perplexity transitioned to a subscription-first model by discontinuing its AI-integrated advertising strategy. Leadership stated the move was intended to preserve user trust in the "answer engine," prioritizing objective results over ad revenue. The company also introduced the "Model Council" feature on February 5, 2026, which allows users to compare outputs from multiple large language models, such as GPT-5.2 and Claude 4.6, simultaneously. To expand its user base, Perplexity began offering a free year of Pro access to students, U.S. Military Veterans, and government employees. == Products and services == === Search engine web portal === Perplexity’s primary offering is an online information retrieval system (search engine) that uses large language models to generate responses to user queries by searching and summarizing web-based content. Perplexity offers a feature known as Perplexity Pages that generates structured summaries and report-like content from user queries by aggregating cited sources. Perplexity is available without charge or registration to Web users, a freemium model. === Perplexity Pro === Perplexity Pro is a subscription tier, a more capable paid "enterprise" service, including stronger security and data protection and additional tools, including the ability to search uploaded documents alongside web content and access to a programmatic application programming interface (API). It allows the user to select between backend models such as GPT-5.4, Claude 4.6 and Gemini 3.1 Pro. The company has also developed its own models, Sonar (based on Llama 3.3) and R1 1776 (based on DeepSeek R1). === Internal Knowledge Search === Internal Knowledge Search enables Pro and Enterprise Pro users to simultaneously search across web content and internal documents. Users can upload and search through Excel, Word, PDF, and other common file formats. Enterprise Pro users can upload and index up to 500 files. === Search API === Perplexity's Search API provides AI developers with programmatic access to the company's search infrastructure. The September 2025 release includes a software development kit, an open-source evaluation framework called search_evals, and documentation detailing the API's design and optimization. === Shopping hub === Perplexity's Shopping Hub is an online shopping platform that provides AI-generated product recommendations, and enables users to purchase products directly through Perplexity's interface. It was launched in November 2024 with backing by Amazon and Nvidia. === Finance === In October 2024, Perplexity AI introduced new finance-related features, including looking up stock prices and company earnings data. The tool provides real-time stock quotes and price tracking, industry peer comparisons and basic financial analysis tools. The platform sources its financial data from Financial Modeling Prep. === Assistant === In January 2025, Perplexity launched the Perplexity Assistant, an AI-powered tool designed to enhance the functionality of its search engine. It can perform tasks across multiple apps, such as hailing a ride or searching for a song, and can maintain context across actions. The assistant is also multi-modal, meaning it can use a phone's camera to provide answers about the user's surroundings or on-screen content. Perplexity has acknowledged that the assistant is still in development and may not always function as expected. For instance, certain features, such as summarizing unread emails or upcoming calendar events, require users to enable a workaround based on notifications. === Comet === In July 2025, Perplexity launched Comet, an AI browser based on Chromium. Initially, access to the browser was limited to users subscribed to the most expensive subscription tier. The browser was later released for free download in October 2025. A key feature is integration of the Perplexity search engine, which can perform a variety of tasks such as generating article summaries, describing an image, conducting research about a topic and composing emails. === Truth Social chatbot === Perplexity has been contracted to produce a chatbot for Donald Trump's social media platform Truth Social. == Leadership == Aravind Srinivas is the CEO and co-founder of Perplexity AI. He previously held research positions at OpenAI, Google DeepMind, and other AI research institutions focusing on machine learning and artificial intelligence. In a March 2026 All-In episode, Srinivas said the incoming AI-related layoffs were "glorious future" to "look forward", as it freed people from jobs they didn't like and gave them opportunities to pursue entrepreneurship. == Controversies == === Copyright and trademark infringement allegations === In June 2024, Forbes publicly criticized Perplexity for using their content. According to Forbes, Perplexity published a story largely copied from a proprietary Forbes article without mentioning or prominently citing Forbes. In response, Srinivas said that the feature had some "rough edges" and accepted feedback but maintained that Perplexity only "aggregates" rather than plagiarizes information. In October 2024, The New York Times sent a cease-and-desist notice to Perplexity to stop accessing and using NYT content, claiming that Perplexity is violating its copyright by scraping data from its website. In June 2024, Dow Jones and New York Post filed a lawsuit against Perplexity, alleging copyright infringement. The lawsuit also alleged that Perplexity harmed their brand by attributing hallucinated quotes, for example on F-16 jets for Ukraine, to artic

    Read more →
  • Pseudonymization

    Pseudonymization

    Pseudonymization is a data management and de-identification procedure by which personally identifiable information fields within a data record are replaced by one or more artificial identifiers, or pseudonyms. A single pseudonym for each replaced field or collection of replaced fields makes the data record less identifiable while remaining suitable for data analysis and data processing. Pseudonymization (or pseudonymisation, the spelling under European guidelines) is one way to comply with the European Union's General Data Protection Regulation (GDPR) demands for secure data storage of personal information. Pseudonymized data can be restored to its original state with the addition of information which allows individuals to be re-identified. In contrast, anonymization is intended to prevent re-identification of individuals within the dataset. Clause 18, Module Four, footnote 2 of the Adoption by the European Commission of the Implementing Decisions (EU) 2021/914 "requires rendering the data anonymous in such a way that the individual is no longer identifiable by anyone ... and that this process is irreversible." == Impact of Schrems II ruling == The European Data Protection Supervisor (EDPS) on 9 December 2021 highlighted pseudonymization as the top technical supplementary measure for Schrems II compliance. Less than two weeks later, the EU Commission highlighted pseudonymization as an essential element of the equivalency decision for South Korea, which is the status that was lost by the United States under the Schrems II ruling by the Court of Justice of the European Union (CJEU). The importance of GDPR-compliant pseudonymization increased dramatically in June 2021 when the European Data Protection Board (EDPB) and the European Commission highlighted GDPR-compliant pseudonymization as the state-of-the-art technical supplementary measure for the ongoing lawful use of EU personal data when using third country (i.e., non-EU) cloud processors or remote service providers under the "Schrems II" ruling by the CJEU. Under the GDPR and final EDPB Schrems II Guidance, the term pseudonymization requires a new protected "state" of data, producing a protected outcome that: Protects direct, indirect, and quasi-identifiers, together with characteristics and behaviors; Protects at the record and data set level versus only the field level so that the protection travels wherever the data goes, including when it is in use; and Protects against unauthorized re-identification via the mosaic effect by generating high entropy (uncertainty) levels by dynamically assigning different tokens at different times for various purposes. The combination of these protections is necessary to prevent the re-identification of data subjects without the use of additional information kept separately, as required under GDPR Article 4(5) and as further underscored by paragraph 85(4) of the final EDPB Schrems II guidance: Article 4(5) "Definitions" of the GDPR defines pseudonymization as "the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person." "Use Case 2: Transfer of pseudonymised Data Paragraph 85(4)" of the final EDPB Schrems II Guidance requires that “the controller has established by means of a thorough analysis of the data in question – taking into account any information that the public authorities of the recipient country may be expected to possess and use – that the pseudonymised personal data cannot be attributed to an identified or identifiable natural person even if cross-referenced with such information." GDPR-compliant pseudonymization requires that data is "anonymous" in the strictest EU sense of the word – globally anonymous – but for the additional information held separately and made available under controlled conditions as authorized by the data controller for permitted re-identification of individual data subjects. Clause 18, Module Four, footnote 2 of the Adoption by the European Commission of the Implementing Decision (EU) 2021/914 "requires rendering the data anonymous in such a way that the individual is no longer identifiable by anyone, in line with recital 26 of Regulation (EU) 2016/679, and that this process is irreversible." Before the Schrems II ruling, pseudonymization was a technique used by security experts or government officials to hide personally identifiable information to maintain data structure and privacy of information. Some common examples of sensitive information include postal code, location of individuals, names of individuals, race and gender, etc. After the Schrems II ruling, GDPR-compliant pseudonymization must satisfy the above-noted elements as an "outcome" versus merely a technique. == Data fields == The choice of which data fields are to be pseudonymized is partly subjective. Less selective fields, such as birth date or postal code are often also included because they are usually available from other sources and therefore make a record easier to identify. Pseudonymizing these less identifying fields removes most of their analytic value and is therefore normally accompanied by the introduction of new derived and less identifying forms, such as year of birth or a larger postal code region. Data fields that are less identifying, such as date of attendance, are usually not pseudonymized. This is because too much statistical utility is lost in doing so, not because the data cannot be identified. For example, given prior knowledge of a few attendance dates it is easy to identify someone's data in a pseudonymized dataset by selecting only those people with that pattern of dates. This is an example of an inference attack. The weakness of pre-GDPR pseudonymized data to inference attacks is commonly overlooked. A famous example is the AOL search data scandal. The AOL example of unauthorized re-identification did not require access to separately kept "additional information" that was under the control of the data controller as is now required for GDPR-compliant pseudonymization, outlined below under the section "New Definition for Pseudonymization Under GDPR". Protecting statistically useful pseudonymized data from re-identification requires: a sound information security base controlling the risk that the analysts, researchers or other data workers cause a privacy breach The pseudonym allows tracking back of data to its origins, which distinguishes pseudonymization from anonymization, where all person-related data that could allow backtracking has been purged. Pseudonymization is an issue in, for example, patient-related data that has to be passed on securely between clinical centers. The application of pseudonymization to e-health intends to preserve the patient's privacy and data confidentiality. It allows primary use of medical records by authorized health care providers and privacy preserving secondary use by researchers. In the US, HIPAA provides guidelines on how health care data must be handled and data de-identification or pseudonymization is one way to simplify HIPAA compliance. However, plain pseudonymization for privacy preservation often reaches its limits when genetic data are involved (see also genetic privacy). Due to the identifying nature of genetic data, depersonalization is often not sufficient to hide the corresponding person. Potential solutions are the combination of pseudonymization with fragmentation and encryption. An example of application of pseudonymization procedure is creation of datasets for de-identification research by replacing identifying words with words from the same category (e.g. replacing a name with a random name from the names dictionary), however, in this case it is in general not possible to track data back to its origins. == New definition under GDPR == Effective as of May 25, 2018, the EU General Data Protection Regulation (GDPR) defines pseudonymization for the very first time at the EU level in Article 4(5). Under Article 4(5) definitional requirements, data is pseudonymized if it cannot be attributed to a specific data subject without the use of separately kept "additional information". Pseudonymized data embodies the state of the art in Data Protection by Design and by Default because it requires protection of both direct and indirect identifiers (not just direct). GDPR Data Protection by Design and by Default principles as embodied in pseudonymization require protection of both direct and indirect identifiers so that personal data is not cross-referenceable (or re-identifiable) via the "mosaic effect" without access to "additional information" that is kept separately by the controller. Because access to separately kept "additional information" is required

    Read more →
  • Gen (software)

    Gen (software)

    Gen is a Computer Aided Software Engineering (CASE) application development environment marketed by Broadcom Inc. Gen was previously known as CA Gen, IEF (Information Engineering Facility), Composer by IEF, Composer, COOL:Gen, Advantage:Gen and AllFusion Gen. The toolset originally supported the information technology engineering methodology developed by Clive Finkelstein, James Martin and others in the early 1980s. Early versions supported IBM's DB2 database, 3270 'block mode' screens and generated COBOL code. In the intervening years the toolset has been expanded to support additional development techniques such as component-based development; creation of client/server and web applications and generation of C, Java and C#. In addition, other platforms are now supported such as many variants of Unix-like Operating Systems (AIX, HP-UX, Solaris, Linux) as well as Windows. Its range of supported database technologies have widened to include ORACLE, Microsoft SQL Server, ODBC, JDBC as well as the original DB2. The toolset is fully integrated - objects identified during analysis carry forward into design without redefinition. All information is stored in a repository (central encyclopedia). The encyclopedia allows for large team development - controlling access so that multiple developers may not change the same object simultaneously. == History == === 1985-1997: Texas Instruments === It was initially produced by Texas Instruments, with input from James Martin and his consultancy firm James Martin Associates, and was based on the Information Engineering Methodology (IEM). The first version was launched in 1985. IEF (Information Engineering Facility) became popular among large government departments and public utilities. It initially supported a CICS/COBOL/DB2 target environment. However, it now supports a wider range of relational databases and operating systems. IEF was intended to shield the developer from the complexities of building complete multi-tier cross-platform applications. In 1995, Texas Instruments decided to change their marketing focus for the product. Part of this change included a new name - "Composer". By 1996, IEF had become a popular tool. However, it was criticized by some IT professionals for being too restrictive, as well as for having a high per-workstation cost ($15K USD). But it is claimed that IEF reduces development time and costs by removing complexity and allowing rapid development of large scale enterprise transaction processing systems. === 1997-2000: Sterling Software === In 1997, Composer had another change of branding, Texas Instruments sold the Texas Instruments Software division, including the Composer rights, to Sterling Software. Sterling software changed the well known name "Information Engineering Facility" to "COOL:Gen". COOL was an acronym for "Common Object Oriented Language" - despite the fact that there was little object orientation in the product. === 2000-2018: Computer Associates === In 2000, Sterling Software was acquired by Computer Associates (now CA). CA has rebranded the product three times to date and the product is still used widely today. Under CA, recent releases of the tool added support for the CA-Datacom DBMS, the Linux operating system, C# code generation and ASP.NET web clients. The current version is known as CA Gen - version 8 being released in May 2010, with support for customised web services, and more of the toolset being based around the Eclipse framework. === 2018-current: Broadcom === As of 2020, CA Gen is owned and marketed by Broadcom Inc., which rebranded the product to Gen to avoid confusion with the former owner of the product. There are a variety of "add-on" tools available for Gen, including GuardIEn - a Configuration Management and Developer Productivity Suite, QAT Wizard, an interview style wizard that takes advantage of the meta model in Gen, products for multi-platform application reporting and XML/SOAP enabling of Gen applications., and developer productivity tools such as Access Gen, APMConnect, QA Console and Upgrade Console from Response Systems Version 8.6 of CA Gen came to market in June 2016. Version 8.6.3 of CA Gen was released in 2021. Following this release, Broadcom have switched to a continuous delivery model with new features to be delivered as patches.

    Read more →
  • Information logistics

    Information logistics

    Information Logistics (IL) deals with the flow of information between human or machine actors within or between any number of organizations that in turn form a value creating network (see, e.g.). IL is closely related to information management, information operations and information technology. == Definition == The term Information Logistics (IL) may be used in either of two ways: Firstly, it can be defined as "managing and controlling information handling processes optimally with respect to time (flow time and capacity), storage, distribution and presentation in such a way that it contributes to company results in concurrence with the costs of capturing (creation, searching, maintenance etc)." (Petri,2017) Thus IL utilizes logistic principles to optimize information handling. Secondly, IL can be seen as a concept using information technology to optimize logistics. A term which is closely related to the first meaning of Information Logistics is Data Logistics, a concept used in Computer Networking. "The study of solutions to problems in Computer Systems that flexibly span resources and services relating to Data Movement, Data Storage and Data Processing." [ref?] Systems that support general Data Logistics solutions thus must span the traditionally separate fields of Networking, File/Database Systems and Process Management. Data Logistics is a more general form of the term Logistical Networking, used as the name of a particular network storage architecture and software stack. == Goal == The goal of Information Logistics is to deliver the right product, consisting of the right information element, in the right format, at the right place at the right time for the right people at the right price and all of this is customer demand driven. If this goal is to be achieved, knowledge workers are best equipped with information for the task at hand for improved interaction with its customers and machines are enabled to respond automatically to meaningful information. Methods for achieving the goal are: the analysis of information demand intelligent information storage the optimization of the flow of information maintaining both security and organizational flexibility integrated information and billing solutions The expression was formed by the Indian mathematician and librarian S. R. Ranganathan . The supply of a product is part of the discipline Logistics. The purpose of this discipline is described as follows: Logistics is the teachings of the plans and the effective and efficient run of supply. The contemporary logistics focuses on the organization, planning, control and implementation of the flow of goods, money, information and people. Information Logistics focusses on information. Information (from Latin informare: "shape, shapes, instruct") means in a general sense everything that adds knowledge and thus reduce ignorance or lack of precision. In a stricter sense, raw data only becomes information to those who can interpret it. Interpreting relevant, related information produces insight that either leads to existing, or eventually builds new, knowledge. == Information element == An information element (IE) is an information component that is located in the organizational value chain. The combination of certain IEs leads to an information product (IP), which is any final product in the form of information that a person needs to have. When a higher number of different IEs are required, it often results in more planning problems in capacity and inherently leads to a non-delivery of the IP. To illustrate the concept of an IP, an example is shown of a bottleneck analysis in HR (by J. Willems 2008). Here, the illustration shows how the information elements (e.g. qualifications) build up the information product (e.g. HR file). == Data logistics == Data logistics is a concept that developed independently of information logistics in the 1990s, in response to the explosion of Internet content and traffic due to the invention of the World Wide Web (WWW). Some motivations for the emergence of interest in Data Logistics included: The incorporation of network hyperlinks into content encoded in HTML encouraged users to freely dereference those links without regard to, or in many cases without even having any knowledge of, the identity (much less the geographical or network topological location of) the target Web server. The growth in the volume of Web hits, combined with the steady increase in the size of Web-delivered objects such as images, audio and video clips resulted in the localized overloading of the bandwidth and processing resources of the local and/or wide area network and/or the Web server infrastructure. The resulting Internet bottleneck can cause Web clients to experience poor performance or complete denial of access to servers that host high volume sites (the so-called Slashdot effect). The growth in all Internet traffic, especially across international telecommunication links, resulted in stress to institutional infrastructure and high costs on networks that billed Internet traffic on a per-use basis. Much of this traffic was redundant, the results of repeated requests by many independent users to access the same stored files and content. Large files and content retrieved from distant Web servers was often delayed due to high delays experienced over long and complex Internet paths. These factors led to interest in the use of large scale storage (and to a lesser extent, processing) resources to cache the response to network requests, first at the Internet endpoint using a Web browser cache and later at intermediate network locations using shared network caches. This line of development also gave rise to Web server replication and other techniques for offloading and distributing the work of delivering large volume Web services to widely dispersed client communities, ultimately resulting in the creation of modern Content delivery networks. At the same time, research efforts in server replication and content delivery gave rise to a number of related projects and strategies, including Logistical Networking (LN). The name LN was intended as an analogy to physical supply chain logistics, in which goods are not only carried from source to destination on networks of roads, but are also stored at warehouses located throughout the transportation infrastructure. This led to a nomenclature in which LN network storage resources are termed "storage depots". The principles that underpin LN have been abstracted into the more general study of scheduling and optimization across the traditional infrastructure silos of Storage, Networking and Processing which was named Data Logistics. === Illustrative examples of data logistics === Data Caching and Replication are classic examples of Data Logistics solutions to problems in Computer Systems and Networking with high data access latencies or data transfer resource limitations. It works mainly across the areas of data transfer and data storage. Dynamic Compression in data transfer is another example which uses computational resources to minimize the bandwidth requirements of data transfer.

    Read more →
  • Elements of AI

    Elements of AI

    Elements of AI is a massive open online course (MOOC) teaching the basics of artificial intelligence. The course, originally launched in 2018, is designed and organized by the University of Helsinki and learning technology company MinnaLearn. The course includes modules on machine learning, neural networks, the philosophy of artificial intelligence, and using artificial intelligence to solve problems. It consists of two parts: Introduction to AI and its sequel, Building AI, that was released in late 2020. In November 2019, the course was named one of four winners of MIT’s Inclusive Innovation Challenge. University of Helsinki's computer science department is known as the alma mater of Linus Torvalds, a Finnish-American software engineer who is the creator of the Linux kernel, which is the kernel for Linux operating systems. == EU’s AI pledge == The government of Finland has pledged to offer the course for all EU citizens by the end of 2021, as the course is made available in all the official EU languages. The initiative was launched as part of Finland's Presidency of the Council of the European Union in 2019, with the European Commission providing translations of the course materials. In 2017, Finland launched an AI strategy to stay competitive in the field of AI amid growing competition between China and the United States. With the support of private companies and the government, Finland's now-realized goal was to get 1 percent of its citizens to participate in Elements of AI. Other governments have also given their support to the course. For instance, Germany's Federal Minister for Economic Affairs and Energy Peter Altmeier has encouraged citizens to take part in the course to help Germany gain a competitive advantage in AI. Sweden's Minister for Energy and Minister for Digital Development Anders Ygeman has said that Sweden aims to teach 1 percent of its population the basics of AI like Finland has. == Participants == Elements of AI had enrolled more than 1 million students from more than 110 countries by May 2023. A quarter of the course's participants are aged 45 and over, and some 40 percent are women. Among Nordic participants, the share of women is nearly 60 percent. In September 2022, the course was available in Finnish, Swedish, Estonian, English, German, Latvian, Norwegian, French, Belgian, Czech, Greek, Slovakian, Slovenian, Latvian, Lithuanian, Portuguese, Spanish, Irish, Icelandic, Maltese, Croatian, Romanian, Italian, Dutch, Polish, and Danish.

    Read more →
  • Algorithmic mechanism design

    Algorithmic mechanism design

    Algorithmic mechanism design (AMD) lies at the intersection of economic game theory, optimization, and computer science. The prototypical problem in mechanism design is to design a system for multiple self-interested participants, such that the participants' self-interested actions at equilibrium lead to good system performance. Typical objectives studied include revenue maximization and social welfare maximization. Algorithmic mechanism design differs from classical economic mechanism design in several respects. It typically employs the analytic tools of theoretical computer science, such as worst case analysis and approximation ratios, in contrast to classical mechanism design in economics which often makes distributional assumptions about the agents. It also considers computational constraints to be of central importance: mechanisms that cannot be efficiently implemented in polynomial time are not considered to be viable solutions to a mechanism design problem. This often, for example, rules out the classic economic mechanism, the Vickrey–Clarke–Groves auction. == History == Noam Nisan and Amir Ronen first coined "Algorithmic mechanism design" in a research paper published in 1999.

    Read more →
  • Brian Deer Classification System

    Brian Deer Classification System

    The Brian Deer Classification System (BDC) is a library classification system used to organize materials in libraries with specialized Indigenous collections. The system was created in the mid-1970s by Canadian librarian A. Brian Deer, a Kahnawake Mohawk. It has been adapted for use in a British Columbia version, and also by a small number of First Nations libraries in Canada. == History and usage == Deer designed his classification system while working in the library of the National Indian Brotherhood (now the Assembly of First Nations) from 1974 to 1976. Instead of using a standard library classification scheme, such as that of the Library of Congress, he created a new system to organize the library's historic indigenous research materials and papers. He later worked at the library of the Union of British Columbia Indian Chiefs, where he developed a system for its holdings. He returned to Kahnawake, working at its Cultural Centre at Kahnawake and the Kahnawake Branch branch of the Mohawk Nation Office. His system was flexible, and he created new forms for their collections. The new systems Deer created were designed specifically for the materials in each collection according to the concerns of local Indigenous people at the time (for example, categories included land claims, treaty rights, resource management, and Elders' stories). Between 1978 and 1980, the system was adapted for use in British Columbia by Gene Joseph and Keltie McCall while they were working at the Union of British Columbia Indian Chiefs, becoming known as BDC-BC. Joseph later adapted it further for use in the Xwi7xwa Library at University of British Columbia, Vancouver. Though the Brian Deer Classification was not created as a universal classification solution for Indigenous resources, the system has provided a foundation for specialized libraries to create their own localized classification schemes. Variations of the Brian Deer Classification System are used in a small number of Canadian libraries. One prominent library using BDC is the X̱wi7x̱wa Library at the University of British Columbia, which uses a British Columbia-focused version of BDC along with First Nations House of Learning subject headings. The Union of British Columbia Indian Chiefs Resource Centre issued a revised BDC-BC in 2014, with the goal of providing users with a more flexible and culturally appropriate approach to organizing their resources. The Aanischaaukamikw Cree Cultural Institute in Oujé-Bougoumou, Quebec, implemented a local adaptation of BDC when they opened in 2012. In 2020 the Carrier Sekani Tribal Council in Prince George, British Columbia, shifted from organizing its library with the Dewey Decimal Classification to using a version of the BDC. They added new subject heading categories for topics of local interest such as the crisis of Missing and murdered Indigenous women. Simon Fraser University Library began developing the Indigenous Curriculum Resource Centre (ICRC) in 2020, with the physical space opening in 2023. The ICRC is Call to Action 21 of SFU's Aboriginal Reconciliation Council's final report, Walk This Path With Us. Through its collection, the ICRC supports those interested in learning about how and why decolonizing pedagogy and teaching practices are important. The physical items in the collection are catalogued using a modified Brian Deer Classification system. In 2022 Kwantlen Polytechnic University’s χʷəχʷéy̓əm Indigenous Collection released a revised BDC-BC System. This BDC contains works exclusively with Indigenous authored materials and expands the cuttering systems of previous BDC, with the result that much of the collection reflects a spatial relationality. The implementation of this BDC was possible due to the tireless work at Xwi7xwa Library, Union of British Columbia Indian Chiefs Resource Centre, and Simon Fraser University Library's Indigenous Curriculum Resource Centre. == Structure == The high-level organizational structure of BDC reflects a First Nations worldview, with an emphasis on relationships between and among people, animals, and the land. Subcategories demonstrate the relationships among First Nations by grouping them geographically as opposed to alphabetically; the latter is a practice frequently used for specific topics in the Library of Congress Classification. The top-level hierarchy of the X̱wi7x̱wa Library adaptation of BDC-BC demonstrates the emphasis on access to subjects prioritized by a First Nation collection: Reference Materials Local History History International Education Economic Development Housing and Community Development Criminal Justice System Constitution (Canada) and First Nations Self Government Rights and Title Natural Resources Community Resources Health World View Fine Arts Languages Literature The system is not designed to provide a comprehensive description of all topics of interest to North American Indigenous peoples; in addition, its use is limited in scope, being intended for small and specialized libraries. While English is used in the classification scheme as a common language among First Nations peoples and non-Indigenous library users, Indigenous spellings and terminology that local library users would expect to find are used to provide access. Short and easily remembered call numbers are used to facilitate use by both library workers and patrons, with the recognition that Indigenous libraries often have a small staff and limited resources to devote to cataloging. Beyond its simplicity, one potential drawback of the system is its shortage of clear guidelines for application, which provides flexibility but can also result in inconsistencies within and between library catalogs. Because few libraries use the BDC and there are limited examples for use as case studies, implementing the system and keeping it up-to-date can prove a challenge for libraries with limited resources. However, X̱wi7x̱wa Library head librarian Ann Doyle describes the system as "an important part of the body of Indigenous scholarship" that should be retained as a reflection of Indigenous worldviews, as well as for ease of access for Indigenous library users.

    Read more →
  • Ontology merging

    Ontology merging

    Ontology merging defines the act of bringing together two conceptually divergent ontologies or the instance data associated to two ontologies. This is similar to work in database merging (schema matching). This merging process can be performed in a number of ways, manually, semi automatically, or automatically. Manual ontology merging although ideal is extremely labour-intensive and current research attempts to find semi or entirely automated techniques to merge ontologies. These techniques are statistically driven often taking into account similarity of concepts and raw similarity of instances through textual string metrics and semantic knowledge. These techniques are similar to those used in information integration employing string metrics from open source similarity libraries.

    Read more →
  • OrCam device

    OrCam device

    OrCam devices such as OrCam MyEye are portable, artificial vision devices that allow visually impaired people to understand text and identify objects through audio feedback, describing what they are unable to see. Reuters described an important part of how it works as "a wireless smartcamera" which, when attached outside eyeglass frames, can read and verbalize text, and also supermarket barcodes. This information is converted to spoken words and entered "into the user’s ear." Face-recognition is also part of OrCam's feature set. == Devices == OrCam Technologies Ltd has created three devices; OrCam MyEye 2.0, OrCam MyEye 1, and OrCam MyReader. OrCam My Eye 2.0: OrCam debuted the second-generation model, the OrCam MyEye 2.0 in December 2017. About the size of a finger, the MyEye 2.0 is battery-powered, and has been compressed into a self-contained device. The device snaps onto any eyeglass frame magnetically. Orcam 2.0 is small and light (22.5 grams/0.8 ounces) with functionality to restore independence to the visually impaired. It comes in two versions. The basic model can read text, and a more advanced one adds features such as face recognition and barcode reading. As of July 2023, the retail cost is between $4000 and $6000 (USD). == Clinical Studies == JAMA Ophthalmology: In 2016 JAMA Ophthalmology conducted a study involving 12 legally blind participants to evaluate the usefulness of a portable artificial vision device (OrCam) for patients with low vision. The results showed that the OrCam device improved the patient's ability to perform tasks simulating those of daily living, such as reading a message on an electronic device, a newspaper article or a menu. Wills Eye: Wills Eye was a clinical study designed to measure the impact of the OrCam device on the quality of life of patients with End-stage Glaucoma. The conclusion was that OrCam, a novel artificial vision device using a mini-camera mounted on eyeglasses, allowed legally blind patients with end-stage glaucoma to read independently, subsequently improving their quality of life. == Employee testing == The New York Times described how a pre-release OrCam device was used by a Coloboma-impaired employee of the device's developer in 2013 for grocery shopping. It was the small size of the prototype rather than the functionality that gave her added mobility in an Israeli store's aisles. Added life-enhancement was described: "to both recognize and speak .. bus numbers .. traffic lights." == Social aspects == In contrast to an early version of Google Glass, which "failed ... because .. Glass wearers were ..mocked", early OrCam devices used designs that "clip unobtrusively on your shirt or perhaps your belt." In addition, it does not record sounds or images, what was called "the privacy puzzle that stumped Google. One 2018 technology reviewer wrote that he wished it had a headphone jack "so it would be less disruptive in places where others are working." An attempt was made to use bone conduction. == USA introduction == In 2018 a team headed by New York Assemblyman Dov Hikind introduced use of OrCam devices to ten individuals screened for what he termed "new Israeli technology that really makes a difference to the blind." Although not the first USA success, it was more focused than a publicly funded project that was authorized in 2016 by a California government agency. Also in 2016 the Chicago Lighthouse for the Blind demonstrated its use. == Technology == In the area of hardware, miniaturization has been quite important, but one major area, software, was mentioned by Assemblyman Hikind, and reported by The Times of Israel is the "AI-driven algorithms" that "reports .. how many people are in a room. In addition to reading printed text, it can also aid in "seeing" what is on a television or computer screen. Although OrCam can't help with handwritten information, it can reuse information, the basis of recognizing "US currency, and even faces." === Features === While early language support was for English, French, German, Hebrew and Spanish, others now available include Danish, Dutch, Finnish, Italian, Norwegian, Portuguese and Swedish. == History == OrCam Technologies Ltd was founded in 2010 by Professor Amnon Shashua and Ziv Aviram. Before co-founding OrCam, the two in 1999 co-founded Mobileye, an Israeli company that develops vision-based advanced driver-assistance systems (ADAS) providing warnings for collision prevention and mitigation, which was acquired by Intel for $15.3 billion in 2017. OrCam launched OrCam MyEye in 2013 after years of development and testing, and began selling it commercially in 2015. In its early years, the company raised $22 million, $6 million of which came from Intel Capital. By 2014, Intel, which was also investing in Google Glass, had invested $15 million in Orcam. In March 2017, OrCam had raised $41 million in capital, making it worth $600 million. === Marketing === One outcome of initial marketing in the USA was that they "reached a deal with the California Department of Rehabilitation, ...qualifying blind and visually impaired state residents." == OrCam Technologies Ltd == OrCam Technologies Ltd. is the Israeli-based company producing these OrCam devices, which are wearable artificial intelligence space. The company develops and manufactures assistive technology devices for individuals who are visually impaired, partially sighted, blind, print disabilities, or have other disabilities. OrCam headquarters is located in Jerusalem, operating under the company name OrCam Technologies Ltd. OrCam has over 150 employees, is headquartered in Jerusalem, and has offices in New York, Toronto, and London. == Awards == 2018 Last Gadget Standing Winner 2018 CES Innovation Awards Honoree in Accessible Tech 2017 NAIDEX Innovation Award 2016 Louise Braille Corporate Recognition Award 2016 Silmo-d-Or Award

    Read more →
  • Operational system

    Operational system

    An operational system is a term used in data warehousing to refer to a system that is used to process the day-to-day transactions of an organization. These systems are designed in a manner that processing of day-to-day transactions is performed efficiently and the integrity of the transactional data is preserved. == Synonyms == Sometimes operational systems are referred to as operational databases, transaction processing systems, or online transaction processing systems (OLTP). However, the use of the last two terms as synonyms may be confusing, because operational systems can be batch processing systems as well. Any enterprise must necessarily maintain a lot of data about its operation.

    Read more →
  • Voiceverse NFT plagiarism scandal

    Voiceverse NFT plagiarism scandal

    In January 2022, 15—the pseudonymous Massachusetts Institute of Technology (MIT) artificial intelligence researcher and creator of the non-commercial generative artificial intelligence voice synthesis research project 15.ai—discovered that the blockchain-based technology company Voiceverse had plagiarized from their platform. Voiceverse marketed itself as a service that offered AI voice cloning technology that could be purchased and traded as non-fungible tokens (NFTs). Amid heightened controversy over NFTs in the gaming industry, voice actor Troy Baker (who has been described as one of the most famous voice actors in video games) announced his partnership with Voiceverse on January 14, 2022, triggering immediate backlash over concerns about the environmental impact of NFTs, potential for fraud, predatory monetization in video games, and the potential of AI displacing jobs for human voice actors. Later that same day, 15 revealed through server logs that Voiceverse had generated voice lines using 15's free text-to-speech platform, pitch-shifted the audio to make them unrecognizable, and falsely marketed the samples as their own technology before selling them as NFTs. Within an hour of being confronted with evidence, Voiceverse confessed and stated that their marketing team had used 15.ai without proper attribution while rushing to create a technology demo to coincide with Baker's partnership announcement, further exacerbating the already negative reception to the original announcement. In response, 15 replied "Go fuck yourself"; the interaction went viral and garnered a large amount of support for the developer. News publications universally characterized this incident as Voiceverse having "stolen" from 15.ai. The next day, Baker appeared on a podcast and stated that his motivation had been to help independent creators who were unable to afford professional voice actors. Following continued backlash and the plagiarism revelation, Baker ended his partnership with Voiceverse on January 31, 2022. Subsequently, the incident was documented in multiple AI ethics databases, criticisms of predatory monetization in video games, and retrospectives as one of the earliest instances of plagiarism and theft stemming from artificial intelligence during the AI boom. == Background == === Troy Baker === Troy Baker is a prominent voice actor in the video game industry best known for his performances as Joel Miller in The Last of Us franchise. Baker has been described as "ubiquitous" by Polygon, "one of the most high-profile and prolific voice actors in video games" by Eurogamer, and "arguably the most famous voice actor in the gaming industry" by GameGuru. His other prominent roles include voicing Agent John "Jonesy" Jones in Fortnite, Booker DeWitt in BioShock Infinite, and both Batman and Joker in multiple Batman video games. As of October 2025, Baker holds the record for the most acting nominations at the BAFTA Games Awards, with five between 2013 and 2021. === Voiceverse === Voiceverse is a blockchain-based startup founded by the Bored Ape Yacht Club that marketed itself as offering AI voice cloning technology in the form of NFTs. Prior to the announcement of their partnership with Baker, Voiceverse had partnered with LOVO, Inc., an AI voice platform that, according to LOVO, could generate human-like voices. Voiceverse stated that any user who purchases a voice NFT would have unlimited and perpetual access to the voice model, which could be used to create content such as audiobooks, YouTube videos, podcasts, e-learning materials, in-game voice chat, and Zoom calls. Voiceverse promised that buyers would "OWN [sic] all of the IP" of content they created using these voices. Voiceverse's roadmap included plans to release 8,888 initial voice NFTs, a feature to add emotions to existing voices, and the ability for users to mint their own voices as NFTs. Prior to Baker's partnership, Voiceverse had also partnered with voice actors Charlet Chung, who voices D.Va in Overwatch, and Andy Milonakis of The Andy Milonakis Show. === 15.ai === 15.ai is a free web application launched in 2020 that uses artificial intelligence to generate text-to-speech voices of fictional characters from popular media. Created by a pseudonymous artificial intelligence researcher known as 15, who began developing the technology as a freshman during their undergraduate research at MIT, it was an early example of an application of generative artificial intelligence during the initial stages of the AI boom. The platform showed that deep neural networks could generate emotionally expressive speech with only 15 seconds of speech; the name "15.ai" references the creator's statement that a voice can be convincingly cloned with just 15 seconds of audio, as opposed to the tens of hours of data previously required. 15.ai became an Internet phenomenon in early 2021 when content utilizing it went viral on social media and quickly gained widespread use among various Internet fandoms. 15 has emphasized that it remain free and non-commercial; it only requires users to give proper credit when using the service for content creation. === NFTs in the video game industry === By early 2022, NFTs had become highly controversial within the gaming industry. Critics raised concerns about their environmental impact due to the significant energy consumption of blockchain technology. In addition, the prevalence of scams, fraud, and potential money laundering associated with NFT sales, as well as fears that NFTs were a new form of predatory monetization following the increasing frequency of loot boxes, caused vocal pushback from the gaming community. Several major gaming companies had begun exploring NFT integration into their products, though fan backlash had already forced some projects to be cancelled. On December 16, 2021, the developers of S.T.A.L.K.E.R. 2: Heart of Chernobyl announced that they would be including NFTs in the game, but cancelled within an hour of the announcement due to immediate universal backlash. Simultaneously, the rise of AI voice technology raised concerns among voice actors about potential job displacement and the devaluation of their work amidst the voice acting industry's ongoing struggles for better compensation and working conditions. == Partnership announcement and backlash == On January 14, 2022, 1:02 a.m. EST, Baker announced on Twitter that he was partnering with Voiceverse "to explore ways where together we might bring new tools to new creators to make new things, and allow everyone a chance to own & invest in the IP's they create." The announcement concluded with the statement "You can hate. Or you can create." Baker's specific role with Voiceverse remained unclear at the time of the announcement. Along with Baker's announcement, Voiceverse promoted their supposed voice AI technology on Twitter by posting animated videos that featured a cat character created by NFT firm Chubbiverse. The videos concluded with text that read "The Voice Powered By Voiceverse"; Voiceverse stated on Twitter that the voices in the animations had been generated using their own AI voice synthesis technology and presented the videos as a technology demonstration of their voice NFT capabilities. The announcement provoked immediate and widespread backlash from the gaming community. Baker's tweet received thousands of replies and quote retweets (the vast majority of which were negative), far more than the number of likes; Michael McWhertor of Polygon described it as a "textbook example of being ratioed" and commented that reactions had been amplified by the final part of Baker's announcement. Michael Beckwith of Metro called Baker's approach "bizarrely aggressive". Later that day, Baker responded to the backlash by apologizing for his choice of words. He said he appreciated people's thoughts and acknowledged that the "hate/create part might have been a bit antagonistic," calling it a "bad attempt to bring levity". Despite the apology, Baker and his fellow voice actors did not distance themselves from Voiceverse at this point. At the same time, Voiceverse attempted to address the criticisms, stating that they were working to move to more environmentally friendly blockchain technology and that voice actors would receive royalties from NFT sales, with actors benefiting from any increase in NFT value. == Plagiarism revelation == On December 13, 2021, amidst the increasingly negative reactions toward NFTs among the general public, the creator of 15.ai (known pseudonymously as 15) announced that they had "no interest in incorporating NFTs into any aspect of [their] work." On January 14, 2022, 11:17 a.m. EST (10 hours after Baker's initial announcement), 15 commented on the Voiceverse venture, stating that it "sounds like a scam". Two hours later, at 1:20 p.m., 15 explicitly accused Voiceverse of "actively attempting to appropriate [15's] work for [Voiceverse's] own benefit." 15 provided evidence through

    Read more →
  • Collision problem

    Collision problem

    The r-to-1 collision problem is an important theoretical problem in complexity theory, quantum computing, and computational mathematics. The collision problem most often refers to the 2-to-1 version: given n {\displaystyle n} even and a function f : { 1 , … , n } → { 1 , … , n } {\displaystyle f:\,\{1,\ldots ,n\}\rightarrow \{1,\ldots ,n\}} , we are promised that f is either 1-to-1 or 2-to-1. We are only allowed to make queries about the value of f ( i ) {\displaystyle f(i)} for any i ∈ { 1 , … , n } {\displaystyle i\in \{1,\ldots ,n\}} . The problem then asks how many such queries we need to make to determine with certainty whether f is 1-to-1 or 2-to-1. == Classical solutions == === Deterministic === Solving the 2-to-1 version deterministically requires n 2 + 1 {\textstyle {\frac {n}{2}}+1} queries, and in general distinguishing r-to-1 functions from 1-to-1 functions requires n r + 1 {\textstyle {\frac {n}{r}}+1} queries. This is a straightforward application of the pigeonhole principle: if a function is r-to-1, then after n r + 1 {\textstyle {\frac {n}{r}}+1} queries we are guaranteed to have found a collision. If a function is 1-to-1, then no collision exists. Thus, n r + 1 {\textstyle {\frac {n}{r}}+1} queries suffice. If we are unlucky, then the first n / r {\displaystyle n/r} queries could return distinct answers, so n r + 1 {\textstyle {\frac {n}{r}}+1} queries is also necessary. === Randomized === If we allow randomness, the problem is easier. By the birthday paradox, if we choose (distinct) queries at random, then with high probability we find a collision in any fixed 2-to-1 function after Θ ( n ) {\displaystyle \Theta ({\sqrt {n}})} queries. == Quantum solution == The BHT algorithm, which uses Grover's algorithm, solves this problem optimally by only making O ( n 1 / 3 ) {\displaystyle O(n^{1/3})} queries to f. The matching lower bound of Ω ( n 1 / 3 ) {\displaystyle \Omega (n^{1/3})} was proved by Aaronson and Shi using the polynomial method.

    Read more →
  • Nextcloud

    Nextcloud

    Nextcloud is a modular workspace platform designed to provide teams and businesses with a comprehensive environment for digital collaboration. Beyond central data management, it integrates office suites like Collabora Online and EuroOffice office suites. for seamless, cooperative workflows. The platform features built-in tools for chat, videoconferencing, and a privacy-focused AI assistant capable of running entirely on local LLMs. Supported by a rich ecosystem of apps, it can be hosted in the cloud or on premises and can scale up to millions of users. It has been translated into over 100 languages. == Features == Nextcloud files are stored in conventional directory structures, accessible via WebDAV if necessary. A SQLite, MySQL/MariaDB or PostgreSQL database is required to provide additional functionality like permissions, shares, and comments. Nextcloud can synchronize with local clients running Windows (Windows 8.1 and above), macOS (10.14 or later), Linux and FreeBSD. Nextcloud permits user and group administration locally or via different backends like OpenID or LDAP. Content can be shared inside the system by defining granular read/write permissions between users and groups. Nextcloud users can create public URLs when sharing files. Logging of file-related actions, as well as disallowing access based on file access rules is also available. Security options like brute-force protection and multi-factor authentication using TOTP, WebAuthn, Oauth2, and OpenID Connect are available. Nextcloud has planned new features such as monitoring capabilities, full-text search and Kerberos authentication, as well as audio/video conferencing, expanded federation and smaller user interface improvements. == History == In April 2016 Frank Karlitschek and most core contributors left ownCloud Inc. These included some of ownCloud's staff according to sources near to the ownCloud community. Karlitschek and many of these contributors went on to fork ownCloud, creating Nextcloud. The fork was preceded by a blog post of Karlitschek announcing his departure and raising questions about the management of the ownCloud, its community, and priorities between growth, money, and sustainability. There have been no official statements about the reason for the fork. However, Karlitschek mentioned the fork several times in a talk at the 2018 FOSDEM conference and in two appearances on the FLOSS Weekly podcast, emphasizing cultural mismatch between open source developers and business oriented people not used to the open source community. On June 2, within 12 hours of the announcement of the fork, the American entity "ownCloud Inc." announced that it is shutting down with immediate effect, stating that "[...] main lenders in the US have cancelled our credit. Following American law, we are forced to close the doors of ownCloud, Inc. with immediate effect and terminate the contracts of 8 employees." ownCloud Inc. accused Karlitschek of poaching developers, while Nextcloud developers such as Arthur Schiwon stated that he "decided to quit because not everything in the ownCloud Inc. company world evolved as I imagined". ownCloud GmbH continued operations, secured financing from new investors and took over the business of ownCloud Inc. In April 2018 Informationstechnikzentrum Bund (ITZBund) reported Nextcloud won the tender for "Bundescloud" (Germany government cloud) project. In August 2019 it was announced that the governments of France, Sweden and the Netherlands would use Nextcloud for file transfer. In January 2020 Nextcloud 18 "Nextcloud Hub" was released. The major change was direct integration with an Office suite (OnlyOffice) and Nextcloud announced that their goal was to compete with Office 365 and Google Docs. A partnership with Ionos was revealed – its hosting location in Germany and compliance with GDPR should support the goal of data sovereignty. In spring 2020 remote work and web conferencing usage increased due to the COVID-19 pandemic and Nextcloud released version 19 with chat and videoconferencing Talk app integrated into the application core. Communication with an optional "high performance back-end" allows self-hosting of web conferences with more than 10 participants. Collabora Online was introduced as another integrated office suite. In August 2021 Nextcloud was chosen as a collaboration platform for European cloud software GAIA-X. In a September 2021 European Commission report it was mentioned as "the most widely deployed Open Source content collaboration platform" Following the 2025 United States tariffs against the European Union, fear of overreliance on US cloud providers such as Microsoft 365 and Google Workspace increased, with Nextcloud being one of the foremost contenders to replace them. Some governmental organisations including the European Data Protection Supervisor and the German state of Schleswig-Holstein have since switched from Microsoft's Sharepoint to Nextcloud. According to Nextcloud, during the first 5 months of 2025, customer interest in the software had tripled.

    Read more →
  • Run-time algorithm specialization

    Run-time algorithm specialization

    In computer science, run-time algorithm specialization is a methodology for creating efficient algorithms for costly computation tasks of certain kinds. The methodology originates in the field of automated theorem proving and, more specifically, in the Vampire theorem prover project. The idea is inspired by the use of partial evaluation in optimising program translation. Many core operations in theorem provers exhibit the following pattern. Suppose that we need to execute some algorithm a l g ( A , B ) {\displaystyle {\mathit {alg}}(A,B)} in a situation where a value of A {\displaystyle A} is fixed for potentially many different values of B {\displaystyle B} . In order to do this efficiently, we can try to find a specialization of a l g {\displaystyle {\mathit {alg}}} for every fixed A {\displaystyle A} , i.e., such an algorithm a l g A {\displaystyle {\mathit {alg}}_{A}} , that executing a l g A ( B ) {\displaystyle {\mathit {alg}}_{A}(B)} is equivalent to executing a l g ( A , B ) {\displaystyle {\mathit {alg}}(A,B)} . The specialized algorithm may be more efficient than the generic one, since it can exploit some particular properties of the fixed value A {\displaystyle A} . Typically, a l g A ( B ) {\displaystyle {\mathit {alg}}_{A}(B)} can avoid some operations that a l g ( A , B ) {\displaystyle {\mathit {alg}}(A,B)} would have to perform, if they are known to be redundant for this particular parameter A {\displaystyle A} . In particular, we can often identify some tests that are true or false for A {\displaystyle A} , unroll loops and recursion, etc. == Difference from partial evaluation == The key difference between run-time specialization and partial evaluation is that the values of A {\displaystyle A} on which a l g {\displaystyle {\mathit {alg}}} is specialised are not known statically, so the specialization takes place at run-time. There is also an important technical difference. Partial evaluation is applied to algorithms explicitly represented as codes in some programming language. At run-time, we do not need any concrete representation of a l g {\displaystyle {\mathit {alg}}} . We only have to imagine a l g {\displaystyle {\mathit {alg}}} when we program the specialization procedure. All we need is a concrete representation of the specialized version a l g A {\displaystyle {\mathit {alg}}_{A}} . This also means that we cannot use any universal methods for specializing algorithms, which is usually the case with partial evaluation. Instead, we have to program a specialization procedure for every particular algorithm a l g {\displaystyle {\mathit {alg}}} . An important advantage of doing so is that we can use some powerful ad hoc tricks exploiting peculiarities of a l g {\displaystyle {\mathit {alg}}} and the representation of A {\displaystyle A} and B {\displaystyle B} , which are beyond the reach of any universal specialization methods. == Specialization with compilation == The specialized algorithm has to be represented in a form that can be interpreted. In many situations, usually when a l g A ( B ) {\displaystyle {\mathit {alg}}_{A}(B)} is to be computed on many values of B {\displaystyle B} in a row, a l g A {\displaystyle {\mathit {alg}}_{A}} can be written as machine code instructions for a special abstract machine, and it is typically said that A {\displaystyle A} is compiled. The code itself can then be additionally optimized by answer-preserving transformations that rely only on the semantics of instructions of the abstract machine. The instructions of the abstract machine can usually be represented as records. One field of such a record, an instruction identifier (or instruction tag), would identify the instruction type, e.g. an integer field may be used, with particular integer values corresponding to particular instructions. Other fields may be used for storing additional parameters of the instruction, e.g. a pointer field may point to another instruction representing a label, if the semantics of the instruction require a jump. All instructions of the code can be stored in a traversable data structure such as an array, linked list, or tree. Interpretation (or execution) proceeds by fetching instructions in some order, identifying their type, and executing the actions associated with said type. In many programming languages, such as C and C++, a simple switch statement may be used to associate actions with different instruction identifiers. Modern compilers usually compile a switch statement with constant (e.g. integer) labels from a narrow range by storing the address of the statement corresponding to a value i {\displaystyle i} in the i {\displaystyle i} -th cell of a special array, as a means of efficient optimisation. This can be exploited by taking values for instruction identifiers from a small interval of values. == Data-and-algorithm specialization == There are situations when many instances of A {\displaystyle A} are intended for long-term storage and the calls of a l g ( A , B ) {\displaystyle {\mathit {alg}}(A,B)} occur with different B {\displaystyle B} in an unpredictable order. For example, we may have to check a l g ( A 1 , B 1 ) {\displaystyle {\mathit {alg}}(A_{1},B_{1})} first, then a l g ( A 2 , B 2 ) {\displaystyle {\mathit {alg}}(A_{2},B_{2})} , then a l g ( A 1 , B 3 ) {\displaystyle {\mathit {alg}}(A_{1},B_{3})} , and so on. In such circumstances, full-scale specialization with compilation may not be suitable due to excessive memory usage. However, we can sometimes find a compact specialized representation A ′ {\displaystyle A^{\prime }} for every A {\displaystyle A} , that can be stored with, or instead of, A {\displaystyle A} . We also define a variant a l g ′ {\displaystyle {\mathit {alg}}^{\prime }} that works on this representation and any call to a l g ( A , B ) {\displaystyle {\mathit {alg}}(A,B)} is replaced by a l g ′ ( A ′ , B ) {\displaystyle {\mathit {alg}}^{\prime }(A^{\prime },B)} , intended to do the same job faster.

    Read more →
  • Single source of truth

    Single source of truth

    In information science and information technology, single source of truth (SSOT) architecture, or single point of truth (SPOT) architecture, for information systems is the practice of structuring information models and associated data schemas such that every data element is mastered (or edited) in only one place, providing data normalization to a canonical form (for example, in database normalization or content transclusion). There are several scenarios with respect to copies and updates: The master data is never copied and instead only references to it are made; this means that all reads and updates go directly to the SSOT. The master data is copied but the copies are only read and only the master data is updated; if requests to read data are only made on copies, this is an instance of CQRS. The master data is copied and the copies are updated; this needs a reconciliation mechanism when there are concurrent updates. Updates on copies can be thrown out whenever a concurrent update is made on the master, so they are not considered fully committed until propagated to the master. (many blockchains work that way.) Concurrent updates are merged. (if an automatic merge fails, it could fall back on another strategy, which could be the previous strategy or something else like manual intervention, which most source version control systems do.) The advantages of SSOT architectures include easier prevention of mistaken inconsistencies (such as a duplicate value/copy somewhere being forgotten), and greatly simplified version control. Without a SSOT, dealing with inconsistencies implies either complex and error-prone consensus algorithms, or using a simpler architecture that's liable to lose data in the face of inconsistency (the latter may seem unacceptable but it is sometimes a very good choice; it is how most blockchains operate: a transaction is actually final only if it was included in the next block that is mined). Ideally, SSOT systems provide data that are authentic (and authenticatable), relevant, and referable. Deployment of an SSOT architecture is becoming increasingly important in enterprise settings where incorrectly linked duplicate or de-normalized data elements (a direct consequence of intentional or unintentional denormalization of any explicit data model) pose a risk for retrieval of outdated, and therefore incorrect, information. Common examples (i.e., example classes of implementation) are as follows: In electronic health records (EHRs), it is imperative to accurately validate patient identity against a single referential repository, which serves as the SSOT. Duplicate representations of data within the enterprise would be implemented by the use of pointers rather than duplicate database tables, rows, or cells. This ensures that data updates to elements in the authoritative location are comprehensively distributed to all federated database constituencies in the larger overall enterprise architecture. EHRs are an excellent class for exemplifying how SSOT architecture is both poignantly necessary and challenging to achieve: it is challenging because inter-organization health information exchange is inherently a cybersecurity competence hurdle, and nonetheless it is necessary, to prevent medical errors, to prevent the wasted costs of inefficiency (such as duplicated work or rework), and to make the primary care and medical home concepts feasible (to achieve competent care transitions). Single-source publishing as a general principle or ideal in content management relies on having SSOTs, via transclusion or (otherwise, at least) substitution. Substitution happens via libraries of objects that can be propagated as static copies which are later refreshed when necessary (that is, when refreshing of the copy-paste or import is triggered by a larger updating event). Component content management systems are a class of content management systems that aim to provide competence on this level. == Implementation == === Ontologic interactions === An acknowledged prerequisite (of the notion that any given single source of truth can exist) is that it depends on the ontologic condition that no more than a single truth (about any particular fact or idea) exists, an assertion that is ontologic in both the IT sense and the general sense of that word. In many instances, this presents no problem (for example, within particular namespaces, or even across them, as long as naming collisions or broader name conflicts are adequately handled). The broadest contexts (and thus thorniest, regarding ontologic discrepancies) require adequate epistemic regime comparison and reconciliation (or at least negotiation or transactional exchanges). An archetypal example of this class of reconciliation is that two theological seminary libraries, from two different religions (X and Y), could exchange information with an SSOT architecture, but the unification of truth would reside on the level of the statement that "religion X asserts that God is purple whereas religion Y asserts that God is green", rather than on the level of "God is purple" or "God is green". === Architectures or architectural features === An ideal implementation of SSOT is rarely possible in most enterprises. This is because many organisations have multiple information systems, each of which needs access to data relating to the same entities (e.g., customer). Often these systems are purchased as commercial off-the-shelf products from vendors and cannot be modified in trivial ways. Each of these various systems therefore needs to store its own version of common data or entities, and therefore each system must retain its own copy of a record (hence immediately violating the SSOT approach defined above). For example, an enterprise resource planning (ERP) system (such as SAP or Oracle e-Business Suite) may store a customer record; the customer relationship management (CRM) system also needs a copy of the customer record (or part of it) and the warehouse dispatch system might also need a copy of some or all of the customer data (e.g., shipping address). In cases where vendors do not support such modifications, it is not always possible to replace these records with pointers to the SSOT. For organisations (with more than one information system) wishing to implement a Single Source of Truth (without modifying all but one master system to store pointers to other systems for all entities), some supporting architectures are: Master data management (MDM) Event store and event sourcing (ES) ==== Master data management (MDM) ==== A master data management system typically serves as the source of truth for an organization's metadata, helping to ensure accuracy and consistency throughout that organizations multiple data sources. Typically the MDM acts as a hub for multiple systems, many of which could allow (be the source of truth for) updates to different aspects of information on a given entity. For example, the CRM system may be the "source of truth" for most aspects of the customer, and is updated by a call centre operator. However, a customer may (for example) also update their address via a customer service web site, with a different back-end database from the CRM system. The MDM application receives updates from multiple sources, acts as a broker to determine which updates are to be regarded as authoritative (the golden record) and then syndicates this updated data to all subscribing systems. The MDM application normally requires an ESB to syndicate its data to multiple subscribing systems. ==== Event store and event sourcing (ES) ==== In event oriented architectures, it has become increasingly common to find an implementation of the Event Sourcing pattern which stores the system state as an ordered sequence of state changes. To do this, you need an Event Store, a particular type of database designed to hold all the events that change the state of the system. The event store in an Event Sourcing + Command Query Responsibility Separation + Domain Driven Design + Messaging architecture is in fact a "single source of truth", with the additional advantage that it can also act as an Enterprise Service Bus as it can listen directly to the event store for status changes as everything passes by. In addition, by saving all the events, it also plays the role of Data Warehouse. One last advantage is that through this system the Shared Database pattern can be implemented, another technique not mentioned to obtain a single source of truth. ==== Data warehouse (DW) ==== While the primary purpose of a data warehouse is to support reporting and analysis of data that has been combined from multiple sources, the fact that such data has been combined (according to business logic embedded in the data transformation and integration processes) means that the data warehouse is often used as a de facto SSOT. Generally, however, the data available from the data warehouse are not used to update other systems; rather the DW becomes

    Read more →