AI Chatbot Quill

AI Chatbot Quill — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Auralization

    Auralization

    Auralization is a procedure designed to model and simulate the experience of acoustic phenomena rendered as a soundfield in a virtualized space. This is useful in configuring the soundscape of architectural structures, concert venues, and public spaces, as well as in making coherent sound environments within virtual immersion systems. == History == The English term auralization was used for the first time by Kleiner et al. in an article in the journal of the AES en 1991. The increase of computational power allowed the development of the first acoustic simulation software towards the end of the 1960s. == Principles == Auralizations are experienced through systems rendering virtual acoustic models made by convolving or mixing acoustic events recorded 'dry' (or in an anechoic chamber) projected within a virtual model of an acoustic space, the characteristics of which are determined by means of sampling its impulse response (IR). Once this h ( t ) {\displaystyle h(t)} has been determined, the simulation of the resulting soundfield s ( t ) {\displaystyle s(t)} in the target environment is obtained by convolution: r ( t ) = h ( t ) ∗ s ( t ) {\displaystyle r(t)=h(t)s(t)} The resulting sound r ( t ) {\displaystyle r(t)} is heard as it would if emitted in that acoustic space. == Binaurality == For auralizations to be perceived as realistic, it is critical to emulate the human hearing in terms of position and orientation of the listener's head with respect to the sources of sound. For IR data to be convolved convincingly, the acoustic events are captured using a dummy head where two microphones are positioned on each side of the head to record an emulation of sound arriving at the locations of human ears, or using an ambisonics microphone array and mixed down for binaurality. Head-related transfer functions (HRTF) datasets can be used to simplify the process insofar as a monaural IR can be measured or simulated, then audio content is convolved with its target acoustic space. In rendering the experience, the transfer function corresponding to the orientation of the head is applied to simulate the corresponding spatial emanation of sound.

    Read more →
  • Voiceverse NFT plagiarism scandal

    Voiceverse NFT plagiarism scandal

    In January 2022, 15—the pseudonymous Massachusetts Institute of Technology (MIT) artificial intelligence researcher and creator of the non-commercial generative artificial intelligence voice synthesis research project 15.ai—discovered that the blockchain-based technology company Voiceverse had plagiarized from their platform. Voiceverse marketed itself as a service that offered AI voice cloning technology that could be purchased and traded as non-fungible tokens (NFTs). Amid heightened controversy over NFTs in the gaming industry, voice actor Troy Baker (who has been described as one of the most famous voice actors in video games) announced his partnership with Voiceverse on January 14, 2022, triggering immediate backlash over concerns about the environmental impact of NFTs, potential for fraud, predatory monetization in video games, and the potential of AI displacing jobs for human voice actors. Later that same day, 15 revealed through server logs that Voiceverse had generated voice lines using 15's free text-to-speech platform, pitch-shifted the audio to make them unrecognizable, and falsely marketed the samples as their own technology before selling them as NFTs. Within an hour of being confronted with evidence, Voiceverse confessed and stated that their marketing team had used 15.ai without proper attribution while rushing to create a technology demo to coincide with Baker's partnership announcement, further exacerbating the already negative reception to the original announcement. In response, 15 replied "Go fuck yourself"; the interaction went viral and garnered a large amount of support for the developer. News publications universally characterized this incident as Voiceverse having "stolen" from 15.ai. The next day, Baker appeared on a podcast and stated that his motivation had been to help independent creators who were unable to afford professional voice actors. Following continued backlash and the plagiarism revelation, Baker ended his partnership with Voiceverse on January 31, 2022. Subsequently, the incident was documented in multiple AI ethics databases, criticisms of predatory monetization in video games, and retrospectives as one of the earliest instances of plagiarism and theft stemming from artificial intelligence during the AI boom. == Background == === Troy Baker === Troy Baker is a prominent voice actor in the video game industry best known for his performances as Joel Miller in The Last of Us franchise. Baker has been described as "ubiquitous" by Polygon, "one of the most high-profile and prolific voice actors in video games" by Eurogamer, and "arguably the most famous voice actor in the gaming industry" by GameGuru. His other prominent roles include voicing Agent John "Jonesy" Jones in Fortnite, Booker DeWitt in BioShock Infinite, and both Batman and Joker in multiple Batman video games. As of October 2025, Baker holds the record for the most acting nominations at the BAFTA Games Awards, with five between 2013 and 2021. === Voiceverse === Voiceverse is a blockchain-based startup founded by the Bored Ape Yacht Club that marketed itself as offering AI voice cloning technology in the form of NFTs. Prior to the announcement of their partnership with Baker, Voiceverse had partnered with LOVO, Inc., an AI voice platform that, according to LOVO, could generate human-like voices. Voiceverse stated that any user who purchases a voice NFT would have unlimited and perpetual access to the voice model, which could be used to create content such as audiobooks, YouTube videos, podcasts, e-learning materials, in-game voice chat, and Zoom calls. Voiceverse promised that buyers would "OWN [sic] all of the IP" of content they created using these voices. Voiceverse's roadmap included plans to release 8,888 initial voice NFTs, a feature to add emotions to existing voices, and the ability for users to mint their own voices as NFTs. Prior to Baker's partnership, Voiceverse had also partnered with voice actors Charlet Chung, who voices D.Va in Overwatch, and Andy Milonakis of The Andy Milonakis Show. === 15.ai === 15.ai is a free web application launched in 2020 that uses artificial intelligence to generate text-to-speech voices of fictional characters from popular media. Created by a pseudonymous artificial intelligence researcher known as 15, who began developing the technology as a freshman during their undergraduate research at MIT, it was an early example of an application of generative artificial intelligence during the initial stages of the AI boom. The platform showed that deep neural networks could generate emotionally expressive speech with only 15 seconds of speech; the name "15.ai" references the creator's statement that a voice can be convincingly cloned with just 15 seconds of audio, as opposed to the tens of hours of data previously required. 15.ai became an Internet phenomenon in early 2021 when content utilizing it went viral on social media and quickly gained widespread use among various Internet fandoms. 15 has emphasized that it remain free and non-commercial; it only requires users to give proper credit when using the service for content creation. === NFTs in the video game industry === By early 2022, NFTs had become highly controversial within the gaming industry. Critics raised concerns about their environmental impact due to the significant energy consumption of blockchain technology. In addition, the prevalence of scams, fraud, and potential money laundering associated with NFT sales, as well as fears that NFTs were a new form of predatory monetization following the increasing frequency of loot boxes, caused vocal pushback from the gaming community. Several major gaming companies had begun exploring NFT integration into their products, though fan backlash had already forced some projects to be cancelled. On December 16, 2021, the developers of S.T.A.L.K.E.R. 2: Heart of Chernobyl announced that they would be including NFTs in the game, but cancelled within an hour of the announcement due to immediate universal backlash. Simultaneously, the rise of AI voice technology raised concerns among voice actors about potential job displacement and the devaluation of their work amidst the voice acting industry's ongoing struggles for better compensation and working conditions. == Partnership announcement and backlash == On January 14, 2022, 1:02 a.m. EST, Baker announced on Twitter that he was partnering with Voiceverse "to explore ways where together we might bring new tools to new creators to make new things, and allow everyone a chance to own & invest in the IP's they create." The announcement concluded with the statement "You can hate. Or you can create." Baker's specific role with Voiceverse remained unclear at the time of the announcement. Along with Baker's announcement, Voiceverse promoted their supposed voice AI technology on Twitter by posting animated videos that featured a cat character created by NFT firm Chubbiverse. The videos concluded with text that read "The Voice Powered By Voiceverse"; Voiceverse stated on Twitter that the voices in the animations had been generated using their own AI voice synthesis technology and presented the videos as a technology demonstration of their voice NFT capabilities. The announcement provoked immediate and widespread backlash from the gaming community. Baker's tweet received thousands of replies and quote retweets (the vast majority of which were negative), far more than the number of likes; Michael McWhertor of Polygon described it as a "textbook example of being ratioed" and commented that reactions had been amplified by the final part of Baker's announcement. Michael Beckwith of Metro called Baker's approach "bizarrely aggressive". Later that day, Baker responded to the backlash by apologizing for his choice of words. He said he appreciated people's thoughts and acknowledged that the "hate/create part might have been a bit antagonistic," calling it a "bad attempt to bring levity". Despite the apology, Baker and his fellow voice actors did not distance themselves from Voiceverse at this point. At the same time, Voiceverse attempted to address the criticisms, stating that they were working to move to more environmentally friendly blockchain technology and that voice actors would receive royalties from NFT sales, with actors benefiting from any increase in NFT value. == Plagiarism revelation == On December 13, 2021, amidst the increasingly negative reactions toward NFTs among the general public, the creator of 15.ai (known pseudonymously as 15) announced that they had "no interest in incorporating NFTs into any aspect of [their] work." On January 14, 2022, 11:17 a.m. EST (10 hours after Baker's initial announcement), 15 commented on the Voiceverse venture, stating that it "sounds like a scam". Two hours later, at 1:20 p.m., 15 explicitly accused Voiceverse of "actively attempting to appropriate [15's] work for [Voiceverse's] own benefit." 15 provided evidence through

    Read more →
  • Schema crosswalk

    Schema crosswalk

    A schema crosswalk is a table that shows equivalent elements (or "fields") in more than one database schema. It maps the elements in one schema to the equivalent elements in another. Crosswalk tables are often employed within or in parallel to enterprise systems, especially when multiple systems are interfaced or when the system includes legacy system data. In the context of Interfaces, they function as an internal extract, transform, load (ETL) mechanism. For example, this is a metadata crosswalk from MARC standards to Dublin Core: Crosswalks show people where to put the data from one scheme into a different scheme. They are often used by libraries, archives, museums, and other cultural institutions to translate data to or from MARC standards, Dublin Core, Text Encoding Initiative (TEI), and other metadata schemes. For example, an archive has a MARC record in its catalog describing a manuscript. Suppose the archive makes a digital copy of that manuscript and wants to display it on the web along with the information from the catalog. In that case, it will have to translate the data from the MARC catalog record into a different format, such as Metadata Object Description Schema, that is viewable on a webpage. Because MARC has various fields than MODS, decisions must be made about where to put the data into MODS. This type of "translating" from one format to another is often called "metadata mapping" or "field mapping," and is related to "data mapping", and "semantic mapping". Crosswalks also have several technical capabilities. They help databases using different metadata schemes to share information. They help metadata harvesters create union catalogs. They enable search engines to search multiple databases simultaneously with a single query. == Challenges for crosswalks == One of the biggest challenges for crosswalks is that no two metadata schemes are 100% equivalent. One scheme may have a field that doesn't exist in another scheme or a field that is split into two different fields in another scheme; this is why data is often lost when mapping from a complex scheme to a simpler one. For example, when mapping from MARC to Simple Dublin Core, the distinction between types of titles is lost: Simple Dublin Core only has one "Title" element, so all of the different types of MARC titles get lumped together without further distinctions. A future attempt to convert the metadata back into MARC would enter the information in the basic MARC 245 Title Statement field, with none of the original distinctions. This is why crosswalks are said to be "lateral" (one-way) mappings from one scheme to another. Separate crosswalks would be required to map from scheme A to scheme B and from scheme B to scheme A. === Difficulties in mapping === Other mapping problems arise when: One scheme has one element that needs to be split up with different parts of it placed in multiple other elements in the second scheme ("one-to-many" mapping) One scheme allows an element to be repeated more than once while another only allows that element to appear once with multiple terms in it Schemes have different data formats (e.g. John Doe or Doe, John) An element in one scheme is indexed, but the equivalent element in the other scheme is not Schemes may use different controlled vocabularies Schemes change their standards over time Some of these problems are not fixable. As Karen Coyle says in "Crosswalking Citation Metadata: The University of California's Experience," "The more metadata experience we have, the more it becomes clear that metadata perfection is not attainable, and anyone who attempts it will be sorely disappointed. When metadata is crosswalked between two or more unrelated sources, there will be data elements that cannot be reconciled in an ideal manner. The key to a successful metadata crosswalk is intelligent flexibility. It is essential to focus on the important goals and be willing to compromise to reach a practical conclusion to projects."

    Read more →
  • Automatic image annotation

    Automatic image annotation

    Automatic image annotation (also known as automatic image tagging or linguistic indexing) is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. This application of computer vision techniques is used in image retrieval systems to organize and locate images of interest from a database. This method can be regarded as a type of multi-class image classification with a very large number of classes - as large as the vocabulary size. Typically, image analysis in the form of extracted feature vectors and the training annotation words are used by machine learning techniques to attempt to automatically apply annotations to new images. The first methods learned the correlations between image features and training annotations. Subsequently, techniques were developed using machine translation to attempt to translate the textual vocabulary into the 'visual vocabulary,' represented by clustered regions known as blobs. Subsequent work has included classification approaches, relevance models, and other related methods. The advantages of automatic image annotation versus content-based image retrieval (CBIR) are that queries can be more naturally specified by the user. At present, Content-Based Image Retrieval (CBIR) generally requires users to search by image concepts such as color and texture or by finding example queries. However, certain image features in example images may override the concept that the user is truly focusing on. Traditional methods of image retrieval, such as those used by libraries, have relied on manually annotated images, which is expensive and time-consuming, especially given the large and constantly growing image databases in existence.

    Read more →
  • Lexalytics

    Lexalytics

    Lexalytics, Inc. provides sentiment and intent analysis to an array of companies using SaaS and cloud based technology. Salience 6, the engine behind Lexalytics, was built as an on-premises, multi-lingual text analysis engine. It is leased to other companies who use it to power filtering and reputation management programs. In July, 2015 Lexalytics acquired Semantria to be used as a cloud option for its technology. In September, 2021 Lexalytics was acquired by CX company InMoment. == History == Lexalytics spun into existence in January 2003 out of a content management startup called Lightspeed. Lightspeed consolidated on America's West Coast. Jeff Catlin, a Lightspeed General Manager, and Mike Marshall, a Lighstpeed Principal Engineer, convinced investors to give them the East Coast company so as to avoid shutdown costs. Catlin and Marshall renamed the operation Lexalytics. Catlin took on the role of chief executive officer with Marshall working as Chief Technology Officer. Lexalytics opted to not accept venture cash. Instead, the company initially shared sales and marketing expenses with U.K. based document management company Infonic. The partner companies soon formed a joint venture in July 2008, which was later dissolved. Since then, Lexalytics has worked with many other companies, like Bottlenose, Salesforce, Thomson Reuters, Oracle and DataSift. Relationships with social media monitoring companies like Datasift tend to find Lexalytics’ Salience engine baked into the product itself. Lexalytics is used similarly to monitor sentiment as it relates to stock trading. In December 2014, Lexalytics announced the latest iteration to its sentiment analysis engine, Salience 6. Earlier that year Lexalytics acquired Semantria in a bid to appeal to a wider variety of business models. Created by former Lexalytics Marketing Director Oleg Rogynskyy, Semantria is a SaaS text mining service offered as an API and Excel based plugin that measures sentiment. The goal of the acquisition, which cost Lexalytics less than US$10 million, was to expand the customer base both within the United States and abroad with multilingual support. The engine that powers Semantria, Salience, is grounded in its deep learning ability. An example of this is its concept matrix, which allows Salience an understanding of concepts and relationship between concepts based on a detailed reading of the entire repository of Wikipedia. This matrix allows Salience to use Wikipedia for automatic categorization. Along with features like the concept matrix, Salience supports 16 international languages. The engine has earned Lexalytics a spot on EContent's “Top 100 Companies in the Digital Content Industry” List for 2014–2015. In September 2018, Lexalytics launched document data extraction market using natural language processing (NLP).

    Read more →
  • QuickPar

    QuickPar

    QuickPar is a computer program that creates parchives used as verification and recovery information for a file or group of files, and uses the recovery information, if available, to attempt to reconstruct the originals from the damaged files and the PAR volumes. Designed for the Microsoft Windows operating system, in the past it was often used to recover damaged or missing files that have been downloaded through Usenet. QuickPar may also be used under Linux via Wine. There are two main versions of PAR files: PAR and PAR2. The PAR2 file format lifts many of its previous restrictions. QuickPar is freeware but not open-source. It uses the Reed-Solomon error correction algorithm internally to create the error correcting information. == Replacement == Since QuickPar hasn't been updated in 21 years, it is considered abandonware. Currently, MultiPar is accepted as the software that replaces QuickPar. MultiPar is actively being developed by Yutaka Sawada. == 64-bit versions == At present the command line version of QuickPar for Linux command line is available as a 64-bit version. None of the GUI versions available presently offer a 64-bit version.

    Read more →
  • Snap rounding

    Snap rounding

    Snap rounding is a method of approximating line segment locations by creating a grid and placing each point in the centre of a cell (pixel) of the grid. The method preserves certain topological properties of the arrangement of line segments. Drawbacks include the potential interpolation of additional vertices in line segments (lines become polylines), the arbitrary closeness of a point to a non-incident edge, and arbitrary numbers of intersections between input line-segments. The 3 dimensional case is worse, with a polyhedral subdivision of complexity n becoming complexity O(n4). There are more refined algorithms to cope with some of these issues, for example iterated snap rounding guarantees a "large" separation between points and non-incident edges. == Algorithm == ... (please edit). See, and https://www.cgal.org/ () == Properties == Canonicity: Efficiency; A number of efficient implementations exist. Conversely there are undesirable properties: Non-idempotence: Repeated applications can cause arbitrary drift of points. Exception on "Stable snap rounding" algorithms, see https://doi.org/10.1016/j.comgeo.2012.02.011

    Read more →
  • Education by algorithm

    Education by algorithm

    Education by algorithm refers to automated solutions that algorithmic agents or social bots offer to education, to assist with mundane educational tasks. These are often instrumentalist “educational reforms” or “curriculum transformations”, which have been implemented by policy makers and are supported by proprietary education technologies. New educational policies, mandated by transnational governance forums (like the OECD), have manufactured a connection between economies and education. Governments, schools and universities are expected to introduce or prepare students for an “unknown future”, to “future proof” them against an identified issue or to mitigate a national crisis. Technologies are seen as a catalyst to effect these changes. However, these policies mask a deeper problem, which include the assetization of education and the use of technologies as a means for surveillance and behavior modification. The traces that students and leave, through cookies, logins learning activities, assignments and tests, are collected, facetted, and shared with commercial organizations by these agents, to both predict future behavior and shape it. Techno solutionist thinking has led to managers adopting educational policies and reforms, and looking towards technologies to act as disrupters, liberators or agents to improve efficiency. During the COVID-19 pandemic, many more students had to modify their learning and working circumstances to protect themselves. Academics shifted their assessment practices from the dominant assessment of learning paradigm to an orientation that saw value in "assessment for learning". Big tech assisted, and teaching infrastructure became further privatized, and unbundling of education provision went a step further. Following the return to class, this assessment paradigm became rationalised in education. Leaving the space for algorithmic agents to step in. Academics work was increasingly driven by learning experience platforms and student understanding was extended through interleaving, behavior modification nudges and rewards and scheduled high stakes assessments. This data collection may also be construed as surveillance., or perceived as evidence of a Fourth Industrial Revolution

    Read more →
  • Teamwork (project management)

    Teamwork (project management)

    Teamwork.com is an Irish, privately owned, web-based software company headquartered in Cork, Ireland. Teamwork creates task management and team collaboration software. Founded in 2007, as of 2016 the company stated that its software was in use by over 370,000 organisations worldwide (including Disney, Spotify and HP), and that it had over 2.4m users. == History == Peter Coppinger and Dan Mackey founded a company, Digital Crew, in 2007. This company built websites, intranets and custom web-based solutions for clients in Cork, Ireland. Frustrated by whiteboards and software management tools, Coppinger wanted a software system that would help manage client projects and which would be easy to use and generic enough to be used by different types of companies. Originally 37signals Basecamp users themselves, Coppinger and Mackey were frustrated by the limited feature set, and by Basecamp's apparent inaction on their feedback. In October 2007, Coppinger and Mackey launched Teamwork Project Manager, nicknamed TeamworkPM. In March 2015, this was renamed as Teamwork Projects. In 2014, after two years of negotiations, TeamworkPM bought the domain name 'Teamwork.com' for US$675,000 (€500,000). At the time this was one of the most expensive domain name purchases by an Irish company, and involved the transfer of a domain name which had been dormant since it was first acquired by the original owner in 1999. In 2015, Teamwork.com was named by Gartner to be one of their "Cool Vendors" in the Program and Portfolio Management Category. This was followed by the launch of a new real-time messaging product, Teamwork Chat, in January 2015. In June 2015, the company announced a drive to recruit for 40 positions by the end of the year. This was followed by the announcement that the company was investing more than €1 million in a new office, and had leased office space in Park House, Blackpool. In June 2016, Teamwork.com undertook a further recruitment drive to entice developers to Cork. In July 2021, the company announced that it had raised an investment of $70 million (€59.1 million) from venture capital firm Bregal Milestone to fund further growth. == Products == Teamwork markets a number of cloud-based applications, including Teamwork, Teamwork Desk, Teamwork Spaces, Teamwork CRM and Teamwork Chat. Teamwork was launched on 4 October 2007, at which time it had time management, milestone management, file sharing, time tracking, and messaging features. Teamwork's platform reportedly integrates with martech software like HubSpot, as well as other productivity tools like Slack, G Suite, MS Teams, Zapier, Dropbox and QuickBooks. == Awards == In 2016, Teamwork was awarded Cork's Best SME in the Cork Chamber of Commerce "Company of the Year" awards. In 2016, Teamwork was named number 7 in Deloitte's Fast 50 tech companies hit €1.6bn turnover. In 2015, Teamwork was identified as a Gartner "Cool Vendor" in the Program and Portfolio Management Category.

    Read more →
  • Information

    Information

    Information is an abstract concept that refers to something which has the power to inform. At the most fundamental level, it pertains to the interpretation (perhaps formally) of that which may be sensed, or their abstractions. Any natural process that is not completely random and any observable pattern in any medium can be said to convey some amount of information. Whereas digital signals and other data use discrete signs to convey information, other phenomena and artifacts such as analogue signals, poems, pictures, music or other sounds, and currents convey information in a more continuous form. Information is not knowledge itself, but the meaning that may be derived from a representation through interpretation. The concept of information is relevant to and connected with various concepts, including constraint, communication, control, data, form, education, knowledge, meaning, understanding, mental stimuli, pattern, perception, proposition, representation, and entropy. Information is often processed iteratively: Data available at one step are processed into information to be interpreted and processed at the next step. For example, in written text each symbol or letter conveys information relevant to the word it is part of, each word conveys information relevant to the phrase it is part of, each phrase conveys information relevant to the sentence it is part of, and so on until at the final step information is interpreted and becomes knowledge in a given domain. In a digital signal, bits may be interpreted into the symbols, letters, numbers, or structures that convey the information available at the next level up. The key characteristic of information is that it is subject to interpretation and processing. The derivation of information from a signal or message may be thought of as the resolution of ambiguity or uncertainty that arises during the interpretation of patterns within the signal or message. Information may be structured as data. Redundant data can be compressed up to an optimal size, which is the theoretical limit of compression. The information available through a collection of data may be derived by analysis. For example, a restaurant collects data from every customer order. That information may be analyzed to produce knowledge that is put to use when the business subsequently wants to identify the most popular or least popular dish. Information can be transmitted in time, via data storage, and space, via communication and telecommunication. Information is expressed either as the content of a message or through direct or indirect observation. That which is perceived can be construed as a message in its own right, and in that sense, all information is always conveyed as the content of a message. Information can be encoded into various forms for transmission and interpretation (for example, information may be encoded into a sequence of signs, or transmitted via a signal). It can also be encrypted for safe storage and communication. The uncertainty of an event is measured by its probability of occurrence. Uncertainty is proportional to the negative logarithm of the probability of occurrence. Information theory takes advantage of this by concluding that more uncertain events require more information to resolve their uncertainty. The bit is the standard unit of information. It is 'that which reduces uncertainty by half'. Other units such as the nat may be used. For example, the information encoded in one "fair" coin flip is log2(2/1) = 1 bit, and in two fair coin flips is log2(4/1) = 2 bits. A 2011 Science article estimates that 97% of technologically stored information was already in digital bits in 2007 and that the year 2002 was the beginning of the digital age for information storage (with digital storage capacity bypassing analogue for the first time). == Etymology and history of the concept == The English word "information" comes from Middle French enformacion/informacion/information 'a criminal investigation' and its etymon, Latin informatiō(n) 'conception, teaching, creation'. In English, "information" is an uncountable mass noun. References on "formation or molding of the mind or character, training, instruction, teaching" date from the 14th century in both English (according to Oxford English Dictionary) and other European languages. In the transition from Middle Ages to Modernity the use of the concept of information reflected a fundamental turn in epistemological basis – from "giving a (substantial) form to matter" to "communicating something to someone". Peters (1988, pp. 12–13) concludes: Information was readily deployed in empiricist psychology (though it played a less important role than other words such as impression or idea) because it seemed to describe the mechanics of sensation: objects in the world inform the senses. But sensation is entirely different from "form" – the one is sensual, the other intellectual; the one is subjective, the other objective. My sensation of things is fleeting, elusive, and idiosyncratic. For Hume, especially, sensory experience is a swirl of impressions cut off from any sure link to the real world... In any case, the empiricist problematic was how the mind is informed by sensations of the world. At first informed meant shaped by; later it came to mean received reports from. As its site of action drifted from cosmos to consciousness, the term's sense shifted from unities (Aristotle's forms) to units (of sensation). Information came less and less to refer to internal ordering or formation, since empiricism allowed for no preexisting intellectual forms outside of sensation itself. Instead, information came to refer to the fragmentary, fluctuating, haphazard stuff of sense. Information, like the early modern worldview in general, shifted from a divinely ordered cosmos to a system governed by the motion of corpuscles. Under the tutelage of empiricism, information gradually moved from structure to stuff, from form to substance, from intellectual order to sensory impulses. In the modern era, the most important influence on the concept of information is derived from the Information theory developed by Claude Shannon and others. This theory, however, reflects a fundamental contradiction. Northrup (1993) wrote: Thus, actually two conflicting metaphors are being used: The well-known metaphor of information as a quantity, like water in the water-pipe, is at work, but so is a second metaphor, that of information as a choice, a choice made by :an information provider, and a forced choice made by an :information receiver. Actually, the second metaphor implies that the information sent isn't necessarily equal to the information received, because any choice implies a comparison with a list of possibilities, i.e., a list of possible meanings. Here, meaning is involved, thus spoiling the idea of information as a pure "Ding an sich." Thus, much of the confusion regarding the concept of information seems to be related to the basic confusion of metaphors in Shannon's theory: is information an autonomous quantity, or is information always per SE information to an observer? Actually, I don't think that Shannon himself chose one of the two definitions. Logically speaking, his theory implied information as a subjective phenomenon. But this had so wide-ranging epistemological impacts that Shannon didn't seem to fully realize this logical fact. Consequently, he continued to use metaphors about information as if it were an objective substance. This is the basic, inherent contradiction in Shannon's information theory." (Northrup, 1993, p. 5). In their seminal book The Study of Information: Interdisciplinary Messages, Almach and Mansfield (1983) collected key views on the interdisciplinary controversy in computer science, artificial intelligence, library and information science, linguistics, psychology, and physics, as well as in the social sciences. Almach (1983, p. 660) himself disagrees with the use of the concept of information in the context of signal transmission, the basic senses of information in his view all referring "to telling something or to the something that is being told. Information is addressed to human minds and is received by human minds." All other senses, including its use with regard to nonhuman organisms as well to society as a whole, are, according to Machlup, metaphoric and, as in the case of cybernetics, anthropomorphic. Hjørland (2007) describes the fundamental difference between objective and subjective views of information and argues that the subjective view has been supported by, among others, Bateson, Yovits, Span-Hansen, Brier, Buckland, Goguen, and Hjørland. Hjørland provided the following example: A stone on a field could contain different information for different people (or from one situation to another). It is not possible for information systems to map all the stone's possible information for every individual. Nor is any one mapping the one "true" mapping. But peop

    Read more →
  • MarkLogic Server

    MarkLogic Server

    MarkLogic Server is a document-oriented database developed by MarkLogic. It is a NoSQL multi-model database that evolved from an XML database to natively store JSON documents and RDF triples, the data model for semantics. MarkLogic is designed to be a data hub for operational and analytical data. == History == MarkLogic Server was built to address shortcomings with existing search and data products. The product first focused on using XML as the document markup standard and XQuery as the query standard for accessing collections of documents up to hundreds of terabytes in size. Currently the MarkLogic platform is widely used in publishing, government, finance and other sectors. MarkLogic's customers are mostly Global 2000 companies. == Technology == MarkLogic uses documents without upfront schemas to maintain a flexible data model. In addition to having a flexible data model, MarkLogic uses a distributed, scale-out architecture that can handle hundreds of billions of documents and hundreds of terabytes of data. It has received Common Criteria certification, and has high availability and disaster recovery. MarkLogic is designed to run on-premises and within public or private cloud environments like Amazon Web Services. == Features == Indexing MarkLogic indexes the content and structure of documents including words, phrases, relationships, and values in over 200 languages with tokenization, collation, and stemming for core languages. Functionality includes the ability to toggle range indexes, geospatial indexes, the RDF triple index, and reverse indexes on or off based on your data, the kinds of queries that you will run, and your desired performance. Full-text search MarkLogic supports search across its data and metadata using a word or phrase and incorporates Boolean logic, stemming, wildcards, case sensitivity, punctuation sensitivity, diacritic sensitivity, and search term weighting. Data can be searched using JavaScript, XQuery, SPARQL, and SQL. Semantics MarkLogic uses RDF triples to provide semantics for ease of storing metadata and querying. ACID Unlike other NoSQL databases, MarkLogic maintains ACID consistency for transactions. Replication MarkLogic provides high availability with replica sets. Scalability MarkLogic scales horizontally using sharding. MarkLogic can run over multiple servers, balancing the load or replicating data to keep the system up and running in the event of hardware failure. Security MarkLogic has built in security features such as element-level permissions and data redaction. Optic API for Relational Operations An API that lets developers view their data as documents, graphs or rows. Security MarkLogic provides redaction, encryption, and element-level security (allowing for control on read and write rights on parts of a document). == Applications == Banking Big Data Fraud prevention Insurance Claims Management and Underwriting Master data management Recommendation engines == Licensing == MarkLogic is available under various licensing and delivery models, namely a free Developer or an Essential Enterprise license.[3] Licenses are available from MarkLogic or directly from cloud marketplaces such as Amazon Web Services and Microsoft Azure. == Releases == 2001 – Cerisent XQE 1: ACID transactions, Full-text search, XML Storage, XQuery, Role-based security 2004 – Cerisent XQE 2: Scale-out architecture, Enhanced search (stemming, thesaurus, wildcard), Backup and restore 2005 – MarkLogic Server 3: Continuing search improvements, Content Processing Framework (including PDF, Word, Excel, PPT), Failover 2008 – MarkLogic Server 4: Geospatial search, entity extraction, advanced XQuery, performance, scalability enhancements, auditing 2011 – MarkLogic Server 5: Flexible replication / DDIL, real-time indexing, advanced search, improved analytics, concurrency enhancements 2012 – MarkLogic Server 6: REST and Java APIs, App Builder, enhanced UI, improved search 2013 – MarkLogic Server 7: Semantic graph, bitemporal data, tiered storage, improved search, better management 2015 – MarkLogic Server 8: A Native JSON storage, Server-side JavaScript, Bitemporal, Node.js client API, Incremental backup, Flexible replication[16] 2017 – MarkLogic Server 9: Data integration across Relational and Non-Relational data, Advanced Encryption, Element Level Security, Redaction 2019 – MarkLogic Server 10: Enhanced Data Hub, improved SQL, security, analytics performance, cloud support 2022 – MarkLogic Server 11: MarkLogic Ops Director (Monitoring and Administration Improvements), expanded PKI 2025 – MarkLogic Server 12: Generative AI and Native Vector Search, Graph Algorithm Support, Virtual TDEs (relational views on the fly)

    Read more →
  • Single customer view

    Single customer view

    A single customer view is an aggregated, consistent and holistic representation of the data held by an organisation about its customers that can be viewed in one place, such as a single page. The advantage to an organisation of attaining this unified view comes from the ability it gives to analyse past behaviour in order to better target and personalise future customer interactions. A single customer view is also considered especially relevant where organisations engage with customers through multichannel marketing, since customers expect those interactions to reflect a consistent understanding of their history and preferences. However, some commentators have challenged the idea that a single view of customers across an entire organisation is either natural or meaningful, proposing that the priority should instead be consistency between the multiple views that arise in different contexts. Where representations of a customer are held in more than one data set, achieving a single customer view can be difficult: firstly because customer identity must be traceable between the records held in those systems, and secondly because anomalies or discrepancies in the customer data must be data cleansed for data quality. As such, the acquisition by an organisation of a single customer view is one potential outcome of successful master data management. Since 31 December, 2010, maintaining a single customer view, and submitting it within 72 hours, has become mandatory for financial institutions in the United Kingdom due to new rules introduced by the Financial Services Compensation Scheme.

    Read more →
  • Color science

    Color science

    Color science is the scientific study of color including lighting and optics; measurement of light and color; the physiology, psychophysics, and modeling of color vision; and color reproduction. It is the modern extension of traditional color theory. == Organizations == International Commission on Illumination (CIE) Illuminating Engineering Society (IES) Inter-Society Color Council (ISCC) Society for Imaging Science and Technology (IS&T) International Colour Association (AIC) Optica, formerly the Optical Society of America (OSA) The Colour Group Society of Dyers and Colourists (SDC) American Association of Textile Chemists and Colorists (AATCC) Association for Research in Vision and Ophthalmology (ARVO) ACM SIGGRAPH Vision Sciences Society (VSS) Council for Optical Radiation Measurements (CORM) == Journals == The preeminent scholarly journal publishing research papers in color science is Color Research and Application, started in 1975 by founding editor-in-chief Fred Billmeyer, along with Gunter Wyszecki, Michael Pointer and Rolf Kuehni, as a successor to the Journal of Colour (1964–1974). Previously most color science work had been split between journals with broader or partially overlapping focus such as the Journal of the Optical Society of America (JOSA), Photographic Science and Engineering (1957–1984), and the Journal of the Society of Dyers and Colourists (renamed Coloration Technology in 2001). Other journals where color science papers are published include the Journal of Imaging Science & Technology, the Journal of Perceptual Imaging, the Journal of the International Colour Association (JAIC), the Journal of the Color Science Association of Japan, Applied Optics, and the Journal of Vision. == Conferences == Congress of the International Color Association IS&T Color and Imaging Conference (CIC) SIGGRAPH International Symposium for Color Science and Art == Selected books == Berns, Roy S. (2019). Billmeyer and Saltzman's Principles of Color Technology (4th ed.). Wiley. doi:10.1002/9781119367314. 3rd ed. (2000). Daw, Nigel (2012). How Vision Works: The Physiological Mechanisms Behind What We See. Oxford. doi:10.1093/acprof:oso/9780199751617.001.0001. Elliot, Andrew J.; Fairchild, Mark D.; Franklin, Anna, eds. (2015). Handbook of Color Psychology. Cambridge. doi:10.1017/CBO9781107337930. Fairchild, Mark D. (2013). Color Appearance Models (3rd ed.). Wiley. doi:10.1002/9781118653128. Author's website. 2nd ed. (2005). Hunt, Robert W. G. (2004). The Reproduction of Colour (6th ed.). Wiley. doi:10.1002/0470024275. Kuehni, Rolf G. (2012). Color: An Introduction to Practice and Principles (3rd ed.). Wiley. doi:10.1002/9781118533567. 1st ed. (1997). Luo, Ming R., ed. (2016). Encyclopedia of Color Science and Technology. Springer. doi:10.1007/978-1-4419-8071-7. MacAdam, David L., ed. (1970). Sources of Color Science. MIT Press. Reinhard, Erik; Khan, Erum Arif; Akyuz, Ahmet Oguz; Johnson, Garrett (2008). Color Imaging: Fundamentals and Applications. CRC Press. doi:10.1201/b10637. Schanda, János, ed. (2007). Colorimetry: Understanding the CIE System. Wiley. doi:10.1002/9780470175637. Shamey, Renzo; Kuehni, Rolf G. (2020). Pioneers of Color Science. Springer. doi:10.1007/978-3-319-30811-1. Wyszecki, Günter; Stiles, Walter S. (1982). Color Science: Concepts and Methods, Quantitative Data and Formulae (2nd ed.). Wiley.

    Read more →
  • Pseudonymization

    Pseudonymization

    Pseudonymization is a data management and de-identification procedure by which personally identifiable information fields within a data record are replaced by one or more artificial identifiers, or pseudonyms. A single pseudonym for each replaced field or collection of replaced fields makes the data record less identifiable while remaining suitable for data analysis and data processing. Pseudonymization (or pseudonymisation, the spelling under European guidelines) is one way to comply with the European Union's General Data Protection Regulation (GDPR) demands for secure data storage of personal information. Pseudonymized data can be restored to its original state with the addition of information which allows individuals to be re-identified. In contrast, anonymization is intended to prevent re-identification of individuals within the dataset. Clause 18, Module Four, footnote 2 of the Adoption by the European Commission of the Implementing Decisions (EU) 2021/914 "requires rendering the data anonymous in such a way that the individual is no longer identifiable by anyone ... and that this process is irreversible." == Impact of Schrems II ruling == The European Data Protection Supervisor (EDPS) on 9 December 2021 highlighted pseudonymization as the top technical supplementary measure for Schrems II compliance. Less than two weeks later, the EU Commission highlighted pseudonymization as an essential element of the equivalency decision for South Korea, which is the status that was lost by the United States under the Schrems II ruling by the Court of Justice of the European Union (CJEU). The importance of GDPR-compliant pseudonymization increased dramatically in June 2021 when the European Data Protection Board (EDPB) and the European Commission highlighted GDPR-compliant pseudonymization as the state-of-the-art technical supplementary measure for the ongoing lawful use of EU personal data when using third country (i.e., non-EU) cloud processors or remote service providers under the "Schrems II" ruling by the CJEU. Under the GDPR and final EDPB Schrems II Guidance, the term pseudonymization requires a new protected "state" of data, producing a protected outcome that: Protects direct, indirect, and quasi-identifiers, together with characteristics and behaviors; Protects at the record and data set level versus only the field level so that the protection travels wherever the data goes, including when it is in use; and Protects against unauthorized re-identification via the mosaic effect by generating high entropy (uncertainty) levels by dynamically assigning different tokens at different times for various purposes. The combination of these protections is necessary to prevent the re-identification of data subjects without the use of additional information kept separately, as required under GDPR Article 4(5) and as further underscored by paragraph 85(4) of the final EDPB Schrems II guidance: Article 4(5) "Definitions" of the GDPR defines pseudonymization as "the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person." "Use Case 2: Transfer of pseudonymised Data Paragraph 85(4)" of the final EDPB Schrems II Guidance requires that “the controller has established by means of a thorough analysis of the data in question – taking into account any information that the public authorities of the recipient country may be expected to possess and use – that the pseudonymised personal data cannot be attributed to an identified or identifiable natural person even if cross-referenced with such information." GDPR-compliant pseudonymization requires that data is "anonymous" in the strictest EU sense of the word – globally anonymous – but for the additional information held separately and made available under controlled conditions as authorized by the data controller for permitted re-identification of individual data subjects. Clause 18, Module Four, footnote 2 of the Adoption by the European Commission of the Implementing Decision (EU) 2021/914 "requires rendering the data anonymous in such a way that the individual is no longer identifiable by anyone, in line with recital 26 of Regulation (EU) 2016/679, and that this process is irreversible." Before the Schrems II ruling, pseudonymization was a technique used by security experts or government officials to hide personally identifiable information to maintain data structure and privacy of information. Some common examples of sensitive information include postal code, location of individuals, names of individuals, race and gender, etc. After the Schrems II ruling, GDPR-compliant pseudonymization must satisfy the above-noted elements as an "outcome" versus merely a technique. == Data fields == The choice of which data fields are to be pseudonymized is partly subjective. Less selective fields, such as birth date or postal code are often also included because they are usually available from other sources and therefore make a record easier to identify. Pseudonymizing these less identifying fields removes most of their analytic value and is therefore normally accompanied by the introduction of new derived and less identifying forms, such as year of birth or a larger postal code region. Data fields that are less identifying, such as date of attendance, are usually not pseudonymized. This is because too much statistical utility is lost in doing so, not because the data cannot be identified. For example, given prior knowledge of a few attendance dates it is easy to identify someone's data in a pseudonymized dataset by selecting only those people with that pattern of dates. This is an example of an inference attack. The weakness of pre-GDPR pseudonymized data to inference attacks is commonly overlooked. A famous example is the AOL search data scandal. The AOL example of unauthorized re-identification did not require access to separately kept "additional information" that was under the control of the data controller as is now required for GDPR-compliant pseudonymization, outlined below under the section "New Definition for Pseudonymization Under GDPR". Protecting statistically useful pseudonymized data from re-identification requires: a sound information security base controlling the risk that the analysts, researchers or other data workers cause a privacy breach The pseudonym allows tracking back of data to its origins, which distinguishes pseudonymization from anonymization, where all person-related data that could allow backtracking has been purged. Pseudonymization is an issue in, for example, patient-related data that has to be passed on securely between clinical centers. The application of pseudonymization to e-health intends to preserve the patient's privacy and data confidentiality. It allows primary use of medical records by authorized health care providers and privacy preserving secondary use by researchers. In the US, HIPAA provides guidelines on how health care data must be handled and data de-identification or pseudonymization is one way to simplify HIPAA compliance. However, plain pseudonymization for privacy preservation often reaches its limits when genetic data are involved (see also genetic privacy). Due to the identifying nature of genetic data, depersonalization is often not sufficient to hide the corresponding person. Potential solutions are the combination of pseudonymization with fragmentation and encryption. An example of application of pseudonymization procedure is creation of datasets for de-identification research by replacing identifying words with words from the same category (e.g. replacing a name with a random name from the names dictionary), however, in this case it is in general not possible to track data back to its origins. == New definition under GDPR == Effective as of May 25, 2018, the EU General Data Protection Regulation (GDPR) defines pseudonymization for the very first time at the EU level in Article 4(5). Under Article 4(5) definitional requirements, data is pseudonymized if it cannot be attributed to a specific data subject without the use of separately kept "additional information". Pseudonymized data embodies the state of the art in Data Protection by Design and by Default because it requires protection of both direct and indirect identifiers (not just direct). GDPR Data Protection by Design and by Default principles as embodied in pseudonymization require protection of both direct and indirect identifiers so that personal data is not cross-referenceable (or re-identifiable) via the "mosaic effect" without access to "additional information" that is kept separately by the controller. Because access to separately kept "additional information" is required

    Read more →
  • Zassenhaus algorithm

    Zassenhaus algorithm

    In mathematics, the Zassenhaus algorithm is a method to calculate a basis for the intersection and sum of two subspaces of a vector space. It is named after Hans Zassenhaus, but no publication of this algorithm by him is known. It is used in computer algebra systems. == Algorithm == === Input === Let V be a vector space and U, W two finite-dimensional subspaces of V with the following spanning sets: U = ⟨ u 1 , … , u n ⟩ {\displaystyle U=\langle u_{1},\ldots ,u_{n}\rangle } and W = ⟨ w 1 , … , w k ⟩ . {\displaystyle W=\langle w_{1},\ldots ,w_{k}\rangle .} Finally, let B 1 , … , B m {\displaystyle B_{1},\ldots ,B_{m}} be linearly independent vectors so that u i {\displaystyle u_{i}} and w i {\displaystyle w_{i}} can be written as u i = ∑ j = 1 m a i , j B j {\displaystyle u_{i}=\sum _{j=1}^{m}a_{i,j}B_{j}} and w i = ∑ j = 1 m b i , j B j . {\displaystyle w_{i}=\sum _{j=1}^{m}b_{i,j}B_{j}.} === Output === The algorithm computes the base of the sum U + W {\displaystyle U+W} and a base of the intersection U ∩ W {\displaystyle U\cap W} . === Algorithm === The algorithm creates the following block matrix of size ( ( n + k ) × ( 2 m ) ) {\displaystyle ((n+k)\times (2m))} : ( a 1 , 1 a 1 , 2 ⋯ a 1 , m a 1 , 1 a 1 , 2 ⋯ a 1 , m ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ a n , 1 a n , 2 ⋯ a n , m a n , 1 a n , 2 ⋯ a n , m b 1 , 1 b 1 , 2 ⋯ b 1 , m 0 0 ⋯ 0 ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ b k , 1 b k , 2 ⋯ b k , m 0 0 ⋯ 0 ) {\displaystyle {\begin{pmatrix}a_{1,1}&a_{1,2}&\cdots &a_{1,m}&a_{1,1}&a_{1,2}&\cdots &a_{1,m}\\\vdots &\vdots &&\vdots &\vdots &\vdots &&\vdots \\a_{n,1}&a_{n,2}&\cdots &a_{n,m}&a_{n,1}&a_{n,2}&\cdots &a_{n,m}\\b_{1,1}&b_{1,2}&\cdots &b_{1,m}&0&0&\cdots &0\\\vdots &\vdots &&\vdots &\vdots &\vdots &&\vdots \\b_{k,1}&b_{k,2}&\cdots &b_{k,m}&0&0&\cdots &0\end{pmatrix}}} Using elementary row operations, this matrix is transformed to the row echelon form. Then, it has the following shape: ( c 1 , 1 c 1 , 2 ⋯ c 1 , m ∙ ∙ ⋯ ∙ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ c q , 1 c q , 2 ⋯ c q , m ∙ ∙ ⋯ ∙ 0 0 ⋯ 0 d 1 , 1 d 1 , 2 ⋯ d 1 , m ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 0 0 ⋯ 0 d ℓ , 1 d ℓ , 2 ⋯ d ℓ , m 0 0 ⋯ 0 0 0 ⋯ 0 ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 0 0 ⋯ 0 0 0 ⋯ 0 ) {\displaystyle {\begin{pmatrix}c_{1,1}&c_{1,2}&\cdots &c_{1,m}&\bullet &\bullet &\cdots &\bullet \\\vdots &\vdots &&\vdots &\vdots &\vdots &&\vdots \\c_{q,1}&c_{q,2}&\cdots &c_{q,m}&\bullet &\bullet &\cdots &\bullet \\0&0&\cdots &0&d_{1,1}&d_{1,2}&\cdots &d_{1,m}\\\vdots &\vdots &&\vdots &\vdots &\vdots &&\vdots \\0&0&\cdots &0&d_{\ell ,1}&d_{\ell ,2}&\cdots &d_{\ell ,m}\\0&0&\cdots &0&0&0&\cdots &0\\\vdots &\vdots &&\vdots &\vdots &\vdots &&\vdots \\0&0&\cdots &0&0&0&\cdots &0\end{pmatrix}}} Here, ∙ {\displaystyle \bullet } stands for arbitrary numbers, and the vectors ( c p , 1 , c p , 2 , … , c p , m ) {\displaystyle (c_{p,1},c_{p,2},\ldots ,c_{p,m})} for every p ∈ { 1 , … , q } {\displaystyle p\in \{1,\ldots ,q\}} and ( d p , 1 , … , d p , m ) {\displaystyle (d_{p,1},\ldots ,d_{p,m})} for every p ∈ { 1 , … , ℓ } {\displaystyle p\in \{1,\ldots ,\ell \}} are nonzero. Then ( y 1 , … , y q ) {\displaystyle (y_{1},\ldots ,y_{q})} with y i := ∑ j = 1 m c i , j B j {\displaystyle y_{i}:=\sum _{j=1}^{m}c_{i,j}B_{j}} is a basis of U + W {\displaystyle U+W} and ( z 1 , … , z ℓ ) {\displaystyle (z_{1},\ldots ,z_{\ell })} with z i := ∑ j = 1 m d i , j B j {\displaystyle z_{i}:=\sum _{j=1}^{m}d_{i,j}B_{j}} is a basis of U ∩ W {\displaystyle U\cap W} . === Proof of correctness === First, we define π 1 : V × V → V , ( a , b ) ↦ a {\displaystyle \pi _{1}:V\times V\to V,(a,b)\mapsto a} to be the projection to the first component. Let H := { ( u , u ) ∣ u ∈ U } + { ( w , 0 ) ∣ w ∈ W } ⊆ V × V . {\displaystyle H:=\{(u,u)\mid u\in U\}+\{(w,0)\mid w\in W\}\subseteq V\times V.} Then π 1 ( H ) = U + W {\displaystyle \pi _{1}(H)=U+W} and H ∩ ( 0 × V ) = 0 × ( U ∩ W ) {\displaystyle H\cap (0\times V)=0\times (U\cap W)} . Also, H ∩ ( 0 × V ) {\displaystyle H\cap (0\times V)} is the kernel of π 1 | H {\displaystyle {\pi _{1}|}_{H}} , the projection restricted to H. Therefore, dim ⁡ ( H ) = dim ⁡ ( U + W ) + dim ⁡ ( U ∩ W ) {\displaystyle \dim(H)=\dim(U+W)+\dim(U\cap W)} . The Zassenhaus algorithm calculates a basis of H. In the first m columns of this matrix, there is a basis y i {\displaystyle y_{i}} of U + W {\displaystyle U+W} . The rows of the form ( 0 , z i ) {\displaystyle (0,z_{i})} (with z i ≠ 0 {\displaystyle z_{i}\neq 0} ) are obviously in H ∩ ( 0 × V ) {\displaystyle H\cap (0\times V)} . Because the matrix is in row echelon form, they are also linearly independent. All rows which are different from zero ( ( y i , ∙ ) {\displaystyle (y_{i},\bullet )} and ( 0 , z i ) {\displaystyle (0,z_{i})} ) are a basis of H, so there are dim ⁡ ( U ∩ W ) {\displaystyle \dim(U\cap W)} such z i {\displaystyle z_{i}} s. Therefore, the z i {\displaystyle z_{i}} s form a basis of U ∩ W {\displaystyle U\cap W} . == Example == Consider the two subspaces U = ⟨ ( 1 − 1 0 1 ) , ( 0 0 1 − 1 ) ⟩ {\displaystyle U=\left\langle \left({\begin{array}{r}1\\-1\\0\\1\end{array}}\right),\left({\begin{array}{r}0\\0\\1\\-1\end{array}}\right)\right\rangle } and W = ⟨ ( 5 0 − 3 3 ) , ( 0 5 − 3 − 2 ) ⟩ {\displaystyle W=\left\langle \left({\begin{array}{r}5\\0\\-3\\3\end{array}}\right),\left({\begin{array}{r}0\\5\\-3\\-2\end{array}}\right)\right\rangle } of the vector space R 4 {\displaystyle \mathbb {R} ^{4}} . Using the standard basis, we create the following matrix of dimension ( 2 + 2 ) × ( 2 ⋅ 4 ) {\displaystyle (2+2)\times (2\cdot 4)} : ( 1 − 1 0 1 1 − 1 0 1 0 0 1 − 1 0 0 1 − 1 5 0 − 3 3 0 0 0 0 0 5 − 3 − 2 0 0 0 0 ) . {\displaystyle \left({\begin{array}{rrrrrrrr}1&-1&0&1&&1&-1&0&1\\0&0&1&-1&&0&0&1&-1\\\\5&0&-3&3&&0&0&0&0\\0&5&-3&-2&&0&0&0&0\end{array}}\right).} Using elementary row operations, we transform this matrix into the following matrix: ( 1 0 0 0 ∙ ∙ ∙ ∙ 0 1 0 − 1 ∙ ∙ ∙ ∙ 0 0 1 − 1 ∙ ∙ ∙ ∙ 0 0 0 0 1 − 1 0 1 ) {\displaystyle \left({\begin{array}{rrrrrrrrr}1&0&0&0&&\bullet &\bullet &\bullet &\bullet \\0&1&0&-1&&\bullet &\bullet &\bullet &\bullet \\0&0&1&-1&&\bullet &\bullet &\bullet &\bullet \\\\0&0&0&0&&1&-1&0&1\end{array}}\right)} (Some entries have been replaced by " ∙ {\displaystyle \bullet } " because they are irrelevant to the result.) Therefore ( ( 1 0 0 0 ) , ( 0 1 0 − 1 ) , ( 0 0 1 − 1 ) ) {\displaystyle \left(\left({\begin{array}{r}1\\0\\0\\0\end{array}}\right),\left({\begin{array}{r}0\\1\\0\\-1\end{array}}\right),\left({\begin{array}{r}0\\0\\1\\-1\end{array}}\right)\right)} is a basis of U + W {\displaystyle U+W} , and ( ( 1 − 1 0 1 ) ) {\displaystyle \left(\left({\begin{array}{r}1\\-1\\0\\1\end{array}}\right)\right)} is a basis of U ∩ W {\displaystyle U\cap W} .

    Read more →