AI Art Modifier

AI Art Modifier — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Private cloud computing infrastructure

    Private cloud computing infrastructure

    Private cloud computing infrastructure is a category of cloud computing that provides comparable benefits to public cloud systems, such as self-service and scalability, but it does so via a proprietary framework. In contrast to public clouds, which cater to multiple entities, a private cloud is specifically designed for the requirements and objectives of one organization. == Definition == A private cloud computing infrastructure constitutes a distinctive model of cloud computing that facilitates a secure and distinct cloud environment where only the intended client can function. It can either be physically housed in the organization's in-house data center or be managed by a third-party provider. In a private cloud, the infrastructure and services are always sustained on a private network, and both the hardware and software are devoted exclusively to a single organization. == History == The concept of private cloud infrastructure started to take shape around the mid-2000s, coinciding with the rise of other cloud computing forms. It came into existence as a solution to the shortcomings of public clouds, particularly concerns over data control, security, and network performance. IT departments began to mirror the automation and self-service features of the public cloud in their data centers. Over time, these services became more advanced, and private cloud technology has been refined to address businesses and organizations' diverse needs. == Architecture == Private cloud computing infrastructure generally involves a mix of hardware, network infrastructure, and virtualization software. The hardware, often referred to as a cloud server or cloud array, consists of a server rack or a collection of server racks containing the storage and processors that constitute the cloud. The virtualization software, such as Hyper-V, OpenStack, or VMWare, establishes and oversees virtual machines with which users interact. The network infrastructure connects the private cloud to users and may facilitate connectivity with other on-premises data centers or clouds. == Applications == Private cloud infrastructures are usually utilized by medium to large businesses and organizations that need robust control over their data, have extensive computing needs, or have specific regulatory or compliance obligations. This includes healthcare organizations, government agencies, financial institutions, and any business that needs to process and store large data volumes.

    Read more →
  • Five safes

    Five safes

    The Five Safes is a framework for helping make decisions about making effective use of data which is confidential or sensitive. It is mainly used to describe or design research access to statistical data held by government and health agencies, and by data archives such as the UK Data Service. It is not an internationally accepted standard. Two of the Five Safes refer to statistical disclosure control, and so the Five Safes is usually used to contrast statistical and non-statistical controls when comparing data management options. == Concept == The Five Safes proposes that data management decisions be considered as solving problems in five 'dimensions': projects, people, settings, data and outputs. The combination of the controls leads to 'safe use'. These are most commonly expressed as questions, for example: These dimensions are scales, not limits. That is, solutions can have a mix of more or fewer controls in each dimension, but the overall aim of 'safe use' independent of the particular mix. For example, a public use file available for open download cannot control who uses it, where or for what purpose, and so all the control (protection) must be in the data itself. In contrast, a file which is only accessed through a secure environment with certified users can contain very sensitive information: the non-statistical controls allow the data to be 'unsafe'. One academic likened the process to a graphic equalizer, where bass and treble can be combined independently to produce a sound the listener likes, which has proven to be a very useful metaphor. This 2023 Data Foundation webinar is an expert discussion of how the elements interact, including an excellent introductory representation. There is no 'order' to the Five Safes, in that one is necessarily more important than the others. However, Ritchie argued that the 'managerial' controls (projects, people, setting) should be addressed before the 'statistical' controls (data, output). The Five Safes concept is associated with other topics which developed from the same programme at ONS, although these are not necessarily implemented. Safe people is associated with 'active researcher management', while safe outputs is linked with principles-based output statistical disclosure control. The Five Safes is a positive framework, describing what is and is not. The EDRU ('evidence-based, default-open, risk-managed, user-centred') attitudinal model is sometimes used to give a normative context == The 'data access spectrum' == From 2003 the Five Safes was also represented in a simpler form as a 'Data Access Spectrum'. The non-data controls (project, people, setting, outputs) tend to work together, in that organisations often see these as a complementary set of restrictions on access. These can then be contrasted with choices about data anonymisation to present a linear representation of data access options. This presentation is consistent with the idea of 'data as a residual', as well as data protection laws of the time which often characterised data simply as anonymous or not anonymous. A similar idea had already been developed independently in 2001 by Chuck Humphrey of the Canadian RDC network, the 'continuum of access'. More recently, The Open Data Institute has developed a 'Data Spectrum toolkit' which includes industry-specific examples. == History and terminology == The Five Safes was devised in the winter of 2002/2003 by Felix Ritchie at the UK Office for National Statistics (ONS) to describe its secure remote-access Virtual Microdata Laboratory (VML). It was described at this time as the 'VML Security Model'. This was adopted by the NORC data enclave, and more widely in the US, as the 'portfolio model' (although this is now also used to refer to a slightly different legal/statistical/educational breakdown). In 2012 the framework as was still being referred to as the 'VML security model', but its increasing use among non-UK organisations led to the adoption of the more general and informative phrase 'Five Safes'. The original framework only had four safes (projects, people, settings and outputs): the framework was used to describe highly detailed data access through a secure environment, and so the 'data' dimension was irrelevant. From 2007 onwards, 'safe data' was included as the framework was used to a describe a wider range of ONS activities. As the US version was based upon the 2005 specification, some US iterations uses have the original four dimensions (eg). Some discussions, such as the OECD, use the term 'secure' instead 'safe'. However, the use of both these terms can cause presentational problems: less control in a particular dimension could be seen to imply 'unsafe users' or 'insecure settings', for example, which distracts from the main message. Hence, the Australian government uses the term "five data sharing principles". The 'Anonymisation Decision-Making Framework' uses a framework based on the Five Safes but relabelling "projects", "people", and "settings" as "governance", "agency" and "infrastructure", respectively; "Output" is omitted, and "safe use" becomes "functional anonymisation". There is no reference to the Five Safes or any associated literature. The Australian version was required to include references to the Five Safes, and presented it as an alternative without comment. == Application == The framework has had three uses: pedagogical, descriptive, and design. Since 2016, it has also been used, directly and indirectly in legislation. See for more detailed examples. === Pedagogy === The first significant use of the framework, other than internal administrative use, was to structure researcher training courses at the UK Office for National Statistics from 2003. UK Data Archive, Administrative Data Research Network, Eurostat, Statistics New Zealand, the Mexican National Institute of Statistics and Geography, NORC, Statistics Canada and the Australian Bureau of Statistics, amongst others, have also used this framework. Most of these courses are for researchers using restricted-access facilities; the Eurostat courses are unusual in that they are designed for all users of sensitive data. === Description === The framework is often used to describe existing data access solutions (e.g. UK HMRC Data Lab, UK Data Service, Statistics New Zealand) or planned/conceptualised ones (e.g. Eurostat in 2011). An early use was to help identify areas where ONS' still had 'irreducible risks' in its provision of secure remote access. The framework is mostly used for confidential social science data. To date it appears to have made little impact on medical research planning, although it is now included in the revised guidelines on implementing HIPAA regulations in the US, and by Cancer Research UK and the Health Foundation in the UK. It has also been used to describe a security model for the Scottish Health Informatics Programme. === Design === In general the Five Safes has been used to describe solutions post-factum, and to explain/justify choices made, but an increasing number of organisations have used the framework to design data access solutions. For example, the Hellenic Statistical Agency developed a data strategy built around the Five Safes in 2016; the UK Health Foundation used the Five Safes to design its data management and training programmes. Use in the private sector is less common but some organisations have incorporated the Five Safes into consulting services. In 2015 the UK Data Service organized a workshop to encourage data users from the academic and private sectors to think about how to manage confidential research data, using the Five Safes to demonstrate alternative options and best practice. Early adopters for strategic design use were in Australia: both the Australian Bureau of Statistics and the Australian Department of Social Service used the Five Safes as an ex ante design tool. In 2017 the Australian Productivity Commission recommended adopting a version of the framework to support cross-government data sharing and re-use. This underwent extensive consultation and culminated in the DAT Act 2022. Since 2020 the Five Safes has been the overriding framework for the design of new secure facilities and data sharing arrangements in the UK for public health and social sciences. This has been promoted by the Office for Statistics Regulation, the UK Statistics Authority, NHS DIgital, and the research funding bodies Administrative Data Research UK and DARE UK. === Regulation and legislation === Three laws have incorporated the Fives Safes. They are explicit in the South Australian Public Sector (Data Sharing) Act 2016, and implicit in the research provisions of the UK Digital Economy Act 2017. The Australian Data Availability and Transparency Act 2022 renames the Five Safes as the Five Data Sharing Principles.A 2025 statutory review of the DAT Act 2022 found "that the DAT Act has not been effective in achieving its objectives.". The review includes specific referen

    Read more →
  • Manufacture Modules Technologies

    Manufacture Modules Technologies

    Manufacture Modules Technologies Sarl (MMT) is a Swiss company established in Geneva in 2015 which originally specialised in the development and commercialization of "Horological Smartwatch modules", firmware, apps and cloud. Located at Geneva's Skylab high-tech hub, it expanded into the development and manufacturing of "E-Straps" operated with a mobile application. Philippe Fraboulet is the CEO. == History == In June 2015, Fullpower Technologies and Union Horlogère Suisse (Swiss Watchmakers Corporation) formed MMT as a joint venture, which then launched the MotionX Horological Smartwatch Open Platform for the Swiss watch industry. The initial licensees were Frederique Constant, Alpina and Mondaine, brands owned by Union Horlogère Suisse. Fullpower created and managed the circuit design, firmware, smartphone applications (including sleep activity), as well as the cloud Infrastructure. MMT managed the Swiss watch movement development and production as well as licensing and support. In July 2016, Union Horlogere Holding and MMT were spun-out of the Frédérique Constant Group. Fullpower Technologies' 19.99% share was acquired by Union Horlogere Holding BV, giving it 100% of MMT's shares. == Business == The company offers firmware, a cloud, manufacturing, service and over-the-air facilities for upgrades. The company also offers its own apps, which bear the label “Swiss Made software”.

    Read more →
  • Driver scheduling problem

    Driver scheduling problem

    The driver scheduling problem (DSP) is type of problem in operations research and theoretical computer science. The DSP consists of selecting a set of duties (assignments) for the drivers or pilots of vehicles (e.g., buses, trains, boats, or planes) involved in the transportation of passengers or goods, within the constraints of various legislative and logistical criteria. == Criteria and modelling == This very complex problem involves several constraints related to labour and company rules and also different evaluation criteria and objectives. Being able to solve this problem efficiently can have a great impact on costs and quality of service for public transportation companies. There is a large number of different rules that a feasible duty might be required to satisfy, such as Minimum and maximum stretch duration Minimum and maximum break duration Minimum and maximum work duration Minimum and maximum total duration Maximum extra work duration Maximum number of vehicle changes Minimum driving duration of a particular vehicle Operations research has provided optimization models and algorithms that lead to efficient solutions for this problem. Among the most common models proposed to solve the DSP are the Set Covering and Set Partitioning Models (SPP/SCP). In the SPP model, each work piece (task) is covered by only one duty. In the SCP model, it is possible to have more than one duty covering a given work piece. In both models, the set of work pieces that needs to be covered is laid out in rows, and the set of previously defined feasible duties available for covering specific work pieces is arranged in columns. The DSP resolution, based on either of these models, is the selection of the set of feasible duties that guarantees that there is one (SPP) or more (SCP) duties covering each work piece while minimizing the total cost of the final schedule.

    Read more →
  • Digital supply chain security

    Digital supply chain security

    Digital supply chain security refers to efforts to enhance cyber security within the supply chain. It is a subset of supply chain security and is focused on the management of cyber security requirements for information technology systems, software and networks, which are driven by threats such as cyber-terrorism, malware, data theft and the advanced persistent threat (APT). Typical supply chain cyber security activities for minimizing risks include buying only from trusted vendors, disconnecting critical machines from outside networks, and educating users on the threats and protective measures they can take. The acting deputy undersecretary for the National Protection and Programs Directorate for the United States Department of Homeland Security, Greg Schaffer, stated at a hearing that he is aware that there are instances where malware has been found on imported electronic and computer devices sold within the United States. == Examples of supply chain cyber security threats == Network or computer hardware that is delivered with malware installed on it already. Malware that is inserted into software or hardware (by various means) Vulnerabilities in software applications and networks within the supply chain that are discovered by malicious hackers Counterfeit computer hardware == Related U.S. government efforts == Comprehensive National Cyber Initiative Defense Procurement Regulations: Noted in section 806 of the National Defense Authorization Act International Strategy for Cyberspace: White House lays out for the first time the U.S.’s vision for a secure and open Internet. The strategy outlines three main themes: diplomacy, development and defense. Diplomacy: The strategy sets out to “promote an open, interoperable, secure and reliable information and communication infrastructure” by establishing norms of acceptable state behavior built through consensus among nations. Development: Through this strategy the government seeks to “facilitate cybersecurity capacity-building abroad, bilaterally and through multilateral organizations.” The objective is to protect the global IT infrastructure and to build closer international partnerships to sustain open and secure networks. Defense: The strategy calls out that the government “will ensure that the risks associated with attacking or exploiting our networks vastly outweigh the potential benefits” and calls for all nations to investigate, apprehend and prosecute criminals and non-state actors who intrude and disrupt network systems. == Related government efforts around the world == Common Criteria offers with Evaluation Assurance Level(EAL) 4 an opportunity to evaluate all relevant aspects of the digital supply chain security like the product, the development environment, IT systems security, the processes in human resource, physical security and with the module ALC_FLR.3 (Systematic Flaw Remediation) also security update processes and methods even by physical site visits. EAL 4 is mutually recognized in countries that signed the SOGIS-MRA and up to ELA 2 in countries the signed the CCRA but including ALC_FRL.3. Russia: Russia has had non-disclosed functionality certification requirements for several years and has recently initiated the National Software Platform effort based on open-source software. This reflects the apparent desire for national autonomy, reducing dependence on foreign suppliers. India: Recognition of supply chain risk in its draft National Cybersecurity Strategy. Rather than targeting specific products for exclusion, it is considering Indigenous Innovation policies, giving preferences to domestic ITC suppliers in order to create a robust, globally competitive national presence in the sector. China: Deriving from goals in the 11th Five Year Plan (2006–2010), China introduced and pursued a mix of security-focused and aggressive Indigenous Innovation policies. China is requiring an indigenous innovation product catalog be used for its government procurement and implementing a Multi-level Protection Scheme (MLPS) which requires (among other things) product developers and manufacturers to be Chinese citizens or legal persons, and product core technology and key components must have independent Chinese or indigenous intellectual property rights. == Private sector efforts == SLSA (Supply-chain Levels for Software Artifacts) is an end-to-end framework for ensuring the integrity of software artifacts throughout the software supply chain. The requirements are inspired by Google’s internal "Binary Authorization for Borg" that has been in use for the past 8+ years and that is mandatory for all of Google's production workloads. The goal of SLSA is to improve the state of the industry, particularly open source, to defend against the most pressing integrity threats. With SLSA, consumers can make informed choices about the security posture of the software they consume. == Other references == Financial Sector Information Sharing and Analysis Center International Strategy for Cyberspace (from the White House) NSTIC SafeCode Whitepaper Archived 2013-10-21 at the Wayback Machine Trusted Technology Forum and the Open Trusted Technology Provider Standard (O-TTPS) Archived 2012-01-03 at the Wayback Machine Cyber Supply Chain Security Solution Malware Implants in Firmware Supply Chain in the Software Era INFORMATION AND COMMUNICATIONS TECHNOLOGY SUPPLY CHAIN RISK MANAGEMENT TASK FORCE: INTERIM REPORT

    Read more →
  • Two-phase commit protocol

    Two-phase commit protocol

    In transaction processing, databases, and computer networking, the two-phase commit protocol (2PC, tupac) is a type of atomic commitment protocol (ACP). It is a distributed algorithm that coordinates all the processes that participate in a distributed atomic transaction on whether to commit or abort (roll back) the transaction. This protocol (a specialised type of consensus protocol) achieves its goal even in many cases of temporary system failure (involving either process, network node, communication, etc. failures), and is thus widely used. However, it is not resilient to all possible failure configurations, and in rare cases, manual intervention is needed to remedy an outcome. To accommodate recovery from failure (automatic in most cases) the protocol's participants use logging of the protocol's states. Log records, which are typically slow to generate but survive failures, are used by the protocol's recovery procedures. Many protocol variants exist that primarily differ in logging strategies and recovery mechanisms. Though usually intended to be used infrequently, recovery procedures compose a substantial portion of the protocol, due to many possible failure scenarios to be considered and supported by the protocol. In a "normal execution" of any single distributed transaction (i.e., when no failure occurs, which is typically the most frequent situation), the protocol consists of two phases: The commit-request phase (or voting phase), in which a coordinator process attempts to prepare all the transaction's participating processes (named participants, cohorts, or workers) to take the necessary steps for either committing or aborting the transaction and to vote, either "Yes": commit (if the transaction participant's local portion execution has ended properly), or "No": abort (if a problem has been detected with the local portion), and The commit phase, in which, based on voting of the participants, the coordinator decides whether to commit (only if all have voted "Yes") or abort the transaction (otherwise), and notifies the result to all the participants. The participants then follow with the needed actions (commit or abort) with their local transactional resources (also called recoverable resources; e.g., database data) and their respective portions in the transaction's other output (if applicable). The two-phase commit (2PC) protocol should not be confused with the two-phase locking (2PL) protocol, a concurrency control protocol. == Assumptions == The protocol works in the following manner: one node is a designated coordinator, which is the master site, and the rest of the nodes in the network are designated the participants. The protocol assumes that: there is stable storage at each node with a write-ahead log, no node crashes forever, the data in the write-ahead log is never lost or corrupted in a crash, and any two nodes can communicate with each other. The last assumption is not too restrictive, as network communication can typically be rerouted. The first two assumptions are much stronger; if a node is totally destroyed then data can be lost. The protocol is initiated by the coordinator after the last step of the transaction has been reached. The participants then respond with an agreement message or an abort message depending on whether the transaction has been processed successfully at the participant. == Basic algorithm == === Commit request (or voting) phase === The coordinator sends a query to commit message to all participants and waits until it has received a reply from all participants. The participants execute the transaction up to the point where they will be asked to commit. They each write an entry to their undo log and an entry to their redo log. Each participant replies with: either an agreement message (participant votes Yes to commit), if the participant's actions succeeded; or an abort message (participant votes No to commit), if the participant experiences a failure that will make it impossible to commit. === Commit (or completion) phase === ==== Success ==== If the coordinator received an agreement message from all participants during the commit-request phase: The coordinator sends a commit message to all the participants. Each participant completes the operation, and releases all the locks and resources held during the transaction. Each participant sends an acknowledgement to the coordinator. The coordinator completes the transaction when all acknowledgements have been received. ==== Failure ==== If any participant votes No during the commit-request phase (or the coordinator's timeout expires): The coordinator sends a rollback message to all the participants. Each participant undoes the transaction using the undo log, and releases the resources and locks held during the transaction. Each participant sends an acknowledgement to the coordinator. The coordinator undoes the transaction when all acknowledgements have been received. ==== Message flow ==== Coordinator Participant QUERY TO COMMIT --------------------------------> VOTE YES/NO prepare/abort <------------------------------- commit/abort COMMIT/ROLLBACK --------------------------------> ACKNOWLEDGEMENT commit/abort <-------------------------------- end An next to the record type means that the record is forced to stable storage. == Disadvantages == The greatest disadvantage of the two-phase commit protocol is that it is a blocking protocol. If the coordinator fails permanently, some participants will never resolve their transactions: After a participant has sent an agreement message as a response to the commit-request message from the coordinator, it will block until a commit or rollback is received. A two-phase commit protocol cannot dependably recover from a failure of both the coordinator and a cohort member during the commit phase. If only the coordinator had failed, and no cohort members had received a commit message, it could safely be inferred that no commit had happened. If, however, both the coordinator and a cohort member failed, it is possible that the failed cohort member was the first to be notified, and had actually done the commit. Even if a new coordinator is selected, it cannot confidently proceed with the operation until it has received an agreement from all cohort members, and hence must block until all cohort members respond. == Implementing the two-phase commit protocol == === Common architecture === In many cases the 2PC protocol is distributed in a computer network. It is easily distributed by implementing multiple dedicated 2PC components similar to each other, typically named transaction managers (TMs; also referred to as 2PC agents or Transaction Processing Monitors), that carry out the protocol's execution for each transaction (e.g., The Open Group's X/Open XA). The databases involved with a distributed transaction, the participants, both the coordinator and participants, register to close TMs (typically residing on respective same network nodes as the participants) for terminating that transaction using 2PC. Each distributed transaction has an ad hoc set of TMs, the TMs to which the transaction participants register. A leader, the coordinator TM, exists for each transaction to coordinate 2PC for it, typically the TM of the coordinator database. However, the coordinator role can be transferred to another TM for performance or reliability reasons. Rather than exchanging 2PC messages among themselves, the participants exchange the messages with their respective TMs. The relevant TMs communicate among themselves to execute the 2PC protocol schema above, "representing" the respective participants, for terminating that transaction. With this architecture the protocol is fully distributed (does not need any central processing component or data structure), and scales up with number of network nodes (network size) effectively. This common architecture is also effective for the distribution of other atomic commitment protocols besides 2PC, since all such protocols use the same voting mechanism and outcome propagation to protocol participants. === Protocol optimizations === Database research has been done on ways to get most of the benefits of the two-phase commit protocol while reducing costs by protocol optimizations and protocol operations saving under certain system's behavior assumptions. ==== Presumed abort and presumed commit ==== Presumed abort or Presumed commit are common such optimizations. An assumption about the outcome of transactions, either commit, or abort, can save both messages and logging operations by the participants during the 2PC protocol's execution. For example, when presumed abort, if during system recovery from failure no logged evidence for commit of some transaction is found by the recovery procedure, then it assumes that the transaction has been aborted, and acts accordingly. This means that it does not matter if aborts are logged at all, and such logging can be saved under this assumption. Typical

    Read more →
  • Enterprise architecture

    Enterprise architecture

    Enterprise architecture (EA) is a business function concerned with the structures and behaviours of a business, especially business roles and processes that create and use business data. The international definition according to the Federation of Enterprise Architecture Professional Organizations is "a well-defined practice for conducting enterprise analysis, design, planning, and implementation, using a comprehensive approach at all times, for the successful development and execution of strategy. Enterprise architecture applies architecture principles and practices to guide organizations through the business, information, process, and technology changes necessary to execute their strategies. These practices utilize the various aspects of an enterprise to identify, motivate, and achieve these changes." The United States Federal Government is an example of an organization that practices EA, in this case with its Capital Planning and Investment Control processes. Companies such as Independence Blue Cross, Intel, Volkswagen AG, and InterContinental Hotels Group also use EA to improve their business architectures as well as to improve business performance and productivity. Additionally, the Federal Enterprise Architecture's reference guide aids federal agencies in the development of their architectures. == Introduction == As a discipline, EA "proactively and holistically lead[s] enterprise responses to disruptive forces by identifying and analyzing the execution of change" towards organizational goals. EA gives business and IT leaders recommendations for policy adjustments and provides best strategies to support and enable business development and change within the information systems the business depends on. EA provides a guide for decision making towards these objectives. The National Computing Centre's EA best practice guidance states that an EA typically "takes the form of a comprehensive set of cohesive models that describe the structure and functions of an enterprise. The individual models in an EA are arranged in a logical manner that provides an ever-increasing level of detail about the enterprise." Important players within EA include enterprise architects and solutions architects. Enterprise architects are at the top level of the architect hierarchy, meaning they have more responsibilities than solutions architects. While solutions architects focus on their own relevant solutions, enterprise architects focus on solutions for and the impact on the whole organization. Enterprise architects oversee many solution architects and business functions. As practitioners of EA, enterprise architects support an organization's strategic vision by acting to align people, process, and technology decisions with actionable goals and objectives that result in quantifiable improvements toward achieving that vision. The practice of EA "analyzes areas of common activity within or between organizations, where information and other resources are exchanged to guide future states from an integrated viewpoint of strategy, business, and technology." === Definitions === The term enterprise can be defined as an organizational unit, organization, or collection of organizations that share a set of common goals and collaborate to provide specific products or services to customers. In that sense, the term enterprise covers various types of organizations, regardless of their size, ownership model, operational model, or geographical distribution. It includes those organizations' complete sociotechnical system, including people, information, processes, and technologies. Enterprise as a sociotechnical system defines the scope of EA. The term architecture refers to fundamental concepts or properties of a system in its environment; and embodied in its elements, relationships, and in the principles of its design and evolution. A methodology for developing and using architecture to guide the transformation of a business from a baseline state to a target state, sometimes through several transition states, is usually known as an enterprise architecture framework. A framework provides a structured collection of processes, techniques, artifact descriptions, reference models, and guidance for the production and use of an enterprise-specific architecture description. Open-source tools supporting EA practice, such as the Essential Project, have also been evaluated for suitability in academic and commercial training contexts. Paramount to changing the EA is the identification of a sponsor. Their mission, vision, strategy, and the governance framework define all roles, responsibilities, and relationships involved in the anticipated transformation. Changes considered by enterprise architects typically include innovations in the structure or processes of an organization; innovations in the use of information systems or technologies; the integration and/or standardization of business processes; and improvement of the quality and timeliness of business information. According to the standard ISO/IEC/IEEE 42010, the product used to describe the architecture of a system is called an architectural description. In practice, an architectural description contains a variety of lists, tables, and diagrams. These are models known as views. In the case of EA, these models describe the logical business functions or capabilities, business processes, human roles and actors, the physical organization structure, data flows and data stores, business applications and platform applications, hardware, and communications infrastructure. The first use of the term "enterprise architecture" is often incorrectly attributed to John Zachman's 1987 A framework for information systems architecture. The first publication to use it was instead a National Institute of Standards (NIST) Special Publication on the challenges of information system integration. The NIST article describes EA as consisting of several levels. Business unit architecture is the top level and might be a total corporate entity or a sub-unit. It establishes for the whole organization necessary frameworks for "satisfying both internal information needs" as well as the needs of external entities, which include cooperating organizations, customers, and federal agencies. The lower levels of the EA that provide information to higher levels are more attentive to detail on behalf of their superiors. In addition to this structure, business unit architecture establishes standards, policies, and procedures that either enhance or stymie the organization's mission. The main difference between these two definitions is that Zachman's concept was the creation of individual information systems optimized for business, while NIST's described the management of all information systems within a business unit. The definitions in both publications, however, agreed that due to the "increasing size and complexity of the [i]mplementations of [i]nformation systems... logical construct[s] (or architecture) for defining and controlling the interfaces and... [i]ntegration of all the components of a system" is necessary. Zachman in particular urged for a "strategic planning methodology." == Overview == === Schools of thought === Within the field of enterprise architecture, there are three overarching schools: Enterprise IT Design, Enterprise Integrating, and Enterprise Ecosystem Adaption. Which school one subscribes to will impact how they see the EA's purpose and scope, as well as the means of achieving it, the skills needed to conduct it, and the locus of responsibility for conducting it. Under Enterprise IT Design, the main purpose of EA is to guide the process of planning and designing an enterprise's IT/IS capabilities to meet the desired organizational objectives, often by greater alignment between IT/IS and business concerns. Architecture proposals and decisions are limited to the IT/IS aspects of the enterprise and other aspects service only as inputs. The Enterprise Integrating school believes that the purpose of EA is to create a greater coherency between the various concerns of an enterprise (HR, IT, Operations, etc.), including the link between strategy formulation and execution. Architecture proposals and decisions here encompass all aspects of the enterprise. The Enterprise Ecosystem Adaption school states that the purpose of EA is to foster and maintain the learning capabilities of enterprises so they may be sustainable. Consequently, a great deal of emphasis is put on improving the capabilities of the enterprise to improve itself, to innovate, and to coevolve with its environment. Typically, proposals and decisions encompass both the enterprise and its environment. === Benefits, challenges, and criticisms === The benefits of EA are achieved through its direct and indirect contributions to organizational goals. Notable benefits include support in the areas related to design and re-design of the organizational structures during mergers, acquisitions, or

    Read more →
  • Single-source publishing

    Single-source publishing

    Single-source publishing, also known as single-sourcing publishing, is a content management method which allows the same source content to be used across different forms of media and more than one time. The labor-intensive and expensive work of editing need only be carried out once, on only one document; that source document (the single source of truth) can then be stored in one place and reused. This reduces the potential for error, as corrections are only made one time in the source document. The benefits of single-source publishing primarily relate to the editor rather than the user. The user benefits from the consistency that single-sourcing brings to terminology and information. This assumes the content manager has applied an organized conceptualization to the underlying content (A poor conceptualization can make single-source publishing less useful). Single-source publishing is sometimes used synonymously with multi-channel publishing though whether or not the two terms are synonymous is a matter of discussion. == Definition == While there is a general definition of single-source publishing, there is no single official delineation between single-source publishing and multi-channel publishing, nor are there any official governing bodies to provide such a delineation. Single-source publishing is most often understood as the creation of one source document in an authoring tool and converting that document into different file formats or human languages (or both) multiple times with minimal effort. Multi-channel publishing can either be seen as synonymous with single-source publishing, or similar in that there is one source document but the process itself results in more than a mere reproduction of that source. == History == The origins of single-source publishing lie, indirectly, with the release of Windows 3.0 in 1990. With the eclipsing of MS-DOS by graphical user interfaces, help files went from being unreadable text along the bottom of the screen to hypertext systems such as WinHelp. On-screen help interfaces allowed software companies to cease the printing of large, expensive help manuals with their products, reducing costs for both producer and consumer. This system raised opportunities as well, and many developers fundamentally changed the way they thought about publishing. Writers of software documentation did not simply move from being writers of traditional bound books to writers of electronic publishing, but rather they became authors of central documents which could be reused multiple times across multiple formats. The first single-source publishing project was started in 1993 by Cornelia Hofmann at Schneider Electric in Seligenstadt, using software based on Interleaf to automatically create paper documentation in multiple languages based on a single original source file. XML, developed during the mid- to late-1990s, was also significant to the development of single-source publishing as a method. XML, a markup language, allows developers to separate their documentation into two layers: a shell-like layer based on presentation and a core-like layer based on the actual written content. This method allows developers to write the content only one time while switching it in and out of multiple different formats and delivery methods. In the mid-1990s, several firms began creating and using single-source content for technical documentation (Boeing Helicopter, Sikorsky Aviation and Pratt & Whitney Canada) and user manuals (Ford owners manuals) based on tagged SGML and XML content generated using the Arbortext Epic editor with add-on functions developed by a contractor. The concept behind this usage was that complex, hierarchical content that did not lend itself to discrete componentization could be used across a variety of requirements by tagging the differences within a single document using the capabilities built into SGML and XML. Ford, for example, was able to tag its single owner's manual files so that 12 model years could be generated via a resolution script running on the single completed file. Pratt & Whitney, likewise, was able to tag up to 20 subsets of its jet engine manuals in single-source files, calling out the desired version at publication time. World Book Encyclopedia also used the concept to tag its articles for American and British versions of English. Starting from the early 2000s, single-source publishing was used with an increasing frequency in the field of technical translation. It is still regarded as the most efficient method of publishing the same material in different languages. Once a printed manual was translated, for example, the online help for the software program which the manual accompanies could be automatically generated using the method. Metadata could be created for an entire manual and individual pages or files could then be translated from that metadata with only one step, removing the need to recreate information or even database structures. Although single-source publishing is now decades old, its importance has increased urgently as of the 2010s. As consumption of information products rises and the number of target audiences expands, so does the work of developers and content creators. Within the industry of software and its documentation, there is a perception that the choice is to embrace single-source publishing or render one's operations obsolete. == Criticism == Editors using single-source publishing have been criticized for below-standard work quality, leading some critics to describe single-source publishing as the "conveyor belt assembly" of content creation. While heavily used in technical translation, there are risks of error in regard to indexing. While two words might be synonyms in English, they may not be synonyms in another language. In a document produced via single-sourcing, the index will be translated automatically and the two words will be rendered as synonyms. This is because they are synonyms in the source language, while in the target language they are not.

    Read more →
  • LumenVox

    LumenVox

    LumenVox is a privately held speech recognition software company based in San Diego, California. LumenVox has been described as one of the market leaders in the speech recognition software industry. == History == LumenVox was founded in 2001 as subsidiary of Progressive Computing. According to LumenVox CEO Edward Miller, when Progressive had initially looked to add speech recognition to its own phone system, it found the existing offerings too expensive and recognized a niche in the market for a more affordable speech recognition product. This led to the development of LumenVox with an aim to bring speech recognition to small-to-midsized businesses. LumenVox is one of the major providers of automatic speech recognition for telephone systems, and as of 2006, became the second largest provider of speech recognition software. == Products == The primary LumenVox product is the LumenVox Speech Engine. It is a speaker-independent automatic speech recognizer that uses the Speech Recognition Grammar Specification for building and defining grammars. It has been integrated with several of the major voice platforms, including Avaya Voice Portal/Interactive Response, Aculab, and BroadSoft's BroadWorks. The Speech Engine was originally derived from CMU Sphinx, but LumenVox has added considerable development effort to make it a commercial-ready product. LumenVox also offers a product called the Speech Tuner, which provides a graphical means of testing and troubleshooting speech recognition applications. == Open source support == LumenVox was recognized as one of the top VoIP companies in 2008 for its work in providing its offerings to the open source community, an effort by the company that began in 2006 when it partnered with Digium. At that time, Digium, maintainer of the open source Asterisk PBX, integrated the LumenVox Speech Engine into Asterisk. This made LumenVox the first commercially available speech recognition engine for Asterisk. As one of the earlier commercial software integrations with Asterisk, the LumenVox integration has been described as one of the applications that helped to mainstream Asterisk. In 2009, LumenVox also began offering access to the Speech Engine as a monthly subscription, bringing the cost of entry down even lower for open source users. LumenVox is also integrated with the open source UniMRCP project, which provides open source client and server libraries for the Media Resource Control Protocol.

    Read more →
  • Reference data

    Reference data

    Reference data is data used to classify or categorize other data. Typically, they are static or slowly changing over time. Examples of reference data include: Units of measurement Country codes Corporate codes Fixed conversion rates e.g., weight, temperature, and length Calendar structure and constraints Reference data sets are sometimes alternatively referred to as a "controlled vocabulary" or "lookup" data. Reference data differs from master data. While both provide context for business transactions, reference data is concerned with classification and categorisation, while master data is concerned with business entities. A further difference between reference data and master data is that a change to the reference data values may require an associated change in business process to support the change, while a change in master data will always be managed as part of existing business processes. For example, adding a new customer or sales product is part of the standard business process. However, adding a new product classification (e.g. "restricted sales item") or a new customer type (e.g. "gold level customer") will result in a modification to the business processes to manage those items. == Externally-defined reference data == For most organisations, most or all reference data is defined and managed within that organisation. Some reference data, however, may be externally defined and managed, for example by standards organizations. An example of externally defined reference data is the set of country codes as defined in ISO 3166-1. == Reference data management == Curating and managing reference data is key to ensuring its quality and thus fitness for purpose. All aspects of an organisation, operational and analytical, are greatly dependent on the quality of an organization's reference data. Without consistency across business process or applications, for example, similar things may be described in quite different ways. Reference data gain in value when they are widely re-used and widely referenced. Examples of good practice in reference data management include: Formalize the reference data management Use external reference data as much as possible Govern the reference data specific to your enterprise Manage reference data at enterprise level Version control your reference data

    Read more →
  • Literature review

    Literature review

    A literature review is an overview of previously published works on a particular topic. The term can refer to a full scholarly paper or a section of a scholarly work such as books or articles. Either way, a literature review provides the researcher/author and the audiences with general information of an existing knowledge of a particular topic. A good literature review has a proper research question, a proper theoretical framework, and/or a chosen research method. It serves to situate the current study within the body of the relevant literature and provides context for the reader. In such cases, the review usually precedes the methodology and results sections of the work. Producing a literature review is often part of a graduate and post-graduate requirement, included in the preparation of a thesis, dissertation, or a journal article. Literature reviews are also common in a research proposal or prospectus (the document approved before a student formally begins a dissertation or thesis). A literature review can be a type of a review article. In this sense, it is a scholarly paper that presents the current knowledge including substantive findings as well as theoretical and methodological contributions to a particular topic. Literature reviews are secondary sources and do not report new or original experimental work. Most often associated with academic-oriented literature, such reviews are found in academic journals and are not to be confused with book reviews, which may also appear in the same publication. Literature reviews are a basis for research in nearly every academic field. == Types == Since the concept of a systematic review was formalized in the 1970s, a basic division among types of reviews is the dichotomy of narrative reviews versus systematic reviews. The main types of narrative reviews are evaluative, exploratory, and instrumental. A fourth type of review of literature (the scientific literature) is the systematic review but it is not called a literature review, which absent further specification, conventionally refers to narrative reviews. A systematic review focuses on a specific research question to identify, appraise, select, and synthesize all high-quality research evidence and arguments relevant to that question. A meta-analysis is typically a systematic review using statistical methods to effectively combine the data used on all selected studies to produce a more reliable result. Torraco (2016) describes an integrative literature review. The purpose of an integrative literature review is to generate new knowledge on a topic through the process of review, critique, and synthesis of the literature under investigation. George et al (2023) offer an extensive overview of review approaches. They also propose a model for selecting an approach by looking at the purpose, object, subject, community, and practices of the review. They describe six different types of review, each with their own unique purposes: Exploratory or scoping reviews focus on breadth as opposed to depth Systematic or integrative reviews integrate empirical studies on a topic Meta-narrative reviews are qualitative and use literature to compare research or practice communities Problematizing or critical reviews propose new perspectives on a concept by association with other literature Meta-analyses and meta-regressions integrate quantitative studies and identify moderators Mixed research syntheses combine other review approaches in the same paper == Process and product == Shields and Rangarajan (2013) distinguish between the process of reviewing the literature and a finished work or product known as a literature review. The process of reviewing the literature is often ongoing and informs many aspects of the empirical research project. The process of reviewing the literature requires different kinds of activities and ways of thinking. Shields and Rangarajan (2013) and Granello (2001) link the activities of doing a literature review with Benjamin Bloom's revised taxonomy of the cognitive domain (ways of thinking: remembering, understanding, applying, analyzing, evaluating, and creating). === Use of artificial intelligence in a literature review === Artificial intelligence (AI) is reshaping traditional literature reviews across various disciplines. Generative pre-trained transformers, such as ChatGPT, are often used by students and academics for review purposes. Since 2023, an increasing number of tools powered by large language models and other artificial intelligence technologies have been developed to assist, automate, or generate literature reviews. Nevertheless, the employment of ChatGPT in academic reviews is problematic due to ChatGPT's propensity to "hallucinate". In response, efforts are being made to mitigate these hallucinations through the integration of plugins. For instance, Rad et al. (2023) used ScholarAI for review in cardiothoracic surgery.

    Read more →
  • TurboQuant

    TurboQuant

    TurboQuant is an online vector quantization algorithm for compressing high-dimensional Euclidean vectors while preserving their geometric structure. It was proposed in 2025 by Amir Zandieh, Majid Daliri, Majid Hadian, and Vahab Mirrokni in the paper TurboQuant: Online Vector Quantization with Near-optimal Distortion Rate. The paper lists Zandieh and Mirrokni as affiliated with Google Research, Daliri with New York University, and Hadian with Google DeepMind. The method was developed for applications including large language model (LLM) inference, key–value (KV) cache compression, vector databases, and nearest neighbor search. TurboQuant consists of two related algorithms: TurboQuantmse, which is optimized for mean squared error (MSE), and TurboQuantprod, which is optimized for unbiased inner product estimation. The algorithm uses a random rotation of input vectors, applies scalar quantizers to the rotated coordinates, and, for inner-product estimation, applies a one-bit Quantized Johnson–Lindenstrauss (QJL) transform to the residual error. == Background == Vector quantization is a compression method that maps high-dimensional vectors to a finite set of codewords. The problem has roots in Shannon's source coding theory and rate–distortion theory. In machine learning and information retrieval, vector quantization is used to reduce the memory required to store embeddings, activation vectors, and other numerical representations. In Transformer-based large language models, the KV cache stores key and value vectors from previous tokens during autoregressive decoding. The size of this cache grows with context length, the number of attention heads, and the number of concurrent requests, making it a major memory bottleneck in LLM serving. Similar compression problems appear in vector search, where large collections of embedding vectors must be stored and searched efficiently. Earlier approaches to vector quantization include product quantization, scalar quantization, and data-dependent k-means codebook construction. The TurboQuant paper argues that many existing methods either require offline preprocessing and calibration or suffer from suboptimal distortion guarantees in online settings. == Algorithm == === TurboQuantmse === TurboQuantmse is the version of the algorithm optimized for mean-squared error. For a unit vector x ∈ S d − 1 {\displaystyle x\in S^{d-1}} , the algorithm first applies a random rotation matrix Π ∈ R d × d {\displaystyle \Pi \in \mathbb {R} ^{d\times d}} and sets z = Π x {\displaystyle z=\Pi x} . Each coordinate of the rotated vector follows a shifted and scaled beta distribution, which converges to a normal distribution in high dimensions. In high dimensions, distinct coordinates also become nearly independent, allowing the algorithm to apply scalar quantizers independently to each coordinate. The scalar quantizer is constructed by solving a one-dimensional continuous k-means or Lloyd–Max quantization problem. If the centroids are c 1 , c 2 , … , c 2 b {\displaystyle c_{1},c_{2},\ldots ,c_{2^{b}}} , the quantization step stores, for each coordinate, i d x j = ⁡ a r g m i n k ∈ [ 2 b ] | z j − c k | . {\displaystyle \mathrm {idx} _{j}=\operatorname {} {arg\,min}_{k\in [2^{b}]}|z_{j}-c_{k}|.} During dequantization, the stored index for each coordinate is replaced by the corresponding centroid, giving a reconstructed rotated vector z ~ {\displaystyle {\tilde {z}}} . The algorithm then rotates back: x ~ = Π ⊤ z ~ . {\displaystyle {\tilde {x}}=\Pi ^{\top }{\tilde {z}}.} The paper gives the following bound for TurboQuantmse: D m s e ≤ 3 π 2 ⋅ 1 4 b . {\displaystyle D_{\mathrm {mse} }\leq {\frac {\sqrt {3\pi }}{2}}\cdot {\frac {1}{4^{b}}}.} It also reports finer-grained MSE values of approximately 0.36, 0.117, 0.03, and 0.009 for bit-widths b = 1 , 2 , 3 , 4 {\displaystyle b=1,2,3,4} , respectively. === TurboQuantprod === TurboQuantprod is optimized for unbiased inner-product estimation. The authors note that an MSE-optimized quantizer may introduce bias when used to estimate inner products. To address this, TurboQuantprod first applies TurboQuantmse with bit-width b − 1 {\displaystyle b-1} , then applies a one-bit Quantized Johnson–Lindenstrauss transform to the remaining residual vector. Let r = x − Q m s e − 1 ( Q m s e ( x ) ) {\displaystyle r=x-Q_{\mathrm {mse} }^{-1}(Q_{\mathrm {mse} }(x))} be the residual after MSE quantization, and let γ = ‖ r ‖ 2 {\displaystyle \gamma =\|r\|_{2}} . The QJL step stores a sign vector for the residual. For γ ≠ 0 {\displaystyle \gamma \neq 0} , this can be written using the normalized residual u = r / γ {\displaystyle u=r/\gamma } : q j l = sign ⁡ ( S u ) , {\displaystyle qjl=\operatorname {sign} (Su),} where S ∈ R d × d {\displaystyle S\in \mathbb {R} ^{d\times d}} is a random projection matrix. Since the sign function is invariant under positive rescaling, this is equivalent to sign ⁡ ( S r ) {\displaystyle \operatorname {sign} (Sr)} when r ≠ 0 {\displaystyle r\neq 0} . If γ = 0 {\displaystyle \gamma =0} , the residual correction is zero. TurboQuantprod stores the MSE quantization, the QJL sign vector, and the residual norm: Q p r o d ( x ) = [ Q m s e ( x ) , q j l , γ ] . {\displaystyle Q_{\mathrm {prod} }(x)=\left[Q_{\mathrm {mse} }(x),qjl,\gamma \right].} The dequantized vector is reconstructed as x ~ = x ~ m s e + π / 2 d γ S ⊤ q j l . {\displaystyle {\tilde {x}}={\tilde {x}}_{\mathrm {mse} }+{\frac {\sqrt {\pi /2}}{d}}\,\gamma S^{\top }qjl.} The paper proves that TurboQuantprod is unbiased for inner-product estimation: E x ~ [ ⟨ y , x ~ ⟩ ] = ⟨ y , x ⟩ . {\displaystyle \mathbb {E} _{\tilde {x}}\left[\langle y,{\tilde {x}}\rangle \right]=\langle y,x\rangle .} It also gives the distortion bound D p r o d ≤ 3 π 2 ⋅ ‖ y ‖ 2 2 d ⋅ 1 4 b . {\displaystyle D_{\mathrm {prod} }\leq {\frac {\sqrt {3\pi }}{2}}\cdot {\frac {\|y\|_{2}^{2}}{d}}\cdot {\frac {1}{4^{b}}}.} == Performance and applications == The TurboQuant paper reports that the algorithm achieves near-optimal distortion rates within a small constant factor of information-theoretic lower bounds. The authors report that, for KV cache quantization, TurboQuant achieved quality neutrality at 3.5 bits per channel and marginal degradation at 2.5 bits per channel. In long-context LLM experiments using Llama 3.1 8B Instruct, the paper evaluated the method on a "needle-in-a-haystack" retrieval task with document lengths from 4,000 to 104,000 tokens. It reported that TurboQuant matched the uncompressed full-precision baseline while using more than 4× compression, and compared the method against PolarQuant, SnapKV, PyramidKV, and KIVI. Google Research stated that TurboQuant was evaluated on long-context benchmarks including LongBench, Needle in a Haystack, ZeroSCROLLS, RULER, and L-Eval using open-source models including Gemma and Mistral. According to a report in Tom's Hardware, Google described the method as reducing KV-cache memory by at least six times and achieving up to an eightfold improvement in attention-logit computation on Nvidia H100 GPUs compared with unquantized 32-bit keys. TurboQuant has also been applied to nearest-neighbor vector search. The original paper reports experiments on DBpedia entity embeddings and GloVe embeddings, comparing TurboQuant with product quantization and other vector-search quantization baselines. == Relationship to other methods == TurboQuant is related to several methods for efficient large language model inference and high-dimensional search: Product quantization – a vector quantization technique widely used for approximate nearest-neighbor search Quantization (machine learning) – reducing the numerical precision of weights, activations, or cached tensors in machine learning models PagedAttention – a memory-management algorithm for LLM serving that reduces fragmentation in the KV cache Johnson–Lindenstrauss lemma – a result in high-dimensional geometry used in random projection methods Lloyd's algorithm – an algorithm for scalar and vector quantization, including k-means-style codebook construction Unlike PagedAttention, which focuses on memory allocation and cache layout, TurboQuant reduces the numerical storage cost of the vectors themselves. Unlike many product-quantization methods, TurboQuant is designed to be data-oblivious and online, avoiding dataset-specific codebook training. == Limitations == The strongest performance claims for TurboQuant come from the original paper and Google Research's own publication. Coverage in technology media has noted that the broader impact of the method will depend on real-world implementation details, workloads, and hardware architectures.

    Read more →
  • Normal distributions transform

    Normal distributions transform

    The normal distributions transform (NDT) is a point cloud registration algorithm introduced by Peter Biber and Wolfgang Straßer in 2003, while working at University of Tübingen. The algorithm registers two point clouds by first associating a piecewise normal distribution to the first point cloud, that gives the probability of sampling a point belonging to the cloud at a given spatial coordinate, and then finding a transform that maps the second point cloud to the first by maximising the likelihood of the second point cloud on such distribution as a function of the transform parameters. Originally introduced for 2D point cloud map matching in simultaneous localization and mapping (SLAM) and relative position tracking, the algorithm was extended to 3D point clouds and has wide applications in computer vision and robotics. NDT is very fast and accurate, making it suitable for application to large scale data, but it is also sensitive to initialisation, requiring a sufficiently accurate initial guess, and for this reason it is typically used in a coarse-to-fine alignment strategy. == Formulation == The NDT function associated to a point cloud is constructed by partitioning the space in regular cells. For each cell, it is possible to define the mean q = 1 n ∑ i x i {\displaystyle \textstyle \mathbf {q} ={\frac {1}{n}}\sum _{i}\mathbf {x_{i}} } and covariance S = 1 n ∑ i ( x i − q ) ( x i − q ) ⊤ {\displaystyle \textstyle \mathbf {S} ={\frac {1}{n}}\sum _{i}\left(\mathbf {x} _{i}-\mathbf {q} \right)\left(\mathbf {x} _{i}-\mathbf {q} \right)^{\top }} of the n {\displaystyle n} points of the cloud x 1 , … , x n {\displaystyle \mathbf {x} _{1},\dots ,\mathbf {x} _{n}} that fall within the cell. The probability density of sampling a point at a given spatial location x {\displaystyle \mathbf {x} } within the cell is then given by the normal distribution e − 1 2 ( x − q ) ⊤ S − 1 ( x − q ) {\displaystyle e^{-{\frac {1}{2}}\left(\mathbf {x} -\mathbf {q} \right)^{\top }\mathbf {S} ^{-1}\left(\mathbf {x} -\mathbf {q} \right)}} . Two point clouds can be mapped by a Euclidean transformation f {\displaystyle f} with rotation matrix R {\displaystyle \mathbf {R} } and translation vector t {\displaystyle \mathbf {t} } f R , t ( x ) = R x + t {\displaystyle f_{\mathbf {R} ,\mathbf {t} }(\mathbf {x} )=\mathbf {R} \mathbf {x} +\mathbf {t} } that maps from the second cloud to the first, parametrised by the rotation angles and translation components. The algorithm registers the two point clouds by optimising the parameters of the transformation that maps the second cloud to the first, with respect to a loss function based on the NDT of the first point cloud, solving the following problem arg ⁡ min R , t { − ∑ i NDT ⁡ ( f R , t ( x i ) ) } {\displaystyle \arg \min _{\mathbf {R} ,\mathbf {t} }\left\{-\sum _{i}\operatorname {NDT} \left(f_{\mathbf {R} ,\mathbf {t} }\left(\mathbf {x_{i}} \right)\right)\right\}} where the loss function represents the negated likelihood, obtained by applying the transformation to all points in the second cloud and summing the value of the NDT at each transformed point f R , t ( x ) {\displaystyle f_{\mathbf {R} ,\mathbf {t} }(\mathbf {x} )} . The loss is piecewise continuous and differentiable, and can be optimised with gradient-based methods (in the original formulation, the authors use Newton's method). In order to reduce the effect of cell discretisation, a technique consists of partitioning the space into multiple overlapping grids, shifted by half cell size along the spatial directions, and computing the likelihood at a given location as the sum of the NDTs induced by each grid.

    Read more →
  • Tuple

    Tuple

    In mathematics, a tuple is a finite sequence (or ordered list) of numbers. More generally, it is a sequence of mathematical objects, called the elements of the tuple. An n-tuple is a tuple of n elements, where n is a non-negative integer. There is only one 0-tuple, called the empty tuple. A 1-tuple and a 2-tuple are commonly called a singleton and an ordered pair, respectively. The term "infinite tuple" is occasionally used for "infinite sequences". Tuples are usually written by listing the elements within parentheses "( )" and separated by commas; for example, (2, 7, 4, 1, 7) denotes a 5-tuple. Other types of brackets are sometimes used, although they may have a different meaning. An n-tuple can be formally defined as the image of a function that has the set of the first n natural numbers as its domain (1, 2, ..., n). Tuples may be also defined from ordered pairs by a recurrence starting from an ordered pair; indeed, an n-tuple can be identified with the ordered pair of its (n − 1) first elements and its nth element, for example, ( ( ( 1 , 2 ) , 3 ) , 4 ) = ( 1 , 2 , 3 , 4 ) {\displaystyle \left(\left(\left(1,2\right),3\right),4\right)=\left(1,2,3,4\right)} . In computer science, tuples come in many forms. Most typed functional programming languages implement tuples directly as product types, tightly associated with algebraic data types, pattern matching, and destructuring assignment. Many programming languages offer an alternative to tuples, known as record types, featuring unordered elements accessed by label. A few programming languages combine ordered tuple product types and unordered record types into a single construct, as in C structs and Haskell records. Relational databases may formally identify their rows (records) as tuples. Tuples also occur in relational algebra; when programming the semantic web with the Resource Description Framework (RDF); in linguistics; and in philosophy. == Etymology == The term originated as an abstraction of the sequence: single, couple/double, triple, quadruple, quintuple, sextuple, septuple, octuple, ..., n‑tuple, ..., where the prefixes are taken from the Latin names of the numerals. The unique 0-tuple is called the null tuple or empty tuple. A 1‑tuple is called a single (or singleton), a 2‑tuple is called an ordered pair or couple, and a 3‑tuple is called a triple (or triplet). The number n can be any nonnegative integer. For example, a complex number can be represented as a 2‑tuple of reals, a quaternion can be represented as a 4‑tuple, an octonion can be represented as an 8‑tuple, and a sedenion can be represented as a 16‑tuple. Although these uses treat ‑tuple as the suffix, the original suffix was ‑ple as in "triple" (three-fold) or "decuple" (ten‑fold). This originates from medieval Latin plus (meaning "more") related to Greek ‑πλοῦς, which replaced the classical and late antique ‑plex (meaning "folded"), as in "duplex". == Properties == The general rule for the identity of two n-tuples is ( a 1 , a 2 , … , a n ) = ( b 1 , b 2 , … , b n ) {\displaystyle (a_{1},a_{2},\ldots ,a_{n})=(b_{1},b_{2},\ldots ,b_{n})} if and only if a 1 = b 1 , a 2 = b 2 , … , a n = b n {\displaystyle a_{1}=b_{1},{\text{ }}a_{2}=b_{2},{\text{ }}\ldots ,{\text{ }}a_{n}=b_{n}} . Thus a tuple has properties that distinguish it from a set: A tuple may contain multiple instances of the same element, so tuple ( 1 , 2 , 2 , 3 ) ≠ ( 1 , 2 , 3 ) {\displaystyle (1,2,2,3)\neq (1,2,3)} ; but set { 1 , 2 , 2 , 3 } = { 1 , 2 , 3 } {\displaystyle \{1,2,2,3\}=\{1,2,3\}} . Tuple elements are ordered: tuple ( 1 , 2 , 3 ) ≠ ( 3 , 2 , 1 ) {\displaystyle (1,2,3)\neq (3,2,1)} , but set { 1 , 2 , 3 } = { 3 , 2 , 1 } {\displaystyle \{1,2,3\}=\{3,2,1\}} . A tuple has a finite number of elements, while a set or a multiset may have an infinite number of elements. == Definitions == There are several definitions of tuples that give them the properties described in the previous section. === Tuples as functions === The 0 {\displaystyle 0} -tuple may be identified as the empty function. For n ≥ 1 , {\displaystyle n\geq 1,} the n {\displaystyle n} -tuple ( a 1 , … , a n ) {\displaystyle \left(a_{1},\ldots ,a_{n}\right)} may be identified with the surjective function F : { 1 , … , n } → { a 1 , … , a n } {\displaystyle F~:~\left\{1,\ldots ,n\right\}~\to ~\left\{a_{1},\ldots ,a_{n}\right\}} with domain domain ⁡ F = { 1 , … , n } = { i ∈ N : 1 ≤ i ≤ n } {\displaystyle \operatorname {domain} F=\left\{1,\ldots ,n\right\}=\left\{i\in \mathbb {N} :1\leq i\leq n\right\}} and with codomain codomain ⁡ F = { a 1 , … , a n } , {\displaystyle \operatorname {codomain} F=\left\{a_{1},\ldots ,a_{n}\right\},} that is defined at i ∈ domain ⁡ F = { 1 , … , n } {\displaystyle i\in \operatorname {domain} F=\left\{1,\ldots ,n\right\}} by F ( i ) := a i . {\displaystyle F(i):=a_{i}.} That is, F {\displaystyle F} is the function defined by 1 ↦ a 1 ⋮ n ↦ a n {\displaystyle {\begin{alignedat}{3}1\;&\mapsto &&\;a_{1}\\\;&\;\;\vdots &&\;\\n\;&\mapsto &&\;a_{n}\\\end{alignedat}}} in which case the equality ( a 1 , a 2 , … , a n ) = ( F ( 1 ) , F ( 2 ) , … , F ( n ) ) {\displaystyle \left(a_{1},a_{2},\dots ,a_{n}\right)=\left(F(1),F(2),\dots ,F(n)\right)} necessarily holds. Tuples as sets of ordered pairs Functions are commonly identified with their graphs, which is a certain set of ordered pairs. Indeed, many authors use graphs as the definition of a function. Using this definition of "function", the above function F {\displaystyle F} can be defined as: F := { ( 1 , a 1 ) , … , ( n , a n ) } . {\displaystyle F~:=~\left\{\left(1,a_{1}\right),\ldots ,\left(n,a_{n}\right)\right\}.} === Tuples as nested ordered pairs === Another way of modeling tuples in set theory is as nested ordered pairs. This approach assumes that the notion of ordered pair has already been defined. The 0-tuple (i.e. the empty tuple) is represented by the empty set ∅ {\displaystyle \emptyset } . An n-tuple, with n > 0, can be defined as an ordered pair of its first entry and an (n − 1)-tuple (which contains the remaining entries when n > 1): ( a 1 , a 2 , a 3 , … , a n ) = ( a 1 , ( a 2 , a 3 , … , a n ) ) {\displaystyle (a_{1},a_{2},a_{3},\ldots ,a_{n})=(a_{1},(a_{2},a_{3},\ldots ,a_{n}))} This definition can be applied recursively to the (n − 1)-tuple: ( a 1 , a 2 , a 3 , … , a n ) = ( a 1 , ( a 2 , ( a 3 , ( … , ( a n , ∅ ) … ) ) ) ) {\displaystyle (a_{1},a_{2},a_{3},\ldots ,a_{n})=(a_{1},(a_{2},(a_{3},(\ldots ,(a_{n},\emptyset )\ldots ))))} Thus, for example: ( 1 , 2 , 3 ) = ( 1 , ( 2 , ( 3 , ∅ ) ) ) ( 1 , 2 , 3 , 4 ) = ( 1 , ( 2 , ( 3 , ( 4 , ∅ ) ) ) ) {\displaystyle {\begin{aligned}(1,2,3)&=(1,(2,(3,\emptyset )))\\(1,2,3,4)&=(1,(2,(3,(4,\emptyset ))))\\\end{aligned}}} A variant of this definition starts "peeling off" elements from the other end: The 0-tuple is the empty set ∅ {\displaystyle \emptyset } . For n > 0: ( a 1 , a 2 , a 3 , … , a n ) = ( ( a 1 , a 2 , a 3 , … , a n − 1 ) , a n ) {\displaystyle (a_{1},a_{2},a_{3},\ldots ,a_{n})=((a_{1},a_{2},a_{3},\ldots ,a_{n-1}),a_{n})} This definition can be applied recursively: ( a 1 , a 2 , a 3 , … , a n ) = ( ( … ( ( ( ∅ , a 1 ) , a 2 ) , a 3 ) , … ) , a n ) {\displaystyle (a_{1},a_{2},a_{3},\ldots ,a_{n})=((\ldots (((\emptyset ,a_{1}),a_{2}),a_{3}),\ldots ),a_{n})} Thus, for example: ( 1 , 2 , 3 ) = ( ( ( ∅ , 1 ) , 2 ) , 3 ) ( 1 , 2 , 3 , 4 ) = ( ( ( ( ∅ , 1 ) , 2 ) , 3 ) , 4 ) {\displaystyle {\begin{aligned}(1,2,3)&=(((\emptyset ,1),2),3)\\(1,2,3,4)&=((((\emptyset ,1),2),3),4)\\\end{aligned}}} === Tuples as nested sets === Using Kuratowski's representation for an ordered pair, the second definition above can be reformulated in terms of pure set theory: The 0-tuple (i.e. the empty tuple) is represented by the empty set ∅ {\displaystyle \emptyset } ; Let x {\displaystyle x} be an n-tuple ( a 1 , a 2 , … , a n ) {\displaystyle (a_{1},a_{2},\ldots ,a_{n})} , and let x → b ≡ ( a 1 , a 2 , … , a n , b ) {\displaystyle x\rightarrow b\equiv (a_{1},a_{2},\ldots ,a_{n},b)} . Then, x → b ≡ { { x } , { x , b } } {\displaystyle x\rightarrow b\equiv \{\{x\},\{x,b\}\}} . (The right arrow, → {\displaystyle \rightarrow } , could be read as "adjoined with".) In this formulation: ( ) = ∅ ( 1 ) = ( ) → 1 = { { ( ) } , { ( ) , 1 } } = { { ∅ } , { ∅ , 1 } } ( 1 , 2 ) = ( 1 ) → 2 = { { ( 1 ) } , { ( 1 ) , 2 } } = { { { { ∅ } , { ∅ , 1 } } } , { { { ∅ } , { ∅ , 1 } } , 2 } } ( 1 , 2 , 3 ) = ( 1 , 2 ) → 3 = { { ( 1 , 2 ) } , { ( 1 , 2 ) , 3 } } = { { { { { { ∅ } , { ∅ , 1 } } } , { { { ∅ } , { ∅ , 1 } } , 2 } } } , { { { { { ∅ } , { ∅ , 1 } } } , { { { ∅ } , { ∅ , 1 } } , 2 } } , 3 } } {\displaystyle {\begin{array}{lclcl}()&&&=&\emptyset \\&&&&\\(1)&=&()\rightarrow 1&=&\{\{()\},\{(),1\}\}\\&&&=&\{\{\emptyset \},\{\emptyset ,1\}\}\\&&&&\\(1,2)&=&(1)\rightarrow 2&=&\{\{(1)\},\{(1),2\}\}\\&&&=&\{\{\{\{\emptyset \},\{\emptyset ,1\}\}\},\\&&&&\{\{\{\emptyset \},\{\emptyset ,1\}\},2\}\}\\&&&&\\(1,2,3)&=&(1,2)\rightarrow 3&=&\{\{(1,2)\},\{(1,2),3\}\}\\&&&=&\{\{\{\{\{\{\empty

    Read more →
  • Information

    Information

    Information is an abstract concept that refers to something which has the power to inform. At the most fundamental level, it pertains to the interpretation (perhaps formally) of that which may be sensed, or their abstractions. Any natural process that is not completely random and any observable pattern in any medium can be said to convey some amount of information. Whereas digital signals and other data use discrete signs to convey information, other phenomena and artifacts such as analogue signals, poems, pictures, music or other sounds, and currents convey information in a more continuous form. Information is not knowledge itself, but the meaning that may be derived from a representation through interpretation. The concept of information is relevant to and connected with various concepts, including constraint, communication, control, data, form, education, knowledge, meaning, understanding, mental stimuli, pattern, perception, proposition, representation, and entropy. Information is often processed iteratively: Data available at one step are processed into information to be interpreted and processed at the next step. For example, in written text each symbol or letter conveys information relevant to the word it is part of, each word conveys information relevant to the phrase it is part of, each phrase conveys information relevant to the sentence it is part of, and so on until at the final step information is interpreted and becomes knowledge in a given domain. In a digital signal, bits may be interpreted into the symbols, letters, numbers, or structures that convey the information available at the next level up. The key characteristic of information is that it is subject to interpretation and processing. The derivation of information from a signal or message may be thought of as the resolution of ambiguity or uncertainty that arises during the interpretation of patterns within the signal or message. Information may be structured as data. Redundant data can be compressed up to an optimal size, which is the theoretical limit of compression. The information available through a collection of data may be derived by analysis. For example, a restaurant collects data from every customer order. That information may be analyzed to produce knowledge that is put to use when the business subsequently wants to identify the most popular or least popular dish. Information can be transmitted in time, via data storage, and space, via communication and telecommunication. Information is expressed either as the content of a message or through direct or indirect observation. That which is perceived can be construed as a message in its own right, and in that sense, all information is always conveyed as the content of a message. Information can be encoded into various forms for transmission and interpretation (for example, information may be encoded into a sequence of signs, or transmitted via a signal). It can also be encrypted for safe storage and communication. The uncertainty of an event is measured by its probability of occurrence. Uncertainty is proportional to the negative logarithm of the probability of occurrence. Information theory takes advantage of this by concluding that more uncertain events require more information to resolve their uncertainty. The bit is the standard unit of information. It is 'that which reduces uncertainty by half'. Other units such as the nat may be used. For example, the information encoded in one "fair" coin flip is log2(2/1) = 1 bit, and in two fair coin flips is log2(4/1) = 2 bits. A 2011 Science article estimates that 97% of technologically stored information was already in digital bits in 2007 and that the year 2002 was the beginning of the digital age for information storage (with digital storage capacity bypassing analogue for the first time). == Etymology and history of the concept == The English word "information" comes from Middle French enformacion/informacion/information 'a criminal investigation' and its etymon, Latin informatiō(n) 'conception, teaching, creation'. In English, "information" is an uncountable mass noun. References on "formation or molding of the mind or character, training, instruction, teaching" date from the 14th century in both English (according to Oxford English Dictionary) and other European languages. In the transition from Middle Ages to Modernity the use of the concept of information reflected a fundamental turn in epistemological basis – from "giving a (substantial) form to matter" to "communicating something to someone". Peters (1988, pp. 12–13) concludes: Information was readily deployed in empiricist psychology (though it played a less important role than other words such as impression or idea) because it seemed to describe the mechanics of sensation: objects in the world inform the senses. But sensation is entirely different from "form" – the one is sensual, the other intellectual; the one is subjective, the other objective. My sensation of things is fleeting, elusive, and idiosyncratic. For Hume, especially, sensory experience is a swirl of impressions cut off from any sure link to the real world... In any case, the empiricist problematic was how the mind is informed by sensations of the world. At first informed meant shaped by; later it came to mean received reports from. As its site of action drifted from cosmos to consciousness, the term's sense shifted from unities (Aristotle's forms) to units (of sensation). Information came less and less to refer to internal ordering or formation, since empiricism allowed for no preexisting intellectual forms outside of sensation itself. Instead, information came to refer to the fragmentary, fluctuating, haphazard stuff of sense. Information, like the early modern worldview in general, shifted from a divinely ordered cosmos to a system governed by the motion of corpuscles. Under the tutelage of empiricism, information gradually moved from structure to stuff, from form to substance, from intellectual order to sensory impulses. In the modern era, the most important influence on the concept of information is derived from the Information theory developed by Claude Shannon and others. This theory, however, reflects a fundamental contradiction. Northrup (1993) wrote: Thus, actually two conflicting metaphors are being used: The well-known metaphor of information as a quantity, like water in the water-pipe, is at work, but so is a second metaphor, that of information as a choice, a choice made by :an information provider, and a forced choice made by an :information receiver. Actually, the second metaphor implies that the information sent isn't necessarily equal to the information received, because any choice implies a comparison with a list of possibilities, i.e., a list of possible meanings. Here, meaning is involved, thus spoiling the idea of information as a pure "Ding an sich." Thus, much of the confusion regarding the concept of information seems to be related to the basic confusion of metaphors in Shannon's theory: is information an autonomous quantity, or is information always per SE information to an observer? Actually, I don't think that Shannon himself chose one of the two definitions. Logically speaking, his theory implied information as a subjective phenomenon. But this had so wide-ranging epistemological impacts that Shannon didn't seem to fully realize this logical fact. Consequently, he continued to use metaphors about information as if it were an objective substance. This is the basic, inherent contradiction in Shannon's information theory." (Northrup, 1993, p. 5). In their seminal book The Study of Information: Interdisciplinary Messages, Almach and Mansfield (1983) collected key views on the interdisciplinary controversy in computer science, artificial intelligence, library and information science, linguistics, psychology, and physics, as well as in the social sciences. Almach (1983, p. 660) himself disagrees with the use of the concept of information in the context of signal transmission, the basic senses of information in his view all referring "to telling something or to the something that is being told. Information is addressed to human minds and is received by human minds." All other senses, including its use with regard to nonhuman organisms as well to society as a whole, are, according to Machlup, metaphoric and, as in the case of cybernetics, anthropomorphic. Hjørland (2007) describes the fundamental difference between objective and subjective views of information and argues that the subjective view has been supported by, among others, Bateson, Yovits, Span-Hansen, Brier, Buckland, Goguen, and Hjørland. Hjørland provided the following example: A stone on a field could contain different information for different people (or from one situation to another). It is not possible for information systems to map all the stone's possible information for every individual. Nor is any one mapping the one "true" mapping. But peop

    Read more →