AI Chat Sites

AI Chat Sites — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • The Business Cloud

    The Business Cloud

    The Business Cloud is an API enabled self-service platform, developed by Domo, that provides an array of services like data connection and data visualization. == History == Domo, Inc. was founded in 2010 by Josh James who also co-founded the web analytics software company Omniture in 1996, which he took public in 2006. Domo launched the Domo Appstore, with 1,000 apps with social and mobile capabilities, in 2016. This appstore creates a network of business apps and an ecosystem of companies into a single, integrated business cloud. This decision came after Domo announced a $131 million round of funding from BlackRock. According to the company, the concept behind The Business Cloud is to connect smaller clouds relating to apps or other functional areas of a business into a single business cloud that allows self-service and other social features to customers. == Services == The Business Cloud is offered as a free service, claimed to be the world's first business cloud with Domo appstore as one of its core services. This free package includes all of the Domo's features and functionality including Domo platform, Domo Apps, visualizations, alerts, company directories, org charts, profiles, tasks and Domo Mobile. The Business Cloud allows customers to leverage their preferred cloud as well as on-premises software and monitor all aspects of their business in routine. The company is supported by a $500 million fund from investors all over the world.

    Read more →
  • SQL/PSM

    SQL/PSM

    SQL/PSM (SQL/Persistent Stored Modules) is an ISO standard mainly defining an extension of SQL with a procedural language for use in stored procedures. Initially published in 1996 as an extension of SQL-92 (ISO/IEC 9075-4:1996, a version sometimes called PSM-96 or even SQL-92/PSM), SQL/PSM was later incorporated into the multi-part SQL:1999 standard, and has been part 4 of that standard since then, most recently in SQL:2023. The SQL:1999 part 4 covered less than the original PSM-96 because the SQL statements for defining, managing, and invoking routines were actually incorporated into part 2 SQL/Foundation, leaving only the procedural language itself as SQL/PSM. The SQL/PSM facilities are still optional as far as the SQL standard is concerned; most of them are grouped in Features P001-P008. SQL/PSM standardizes syntax and semantics for control flow, exception handling (called "condition handling" in SQL/PSM), local variables, assignment of expressions to variables and parameters, and (procedural) use of cursors. It also defines an information schema (metadata) for stored procedures. SQL/PSM is one language in which methods for the SQL:1999 structured types can be defined. The other is Java, via SQL/JRT. SQL/PSM is derived, seemingly directly, from Oracle's PL/SQL. Oracle developed PL/SQL and released it in 1991, basing the language on the US Department of Defense's Ada programming language. However, Oracle has maintained a distance from the standard in its documentation. IBM's SQL PL (used in DB2) and Mimer SQL's PSM were the first two products officially implementing SQL/PSM. It is commonly thought that these two languages, and perhaps also MySQL/MariaDB's procedural language, are closest to the SQL/PSM standard. However, a PostgreSQL addon implements SQL/PSM (alongside its other procedural languages like the PL/SQL-derived plpgsql), although it is not part of the core product. RDF functionality in OpenLink Virtuoso was developed entirely through SQL/PSM, combined with custom datatypes (e.g., ANY for handling URI and Literal relation objects), sophisticated indexing, and flexible physical storage choices (column-wise or row-wise).

    Read more →
  • Overcategorization

    Overcategorization

    Overcategorization or category clutter is a phenomenon during classification where too many categories or classes are assigned to a document, record, or item. Overcategorization is related to the library and information science (LIS) concepts of document classification and subject indexing. It is also related to online shopping where excessive product categories can overwhelm users with too many choices or make it more difficult for customers to find the products they need. Although these categories are intended to improve organization and ease of navigation when shipping online, too many categories can lower customer satisfaction, increase difficulty navigating the online store, and reduce future shopping intentions. In LIS, the ideal number of terms that should be assigned to classify an item are measured by the variables precision and recall. Assigning few category labels that are most closely related to the content of the item being classified will result in searches that have high precision, I.e., where a high proportion of the results are closely related to the query. Assigning more category labels to each item will reduce the precision of each search, but increase the recall, retrieving more relevant results. Related LIS concepts include exhaustivity of indexing and information overload. == Basic principles == If too many categories are assigned to a given document, the implications for users depend on how informative the links are. If the user is able to distinguish between useful and not useful links, the damage is limited: The user only wastes time selecting links. In many cases, however, the user cannot judge whether or not a given link will turn out to be fruitful. In that case he or she has to follow the link and to read or skim another document. The worst case scenario is, of course, that even after reading the new document the user is unable to decide whether or not it might be useful if its subject matter is not thoroughly investigated. Overcategorization also has another unpleasant implication: It makes the system (for example in Wikipedia) difficult to maintain in a consistent way. If the system is inconsistent, it means that when the user considers the links in a given category, he or she will not find all documents relevant to that category. Basically, the problem of overcategorization should be understood from the perspective of relevance and the traditional measures of recall and precision. If too few relevant categories are assigned to a document, recall may decrease. If too many non-relevant categories are assigned, precision becomes lower. The hard job is to say which categories are fruitful or relevant for future use of the document.

    Read more →
  • Certifying algorithm

    Certifying algorithm

    In theoretical computer science, a certifying algorithm is an algorithm that outputs, together with a solution to the problem it solves, a proof that the solution is correct. A certifying algorithm is said to be efficient if the combined runtime of the algorithm and a proof checker is slower by at most a constant factor than the best known non-certifying algorithm for the same problem. The proof produced by a certifying algorithm should be in some sense simpler than the algorithm itself, for otherwise any algorithm could be considered certifying (with its output verified by running the same algorithm again). Sometimes this is formalized by requiring that a verification of the proof take less time than the original algorithm, while for other problems (in particular those for which the solution can be found in linear time) simplicity of the output proof is considered in a less formal sense. For instance, the validity of the output proof may be more apparent to human users than the correctness of the algorithm, or a checker for the proof may be more amenable to formal verification. Implementations of certifying algorithms that also include a checker for the proof generated by the algorithm may be considered to be more reliable than non-certifying algorithms. For, whenever the algorithm is run, one of three things happens: it produces a correct output (the desired case), it detects a bug in the algorithm or its implication (undesired, but generally preferable to continuing without detecting the bug), or both the algorithm and the checker are faulty in a way that masks the bug and prevents it from being detected (undesired, but unlikely as it depends on the existence of two independent bugs). == Examples == Many examples of problems with checkable algorithms come from graph theory. For instance, a classical algorithm for testing whether a graph is bipartite would simply output a Boolean value: true if the graph is bipartite, false otherwise. In contrast, a certifying algorithm might output a 2-coloring of the graph in the case that it is bipartite, or a cycle of odd length if it is not. Any graph is bipartite if and only if it can be 2-colored, and non-bipartite if and only if it contains an odd cycle. Both checking whether a 2-coloring is valid and checking whether a given odd-length sequence of vertices is a cycle may be performed more simply than testing bipartiteness. Analogously, it is possible to test whether a given directed graph is acyclic by a certifying algorithm that outputs either a topological order or a directed cycle. It is possible to test whether an undirected graph is a chordal graph by a certifying algorithm that outputs either an elimination ordering (an ordering of all vertices such that, for every vertex, the neighbors that are later in the ordering form a clique) or a chordless cycle. And it is possible to test whether a graph is planar by a certifying algorithm that outputs either a planar embedding or a Kuratowski subgraph. The extended Euclidean algorithm for the greatest common divisor of two integers x and y is certifying: it outputs three integers g (the divisor), a, and b, such that ax + by = g. This equation can only be true of multiples of the greatest common divisor, so testing that g is the greatest common divisor may be performed by checking that g divides both x and y and that this equation is correct.

    Read more →
  • Gooch shading

    Gooch shading

    Gooch shading is a non-photorealistic rendering technique for shading objects. It is also known as "cool to warm" shading, and is widely used in technical illustration. == History == Gooch shading was developed by Amy Gooch et al. at the University of Utah School of Computing and first presented at the 1998 SIGGRAPH conference. It has since been implemented in shader libraries, software, and games released by Autodesk, Nvidia, and Valve. == Process == Gooch shading defines an additional two colors in conjunction with the original model color: a warm color (such as yellow) and a cool color (such as blue). The warm color indicates surfaces that are facing toward the light source while the cool color indicates surfaces facing away. This allows shading to occur only in mid-tones so that edge lines and highlights remain visually prominent. The Gooch shader is typically implemented in two passes: all objects in the scene are first drawn with the "cool to warm" shading, and in the second pass the object's edges are rendered in black.

    Read more →
  • Data management plan

    Data management plan

    A data management plan or DMP is a formal document that outlines how data are to be handled both during a research project, and after the project is completed. The goal of a data management plan is to consider the many aspects of data management, metadata generation, data preservation, and analysis before the project begins; this may lead to data being well-managed in the present, and prepared for preservation in the future. DMPs were originally used in 1966 to manage aeronautical and engineering projects' data collection and analysis, and expanded across engineering and scientific disciplines in the 1970s and 1980s. Up until the early 2000s, DMPs were used "for projects of great technical complexity, and for limited mid-study data collection and processing purposes". In the 2000s and later, E-research and economic policies drove the development and uptake of DMPs. == Importance == Preparing a data management plan before data are collected is claimed to ensure that data are in the correct format, organized well, and better annotated. This could arguably save time in the long term because there is no need to re-organize, re-format, or try to remember details about data. It is also claimed to increase research efficiency since both the data collector and other researchers might be able to understand and use well-annotated data in the future. One component of a data management plan is data archiving and preservation. By deciding on an archive ahead of time, the data collector can format data during collection to make its future submission to a database easier. If data are preserved, they are more relevant since they can be re-used by other researchers. It also allows the data collector to direct requests for data to the database, rather than address requests individually. A frequent argument in favor of preservation is that data that are preserved have the potential to lead to new, unanticipated discoveries, and they prevent duplication of scientific studies that have already been conducted. Data archiving also provides insurance against loss by the data collector. In the 2010s, funding agencies increasingly required data management plans as part of the proposal and evaluation process, despite little or no evidence of their efficacy. == Major components == "There is no general and definitive list of topics that should be covered in a DMP for a research project", and researchers are often left to their own devices as to how to fill out a DMP. === Information about data and data format === A description of data to be produced by the project. This might include (but is not limited to) data that are: Experimental Observational Raw or derived Physical collections Models Simulations Curriculum materials Software Images How will the data be acquired? When and where will they be acquired? After collection, how will the data be processed? Include information about Software used Algorithms Scientific workflows File formats that will be used, justify those formats, and describe the naming conventions used. Quality assurance & quality control measures that will be taken during sample collection, analysis, and processing. If existing data are used, what are their origins? How will the data collected be combined with existing data? What is the relationship between the data collected and existing data? How will the data be managed in the short-term? Consider the following: Version control for files Backing up data and data products Security & protection of data and data products Who will be responsible for management === Metadata content and format === Metadata are the contextual details, including any information important for using data. This may include descriptions of temporal and spatial details, instruments, parameters, units, files, etc. Metadata is commonly referred to as "data about data". Issues to be considered include: How detailed has the metadata to be in order to make the data meaningful? How will the metadata be created and/or captured? Examples include lab notebooks, GPS hand-held units, Auto-saved files on instruments, etc. What format will be used for the metadata? What are the metadata standards commonly used in the respective scientific discipline? There should be justification for the format chosen. === Policies for access, sharing, and re-use === Describe any obligations that exist for sharing data collected. These may include obligations from funding agencies, institutions, other professional organizations, and legal requirements. Include information about how data will be shared, including when the data will be accessible, how long the data will be available, how access can be gained, and any rights that the data collector reserves for using data. Address any ethical or privacy issues with data sharing Address intellectual property & copyright issues. Who owns the copyright? What are the institutional, publisher, and/or funding agency policies associated with intellectual property? Are there embargoes for political, commercial, or patent reasons? Describe the intended future uses/users for the data Indicate how the data should be cited by others. How will the issue of persistent citation be addressed? For example, if the data will be deposited in a public archive, will the dataset have a persistent identifier (e.g., ARK, DOI, Handle, PURL, URN) assigned to it? === Long-term storage and data management === Researchers should identify an appropriate archive for the long-term preservation of their data. By identifying the archive early in the project, the data can be formatted, transformed, and documented appropriately to meet the requirements of the archive. Researchers should consult colleagues and professional societies in their discipline to determine the most appropriate database, and include a backup archive in their data management plan in case their first choice goes out of existence. Early in the project, the primary researcher should identify what data will be preserved in an archive. Usually, preserving the data in its most raw form is desirable, although data derivatives and products can also be preserved. An individual should be identified as the primary contact person for archived data, and ensure contact information is always kept up-to-date in case there are requests for data or information about data. === Budget === Data management and preservation costs may be considerable, depending on the nature of the project. By anticipating costs ahead of time, researchers ensure that the data will be properly managed and archived. Potential expenses that should be considered are Human resources and staff as they handle data preparation, management, documentation, and preservation Hardware and/or software needed for data management, backing up, security, documentation, and preservation Costs associated with submitting the data to an archive The data management plan should include how these costs will be paid. == NSF Data Management Plan == All grant proposals submitted to National Science Foundation (NSF) must include a Data Management Plan that is no more than two pages. This is a supplement (not part of the 15-page proposal) and should describe how the proposal will conform to the Award and Administration Guide policy (see below). It may include the following: The types of data The standards to be used for data and metadata format and content Policies for access and sharing Policies and provisions for re-use Plans for archiving data Policy summarized from the NSF Award and Administration Guide, Section 4 (Dissemination and Sharing of Research Results): Promptly publish with appropriate authorship Share data, samples, physical collections, and supporting materials with others, within a reasonable time frame Share software and inventions Investigators can keep their legal rights over their intellectual property, but they still have to make their results, data, and collections available to others Policies will be implemented via Proposal review Award negotiations and conditions Support/incentives == ESRC Data Management Plan == Since 1995, the UK's Economic and Social Research Council (ESRC) have had a research data policy in place. The current ESRC Research Data Policy states that research data created as a result of ESRC-funded research should be openly available to the scientific community to the maximum extent possible, through long-term preservation and high-quality data management. ESRC requires a data management plan for all research award applications where new data are being created. Such plans are designed to promote a structured approach to data management throughout the data lifecycle, resulting in better quality data that is ready to archive for sharing and re-use. The UK Data Service, the ESRC's flagship data service, provides practical guidance on research data management planning suitable for social science researchers in the UK and around the world. ESRC has a longstanding arrangement with the UK Data A

    Read more →
  • Algorithmic paradigm

    Algorithmic paradigm

    An algorithmic paradigm or algorithm design paradigm is a generic model or framework which underlies the design of a class of algorithms. An algorithmic paradigm is an abstraction higher than the notion of an algorithm, just as an algorithm is an abstraction higher than a computer program. == List of well-known paradigms == === General === Backtracking Branch and bound Brute-force search Divide and conquer Dynamic programming Greedy algorithm Recursion Prune and search === Parameterized complexity === Kernelization Iterative compression === Computational geometry === Sweep line algorithms Rotating calipers Randomized incremental construction

    Read more →
  • AI-driven design automation

    AI-driven design automation

    AI-driven design automation is the use of artificial intelligence (AI) to automate and improve different parts of the electronic design automation (EDA) process. It is particularly important in the design of integrated circuits (chips) and complex electronic systems, where it can potentially increase productivity, decrease costs, and speed up design cycles. AI Driven Design Automation uses several methods, including machine learning, expert systems, and reinforcement learning. These are used for many tasks, from planning a chip's architecture and logic synthesis to its physical design and final verification. == History == === 1980s–1990s: Expert systems and early experiments === The use of AI for design automation originated in the 1980s and 1990s, mainly with the creation of expert systems. These systems tried to capture the knowledge and practical rules used by human design experts, and used these rules, along with reasoning engines, to direct the design process. A notable early project was the ULYSSES system from Carnegie Mellon University. ULYSSES was a CAD tool integration environment that let expert designers turn their design methods into scripts that could be run automatically. It treated design tools as sources of knowledge that a scheduler could manage. Another example was the ADAM (Advanced Design AutoMation) system at the University of Southern California, which used an expert system called the Design Planning Engine. This engine figured out design strategies on the fly and handled different design jobs by organizing specialized knowledge into structured formats called frames. Other systems like DAA (Design Automation Assistant) used a rule-based approach for specific jobs, such as register transfer level (RTL) design for systems like the IBM 370. Researchers at Carnegie Mellon University also created TALIB, an expert system for mask layout that used over 1200 rules, and EMUCS/DAA for CPU architectural design which used about 70 rules. These projects showed that AI worked better for problems where relatively few rules were required to describe much larger amounts of data. At the same time, there was a surge of tools called silicon compilers like MacPitts, Arsenic, and Palladio. They used algorithms and search techniques to explore different design paradigms. This was another way to automate design, even if it was not always based on expert systems. Early tests with neural networks in VLSI design also happened during this time, although they were not as common as systems based on rules. === 2000s: Introduction of machine learning === In the 2000s, interest in AI for design automation increased. This was mostly because of better machine learning (ML) algorithms and more available data from design and manufacturing. For example, they were used to model and reduce the effects of small manufacturing differences in semiconductor devices. This became very important as the size of components on chips became smaller. The large amount of data created during chip design provided the foundation needed to train smarter ML models. This allowed for predicting outcomes and optimizing in areas that were hard to automate before. === 2016–2020: Reinforcement learning and large scale initiatives === A major turning point happened in the mid to late 2010s, sparked by successes in other areas of AI. The success of DeepMind's AlphaGo in mastering the game of Go inspired researchers. They began to apply reinforcement learning (RL) to difficult EDA problems. These problems often require searching through many options and making a series of decisions. In 2018, the U.S. DARPA started the Intelligent Design of Electronic Assets (IDEA) program. A main goal of IDEA was to create a fully automated layout generator that required no human intervention, able to produce a chip design ready for manufacturing from RTL specifications in 24 hours. Another big initiative was the OpenROAD project, a large effort under IDEA led by UC San Diego with industry and university partners, aimed to build an open source, independent toolchain. It used machine learning, parallelization and divide and conquer approaches. A much-publicized but controversial demonstration of RL's potential came from Google researchers between 2020 and 2021. They created a deep reinforcement learning method for planning the layout of a chip, known as floorplanning. They reported that this method created layouts that were as good as or better than those made by human experts, and it did so in less than six hours. This method used a type of network called a graph convolutional neural network. It showed that it could learn general patterns that could be applied to new problems, getting better as it saw more chip designs. The technology was later used to design Google's Tensor Processing Unit (TPU) accelerators. However, in the original paper, the improvement (if any) from AI was not demonstrated. There was no comparison with existing non-AI tools performing the same task, and since the data is proprietary, no ability for anyone else to perform this comparison. Various efforts to reproduce the AI algorithm, and compare its results with various commercial and academic tools, have yielded mixed results with no conclusive advantage to AI. === 2020s: Autonomous systems and agents === Entering the 2020s, the industry saw the commercial launch of autonomous AI driven EDA systems. For example, Synopsys launched DSO.ai (Design Space Optimization AI) in early 2020, calling it the first autonomous artificial intelligence application for chip design in the industry. This system uses reinforcement learning to search for the best ways to optimize a design within the huge number of possible solutions, trying to improve power, performance, and area (PPA). By 2023, DSO.ai had been used to produce over 100 commercial chips, showing mainstream adoption. Synopsys later grew its AI tools into a suite called Synopsys.ai. The goal was to use AI in the entire EDA workflow, including verification and testing. These advancements, which combine modern AI methods with cloud computing and large data resources, have led to talks about a new phase in EDA. Industry experts and participants sometimes call this 'EDA 4.0'. This new era is defined by the widespread use of AI and machine learning to deal with growing design complexity, automate more of the design process, and help engineers handle the huge amounts of data that EDA tools create. The purpose of EDA 4.0 is to optimize product performance, get products to market faster and make development and manufacturing smoother through intelligent automation. == Applications == Artificial intelligence (AI) is now used in many stages of the electronic design workflow. It aims to improve productivity, get better results, and handle the growing complexity of modern integrated circuits. AI helps designers from the very first ideas about architecture all the way to manufacturing and testing. === High level synthesis and architectural exploration === In the first phases of chip design, AI helps with High Level Synthesis (HLS) and exploring different system level design options (DSE). These processes are key for turning general ideas into detailed hardware plans. AI algorithms, often using supervised learning, are used to build simpler, substitute models. These models can quickly guess important design measurements like area, performance, and power for many different architectural options or HLS settings. For example, the Ithemal tool uses deep neural networks to estimate how fast basic code blocks will run, which helps in making processor architecture decisions. Similarly, PRIMAL uses machine learning estimate power use at the register transfer level (RTL), giving early information about how much power the chip will use. Reinforcement learning (RL) and Bayesian optimization are also used to guide the DSE process. They help search through the many parameters to find the best HLS settings or architectural details like cache sizes. LLMs are also being tested for creating architectural plans or initial C code for HLS, as seen with GPT4AIGChip. === Logic synthesis and optimization === Logic synthesis starts from a high level hardware description and generates an optimized list of electronic gates, known as a gate level netlist, that is ready for placement, routing, and then construction in a specific manufacturing process. AI methods help with different parts of this process, including logic optimization, technology mapping, and making improvements after mapping. Supervised learning, especially with Graph Neural Networks (GNNs), is good at handling data or problems that can be represented as graphs. Since circuit diagrams are instances of directed graphs, supervised learning can help create models that predict design properties like power or error rates in circuits. In logic synthesis and optimization reinforcement learning is used to perform logic optimization directly. In some cases ag

    Read more →
  • Oculus Quill

    Oculus Quill

    Quill is a painting and animation software for virtual reality. It runs on Microsoft Windows with Oculus Rift headsets. It is used to create 3D paintings and animated cartoons. Quill was released on November 29, 2016, on the Oculus Store. Theater Elsewhere(formerly Quill Theater), an application for viewing creations made in Quill, was later made available following the release of the Oculus Quest. In September 2021, Facebook, now known as Meta Platforms, and the owner of Oculus, sold Quill to its original creator, who continues to develop and support the app. == Development == Quill was originally developed by Oculus Story Studio as an internal tool for the creative needs of the studio's project Dear Angelica directed by Saschka Unseld along with its art-director Wesley Allsbrook. == Controls == The software works on Oculus Rift utilizing its 6DoF motion controllers. Users can paint in 3D space using their hands naturally, and animate those paintings with keyframes. They can also capture videos and photos of their creations. == Reception == Dear Angelica, a VR story fully painted in Quill, was nominated for an Emmy Award in 2017.

    Read more →
  • Run-time algorithm specialization

    Run-time algorithm specialization

    In computer science, run-time algorithm specialization is a methodology for creating efficient algorithms for costly computation tasks of certain kinds. The methodology originates in the field of automated theorem proving and, more specifically, in the Vampire theorem prover project. The idea is inspired by the use of partial evaluation in optimising program translation. Many core operations in theorem provers exhibit the following pattern. Suppose that we need to execute some algorithm a l g ( A , B ) {\displaystyle {\mathit {alg}}(A,B)} in a situation where a value of A {\displaystyle A} is fixed for potentially many different values of B {\displaystyle B} . In order to do this efficiently, we can try to find a specialization of a l g {\displaystyle {\mathit {alg}}} for every fixed A {\displaystyle A} , i.e., such an algorithm a l g A {\displaystyle {\mathit {alg}}_{A}} , that executing a l g A ( B ) {\displaystyle {\mathit {alg}}_{A}(B)} is equivalent to executing a l g ( A , B ) {\displaystyle {\mathit {alg}}(A,B)} . The specialized algorithm may be more efficient than the generic one, since it can exploit some particular properties of the fixed value A {\displaystyle A} . Typically, a l g A ( B ) {\displaystyle {\mathit {alg}}_{A}(B)} can avoid some operations that a l g ( A , B ) {\displaystyle {\mathit {alg}}(A,B)} would have to perform, if they are known to be redundant for this particular parameter A {\displaystyle A} . In particular, we can often identify some tests that are true or false for A {\displaystyle A} , unroll loops and recursion, etc. == Difference from partial evaluation == The key difference between run-time specialization and partial evaluation is that the values of A {\displaystyle A} on which a l g {\displaystyle {\mathit {alg}}} is specialised are not known statically, so the specialization takes place at run-time. There is also an important technical difference. Partial evaluation is applied to algorithms explicitly represented as codes in some programming language. At run-time, we do not need any concrete representation of a l g {\displaystyle {\mathit {alg}}} . We only have to imagine a l g {\displaystyle {\mathit {alg}}} when we program the specialization procedure. All we need is a concrete representation of the specialized version a l g A {\displaystyle {\mathit {alg}}_{A}} . This also means that we cannot use any universal methods for specializing algorithms, which is usually the case with partial evaluation. Instead, we have to program a specialization procedure for every particular algorithm a l g {\displaystyle {\mathit {alg}}} . An important advantage of doing so is that we can use some powerful ad hoc tricks exploiting peculiarities of a l g {\displaystyle {\mathit {alg}}} and the representation of A {\displaystyle A} and B {\displaystyle B} , which are beyond the reach of any universal specialization methods. == Specialization with compilation == The specialized algorithm has to be represented in a form that can be interpreted. In many situations, usually when a l g A ( B ) {\displaystyle {\mathit {alg}}_{A}(B)} is to be computed on many values of B {\displaystyle B} in a row, a l g A {\displaystyle {\mathit {alg}}_{A}} can be written as machine code instructions for a special abstract machine, and it is typically said that A {\displaystyle A} is compiled. The code itself can then be additionally optimized by answer-preserving transformations that rely only on the semantics of instructions of the abstract machine. The instructions of the abstract machine can usually be represented as records. One field of such a record, an instruction identifier (or instruction tag), would identify the instruction type, e.g. an integer field may be used, with particular integer values corresponding to particular instructions. Other fields may be used for storing additional parameters of the instruction, e.g. a pointer field may point to another instruction representing a label, if the semantics of the instruction require a jump. All instructions of the code can be stored in a traversable data structure such as an array, linked list, or tree. Interpretation (or execution) proceeds by fetching instructions in some order, identifying their type, and executing the actions associated with said type. In many programming languages, such as C and C++, a simple switch statement may be used to associate actions with different instruction identifiers. Modern compilers usually compile a switch statement with constant (e.g. integer) labels from a narrow range by storing the address of the statement corresponding to a value i {\displaystyle i} in the i {\displaystyle i} -th cell of a special array, as a means of efficient optimisation. This can be exploited by taking values for instruction identifiers from a small interval of values. == Data-and-algorithm specialization == There are situations when many instances of A {\displaystyle A} are intended for long-term storage and the calls of a l g ( A , B ) {\displaystyle {\mathit {alg}}(A,B)} occur with different B {\displaystyle B} in an unpredictable order. For example, we may have to check a l g ( A 1 , B 1 ) {\displaystyle {\mathit {alg}}(A_{1},B_{1})} first, then a l g ( A 2 , B 2 ) {\displaystyle {\mathit {alg}}(A_{2},B_{2})} , then a l g ( A 1 , B 3 ) {\displaystyle {\mathit {alg}}(A_{1},B_{3})} , and so on. In such circumstances, full-scale specialization with compilation may not be suitable due to excessive memory usage. However, we can sometimes find a compact specialized representation A ′ {\displaystyle A^{\prime }} for every A {\displaystyle A} , that can be stored with, or instead of, A {\displaystyle A} . We also define a variant a l g ′ {\displaystyle {\mathit {alg}}^{\prime }} that works on this representation and any call to a l g ( A , B ) {\displaystyle {\mathit {alg}}(A,B)} is replaced by a l g ′ ( A ′ , B ) {\displaystyle {\mathit {alg}}^{\prime }(A^{\prime },B)} , intended to do the same job faster.

    Read more →
  • Enterprise information system

    Enterprise information system

    An Enterprise Information System (EIS) is any kind of information system which improves the functions of enterprise business processes through integration. This means typically offering high quality service, dealing with large volumes of data and capable of supporting some large and possibly complex organization or enterprise. An EIS must be able to be used by all parts and all levels of an enterprise. The word enterprise can have various connotations. Frequently the term is used only to refer to very large organizations such as multi-national companies or public-sector organizations. However, the term may be used to mean virtually anything, by virtue of it having become a corporate-speak buzzword. == Purpose == Enterprise information systems provide a technology platform that enables organizations to integrate and coordinate their business processes on a robust foundation. An EIS is currently used in conjunction with customer relationship management and supply chain management to automate business processes. An enterprise information system provides a single system that is central to the organization that ensuring information can be shared across all functional levels and management hierarchies. An EIS can be used to increase business productivity and reduce service cycles, product development cycles and marketing life cycles. It may be used to amalgamate existing applications. Other outcomes include higher operational efficiency and cost savings. Financial value is not usually a direct outcome from the implementation of an enterprise information system. == Design stage == At the design stage the main characteristic of EIS efficiency evaluation is the probability of timely delivery of various messages such as command, service, etc. == Information systems == Enterprise systems create a standard data structure and are invaluable in eliminating the problem of information fragmentation caused by multiple information systems within an organization. An EIS differentiates itself from legacy systems in that it is self-transactional, self-helping and adaptable to general and specialist conditions. Unlike an enterprise information system, legacy systems are limited to department-wide communications. A typical enterprise information system would be housed in one or more data centers, would run enterprise software, and could include applications that typically cross organizational borders such as content management systems.

    Read more →
  • Single-source publishing

    Single-source publishing

    Single-source publishing, also known as single-sourcing publishing, is a content management method which allows the same source content to be used across different forms of media and more than one time. The labor-intensive and expensive work of editing need only be carried out once, on only one document; that source document (the single source of truth) can then be stored in one place and reused. This reduces the potential for error, as corrections are only made one time in the source document. The benefits of single-source publishing primarily relate to the editor rather than the user. The user benefits from the consistency that single-sourcing brings to terminology and information. This assumes the content manager has applied an organized conceptualization to the underlying content (A poor conceptualization can make single-source publishing less useful). Single-source publishing is sometimes used synonymously with multi-channel publishing though whether or not the two terms are synonymous is a matter of discussion. == Definition == While there is a general definition of single-source publishing, there is no single official delineation between single-source publishing and multi-channel publishing, nor are there any official governing bodies to provide such a delineation. Single-source publishing is most often understood as the creation of one source document in an authoring tool and converting that document into different file formats or human languages (or both) multiple times with minimal effort. Multi-channel publishing can either be seen as synonymous with single-source publishing, or similar in that there is one source document but the process itself results in more than a mere reproduction of that source. == History == The origins of single-source publishing lie, indirectly, with the release of Windows 3.0 in 1990. With the eclipsing of MS-DOS by graphical user interfaces, help files went from being unreadable text along the bottom of the screen to hypertext systems such as WinHelp. On-screen help interfaces allowed software companies to cease the printing of large, expensive help manuals with their products, reducing costs for both producer and consumer. This system raised opportunities as well, and many developers fundamentally changed the way they thought about publishing. Writers of software documentation did not simply move from being writers of traditional bound books to writers of electronic publishing, but rather they became authors of central documents which could be reused multiple times across multiple formats. The first single-source publishing project was started in 1993 by Cornelia Hofmann at Schneider Electric in Seligenstadt, using software based on Interleaf to automatically create paper documentation in multiple languages based on a single original source file. XML, developed during the mid- to late-1990s, was also significant to the development of single-source publishing as a method. XML, a markup language, allows developers to separate their documentation into two layers: a shell-like layer based on presentation and a core-like layer based on the actual written content. This method allows developers to write the content only one time while switching it in and out of multiple different formats and delivery methods. In the mid-1990s, several firms began creating and using single-source content for technical documentation (Boeing Helicopter, Sikorsky Aviation and Pratt & Whitney Canada) and user manuals (Ford owners manuals) based on tagged SGML and XML content generated using the Arbortext Epic editor with add-on functions developed by a contractor. The concept behind this usage was that complex, hierarchical content that did not lend itself to discrete componentization could be used across a variety of requirements by tagging the differences within a single document using the capabilities built into SGML and XML. Ford, for example, was able to tag its single owner's manual files so that 12 model years could be generated via a resolution script running on the single completed file. Pratt & Whitney, likewise, was able to tag up to 20 subsets of its jet engine manuals in single-source files, calling out the desired version at publication time. World Book Encyclopedia also used the concept to tag its articles for American and British versions of English. Starting from the early 2000s, single-source publishing was used with an increasing frequency in the field of technical translation. It is still regarded as the most efficient method of publishing the same material in different languages. Once a printed manual was translated, for example, the online help for the software program which the manual accompanies could be automatically generated using the method. Metadata could be created for an entire manual and individual pages or files could then be translated from that metadata with only one step, removing the need to recreate information or even database structures. Although single-source publishing is now decades old, its importance has increased urgently as of the 2010s. As consumption of information products rises and the number of target audiences expands, so does the work of developers and content creators. Within the industry of software and its documentation, there is a perception that the choice is to embrace single-source publishing or render one's operations obsolete. == Criticism == Editors using single-source publishing have been criticized for below-standard work quality, leading some critics to describe single-source publishing as the "conveyor belt assembly" of content creation. While heavily used in technical translation, there are risks of error in regard to indexing. While two words might be synonyms in English, they may not be synonyms in another language. In a document produced via single-sourcing, the index will be translated automatically and the two words will be rendered as synonyms. This is because they are synonyms in the source language, while in the target language they are not.

    Read more →
  • YaDICs

    YaDICs

    YaDICs is a program written to perform digital image correlation on 2D and 3D tomographic images. The program was designed to be both modular, by its plugin strategy and efficient, by it multithreading strategy. It incorporates different transformations (Global, Elastic, Local), optimizing strategy (Gauss-Newton, Steepest descent), Global and/or local shape functions (Rigid-body motions, homogeneous dilatations, flexural and Brazilian test models)... == Theoretical background == === Context === In solid mechanics, digital image correlation is a tool that allows to identify the displacement field to register a reference image (called herein fixed image) to images during an experiment (mobile image). For example, it is possible to observe the face of a specimen with a painted speckle on it in order to determine its displacement fields during a tensile test. Before the appearance of such methods, researchers usually used strain gauges to measure the mechanical state of the material but strain gauges only measure the strain on a point and don't allow to understand material with an heterogeneous behavior. One can obtain a full in plane strain tensor by derivation of the displacement fields. Many methods are based upon the optical flow. In fluid mechanics a similar method is used, called Particle Image Velocimetry (PIV); the algorithms are similar to those of DIC but it is impossible to ensure that the optical flow is conserved so a vast majority of the software used the normalized cross correlation metric. In mechanics the displacement or velocity fields are the only concern, registering images is just a side effect. There is another process called image registration using the same algorithms (on monomodal images) but where the goal is to register images and thereby identifying the displacement field is just a side effect. YaDICs uses the general principle of image registration with a particular attention to the displacement fields basis. === Image registration principle === YaDICs can be explained using the classical image registration framework: === Image registration general scheme === The common idea of image registration and digital image correlation is to find the transformation between a fixed image and a moving one for a given metric using an optimization scheme. While there are many methods to achieve such a goal, Yadics focuses on registering images with the same modality. The idea behind the creation of this software is to be able to process data that comes from a μ-tomograph; i.e.: data cube over 10003 voxels. With such a size it is not possible to use naive approach usually used in a two-dimensional context. In order to get sufficient performances OpenMP parallelism is used and data are not globally stored in memory. As an extensive description of the different algorithms is given in. === Sampling === Contrary to image registration, Digital Image Correlation targets the transformation, one wants to extracted the most accurate transformation from the two images and not just match the images. Yadics uses the whole image as a sampling grid: it is thus a total sampling. === Interpolator === It is possible to choose between bilinear interpolation and bicubic interpolation for the grey level evaluation at non integer coordinates. The bi-cubic interpolation is the recommended one. === Metrics === ==== Sum of squared differences (SSD) ==== The SSD is also known as mean squared error. The equation below defines the SSD metric: S S D ( μ , I F , I M ) = 1 | Ω F | ∑ x i ∈ Ω F ( I F ( x i ) − I M ( T μ ( x i ) ) ) 2 , {\displaystyle SSD(\mu ,{\mathcal {I_{F}}},{\mathcal {I_{M}}})={\dfrac {1}{\left|\Omega _{F}\right|}}\sum _{x_{i}\in \Omega _{F}}\left({\mathcal {I_{F}}}(x_{i})-{\mathcal {I_{M}}}({T}_{\mu }(x_{i}))\right)^{2},} where I F {\displaystyle {\mathcal {I_{F}}}} is the fixed image, I M {\displaystyle {\mathcal {I_{M}}}} the moving one, Ω F {\displaystyle \Omega _{F}} the integration area | Ω F | {\displaystyle \left|\Omega _{F}\right|} the number of pi(vo)xels (cardinal) and T μ {\displaystyle {T}_{\mu }} the transformation parametrized by μ The transformation can be written as: T μ ( x ) = x + { Φ ( x ) } t { μ } . {\displaystyle T_{\mu }(x)=x+\left\{\Phi (x)\right\}^{t}\left\{\mu \right\}.} This metric is the main one used in the YaDICs as it works well with same modality images. One has to find the minimum of this metric ==== Normalized cross-correlation ==== The normalized cross-correlation (NCC) is used when one cannot assure the optical flow conservation; it happens in case of change of lighting or if particles disappear from the scene can occur in particle images velocimetry (PIV). The NCC is defined by: N C C ( μ , I F , I M ) = ∑ x i ∈ Ω F ( I F ( x i ) − I F ¯ ) ( I M ( T μ ( x i ) ) − I M ¯ ) ∑ x i ∈ Ω F ( I F ( x i ) − I F ¯ ) 2 ∑ x i ∈ Ω F ( I M ( T μ ( x i ) ) − I M ¯ ) 2 , {\displaystyle NCC(\mu ,{\mathcal {I_{F}}},{\mathcal {I_{M}}})={\dfrac {\sum _{x_{i}\in \Omega _{F}}\left({\mathcal {I_{F}}}(x_{i})-{\overline {\mathcal {I_{F}}}}\right)\left({\mathcal {I_{M}}}({T}_{\mu }(x_{i}))-{\overline {\mathcal {I_{M}}}}\right)}{\sqrt {\sum _{x_{i}\in \Omega _{F}}\left({\mathcal {I_{F}}}(x_{i})-{\overline {\mathcal {I_{F}}}}\right)^{2}\sum _{x_{i}\in \Omega _{F}}\left({\mathcal {I_{M}}}({T}_{\mu }(x_{i}))-{\overline {\mathcal {I_{M}}}}\right)^{2}}}},} where I F ¯ {\displaystyle {\overline {\mathcal {I_{F}}}}} and I M ¯ {\displaystyle {\overline {\mathcal {I_{M}}}}} are the mean values of the fixed and mobile images. This metric is only used to find local translation in Yadics. This metric with translation transform can be solved using cross-correlation methods, which are non iterative and can be accelerated using Fast Fourier Transform . === Classification of transformations === There are three categories of parametrization: elastic, global and local transformation. The elastic transformations respect the partition of unity, there are no holes created or surfaces counted several times. This is commonly used in Image Registration by the use of B-Spline functions and in solid mechanics with finite element basis. The global transformations are defined on the whole picture using rigid body or affine transformation (which is equivalent to homogeneous strain transformation). More complex transformations can be defined such as mechanically based one. These transformations have been used for stress intensity factor identification by and for rod strain by. The local transformation can be considered as the same global transformation defined on several Zone Of Interest (ZOI) of the fixed image. ==== Global ==== Several global transforms have been implemented: Rigid and homogeneous (Tx,Ty,Rz in 2D; Tx,Ty,Tz,Rx,Ry,Rz,Exx,Eyy,Ezz,Eyz,Exz,Exy in 3D) Brazilian (Only in 2D), Dynamic Flexion, ==== Elastic ==== First-order quadrangular finite elements Q4P1 are used in Yadics. ===== Local ===== Every global transform can be used on a local mesh. === Optimization === The YaDICs optimization process follows a gradient descent scheme. The first step is to compute the gradient of the metric regarding the transform parameters ∂ S S D ( μ , I F , I M ) ∂ μ = 2 | Ω F | ∑ x i ∈ Ω F ( I F ( x i ) − I M ( T μ ( x i ) ) ) ∂ I M ( T μ ( x i ) ∂ μ = 2 | Ω F | ∑ x i ∈ Ω F ( I F ( x i ) − I M ( T μ ( x i ) ) ) ( ∂ T μ ( x i ) ∂ μ ) t ∂ I M ( T μ ( x i ) ) ∂ x {\displaystyle {\begin{array}{lcl}{\dfrac {\partial SSD(\mu ,{\mathcal {I_{F}}},{\mathcal {I_{M}}})}{\partial \mu }}&=&{\dfrac {2}{\left|\Omega _{F}\right|}}\sum _{x_{i}\in \Omega _{F}}\left({\mathcal {I_{F}}}(x_{i})-{\mathcal {I_{M}}}({T}_{\mu }(x_{i}))\right){\dfrac {\partial {\mathcal {I_{M}}}({T}_{\mu }(x_{i})}{\partial \mu }}\\&=&{\dfrac {2}{\left|\Omega _{F}\right|}}\sum _{x_{i}\in \Omega _{F}}\left({\mathcal {I_{F}}}(x_{i})-{\mathcal {I_{M}}}({T}_{\mu }(x_{i}))\right)\left({\dfrac {\partial {T}_{\mu }(x_{i})}{\partial \mu }}\right)^{t}{\dfrac {\partial {\mathcal {I_{M}}}({T}_{\mu }(x_{i}))}{\partial x}}\\\end{array}}} ==== Gradient method ==== Once the metric gradient has been computed, one has to find an optimization strategy The gradient method principle is explained below: μ k + 1 = μ k + α k d k {\displaystyle \mu _{k+1}=\mu _{k}+\alpha _{k}d_{k}} The gradient step can be constant or updated at every iteration. d k = − γ k ∂ C ( μ , I F , I M ) ∂ μ {\displaystyle d_{k}=-\gamma _{k}{\dfrac {\partial {\mathcal {C}}(\mu ,{\mathcal {I_{F}}},{\mathcal {I_{M}}})}{\partial \mu }}} , γ k {\displaystyle \gamma _{k}} allows one to choose between the following methods : γ k {\displaystyle \gamma _{k}} ⟹ {\displaystyle \Longrightarrow } steepest descent, γ k = [ ∂ C ( μ , I F , I M ) ∂ μ ∂ C ( μ , I F , I M ) ∂ μ t ] − 1 {\displaystyle \gamma _{k}=\left[{\dfrac {\partial {\mathcal {C}}(\mu ,{\mathcal {I_{F}}},{\mathcal {I_{M}}})}{\partial \mu }}{\dfrac {\partial {\mathcal {C}}(\mu ,{\mathcal {I_{F}}},{\mathcal {I_{M}}})}{\partial \mu }}^{t}\right]^{-1}} ⟹ {\displaystyle \Longrightarrow } Gauss-Newto

    Read more →
  • Traité de Documentation

    Traité de Documentation

    Traité de documentation: le livre sur le livre, théorie et pratique is a landmark book by Belgian author Paul Otlet, first published in 1934. == Legacy == The book is considered a landmark in the history of information science, with concepts predicting the rise of the World Wide Web and search engines. In [Otlet's] most famous publication of 1934, Traité de Documentation, he wrote of a desk in the form of a wheel from which different projects (workspaces) could be switched as they rotated — foreshadowing the multiple desktops and tabs of contemporary computer interfaces. Inspired by the arrival of radio, phonograph, cinema, and television, Otlet also posited that there were as yet many “inventions to be discovered,” including the reading and annotation of remote documents and computer speech.

    Read more →
  • Web data integration

    Web data integration

    Web data integration (WDI) is the process of aggregating and managing data from different websites into a single, homogeneous workflow. This process includes data access, transformation, mapping, quality assurance and fusion of data. Data that is sourced and structured from websites is referred to as "web data". WDI is an extension and specialization of data integration that views the web as a collection of heterogeneous databases. Data integration techniques in the context of the web, forms the foundation for businesses taking advantage of data available on the ever-increasing number of publicly-accessible websites. Corporate spending on this area amounted to about USD 2.5bn in 2017, and it is expected that by 2020 the market will reach almost USD 7bn.

    Read more →