AI Assistant For Acrobat Cost

AI Assistant For Acrobat Cost — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Comparison of vector graphics editors

    Comparison of vector graphics editors

    A number of vector graphics editors exist for various platforms. Potential users of these editors will make comparisons based on factors such as the availability for the user's platform, the software license, the feature set, the merits of the user interface (UI) and the focus of the program. Some programs are more suitable for artistic work while others are better for technical drawings. Another important factor is the application's support of various vector and bitmap image formats for import and export. The tables in this article compare general and technical information for a number of vector graphics editors. See the article on each editor for further information. This article is neither all-inclusive nor necessarily up-to-date. == Some editors in detail == Adobe Fireworks (formerly Macromedia Fireworks) is a vector editor with bitmap editing capabilities with its main purpose being the creation of graphics for Web and screen. Fireworks supports RGB color scheme and has no CMYK support. This means it is mostly used for screen design. The native Fireworks file format is editable PNG (FWPNG or PNG). Adobe Fireworks has a competitive price, but its features can seem limited in comparison with other products. It is easier to learn than other products and can produce complex vector artwork. The Fireworks editable PNG file format is not supported by other Adobe products. Fireworks can manage the PSD and AI file formats which enables it to be integrated with other Adobe apps. Fireworks can also open FWPNG/PNG, PSD, AI, EPS, JPG, GIF, BMP, TIFF file formats, and save/export to FWPNG/PNG, PSD, AI (v.8), FXG (v.2.0), JPG, GIF, PDF, SWF and some others. Some support for exporting to SVG is available via a free Export extension. On May 6, 2013, Adobe announced that Fireworks would be phased out. Adobe Flash (formerly a Macromedia product) has straightforward vector editing tools that make it easier for designers and illustrators to use. The most important of these tools are vector lines and fills with bitmap-like selectable areas, simple modification of curves via the "selection" or the control points/handles through "direct selection" tools. Flash uses Actionscript for OOP, and has full XML functionality through E4X support. Adobe FreeHand (formerly Macromedia Freehand and Aldus Freehand) is mainly used by professional graphic designers. The functionality of FreeHand includes the flexibility of the application in the wide design environment, catering to the output needs of both traditional image reproduction methods and to contemporary print and digital media with its page-layout capabilities and text attribute controls. Specific functions of FreeHand include a superior image-tracing operation for vector editing, page layout features within multiple-page documents, and embedding custom print-settings (such as variable halftone-screen specifications within a single graphic, etc.) to each document independent of auxiliary printer-drivers. User-operation is considered to be more suited for designers with an artistic background compared to designers with a technical background. When being marketed, FreeHand lacked the promotional backing, development and PR support in comparison to other similar products. FreeHand was transferred to the classic print group after Macromedia was purchased by Adobe in 2005. On May 16, 2007, Adobe announced that no further updates to Freehand would be developed but continues to sell FreeHand MX as a Macromedia product. FreeHand continues to run on Mac OS X Snow Leopard (using an Adobe fix) and on Windows 7. For macOS, Affinity Designer is able to open version 10 & MX Freehand files. Adobe Illustrator is a commonly used editor because of Adobe's market dominance, but is more expensive than other similar products. It is primarily developed consistently in line with other Adobe products and is best integrated with Adobe's Creative Suite packages. The ai file format is proprietary, but some vector editors can open and save in that format. Illustrator imports over two dozen formats, including PSD, PDF and SVG, and exports AI, PDF, SVG, SVGZ, GIF, JPG, PNG, WBMP, and SWF. However, the user must be aware of unchecking the "Preserve Illustrator Editing Capabilities" option if generating interoperable SVG files is desired. Affinity Designer by Serif Europe (the successor to their previous product, DrawPlus) is non-subscription-based software that is often described as an alternative to Adobe Illustrator. The application can open Portable Document Format (PDF), Adobe Photoshop, and Adobe Illustrator files, as well as export to those formats and to the Scalable Vector Graphics (SVG) and Encapsulated PostScript (EPS) formats. It also supports import from some Adobe Freehand files (specifically versions 10 & MX). Apache OpenOffice Draw is the vector graphics editor of the Apache OpenOffice open source office suite. It supports many import and export file formats and is available for multiple desktop operating systems. Boxy SVG is a chromium-based vector graphics editor for creating illustrations, as well as logos, icons, and other elements of graphic design. It is primarily focused on editing drawings in the SVG file format. The program is available as both a web app and a desktop application for Windows, macOS, ChromeOS, and Linux-based operating systems. Collabora Online Draw is the vector graphics editor of the Collabora Online open source office suite. It supports many import and export file formats and is accessible via any modern web browser, it also supports desktop editing features, Collabora Office is available for desktop and mobile operating systems, it is the enterprise ready version of LibreOffice. ConceptDraw PRO is a business diagramming tool and vector graphics editor available for both Windows and macOS. It supports multi-page documents, and includes an integrated presentation mode. ConceptDraw PRO supports imports and exports several formats, including Microsoft Visio and Microsoft PowerPoint. Corel Designer (originally Micrografx Designer) is one of the earliest vector-based graphics editors for the Microsoft Windows platform. The product is mainly used for the creation of engineering drawings and is shipped with extensive libraries for the needs of engineers. It is also flexible enough for most vector graphics design applications. CorelDRAW is an editor used in the graphic design, sign making and fashion design industries. CorelDRAW is capable of limited interoperation by reading file formats from Adobe Illustrator. CorelDRAW has over 50 import and export filters, on-screen and dialog box editing and the ability to create multi-page documents. It can also generate TrueType and Type 1 fonts, although refined typographic control is better suited to a more specific application. Some other features of CorelDRAW include the creation and execution of VBA macros, viewing of colour separations in print preview mode and integrated professional imposing options. Dia is a free and open-source diagramming and vector graphics editor available for Windows, Linux and other Unix-based computer operating systems. Dia has a modular design and several shape packages for flowcharting, network diagrams and circuit diagrams. Its design was inspired by Microsoft Visio, although it uses a Single Document Interface similar to other GNOME software (such as GIMP). DrawPlus, first built for the Windows platform in 1993, has matured into a full featured vector graphics editor for home and professional users. Also available as a feature-limited free 'starter edition': DrawPlus SE. DrawPlus developers, Serif Europe, have now ceased its development in order to focus on its successor, Affinity Designer. Edraw Max is a cross-platform diagram software and vector graphics editor available for Windows, Mac and Linux. It supports kinds of diagram types. It supports imports and exports SVG, PDF, HTML, Multiple page TIFF, Microsoft Visio and Microsoft PowerPoint. Embroidermodder is a free machine embroidery software tool that supports a variety of formats and allows the user to add custom modifications to their embroidery designs. Fatpaint is a free, light-weight, browser-based graphic design application with built-in vector drawing tools. It can be accessed through any browser with Flash 9 installed. Its integration with Zazzle makes it particularly suitable for people who want to create graphics for custom printed products such as T-shirts, mugs, iPhone cases, flyers and other promotional products. Figma is a collaborative web-based online vector graphics editor, used primarily for UX design and prototyping. GIMP, which works mainly with raster images, offers a limited set of features to create and record SVG files. It can also load and handle SVG files created with other software like Inkscape. Inkscape is a free and open-source vector editor with the primary native format being SVG. Inkscape is available for Linux, Windows, Mac OS X, and

    Read more →
  • Spatial computing

    Spatial computing

    Spatial computing refers to 3D human–computer interaction techniques that are perceived by users as taking place in the real world, in and around their bodies and physical environments, instead of constrained to and perceptually behind computer screens or in purely virtual worlds. This concept inverts the long-standing practice of teaching people to interact with computers in digital environments, and instead teaches computers to better understand and interact with people more naturally in the human world. This concept overlaps with and encompasses others including extended reality, augmented reality, mixed reality, natural user interface, contextual computing, affective computing, and ubiquitous computing. The usage for labeling and discussing these adjacent technologies is imprecise. Spatial computing devices include sensors—such as RGB cameras, depth cameras, 3D trackers, inertial measurement units, or other tools—to sense and track nearby human bodies (including hands, arms, eyes, legs, mouths) during ordinary interactions with people and computers in a 3D space. They further use computer vision to attempt to understand real world scenes, such as rooms, streets or stores, to read labels, to recognize objects, create 3D maps, and more. Quite often they also use extended reality and mixed reality to superimpose virtual 3D graphics and virtual 3D audio onto the human visual and auditory system as a way of providing information more naturally and contextually than traditional 2D screens. Spatial computing often refers to personal computing devices like headsets and headphones, but other human-computer interactions that leverage real-time spatial positioning for displays, like projection mapping or cave automatic virtual environment displays, can also be considered spatial computing if they leverage human-computer input for the participants. == History == The term "spatial computing" apparently originated in the field of GIS around 1985 or earlier to describe computations on large-scale geospatial information. Early examples of spatial computing in GIS include ArcInfo and its iterations, initially released in 1981, a part of ArcGIS along with ArcEditor, which together provide mapping, analysis, editing, and geoprocessing for geodatabases. This is somewhat related to the modern use, but on the scale of continents, cities, and neighborhoods. Modern spatial computing is more centered on the human scale of interaction, around the size of a living room or smaller. But it is not limited to that scale in the aggregate. In the early 1990s, as field of virtual reality was beginning to be commercialized beyond academic and military labs, a startup called Worldesign in Seattle used the term Spatial Computing to describe the interaction between individual people and 3D spaces, operating more at the human end of the scale than previous GIS examples may have contemplated. The company built a CAVE-like environment it called the Virtual Environment Theater, whose 3D experience was of a virtual flyover of the Giza Plateau, circa 3000 BC. Robert Jacobson, CEO of Worldesign, attributes the origins of the term to experiments at the Human Interface Technology Lab, at the University of Washington, under the direction of Thomas A. Furness III. Jacobson was a co-founder of that lab before spinning off this early VR startup. In 1997, an academic publication by T. Caelli, Peng Lam, and H. Bunke called "Spatial Computing: Issues in Vision, Multimedia and Visualization Technologies" introduced the term more broadly for academic audiences, focusing on a variety of topics such as image processing, dead reckoning navigation, object recognition, and visualizing spatial data. The specific term "spatial computing" was later referenced again in 2003 by Simon Greenwold, as "human interaction with a machine in which the machine retains and manipulates referents to real objects and spaces". MIT Media Lab alumnus John Underkoffler gave a TED talk in 2010 giving a live demo of the multi-screen, multi-user spatial computing systems being developed by Oblong Industries, which sought to bring to life the futuristic interfaces conceptualized by Underkoffler in the films Minority Report and Iron Man. Google Earth, initially released by Keyhole Inc. in 2001 and re-released by Google in 2005 can be considered a capable GIS and includes advanced geospatial tools and capabilities. == Notable instances of the use of spatial computing == In 2019, Microsoft HoloLens released a video outlining Airbus' partnership with Microsoft Azure to utilize the latter's mixed reality services for streamlining and improving the aircraft design process, as well as reducing the error in development. Airbus utilized the HoloLens 2 to this end, and the executive vice president of engineering claimed that their design process' validation phases were "hugely accelerated by 80 percent", as well as "strongly believe[d]" that up to 30% improvements in their industrial tasks could be attained with the HoloLens 2. During the presentational video, Airbus cited the maturity of Microsoft Azure services as "key" for their usage of the HoloLens 2. Also in 2019, the U.S. army partnered with Microsoft to produce a HoloLens based Integrated Visual Augmentation System (IVAS) to enhance infantry members by giving troops various abilities, including but not limited to using holographs to train, projecting 3D maps into their vision, and seeing through smoke and corners. Microsoft received tens of thousands of hours of feedback for their systems by 2021. Sergeant Marc Krugh at the time claimed that Microsoft's partnership has already caused the army to rethink some of its troops' operation strategy. == Products == === Apple Vision Pro === Apple announced Apple Vision Pro, a device it markets as a "spatial computer", on June 5, 2023. It includes several features such as Spatial Audio, two 4K micro-OLED displays, the Apple R1 chip and eye tracking, and released in the United States on February 2, 2024. In announcing the platform, Apple invoked its history of popularizing 2D graphical user interfaces that supplanted prior human-computer interface mechanisms such as the command line. Apple suggests the introduction of spatial computing as a new category of interactive device, on the same level of importance as the introduction of the 2D GUI. Apple Vision Pro runs on a new operating system called visionOS, which combines eye tracking, gesture recognition, and voice input to enable immersive interaction without physical controllers. The platform is aimed at productivity, entertainment, collaboration, and enterprise use cases. === Magic Leap === Magic Leap had also previously used the term “spatial computing” to describe its own devices. Its first headset, the Magic Leap 1, was released on August 8, 2018. Magic Leap’s technology enables the display of content into the real world using an optical see-through head-mounted display, which projects an overlay of a virtual world into the user’s field of view. This allows for an experience where the physical and digital worlds are perceived simultaneously. === Microsoft Hololens === On February 24, 2019, Microsoft released the HoloLens 2, which includes mixed reality tools and can generate interactable, manipulatable holograms in 3D space. The holograms in question can be related to a physical object or completely independent and free-floating. The Azure Spatial Anchors cloud service was released simultaneously, which gives the holograms capability to persist across time and many individuals' devices. === Meta Quest === The Meta Quest 3, a mixed reality gaming headset that includes spatial audio, two color cameras, and grants the ability to interact with virtual characters released on October 9, 2023, at a notably cheaper price than the Apple Vision Pro, but with reduced capabilities. === Snap Spectacles === Spectacles (product) are augmented reality glasses developed by Snap Inc.. The latest generation includes a 46-degree stereoscopic display, adjustable tint, and Snapdragon processors. Spectacles allow users to interact with a collection of augmented reality experiences designed for education, entertainment, and utility. Currently, the device is in the hands of selected developers and creators, as part of an experimental AR ecosystem focused on creativity, use case exploration and expression.

    Read more →
  • Online analytical processing

    Online analytical processing

    In computing, online analytical processing (OLAP) (), is an approach to quickly answer multi-dimensional analytical (MDA) queries. The term OLAP was created as a slight modification of the traditional database term online transaction processing (OLTP). OLAP is part of the broader category of business intelligence, which also encompasses relational databases, report writing and data mining. Typical applications of OLAP include business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas, with new applications emerging, such as agriculture. OLAP tools enable users to analyse multidimensional data interactively from multiple perspectives. OLAP consists of three basic analytical operations: consolidation (roll-up), drill-down, and slicing and dicing. Consolidation involves the aggregation of data that can be accumulated and computed in one or more dimensions. For example, all sales offices are rolled up to the sales department or sales division to anticipate sales trends. By contrast, the drill-down is a technique that allows users to navigate through the details. For instance, users can view the sales by individual products that make up a region's sales. Slicing and dicing is a feature whereby users can take out (slicing) a specific set of data of the OLAP cube and view (dicing) the slices from different viewpoints. These viewpoints are sometimes called dimensions (such as looking at the same sales by salesperson, or by date, or by customer, or by product, or by region, etc.). Databases configured for OLAP use a multidimensional data model, allowing for complex analytical and ad hoc queries with a rapid execution time. They borrow aspects of navigational databases, hierarchical databases and relational databases. OLAP is typically contrasted to OLTP (online transaction processing), which is generally characterized by much less complex queries, in a larger volume, to process transactions rather than for the purpose of business intelligence or reporting. Whereas OLAP systems are mostly optimized for read, OLTP has to process all kinds of queries (read, insert, update and delete). == Overview of OLAP systems == At the core of any OLAP system is an OLAP cube (also called a 'multidimensional cube' or a hypercube). It consists of numeric facts called measures that are categorized by dimensions. The measures are placed at the intersections of the hypercube, which is spanned by the dimensions as a vector space. The usual interface to manipulate an OLAP cube is a matrix interface, like Pivot tables in a spreadsheet program, which performs projection operations along the dimensions, such as aggregation or averaging. The cube metadata is typically created from a star schema or snowflake schema or fact constellation of tables in a relational database. Measures are derived from the records in the fact table and dimensions are derived from the dimension tables. Each measure can be thought of as having a set of labels, or meta-data associated with it. A dimension is what describes these labels; it provides information about the measure. A simple example would be a cube that contains a store's sales as a measure, and Date/Time as a dimension. Each Sale has a Date/Time label that describes more about that sale. For example: Sales Fact Table +-------------+----------+ | sale_amount | time_id | +-------------+----------+ Time Dimension | 930.10| 1234 |----+ +---------+-------------------+ +-------------+----------+ | | time_id | timestamp | | +---------+-------------------+ +---->| 1234 | 20080902 12:35:43 | +---------+-------------------+ === Multidimensional databases === Multidimensional structure is defined as "a variation of the relational model that uses multidimensional structures to organize data and express the relationships between data". The structure is broken into cubes and the cubes are able to store and access data within the confines of each cube. "Each cell within a multidimensional structure contains aggregated data related to elements along each of its dimensions". Even when data is manipulated it remains easy to access and continues to constitute a compact database format. The data still remains interrelated. Multidimensional structure is quite popular for analytical databases that use online analytical processing (OLAP) applications. Analytical databases use these databases because of their ability to deliver answers to complex business queries swiftly. Data can be viewed from different angles, which gives a broader perspective of a problem unlike other models. === Aggregations === It has been claimed that for complex queries OLAP cubes can produce an answer in around 0.1% of the time required for the same query on OLTP relational data. The most important mechanism in OLAP which allows it to achieve such performance is the use of aggregations. Aggregations are built from the fact table by changing the granularity on specific dimensions and aggregating up data along these dimensions, using an aggregate function (or aggregation function). The number of possible aggregations is determined by every possible combination of dimension granularities. The combination of all possible aggregations and the base data contains the answers to every query which can be answered from the data. Because usually there are many aggregations that can be calculated, often only a predetermined number are fully calculated; the remainder are solved on demand. The problem of deciding which aggregations (views) to calculate is known as the view selection problem. View selection can be constrained by the total size of the selected set of aggregations, the time to update them from changes in the base data, or both. The objective of view selection is typically to minimize the average time to answer OLAP queries, although some studies also minimize the update time. View selection is NP-complete. Many approaches to the problem have been explored, including greedy algorithms, randomized search, genetic algorithms and A search algorithm. Some aggregation functions can be computed for the entire OLAP cube by precomputing values for each cell, and then computing the aggregation for a roll-up of cells by aggregating these aggregates, applying a divide and conquer algorithm to the multidimensional problem to compute them efficiently. For example, the overall sum of a roll-up is just the sum of the sub-sums in each cell. Functions that can be decomposed in this way are called decomposable aggregation functions, and include COUNT, MAX, MIN, and SUM, which can be computed for each cell and then directly aggregated; these are known as self-decomposable aggregation functions. In other cases, the aggregate function can be computed by computing auxiliary numbers for cells, aggregating these auxiliary numbers, and finally computing the overall number at the end; examples include AVERAGE (tracking sum and count, dividing at the end) and RANGE (tracking max and min, subtracting at the end). In other cases, the aggregate function cannot be computed without analyzing the entire set at once, though in some cases approximations can be computed; examples include DISTINCT COUNT, MEDIAN, and MODE; for example, the median of a set is not the median of medians of subsets. These latter are difficult to implement efficiently in OLAP, as they require computing the aggregate function on the base data, either computing them online (slow) or precomputing them for possible rollouts (large space). == Types == OLAP systems have been traditionally categorized using the following taxonomy. === Multidimensional OLAP (MOLAP) === MOLAP (multi-dimensional online analytical processing) is the classic form of OLAP and is sometimes referred to as just OLAP. MOLAP stores this data in an optimized multi-dimensional array storage, rather than in a relational database. Some MOLAP tools require the pre-computation and storage of derived data, such as consolidations – the operation known as processing. Such MOLAP tools generally utilize a pre-calculated data set referred to as a data cube. The data cube contains all the possible answers to a given range of questions. As a result, they have a very fast response to queries. On the other hand, updating can take a long time depending on the degree of pre-computation. Pre-computation can also lead to what is known as data explosion. Other MOLAP tools, particularly those that implement the functional database model do not pre-compute derived data but make all calculations on demand other than those that were previously requested and stored in a cache. Advantages of MOLAP Fast query performance due to optimized storage, multidimensional indexing and caching. Smaller on-disk size of data compared to data stored in relational database due to compression techniques. Automated computation of higher-level aggregates of the data. It is very compact for low dimension data se

    Read more →
  • Overcategorization

    Overcategorization

    Overcategorization or category clutter is a phenomenon during classification where too many categories or classes are assigned to a document, record, or item. Overcategorization is related to the library and information science (LIS) concepts of document classification and subject indexing. It is also related to online shopping where excessive product categories can overwhelm users with too many choices or make it more difficult for customers to find the products they need. Although these categories are intended to improve organization and ease of navigation when shipping online, too many categories can lower customer satisfaction, increase difficulty navigating the online store, and reduce future shopping intentions. In LIS, the ideal number of terms that should be assigned to classify an item are measured by the variables precision and recall. Assigning few category labels that are most closely related to the content of the item being classified will result in searches that have high precision, I.e., where a high proportion of the results are closely related to the query. Assigning more category labels to each item will reduce the precision of each search, but increase the recall, retrieving more relevant results. Related LIS concepts include exhaustivity of indexing and information overload. == Basic principles == If too many categories are assigned to a given document, the implications for users depend on how informative the links are. If the user is able to distinguish between useful and not useful links, the damage is limited: The user only wastes time selecting links. In many cases, however, the user cannot judge whether or not a given link will turn out to be fruitful. In that case he or she has to follow the link and to read or skim another document. The worst case scenario is, of course, that even after reading the new document the user is unable to decide whether or not it might be useful if its subject matter is not thoroughly investigated. Overcategorization also has another unpleasant implication: It makes the system (for example in Wikipedia) difficult to maintain in a consistent way. If the system is inconsistent, it means that when the user considers the links in a given category, he or she will not find all documents relevant to that category. Basically, the problem of overcategorization should be understood from the perspective of relevance and the traditional measures of recall and precision. If too few relevant categories are assigned to a document, recall may decrease. If too many non-relevant categories are assigned, precision becomes lower. The hard job is to say which categories are fruitful or relevant for future use of the document.

    Read more →
  • Progress in artificial intelligence

    Progress in artificial intelligence

    Progress in artificial intelligence (AI) refers to the advances, milestones, and breakthroughs that have been achieved in the field of artificial intelligence over time. AI is a branch of computer science that aims to create machines and systems capable of performing tasks that typically require human intelligence. AI applications have been used in a wide range of fields including medical diagnosis, finance, robotics, law, video games, agriculture, and scientific discovery. The society as a whole is looking for artificial intelligence to be on a key factor in the upcming years because of its potential. However, many AI applications are not perceived as AI: "A lot of cutting-edge AI has filtered into general applications, often without being called AI because once something becomes useful enough and common enough it's not labeled AI anymore." "Many thousands of AI applications are deeply embedded in the infrastructure of every industry." In the late 1990s and early 2000s, AI technology became widely used as elements of larger systems, but the field was rarely credited for these successes at the time. Kaplan and Haenlein structure artificial intelligence along three evolutionary stages: Artificial narrow intelligence – AI capable only of specific tasks; Artificial general intelligence – AI with ability in several areas, and able to autonomously solve problems they were never even designed for; Artificial superintelligence – AI capable of general tasks, including scientific creativity, social skills, and general wisdom. To allow comparison with human performance, artificial intelligence can be evaluated on constrained and well-defined problems. Such tests have been termed subject-matter expert Turing tests. Also, smaller problems provide more achievable goals and there are an ever-increasing number of positive results. In 2023, humans still substantially outperformed both GPT-4 and other models tested on the ConceptARC benchmark. Those models scored 60% on most, and 77% on one category, while humans scored 91% on all and 97% on one category. However, later research in 2025 showed that human-generated output grids were only accurate 73% of the time, while AI models available that year managed to score above 77%. == History == Increasing, promoting or constraining AI progress has often be done via controlling or increasing the amount of compute. == Current performance in specific areas == There are many useful abilities that can be described as showing some form of intelligence. This gives better insight into the comparative success of artificial intelligence in different areas. AI, like electricity or the steam engine, is a general-purpose technology. There is no consensus on how to characterize which tasks AI tends to excel at. Some versions of Moravec's paradox observe that humans are more likely to outperform machines in areas such as physical dexterity that have been the direct target of natural selection. While projects such as AlphaZero have succeeded in generating their own knowledge from scratch, many other machine learning projects require large training datasets. Researcher Andrew Ng has suggested, as a "highly imperfect rule of thumb", that "almost anything a typical human can do with less than one second of mental thought, we can probably now or in the near future automate using AI." Games provide a high-profile benchmark for assessing rates of progress; many games have a large professional player base and a well-established competitive rating system. AlphaGo brought the era of classical board-game benchmarks to a close when Artificial Intelligence proved their competitive edge over humans in 2016. Deep Mind's AlphaGo AI software program defeated the world's best professional Go Player Lee Sedol. Games of imperfect knowledge provide new challenges to AI in the area of game theory; the most prominent milestone in this area was brought to a close by Libratus' poker victory in 2017. E-sports continue to provide additional benchmarks; Facebook AI, Deepmind, and others have engaged with the popular StarCraft franchise of videogames. Broad classes of outcome for an AI test may be given as: optimal: it is not possible to perform better (note: some of these entries were solved by humans) super-human: performs better than all humans high-human: performs better than most humans par-human: performs similarly to most humans sub-human: performs worse than most humans === Optimal === Tic-tac-toe Connect Four: 1988 Checkers (aka 8x8 draughts): Weakly solved (2007) Rubik's Cube: Mostly solved (2010) Heads-up limit hold'em poker: Statistically optimal in the sense that "a human lifetime of play is not sufficient to establish with statistical significance that the strategy is not an exact solution" (2015) === Super-human === Othello (aka reversi): c. 1997 Scrabble: 2006 Backgammon: c. 1995–2002 Chess: Supercomputer (c. 1997); Personal computer (c. 2006); Mobile phone (c. 2009); Computer defeats human + computer (c. 2017) Jeopardy!: Question answering, although the machine did not use speech recognition (2011) Arimaa: 2015 Shogi: c. 2017 Go: 2017 Heads-up no-limit hold'em poker: 2017 Six-player no-limit hold'em poker: 2019 Gran Turismo Sport: 2022 === High-human === Crosswords: c. 2012 Freeciv: 2016 Dota 2: 2018 Bridge card-playing: According to a 2009 review, "the best programs are attaining expert status as (bridge) card players", excluding bidding. StarCraft II: 2019 Mahjong: 2019 Stratego: 2022 No-Press Diplomacy: 2022 Hanabi: 2022 Natural language processing === Par-human === Optical character recognition for ISO 1073-1:1976 and similar special characters. Classification of images Handwriting recognition Facial recognition Visual question answering SQuAD 2.0 English reading-comprehension benchmark (2019) SuperGLUE English-language understanding benchmark (2020) Some school science exams (2019) Some tasks based on Raven's Progressive Matrices Many Atari 2600 games (2015) === Sub-human === Optical character recognition for printed text (nearing par-human for Latin-script typewritten text) Object recognition Various robotics tasks that may require advances in robot hardware as well as AI, including: Stable bipedal locomotion: Bipedal robots can walk, but are less stable than human walkers (as of 2017) Humanoid soccer Speech recognition: "nearly equal to human performance" (2017) Explainability. Current medical systems can diagnose certain medical conditions well, but cannot explain to users why they made the diagnosis. Many tests of fluid intelligence (2020) Bongard visual cognition problems, such as the Bongard-LOGO benchmark (2020) Visual Commonsense Reasoning (VCR) benchmark (as of 2020) Stock market prediction: Financial data collection and processing using Machine Learning algorithms Angry Birds video game, as of 2020 Various tasks that are difficult to solve without contextual knowledge, including: Translation Word-sense disambiguation == Proposed tests of artificial intelligence == In his famous Turing test, Alan Turing picked language, the defining feature of human beings, for its basis. The Turing test is now considered too exploitable to be a meaningful benchmark. The Feigenbaum test, proposed by the inventor of expert systems, tests a machine's knowledge and expertise about a specific subject. A paper by Jim Gray of Microsoft in 2003 suggested extending the Turing test to speech understanding, speaking and recognizing objects and behavior. Proposed "universal intelligence" tests aim to compare how well machines, humans, and even non-human animals perform on problem sets that are generic as possible. At an extreme, the test suite can contain every possible problem, weighted by Kolmogorov complexity; however, these problem sets tend to be dominated by impoverished pattern-matching exercises where a tuned AI can easily exceed human performance levels. == Exams == According to OpenAI, in 2023 GPT-4 achieved high scores on several standardized and professional examinations, including around the 90th percentile on the Uniform Bar Exam, the 89th percentile on the mathematics section of the SAT, the 93rd percentile on SAT Reading and Writing, the 54th percentile on the analytical writing section of the GRE, the 88th percentile on GRE quantitative reasoning, and the 99th percentile on GRE verbal reasoning. OpenAI also reported that GPT-4 scored in the 99th to 100th percentile on the 2020 USA Biology Olympiad semifinal exam and earned top scores on several AP exams. Independent researchers found in 2023 that ChatGPT based on GPT-3.5 performed "at or near the passing threshold" on all three parts of the United States Medical Licensing Examination (USMLE), suggesting that large language models could reach passing-level performance on some medical knowledge assessments even without domain-specific fine-tuning. GPT-3.5 was also reported to attain a low but passing grade on examinations for four law school courses at the University of Minnes

    Read more →
  • Environmental informatics

    Environmental informatics

    Environmental informatics is the science of information applied to environmental science. As such, it provides the information processing and communication infrastructure to the interdisciplinary field of environmental sciences aiming at data, information and knowledge integration, the application of computational intelligence to environmental data as well as the identification of environmental impacts of information technology. Environmental informatics thus acts as a bridge, providing an interdisciplinary means of analysing, describing and understanding the complex interactions between humans, nature and technology. Since each field of applied computer science has its own subject matter, terminology and methods, specialised disciplines, such as environmental, bio- and geoinformatics have emerged, each of which combines computer science with a specific field of application such as environmental, bio- or geosciences. Environmental informatics, bioinformatics and geoinformatics all deal with computer-based processing of environmental phenomena. However, environmental informatics is the only field that pursues normative goals (e.g., political goals of environmental protection, environmental planning, and sustainability). This also influences the choice of methods. This also distinguishes it from application areas such as numerical weather prediction, which is considered an early and important example of computer simulation of environmental phenomena. The UK Natural Environment Research Council defines environmental informatics as the "research and system development focusing on the environmental sciences relating to the creation, collection, storage, processing, modelling, interpretation, display and dissemination of data and information." Kostas Karatzas defined environmental informatics as the "creation of a new 'knowledge-paradigm' towards serving environmental management needs." Karatzas argued further that environmental informatics "is an integrator of science, methods and techniques and not just the result of using information and software technology methods and tools for serving environmental engineering needs." Environmental informatics emerged in early 1990 in Central Europe. Current initiatives to effectively manage, share, and reuse environmental and ecological data are indicative of the increasing importance of fields like environmental informatics and ecoinformatics to develop the foundations for effectively managing ecological information. Examples of these initiatives are National Science Foundation Datanet projects, DataONE and Data Conservancy. == Subject matter and objectives == The subject of environmental informatics are environmental information systems (EIS). An EIS 'is a computer-based system that integrates and stores data collected about the natural environment and provides powerful methods for accessing and evaluating it.' This allows environmental data to be processed by computers for environmental protection, planning, research and technology. According to Jaeschke and Bossel, environmental informatics has three interrelated objectives: Environmental informatics serves to procure data and information for describing the state and development of the environment. Of particular importance is information that is needed to prevent or limit undesirable changes and to support desirable changes. Based on the evaluation and analysis of data, environmental informatics improves our understanding of the environment and the interactions between nature, technology and society. It thus supports environmentally relevant decisions. This enables the influence of development (system correction), the assessment of the effects and side effects of potential measures, and the creation of tools for the routine planning, implementation and monitoring of measures. == History == The simulation model World3, which formed the basis of the highly acclaimed study The Limits to Growth, is considered the starting point of environmental informatics. It incorporated environmental information, among other things, to calculate scenarios for global development. In the mid-1980s, interest grew in structuring environmental protection as an area of application for computer science. One of the first publications in German was the book Informatik im Umweltschutz. Anwendungen und Perspektiven (Computer science in environmental protection. Applications and perspectives) from 1986. The term 'environmental informatics' did not appear until around 1993, which is why the development of environmental informatics is usually referred to as having taken place in the 1990s. In 1993, the first university chair for environmental informatics was established in Cottbus. In 1994, the anthology Umweltinformatik. Informatikmethoden für Umweltschutz und Umweltforschung (Environmental Informatics: Informatics Methods for Environmental Protection and Environmental Research) was published. The development of environmental informatics was 'primarily initiated by German computer science.' In the English-speaking world, the volume Environmental Informatics was published in 1995, mainly based on the German anthology of 1994. An article in the conference proceedings of the World Computer Congress of the International Federation for Information Processing (IFIP) in Hamburg in 1994 describes the initial situation of environmental informatics as follows: 'On the one hand, we suffer from the huge amount of available data – people sometimes speak of data graveyards – on the other hand, the really relevant data may still be missing.' This statement indicates the need that led to the emergence of environmental informatics as a specialised discipline of applied computer science. Furthermore, the specific characteristics and processing requirements of environmental data necessitated the emergence of environmental informatics. The special features of environmental data include: The data structures required are highly heterogeneous due to specific processes and differing perspectives on environmental aspects (e.g., water protection, emission control, hazardous substances). In addition to the heterogeneity of the data, heterogeneous databases also play a role, as environmental data is often obtained and presented in an interdisciplinary manner. Obligations change frequently as a result of new legislation, whether regional (e.g. state regulations on water protection), national (e.g. federal emission control regulations) or international (e.g. Registration, Evaluation, Authorisation and Restriction of Chemicals|REACH). The objects represented are often multidimensional and, therefore, require complex geometric representation using curves or polygons. It is often necessary to process uncertain, imprecise or incomplete data, which is, for example, the result of extrapolations or forecasts. A new "knowledge paradigm" has emerged to meet the requirements of environmental management. Environmental informatics produces its own concepts, methods and techniques and is not merely the result of using information and communication technology methods and tools to meet environmental requirements. The development of environmental informatics since the 1990s has been significantly influenced by the newly established conferences EnviroInfo, ISESS and ITEE and is documented in the respective proceedings. Aspects of sustainability and sustainable development were increasingly integrated into environmental informatics after 2000, thereby expanding the field. In 2004, the Working Group on Sustainable Information Society of the Gesellschaft für Informatik e. V. (German Informatics Society, GI) published the Memorandum on a Sustainable Information Society, which formulates recommendations for an information society that is compatible with human, social and natural needs. Since 2007, environmental informatics has often been described in more detail as informatics for environmental protection, sustainable development and risk management. The increased focus on sustainability has also contributed to the formation of the research focus Information and Communications Technology for Sustainability (ICT4S) and to the emergence of the international conference ICT4S in 2013. ICT-ENSURE, the European Commission's funding measure for the establishment of a European research area on "ICT for Environmental Sustainability Research" (2008–2010), has also contributed to the structuring of environmental informatics. == Environmental informatics and sustainable development == Efforts to place environmental informatics within the context of sustainable development have been growing since 2000 and were significantly influenced by the Memorandum on a Sustainable Information Society. According to this Memorandum, the information society offers great but unevenly distributed opportunities for education, participation and intercultural understanding. In addition, the Memorandum highlighted the material and energy consumption of inf

    Read more →
  • Microsoft Office PerformancePoint Server

    Microsoft Office PerformancePoint Server

    Microsoft Office PerformancePoint Server is a business intelligence software product released in 2007 by Microsoft. The product was generally an integration of the acquisitions from ProClarity - the Planning Server and Monitoring Server - into Microsoft's SharePoint server product line. Although discontinued in 2009, the dashboard, scorecard, and analytics capabilities of PerformancePoint Server were incorporated into SharePoint 2010 and later versions. PerformancePoint Server also provided a planning and budgeting component directly integrated with Excel. == History == Microsoft offered preview releases of PerformancePoint Server starting in mid-2006. Previews of the product were formed from Business Scorecard Manager 2005 and the Planning Server component. Acquisitions ProClarity and Great Plains brought additional analytics and planning/reporting capabilities, as well as companion products ProClarity 6.3 and FRx. PerformancePoint Server was officially released in November 2007. Microsoft discontinued PerformancePoint Server as an independent product in 2009 and folded its dashboard, scorecard and analytics capabilities into PerformancePoint Services in SharePoint Server 2010. == Monitoring Server Component == Business monitoring capabilities, including dashboards, scorecards & key performance indicators, navigable reports for deeper analysis, strategy maps, and linked filtering, are provided by PerformancePoint's Monitoring Server component. A Dashboard Designer application that is distributed from Monitoring Server enables business analysts or IT Administrators to: create & test data source connections create views that use those data connections assemble the views into a dashboard deploy the dashboard as a SharePoint page Dashboard Designer saved content and security information back to the Monitoring Server. Data source connections, such as OLAP cubes or relational tables, were also made through Monitoring Server. After a dashboard has been published to the Monitoring Server database, it would be deployed as a SharePoint page and shared with other users as such. When the pages were opened in a web browser, Monitoring Server updated the data in the views by connecting back to the original data sources. == Planning Server Component == PerformancePoint's Planning Server component supported maintenance of logical business models, budget & approval workflows, enterprise data sources, and it followed Generally Accepted Accounting Principles. Planning Server made use of Excel for input and line-of-business reporting, as well as SQL Server for storing and processing business models. == Management Reporter Component == The Management Reporter component was designed to perform financial reporting and can read PerformancePoint Planning models directly. A development kit was also available to allow this component to read other models.

    Read more →
  • Timeline of algorithms

    Timeline of algorithms

    The following timeline of algorithms outlines the development of algorithms (mainly "mathematical recipes") since their inception. == Antiquity == Before – writing about "recipes" (on cooking, rituals, agriculture and other themes) c. 1700–2000 BC – Egyptians develop earliest known algorithms for multiplying two numbers c. 1600 BC – Babylonians develop earliest known algorithms for factorization and finding square roots c. 300 BC – Euclid's algorithm c. 200 BC – the Sieve of Eratosthenes 263 AD – Gaussian elimination described by Liu Hui == Medieval Period == 628 – Chakravala method described by Brahmagupta c. 820 – Al-Khawarizmi described algorithms for solving linear equations and quadratic equations in his Algebra; the word algorithm comes from his name 825 – Al-Khawarizmi described the algorism, algorithms for using the Hindu–Arabic numeral system, in his treatise On the Calculation with Hindu Numerals, which was translated into Latin as Algoritmi de numero Indorum, where "Algoritmi", the translator's rendition of the author's name gave rise to the word algorithm (Latin algorithmus) with a meaning "calculation method" c. 850 – cryptanalysis and frequency analysis algorithms developed by Al-Kindi (Alkindus) in A Manuscript on Deciphering Cryptographic Messages, which contains algorithms on breaking encryptions and ciphers c. 1025 – Ibn al-Haytham (Alhazen), was the first mathematician to derive the formula for the sum of the fourth powers, and in turn, he develops an algorithm for determining the general formula for the sum of any integral powers c. 1400 – Ahmad al-Qalqashandi gives a list of ciphers in his Subh al-a'sha which include both substitution and transposition, and for the first time, a cipher with multiple substitutions for each plaintext letter; he also gives an exposition on and worked example of cryptanalysis, including the use of tables of letter frequencies and sets of letters which can not occur together in one word == Before 1940 == 1540 – Lodovico Ferrari discovered a method to find the roots of a quartic polynomial 1545 – Gerolamo Cardano published Cardano's method for finding the roots of a cubic polynomial 1614 – John Napier develops method for performing calculations using logarithms 1671 – Newton–Raphson method developed by Isaac Newton 1690 – Newton–Raphson method independently developed by Joseph Raphson 1706 – John Machin develops a quickly converging inverse-tangent series for π and computes π to 100 decimal places 1768 – Leonhard Euler publishes his method for numerical integration of ordinary differential equations in problem 85 of Institutiones calculi integralis 1789 – Jurij Vega improves Machin's formula and computes π to 140 decimal places, 1805 – FFT-like algorithm known by Carl Friedrich Gauss 1842 – Ada Lovelace writes the first algorithm for a computing engine 1903 – A fast Fourier transform algorithm presented by Carle David Tolmé Runge 1918 - Soundex 1926 – Borůvka's algorithm 1926 – Primary decomposition algorithm presented by Grete Hermann 1927 – Hartree–Fock method developed for simulating a quantum many-body system in a stationary state. 1934 – Delaunay triangulation developed by Boris Delaunay 1936 – Turing machine, an abstract machine developed by Alan Turing, with others developed the modern notion of algorithm. == 1940s == 1942 – A fast Fourier transform algorithm developed by G.C. Danielson and Cornelius Lanczos 1945 – Merge sort developed by John von Neumann 1947 – Simplex algorithm developed by George Dantzig == 1950s == 1950 – Hamming codes developed by Richard Hamming 1952 – Huffman coding developed by David A. Huffman 1953 – Simulated annealing introduced by Nicholas Metropolis 1954 – Radix sort computer algorithm developed by Harold H. Seward 1964 – Box–Muller transform for fast generation of normally distributed numbers published by George Edward Pelham Box and Mervin Edgar Muller. Independently pre-discovered by Raymond E. A. C. Paley and Norbert Wiener in 1934. 1956 – Kruskal's algorithm developed by Joseph Kruskal 1956 – Ford–Fulkerson algorithm developed and published by R. Ford Jr. and D. R. Fulkerson 1957 – Prim's algorithm developed by Robert Prim 1957 – Bellman–Ford algorithm developed by Richard E. Bellman and L. R. Ford, Jr. 1959 – Dijkstra's algorithm developed by Edsger Dijkstra 1959 – Shell sort developed by Donald L. Shell 1959 – De Casteljau's algorithm developed by Paul de Casteljau 1959 – QR factorization algorithm developed independently by John G.F. Francis and Vera Kublanovskaya 1959 – Rabin–Scott powerset construction for converting NFA into DFA published by Michael O. Rabin and Dana Scott == 1960s == 1960 – Karatsuba multiplication 1961 – CRC (Cyclic redundancy check) invented by W. Wesley Peterson 1962 – AVL trees 1962 – Quicksort developed by C. A. R. Hoare 1962 – Bresenham's line algorithm developed by Jack E. Bresenham 1962 – Gale–Shapley 'stable-marriage' algorithm developed by David Gale and Lloyd Shapley 1964 – Heapsort developed by J. W. J. Williams 1964 – multigrid methods first proposed by R. P. Fedorenko 1965 – Cooley–Tukey algorithm rediscovered by James Cooley and John Tukey 1965 – Levenshtein distance developed by Vladimir Levenshtein 1965 – Cocke–Younger–Kasami (CYK) algorithm independently developed by Tadao Kasami 1965 – Buchberger's algorithm for computing Gröbner bases developed by Bruno Buchberger 1965 – LR parsers invented by Donald Knuth 1966 – Dantzig algorithm for shortest path in a graph with negative edges 1967 – Viterbi algorithm proposed by Andrew Viterbi 1967 – Cocke–Younger–Kasami (CYK) algorithm independently developed by Daniel H. Younger 1968 – A graph search algorithm described by Peter Hart, Nils Nilsson, and Bertram Raphael 1968 – Risch algorithm for indefinite integration developed by Robert Henry Risch 1969 – Strassen algorithm for matrix multiplication developed by Volker Strassen == 1970s == 1970 – Dinic's algorithm for computing maximum flow in a flow network by Yefim (Chaim) A. Dinitz 1970 – Knuth–Bendix completion algorithm developed by Donald Knuth and Peter B. Bendix 1970 – BFGS method of the quasi-Newton class 1970 – Needleman–Wunsch algorithm published by Saul B. Needleman and Christian D. Wunsch 1972 – Edmonds–Karp algorithm published by Jack Edmonds and Richard Karp, essentially identical to Dinic's algorithm from 1970 1972 – Graham scan developed by Ronald Graham 1972 – Red–black trees and B-trees discovered 1973 – RSA encryption algorithm discovered by Clifford Cocks 1973 – Jarvis march algorithm developed by R. A. Jarvis 1973 – Hopcroft–Karp algorithm developed by John Hopcroft and Richard Karp 1974 – Pollard's p − 1 algorithm developed by John Pollard 1974 – Quadtree developed by Raphael Finkel and J.L. Bentley 1975 – Genetic algorithms popularized by John Holland 1975 – Pollard's rho algorithm developed by John Pollard 1975 – Aho–Corasick string matching algorithm developed by Alfred V. Aho and Margaret J. Corasick 1975 – Cylindrical algebraic decomposition developed by George E. Collins 1976 – Salamin–Brent algorithm independently discovered by Eugene Salamin and Richard Brent 1976 – Knuth–Morris–Pratt algorithm developed by Donald Knuth and Vaughan Pratt and independently by J. H. Morris 1977 – Boyer–Moore string-search algorithm for searching the occurrence of a string into another string. 1977 – RSA encryption algorithm rediscovered by Ron Rivest, Adi Shamir, and Len Adleman 1977 – LZ77 algorithm developed by Abraham Lempel and Jacob Ziv 1977 – multigrid methods developed independently by Achi Brandt and Wolfgang Hackbusch 1978 – LZ78 algorithm developed from LZ77 by Abraham Lempel and Jacob Ziv 1978 – Bruun's algorithm proposed for powers of two by Georg Bruun 1979 – Khachiyan's ellipsoid method developed by Leonid Khachiyan 1979 – ID3 decision tree algorithm developed by Ross Quinlan == 1980s == 1980 – Brent's Algorithm for cycle detection Richard P. Brendt 1981 – Quadratic sieve developed by Carl Pomerance 1981 – Smith–Waterman algorithm developed by Temple F. Smith and Michael S. Waterman 1983 – Simulated annealing developed by S. Kirkpatrick, C. D. Gelatt and M. P. Vecchi 1983 – Classification and regression tree (CART) algorithm developed by Leo Breiman, et al. 1984 – LZW algorithm developed from LZ78 by Terry Welch 1984 – Karmarkar's interior-point algorithm developed by Narendra Karmarkar 1984 – ACORN PRNG discovered by Roy Wikramaratna and used privately 1985 – Simulated annealing independently developed by V. Cerny 1985 – Car–Parrinello molecular dynamics developed by Roberto Car and Michele Parrinello 1985 – Splay trees discovered by Sleator and Tarjan 1986 – Blum Blum Shub proposed by L. Blum, M. Blum, and M. Shub 1986 – Push relabel maximum flow algorithm by Andrew Goldberg and Robert Tarjan 1986 – Barnes–Hut tree method developed by Josh Barnes and Piet Hut for fast approximate simulation of n-body problems 1987 – Fast multipole method developed by Leslie Greengard and Vladimir

    Read more →
  • Exercism

    Exercism

    Exercism is an online, open-source, free coding platform that offers code practice and mentorship on 77 different programming languages. == History == Software developer Katrina Owen created Exercism while she was teaching programming at Jumpstart Labs. The platform was developed as an internal tool to solve the problem of her own students not receiving feedback on the coding problems they were practicing. Katrina put the site publicly online and found that people were sharing it with their friends, practicing together and giving each other feedback. Within 12 months, the site had organically grown to see over 6,000 users had submitted code or feedback, and hundreds of volunteers contribute to the languages or tooling on the platform. In 2016, Jeremy Walker joined as co-founder and CEO. In July 2018, the site was relaunched with a new design and centered around a formal mentoring mode, at which point Katrina stepped back from day-to-day involvement. == Product == In the past, the website differed from other coding platforms by requiring students to download exercises through a command line client, solve the code on their own computers then submit the solution for feedback, at which point they can also view other's solutions to the same problem. Since its second relaunch in 2021, solutions can be edited and submitted through a web editor, though the command line client remains available. Exercism has tracks for 74 programming languages. Among the notable languages taught: ABAP, C, C#, C++, CoffeeScript, Delphi, Elm, Erlang, F#, Gleam, Go, Java, JavaScript, Julia, Kotlin, Objective-C, PHP, Python, Raku, Red, Ruby, Rust, Scala, Swift, and V (Vlang). In 2023, the site launched a "12 in 23" challenge for users to learn the basics of 12 different languages - one per month in 2023. == Open source == The Exercism codebase is open source. In April 2016, it consisted of 50 repositories including website code, API code, command-line code and, most of all, over 40 stand-alone repositories for different language tracks. As of February 2024 Exercism has 14,344 contributors, maintains 366 repositories, and 19,603 mentors.

    Read more →
  • Collective operation

    Collective operation

    Collective operations are building blocks for interaction patterns, that are often used in SPMD algorithms in the parallel programming context. Hence, there is an interest in efficient realizations of these operations. A realization of the collective operations is provided by the Message Passing Interface (MPI). == Definitions == In all asymptotic runtime functions, we denote the latency α {\displaystyle \alpha } (or startup time per message, independent of message size), the communication cost per word β {\displaystyle \beta } , the number of processing units p {\displaystyle p} and the input size per node n {\displaystyle n} . In cases where we have initial messages on more than one node we assume that all local messages are of the same size. To address individual processing units we use p i ∈ { p 0 , p 1 , … , p p − 1 } {\displaystyle p_{i}\in \{p_{0},p_{1},\dots ,p_{p-1}\}} . If we do not have an equal distribution, i.e. node p i {\displaystyle p_{i}} has a message of size n i {\displaystyle n_{i}} , we get an upper bound for the runtime by setting n = max ( n 0 , n 1 , … , n p − 1 ) {\displaystyle n=\max(n_{0},n_{1},\dots ,n_{p-1})} . A distributed memory model is assumed. The concepts are similar for the shared memory model. However, shared memory systems can provide hardware support for some operations like broadcast (§ Broadcast) for example, which allows convenient concurrent read. Thus, new algorithmic possibilities can become available. == Broadcast == The broadcast pattern is used to distribute data from one processing unit to all processing units, which is often needed in SPMD parallel programs to dispense input or global values. Broadcast can be interpreted as an inverse version of the reduce pattern (§ Reduce). Initially only root r {\displaystyle r} with i d {\displaystyle id} 0 {\displaystyle 0} stores message m {\displaystyle m} . During broadcast m {\displaystyle m} is sent to the remaining processing units, so that eventually m {\displaystyle m} is available to all processing units. Since an implementation by means of a sequential for-loop with p − 1 {\displaystyle p-1} iterations becomes a bottleneck, divide-and-conquer approaches are common. One possibility is to utilize a binomial tree structure with the requirement that p {\displaystyle p} has to be a power of two. When a processing unit is responsible for sending m {\displaystyle m} to processing units i . . j {\displaystyle i..j} , it sends m {\displaystyle m} to processing unit ⌈ ( i + j ) / 2 ⌉ {\displaystyle \left\lceil (i+j)/2\right\rceil } and delegates responsibility for the processing units ⌈ ( i + j ) / 2 ⌉ . . j {\displaystyle \left\lceil (i+j)/2\right\rceil ..j} to it, while its own responsibility is cut down to i . . ⌈ ( i + j ) / 2 ⌉ − 1 {\displaystyle i..\left\lceil (i+j)/2\right\rceil -1} . Binomial trees have a problem with long messages m {\displaystyle m} . The receiving unit of m {\displaystyle m} can only propagate the message to other units, after it received the whole message. In the meantime, the communication network is not utilized. Therefore pipelining on binary trees is used, where m {\displaystyle m} is split into an array of k {\displaystyle k} packets of size ⌈ n / k ⌉ {\displaystyle \left\lceil n/k\right\rceil } . The packets are then broadcast one after another, so that data is distributed fast in the communication network. Pipelined broadcast on balanced binary tree is possible in O ( α log ⁡ p + β n ) {\displaystyle {\mathcal {O}}(\alpha \log p+\beta n)} , whereas for the non-pipelined case it takes O ( ( α + β n ) log ⁡ p ) {\displaystyle {\mathcal {O}}((\alpha +\beta n)\log p)} cost. == Reduce == The reduce pattern is used to collect data or partial results from different processing units and to combine them into a global result by a chosen operator. Given p {\displaystyle p} processing units, message m i {\displaystyle m_{i}} is on processing unit p i {\displaystyle p_{i}} initially. All m i {\displaystyle m_{i}} are aggregated by ⊗ {\displaystyle \otimes } and the result is eventually stored on p 0 {\displaystyle p_{0}} . The reduction operator ⊗ {\displaystyle \otimes } must be associative at least. Some algorithms require a commutative operator with a neutral element. Operators like s u m {\displaystyle sum} , m i n {\displaystyle min} , m a x {\displaystyle max} are common. Implementation considerations are similar to broadcast (§ Broadcast). For pipelining on binary trees the message must be representable as a vector of smaller object for component-wise reduction. Pipelined reduce on a balanced binary tree is possible in O ( α log ⁡ p + β n ) {\displaystyle {\mathcal {O}}(\alpha \log p+\beta n)} . == All-Reduce == The all-reduce pattern (also called allreduce) is used if the result of a reduce operation (§ Reduce) must be distributed to all processing units. Given p {\displaystyle p} processing units, message m i {\displaystyle m_{i}} is on processing unit p i {\displaystyle p_{i}} initially. All m i {\displaystyle m_{i}} are aggregated by an operator ⊗ {\displaystyle \otimes } and the result is eventually stored on all p i {\displaystyle p_{i}} . Analog to the reduce operation, the operator ⊗ {\displaystyle \otimes } must be at least associative. All-reduce can be interpreted as a reduce operation with a subsequent broadcast (§ Broadcast). For long messages a corresponding implementation is suitable, whereas for short messages, the latency can be reduced by using a hypercube (Hypercube (communication pattern) § All-Gather/ All-Reduce) topology, if p {\displaystyle p} is a power of two. All-reduce can also be implemented with a butterfly algorithm and achieve optimal latency and bandwidth. All-reduce is possible in O ( α log ⁡ p + β n ) {\displaystyle {\mathcal {O}}(\alpha \log p+\beta n)} , since reduce and broadcast are possible in O ( α log ⁡ p + β n ) {\displaystyle {\mathcal {O}}(\alpha \log p+\beta n)} with pipelining on balanced binary trees. All-reduce implemented with a butterfly algorithm achieves the same asymptotic runtime. == Prefix-Sum/Scan == The prefix-sum or scan operation is used to collect data or partial results from different processing units and to compute intermediate results by an operator, which are stored on those processing units. It can be seen as a generalization of the reduce operation (§ Reduce). Given p {\displaystyle p} processing units, message m i {\displaystyle m_{i}} is on processing unit p i {\displaystyle p_{i}} . The operator ⊗ {\displaystyle \otimes } must be at least associative, whereas some algorithms require also a commutative operator and a neutral element. Common operators are s u m {\displaystyle sum} , m i n {\displaystyle min} and m a x {\displaystyle max} . Eventually processing unit p i {\displaystyle p_{i}} stores the prefix sum ⊗ i ′ <= i {\displaystyle \otimes _{i'<=i}} m i ′ {\displaystyle m_{i'}} . In the case of the so-called exclusive prefix sum, processing unit p i {\displaystyle p_{i}} stores the prefix sum ⊗ i ′ < i {\displaystyle \otimes _{i' Read more →

  • XOR swap algorithm

    XOR swap algorithm

    In computer programming, the exclusive or swap (sometimes shortened to XOR swap) is an algorithm that uses the exclusive or bitwise operation to swap the values of two variables without using the temporary variable which is normally required. The algorithm is primarily a novelty and a way of demonstrating properties of the exclusive or operation. It is sometimes discussed as a program optimization, but there are almost no cases where swapping via exclusive or provides benefit over the standard, obvious technique. == The algorithm == Conventional swapping requires the use of a temporary storage variable. Using the XOR swap algorithm, however, no temporary storage is needed. The algorithm is as follows: Since XOR is a commutative operation, either X XOR Y or Y XOR X can be used interchangeably in any of the foregoing three lines. Note that on some architectures the first operand of the XOR instruction specifies the target location at which the result of the operation is stored, preventing this interchangeability. The algorithm typically corresponds to three machine-code instructions, represented by corresponding pseudocode and assembly instructions in the three rows of the following table: In the above System/370 assembly code sample, R1 and R2 are distinct registers, and each XR operation leaves its result in the register named in the first argument. Using x86 assembly, values X and Y are in registers eax and ebx (respectively), and xor places the result of the operation in the first register (Note: x86 supports XCHG instruction so using triple XOR do not make sense on this architecture). In RISC-V assembly, value X and Y are in registers x10 and x11, and xor places the result of the operation in the first operand. However, in the pseudocode or high-level language version or implementation, the algorithm fails if x and y use the same storage location, since the value stored in that location will be zeroed out by the first XOR instruction, and then remain zero; it will not be "swapped with itself". This is not the same as if x and y have the same values. The trouble only comes when x and y use the same storage location, in which case their values must already be equal. That is, if x and y use the same storage location, then the line: sets x to zero (because x = y so X XOR Y is zero) and sets y to zero (since it uses the same storage location), causing x and y to lose their original values. == Proof of correctness == The binary operation XOR over bit strings of length N {\displaystyle N} exhibits the following properties (where ⊕ {\displaystyle \oplus } denotes XOR): L1. Commutativity: A ⊕ B = B ⊕ A {\displaystyle A\oplus B=B\oplus A} L2. Associativity: ( A ⊕ B ) ⊕ C = A ⊕ ( B ⊕ C ) {\displaystyle (A\oplus B)\oplus C=A\oplus (B\oplus C)} L3. Identity exists: there is a bit string, 0, (of length N) such that A ⊕ 0 = A {\displaystyle A\oplus 0=A} for any A {\displaystyle A} L4. Each element is its own inverse: for each A {\displaystyle A} , A ⊕ A = 0 {\displaystyle A\oplus A=0} . Suppose that we have two distinct registers R1 and R2 as in the table below, with initial values A and B respectively. We perform the operations below in sequence, and reduce our results using the properties listed above. === Linear algebra interpretation === As XOR can be interpreted as binary addition and a pair of bits can be interpreted as a vector in a two-dimensional vector space over the field with two elements, the steps in the algorithm can be interpreted as multiplication by 2×2 matrices over the field with two elements. For simplicity, assume initially that x and y are each single bits, not bit vectors. For example, the step: which also has the implicit: corresponds to the matrix ( 1 1 0 1 ) {\displaystyle \left({\begin{smallmatrix}1&1\\0&1\end{smallmatrix}}\right)} as ( 1 1 0 1 ) ( x y ) = ( x + y y ) . {\displaystyle {\begin{pmatrix}1&1\\0&1\end{pmatrix}}{\begin{pmatrix}x\\y\end{pmatrix}}={\begin{pmatrix}x+y\\y\end{pmatrix}}.} The sequence of operations is then expressed as: ( 1 1 0 1 ) ( 1 0 1 1 ) ( 1 1 0 1 ) = ( 0 1 1 0 ) {\displaystyle {\begin{pmatrix}1&1\\0&1\end{pmatrix}}{\begin{pmatrix}1&0\\1&1\end{pmatrix}}{\begin{pmatrix}1&1\\0&1\end{pmatrix}}={\begin{pmatrix}0&1\\1&0\end{pmatrix}}} (working with binary values, so 1 + 1 = 0 {\displaystyle 1+1=0} ), which expresses the elementary matrix of switching two rows (or columns) in terms of the transvections (shears) of adding one element to the other. To generalize to where X and Y are not single bits, but instead bit vectors of length n, these 2×2 matrices are replaced by 2n×2n block matrices such as ( I n I n 0 I n ) . {\displaystyle \left({\begin{smallmatrix}I_{n}&I_{n}\\0&I_{n}\end{smallmatrix}}\right).} These matrices are operating on values, not on variables (with storage locations), hence this interpretation abstracts away from issues of storage location and the problem of both variables sharing the same storage location. == Code example == A C function that implements the XOR swap algorithm: The code first checks if the addresses are distinct and uses a guard clause to exit the function early if they are equal. Without that check, if they were equal, the algorithm would fold to a triple x ^= x resulting in zero. == Reasons for avoidance in practice == On modern CPU architectures, the XOR technique can be slower than using a temporary variable to do swapping. At least on recent x86 CPUs, both by AMD and Intel, moving between registers regularly incurs zero latency. (This is called MOV-elimination.) Even if there is not any architectural register available to use, the XCHG instruction will be at least as fast as the three XORs taken together. Another reason is that modern CPUs strive to execute instructions in parallel via instruction pipelines. In the XOR technique, the inputs to each operation depend on the results of the previous operation, so they must be executed in strictly sequential order, negating any benefits of instruction-level parallelism. === Aliasing === The XOR swap is also complicated in practice by aliasing. If an attempt is made to XOR-swap the contents of some location with itself, the result is that the location is zeroed out and its value lost. Therefore, XOR swapping must not be used blindly in a high-level language if aliasing is possible. This issue does not apply if the technique is used in assembly to swap the contents of two registers. Similar problems occur with call by name, as in Jensen's Device, where swapping i and A[i] via a temporary variable yields incorrect results due to the arguments being related: swapping via temp = i; i = A[i]; A[i] = temp changes the value for i in the second statement, which then results in the incorrect i value for A[i] in the third statement. == Variations == The underlying principle of the XOR swap algorithm can be applied to any operation meeting criteria L1 through L4 above. Replacing XOR by addition and subtraction gives various slightly different, but largely equivalent, formulations. For example: Unlike the XOR swap, this variation requires that the underlying processor or programming language uses a method such as modular arithmetic or bignums to guarantee that the computation of X + Y cannot cause an error due to integer overflow. Therefore, it is seen even more rarely in practice than the XOR swap. However, the implementation of AddSwap above in the C programming language always works even in case of integer overflow, since, according to the C standard, addition and subtraction of unsigned integers follow the rules of modular arithmetic, i. e. are done in the cyclic group Z / 2 s Z {\displaystyle \mathbb {Z} /2^{s}\mathbb {Z} } where s {\displaystyle s} is the number of bits of unsigned int. Indeed, the correctness of the algorithm follows from the fact that the formulas ( x + y ) − y = x {\displaystyle (x+y)-y=x} and ( x + y ) − ( ( x + y ) − y ) = y {\displaystyle (x+y)-((x+y)-y)=y} hold in any abelian group. This generalizes the proof for the XOR swap algorithm: XOR is both the addition and subtraction in the abelian group ( Z / 2 Z ) s {\displaystyle (\mathbb {Z} /2\mathbb {Z} )^{s}} (which is the direct sum of s copies of Z / 2 Z {\displaystyle \mathbb {Z} /2\mathbb {Z} } ). This doesn't hold when dealing with the signed int type (the default for int). Signed integer overflow is an undefined behavior in C and thus modular arithmetic is not guaranteed by the standard, which may lead to incorrect results. The sequence of operations in AddSwap can be expressed via matrix multiplication as: ( 1 − 1 0 1 ) ( 1 0 1 − 1 ) ( 1 1 0 1 ) = ( 0 1 1 0 ) {\displaystyle {\begin{pmatrix}1&-1\\0&1\end{pmatrix}}{\begin{pmatrix}1&0\\1&-1\end{pmatrix}}{\begin{pmatrix}1&1\\0&1\end{pmatrix}}={\begin{pmatrix}0&1\\1&0\end{pmatrix}}} == Application to register allocation == On architectures lacking a dedicated swap instruction, because it avoids the extra temporary register, the XOR swap algorithm is required for optimal register allocatio

    Read more →
  • Environmental informatics

    Environmental informatics

    Environmental informatics is the science of information applied to environmental science. As such, it provides the information processing and communication infrastructure to the interdisciplinary field of environmental sciences aiming at data, information and knowledge integration, the application of computational intelligence to environmental data as well as the identification of environmental impacts of information technology. Environmental informatics thus acts as a bridge, providing an interdisciplinary means of analysing, describing and understanding the complex interactions between humans, nature and technology. Since each field of applied computer science has its own subject matter, terminology and methods, specialised disciplines, such as environmental, bio- and geoinformatics have emerged, each of which combines computer science with a specific field of application such as environmental, bio- or geosciences. Environmental informatics, bioinformatics and geoinformatics all deal with computer-based processing of environmental phenomena. However, environmental informatics is the only field that pursues normative goals (e.g., political goals of environmental protection, environmental planning, and sustainability). This also influences the choice of methods. This also distinguishes it from application areas such as numerical weather prediction, which is considered an early and important example of computer simulation of environmental phenomena. The UK Natural Environment Research Council defines environmental informatics as the "research and system development focusing on the environmental sciences relating to the creation, collection, storage, processing, modelling, interpretation, display and dissemination of data and information." Kostas Karatzas defined environmental informatics as the "creation of a new 'knowledge-paradigm' towards serving environmental management needs." Karatzas argued further that environmental informatics "is an integrator of science, methods and techniques and not just the result of using information and software technology methods and tools for serving environmental engineering needs." Environmental informatics emerged in early 1990 in Central Europe. Current initiatives to effectively manage, share, and reuse environmental and ecological data are indicative of the increasing importance of fields like environmental informatics and ecoinformatics to develop the foundations for effectively managing ecological information. Examples of these initiatives are National Science Foundation Datanet projects, DataONE and Data Conservancy. == Subject matter and objectives == The subject of environmental informatics are environmental information systems (EIS). An EIS 'is a computer-based system that integrates and stores data collected about the natural environment and provides powerful methods for accessing and evaluating it.' This allows environmental data to be processed by computers for environmental protection, planning, research and technology. According to Jaeschke and Bossel, environmental informatics has three interrelated objectives: Environmental informatics serves to procure data and information for describing the state and development of the environment. Of particular importance is information that is needed to prevent or limit undesirable changes and to support desirable changes. Based on the evaluation and analysis of data, environmental informatics improves our understanding of the environment and the interactions between nature, technology and society. It thus supports environmentally relevant decisions. This enables the influence of development (system correction), the assessment of the effects and side effects of potential measures, and the creation of tools for the routine planning, implementation and monitoring of measures. == History == The simulation model World3, which formed the basis of the highly acclaimed study The Limits to Growth, is considered the starting point of environmental informatics. It incorporated environmental information, among other things, to calculate scenarios for global development. In the mid-1980s, interest grew in structuring environmental protection as an area of application for computer science. One of the first publications in German was the book Informatik im Umweltschutz. Anwendungen und Perspektiven (Computer science in environmental protection. Applications and perspectives) from 1986. The term 'environmental informatics' did not appear until around 1993, which is why the development of environmental informatics is usually referred to as having taken place in the 1990s. In 1993, the first university chair for environmental informatics was established in Cottbus. In 1994, the anthology Umweltinformatik. Informatikmethoden für Umweltschutz und Umweltforschung (Environmental Informatics: Informatics Methods for Environmental Protection and Environmental Research) was published. The development of environmental informatics was 'primarily initiated by German computer science.' In the English-speaking world, the volume Environmental Informatics was published in 1995, mainly based on the German anthology of 1994. An article in the conference proceedings of the World Computer Congress of the International Federation for Information Processing (IFIP) in Hamburg in 1994 describes the initial situation of environmental informatics as follows: 'On the one hand, we suffer from the huge amount of available data – people sometimes speak of data graveyards – on the other hand, the really relevant data may still be missing.' This statement indicates the need that led to the emergence of environmental informatics as a specialised discipline of applied computer science. Furthermore, the specific characteristics and processing requirements of environmental data necessitated the emergence of environmental informatics. The special features of environmental data include: The data structures required are highly heterogeneous due to specific processes and differing perspectives on environmental aspects (e.g., water protection, emission control, hazardous substances). In addition to the heterogeneity of the data, heterogeneous databases also play a role, as environmental data is often obtained and presented in an interdisciplinary manner. Obligations change frequently as a result of new legislation, whether regional (e.g. state regulations on water protection), national (e.g. federal emission control regulations) or international (e.g. Registration, Evaluation, Authorisation and Restriction of Chemicals|REACH). The objects represented are often multidimensional and, therefore, require complex geometric representation using curves or polygons. It is often necessary to process uncertain, imprecise or incomplete data, which is, for example, the result of extrapolations or forecasts. A new "knowledge paradigm" has emerged to meet the requirements of environmental management. Environmental informatics produces its own concepts, methods and techniques and is not merely the result of using information and communication technology methods and tools to meet environmental requirements. The development of environmental informatics since the 1990s has been significantly influenced by the newly established conferences EnviroInfo, ISESS and ITEE and is documented in the respective proceedings. Aspects of sustainability and sustainable development were increasingly integrated into environmental informatics after 2000, thereby expanding the field. In 2004, the Working Group on Sustainable Information Society of the Gesellschaft für Informatik e. V. (German Informatics Society, GI) published the Memorandum on a Sustainable Information Society, which formulates recommendations for an information society that is compatible with human, social and natural needs. Since 2007, environmental informatics has often been described in more detail as informatics for environmental protection, sustainable development and risk management. The increased focus on sustainability has also contributed to the formation of the research focus Information and Communications Technology for Sustainability (ICT4S) and to the emergence of the international conference ICT4S in 2013. ICT-ENSURE, the European Commission's funding measure for the establishment of a European research area on "ICT for Environmental Sustainability Research" (2008–2010), has also contributed to the structuring of environmental informatics. == Environmental informatics and sustainable development == Efforts to place environmental informatics within the context of sustainable development have been growing since 2000 and were significantly influenced by the Memorandum on a Sustainable Information Society. According to this Memorandum, the information society offers great but unevenly distributed opportunities for education, participation and intercultural understanding. In addition, the Memorandum highlighted the material and energy consumption of inf

    Read more →
  • Springpad

    Springpad

    Springpad was a free online application and web service that allowed its registered users to save, organize and share collected ideas and information. As users added content to their Springpad accounts, the application automatically identified and categorized it, then generated additional snippets based on the types of objects added—for example, listing price comparisons for products and showtimes for movies. Springpad was also available as apps on the iPad, iPhone and Android that synchronized with the Web interface. Springpad was bundled on new Toshiba notebook computers through a Web application subscription service. On May 23, 2014, Springpad announced that it would cease operations on June 25, 2014. The company then allowed users to export their data (as JSON and read-only HTML formats), or to automatically migrate it to Evernote accounts before the expiration date. == Features == Springpad users could use the main site interface which uses HTML5 from most browsers or use the smartphone app to capture notes, tasks, or lists which were then added to the user's "My Stuff", the user's personal database or collection. Additionally Springpad let users look up items of interest which were then automatically categorized based on type or manually categorized by the user. Category types included recipes, movies, products, restaurants and wine. Events could also be added to Springpad, and if the user used Google Calendar, they could opt to sync the event to it. In addition to the smartphone app and site, Springpad could be used via browser extension for Google Chrome, or the Springpad Clipper, a bookmarklet to analyze webpages and clip relevant information from them—for example, the ingredients needed for a recipe—or to add the site as a normal bookmark. Another way users could add content to their Springpad "My Stuff" was by emailing entries to an email address specified on Springpad registration. Springpad's smartphone apps could be used to scan barcodes to identify products, save them to the user's "My Stuff", and automatically generate additional product information and links. The mobile app could also save images taken with the phone's camera, and locate nearby businesses. With most of the content added to a user's "My Stuff", relevant news, useful links and other helpful information could be viewed. Users could also attach additional notes and images to content they had already saved, and could add reminders and alerts which could be emailed to the user or texted to their phone. Springpad also added alerts to its own Alerts section for relevant news, deals or coupons for specific products users added. For additional organization, anything added to Springpad could also be tagged. Users could also add entries to "Notebooks" to separate content by projects, or any other way they wished. Each Notebook included a section called a "Board", which acted as a pin board where users could "pin" content they'd added to the Notebook, allowing them to visually lay out items. If the user added a map to the Board and had entries that included an address, Springpad could automatically point out entries on the map. By default, everything added to Springpad was private. However users could change the privacy settings for each of the types of items added, decide to make specific items public and shareable on Facebook and Twitter, add them to their public page, or keep them private but links to them with specific people.

    Read more →
  • MarkLogic Server

    MarkLogic Server

    MarkLogic Server is a document-oriented database developed by MarkLogic. It is a NoSQL multi-model database that evolved from an XML database to natively store JSON documents and RDF triples, the data model for semantics. MarkLogic is designed to be a data hub for operational and analytical data. == History == MarkLogic Server was built to address shortcomings with existing search and data products. The product first focused on using XML as the document markup standard and XQuery as the query standard for accessing collections of documents up to hundreds of terabytes in size. Currently the MarkLogic platform is widely used in publishing, government, finance and other sectors. MarkLogic's customers are mostly Global 2000 companies. == Technology == MarkLogic uses documents without upfront schemas to maintain a flexible data model. In addition to having a flexible data model, MarkLogic uses a distributed, scale-out architecture that can handle hundreds of billions of documents and hundreds of terabytes of data. It has received Common Criteria certification, and has high availability and disaster recovery. MarkLogic is designed to run on-premises and within public or private cloud environments like Amazon Web Services. == Features == Indexing MarkLogic indexes the content and structure of documents including words, phrases, relationships, and values in over 200 languages with tokenization, collation, and stemming for core languages. Functionality includes the ability to toggle range indexes, geospatial indexes, the RDF triple index, and reverse indexes on or off based on your data, the kinds of queries that you will run, and your desired performance. Full-text search MarkLogic supports search across its data and metadata using a word or phrase and incorporates Boolean logic, stemming, wildcards, case sensitivity, punctuation sensitivity, diacritic sensitivity, and search term weighting. Data can be searched using JavaScript, XQuery, SPARQL, and SQL. Semantics MarkLogic uses RDF triples to provide semantics for ease of storing metadata and querying. ACID Unlike other NoSQL databases, MarkLogic maintains ACID consistency for transactions. Replication MarkLogic provides high availability with replica sets. Scalability MarkLogic scales horizontally using sharding. MarkLogic can run over multiple servers, balancing the load or replicating data to keep the system up and running in the event of hardware failure. Security MarkLogic has built in security features such as element-level permissions and data redaction. Optic API for Relational Operations An API that lets developers view their data as documents, graphs or rows. Security MarkLogic provides redaction, encryption, and element-level security (allowing for control on read and write rights on parts of a document). == Applications == Banking Big Data Fraud prevention Insurance Claims Management and Underwriting Master data management Recommendation engines == Licensing == MarkLogic is available under various licensing and delivery models, namely a free Developer or an Essential Enterprise license.[3] Licenses are available from MarkLogic or directly from cloud marketplaces such as Amazon Web Services and Microsoft Azure. == Releases == 2001 – Cerisent XQE 1: ACID transactions, Full-text search, XML Storage, XQuery, Role-based security 2004 – Cerisent XQE 2: Scale-out architecture, Enhanced search (stemming, thesaurus, wildcard), Backup and restore 2005 – MarkLogic Server 3: Continuing search improvements, Content Processing Framework (including PDF, Word, Excel, PPT), Failover 2008 – MarkLogic Server 4: Geospatial search, entity extraction, advanced XQuery, performance, scalability enhancements, auditing 2011 – MarkLogic Server 5: Flexible replication / DDIL, real-time indexing, advanced search, improved analytics, concurrency enhancements 2012 – MarkLogic Server 6: REST and Java APIs, App Builder, enhanced UI, improved search 2013 – MarkLogic Server 7: Semantic graph, bitemporal data, tiered storage, improved search, better management 2015 – MarkLogic Server 8: A Native JSON storage, Server-side JavaScript, Bitemporal, Node.js client API, Incremental backup, Flexible replication[16] 2017 – MarkLogic Server 9: Data integration across Relational and Non-Relational data, Advanced Encryption, Element Level Security, Redaction 2019 – MarkLogic Server 10: Enhanced Data Hub, improved SQL, security, analytics performance, cloud support 2022 – MarkLogic Server 11: MarkLogic Ops Director (Monitoring and Administration Improvements), expanded PKI 2025 – MarkLogic Server 12: Generative AI and Native Vector Search, Graph Algorithm Support, Virtual TDEs (relational views on the fly)

    Read more →
  • Enumeration algorithm

    Enumeration algorithm

    In computer science, an enumeration algorithm is an algorithm that enumerates the answers to a computational problem. Formally, such an algorithm applies to problems that take an input and produce a list of solutions, similarly to function problems. For each input, the enumeration algorithm must produce the list of all solutions, without duplicates, and then halt. The performance of an enumeration algorithm is measured in terms of the time required to produce the solutions, either in terms of the total time required to produce all solutions, or in terms of the maximal delay between two consecutive solutions and in terms of a preprocessing time, counted as the time before outputting the first solution. This complexity can be expressed in terms of the size of the input, the size of each individual output, or the total size of the set of all outputs, similarly to what is done with output-sensitive algorithms. == Formal definitions == An enumeration problem P {\displaystyle P} is defined as a relation R {\displaystyle R} over strings of an arbitrary alphabet Σ {\displaystyle \Sigma } : R ⊆ Σ ∗ × Σ ∗ {\displaystyle R\subseteq \Sigma ^{}\times \Sigma ^{}} An algorithm solves P {\displaystyle P} if for every input x {\displaystyle x} the algorithm produces the (possibly infinite) sequence y {\displaystyle y} such that y {\displaystyle y} has no duplicate and z ∈ y {\displaystyle z\in y} if and only if ( x , z ) ∈ R {\displaystyle (x,z)\in R} . The algorithm should halt if the sequence y {\displaystyle y} is finite. == Common complexity classes == Enumeration problems have been studied in the context of computational complexity theory, and several complexity classes have been introduced for such problems. A very general such class is EnumP, the class of problems for which the correctness of a possible output can be checked in polynomial time in the input and output. Formally, for such a problem, there must exist an algorithm A which takes as input the problem input x, the candidate output y, and solves the decision problem of whether y is a correct output for the input x, in polynomial time in x and y. For instance, this class contains all problems that amount to enumerating the witnesses of a problem in the class NP. Other classes that have been defined include the following. In the case of problems that are also in EnumP, these problems are ordered from least to most specific: Output polynomial, the class of problems whose complete output can be computed in polynomial time. Incremental polynomial time, the class of problems where, for all i, the i-th output can be produced in polynomial time in the input size and in the number i. Polynomial delay, the class of problems where the delay between two consecutive outputs is polynomial in the input (and independent from the output). Strongly polynomial delay, the class of problems where the delay before each output is polynomial in the size of this specific output (and independent from the input or from the other outputs). The preprocessing is generally assumed to be polynomial. Constant delay, the class of problems where the delay before each output is constant, i.e., independent from the input and output. The preprocessing phase is generally assumed to be polynomial in the input. == Common techniques == Backtracking: The simplest way to enumerate all solutions is by systematically exploring the space of possible results (partitioning it at each successive step). However, performing this may not give good guarantees on the delay, i.e., a backtracking algorithm may spend a long time exploring parts of the space of possible results that do not give rise to a full solution. Flashlight search: This technique improves on backtracking by exploring the space of all possible solutions but solving at each step the problem of whether the current partial solution can be extended to a partial solution. If the answer is no, then the algorithm can immediately backtrack and avoid wasting time, which makes it easier to show guarantees on the delay between any two complete solutions. In particular, this technique applies well to self-reducible problems. Closure under set operations: If we wish to enumerate the disjoint union of two sets, then we can solve the problem by enumerating the first set and then the second set. If the union is non disjoint but the sets can be enumerated in sorted order, then the enumeration can be performed in parallel on both sets while eliminating duplicates on the fly. If the union is not disjoint and both sets are not sorted then duplicates can be eliminated at the expense of a higher memory usage, e.g., using a hash table. Likewise, the cartesian product of two sets can be enumerated efficiently by enumerating one set and joining each result with all results obtained when enumerating the second step. == Examples of enumeration problems == The vertex enumeration problem, where we are given a polytope described as a system of linear inequalities and we must enumerate the vertices of the polytope. Enumerating the minimal transversals of a hypergraph. This problem is related to monotone dualization and is connected to many applications in database theory and graph theory. Enumerating the answers to a database query, for instance a conjunctive query or a query expressed in monadic second-order. There have been characterizations in database theory of which conjunctive queries could be enumerated with linear preprocessing and constant delay. The problem of enumerating maximal cliques in an input graph, e.g., with the Bron–Kerbosch algorithm Listing all elements of structures such as matroids and greedoids Several problems on graphs, e.g., enumerating independent sets, paths, cuts, etc. Enumerating the satisfying assignments of representations of Boolean functions, e.g., a Boolean formula written in conjunctive normal form or disjunctive normal form, a binary decision diagram such as an OBDD, or a Boolean circuit in restricted classes studied in knowledge compilation, e.g., NNF. == Connection to computability theory == The notion of enumeration algorithms is also used in the field of computability theory to define some high complexity classes such as RE, the class of all recursively enumerable problems. This is the class of sets for which there exist an enumeration algorithm that will produce all elements of the set: the algorithm may run forever if the set is infinite, but each solution must be produced by the algorithm after a finite time.

    Read more →