AI Coding Github

AI Coding Github — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • FIRST Global Challenge

    FIRST Global Challenge

    The FIRST Global Challenge is a yearly robotics competition organized by the International First Committee Association. It promotes STEM education and careers for youth and was created by Dean Kamen in 2016 as an expansion of FIRST, an organization with similar objectives. == History == FIRST Global is a trade name for the International First Committee Association, a nonprofit corporation based in Manchester, New Hampshire, with a 501(c)(3) designation from the IRS. The nonprofit was founded by the co-founder of FIRST, Dean Kamen, with the objective of promoting STEM education and careers in the developing world through Olympics-style robotics competitions. Former US Congressman, Joe Sestak was the organization's president in 2017, but left after the 2017 Challenge. Each year, the FIRST Global Challenge is held in a different city. For example, Mexico City was selected to host the 2018 Challenge after the United States hosted the 2017 edition in Washington, DC. This is a change from FIRST's system of championships, where one city hosts for several years at a time. In May 2020, it was announced that FIRST Global would not host a traditional challenge in 2020 due to the COVID-19 pandemic and shifted to a remote model. One of the three champions were Team Bangladesh. In 2022, FIRST Global returned to in-person events with the 2022 Challenge in Geneva, Switzerland. == Editions == === Washington, D.C. 2017 === The 2017 FIRST Global Challenge was held in Washington, D.C., from July 16–18, and the challenge was the use of robots to separate different colored balls, representing clean water and impurities in water, symbolizing the Engineering Grand Challenge (based on the Millennium Development Goal) of improving access to clean water in the developing world. Around 160 teams composed of 15- to 18-year-olds from 157 countries participated, and around 60% of teams were created or led by young women. Six continental teams also participated. === Mexico City 2018 === The 2018 FIRST Global Challenge was held in Mexico City from August 15–18. The 2018 Challenge was called Energy Impact and explored the impact of various types of energy on the world and how they can be made more sustainable. In the challenge, robots worked together in teams of three to give cubes to human players, turn a crank, and score cubes in goals in order to generate electrical power. The challenge was based on three Engineering Grand Challenges; making solar energy affordable, making fusion energy a reality, and creating carbon sequestration methods. === Dubai 2019 === The 2019 challenge, called Ocean Opportunities, was held in Dubai from October 24–27 and was the first challenge hosted outside of North America. The challenge was themed around clearing the ocean of pollutants, and had two alliances of three teams each attempting to score large and small balls representing pollutants into processing areas and a processing barge. The processing barge had multiple levels, with higher levels worth more points. At the end of the match, robots "docked" with the barge by driving onto or climbing up it, with climbing worth more points. The event was opened by Sheikh Hamdan bin Mohammed Al Maktoum, Crown Prince of Dubai. === Geneva 2022 === The 2022 challenge called Carbon Capture, was held in Geneva from October 13–16. The challenge was themed around removing carbon dioxide (CO2) emissions from the atmosphere. In the Carbon Capture game, six different countries worked together to capture and store black balls representing carbon particles. The storage tower had multiple cantilevered bars that the robots mounted to, with the higher bars worth a greater multiplier. At the end of a match, robots "docked" on the storage tower's base or climbed the bars with their alliance indicator ball. Each match started with a "global alliance" of six countries, then divided into two "regional alliances" each consisting of three countries. The event was opened by Dr. Martina Hirayama, Switzerland State Secretary for Education, Research and Innovation (SERI). === Singapore 2023 === The 2023 challenge, called Hydrogen Horizons, was held in Singapore from October 7–10. The challenge is themed around renewable energy with a focus on hydrogen technologies. === Athens 2024 === The 2024 challenge was hosted in the Peace and Friendship Stadium in Attica, Greece. === Panama 2025 === The 2025 challenge, Eco Equilibrium, was hosted in the Panama Convention Centre in Panama City, Panama. == Subordinate programs == === Global STEM Corps === The Global STEM Corps is a FIRST Global initiative that connects qualified volunteer mentors with students in developing countries to prepare them for competitions. === New Technology Experience === The New Technology Experience (NTE) is an annual component of the FIRST Global Challenge that was added to the organization's offerings in 2021. It was established as a means for the student community to stay current with cutting-edge technology and is integrated with each year's theme. The 2021 NTE was the CubeSat Prototype Challenge. The 2022 NTE, Carbon Countermeasures, was presented in partnership with XPRIZE.

    Read more →
  • Data quality

    Data quality

    Data quality refers to the state of qualitative or quantitative pieces of information. There are many definitions of data quality, but data is generally considered high quality if it is "fit for [its] intended uses in operations, decision making and planning". Data is deemed of high quality if it correctly represents the real-world construct to which it refers. Apart from these definitions, as the number of data sources increases, the question of internal data consistency becomes significant, regardless of fitness for use for any particular external purpose. People's views on data quality can often be in disagreement, even when discussing the same set of data used for the same purpose. When this is the case, businesses may adopt recognised international standards for data quality (See #International Standards for Data Quality below). Data governance can also be used to form agreed upon definitions and standards, including international standards, for data quality. In such cases, data cleansing, including standardization, may be required in order to ensure data quality. == Definitions == Defining data quality is difficult due to the many contexts data are used in, as well as the varying perspectives among end users, producers, and custodians of data. From a consumer perspective, data quality is: "data that are fit for use by data consumers" data "meeting or exceeding consumer expectations" data that "satisfies the requirements of its intended use" From a business perspective, data quality is: data that are "'fit for use' in their intended operational, decision-making and other roles" or that exhibits "'conformance to standards' that have been set, so that fitness for use is achieved" data that "are fit for their intended uses in operations, decision making and planning" "the capability of data to satisfy the stated business, system, and technical requirements of an enterprise" From a standards-based perspective, data quality is: the "degree to which a set of inherent characteristics (quality dimensions) of an object (data) fulfills requirements" "the usefulness, accuracy, and correctness of data for its application" Arguably, in all these cases, "data quality" is a comparison of the actual state of a particular set of data to a desired state, with the desired state being typically referred to as "fit for use," "to specification," "meeting consumer expectations," "free of defect," or "meeting requirements." These expectations, specifications, and requirements are usually defined by one or more individuals or groups, standards organizations, laws and regulations, business policies, or software development policies. == Dimensions of data quality == Drilling down further, those expectations, specifications, and requirements are stated in terms of characteristics or dimensions of the data, such as: accessibility or availability accuracy or correctness comparability completeness or comprehensiveness consistency, coherence, or clarity credibility, reliability, or reputation flexibility plausibility relevance, pertinence, or usefulness timeliness or latency uniqueness validity or reasonableness A systematic scoping review of the literature suggests that data quality dimensions and methods with real world data are not consistent in the literature, and as a result quality assessments are challenging due to the complex and heterogeneous nature of these data. == International standards for data quality == ISO 8000 is an international standard for data quality. Managed by the International Organization for Standardization, the ISO 8000 standards address and describe general aspects of data quality including principles, vocabulary and measurement data governance data quality management data quality assessment quality of master data, including exchange of characteristic data and identifiers quality of industrial data == History == Before the rise of the inexpensive computer data storage, massive mainframe computers were used to maintain name and address data for delivery services. This was so that mail could be properly routed to its destination. The mainframes used business rules to correct common misspellings and typographical errors in name and address data, as well as to track customers who had moved, died, gone to prison, married, divorced, or experienced other life-changing events. Government agencies began to make postal data available to a few service companies to cross-reference customer data with the National Change of Address registry (NCOA). This technology saved large companies millions of dollars in comparison to manual correction of customer data. Large companies saved on postage, as bills and direct marketing materials made their way to the intended customer more accurately. Initially sold as a service, data quality moved inside the walls of corporations, as low-cost and powerful server technology became available. Companies with an emphasis on marketing often focused their quality efforts on name and address information, but data quality is recognized as an important property of all types of data. Principles of data quality can be applied to supply chain data, transactional data, and nearly every other category of data found. For example, making supply chain data conform to a certain standard has value to an organization by: 1) avoiding overstocking of similar but slightly different stock; 2) avoiding false stock-out; 3) improving the understanding of vendor purchases to negotiate volume discounts; and 4) avoiding logistics costs in stocking and shipping parts across a large organization. For companies with significant research efforts, data quality can include developing protocols for research methods, reducing measurement error, bounds checking of data, cross tabulation, modeling and outlier detection, verifying data integrity, etc. == Overview == There are a number of theoretical frameworks for understanding data quality. A systems-theoretical approach influenced by American pragmatism expands the definition of data quality to include information quality, and emphasizes the inclusiveness of the fundamental dimensions of accuracy and precision on the basis of the theory of science (Ivanov, 1972). One framework, dubbed "Zero Defect Data" (Hansen, 1991) adapts the principles of statistical process control to data quality. Another framework seeks to integrate the product perspective (conformance to specifications) and the service perspective (meeting consumers' expectations) (Kahn et al. 2002). Another framework is based in semiotics to evaluate the quality of the form, meaning and use of the data (Price and Shanks, 2004). One highly theoretical approach analyzes the ontological nature of information systems to define data quality rigorously (Wand and Wang, 1996). A considerable amount of data quality research involves investigating and describing various categories of desirable attributes (or dimensions) of data. Nearly 200 such terms have been identified and there is little agreement in their nature (are these concepts, goals or criteria?), their definitions or measures (Wang et al., 1993). Software engineers may recognize this as a similar problem to "ilities". MIT has an Information Quality (MITIQ) Program, led by Professor Richard Wang, which produces a large number of publications and hosts a significant international conference in this field (International Conference on Information Quality, ICIQ). This program grew out of the work done by Hansen on the "Zero Defect Data" framework (Hansen, 1991). In practice, data quality is a concern for professionals involved with a wide range of information systems, ranging from data warehousing and business intelligence to customer relationship management and supply chain management. One industry study estimated the total cost to the U.S. economy of data quality problems at over U.S. $600 billion per annum (Eckerson, 2002). Incorrect data – which includes invalid and outdated information – can originate from different data sources – through data entry, or data migration and conversion projects. In 2002, the USPS and PricewaterhouseCoopers released a report stating that 23.6 percent of all U.S. mail sent is incorrectly addressed. One reason contact data becomes stale very quickly in the average database – more than 45 million Americans change their address every year. In fact, the problem is such a concern that companies are beginning to set up a data governance team whose sole role in the corporation is to be responsible for data quality. In some organizations, this data governance function has been established as part of a larger Regulatory Compliance function - a recognition of the importance of Data/Information Quality to organizations. Problems with data quality don't only arise from incorrect data; inconsistent data is a problem as well. Eliminating data shadow systems and centralizing data in a warehouse is one of the initiatives a company can take to ensure data consistency. En

    Read more →
  • Novell File Reporter

    Novell File Reporter

    Novell File Reporter (NFR) is software that allows network administrators to identify files stored on the network and generates reports regarding the size of individual files, file type, when files were last accessed, and where duplicates exist. Additionally, the File Reporter tracks storage volume capacity and usage. It is a component of the Novell File Management Suite. == How it works == Novell File Reporter examines and reports on terabytes of data via a central reporting engine (NFR Engine) and distributed agents (NFR Agents). The NFR Engine schedules the scans of file instances conducted by NFR Agents, processes and compiles the scans for reporting purposes, and provides report information to the user interface. In addition to the standard reports it can generate, the NFR Engine can also produce "trigger reports" in response to specific events (a server volume crossing a capacity threshold, for example). Accordingly, the NFR Engine monitors the data gathered by the NFR Agents in order to identify these "triggers." The NFR Engine when working in either eDirectory or Active Directory connects to the directory via a Directory Services Interface (DSI) and thus can monitor and check file permissions.

    Read more →
  • Document

    Document

    A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The etymology of the word "document" derives from the Latin documentum, which denotes a "teaching" or "lesson": the verb doceō denotes "to teach". Historically, the term "document" was usually used to indicate written proof useful as evidence of a truth or fact. In the Computer Age, the term "document" typically refers to a primarily textual computer file, encompassing its structural and format elements, such as fonts, colors, and images. In the contemporary era, the definition of "document" has expanded beyond its traditional medium, such as paper, to encompass electronic documents as well. History, events, examples, opinions, stories, and creativity can all be expressed in documents. "Documentation" is distinct because it has more denotations than "document". Documents are also distinguished from "realia", which are three-dimensional objects that would otherwise satisfy the definition of "document" because they memorialize or represent thought. Documents are usually considered to be two-dimensional representations. == Abstract definitions == The concept of "document" has been defined by Suzanne Briet as "any concrete or symbolic indication, preserved or recorded, for reconstructing or for proving a phenomenon, whether physical or mental." An often-cited article concludes that "the evolving notion of document" among Jonathan Priest, Paul Otlet, Briet, Walter Schürmeyer, and the other documentalists increasingly emphasized whatever functioned as a document rather than traditional physical forms of documents. The shift to digital technology would seem to make this distinction even more important. David M. Levy has said that an emphasis on the technology of digital documents has impeded our understanding of digital documents as documents. A conventional document, such as a mail message or a technical report, exists physically in digital technology as a string of bits, as does everything else in a digital environment. As an object of study, it has been made into a document. It has become physical evidence by those who study it. "Document" is defined in library and information science and documentation science as a fundamental, abstract idea: the word denotes everything that may be represented or memorialized to serve as evidence. The classic example provided by Briet is an antelope: "An antelope running wild on the plains of Africa should not be considered a document[;] she rules. But if it were to be captured, taken to a zoo and made an object of study, it has been made into a document. It has become physical evidence being used by those who study it. Indeed, scholarly articles written about the antelope are secondary documents, since the antelope itself is the primary document." This opinion has been interpreted as an early expression of actor–network theory. == Kinds == A document can be structured, like tabular documents, lists, forms, or scientific charts, semi-structured like a book or a newspaper article, or unstructured like a handwritten note. Documents are sometimes classified as secret, private, or public. They may also be described as drafts or proofs. When a document is copied, the source is denominated the "original". Documents are used in numerous fields, e.g.: Academia: manuscript, thesis, paper, journal, chart, and technical drawing Media: mock-up, script, image, photography, and newspaper article Administration, law, and politics: application, brief, certificate, commission, constitutional document, form, gazette, identity document, license, manifesto, summons, census, and white paper Business: invoice, request for proposal, proposal, contract, packing slip, manifest, report (detailed and summary), spreadsheet, material safety data sheet, waybill, bill of lading, financial statement, nondisclosure agreement (NDA), mutual nondisclosure agreement, and user guide Geography and planning: topographic map, cadastre, legend, and architectural plan Such standard documents can be drafted based on a template. == Drafting == The page layout of a document is how information is graphically arranged in the space of the document, e.g., on a page. If the appearance of the document is of concern, the page layout is generally the responsibility of a graphic designer. Typography concerns the design of letter and symbol forms and their physical arrangement in the document (see typesetting). Information design concerns the effective communication of information, especially in industrial documents and public signs. Simple textual documents may not require visual design and may be drafted only by an author, clerk, or transcriber. Forms may require a visual design for their initial fields, but not to complete the forms. == Media == Traditionally, the medium of a document was paper and the information was applied to it in ink, either by handwriting (to make a manuscript) or by a mechanical process (e.g., a printing press or laser printer). Today, some short documents also may consist of sheets of paper stapled together. Historically, documents were inscribed with ink on papyrus (starting in ancient Egypt) or parchment; scratched as runes or carved on stone using a sharp tool, e.g., the Tablets of Stone described in the Bible; stamped or incised in clay and then baked to make clay tablets, e.g., in the Sumerian and other Mesopotamian civilizations. The papyrus or parchment was often rolled into a scroll or cut into sheets and bound into a codex (book). Contemporary electronic means of memorializing and displaying documents include: Monitor of a desktop computer, laptop, tablet; optionally with a printer to produce a hard copy; Personal digital assistant; Dedicated e-book device; Electronic paper, typically, using the Portable Document Format (PDF); Information appliance; Digital audio player; and Radio and television service provider. Digital documents usually require a specific file format to be presentable in a specific medium. == In law == Documents in all forms frequently serve as material evidence in criminal and civil proceedings. The forensic analysis of such a document is within the scope of questioned document examination. To catalog and manage the large number of documents that may be produced during litigation, Bates numbering is often applied to all documents in the lawsuit so that each document has a unique, arbitrary, identification number.

    Read more →
  • And–or tree

    And–or tree

    An and–or tree is a graphical representation of the reduction of problems (or goals) to conjunctions and disjunctions of subproblems (or subgoals). == Example == The and–or tree: represents the search space for solving the problem P, using the goal-reduction methods: P if Q and R P if S Q if T Q if U == Definitions == Given an initial problem P0 and set of problem solving methods of the form: P if P1 and … and Pn the associated and–or tree is a set of labelled nodes such that: The root of the tree is a node labelled by P0. For every node N labelled by a problem or sub-problem P and for every method of the form P if P1 and ... and Pn, there exists a set of children nodes N1, ..., Nn of the node N, such that each node Ni is labelled by Pi. The nodes are conjoined by an arc, to distinguish them from children of N that might be associated with other methods. A node N, labelled by a problem P, is a success node if there is a method of the form P if nothing (i.e., P is a "fact"). The node is a failure node if there is no method for solving P. If all of the children of a node N, conjoined by the same arc, are success nodes, then the node N is also a success node. Otherwise the node is a failure node. == Search strategies == An and–or tree specifies only the search space for solving a problem. Different search strategies for searching the space are possible. These include searching the tree depth-first, breadth-first, or best-first using some measure of desirability of solutions. The search strategy can be sequential, searching or generating one node at a time, or parallel, searching or generating several nodes in parallel. == Relationship with logic programming == The methods used for generating and–or trees are propositional logic programs (without variables). In the case of logic programs containing variables, the solutions of conjoint sub-problems must be compatible. Subject to this complication, sequential and parallel search strategies for and–or trees provide a computational model for executing logic programs. == Relationship with two-player games == And–or trees can also be used to represent the search spaces for two-person games. The root node of such a tree represents the problem of one of the players winning the game, starting from the initial state of the game. Given a node N, labelled by the problem P of the player winning the game from a particular state of play, there exists a single set of conjoint children nodes, corresponding to all of the opponents responding moves. For each of these children nodes, there exists a set of non-conjoint children nodes, corresponding to all of the player's defending moves. For solving game trees with proof-number search family of algorithms, game trees are to be mapped to and–or trees. MAX-nodes (i.e. maximizing player to move) are represented as OR nodes, MIN-nodes map to AND nodes. The mapping is possible, when the search is done with only a binary goal, which usually is "player to move wins the game".

    Read more →
  • Algorism

    Algorism

    Algorism is the technique of performing basic arithmetic by writing numbers in place value form and applying a set of memorized rules and facts to the digits. One who practices algorism is known as an algorist. This positional notation system has largely superseded earlier calculation systems that used a different set of symbols for each numerical magnitude, such as Roman numerals, and in some cases required a device such as an abacus. == Etymology == The word algorism comes from the name Al-Khwārizmī (c. 780–850), a Persian mathematician, astronomer, geographer and scholar in the House of Wisdom in Baghdad, whose name means "the native of Khwarezm", which is now in modern-day Uzbekistan. He wrote a treatise in Arabic language in the 9th century, which was translated into Latin in the 12th century under the title Algoritmi de numero Indorum. This title means "Algoritmi on the numbers of the Indians", where "Algoritmi" was the translator's Latinization of Al-Khwarizmi's name. Al-Khwarizmi was the most widely read mathematician in Europe in the late Middle Ages, primarily through his other book, the Algebra. In late medieval Latin, algorismus, the corruption of his name, simply meant the "decimal number system" that is still the meaning of modern English algorism. During the 17th century, the French form for the word – but not its meaning – was changed to algorithm, following the model of the word logarithm, this form alluding to the ancient Greek arithmos = number. English adopted the French very soon afterwards, but it wasn't until the late 19th century that "algorithm" took on the meaning that it has in modern English. In English, it was first used about 1230 and then by Chaucer in 1391. Another early use of the word is from 1240, in a manual titled Carmen de Algorismo composed by Alexandre de Villedieu. It begins thus: Haec algorismus ars praesens dicitur, in qua / Talibus Indorum fruimur bis quinque figuris. which translates as: This present art, in which we use those twice five Indian figures, is called algorismus. The word algorithm also derives from algorism, a generalization of the meaning to any set of rules specifying a computational procedure. Occasionally algorism is also used in this generalized meaning, especially in older texts. == History == Starting with the integer arithmetic developed in India using base 10 notation, Al-Khwārizmī along with other mathematicians in medieval Islam, documented new arithmetic methods and made many other contributions to decimal arithmetic (see the articles linked below). These included the concept of the decimal fractions as an extension of the notation, which in turn led to the notion of the decimal point. This system was popularized in Europe by Leonardo of Pisa, now known as Fibonacci.

    Read more →
  • AI-assisted reverse engineering

    AI-assisted reverse engineering

    AI-assisted reverse engineering (AIARE) is a branch of computer science that leverages artificial intelligence (AI), notably machine learning (ML) strategies, to augment and automate the process of reverse engineering. The latter involves breaking down a product, system, or process to comprehend its structure, design, and functionality. AIARE was primarily introduced in the early years of the 21st century, witnessing substantial advancements from the mid-2010s onwards. == Overview == Conventionally, reverse engineering is conducted by specialists who dismantle a system to grasp its working principles, often for the purposes of reproduction, modification, enhancement of compatibility, or forensic examination. This method, while efficient, can be laborious and time-intensive, particularly when dealing with intricate software or hardware systems. AIARE integrates machine learning algorithms to either partially automate or augment this process. It is capable of detecting patterns, relationships, structures, and potential vulnerabilities within the analyzed system, frequently surpassing human experts in speed and accuracy. This has rendered AIARE a critical tool in numerous fields, including cybersecurity, software development, and hardware design and analysis. == Techniques == AIARE encompasses several AI methodologies: === Supervised learning === Supervised learning employs tagged data to train models to recognize system components, their operations, and their interconnections. This method is particularly helpful in software analysis to discover vulnerabilities or enhance compatibility. === Unsupervised learning === Unsupervised learning is utilized to detect concealed patterns and structures in untagged data. It proves beneficial in comprehending complex systems where there's no evident labeling or mapping of components. === Reinforcement learning === Reinforcement learning is employed to build models that progressively refine their system understanding through a process of trial and error. This method is often implemented when deciphering a system's functionality under various circumstances or configurations. === Deep learning === Deep learning is employed for analysis of high-dimensional data. For instance, deep learning techniques can aid in examining the layout and connections of integrated circuits (ICs), substantially reducing the manual effort required for reverse engineering. == Benefits == === Usable Security === AIARE expands usable security as reverse engineering is traditionally slow and highly specialized as it produces dense, low-level information (usually in Assembly or C) when using tools like Ghidra. The use of multiple different methods to interface with models today (such as through chat bots like ChatGPT) greatly reduces the barrier to entry by providing a clear way to interact with the user and even providing meaningful decompiled source code. In addition, either done automatically or through prompt engineering, a model is capable of producing a high-level summary and explanation of its reverse engineering efforts in human-readable form that doesn't require much knowledge on code. === Speedup === AIARE is capable of processing data much faster than humans, providing a boost in speed when analyzing said data. In the context of computer security, this can greatly speed up incident management or response and malware detection as AIARE can be automated to drastically reduce the manual effort usually associated with reverse engineering. == Limitations == In an effort to improve readability for reverse engineering, AI-generated code may introduce erroneous bugs not present in the source. This compromises the correctness of the code if not carefully validated and will throw off reverse engineering efforts. Additionally, AIARE's weakness in zero-shot prompting makes gathering accurate data without reference data in the prompt more inconsistent, thus requiring a user to provide some quality data of their own that hurts its usability.

    Read more →
  • Energy informatics

    Energy informatics

    Energy informatics is a research field covering the use of information and communication technology to address energy utilization and management challenges. Methods used for "smart" implementations often combine IoT sensors with artificial intelligence and machine learning. Energy Informatics is founded on flow networks that are the major suppliers and consumers of energy. Their efficiency can be improved by collecting and analyzing information. == Application areas == The field among other consider application areas within: Smart Buildings by developing ICT-centred solutions for improving the energy-efficiency of buildings. Smart Cities by investigating the synergies between demand patterns and supply availability of energy flows in cities and communities to improve energy efficiency, increase integration of renewable sources, and provide resilience towards system faults caused by extreme situations, like hurricanes and flooding. Smart Industries including the development of ICT-centred solutions for improving the energy efficiency and predictability of energy intensive industrial processes, without compromising process and product quality. Smart Energy Networks by developing ICT-centred solutions for coordinating the supply and demand in environmentally sustainable energy networks.

    Read more →
  • Drops (app)

    Drops (app)

    Drops is a language learning app that was created in Estonia by Daniel Farkas and Mark Szulyovszky in 2015. It is the second product from the company, after their first app, LearnInvisible, had issues in retaining a user's engagement over the required time period. The languages available include Native Hawaiian and Māori, and was classified as one of the fifty "Most Innovative Companies" for 2019 by Fast Company. The company partnered with Global Eagle Entertainment to include Travel Talk, a feature intended to focus on words and phrases frequently used by travelers. At the beginning of the COVID-19 pandemic in March 2020, the number of users increased by 55 percent in the United States and 92 percent in the United Kingdom. Droplets, a language app for children, includes profiles for multiple teachers working with remote students. The company also produces an app called Scripts, intended to help users learn to write alphabets. The app was purchased by the Norwegian company Kahoot! on 24 November 2020.

    Read more →
  • March algorithm

    March algorithm

    The March algorithm is a widely used algorithm that tests SRAM memory by filling all its entries test patterns. It carries out several passes through an SRAM checking the patterns and writing new patterns. The SRAM read and write operations performed on each pass are called a March element and each element is repeated for each entry. The March algorithm is often used to find functional faults in SRAM during testing such as: Stuck-at Faults (SAFs) Transition Faults (TFs) Address Decoder Faults (AFs) Coupling Faults (CFs), such as Inversion (CFin), Idempotent (CFid), and State (CFst) coupling faults It has been suggested to test SRAM modules using the algorithm before sale using a built-in self-test mechanism. == Notation == Each pass in a test sequence is represented by an "element". An element consists of a vertical arrow to indicate the direction in which the memory is scanned followed by a list of read/write operations to be applied to each memory cell. Multiple elements can be listed, separated by semicolons, to form a "test". For example, { ⇕ ( w 0 ) ; ⇑ ( r 0 , w 1 ) ; ⇓ ( r 1 , w 0 , r 0 ) } {\displaystyle \{\Updownarrow (w0);\Uparrow (r0,w1);\Downarrow (r1,w0,r0)\}} specifies to: Scan in both directions, writing 0. Scan from lowest to highest address, reading 0 and writing 1. Scan from highest to lowest address, reading 1, writing 0 and reading 0. == Variants == Many variants of the March algorithm exist with different sequences of tests. Each variant makes a different tradeoff between what faults it can detect and the complexity of the algorithm. Several variants have been given names:

    Read more →
  • Bibliographic database

    Bibliographic database

    A bibliographic database is a database of bibliographic records. This is an organised online collection of references to published written works like journal and newspaper articles, conference proceedings, reports, government and legal publications, patents and books. In contrast to library catalogue entries, a majority of the records in bibliographic databases describe articles and conference papers rather than complete monographs, and they generally contain very rich subject descriptions in the form of keywords, subject classification terms, or abstracts. A bibliographic database may cover a wide range of topics or one academic field like computer science. A significant number of bibliographic databases are marketed under a trade name by licensing agreement from vendors, or directly from their makers: the indexing and abstracting services. Many bibliographic databases have evolved into digital libraries, providing the full text of the organised contents:for instance CORE also organises and mirrors scholarly articles and OurResearch develops a search engine for open access content in Unpaywall. Others merge with non-bibliographic and scholarly databases to create more complete disciplinary search engine systems, such as Chemical Abstracts or Entrez. == History == Prior to the mid-20th century, individuals searching for published literature had to rely on printed bibliographic indexes, generated manually from index cards. During the early 1960s computers were used to digitize text for the first time; the purpose was to reduce the cost and time required to publish two American abstracting journals, the Index Medicus of the National Library of Medicine and the Scientific and Technical Aerospace Reports of the National Aeronautics and Space Administration (NASA). By the late 1960s, such bodies of digitized alphanumeric information, known as bibliographic and numeric databases, constituted a new type of information resource. Online interactive retrieval became commercially viable in the early 1970s over private telecommunications networks. The first services offered a few databases of indexes and abstracts of scholarly literature. These databases contained bibliographic descriptions of journal articles that were searchable by keywords in author and title, and sometimes by journal name or subject heading. The user interfaces were crude, the access was expensive, and searching was done by librarians on behalf of "end users".

    Read more →
  • Pseudonymization

    Pseudonymization

    Pseudonymization is a data management and de-identification procedure by which personally identifiable information fields within a data record are replaced by one or more artificial identifiers, or pseudonyms. A single pseudonym for each replaced field or collection of replaced fields makes the data record less identifiable while remaining suitable for data analysis and data processing. Pseudonymization (or pseudonymisation, the spelling under European guidelines) is one way to comply with the European Union's General Data Protection Regulation (GDPR) demands for secure data storage of personal information. Pseudonymized data can be restored to its original state with the addition of information which allows individuals to be re-identified. In contrast, anonymization is intended to prevent re-identification of individuals within the dataset. Clause 18, Module Four, footnote 2 of the Adoption by the European Commission of the Implementing Decisions (EU) 2021/914 "requires rendering the data anonymous in such a way that the individual is no longer identifiable by anyone ... and that this process is irreversible." == Impact of Schrems II ruling == The European Data Protection Supervisor (EDPS) on 9 December 2021 highlighted pseudonymization as the top technical supplementary measure for Schrems II compliance. Less than two weeks later, the EU Commission highlighted pseudonymization as an essential element of the equivalency decision for South Korea, which is the status that was lost by the United States under the Schrems II ruling by the Court of Justice of the European Union (CJEU). The importance of GDPR-compliant pseudonymization increased dramatically in June 2021 when the European Data Protection Board (EDPB) and the European Commission highlighted GDPR-compliant pseudonymization as the state-of-the-art technical supplementary measure for the ongoing lawful use of EU personal data when using third country (i.e., non-EU) cloud processors or remote service providers under the "Schrems II" ruling by the CJEU. Under the GDPR and final EDPB Schrems II Guidance, the term pseudonymization requires a new protected "state" of data, producing a protected outcome that: Protects direct, indirect, and quasi-identifiers, together with characteristics and behaviors; Protects at the record and data set level versus only the field level so that the protection travels wherever the data goes, including when it is in use; and Protects against unauthorized re-identification via the mosaic effect by generating high entropy (uncertainty) levels by dynamically assigning different tokens at different times for various purposes. The combination of these protections is necessary to prevent the re-identification of data subjects without the use of additional information kept separately, as required under GDPR Article 4(5) and as further underscored by paragraph 85(4) of the final EDPB Schrems II guidance: Article 4(5) "Definitions" of the GDPR defines pseudonymization as "the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person." "Use Case 2: Transfer of pseudonymised Data Paragraph 85(4)" of the final EDPB Schrems II Guidance requires that “the controller has established by means of a thorough analysis of the data in question – taking into account any information that the public authorities of the recipient country may be expected to possess and use – that the pseudonymised personal data cannot be attributed to an identified or identifiable natural person even if cross-referenced with such information." GDPR-compliant pseudonymization requires that data is "anonymous" in the strictest EU sense of the word – globally anonymous – but for the additional information held separately and made available under controlled conditions as authorized by the data controller for permitted re-identification of individual data subjects. Clause 18, Module Four, footnote 2 of the Adoption by the European Commission of the Implementing Decision (EU) 2021/914 "requires rendering the data anonymous in such a way that the individual is no longer identifiable by anyone, in line with recital 26 of Regulation (EU) 2016/679, and that this process is irreversible." Before the Schrems II ruling, pseudonymization was a technique used by security experts or government officials to hide personally identifiable information to maintain data structure and privacy of information. Some common examples of sensitive information include postal code, location of individuals, names of individuals, race and gender, etc. After the Schrems II ruling, GDPR-compliant pseudonymization must satisfy the above-noted elements as an "outcome" versus merely a technique. == Data fields == The choice of which data fields are to be pseudonymized is partly subjective. Less selective fields, such as birth date or postal code are often also included because they are usually available from other sources and therefore make a record easier to identify. Pseudonymizing these less identifying fields removes most of their analytic value and is therefore normally accompanied by the introduction of new derived and less identifying forms, such as year of birth or a larger postal code region. Data fields that are less identifying, such as date of attendance, are usually not pseudonymized. This is because too much statistical utility is lost in doing so, not because the data cannot be identified. For example, given prior knowledge of a few attendance dates it is easy to identify someone's data in a pseudonymized dataset by selecting only those people with that pattern of dates. This is an example of an inference attack. The weakness of pre-GDPR pseudonymized data to inference attacks is commonly overlooked. A famous example is the AOL search data scandal. The AOL example of unauthorized re-identification did not require access to separately kept "additional information" that was under the control of the data controller as is now required for GDPR-compliant pseudonymization, outlined below under the section "New Definition for Pseudonymization Under GDPR". Protecting statistically useful pseudonymized data from re-identification requires: a sound information security base controlling the risk that the analysts, researchers or other data workers cause a privacy breach The pseudonym allows tracking back of data to its origins, which distinguishes pseudonymization from anonymization, where all person-related data that could allow backtracking has been purged. Pseudonymization is an issue in, for example, patient-related data that has to be passed on securely between clinical centers. The application of pseudonymization to e-health intends to preserve the patient's privacy and data confidentiality. It allows primary use of medical records by authorized health care providers and privacy preserving secondary use by researchers. In the US, HIPAA provides guidelines on how health care data must be handled and data de-identification or pseudonymization is one way to simplify HIPAA compliance. However, plain pseudonymization for privacy preservation often reaches its limits when genetic data are involved (see also genetic privacy). Due to the identifying nature of genetic data, depersonalization is often not sufficient to hide the corresponding person. Potential solutions are the combination of pseudonymization with fragmentation and encryption. An example of application of pseudonymization procedure is creation of datasets for de-identification research by replacing identifying words with words from the same category (e.g. replacing a name with a random name from the names dictionary), however, in this case it is in general not possible to track data back to its origins. == New definition under GDPR == Effective as of May 25, 2018, the EU General Data Protection Regulation (GDPR) defines pseudonymization for the very first time at the EU level in Article 4(5). Under Article 4(5) definitional requirements, data is pseudonymized if it cannot be attributed to a specific data subject without the use of separately kept "additional information". Pseudonymized data embodies the state of the art in Data Protection by Design and by Default because it requires protection of both direct and indirect identifiers (not just direct). GDPR Data Protection by Design and by Default principles as embodied in pseudonymization require protection of both direct and indirect identifiers so that personal data is not cross-referenceable (or re-identifiable) via the "mosaic effect" without access to "additional information" that is kept separately by the controller. Because access to separately kept "additional information" is required

    Read more →
  • SwissCovid

    SwissCovid

    SwissCovid is a COVID-19 contact tracing app used for digital contact tracing in Switzerland. Use of the app is voluntary and based on a decentralized approach using Bluetooth Low Energy and Decentralized Privacy-Preserving Proximity Tracing (dp3t). == Development == The app was developed in collaboration with the FOPH by Federal Office for Information Technology, Systems and Communications FOITT, École polytechnique fédérale de Lausanne (EPFL) and the Swiss Federal Institute of Technology in Zurich (ETH) as well as other experts. == Non-interoperability with applications in European countries == There is an agreement between EU countries to make applications compatible. However, there is no legal basis for the SwissCovid application to be part of this portal even though technically speaking it is ready, according to Sang-Ill Kim, head of the digital transformation department of the Federal Office of Public Health. == Criticism == === Not full open source and dependence on Google and Apple === In June 2020, researchers Serge Vaudenay and Martin Vuagnoux published a critical analysis of the application, noting that it relies heavily on Google and Apple's exposure notification system, which is integrated into their respective Android and iOS operating systems. Since Google and Apple have not released the full source code of this system, this would call into question the truly open source nature of the application. The researchers note that the dp3t collective, which includes the developers of the application, has asked Google and Apple to release their code. Moreover, they criticize the official description of the application and its functionalities, as well as the adequacy of the legal basis for its effective operation. === Cyber attacks === Professor Serge Vaudenay and Martin Vuagnoux identify also various security vulnerabilities in the application. The system would thus allow a third party to trace the movements of a phone using the application by means of Bluetooth sensors scattered along its path, for example in a building. Another possible attack would be to copy identifiers from the phones of people who may be ill (for example, in a hospital), and to reproduce those identifiers in order to receive notification of exposure to COVID-19 and illegitimately benefit from quarantine (thus entitling them to paid leave, a postponed examination, or other benefits). The system would also allow a third party to use a phone using the application by means of Bluetooth sensors scattered along the way. Paul-Olivier Dehaye of Personaldata.io and professor Joel Reardon of the University of Calgary published in June 2020 several examples of AEM (Associated Encrypted Metadata) replay and manipulation attacks via software development kits (SDKs) found in benign third-party mobile applications downloaded by the general public and having the phone's Bluetooth access permissions and in September 2020 a paper indicating that "Bluetooth-based proximity tracing apps are fundamentally insecure with respect to an attacker leveraging a malevolent app or SDK". === Costs === According to a publication by the federal administration, "the costs of developing the software for the mobile phone application, the GR back-end and the code management system as well as the costs for access management for the cantonal doctors' services are estimated at a one-off amount of 1.65 million francs. However, the Zurich-based company Ubique, responsible for the development of the application, was finally awarded the mandate to develop the application for an amount of 1.8 million francs. Through the Botnar Foundation based in Basel, École polytechnique fédérale de Lausanne received 3.5 million Swiss francs for the development of the application

    Read more →
  • Emotion-sensitive software

    Emotion-sensitive software

    Emotion-sensitive software (ESS) is software specifically designed to target and monitor emotional response in a human being. Some software measures anger by comparing the pitch of a voice to a regular, or calm, pitch. Another approach is the measurement of physical appearance. If a camera or similar recording device picks up a certain amount of red pigmentation in the skin the system can be alerted that this person is angered. The competitive landscape in the Electronic Surveillance Software (ESS) industry is marked by a high level of secrecy regarding the operational details of these software systems. Many producers deliberately withhold information about the inner workings of their ESS products, a strategy that serves dual purposes: firstly, it intensifies competition among companies in the sector, as each strives to maintain a unique edge without revealing trade secrets that could be leveraged by competitors; secondly, this secrecy acts as a deterrent against individuals or entities who might try to circumvent the surveillance mechanisms. One application of ESS was developed by University of Notre Dame Assistant Professor of Psychology Sidney D'Mello, Art Graesser from the University of Memphis and a colleague from Massachusetts Institute of Technology. They used the technology to create an electronic tutor that could assess a student's level of boredom and frustration based on facial expression and body language, and react accordingly.

    Read more →
  • Information audit

    Information audit

    The information audit (IA) extends the concept of auditing from a traditional scope of accounting and finance to the organisational information management system. Information is representative of a resource which requires effective management and this led to the development of interest in the use of an IA. Prior the 1990s and the methodologies of Orna, Henczel, Wood, Buchanan and Gibb, IA approaches and methodologies focused mainly upon an identification of formal information resources (IR). Later approaches included an organisational analysis and the mapping of the information flow. This gave context to analysis within an organisation's information systems and a holistic view of their IR and as such could contribute to the development of the information systems architecture (ISA). In recent years the IA has been overlooked in favour of the systems development process which can be less expensive than the IA, yet more heavily technically focused, project specific (not holistic) and does not favour the top-down analysis of the IA. == Definition == A definition for the Information Audit cannot be universally agreed-upon amongst scholars, however the definition offered by ASLIB received positive support from a few notable scholars including Henczel, Orna and Wood; “(the IA is a) systematic examination of information use, resources and flows, with a verification by reference to both people and existing documents, in order to establish the extent to which they are contributing to an organisation’s objectives” In summary, the term audit itself implies a counting, the IA being much the same yet it counts IR and analyses how they are used and how critical they are to the success of a given task. == Role and scope of an IA == In much the same way as the IA is difficult to define, it can be utilised in a range of contexts by the information professional, from complying with freedom of information legislation to identifying any existing gaps, duplications, bottlenecks or other inefficiencies in information flows and to understand how existing channels can be used for knowledge transfer In 2007 Buchanan and Gibb developed upon their 1998 examination of the IA process by outlining a summary of its main objectives: To identify an organisation’s information resource To identify an organisation’s information needs Furthermore, Buchanan and Gibb went on to state that the IA also had to meet the following additional objectives: To identify the cost/benefits of information resources To identify the opportunities to use the information resources for strategic competitive advantage To integrate IT investment with strategic business initiatives To identify information flow and processes To develop an integrated information strategy and/or policy To create an awareness of the importance of Information Resource Management (IRM) To monitor/evaluate conformance to information related standards, legislations, policy and guidelines. == Methodology evolution == === Overview === In 1976 Riley first published a definition of IA as a way of analysing IR based on a cost-benefit model. Since Riley, scholars have outlined further developed methodologies. Henderson took a cost-benefit approach hoping to draw focus from manpower-costing to information storage and acquisition which he felt was being overlooked. In 1985 Gillman focused upon identifying the relationships which existed between various components in order to map them to one another. Neither Henderson nor Gillman’s methods offered alternative approaches beyond the existing organisational frameworks. Quinn took a hybrid-approach combining Gillman and Henderson’s methods to identify the purpose of existing IR and to position them within the organisation, as did Worlock. The differentiator between Quinn and Worlock lay in Worlock’s consideration of solutions outside of the current organisational structure. These approaches had thus far had paid little attention to the needs of the user or in making structured recommendations for the development of a corporate information strategy. Therefore, here follows a brief outline and overall comparison of four published strategic approaches in order that one might understand the development of the IA methodology. === Burk and Horton === In 1988 Burk and Horton developed InfoMap, the first IA methodology developed for widespread use. It aimed to discover, map and evaluate the IR within an organisation using a 4-stage process: Survey staff using questionnaires/interviews Measure the IR against cost/value Analyse resources Synthesise the findings and map the strengths and weaknesses of the IR against the objectives of the organisation. Although the method inventoried all IR (and therefore met standard ISO 1779) this bottom-up approach revealed limited analysis of the organisation holistically and the steps were not explicit enough. === Orna === Orna produced a top-down methodology in contrast to Burk and Horton, placing emphasis upon the importance of organisational analysis and aimed to assist in the production of a corporate information policy. Initially the method had just 4-stages, this later revised to a 10-stage process which included pre and post-audit stages as below: Conduct a preliminary review to confirm operational/strategic direction Gain support/resource from management Gain commitment from the other stakeholders (staff) Planning including the project, team, tools and techniques Identify the IR, information flow and produce a cost/value assessment Interpret findings based upon current versus desired state Produce a report to present findings Implement recommendations Monitor effects of change Repeat the IA Orna’s method introduced the need for a cyclical IA to be put in place in order for the IR to be continually tracked and improvements made regularly. Again this method was criticised for lacking some practical application and in 2004 Orna revised the methodology once more to try to rectify this problem === Buchanan and Gibb === In 1998, similarly to Orna's earlier publication, Buchanan and Gibb took a top-down approach, drawing techniques from established management disciplines to provide a framework and a level of familiarity for information professionals. This set of techniques was a notable contribution to IA methodologies and understood the need to be flexible for each organisation. Theirs was a 5-stage process: Promote benefits of the IA through seminars/surveys/CEO letter for cooperation Identify the mission objectives of the organisation, define environment (PEST), map information flow and examine organisation culture. Analyse and formulate action plan for problem areas, flow diagrams and a report of findings and recommendations Account for cost of IR and related services using Activity Based Costing (ABC) and Output Based Specification (OBS). Synthesise the whole process in final audit report and provide an information strategy (strategic direction) in relation to the organisation’s mission statement. This was the introduction of a new approach to costing the IR and had an integrated strategic direction, yet the scholars admitted that this method may be impractical for smaller organisations. === Henczel === Henczel’s methodology drew upon the strengths of Orna and Buchanan and Gibb to produce a 7-stage process: Planning and submission of business case for approval to proceed Data collection and development of an IR database and population through survey techniques Structured data analysis Data evaluation, interpretation and formulation of recommendations Communication of recommendations through a report Implementing recommendations through a devised programme The IA as a continuum-establishment of a cyclical process Focus was made once more on the strategic direction of the organisation conducting the IA. Furthermore, Henczel made examination into the use of the IA as a first-step in the development of a knowledge audit or knowledge management strategy as discussed in the later section. == Case studies == Scholars and information professionals have since tested the above methodologies with varied results. An early case study produced by Soy and Bustelo in a Spanish financial institution in 1999 aimed to identify the use of information resources for qualitative and quantitative data analysis due to the rapid expansion of the organisation within a six-year period. Although the methodology was not explicitly credited to any of the above-mentioned scholars, it did follow a strategic (post 1990's) IA process including gaining support from management, the use of questionnaires for data collection, analysis and evaluation of the data, identification and mapping of the IR, cost-analysis and outlining recommendations to assist with the establishment of an Information policy. In addition the IA report suggested that the process would need to be continual (cyclical as Orna, Henczel and Buchanan and Gibb suggest). Conclusions of this case-study stated that th

    Read more →