AI App Jobs

AI App Jobs — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • List of computer graphics journals

    List of computer graphics journals

    List of computer graphics journals includes notable peer-reviewed scientific and academic journals that focus on computer graphics, visualization, and related areas such as rendering, animation, image processing, and geometric modeling. == Journals == ACM Transactions on Graphics Computers & Graphics IEEE Computer Graphics and Applications IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Graphical Models Journal of Computer Graphics Techniques Presence: Teleoperators and Virtual Environments Virtual Reality Simulation & Gaming

    Read more →
  • Chaotic cryptology

    Chaotic cryptology

    Chaotic cryptology is the application of mathematical chaos theory to the practice of cryptography, the study or techniques used to privately and securely transmit information with the presence of a third-party or adversary. Since first being investigated by Robert Matthews in 1989, the use of chaos in cryptography has attracted much interest. However, long-standing concerns about its security and implementation speed continue to limit its implementation. Chaotic cryptology consists of two opposite processes: Chaotic cryptography and Chaotic cryptanalysis. Cryptography refers to encrypting information for secure transmission, whereas cryptanalysis refers to decrypting and deciphering encoded encrypted messages. In order to use chaos theory efficiently in cryptography, the chaotic maps are implemented such that the entropy generated by the map can produce required Confusion and diffusion. Properties in chaotic systems and cryptographic primitives share unique characteristics that allow for the chaotic systems to be applied to cryptography. If chaotic parameters, as well as cryptographic keys, can be mapped symmetrically or mapped to produce acceptable and functional outputs, it will make it next to impossible for an adversary to find the outputs without any knowledge of the initial values. Since chaotic maps in a real life scenario require a set of numbers that are limited, they may, in fact, have no real purpose in a cryptosystem if the chaotic behavior can be predicted. One of the most important issues for any cryptographic primitive is the security of the system. However, in numerous cases, chaos-based cryptography algorithms are proved insecure. The main issue in many of the cryptanalyzed algorithms is the inadequacy of the chaotic maps implemented in the system. == Types == Chaos-based cryptography has been divided into two major groups: Symmetric chaos cryptography, where the same secret key is used by sender and receiver. Asymmetric chaos cryptography, where one key of the cryptosystem is public. Some of the few proposed systems have been broken. The majority of chaos-based cryptographic algorithms are symmetric. Many use discrete chaotic maps in their process. == Applications == === Image encryption === Bourbakis and Alexopoulos in 1991 proposed supposedly the earliest fully intended digital image encryption scheme which was based on SCAN language. Later on, with the emergence of chaos-based cryptography hundreds of new image encryption algorithms, all with the aim of improving the security of digital images were proposed. However, there were three main aspects of the design of an image encryption that was usually modified in different algorithms (chaotic map, application of the map and structure of algorithm). The initial and perhaps most crucial point was the chaotic map applied in the design of the algorithms. The speed of the cryptosystem is always an important parameter in the evaluation of the efficiency of a cryptography algorithm, therefore, the designers were initially interested in using simple chaotic maps such as tent map, and the logistic map. However, in 2006 and 2007, the new image encryption algorithms based on more sophisticated chaotic maps proved that application of chaotic map with higher dimension could improve the quality and security of the cryptosystems. === Hash function === Chaotic behavior can generate hash functions, such as applying the Chirikov/Julia 3D trajectory translation into a SHA-512 hash. === Random number generation === The unpredictable behavior of the chaotic maps can be used in the generation of random numbers. Some of the earliest chaos-based random number generators tried to directly generate random numbers from the logistic map. Many more recent works did so using the numerical solutions of hyperchaotic systems of differential equations, either at the integer-order, or the fractional-order.

    Read more →
  • Code (cryptography)

    Code (cryptography)

    In cryptology, a code is a method used to encrypt a message that operates at the level of meaning; that is, words or phrases are converted into something else. A code might transform "change" into "CVGDK" or "cocktail lounge". The U.S. National Security Agency defined a code as "A substitution cryptosystem in which the plaintext elements are primarily words, phrases, or sentences, and the code equivalents (called "code groups") typically consist of letters or digits (or both) in otherwise meaningless combinations of identical length." A codebook is needed to encrypt, and decrypt the phrases or words. By contrast, ciphers encrypt messages at the level of individual letters, or small groups of letters, or even, in modern ciphers, individual bits. Messages can be transformed first by a code, and then by a cipher. Such multiple encryption, or "superencryption" aims to make cryptanalysis more difficult. Another comparison between codes and ciphers is that a code typically represents a letter or groups of letters directly without the use of mathematics. As such the numbers are configured to represent these three values: 1001 = A, 1002 = B, 1003 = C, ... . The resulting message, then would be 1001 1002 1003 to communicate ABC. Ciphers, however, utilize a mathematical formula to represent letters or groups of letters. For example, A = 1, B = 2, C = 3, ... . Thus the message ABC results by multiplying each letter's value by 13. The message ABC, then would be 13 26 39. Codes have a variety of drawbacks, including susceptibility to cryptanalysis and the difficulty of managing the cumbersome codebooks, so ciphers are now the dominant technique in modern cryptography. In contrast, because codes are representational, they are not susceptible to mathematical analysis of the individual codebook elements. In the example, the message 13 26 39 can be cracked by dividing each number by 13 and then ranking them alphabetically. However, the focus of codebook cryptanalysis is the comparative frequency of the individual code elements matching the same frequency of letters within the plaintext messages using frequency analysis. In the above example, the code group, 1001, 1002, 1003, might occur more than once and that frequency might match the number of times that ABC occurs in plain text messages. (In the past, or in non-technical contexts, code and cipher are often used to refer to any form of encryption). == One- and two-part codes == Codes are defined by "codebooks" (physical or notional), which are dictionaries of codegroups listed with their corresponding plaintext. Codes originally had the codegroups assigned in 'plaintext order' for convenience of the code designed, or the encoder. For example, in a code using numeric code groups, a plaintext word starting with "a" would have a low-value group, while one starting with "z" would have a high-value group. The same codebook could be used to "encode" a plaintext message into a coded message or "codetext", and "decode" a codetext back into plaintext message. In order to make life more difficult for codebreakers, codemakers designed codes with no predictable relationship between the codegroups and the ordering of the matching plaintext. In practice, this meant that two codebooks were now required, one to find codegroups for encoding, the other to look up codegroups to find plaintext for decoding. Such "two-part" codes required more effort to develop, and twice as much effort to distribute (and discard safely when replaced), but they were harder to break. The Zimmermann Telegram in January 1917 used the German diplomatic "0075" two-part code system which contained upwards of 10,000 phrases and individual words. == One-time code == A one-time code is a prearranged word, phrase or symbol that is intended to be used only once to convey a simple message, often the signal to execute or abort some plan or confirm that it has succeeded or failed. One-time codes are often designed to be included in what would appear to be an innocent conversation. Done properly they are almost impossible to detect, though a trained analyst monitoring the communications of someone who has already aroused suspicion might be able to recognize a comment like "Aunt Bertha has gone into labor" as having an ominous meaning. Famous example of one time codes include: In the Bible, Jonathan prearranges a code with David, who is going into hiding from Jonathan's father, King Saul. If, during archery practice, Jonathan tells the servant retrieving arrows "the arrows are on this side of you," David may safely return to court; if the command is "the arrows are beyond you," David must flee. "One if by land; two if by sea" in "Paul Revere's Ride" made famous in the poem by Henry Wadsworth Longfellow "Climb Mount Niitaka" - the signal to Japanese planes to begin the attack on Pearl Harbor During World War II the British Broadcasting Corporation's overseas service frequently included "personal messages" as part of its regular broadcast schedule. The seemingly nonsensical stream of messages read out by announcers were actually one time codes intended for Special Operations Executive (SOE) agents operating behind enemy lines. An example might be "The princess wears red shoes" or "Mimi's cat is asleep under the table". Each code message was read out twice. By such means, the French Resistance were instructed to start sabotaging rail and other transport links the night before D-day. "Over all of Spain, the sky is clear" was a signal (broadcast on radio) to start the nationalist military revolt in Spain on July 17, 1936. Sometimes messages are not prearranged and rely on shared knowledge hopefully known only to the recipients. An example is the telegram sent to U.S. President Harry Truman, then at the Potsdam Conference to meet with Soviet premier Joseph Stalin, informing Truman of the first successful test of an atomic bomb. "Operated on this morning. Diagnosis not yet complete but results seem satisfactory and already exceed expectations. Local press release necessary as interest extends great distance. Dr. Groves pleased. He returns tomorrow. I will keep you posted." == Idiot code == An idiot code is a code that is created by the parties using it. This type of communication is akin to the hand signals used by armies in the field. Example: Any sentence where 'day' and 'night' are used means 'attack'. The location mentioned in the following sentence specifies the location to be attacked. Plaintext: Attack X. Codetext: We walked day and night through the streets but couldn't find it! Tomorrow we'll head into X. An early use of the term appears to be by George Perrault, a character in the science fiction book Friday by Robert A. Heinlein: The simplest sort [of code] and thereby impossible to break. The first ad told the person or persons concerned to carry out number seven or expect number seven or it said something about something designated as seven. This one says the same with respect to code item number ten. But the meaning of the numbers cannot be deduced through statistical analysis because the code can be changed long before a useful statistical universe can be reached. It's an idiot code... and an idiot code can never be broken if the user has the good sense not to go too often to the well. Terrorism expert Magnus Ranstorp said that the men who carried out the September 11 attacks on the United States used basic e-mail and what he calls "idiot code" to discuss their plans. == Cryptanalysis of codes == While solving a monoalphabetic substitution cipher is easy, solving even a simple code is difficult. Decrypting a coded message is a little like trying to translate a document written in a foreign language, with the task basically amounting to building up a "dictionary" of the codegroups and the plaintext words they represent. One fingerhold on a simple code is the fact that some words are more common than others, such as "the" or "a" in English. In telegraphic messages, the codegroup for "STOP" (i.e., end of sentence or paragraph) is usually very common. This helps define the structure of the message in terms of sentences, if not their meaning, and this is cryptanalytically useful. Further progress can be made against a code by collecting many codetexts encrypted with the same code and then using information from other sources spies newspapers diplomatic cocktail party chat the location from where a message was sent where it was being sent to (i.e., traffic analysis) the time the message was sent, events occurring before and after the message was sent the normal habits of the people sending the coded messages etc. For example, a particular codegroup found almost exclusively in messages from a particular army and nowhere else might very well indicate the commander of that army. A codegroup that appears in messages preceding an attack on a particular location may very well stand for that location. Cribs can be an immediate giveaway to the definiti

    Read more →
  • Brain Imaging Data Structure

    Brain Imaging Data Structure

    The Brain Imaging Data Structure (BIDS) is a standard for organizing, annotating, and describing data collected during neuroimaging experiments. It is based on a formalized file and directory structure and metadata files (based on JSON and TSV) with controlled vocabulary. This standard has been adopted by a multitude of labs around the world as well as databases such as OpenNeuro, SchizConnect, Developing Human Connectome Project, and FCP-INDI, and is seeing uptake in an increasing number of studies. While originally specified for MRI data, BIDS has been extended to several other imaging modalities such as MEG, EEG, and intracranial EEG (see also BIDS Extension Proposals). == History == The project is a community-driven effort. BIDS, originally OBIDS (Open Brain Imaging Data Structure), was initiated during an INCF sponsored data sharing working group meeting (January 2015) at Stanford University. It was subsequently spearheaded and maintained by Chris Gorgolewski. Since October 2019, the project is headed by a Steering Group and maintained by a separate team of maintainers, the Maintainers Group, according to a governance document that was approved of by the BIDS community in a vote. BIDS has advanced under the direction and effort of contributors, the community of researchers that appreciate the value of standardizing neuroimaging data to facilitate sharing and analysis. == BIDS Extension Proposals == BIDS can be extended in a backwards compatible way and is evolving over time. This is accomplished through BIDS Extension Proposals (BEPs), which are community-driven processes following agreed-upon guidelines. A full list of finalized BEPs and BEPs in progress can be found on the BIDS website

    Read more →
  • Accelerated Linear Algebra

    Accelerated Linear Algebra

    XLA (Accelerated Linear Algebra) is an open-source compiler for machine learning developed by the OpenXLA project. XLA is designed to improve the performance of machine learning models by optimizing the computation graphs at a lower level, making it particularly useful for large-scale computations and high-performance machine learning models. Key features of XLA include: Compilation of Computation Graphs: Compiles computation graphs into efficient machine code. Optimization Techniques: Applies operation fusion, memory optimization, and other techniques. Hardware Support: Optimizes models for various hardware, including CPUs, GPUs, and NPUs. Improved Model Execution Time: Aims to reduce machine learning models' execution time for both training and inference. Seamless Integration: Can be used with existing machine learning code with minimal changes. XLA represents a significant step in optimizing machine learning models, providing developers with tools to enhance computational efficiency and performance. == OpenXLA Project == OpenXLA Project is an open-source machine learning compiler and infrastructure initiative intended to provide a common set of tools for compiling and deploying machine learning models across different frameworks and hardware platforms. It provides a modular compilation stack that can be used by major deep learning frameworks like JAX, PyTorch, and TensorFlow. The project focuses on supplying shared components for optimization, portability, and execution across CPUs, GPUs, and specialized accelerators. Its design emphasizes interoperability between frameworks and a standardized set of representations for model computation. == Components == The OpenXLA ecosystem includes several core components: XLA – A deep learning compiler that optimizes computational graphs for multiple hardware targets. PJRT – A runtime interface that allows different back-ends to connect to XLA through a consistent API. StableHLO – A high-level operator set intended to serve as a stable, portable representation for ML models across compilers and frameworks. Shardy – An MLIR-based system for describing and transforming models that run in distributed or multi-device environments. Additional profiling, testing, and integration tools maintained under the OpenXLA organization. == Users and adopters == Several machine learning frameworks can use or interoperate with OpenXLA components, including JAX, TensorFlow, and parts of the PyTorch ecosystem. The project is developed with participation from multiple hardware and software organizations that contribute back-end integrations, testing, or specifications for their devices. This includes Alibaba, Amazon Web Services, AMD, Anyscale, Apple, Arm, Cerebras, Google, Graphcore, Hugging Face, Intel, Meta, NVIDIA and SiFive. == Supported target devices == x86-64 ARM64 NVIDIA GPU AMD GPU Intel GPU Apple GPU Google TPU AWS Trainium, Inferentia Cerebras Graphcore IPU == Governance == OpenXLA is developed as a community project with its work carried out in public repositories, discussion forums, and design meetings. Some components, such as StableHLO, began with stewardship from specific organizations and have outlined plans for more formal and distributed governance models as the project matures. == History == The project was announced in 2022 as an effort to coordinate development of ML compiler technologies across major AI companies, notably: Alibaba, Amazon Web Services, AMD, Anyscale, Apple, Arm, Cerebras, Google, Graphcore, Hugging Face, Intel, Meta, NVIDIA and SiFive.. It consolidated the XLA compiler, introduced StableHLO as a portable operator set, and created a unified structure for additional tools. Development continues within multiple repositories under the OpenXLA umbrella. It was founded by Eugene Burmako, James Rubin, Magnus Hyttsten, Mehdi Amini, Navid Khajouei, and Thea Lamkin from Google's Machine Learning organization.

    Read more →
  • Classora

    Classora

    Classora is a knowledge base for the Internet oriented to data analysis. From a practical point of view, Classora is a digital repository that stores structured information and allows it to be displayed in multiple formats: analytically, graphically, geographically (through maps); as well as carry out OLAP analysis. The information contained in Classora comes from public sources and is uploaded into the system through bots and ETL processes. The Knowledge Base has a commercial API for semantic enhancement, and an open web through which any user can access to part of the information collected (it also allows users to complete data and share opinions). Internally, Classora is organized into Knowledge Units and Reports. A «Knowledge Unit» is any element of the World about which information may be stored and presented in the form of a data sheet (a person, a company, a country, etc.) A «Report» is a group of Knowledge Units: a ranking of companies, a sport classification table, a survey about people, etc. In fact, one of the technical capabilities of Classora is that it allows the comparison of reports and knowledge units gathered from different sources, thereby generating an added value for the media in which this information is published: digital media, interactive TV, etc. == Key definitions == === Knowledge unit === The units of knowledge (also known as entries) in Classora are data sheets that have a certain semantic equivalence with the articles on the Wikipedia: they store information about any element of the world, be it a film, a country, a company or an animal. However, they differ from Wikipedia in that Classora stores structured information, enriched with a metadata layer; and therefore it is able to automatically interpret the meaning of each unit of knowledge. === Data report === A report is a group of units of knowledge in which the repetition of elements is not allowed. This definition includes any list, poll, ranking, etc.; and, in general, any consultation that involves more than one unit of knowledge. Classora excels at the reports management due to its visualization capabilities, being able to display data in the form of tables, graphs and maps. Types of reports: Sports scores: Sports competitions results sanctioned by the competent institution. Rankings and lists: All types of interesting and curious lists, whether they have an implicit order or not. Polls: Units of knowledge that are ranked according to users’ votes. Queries to the Knowledge Base: Questions from users using CQL. Networks of connections: automatically calculated from the reports and the taxonomy of each Knowledge Unit. === Organizational taxonomy === An organizational taxonomy (also referred to as entry type) is a data sheet that brings together the common attributes of a set of units of knowledge. For instance, the organizational taxonomy F1 Driver displays attributes such as date of debut, team, etc.; and the organizational taxonomy Football Club presents attributes such as city, stadium, etc. In Classora, taxonomies are hierarchically organized, so that they inherit attributes from their parent taxonomies. For instance, F1 Driver is a subsidiary taxonomy of Sportsperson, which is a subsidiary taxonomy of Person, which in turn is a subsidiary taxonomy of Organism. The simplest type of entry in Classora is Classora Object. All the other taxonomies are its subsidiaries and inherit its attributes. In fact, the only attribute Classora Object possesses is name (all units of knowledge are required to have one name at least). == Architecture of Classora == === Data Extraction Module === The Data Extraction Module consists of a set of robots coordinated by software that also manages the potential incidents. Most of the information available in Classora is automatically uploaded through those robots, which connect to the main online public sources to gather all types of data. There are three categories of robots: Extraction robots: responsible for the massive uploading of reports from official public sources (FIFA, CIA, IMF, Eurostat...). They are used for either absolute or incremental data uploading. Data scanner robots: responsible for looking for and updating the data of a unit of knowledge. They use specific sources to perform this task: Wikipedia, IMDB, World Bank, etc. Content aggregators: they don’t connect to external sources. Instead, they generate new information using Classora’s internal database. === Participatory Module === In Classora’s Open Website, Internet users may participate providing their knowledge as they would on the Wikipedia. There are different ways to participate: adding or correcting data in the Knowledge Base, voting in surveys (participatory rankings) and creating new Knowledge Units and Data Reports. === Connectivity Module === The Knowledge Base is designed to be embedded in multi-platform, multi-channel systems, thus enabling its integration into mobile devices, tablets, interactive TV, etc. This integration may be carried out through specific plugins (for navigators or other devices) or an API REST that provides content in XML or JSON formats. The API is divided into three blocks of operations. The first one is the block of general utility tools (ranging from autosuggest components about geographical hierarchies to operations to obtain the list of today’s celebrity birthdays, using CQL). The second one is the block of operations for widget generation (graphs, maps, rankings) using information from the knowledge base. Finally, there is a block of operations designed for the publication of free-source content. == Project statistics == As of April 2012, 2,000,000 Knowledge Units, 15,000 Reports, around 10,000 Maps and several million potential Comparative Analyses had been added to Classora. According to the site of web metrics Alexa, Classora Open Website is ranked at 100,557 globally and at 2,880 in the Spanish traffic ranking. Users spend an average of 9 ½ minutes in Classora.

    Read more →
  • Smart-ID

    Smart-ID

    Smart-ID is an electronic authentication tool developed by SK ID Solutions, an Estonian company. Users can log in to various electronic services and sign documents with an electronic signature. Smart-ID meets the European Union's eIDAS Regulation and the European Central Bank's standards for a secure authentication solution. Smart-ID is a Qualified Signature Creator Device (QSCD) that can issue a Qualified Electronic Signature (QES). The Smart-ID app is compatible with both iOS and Android devices and does not require a SIM card. By 2021, the Smart-ID application was launched in the Huawei AppGallery. As of May 2023, Smart-ID has 3,298,969 active users across the Baltic States (Latvia, Lithuania, and Estonia). Every month, the Smart-ID processes 79 million transactions. In March 2023, Smart-ID users made an exceptional 85 million transactions. == History == In November 2016, SK ID Solutions debuted the Smart-ID tool for the first time at its annual conference. In February 2017, eKool, Starman, and Tallinn Kaubamaja Grupp were the first to implement Smart-ID authentication in their e-services. In March 2017, Smart-ID was added as an authentication option to SEB bank and Swedbank's online banking in all three Baltic States. Dokobit, previously known as DigiDoc, began offering its clients the ability to use e-services using Smart-ID in April 2017. More than 100 service providers had implemented Smart-ID as an authentication solution for their services by November 2019. At its annual conference on November 8, 2018, SK ID Solutions revealed that Smart-ID had been certified as compatible with the QSCD[8] level, the highest level of qualified electronic signature in the European Union, following a rigorous certification process. As a result, the Smart-QES-level ID's electronic signature, the digital counterpart of a handwritten signature, is now available to all users who have registered with the tool. This signature is accepted by all European Union member states. On August 26, 2019, Estonian Information Systems Supervisory Authority experts reviewed Smart-ID (ISSA). Based on the methods provided in the eIDAS Regulation, the expert committee concluded that Smart-ID offers a high level of electronic identification assurance. SK ID Solutions and RIA struck an agreement in September 2019 that allows Smart-ID to authenticate Estonian state e-services via RIA's central authentication service, which is used by over 60 public authorities. Smart-ID accounts created three years ago have expired in January 2020. Therefore, renewing them and performing mandatory updates was necessary. In February 2020, SK ID Solutions announced that Smart-ID could be used to give digital signatures in the national digital signature software DigiDoc4, which up until this moment was only possible with ID cards via Mobile-ID. Users must have at least version 4.2.4.71 or later of the DigiDoc4 software installed on their computers to use this feature. Since February 2020, Smart-ID accounts can now be created with biometric information from an ID card or passport, but only by users who have previously used a Smart-ID account. Since October 2022, 13–17 years old minors in Lithuania are able to create a Smart-ID account using biometric information too. A parent or legal guardian must approve the registration. SK ID Solutions collaborated on the new solution with iProov from the United Kingdom and InnoValor from the Netherlands. TÜV Informationstechnik GmbH, a German certification company, assessed it. Since May 2023, Smart-ID can be used to submit company's annual reports in Estonia and digitally sign anything in the e-business register using your PIN2. == Overview == The Smart-ID app is available for download on Google Play and Apple's App Store. Android 4.4 and iOS 11 are the oldest supported operating system versions for Smart-ID. Smart-ID works on the premise of two-factor authentication, combining an intelligent device (something the user owns) with PINs (something the user knows). A new user must first authenticate themselves with an ID card or a mobile phone number and then confirm a PIN1 and PIN2 code, either manually or automatically produced. The first PIN is used to authenticate a person's identity when accessing e-banking or e-services, while the second PIN is used to support electronic signatures and authenticate transactions (e.g., transfers). The PIN1 code must be four digits long, while the PIN2 code must be five digits long. To log in to an e-service, the user must use Smart-ID as the authentication method and enter their unique Smart-ID user ID. A notification will open on the user's smart device where the software is installed and display a verification code. If the code matches the code presented to the user by the e-service, then the user can confirm the match by entering their PIN1 code. The user must verify the action with their PIN2 code when giving digital signatures. A Smart-ID account is valid for three years. The report can be updated, changed, and deleted at any given time, free of charge. Smart-ID is available in five languages: Estonian, Latvian, Lithuanian, Russian, and English. An international survey conducted in 2021 revealed that Smart-ID is the most reliable authentication solution in Baltic countries. In January 2023, the number of times Smart-ID was used to access State Authentication Service (TARA) in Estonia has surpassed those of Mobile-ID and ID-cards for the first time since July 2022. == Security == Smart-ID is based on Cybernetica's SplitKey authentication and digital signature platform technology, for which the company has filed a patent application. Public key cryptography, digital signature methods, and critical public infrastructures are all used in the technology. The user's PIN is not saved on the device and is only needed to decrypt the private key in the Smart-ID app. When the user inputs the PIN, the private key is cracked, and the answer is transmitted to the Smart-ID server, where a portion of the key given by the app is joined with the server's encrypted key. The app will block the user from accessing it for three hours if they input the incorrect PIN three times in a row. If this happens once again, the app will lock for 24 hours. If this happens a third time, the account will be permanently disabled. PINs cannot be changed or recovered once an account has been created. The user must create a new account if the account is permanently blocked. Smart-ID uses the Apple and Google messaging networks to notify the app when new data is saved on its servers. == Phishing == In February 2019, unknown criminals attempted to create Smart-ID accounts with stolen IDs obtained via phishing customers' text messages and website addresses, according to a monthly report by the Estonian Information System Manager in April 2019. The Latvian Information Technology Security Incident Assessment Body Cert was also notified of these intrusions on March 1. Fraudsters sent emails to potential victims pretending to be bank representatives. The mails linked users to a phishing page after redirecting them to a phony bank login page. Victims were asked to log in using their identification information and PIN1 code. The fraudsters then began the process of generating a new Smart-ID account. As a result, the victim had to input a PIN2 number, which permitted the fraudster to finish setting up a new tab with the victim's personal information. Fraudsters in Estonia were able to log in to multiple e-services utilizing Smart-ID using a Smart-ID account and the victim's data. On behalf of the victims, fraudsters also employed online banking services. Later, the Estonian Information System Manager identified several victims, some of whom had also experienced financial losses. The Estonian Information System Manager requested a full report on the event from SK ID Solutions. The organization opted not to criticize the corporation after receiving the information, although it did propose that the procedure of creating Smart-ID accounts be reviewed. According to the Estonian Banking Association, Estonian banks have not discontinued using Smart-ID and do not think it is required. Smart-ID was exposed to a thorough review process in September 2019 to determine this authentication instrument's level of security. Reviewers discovered no flaws, and SK ID Solutions and the Estonian Information System Manager signed a contract. Estonia later introduced Smart-ID and other authentication mechanisms to the central public services portal.

    Read more →
  • Computer-aided software engineering

    Computer-aided software engineering

    Computer-aided software engineering (CASE) is a domain of software tools used to design and implement applications. CASE tools are similar to and are partly inspired by computer-aided design (CAD) tools used for designing hardware products. CASE tools are intended to help develop high-quality, defect-free, and maintainable software. CASE software was often associated with methods for the development of information systems together with automated tools that could be used in the software development process. == History == The Information System Design and Optimization System (ISDOS) project, started in 1968 at the University of Michigan, initiated a great deal of interest in the whole concept of using computer systems to help analysts in the very difficult process of analysing requirements and developing systems. Several papers by Daniel Teichroew fired a whole generation of enthusiasts with the potential of automated systems development. His Problem Statement Language / Problem Statement Analyzer (PSL/PSA) tool was a CASE tool although it predated the term. Another major thread emerged as a logical extension to the data dictionary of a database. By extending the range of metadata held, the attributes of an application could be held within a dictionary and used at runtime. This "active dictionary" became the precursor to the more modern model-driven engineering capability. However, the active dictionary did not provide a graphical representation of any of the metadata. It was the linking of the concept of a dictionary holding analysts' metadata, as derived from the use of an integrated set of techniques, together with the graphical representation of such data that gave rise to the earlier versions of CASE. The next entrant into the market was Excelerator from Index Technology in Cambridge, Mass. While DesignAid ran on Convergent Technologies and later Burroughs Ngen networked microcomputers, Index launched Excelerator on the IBM PC/AT platform. While, at the time of launch, and for several years, the IBM platform did not support networking or a centralized database as did the Convergent Technologies or Burroughs machines, the allure of IBM was strong, and Excelerator came to prominence. Hot on the heels of Excelerator were a rash of offerings from companies such as Knowledgeware (James Martin, Fran Tarkenton and Don Addington), Texas Instrument's CA Gen and Andersen Consulting's FOUNDATION toolset (DESIGN/1, INSTALL/1, FCP). CASE tools were at their peak in the early 1990s. According to the PC Magazine of January 1990, over 100 companies were offering nearly 200 different CASE tools. At the time IBM had proposed AD/Cycle, which was an alliance of software vendors centered on IBM's Software repository using IBM DB2 in mainframe and OS/2: The application development tools can be from several sources: from IBM, from vendors, and from the customers themselves. IBM has entered into relationships with Bachman Information Systems, Index Technology Corporation, and Knowledgeware wherein selected products from these vendors will be marketed through an IBM complementary marketing program to provide offerings that will help to achieve complete life-cycle coverage. With the decline of the mainframe, AD/Cycle and the Big CASE tools died off, opening the market for the mainstream CASE tools of today. Many of the leaders of the CASE market of the early 1990s ended up being purchased by Computer Associates, including IEW, IEF, ADW, Cayenne, and Learmonth & Burchett Management Systems (LBMS). The other trend that led to the evolution of CASE tools was the rise of object-oriented methods and tools. Most of the various tool vendors added some support for object-oriented methods and tools. In addition new products arose that were designed from the bottom up to support the object-oriented approach. Andersen developed its project Eagle as an alternative to Foundation. Several of the thought leaders in object-oriented development each developed their own methodology and CASE tool set: Jacobson, Rumbaugh, Booch, etc. Eventually, these diverse tool sets and methods were consolidated via standards led by the Object Management Group (OMG). The OMG's Unified Modelling Language (UML) is currently widely accepted as the industry standard for object-oriented modeling. == CASE software == === Tools === CASE tools support specific tasks in the software development life-cycle. They can be divided into the following categories: Business and analysis modeling: Graphical modeling tools. E.g., E/R modeling, object modeling, etc. Development: Design and construction phases of the life-cycle. Debugging environments. E.g., IISE LKO. Verification and validation: Analyze code and specifications for correctness, performance, etc. Configuration management: Control the check-in and check-out of repository objects and files. E.g., SCCS, IISE. Metrics and measurement: Analyze code for complexity, modularity (e.g., no "go to's"), performance, etc. Project management: Manage project plans, task assignments, scheduling. Another common way to distinguish CASE tools is the distinction between Upper CASE and Lower CASE. Upper CASE Tools support business and analysis modeling. They support traditional diagrammatic languages such as ER diagrams, Data flow diagram, Structure charts, Decision Trees, Decision tables, etc. Lower CASE Tools support development activities, such as physical design, debugging, construction, testing, component integration, maintenance, and reverse engineering. All other activities span the entire life-cycle and apply equally to upper and lower CASE. === Workbenches === Workbenches integrate two or more CASE tools and support specific software-process activities. Hence they achieve: A homogeneous and consistent interface (presentation integration) Seamless integration of tools and toolchains (control and data integration) An example workbench is Microsoft's Visual Basic programming environment. It incorporates several development tools: a GUI builder, a smart code editor, debugger, etc. Most commercial CASE products tended to be such workbenches that seamlessly integrated two or more tools. Workbenches also can be classified in the same manner as tools; as focusing on Analysis, Development, Verification, etc. as well as being focused on the upper case, lower case, or processes such as configuration management that span the complete life-cycle. === Environments === An environment is a collection of CASE tools or workbenches that attempts to support the complete software process. This contrasts with tools that focus on one specific task or a specific part of the life-cycle. CASE environments are classified by Fuggetta as follows: Toolkits: Loosely coupled collections of tools. These typically build on operating system workbenches such as the Unix Programmer's Workbench or the VMS VAX set. They typically perform integration via piping or some other basic mechanism to share data and pass control. The strength of easy integration is also one of the drawbacks. Simple passing of parameters via technologies such as shell scripting can't provide the kind of sophisticated integration that a common repository database can. Fourth generation: These environments are also known as 4GL standing for fourth generation language environments due to the fact that the early environments were designed around specific languages such as Visual Basic. They were the first environments to provide deep integration of multiple tools. Typically these environments were focused on specific types of applications. For example, user-interface driven applications that did standard atomic transactions to a relational database. Examples are Informix 4GL, and Focus. Language-centered: Environments based on a single often object-oriented language such as the Symbolics Lisp Genera environment or VisualWorks Smalltalk from Parcplace. In these environments all the operating system resources were objects in the object-oriented language. This provides powerful debugging and graphical opportunities but the code developed is mostly limited to the specific language. For this reason, these environments were mostly a niche within CASE. Their use was mostly for prototyping and R&D projects. A common core idea for these environments was the model–view–controller user interface that facilitated keeping multiple presentations of the same design consistent with the underlying model. The MVC architecture was adopted by the other types of CASE environments as well as many of the applications that were built with them. Integrated: These environments are an example of what most IT people tend to think of first when they think of CASE. Environments such as IBM's AD/Cycle, Andersen Consulting's FOUNDATION, the ICL CADES system, and DEC Cohesion. These environments attempt to cover the complete life-cycle from analysis to maintenance and provide an integrated database repository for storing all artifacts of the software pr

    Read more →
  • MultiValue database

    MultiValue database

    A MultiValue database is a type of NoSQL and multidimensional database. It is typically considered synonymous with PICK, a database originally developed as the Pick operating system. MultiValue databases include commercial products from Rocket Software, Revelation, InterSystems, Northgate Information Solutions, ONgroup, and other companies. These databases differ from a relational database in that they have features that support and encourage the use of attributes which can take a list of values, rather than all attributes being single-valued. They are often categorized with MUMPS within the category of post-relational databases, although the data model actually pre-dates the relational model. Unlike SQL-DBMS tools, most MultiValue databases can be accessed both with or without SQL. == History == Don Nelson designed the MultiValue data model in the early to mid-1960s. Dick Pick, a developer at TRW, worked on the first implementation of this model for the US Army in 1965. Pick considered the software to be in the public domain because it was written for the military, this was but the first dispute regarding MultiValue databases that was addressed by the courts. Ken Simms wrote DataBASIC, sometimes known as S-BASIC, in the mid-1970s. It was based on Dartmouth BASIC, but had enhanced features for data management. Simms played a lot of Star Trek (a text-based early computer game originally written in Dartmouth BASIC) while developing the language, to ensure that DataBASIC functioned to his satisfaction. Three of the implementations of MultiValue - PICK version R77, Microdata Reality 3.x, and Prime Information 1.0 - were very similar. In spite of attempts to standardize, particularly by International Spectrum and the Spectrum Manufacturers Association, who designed a logo for all to use, there are no standards across MultiValue implementations. Subsequently, these flavors diverged, although with some cross-over. These streams of MultiValue database development could be classified as one stemming from PICK R83, one from Microdata Reality, and one from Prime Information. Because of the differences, some implementations have provisions for supporting several flavors of the languages. An attempt to document the similarities and differences can be found at the Post-Relational Database Reference (PRDB). One reasonable hypothesis for this data model lasting 50 years, with new database implementations of the model even in the 21st century is that it provides inexpensive database solutions. == Data model example == In a MultiValue database system: a database or schema is called an "account" a table or collection is called a "file" a column or field is called a field or an "attribute", which is composed of "multi-value attributes" and "sub-value attributes" to store multiple values in the same attribute. a row or document is called a "record" or "item" Data is stored using two separate files: a "file" to store raw data and a "dictionary" to store the format for displaying the raw data. For example, assume there's a file (table) called "PERSON". In this file, there is an attribute called "eMailAddress". The eMailAddress field can store a variable number of email address values in a single record. The list [[email protected], [email protected], [email protected]] can be stored and accessed via a single query when accessing the associated record. Achieving the same (one-to-many) relationship within a traditional relational database system would include creating an additional table to store the variable number of email addresses associated with a single "PERSON" record. However, modern relational database systems support this multi-value data model too. For example, in PostgreSQL, a column can be an array of any base type. == MultiValue Basic Language == Multivalue Basic (now commonly styled as mvBasic) is a family of programming languages more or less common (and portable) to all the multivalue databases derived from the original Pick Operating System. The variations between implementations are known as flavours. The language originates from Dartmouth Basic and the earliest implementation of PickBASIC (now D3 FlashBasic). Over time various customisations and extensions have been added to take advantage of capabilities added to the different flavours while staying mainly in sync. mvBasic statements and functions are designed to access and take advantage of the multivalue database model and providing the usual capabilities of most modern languages. For example, cryptography and communications. mvBasic is typeless and lends itself to structured programming techniques. Example code is available but limited. Whilst there are commercial applications and tools available, the multivalue database community has not embraced the open source library/package model to the degree seen with other languages. The typical mvBasic compiler compiles program source to a P-code executable object and runs in an interpreter, with D3 FlashBasic and jBASE being notable exceptions. == MultiValue Query Language == Known as ENGLISH, ACCESS, AQL, UniQuery, Retrieve, CMQL, and by many other names over the years, corresponding to the different MultiValue implementations, the MultiValue query language differs from SQL in several respects. Each query is issued against a single dictionary within the schema, which could be understood as a virtual file or a portal to the database through which to view the data. LIST PEOPLE LAST_NAME FIRST_NAME EMAIL_ADDRESSES WITH LAST_NAME LIKE "Van..." The above statement would list all e-mail addresses for each person whose last name starts with "Van". A single entry would be output for each person, with multiple lines showing the multiple e-mail addresses (without repeating other data about the person).

    Read more →
  • Semi-Automatic Ground Environment

    Semi-Automatic Ground Environment

    The Semi-Automated Ground Environment (SAGE) was a system of large computers and associated networking equipment that coordinated data from many radar sites and processed it to produce a single unified image of the airspace over a wide area. SAGE directed and controlled the NORAD response to a possible Soviet air attack, operating in this role from the late 1950s into the 1980s. The processing power behind SAGE was supplied by the largest discrete component-based computer ever built, the AN/FSQ-7, manufactured by IBM. Each SAGE Direction Center (DC) housed an FSQ-7 which occupied an entire floor, approximately 22,000 square feet (2,000 m2) not including supporting equipment. The FSQ-7 was actually two computers, "A" side and "B" side. Computer processing was switched from "A" side to "B" side on a regular basis, allowing maintenance on the unused side. Information was fed to the DCs from a network of radar stations as well as readiness information from various defense sites. The computers, based on the raw radar data, developed "tracks" for the reported targets, and automatically calculated which defenses were within range. Operators used light guns to select targets on-screen for further information, select one of the available defenses, and issue commands to attack. These commands would then be automatically sent to the defense site via teleprinter. Connecting the various sites was an enormous network of telephones, modems and teleprinters. Later additions to the system allowed SAGE's tracking data to be sent directly to CIM-10 Bomarc missiles and some of the US Air Force's interceptor aircraft in-flight, directly updating their autopilots to maintain an intercept course without operator intervention. Each DC also forwarded data to a Combat Center (CC) for "supervision of the several sectors within the division" ("each combat center [had] the capability to coordinate defense for the whole nation"). SAGE became operational in the late 1950s and early 1960s at an estimated total cost between 8 and 12 billion dollars, four times the cost of the Manhattan Project. Throughout its development, there were continual concerns about its real ability to deal with large attacks, and the Operation Sky Shield tests showed that only about one-fourth of enemy bombers would have been intercepted. Nevertheless, SAGE was the backbone of NORAD's air defense system into the 1980s, by which time the tube-based FSQ-7s were increasingly costly to maintain and completely outdated. Today the same command and control task is carried out by microcomputers, based on the same basic underlying data. == Background == === Earlier systems === Just prior to World War II, Royal Air Force (RAF) tests with the new Chain Home (CH) radars had demonstrated that relaying information to the fighter aircraft directly from the radar sites was not feasible. The radars determined the map coordinates of the enemy, but could generally not see the fighters at the same time. This meant the fighters had to be able to determine where to fly to perform an interception but were often unaware of their own exact location and unable to calculate an interception while also flying their aircraft. The solution was to send all of the radar information to a central control station where operators collated the reports into single tracks, and then reported these tracks to the airbases, or sectors. The sectors used additional systems to track their own aircraft, plotting both on a single large map. Operators viewing the map could then see what direction their fighters would have to fly to approach their targets and relay that simply by telling them to fly along a certain heading or vector. This Dowding system was the first ground-controlled interception (GCI) system of large scale, covering the entirety of the UK. It proved enormously successful during the Battle of Britain, and is credited as being a key part of the RAF's success. The system was slow, often providing information that was up to five minutes out of date. Against propeller driven bombers flying at perhaps 225 miles per hour (362 km/h) this was not a serious concern, but it was clear the system would be of little use against jet-powered bombers flying at perhaps 600 miles per hour (970 km/h). The system was extremely expensive in manpower terms, requiring hundreds of telephone operators, plotters and trackers in addition to the radar operators. This was a serious drain on manpower, making it difficult to expand the network. The idea of using a computer to handle the task of taking reports and developing tracks had been explored beginning late in the war. By 1944, analog computers had been installed at the CH stations to automatically convert radar readings into map locations, eliminating two people. Meanwhile, the Royal Navy began experimenting with the Comprehensive Display System (CDS), another analog computer that took X and Y locations from a map and automatically generated tracks from repeated inputs. Similar systems began development with the Royal Canadian Navy, DATAR, and the US Navy, the Naval Tactical Data System (NTDS). A similar system was also specified for the Nike SAM project, specifically referring to a US version of CDS, coordinating the defense over a battle area so that multiple batteries did not fire on a single target. All of these systems were relatively small in geographic scale, generally tracking within a city-sized area. === Valley Committee === When the Soviet Union tested its first atomic bomb in August 1949, the topic of air defense of the US became important for the first time. A study group, the "Air Defense Systems Engineering Committee", was set up under the direction of Dr. George Valley to consider the problem and is known to history as the "Valley Committee". Their December report noted a key problem in air defense using ground-based radars. A bomber approaching a radar station would detect the signals from the radar long before the reflection off the bomber was strong enough to be detected by the station. The committee suggested that when this occurred, the bomber would descend to low altitude, thereby greatly limiting the radar horizon, allowing the bomber to fly past the station undetected. Although flying at low altitude greatly increased fuel consumption, the team calculated that the bomber would only need to do this for about 10% of its flight, making the fuel penalty acceptable. The only solution to this problem was to build a huge number of stations with overlapping coverage. At that point the problem became one of managing the information. Manual plotting was ruled out as too slow, and a computerized solution was the only possibility. To handle this task, the computer would need to be fed information directly, eliminating any manual translation by phone operators, and it would have to be able to analyze that information and automatically develop tracks. A system tasked with defending cities against the predicted future Soviet bomber fleet would have to be dramatically more powerful than the models used in the NTDS or DATAR. The Committee then had to consider whether or not such a computer was possible. The Valley Committee was introduced to Jerome Wiesner, associate director of the Research Laboratory of Electronics at MIT. Wiesner noted that the Servomechanisms Laboratory had already begun development of a machine that might be fast enough. This was the Whirlwind I, originally developed for the Office of Naval Research as a general purpose flight simulator that could simulate any current or future aircraft by changing its software. Wiesner introduced the Valley Committee to Whirlwind's project lead, Jay Forrester, who convinced him that Whirlwind was sufficiently capable. In September 1950, an early microwave early-warning radar system at Hanscom Field was connected to Whirlwind using a custom interface developed by Forrester's team. An aircraft was flown past the site, and the system digitized the radar information and successfully sent it to Whirlwind. With this demonstration, the technical concept was proven. Forrester was invited to join the committee. === Project Charles === With this successful demonstration, Louis Ridenour, chief scientist of the Air Force, wrote a memo stating "It is now apparent that the experimental work necessary to develop, test, and evaluate the systems proposals made by ADSEC will require a substantial amount of laboratory and field effort." Ridenour approached MIT President James Killian with the aim of beginning a development lab similar to the war-era Radiation Laboratory that made enormous progress in radar technology. Killian was initially uninterested, desiring to return the school to its peacetime civilian charter. Ridenour eventually convinced Killian the idea was sound by describing the way the lab would lead to the development of a local electronics industry based on the needs of the lab and the students who would leave the lab to start their

    Read more →
  • Campus network

    Campus network

    A campus network, campus area network, corporate area network or CAN is a computer network made up of an interconnection of local area networks (LANs) within a limited geographical area. The networking equipments (switches, routers) and transmission media (optical fiber, copper plant, Cat5 cabling etc.) are almost entirely owned by the campus tenant / owner: an enterprise, university, government etc. A campus area network is larger than a local area network but smaller than a metropolitan area network (MAN) or wide area network (WAN). == University campuses == College or university campus area networks often interconnect a variety of buildings, including administrative buildings, academic buildings, laboratories, university libraries, or student centers, residence halls, gymnasiums, and other outlying structures, like conference centers, technology centers, and training institutes. Early examples include the Stanford University Network at Stanford University, Project Athena at MIT, and the Andrew Project at Carnegie Mellon University. == Corporate campuses == Much like a university campus network, a corporate campus network serves to connect buildings. Examples of such are the networks at Googleplex and Microsoft's campus. Campus networks are normally interconnected with high speed Ethernet links operating over optical fiber such as gigabit Ethernet and 10 Gigabit Ethernet. == Area range == The range of CAN is 1 to 5 km (1 to 3 mi). If two buildings have the same domain and they are connected with a network, then it will be considered as CAN only. Though the CAN is mainly used for corporate campuses so the link will be high speed.

    Read more →
  • Storyful

    Storyful

    Storyful (stylized as storyful.) is a social media intelligence company headquartered in Dublin, Ireland that is a subsidiary of News Corp, offering services such as social news monitoring, video licensing, and reputation risk management tools for corporate clients. The startup was launched as the first social media newswire, a content aggregator, verifying news sources and online content in Dublin in 2010 by Mark Little, a former journalist with RTÉ News. Storyful was acquired by News Corp in 2013 for USD$25 million. == Background == Mark Little, who had worked as a television journalist for RTÉ One, founded startup Storyful in Dublin, Ireland, in 2010, as a service that "verified news sources and online content". According to Nieman Lab, Storyful had a reputation for content aggregation as a social news agency—finding, verifying, distributing, licensing, and commercializing user-generated content, social media and online content from social networking services, including videos about stories in the news, such as the Syrian Civil War, Arab Spring protests, as well as "smaller viral moments". Storyful aimed to provide authority through its verification and monitoring tools while providing authenticity through user-generated content. On 20 December 2013 News Corp purchased Storyful for US$25 million and opened a New York office in the same building as Fox News' main studios. Little left Storyful in 2015 and Gavin Sheridan, Storyful's director of innovation left in 2014. News Corp CEO Robert Thomson said that through Storyful, News Corp would "define the opportunities that the digital landscape presents, rather than simply adapt to them." After the acquisition, the company expanded its service to include "commercial and creative work". After Murdoch acquired the company, from 2014 through to February 2018, losses "swelled", requiring a series of cash injections from News Corp. During that time the company expanded aggressively globally with a staff of about 200 worldwide up from about 30 in 2014. According to The Guardian, in 2016, journalists were encouraged by Storyful to use the social media monitoring software called Verify developed by Storyful. By installing Verify's web browser extension on their computers, Verify would inform the journalists when social media content had been "verified and cleared". The Guardian revealed that through the Verify plugin, dozens of staff in four offices had access to the journalists browsing activity without them knowing. This data allowed Storyful to actively monitor its own clients' activities on social media and to "turn it into an internal feed" at Storyful that "updates in real time". In November 2018, when a video circulated by Infowars' Paul Joseph Watson appeared to prove that CNN's Jim Acosta's contact with a White House intern was a physical blow, Storyful was able to prove that the 15-second-long clip had been doctored. According to a 21 January 2019 article in CNN Business, Rob McDonagh, the editor of Storyful's U.S. news team, had proven that one of the viral videos that served as catalysts in the January 2019 Lincoln Memorial confrontation at 18 January 2019 Indigenous Peoples March, was posted by a suspicious account, under the handle @2020fight. McDonagh's team validates videos and posts before adding them to their "digest", distinguishing true stories from those that are not. Storyful attempts to validate each post or video before including it in its digest. McDonagh reviewed previous content from @2020fight's account, and found it suspicious because it had a high follower count, a "highly polarized and yet inconsistent political messaging", an "unusually high rate of tweets", and "the use of someone else's image in the profile photo." reporter Donie O'Sullivan said that the @2020fight video that had been posted on 18 January, which had 2.5 million views by 22 January, was the one that "helped frame the news cycle". Currently the website offers a service by which video can be commercially brokered. == Services == Services include a newswire service—one of their "core pillars"—and social news monitoring. By February 2018, Storyful was developing "risk and reputation monitoring" services through which they would source and verify social news, fact-checking it and contextualising it for corporate clients. They were "developing tech tools" to "explore obscure or closed networks" for their intelligence team. can use to explore obscure or closed networks. They "track deviations in social conversations around brands and organisations and catch potential risks before they blow up. Like an alerts system." The company "released a re-booted version of its Newswire platform in 2018. According to FORA, Storyful was developing new tools to combat fake news online. == Clients == When Storyful was acquired by News Corp in 2013, the company already had the Wall Street Journal, the BBC, New York Times, YouTube, ITN and Channel 4 News as clients. By 2018 their clients included CNN, ABC News and Fox News, The New York Times, the Washington Post, in the United States, the Australian Broadcasting Corporation and all of News Corp’s own publications. Most of their "reputation-conscious corporate customers" clients prefer to not be named.

    Read more →
  • Rademacher complexity

    Rademacher complexity

    In computational learning theory (machine learning and theory of computation), Rademacher complexity, named after Hans Rademacher, measures richness of a class of sets with respect to a probability distribution. The concept can also be extended to real valued functions. == Definitions == === Rademacher complexity of a set === Given a set A ⊆ R m {\displaystyle A\subseteq \mathbb {R} ^{m}} , the Rademacher complexity of A is defined as follows: Rad ⁡ ( A ) := 1 m E σ [ sup a ∈ A ∑ i = 1 m σ i a i ] {\displaystyle \operatorname {Rad} (A):={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{a\in A}\sum _{i=1}^{m}\sigma _{i}a_{i}\right]} where σ 1 , σ 2 , … , σ m {\displaystyle \sigma _{1},\sigma _{2},\dots ,\sigma _{m}} are independent random variables drawn from the Rademacher distribution i.e. Pr ( σ i = + 1 ) = Pr ( σ i = − 1 ) = 1 / 2 {\displaystyle \Pr(\sigma _{i}=+1)=\Pr(\sigma _{i}=-1)=1/2} for i ∈ { 1 , 2 , … , m } {\displaystyle i\in \{1,2,\dots ,m\}} , and a = ( a 1 , … , a m ) ∈ A {\displaystyle a=(a_{1},\ldots ,a_{m})\in A} . Some authors take the absolute value of the sum before taking the supremum, but if A {\displaystyle A} is symmetric this makes no difference. === Rademacher complexity of a function class === Let S = { z 1 , z 2 , … , z m } ⊆ Z {\displaystyle S=\{z_{1},z_{2},\dots ,z_{m}\}\subseteq Z} be a sample of points and consider a function class F {\displaystyle {\mathcal {F}}} of real-valued functions over Z {\displaystyle Z} . Then, the empirical Rademacher complexity of F {\displaystyle {\mathcal {F}}} given S {\displaystyle S} is defined as: Rad S ⁡ ( F ) = 1 m E σ [ sup f ∈ F | ∑ i = 1 m σ i f ( z i ) | ] {\displaystyle \operatorname {Rad} _{S}({\mathcal {F}})={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{f\in {\mathcal {F}}}\left|\sum _{i=1}^{m}\sigma _{i}f(z_{i})\right|\right]} This can also be written using the previous definition: Rad S ⁡ ( F ) = Rad ⁡ ( F ∘ S ) {\displaystyle \operatorname {Rad} _{S}({\mathcal {F}})=\operatorname {Rad} ({\mathcal {F}}\circ S)} where F ∘ S {\displaystyle {\mathcal {F}}\circ S} denotes function composition, i.e.: F ∘ S := { ( f ( z 1 ) , … , f ( z m ) ) ∣ f ∈ F } {\displaystyle {\mathcal {F}}\circ S:=\{(f(z_{1}),\ldots ,f(z_{m}))\mid f\in {\mathcal {F}}\}} The worst case empirical Rademacher complexity is Rad ¯ m ( F ) = sup S = { z 1 , … , z m } Rad S ⁡ ( F ) {\displaystyle {\overline {\operatorname {Rad} }}_{m}({\mathcal {F}})=\sup _{S=\{z_{1},\dots ,z_{m}\}}\operatorname {Rad} _{S}({\mathcal {F}})} Let P {\displaystyle P} be a probability distribution over Z {\displaystyle Z} . The Rademacher complexity of the function class F {\displaystyle {\mathcal {F}}} with respect to P {\displaystyle P} for sample size m {\displaystyle m} is: Rad P , m ⁡ ( F ) := E S ∼ P m [ Rad S ⁡ ( F ) ] {\displaystyle \operatorname {Rad} _{P,m}({\mathcal {F}}):=\mathbb {E} _{S\sim P^{m}}\left[\operatorname {Rad} _{S}({\mathcal {F}})\right]} where the above expectation is taken over an identically independently distributed (i.i.d.) sample S = ( z 1 , z 2 , … , z m ) {\displaystyle S=(z_{1},z_{2},\dots ,z_{m})} generated according to P {\displaystyle P} . == Intuition == The Rademacher complexity is typically applied on a function class of models that are used for classification, with the goal of measuring their ability to classify points drawn from a probability space under arbitrary labellings. When the function class is rich enough, it contains functions that can appropriately adapt for each arrangement of labels, simulated by the random draw of σ i {\displaystyle \sigma _{i}} under the expectation, so that this quantity in the sum is maximized. The Rademacher complexity of a set A {\displaystyle A} can be rewritten as Rad ⁡ ( A ) := 1 m E σ [ sup a ∈ A ∑ i = 1 m σ i a i ] = 1 m 2 m ∑ σ ∈ { − 1 / m , + 1 / m } m [ sup a ∈ A ⟨ σ , a ⟩ ] . {\displaystyle \operatorname {Rad} (A):={\frac {1}{m}}\mathbb {E} _{\sigma }\left[\sup _{a\in A}\sum _{i=1}^{m}\sigma _{i}a_{i}\right]={\frac {1}{{\sqrt {m}}2^{m}}}\sum _{\sigma \in \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}}\left[\sup _{a\in A}\langle \sigma ,a\rangle \right].} Each term in the summation is the farthest distance of the set A {\displaystyle A} from the origin, along a unit-length direction σ {\displaystyle \sigma } . The directions are along the vertices of a hypercube. Thus, we can also write it as Rad ⁡ ( A ) = 1 2 m 1 2 m − 1 ∑ σ ∈ { − 1 / m , + 1 / m } m / { − 1 , + 1 } [ sup a ∈ A ⟨ σ , a ⟩ − inf a ∈ A ⟨ σ , a ⟩ ] {\displaystyle \operatorname {Rad} (A)={\frac {1}{2{\sqrt {m}}}}{\frac {1}{2^{m-1}}}\sum _{\sigma \in \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}/\{-1,+1\}}\left[\sup _{a\in A}\langle \sigma ,a\rangle -\inf _{a\in A}\langle \sigma ,a\rangle \right]} Here, the set { − 1 / m , + 1 / m } m / { − 1 , + 1 } {\displaystyle \{-1/{\sqrt {m}},+1/{\sqrt {m}}\}^{m}/\{-1,+1\}} denotes half of the vertices of a hypercube, selected so that each diagonal has exactly one vertex selected. In words, this states that 2 m Rad ⁡ ( A ) {\displaystyle 2{\sqrt {m}}\operatorname {Rad} (A)} is precisely the average width of the set A {\displaystyle A} along all diagonal directions of a hypercube. == Examples == A singleton set has 0 width in any direction, so it has Rademacher complexity 0. The set A = { ( 1 , 1 ) , ( 1 , 2 ) } ⊆ R 2 {\displaystyle A=\{(1,1),(1,2)\}\subseteq \mathbb {R} ^{2}} has average width 1 / 2 {\displaystyle 1/{\sqrt {2}}} along the two diagonal directions of the square, so it has Rademacher complexity 1 / 4 {\displaystyle 1/4} . The unit cube [ 0 , 1 ] m {\displaystyle [0,1]^{m}} has constant width m {\displaystyle {\sqrt {m}}} along the diagonal directions, so it has Rademacher complexity 1 / 2 {\displaystyle 1/2} . Similarly, the unit cross-polytope { x ∈ R m : ‖ x ‖ 1 ≤ 1 } {\displaystyle \{x\in \mathbb {R} ^{m}:\|x\|_{1}\leq 1\}} has constant width 2 / m {\displaystyle 2/{\sqrt {m}}} along the diagonal directions, so it has Rademacher complexity 1 / m {\displaystyle 1/m} . == Using the Rademacher complexity == The Rademacher complexity can be used to derive data-dependent upper-bounds on the learnability of function classes. Intuitively, a function-class with smaller Rademacher complexity is easier to learn. === Bounding the representativeness === In machine learning, it is desired to have a training set that represents the true distribution of some sample data S {\displaystyle S} . This can be quantified using the notion of representativeness. Denote by P {\displaystyle P} the probability distribution from which the samples are drawn. Denote by H {\displaystyle H} the set of hypotheses (potential classifiers) and denote by F {\displaystyle {\mathcal {F}}} the corresponding set of error functions, i.e., for every hypothesis h ∈ H {\displaystyle h\in H} , there is a function f h ∈ F {\displaystyle f_{h}\in F} , that maps each training sample (features,label) to the error of the classifier h {\displaystyle h} (note in this case hypothesis and classifier are used interchangeably). For example, in the case that h {\displaystyle h} represents a binary classifier, the error function is a 0–1 loss function, i.e. the error function f h {\displaystyle f_{h}} returns 0 if h {\displaystyle h} correctly classifies a sample and 1 else. We omit the index and write f {\displaystyle f} instead of f h {\displaystyle f_{h}} when the underlying hypothesis is irrelevant. Define: L P ( f ) := E z ∼ P [ f ( z ) ] {\displaystyle L_{P}(f):=\mathbb {E} _{z\sim P}[f(z)]} – the expected error of some error function f ∈ F {\displaystyle f\in {\mathcal {F}}} on the real distribution P {\displaystyle P} ; L S ( f ) := 1 m ∑ i = 1 m f ( z i ) {\displaystyle L_{S}(f):={1 \over m}\sum _{i=1}^{m}f(z_{i})} – the estimated error of some error function f ∈ F {\displaystyle f\in {\mathcal {F}}} on the sample S {\displaystyle S} . The representativeness of the sample S {\displaystyle S} , with respect to P {\displaystyle P} and F {\displaystyle {\mathcal {F}}} , is defined as: Rep P ⁡ ( F , S ) := sup f ∈ F ( L P ( f ) − L S ( f ) ) {\displaystyle \operatorname {Rep} _{P}({\mathcal {F}},S):=\sup _{f\in F}(L_{P}(f)-L_{S}(f))} Smaller representativeness is better, since it provides a way to avoid overfitting: it means that the true error of a classifier is not much higher than its estimated error, and so selecting a classifier that has low estimated error will ensure that the true error is also low. Note however that the concept of representativeness is relative and hence can not be compared across distinct samples. The expected representativeness of a sample can be bounded above by the Rademacher complexity of the function class: If F {\displaystyle {\mathcal {F}}} is a set of functions with range within [ 0 , 1 ] {\displaystyle [0,1]} , then Rad P , m ⁡ ( F ) − ln ⁡ 2 2 m ≤ E S ∼ P m [ Rep P ⁡ ( F , S ) ] ≤ 2 Rad P , m ⁡ ( F ) {\displaystyle \operatorname {Rad} _{P,m}({\mathcal {F}})-{\sqrt {\frac {\ln 2}{2m}}}\leq \mathbb {E} _{S\sim P^{m}}[\operatorname {Rep} _{P}({\

    Read more →
  • Payment tokenization

    Payment tokenization

    Payment tokenization is a data security process that replaces sensitive payment information, such as credit card numbers, with a unique identifier or "token." This token can be used in place of actual data during transactions but has no exploitable value if breached, thereby reducing the risk of data theft and fraud. == Overview == Payment tokenization is generally categorized into two types: security tokens and payment tokens. Security tokens, also known as post-authorization tokens, are used to replace sensitive information like Primary Account Numbers (PANs), such as credit card numbers either after a payment is authorized or for storing data securely (data-at-rest), such as in merchant databases. These models have been in use since the mid-2000s, following the introduction of the Payment Card Industry Data Security Standard in 2004, which established standards for safeguarding cardholder data. The Payment Card Industry Security Standards Council's 2011 Tokenization Guidelines and the proposed American National Standards Institute X9 standards emphasize using tokens primarily to secure sensitive information, not as replacements for payment credentials processed over financial networks. Traditionally, merchants stored PANs to support backend operations such as settlements, reconciliations, chargebacks, loyalty programs, and customer service. However, with the adoption of security tokenization, merchants can substitute PANs with tokens in their systems. This not only reduces their exposure to fraud but also helps minimize the scope and cost of PCI-DSS compliance, offering a more secure and efficient way to manage cardholder data. == Applications == Payment tokenization is widely used by mobile wallets such as Apple Pay, Google Pay, and Samsung Pay use tokenization to safely store card data on devices. E-commerce platforms rely on it to securely retain customer payment details for recurring purchases. At the physical point of sale, EMV-enabled systems use tokenization to protect card information during in-store transactions. Also, subscription billing services implement tokenization to manage and safeguard payment credentials for ongoing charges.

    Read more →
  • Virtual influencer

    Virtual influencer

    A virtual influencer, sometimes described as a virtual persona or virtual model, is a computer-generated fictional character that can be used for a variety of marketing-related purposes, but most frequently for social media marketing, in lieu of online human "influencers". Most virtual influencers are designed using computer graphics and motion capture technology to resemble real people in realistic situations. Common derivatives of virtual influencers include VTubers, which broadly refer to online entertainers and YouTubers who represent themselves using virtual avatars instead of their physical selves. == History == Virtual influencers are fundamentally synonymous with virtual idols, which originate from Japan's anime and Japanese idol culture that dates back to the 1980s. The first virtual idol created was Lynn Minmay, a fictional singer and main character of the anime television series Super Dimension Fortress Macross (1982) and the animated film adaptation Macross: Do You Remember Love? (1984). Minmay's success led to the production of more Japanese virtual idols, such as EVE from the Japanese cyberpunk anime Megazone 23 (1985), and Sharon Apple in Macross Plus (1994). Virtual idols were not always well received – in 1995, Japanese talent agency Horipro created Kyoko Date, which was inspired by the Macross franchise and dating sim games such as Tokimeki Memorial (1994). Date failed to gain commercial success despite drawing headlines for her debut as a CGI idol, largely due to technical limitations leading to issues such as unnatural movements, an issue also known as the uncanny valley. Since their inception, many virtual idols created have achieved continual success, with notable names including the Vocaloid singer Hatsune Miku, and the VTuber Kizuna AI. Technological advancements have also enabled production teams to use artificial intelligence and advanced techniques to customize the personalities and behavior of virtual idols. Due to modern-day advancements in technology, many virtual idols have held real-life tours and events. Notable ones include Hatsune Miku's titular tour Miku Expo and Hololive's concerts with many of their idols from their English, Japanese and Indonesian branches. Some notable events including virtual singers and influencers have included: Hatsune Miku opening for Lady Gaga in 2014 and Hoshimachi Suisei's concerts at the famous Budokan venue in Japan and her addition to the Forbes Japan list of '30 Under 30' individuals who are changing the world in their respective fields. == Benefits and criticism == From a branding perspective, virtual influencers are perceived to be much less likely to be mired in scandals. In China, celebrities caught in bad publicity such as singer Wang Leehom and entertainer Kris Wu have heightened the appeal of virtual influencers, since their existence relies entirely on computer-generated imagery and they are therefore unlikely to cause any damage to a brand's image by association. Some studies have also suggested that Generation Z consumers have a unique appetite for virtual idols and influencers, since they grew up in the age of the internet. Studies also show that human-like appearance of virtual influencers show higher message credibility than anime-like virtual influencers. Scholars and commentators have also questioned the ethics and cultural impact of virtual influencers, arguing that computer-generated personas can entrench unrealistic beauty standards while diffusing accountability for labor, identity, and consent. Business and marketing analysts have also warned that disclosure and governance remain inconsistent, recommending clearer guardrails and transparency when brands deploy synthetic spokespeople. In 2025, reporting highlighted concerns that AI-driven "virtual humans" could displace human creators and sales workers, intensifying debates over the future of creative labor and authenticity online. == Notable examples == === Virtual bands === Eternity - A South Korean virtual idol group formed by Pulse9. Gorillaz - A virtual band formed in 1998. K/DA - A virtual K-pop girl group created as part of the League of Legends video game franchise. MAVE: - A South Korean virtual girl group formed in 2023 by Metaverse Entertainment. Pentakill - A virtual heavy metal band created as part of the League of Legends video game franchise. Plave (band) - A South Korean virtual boy band formed by VLast. Squid Sisters and Off the Hook - Two virtual pop idol duos as part of the Splatoon series. Studio Killers - A Finnish-Danish-British virtual band formed in 2011. === Vocaloids === Hatsune Miku (modeled after Saki Fujita) Kagamine Rin/Len (modeled after Asami Shimoda) Megurine Luka (modeled after Yū Asakawa) Meiko (modeled after Meiko Haigō) Kaito (modeled after Naoto Fūga) === VTubers === Kano Kizuna AI Neuro-sama VShojo Ironmouse Projekt Melody Nijisanji Hololive Akai Haato Gawr Gura Hoshimachi Suisei Natsuiro Matsuri === Other examples === Ami Yamato Crazy Frog FN Meka IA Kuki AI Kyoko Date Kyra Miquela Naevis Shudu Gram

    Read more →