AI Chat Hpt

AI Chat Hpt — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Biometric device

    Biometric device

    A biometric device is a security identification and authentication device. Such devices use automated methods of verifying or recognising the identity of a living person based on a physiological or behavioral characteristic. These characteristics include fingerprints, facial images, iris and voice recognition. == History == Biometric devices have been in use for thousands of years. Non-automated biometric devices have been in use since 500 BC, when ancient Babylonians would sign their business transactions by pressing their fingertips into clay tablets. Automation in biometric devices was first seen in the 1960s. The Federal Bureau of Investigation (FBI) in the 1960s, introduced the Indentimat, which started checking for fingerprints to maintain criminal records. The first systems measured the shape of the hand and the length of the fingers. Although discontinued in the 1980s, the system set a precedent for future Biometric Devices. == Subgroups == The characteristic of the human body is used to access information by the users. According to these characteristics, the sub-divided groups are Chemical biometric devices: Analyses the segments of the DNA to grant access to the users. Visual biometric devices: Analyses the visual features of the humans to grant access which includes iris recognition, face recognition, Finger recognition, and Retina Recognition. Behavioral biometric devices: Analyses the Walking Ability and Signatures (velocity of sign, width of sign, pressure of sign) distinct to every human. Olfactory biometric devices: Analyses the odor to distinguish between varied users. Auditory biometric devices: Analyses the voice to determine the identity of a speaker for accessing control. == Uses == === Workplace === Biometrics are being used to establish better and accessible records of the hour's employee's work. With the increase in "Buddy Punching" (a case where employees clocked out coworkers and fraudulently inflated their work hours) employers have looked towards new technology like fingerprint recognition to reduce such fraud. Additionally, employers are also faced with the task of proper collection of data such as entry and exit times. Biometric devices make for largely fool proof and reliable ways of enabling to collect data as employees have to be present to enter biometric details which are unique to them. === Immigration === As the demand for air travel grows and more people travel, modern-day airports have to implement technology in such a way that there are no long queues. Biometrics are being implemented in more and more airports as they enable quick recognition of passengers and hence lead to lower volume of people standing in queues. One such example is of the Dubai International Airport which plans to make immigration counters a relic of the past as they implement IRIS on the move technology (IOM) which should help the seamless departures and arrivals of passengers at the airport. === Handheld and personal devices === Fingerprint sensors can be found on mobile devices. The fingerprint sensor is used to unlock the device and authorize actions, like money and file transfers, for example. It can be used to prevent a device from being used by an unauthorized person. It is also used in attendance in number of colleges and universities. == Present day biometric devices == === Personal signature verification systems === This is one of the most highly recognised and acceptable biometrics in corporate surroundings. This verification has been taken one step further by capturing the signature while taking into account many parameters revolving around this like the pressure applied while signing, the speed of the hand movement and the angle made between the surface and the pen used to make the signature. This system also has the ability to learn from users as signature styles vary for the same user. Hence by taking a sample of data, this system is able to increase its own accuracy. === Iris recognition system === Iris recognition involves the device scanning the pupil of the subject and then cross referencing that to data stored on the database. It is one of the most secure forms of authentication, as while fingerprints can be left behind on surfaces, iris prints are extremely hard to be stolen. Iris recognition is widely applied by organisations dealing with the masses, one being the Aadhaar identification system issued by the Government of India to keep records of its population. The reason for this is that iris recognition makes use of iris prints of humans, which change little over the course of one's lifetime. == Problems with present day biometric devices == === Biometric spoofing === Biometric spoofing is a method of fooling a biometric identification management system, where a counterfeit mold is presented in front of the biometric scanner. This counterfeit mold emulates the unique biometric attributes of an individual so as to confuse the system between the artifact and the real biological target and gain access to sensitive data/materials. One such high-profile case of Biometric spoofing came to the limelight when it was found that German Defence Minister, Ursula von der Leyen's fingerprint had been successfully replicated by Chaos Computer Club. The group used high quality camera lenses and shot images from 6 feet away. They used a professional finger software and mapped the contours of the Ministers thumbprint. Although progress has been made to stop spoofing. Using the principle of pulse oximetry — the liveliness of the test subject is taken into account by measure of blood oxygenation and the heart rate. This reduces attacks like the ones mentioned above, although these methods aren't commercially applicable as costs of implementation are high. This reduces their real world application and hence makes biometrics insecure until these methods are commercially viable. === Accuracy === Accuracy is a major issue with biometric recognition. Passwords are still extremely popular, because a password is static in nature, while biometric data can be subject to change (such as one's voice becoming heavier due to puberty, or an accident to the face, which could lead to improper reading of facial scan data). When testing voice recognition as a substitute to PIN-based systems, Barclays reported that their voice recognition system is 95 percent accurate. This statistic means that many of its customers' voices might still not be recognised even when correct. This uncertainty revolving around the system could lead to slower adoption of biometric devices, continuing the reliance of traditional password-based methods. == Benefits of biometric devices over traditional methods of authentication == Biometric data cannot be lent and hacking of Biometric data is complicated hence it makes it safer to use than traditional methods of authentication like passwords which can be lent and shared. Passwords do not have the ability to judge the user but rely only on the data provided by the user, which can easily be stolen while Biometrics work on the uniqueness of each individual. Passwords can be forgotten and recovering them can take time, whereas Biometric devices rely on biometric data which tends to be unique to a person, hence there is no risk of forgetting the authentication data. A study conducted among Yahoo! users found that at least 1.5 percent of Yahoo users forgot their passwords every month, hence this makes accessing services more lengthy for consumers as the process of recovering passwords is lengthy. These shortcomings make Biometric devices more efficient and reduces effort for the end user. == Future == Researchers are targeting the drawbacks of present-day biometric devices and developing to reduce problems like biometric spoofing and inaccurate intake of data. Technologies which are being developed are- The United States Military Academy are developing an algorithm that allows identification through the ways each individual interacts with their own computers; this algorithm considers unique traits like typing speed, rhythm of writing and common spelling mistakes. This data allows the algorithm to create a unique profile for each user by combining their multiple behavioral and stylometric information. This can be very difficult to replicate collectively. A recent innovation by Kenneth Okereafor and, presented an optimized and secure design of applying biometric liveness detection technique using a trait randomization approach. This novel concept potentially opens up new ways of mitigating biometric spoofing more accurately, and making impostor predictions intractable or very difficult in future biometric devices. A simulation of Kenneth Okereafor's biometric liveness detection algorithm using a 3D multi-biometric framework consisting of 15 liveness parameters from facial print, finger print and iris pattern traits resulted in a system efficiency of the 99.2% over a cardinality of 125 distinct randomization combinat

    Read more →
  • Data definition specification

    Data definition specification

    In computing, a data definition specification (DDS) is a guideline to ensure comprehensive and consistent data definition. It represents the attributes required to quantify data definition. A comprehensive data definition specification encompasses enterprise data, the hierarchy of data management, prescribed guidance enforcement and criteria to determine compliance. == Overview == A data definition specification may be developed for any organization or specialized field, improving the quality of its products through consistency and transparency. It eliminates redundancy (since all contributing areas are referencing the same specification) and provides standardization and degrees of compliance, making it easier and more efficient to create, modify, verify, analyze and share information across the enterprise. To understand how a data definition specification works in an enterprise, we must look at the elements of a DDS. Writing data definitions, defining business terms (or rules) in the context of a particular environment, provides structure for an organization's data architecture. In developing these definitions, the words used must be traceable to clearly defined data. A data definition specification may be used in the following activities: Business intelligence Business process modeling Business rules management Data analysis and modeling Information architecture Metadata modeling Data mastering Report generation == Criteria == A data definition specification requires data definitions to be: Atomic – singular, describing only one concept. Commonly used and ambiguous terms should be defined. While a term refers to one concept, several words may be used in a term: File – A concept identifiable with one word File extension – A concept identifiable with more than one word Traceable – Mapped to a specific data element. In business, a term may be traced to an entity (for example, a customer) or an attribute (such as a customer's name). A term may be a value in a data set (such as gender), or designate the data set itself. Traceability indicates relationships in the data hierarchy. Consistent - Used in a standard syntax; if used in a specific context, the context is noted Accurate - Precise, correct and unambiguous, stating what the term is and is not Clear - Readily understood by the reader Complete - With the term, its description and contextual references Concise - To avoid circular references == Applications == === Enterprise data === A data definition specification was produced by the Open Mobile Alliance to document charging data. The document, the centralized catalog of data elements defined for interfaces, specifies the mapping of these data elements to protocol fields in the interfaces. Created for the exchange of financial data, Market Data Definition Language (MDDL) is an XML specification designed to enable the interchange of information necessary to account, to analyze, and to trade financial instruments of the world's markets. It defines an XML-based interchange format and common data dictionary on the fields needed to describe: (1) financial instruments, (2) corporate events affecting value and tradability, and (3) market-related, economic and industrial indicators. The principal function of MDDL is to allow entities to exchange market data by standardizing formats and definitions. MDDL provides a common format for market data so that it can be efficiently passed from one processing system to another and provides a common understanding of market data content by standardizing terminology and by normalizing the relationships of various data elements to one another ... From the user perspective, the goal of MDDL is to enable users to integrate data from multiple sources by standardizing both the input feeds used for data warehousing (i.e., define what's being provided by vendors) and the output methods by which client applications request the data (i.e., ensure compatibility on how to get data in and out of applications)." === Clinical submissions === The Clinical Data Interchange Standards Consortium, a global, multidisciplinary, non-profit organization, has established standards to support the acquisition, exchange, submission and archiving of clinical research data and metadata. CDISC standards are vendor-neutral, platform-independent and freely available from the CDISC website. The Case Report Tabulation Data Definition Specification (define.xml) draft version 2.0, the oldest data definition specification, is part of the evolution from the 1999 FDA electronic submission (eSub) guidance and electronic Common Technical Document (eCTD) documents specifying that a document describing the content and structure of included data be included in a submission. Define.xml was developed to automate the review process by generating a machine-readable data-definition document. Define.xml has standardized submissions to the Food and Drug Administration, reducing review times from over two years to several months. === Archival data === A data definition specification is the foundation of metadata for scientific data archiving. The Metadata Encoding and Transmission Standard (METS) uses one principle of a DDS: consistent use of key terms to catalog digital objects for global use. The METS schema is a flexible mechanism for encoding descriptive, administrative and structural metadata for a digital library object and expressing complex links between metadata, and can provide a useful standard for the exchange of digital-library objects between repositories. A similar effort is underway to preserve complex data associated with video-game archiving. Preserving Virtual Worlds attempted to address archival-format deficiencies, citing the lack of suitable documentation for interactive fiction and games at the bit level: specifically, the absence of "representation information" needed to map raw bits into higher-level data constructs. Preserving Virtual Worlds 2 is a research project expanding on initial efforts in this field.

    Read more →
  • Merit Network

    Merit Network

    Merit Network, Inc., is a nonprofit member-governed organization providing high-performance computer networking and related services to educational, government, health care, and nonprofit organizations, primarily in Michigan. Created in 1966, Merit operates the longest running regional computer network in the United States. == Organization == Created in 1966 as the Michigan Educational Research Information Triad by Michigan State University (MSU), the University of Michigan (U-M), and Wayne State University (WSU), Merit was created to investigate resource sharing by connecting the mainframe computers at these three Michigan public research universities. Merit's initial three node packet-switched computer network was operational in October 1972 using custom hardware based on DEC PDP-11 minicomputers and software developed by the Merit staff and the staffs at the three universities. Over the next dozen years the initial network grew as new services such as dial-in terminal support, remote job submission, remote printing, and file transfer were added; as gateways to the national and international Tymnet, Telenet, and Datapac networks were established, as support for the X.25 and TCP/IP protocols was added; as additional computers such as WSU's MVS system and the UM's electrical engineering's VAX running UNIX were attached; and as new universities became Merit members. Merit's involvement in national networking activities started in the mid-1980s with connections to the national supercomputing centers and work on the 56 kbit/s National Science Foundation Network (NSFNET), the forerunner of today's Internet. From 1987 until April 1995, Merit re-engineered and managed the NSFNET backbone service. MichNet, Merit's regional network in Michigan was attached to NSFNET and in the early 1990s Merit began extending "the Internet" throughout Michigan, offering both direct connect and dial-in services, and upgrading the statewide network from 56 kbit/s to 1.5 Mbit/s, and on to 45, 155, 622 Mbit/s, and eventually 1 and 10 Gbit/s. In 2003 Merit began its transition to a facilities based network, using fiber optic facilities that it shares with its members, that it purchases or leases under long-term agreements, or that it builds. In addition to network connectivity services, Merit offers a number of related services within Michigan and beyond, including: Internet2 connectivity, VPN, Network monitoring, Voice over IP (VOIP), Cloud storage, E-mail, Domain Name, Network Time, VMware and Zimbra software licensing, Colocation, and professional development seminars, workshops, classes, conferences, and meetings. == History == === Creating the network: 1966 to 1973 === The Michigan Educational Research Information Triad (MERIT) was formed in the fall of 1966 by Michigan State University (MSU), University of Michigan (U-M), and Wayne State University (WSU). More often known as the Merit Computer Network or simply Merit, it was created to design and implement a computer network connecting the mainframe computers at the universities. In the fall of 1969, after funding for the initial development of the network had been secured, Bertram Herzog was named director for MERIT. Eric Aupperle was hired as senior engineer, and was charged with finding hardware to make the network operational. The National Science Foundation (NSF) and the State of Michigan provided the initial funding for the network. In June 1970, the Applied Dynamics Division of Reliance Electric in Saline, Michigan was contracted to build three Communication Computers or CCs. Each would consist of a Digital Equipment Corporation (DEC) PDP-11 computer, dataphone interfaces, and interfaces that would attach them directly to the mainframe computers. The cost was to be slightly less than the $300,000 ($2,487,100, adjusted for inflation) originally budgeted. Merit staff wrote the software that ran on the CCs, while staff at each of the universities wrote the mainframe software to interface to the CCs. The first completed connection linked the IBM S/360-67 mainframe computers running the Michigan Terminal System at WSU and U-M, and was publicly demonstrated on December 14, 1971. The MSU node was completed in October 1972, adding a CDC 6500 mainframe running Scope/Hustler. The network was officially dedicated on May 15, 1973. === Expanding the network: 1974 to 1985 === In 1974, Herzog returned to teaching in the University of Michigan's Industrial Engineering Department, and Aupperle was appointed as director. Use of the all uppercase name "MERIT" was abandoned in favor of the mixed case "Merit". The first network connections were host to host interactive connections which allowed person to remote computer or local computer to remote computer interactions. To this, terminal to host connections, batch connections (remote job submission, remote printing, batch file transfer), and interactive file copy were added. And, in addition to connecting to host computers over custom hardware interfaces, the ability to connect to hosts or other networks over groups of asynchronous ports and via X.25 were added. Merit interconnected with Telenet (later SprintNet) in 1976 to give Merit users dial-in access from locations around the United States. Dial-in access within the U.S. and internationally was further expanded via Merit's interconnections to Tymnet, ADP's Autonet, and later still the IBM Global Network as well as Merit's own expanding network of dial-in sites in Michigan, New York City, and Washington, D.C. In 1978, Western Michigan University (WMU) became the fourth member of Merit (prompting a name change, as the acronym Merit no longer made sense as the group was no longer a triad). To expand the network, the Merit staff developed new hardware interfaces for the Digital PDP-11 based on printed circuit technology. The new system became known as the Primary Communications Processor (PCP), with the earliest PCPs connecting a PDP-10 located at WMU and a DEC VAX running UNIX at U-M's Electrical Engineering department. A second hardware technology initiative in 1983 produced the smaller Secondary Communication Processors (SCP) based on DEC LSI-11 processors. The first SCP was installed at the Michigan Union in Ann Arbor, creating UMnet, which extended Merit's network connectivity deeply into the U-M campus. In 1983 Merit's PCP and SCP software was enhanced to support TCP/IP and Merit interconnected with the ARPANET. === National networking, NSFNET, and the Internet: 1986 to 1995 === In 1986 Merit engineered and operated leased lines and satellite links that allowed the University of Michigan to access the supercomputing facilities at Pittsburgh, San Diego, and NCAR. In 1987, Merit, IBM and MCI submitted a winning proposal to NSF to implement a new NSFNET backbone network. The new NSFNET backbone network service began July 1, 1988. It interconnected supercomputing centers around the country at 1.5 megabits per second (T1), 24 times faster than the 56 kilobits-per-second speed of the previous network. The NSFNET backbone grew to link scientists and educators on university campuses nationwide and connect them to their counterparts around the world. The NSFNET project caused substantial growth at Merit, nearly tripling the staff and leading to the establishment of a new 24-hour Network Operations Center at the U-M Computer Center. In September 1990 in anticipation of the NSFNET T3 upgrade and the approaching end of the 5-year NSFNET cooperative agreement, Merit, IBM, and MCI formed Advanced Network and Services (ANS), a new non-profit corporation with a more broadly based Board of Directors than the Michigan-based Merit Network. Under its cooperative agreement with NSF, Merit remained ultimately responsible for the operation of NSFNET, but subcontracted much of the engineering and operations work to ANS. In 1991 the NSFNET backbone service was expanded to additional sites and upgraded to a more robust 45 Mbit/s (T3) based network. The new T3 backbone was named ANSNet and provided the physical infrastructure used by Merit to deliver the NSFNET Backbone Service. On April 30, 1995, the NSFNET project came to an end, when the NSFNET backbone service was decommissioned and replaced by a new Internet architecture with commercial Internet service providers (ISPs) interconnected at Network Access Points provided by multiple providers across the country. === Bringing the Internet to Michigan: 1985 to 2001 === During the 1980s, Merit Network grew to serve eight member universities, with Oakland University joining in 1985 and Central Michigan University, Eastern Michigan University, and Michigan Technological University joining in 1987. In 1990, Merit's board of directors formally changed the organization's name to Merit Network, Inc., and created the name MichNet to refer to Merit's statewide network. The board also approved a staff proposal to allow organizations other than publicly supported universities, referred to as aff

    Read more →
  • Netsukuku

    Netsukuku

    Netsukuku is an experimental peer-to-peer routing system, developed by the FreakNet MediaLab in 2005, created to build up a distributed network, anonymous and censorship-free, fully independent but not necessarily separated from the Internet, without the support of any server, Internet service provider and no central authority. Netsukuku is designed to handle up to 2128 nodes without any servers or central systems, with minimal CPU and memory resources. This mesh network can be built using existing network infrastructure components such as Wi-Fi. The project has been in slow development since 2005, never abandoning a beta state. It has also never been tested on large scale. == Operation == As of December 2011, the latest theoretical work on Netsukuku could be found in the author's master thesis Scalable Mesh Networks and the Address Space Balancing problem. The following description takes into account only the basic concepts of the theory. Netsukuku uses a custom routing protocol called QSPN (Quantum Shortest Path Netsukuku) that strives to be efficient and not taxing on the computational capabilities of each node. The current version of the protocol is QSPNv2. It adopts a hierarchical structure. 256 nodes are grouped inside a gnode (group node), 256 gnodes are grouped in a single ggnode (group of group nodes), 256 ggnodes are grouped in a single gggnode, and so on. This offers a set of advantages main documentation. The protocol relies on the fact that the nodes are not mobile and that the network structure does not change quickly, as several minutes may be required before a change in the network is propagated. However, a node that joins the network is immediately able to communicate using the routes of its neighbors. When a node joins the mesh network, Netsukuku automatically adapts and all other nodes come to know the fastest and most efficient routes to communicate with the newcomer. Each node has no more privileges or restrictions than the other nodes. The domain name system (DNS) is replaced by a decentralised and distributed system called ANDNA (Abnormal Netsukuku Domain Name Anarchy). The ANDNA database is included in the Netsukuku system, so each node includes such database that occupies at most 355 kilobytes of memory. Simplifying, ANDNA works as follows: to resolve a symbolic name the host applies a function Hash on its behalf. The Hash function returns an address that the host contacts asking for the resolution generated by the hash. The contacted node receives a request, searches in its ANDNA database for the address associated with the name and returns it to the applicant host. Recording works in a similar way: for example, let's suppose that the node X wants to register the address FreakNet.andna; X calculates the hash name and obtains the address 11.22.33.44 associated with node Y. The node X contacts Y asking to register 11.22.33.44 as its own. Y stores the request in its database and any request for resolution of 11.22.33.44 hash, will answer with the X's address. The protocol is a little more complex than this, as the system provides a public/private key to authenticate the hosts and prevent unauthorized changes to the ANDNA database. Furthermore, the protocol provides redundancy in the database to make the protocol resistant to failure and also provides for the migration of the database if the network topology changes. The protocol does not provide for the possibility of revoking a symbolic name; after a certain period of inactivity (currently 3 days) it is simply deleted from the database. The protocol also prevents a single host from recording an excessive number of symbolic names (at present 256 names) in order to prevent spammers from storing a high number of terms to perform cybersquatting.

    Read more →
  • Abdul Majid Bhurgri Institute of Language Engineering

    Abdul Majid Bhurgri Institute of Language Engineering

    Abdul Majid Bhurgri Institute of Language Engineering (Sindhi: عبدالماجد ڀرڳڙي انسٽيٽيوٽ آف لئنگئيج انجنيئرنگ) is an autonomous body under the administrative control of the Culture, Tourism and Antiquities Department, Government of Sindh established for bringing Sindhi language at par with national and international languages in all computational process and Natural language processing. == Establishment == In recognition to services of Abdul-Majid Bhurgri, who is the founder of Sindhi computing, Government of Sindh has established the institute after his name. The institute was primarily initiated on the concept given by a language engineer and linguist Amar Fayaz Buriro in briefing to the Minister, Culture, Tourism and Antiquities, Government of Sindh, Syed Sardar Ali Shah on 21 February 2017 on celebration of International Mother Language Day in Sindhi Language Authority, Hyderabad, Sindh. After the presentation and concept given by Amar Fayaz Buriro, the minister Syed Sardar Ali Shah had announced the Institute. Then, Government of Sindh added the development scheme in the Budget of fiscal year 2017-2018. == Projects == The Institute has developed several projects aimed at advancing the Sindhi language and promoting linguistic research. Notable initiatives include the AMBILE Hamiz Ali Sindhi Optical character recognition, which allows for the accurate digitization of Sindhi text, and the ongoing Sindhi WordNet System, a project to build a comprehensive lexical database for Natural language processing. The institute has also created the Font, which integrates symbols from the Indus script, Khudabadi script, and modern Perso-Arabic Script Code for Information Interchange into a single resource for researchers]. Additionally, institute has developed online converter tools that automatically transliterate between the Arabic-Perso script and Devanagari script, improving linguistic accessibility. Another key project is Bhittaipedia, a digital platform dedicated to the preservation and dissemination of the poetry of Shah Abdul Latif Bhittai, one of Sindh's most renowned poet. == Location == The institute is established behind Sindh Museum and Sindhi Language Authority, N-5 National Highway, Qasimabad, Hyderabad, Sindh.

    Read more →
  • Data Transformation Services

    Data Transformation Services

    Data Transformation Services (DTS) is a Microsoft database tool with a set of objects and utilities to allow the automation of extract, transform and load operations to or from a database. The objects are DTS packages and their components, and the utilities are called DTS tools. DTS was included with earlier versions of Microsoft SQL Server, and was almost always used with SQL Server databases, although it could be used independently with other databases. DTS allows data to be transformed and loaded from heterogeneous sources using OLE DB, ODBC, or text-only files, into any supported database. DTS can also allow automation of data import or transformation on a scheduled basis, and can perform additional functions such as FTPing files and executing external programs. In addition, DTS provides an alternative method of version control and backup for packages when used in conjunction with a version control system, such as Microsoft Visual SourceSafe. DTS has been superseded by SQL Server Integration Services in later releases of Microsoft SQL Server though there was some backwards compatibility and ability to run DTS packages in the new SSIS for a time. == History == In SQL Server versions 6.5 and earlier, database administrators (DBAs) used SQL Server Transfer Manager and Bulk Copy Program, included with SQL Server, to transfer data. These tools had significant shortcomings, and many DBAs used third-party tools such as Pervasive Data Integrator to transfer data more flexibly and easily. With the release of SQL Server 7 in 1998, "Data Transformation Services" was packaged with it to replace all these tools. The concept, design, and implementation of the Data Transformation Services was led by Stewart P. MacLeod (SQL Server Development Group Program Manager), Vij Rajarajan (SQL Server Lead Developer), and Ted Hart (SQL Server Lead Developer). The goal was to make it easier to import, export, and transform heterogeneous data and simplify the creation of data warehouses from operational data sources. SQL Server 2000 expanded DTS functionality in several ways. It introduced new types of tasks, including the ability to FTP files, move databases or database components, and add messages into Microsoft Message Queue. DTS packages can be saved as a Visual Basic file in SQL Server 2000, and this can be expanded to save into any COM-compliant language. Microsoft also integrated packages into Windows 2000 security and made DTS tools more user-friendly; tasks can accept input and output parameters. DTS comes with all editions of SQL Server 7 and 2000, but was superseded by SQL Server Integration Services in the Microsoft SQL Server 2005 release in 2005. == DTS packages == The DTS package is the fundamental logical component of DTS; every DTS object is a child component of the package. Packages are used whenever one modifies data using DTS. All the metadata about the data transformation is contained within the package. Packages can be saved directly in a SQL Server, or can be saved in the Microsoft Repository or in COM files. SQL Server 2000 also allows a programmer to save packages in a Visual Basic or other language file (when stored to a VB file, the package is actually scripted—that is, a VB script is executed to dynamically create the package objects and its component objects). A package can contain any number of connection objects, but does not have to contain any. These allow the package to read data from any OLE DB-compliant data source, and can be expanded to handle other sorts of data. The functionality of a package is organized into tasks and steps. A DTS Task is a discrete set of functionalities executed as a single step in a DTS package. Each task defines a work item to be performed as part of the data movement and data transformation process or as a job to be executed. Data Transformation Services supplies a number of tasks that are part of the DTS object model and that can be accessed graphically through the DTS Designer or accessed programmatically. These tasks, which can be configured individually, cover a wide variety of data copying, data transformation and notification situations. For example, the following types of tasks represent some actions that you can perform by using DTS: executing a single SQL statement, sending an email, and transferring a file with FTP. A step within a DTS package describes the order in which tasks are run and the precedence constraints that describe what to do in the case damage or of failure. These steps can be executed sequentially or in parallel. Packages can also contain global variables which can be used throughout the package. SQL Server 2000 allows input and output parameters for tasks, greatly expanding the usefulness of global variables. DTS packages can be edited, password protected, scheduled for execution, and retrieved by version. == DTS tools == DTS tools packaged with SQL Server include the DTS wizards, DTS Designer, and DTS Programming Interfaces. === DTS wizards === The DTS wizards can be used to perform simple or common DTS tasks. These include the Import/Export Wizard and the Copy of Database Wizard. They provide the simplest method of copying data between OLE DB data sources. There is a great deal of functionality that is not available by merely using a wizard. However, a package created with a wizard can be saved and later altered with one of the other DTS tools. A Create Publishing Wizard is also available to schedule packages to run at certain times. This only works if SQL Server Agent is running; otherwise the package will be scheduled, but will not be executed. === DTS Designer === The DTS Designer is a graphical tool used to build complex DTS Packages with workflows and event-driven logic. DTS Designer can also be used to edit and customize DTS Packages created with the DTS wizard. Each connection and task in DTS Designer is shown with a specific icon. These icons are joined with precedence constraints, which specify the order and requirements for tasks to be run. One task may run, for instance, only if another task succeeds (or fails). Other tasks may run concurrently. The DTS Designer has been criticized for having unusual quirks and limitations, such as the inability to visually copy and paste multiple tasks at one time. Many of these shortcomings have been overcome in SQL Server Integration Services, DTS's successor. === DTS Query Designer === A graphical tool used to build queries in DTS. === DTS Run Utility === DTS Packages can be run from the command line using the DTSRUN Utility. The utility is invoked using the following syntax: dtsrun /S server_name[\instance_name] { {/[~]U user_name [/[~]P password]} | /E } ] { {/[~]N package_name } | {/[~]G package_guid_string} | {/[~]V package_version_guid_string} } [/[~]M package_password] [/[~]F filename] [/[~]R repository_database_name] [/A global_variable_name:typeid=value] [/L log_file_name] [/W NT_event_log_completion_status] [/Z] [/!X] [/!D] [/!Y] [/!C] ] When passing in parameters which are mapped to Global Variables, you are required to include the typeid. This is rather difficult to find on the Microsoft site. Below are the TypeIds used in passing in these values.

    Read more →
  • Social television

    Social television

    Social television is the union of television and social media. Millions of people now share their TV experience with other viewers on social media such as Twitter and Facebook using smartphones and tablets. TV networks and rights holders are increasingly sharing video clips on social platforms to monetise engagement and drive tune-in. The social TV market covers the technologies that support communication and social interaction around TV as well as companies that study television-related social behavior and measure social media activities tied to specific TV broadcasts – many of which have attracted significant investment from established media and technology companies. The market is also seeing numerous tie-ups between broadcasters and social networking players such as Twitter and Facebook. The market is expected to be worth $256bn by 2017. Social TV was named one of the 10 most important emerging technologies by the MIT Technology Review on Social TV in 2010. And in 2011, David Rowan, the editor of Wired magazine, named Social TV at number three of six in his peek into 2011 and what tech trends to expect to get traction. Ynon Kreiz, CEO of the Endemol Group told the audience at the Digital Life Design (DLD) conference in January 2011: "Everyone says that social television will be big. I think it's not going to be big—it's going to be huge". Much of the investment in the earlier years of social TV went into standalone social TV apps. The industry believed these apps would provide an appealing and complimentary consumer experience which could then be monetized with ads. These apps featured TV listings, check-ins, stickers and synchronised second-screen content but struggled to attract users away from Twitter and Facebook. Most of these companies have since gone out of business or been acquired amid a wave of consolidation and the market has instead focused on the activities of the social media channels themselves – such as Twitter Amplify, Facebook Suggested Videos and Snapchat Discover – and the technologies that support them. == Twitter == Twitter and Facebook are both helping users connect around media, which can provoke strong debate and engagement. Both social platforms want to be the 'digital watercooler' and host conversation around TV because the engagement and data about what media people consume can then be used to generate advertising revenue. As an open platform, conversation on Twitter is closely aligned with real-time events. In May 2013, it launched Twitter Amplify – an advertising product for media and consumer brands. With Amplify, Twitter runs video highlights from major live broadcasts, with advertisers' names and messages playing before the clip. By February 2014, all four major U.S. TV networks had signed up to the Amplify program, bringing a variety of premium TV content onto the social platform in the form of in-tweet real-time video clips. In June 2014, Twitter acquired its Twitter Amplify partner in the U.S. SnappyTV, a company that was helping broadcasters and rights holders to share video content both organically across social and via Twitter's Amplify program. Twitter continues to rely on Grabyo, which has also struck numerous deals with some of the largest broadcasters and rights holders in Europe and North America to share video content across Facebook and Twitter. == Facebook == Facebook made significant changes to its platform in 2014 including updates to its algorithm to enhance how it serves video in users' feeds. It also launched video autoplay to get users to watch the videos in their feeds. It rapidly surpassed Twitter and by the end of 2014 it was enjoying three billion video views a day on its platform and had announced a partnership with the NFL, one of Twitter's most active Twitter Amplify partners. In April 2015, at its F8 Developer Conference, it revealed it was working with Grabyo among other technology partners to bring video onto its platform. Then in July it announced it would be launching Facebook Suggested Videos, bringing related videos and ads to anyone that clicks on a video – a move that not only competed with Twitter's commercial video offering but also put it in direct competition with YouTube. == TV Time == TV Time is a television dedicated social network that allows users to keep track of the television series they watch, as well as films. It also allows them to express their reaction to the media they have seen with episode specific voting for favorite characters and emotional reaction to episodes, as well as commenting in episode restrictive pages. This way users are able to avoid spoilers while also finding a precise audience and community for each of their interactions, as opposed to bigger, non-television dedicated social medias such as Facebook and Twitter where the likelihood of unintentionally reading spoilers is much higher. TV Time offers an analytics service called "TVLytics" where the votes and reactions collected from users can be studied for research and television production purposes. == Advertising == According to Businessinsider.com, there are variety of applications for social TV, including support for TV ad sales, optimizing TV ad buys, making ad buys more efficient, as a complement to audience measurement, and eventually, audience forecasting and real-time optimization. Social TV data can ease access to focus groups and may create a positive feedback loop for generating ultra-sticky TV programming and multi-screen ad campaigns. == In numbers == Viewers share their TV experience on social media in real-time as events unfold: between 88-100m Facebook users login to the platform during the primetime hours of 8pm – 11pm in the US. The volume of social media engagement in TV is also rising – according to Nielsen SocialGuide, there was a 38% increase in tweets about TV in 2013 to 263m. For the 2014 Super Bowl, Twitter reported that a record 24.9 million tweets about the game were sent during the telecast, peaking at 381,605 tweets per minute. Facebook reported that 50 million people discussed the Super Bowl, generating 185 million interactions. The 2014 Oscars generated 5m tweets, viewed by an audience of 37m unique Twitter users and delivering 3.3bn impressions globally as conversation and key moments were shared virally across the platform. In 2014 the All England Lawn Tennis Club (AELTC), hosts of Wimbledon, used Grabyo to share video content across social. The videos were viewed 3.5 million times across Facebook and Twitter. In partnered with Grabyo again in 2015 and the videos generated over 48 million views across Facebook and Twitter. == Television shows with social integration == Here are some examples of how TV executives are integrating social elements with TV shows: C-SPAN streamed tweets from US Senators and Representatives during the quorum call The Voice had the judges of the program tweet during the show and the posts scrolls on the bottom of the screen. The use of Twitter also led to an increase in viewers. "Glee" Entertainment Weekly created a second screen viewing platform for the Glee season 3 premiere. == Related publications == Erika Jonietz. "Making TV Social, Virtually" MIT Technology Review. (January 11, 2010) AmigoTV (Alcatel-Lucent; Coppens et al.) – 2004 www.ist-ipmedianet.org/Alcatel_EuroiTV2004_AmigoTV_short_paper_S4-2.pdf Nextream (MIT Media Lab, Martin et al.) – 2010 Social Interactive Television: Immersive Shared Experiences and Perspectives (P. Cesar, D. Geerts, and K. Chorianopoulos (eds.)) – 2009 Social TV and the Emergence of Interactive TV – Multimedia Research Group – November 2010 Interactive Social TV on Service Oriented Environments: Challenges and Enablers (May 2011) == Systems == Boxee – acquired by Samsung GetGlue – acquired by i.TV Grabyo KIT digital Miso TV Tank Top TV WiO Xbox Live

    Read more →
  • Reverse proxy

    Reverse proxy

    In computer networks, a reverse proxy or surrogate server is a proxy server that appears to any client to be an ordinary web server, but in reality merely acts as an intermediary that forwards the client's requests to one or more ordinary web servers. Reverse proxies help increase scalability, performance, resilience, and security, but they also carry a number of risks. Companies that run web servers often set up reverse proxies to facilitate the communication between an Internet user's browser and the web servers. An important advantage of doing so is that the web servers can be hidden behind a firewall on a company-internal network, and only the reverse proxy needs to be directly exposed to the Internet. Reverse proxy servers are implemented in popular open-source web servers. Dedicated reverse proxy servers are used by some of the biggest websites on the Internet. A reverse proxy is capable of tracking IP addresses of requests that are relayed through it as well as reading and/or modifying any non-encrypted traffic. However, this implies that anyone who has compromised the server could do so as well. Reverse proxies differ from forward proxies, which are used when the client is restricted to a private, internal network and asks a forward proxy to retrieve resources from the public Internet. == Uses == Large websites and content delivery networks use reverse proxies, together with other techniques, to balance the load between internal servers. Reverse proxies can keep a cache of static content, which further reduces the load on these internal servers and the internal network. It is also common for reverse proxies to add features such as compression or TLS encryption to the communication channel between the client and the reverse proxy. Reverse proxies can inspect HTTP headers, which, for example, allows them to present a single IP address to the Internet while relaying requests to different internal servers based on the URL of the HTTP request. Reverse proxies can hide the existence and characteristics of origin servers. This can make it more difficult to determine the actual location of the origin server / website and, for instance, more challenging to initiate legal action such as takedowns or block access to the website, as the IP address of the website may not be immediately apparent. Additionally, the reverse proxy may be located in a different jurisdiction with different legal requirements, further complicating the takedown process. Application firewall features can protect against common web-based attacks, like a denial-of-service attack (DoS) or distributed denial-of-service attacks (DDoS). Without a reverse proxy, removing malware or initiating takedowns (while simultaneously dealing with the attack) on one's own site, for example, can be difficult. In the case of secure websites, a web server may not perform TLS encryption itself, but instead offload the task to a reverse proxy that may be equipped with TLS acceleration hardware. (See TLS termination proxy.) A reverse proxy can distribute the load from incoming requests to several servers, with each server supporting its own application area. In the case of reverse proxying web servers, the reverse proxy may have to rewrite the URL in each incoming request in order to match the relevant internal location of the requested resource. A reverse proxy can reduce load on its origin servers by caching static content and dynamic content, known as web acceleration. Proxy caches of this sort can often satisfy a considerable number of website requests, greatly reducing the load on the origin server(s). A reverse proxy can optimize content by compressing it in order to speed up loading times. In a technique named "spoon-feeding", a dynamically generated page can be produced in its entirety and served to the reverse proxy, which can feed the page to the client as the connection allows. The program that generates the page need not remain open, thus releasing server resources during the possibly extended time the client requires to complete the transfer. Reverse proxies can operate wherever multiple web-servers must be accessible via a single public IP address. The web servers listen on different ports in the same machine, with the same local IP address or, possibly, on different machines with different local IP addresses. The reverse proxy analyzes each incoming request and delivers it to the right server within the local area network. Reverse proxies can perform A/B testing and multivariate testing without requiring application code to handle the logic of which version is served to a client. A reverse proxy can add access authentication to a web server that does not have any authentication. == Risks == When the transit traffic is encrypted and the reverse proxy needs to filter/cache/compress or otherwise modify or improve the traffic, the proxy first must decrypt and re-encrypt communications. This requires the proxy to possess the TLS certificate and its corresponding private key, extending the number of systems that can have access to non-encrypted data and making it a more valuable target for attackers. The vast majority of external data breaches happen either when hackers succeed in abusing an existing reverse proxy that was intentionally deployed by an organization, or when hackers succeed in converting an existing Internet-facing server into a reverse proxy server. Compromised or converted systems allow external attackers to specify where they want their attacks proxied to, enabling their access to internal networks and systems. Applications that were developed for the internal use of a company are not typically hardened to public standards and are not necessarily designed to withstand all hacking attempts. When an organization allows external access to such internal applications via a reverse proxy, they might unintentionally increase their own attack surface and invite hackers. If a reverse proxy is not configured to filter attacks or it does not receive daily updates to keep its attack signature database up to date, a zero-day vulnerability can pass through unfiltered, enabling attackers to gain control of the system(s) that are behind the reverse proxy server. Giving the reverse proxy of a third party access to private keys (for caching or optimizing content) places the entire triad of confidentiality, integrity and availability in the hands of the third party who operates the proxy. A reverse proxy is a single point of failure for the back-end services it fronts: an outage caused by misconfiguration, a denial-of-service attack, or a software fault can make every fronted service unreachable to outside clients, even when the back-end services themselves remain healthy. For example, a 2020 outage at Cloudflare briefly took down major sites and services that relied on its reverse-proxy edge, including Discord.

    Read more →
  • Language-Theoretic Security

    Language-Theoretic Security

    Language-theoretic security, or LangSec, is an approach to software security that focuses on input handling, complexity, and program design as strategies to improve the verifiability of computer programs. It was introduced in 2005 by Robert J. Hansen and Meredith L. Patterson at BlackHat and in 2011 by Len Sassaman and Patterson. It aims to create a formal description of which software is likely to have security vulnerabilities of particular classes, and why. It considers programs to have an inherent parser component, whether or not explicit, composed of that part of the program which operates on external input before that input is fully parsed. A central hypothesis of language-theoretic security is that vulnerabilities in software increase according to the computational power of the notional input-accepting automaton equivalent to this parser, using the definitions of automata theory. The lower bound on this computational power is the input language complexity of the program. The extent to which reducing this complexity is possible is a function of the specification of the communication protocol or file format the program takes as input. == Parsing as a security mechanism == The behaviour of a program is defined with reference to its expected input. Unexpected input being used by a program is a factor in numerous security bugs, including the so-called Android master key vulnerability (CVE-2013-4787), because accepting unexpected input renders the program's specification ambiguous. In that instance, the unexpected ambiguity came in the form of a ZIP file with duplicate filenames. If a program fully parses its input and only acts on input that unambiguously meets the specification, it follows that the program will avoid these types of vulnerabilities. This is an intentional inversion of the Postel principle. Accepting only unambiguous and valid input is a more formal requirement than input validation or sanitization, and narrows the number of possible but unanticipated program states that can be induced in an application via user input. Conversely, failure to do this is associated with security vulnerabilities. Input sanitization in particular is held to be an inadequate approach to avoiding malicious input because it inherently ignores context-sensitive properties of the input; it can therefore result in paradoxical effects, such as sanitization code activating otherwise inert cross-site scripting payloads in browsers. === Parser differentials === If the language of accepted program input is sufficiently simple, it is possible to verify that two implementations parse the same input language consistently. This is advantageous because it shows no parser differential exists between the two implementations. The requisite level of simplicity is theoretically that for which there is a solution to the equivalence problem. If the two parsers involved in CVE-2013-4787 were equivalent - that is, if they rendered the same output state given the same input state - the vulnerability could not have existed. One strategy for doing this is to publish machine-readable specifications of a format or protocol, and then use a parser generator to generate the parser code. An example of a parser generator built for this purpose is DaeDaLus. The combination of Lex with any of GNU Bison, ANTLR, or Yacc also accomplishes this. However, many parser generators allow the mixing of general purpose code with the parsing definitions, which weakens the guarantees provided by parsing. === Analysis of injection attacks === Injection attacks are generally the result of differences between the serializer (or "unparser") and the corresponding parser at a layer boundary in a system; therefore, they are a special case of parser differentials. In a SQL injection attack, for example, an attacker is able to cause the application with which they are interacting to serialize a SQL query that has different semantics than intended. In the simplest case where the payload ends a string and adds new code, the payload has crossed the code-data boundary in SQL. In language-theoretic security, this is treated as a bug in the serializer of the SQL query, which should instead be written in a way that constrains its possible outputs to those within the scope of the intended query. === Parser combinators === If a parser generator is not used, it is still possible to avoid implementation bugs by using parser combinator such as Nom to implement the parser code. This has the drawback of relying on a programmer correctly translating the specification into the language of the parser generator library, though this task is still less error-prone than hand-coding a parser. == Input format complexity == Complexity in computer programs is associated with security vulnerabilities. Within the domain of language-theoretic security, complexity is described with reference to the computational power of the abstract machine necessary to implement the program, or more particularly, to implement the parser for its input language. This complexity describes whether it is possible to show that there is no unintended or undesired functionality in the program which might be exploitable by an attacker. To be bounded in complexity, the program's input must be well-defined both in terms of form and of semantics. === Weird machines === A weird machine is a model of computation in a program that exists in parallel with, but is distinct from, the intended abstract model of computation in that program. Some classes of weird machine arise from the multi-layered nature of computer programs, or the context in which the programs run; others result from the unanticipated functionality a program has due to its complexity or to software bugs. The more complex the computation model of a program, the more likely it is to implement a weird machine. Depending on context, the weird machine may or may not be concretely useful for an attacker. Since the space of weird machines in the context of some program is the universe of all possible states that are not within the program's intended states, many exploited states including remote code execution and injection attacks belong to the domain of weird machines. A reduction in weird machines is therefore a likely correlate with reduced program vulnerability. === SafeDocs project === SafeDocs is a DARPA project undertaken in 2018 to take existing file formats, create safer subsets of them, and develop programming tools to work for the safer formats. The initial test case for this was PDF. The purpose of creating safer subsets in this case is to lower the minimum bound on parser complexity so that it becomes possible to create tools that will generate correct, normative parsers for them. == Relation to programming languages == The analytic framework of language-theoretic security assumes programs to be virtual machines that execute their input. A document that is read by an application is in this sense a form of machine code, in a generalization of the data as code idea, following the automata theory description of parsers. === Type-safe programming languages === Parsing input and serializing output are operations that consume one data type and emit another. A programming language can therefore check that data is correctly parsed and contains the expected structure by checking data types, and correct serializing (or unparsing) can be implemented as operations on the data types that are relevant to the program's output. This approach can be used to show that the recognizer and unparser patterns have been implemented. It is also possible to implement type checking across a distributed system to enforce parsing and unparsing of the expected structures and to verify that the assumptions made in designing the compositional properties of a distributed system have been followed. === Memory-safe programming languages === In the general case, spatial memory correctness is undecidable. If any proof of spatial memory correctness is to be made, it is therefore necessary to bound the complexity of the code. Interpreted languages such as Java and Python effectively accomplish this via runtime bounds checking, and frameworks for runtime bounds checking also exist for C. The effect of these strategies for spatial memory correctness are to create a halt state in place of a spatial memory correctness violation; therefore, it can be shown that the program will not violate spatial memory correctness, but in exchange, it cannot be shown in the general case that programs will not have runtime bounds checking exceptions. Some programming languages, such as Rust, accomplish this using borrow checking. The borrow checker acts to assure spatial memory correctness by compile-time reference counting. Code for which spatial memory correctness cannot be shown to not be violated therefore does not compile, inherently limiting the complexity of the spatial memory correctness of the program to what is decidable. Thi

    Read more →
  • Dashboard (computing)

    Dashboard (computing)

    In computer information systems, a dashboard is a type of graphical user interface which often provides at-a-glance views of data relevant to a particular objective or process through a combination of visualizations and summary information. In other usage, "dashboard" is another name for "progress report" or "report" and is considered a form of data visualization. The dashboard is often accessible by a web browser and is typically linked to regularly updating data sources. Dashboards are often interactive and facilitate users to explore the data themselves, usually by clicking into elements to view more detailed information. The term dashboard originates from the automobile dashboard where drivers monitor the major functions at a glance via the instrument panel. == History == The idea of digital dashboards followed the study of decision support systems in the 1970s. Early predecessors of the modern business dashboard were first developed in the 1980s in the form of Executive Information Systems (EISs). Due to problems primarily with data refreshing and handling, it was soon realized that the approach wasn't practical as information was often incomplete, unreliable, and spread across too many disparate sources. Thus, EISs hibernated until the 1990s when the information age quickened pace and data warehousing, and online analytical processing (OLAP) allowed dashboards to function adequately. Despite the availability of enabling technologies, the dashboard use didn't become popular until later in that decade, with the rise of key performance indicators (KPIs), and the introduction of Robert S. Kaplan and David P. Norton's balanced scorecard. In the late 1990s, Microsoft promoted a concept known as the Digital Nervous System and "digital dashboards" were described as being one leg of that concept. Today, the use of dashboards forms an important part of Business Performance Management (BPM). Initially dashboards were used for monitoring purposes, now with the advancement of technology, dashboards are being used for more analytical purposes. The use of dashboards has now been incorporating; scenario analysis, drill down capabilities, and presentation format flexibility. == Benefits == Digital dashboards allow managers to monitor the contribution of the various departments in their organization. In addition, they enable “rolling up” of information to present a consolidated view across an organization. To gauge exactly how well an organization is performing overall, digital dashboards allow you to capture and report specific data points from each department within the organization, thus providing a "snapshot" of performance. Benefits of using digital dashboards include: Visual presentation of performance measures Ability to identify and correct negative trends Measure efficiencies/inefficiencies Ability to generate detailed reports showing new trends Ability to make more informed decisions based on collected business intelligence Dashboards offers a holistic view of the entire business as it gives the manager a bird's eye view into the performance of sales, data inventory, web traffic, social media analytics and other associated data that is visually presented on a single dashboard. Dashboards lead to better management of marketing/financial strategies as a dashboard for the display of marketing data makes the process of marketing easier and more reliable as compared to doing it manually. Web analytics play a crucial role in shaping the marketing strategy of many businesses. Dashboards also facilitate for better tracking of sales and financial reporting as the data is more precise and in one area. Lastly, dashboards offer for better customer service through monitoring because they keep both the managers and the clients updated on the project progress through automated emails and notifications. == Align strategies and organizational goals == Gain total visibility of all systems instantly Quick identification of data outliers and correlations Consolidated reporting into one location Available on mobile devices to quickly access metrics == Classification == Dashboards can be broken down according to role and are either strategic, analytical, operational, or informational. Dashboards are the 3rd step on the information ladder, demonstrating the conversion of data to increasingly valuable insights. Strategic dashboards support managers at any level in an organization and provide the quick overview that decision-makers need to monitor the health and opportunities of the business. Dashboards of this type focus on high-level measures of performance and forecasts. Strategic dashboards benefit from static snapshots of data (daily, weekly, monthly, and quarterly) that are not constantly changing from one moment to the next. Dashboards for analytical purposes often include more context, comparisons, and history, along with subtler performance evaluators. In addition, analytical dashboards typically support interactions with the data, such as drilling down into the underlying details. Dashboards for monitoring operations are often designed differently from those that support strategic decision making or data analysis and often require monitoring of activities and events that are constantly changing and might require attention and response at a moment's notice. == Types of dashboards == Digital dashboards may be laid out to track the flows inherent in the business processes that they monitor. Graphically, users may see the high-level processes and then drill down into low-level data. This level of detail is often buried deep within the corporate enterprise and otherwise unavailable to the senior executives. Three main types of digital dashboards dominate the market today: desktop software applications, web-browser-based applications, and desktop applications are also known as desktop widgets. The last are driven by a widget engine. Both Desktop and Browser-based providers enable the distribution of dashboards via a web browser. An example of the latter is web-based-browser Asana, which helps teams orchestrate their work, from daily tasks to strategic cross-functional initiatives. With it, teams can manage everything from company objectives to digital transformation to product launches and marketing campaigns. Specialized dashboards may track all corporate functions. Examples include human resources, recruiting, sales, operations, security, information technology, project management, customer relationship management, digital marketing and many more departmental dashboards. For a smaller organization like a startup a compact startup scorecard dashboard tracks important activities across lot of domains ranging from social media to sales. Digital dashboard projects involve business units as the driver and the information technology department as the enabler. Therefore, the success of dashboard projects depends on the relevancy/importance of information provided within the dashboard. This includes the metrics chosen to monitor and the timeliness of the data forming those metrics; data must be up to date and accurate. Key performance indicators, balanced scorecards, and sales performance figures are some of the content appropriate on business dashboards. === Performance Dashboards === Dashboards involve the combination of visual and functional features. This combination of features helps improve cognition and interpretation. A performance dashboard sits at the intersection of two powerful disciplines: business intelligence and performance management. Therefore, there are different users who could use these dashboards for different reasons. For example, a level of workers could look at monitoring inventory while those in more managerial roles can look at lagging measure. Then executives could utilize the dashboard to evaluate strategic performance against objectives. == Dashboards and scorecards == Balanced scorecards and dashboards have been linked together as if they were interchangeable. However, although both visually display critical information, the difference is in the format: Scorecards can open the quality of an operation while dashboards provide calculated direction. A balanced scorecard has what they called a "prescriptive" format. It should always contain these components: Perspectives – group Objectives – verb-noun phrases pulled from a strategy plan Measures – also called metric or key performance indicators (KPIs) Spotlight indicators – red, yellow, or green symbols that provide an at-a-glance view of a measure's performance. Each of these sections ensures that a Balanced Scorecard is essentially connected to the businesses critical strategic needs. The design of a dashboard is more loosely defined. Dashboards are usually a series of graphics, charts, gauges and other visual indicators that can be monitored and interpreted. Even when there is a strategic link, on a dashboard, it may not be noticed as such since objectives are not normally pre

    Read more →
  • Semi-Automatic Ground Environment

    Semi-Automatic Ground Environment

    The Semi-Automated Ground Environment (SAGE) was a system of large computers and associated networking equipment that coordinated data from many radar sites and processed it to produce a single unified image of the airspace over a wide area. SAGE directed and controlled the NORAD response to a possible Soviet air attack, operating in this role from the late 1950s into the 1980s. The processing power behind SAGE was supplied by the largest discrete component-based computer ever built, the AN/FSQ-7, manufactured by IBM. Each SAGE Direction Center (DC) housed an FSQ-7 which occupied an entire floor, approximately 22,000 square feet (2,000 m2) not including supporting equipment. The FSQ-7 was actually two computers, "A" side and "B" side. Computer processing was switched from "A" side to "B" side on a regular basis, allowing maintenance on the unused side. Information was fed to the DCs from a network of radar stations as well as readiness information from various defense sites. The computers, based on the raw radar data, developed "tracks" for the reported targets, and automatically calculated which defenses were within range. Operators used light guns to select targets on-screen for further information, select one of the available defenses, and issue commands to attack. These commands would then be automatically sent to the defense site via teleprinter. Connecting the various sites was an enormous network of telephones, modems and teleprinters. Later additions to the system allowed SAGE's tracking data to be sent directly to CIM-10 Bomarc missiles and some of the US Air Force's interceptor aircraft in-flight, directly updating their autopilots to maintain an intercept course without operator intervention. Each DC also forwarded data to a Combat Center (CC) for "supervision of the several sectors within the division" ("each combat center [had] the capability to coordinate defense for the whole nation"). SAGE became operational in the late 1950s and early 1960s at an estimated total cost between 8 and 12 billion dollars, four times the cost of the Manhattan Project. Throughout its development, there were continual concerns about its real ability to deal with large attacks, and the Operation Sky Shield tests showed that only about one-fourth of enemy bombers would have been intercepted. Nevertheless, SAGE was the backbone of NORAD's air defense system into the 1980s, by which time the tube-based FSQ-7s were increasingly costly to maintain and completely outdated. Today the same command and control task is carried out by microcomputers, based on the same basic underlying data. == Background == === Earlier systems === Just prior to World War II, Royal Air Force (RAF) tests with the new Chain Home (CH) radars had demonstrated that relaying information to the fighter aircraft directly from the radar sites was not feasible. The radars determined the map coordinates of the enemy, but could generally not see the fighters at the same time. This meant the fighters had to be able to determine where to fly to perform an interception but were often unaware of their own exact location and unable to calculate an interception while also flying their aircraft. The solution was to send all of the radar information to a central control station where operators collated the reports into single tracks, and then reported these tracks to the airbases, or sectors. The sectors used additional systems to track their own aircraft, plotting both on a single large map. Operators viewing the map could then see what direction their fighters would have to fly to approach their targets and relay that simply by telling them to fly along a certain heading or vector. This Dowding system was the first ground-controlled interception (GCI) system of large scale, covering the entirety of the UK. It proved enormously successful during the Battle of Britain, and is credited as being a key part of the RAF's success. The system was slow, often providing information that was up to five minutes out of date. Against propeller driven bombers flying at perhaps 225 miles per hour (362 km/h) this was not a serious concern, but it was clear the system would be of little use against jet-powered bombers flying at perhaps 600 miles per hour (970 km/h). The system was extremely expensive in manpower terms, requiring hundreds of telephone operators, plotters and trackers in addition to the radar operators. This was a serious drain on manpower, making it difficult to expand the network. The idea of using a computer to handle the task of taking reports and developing tracks had been explored beginning late in the war. By 1944, analog computers had been installed at the CH stations to automatically convert radar readings into map locations, eliminating two people. Meanwhile, the Royal Navy began experimenting with the Comprehensive Display System (CDS), another analog computer that took X and Y locations from a map and automatically generated tracks from repeated inputs. Similar systems began development with the Royal Canadian Navy, DATAR, and the US Navy, the Naval Tactical Data System (NTDS). A similar system was also specified for the Nike SAM project, specifically referring to a US version of CDS, coordinating the defense over a battle area so that multiple batteries did not fire on a single target. All of these systems were relatively small in geographic scale, generally tracking within a city-sized area. === Valley Committee === When the Soviet Union tested its first atomic bomb in August 1949, the topic of air defense of the US became important for the first time. A study group, the "Air Defense Systems Engineering Committee", was set up under the direction of Dr. George Valley to consider the problem and is known to history as the "Valley Committee". Their December report noted a key problem in air defense using ground-based radars. A bomber approaching a radar station would detect the signals from the radar long before the reflection off the bomber was strong enough to be detected by the station. The committee suggested that when this occurred, the bomber would descend to low altitude, thereby greatly limiting the radar horizon, allowing the bomber to fly past the station undetected. Although flying at low altitude greatly increased fuel consumption, the team calculated that the bomber would only need to do this for about 10% of its flight, making the fuel penalty acceptable. The only solution to this problem was to build a huge number of stations with overlapping coverage. At that point the problem became one of managing the information. Manual plotting was ruled out as too slow, and a computerized solution was the only possibility. To handle this task, the computer would need to be fed information directly, eliminating any manual translation by phone operators, and it would have to be able to analyze that information and automatically develop tracks. A system tasked with defending cities against the predicted future Soviet bomber fleet would have to be dramatically more powerful than the models used in the NTDS or DATAR. The Committee then had to consider whether or not such a computer was possible. The Valley Committee was introduced to Jerome Wiesner, associate director of the Research Laboratory of Electronics at MIT. Wiesner noted that the Servomechanisms Laboratory had already begun development of a machine that might be fast enough. This was the Whirlwind I, originally developed for the Office of Naval Research as a general purpose flight simulator that could simulate any current or future aircraft by changing its software. Wiesner introduced the Valley Committee to Whirlwind's project lead, Jay Forrester, who convinced him that Whirlwind was sufficiently capable. In September 1950, an early microwave early-warning radar system at Hanscom Field was connected to Whirlwind using a custom interface developed by Forrester's team. An aircraft was flown past the site, and the system digitized the radar information and successfully sent it to Whirlwind. With this demonstration, the technical concept was proven. Forrester was invited to join the committee. === Project Charles === With this successful demonstration, Louis Ridenour, chief scientist of the Air Force, wrote a memo stating "It is now apparent that the experimental work necessary to develop, test, and evaluate the systems proposals made by ADSEC will require a substantial amount of laboratory and field effort." Ridenour approached MIT President James Killian with the aim of beginning a development lab similar to the war-era Radiation Laboratory that made enormous progress in radar technology. Killian was initially uninterested, desiring to return the school to its peacetime civilian charter. Ridenour eventually convinced Killian the idea was sound by describing the way the lab would lead to the development of a local electronics industry based on the needs of the lab and the students who would leave the lab to start their

    Read more →
  • Sentiment analysis

    Sentiment analysis

    Sentiment analysis (also known as opinion mining) is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. With the rise of deep language models, such as RoBERTa, more difficult data domains can be analyzed, e.g., news texts where authors typically express their opinion/sentiment less explicitly. == Types == A basic task in sentiment analysis is classifying the polarity of a given text at the document, sentence, or feature/aspect level—whether the expressed opinion in a document, a sentence or an entity feature/aspect is positive, negative, or neutral. Advanced, "beyond polarity" sentiment classification looks, for instance, at emotional states such as enjoyment, anger, disgust, sadness, fear, and surprise. Precursors to sentimental analysis include the General Inquirer, which provided hints toward quantifying patterns in text and, separately, psychological research that examined a person's psychological state based on analysis of their verbal behavior. Subsequently, the method described in a patent by Volcani and Fogel, looked specifically at sentiment and identified individual words and phrases in text with respect to different emotional scales. A current system based on their work, called EffectCheck, presents synonyms that can be used to increase or decrease the level of evoked emotion in each scale. Many other subsequent efforts were less sophisticated, using a mere polar view of sentiment, from positive to negative, such as work by Turney, and Pang who applied different methods for detecting the polarity of product reviews and movie reviews respectively. This work is at the document level. One can also classify a document's polarity on a multi-way scale, which was attempted by Pang and Snyder among others: Pang and Lee expanded the basic task of classifying a movie review as either positive or negative to predict star ratings on either a 3- or a 4-star scale, while Snyder performed an in-depth analysis of restaurant reviews, predicting ratings for various aspects of the given restaurant, such as the food and atmosphere (on a five-star scale). First steps to bringing together various approaches—learning, lexical, knowledge-based, etc.—were taken in the 2004 AAAI Spring Symposium where linguists, computer scientists, and other interested researchers first aligned interests and proposed shared tasks and benchmark data sets for the systematic computational research on affect, appeal, subjectivity, and sentiment in text. Even though in most statistical classification methods, the neutral class is ignored under the assumption that neutral texts lie near the boundary of the binary classifier, several researchers suggest that, as in every polarity problem, three categories must be identified. Moreover, it can be proven that specific classifiers such as the Max Entropy and SVMs can benefit from the introduction of a neutral class and improve the overall accuracy of the classification. There are in principle two ways for operating with a neutral class. Either, the algorithm proceeds by first identifying the neutral language, filtering it out and then assessing the rest in terms of positive and negative sentiments, or it builds a three-way classification in one step. This second approach often involves estimating a probability distribution over all categories (e.g. naive Bayes classifiers as implemented by the NLTK). Whether and how to use a neutral class depends on the nature of the data: if the data is clearly clustered into neutral, negative and positive language, it makes sense to filter the neutral language out and focus on the polarity between positive and negative sentiments. If, in contrast, the data are mostly neutral with small deviations towards positive and negative affect, this strategy would make it harder to clearly distinguish between the two poles. A different method for determining sentiment is the use of a scaling system whereby words commonly associated with having a negative, neutral, or positive sentiment are given an associated number on a −10 to +10 scale (most negative up to most positive) or simply from 0 to a positive upper limit such as +4. This makes it possible to adjust the sentiment of a given term relative to its environment (usually on the level of the sentence). When a piece of unstructured text is analyzed using natural language processing, each concept in the specified environment is given a score based on the way sentiment words relate to the concept and its associated score. This allows movement to a more sophisticated understanding of sentiment, because it is now possible to adjust the sentiment value of a concept relative to modifications that may surround it. Words, for example, that intensify, relax or negate the sentiment expressed by the concept can affect its score. Alternatively, texts can be given a positive and negative sentiment strength score if the goal is to determine the sentiment in a text rather than the overall polarity and strength of the text. There are various other types of sentiment analysis, such as aspect-based sentiment analysis, grading sentiment analysis (positive, negative, neutral), multilingual sentiment analysis and detection of emotions. === Subjectivity/objectivity identification === This task is commonly defined as classifying a given text (usually a sentence) into one of two classes: objective or subjective. This problem can sometimes be more difficult than polarity classification. The subjectivity of words and phrases may depend on their context and an objective document may contain subjective sentences (e.g., a news article quoting people's opinions). Moreover, as mentioned by Su, results are largely dependent on the definition of subjectivity used when annotating texts. However, Pang showed that removing objective sentences from a document before classifying its polarity helped improve performance. Subjective and objective identification, emerging subtasks of sentiment analysis to use syntactic, semantic features, and machine learning knowledge to identify if a sentence or document contains facts or opinions. Awareness of recognizing factual and opinions is not recent, having possibly first presented by Carbonell at Yale University in 1979. The term objective refers to the incident carrying factual information. Example of an objective sentence: 'To be elected president of the United States, a candidate must be at least thirty-five years of age.' The term subjective describes the incident contains non-factual information in various forms, such as personal opinions, judgment, and predictions, also known as 'private states'. In the example down below, it reflects a private states 'We Americans'. Moreover, the target entity commented by the opinions can take several forms from tangible product to intangible topic matters stated in Liu (2010). Furthermore, three types of attitudes were observed by Liu (2010), 1) positive opinions, 2) neutral opinions, and 3) negative opinions. Example of a subjective sentence: 'We Americans need to elect a president who is mature and who is able to make wise decisions.' This analysis is a classification problem. Each class's collections of words or phrase indicators are defined for to locate desirable patterns on unannotated text. For subjective expression, a different word list has been created. Lists of subjective indicators in words or phrases have been developed by multiple researchers in the linguist and natural language processing field states in Riloff et al. (2003). A dictionary of extraction rules has to be created for measuring given expressions. Over the years, in subjective detection, the features extraction progression from curating features by hand to automated features learning. At the moment, automated learning methods can further separate into supervised and unsupervised machine learning. Patterns extraction with machine learning process annotated and unannotated text have been explored extensively by academic researchers. However, researchers recognized several challenges in developing fixed sets of rules for expressions respectably. Much of the challenges in rule development stems from the nature of textual information. Six challenges have been recognized by several researchers: 1) metaphorical expressions, 2) discrepancies in writings, 3) context-sensitive, 4) represented words with fewer usages, 5) time-sensitive, and 6) ever-growing volume. Metaphorical expressions. The text contains metaphoric expression may impact on the performance on the extraction. Besides, metaphors take in different forms, which may have been contribu

    Read more →
  • Robomart

    Robomart

    Robomart is an American technology company headquartered in Santa Monica, California that builds autonomous smart shops for cafes, ice cream parlors, and quick-service restaurants. The company’s white label platform gives retailers the option to expand their footprint at a significantly lower cost than traditional brick-and-mortar real-estate. Robomarts are equipped with a proprietary checkout-free system, temperature controlled compartments, sensors for autonomous operation, and external cameras for added security. The company licenses its technology and white label applications to retailers who manage their fleet of stores and deploy them to their consumers’ locations. After consumers have taken goods from the robomart, their order is automatically calculated, their card on file is charged and they are sent a receipt. The company has announced partnerships with Unilever, Mars, and Fatty Mart. == History == Robomart was founded by Ali Ahmed, Tigran Shahverdyan, and Emad Suhail Rahim. The company debuted at CES 2018 where it unveiled its concept of a self-driving store. At GITEX 2018 the company presented its first functional prototype of a fully driverless Robomart. At the 2019 Consumer Electronics Show the company demonstrated the technology behind its autonomous stores and checkout-free shopping experience. In January 2019, Robomart announced its first partnership with U.S. grocery chain Stop & Shop to test its driverless stores. In December 2020, Robomart deployed the Pharmacy Robomart in a trial in West Hollywood. In June 2021, the company launched its commercial service with a fleet of Pharmacy and Snacks Robomarts operating within West Hollywood and Central Hollywood. In August 2023, Robomart announced a $2 million seed round, putting its to-date funding at $3.4 million. == Partnerships == In September 2019, Robomart partnered with Avery Dennison to source the RFID tags used to enable its checkout-free shopping experience. In December 2020, Robomart partnered with Zeeba Vans to provide vehicles for its growing fleet. In June 2021, Robomart partnered with REEF Technology to provide inventory management and restocking services. In addition, REEF's Light Speed grocery division serves as the first merchant selling products through Robomart. == Products == The company currently offers three Robomart types. The frozen Robomart that stocks ice cream, the refrigerated Robomart that stocks perishable foods, and the ambient Robomart that stocks shelf-stable goods.

    Read more →
  • Chunked transfer encoding

    Chunked transfer encoding

    Chunked transfer encoding is a streaming data transfer mechanism available in Hypertext Transfer Protocol (HTTP) version 1.1, defined in RFC 9112 §7.1. In chunked transfer encoding, the data stream is divided into a series of non-overlapping "chunks". The chunks are sent out and received independently of one another. At any given time, no knowledge of the data stream outside the currently-being-processed chunk is necessary for either the sender or the receiver. Each chunk is preceded by its size in bytes and transmission ends when a zero-length chunk is received. The chunked keyword in the Transfer-Encoding header is used to indicate chunked transfer. Chunked transfer encoding is not supported in HTTP/2, which provides its own mechanisms for data streaming. == Rationale == The introduction of chunked encoding provided various benefits: Chunked transfer encoding allows a server to maintain an HTTP persistent connection for dynamically generated content. In this case, the HTTP Content-Length header cannot be used to delimit the content and the next HTTP request/response, as the content size is not yet known. Chunked encoding has the benefit that it is not necessary to generate the full content before writing the header, as it allows streaming of content as chunks and explicitly signaling the end of the content, making the connection available for the next HTTP request/response. Chunked encoding allows the sender to send additional header fields after the message body. This is important in cases where values of a field cannot be known until the content has been produced, such as when the content of the message must be digitally signed. Without chunked encoding, the sender would have to buffer the content until it was complete in order to calculate a field value and send it before the content. == Applicability == For version 1.1 of the HTTP protocol, the chunked transfer mechanism is considered to be always and anyway acceptable, even if not listed in the Transfer-Encoding (TE) request header field, and when used with other transfer mechanisms, should always be applied last to the transferred data and never more than one time. This transfer encoding method also allows additional entity header fields to be sent after the last chunk if the client specified the "trailers" parameter as an argument of the TE request field. The origin server of the response can also decide to send additional entity trailers even if the client did not specify the "trailers" parameter, but only if the metadata is optional (i.e. the client can use the received entity without them). Whenever the trailers are used, the server should list their names in the Trailer header field; three header field types are specifically prohibited from appearing as a trailer field: Content-Length, Trailer, and Transfer-Encoding. == Format == If a Transfer-Encoding field with a value of "chunked" is specified in an HTTP message (either a request sent by a client or the response from the server), the body of the message consists of one or more chunks and one terminating chunk with an optional trailer before the final ␍␊ sequence (i.e. carriage return followed by line feed). Each chunk starts with the number of octets of the data it embeds expressed as a hexadecimal number in ASCII followed by optional parameters (chunk extension) and a terminating ␍␊ sequence, followed by the chunk data. The chunk is terminated by ␍␊. If chunk extensions are provided, the chunk size is terminated by a semicolon and followed by the parameters, each also delimited by semicolons. Each parameter is encoded as an extension name followed by an optional equal sign and value. These parameters could be used for a running message digest or digital signature, or to indicate an estimated transfer progress, for instance. The terminating chunk is a special chunk of zero length. It may contain a trailer, which consists of a (possibly empty) sequence of entity header fields. Normally, such header fields would be sent in the message's header; however, it may be more efficient to determine them after processing the entire message entity. In that case, it is useful to send those headers in the trailer. Header fields that regulate the use of trailers are Transfer-Encoding with the "trailers" parameter (used in requests) and Trailer (used in responses). == Use with compression == HTTP servers often use compression to optimize transmission, for example with Content-Encoding: gzip or Content-Encoding: deflate. If both compression and chunked encoding are enabled, then the content stream is first compressed, then chunked; so the chunk encoding itself is not compressed, and the data in each chunk is compressed holistically (i.e. based on the whole content). The remote endpoint then decodes the stream by concatenating the chunks and uncompressing the result. == Example == === Encoded data === The following example contains three chunks of size 4, 7, and 11 (hexadecimal "B") octets of data. 4␍␊Wiki␍␊7␍␊pedia i␍␊B␍␊n ␍␊chunks.␍␊0␍␊␍␊ Below is an annotated version of the encoded data. 4␍␊ (chunk size is four octets) Wiki (four octets of data) ␍␊ (end of chunk) 7␍␊ (chunk size is seven octets) pedia i (seven octets of data) ␍␊ (end of chunk) B␍␊ (chunk size is eleven octets) n ␍␊chunks. (eleven octets of data) ␍␊ (end of chunk) 0␍␊ (chunk size is zero octets, no more chunks) ␍␊ (end of final chunk with zero data octets) Note: Each chunk's size excludes the two ␍␊ bytes that terminate the data of each chunk. === Decoded data === Decoding the above example produces the following octets: Wikipedia in ␍␊chunks. The bytes above are typically displayed as Wikipedia in chunks.

    Read more →
  • Tokenization (data security)

    Tokenization (data security)

    Tokenization, when applied to data security, is the process of substituting a sensitive data element with a non-sensitive equivalent, referred to as a token, that has no intrinsic or exploitable meaning or value. The token is a reference (i.e. identifier) that maps back to the sensitive data through a tokenization system. The mapping from original data to a token uses methods that render tokens infeasible to reverse in the absence of the tokenization system, for example using tokens created from random numbers. A one-way cryptographic function is used to convert the original data into tokens, making it difficult to recreate the original data without obtaining entry to the tokenization system's resources. To deliver such services, the system maintains a vault database of tokens that are connected to the corresponding sensitive data. Protecting the system vault is vital to the system, and improved processes must be put in place to offer database integrity and physical security. The tokenization system must be secured and validated using security best practices applicable to sensitive data protection, secure storage, audit, authentication and authorization. The tokenization system provides data processing applications with the authority and interfaces to request tokens, or detokenize back to sensitive data. The security and risk reduction benefits of tokenization require that the tokenization system is logically isolated and segmented from data processing systems and applications that previously processed or stored sensitive data replaced by tokens. Only the tokenization system can tokenize data to create tokens, or detokenize back to redeem sensitive data under strict security controls. The token generation method must be proven to have the property that there is no feasible means through direct attack, cryptanalysis, side channel analysis, token mapping table exposure or brute force techniques to reverse tokens back to live data. Replacing live data with tokens in systems is intended to minimize exposure of sensitive data to those applications, stores, people and processes, reducing risk of compromise or accidental exposure and unauthorized access to sensitive data. Applications can operate using tokens instead of live data, with the exception of a small number of trusted applications explicitly permitted to detokenize when strictly necessary for an approved business purpose. Tokenization systems may be operated in-house within a secure isolated segment of the data center, or as a service from a secure service provider. Tokenization may be used to safeguard sensitive data involving, for example, bank accounts, financial statements, medical records, criminal records, driver's licenses, loan applications, stock trades, voter registrations, and other types of personally identifiable information (PII). Tokenization is often used in credit card processing. The PCI Council defines tokenization as "a process by which the primary account number (PAN) is replaced with a surrogate value called a token. A PAN may be linked to a reference number through the tokenization process. In this case, the merchant simply has to retain the token and a reliable third party controls the relationship and holds the PAN. The token may be created independently of the PAN, or the PAN can be used as part of the data input to the tokenization technique. The communication between the merchant and the third-party supplier must be secure to prevent an attacker from intercepting to gain the PAN and the token. De-tokenization is the reverse process of redeeming a token for its associated PAN value. The security of an individual token relies predominantly on the infeasibility of determining the original PAN knowing only the surrogate value". The choice of tokenization as an alternative to other techniques such as encryption will depend on varying regulatory requirements, interpretation, and acceptance by respective auditing or assessment entities. This is in addition to any technical, architectural or operational constraint that tokenization imposes in practical use. == Concepts and origins == The concept of tokenization, as adopted by the industry today, has existed since the first currency systems emerged centuries ago as a means to reduce risk in handling high value financial instruments by replacing them with surrogate equivalents. In the physical world, coin tokens have a long history of use replacing the financial instrument of minted coins and banknotes. In more recent history, subway tokens and casino chips found adoption for their respective systems to replace physical currency and cash handling risks such as theft. Exonumia and scrip are terms synonymous with such tokens. In the digital world, similar substitution techniques have been used since the 1970s as a means to isolate real data elements from exposure to other data systems. In databases for example, surrogate key values have been used since 1976 to isolate data associated with the internal mechanisms of databases and their external equivalents for a variety of uses in data processing. More recently, these concepts have been extended to consider this isolation tactic to provide a security mechanism for the purposes of data protection. In the payment card industry, tokenization is one means of protecting sensitive cardholder data in order to comply with industry standards and government regulations. Tokenization was applied to payment card data by Shift4 Corporation and released to the public during an industry Security Summit in Las Vegas, Nevada in 2005. The technology is meant to prevent the theft of the credit card information in storage. Shift4 defines tokenization as: "The concept of using a non-decryptable piece of data to represent, by reference, sensitive or secret data. In payment card industry (PCI) context, tokens are used to reference cardholder data that is managed in a tokenization system, application or off-site secure facility." To protect data over its full lifecycle, tokenization is often combined with end-to-end encryption to secure data in transit to the tokenization system or service, with a token replacing the original data on return. For example, to avoid the risks of malware stealing data from low-trust systems such as point of sale (POS) systems, as in the Target breach of 2013, cardholder data encryption must take place prior to card data entering the POS and not after. Encryption takes place within the confines of a security hardened and validated card reading device and data remains encrypted until received by the processing host, an approach pioneered by Heartland Payment Systems as a means to secure payment data from advanced threats, now widely adopted by industry payment processing companies and technology companies. The PCI Council has also specified end-to-end encryption (certified point-to-point encryption—P2PE) for various service implementations in various PCI Council Point-to-point Encryption documents. == The tokenization process == The process of tokenization consists of the following steps: The application sends the tokenization data and authentication information to the tokenization system. It is stopped if authentication fails and the data is delivered to an event management system. As a result, administrators can discover problems and effectively manage the system. The system moves on to the next phase if authentication is successful. Using one-way cryptographic or random generation techniques, a token is generated and kept in a highly secure data vault. The new token is provided to the application for further use, replacing the sensitive data for processing and storage. Tokenization systems share several components according to established standards. Token generation is the process of producing a token using any means, such as one-way nonreversible cryptographic functions (e.g., a hash function with a strong, secret salt) or assignment via a randomly generated number. Random number generator (RNG) techniques are often the best choice for generating token values. Token mapping – this is the process of assigning the created token value to its original value. To enable permitted look-ups of the original value using the token as the index, a secure cross-reference database must be constructed. Token data store – this is a central repository for the token mapping process that holds the original sensitive values and their related token values. Sensitive data and token values must be securely kept in an encrypted format. Management of cryptographic keys. Strong key management procedures are required for sensitive data encryption on token data stores. == Difference from encryption == Tokenization and "classic" encryption effectively protect data if implemented properly, and a computer security system may use both. While similar in certain regards, tokenization and classic encryption differ in a few key aspects. Both are cryptographic data security methods and the

    Read more →