Machine-readable medium and data

Machine-readable medium and data

In communications and computing, a machine-readable medium (or computer-readable medium) is a medium capable of storing data in a format easily readable by a digital computer or a sensor. It contrasts with human-readable medium and data. The result is called machine-readable data or computer-readable data, and the data itself can be described as having machine-readability. == Data == Machine-readable data must be structured data. Attempts to create machine-readable data occurred as early as the 1960s. At the same time that seminal developments in machine-reading and natural-language processing were releasing (like Weizenbaum's ELIZA), people were anticipating the success of machine-readable functionality and attempting to create machine-readable documents. One such example was musicologist Nancy B. Reich's creation of a machine-readable catalog of composer William Jay Sydeman's works in 1966. In the United States, the OPEN Government Data Act of 14 January 2019 defines machine-readable data as "data in a format that can be easily processed by a computer without human intervention while ensuring no semantic meaning is lost." The law directs U.S. federal agencies to publish public data in such a manner, ensuring that "any public data asset of the agency is machine-readable". Machine-readable data may be classified into two groups: human-readable data that is marked up so that it can also be read by machines (e.g. microformats, RDFa, HTML), and data file formats intended principally for processing by machines (CSV, RDF, XML, JSON). These formats are only machine readable if the data contained within them is formally structured; exporting a CSV file from a badly structured spreadsheet does not meet the definition. Machine readable is not synonymous with digitally accessible. A digitally accessible document may be online, making it easier for humans to access via computers, but its content is much harder to extract, transform, and process via computer programming logic if it is not machine-readable. Extensible Markup Language (XML) is designed to be both human- and machine-readable, and Extensible Stylesheet Language Transformations (XSLT) is used to improve the presentation of the data for human readability. For example, XSLT can be used to automatically render XML in Portable Document Format (PDF). Machine-readable data can be automatically transformed for human-readability but, generally speaking, the reverse is not true. For purposes of implementation of the Government Performance and Results Act (GPRA) Modernization Act, the Office of Management and Budget (OMB) defines "machine readable format" as follows: "Format in a standard computer language (not English text) that can be read automatically by a web browser or computer system. (e.g.; xml). Traditional word processing documents and portable document format (PDF) files are easily read by humans but typically are difficult for machines to interpret. Other formats such as extensible markup language (XML), (JSON), or spreadsheets with header columns that can be exported as comma separated values (CSV) are machine readable formats. As HTML is a structural markup language, discreetly labeling parts of the document, computers are able to gather document components to assemble tables of contents, outlines, literature search bibliographies, etc. It is possible to make traditional word processing documents and other formats machine readable but the documents must include enhanced structural elements." == Media == Examples of machine-readable media include magnetic media such as magnetic disks, cards, tapes, and drums, punched cards and paper tapes, optical discs, barcodes and magnetic ink characters. Common machine-readable technologies include magnetic recording, processing waveforms, and barcodes. Optical character recognition (OCR) can be used to enable machines to read information available to humans. Any information retrievable by any form of energy can be machine-readable. Examples include: Acoustics Chemical Photochemical Electrical Semiconductor used in volatile RAM microchips Floating-gate transistor used in non-volatile memory cards Radio transmission Magnetic storage Mechanical Tins And Swins Punched card Paper tape Music roll Music box cylinder or disk Grooves (See also: Audio Data) Phonograph cylinder Gramophone record DictaBelt (groove on plastic belt) Capacitance Electronic Disc Optics Optical storage Thermodynamic == Applications == === Documents === === Catalogs === === Dictionaries === === Passports ===

AirPair

AirPair is a service and eponymous company that connects people who need help with programming issues (usually, programmers at small technology companies or at finance companies that use technology products) and people who can help them. Unlike services such as oDesk and Elance, AirPair is not a service for outsourcing programming tasks, but rather a service that facilitates one-off knowledge transfers from people with highly specialized knowledge of particular technology stacks or programming issues to people who are in need of specialized help. == History == AirPair launched in March 2013, with founder Jonathon Kresner, who hails from Australia, working full-time, and it soon hired three other part-time developers to work alongside him. Kresner had previously founded two other startups: Preparty, a social invitation and event-booking service based in Australia, and ClimbFind, an online rock-climbing community that reached a million users. Kresner was inspired to work on AirPair because he saw the need for outside expert assistance with programming issues arise regularly at these startups. In November 2013, founder Kresner describes the company's initial success at bootstrapping itself to "Ramen profitability" in a blog post. In December 2013, AirPair was accepted into the Winter 2014 Y Combinator batch. In March 2014, AirPair announced it would launch partnerships with Stripe, Twilio, and other companies that had their own application programming interfaces, allowing developers having trouble with the APIs to seek help over AirPair from experts on the APIs. AirPair presented at the Y Combinator Winter 2014 Demo Day on March 25, 2014, and successfully raised over $1 million within the next 48 hours. == Reception == A review of AirPair by Will Lam stressed that because payment was based on time rather than results, it was important to use it for clearly thought-out questions where one had high confidence that the session would help. Dennis Beatty, who met AirPair founder Jonathon Kresner in March 2014, wrote in April 2014 a glowing review of AirPair's vision of connecting people and its business success. AirPair has been compared with other peer-to-peer coding help sites such as Codementor and HackHands.

Knapsack cryptosystems

Knapsack cryptosystems are cryptosystems whose security is based on the hardness of solving the knapsack problem. They remain quite unpopular because simple versions of these algorithms have been broken for several decades. However, that type of cryptosystem is a good candidate for post-quantum cryptography. The most famous knapsack cryptosystem is the Merkle-Hellman Public Key Cryptosystem, one of the first public key cryptosystems, published the same year as the RSA cryptosystem. However, this system has been broken by several attacks: one from Shamir, one by Adleman, and the low density attack. However, there exist modern knapsack cryptosystems that are considered secure so far: among them is Nasako-Murakami 2006. Knapsack cryptosystems, when not subject to classical cryptoanalysis, are believed to be difficult even for quantum computers. That is not the case for systems that rely on factoring large integers, like RSA, or computing discrete logarithms, like ECDSA, problems solved in polynomial time with Shor's algorithm.

Corporate surveillance

Corporate surveillance describes the practice of businesses monitoring and extracting information from their users, clients, or staff. This information may consist of online browsing history, email correspondence, phone calls, location data, and other private details. Acts of corporate surveillance frequently look to boost results, detect potential security problems, or adjust advertising strategies. These practices have been criticized for violating ethical standards and invading personal privacy. Critics and privacy activists have called for businesses to incorporate rules and transparency surrounding their monitoring methods to ensure they are not misusing their position of authority or breaching regulatory standards. Monitoring can feel intrusive and give the impression that the business does not promote ethical behavior among its personnel. Staff satisfaction, productivity, and staff turnover may all suffer as a result of the invasion of privacy. == Monitoring methods == Employers may be authorized to gather information through keystroke logging and mouse tracking, which involves recording the keys individuals interact with and cursor position on computers. In cases where employment contracts permit it, they may also monitor webcam activity on company-provided computers. Employers may be able to view the emails sent from business accounts and may be able to see the websites visited when using a corporate internet connection. The screenshot capability is another tool that enables companies to see what remote workers are doing. This feature, which can be found in tracking software, takes screenshots throughout the day at predetermined or arbitrary intervals. Additionally, people who don't work in offices are observed. For instance, it has been claimed that Amazon has incorporated tracking technology to monitor warehouse staff and delivery drivers. == Use of collected information == Information collected by corporations can be used for a variety of uses including marketing research, targeting advertising, fraud detection and prevention, ensuring policy adherence, preventing lawsuits, and safeguarding records and company assets. == Privacy concerns == Concerns over corporate privacy have become more important due to companies collection and manipulation of personal data. Since these practices have been recognized there has been a rising concern about both the security and the possible mishandling of the data accumulated. Social Media data collection and monitoring has been one of the most concerned areas regarding corporate surveillance. Recently, many employers on CareerBuilder have checked their potential candidates' social media activities before the hiring process. This approach can be excusable since it is important to be aware of a future employee or applicant's online presence, and how it might affect the company's reputation in the future. This is crucial since employers are often made legally responsible for their worker's digital actions. These data can also be used to enact political gains. The Facebook-Cambridge Analytica data scandal in 2018 revealed that its British branch to have surreptitiously sold American psychological data to the Trump campaign. This information was supposed to be private, but Facebook's inability to protect user information had reportedly not been a top priority of the company at the time. == Laws and regulations == The National Labor and Relations Act (NLRA) safeguards workplace democracy by giving workers in the private sector the basic freedom to demand better working conditions and choice of representation without fear of retaliation. General Data Protection Regulation (GDPR) outlines the broad responsibilities of data controllers and the "processors" that handle personal data on their behalf. They must adopt the necessary security measures in accordance with the risk involved in the data processing operations they carry out.[1] Electronics Communication Privacy Act (ECPA), as amended, provides protection for electronic, oral, and wire communications while they are being created, while they are being sent, and while they are being stored on computers. Email, phone calls, and electronically stored data are covered by the Act. == Sale of customer data == If it is business intelligence, data collected on individuals and groups can be sold to other corporations, so that they can use it for the aforementioned purpose. It can be used for direct marketing purposes, such as targeted advertisements on Google and Yahoo. These ads are tailored to the individual user of the search engine by analyzing their search history and emails (if they use free webmail services). For example, the world's most popular web search engine stores identifying information for each web search. Google stores an IP address and the search phrase used in a database for up to 2 years. Google also scans the content of emails of users of its Gmail webmail service, in order to create targeted advertising based on what people are talking about in their personal email correspondences. Google is, by far, the largest web advertising agency. Their revenue model is based on receiving payments from advertisers for each page-visit resulting from a visitor clicking on a Google AdWords ad, hosted either on a Google service or a third-party website. Millions of sites place Google's advertising banners and links on their websites, in order to share this profit from visitors who click on the ads. Each page containing Google advertisements adds, reads, and modifies cookies on each visitor's computer. These cookies track the user across all of these sites, and gather information about their web surfing habits, keeping track of which sites they visit, and what they do when they are on these sites. This information, along with the information from their email accounts, and search engine histories, is stored by Google to use for building a profile of the user to deliver better-targeted advertising. == Surveillance of workers == In 1993, David Steingard and Dale Fitzgibbons argued that modern management, far from empowering workers, had features of neo-Taylorism, where teamwork perpetuated surveillance and control. They argued that employees had become their own "thought police" and the team gaze was the equivalent of Bentham's panopticon guard tower. A critical evaluation of the Hawthorne Plant experiments has in turn given rise to the notion of a Hawthorne effect, where workers increase their productivity in response to their awareness of being observed or because they are gratified for being chosen to participate in a project. According to the American Management Association and the ePolicy Institute, who undertook a quantitative survey in 2007 about electronic monitoring and surveillance with approximately 300 US companies, "more than one fourth of employers have fired workers for misusing email and nearly one third have fired employees for misusing the Internet." Furthermore, about 30 percent of the companies had also fired employees for usage of "inappropriate or offensive language" and "viewing, downloading, or uploading inappropriate/offensive content." More than 40 percent of the companies monitor email traffic of their workers, and 66 percent of corporations monitor Internet connections. In addition, most companies use software to block websites such as sites with games, social networking, entertainment, shopping, and sports. The American Management Association and the ePolicy Institute also stress that companies track content that is being written about them, for example by monitoring blogs and social media, and scanning all files that are stored in a filesystem. == Government use of corporate surveillance data == The United States government often gains access to corporate databases, either by producing a warrant for it, or by asking. The Department of Homeland Security has openly stated that it uses data collected from consumer credit and direct marketing agencies—such as Google—for augmenting the profiles of individuals that it is monitoring. The US government has gathered information from grocery store discount card programs, which track customers' shopping patterns and store them in databases, in order to look for terrorists by analyzing shoppers' buying patterns. == Corporate surveillance of citizens == According to Dennis Broeders, "Big Brother is joined by big business". He argues that corporations are in any event interested in data on their potential customers and that placing some forms of surveillance in the hands of companies, results in companies owning video surveillance data for stores and public places. The commercial availability of surveillance systems has led to their rapid spread. Therefore it is almost impossible for citizens to maintain their anonymity. When businesses can monitor their customers, such customers run the risk of facing prejudice when applying for housing, loans, jobs, and other economic opportun

Brooklyn Bridge (software)

The Brooklyn Bridge from White Crane Systems was a data transfer enabler. Although it came with some hardware, it was the software which was the basis of the product. It also could transform the data's format. == Overview == The New York Times described its category as being among "communications packages used to transfer files." In an era of 300 baud, Brooklyn Bridge operated at "115,200 baud" so that a transfer which "at 300 baud took 4 minutes and 36 seconds" only needed 5 seconds. Unlike some communications packages, this one retains the original version-date, so as not to alarm people when they seem to have what looks like an update, when it's not. == Description == Once the software is installed, users comfortable with typing the word "COPY" can do so as readily as they sneakernet. An earlier review described it as "less cumbersome than conventional communications software" The use of neither specialized hardware nor specialized software is ideal in an era when this can be done using online or other "outside" services.

Plug computer

A plug computer is a small-form-factor computer whose chassis contains the AC power plug, and thus plugs directly into the wall. Alternatively, the computer may resemble an AC adapter or a similarly small device. Plug computers are often configured for use in the home or office as compact computer. == Description == Plug computers consist of a high-performance, low-power system-on-a-chip processor, with several I/O hardware ports (USB ports, Ethernet connectors, etc.). Most versions do not have provisions for connecting a display and are best suited to running media servers, back-up services, or file sharing and remote access functions; thus acting as a bridge between in-home protocols (such as Digital Living Network Alliance (DLNA) and Server Message Block (SMB)) and cloud-based services. There are, however, plug computer offerings that have analog VGA monitor and/or HDMI connectors, which, along with multiple USB ports, permit the use of a display, keyboard, and mouse, thus making them full-fledged, low-power alternatives to desktop and laptop computers. They typically run any of a number of Linux distributions. Plug computers typically consume little power and are inexpensive. == History == A number of other devices of this type began to appear at the 2009 Consumer Electronics Show. On January 6, 2009 CTERA Networks launched a device called CloudPlug that provides online backup at local disk speeds and overlays a file sharing service. The device also transforms any external USB hard drive into a network-attached storage device. On January 7, 2009, Cloud Engines unveiled the Pogoplug network access server. On January 8, 2009, Axentra announced availability of their HipServ platform. On February 23, 2009, Marvell Technology Group announced its plans to build a mini-industry around plug computers. On August 19, 2009, CodeLathe announced availability of their TonidoPlug network access server. On November 13, 2009 QuadAxis launched its plug computing device product line and development platform, featuring the QuadPlug and QuadPC and running QuadMix, a modified Linux. On January 5, 2010, Iomega announced their iConnect network access server. On January 7, 2010 Pbxnsip launched its plug computing device the sipJack running pbxnsip: an IP Communications platform.

Data security

Data security or data protection is the process of securing digital information to protect it from online threats. Data security or protection means protecting digital data, such as those in a database, from destructive forces and from the unwanted actions of unauthorized users, such as a cyberattack or a data breach. Data security protects computer hardware, software, storage devices, and the data of user devices. Data security also protects the data of organizations, companies and administrative controls. Data security guarantees the protection of individual data, such as identity documents and bank data, and protects against unauthorized access, theft and loss of individual data. Data security also protects data breaches that occurs in companies and industries. Good security measures in industries reduce the probability of data breaches, and employees can rely on the company with their data and private information to be kept secured while companies can continue to maintain a stable reputation. The CIA Triad (Confidentiality, Integrity, and Availability) is what is used to practice what an information security is required to follow. Confidentiality, protects information from being accessed by unauthorized persons. Integrity, makes sure data is trustworthy; and Availability, meaning that data can be accessed by approved users when it is needed; are three goals for data security. Non-repudiation in data security definition, is a device/service that shows where the data originated from and the proof of integrity. == Technologies == === Disk encryption === Disk encryption refers to encryption technology that encrypts data on a hard disk drive. It takes data from a storage device and coverts it into an unreadable format. Disk encryption typically takes form in either software (see disk encryption software) or hardware (see disk encryption hardware) which can be used together. Disk encryption is often referred to as on-the-fly encryption (OTFE) or transparent encryption. Full disk encryption encrypts each individual sector of a disk volume. Files and user data are encrypted to hinder unauthorized users from accessing without a decryption key. A diversifier permits a plaintext of a specific disk sector to be encrypted into different ciphertexts, which does not require additional storage, such as an initialization vector (IV) or message authentication code (MAC). === Software versus hardware-based mechanisms for protecting data === Software-based security solutions encrypt the data to protect it from theft. However, a malicious program or a hacker could corrupt the data to make it unrecoverable, making the system unusable. Hardware-based security solutions prevent read and write access to data, which provides very strong protection against tampering and unauthorized access. Hardware-based security or assisted computer security offers an alternative to software-only computer security. Security tokens such as those using PKCS#11 or a mobile phone may be more secure due to the physical access required in order to be compromised. Access is enabled only when the token is connected and the correct PIN is entered (see two-factor authentication). However, dongles can be used by anyone who can gain physical access to it. Newer technologies in hardware-based security solve this problem by offering full proof of security for data. Working off hardware-based security: A hardware device allows a user to log in, log out and set different levels through manual actions. Many devices use biometric technology to prevent malicious users from logging in, logging out, and changing privilege levels. The current state of a user of the device is read by controllers in peripheral devices such as hard disks. Illegal access by a malicious user or a malicious program is interrupted based on the current state of a user by hard disk and DVD controllers making illegal access to data impossible. Hardware-based access control is more secure than the protection provided by the operating systems as operating systems are vulnerable to malicious attacks by viruses and hackers. The data on hard disks can be corrupted after malicious access is obtained. With hardware-based protection, the software cannot manipulate the user privilege levels. A hacker or a malicious program cannot gain access to secure data protected by hardware or perform unauthorized privileged operations. This assumption is broken only if the hardware itself is malicious or contains a backdoor. The hardware protects the operating system image and file system privileges from being tampered with. Therefore, a completely secure system can be created using a combination of hardware-based security and secure system administration policies. === Backups === Backup is the process of reproducing copies of essential data and storing in a separate, secured place. It is used to ensure data that is lost can be recovered from another source. Backups contains a minimum of one copy of the data that requires preservation. It is considered essential to keep a backup of any data in most industries and the process is recommended for any files of importance to a user. There are 3 types of backups; full backups, incremental backups, and differential backups. Full backups secure all data from a production system, such as a server, database, or other connected data source. It is impossible to lose all data in a full backup if a breach or corruption were to occur. Full backups require a significantly large amount of time to back up and may be time-consuming taking hours to days to complete. Incremental backups only secures changed data since last backup. While all backups are done in full backups, incremental backups only save data that is recently or frequently changed. Incremental backups require lower storage costs making it a prominent solution for growing datasets. === Data Privacy === Data privacy (or information privacy) is the right for individual's data to be secured to obstruct the use of unauthorized access. It gives individuals control over their data and how it can be shared to third parties. The U.S Privacy Protection Law (see Privacy laws of the United States) requires organizations to inform individuals of how their data is collected and when a data breach occurs. By implementing an encryption, it ensures that private data is unreadable to cybercriminals. === Data masking === Data masking of structured data is the process of obscuring (masking) specific data within a database table or cell to ensure that data security is maintained and sensitive information is not exposed to unauthorized personnel. This may include masking the data from users (for example so banking customer representatives can only see the last four digits of a customer's national identity number), developers (who need real production data to test new software releases but should not be able to see sensitive financial data), outsourcing vendors, etc. Data masking is a form of encryption, as it obscures data by modifying particular letters and numbers to keep data concealed and protected from potential hackers. The individual that has access to the code that decrypts the replaced characters are the only ones that can uncover the data. === Data erasure === Data erasure (or data deletion, data destruction) is a method of software-based overwriting that permanently clears all electronic data residing on a hard drive or other digital media to ensure that no sensitive data is lost when an asset is retired or reused. Article 17: Right to be Forgotten states that users have the right to permanently remove all of their private information from their old devices/services to give people more control over their data. Users are able to switch between devices efficiently. == Threats == === Malware === Malware (or malicious software) is designed to destroy, corrupt or gain unauthorized access to a computer for the purpose of stealing, or destroying data. Hackers who use malware typically utilize many types of malware, which includes computer virus, computer worms, ransomware, spyware and Trojan horse to create a vast system of disruption and cause easy data theft. One of the victims of the vast system of disruption includes healthcare workers, who are targeted by compromised systems by infections and then having their data attacked. === Phishing === Phishing is a type of scam that allows hackers to hoax people using psychological and social engineering (using human emotions such as their trust and fear) tactics into giving personal data through emails and messages, and install computer viruses if the individual were to click on a malicious link unknowingly. Attackers are able to create websites that are very similar to original websites, which makes it difficult to detect a fake website, causing individuals to fall for giving in information. Phishing attackers use human emotion to exploit them, such as making them feel fear, urgency, sympathy with the message