Optical character recognition (OCR) or optical character reader is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example: from a television broadcast). Widely used as a form of data entry from printed paper data records – whether passport documents, invoices, bank statements, computerized receipts, business cards, mail, printed data, or any suitable documentation – it is a common method of digitizing printed texts so that they can be electronically edited, searched, stored more compactly, displayed online, and used in machine processes such as cognitive computing, machine translation, (extracted) text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision. Early versions needed to be trained with images of each character, and worked on one font at a time. Advanced systems capable of producing a high degree of accuracy for most fonts are now common, and with support for a variety of image file format inputs. Some systems are capable of reproducing formatted output that closely approximates the original page including images, columns, and other non-textual components. == History == Early optical character recognition may be traced to technologies involving telegraphy and creating reading devices for the blind. In 1914, Emanuel Goldberg developed a machine that read characters and converted them into standard telegraph code. Concurrently, Edmund Fournier d'Albe developed the Optophone, a handheld scanner that when moved across a printed page, produced tones that corresponded to specific letters or characters. In the late 1920s and into the 1930s, Emanuel Goldberg developed what he called a "Statistical Machine" for searching microfilm archives using an optical code recognition system. In 1931, he was granted US Patent number 1,838,389 for the invention. The patent was acquired by IBM. === Visually impaired users === In 1974, Ray Kurzweil started the company Kurzweil Computer Products, Inc. and continued development of omni-font OCR, which could recognize text printed in virtually any font. (Kurzweil is often credited with inventing omni-font OCR, but it was in use by companies, including CompuScan, in the late 1960s and 1970s.) Kurzweil used the technology to create a reading machine for blind people to have a computer read text to them out loud. The device included a CCD-type flatbed scanner and a text-to-speech synthesizer. On January 13, 1976, the finished product was unveiled during a widely reported news conference headed by Kurzweil and the leaders of the National Federation of the Blind. In 1978, Kurzweil Computer Products began selling a commercial version of the optical character recognition computer program. LexisNexis was one of the first customers, and bought the program to upload legal paper and news documents onto its nascent online databases. Two years later, Kurzweil sold his company to Xerox, which eventually spun it off as Scansoft, which merged with Nuance Communications. In the 2000s, OCR was made available online as a service (WebOCR), in a cloud computing environment, and in mobile applications like real-time translation of foreign-language signs on a smartphone. With the advent of smartphones and smartglasses, OCR can be used in internet connected mobile device applications that extract text captured using the device's camera. These devices that do not have built-in OCR functionality will typically use an OCR API to extract the text from the image file captured by the device. The OCR API returns the extracted text, along with information about the location of the detected text in the original image back to the device app for further processing (such as text-to-speech) or display. Various commercial and open source OCR systems are available for most common writing systems, including Latin, Cyrillic, Arabic, Hebrew, Indic, Bengali (Bangla), Devanagari, Tamil, Chinese, Japanese, and Korean characters. == Applications == OCR engines have been developed into software applications specializing in various subjects such as receipts, invoices, checks, and legal billing documents. The software can be used for: Entering data for business documents, e.g. checks, passports, invoices, bank statements and receipts Automatic number-plate recognition Passport recognition and information extraction in airports Automatically extracting key information from insurance documents Traffic-sign recognition Extracting business card information into a contact list Creating textual versions of printed documents, e.g. book scanning for Project Gutenberg Making electronic images of printed documents searchable, e.g. Google Books Converting handwriting in real-time to control a computer (pen computing) Defeating or testing the robustness of CAPTCHA anti-bot systems, though these are specifically designed to prevent OCR. Assistive technology for blind and visually impaired users Writing instructions for vehicles by identifying CAD images in a database that are appropriate to the vehicle design as it changes in real time Making scanned documents searchable by converting them to PDFs == Types == Optical character recognition (OCR) – targets typewritten text, one glyph or character at a time. Optical word recognition – targets typewritten text, one word at a time (for languages that use a space as a word divider). Usually just called "OCR". Intelligent character recognition (ICR) – also targets handwritten printscript or cursive text one glyph or character at a time, usually involving machine learning. Intelligent word recognition (IWR) – also targets handwritten printscript or cursive text, one word at a time. This is especially useful for languages where glyphs are not separated in cursive script. OCR is generally an offline process, which analyses a static document. There are cloud based services which provide an online OCR API service. Handwriting movement analysis can be used as input to handwriting recognition. Instead of merely using the shapes of glyphs and words, this technique is able to capture motion, such as the order in which segments are drawn, the direction, and the pattern of putting the pen down and lifting it. This additional information can make the process more accurate. This technology is also known as "online character recognition", "dynamic character recognition", "real-time character recognition", and "intelligent character recognition". == Techniques == === Pre-processing === OCR software often pre-processes images to improve the chances of successful recognition. Techniques include: De-skewing – if the document was not aligned properly when scanned, it may need to be tilted a few degrees clockwise or counterclockwise in order to make lines of text perfectly horizontal or vertical. Despeckling – removal of positive and negative spots, smoothing edges Binarization – conversion of an image from color or greyscale to black-and-white (called a binary image because there are two colors). The task is performed as a simple way of separating the text (or any other desired image component) from the background. The task of binarization is necessary since most commercial recognition algorithms work only on binary images, as it is simpler to do so. In addition, the effectiveness of binarization influences to a significant extent the quality of character recognition, and careful decisions are made in the choice of the binarization employed for a given input image type; since the quality of the method used to obtain the binary result depends on the type of image (scanned document, scene text image, degraded historical document, etc.). Line removal – Cleaning up non-glyph boxes and lines Layout analysis or zoning – Identification of columns, paragraphs, captions, etc. as distinct blocks. Especially important in multi-column layouts and tables. Line and word detection – Establishment of a baseline for word and character shapes, separating words as necessary. Script recognition – In multilingual documents, the script may change at the level of the words and hence, identification of the script is necessary, before the right OCR can be invoked to handle the specific script. Character isolation or segmentation – For per-character OCR, multiple characters that are connected due to image artifacts must be separated; single characters that are broken into multiple pieces due to artifacts must be connected. Normalization of aspect ratio and scale Segmentation of fixed-pitch fonts is accomplished relatively simply by aligning the image to a uniform grid based on where vertical grid lines will least often intersect black areas. For proportional fonts, more sophisticated techniques are needed because whitespace bet
Graphics processing unit
A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a component on a discrete graphics card or embedded on motherboards, mobile phones, personal computers, workstations, and game consoles. GPUs are increasingly being used for artificial intelligence (AI) processing due to linear algebra acceleration, which is also used extensively in graphics processing. Although there is no single definition of the term, and it may be used to describe any video display system, in modern use a GPU includes the ability to internally perform the calculations needed for various graphics tasks, like rotating and scaling 3D images, and often the additional ability to run custom programs known as shaders. This contrasts with earlier graphics controllers known as video display controllers which had no internal calculation capabilities, or blitters, which performed only basic memory movement operations. The modern GPU emerged during the 1990s, adding the ability to perform operations like drawing lines and text without CPU help, and later adding 3D functionality. Graphics functions are generally independent and this lends these tasks to being implemented on separate calculation engines. Modern GPUs include hundreds, or thousands, of calculation units. This made them useful for non-graphic calculations involving embarrassingly parallel problems due to their parallel structure. The ability of GPUs to rapidly perform vast numbers of calculations has led to their adoption in diverse fields including artificial intelligence (AI) where they excel at handling data-intensive and computationally demanding tasks. Other non-graphical uses include the training of neural networks and cryptocurrency mining. == History == === 1960s === Dedicated 3D graphics hardware dates back to graphic terminals such as the Adage AGT-30 from 1967 with analog matrix processors. In 1969 Evans & Sutherland (E&S) introduced the Line Drawing System-1 (LDS-1), which was the first all-digital system to provide matrix multiplication. Also in 1969, the low-cost graphics terminal IMLAC PDS-1 was introduced. It later saw use as an early 3D gaming machine with the likes of Maze War. === 1970s === In professional hardware, in 1972 PLATO IV system becomes operational at the University of Illinois Urbana-Champaign. Between around 1973 and 1978, several networked multiplayer wireframe 3D games are implemented and popularized by users of the system. Also in 1972, the E&S Continuous Tone 1 (CT1) "Watkins box" system (consisting of an E&S LDS-2 and Shaded Picture System) is delivered to Case Western Reserve University. It offered the first real-time Gouraud shading. In 1975, a joint effort between Evans & Sutherland Computer Corporation and the University of Utah's computer graphics department results in the first ever MOSFET video framebuffer, capable of color and smooth shading. E&S Continuous Tone 3 (CT3) system was delivered in 1977 to Lufthansa for pilot training using computer simulation. It was the first graphics system capable of real-time texture mapping. Ikonas made graphics systems with 8- and 24-bit graphics and 3D acceleration in the late 70s. Arcade system boards have used specialized 2D graphics circuits since the 1970s. In early video game hardware, RAM for frame buffers was expensive, so video chips composited data together as the display was being scanned out on the monitor. A specialized barrel shifter circuit helped the CPU animate the framebuffer graphics for various 1970s arcade video games from Midway and Taito, such as Gun Fight (1975), Sea Wolf (1976), and Space Invaders (1978). The Namco Galaxian arcade system in 1979 used specialized graphics hardware that supported RGB color, multi-colored sprites, and tilemap backgrounds. The Galaxian hardware was widely used during the golden age of arcade video games, by game companies such as Namco, Centuri, Gremlin, Irem, Konami, Midway, Nichibutsu, Sega, and Taito. The Atari 2600 in 1977 used a video shifter called the Television Interface Adaptor. Atari 8-bit computers (1979) had ANTIC, a video processor which interpreted instructions describing a "display list"—the way the scan lines map to specific bitmapped or character modes and where the memory is stored (so there did not need to be a contiguous frame buffer). 6502 machine code subroutines could be triggered on scan lines by setting a bit on a display list instruction. ANTIC also supported smooth vertical and horizontal scrolling independent of the CPU. === 1980s === In the 1980s significant advancements were made in professional 3D graphics hardware. Perhaps most impactful was the 1981 development of the Geometry Engine, a VLSI vector processor ASIC designed by Jim Clark and Marc Hannah at Stanford University. This processor is the forerunner of modern tensor cores and other similar processors marketed for graphics and AI. The Geometry Engine went on to be used in Silicon Graphics workstations for many years. Silicon Graphics's first product, shipped in November 1983, was the IRIS 1000, a terminal with hardware-accelerated 3D graphics based on the Geometry Engine. The Geometry Engine was capable of approximately 6 million operations per second. The 1981 NEC μPD7220 was the first implementation of a personal computer graphics display processor as a single large-scale integration (LSI) integrated circuit chip. This enabled the design of low-cost, high-performance video graphics cards such as those from Number Nine Visual Technology. It became the best-known GPU until the mid-1980s. It was the first fully integrated VLSI (very large-scale integration) metal–oxide–semiconductor (NMOS) graphics display processor for PCs, supported up to 1024×1024 resolution, and laid the foundations for the PC graphics market. It was used in a number of graphics cards and was licensed for clones such as the Intel 82720, the first of Intel's graphics processing units. The Williams Electronics arcade games Robotron: 2084, Joust, Sinistar, and Bubbles, all released in 1982, contain custom blitter chips for operating on 16-color bitmaps. In 1984, Hitachi released the ARTC HD63484, the first major CMOS graphics processor for personal computers. The ARTC could display up to 4K resolution when in monochrome mode. It was used in a number of graphics cards and terminals during the late 1980s. In 1985, the Amiga was released with a custom graphics chip called Agnus including a blitter for bitmap manipulation, line drawing, and area fill. It also included a coprocessor with its own simple instruction set, that was capable of manipulating graphics hardware registers in sync with the video beam (e.g. for per-scanline palette switches, sprite multiplexing, and hardware windowing), or driving the blitter. Also in 1985, IBM released the Professional Graphics Controller, designed by later to be Nvidia co-founder Curtis Priem, which was a rudimentary 3D card with 640 × 480 256-color graphics which used a dedicated CPU to draw graphics independently of the main system. It was used as the basis of cards by a number of makers (including Matrox) and its analog RGB signaling led directly to the VGA video standard. Priem later in the 80s worked on the influential Sun Microsystems GX (also known as cgsix) accelerated 2D graphics card. In 1986, Texas Instruments released the TMS34010, the first fully programmable graphics processor. It could run general-purpose code but also had a graphics-oriented instruction set. During 1990–1992, this chip became the basis of the Texas Instruments Graphics Architecture ("TIGA") Windows accelerator cards. Following in 1987, the IBM 8514 graphics system was released. It was one of the first video cards for IBM PC compatibles that implemented fixed-function 2D primitives in electronic hardware. Sharp's X68000, released in 1987, used a custom graphics chipset with a 65,536 color palette and hardware support for sprites, scrolling, and multiple playfields. It served as a development machine for Capcom's CP System arcade board. Fujitsu's FM Towns computer, released in 1989, had support for a 16,777,216 color palette. For context, IBM also introduced its Video Graphics Array (VGA) display system in 1987, with a maximum resolution of 640 × 480 pixels. Unlike 8514/A, VGA had no hardware acceleration features. In November 1988, NEC Home Electronics announced its creation of the Video Electronics Standards Association (VESA) to develop and promote a Super VGA (SVGA) computer display standard as a successor to VGA. Super VGA enabled graphics display resolutions up to 800 × 600 pixels, a 56% increase. In 1988 SGI sold IRIS workstation graphics with 10-12 Geometry Engines and introduced the IrisVision add-in board for IBM MicroChannel bus (RS/6000) based on the Geometry Engine as well. In 1988 as well, the first dedicated polygonal 3D graphics boards in arcade machines were introduced wit
Ultra (cryptography)
Ultra was the designation adopted by British military intelligence in June 1941 for wartime signals intelligence obtained by breaking high-level encrypted enemy radio and teleprinter communications at the Government Code and Cypher School (GC&CS) at Bletchley Park. Ultra eventually became the standard designation among the western Allies for all such intelligence. The name arose because the intelligence obtained was considered more important than that designated by the highest British security classification then used (Most Secret) and so was regarded as being Ultra Secret. Several other cryptonyms had been used for such intelligence. The code name "Boniface" was used as a cover name for Ultra. In order to ensure that the successful code-breaking did not become apparent to the Germans, British intelligence created a fictional MI6 master spy, Boniface, who controlled a fictional series of agents throughout Germany. Information obtained through code-breaking was often attributed to the human intelligence from the Boniface network. The U.S. used the codename Magic for its decrypts from Japanese sources, including the "Purple" cipher. Much of the German cipher traffic was encrypted on the Enigma machine. Used properly, the German military Enigma would have been virtually unbreakable; in practice, shortcomings in operation allowed it to be broken. The term "Ultra" has often been used almost synonymously with "Enigma decrypts". However, Ultra also encompassed decrypts of the German Lorenz SZ 40/42 machines that were used by the German High Command, and the Hagelin machine. Many observers, at the time and later, regarded Ultra as immensely valuable to the Allies. Winston Churchill was reported to have told King George VI, when presenting to him Stewart Menzies (head of the Secret Intelligence Service and the person who controlled distribution of Ultra decrypts to the government): "It is thanks to the secret weapon of General Menzies, put into use on all the fronts, that we won the war!" F. W. Winterbotham quoted the western Supreme Allied Commander, Dwight D. Eisenhower, at war's end describing Ultra as having been "decisive" to Allied victory. Sir Harry Hinsley, Bletchley Park veteran and official historian of British Intelligence in World War II, made a similar assessment of Ultra, saying that while the Allies would have won the war without it, "the war would have been something like two years longer, perhaps three years longer, possibly four years longer than it was." However, Hinsley and others have emphasized the difficulties of counterfactual history in attempting such conclusions, and some historians, such as John Keegan, have said the shortening might have been as little as the three months it took the United States to deploy the atomic bomb. == Sources of intelligence == Most Ultra intelligence was derived from reading radio messages that had been encrypted with cipher machines, complemented by material from radio communications using traffic analysis and direction finding. In the early phases of the war, particularly during the eight-month Phoney War, the Germans could transmit most of their messages using land lines and so had no need to use radio. This meant that those at Bletchley Park had some time to build up experience of collecting and starting to decrypt messages on the various radio networks. German Enigma messages were the main source, with those of the German air force (the Luftwaffe) predominating, as they used radio more and their operators were particularly ill-disciplined. === German === ==== Enigma ==== "Enigma" refers to a family of electro-mechanical rotor cipher machines. These produced a polyalphabetic substitution cipher and were widely thought to be unbreakable in the 1920s, when a variant of the commercial Model D was first used by the Reichswehr. The German Army (Heer), Navy, Air Force, Nazi party, Gestapo and German diplomats used Enigma machines in several variants. Abwehr (German military intelligence) used a four-rotor machine without a plugboard and Naval Enigma used different key management from that of the army or air force, making its traffic far more difficult to cryptanalyse; each variant required different cryptanalytic treatment. The commercial versions were not as secure and Dilly Knox of GC&CS is said to have broken one before the war. German military Enigma was first broken in December 1932 by Marian Rejewski and the Polish Cipher Bureau, using a combination of brilliant mathematics, the services of a spy in the German office responsible for administering encrypted communications, and good luck. The Poles read Enigma to the outbreak of World War II and beyond, in France. At the turn of 1939, the Germans made the systems ten times more complex, which required a tenfold increase in Polish decryption equipment, which they could not meet. On 25 July 1939, the Polish Cipher Bureau handed reconstructed Enigma machines and their techniques for decrypting ciphers to the French and British. Gordon Welchman wrote, Ultra would never have got off the ground if we had not learned from the Poles, in the nick of time, the details both of the German military Enigma machine, and of the operating procedures that were in use. At Bletchley Park, some of the key people responsible for success against Enigma included mathematicians Alan Turing and Hugh Alexander and, at the British Tabulating Machine Company, chief engineer Harold Keen. After the war, interrogation of German cryptographic personnel led to the conclusion that German cryptanalysts understood that cryptanalytic attacks against Enigma were possible but were thought to require impracticable amounts of effort and investment. The Poles' early start at breaking Enigma and the continuity of their success gave the Allies an advantage when World War II began. ==== Lorenz cipher ==== In June 1941, the Germans started to introduce on-line stream cipher teleprinter systems for strategic point-to-point radio links, to which the British gave the code-name Fish. Several systems were used, principally the Lorenz SZ 40/42 (codenamed "Tunny" by the British) and Geheimfernschreiber ("Sturgeon"). These cipher systems were cryptanalysed, particularly Tunny, which the British thoroughly penetrated. It was eventually attacked using Colossus machines, which were the first digital programme-controlled electronic computers. In many respects the Tunny work was more difficult than for the Enigma, since the British codebreakers had no knowledge of the machine producing it and no head-start such as that the Poles had given them against Enigma. Although the volume of intelligence derived from this system was much smaller than that from Enigma, its importance was often far higher because it produced primarily high-level, strategic intelligence that was sent between Wehrmacht high command (Oberkommando der Wehrmacht, OKW). The eventual bulk decryption of Lorenz-enciphered messages contributed significantly, and perhaps decisively, to the defeat of Nazi Germany. Nevertheless, the Tunny story has become much less well known among the public than the Enigma one. At Bletchley Park, some of the key people responsible for success in the Tunny effort included mathematicians W. T. "Bill" Tutte and Max Newman and electrical engineer Tommy Flowers. === Italian === In June 1940, the Italians were using book codes for most of their military messages, except for the Italian Navy, which in early 1941 had started using a version of the Hagelin rotor-based cipher machine C-38. This was broken from June 1941 onwards by the Italian subsection of GC&CS at Bletchley Park. === Japanese === In the Pacific theatre, a Japanese cipher machine, called "Purple" by the Americans, was used for highest-level Japanese diplomatic traffic. It produced a polyalphabetic substitution cipher, but unlike Enigma, was not a rotor machine, being built around electrical stepping switches. It was broken by the US Army Signal Intelligence Service and disseminated as Magic. Detailed reports by the Japanese ambassador to Germany were encrypted on the Purple machine. His reports included reviews of German assessments of the military situation, reviews of strategy and intentions, reports on direct inspections by the ambassador (in one case, of Normandy beach defences), and reports of long interviews with Hitler. The Japanese are said to have obtained an Enigma machine in 1937, although it is debated whether they were given it by the Germans or bought a commercial version, which, apart from the plugboard and internal wiring, was the German Heer/Luftwaffe machine. Having developed a similar machine, the Japanese did not use the Enigma machine for their most secret communications. The chief fleet communications code system used by the Imperial Japanese Navy was called JN-25 by the Americans, and by early 1942 the US Navy had made considerable progress in decrypting Japanese naval messages. The US Army also made progress on the
Cover-coding
Cover-coding is a technique for obscuring the data that is transmitted over an insecure link, to reduce the risks of snooping. An example of cover-coding would be for the sender to perform a bitwise XOR (exclusive OR) of the original data with a password or random number which is known to both sender and receiver. The resulting cover-coded data is then transmitted from sender to the receiver, who uncovers the original data by performing a further bitwise XOR (exclusive OR) operation on the received data using the same password or random number. ISO 18000-6C (EPC Class 1 Generation 2) RFID tags protect some operations with a cover code. The reader requests a random number from the tag, and the tag responds with a new random number. The reader then encrypts future communications with this number, using bitwise XOR, to the data it sends. Cover coding is secure if the tag signal can't be intercepted and the random number is not re-used. Compared to the loud transmissions from the reader, tag backscatter is much weaker and difficult -- but not impossible -- to intercept.
Cut, copy, and paste
Cut, copy, and paste are essential commands of modern human–computer interaction and user interface design. They offer an interprocess communication technique for transferring data through a computer's user interface. The cut command removes the selected data from its original position, and the copy command creates a duplicate; in both cases the selected data is kept in temporary storage called the clipboard. Clipboard data is later inserted wherever a paste command is issued. The data remains available to any application supporting the feature, thus allowing easy data transfer between applications. The command names are a (skeuomorphic) interface metaphor based on the physical procedure used in manuscript print editing to create a page layout, like with paper. The commands were pioneered into computing by Xerox PARC in 1974, popularized by Apple Computer in the 1983 Lisa workstation and the 1984 Macintosh computer, and in a few home computer applications such as the 1984 word processor Cut & Paste. This interaction technique has close associations with related techniques in graphical user interfaces (GUIs) that use pointing devices such as a computer mouse (by drag and drop, for example). Typically, clipboard support is provided by an operating system as part of its GUI and widget toolkit. The capability to replicate information with ease, changing it between contexts and applications, involves privacy concerns because of the risks of disclosure when handling sensitive information. Terms like cloning, copy forward, carry forward, or re-use refer to the dissemination of such information through documents, and may be subject to regulation by administrative bodies. == History == === Origins === The term "cut and paste" comes from the traditional practice in manuscript editing, whereby people cut paragraphs from a page with scissors and paste them onto another page. This practice remained standard into the 1980s. Stationery stores sold "editing scissors" with blades long enough to cut an 8½"-wide page. The advent of photocopiers made the practice easier and more flexible. The act of copying or transferring text from one part of a computer-based document ("buffer") to a different location within the same or different computer-based document was a part of the earliest on-line computer editors. As soon as computer data entry moved from punch-cards to online files (in the mid/late 1960s) there were "commands" for accomplishing this operation. This mechanism was often used to transfer frequently-used commands or text snippets from additional buffers into the document, as was the case with the QED text editor. === Early methods === The earliest editors (designed for teleprinter terminals) provided keyboard commands to delineate a contiguous region of text, then delete or move it. Since moving a region of text requires first removing it from its initial location and then inserting it into its new location, various schemes had to be invented to allow for this multi-step process to be specified by the user. Often this was done with a "move" command, but some text editors required that the text be first put into some temporary location for later retrieval/placement. In 1983, the Apple Lisa became the first text editing system to call that temporary location "the clipboard". Earlier control schemes such as NLS used a verb—object command structure, where the command name was provided first and the object to be copied or moved was second. The inversion from verb—object to object—verb on which copy and paste are based, where the user selects the object to be operated before initiating the operation, was an innovation crucial for the success of the desktop metaphor as it allowed copy and move operations based on direct manipulation. === Popularization === Inspired by early line and character editors, such as Pentti Kanerva's TV-Edit, that broke a move or copy operation into two steps—between which the user could invoke a preparatory action such as navigation—Lawrence G. "Larry" Tesler proposed the names "cut" and "copy" for the first step and "paste" for the second step. Beginning in 1974, he and colleagues at Xerox PARC implemented several text editors that used cut/copy-and-paste commands to move and copy text. Apple Computer popularized this paradigm with its Lisa (1983) and Macintosh (1984) operating systems and applications. The functions were mapped to key combinations using the ⌘ Command key as a special modifier, which is held down while also pressing X for cut, C for copy, or V for paste. These few keyboard shortcuts allow the user to perform all the basic editing operations, and the keys are clustered at the left end of the bottom row of the standard QWERTY keyboard. These are the standard shortcuts: Control-Z (or ⌘ Command+Z) to undo Control-X (or ⌘ Command+X) to cut Control-C (or ⌘ Command+C) to copy Control-V (or ⌘ Command+V) to paste The IBM Common User Access (CUA) standard also uses combinations of the Insert, Del, Shift and Control keys. Early versions of Windows used the IBM standard. Microsoft later also adopted the Apple key combinations with the introduction of Windows, using the control key as modifier key. Similar patterns of key combinations, later borrowed by others, are widely available in most GUI applications. The original cut, copy, and paste workflow, as implemented at PARC, utilizes a unique workflow: With two windows on the same screen, the user could use the mouse to pick a point at which to make an insertion in one window (or a segment of text to replace). Then, by holding shift and selecting the copy source elsewhere on the same screen, the copy would be made as soon as the shift was released. Similarly, holding shift and control would copy and cut (delete) the source. This workflow requires many fewer keystrokes/mouse clicks than the current multi-step workflows, and did not require an explicit copy buffer. It was dropped, one presumes, because the original Apple and IBM GUIs were not high enough density to permit multiple windows, as were the PARC machines, and so multiple simultaneous windows were rarely used. == Cut and paste == Computer-based editing can involve very frequent use of cut-and-paste operations. Most software-suppliers provide several methods for performing such tasks, and this can involve (for example) key combinations, pulldown menus, pop-up menus, or toolbar buttons. The user selects or "highlights" the text or file for moving by some method, typically by dragging over the text or file name with the pointing-device or holding down the Shift key while using the arrow keys to move the text cursor. The user performs a "cut" operation via key combination Ctrl+x (⌘+x for Macintosh users), menu, or other means. Visibly, "cut" text immediately disappears from its location. "Cut" files typically change color to indicate that they will be moved. Conceptually, the text has now moved to a location often called the clipboard. The clipboard typically remains invisible. On most systems only one clipboard location exists, hence another cut or copy operation overwrites the previously stored information. Many UNIX text-editors provide multiple clipboard entries, as do some Macintosh programs such as Clipboard Master, and Windows clipboard-manager programs such as the one in Microsoft Office. The user selects a location for insertion by some method, typically by clicking at the desired insertion point. A paste operation takes place which visibly inserts the clipboard text at the insertion point. (The paste operation does not typically destroy the clipboard text: it remains available in the clipboard and the user can insert additional copies at other points). Whereas cut-and-paste often takes place with a mouse-equivalent in Windows-like GUI environments, it may also occur entirely from the keyboard, especially in UNIX text editors, such as Pico or vi. Cutting and pasting without a mouse can involve a selection (for which Ctrl+x is pressed in most graphical systems) or the entire current line, but it may also involve text after the cursor until the end of the line and other more sophisticated operations. The clipboard usually stays invisible, because the operations of cutting and pasting, while actually independent, usually take place in quick succession, and the user (usually) needs no assistance in understanding the operation or maintaining mental context. Some application programs provide a means of viewing, or sometimes even editing, the data on the clipboard. == Copy and paste == The term "copy-and-paste" refers to the popular, simple method of reproducing text or other data from a source to a destination. It differs from cut and paste in that the original source text or data does not get deleted or removed. The popularity of this method stems from its simplicity and the ease with which users can move data between various applications visually – without resorting to permanent storage. Use in healthcare do
Inception score
The Inception Score (IS) is an algorithm used to assess the quality of images created by a generative image model such as a generative adversarial network (GAN). The score is calculated based on the output of a separate, pretrained Inception v3 image classification model applied to a sample of (typically around 30,000) images generated by the generative model. The Inception Score is maximized when the following conditions are true: The entropy of the distribution of labels predicted by the Inceptionv3 model for the generated images is minimized. In other words, the classification model confidently predicts a single label for each image. Intuitively, this corresponds to the desideratum of generated images being "sharp" or "distinct". The predictions of the classification model are evenly distributed across all possible labels. This corresponds to the desideratum that the output of the generative model is "diverse". It has been somewhat superseded by the related Fréchet inception distance. While the Inception Score only evaluates the distribution of generated images, the FID compares the distribution of generated images with the distribution of a set of real images ("ground truth"). == Definition == Let there be two spaces, the space of images Ω X {\displaystyle \Omega _{X}} and the space of labels Ω Y {\displaystyle \Omega _{Y}} . The space of labels is finite. Let p g e n {\displaystyle p_{gen}} be a probability distribution over Ω X {\displaystyle \Omega _{X}} that we wish to judge. Let a discriminator be a function of type p d i s : Ω X → M ( Ω Y ) {\displaystyle p_{dis}:\Omega _{X}\to M(\Omega _{Y})} where M ( Ω Y ) {\displaystyle M(\Omega _{Y})} is the set of all probability distributions on Ω Y {\displaystyle \Omega _{Y}} . For any image x {\displaystyle x} , and any label y {\displaystyle y} , let p d i s ( y | x ) {\displaystyle p_{dis}(y|x)} be the probability that image x {\displaystyle x} has label y {\displaystyle y} , according to the discriminator. It is usually implemented as an Inception-v3 network trained on ImageNet. The Inception Score of p g e n {\displaystyle p_{gen}} relative to p d i s {\displaystyle p_{dis}} is I S ( p g e n , p d i s ) := exp ( E x ∼ p g e n [ D K L ( p d i s ( ⋅ | x ) ‖ ∫ p d i s ( ⋅ | x ) p g e n ( x ) d x ) ] ) {\displaystyle IS(p_{gen},p_{dis}):=\exp \left(\mathbb {E} _{x\sim p_{gen}}\left[D_{KL}\left(p_{dis}(\cdot |x)\|\int p_{dis}(\cdot |x)p_{gen}(x)dx\right)\right]\right)} Equivalent rewrites include ln I S ( p g e n , p d i s ) := E x ∼ p g e n [ D K L ( p d i s ( ⋅ | x ) ‖ E x ∼ p g e n [ p d i s ( ⋅ | x ) ] ) ] {\displaystyle \ln IS(p_{gen},p_{dis}):=\mathbb {E} _{x\sim p_{gen}}\left[D_{KL}\left(p_{dis}(\cdot |x)\|\mathbb {E} _{x\sim p_{gen}}[p_{dis}(\cdot |x)]\right)\right]} ln I S ( p g e n , p d i s ) := H [ E x ∼ p g e n [ p d i s ( ⋅ | x ) ] ] − E x ∼ p g e n [ H [ p d i s ( ⋅ | x ) ] ] {\displaystyle \ln IS(p_{gen},p_{dis}):=H[\mathbb {E} _{x\sim p_{gen}}[p_{dis}(\cdot |x)]]-\mathbb {E} _{x\sim p_{gen}}[H[p_{dis}(\cdot |x)]]} ln I S {\displaystyle \ln IS} is nonnegative by Jensen's inequality. Pseudocode:INPUT discriminator p d i s {\displaystyle p_{dis}} . INPUT generator g {\displaystyle g} . Sample images x i {\displaystyle x_{i}} from generator. Compute p d i s ( ⋅ | x i ) {\displaystyle p_{dis}(\cdot |x_{i})} , the probability distribution over labels conditional on image x i {\displaystyle x_{i}} . Sum up the results to obtain p ^ {\displaystyle {\hat {p}}} , an empirical estimate of ∫ p d i s ( ⋅ | x ) p g e n ( x ) d x {\displaystyle \int p_{dis}(\cdot |x)p_{gen}(x)dx} . Sample more images x i {\displaystyle x_{i}} from generator, and for each, compute D K L ( p d i s ( ⋅ | x i ) ‖ p ^ ) {\displaystyle D_{KL}\left(p_{dis}(\cdot |x_{i})\|{\hat {p}}\right)} . Average the results, and take its exponential. RETURN the result. === Interpretation === A higher inception score is interpreted as "better", as it means that p g e n {\displaystyle p_{gen}} is a "sharp and distinct" collection of pictures. ln I S ( p g e n , p d i s ) ∈ [ 0 , ln N ] {\displaystyle \ln IS(p_{gen},p_{dis})\in [0,\ln N]} , where N {\displaystyle N} is the total number of possible labels. ln I S ( p g e n , p d i s ) = 0 {\displaystyle \ln IS(p_{gen},p_{dis})=0} iff for almost all x ∼ p g e n {\displaystyle x\sim p_{gen}} p d i s ( ⋅ | x ) = ∫ p d i s ( ⋅ | x ) p g e n ( x ) d x {\displaystyle p_{dis}(\cdot |x)=\int p_{dis}(\cdot |x)p_{gen}(x)dx} That means p g e n {\displaystyle p_{gen}} is completely "indistinct". That is, for any image x {\displaystyle x} sampled from p g e n {\displaystyle p_{gen}} , discriminator returns exactly the same label predictions p d i s ( ⋅ | x ) {\displaystyle p_{dis}(\cdot |x)} . The highest inception score N {\displaystyle N} is achieved if and only if the two conditions are both true: For almost all x ∼ p g e n {\displaystyle x\sim p_{gen}} , the distribution p d i s ( y | x ) {\displaystyle p_{dis}(y|x)} is concentrated on one label. That is, H y [ p d i s ( y | x ) ] = 0 {\displaystyle H_{y}[p_{dis}(y|x)]=0} . That is, every image sampled from p g e n {\displaystyle p_{gen}} is exactly classified by the discriminator. For every label y {\displaystyle y} , the proportion of generated images labelled as y {\displaystyle y} is exactly E x ∼ p g e n [ p d i s ( y | x ) ] = 1 N {\displaystyle \mathbb {E} _{x\sim p_{gen}}[p_{dis}(y|x)]={\frac {1}{N}}} . That is, the generated images are equally distributed over all labels.
Symmetric Boolean function
In mathematics, a symmetric Boolean function is a Boolean function whose value does not depend on the order of its input bits, i.e., it depends only on the number of ones (or zeros) in the input. For this reason they are also known as Boolean counting functions. There are 2n+1 symmetric n-ary Boolean functions. Instead of the truth table, traditionally used to represent Boolean functions, one may use a more compact representation for an n-variable symmetric Boolean function: the (n + 1)-vector, whose i-th entry (i = 0, ..., n) is the value of the function on an input vector with i ones. Mathematically, the symmetric Boolean functions correspond one-to-one with the functions that map n+1 elements to two elements, f : { 0 , 1 , . . . , n } → { 0 , 1 } {\displaystyle f:\{0,1,...,n\}\rightarrow \{0,1\}} . Symmetric Boolean functions are used to classify Boolean satisfiability problems. == Special cases == A number of special cases are recognized: Majority function: their value is 1 on input vectors with more than n/2 ones Threshold functions: their value is 1 on input vectors with k or more ones for a fixed k All-equal and not-all-equal function: their values is 1 when the inputs do (not) all have the same value Exact-count functions: their value is 1 on input vectors with k ones for a fixed k One-hot or 1-in-n function: their value is 1 on input vectors with exactly one one One-cold function: their value is 1 on input vectors with exactly one zero Congruence functions: their value is 1 on input vectors with the number of ones congruent to k mod m for fixed k, m Parity function: their value is 1 if the input vector has odd number of ones The n-ary versions of AND, OR, XOR, NAND, NOR and XNOR are also symmetric Boolean functions. == Properties == In the following, f k {\displaystyle f_{k}} denotes the value of the function f : { 0 , 1 } n → { 0 , 1 } {\displaystyle f:\{0,1\}^{n}\rightarrow \{0,1\}} when applied to an input vector of weight k {\displaystyle k} . === Weight === The weight of the function can be calculated from its value vector: | f | = ∑ k = 0 n ( n k ) f k {\displaystyle |f|=\sum _{k=0}^{n}{\binom {n}{k}}f_{k}} === Algebraic normal form === The algebraic normal form either contains all monomials of certain order m {\displaystyle m} , or none of them; i.e. the Möbius transform f ^ {\displaystyle {\hat {f}}} of the function is also a symmetric function. It can thus also be described by a simple (n+1) bit vector, the ANF vector f ^ m {\displaystyle {\hat {f}}_{m}} . The ANF and value vectors are related by a Möbius relation: f ^ m = ⨁ k 2 ⊆ m 2 f k {\displaystyle {\hat {f}}_{m}=\bigoplus _{k_{2}\subseteq m_{2}}f_{k}} where k 2 ⊆ m 2 {\displaystyle k_{2}\subseteq m_{2}} denotes all the weights k whose base-2 representation is covered by the base-2 representation of m (a consequence of Lucas’ theorem). Effectively, an n-variable symmetric Boolean function corresponds to a log(n)-variable ordinary Boolean function acting on the base-2 representation of the input weight. For example, for three-variable functions: f ^ 0 = f 0 f ^ 1 = f 0 ⊕ f 1 f ^ 2 = f 0 ⊕ f 2 f ^ 3 = f 0 ⊕ f 1 ⊕ f 2 ⊕ f 3 {\displaystyle {\begin{array}{lcl}{\hat {f}}_{0}&=&f_{0}\\{\hat {f}}_{1}&=&f_{0}\oplus f_{1}\\{\hat {f}}_{2}&=&f_{0}\oplus f_{2}\\{\hat {f}}_{3}&=&f_{0}\oplus f_{1}\oplus f_{2}\oplus f_{3}\end{array}}} So the three variable majority function with value vector (0, 0, 1, 1) has ANF vector (0, 0, 1, 0), i.e.: Maj ( x , y , z ) = x y ⊕ x z ⊕ y z {\displaystyle {\text{Maj}}(x,y,z)=xy\oplus xz\oplus yz} === Unit hypercube polynomial === The coefficients of the real polynomial agreeing with the function on { 0 , 1 } n {\displaystyle \{0,1\}^{n}} are given by: f m ∗ = ∑ k = 0 m ( − 1 ) | k | + | m | ( m k ) f k {\displaystyle f_{m}^{}=\sum _{k=0}^{m}(-1)^{|k|+|m|}{\binom {m}{k}}f_{k}} For example, the three variable majority function polynomial has coefficients (0, 0, 1, -2): Maj ( x , y , z ) = ( x y + x z + y z ) − 2 ( x y z ) {\displaystyle {\text{Maj}}(x,y,z)=(xy+xz+yz)-2(xyz)} == Examples ==