The Great Automatic Grammatizator (published in the U.S. as The Umbrella Man and Other Stories) is a posthumous 1998 collection of thirteen short stories written by British author Roald Dahl. The stories were selected for teenagers from Dahl's adult works. All the stories included were published elsewhere originally; their sources are noted below. The stories, with the exception of the war story "Katina", possess a deadpan, ironic, bizarre, or even macabre sense of humor. They generally end with unexpected plot twists. == Stories == "The Great Automatic Grammatizator" (from Someone Like You): A mechanically-minded man reasons that the rules of grammar are fixed by certain, almost mathematical principles. By exploiting this idea, he is able to create a mammoth machine that can write a prize-winning novel in roughly fifteen minutes. The story ends on a fearful note, as more and more of the world's writers are forced into licensing their names—and all hope of human creativity—to the machine. "Mrs. Bixby and the Colonel's Coat" (from Kiss Kiss): Mrs. Bixby cheats on her dentist husband with a rich, dashing colonel. When their relationship breaks off, the colonel offers Mrs. Bixby a gorgeous and expensive mink coat. In an attempt to explain the coat away, Mrs. Bixby sets up an elaborate trick with the help of a pawn shop—but her husband learns of the ruse and manages to turn the tables. "The Butler" (from More Tales of the Unexpected): An obnoxious and newly wealthy couple employs a butler and chef to impress dinner guests. The butler recommends that the husband buy expensive wines to please his guests, and the man slavishly follows the idea. The butler and the chef reap the rewards of this idea, while making fools of the "fashionable" couple. "Man from the South" (from Someone Like You): At a seaside resort in Jamaica, a strange old man makes a bet with an American man in his late teens. If the young man's cigarette lighter can spark ten times without fail, the American will win a brand-new Cadillac car—but failure means losing the little finger of his right hand. The high-tension wager ensues, and with only a few sparks left, a woman—who knows only too well the cost of the old man's bets—appears and stops the madness. "The Landlady" (from Kiss Kiss): A young man traveling to London on business stops at a bed and breakfast along the way, where a strange and slightly dotty landlady eagerly welcomes him. The eccentric nature of the house, and the news that only two other young men have ever stayed there, confuse and frighten the young man. In the end, the landlady—who indulges in the hobby of taxidermy—and the boy share a drink of tea that tastes of bitter almonds, and the landlady softly smiles at what may be her latest stuffing project. "Parson's Pleasure" (from Kiss Kiss): A man discovers an extremely rare piece of Chippendale furniture at the farm of some boorish ranchers. He desperately attempts to buy the piece cheap, in the hope of selling it at auction to earn a huge profit. He manages to buy the piece "for firewood", only for the ranchers to destroy it in an attempt to make it fit into his car. "The Umbrella Man" (from More Tales of the Unexpected): On a rainy day, a mother and daughter meet a gentlemanly old man on a street corner, who offers them a beautiful silk umbrella in exchange for a pound note. They trade, and the daughter notices that the "feeble" old man suddenly seems much sprier. They follow him, and discover that the gentleman is a con artist who visits various pubs, has a drink, and then steals another umbrella to continue the cycle. "Katina" (from Over to You: Ten Stories of Flyers and Flying): A group of RAF pilots stationed in Greece during World War II discover a hauntingly beautiful young girl, whose "family is beneath the rubble." She becomes their squadron's unofficial "mascot". In the end, her fragile life is taken as she stands defiantly against a rain of bullets from Nazi aircraft, shaking her fists at the heavens. "The Way Up to Heaven" (from Kiss Kiss): Mrs. Foster suffers from a chronic phobia of being late for appointments. Her husband enjoys the cruel sport of purposely delaying their activities, just to rile his wife. On the day when Mrs. Foster is due to fly to Paris to visit her grandchildren, her husband engages in his usual tricks. But as Mrs. Foster rushes from their taxi to the house to find him, she hears a strange noise—and turns triumphantly toward her cab. It is only when she returns, and calls a man to "repair the lift" that was stuck between floors in the house, that readers guess Mr. Foster's fate. "Royal Jelly" (from Kiss Kiss): New parents fear for the life of their little girl, who is sickly and dangerously underweight. The husband, a beekeeper, remembers hearing of the miraculous royal jelly used by bees to transform one particular larva into a queen. He adds the mixture to his daughter's bottles, and she puts on weight at an astonishing rate. The mother senses that something is amiss, and the husband confesses his actions—along with the fact that he himself swallowed buckets of the jelly for months in an attempt to cure his impotence. The royal jelly did the trick—but the strange side-effects include a disturbing metamorphosis for both father and daughter. "Vengeance is Mine Inc." (from More Tales of the Unexpected): Two brothers who are short of cash bemoan their fate over breakfast while reading the society column of a newspaper. They hit upon a scheme to take revenge on cruel tabloid writers in exchange for money from wealthy patrons. The unconventional plan works, and the brothers line their pockets with the spoils of their plans. "Taste" (from Someone Like You): A rich man with a beautiful young daughter hosts a dinner party, inviting a famous connoisseur of fine wines. When the rich man boasts that he has a wine that the expert cannot identify, the stakes become frighteningly high: if he can guess the name and vintage of the wine, he will win his daughter's hand. After an elaborate show, the expert guesses correctly; however, the family's maid appears and inadvertently exposes the guest as a cheat, thus saving the girl. "Neck" (from Someone Like You): A newspaper heir finds himself suddenly engaged to the voluptuous and controlling Lady Tutton. He loses all control of his life, and only his trusted butler and friends realize how broken he is by her control. A weekend trip to their estate, however, proves the perfect opportunity for Lord Tutton to engage in revenge against his wicked wife: her head is trapped in a valuable piece of wooden sculpture, and he must decide whether to use a saw or an axe to cut her free. == Publication details == Dahl, Roald (19 January 2004). The Umbrella Man and Other Stories. Speak. ISBN 9780142400876. == Reception == Groff Conklin in 1954 called the short story "The Great Automatic Grammatizator" "an awe-inspiring fantasy-satire ... an unforgettable bit of biting nonsense".
Commitment ordering
Commitment ordering (CO) is a class of interoperable serializability techniques in concurrency control of databases, transaction processing, and related applications. It allows optimistic (non-blocking) implementations. With the proliferation of multi-core processors, CO has also been increasingly utilized in concurrent programming, transactional memory, and software transactional memory (STM) to achieve serializability optimistically. CO is also the name of the resulting transaction schedule (history) property, defined in 1988 with the name dynamic atomicity. In a CO compliant schedule, the chronological order of commitment events of transactions is compatible with the precedence order of the respective transactions. CO is a broad special case of conflict serializability and effective means (reliable, high-performance, distributed, and scalable) to achieve global serializability (modular serializability) across any collection of database systems that possibly use different concurrency control mechanisms (CO also makes each system serializability compliant, if not already). Each not-CO-compliant database system is augmented with a CO component (the commitment order coordinator—COCO) which orders the commitment events for CO compliance, with neither data-access nor any other transaction operation interference. As such, CO provides a low overhead, general solution for global serializability (and distributed serializability), instrumental for global concurrency control (and distributed concurrency control) of multi-database systems and other transactional objects, possibly highly distributed (e.g., within cloud computing, grid computing, and networks of smartphones). An atomic commitment protocol (ACP; of any type) is a fundamental part of the solution, utilized to break global cycles in the conflict (precedence, serializability) graph. CO is the most general property (a necessary condition) that guarantees global serializability, if the database systems involved do not share concurrency control information beyond atomic commitment protocol (unmodified) messages and have no knowledge of whether transactions are global or local (the database systems are autonomous). Thus CO (with its variants) is the only general technique that does not require the typically costly distribution of local concurrency control information (e.g., local precedence relations, locks, timestamps, or tickets). It generalizes the popular strong strict two-phase locking (SS2PL) property, which in conjunction with the two-phase commit protocol (2PC), is the de facto standard to achieve global serializability across (SS2PL based) database systems. As a result, CO compliant database systems (with any different concurrency control types) can transparently join such SS2PL based solutions for global serializability. In addition, locking based global deadlocks are resolved automatically in a CO based multi-database environment, a vital side-benefit (including the special case of a completely SS2PL based environment; a previously unnoticed fact for SS2PL). Furthermore, strict commitment ordering (SCO; Raz 1991c), the intersection of Strictness and CO, provides better performance (shorter average transaction completion time and resulting in better transaction throughput) than SS2PL whenever read-write conflicts are present (identical blocking behavior for write-read and write-write conflicts; comparable locking overhead). The advantage of SCO is especially during lock contention. Strictness allows both SS2PL and SCO to use the same effective database recovery mechanisms. Two major generalizing variants of CO exist, extended CO (ECO; Raz 1993a) and multi-version CO (MVCO; Raz 1993b). They also provide global serializability without local concurrency control information distribution, can be combined with any relevant concurrency control, and allow optimistic (non-blocking) implementations. Both use additional information for relaxing CO constraints and achieving better concurrency and performance. Vote ordering (VO or Generalized CO (GCO); Raz 2009) is a container schedule set (property) and technique for CO and all its variants. Local VO is necessary for guaranteeing global serializability if the atomic commitment protocol (ACP) participants do not share concurrency control information (have the generalized autonomy property). CO and its variants inter-operate transparently, guaranteeing global serializability and automatic global deadlock resolution together in a mixed, heterogeneous environment with different variants. == Overview == The Commitment ordering (CO; Raz 1990, 1992, 1994, 2009) schedule property has been referred to also as Dynamic atomicity (since 1988), commit ordering, commit order serializability, and strong recoverability (since 1991). The latter is a misleading name since CO is incomparable with recoverability, and the term "strong" implies a special case. This means that a substantial recoverability property does not necessarily have the CO property and vice versa. In 2009 CO has been characterized as a major concurrency control method, together with the previously known (since the 1980s) three major methods: Locking, Time-stamp ordering, and Serialization graph testing, and as an enabler for the interoperability of systems using different concurrency control mechanisms. In a federated database system or any other more loosely defined multidatabase system, which are typically distributed in a communication network, transactions span multiple and possibly Distributed databases. Enforcing global serializability in such system is problematic. Even if every local schedule of a single database is still serializable, the global schedule of a whole system is not necessarily serializable. The massive communication exchanges of conflict information needed between databases to reach conflict serializability would lead to unacceptable performance, primarily due to computer and communication latency. The problem of achieving global serializability effectively had been characterized as open until the public disclosure of CO in 1991 by its inventor Yoav Raz (Raz 1991a; see also Global serializability). Enforcing CO is an effective way to enforce conflict serializability globally in a distributed system since enforcing CO locally in each database (or other transactional objects) also enforces it globally. Each database may use any, possibly different, type of concurrency control mechanism. With a local mechanism that already provides conflict serializability, enforcing CO locally does not cause any other aborts, since enforcing CO locally does not affect the data access scheduling strategy of the mechanism (this scheduling determines the serializability related aborts; such a mechanism typically does not consider the commitment events or their order). The CO solution requires no communication overhead since it uses (unmodified) atomic commitment protocol messages only, already needed by each distributed transaction to reach atomicity. An atomic commitment protocol plays a central role in the distributed CO algorithm, which enforces CO globally by breaking global cycles (cycles that span two or more databases) in the global conflict graph. CO, its special cases, and its generalizations are interoperable and achieve global serializability while transparently being utilized together in a single heterogeneous distributed environment comprising objects with possibly different concurrency control mechanisms. As such, Commitment ordering, including its special cases, and together with its generalizations (see CO variants below), provides a general, high performance, fully distributed solution (no central processing component or central data structure are needed) for guaranteeing global serializability in heterogeneous environments of multidatabase systems and other multiple transactional objects (objects with states accessed and modified only by transactions; e.g., in the framework of transactional processes, and within Cloud computing and Grid computing). The CO solution scales up with network size and the number of databases without any negative impact on performance (assuming the statistics of a single distributed transaction, e.g., the average number of databases involved with a single transaction, are unchanged). With the proliferation of Multi-core processors, Optimistic CO (OCO) has also been increasingly utilized to achieve serializability in software transactional memory, and numerous STM articles and patents utilizing "commit order" have already been published (e.g., Zhang et al. 2006). == The commitment ordering solution for global serializability == === General characterization of CO === Commitment ordering (CO) is a special case of conflict serializability. CO can be enforced with non-blocking mechanisms (each transaction can complete its task without having its data-access blocked, which allows optimistic concurrency control; however, commitment could be blo
Glossary of machine vision
The following are common definitions related to the machine vision field. General related fields Machine vision Computer vision Image processing Signal processing == 0-9 == 1394. FireWire is Apple Inc.'s brand name for the IEEE 1394 interface. It is also known as i.Link (Sony's name) or IEEE 1394 (although the 1394 standard also defines a backplane interface). It is a personal computer (and digital audio/digital video) serial bus interface standard, offering high-speed communications and isochronous real-time data services. 1D. One-dimensional. 2D computer graphics. The computer-based generation of digital images—mostly from two-dimensional models (such as 2D geometric models, text, and digital images) and by techniques specific to them. 3D computer graphics. 3D computer graphics are different from 2D computer graphics in that a three-dimensional representation of geometric data is stored in the computer for the purposes of performing calculations and rendering 2D images. Such images may be for later display or for real-time viewing. Despite these differences, 3D computer graphics rely on many of the same algorithms as 2D computer vector graphics in the wire frame model and 2D computer raster graphics in the final rendered display. In computer graphics software, the distinction between 2D and 3D is occasionally blurred; 2D applications may use 3D techniques to achieve effects such as lighting, and primarily 3D may use 2D rendering techniques. 3D scanner. This is a device that analyzes a real-world object or environment to collect data on its shape and possibly color. The collected data can then be used to construct digital, three dimensional models useful for a wide variety of applications. == A == Aberration. Optically, defocus refers to a translation along the optical axis away from the plane or surface of best focus. In general, defocus reduces the sharpness and contrast of the image. What should be sharp, high-contrast edges in a scene become gradual transitions. Algebraic distance or algebraic error. The algebraic distance from a point xi to a curve or surface defined by f ( x , β ) = 0 {\displaystyle f(x,\beta )=0} is the value of f ( x i , β ) {\displaystyle f(x_{i},\beta )} , i.e. the residual in the least squares problem with data point (xi, 0) and model function f. This term is mainly used in computer vision.[1][2] Aperture. In context of photography or machine vision, aperture refers to the diameter of the aperture stop of a photographic lens. The aperture stop can be adjusted to control the amount of light reaching the film or image sensor. aspect ratio (image). The aspect ratio of an image is its displayed width divided by its height (usually expressed as "x:y"). Angular resolution. Describes the resolving power of any image forming device such as an optical or radio telescope, a microscope, a camera, or an eye. Automated optical inspection. == B == Barcode. A barcode (also bar code) is a machine-readable representation of information in a visual format on a surface. Blob discovery. Inspecting an image for discrete blobs of connected pixels (e.g. a black hole in a grey object) as image landmarks. These blobs frequently represent optical targets for machining, robotic capture, or manufacturing failure. Bitmap. A raster graphics image, digital image, or bitmap, is a data file or structure representing a generally rectangular grid of pixels, or points of color, on a computer monitor, paper, or other display device. == C == Camera. A camera is a device used to take pictures, either singly or in sequence. A camera that takes pictures singly is sometimes called a photo camera to distinguish it from a video camera. Camera Link. Camera Link is a serial communication protocol designed for computer vision applications based on the National Semiconductor interface Channel-link. It was designed for the purpose of standardizing scientific and industrial video products including cameras, cables and frame grabbers. The standard is maintained and administered by the Automated Imaging Association, or AIA, the global machine vision industry's trade group. Charge-coupled device. A charge-coupled device (CCD) is a sensor for recording images, consisting of an integrated circuit containing an array of linked, or coupled, capacitors. CCD sensors and cameras tend to be more sensitive, less noisy, and more expensive than CMOS sensors and cameras. CIE 1931 Color Space. In the study of the perception of color, one of the first mathematically defined color spaces was the CIE XYZ color space (also known as CIE 1931 color space), created by the International Commission on Illumination (CIE) in 1931. CMOS. CMOS ("see-moss")stands for complementary metal-oxide semiconductor, is a major class of integrated circuits. CMOS imaging sensors for machine vision are cheaper than CCD sensors but more noisy. CoaXPress. CoaXPress (CXP) is an asymmetric high speed serial communication standard over coaxial cable. CoaXPress combines high speed image data, low speed camera control and power over a single coaxial cable. The standard is maintained by JIIA, the Japan Industrial Imaging Association. Color. The perception of the frequency (or wavelength) of light, and can be compared to how pitch (or a musical note) is the perception of the frequency or wavelength of sound. Color blindness. Also known as color vision deficiency, in humans is the inability to perceive differences between some or all colors that other people can distinguish Color temperature. "White light" is commonly described by its color temperature. A traditional incandescent light source's color temperature is determined by comparing its hue with a theoretical, heated black-body radiator. The lamp's color temperature is the temperature in kelvins at which the heated black-body radiator matches the hue of the lamp. Color vision. CV is the capacity of an organism or machine to distinguish objects based on the wavelengths (or frequencies) of the light they reflect or emit. computer vision. The study and application of methods which allow computers to "understand" image content. Contrast. In visual perception, contrast is the difference in visual properties that makes an object (or its representation in an image) distinguishable from other objects and the background. C-Mount. Standardized adapter for optical lenses on CCD - cameras. C-Mount lenses have a back focal distance 17.5 mm vs. 12.5 mm for "CS-mount" lenses. A C-Mount lens can be used on a CS-Mount camera through the use of a 5 mm extension adapter. C-mount is a 1" diameter, 32 threads per inch mounting thread (1"-32UN-2A.) CS-Mount. Same as C-Mount but the focal point is 5 mm shorter. A CS-Mount lens will not work on a C-Mount camera. CS-mount is a 1" diameter, 32 threads per inch mounting thread. == D == Data matrix. A two dimensional Barcode. Depth of field. In optics, particularly photography and machine vision, the depth of field (DOF) is the distance in front of and behind the subject which appears to be in focus. Depth perception. DP is the visual ability to perceive the world in three dimensions. It is a trait common to many higher animals. Depth perception allows the beholder to accurately gauge the distance to an object. Diaphragm. In optics, a diaphragm is a thin opaque structure with an opening (aperture) at its centre. The role of the diaphragm is to stop the passage of light, except for the light passing through the aperture. == E == Edge detection. ED marks the points in a digital image at which the luminous intensity changes sharply. It also marks the points of luminous intensity changes of an object or spatial-taxon silhouette. Electromagnetic interference. Radio Frequency Interference (RFI) is electromagnetic radiation which is emitted by electrical circuits carrying rapidly changing signals, as a by-product of their normal operation, and which causes unwanted signals (interference or noise) to be induced in other circuits. == F == FireWire. FireWire (also known as i. Link or IEEE 1394) is a personal computer (and digital audio/video) serial bus interface standard, offering high-speed communications. It is often used as an interface for industrial cameras. Fixed-pattern noise. Flat-field correction. Frame grabber. An electronic device that captures individual, digital still frames from an analog video signal or a digital video stream. Fringe Projection Technique. 3D data acquisition technique employing projector displaying fringe pattern on a surface of measured piece, and one or more cameras recording image(s). Field of view. The field of view (FOV) is the part which can be seen by the machine vision system at one moment. The field of view depends from the lens of the system and from the working distance between object and camera. Focus. An image, or image point or region, is said to be in focus if light from object points is converged about as well as possible in the image; conversely, it is out of focus if light is not w
Adversarial stylometry
Adversarial stylometry is the practice of altering writing style to reduce the potential for stylometry to discover the author's identity or their characteristics. This task is also known as authorship obfuscation or authorship anonymisation. Stylometry poses a significant privacy challenge in its ability to unmask anonymous authors or to link pseudonyms to an author's other identities, which, for example, creates difficulties for whistleblowers, activists, and hoaxers and fraudsters. The privacy risk is expected to grow as machine learning techniques and text corpora develop. All adversarial stylometry shares the core idea of faithfully paraphrasing the source text so that the meaning is unchanged but the stylistic signals are obscured. Such a faithful paraphrase is an adversarial example for a stylometric classifier. Several broad approaches to this exist, with some overlap: imitation, substituting the author's own style for another's; translation, applying machine translation with the hope that this eliminates characteristic style in the source text; and obfuscation, deliberately modifying a text's style to make it not resemble the author's own. Manually obscuring style is possible, but laborious; in some circumstances, it is preferable or necessary. Automated tooling, either semi- or fully-automatic, could assist an author. How best to perform the task and the design of such tools is an open research question. While some approaches have been shown to be able to defeat particular stylometric analyses, particularly those that do not account for the potential of adversariality, establishing safety in the face of unknown analyses is an issue. Ensuring the faithfulness of the paraphrase is a critical challenge for automated tools. It is uncertain if the practice of adversarial stylometry is detectable in itself. Some studies have found that particular methods produced signals in the output text, but a stylometrist who is uncertain of what methods may have been used may not be able to reliably detect them. == History == Rao & Rohatgi (2000), an early work in adversarial stylometry, identified machine translation as a possibility, but noted that the quality of translators available at the time presented severe challenges. Kacmarcik & Gamon (2006) is another early work. Brennan, Afroz & Greenstadt (2012) performed the first evaluation of adversarial stylometric methods on actual texts. Brennan & Greenstadt (2009) introduced the first corpus of adversarially authored texts specifically for evaluating stylometric methods; other corpora include the International Imitation Hemingway Competition, the Faux Faulkner contest, and the hoax blog A Gay Girl in Damascus. == Motivations == Rao & Rohatgi (2000) suggest that short, unattributed documents (i.e., anonymous posts) are not at risk of stylometric identification, but pseudonymous authors who have not practiced adversarial stylometry in producing corpuses of thousands of words may be vulnerable. Narayanan et al. (2012) attempted large-scale deanonymisation of 100,000 blog authors with mixed results: the identifications were significantly better than chance, but only accurately matched the blog and author a fifth of the time; identification improved with the number of posts written by the author in the corpus. Even if an author is not identified, some of their characteristics may still be deduced stylometrically, or stylometry may narrow the anonymity set of potential authors sufficiently for other information to complete the identification. Detecting author characteristics (e.g., gender or age) is often simpler than identifying an author from a large, possibly open, set of candidates. Modern machine learning techniques offer powerful tools for identification; further development of corpora and computational stylometric techniques are likely to raise further privacy issues. Gröndahl & Asokan (2020a) say that the general validity of the hypothesis underlying stylometry—that authors have invariant, content-independent 'style fingerprints'—is uncertain, but "the deanonymisation attack is a real privacy concern". Those interested in practicing adversarial stylometry and stylistic deception include whistleblowers avoiding retribution; journalists and activists; perpetrators of frauds and hoaxes; authors of fake reviews; literary forgers; criminals disguising their identity from investigators; and, generally, anyone with a desire for anonymity or pseudonymity. Authors, or agents acting on behalf of authors, may also attempt to remove stylistic clues to author characteristics (e.g., race or gender) so that knowledge of those characteristics cannot be used for discrimination (e.g., through algorithmic bias). Another possible use for adversarial stylometry is in disguising automatically generated text as human-authored. == Methods == With imitation, the author attempts to mislead stylometry by matching their style to another author's. An incomplete imitation, where some of the true author's unique characteristics appear alongside the imitated author's, can be a detectable signal for the use of adversarial stylometry. Imitation can be performed automatically with style transfer systems, though this typically requires a large corpus in the target style for the system to learn from. Another approach is translation, which employs machine translation of a source text to eliminate characteristic style, often through multiple translators in sequence to produce a round-trip translation. Such chained translation can lead to texts being significantly altered, even to the point of incomprehensibility; improved translation tools reduce this risk. More simply-structured texts can be easier to machine translate without losing the original meaning. Machine translation blurs into direct stylistic imitation or obfuscation achieved through automated style transfer, which can be viewed as a "translation" with the same language as input and output. With low-quality translation tools, an author can be required to manually correct major translation errors while avoiding the hazard of re-introducing stylistic characteristics. Wang, Juola & Riddell (2022) found that gross errors introduced by Google Translate were rare, but more common with several intermediate translations—however, occasional simple or short sentences and misspellings in the source text appeared verbatim in the output, potentially providing an identifying signal. Chain translation can leave characteristic traces of its application in a document, which may allow reconstruction of the intermediate languages used and the number of translation steps performed. Obfuscation involves deliberately changing the style of a text to reduce its similarity to other texts by some metric; this may be performed at the time of writing by conscious modification, or as part of a revision process with feedback from the metric being targeted as an input to decide when the text has been sufficiently obfuscated. In contrast to translation, complex texts can offer more opportunities for effective obfuscation without altering meaning, and likewise genres with more permissible variation allow more obfuscation. However, longer texts are harder to thoroughly obfuscate. Obfuscation can blend into imitation if the author develops a novel target style, distinct from their original style. With respect to masking author characteristics, obfuscation may aim to achieve a union (adding signals for imitated characteristics) or an intersection (removing signals and normalising) of other authors' styles. Avoiding the author's own idiosyncrasies and producing a "normalised" text is a critical obfuscatory step: an author may have a unique tendency to misspell certain words, use particular variants, or to format a document in a characteristic way. Stylometric signals vary in how simply they can be adversarially masked; an author may easily change their vocabulary by conscious choice, but altering the pattern of grammar or the letter frequency in their text may be harder to achieve, though Juola & Vescovi (2011) report that imitation typically succeeds at masking more characteristics than obfuscation. Automated obfuscation may require large amounts of training data written by the author. Concerning automated implementations of adversarial stylometry, two possible implementations are rule-based systems for paraphrasing; and encoder–decoder architectures, where the text passes through an intermediate format that is (intended to be) style-neutral. Another division in automated methods is whether there is feedback from an identification system or not. With such feedback, finding paraphrases for author masking has been characterised as a heuristic search problem, exploring textual variants until the result is stylistically sufficiently far (in the case of obfuscation) or near (in the case of imitation), which then constitutes an adversarial example for that identification system. == Evaluation == How
OrCam device
OrCam devices such as OrCam MyEye are portable, artificial vision devices that allow visually impaired people to understand text and identify objects through audio feedback, describing what they are unable to see. Reuters described an important part of how it works as "a wireless smartcamera" which, when attached outside eyeglass frames, can read and verbalize text, and also supermarket barcodes. This information is converted to spoken words and entered "into the user’s ear." Face-recognition is also part of OrCam's feature set. == Devices == OrCam Technologies Ltd has created three devices; OrCam MyEye 2.0, OrCam MyEye 1, and OrCam MyReader. OrCam My Eye 2.0: OrCam debuted the second-generation model, the OrCam MyEye 2.0 in December 2017. About the size of a finger, the MyEye 2.0 is battery-powered, and has been compressed into a self-contained device. The device snaps onto any eyeglass frame magnetically. Orcam 2.0 is small and light (22.5 grams/0.8 ounces) with functionality to restore independence to the visually impaired. It comes in two versions. The basic model can read text, and a more advanced one adds features such as face recognition and barcode reading. As of July 2023, the retail cost is between $4000 and $6000 (USD). == Clinical Studies == JAMA Ophthalmology: In 2016 JAMA Ophthalmology conducted a study involving 12 legally blind participants to evaluate the usefulness of a portable artificial vision device (OrCam) for patients with low vision. The results showed that the OrCam device improved the patient's ability to perform tasks simulating those of daily living, such as reading a message on an electronic device, a newspaper article or a menu. Wills Eye: Wills Eye was a clinical study designed to measure the impact of the OrCam device on the quality of life of patients with End-stage Glaucoma. The conclusion was that OrCam, a novel artificial vision device using a mini-camera mounted on eyeglasses, allowed legally blind patients with end-stage glaucoma to read independently, subsequently improving their quality of life. == Employee testing == The New York Times described how a pre-release OrCam device was used by a Coloboma-impaired employee of the device's developer in 2013 for grocery shopping. It was the small size of the prototype rather than the functionality that gave her added mobility in an Israeli store's aisles. Added life-enhancement was described: "to both recognize and speak .. bus numbers .. traffic lights." == Social aspects == In contrast to an early version of Google Glass, which "failed ... because .. Glass wearers were ..mocked", early OrCam devices used designs that "clip unobtrusively on your shirt or perhaps your belt." In addition, it does not record sounds or images, what was called "the privacy puzzle that stumped Google. One 2018 technology reviewer wrote that he wished it had a headphone jack "so it would be less disruptive in places where others are working." An attempt was made to use bone conduction. == USA introduction == In 2018 a team headed by New York Assemblyman Dov Hikind introduced use of OrCam devices to ten individuals screened for what he termed "new Israeli technology that really makes a difference to the blind." Although not the first USA success, it was more focused than a publicly funded project that was authorized in 2016 by a California government agency. Also in 2016 the Chicago Lighthouse for the Blind demonstrated its use. == Technology == In the area of hardware, miniaturization has been quite important, but one major area, software, was mentioned by Assemblyman Hikind, and reported by The Times of Israel is the "AI-driven algorithms" that "reports .. how many people are in a room. In addition to reading printed text, it can also aid in "seeing" what is on a television or computer screen. Although OrCam can't help with handwritten information, it can reuse information, the basis of recognizing "US currency, and even faces." === Features === While early language support was for English, French, German, Hebrew and Spanish, others now available include Danish, Dutch, Finnish, Italian, Norwegian, Portuguese and Swedish. == History == OrCam Technologies Ltd was founded in 2010 by Professor Amnon Shashua and Ziv Aviram. Before co-founding OrCam, the two in 1999 co-founded Mobileye, an Israeli company that develops vision-based advanced driver-assistance systems (ADAS) providing warnings for collision prevention and mitigation, which was acquired by Intel for $15.3 billion in 2017. OrCam launched OrCam MyEye in 2013 after years of development and testing, and began selling it commercially in 2015. In its early years, the company raised $22 million, $6 million of which came from Intel Capital. By 2014, Intel, which was also investing in Google Glass, had invested $15 million in Orcam. In March 2017, OrCam had raised $41 million in capital, making it worth $600 million. === Marketing === One outcome of initial marketing in the USA was that they "reached a deal with the California Department of Rehabilitation, ...qualifying blind and visually impaired state residents." == OrCam Technologies Ltd == OrCam Technologies Ltd. is the Israeli-based company producing these OrCam devices, which are wearable artificial intelligence space. The company develops and manufactures assistive technology devices for individuals who are visually impaired, partially sighted, blind, print disabilities, or have other disabilities. OrCam headquarters is located in Jerusalem, operating under the company name OrCam Technologies Ltd. OrCam has over 150 employees, is headquartered in Jerusalem, and has offices in New York, Toronto, and London. == Awards == 2018 Last Gadget Standing Winner 2018 CES Innovation Awards Honoree in Accessible Tech 2017 NAIDEX Innovation Award 2016 Louise Braille Corporate Recognition Award 2016 Silmo-d-Or Award
Smoothing
In statistics and image processing, to smooth a data set is to create an approximating function that attempts to capture important patterns in the data, while leaving out noise or other fine-scale structures/rapid phenomena. In smoothing, the data points of a signal are modified so individual points higher than the adjacent points (presumably because of noise) are reduced, and points that are lower than the adjacent points are increased, leading to a smoother signal. Reducing noise by smoothing may aid in data analysis in two notable ways: Help uncover more meaningful information from the underlying data, such as trends. Provide analyses that are both flexible and robust. Many different algorithms are used in smoothing, most commonly binning, kernels, and local weighted regression. == Compared to curve fitting == Smoothing may be distinguished from the related and partially overlapping concept of curve fitting in the following ways: curve fitting often involves the use of an explicit function form for the result, whereas the immediate results from smoothing are the "smoothed" values with no later use made of a functional form if there is one; the aim of smoothing is to give a general idea of relatively slow changes of value with little attention paid to the close matching of data values, while curve fitting concentrates on achieving as close a match as possible. smoothing methods often have an associated tuning parameter which is used to control the extent of smoothing. Curve fitting will adjust any number of parameters of the function to obtain the 'best' fit. == Linear smoothers == In the case that the smoothed values can be written as a linear transformation of the observed values, the smoothing operation is known as a linear smoother; the matrix representing the transformation is known as a smoother matrix or hat matrix. The operation of applying such a matrix transformation is called convolution. Thus the matrix is also called convolution matrix or a convolution kernel. In the case of simple series of data points (rather than a multi-dimensional image), the convolution kernel is a one-dimensional vector. == Algorithms == One of the most common algorithms is the "moving average", often used to try to capture important trends in repeated statistical surveys. In image processing and computer vision, smoothing ideas are used in scale space representations. The simplest smoothing algorithm is the "rectangular" or "unweighted sliding-average smooth". This method replaces each point in the signal with the average of "m" adjacent points, where "m" is a positive integer called the "smooth width". Usually m is an odd number. The triangular smooth is like the rectangular smooth except that it implements a weighted smoothing function. Some specific smoothing and filter types, with their respective uses, pros and cons are:
Keyword extraction
Keyword extraction is tasked with the automatic identification of terms that best describe the subject of a document. Key phrases, key terms, key segments or just keywords are the terminology which is used for defining the terms that represent the most relevant information contained in the document. Although the terminology is different, function is the same: characterization of the topic discussed in a document. The task of keyword extraction is an important problem in text mining, information extraction, information retrieval and natural language processing (NLP). == Keyword assignment vs. extraction == Keyword assignment methods can be roughly divided into: keyword assignment (keywords are chosen from controlled vocabulary or taxonomy) and keyword extraction (keywords are chosen from words that are explicitly mentioned in original text). Methods for automatic keyword extraction can be supervised, semi-supervised, or unsupervised. Unsupervised methods can be further divided into simple statistics, linguistics or graph-based, or ensemble methods that combine some or most of these methods.