AI Detector Zero Chatgpt

AI Detector Zero Chatgpt — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Lexical Markup Framework

    Lexical Markup Framework

    Language resource management – Lexical markup framework (LMF; ISO 24613), produced by ISO/TC 37, is the ISO standard for natural language processing (NLP) and machine-readable dictionary (MRD) lexicons. The scope is standardization of principles and methods relating to language resources in the contexts of multilingual communication. == Objectives == The goals of LMF are to provide a common model for the creation and use of lexical resources, to manage the exchange of data between and among these resources, and to enable the merging of large number of individual electronic resources to form extensive global electronic resources. Types of individual instantiations of LMF can include monolingual, bilingual or multilingual lexical resources. The same specifications are to be used for both small and large lexicons, for both simple and complex lexicons, for both written and spoken lexical representations. The descriptions range from morphology, syntax, computational semantics to computer-assisted translation. The covered languages are not restricted to European languages but cover all natural languages. The range of targeted NLP applications is not restricted. LMF is able to represent most lexicons, including WordNet, EDR and PAROLE lexicons. == History == In the past, lexicon standardization has been studied and developed by a series of projects like GENELEX, EDR, EAGLES, MULTEXT, PAROLE, SIMPLE and ISLE. Then, the ISO/TC 37 National delegations decided to address standards dedicated to NLP and lexicon representation. The work on LMF started in Summer 2003 by a new work item proposal issued by the US delegation. In Fall 2003, the French delegation issued a technical proposition for a data model dedicated to NLP lexicons. In early 2004, the ISO/TC 37 committee decided to form a common ISO project with Nicoletta Calzolari (CNR-ILC Italy) as convenor and Gil Francopoulo (Tagmatica France) and Monte George (ANSI, United States) as editors. The first step in developing LMF was to design an overall framework based on the general features of existing lexicons and to develop a consistent terminology to describe the components of those lexicons. The next step was the actual design of a comprehensive model that best represented all of the lexicons in detail. A large panel of 60 experts contributed a wide range of requirements for LMF that covered many types of NLP lexicons. The editors of LMF worked closely with the panel of experts to identify the best solutions and reach a consensus on the design of LMF. Special attention was paid to the morphology in order to provide powerful mechanisms for handling problems in several languages that were known as difficult to handle. 13 versions have been written, dispatched (to the National nominated experts), commented and discussed during various ISO technical meetings. After five years of work, including numerous face-to-face meetings and e-mail exchanges, the editors arrived at a coherent UML model. In conclusion, LMF should be considered a synthesis of the state of the art in NLP lexicon field. == Current stage == The ISO number is 24613. The LMF specification has been published officially as an International Standard on 17 November 2008. == As one of the members of the ISO/TC 37 family of standards == The ISO/TC 37 standards are currently elaborated as high level specifications and deal with word segmentation (ISO 24614), annotations (ISO 24611 a.k.a. MAF, ISO 24612 a.k.a. LAF, ISO 24615 a.k.a. SynAF, and ISO 24617-1 a.k.a. SemAF/Time), feature structures (ISO 24610), multimedia containers (ISO 24616 a.k.a. MLIF), and lexicons (ISO 24613). These standards are based on low level specifications dedicated to constants, namely data categories (revision of ISO 12620), language codes (ISO 639), scripts codes (ISO 15924), country codes (ISO 3166) and Unicode (ISO 10646). The two level organization forms a coherent family of standards with the following common and simple rules: the high level specification provides structural elements that are adorned by the standardized constants; the low level specifications provide standardized constants as metadata. == Key standards == The linguistics constants like /feminine/ or /transitive/ are not defined within LMF but are recorded in the Data Category Registry (DCR) that is maintained as a global resource by ISO/TC 37 in compliance with ISO/IEC 11179-3:2003. And these constants are used to adorn the high level structural elements. The LMF specification complies with the modeling principles of Unified Modeling Language (UML) as defined by Object Management Group (OMG). The structure is specified by means of UML class diagrams. The examples are presented by means of UML instance (or object) diagrams. An XML DTD is given in an annex of the LMF document. == Model structure == LMF is composed of the following components: The core package that is the structural skeleton which describes the basic hierarchy of information in a lexical entry. Extensions of the core package which are expressed in a framework that describes the reuse of the core components in conjunction with the additional components required for a specific lexical resource. The extensions are specifically dedicated to morphology, MRD, NLP syntax, NLP semantics, NLP multilingual notations, NLP morphological patterns, multiword expression patterns, and constraint expression patterns. == Example == In the following example, the lexical entry is associated with a lemma clergyman and two inflected forms clergyman and clergymen. The language coding is set for the whole lexical resource. The language value is set for the whole lexicon as shown in the following UML instance diagram. The elements Lexical Resource, Global Information, Lexicon, Lexical Entry, Lemma, and Word Form define the structure of the lexicon. They are specified within the LMF document. On the contrary, languageCoding, language, partOfSpeech, commonNoun, writtenForm, grammaticalNumber, singular, plural are data categories that are taken from the Data Category Registry. These marks adorn the structure. The values ISO 639-3, clergyman, clergymen are plain character strings. The value eng is taken from the list of languages as defined by ISO 639-3. With some additional information like dtdVersion and feat, the same data can be expressed by the following XML fragment: This example is rather simple, while LMF can represent much more complex linguistic descriptions the XML tagging is correspondingly complex. == Selected publications about LMF == The first publication about the LMF specification as it has been ratified by ISO (this paper became (in 2015) the 9th most cited paper within the Language Resources and Evaluation conferences from LREC papers): Language Resources and Evaluation LREC-2006/Genoa: Gil Francopoulo, Monte George, Nicoletta Calzolari, Monica Monachini, Nuria Bel, Mandy Pet, Claudia Soria: Lexical Markup Framework (LMF) About semantic representation: Gesellschaft für linguistische Datenverarbeitung GLDV-2007/Tübingen: Gil Francopoulo, Nuria Bel, Monte George Nicoletta Calzolari, Monica Monachini, Mandy Pet, Claudia Soria: Lexical Markup Framework ISO standard for semantic information in NLP lexicons About African languages: Traitement Automatique des langues naturelles, Marseille, 2014: Mouhamadou Khoule, Mouhamad Ndiankho Thiam, El Hadj Mamadou Nguer: Toward the establishment of a LMF-based Wolof language lexicon (Vers la mise en place d'un lexique basé sur LMF pour la langue wolof) [in French] About Asian languages: Lexicography, Journal of ASIALEX, Springer 2014: Lexical Markup Framework: Gil Francopoulo, Chu-Ren Huang: An ISO Standard for Electronic Lexicons and its Implications for Asian Languages DOI 10.1007/s40607-014-0006-z About European languages: COLING 2010: Verena Henrich, Erhard Hinrichs: Standardizing Wordnets in the ISO Standard LMF: Wordnet-LMF for GermaNet EACL 2012: Judith Eckle-Kohler, Iryna Gurevych: Subcat-LMF: Fleshing out a standardized format for subcategorization frame interoperability EACL 2012: Iryna Gurevych, Judith Eckle-Kohler, Silvana Hartmann, Michael Matuschek, Christian M Meyer, Christian Wirth: UBY - A Large-Scale Unified Lexical-Semantic Resource Based on LMF. About Semitic languages: Journal of Natural Language Engineering, Cambridge University Press (to appear in Spring 2015): Aida Khemakhem, Bilel Gargouri, Abdelmajid Ben Hamadou, Gil Francopoulo: ISO Standard Modeling of a large Arabic Dictionary. Proceedings of the seventh Global Wordnet Conference 2014: Nadia B M Karmani, Hsan Soussou, Adel M Alimi: Building a standardized Wordnet in the ISO LMF for aeb language. Proceedings of the workshop: HLT & NLP within Arabic world, LREC 2008: Noureddine Loukil, Kais Haddar, Abdelmajid Ben Hamadou: Towards a syntactic lexicon of Arabic Verbs. Traitement Automatique des Langues Naturelles, Toulouse (in French) 2007: Khemakhem A, Gargouri B, Abdelwahed A, Francopoulo G: Modélisation des paradigmes de fl

    Read more →
  • Hardware-based encryption

    Hardware-based encryption

    Hardware-based encryption is the use of computer hardware to assist software, or sometimes replace software, in the process of data encryption. Typically, this is implemented as part of the processor's instruction set. For example, the AES encryption algorithm (a modern cipher) can be implemented using the AES instruction set on the ubiquitous x86 architecture. Such instructions also exist on the ARM architecture. However, more unusual systems exist where the cryptography module is separate from the central processor, instead being implemented as a coprocessor, in particular a secure cryptoprocessor or cryptographic accelerator, of which an example is the IBM 4758, or its successor, the IBM 4764. Hardware implementations can be faster and less prone to exploitation than traditional software implementations, and furthermore can be protected against tampering. == History == Prior to the use of computer hardware, cryptography could be performed through various mechanical or electro-mechanical means. An early example is the Scytale used by the Spartans. The Enigma machine was an electro-mechanical system cipher machine notably used by the Germans in World War II. After World War II, purely electronic systems were developed. In 1987 the ABYSS (A Basic Yorktown Security System) project was initiated. The aim of this project was to protect against software piracy. However, the application of computers to cryptography in general dates back to the 1940s and Bletchley Park, where the Colossus computer was used to break the encryption used by German High Command during World War II. The use of computers to encrypt, however, came later. In particular, until the development of the integrated circuit, of which the first was produced in 1960, computers were impractical for encryption, since, in comparison to the portable form factor of the Enigma machine, computers of the era took the space of an entire building. It was only with the development of the microcomputer that computer encryption became feasible, outside of niche applications. The development of the World Wide Web lead to the need for consumers to have access to encryption, as online shopping became prevalent. The key concerns for consumers were security and speed. This led to the eventual inclusion of the key algorithms into processors as a way of both increasing speed and security. == Implementations == === In the instruction set === ==== x86 ==== The X86 architecture, as a CISC (Complex Instruction Set Computer) Architecture, typically implements complex algorithms in hardware. Cryptographic algorithms are no exception. The x86 architecture implements significant components of the AES (Advanced Encryption Standard) algorithm, which can be used by the NSA for Top Secret information. The architecture also includes support for the SHA Hashing Algorithms through the Intel SHA extensions. Whereas AES is a cipher, which is useful for encrypting documents, hashing is used for verification, such as of passwords (see PBKDF2). ==== ARM ==== ARM processors can optionally support Security Extensions. Although ARM is a RISC (Reduced Instruction Set Computer) architecture, there are several optional extensions specified by ARM Holdings. === As a coprocessor === IBM 4758 – The predecessor to the IBM 4764. This includes its own specialised processor, memory and a Random Number Generator. IBM 4764 and IBM 4765, identical except for the connection used. The former uses PCI-X, while the latter uses PCI-e. Both are peripheral devices that plug into the motherboard. === Proliferation === Advanced Micro Devices (AMD) processors are also x86 devices, and have supported the AES instructions since the 2011 Bulldozer processor iteration. Due to the existence of encryption instructions on modern processors provided by both Intel and AMD, the instructions are present on most modern computers. They also exist on many tablets and smartphones due to their implementation in ARM processors. == Advantages == Implementing cryptography in hardware means that part of the processor is dedicated to the task. This can lead to a large increase in speed. In particular, modern processor architectures that support pipelining can often perform other instructions concurrently with the execution of the encryption instruction. Furthermore, hardware can have methods of protecting data from software. Consequently, even if the operating system is compromised, the data may still be secure (see Software Guard Extensions). == Disadvantages == If, however, the hardware implementation is compromised, major issues arise. Malicious software can retrieve the data from the (supposedly) secure hardware – a large class of method used is the timing attack. This is far more problematic to solve than a software bug, even within the operating system. Microsoft regularly deals with security issues through Windows Update. Similarly, regular security updates are released for Mac OS X and Linux, as well as mobile operating systems like iOS, Android, and Windows Phone. However, hardware is a different issue. Sometimes, the issue will be fixable through updates to the processor's microcode (a low level type of software). However, other issues may only be resolvable through replacing the hardware, or a workaround in the operating system which mitigates the performance benefit of the hardware implementation, such as in the Spectre exploit.

    Read more →
  • Fansly

    Fansly

    Fansly is a subscription-based social media platform that allows content creators to monetize exclusive content, including photos, videos, live streams, and direct messages. Operated by Select Media LLC, the platform is headquartered in Baltimore, Maryland. While the platform hosts a variety of content genres, it is primarily known for adult content and is frequently compared to OnlyFans. == History == Fansly was launched in 2020 by Micheal Etelis under Select Media LLC, which was incorporated in February 2020. The platform also operates through CY Media LTD, registered in Kamares, Cyprus, established in May 2021. The company has remained privately held with no disclosed external funding rounds or official valuation, operating as a bootstrapped entity. Based on Fansly's social media presence, which was created in November 2020, the platform did not begin gaining traction until early 2021 when creators started to become concerned about potential content policy changes at OnlyFans. In August 2021, OnlyFans announced it would ban sexually explicit content effective October 2021, citing pressure from banks involved in its payment processing. Although OnlyFans reversed the decision six days later, the announcement triggered a massive influx of users to Fansly; the platform received nearly 4,000 new creator applications in a single hour, causing its servers to crash from the surge in traffic. By August 21, 2021, Fansly had reached 2.1 million users. == Features and business model == Fansly operates as a B2C marketplace, taking a 20% commission on all transactions conducted on the platform, with creators retaining the remaining 80%. This commission rate is the same as that charged by its main competitor, OnlyFans. A distinguishing feature of Fansly is its tiered subscription model, which allows creators to set multiple subscription levels at different price points, each offering different perks such as exclusive content, chat access, or custom requests. By contrast, OnlyFans historically relied on a single-tier subscription model. Revenue streams on the platform include recurring subscriptions, one-time pay-per-view content purchases, tips, paid messaging, and live-streaming fees. The platform also features an algorithmic "For You" feed that helps users discover new creators, addressing a limitation of competitors that lack internal content promotion mechanisms. Additional features include content watermarking, geolocation blocking to control where content is visible, two-factor authentication, community polls, 24-hour stories, and social media integration with platforms such as Twitter and Twitch. Payouts are processed within one to two business days and support multiple methods, including bank transfers, Skrill, Paxum, and cryptocurrency. In December 2025, Fansly expanded its live-streaming capabilities, introducing ticketed access, private list gating, configurable chat permissions, stream goals, and interactive device integration. == Controversies == === OnlyFans anti-competitive allegations === In August 2022, a series of lawsuits were filed in the United States alleging that OnlyFans had bribed employees of Meta Platforms to place Instagram accounts of creators who also sold content on competitor platforms, including Fansly, onto a terrorist blacklist. The lawsuits alleged that adult performers had traffic driven away from their Instagram accounts after being falsely tagged as terror-related. OnlyFans denied awareness of such activity. The plaintiffs withdrew the bribery claim in July 2023, and the case was dismissed in August 2023. === Privacy class action === In June 2025, Select Media LLC (operating as Fansly) was the subject of a digital privacy class action lawsuit filed in Massachusetts District Court. The lawsuit alleged that the platform secretly collected and shared users' sensitive viewing data with Google and other third parties without consent. The case was brought on behalf of an estimated class of over 10,000 users across multiple states.

    Read more →
  • Digital data

    Digital data

    Digital data or digital information, in information theory and information systems, is data or information represented as a string of discrete symbols, each of which can take on one of only a finite number of values from some alphabet, such as letters or digits. An example is a text document, which consists of a string of alphanumeric characters. The most common form of digital data in modern information systems is binary data, which is represented by a string of binary digits (bits) each of which can have one of two values, either 0 or 1. Digital data can be contrasted with analog data, which is represented by a value from a continuous range of real numbers. Analog data is transmitted by an analog signal, which not only takes on continuous values but can vary continuously with time, a continuous real-valued function of time. An example is the air pressure variation in a sound wave. Data requires interpretation to become information. In modern (post-1960) computer systems, all data is digital. The word digital comes from the same source as the words digit and digitus (the Latin word for finger), as fingers are often used for counting. Mathematician George Stibitz of Bell Telephone Laboratories used the word digital in reference to the fast electric pulses emitted by a device designed to aim and fire anti-aircraft guns in 1942. The term is most commonly used in computing and electronics, especially where real-world information is converted to binary numeric form as in digital audio and digital photography. == Symbol to digital conversion == Since symbols (for example, alphanumeric characters) are not continuous, representing symbols digitally is rather simpler than conversion of continuous or analog information to digital. Instead of sampling and quantization as in analog-to-digital conversion, such techniques as polling and encoding are used. A symbol input device usually consists of a group of switches that are polled at regular intervals to see which switches are switched. Data will be lost if, within a single polling interval, two switches are pressed, or a switch is pressed, released, and pressed again. This polling can be done by a specialized processor in the device to prevent burdening the main CPU. When a new symbol has been entered, the device typically sends an interrupt, in a specialized format, so that the CPU can read it. For devices with only a few switches (such as the buttons on a joystick), the status of each can be encoded as bits (usually 0 for released and 1 for pressed) in a single word. This is useful when combinations of key presses are meaningful, and is sometimes used for passing the status of modifier keys on a keyboard (such as shift and control). But it does not scale to support more keys than the number of bits in a single byte or word. Devices with many switches (such as a computer keyboard) usually arrange these switches in a scan matrix, with the individual switches on the intersections of x and y lines. When a switch is pressed, it connects the corresponding x and y lines together. Polling (often called scanning in this case) is done by activating each x line in sequence and detecting which y lines then have a signal, thus which keys are pressed. When the keyboard processor detects that a key has changed state, it sends a signal to the CPU indicating the scan code of the key and its new state. The symbol is then encoded or converted into a number based on the status of modifier keys and the desired character encoding. A custom encoding can be used for a specific application with no loss of data. However, using a standard encoding such as ASCII is problematic if a symbol such as 'ß' needs to be converted but is not in the standard. It is estimated that in the year 1986, less than 1% of the world's technological capacity to store information was digital and in 2007 it was already 94%. The year 2002 is assumed to be the year when humankind was able to store more information in digital than in analog format (the "beginning of the digital age"). == States == Digital data come in these three states: data at rest, data in transit, and data in use. The confidentiality, integrity, and availability have to be managed during the entire lifecycle from 'birth' to the destruction of the data. === Data at rest === Data at rest in information technology means data that is housed physically on computer data storage in any digital form (e.g. cloud storage, file hosting services, databases, data warehouses, spreadsheets, archives, tapes, off-site or cloud backups, mobile devices etc.). Data at rest includes both structured and unstructured data. This type of data is subject to threats from hackers and other malicious threats to gain access to the data digitally or physical theft of the data storage media. To prevent this data from being accessed, modified or stolen, organizations will often employ security protection measures such as password protection, data encryption, or a combination of both. The security options used for this type of data are broadly referred to as data-at-rest protection (DARP). Definitions include: "...all data in computer storage while excluding data that is traversing a network or temporarily residing in computer memory to be read or updated." "...all data in storage but excludes any data that frequently traverses the network or that which resides in temporary memory. Data at rest includes but is not limited to archived data, data which is not accessed or changed frequently, files stored on hard drives, USB thumb drives, files stored on backup tape and disks, and also files stored off-site or on a storage area network (SAN)." While it is generally accepted that archive data (i.e. which never changes), regardless of its storage medium, is data at rest and active data subject to constant or frequent change is data in use. “Inactive data” could be taken to mean data which may change, but infrequently. The imprecise nature of terms such as “constant” and “frequent” means that some stored data cannot be comprehensively defined as either data at rest or in use. These definitions could be taken to assume that Data at Rest is a superset of data in use; however, data in use, subject to frequent change, has distinct processing requirements from data at rest, whether completely static or subject to occasional change. ==== Security ==== Because of its nature data at rest is of increasing concern to businesses, government agencies and other institutions. Mobile devices are often subject to specific security protocols to protect data at rest from unauthorized access when lost or stolen and there is an increasing recognition that database management systems and file servers should also be considered as at risk; the longer data is left unused in storage, the more likely it might be retrieved by unauthorized individuals outside the network. Data encryption, which prevents data visibility in the event of its unauthorized access or theft, is commonly used to protect data in motion and increasingly promoted for protecting data at rest. The encryption of data at rest should only include strong encryption methods such as AES or RSA. Encrypted data should remain encrypted when access controls such as usernames and password fail. Increasing encryption on multiple levels is recommended. Cryptography can be implemented on the database housing the data and on the physical storage where the databases are stored. Data encryption keys should be updated on a regular basis. Encryption keys should be stored separately from the data. Encryption also enables crypto-shredding at the end of the data or hardware lifecycle. Periodic auditing of sensitive data should be part of policy and should occur on scheduled occurrences. Finally, only store the minimum possible amount of sensitive data. Tokenization is a non-mathematical approach to protecting data at rest that replaces sensitive data with non-sensitive substitutes, referred to as tokens, which have no extrinsic or exploitable meaning or value. This process does not alter the type or length of data, which means it can be processed by legacy systems such as databases that may be sensitive to data length and type. Tokens require significantly less computational resources to process and less storage space in databases than traditionally encrypted data. This is achieved by keeping specific data fully or partially visible for processing and analytics while sensitive information is kept hidden. Lower processing and storage requirements makes tokenization an ideal method of securing data at rest in systems that manage large volumes of data. A further method of preventing unwanted access to data at rest is the use of data federation especially when data is distributed globally (e.g. in off-shore archives). An example of this would be a European organisation which stores its archived data off-site in the US. Under the terms of the USA PATRIOT Act the American authorities can demand

    Read more →
  • CLAWS (linguistics)

    CLAWS (linguistics)

    The Constituent Likelihood Automatic Word-tagging System (CLAWS) is a program that performs part-of-speech tagging. It was developed in the 1980s at Lancaster University by the University Centre for Computer Corpus Research on Language. It has an overall accuracy rate of 96–97% with the latest version (CLAWS4) tagging around 100 million words of the British National Corpus. == History == A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some language and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc., although generally computational applications use more fine-grained POS tags like 'noun-plural'. Developed in the early 1980s, CLAWS was built to fill the ever-growing gap created by always-changing POS necessities. Originally created to add part-of-speech tags to the LOB corpus of British English, the CLAWS tagset has since been adapted to other languages as well, including Urdu and Arabic. Since its inception, CLAWS has been hailed for its functionality and adaptability. Still, it is not without flaws, and though it boasts an error-rate of only 1.5% when judged in major categories, CLAWS still remains with c.3.3% ambiguities unresolved. Ambiguity arises in cases such as with the word flies, and whether it should be classified as a noun or a verb. It's these ambiguities that will require the various upgrades and tagsets that CLAWS will endure. == Rules and processing == CLAWS uses a Hidden Markov model to determine the likelihood of sequences of words in anticipating each part-of-speech label. === Sample output === This excerpt from Bram Stoker's Dracula (1897) has been tagged using both the CLAWS C5 and C7 tagsets. This is what a CLAWS output will generally look like, with the most likely part-of-speech tag following each word. == Tagsets == === CLAWS1 tagset === The first tagset developed in CLAWS, CLAWS1 tagset, has 132 word tags. In terms of form and application, C1 tagset is similar to Brown Corpus tags. See Table of tags in C1 tagset here. === CLAWS2 tagset === From 1983 to 1986, updated versions leading to CLAWS2 were part of a larger attempt to deal with aspects such as recognizing sentence breaks, in order to avoid the need for manual pre-processing of a text before the tags were applied, moving instead to optional manual post-editing to adjust the output of the automatic annotation, if needed. The CLAWS2 tagset has 166 word tags. See Table of tags in C2 tagset here. === CLAWS4 tagset === The CLAWS4 was used for the 100-million-word British National Corpus (BNC). A general-purpose grammatical tagger, it is a successor of the CLAWS1 tagger. In tagging the BNC, the many rounds of work that went into CLAWS4 focused on making the CLAWS program independent from the tagsets. For example, the BNC project used two tagset versions: "a main tagset (C5) with 62 tags with which the whole of the corpus has been tagged, and a larger (C7) tagset with 152 tags, which has been used to make a selected 'core' sample corpus of two million words." The latest version of CLAWS4 is offered by UCREL, a research center of Lancaster University. === CLAWS5 tagset === The CLAWS5 tagset, which was used for BNC, has over 60 tags. See Table of tags in C5 tagset here. === CLAWS6 tagset === The CLAWS6 tagset was used for the BNC sampler corpus and the COLT corpus. It has over 160 tags, including 13 determiner subtypes. See Table of tags in C6 tagset here. === CLAWS7 tagset === The standard CLAWS7 tagset is used currently. It is only different in the punctuation tags when compared to the CLAWS6 tagset. See Table of tags in C7 tagset here. === CLAWS8 tagset === CLAWS8 tagset was extended from C7 tagset with further distinctions in the determiner and pronoun categories, as well as 37 new auxiliary tags for forms of be, do, and have. See Table of tags in C8 tagset here

    Read more →
  • Paperless society

    Paperless society

    A paperless society is a society in which paper communication (written documents, email, letters, etc.) is replaced by electronic communication and storage. The concept was first introduced by Frederick Wilfrid Lancaster in 1978. Furthermore, libraries would no longer be needed to handle printed documents. "Librarians will, in time, become information specialists in a deinstitutionalized setting". Lancaster also stated that both computers and libraries will not always give us the information that other people and living life will. == Literature == Brodman, E. (1979). Review of Toward Paperless Information Systems. Bulletin of the Medical Library Association, 67(4), 437–439. Buckland, M. K. (1980). Review of Toward Paperless Information Systems. Journal of Academic Librarianship, 5(6), 349. Grosch, A. (1979). Review of Toward Paperless Information Systems. College & Research Libraries, 40(1), 88–89. Kohl, D. F. (2004). From the editor . . . The paperless society . . . Not quite yet. Journal of Academic Librarianship, 30(3), 177–178. Lancaster, F. W. (1978a). Toward paperless information systems. New York: Academic Press. Lancaster, F. W. (1980b). The future of the librarian lies outside of the library. Catholic Library World, 51, 388–391. Lancaster, F. W. (1982a). Libraries and librarians in an age of electronics. Arlington, VA: Information Resources Press. Lancaster, F. W. (1982b). The evolving paperless society and its implications for libraries. International Forum on Information and Documentation, 7(4), 3–10. Lancaster, F. W. (1983). Future librarianship: Preparing for an unconventional career. Wilson Library Bulletin, 57, 747–753. Lancaster, F. W. (1985). The paperless society revisited. American Libraries, 16, 553–555. Lancaster, F. W. (1993). Libraries and the future: Essays on the library in the twenty-first century. New York: Haworth Press. Lancaster, F. W. (1999). Second thoughts on the paperless society. Library Journal, 124(15), 48– 50. Lancaster, F. W., & Smith, L. C. (1980c). On-Line systems in the communication process: Projections. Journal of the American Society for Information Science, 31(3), 193–200. Miall, D. S. (2001). The library versus the Internet: Literary studies under siege? Proceedings of the Modern Language Association, 116(5), 1405–1414. Salton, G. (1979). Review of Toward Paperless Information Systems. Journal of Documentation, 35(3), 250–252. Sellen, A. J., & Harper, R. H. R. (2003). The myth of the paperless office. Cambridge, MA: MIT Press. Stevens, N. D. (2006). The fully electronic academic library. College & Research Libraries, 67(1),5–14. Young, Arthur P. (2008).Aftermath of a Prediction: F. W. Lancaster and the Paperless Society LIBRARY TRENDS, 56(4),(“The Evaluation and Transformation of Information Systems: Essays Honoring the Legacy of F. W. Lancaster,” edited by Lorraine J. Haricombe and Keith Russell), pp. 843–858.

    Read more →
  • Server-sent events

    Server-sent events

    Server-Sent Events (SSE) is a server push technology enabling a client to receive automatic updates from a server via an HTTP connection, and describes how servers can initiate data transmission towards clients once an initial client connection has been established. They are commonly used to send message updates or continuous data streams to a browser client and designed to enhance native, cross-browser streaming through a JavaScript API called EventSource, through which a client requests a particular URL in order to receive an event stream. The EventSource API is standardized as part of HTML Living Standard by the WHATWG. The media type for SSE is text/event-stream. All modern browsers support server-sent events: Firefox 6+, Google Chrome 6+, Opera 11.5+, Safari 5+, Microsoft Edge 79+, Brave. Since SSE does not use either persistent connections nor chunked transfer encoding, HTTP/1.1 is not a technical requirement. == History == The SSE mechanism was first specified by Ian Hickson as part of the "WHATWG Web Applications 1.0" proposal starting in 2004. In September 2006, the Opera web browser implemented the experimental technology in a feature called "Server-Sent Events". The W3C published Server-Sent Events as a Recommendation on February 3, 2015, after years of development through Working Drafts and Candidate Recommendations. == Example == == Technology == When sending high-frequency data , the server must manage backpressure to prevent saturating clients. This is mitigated in the following ways: Client-side buffering: Browsers have limited buffer space for incoming server-sent events Adaptive rate limiting: Servers can adjust event frequency and monitor connection health Event batching: Combining multiple events into larger and less frequent transmissions

    Read more →
  • New media

    New media

    New media are communication technologies that enable or enhance interaction between users, as well as interaction between users and content. In the middle of the 1990s, the phrase "new media" became widely used as part of a sales pitch for the influx of interactive CD-ROMs for entertainment and education. The new media technologies, sometimes known as Web 2.0, include a wide range of web-related communication tools such as blogs, wikis, online social networking, virtual worlds, and other social media platforms. The phrase "new media" refers to computational media that share material online and through computers. New media inspire new ways of thinking about older media. Media do not replace one another in a clear, linear succession, instead evolving in a more complicated network of interconnected feedback loops . What is different about new media is how they specifically refashion traditional media and how older media refashion themselves to meet the challenges of new media. Unless they contain technologies that enable digital generative or interactive processes, broadcast television programs, non-interactive news websites, feature films, magazines, and books are not considered to be new media. The term "new media" stands in contrast to old media, which dominated the media landscape as a form of mass media for many years. == History == In the 1950s, connections between computing and radical art began to grow stronger. It was not until the 1980s that Alan Kay and his co-workers at Xerox PARC began to give the computability of a personal computer to the individual, rather than have a big organization be in charge of this. In the late 1980s and early 1990s, however, we seem to witness a different kind of parallel relationship between social changes and computer design. Although causally unrelated, conceptually, it makes sense that the Cold War and the design of the Web took place at exactly the same time. Writers and philosophers such as Marshall McLuhan were instrumental in the development of media theory during this period which is now famous declaration in Understanding Media: The Extensions of Man, that "the medium is the message" drew attention to the too often ignored influence media and technology themselves, rather than their "content," have on humans' experience of the world and on society broadly. Until the 1980s, media relied primarily upon print and analog broadcast models such as television and radio. The last twenty-five years have seen the rapid transformation into media which are predicated upon the use of digital technologies such as the Internet and video games. However, these examples are only a small representation of new media. The use of digital computers has transformed the remaining 'old' media, as suggested by the advent of digital television and online publications. Even traditional media forms such as the printing press have been transformed through the application of technologies by using of image manipulation software like Adobe Photoshop and desktop publishing tools. Andrew L. Shapiro argues that the "emergence of new, digital technologies signals a potentially radical shift of who is in control of information, experience and resources". W. Russell Neuman suggests that whilst the "new media" have technical capabilities to pull in one direction, economic and social forces pull back in the opposite direction. According to Neuman, "We are witnessing the evolution of a universal interconnected network of audio, video, and electronic text communications that will blur the distinction between interpersonal and mass communication; and between public and private communication". Neuman argues that new media will: Alter the meaning of geographic distance. Allow for a huge increase in the volume of communication. Provide the possibility of increasing the speed of communication. Provide opportunities for interactive communication. Allow forms of communication that were previously separate to overlap and interconnect. Consequently, it has been the contention of scholars such as Douglas Kellner and James Bohman that new media and particularly the Internet will provide the potential for a democratic postmodern public sphere, in which citizens can participate in well informed, non-hierarchical debate pertaining to their social structures. Contradicting these positive appraisals of the potential social impacts of new media are scholars such as Edward S. Herman and Robert McChesney who have suggested that the transition to new media has seen a handful of powerful transnational telecommunications corporations who achieve a level of global influence which was hitherto unimaginable. Scholars have highlighted both the positive and negative potential and actual implications of new media technologies, suggesting that some of the early work in new media studies was guilty of technologicaldeterminism – whereby the effects of media were determined by the technologies themselves, rather than by tracing the complex social networks that governed the development, funding, implementation, and future evolution of any technology. Based on the argument that people have a limited amount of time to spend on the consumption of different media, displacement theory argue that the viewership or readership of one particular outlet leads to the reduction in the amount of time spent by the individual on another. The introduction of new media, such as the internet, therefore reduces the amount of time individuals would spend on existing "old" media, which could ultimately lead to the end of such traditional media. == Definition == Although, there are several ways that new media may be described, Lev Manovich, in an introduction to The New Media Reader, defines new media by using eight propositions: New media versus cyberculture – Cyberculture is the various social phenomena that are associated with the Internet and network communications (blogs, online multi-player gaming), whereas new media is concerned more with cultural objects and paradigms (digital to analog television, smartphones). New media as computer technology used as a distribution platform – New media are the cultural objects which use digital computer technology for distribution and exhibition. e.g. (at least for now) Internet, Web sites, computer multimedia, Blu-ray disks etc. The problem with this is that the definition must be revised every few years. The term "new media" will not be "new" anymore, as most forms of culture will be distributed through computers. New media as digital data controlled by software – The language of new media is based on the assumption that, in fact, all cultural objects that rely on digital representation and computer-based delivery do share a number of common qualities. New media is reduced to digital data that can be manipulated by software as any other data. Now media operations can create several versions of the same object. An example is an image stored as matrix data which can be manipulated and altered according to the additional algorithms implemented, such as color inversion, gray-scaling, sharpening, rasterizing, etc. New media as the mix between existing cultural conventions and the conventions of software – New media today can be understood as the mix between older cultural conventions for data representation, access, and manipulation and newer conventions of data representation, access, and manipulation. The "old" data are representations of visual reality and human experience, and the "new" data is numerical data. The computer is kept out of the key "creative" decisions, and is delegated to the position of a technician. e.g. In film, software is used in some areas of production, in others are created using computer animation. New media as the aesthetics that accompanies the early stage of every new modern media and communication technology – While ideological tropes indeed seem to be reappearing rather regularly, many aesthetic strategies may reappear two or three times ... In order for this approach to be truly useful it would be insufficient to simply name the strategies and tropes and to record the moments of their appearance; instead, we would have to develop a much more comprehensive analysis which would correlate the history of technology with social, political, and economical histories or the modern period. New media as faster execution of algorithms previously executed manually or through other technologies – Computers are a huge speed-up of what were previously manual techniques. e.g. calculators. Dramatically speeding up the execution makes possible previously non-existent representational technique. This also makes possible of many new forms of media art such as interactive multimedia and video games. On one level, a modern digital computer is just a faster calculator, we should not ignore its other identity: that of a cybernetic control device. New media as the encoding of modernist avant-garde; new media as metamedia – Manovi

    Read more →
  • Cyclodisparity

    Cyclodisparity

    In vision science, cyclodisparity is the difference in the rotation angle of an object or scene viewed by the left and right eyes. Cyclodisparity can result from the eyes' torsional rotation (cyclorotation) or can be created artificially by presenting to the eyes two images that need to be rotated relative to each other for binocular fusion to take place. == Human and animal vision == The eyes and visual system can compensate for cyclodisparity up to a certain point; if the cyclodisparity is larger than a threshold, the images cannot be fused, resulting stereoblindness, and in double vision in subjects who otherwise have full stereo vision. When a human subject is presented with images that have artificial cyclodisparity, cyclovergence is evoked, that is, a motor response of the eye muscles that rotates the two eyes in opposite directions, thereby reducing cyclodisparity. Visually-induced cyclovergence of up to 8 degrees has been observed in normal subjects. Furthermore, up to about 8 degrees can usually be compensated by purely sensory means, that is, without physical eye rotation. This means that the normal human observer can achieve binocular image fusion in presence of cyclodisparity of up to approximately 16 degrees. Cyclodisparity due to images having been rotated inward can be compensated better when the gaze is directed downwards, and cyclodisparity due to an outward rotation can be compensated better when the gaze is directed upwards. A proposed explanation for this phenomenon is that the motor system is coordinated in such a way that the eyes perform a torsional movement to reduce the size of the search zones and thus the computational load required for solving the correspondence problem. The resulting cyclovergence at near gaze is smaller than the cyclovergence predicted by Listing's law. == Video processing and computer vision == Active camera torsion can be used in machine and computer vision for several purposes. For instance, camera torsion can be used to make improved use of the search range over which matching detectors or stereo matching algorithms operate, or to make a 3D slanted surface appear frontoparallel for further stereo processing. For image compression purposes, images with cyclodisparity are advantageously encoded using global motion compensation using a rotational motion model.

    Read more →
  • Media Auxiliary Memory

    Media Auxiliary Memory

    Media Auxiliary Memory or Medium Auxiliary Memory (MAM) refers to a chip embedded into a digital media device (usually a tape cartridge) that stores a small amount of data or metadata that a computer can read without having to read the actual tape. MAMs can be used by the tape driver to increase efficiency, or by custom software to store & retrieve custom data. Some examples of MAM's are Cartridge Memory (HP/Seagate/IBM LTO) and MIC (Sony AIT).

    Read more →
  • Digital cassettes

    Digital cassettes

    Digital audio cassette formats introduced to the professional audio and consumer markets: Digital Audio Tape (or DAT) is the most well-known, and had some success as an audio storage format among professionals and "prosumers" before the prices of hard drive and solid-state flash memory-based digital recording devices dropped in the late 1990s. Hard-drive recording has mostly made DAT obsolete, as hard disk recorders offer more editing versatility than tape, and easier importation into digital audio workstations (DAWs) and non-linear video editing (NLE) systems. Digital Compact Cassette was intended as a digital replacement for the mass-market analog cassette tape, but received very little attention or adaptation. Its failure is generally attributed to higher production costs than audio CDs, durability and indifferent reception by consumers. Digital video cassettes include: Betacam IMX (Sony) D-VHS (JVC) D1 (Sony) D2 (Sony) D3 D5 HD Digital-S D9 (JVC) Digital Betacam (Sony) Digital8 (Sony) DV HDV ProHD (JVC) MiniDV MicroMV == Analog cassettes used as digital data storage == Historically, the compact audio cassette which was originally designed for analog storage of music was used as an alternative to disk drives in the late 1970s and early 1980s to provide data storage for home computers. There is a number of unique and incompatible cassette tape data storage formats that all use the same analog compact audio cassette tape media. The ADAT system uses Super VHS tapes to record 8 synchronized digital audiotracks at once. There have also been several audio recording systems that used VHS video recorders as storage devices and video tape transports, generally by encoding the digital data to be recorded into an analog composite video signal (which resembles static) and then recording this to magnetic tape. These systems were often used as "mixdown" recorders, to record the finished mix from a multi-track recorder in preparation for the manufacture of a vinyl record, cassette tape, or CD. An example was the Dbx Model 700. Another example is the Sony PCM adaptor series. Several companies sold VHS backup solutions in the 1980s and 1990s where data was converted to a video image which was then saved onto a VHS tape. the Corvus "Mirror" ( U.S. patent 4380047A ) the Metrum Model 64 on S-VHS tape, the Danmere Backer tape backup system, the Alpha Microsystems Videotrax the Legacy Storage Systems International VAST (Variable Array Storage) the ArVid the Video Backup System Amiga, The S2 VLBI system at three NASA Deep Space Network complexes and over 20 other radio telescopes stores digital data on SVHS tapes.

    Read more →
  • Event cinema

    Event cinema

    Event cinema sometimes called alternative content cinema or livecasts refers to the use of movie theaters to display a varied range of live and recorded entertainment excluding traditional films, such as sport, opera, musicals, ballet, music, one-off TV specials, current affairs, comedy and religious services. == History and development == Event Cinema was set up at the start of the century with rock concerts by Bon Jovi (2001), David Bowie (2003), and Robbie Williams (2005) bringing non-film audiences into cinemas that had newly installed digital equipment. The Metropolitan Opera in New York through their partnership with Fathom Events is acknowledged as the trailblazer in this area, aggressively seeking out new markets and setting high standards for live broadcasts via satellite. Emulated by other opera houses worldwide such as the Royal Opera House following a close second, Glyndebourne, La Scala and the Sydney Opera House the genre of opera within the 'Event Cinema' industry has been a huge success, and has brought new, younger audiences into cash-strapped opera houses depended on state funding and wealthy benefactors for the first time - an unforeseen and happy consequence of digitisation. Ballet and theater have also been very successful, as have rock concerts, both live and recorded. The UK's National Theatre has been a huge success here with their season of live broadcasts under the banner 'NT Live', featuring big name casts such as Helen Mirren, whose recent turn as Queen Elizabeth II in The Audience was a sell out everywhere. (This was in partnership with another West End theatre and the NT are keen to help other theatres maximise their potential through live broadcasts). The Globe and the Royal Shakespeare Company are also producing work for live broadcast and recorded exhibition. As digitisation of cinemas matures, the Event Cinema industry is growing. The strongest territory is the US, followed by the UK and mainland European territories. Latin America is also a very strong market. Recent additions include Pompeii Live, a unique exhibition by the UK's British Museum, featuring celebrities and curators taking the audience on a live tour around the recreated set of Pompeii within the museum itself, and they are also exploring the schools market for the first time, following the live broadcast on June 18 with a daytime broadcast aimed at UK schools for the first time. If successful this will no doubt prove a model for future museums to emulate. An added incentive for exhibitors is the ability to show alternative content, i.e. alternative to mainstream, studio-driven content, such as live special events, sports, pre-show advertising and other digital or video content. In industry terms this has become known as 'Alternative Content', but has recently become known more widely as 'Event Cinema'. === Expanding markets === Some low-budget films that would normally not have a theatrical release because of distribution costs might be shown in smaller engagements than the typical large release studio pictures. The cost of duplicating a digital "print" is very low, so adding more theaters to a release has a small additional cost to the distributor. Movies that start with a small release could scale to a much larger release quickly if they were sufficiently successful, opening up the possibility that smaller movies could achieve box office success previously out of their reach. ==== Technical specifications ==== Event Cinema is also finding a market in 3rd world countries in which the higher costs and quality of DCI equipment are not yet affordable, as crucially there are no DCI specifications for Alternative Content as there is in mainstream [studio] content. This has led to an explosion in the variety of content on offer, but a lack of standardisation has led to questionable quality at times. As the industry matures, this lack of regulation is expected to change and there are moves afoot to introduce codes of practice and technical specifications. Recorded content complements mainstream studio content by maximising the 'downtime' that plagues the cinema industry, where screens worldwide spend a large proportion of their time in darkness and cinemas empty. Some cinema chains have targeted pensioners in particular, offering free tea and coffee for afternoon matinees of recorded opera, for example. Digital Cinema Packages (DCPs) have been useful to cinemas not yet equipped with satellite broadcasting capability and has enabled exhibitors to build their Event Cinema audience, which is not generally the 18-24 demographic that multiplexes are targeting. ==== New Audiences ==== Event Cinema has seen a return of an older, affluent audience, previously turned off by the multiplex experience, and cinemas are starting to capitalise on this by offering waiter-serviced, high class finger food and alcoholic beverages, complete with bars and restaurants, a world away from the traditional popcorn/soft drink model; art house cinemas are increasingly marketing themselves as 'destination' venues for an evening's entertainment, somewhere to spend an entire evening, rather than just a couple of hours. As exhibition admissions have plateau'd in recent years due to the explosion in VOD, tablet and mobile content technology, this new revenue stream has been a surprise and welcome addition to the cinema industry, though the US studios have been cautious in embracing the change as yet. The thrill of Live broadcasts means they are generally regarded as more popular than recorded events, but there are exceptions; artists with a loyal cult or teenage following tend to do particularly well in this area, as concert films featuring artists such as the Grateful Dead, Pearl Jam, JLS, Led Zeppelin and the Rolling Stones have shown. ==== The Future ==== As more and more distributors are emerging, offering an increasingly broad range of content to cinemas worldwide, the landscape itself is shifting: screen advertising companies, technical providers, and exhibitors themselves are reinventing themselves as Alternative Content or Event Cinema distributors, and the industry is witnessing a re-evaluation of business models and practices worldwide. Predictions are that this industry could be work in excess of US$1bn by 2015. An illustration of the growth of this industry is the news the establishment of a European trade association promoting the industry to the general public and supporting those involved in it and the Event Cinema Association.

    Read more →
  • Confused deputy problem

    Confused deputy problem

    In information security, a confused deputy is a computer program that is tricked by another program (with fewer privileges or less rights) into misusing its authority on the system. It is a specific type of privilege escalation. The confused deputy problem is often cited as an example of why capability-based security is important. Capability systems protect against the confused deputy problem, whereas access-control list–based systems do not. Such systems can mitigate the confused deputy problem by eliminating ambient authority, allowing programs to act only on resources for which they hold explicit capabilities, whereas access-control list–based systems are more susceptible to it. However, this protection depends on correct implementation; in formally verified capability systems such as seL4, it can be shown that the kernel enforces capability constraints correctly, preventing such behavior at the system level. == Example == In the original example of a confused deputy, there was a compiler program provided on a commercial timesharing service. Users could run the compiler and optionally specify a filename where it would write debugging output, and the compiler would be able to write to that file if the user had permission to write there. The compiler also collected statistics about language feature usage. Those statistics were stored in a file called "(SYSX)STAT", in the directory "SYSX". To make this possible, the compiler program was given permission to write to files in SYSX. But there were other files in SYSX: in particular, the system's billing information was stored in a file "(SYSX)BILL". A user ran the compiler and named "(SYSX)BILL" as the desired debugging output file. This produced a confused deputy problem. The compiler made a request to the operating system to open (SYSX)BILL. Even though the user did not have access to that file, the compiler did, so the open succeeded. The compiler wrote the compilation output to the file (here "(SYSX)BILL") as normal, overwriting it, and the billing information was destroyed. === The confused deputy === In this example, the compiler program is the deputy because it is acting at the request of the user. The program is seen as 'confused' because it was tricked into overwriting the system's billing file. Whenever a program tries to access a file, the operating system needs to know two things: which file the program is asking for, and whether the program has permission to access the file. In the example, the file is designated by its name, “(SYSX)BILL”. The program receives the file name from the user, but does not know whether the user had permission to write the file. When the program opens the file, the system uses the program's permission, not the user's. When the file name was passed from the user to the program, the permission did not go along with it; the permission was increased by the system silently and automatically. It is not essential to the attack that the billing file be designated by a name represented as a string. The essential points are that: the designator for the file does not carry the full authority needed to access the file; the program's own permission to access the file is used implicitly. == Other examples == A cross-site request forgery (CSRF) is an example of a confused deputy attack that uses the web browser to perform sensitive actions against a web application. A common form of this attack occurs when a web application uses a cookie to authenticate all requests transmitted by a browser. Using JavaScript, an attacker can force a browser into transmitting authenticated HTTP requests. The Samy computer worm used cross-site scripting (XSS) to turn the browser's authenticated MySpace session into a confused deputy. Using XSS the worm forced the browser into posting an executable copy of the worm as a MySpace message which was then viewed and executed by friends of the infected user. Clickjacking is an attack where the user acts as the confused deputy. In this attack a user thinks they are harmlessly browsing a website (an attacker-controlled website) but they are in fact tricked into performing sensitive actions on another website. An FTP bounce attack can allow an attacker to connect indirectly to TCP ports to which the attacker's machine has no access, using a remote FTP server as the confused deputy. Another example relates to personal firewall software. It can restrict Internet access for specific applications. Some applications circumvent this by starting a browser with instructions to access a specific URL. The browser has authority to open a network connection, even though the application does not. Firewall software can attempt to address this by prompting the user in cases where one program starts another which then accesses the network. However, the user frequently does not have sufficient information to determine whether such an access is legitimate—false positives are common, and there is a substantial risk that even sophisticated users will become habituated to clicking "OK" to these prompts. Not every program that misuses authority is a confused deputy. Sometimes misuse of authority is simply a result of a program error. The confused deputy problem occurs when the designation of an object is passed from one program to another, and the associated permission changes unintentionally, without any explicit action by either party. It is insidious because neither party did anything explicit to change the authority. Another example is when an administrator authorizes an AI agent to act on their behalf, and that AI subsequently delegates authority to another AI agent neither vetted nor authorized by the original administrator. The unvetted AI can then act without permissions or oversight from the original developer. == Solutions == In some systems it is possible to ask the operating system to open a file using the permissions of another client. This solution has some drawbacks: It requires explicit attention to security by the server. A naive or careless server might not take this extra step. It becomes more difficult to identify the correct permission if the server is in turn the client of another service and wants to pass along access to the file. It requires the client to trust the server to not abuse the borrowed permissions. Note that intersecting the server and client's permissions does not solve the problem either, because the server may then have to be given very wide permissions (all of the time, rather than those needed for a given request) in order to act for arbitrary clients. The simplest way to solve the confused deputy problem is to bundle together the designation of an object and the permission to access that object. This is exactly what a capability is. Using capability security in the compiler example, the client would pass to the server a capability to the output file, such as a file descriptor, rather than the name of the file. Since it lacks a capability to the billing file, it cannot designate that file for output. In the cross-site request forgery example, a URL supplied "cross"-site would include its own authority independent of that of the client of the web browser.

    Read more →
  • Digital asset

    Digital asset

    A digital asset is anything that exists only in digital form and comes with a distinct usage right or distinct permission for use. Data that do not possess those rights are not considered assets. Digital assets include, but are not limited to: digital documents, audio content, motion pictures, and other relevant digital data currently in circulation or stored on digital appliances, such as personal computers, laptops, portable media players, tablets, data storage devices, and telecommunication devices. This encompasses any apparatus that currently exists or will exist as technology progresses to accommodate the conception of new modalities capable of carrying digital assets. This holds true regardless of the ownership of the physical device on which the digital asset is located. == Types == Types of digital assets include, but are not limited to: software, photography, logos, illustrations, animations, audiovisual media, presentations, spreadsheets, digital paintings, word documents, electronic mails, websites, and various other digital formats with their respective metadata. The number of different types of digital assets is exponentially increasing due to the rising number of devices that leverage these assets, such as smartphones, serving as conduits for digital media. In Intel's presentation at the 'Intel Developer Forum 2013,' they introduced several new types of digital assets related to medicine, education, voting, friendships, conversations, and reputation, among others. == Digital asset management system == A digital asset management (DAM) is an integrated structure that combines software, hardware, and/or other services to manage, store, ingest, organize, and retrieve digital assets. These systems enable users to find and use content when needed. == Digital asset metadata == Metadata is data about other data. Any structured information that defines a specification of any form of data is referred to as metadata. Metadata is also a claimed relationship between two entities, often used to establish connections or associations. Librarian Lorcan Dempsey says "Think of metadata as data which removes from a user (human or machine) the need to have full advance knowledge of the existence or characteristics of things of potential interest in the environment". At first, the term metadata was used for digital data exclusively, but nowadays metadata can apply to both physical and digital data. Catalogs, inventories, registers, and other similar standardized forms of organizing, managing, and retrieving resources contain metadata. Metadata can be stored and contained directly within the file it refers to or independently from it with the help of other forms of data management such as a DAM system. The more metadata is assigned to an asset the easier it gets to categorize it, especially as the amount of information grows. The asset's value rises the more metadata it has for it becomes more accessible, easier to manage, and more complex. Structured metadata can be shared with open protocols like OAI-PMH to allow further aggregation and processing. Open data sources like institutional repositories have thus been aggregated to form large datasets and academic search engines comprising tens of millions of open access works, like BASE, CORE, and Unpaywall. == Issues == Due to a lack of either legislation or legal precedent, there is limited existing governmental control and regulation surrounding digital assets in the United States and other large economies globally. Many of the control issues relating to access and transferability are maintained by individual companies. Some consequences of this include 'What is to become of the assets once their owner is deceased?' as well as can, and, if so, how, may they be inherited. This subject was broached in a bogus story about Bruce Willis allegedly looking to sue Apple as the end user agreement prevented him from bequeathing his iTunes collection to his children. Another case of this was when a soldier died on duty and the family requested access to the Yahoo! account. When Yahoo! refused to grant access, the probate judge ordered them to give the emails to the family but Yahoo! still was not required to give access. The Music Modernization Act was passed in September 2018 by the U.S. Congress to create a new music licensing system, with the aim to help songwriters get paid more.

    Read more →
  • Mean opinion score

    Mean opinion score

    Mean opinion score (MOS) is a measure used in the domain of Quality of Experience and telecommunications engineering, representing overall quality of a stimulus or system. It is the arithmetic mean over all individual "values on a predefined scale that a subject assigns to his opinion of the performance of a system quality". Such ratings are usually gathered in a subjective quality evaluation test, but they can also be algorithmically estimated. MOS is a commonly used measure for video, audio, and audiovisual quality evaluation, but not restricted to those modalities. ITU-T has defined several ways of referring to a MOS in Recommendation ITU-T P.800.1, depending on whether the score was obtained from audiovisual, conversational, listening, talking, or video quality tests. == Rating scales and mathematical definition == The MOS is expressed as a single rational number, typically in the range 1–5, where 1 is lowest perceived quality, and 5 is the highest perceived quality. Other MOS ranges are also possible, depending on the rating scale that has been used in the underlying test. The Absolute Category Rating scale is very commonly used, which maps ratings between Bad and Excellent to numbers between 1 and 5, as seen in below table. Other standardized quality rating scales exist in ITU-T Recommendations (such as ITU-T P.800 or ITU-T P.910). For example, one could use a continuous scale ranging between 1–100. Which scale is used depends on the purpose of the test. In certain contexts there are no statistically significant differences between ratings for the same stimuli when they are obtained using different scales. The MOS is calculated as the arithmetic mean over single ratings performed by human subjects for a given stimulus in a subjective quality evaluation test. Thus: M O S = ∑ n = 1 N R n N {\displaystyle MOS={\frac {\sum _{n=1}^{N}{R_{n}}}{N}}} Where R {\displaystyle R} are the individual ratings for a given stimulus by N {\displaystyle N} subjects. == Properties of the MOS == The MOS is subject to certain mathematical properties and biases. In general, there is an ongoing debate on the usefulness of the MOS to quantify Quality of Experience in a single scalar value. When the MOS is acquired using a categorical rating scales, it is based on – similar to Likert scales – an ordinal scale. In this case, the ranking of the scale items is known, but their interval is not. Therefore, it is mathematically incorrect to calculate a mean over individual ratings in order to obtain the central tendency; the median should be used instead. However, in practice and in the definition of MOS, it is considered acceptable to calculate the arithmetic mean. It has been shown that for categorical rating scales (such as ACR), the individual items are not perceived equidistant by subjects. For example, there may be a larger "gap" between Good and Fair than there is between Good and Excellent. The perceived distance may also depend on the language into which the scale is translated. However, there exist studies that could not prove a significant impact of scale translation on the obtained results. Several other biases are present in the way MOS ratings are typically acquired. In addition to the above-mentioned issues with scales that are perceived non-linearly, there is a so-called "range-equalization bias": subjects, over the course of a subjective experiment, tend to give scores that span the entire rating scale. This makes it impossible to compare two different subjective tests if the range of presented quality differs. In other words, the MOS is never an absolute measure of quality, but only relative to the test in which it has been acquired. For the above reasons – and due to several other contextual factors influencing the perceived quality in a subjective test – a MOS value should only be reported if the context in which the values have been collected in is known and reported as well. MOS values gathered from different contexts and test designs therefore should not be directly compared. Recommendation ITU-T P.800.2 prescribes how MOS values should be reported. Specifically, P.800.2 says:it is not meaningful to directly compare MOS values produced from separate experiments, unless those experiments were explicitly designed to be compared, and even then the data should be statistically analysed to ensure that such a comparison is valid. == MOS for speech and audio quality estimation == MOS historically originates from subjective measurements where listeners would sit in a "quiet room" and score a telephone call quality as they perceived it. This kind of test methodology had been in use in the telephony industry for decades and was standardized in Recommendation ITU-T P.800. It specifies that "the talker should be seated in a quiet room with volume between 30 and 120 m³ and a reverberation time less than 500 ms (preferably in the range 200–300 ms). The room noise level must be below 30 dBA with no dominant peaks in the spectrum." Requirements for other modalities were similarly specified in later ITU-T Recommendations. == MOS estimation using quality models == Obtaining MOS ratings may be time-consuming and expensive as it requires the recruitment of human assessors. For various use cases such as codec development or service quality monitoring purposes – where quality should be estimated repeatedly and automatically – MOS scores can also be predicted by objective quality models, which typically have been developed and trained using human MOS ratings. A question that arises from using such models is whether the MOS differences produced are noticeable to the users. For example, when rating images on a five point MOS scale, an image with a MOS equal to 5 is expected to be noticeably better in quality than one with a MOS equal to 1. Contrary to that, it is not evident whether an image with a MOS equal to 3.8 is noticeably better in quality than one with a MOS equal to 3.6. Research conducted on determining the smallest MOS difference that is perceptible to users for digital photographs showed that a MOS difference of approximately 0.46 is required in order for 75% of the users to be able to detect the higher quality image. Nevertheless, image quality expectation, and hence MOS, changes over time with the change of user expectations. As a result, minimum noticeable MOS differences determined using analytical methods such as in may change over time.

    Read more →