A-i Generator Reviews

A-i Generator Reviews — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Apache Parquet

    Apache Parquet

    Apache Parquet is a free and open-source column-oriented data storage format in the Apache Hadoop ecosystem inspired by Google Dremel interactive ad-hoc query system for analysis of read-only nested data. It is similar to RCFile and ORC, the other columnar-storage file formats in Hadoop, and is compatible with most of the data processing frameworks around Hadoop. It provides data compression and encoding schemes with enhanced performance to handle complex data in bulk. == History == The open-source project to build Apache Parquet began as a joint effort between Twitter and Cloudera using the record shredding and assembly algorithm as described in Google's Dremel. Parquet was designed as an improvement on the Trevni columnar storage format created by Doug Cutting, the creator of Hadoop. The name 'parquet' (lit. 'small compartment') refers to a style of decorative flooring and was chosen to "evoke the bottom layer of a database with an interesting layout". The first version, Apache Parquet 1.0, was released in July 2013. Since April 27, 2015, Apache Parquet has been a top-level Apache Software Foundation (ASF)-sponsored project. == Features == Apache Parquet is implemented using the record-shredding and assembly algorithm, which accommodates the complex data structures that can be used to store data. The values in each column are stored in contiguous memory locations, providing the following benefits: Column-wise compression is efficient in storage space Encoding and compression techniques specific to the type of data in each column can be used Queries that fetch specific column values need not read the entire row, thus improving performance Apache Parquet is implemented using the Apache Thrift framework, which increases its flexibility; it can work with a number of programming languages like C++, Java, Python, PHP, etc. As of August 2015, Parquet supports the big-data-processing frameworks including Apache Hive, Apache Drill, Apache Impala, Apache Crunch, Apache Pig, Cascading, Presto and Apache Spark. It is one of the external data formats used by the pandas Python data manipulation and analysis library. == Compression and encoding == In Parquet, compression is performed column by column, which enables different encoding schemes to be used for text and integer data. This strategy also keeps the door open for newer and better encoding schemes to be implemented as they are invented. Parquet supports various compression formats: snappy, gzip, LZO, brotli, zstd, and LZ4. === Dictionary encoding === Parquet has an automatic dictionary encoding enabled dynamically for data with a small number of unique values (i.e. below 105) that enables significant compression and boosts processing speed. === Bit packing === Storage of integers is usually done with dedicated 32 or 64 bits per integer. For small integers, packing multiple integers into the same space makes storage more efficient. === Run-length encoding (RLE) === To optimize storage of multiple occurrences of the same value, run-length encoding is used, which is where a single value is stored once along with the number of occurrences. Parquet implements a hybrid of bit packing and RLE, in which the encoding switches based on which produces the best compression results. This strategy works well for certain types of integer data and combines well with dictionary encoding. == Cloud Storage and Data Lakes == Parquet is widely used as the underlying file format in modern cloud-based data lake architectures. Cloud storage systems such as Amazon S3, Azure Data Lake Storage, and Google Cloud Storage commonly store data in Parquet format due to its efficient columnar representation and retrieval capabilities. Data lakehouse frameworks—including Apache Iceberg, Delta Lake, and Apache Hudi —build an additional metadata layer on top of Parquet files to support features such as schema evolution, time-travel queries, and ACID-compliant transactions. In these architectures, Parquet files serve as the immutable storage layer while the table formats manage data versioning and transactional integrity. == Comparison == Apache Parquet is comparable to RCFile and Optimized Row Columnar (ORC) file formats — all three fall under the category of columnar data storage within the Hadoop ecosystem. They all have better compression and encoding with improved read performance at the cost of slower writes. In addition to these features, Apache Parquet supports limited schema evolution, i.e., the schema can be modified according to the changes in the data. It also provides the ability to add new columns and merge schemas that do not conflict. Apache Arrow is designed as an in-memory complement to on-disk columnar formats like Parquet and ORC. The Arrow and Parquet projects include libraries that allow for reading and writing between the two formats. == Implementations == Known implementations of Parquet include:

    Read more →
  • Group (online social networking)

    Group (online social networking)

    A group (often termed as a community, e-group or club) is a feature in many social networking services which allows users to create, post, comment to and read from their own interest- and niche-specific forums, often within the realm of virtual communities. Groups, which may allow for open or closed access, invitation and/or joining by other users outside the group, are formed to provide mini-networks within the larger, more diverse social network service. Much like electronic mailing lists, they are also owned and maintained by owners, moderators, or managers, who can edit posts to discussion threads and regulate member behavior within the group. However, unlike traditional Internet forums and mailing lists, groups in social networking services allow owners and moderators alike to share account credentials between groups without having to log in to every group. == History == The rise of the World Wide Web resulted in an expansion of the varieties of methods for communication on the Internet, much of which was limited in the 1980s to discussion in newsgroups, BBS and chat rooms. While the initial rise of web-based mass communication took place in the form of early Internet forums in the mid-1990s, a few services such as MSN Groups, Yahoo! Groups and eGroups pioneered the combination of web-based mailing list archives with user profiles; by 2000, such services doubled as full-fledged mailing lists and Internet forums, allowing users to create an extremely large variety of discussion and networking mediums with comparatively sparse thresholds of complexity. Further features included chat rooms (often Java-based), image and video galleries, and group calendars. The second spurt of bullecalbel networking, one which was less dependent upon mailing list-related features and more upon Internet forum features, began in the early- to mid-2000s in the form of such services as LiveJournal, Friendster, MySpace and Facebook. These services continued the evolution of the web-based e-group as a discussion and organization medium. In the late 2000s, services such as Yammer and Micromobs further advanced e-group communication by taking advantage of microblog-style activity streams. == In virtual worlds == In Second Life, groups are centered less around discussion forums (as such, an asynchronous conferencing feature is not built into the Second Life network as of 2009) and common interest, and are more centered on maintenance of a particular geographic location inside the network. Such groups are often created by the owners of areas such as buildings, plots of land or whole islands in order to cater to the most frequent visitors and patrons of the regions. With the limited asynchronous messaging capability of Second Life, groups are also a means of mass-emailing announcements pertinent to the group, but are not completely capable of hosting discussion or deliberation of such announcement messages. == The importance of online social networking groups == Before people expanded their social life to the internet, they had small circles. These included the networks gained from rural areas or villages, such as family, friends and neighbors, and community groups such as churches. These networks represented a social safety net to support individuals. Since we have moved a huge part of our social life to the internet, online social networking groups have become a way to maintain a structure in social life. Online networking is made up by clusters of people, bounding themselves together on the World Wide Web. To be able to sort out the many different clusters we belong to we use online groups to helps us arrange and make sense of all our contacts. This sense-making is rooted within us, we sort and put people into compartments or sort by categories to make sense and try to understand our relationships to the people around us. Online social networking groups therefore enables us to do the same thing online. Online social networks have a huge impact on people’s lives. Since the social network revolution has offered people with more loose ties and diversity in their relationships, it creates both stress and opportunities. Furthermore, the Internet revolution has transformed the contact point from a household to the individual. In addition, people are in constant communication with each other due to the mobile revolution. All in all, the mentioned revolutions created a new social operating system: "networked individualism". The way that people currently connect, communicate and exchange information can be described as a form of operating system because of the similarities between the structure of computer systems and the networked individualism that has taken form in society. These structures consist of unwritten rules, norms, constraints and opportunities which are apparent for those who are part of a specific network. == Concerns == There is some research claiming that fake news is infiltrating online social networking. A recent study claimed that people exposed to fake news generally revert to their original opinion even after finding out the information they were given was false.

    Read more →
  • Algorithmic radicalization

    Algorithmic radicalization

    Algorithmic radicalization is the concept that recommender algorithms on popular social media sites, such as YouTube and Facebook, drive users toward progressively more extreme content over time, leading to the development of radicalized extremist political views. Algorithms meticulously record user interactions, encompassing likes, dislikes and the duration of time watching content, with the objective of generating an endless stream of media designed to sustain user engagement. The phenomenon of echo chamber channels has been demonstrated to exacerbate the polarization of consumers, primarily through the reinforcement of media preferences and the validation of one's existing beliefs. Algorithmic radicalization remains a controversial phenomenon as it is often not in the best interest of social media companies to remove echo chamber channels. To what extent recommender algorithms are actually responsible for radicalization remains disputed. Studies have found contradictory results regarding the promotion of extremist content by algorithms. == Social media echo chambers and filter bubbles == Social media platforms learn the interests and likes of the user to modify their experiences in their feed to keep them engaged and scrolling, known as a filter bubble. An echo chamber is formed when users come across beliefs that magnify or reinforce their thoughts and form a group of like-minded users in a closed system. Echo chambers spread information without any opposing beliefs and can possibly lead to confirmation bias. According to group polarization theory, an echo chamber can potentially lead users and groups towards more extreme radicalized positions. According to the National Library of Medicine, "Users online tend to prefer information adhering to their worldviews, ignore dissenting information, and form polarized groups around shared narratives. Furthermore, when polarization is high, misinformation quickly proliferates." == By site == === Facebook === Facebook's algorithm focuses on recommending content that makes the user want to interact. They rank content by prioritizing popular posts by friends, viral content, and sometimes divisive content. Each feed is personalized to the user's specific interests which can sometimes lead users towards an echo chamber of troublesome content. Users can find their list of interests the algorithm uses by going to the "Your ad Preferences" page. According to a Pew Research study, 74% of Facebook users did not know that list existed until they were directed towards that page in the study. It is also relatively common for Facebook to assign political labels to their users. In recent years, Facebook has started using artificial intelligence to change the content users see in their feed and what is recommended to them. A document known as The Facebook Files has revealed that their AI system prioritizes user engagement over everything else. The Facebook Files has also demonstrated that controlling the AI systems has proven difficult to handle. In an August 2019 internal memo leaked in 2021, Facebook has admitted that "the mechanics of our platforms are not neutral", concluding that in order to reach maximum profits, optimization for engagement is necessary. In order to increase engagement, algorithms have found that hate, misinformation, and politics are instrumental for app activity. As referenced in the memo, "The more incendiary the material, the more it keeps users engaged, the more it is boosted by the algorithm." According to a 2018 study, "false rumors spread faster and wider than true information... They found falsehoods are 70% more likely to be retweeted on Twitter than the truth, and reach their first 1,500 people six times faster. This effect is more pronounced with political news than other categories." === YouTube === YouTube has been around since 2005 and has more than 2.5 billion monthly users. YouTube discovery content systems focus on the user's personal activity (watched, favorites, likes) to direct them to recommended content. YouTube's algorithm is accountable for roughly 70% of users' recommended videos and what drives people to watch certain content. According to a 2022 study by the Mozilla Foundation, users have little power to keep unsolicited videos out of their suggested recommended content. This includes videos about hate speech, livestreams, etc. YouTube has been identified as an influential platform for spreading radicalized content. Al-Qaeda and similar extremist groups have been linked to using YouTube for recruitment videos and engaging with international media outlets. In a research study published by the American Behavioral Scientist Journal, they researched "whether it is possible to identify a set of attributes that may help explain part of the YouTube algorithm's decision-making process". The results of the study showed that YouTube's algorithm recommendations for extremism content factor into the presence of radical keywords in a video's title. In February 2023, in the case of Gonzalez v. Google, the question at hand is whether or not Google, the parent company of YouTube, is protected from lawsuits claiming that the site's algorithms aided terrorists in recommending ISIS videos to users. Section 230 is known to generally protect online platforms from civil liability for the content posted by its users. Multiple studies have found little to no evidence to suggest that YouTube's algorithms direct attention towards far-right content to those not already engaged with it. === TikTok === TikTok is a platform that recommends videos to a user's 'For You Page' (FYP), making every users' page different. With the nature of the algorithm behind the app, TikTok's FYP has been linked to showing more explicit and radical videos over time based on users' previous interactions on the app. Since TikTok's inception, the app has been scrutinized for misinformation and hate speech as those forms of media usually generate more interactions to the algorithm. Various extremist groups, including jihadist organizations, have utilized TikTok to disseminate propaganda, recruit followers, and incite violence. The platform's algorithm, which recommends content based on user engagement, can expose users to extremist content that aligns with their interests or interactions. As of 2022, TikTok's head of US Security has put out a statement that "81,518,334 videos were removed globally between April – June for violating our Community Guidelines or Terms of Service" to cut back on hate speech, harassment, and misinformation. Studies have noted instances where individuals were radicalized through content encountered on TikTok. For example, in early 2023, Austrian authorities thwarted a plot against an LGBTQ+ pride parade that involved two teenagers and a 20-year-old who were inspired by jihadist content on TikTok. The youngest suspect, 14 years old, had been exposed to videos created by Islamist influencers glorifying jihad. These videos led him to further engagement with similar content, eventually resulting in his involvement in planning an attack. Another case involved the arrest of several teenagers in Vienna, Austria, in 2024, who were planning to carry out a terrorist attack at a Taylor Swift concert. The investigation revealed that some of the suspects had been radicalized online, with TikTok being one of the platforms used to disseminate extremist content that influenced their beliefs and actions. == Self-radicalization == The U.S. Department of Justice defines 'Lone-wolf' (self) terrorism as "someone who acts alone in a terrorist attack without the help or encouragement of a government or a terrorist organization". Through social media outlets on the internet, 'Lone-wolf' terrorism has been on the rise, being linked to algorithmic radicalization. Through echo-chambers on the internet, viewpoints typically seen as radical were accepted and quickly adopted by other extremists. These viewpoints are encouraged by forums, group chats, and social media to reinforce their beliefs. == References in media == === The Social Dilemma === The Social Dilemma is a 2020 docudrama about how algorithms behind social media enables addiction, while possessing abilities to manipulate people's views, emotions, and behavior to spread conspiracy theories and disinformation. The film repeatedly uses buzz words such as 'echo chambers' and 'fake news' to prove psychological manipulation on social media, therefore leading to political manipulation. In the film, Ben falls deeper into a social media addiction as the algorithm found that his social media page has a 62.3% chance of long-term engagement. This leads into more videos on the recommended feed for Ben and he eventually becomes more immersed into propaganda and conspiracy theories, becoming more polarized with each video. == Proposed solutions == === United States: Weakening Section 230 protections === In the Communications Decency Act, Section 230 states t

    Read more →
  • Telecommunications device for the deaf

    Telecommunications device for the deaf

    A telecommunications device for the deaf (TDD) is a teleprinter, an electronic device for text communication over a telephone line, that is designed for use by persons with hearing or speech difficulties. Other names for the device include teletypewriter (TTY), textphone (common in Europe), and minicom (United Kingdom). The typical TDD is a device about the size of a typewriter or laptop computer with a QWERTY keyboard and small screen that uses an LED, LCD, or VFD screen to display typed text electronically. In addition, TDDs commonly have a small spool of paper on which text is also printed – old versions of the device had only a printer and no screen. The text is transmitted live, via a telephone line, to a compatible device, i.e. one that uses a similar communication protocol. Special telephone services have been developed to carry the TDD functionality even further. In certain countries, there are systems in place so that a deaf person can communicate with a hearing person on an ordinary voice phone using a human relay operator. There are also "carry-over" services, enabling people who can hear but cannot speak ("hearing carry-over", a.k.a. "HCO"), or people who cannot hear but are able to speak ("voice carry-over", a.k.a. "VCO") to use the telephone. The term TDD is sometimes discouraged because people who are deaf are increasingly using mainstream devices and technologies to carry out most of their communication. The devices described here were developed for use on the partially-analog Public Switched Telephone Network (PSTN). They do not work well on the new internet protocol (IP) networks. Thus as society increasingly moves toward IP based telecommunication, the telecommunication devices used by people who are deaf will not be TDDs. In the US and Canada, the devices are referred to as TTYs. Teletype Corporation, of Skokie, Illinois, made page printers for text, notably for news wire services and telegrams, but these used standards different from those for deaf communication, and although in quite widespread use, were technically incompatible. Furthermore, these were sometimes referred to by the "TTY" initialism, short for "Teletype". When computers had keyboard input mechanisms and page printer output, before CRT terminals came into use, Teletypes were the most widely used devices. They were called "console typewriters". (Telex used similar equipment, but was a separate international communication network.) == History == === APCOM acoustic coupler or MODEM device === The TDD concept was developed by James C. Marsters (1924–2009), a dentist and private airplane pilot who became deaf as an infant because of scarlet fever, and Robert Weitbrecht, a deaf physicist. In 1964, Marsters, Weitbrecht and Andrew Saks, an electrical engineer and grandson of the founder of the Saks Fifth Avenue department store chain, founded APCOM (Applied Communications Corp.), located in the San Francisco Bay area, to develop the acoustic coupler, or modem; their first product was named the PhoneType. APCOM collected old teleprinter machines (TTYs) from the Department of Defense and junkyards. Acoustic couplers were cabled to TTYs enabling the AT&T standard Model 500 telephone to couple, or fit, into the rubber cups on the coupler, thus allowing the device to transmit and receive a unique sequence of tones generated by the different corresponding TTY keys. The entire configuration of teleprinter machine, acoustic coupler, and telephone set became known as the TTY. Weitbrecht invented the acoustic coupler modem in 1964. The actual mechanism for TTY communications was accomplished electro-mechanically through frequency-shift keying (FSK) allowing only half-duplex communication, where only one person at a time can transmit. === Paul Taylor TTY device === During the late 1960s, Paul Taylor combined Western Union Teletype machines with modems to create teletypewriters, known as TTYs. He distributed these early, non-portable devices to the homes of many in the deaf community in St. Louis, Missouri. He worked with others to establish a local telephone wake-up service. In the early 1970s, these small successes in St. Louis evolved into the nation's first local telephone relay system for the deaf. === Micon Industries MCM device === In 1973, the Manual Communications Module (MCM), which was the world's first electronic portable TTY allowing two-way telecommunications, premiered at the California Association of the Deaf convention in Sacramento, California. The battery-powered MCM was invented and designed by a deaf news anchor and interpreter, Kit Patrick Corson, in conjunction with Michael Cannon and physicist Art Ogawa. It was manufactured by Michael Cannon's company, Micon Industries, and initially marketed by Kit Corson's company, Silent Communications. In order to be compatible with the existing TTY network, the MCM was designed around the five-bit Baudot code established by the older TTY machines instead of the ASCII code used by computers. The MCM was an instant success with the deaf community despite the drawback of a $599 cost. Within six months there were more MCMs in use by the deaf and hard of hearing than TTY machines. After a year Micon took over the marketing of the MCM and subsequently concluded a deal with Pacific Bell (who coined the term "TDD") to purchase MCMs and rent them to deaf telephone subscribers for $30 per month. After Micon formed an alliance with APCOM, Michael Cannon (Micon), Paul Conover (Micon), and Andrea Saks (APCOM) successfully petitioned the California Public Utilities Commission (CPUC), resulting in a tariff that paid for TTY devices to be distributed free of cost to deaf persons. Micon produced over 1,000 MCMs per month, resulting in approximately 50,000 MCMs being disseminated into the deaf community. Before he left Micon in 1980, Michael Cannon developed several computer compatible variations of the MCM and a portable, battery operated printing TTY, but they were never as popular as the original MCM. Newer model TTYs could communicate with selectable codes that allow communications at a higher bit rate on those models similarly equipped. However, the lack of true computer interface functionality spelled the demise of the original TTY and its clones. During the mid-1970s, other so-called portable telephone devices were being cloned by other companies, and this was the time period when the term "TDD" began being used largely by those outside the deaf community. === Text messaging and the Def-Tone System (DTS) === This relay system became known commonly as the Def-Tone System (DTS) because the tones representing letters of the alphabet were eventually carried in tones outside the range of human hearing. Today, this is commonly called multi-tap because you press a number 1, 2 or 3 times to get a corresponding letter. In 1994 Joseph Alan Poirier, a college student-worker, recommended using the system to send texts to forklifts to improve delivery of parts to the assembly line at GM Powertrain in Toledo, Ohio, and sending a text to pagers. He recommended taking pagers to alphanumeric displays incorporating the same system in discussions with the pager supplier for Outback Steakhouse and having relays put in the forklifts to ping alert messages to the pagers used in that system. He called it text messaging, coining the phrase. It is theorized that when Toyota forklift was allegedly hired by GM for this work, one of the subcontractors, Kyocera, utilized the work for the Toyota forklift company to create text messaging for cell phones. === Marsters Award === In 2009, AT&T received the James C. Marsters Promotion Award from TDI (formerly Telecommunications for the Deaf, Inc.) for its efforts to increase accessibility to communication for people with disabilities. The award holds some irony; it was AT&T that, in the 1960s, resisted efforts to implement TTY technology, claiming it would damage its communication equipment. In 1968, the Federal Communications Commission struck down AT&T's policy and forced it to offer TTY access to its network. == Protocols == There are many different standards for TDDs and textphones. === Original 5-bit Baudot code === The original standard used by TTYs is a variant of the Baudot code. The maximum speed of this protocol is 10 characters per second. This is a half-duplex protocol, which means that only one person at a time may transmit characters. If both try to transmit at the same time, the characters will be garbled on the other end. This protocol is commonly used in the United States. This is a variant of the Baudot code, implemented as 5-bits per character transmitted asynchronously using frequency-shift key-modulation at either 45.5 or 50 baud, 1 start bit, 5 data bits, and 1.5 stop bits. Details of the protocol implementation are available in TIA-825-A and also in T-REC V.18 Annex A "5-bit operational mode". === Turbo Code === The UltraTec company implements another protocol known as Enh

    Read more →
  • Speech recognition

    Speech recognition

    Speech recognition (automatic speech recognition (ASR), computer speech recognition, or speech-to-text (STT)) is a sub-field of computational linguistics concerned with methods and technologies that translate spoken language into text or other interpretable forms. Speech recognition applications include voice user interfaces, where the user speaks to a device, which "listens" and processes the audio. Common voice applications include interpreting commands for calling, call routing, home automation, and aircraft control. These applications are called direct voice input. Productivity applications include searching audio recordings, creating transcripts, and dictation. Speech recognition can be used to analyse speaker characteristics, such as identifying native language using pronunciation assessment. Voice recognition (speaker identification) refers to identifying the speaker, rather than speech contents. Recognizing the speaker can simplify the task of translating speech in systems trained on a specific person's voice. It can also be used to authenticate the speaker as part of a security process. == History == Applications for speech recognition developed over many decades, with progress accelerated due to advances in deep learning and the use of big data. These advances are reflected in an increase in academic papers, and greater system adoption. Key areas of growth include vocabulary size, more accurate recognition for unfamiliar speakers (speaker independence), and faster processing speed. === Pre-1970 === 1952 – Bell Labs researchers, Stephen Balashek, R. Biddulph, and K. H. Davis, built Audrey for single-speaker digit recognition. Their system located the formants in the power spectrum of each utterance. 1960 – Gunnar Fant developed and published the source–filter model of speech production. 1962 – IBM's 16-word "Shoebox" machine's speech recognition debuted at the 1962 World's Fair. 1966 – Linear predictive coding, a speech coding method, was proposed by Fumitada Itakura of Nagoya University and Shuzo Saito of Nippon Telegraph and Telephone. 1969 – Funding at Bell Labs came to a halt for several years after the company's head engineer, John R. Pierce, wrote an open letter criticizing speech recognition research. This defunding lasted until Pierce retired and James L. Flanagan took over. Raj Reddy was the first person to work on continuous speech recognition, as a graduate student at Stanford University in the late 1960s. Previous systems required users to pause after each word. Reddy's system issued spoken commands for playing chess. Around this time, Soviet researchers invented the dynamic time warping (DTW) algorithm and used it to create a recognizer capable of operating on a 200-word vocabulary. DTW processed speech by dividing it into short frames (e.g. 10 ms segments) and treating each frame as a unit. Speaker independence, however, remained unsolved. === 1970–1990 === 1971 – DARPA funded a five-year speech recognition research project, Speech Understanding Research, seeking a minimum vocabulary size of 1,000 words. The project considered speech understanding a key to achieving progress in speech recognition, which was later disproved. BBN, IBM, Carnegie Mellon (CMU), and Stanford Research Institute participated. 1972 – The IEEE Acoustics, Speech, and Signal Processing group held a conference in Newton, Massachusetts. 1976 – The first ICASSP was held in Philadelphia, which became a major venue for publishing on speech recognition. During the late 1960s, Leonard Baum developed the mathematics of Markov chains at the Institute for Defense Analysis. A decade later, at CMU, Raj Reddy's students James Baker and Janet M. Baker began using the hidden Markov model (HMM) for speech recognition. James Baker had learned about HMMs while at the Institute for Defense Analysis. HMMs enabled researchers to combine sources of knowledge, such as acoustics, language, and syntax, in a unified probabilistic model. By the mid-1980s, Fred Jelinek's team at IBM created a voice-activated typewriter called Tangora, which could handle a 20,000-word vocabulary. Jelinek's statistical approach placed less emphasis on emulating human brain processes in favor of statistical modelling. (Jelinek's group independently discovered the application of HMMs to speech.) This was controversial among linguists since HMMs are too simplistic to account for many features of human languages. However, the HMM proved to be a highly useful way for modelling speech and replaced dynamic time warping as the dominant speech recognition algorithm in the 1980s. 1982 – Dragon Systems, founded by James and Janet M. Baker, was one of IBM's few competitors. === Practical speech recognition === The 1980s also saw the introduction of the n-gram language model. 1987 – The back-off model enabled language models to use multiple-length n-grams, and CSELT used HMM to recognize languages (in software and hardware, e.g. RIPAC). At the end of the DARPA program in 1976, the best computer available to researchers was the PDP-10 with 4 MB of RAM. It could take up to 100 minutes to decode 30 seconds of speech. Practical products included: 1984 – the Apricot Portable was released with up to 4096 words support, of which only 64 could be held in RAM at a time. 1987 – a recognizer from Kurzweil Applied Intelligence 1990 – Dragon Dictate, a consumer product released in 1990. AT&T deployed the Voice Recognition Call Processing service in 1992 to route telephone calls without a human operator. The technology was developed by Lawrence Rabiner and others at Bell Labs. By the early 1990s, the vocabulary of the typical commercial speech recognition system had exceeded the average human vocabulary. Reddy's former student, Xuedong Huang, developed the Sphinx-II system at CMU. Sphinx-II was the first to do speaker-independent, large vocabulary, continuous speech recognition, and it won DARPA's 1992 evaluation. Handling continuous speech with a large vocabulary was a major milestone. Huang later founded the speech recognition group at Microsoft in 1993. Reddy's student Kai-Fu Lee joined Apple, where, in 1992, he helped develop the Casper speech interface prototype. Lernout & Hauspie, a Belgium-based speech recognition company, acquired other companies, including Kurzweil Applied Intelligence in 1997 and Dragon Systems in 2000. L&H was used in Windows XP. L&H was an industry leader until an accounting scandal destroyed it in 2001. L&H speech technology was bought by ScanSoft, which became Nuance in 2005. Apple licensed Nuance software for its digital assistant Siri. ==== 2000s ==== In the 2000s, DARPA sponsored two speech recognition programs: Effective Affordable Reusable Speech-to-Text (EARS) in 2002, followed by Global Autonomous Language Exploitation (GALE) in 2005. Four teams participated in EARS: IBM; a team led by BBN with LIMSI and the University of Pittsburgh; Cambridge University; and a team composed of ICSI, SRI, and the University of Washington. EARS funded the collection of the Switchboard telephone speech corpus, which contained 260 hours of recorded conversations from over 500 speakers. The GALE program focused on Arabic and Mandarin broadcast news. Google's first effort at speech recognition came in 2007 after recruiting Nuance researchers. Its first product, GOOG-411, was a telephone-based directory service. Since at least 2006, the U.S. National Security Agency has employed keyword spotting, allowing analysts to index large volumes of recorded conversations and identify speech containing "interesting" keywords. Other government research programs focused on intelligence applications, such as DARPA's EARS program and IARPA's Babel program. In the early 2000s, speech recognition was dominated by hidden Markov models combined with feed-forward artificial neural networks (ANN). Later, speech recognition was taken over by long short-term memory (LSTM), a recurrent neural network (RNN) published by Sepp Hochreiter & Jürgen Schmidhuber in 1997. LSTM RNNs avoid the vanishing gradient problem and can learn "Very Deep Learning" tasks that require memories of events that happened thousands of discrete time steps earlier, which is important for speech. Around 2007, LSTMs trained with Connectionist Temporal Classification (CTC) began to outperform. In 2015, Google reported a 49 percent error‑rate reduction in its speech recognition via CTC‑trained LSTM. Transformers, a type of neural network based solely on attention, were adopted in computer vision and language modelling, and then to speech recognition. Deep feed-forward (non-recurrent) networks for acoustic modelling were introduced in 2009 by Geoffrey Hinton and his students at the University of Toronto, and by Li Deng and colleagues at Microsoft Research. In contrast to the prioer incremental improvements, deep learning decreased error rates by 30%. Both shallow and deep forms (e.g., recurrent nets) of ANNs had been explored since the 1980s. Howev

    Read more →
  • Harmonic

    Harmonic

    In physics, acoustics, and telecommunications, a harmonic is a sinusoidal wave with a frequency that is a positive integer multiple of the fundamental frequency of a periodic signal. The fundamental frequency is also called the 1st harmonic; the other harmonics are known as higher harmonics. As all harmonics are periodic at the fundamental frequency, the sum of harmonics is also periodic at that frequency. The set of harmonics forms a harmonic series. The term is employed in various disciplines, including music, physics, acoustics, electronic power transmission, radio technology, and other fields. For example, if the fundamental frequency is 50 Hz, a common AC power supply frequency, the frequencies of the first three higher harmonics are 100 Hz (2nd harmonic), 150 Hz (3rd harmonic), 200 Hz (4th harmonic) and any addition of waves with these frequencies is periodic at 50 Hz. An n {\displaystyle \ n} th characteristic mode, for n > 1 , {\displaystyle \ n>1\ ,} will have nodes that are not vibrating. For example, the 3rd characteristic mode will have nodes at 1 3 L {\displaystyle \ {\tfrac {1}{3}}\ L\ } and 2 3 L , {\displaystyle \ {\tfrac {2}{3}}\ L\ ,} where L {\displaystyle \ L\ } is the length of the string. In fact, each n {\displaystyle \ n} th characteristic mode, for n {\displaystyle \ n\ } not a multiple of 3, will not have nodes at these points. These other characteristic modes will be vibrating at the positions 1 3 L {\displaystyle \ {\tfrac {1}{3}}\ L\ } and 2 3 L . {\displaystyle \ {\tfrac {2}{3}}\ L~.} If the player gently touches one of these positions, then these other characteristic modes will be suppressed. The tonal harmonics from these other characteristic modes will then also be suppressed. Consequently, the tonal harmonics from the n {\displaystyle \ n} th characteristic characteristic modes, where n {\displaystyle \ n\ } is a multiple of 3, will be made relatively more prominent. In music, harmonics are used on string instruments and wind instruments as a way of producing sound on the instrument, particularly to play higher notes and, with strings, obtain notes that have a unique sound quality or "tone colour". On strings, bowed harmonics have a "glassy", pure tone. On stringed instruments, harmonics are played by touching (but not fully pressing down the string) at an exact point on the string while sounding the string (plucking, bowing, etc.); this allows the harmonic to sound, a pitch which is always higher than the fundamental frequency of the string. == Terminology == Harmonics may be called "overtones", "partials", or "upper partials", and in some music contexts, the terms "harmonic", "overtone" and "partial" are used fairly interchangeably. But more precisely, the term "harmonic" includes all pitches in a harmonic series (including the fundamental frequency) while the term "overtone" only includes pitches above the fundamental. == Characteristics == A whizzing, whistling tonal character, distinguishes all the harmonics both natural and artificial from the firmly stopped intervals; therefore their application in connection with the latter must always be carefully considered. Most acoustic instruments emit complex tones containing many individual partials (component simple tones or sinusoidal waves), but the untrained human ear typically does not perceive those partials as separate phenomena. Rather, a musical note is perceived as one sound, the quality or timbre of that sound being a result of the relative strengths of the individual partials. Many acoustic oscillators, such as the human voice or a bowed violin string, produce complex tones that are more or less periodic, and thus are composed of partials that are nearly matched to the integer multiples of fundamental frequency and therefore resemble the ideal harmonics and are called "harmonic partials" or simply "harmonics" for convenience (although it's not strictly accurate to call a partial a harmonic, the first being actual and the second being theoretical). Oscillators that produce harmonic partials behave somewhat like one-dimensional resonators, and are often long and thin, such as a guitar string or a column of air open at both ends (as with the metallic modern orchestral transverse flute). Wind instruments whose air column is open at only one end, such as trumpets and clarinets, also produce partials resembling harmonics. However they only produce partials matching the odd harmonics—at least in theory. In practical use, no real acoustic instrument behaves as perfectly as the simplified physical models predict; for example, instruments made of non-linearly elastic wood, instead of metal, or strung with gut instead of brass or steel strings, tend to have not-quite-integer partials. Partials whose frequencies are not integer multiples of the fundamental are referred to as inharmonic partials. Some acoustic instruments emit a mix of harmonic and inharmonic partials but still produce an effect on the ear of having a definite fundamental pitch, such as pianos, strings plucked pizzicato, vibraphones, marimbas, and certain pure-sounding bells or chimes. Antique singing bowls are known for producing multiple harmonic partials or multiphonics. Other oscillators, such as cymbals, drum heads, and most percussion instruments, naturally produce an abundance of inharmonic partials and do not imply any particular pitch, and therefore cannot be used melodically or harmonically in the same way other instruments can. Building on of Sethares (2004), dynamic tonality introduces the notion of pseudo-harmonic partials, in which the frequency of each partial is aligned to match the pitch of a corresponding note in a pseudo-just tuning, thereby maximizing the consonance of that pseudo-harmonic timbre with notes of that pseudo-just tuning. == Partials, overtones, and harmonics == An overtone is any partial higher than the lowest partial in a compound tone. The relative strengths and frequency relationships of the component partials determine the timbre of an instrument. The similarity between the terms overtone and partial sometimes leads to their being loosely used interchangeably in a musical context, but they are counted differently, leading to some possible confusion. In the special case of instrumental timbres whose component partials closely match a harmonic series (such as with most strings and winds) rather than being inharmonic partials (such as with most pitched percussion instruments), it is also convenient to call the component partials "harmonics", but not strictly correct, because harmonics are numbered the same even when missing, while partials and overtones are only counted when present. This chart demonstrates how the three types of names (partial, overtone, and harmonic) are counted (assuming that the harmonics are present): In many musical instruments, it is possible to play the upper harmonics without the fundamental note being present. In a simple case (e.g., recorder) this has the effect of making the note go up in pitch by an octave, but in more complex cases many other pitch variations are obtained. In some cases it also changes the timbre of the note. This is part of the normal method of obtaining higher notes in wind instruments, where it is called overblowing. The extended technique of playing multiphonics also produces harmonics. On string instruments it is possible to produce very pure sounding notes, called harmonics or flageolets by string players, which have an eerie quality, as well as being high in pitch. Harmonics may be used to check at a unison the tuning of strings that are not tuned to the unison. For example, lightly fingering the node found halfway down the highest string of a cello produces the same pitch as lightly fingering the node ⁠ 1 / 3 ⁠ of the way down the second highest string. For the human voice see Overtone singing, which uses harmonics. While it is true that electronically produced periodic tones (e.g. square waves or other non-sinusoidal waves) have "harmonics" that are whole number multiples of the fundamental frequency, practical instruments do not all have this characteristic. For example, higher "harmonics" of piano notes are not true harmonics but are "overtones" and can be very sharp, i.e. a higher frequency than given by a pure harmonic series. This is especially true of instruments other than strings, brass, or woodwinds. Examples of these "other" instruments are xylophones, drums, bells, chimes, etc.; not all of their overtone frequencies make a simple whole number ratio with the fundamental frequency. (The fundamental frequency is the reciprocal of the longest time period of the collection of vibrations in some single periodic phenomenon.) == On stringed instruments == Harmonics may be singly produced [on stringed instruments] (1) by varying the point of contact with the bow, or (2) by slightly pressing the string at the nodes, or divisions of its aliquot parts ( 1 2 {\displaystyle {\tfrac {1}{2}}} , 1

    Read more →
  • Enterprise bookmarking

    Enterprise bookmarking

    Enterprise bookmarking is a method for Web 2.0 users to tag, organize, store, and search bookmarks of both web pages on the Internet and data resources stored in a distributed database or fileserver. This is done collectively and collaboratively in a process by which users add tag (metadata) and knowledge tags. In early versions of the software, these tags are applied as non-hierarchical keywords, or terms assigned by a user to a web page, and are collected in tag clouds. Examples of this software are Connectbeam and Dogear. New versions of the software such as Jumper 2.0 and Knowledge Plaza expand tag metadata in the form of knowledge tags that provide additional information about the data and are applied to structured and semi-structured data and are collected in tag profiles. == History == Enterprise bookmarking is derived from Social bookmarking that got its modern start with the launch of the website del.icio.us in 2003. The first major announcement of an enterprise bookmarking platform was the IBM Dogear project, developed in Summer 2006. Version 1.0 of the Dogear software was announced at Lotusphere 2007, and shipped later that year on June 27 as part of IBM Lotus Connections. The second significant commercial release was Cogenz in September 2007. Since these early releases, Enterprise bookmarking platforms have diverged considerably. The most significant new release was the Jumper 2.0 platform, with expanded and customizable knowledge tagging fields. == Differences == === Versus social bookmarking === In a social bookmarking system, individuals create personal collections of bookmarks and share their bookmarks with others. These centrally stored collections of Internet resources can be accessed by other users to find useful resources. Often these lists are publicly accessible, so that other people with similar interests can view the links by category or by the tags themselves. Most social bookmarking sites allow users to search for bookmarks which are associated with given "tags", and rank the resources by the number of users which have bookmarked them. Enterprise bookmarking is a method of tagging and linking any information using an expanded set of tags to capture knowledge about data. It collects and indexes these tags in a web-infrastructure knowledge base server residing behind the firewall. Users can share knowledge tags with specified people or groups, shared only inside specific networks, typically within an organization. Enterprise bookmarking is a knowledge management discipline that embraces Enterprise 2.0 methodologies to capture specific knowledge and information that organizations consider proprietary and are not shared on the public Internet. === Tag management === Enterprise bookmarking tools also differ from social bookmarking tools in the way that they often face an existing taxonomy. Some of these tools have evolved to provide Tag management which is the combination of uphill abilities (e.g. faceted classification, predefined tags, etc.) and downhill gardening abilities (e.g. tag renaming, moving, merging) to better manage the bottom-up folksonomy generated from user tagging.

    Read more →
  • Horus Music

    Horus Music

    Horus Music Limited is a global digital distribution and label services company. Established in 2006, Horus Music allows artists, labels and right-holders to send their music to over 200 download, streaming, and interactive platforms including iTunes, Google Play, Amazon, VEVO, 7digital, Spotify, Beatport, Deezer, Tidal, as well as offering digital marketing and playlisting opportunities. == History == The company were named Best Business Partner of 2014 by Huawei Technology of China, and were also a finalist in the International Trade category as part of the Leicester Mercury Business Awards during that same year. Their client base consists of unsigned and independent musicians and record labels, as well as well known recording artists. In November 2015, Horus Music sponsored the UK’s first Independent Label Week, in order to highlight the music that is released by the UK’s indie labels. In 2016, Horus Music celebrated their 10th anniversary Horus Music's sister companies Help for Bands and Help For Writers, provide advice and opportunities for musicians and E-book distribution for writers, respectively. Anara Publishing opened in 2017 which allows the company to work closely with a handpicked roster of musicians to provide royalty administration and sync licensing services. On 21 April 2017, Her Majesty Queen Elizabeth II’s 91st birthday, Horus Music was awarded with the Queen’s Award for Enterprise in International Trade. In 2021, Horus Music, UnitedMasters, and Symphonic Distribution partnered with pioneering music fintech company, beatBread, to offer clients access to more capital. beatBread's chordCashAI technology provides an automated advance experience for independent musicians while enable clients to choose their own terms and retain ownership of their music. == Clients == Horus Music has partnered with a number of charities including Save the Children, for the recording "Look into Your Heart", featuring Beverley Knight with Rolling Stones' Mick Jagger and Ronnie Wood, 100% of proceeds from the single were donated to the charity. The Pixel Project, who produced songs about violence against women and the blood cancer charity Bloodwise. The company have spoken openly about the state of the music industry and artists' rights and were one of the first distributors to remove their catalogue from Rdio after the streaming service was acquired by Pandora. Their relationships with artists and labels, as well as leading industry contacts, means they have the ability to work with musicians in a myriad of ways, including offering performance opportunities and even local auditions for TV shows such as The Voice UK. == Horus Music India == Horus Music India opened in 2016 and is based in Mumbai. By opening Horus Music India, the company are able to expand on their local connections as well as to provide a much more personalised service to musicians based in this area. The appointment of two Business Development Managers in India cemented their move.

    Read more →
  • Shepp–Logan phantom

    Shepp–Logan phantom

    The Shepp–Logan phantom is a standard test image created by Larry Shepp and Benjamin F. Logan for their 1974 paper "The Fourier Reconstruction of a Head Section". It serves as the model of a human head in the development and testing of image reconstruction algorithms. == Definition == The function describing the phantom is defined as the sum of 10 ellipses inside a 2×2 square:

    Read more →
  • FreePBX Distro

    FreePBX Distro

    The FreePBX Distro was a freeware unified communications software system that consisted of FreePBX, a graphical user interface (GUI) for configuring, controlling and managing Asterisk PBX software. The FreePBX Distro included packages that offer VoIP, PBX, Fax, IVR, voice-mail and email functions. The FreePBX Distro Linux distribution was based on CentOS, which maintains binary compatibility with Red Hat Enterprise Linux. FreePBX has contributed to the popularity of Asterisk. As a result of CentOS Linux being discontinued and the last version of CentOS 7 going out of support on June 30, 2024, FreePBX 17 has moved over to and is supported on Debian Linux. FreePBX will no longer be providing a pre-configured FreePBX Distro, but will provide a script to install FreePBX on a fresh install of Debian Linux. In-place migration will not be possible, but will be possible by restoring a backup on the new version from the previous version. As FreePBX 16 will be supported until the release of FreePBX 18, FreePBX on this distribution will still work and be supported, however, there will be no further support for the underlying operating system. == Installation == The Official FreePBX Distro is installed from a ISO image available by web download, that includes the system CentOS, Asterisk, FreePBX GUI and assorted dependencies. This can then either be burned to DVD or written to a USB stick for installation == Support for telephony hardware == The FreePBX Distro has built-in support for cards from multiple vendors, including Digium, OpenVox, Alto, Rhino Equipment, Xorcom and Sangoma. The FreePBX Distro supports a large number of phone models via open-source modules. Supported VoIP phone manufacturers include Algo, AND, AudioCodes, Cisco, Cyberdata, Digium, Grandstream, Mitel/Aastra, Nortel/Avaya, Panasonic, Polycom, Sangoma, Snom, Xorcom and Yealink. == Development == FreePBX made its debut in 2004 as the AMP project (Asterisk Management Portal). The FreePBX Distro was released in 2011 as an turnkey solution for building a PBX using Asterisk, CentOS and FreePBX. FreePBX has over 1 million active production PBXs and over 20,000 new systems added each month. The core telephony engine is Asterisk, as configured by the Open Source FreePBX GUI. The last stable release is FreePBX Distro Stable SNG7-PBX16-64bit-2302-1 based on these main components: FreePBX 16 CentOS 7.8 Asterisk 16, 18, 19 (20 supported by upgrade once installed)

    Read more →
  • Media preservation

    Media preservation

    Preservation of documents, pictures, recordings, digital content, etc., is a major aspect of archival science. It is also an important consideration for people who are creating time capsules, family history, historical documents, scrapbooks and family trees. Common storage media are not permanent, and there are few reliable methods of preserving documents and pictures for the future. == Paper/prints (photos) == Color negatives and ordinary color prints may fade away to nothing in a relatively short period if not stored and handled properly. This happens even if the negatives and prints are kept in the dark, because ambient light is not the determining factor, but heat and humidity are. The color degradation is the result of the dyes used in the color processes. Because color processing results in a less stable image than traditional black-and-white processing, black-and-white pictures from the 1920s are more likely to survive long-term than color films and photographs from after the middle 20th century. Black-and-white photographic films using silver halide emulsions are the only film types that have proven to last for archival storage. The determining factors for longevity include the film base type, proper processing (develop, stop, fix and wash) and proper storage. Early films used a Cellulose nitrate base which was prone to decomposition and highly flammable. Nitrate film was replaced with acetate-base films. These Cellulose acetate films were later discovered to outgass acids (also referred to as vinegar syndrome). Acetate films were replaced in the early 1980s by polyester film base materials which have been determined to be more stable than film stocks with a nitrate or acetate base. Color prints made on most inkjet printers look very good at first but they have a very short lifespan, measured in months rather than in years. Even prints from commercial photo labs will start to fade in a matter of years if not processed properly and stored in cool, dry environments. == Documents/books == With documents for which the media are not so critical as what the documents contain, the information in documents can be copied by using photocopiers and image scanners. Books and manuscripts can also have their information saved without destruction by using a book scanner. Where the medium itself needs to be preserved, for example if a document is a crayon sketch by a famous artist on paper, a complex process of preservation may be used. Depending on the condition and importance of the item this can include gluing the media onto more stable media, or protective enclosing of the media. Polyester sleeves, acid-free folders, and pH buffered document boxes are common supportive protective enclosures whose selection must match the media's chemical and physical properties. Other considerations in preserving paper/books are: Damaging light, particularly UV light, which fades and destroys media over time by breaking down the molecules. Atmosphere contains small traces of sulfur dioxide and nitric acid which turn media yellow and break the fibers down. Humidity and moisture also aid in the breakdown of media. If there is too much, the document can be attacked by bacteria, and if too little, cellulose material breaks down. Temperature, particularly elevated ones, can destroy some media. Low temperatures can cause the water to form crystals which expands destroying the structure of paper-based documents. == Online photo albums == Although there are many websites that allow the upload of photographs and videos, digital preservation for the long-term is still considered an issue. There is a lack of confidence that such websites are capable of storing data for long periods of time (ex. 50 years) without data degradation or loss. == Optical media - CD, DVD, Blu-ray, M-Disc == Write-once optical media, such as CD-Rs and DVD-Rs, typically contain an organic dye that distinguishes data reading from data writing based on the dye's transparency along the disc. Conventional CDs and DVDs have finite shelf-life due to natural degradation of the dye; the newer M-DISC uses inorganic material technology to produce molded DVDs and Blu-Rays (up to 3-layer 100GB BDXL) with a claimed lifespan of 100-1000 years if stored correctly with most BD & BDXL rated read/writers enabling the higher power mode for the M-Disc format after 2011. The National Archives and Records Administration lists published life expectancies to be 10 or 25 years or more for normal CDs and DVDs and conservative life expectancies to be between 2 and 5 years. Storage environments, such as temperature and humidity, as well as handling conditions such as frequency of media use and compatibility between the recorder and media, affect media shelf-life. Improvements in media storage and migrations to new recording technologies can make certain formats obsolete within their respective lifespan. Technologists have pointed to internet streaming services, where services such as video-on-demand have contributed to the 33 percent decline in DVD sales the past 5 years, as a challenge for digital preservation. == Magnetic media - video cassettes, tapes, hard drives == Magnetic media such as audio and video tape and floppy disks also have limited life spans. Audio and video tapes require specific care and handling to ensure that the recorded information will be preserved. For information that must be preserved indefinitely, periodic transcription from old media to new ones is necessary, not only because the media are unstable but also because the recording technology may become obsolete. Magnetic media also deteriorates naturally with typical shelf lives between 10 and 20 years. Magnetic tape can degrade from binder hydrolysis or magnetic remanence decay. Binder hydrolysis, also known as sticky-shed syndrome, refers to the breakdown of binder, or glue, that holds the magnetic particles to the polyester base of the tape. Tapes which have been stored in hot, humid conditions are particularly vulnerable to this phenomenon and may suffer from accelerated degradation. Severe binder can cause the magnetic material to fall off or sheds from the base, leaving a pile of dust and clear backing. Archivists can bake the tape, which evaporates water molecules on the tape, to temporarily restore the binder before making a copy. Magnetic tape can also be destabilized by magnetic remanence decay, which refers to the weakening of the tape's magnetization over time. This weakens the affected tape's readability, leading to reduced sound clarity and volume or picture hue and contrast. Baking the tape will not restore magnetization. Media at risk include recorded media such as master audio recordings of symphonies and videotape recordings of the news gathered over the last 40 years. Threats to media that must be considered when archiving important record media include accidental erasure, physical loss due to disasters such as fires and floods, and media degradation. Along with the actual media being degraded over the years, the machines that are available to play back or reproduce the audio sources are becoming archaic themselves. Manufacturers and their support (parts, technical updates) for their machines have disappeared throughout the years. Even if the medium is vaulted and archived correctly, the mechanical properties of the machines have deteriorated to the point that they could do more harm than good to the tape being played. Many major film studios are now backing up their libraries by converting them to electronic media files, such as .AIFF or .WAV-based files via digital audio workstations. That way, even if the digital platform manufacturer goes out of business or no longer supports their product, the files can still be played on any common computer. There is a detailed process that must take place previous to the final archival product now that a digital solution is in place. Sample rates and their conversion and reference speed are both critical in this process. In floppy disks, the lubricants inside the plastic jackets of many older floppies promote the decay of the magnetic medium. Also, the alignment of the magnetic particles of the disk substrate may gradually degrade, leading to a loss of formatting and data. Early laser disk media were prone to degradation as the layers of the disk substrate were bonded with an adhesive that was vulnerable to decay and would crumble over time. This would lead the different layers of the disk to peel apart, damaging the pitted data surface and rendering the disk unreadable.

    Read more →
  • Media preservation

    Media preservation

    Preservation of documents, pictures, recordings, digital content, etc., is a major aspect of archival science. It is also an important consideration for people who are creating time capsules, family history, historical documents, scrapbooks and family trees. Common storage media are not permanent, and there are few reliable methods of preserving documents and pictures for the future. == Paper/prints (photos) == Color negatives and ordinary color prints may fade away to nothing in a relatively short period if not stored and handled properly. This happens even if the negatives and prints are kept in the dark, because ambient light is not the determining factor, but heat and humidity are. The color degradation is the result of the dyes used in the color processes. Because color processing results in a less stable image than traditional black-and-white processing, black-and-white pictures from the 1920s are more likely to survive long-term than color films and photographs from after the middle 20th century. Black-and-white photographic films using silver halide emulsions are the only film types that have proven to last for archival storage. The determining factors for longevity include the film base type, proper processing (develop, stop, fix and wash) and proper storage. Early films used a Cellulose nitrate base which was prone to decomposition and highly flammable. Nitrate film was replaced with acetate-base films. These Cellulose acetate films were later discovered to outgass acids (also referred to as vinegar syndrome). Acetate films were replaced in the early 1980s by polyester film base materials which have been determined to be more stable than film stocks with a nitrate or acetate base. Color prints made on most inkjet printers look very good at first but they have a very short lifespan, measured in months rather than in years. Even prints from commercial photo labs will start to fade in a matter of years if not processed properly and stored in cool, dry environments. == Documents/books == With documents for which the media are not so critical as what the documents contain, the information in documents can be copied by using photocopiers and image scanners. Books and manuscripts can also have their information saved without destruction by using a book scanner. Where the medium itself needs to be preserved, for example if a document is a crayon sketch by a famous artist on paper, a complex process of preservation may be used. Depending on the condition and importance of the item this can include gluing the media onto more stable media, or protective enclosing of the media. Polyester sleeves, acid-free folders, and pH buffered document boxes are common supportive protective enclosures whose selection must match the media's chemical and physical properties. Other considerations in preserving paper/books are: Damaging light, particularly UV light, which fades and destroys media over time by breaking down the molecules. Atmosphere contains small traces of sulfur dioxide and nitric acid which turn media yellow and break the fibers down. Humidity and moisture also aid in the breakdown of media. If there is too much, the document can be attacked by bacteria, and if too little, cellulose material breaks down. Temperature, particularly elevated ones, can destroy some media. Low temperatures can cause the water to form crystals which expands destroying the structure of paper-based documents. == Online photo albums == Although there are many websites that allow the upload of photographs and videos, digital preservation for the long-term is still considered an issue. There is a lack of confidence that such websites are capable of storing data for long periods of time (ex. 50 years) without data degradation or loss. == Optical media - CD, DVD, Blu-ray, M-Disc == Write-once optical media, such as CD-Rs and DVD-Rs, typically contain an organic dye that distinguishes data reading from data writing based on the dye's transparency along the disc. Conventional CDs and DVDs have finite shelf-life due to natural degradation of the dye; the newer M-DISC uses inorganic material technology to produce molded DVDs and Blu-Rays (up to 3-layer 100GB BDXL) with a claimed lifespan of 100-1000 years if stored correctly with most BD & BDXL rated read/writers enabling the higher power mode for the M-Disc format after 2011. The National Archives and Records Administration lists published life expectancies to be 10 or 25 years or more for normal CDs and DVDs and conservative life expectancies to be between 2 and 5 years. Storage environments, such as temperature and humidity, as well as handling conditions such as frequency of media use and compatibility between the recorder and media, affect media shelf-life. Improvements in media storage and migrations to new recording technologies can make certain formats obsolete within their respective lifespan. Technologists have pointed to internet streaming services, where services such as video-on-demand have contributed to the 33 percent decline in DVD sales the past 5 years, as a challenge for digital preservation. == Magnetic media - video cassettes, tapes, hard drives == Magnetic media such as audio and video tape and floppy disks also have limited life spans. Audio and video tapes require specific care and handling to ensure that the recorded information will be preserved. For information that must be preserved indefinitely, periodic transcription from old media to new ones is necessary, not only because the media are unstable but also because the recording technology may become obsolete. Magnetic media also deteriorates naturally with typical shelf lives between 10 and 20 years. Magnetic tape can degrade from binder hydrolysis or magnetic remanence decay. Binder hydrolysis, also known as sticky-shed syndrome, refers to the breakdown of binder, or glue, that holds the magnetic particles to the polyester base of the tape. Tapes which have been stored in hot, humid conditions are particularly vulnerable to this phenomenon and may suffer from accelerated degradation. Severe binder can cause the magnetic material to fall off or sheds from the base, leaving a pile of dust and clear backing. Archivists can bake the tape, which evaporates water molecules on the tape, to temporarily restore the binder before making a copy. Magnetic tape can also be destabilized by magnetic remanence decay, which refers to the weakening of the tape's magnetization over time. This weakens the affected tape's readability, leading to reduced sound clarity and volume or picture hue and contrast. Baking the tape will not restore magnetization. Media at risk include recorded media such as master audio recordings of symphonies and videotape recordings of the news gathered over the last 40 years. Threats to media that must be considered when archiving important record media include accidental erasure, physical loss due to disasters such as fires and floods, and media degradation. Along with the actual media being degraded over the years, the machines that are available to play back or reproduce the audio sources are becoming archaic themselves. Manufacturers and their support (parts, technical updates) for their machines have disappeared throughout the years. Even if the medium is vaulted and archived correctly, the mechanical properties of the machines have deteriorated to the point that they could do more harm than good to the tape being played. Many major film studios are now backing up their libraries by converting them to electronic media files, such as .AIFF or .WAV-based files via digital audio workstations. That way, even if the digital platform manufacturer goes out of business or no longer supports their product, the files can still be played on any common computer. There is a detailed process that must take place previous to the final archival product now that a digital solution is in place. Sample rates and their conversion and reference speed are both critical in this process. In floppy disks, the lubricants inside the plastic jackets of many older floppies promote the decay of the magnetic medium. Also, the alignment of the magnetic particles of the disk substrate may gradually degrade, leading to a loss of formatting and data. Early laser disk media were prone to degradation as the layers of the disk substrate were bonded with an adhesive that was vulnerable to decay and would crumble over time. This would lead the different layers of the disk to peel apart, damaging the pitted data surface and rendering the disk unreadable.

    Read more →
  • Histogram of oriented displacements

    Histogram of oriented displacements

    Histogram of oriented displacements (HOD) is a 2D trajectory descriptor. The trajectory is described using a histogram of the directions between each two consecutive points. Given a trajectory T = {P1, P2, P3, ..., Pn}, where Pt is the 2D position at time t. For each pair of positions Pt and Pt+1, calculate the direction angle θ(t, t+1). Value of θ is between 0 and 360. A histogram of the quantized values of θ is created. If the histogram is of 8 bins, the first bin represents all θs between 0 and 45. The histogram accumulates the lengths of the consecutive moves. For each θ, a specific histogram bin is determined. The length of the line between Pt and Pt+1 is then added to the specific histogram bin. To show the intuition behind the descriptor, consider the action of waving hands. At the end of the action, the hand falls down. When describing this down movement, the descriptor does not care about the position from which the hand started to fall. This fall will affect the histogram with the appropriate angles and lengths, regardless of the position where the hand started to fall. HOD records for each moving point: how much it moves in each range of directions. HOD has a clear physical interpretation. It proposes that, a simple way to describe the motion of an object, is to indicate how much distance it moves in each direction. If the movement in all directions are saved accurately, the movement can be repeated from the initial position to the final destination regardless of the displacements order. However, the temporal information will be lost, as the order of movements is not stored-this is what we solve by applying the temporal pyramid, as shown in section \ref{sec:temp-pyramid}. If the angles quantization range is small, classifiers that use the descriptor will overfit. Generalization needs some slack in directions-which can be done by increasing the quantization range.

    Read more →
  • The Culture of Connectivity

    The Culture of Connectivity

    The Culture of Connectivity: A Critical History of Social Media is a book by José van Dijck published by Oxford University Press in 2013 on social media platforms and their history. The author considers the histories of five social media platforms: Facebook, Twitter, Flickr, YouTube, and Wikipedia. She focuses on how their technological, social and cultural dimensions contribute to their current status.

    Read more →
  • GPU switching

    GPU switching

    GPU switching is a mechanism used on computers with multiple graphic controllers. This mechanism allows the user to either maximize the graphic performance or prolong battery life by switching between the graphic cards. It is mostly used on gaming laptops which usually have an integrated graphic device and a discrete video card. == Basic components == Most computers using this feature contain integrated graphics processors and dedicated graphics cards that applies to the following categories. === Integrated graphics === Also known as: Integrated graphics, shared graphics solutions, integrated graphics processors (IGP) or unified memory architecture (UMA). This kind of graphics processors usually have much fewer processing units and share the same memory with the CPU. Sometimes the graphics processors are integrated onto a motherboard. It is commonly known as: on-board graphics. A motherboard with on-board graphics processors doesn't require a discrete graphics card or a CPU with graphics processors to operate. === Dedicated graphics cards === Also known as: discrete graphics cards. Unlike integrated graphics, dedicated graphics cards have much more processing units and have its own RAM with much higher memory bandwidth. In some cases, a dedicated graphics chip can be integrated onto the motherboards, B150-GP104 for example. Regardless of the fact that the graphics chip is integrated, it is still counted as a dedicated graphics cards system because the graphics chip is integrated with its own memory. == Theory == Most Personal Computers have a motherboard that uses a Southbridge and Northbridge structure. === Northbridge control === The Northbridge is one of the core logic chipset that handles communications between the CPU, GPU, RAM and the Southbridge. The discrete graphics card is usually installed onto the graphics card slot such as PCI-Express and the integrated graphics is integrated onto the CPU itself or occasionally onto the Northbridge. The Northbridge is the most responsible for switching between GPUs. The way how it works usually has the following process (refer to the Figure 1. on the right): The Northbridge receives input from Southbridge through the internal bus. The Northbridge signals to CPU through the Front-side bus. The CPU runs the task assignment application (usually the graphics card driver) to determine which GPU core to use. The CPU passes down the command to the Northbridge. The Northbridge passes down the command to the according GPU core. The GPU core processes the command and returns the rendered data back to the Northbridge. The Northbridge sends the rendered data back to Southbridge. === Southbridge control === The Southbridge is a set of integrated circuits such Intel's I/O Controller Hub (ICH). It handles all of a computer's I/O functions, such as receiving the keyboard input and outputting the data onto the screen. The way how it usually works usually has two steps: Take in the user input and pass it down to the Northbridge. (Optional) Receive the rendered data from the Northbridge and output it. The reason why the second step can be optional is that sometimes the rendered the data is outputted directly from the discrete graphics card which is located on the graphics card slot so there is no need to output the data through the Southbridge. == Main purpose == GPU switching is mostly used for saving energy by switching between graphic cards. The dedicated graphics cards consume much more power than integrated graphics but also provides higher 3D performances, which is needed for a better gaming and CAD experience. Following is a list of the TDPs of the most popular CPU with integrated graphics and dedicated graphics cards. The dedicated graphics cards exhibit much higher power consumption than the integrated graphics on both platforms. Disabling them when no heavy graphics processing is needed can significantly lower the power consumption. == Technologies == === Nvidia Optimus === Nvidia Optimus™ is a computer GPU switching technology created by Nvidia that can dynamically and seamlessly switch between two graphic cards based on running programs. === AMD Enduro === AMD Enduro™ is a collective brand developed by AMD that features many new technologies that can significantly save power. It was previously named as: PowerXpress and Dynamic Switchable Graphics (DSG). This technology implements a sophisticated system to predict the potential usage need for graphics cards and switch between graphics cards based on predicted need. This technology also introduces a new power control plan that allows the discrete graphics cards consume no energy when idling. == Manufacturers == === Integrated graphics === In personal computers, the IGP (integrated graphics processors) are mostly manufactured by Intel and AMD and are integrated onto their CPUs. They are commonly known as: Intel HD and Iris Graphics - also called HD series and Iris series AMD Accelerated Processing Unit (APU) - also formerly known as: fusion === Dedicated graphics cards === The most popular dedicated graphics cards are manufactured by AMD and Nvidia. They are commonly known as: AMD Radeon Nvidia GeForce == Drivers and OS support == Most common operating systems have built-in support for this feature. However, the users may download the updated drivers from Nvidia or AMD for better experience. === Windows support === Windows 7 has built-in support for this feature. The system automatically switches between GPUs depending on the program that's running. However, the user may switch the GPUs manually through device manager or power manager. === Linux === Modern Linux systems handle hybrid graphics in two parts: power/control for the inactive GPU, and optional render offloading for individual applications. vga_switcheroo (in the kernel since 2.6.34) coordinates power and mux control on systems with multiple GPUs. It was designed primarily for muxed designs (hardware display switch), and on muxless laptops it is typically used only for power control. A display server restart is no longer required for offloading on muxless systems. DRI PRIME (Mesa) enables per-process render offload on muxless systems: an app renders on the discrete GPU and the integrated GPU presents the result. Users can opt in via the DRI_PRIME environment variable (e.g., DRI_PRIME=1) or desktop integration. On GNOME, the switcheroo-control service exposes the discrete GPU to the shell, adding a “Launch using Discrete Graphics Card” entry to app menus on supported systems (Wayland or Xorg), which invokes render offload under the hood. With the proprietary Nvidia driver, render offload is provided as PRIME Render Offload (supported since driver 435.xx). Distributions commonly ship a helper like prime-run or desktop menu entries that set the required environment for offloading. ==== Notes and limitations (Linux) ==== On muxless systems the internal display is hard-wired to the integrated GPU; the discrete GPU cannot directly drive that panel and instead renders offscreen for composition by the iGPU. External displays connected to the dGPU may allow direct output depending on the laptop’s wiring. Power-saving behavior varies by driver and distro defaults. Some setups need explicit configuration to power down the inactive GPU when idle. Desktop integrations (e.g., GNOME's menu item) simply opt an app into offload; they do not "auto-switch" the whole session. Users can still launch apps on either GPU as needed.

    Read more →