AI Generator Text To Human

AI Generator Text To Human — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Artificial intelligence

    Artificial intelligence

    Artificial intelligence (AI) is the capability of computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of research in engineering, mathematics and computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals. High-profile applications of AI include advanced web search engines, chatbots, virtual assistants, autonomous vehicles, and play and analysis in strategy games (e.g., chess and Go). Since the 2020s, generative AI has become widely available to generate images, audio, and videos from text prompts. The traditional goals of AI research include learning, reasoning, knowledge representation, planning, natural language processing, and perception, as well as support for robotics. To reach these goals, AI researchers have used techniques including state space search and mathematical optimization, formal logic, artificial neural networks, and methods based on statistics, operations research, and economics. AI also draws upon psychology, linguistics, philosophy, neuroscience, and other fields. Some companies, such as OpenAI, Google DeepMind and Meta, aim to create artificial general intelligence (AGI) – AI that can complete virtually any cognitive task at least as well as a human. Artificial intelligence was founded as an academic discipline in 1956, and the field went through multiple cycles of optimism throughout its history, followed by periods of disappointment and loss of funding, known as AI winters. Funding and interest increased substantially after 2012, when graphics processing units began being used to accelerate neural networks, and deep learning outperformed previous AI techniques. This growth accelerated further after 2017 with the transformer architecture. In the 2020s, an AI boom has coincided with advances in generative AI, which allowed for the creation and modification of media. In addition to AI safety and unintended consequences and harms from the use of AI, ethical concerns, AI's long-term effects, and potential existential risks have prompted discussions of AI regulation. == Goals == The general problem of simulating (or creating) intelligence has been broken into subproblems. These consist of particular traits or capabilities that researchers expect an intelligent system to display. The traits described below have received the most attention and cover the scope of AI research. === Reasoning and problem-solving === Early researchers developed algorithms that imitated step-by-step reasoning that humans use when they solve puzzles or make logical deductions. By the late 1980s and 1990s, methods were developed for dealing with uncertain or incomplete information, employing concepts from probability and economics. Many of these algorithms are insufficient for solving large reasoning problems because they experience a "combinatorial explosion": They become exponentially slower as the problems grow. Even humans rarely use the step-by-step deduction that early AI research could model. They solve most of their problems using fast, intuitive judgments. Accurate and efficient reasoning is an unsolved problem. === Knowledge representation === Knowledge representation and knowledge engineering allow AI programs to answer questions intelligently and make deductions about real-world facts. Formal knowledge representations are used in content-based indexing and retrieval, scene interpretation, clinical decision support, knowledge discovery (mining "interesting" and actionable inferences from large databases), and other areas. A knowledge base is a body of knowledge represented in a form that can be used by a program. An ontology is the set of objects, relations, concepts, and properties used by a particular domain of knowledge. Knowledge bases need to represent things such as objects, properties, categories, and relations between objects; situations, events, states, and time; causes and effects; knowledge about knowledge (what we know about what other people know); default reasoning (things that humans assume are true until they are told differently and will remain true even when other facts are changing); and many other aspects and domains of knowledge. Among the most difficult problems in knowledge representation are the breadth of commonsense knowledge (the set of atomic facts that the average person knows is enormous); and the sub-symbolic form of most commonsense knowledge (much of what people know is not represented as "facts" or "statements" that they could express verbally). There is also the difficulty of knowledge acquisition, the problem of obtaining knowledge for AI applications. === Planning and decision-making === An "agent" is any entity (artificial or not) that perceives and takes actions in the world. A rational agent has goals or preferences and takes actions to make them happen. In automated planning, the agent has a specific goal. In automated decision-making, the agent has preferences—there are some situations it would prefer to be in, and some situations it is trying to avoid. The decision-making agent assigns a number to each situation (called the "utility") that measures how much the agent prefers it. For each possible action, it can calculate the "expected utility": the utility of all possible outcomes of the action, weighted by the probability that the outcome will occur. It can then choose the action with the maximum expected utility. In classical planning, the agent knows exactly what the effect of any action will be. In most real-world problems, however, the agent may not be certain about the situation they are in (it is "unknown" or "unobservable") and it may not know for certain what will happen after each possible action (it is not "deterministic"). It must choose an action by making a probabilistic guess and then reassess the situation to see if the action worked. Alongside thorough testing and improvement based on previous decisions, having an explanation for why the agent took certain decisions is a way to build trust, especially when the decisions have to be relied upon. In some problems, the agent's preferences may be uncertain, especially if there are other agents or humans involved. These can be learned (e.g., with inverse reinforcement learning), or the agent can seek information to improve its preferences. Information value theory can be used to weigh the value of exploratory or experimental actions. The space of possible future actions and situations is typically intractably large, so the agents must take actions and evaluate situations while being uncertain of what the outcome will be. A Markov decision process has a transition model that describes the probability that a particular action will change the state in a particular way and a reward function that supplies the utility of each state and the cost of each action. A policy associates a decision with each possible state. The policy could be calculated (e.g., by iteration), be heuristic, or it can be learned. Game theory describes the rational behavior of multiple interacting agents and is used in AI programs that make decisions that involve other agents. === Learning === Machine learning is the study of programs that can improve their performance on a given task automatically. It has been a part of AI from the beginning. There are several kinds of machine learning. Unsupervised learning analyzes a stream of data and finds patterns and makes predictions without any other guidance. Supervised learning requires labeling the training data with the expected answers, and comes in two main varieties: classification (where the program must learn to predict what category the input belongs in) and regression (where the program must deduce a numeric function based on numeric input). In reinforcement learning, the agent is rewarded for good responses and punished for bad ones. The agent learns to choose responses that are classified as "good". Transfer learning is when the knowledge gained from one problem is applied to a new problem. Deep learning is a type of machine learning that runs inputs through biologically inspired artificial neural networks for all of these types of learning. Computational learning theory can assess learners by computational complexity, by sample complexity (how much data is required), or by other notions of optimization. === Natural language processing === Natural language processing (NLP) allows programs to read, write and communicate in human languages. Specific problems include speech recognition, speech synthesis, machine translation, information extraction, information retrieval and question answering. Early work, based on Noam Chomsky's generative grammar and semantic networks, had difficulty with word-sense disambiguation unless

    Read more →
  • Control communications

    Control communications

    In telecommunications, control communications is the branch of technology devoted to the design, development, and application of communications facilities used specifically for control purposes, such as for controlling (a) industrial processes, (b) movement of resources, (c) electric power generation, distribution, and utilization, (d) communications networks, and (e) transportation systems.

    Read more →
  • Asymmetric follow

    Asymmetric follow

    An asymmetric follow social network is one which allows many people to follow an individual or account without having to follow them back. It is also known as asynchronous follow or sometimes asymmetric friendship. Asymmetric follow is a common pattern on Twitter, where someone may have thousands of followers, but themselves follow few (or no) accounts. In September 2010 Facebook started experimenting with a similar feature, which Facebook calls "Subscribe To."

    Read more →
  • Open Mashup Alliance

    Open Mashup Alliance

    The Open Mashup Alliance (OMA) is a non-profit consortium that promotes the adoption of mashup solutions in the enterprise through the evolution of enterprise mashup standards like EMML. The initial members of the OMA include some large technology companies such as Adobe Systems, Hewlett-Packard, and Intel and some major technology users such as Bank of America and Capgemini. According to Dion Hinchcliffe, "Ultimately, the OMA creates a standardized approach to enterprise mashups that creates an open and vibrant market for competing runtimes, mashups, and an array of important aftermarket services such as development/testing tools, management and administration appliances, governance frameworks, education, professional services, and so on." == Specification development == The initial focus of the OMA is developing EMML, which is a declarative mashup domain-specific language (DSL) aimed at creating enterprise mashups. The EMML language provides a comprehensive set of high-level mashup-domain vocabulary to consume and mash a variety of web data sources. EMML provides a uniform syntax to invoke heterogeneous service styles: REST, WSDL, RSS/ATOM, RDBMS, and POJO. EMML also provides the ability to mix and match diverse data formats: XML, JSON, JDBC, JavaObjects, and primitive types. The OMA website provides the EMML specification, the EMML schema, a reference runtime implementation capable of running EMML scripts, sample EMML mashup scripts, and technical documentation. The OMA is developing EMML under a Creative Commons Attribution No Derivatives license. The eventual objective of the OMA is to submit the EMML specification and any other OMA specifications to a recognized industry standards body.

    Read more →
  • Large language model

    Large language model

    A large language model (LLM) is a neural network trained on a vast amount of text for natural language processing tasks, especially language generation. LLMs can typically generate, summarize, translate and analyze text in many contexts, and are a foundational technology behind modern chatbots. Biased or inaccurate training data can make an LLM's output less reliable. As of 2026, the most capable LLMs are based on transformer architectures, which, according to the 2017 paper "Attention Is All You Need", can be more efficient and parallelizable than earlier statistical and recurrent neural network models. Benchmark evaluations for LLMs attempt to measure model reasoning, factual accuracy, alignment, and safety. == History == Before the emergence of transformer-based models in 2017, some language models were considered large relative to the computational and data constraints of their time. In the early 1990s, IBM's statistical models pioneered word alignment techniques for machine translation, laying the groundwork for corpus-based language modeling. In 2001, a smoothed n-gram model, such as those employing Kneser–Ney smoothing, trained on 300 million words, achieved state-of-the-art perplexity on benchmark tests. During the 2000s, with the rise of widespread internet access, researchers began compiling massive text datasets from the web ("web as corpus") to train statistical language models. Moving beyond n-gram models, researchers started in 2000 to use neural networks as language models. Following the breakthrough of deep neural networks in image classification around 2012, similar architectures were adapted for language tasks. This shift was marked by the development of word embeddings (e.g., Word2Vec by Mikolov in 2013) and sequence-to-sequence (seq2seq) models using LSTM. In 2016, Google transitioned its translation service to neural machine translation (NMT), replacing statistical phrase-based models with deep recurrent neural networks. These early NMT systems used LSTM-based encoder-decoder architectures, as they preceded the invention of transformers. At the 2017 NeurIPS conference, Google researchers introduced the transformer architecture in their landmark paper "Attention Is All You Need". This paper's goal was to improve upon 2014 seq2seq technology, and was based mainly on the attention mechanism developed by Bahdanau et al. in 2014. The following year in 2018, BERT was introduced and quickly became "ubiquitous". Though the original transformer has both encoder and decoder blocks, BERT is an encoder-only model. Academic and research usage of BERT began to decline in 2023, following rapid improvements in the abilities of decoder-only models (such as GPT) to solve tasks via prompting. Although decoder-only GPT-1 was introduced in 2018, it was GPT-2 in 2019 that caught widespread attention because OpenAI claimed to have initially deemed it too powerful to release publicly, out of fear of malicious use. GPT-3 in 2020 went a step further and as of 2025 is available only via API with no offering of downloading the model to execute locally. But it was the consumer-facing chatbot ChatGPT in late 2022 that received extensive media coverage and public attention by 2023. The 2023 GPT-4 was praised for its increased accuracy and as a "holy grail" for its multimodal capabilities. OpenAI did not reveal the high-level architecture and the number of parameters of GPT-4. The release of ChatGPT led to an uptick in LLM usage across several research subfields of computer science, including robotics, software engineering, and societal impact work. In 2024, OpenAI released the reasoning model OpenAI o1, which generates long chains of thought before returning a final answer. Many LLMs with parameter counts comparable to those of OpenAI's GPT series have been developed. Since 2022, weights-available models have been gaining popularity, especially at first with BLOOM and LLaMA, though both have restrictions on usage and deployment. Mistral AI's open-weight models Mistral 7B and Mixtral 8x7B have a more permissive Apache License. In January 2025, DeepSeek released DeepSeek R1, a 671-billion-parameter open-weight model that performs comparably to OpenAI o1 but at a much lower price per token for users. Since 2023, many LLMs have been trained to be multimodal, having the ability to also process or generate other types of data, such as images, audio, or 3D meshes. Open-weight LLMs have become more influential since 2023. Per Vake et al. (2025), community-driven contributions to open-weight models improve their efficiency and performance via collaborative platforms such as Hugging Face. == Dataset preprocessing == === Tokenization === As machine learning algorithms process numbers rather than text, the text must be converted to numbers. In the first step, a vocabulary is decided upon, then integer indices are arbitrarily but uniquely assigned to each vocabulary entry, and finally, an embedding is associated with the integer index. Algorithms include byte-pair encoding (BPE) and WordPiece. There are also special tokens serving as control characters, such as [MASK] for masked-out token (as used in BERT), and [UNK] ("unknown") for characters not appearing in the vocabulary. Also, some special symbols are used to denote special text formatting. For example, "Ġ" denotes a preceding whitespace in RoBERTa and GPT and "##" denotes continuation of a preceding word in BERT. For example, the BPE tokenizer used by the legacy version of GPT-3 would split tokenizer: texts -> series of numerical "tokens" as Tokenization also compresses the datasets. Because LLMs generally require input to be an array that is not jagged, the shorter texts must be "padded" until they match the length of the longest one. ==== Byte-pair encoding ==== As an example, consider a tokenizer based on byte-pair encoding. In the first step, all unique characters (including blanks and punctuation marks) are treated as an initial set of n-grams (i.e. initial set of uni-grams). Successively the most frequent pair of adjacent characters is merged into a bi-gram and all instances of the pair are replaced by it. All occurrences of adjacent pairs of (previously merged) n-grams that most frequently occur together are then again merged into even lengthier n-gram, until a vocabulary of prescribed size is obtained. After a tokenizer is trained, any text can be tokenized by it, as long as it does not contain characters not appearing in the initial-set of uni-grams. === Dataset cleaning === In the context of training LLMs, datasets are typically cleaned by removing low-quality, duplicated, or toxic data. Cleaned datasets can increase training efficiency and lead to improved downstream performance. A trained LLM can be used to clean datasets for training a further LLM. With the increasing proportion of LLM-generated content on the web, data cleaning in the future may include filtering out such content. LLM-generated content can pose a problem if the content is similar to human text (making filtering difficult) but of lower quality (degrading performance of models trained on it). === Synthetic data === Training of largest language models might need more linguistic data than naturally available, or that the naturally occurring data is of insufficient quality. In these cases, synthetic data might be used. == Training == An LLM is a type of foundation model (large X model) trained on language. LLMs can be trained in different ways. In particular, GPT models are first pretrained to predict the next word on a large amount of data, before being fine-tuned. === Cost === Substantial infrastructure is necessary for training the largest models. The tendency towards larger models is visible in the list of large language models. For example, the training of GPT-2 (i.e. a 1.5-billion-parameter model) in 2019 cost $50,000, while training of the PaLM (i.e. a 540-billion-parameter model) in 2022 cost $8 million, and Megatron-Turing NLG 530B (in 2021) cost around $11 million. The qualifier "large" in "large language model" is inherently vague, as there is no definitive threshold for the number of parameters required to qualify as "large". === Fine-tuning === Before being fine-tuned, most LLMs are next-token predictors. The fine-tuning shapes the LLM's behavior via techniques like reinforcement learning from human feedback (RLHF) or constitutional AI. Instruction fine-tuning is a form of supervised learning used to teach LLMs to follow user instructions. In 2022, OpenAI demonstrated InstructGPT, a version of GPT-3 similarly fine-tuned to follow instructions. Reinforcement learning from human feedback (RLHF) involves training a reward model to predict which text humans prefer. Then, the LLM can be fine-tuned through reinforcement learning to better satisfy this reward model. Since humans typically prefer truthful, helpful and harmless answers, RLHF favors such answers. == Architecture == LLMs are generally based on the tra

    Read more →
  • Bootstrap (front-end framework)

    Bootstrap (front-end framework)

    Bootstrap (formerly Twitter Bootstrap) is a free and open-source CSS framework directed at responsive, mobile-first front-end web development. It contains HTML, CSS and (optionally) JavaScript-based design templates for typography, forms, buttons, navigation, and other interface components. As of May 2023, Bootstrap is the 17th most starred project (4th most starred library) on GitHub, with over 164,000 stars. According to W3Techs, Bootstrap is used by 19.2% of all websites. == Features == Bootstrap is an HTML, CSS and JS library that focuses on simplifying the development of informative web pages (as opposed to web applications). The primary purpose of adding it to a web project is to apply Bootstrap's choices of color, size, font and layout to that project. As such, the primary factor is whether the developers in charge find those choices to their liking. Once added to a project, Bootstrap provides basic style definitions for all HTML elements. The result is a uniform appearance for prose, tables and form elements across web browsers. In addition, developers can take advantage of CSS classes defined in Bootstrap to further customize the appearance of their contents. For example, Bootstrap has provisioned for light- and dark-colored tables, page headings, more prominent pull quotes, and text with a highlight. Bootstrap also comes with several JavaScript components which do not require other libraries like jQuery. They provide additional user interface elements such as dialog boxes, tooltips, progress bars, navigation drop-downs, and carousels. Each Bootstrap component consists of an HTML structure, CSS declarations, and in some cases accompanying JavaScript code. They also extend the functionality of some existing interface elements, including for example an auto-complete function for input fields. The most prominent components of Bootstrap are its layout components, as they affect an entire web page. The basic layout component is called "Container", as every other element in the page is placed in it. Developers can choose between a fixed-width container and a fluid-width container. While the latter always fills the width with the web page, the former uses one of the five predefined fixed widths, depending on the size of the screen showing the page: Smaller than 576 pixels 576–768 pixels 768–992 pixels 992–1200 pixels 1200–1400 pixels Larger than 1400 pixels Once a container is in place, other Bootstrap layout components implement a CSS Flexbox layout through defining rows and columns. A precompiled version of Bootstrap is available in the form of one CSS file and three JavaScript files that can be readily added to any project. The raw form of Bootstrap, however, enables developers to implement further customization and size optimizations. This raw form is modular, meaning that the developer can remove unneeded components, apply a theme and modify the uncompiled Sass files. == History == === Early beginnings === Bootstrap, originally named Twitter Blueprint, was developed by Mark Otto and Jacob Thornton at Twitter in 2010 as a framework to encourage consistency across internal tools. Before Bootstrap, various libraries were used for interface development, which led to inconsistencies and a high maintenance burden. According to Otto: A super small group of developers and I got together to design and build a new internal tool and saw an opportunity to do something more. Through that process, we saw ourselves build something much more substantial than another internal tool. Months later, we ended up with an early version of Bootstrap as a way to document and share common design patterns and assets within the company. After a few months of development by a small group, many developers at Twitter began to contribute to the project as a part of Hack Week, a hackathon-style week for the Twitter development team. It was renamed from Twitter Blueprint to Twitter Bootstrap and released as an open-source project on August 19, 2011. It has continued to be maintained by Otto, Thornton, a small group of core developers, and a large community of contributors. === Bootstrap 2 === On January 31, 2012, Bootstrap 2 was released, which added built-in support for Glyphicons, several new components, as well as changes to many of the existing components. This version supports responsive web design, meaning the layout of web pages adjusts dynamically, taking into account the characteristics of the device used (whether desktop, tablet, mobile phone). Shortly before the release of Bootstrap 2.1.2, Otto and Thornton left Twitter, but committed to continue to work on Bootstrap as an independent project. === Bootstrap 3 === On August 19, 2013, Bootstrap 3 was released. It redesigned components to use flat design and a mobile first approach. Bootstrap 3 features new plugin system with namespaced events. Bootstrap 3 dropped Internet Explorer 7 and Firefox 3.6 support, but there is an optional polyfill for these browsers. Bootstrap 3 was also the first version released under the twbs organization on GitHub instead of the Twitter one. === Bootstrap 4 === Otto announced Bootstrap 4 on October 29, 2014. The first alpha version of Bootstrap 4 was released on August 19, 2015. The first beta version was released on August 10, 2017. Otto suspended work on Bootstrap 3 on September 6, 2016, to free up time to work on Bootstrap 4. Bootstrap 4 was finalized on January 18, 2018. Significant changes include: Major rewrite of the code Replacing Less with Sass Addition of Reboot, a collection of element-specific CSS changes in a single file, based on Normalize Dropping support for IE8, IE9, and iOS 6 CSS Flexible Box support Adding navigation customization options Adding responsive spacing and sizing utilities Switching from the pixels unit in CSS to root ems Increasing global font size from 14px to 16px for enhanced readability Dropping the panel, thumbnail, pager, and well components Dropping the Glyphicons icon font Huge number of utility classes Improved form styling, buttons, drop-down menus, media objects and image classes Bootstrap 4 supports the latest versions of Google Chrome, Firefox, Internet Explorer, Opera, and Safari (except on Windows). It additionally supports back to IE10 and the latest Firefox Extended Support Release (ESR). === Bootstrap 5 === Bootstrap 5 was officially released on May 5, 2021. Major changes include: New offcanvas menu component Removing dependence on jQuery in favor of vanilla JavaScript Rewriting the grid to support responsive gutters and columns placed outside of rows Migrating the documentation from Jekyll to Hugo Dropping support for Internet Explorer Moving testing infrastructure from QUnit to Jasmine Adding custom set of SVG icons Adding CSS custom properties Improved API Enhanced grid system Improved customizing docs Updated forms RTL support Built in darkmode support

    Read more →
  • Interstellar communication

    Interstellar communication

    Interstellar communication is the transmission of signals between planetary systems. Sending interstellar messages is potentially much easier than interstellar travel, being possible with technologies and equipment which are currently available. However, the distances from Earth to other potentially inhabited systems introduce prohibitive delays, assuming the limitations of the speed of light. Even an immediate reply to radio communications sent to stars tens of thousands of light-years away would take many human generations to arrive. == Radio == The SETI project has for the past several decades been conducting a search for signals being transmitted by extraterrestrial life located outside the Solar System, primarily in the radio frequencies of the electromagnetic spectrum. Special attention has been given to the Water Hole, the frequency of one of neutral hydrogen's absorption lines, due to the low background noise at this frequency and its symbolic association with the basis for what is likely to be the most common system of biochemistry (but see alternative biochemistry). The regular radio pulses emitted by pulsars were briefly thought to be potential intelligent signals; the first pulsar to be discovered was originally designated "LGM-1", for "Little Green Men." They were quickly determined to be of natural origin, however. Several attempts have been made to transmit signals to other stars as well. (See "Realized projects" at Active SETI.) One of the earliest and most famous was the 1974 radio message sent from the largest radio telescope in the world, the Arecibo Observatory in Puerto Rico. An extremely simple message was aimed at a globular cluster of stars known as M13 in the Milky Way Galaxy and at a distance of 30,000 light years from the Solar System. These efforts have been more symbolic than anything else, however. Further, a possible answer needs double the travel time, i.e. tens of years (near stars) or 60,000 years (M13). == Other methods == It has also been proposed that higher frequency signals, such as lasers operating at visible light frequencies, may prove to be a fruitful method of interstellar communication; at a given frequency it takes surprisingly small energy output for a laser emitter to outshine its local star from the perspective of its target. Other more exotic methods of communication have been proposed, such as modulated neutrino or gravitational wave emissions. These would have the advantage of being essentially immune to interference by intervening matter. Sending physical mail packets between stars may prove to be optimal for many applications. While mail packets would likely be limited to speeds far below that of electromagnetic or other light-speed signals (resulting in very high latency), the amount of information that could be encoded in only a few tons of physical matter could more than make up for it in terms of average bandwidth. The possibility of using interstellar messenger probes for interstellar communication — known as Bracewell probes — was first suggested by Ronald N. Bracewell in 1960, and the technical feasibility of this approach was demonstrated by the British Interplanetary Society's starship study Project Daedalus in 1978. Starting in 1979, Robert Freitas advanced arguments for the proposition that physical space-probes provide a superior mode of interstellar communication to radio signals, then undertook telescopic searches for such probes in 1979 and 1982.

    Read more →
  • Electronics

    Electronics

    Electronics is a scientific and engineering discipline that studies and applies the principles of physics to design, create, and operate devices that manipulate electrons and other electrically charged particles. It is a subfield of physics and electrical engineering which uses active devices such as transistors, diodes, and integrated circuits to control and amplify the flow of electric current and to convert it from one form to another, such as from alternating current (AC) to direct current (DC) or from analog signals to digital signals. Electronic devices have significantly influenced the development of many aspects of modern society, such as telecommunications, entertainment, education, health care, industry, and security. The main driving force behind the advancement of electronics is the semiconductor industry, which continually produces ever-more sophisticated electronic devices and circuits in response to global demand. The semiconductor industry is one of the global economy's largest and most profitable industries, with annual revenues exceeding $481 billion in 2018. The electronics industry also encompasses other branches that rely on electronic devices and systems, such as e-commerce, which generated over $29 trillion in online sales in 2017. == History and development == Karl Ferdinand Braun's development of the crystal detector, the first semiconductor device, in 1874 and the identification of the electron in 1897 by Sir Joseph John Thomson, along with the subsequent invention of the vacuum tube which could amplify and rectify small electrical signals, inaugurated the field of electronics and the electron age. Practical applications started with the invention of the diode by Ambrose Fleming and the triode by Lee De Forest in the early 1900s, which made the detection of small electrical voltages, such as radio signals from a radio antenna, practicable. Vacuum tubes (thermionic valves) were the first active electronic components which controlled current flow by influencing the flow of individual electrons, and enabled the construction of equipment that used current amplification and rectification to give us radio, television, radar, long-distance telephony and much more. The early growth of electronics was rapid, and by the 1920s, commercial radio broadcasting and telecommunications were becoming widespread and electronic amplifiers were being used in such diverse applications as long-distance telephony and the music recording industry. The next big technological step took several decades to appear, when the first working point-contact transistor was invented by John Bardeen and Walter Houser Brattain at Bell Labs in 1947. However, vacuum tubes continued to play a leading role in the field of microwave and high power transmission as well as television receivers until the middle of the 1980s. Since then, solid-state devices have all but completely taken over. Vacuum tubes are still used in some specialist applications such as high power RF amplifiers, cathode-ray tubes, specialist audio equipment, guitar amplifiers and some microwave devices. In April 1955, the IBM 608 was the first IBM product to use transistor circuits without any vacuum tubes and is believed to be the first all-transistorized calculator to be manufactured for the commercial market. The 608 contained more than 3,000 germanium transistors. Thomas J. Watson Jr. ordered all future IBM products to use transistors in their design. From that time on, transistors were almost exclusively used for computer logic circuits and peripheral devices. However, early junction transistors were relatively bulky devices that were difficult to manufacture on a mass-production basis, which limited them to a number of specialised applications. The MOSFET was invented at Bell Labs between 1955 and 1960. It was the first truly compact transistor that could be miniaturised and mass-produced for a wide range of uses. Its advantages include high scalability, affordability, low power consumption, and high density. It revolutionized the electronics industry, becoming the most widely used electronic device in the world. The MOSFET is the basic element in most modern electronic equipment. As the complexity of circuits grew, problems arose. One problem was the size of the circuit. A complex circuit like a computer was dependent on speed. If the components were large, the wires interconnecting them must be long. The electric signals took time to go through the circuit, thus slowing the computer. The invention of the integrated circuit by Jack Kilby and Robert Noyce solved this problem by making all the components and the chip out of the same block (monolith) of semiconductor material. The circuits could be made smaller, and the manufacturing process could be automated. This led to the idea of integrating all components on a single-crystal silicon wafer, which led to small-scale integration (SSI) in the early 1960s, and then medium-scale integration (MSI) in the late 1960s, followed by VLSI. In 2008, billion-transistor processors became commercially available. == Subfields == == Devices and components == An electronic component is any component, either active or passive, in an electronic system or electronic device. Components are connected together, usually by being soldered to a printed circuit board (PCB), to create an electronic circuit with a particular function. Components may be packaged singly or in more complex groups as integrated circuits. Passive electronic components are capacitors, inductors, resistors, whilst active components are such as semiconductor devices; transistors and thyristors, which control current flow at electron level. == Types of circuits == Electronic circuit functions can be divided into two function groups: analog and digital. A particular device may consist of circuitry that has either or a mix of the two types. Analog circuits are becoming less common, as many of their functions are being digitized. === Analog circuits === Analog circuits use a continuous range of voltage or current for signal processing, as opposed to the discrete levels used in digital circuits. Analog circuits were common throughout electronic devices in the early years, in devices such as radio receivers and transmitters. Analog electronic computers were valuable for solving problems with continuous variables until digital processing advanced. As semiconductor technology developed, many of the functions of analog circuits were taken over by digital circuits, and modern circuits that are entirely analog are less common; their functions being replaced by hybrid approach which, for instance, uses analog circuits at the front end of a device receiving an analog signal, and then use digital processing using microprocessor techniques thereafter. Sometimes it may be difficult to classify some circuits that have elements of both linear and non-linear operation. An example is the voltage comparator, which receives a continuous range of voltage but only outputs one of two levels, as in a digital circuit. Similarly, an overdriven transistor amplifier can take on the characteristics of a controlled switch, having essentially two levels of output. Analog circuits are still widely used for signal amplification, such as in the entertainment industry, and conditioning signals from analog sensors, such as in industrial measurement and control. === Digital circuits === Digital circuits are electric circuits based on discrete voltage levels. Digital circuits use Boolean algebra and are the basis of all digital computers and microprocessor devices. They range from simple logic gates to large integrated circuits, employing millions of such gates. Digital circuits use a binary system with two voltage levels labelled 0 and 1 to indicate logical status. Often logic 0 will be a lower voltage and referred to as Low while logic 1 is referred to as High. However, some systems use the reverse definition (0 is High) or are current based. Quite often, the logic designer may reverse these definitions from one circuit to the next as they see fit to facilitate their design. The definition of the levels as 0 or 1 is arbitrary. Ternary (with three states) logic has been studied, and some prototype computers made, but have not gained any significant practical acceptance. Universally, computers and digital signal processors are constructed with digital logic circuits using transistors such as MOSFETs in the electronic logic gates to generate binary states. Logic gates Adders Flip-flops Counters Registers Multiplexers Schmitt triggers Highly integrated devices: Memory chip Microprocessors Microcontrollers Application-specific integrated circuit (ASIC) Digital signal processor (DSP) Field-programmable gate array (FPGA) Field-programmable analog array (FPAA) System on chip (SOC) == Design == Electronic systems design deals with the multi-disciplinary design issues of complex electronic devices and systems, such as mob

    Read more →
  • Purged cross-validation

    Purged cross-validation

    Purged cross-validation is a variant of k-fold cross-validation designed to prevent look-ahead bias in time series and other structured data, developed in 2017 by Marcos López de Prado at Guggenheim Partners and Cornell University. It is primarily used in financial machine learning to ensure the independence of training and testing samples when labels depend on future events. It provides an alternative to conventional cross-validation and walk-forward backtesting methods, which often yield overly optimistic performance estimates due to information leakage and overfitting. == Motivation == Standard cross-validation assumes that observations are independently and identically distributed (IID), which often does not hold in time series or financial datasets. If the label of a test sample overlaps in time with the features or labels in the training set, the result may be data leakage and overfitting. Purged cross-validation addresses this issue by removing overlapping observations and, optionally, adding a temporal buffer ("embargo") around the test set to further reduce the risk of leakage. The figure below illustrates standard 5 Fold Cross-Validation == Purging == Purging removes from the training set any observation whose timestamp falls within the time range of formation of a label in the test set. This can be the case for train set observations before and after the test set. Their removal ensures that the algorithm cannot learn during train time information that will be used to assess the performance of the algorithm. See the figure below for an illustration of purging. == Embargoing == Embargoing addresses a more subtle form of leakage: even if an observation does not directly overlap the test set, it may still be affected by test events due to market reaction lag or downstream dependencies. To guard against this, a percentage-based embargo is imposed after each test fold. For example, with a 5% embargo and 1000 observations, the 50 observations following each test fold are excluded from training. Unlike purging, embargoing can only occur after the test set. The figure below illustrates the application of embargo: == Applications == Purged and embargoed cross-validation has been useful in: Backtesting of trading strategies Validation of classifiers on labeled event-driven returns Any machine learning task with overlapping label horizons == Example == To illustrate the effect of purging and embargoing, consider the figures below. Both diagrams show the structure of 5-fold cross-validation over a 20-day period. In each row, blue squares indicate training samples and red squares denote test samples. Each label is defined based on the value of the next two observations, hence creating an overlap. If this overlap is left untreated, test set information leaks into the train set. The second figure applies the Purged CV procedure. Notice how purging removes overlapping observations from the training set and the embargo widens the gap between test and training data. This approach ensures that the evaluation more closely resembles a true out-of-sample test and reduces the risk of backtest overfitting. == Combinatorial Purged Cross-Validation == Walk-forward backtesting analysis, another common cross-validation technique in finance, preserves temporal order but evaluates the model on a single sequence of test sets. This leads to high variance in performance estimation, as results are contingent on a specific historical path. Combinatorial Purged Cross-Validation (CPCV) addresses this limitation by systematically constructing multiple train-test splits, purging overlapping samples, and enforcing an embargo period to prevent information leakage. The result is a distribution of out-of-sample performance estimates, enabling robust statistical inference and more realistic assessment of a model's predictive power. === Methodology === CPCV divides a time-series dataset into N sequential, non-overlapping groups. These groups preserve the temporal order of observations. Then, all combinations of k groups (where k < N) are selected as test sets, with the remaining N − k groups used for training. For each combination, the model is trained and evaluated under strict controls to prevent leakage. To eliminate potential contamination between training and test sets, CPCV introduces two additional mechanisms: Purging: Any training observations whose label horizon overlaps with the test period are excluded. This ensures that future information does not influence model training. Embargoing: After the end of each test period, a fixed number of observations (typically a small percentage) are removed from the training set. This prevents leakage due to delayed market reactions or auto-correlated features. Each data point appears in multiple test sets across different combinations. Because test groups are drawn combinatorially, this process produces multiple backtest "paths," each of which simulates a plausible market scenario. From these paths, practitioners can compute a distribution of performance statistics such as the Sharpe ratio, drawdown, or classification accuracy. === Formal definition === Let N be the number of sequential groups into which the dataset is divided, and let k be the number of groups selected as the test set for each split. Then: The number of unique train-test combinations is given by the binomial coefficient: ( N k ) {\displaystyle {\binom {N}{k}}} Each observation is used in k {\displaystyle k} test sets and contributes to φ [ N , k ] {\displaystyle \varphi [N,k]} unique backtest paths: φ [ N , k ] = k N ( N k ) {\displaystyle \varphi [N,k]={\frac {k}{N}}{\binom {N}{k}}} This yields a distribution of performance metrics rather than a single point estimate, making it possible to apply Monte Carlo-based or probabilistic techniques to assess model robustness. === Illustrative example === Consider the case where N = 6 and k = 2. The number of possible test set combinations is ( 6 2 ) = 15 {\displaystyle {\binom {6}{2}}=15} . Each of the six groups appears in five test splits. Consequently, five distinct backtest paths can be constructed, each incorporating one appearance from every group. ==== Test group assignment matrix ==== This table shows the 15 test combinations. An "x" indicates that the corresponding group is included in the test set for that split. ==== Backtest path assignment ==== Each group contributes to five different backtest paths. The number in each cell indicates the path to which the group's result is assigned for that split. === Advantages === Combinatorial Purged Cross-Validation offers several key benefits over conventional methods: It produces a distribution of performance metrics, enabling more rigorous statistical inference. The method systematically eliminates lookahead bias through purging and embargoing. By simulating multiple historical scenarios, it reduces the dependence on any single market regime or realization. It supports high-confidence comparisons between competing models or strategies. CPCV is commonly used in quantitative strategy research, especially for evaluating predictive models such as classifiers, regressors, and portfolio optimizers. It has been applied to estimate realistic Sharpe ratios, assess the risk of overfitting, and support the use of statistical tools such as the Deflated Sharpe Ratio (DSR). === Limitations === The main limitation of CPCV stems from its high computational cost. However, this cost can be managed by sampling a finite number of splits from the space of all possible combinations.

    Read more →
  • Downloadable content

    Downloadable content

    Downloadable content (DLC) is additional content created for an already released video game, distributed through the Internet by the game's publisher. It can be added for no extra cost or as a form of video game monetization, enabling the publisher to gain additional revenue from a title after it has been purchased, often using a microtransaction system. DLC can range from cosmetic content, such as skins, to new in-game content, like characters, levels, modes, and larger expansions that may contain a mix of such content as a continuation of the base game. In some games, multiple DLCs (including future DLC not yet released) may be bundled as part of a "season pass"—typically at a discount rather than purchasing each DLC individually. While the Dreamcast was the first home console to support DLC (albeit in a limited form due to hardware and internet connection limitations), Microsoft's Xbox helped popularize the concept. Since the seventh generation of video game consoles, DLC has been a prevalent feature of major video game platforms with internet connectivity. == Etymology == Since the popularization of microtransactions in online distribution platforms such as Steam, the term DLC has become a synonymous for any form of paid content in video games, regardless of whether they constitute the download of new content. Furthermore, this led to the creation of the oxymoronic term "on-disc DLC" for content included on the game's original files but locked behind a paywall. == History == === Precursors to DLC === The earliest form of downloadable content were offerings of full games, such as on the Atari 2600's GameLine service, which allowed users to download games using a telephone line. A similar service, Sega Channel, allowed for the downloading of games to the Sega Genesis over a cable line. While the GameLine and Sega Channel services allowed for the distribution of entire titles, they did not provide downloadable content for existing titles. Expansion packs were sold at retail for some PC games, which featured content such as additional levels, characters, or maps for a base game. They often required an installation of the original game in order to function, but some games (such as Half-Life) had "standalone" expansions, which were essentially spin-off games that reused engine code and assets from the original game. === On consoles === The Dreamcast was the first console to feature online support as a standard; DLC was available, though limited in size due to the narrowband connection and the 200 block limit of the Visual Memory Unit memory card. These online features were still considered a breakthrough in video games. With the release of the Xbox, Microsoft was the second company to implement downloadable content. Many Xbox titles, including Splinter Cell, Halo 2, and Ninja Gaiden, offered varying amounts of extra content, available for download through the Xbox Live service. Most of this content was available free. With the advent of the GameCube, Nintendo was the third company to implement downloadable content. Many GameCube titles offered varying amounts of extra content from Game Boy Advance titles with the GameCube – Game Boy Advance link cable. All of this content was available free. The Xbox 360 (2005) included more robust support for digital distribution, including DLC downloads and purchases, via its Xbox Live Marketplace service. Microsoft believed that publishers would benefit by offering small pieces of content at a small cost ($1 to $5), rather than full expansion packs (~$20), as this would allow players to pick and chose what content they desired, providing revenue to the publishers. Microsoft also utilized a digital currency known as "Microsoft Points" for transactions, which could also be purchased through physical gift cards to avoid the banking fees associated with the small price points. The PlayStation 3 (2006) adopted the same approach with their downloadable hub, the PlayStation Store. Sony planned on having the bulk of its content be purchased separately via many separate online microtransactions for PlayStation Network titles, including Gran Turismo HD Concept and Gran Turismo 5 Prologue. The Wii (2006) featured a sparser amount of downloadable content on their Wii Shop Channel, the bulk of which is accounted for by digital distribution of emulated Nintendo titles from previous generations. Music video games, such as titles from the Guitar Hero and Rock Band franchises, took significant advantage of downloadable content as a means of offering new songs to be played in-game. Harmonix claimed that Guitar Hero II would feature "more online content than anyone has ever seen in a game to this date." Rock Band features the largest number of downloadable items of any console video game, with a steady number of new songs that were added weekly between 2007 and 2013. Acquiring all the downloadable content for Rock Band would, as of July 12, 2012, cost $5,880.10. === On personal computers === As the popularity and speed of internet connections rose, so did the popularity of using the internet for digital distribution of media. User-created game mods and maps were distributed exclusively online, as they were mainly created by people without the infrastructure capable of distributing the content through physical media. In 1997, Cavedog offered a new unit every month as free downloadable content for their real-time strategy computer game Total Annihilation. Later PC digital distribution platforms, such as Games for Windows Marketplace and Steam, would add support for DLC in a similar manner to consoles. === On handhelds === Nokia phones of the late 1990s and early 2000s shipped with side-scrolling shooter Space Impact, available on various models. With the introduction of WAP in 2000, additional downloadable content for the game, with extra levels, became available. The Nintendo Wi-Fi Connection service on the Nintendo DS could be used to obtain a form of DLC for certain games, such as Picross DS—where players could download puzzle "packs" of classic puzzles from previous Picross series games (such as Mario's Picross). as well as downloadable user generated content. Due to the Nintendo DS's use of cartridges and lack of dedicated storage, most "DLC" for DS games was limited in scope, or in some cases (such as Professor Layton and the Curious Village and Moero! Nekketsu Rhythm Damashii Osu! Tatakae! Ouendan 2), was already part of the game's data on the cartridge, and merely unlocked. Its successor, the Nintendo 3DS, natively supported the purchase of DLC for supported titles via Nintendo eShop. Starting with iPhone OS 3, downloadable content became available for the platform via applications bought from the App Store. While this ability was initially only available to developers for paid applications, Apple eventually allowed for developers to offer this in free applications as well in October 2009. == On-disc DLC == In some cases, a purchased DLC may not actually download new content to the device, but merely consists of data used to enable associated content that is already present within the game's data. DLC of this nature revealed via data mining is typically referred to as "on-disc DLC" or PULC (premium unlockable content). This practice has sometimes been considered controversial, with publishers being accused of using what is effectively a microtransaction to lock access to content that was already contained within the game as sold at retail. Data relating to future DLC may be included on-disc or downloaded during updates for technical reasons as well, either to ensure online multiplayer compatibility for existing content between players who have not yet purchased the new DLC, or as dormant support code for planned content that is still in development at the time of the release. == Monetization == Downloadable content is often offered for a price. Since Facebook games popularized the business model of microtransactions, some have criticized downloadable content as being overpriced and an incentive for developers to leave items out of the initial release, with The Elder Scrolls IV: Oblivion's horse armor DLC having faced a mixed reception upon its release for that reason. However, by 2009, the Horse Armor DLC was one of the top ten content packs that Bethesda had sold, which justified the DLC model for future games. Where a normal software disc may allow its license sold or traded, DLC is generally locked to a specific user's account and does not come with the ability to transfer that license to another user. In addition to individual content downloads, video game publishers sometimes offer a "season pass", which allows users to pre-order a selection of upcoming content over a specific time period, and ensuring the customer's ability to immediately obtain the content upon release. As users do not have the ability to fully preview the content before their purchase, there is a chance that the content of a season

    Read more →
  • Locative media

    Locative media

    Locative media or location-based media (LBM) is a virtual medium of communication functionally bound to a location. The physical implementation of locative media, however, is not bound to the same location to which the content refers. Location-based media delivers multimedia and other content directly to the user of a mobile device dependent upon their location. Location information determined by means such as mobile phone tracking and other emerging real-time locating system technologies like Wi-Fi or RFID can be used to customize media content presented on the device. Locative media are digital media applied to real places and thus triggering real social interactions. While mobile technologies such as the Global Positioning System (GPS), laptop computers and mobile phones enable locative media, they are not the goal for the development of projects in this field. == Description == Media content is managed and organized externally of the device on a standard desktop, laptop, server, or cloud computing system. The device then downloads this formatted content with GPS or other RTLS coordinate-based triggers applied to each media sequence. As the location-aware device enters the selected area, centralized services trigger the assigned media, designed to be of optimal relevance to the user and their surroundings. Use of locative technologies "includes a range of experimental uses of geo-technologies including location-based games, artistic critique of surveillance technologies, experiential mapping, and spatial annotation." Location based media allows for the enhancement of any given environment offering explanation, analysis and detailed commentary on what the user is looking at through a combination of video, audio, images and text. The location-aware device can deliver interpretation of cities, parklands, heritage sites, sporting events or any other environment where location based media is required. The content production and pre-production are integral to the overall experience that is created and must have been performed with ultimate consideration of the location and the users position within that location. The media offers a depth to the environment beyond that which is immediately apparent, allowing revelations about background, history and current topical feeds. == Locative, ubiquitous and pervasive computing == The term 'locative media' was coined by Karlis Kalnins. Locative media is closely related to augmented reality (reality overlaid with virtual reality) and pervasive computing (computers everywhere, as in ubiquitous computing). Whereas augmented reality strives for technical solutions, and pervasive computing is interested in embedded computers, locative media concentrates on social interaction with a place and with technology. Many locative media projects have a social, critical or personal (memory) background. While strictly spoken, any kind of link to additional information set up in space (together with the information that a specific place supplies) would make up location-dependent media, the term locative media is strictly bound to technical projects. Locative media works on locations and yet many of its applications are still location-independent in a technical sense. As in the case of digital media, where the medium itself is not digital but the content is digital, in locative media the medium itself might not be location-oriented, whereas the content is location-oriented. Japanese mobile phone culture embraces location-dependent information and context-awareness. It is projected that in the near future locative media will develop to a significant factor in everyday life. == Enabling technologies == Locative media projects use technology such as Global Positioning System (GPS), laptop computers, the mobile phone, Geographic Information System (GIS), and web map services such as Mapbox, OpenStreetMap, and Google Maps among others. Whereas GPS allows for the accurate detection of a specific location, mobile computers allow interactive media to be linked to this place. The GIS supplies arbitrary information about the geological, strategic or economic situation of a location. Web maps like Google Maps give a visual representation of a specific place. Another important new technology that links digital data to a specific place is radio-frequency identification (RFID), a successor to barcodes like Semacode. Research that contributes to the field of locative media happens in fields such as pervasive computing, context awareness and mobile technology. The technological background of locative media is sometimes referred to as "location-aware computing". == Creative representation == Place is often seen as central to creativity; in fact, "for some—regional artists, citizen journalists and environmental organizations for example—a sense of place is a particularly important aspect of representation, and the starting point of conversations." Locative media can propel such conversations in its function as a "poetic form of data visualization," as its output often traces how people move in, and by proxy, make sense of, urban environments. Given the dynamism and hybridity of cities and the networks which comprise them, locative media extends the internet landscape to physical environments where people forge social relations and actions which can be "mobile, plural, differentiated, adventurous, innovative, but also estranged, alienated, impersonalized." Moreover, in using locative technologies, users can expand how they communicate and assert themselves in their environment and, in doing so, explore this continuum of urban interactions. Furthermore, users can assume a more active role in constructing the environments they are situated in accordingly. In turn, artists have been intrigued with locative media as a means of "user-led mapping, social networking and artistic interventions in which the fabric of the urban environment and the contours of the earth become a 'canvas.'" Such projects demystify how resident behaviors in a given city contribute to the culture and sense of personality that cities are often perceived to take on. Design scholars Anne Galloway and Matthew Ward state that "various online lists of pervasive computing and locative media projects draw out the breadth of current classification schema: everything from mobile games, place-based storytelling, spatial annotation and networked performances to device-specific applications." A prominent use of locative media is in locative art. A sub-category of interactive art or new media art, locative art explores the relationships between the real world and the virtual or between people, places or objects in the real world. == Examples == Notable locative media projects include Bio Mapping by Christian Nold in 2004, locative art projects such as the SpacePlace ZKM/ZKMax bluecasting and participatory urban media access in Munich in 2005 and Britglyph by Alfie Dennen in 2009, and location-based games such as AR Quake by the Wearable Computer Lab at the University of South Australia and Can You See Me Now? in 2001 by Blast Theory in collaboration with the Mixed Reality Lab at the University of Nottingham. In 2005, the Silicon Valley–based collaborators of C5 first exhibited the C5 Landscape Initiative, a suite of four GPS inspired projects that investigate perception of landscape in light of locative media. In William Gibson's 2007 novel Spook Country, locative art is one of the main themes and set pieces in the story. Narrative projects which engage with locative media are sometimes referred to as Location-Aware Fiction, as explored in "Data and Narrative: Location Aware Fiction" a 2003 essay by Kate Armstrong. This location-aware fiction is also known as locative literature, where locative stories and poems can be experienced via digital portals, apps, QR codes and e-books, as well as via analogue forms such as labelling tape, Scrabble tiles, fridge magnets or Post-It notes, and these are forms often used by the writer and artist Matt Blackwood. The Transborder Immigrant Tool by the Electronic Disturbance Theater is a locative media project aimed at providing life saving directions to water for people trying to cross the US / Mexico border. The project attracted global media attention in 2009 and 2010. Articles included a Los Angeles Times cover story focusing on Ricardo Dominguez and an AP story interviewing Micha Cárdenas and Brett Stalbaum. The articles focused on concerns over the legality of the project and the ensuing investigations of the group, which are still underway. The Transborder Immigrant Tool has recently been included in a number of major exhibitions including Here, Not There at the Museum of Contemporary Art San Diego and the 2010 California Biennial at the Orange County Museum of Art. Invisible Threads by Stephanie Rothenberg and Jeff Crouse is a locative media project aimed at creating embodied awareness of sweatshops and just-in-time production t

    Read more →
  • Digital divide

    Digital divide

    Digital divide is inequitable access to and use of digital technology, encompassing four interrelated dimensions: motivational, material, skills, and usage access. The digital divide worsens inequality in access to information and resources. According to 2026 data from the U.S. Census Bureau, a significant 'digital divide' persists, with over 15.7 million Americans lacking access to high-speed broadband. Students from low-income households often face limited access to reliable internet and digital devices, which negatively affects their educational opportunities. In the Information Age, people without access to the Internet and other technology are at a disadvantage, for they are less able to connect with others, find and apply for jobs, shop, and learn. People living in poverty, in insecure housing or who are homeless, elderly people, and those living in rural communities may have limited access to the Internet; in contrast, urban middle class people have easy access to the Internet. Another divide is between producers and consumers of Internet content, which could be a result of educational disparities. While social media use varies across age groups, a US 2010 study reported no racial divide. == History == The historical roots of the digital divide in the United States refer to the increasing gap that occurred during the early modern period between those who could and could not access the real time forms of calculation, decision-making, and visualization offered via written and printed media. "Over time, focus has shifted from binary access to differentiated use, where quality and purpose of engagement vary across socio-economic groups." Within this context, ethical discussions regarding the relationship between education and the free distribution of information were raised by thinkers such as Immanuel Kant, Jean Jacques Rousseau, and Mary Wollstonecraft (1712–1778). The latter advocated that governments should intervene to ensure that any society's economic benefits should be fairly and meaningfully distributed. Amid the Industrial Revolution in Great Britain, Rousseau's idea helped to justify poor laws that created a safety net for those who were harmed by new forms of production. Later, when telegraph and postal systems evolved, many used Rousseau's ideas to argue for full access to those services, even if it meant subsidizing hard-to-serve citizens. Thus, "universal services" referred to innovations in regulation and taxation that would allow phone services such as AT&T in the United States to serve hard-to-serve rural users. In 1996, as telecommunications companies merged with Internet companies, the Federal Communications Commission adopted Telecommunications Act of 1996 to consider regulatory strategies and taxation policies to close the digital divide. Though the term "digital divide" was coined among consumer groups that sought to tax and regulate information and communications technology (ICeT) companies to close the digital divide, the topic soon moved onto a global stage. The focus was the World Trade Organization which passed the Telecommunications Services Act, which resisted regulation of ICT companies so that they would be required to serve hard-to-serve individuals and communities. In 1999, to assuage anti-globalization forces, the WTO hosted the "Financial Solutions to Digital Divide" in Seattle, US, co-organized by Craig Warren Smith of Digital Divide Institute and Bill Gates Sr. the chairman of the Bill and Melinda Gates Foundation. It catalyzed a full-scale global movement to close the digital divide, which quickly spread to all sectors of the global economy. In 2000, US president Bill Clinton mentioned the term in the State of the Union Address. Since the early 2000s, the international community has transitioned from a focus on domestic infrastructure to a global, multi-dimensional framework for digital equity. This shift was formalized through the World Summit on the Information Society (WSIS) in Geneva (2003) and Tunis (2005), where the International Telecommunication Union (ITU) established a roadmap for bridging the Global North-South disparity as part of the Sustainable Development Goals. Academic and policy discourse has since evolved to distinguish between the first-level divide (physical access), the second-level divide (digital literacy), and the third-level divide (the ability to translate technology use into socio-economic capital). By the 2020s, critical reflections on national development emphasized that the divide is fundamentally a socio-institutional gap. Research by Tiwari, Kostenko, and Yekhanurov (2025) identifies four pillars for achieving national digital maturity which are digital governance capacity, institutional design to prevent adverse digital incorporation, infrastructure resilience, and citizen capability. This modern era is characterized by the pursuit of meaningful connectivity, a standard that requires internet access to be not only available but affordable, high-speed, and supportive of active content creation. === During the COVID-19 pandemic === At the outset of the COVID-19 pandemic, governments worldwide issued stay-at-home orders that imposed lockdowns, quarantines, restrictions, and closures. The resulting interruptions to schooling, public services, and business operations drove nearly half of the world's population into seeking alternative methods to live while in isolation. These methods included telemedicine, virtual classrooms, online shopping, technology-based social interactions and working remotely, all of which require access to high-speed or broadband internet access and digital technologies. A Pew Research Centre study reports that 90% of Americans describe the use of the Internet as "essential" during the pandemic. The accelerated use of digital technologies created a landscape where the ability, or lack thereof, to access digital spaces became a crucial factor in everyday life. According to the Pew Research Center, 59% of children from lower-income families were likely to face digital obstacles in completing school assignments. These obstacles included the use of a cellphone to complete homework, having to use public Wi-Fi because of unreliable internet service in the home and lack of access to a computer in the home. This difficulty, titled the homework gap, affects more than 30% of K-12 students living below the poverty threshold, and disproportionally affects American Indian/Alaska Native, Black, and Hispanic students. These types of interruptions or privilege gaps in education exemplify problems in the systemic marginalization of historically oppressed individuals in primary education. The pandemic exposed inequity causing discrepancies in learning. "Large-scale events such as COVID-19 intensify both access and skills gaps, underlining the need for resilient digital inclusion policies. Studies during COVID-19 reveal first-level (access) and second-level (skills) divides, with underserved students struggling with reliable internet, devices, and platform navigation ” A lack of "tech readiness", that is, confident and independent use of devices, was reported among the US elderly population; with more than 50% reporting an inadequate knowledge of devices and more than one-third reporting a lack of confidence. "Older adults often face skills and confidence barriers, illustrating later-stage divides in van Dijk’s model." Moreover, according to a UN research paper, similar results can be found across various Asian countries, with those aged over 74, reporting less confident or inconsistent use of digital devices. This aspect of the digital divide and the elderly occurred during the pandemic as healthcare providers increasingly relied upon telemedicine to manage chronic and acute health conditions. == Aspects == There are various definitions of the digital divide, all with slightly different emphasis, which is evidenced by related concepts like digital inclusion, digital participation, digital skills, media literacy, and digital accessibility.“Van Dijk’s model identifies sequential barriers—motivational, material, skills, and usage—that must be addressed to bridge the divide.” === Infrastructure === The infrastructure by which individuals, households, businesses, and communities connect to the Internet addresses the physical mediums that people use to connect to the Internet such as desktop computers, laptops, basic mobile phones or smartphones, MP3 players, gaming consoles, electronic book readers, and tablets. Traditionally, the nature of the divide has been measured in terms of the existing numbers of subscriptions and digital devices. Given the increasing number of such devices, some have concluded that the digital divide among individuals has increasingly been closing as the result of a natural and almost automatic process. Others point to persistent lower levels of connectivity among women, racial and ethnic minorities, people with lower incomes, rura

    Read more →
  • Two-phase locking

    Two-phase locking

    In databases and transaction processing, two-phase locking (2PL) is a pessimistic concurrency control method that guarantees conflict-serializability. It is also the name of the resulting set of database transaction schedules (histories). The protocol uses locks, applied by a transaction to data, which may block (interpreted as signals to stop) other transactions from accessing the same data during the transaction's life. By the 2PL protocol, locks are applied and removed in two phases: Expanding phase: locks are acquired and no locks are released. Shrinking phase: locks are released and no locks are acquired. Two types of locks are used by the basic protocol: Shared and Exclusive locks. Refinements of the basic protocol may use more lock types. Using locks that block processes, 2PL, S2PL, and SS2PL may be subject to deadlocks that result from the mutual blocking of two or more transactions. == Read and write locks == Locks are used to guarantee serializability. A transaction is holding a lock on an object if that transaction has acquired a lock on that object which has not yet been released. For 2PL, the only used data-access locks are read-locks (shared locks) and write-locks (exclusive locks). Below are the rules for read-locks and write-locks: A transaction is allowed to read an object if and only if it is holding a read-lock or write-lock on that object. A transaction is allowed to write an object if and only if it is holding a write-lock on that object. A schedule (i.e., a set of transactions) is allowed to hold multiple locks on the same object simultaneously if and only if none of those locks are write-locks. If a disallowed lock attempts on being held simultaneously, it will be blocked. == Variants == Note that all conflict serializable schedules are also view serializable (but not vice-versa). === Two-phase locking === According to the two-phase locking protocol, each transaction handles its locks in two distinct, consecutive phases during the transaction's execution: Expanding phase (aka Growing phase): locks are acquired and no locks are released (the number of locks can only increase). Shrinking phase (aka Contracting phase): locks are released and no locks are acquired. The two phase locking rules can be summarized as: each transaction must never acquire a lock after it has released a lock. The serializability property is guaranteed for a schedule with transactions that obey this rule. Typically, without explicit knowledge in a transaction on end of phase 1, the rule is safely determined only when a transaction has completed processing and requested commit. In this case, all the locks can be released at once (phase 2). === Conservative two-phase locking === Conservative two-phase locking (C2PL) differs from 2PL in that transactions obtain all the locks they need before the actual execution begins. This is to ensure that a transaction that already holds some locks will not block waiting for other locks. C2PL prevents deadlocks. In cases of heavy lock contention, C2PL reduces the time locks are held on average, relative to 2PL and Strict 2PL, because transactions that hold locks are never blocked. In light lock contention, C2PL holds more locks than is necessary, because it is difficult to predict which locks will be needed in the future, thus leading to higher overhead. A C2PL transaction will not obtain any locks if it cannot obtain all the locks it needs in its initial request. Furthermore, each transaction needs to declare its read and write set (the data items that will be read/written), which is not always possible. Because of these limitations, C2PL is not used very frequently. === Strict two-phase locking === To comply with the strict two-phase locking (S2PL) protocol, a transaction needs to comply with 2PL, and release its write (exclusive) locks only after the transaction has ended (i.e., either committed or aborted). On the other hand, read (shared) locks are released regularly during the shrinking phase. Unlike 2PL, S2PL provides strictness (a special case of cascade-less recoverability). This protocol is not appropriate in B-trees because it causes Bottleneck (while B-trees always starts searching from the parent root). === Strong strict two-phase locking === or Rigorousness, or Rigorous scheduling, or Rigorous two-phase locking To comply with strong strict two-phase locking (SS2PL), a transaction's read and write locks are released only after that transaction has ended (i.e., either committed or aborted). A transaction obeying SS2PL has only a phase 1 and lacks a phase 2 until the transaction has completed. Every SS2PL schedule is also an S2PL schedule, but not vice versa.

    Read more →
  • Deluxe Media

    Deluxe Media

    Deluxe Media Inc., also known simply as Deluxe and formerly Deluxe Entertainment Services Group, Inc., is an American multinational multimedia and entertainment service provisions company owned by Platinum Equity, founded in 1915 by Hungarian-born American film producer William Fox and headquartered in Burbank, California. The company services multiple clients in the film, television, digital content and advertising industries across the globe, and has been recognized with 10 Academy Awards for scientific and technical achievements, including developments in CinemaScope pictures (as part of 20th Century Fox) and more recently for a process of creating archival separations from digital image data. == History == Deluxe began as a film processing laboratory established in 1915 by William Fox under the name De Luxe as part of his eponymous film conglomerate corporation in Fort Lee, New Jersey. In 1916, Fox Film Corporation opened its studio in Hollywood on 13 acres at Sunset and Western. The first Deluxe film laboratory on the west coast was built on the south side of the lot (Fernwood and Serrano), and the laboratory was moved to the new Fox studios building on Manhattan's west side in 1919, where it remained for over 40 years. The "business manager" (later president) of the laboratory was Alan E. Freedman, who guided the company into the 1960s. In 1927, Fox (Deluxe) received a patent for sound-on-film, the Fox Movietone system. In 1927, "Sunrise: A Song of Two Humans," an early Movietone film, opened. Fox Movietone News, ran weekly in theaters until 1963. During the Great Depression, Fox Film Corporation encountered financial difficulties. Among the actions taken to maintain liquidity, Fox sold the laboratories in 1932 to Freedman, who renamed the operation Deluxe. Under Freedman's leadership, Deluxe added two more plants in Chicago and Toronto. In January 1934, Fox was granted an option to rebuy DeLuxe before December 31, 1938. On 31 May 1935, under Sidney Kent, Fox merged his film company with Twentieth Century Pictures to form The Twentieth Century-Fox Film Corporation following a bank-infused reorganisation. The merged company then exercised this option in July 1936, with Freedman remaining as president. In 1953, Deluxe developed the widescreen format CinemaScope. Titles included "There's No Business Like Show Business" (1954) and "The Seven Year Itch" (1955). Other innovations included the processing and sound striping of CinemaScope, and were patented and/or received Academy awards. In 1962 Freedman retired. In the 1960s, Deluxe closed its New York plant, followed by its plants in Chicago and Toronto, as motion picture production declined on the East Coast. In 1972, Deluxe began large volume videocassette production, with a billion by 1996. In 1990, The Rank Organisation acquired Deluxe from Fox. In 2000, Deluxe began large volume DVD production. In 2006, The Rank Organisation sold Deluxe Film Group to MacAndrews & Forbes, renamed Deluxe Entertainment Services Group. On 9 February 2012, Deluxe acquired Hong Kong–based visual effects and post-production company, Centro Digital Pictures, with its founder John Chu remaining as president while reporting to Alaric McAusland, managing director for Deluxe in Australia. In May 2014, Deluxe shut down its Los Angeles plant at Sunset & Western Studios complex, where other studios themselves were demolished way back in 1971. Also that same year, Deluxe closed the Hollywood film labs, and they gave thousands of orphaned film elements to the Academy Film Archive. The Deluxe Laboratories Collection at the Academy Film Archive consists of over 7,500 35mm and 16mm film elements of various motion pictures dating back to the early 1960s. On 22 April 2015, Deluxe and its longtime competitor, Technicolor S.A., announced that they had entered into a binding agreement to create a new joint venture known as Deluxe Technicolor Digital Cinema which will specialize in cinema mastering, distribution and management services. Deluxe got acquired on 4 September 2019 by creditors in a debt-for-equity swap to avoid bankruptcy. On 3 October 2019, Deluxe filed for bankruptcy, pending in the Southern District of New York. The same month on the 24th, the company received court approval to emerge from bankruptcy with a comprehensive restructuring plan. On July 1, 2020, Platinum Equity agreed to acquire the distribution division of Deluxe and re-unite with former CEO Cyril Drabinsky who would merge CineVizion, a film distribution company he founded after leaving Deluxe in 2016, into it. The companies Company 3 and Method Studios which formed the creative divisions of Deluxe were sold to Framestore in November 2020.

    Read more →
  • Gnowit

    Gnowit

    Gnowit (pronounced "know it") is a Canadian software company that provides automated, near-real-time monitoring of legislative, regulatory, and political activity across Canada. Its platform aggregates and analyzes information from government publications, parliamentary debates, committee, and proceedings to provide searchable alerts and reports for organizations monitoring public policy and regulatory developments. The system uses natural-language processing and machine learning techniques to organize and filter large volumes of public information.; the company reports that new publication documents are captured and millions of items are added to its repository daily. == History, Founders and Leadership == Gnowit was co-founded in Ottawa in 2010 by Shahzad Khan and Mohammad Al-Azzouni; Khan serves as chief executive officer. Khan holds a PhD in Computer Science from the University of Cambridge, has more than two decades of experience in AI/ML and computational linguistics, and has authored or co-authored 37 peer-reviewed publications and five patents. Traditionally, companies performed this analysis manually; Gnowit has delivered efficiencies achieved through AI innovations. The company has participated in several Canadian startup and accelerator programs, including Carleton University's Lead To Win initiative, the University of Ottawa's Startup Garage, the Invest Ottawa incubator, and the League of Innovators' BOOST program. === Kubernetes validation (2019–2020) === As part of a Canada's Centre of Excellence in Next Generation Networks (CENGN) project, Gnowit validated a containerized version of its web-intelligence software on Kubernetes. Between 2019 and 2020, Gnowit participated in a project with Canada’s Centre of Excellence in Next Generation Networks (CENGN) to test and scale its platform using containerized infrastructure based on Kubernetes. The initiative focused on improving scalability and supporting the company’s transition from a monolithic software architecture to a cloud-native deployment model. == Products and services == Gnowit markets several modules for public-affairs, compliance, and market-intelligence teams. Legislative & Regulatory Monitoring (vAnalyst). vAnalyst is a monitoring platform that tracks legislative and regulatory activity across Canadian federal, provincial, and territorial jurisdictions. The system aggregates parliamentary debates, bills, committee proceedings, and regulatory publications and provides searchable alerts and reporting tools. The product monitors more than two million web sources to surface relevant items quickly. Parliamentary Live (vAnalyst). Monitors live video feeds from parliamentary sessions and committees with same-day transcripts, AI-generated summaries, witness summaries, and motion detection; municipal coverage is offered as an option. Gnowit can avail transcripts up to two weeks before official releases. These transcripts enable users to navigate and review lengthy parliamentary sittings and committee discussions through searchable text. Municipal Monitoring (vAnalyst). The platform also tracks council meetings, agendas, bylaws, and other municipal government publications from hundreds of Canadian municipalities. The platform aggregates these sources into a single searchable interface for reviewing local government decisions. Curation Edge (analyst service). Curation Edge is an add-on service in which expert analysts work and collaborate with clients to develop a tailored curation guide and deliver daily newsletters or briefs on legislation and media. These reports provide concise summaries, relevant links, and optional metadata, prioritizing key updates with additional context and analysis. The service is customizable, including branding and formatting for executive audiences, and is intended to reduce information overload, support decision-making, and streamline the synthesis and distribution of information. === Coverage and sources === Gnowit monitors sources span Canadian government materials across federal, provincial, and territorial jurisdictions Hansard transcripts (All Jurisdictions, including committees), order papers, committee transcripts, gazettes, bills, acts and regulations, consultations, regulatory-agency publications, and global news media as well as press releases and council-meeting materials from hundreds of municipalities. == Partnerships and support == Gnowit reports collaborations with Canadian academic and ecosystem partners, including: Algonquin College Carleton University McGill University University of Ottawa Université du Québec en Outaouais (UQO) Queen's University The company also participated in the accelerator program at Invest Ottawa and has received support from Canadian research and innovation programs, including: NRC Industrial Research Assistance Program (NRC-IRAP) Mitacs Ontario Centre of Innovation (OCI) (formerly OCE) Gnowit has also referenced membership in the Southern Ontario Smart Computing Innovation Platform (Government of Canada profile: FedDev Ontario – SOSCIP overview). == Technology == Gnowit develops technology intended to support timely decision-making by delivering updates from monitored web sources as they are published. The platform applies artificial intelligence (AI) and machine learning (ML) techniques to monitor, capture, clean, analyze, filter, and organize text, and to generate concise briefs. Its technical approach combines Boolean queries, shallow language processing techniques, and machine learning classifiers within a self-service interface. The company has described its longer-term development framework in relation to a belief–desire–intention (BDI) model of intelligent agents on the web. Gnowit and its founder are listed as inventors/assignees on patents concerning multi-document clustering, salient-content extraction, and sentiment analysis methods that are consistent with these features: US 9,600,470 – Method and system relating to re-labelling multi-document clusters (assignee: Whyz Technologies Ltd.). US 9,336,202 – Method and system relating to salient content extraction for information retrieval (assignee: Whyz Technologies Ltd.). CA 2,865,184 C – Method and system relating to re-labelling multi-document clusters. CA 2,865,186 C – Procédé et système concernant l'analyse de sentiment d'un contenu (sentiment analysis; French record). CA 2,865,187 C – Method and system relating to salient content extraction for information retrieval. == Research and community == In January 2025, Gnowit personnel contributed to regulatory NLP by co-authoring a peer-reviewed paper at the 1st Regulatory NLP Workshop (RegNLP 2025), co-located with COLING in Abu Dhabi. Titled Unifying Large Language Models and Knowledge Graphs for Efficient Regulatory Information Retrieval and Answer Generation, the work introduces PolicyInsight, a framework that joins a dynamic policy data model and knowledge graph with LLMs to monitor policy texts, detect changes, and support retrieval and answer generation; the author list includes Shahzad Khan (CEO, Gnowit Inc.). (ACL Anthology, aclweb.org). Similar information-retrieval technologies are widely used for competitive intelligence, policy monitoring, and media analysis. == White paper == Gnowit has published a practical guide, Automated Government Information Monitoring, which outlines how GR and regulatory teams can design a monitoring and briefing workflow and describes Gnowit's automation features and export options (PDF, email, dashboards, CSV/JSON/XML/API).

    Read more →