AI App Studio

AI App Studio — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Speculative decoding

    Speculative decoding

    Speculative decoding is an inference-time optimization for autoregressive large language models (LLMs) that generates multiple tokens per decoding step instead of one. A smaller draft model proposes a sequence of candidate tokens, and the larger target model verifies them in a single forward pass through a modified rejection sampling scheme. The verification preserves the target model's original output distribution, so the technique produces the same results as standard decoding while cutting latency by roughly two to three times. The name is an analogy to speculative execution in CPU design, where a processor runs instructions along a predicted branch before the outcome is known. == Background == Standard autoregressive decoding in large language models generates one token at a time. The model computes a probability distribution over its vocabulary, samples the next token, and feeds that token back as input. For large models, this process is bottlenecked by memory bandwidth rather than arithmetic throughput: loading the model's parameters from high-bandwidth memory (HBM) to the processor takes up most of the wall-clock time at each step. Because of this, a forward pass over one token and a forward pass over several tokens in a batch take roughly the same time. Speculative decoding relies on this property. == Mechanism == The technique alternates between two phases: drafting and verification. During drafting, a fast approximation model generates a short run of K candidate tokens, typically between 3 and 12. The draft model is usually a much smaller version of the target model or a lightweight auxiliary network. During verification, the target model scores the entire draft sequence in one batched forward pass. A modified rejection sampling algorithm compares the draft and target probabilities at each position. If the target model would have been at least as likely to produce a given token, that token is accepted; the first token that fails is resampled from a corrected distribution, and everything after it is thrown out. The result is that the output distribution is the same as if each token had been generated one at a time. How many tokens get accepted per cycle depends on how well the draft model matches the target. For common words and predictable continuations the match tends to be good, so the target model can confirm several tokens at once. == History == An early precursor was blockwise parallel decoding, proposed in 2018 by Stern, Shazeer, and Uszkoreit. Their method predicted multiple future tokens through auxiliary prediction heads and validated them against the autoregressive model, but it only worked with greedy decoding and did not preserve the full sampling distribution. The modern form of the technique came from Yaniv Leviathan, Matan Kalman, and Yossi Matias at Google Research, who posted "Fast Inference from Transformers via Speculative Decoding" on arXiv in November 2022. Separately and at about the same time, Charlie Chen and colleagues at DeepMind arrived at a closely related method they called speculative sampling, published in February 2023. Both papers introduced the use of rejection sampling to guarantee that the output distribution is unchanged. Leviathan et al. showed roughly 2–3x speedup on T5-XXL (11 billion parameters); Chen et al. reported 2–2.5x on the Chinchilla model (70 billion parameters). The Leviathan et al. paper was presented as an oral at the International Conference on Machine Learning in July 2023. == Variants == SpecInfer (Miao et al., 2024) uses multiple small language models to jointly build a tree of candidate continuations rather than a single chain. The target model verifies the whole tree in parallel and keeps the longest valid path, with reported speedups of 1.5–3.5x. Medusa (Cai et al., 2024) takes a different approach by not using a separate draft model at all. Extra lightweight decoding heads are attached to the target model itself, and each one predicts a token at a different future position. The candidates are evaluated through a tree-structured attention mechanism. The authors measured 2.2–3.6x speedup. EAGLE (Li et al., 2024) performs autoregression on the target model's internal feature representations (specifically the second-to-top layer) rather than on tokens directly. On LLaMA 2 Chat 70B, this gave a 2.7–3.5x latency reduction. Later versions added dynamic draft trees (EAGLE-2) and further optimizations (EAGLE-3), reaching 3–6.5x speedup. == Adoption == By 2024, speculative decoding had become a standard part of production LLM serving. Google uses it in the AI Overviews feature of Google Search. Open-source inference frameworks such as vLLM, NVIDIA's TensorRT-LLM, and SGLang all include built-in support for speculative decoding and its variants. Apple, AWS, and Meta have also published research extending the method or deploying it at scale.

    Read more →
  • Errored second

    Errored second

    In telecommunications and data communication systems, an errored second is an interval of a second during which any error whatsoever has occurred, regardless of whether that error was a single bit error or a complete loss of communication for that entire second. The type of error is not important for the purpose of counting errored seconds. In communication systems with very low uncorrected bit error rates, such as modern fiber-optic transmission systems, or systems with higher low-level error rates that are corrected using large amounts of forward error correction, errored seconds are often a better measure of the effective user-visible error rate than the raw bit error rate. For many modern packet-switched communication systems, even a single uncorrected bit error is enough to cause the loss of a data packet by causing its CRC check to fail; whether that packet loss was caused by a single bit error or a hundred-bit-long error burst is irrelevant. For systems using large amounts of forward error correction, the reverse applies; a single low-level bit error will almost never occur, since any small errors will almost always be corrected, but any error sufficiently large to cause the forward error correction to fail will almost always result in a large burst error. More specialist and precise definitions of errored seconds exist in standards such as the T1 and DS1 transport systems.

    Read more →
  • Dynamic web page

    Dynamic web page

    A dynamic web page is a web page constructed at runtime (during software execution), as opposed to a static web page, delivered as it is stored. A server-side dynamic web page is a web page whose construction is controlled by an application server processing server-side scripts. In server-side scripting, parameters determine how the assembly of every new web page proceeds, and including the setting up of more client-side processing. A client-side dynamic web page processes the web page using JavaScript running in the browser as it loads. JavaScript can interact with the page via Document Object Model (DOM), to query page state and modify it. Even though a web page can be dynamic on the client-side, it can still be hosted on a static hosting service such as GitHub Pages or Amazon S3 as long as there is not any server-side code included. A dynamic web page is then reloaded by the user or by a computer program to change some variable content. The updating information could come from the server, or from changes made to that page's DOM. This may or may not truncate the browsing history or create a saved version to go back to, but a dynamic web page update using AJAX technologies will neither create a page to go back to, nor truncate the web browsing history forward of the displayed page. Using AJAX, the end user gets one dynamic page managed as a single page in the web browser while the actual web content rendered on that page can vary. The AJAX engine sits only on the browser requesting parts of its DOM, the DOM, for its client, from an application server. A particular application server could offer a standardized REST style interface to offer services to the web application. DHTML is the umbrella term for technologies and methods used to create web pages that are not static web pages, though it has fallen out of common use since the popularization of AJAX, a term which is now itself rarely used. Client-side-scripting, server-side scripting, or a combination of these make for the dynamic web experience in a browser. == Basic concepts == Classical hypertext navigation, with HTML or XHTML alone, provides "static" content, meaning that the user requests a web page and simply views the page and the information on that page. However, a web page can also provide a "live", "dynamic", or "interactive" user experience. Content (text, images, form fields, etc.) on a web page can change, in response to different contexts or conditions. There are two ways to create this kind of effect: Using client-side scripting to change interface behaviors within a specific web page, in response to mouse or keyboard actions, data received from a web API, websocket or at specified timing events. In this case the dynamic behavior occurs within the presentation. Using server-side scripting to change the supplied page source code between pages, adjusting the sequence or reload of the web pages or web content supplied to the browser. Server responses may be determined by such conditions as data in a posted HTML form, parameters in the URL, the type of browser being used, the passage of time, or a database or server state. Web pages that use client-side scripting must use presentation technology broadly called rich interfaced pages. Client-side scripting languages like JavaScript or ActionScript, used for Dynamic HTML (DHTML) and Flash technologies respectively, are frequently used to orchestrate media types (sound, animations, changing text, etc.) of the presentation. The scripting also allows use of remote scripting, a technique by which the DHTML page requests additional information from a server, using a hidden Frame, XMLHttpRequests, or a web service. It is also possible to use a web framework to create a web API, which the client, via the use of JavaScript, uses to obtain data and alter its appearance or behavior dynamically depending on the data. Web pages that use server-side scripting are often created with the help of server-side languages such as PHP, Perl, ASP, JSP, ColdFusion and other languages. These server-side languages typically use the Common Gateway Interface (CGI) to produce dynamic web pages. These kinds of pages can also use, on the client-side, the first kind (DHTML, etc.). == History == It is difficult to be precise about "dynamic web page beginnings" or chronology because the precise concept makes sense only after the "widespread development of web pages". HTTP has existed since 1989, HTML, publicly standardized since 1996. The web browser's rise in popularity started with Mosaic in 1993. Between 1995 and 1996, multiple dynamic web products were introduced to the market, including Coldfusion, WebObjects, PHP, and Active Server Pages. The introduction of JavaScript (then known as LiveScript) enabled the production of client-side dynamic web pages, with JavaScript code executed in the client's browser. The letter "J" in the term AJAX originally indicated the use of JavaScript, as well as XML. With the rise of server side JavaScript processing, for example, Node.js, originally developed in 2009, JavaScript is also used to dynamically create pages on the server that are sent fully formed to clients. MediaWiki, the content management system that powers Wikipedia, is an example for an originally server-side dynamic web page, interacted with through form submissions and URL parameters. Throughout time, progressively enhancing extensions such as the visual editor have also added elements that are dynamic on the client side, while the original dynamic server-side elements such as the classic edit form remain available to be fallen back on (graceful degradation) in case of error or incompatibility. == Server-side scripting == A program running on a web server is used to generate the web content on various web pages, manage user sessions, and control workflow. Server responses may be determined by such conditions as data in a posted HTML form, parameters in the URL, the type of browser being used, the passage of time, or a database or server state. Such web pages are often created with the help of server-side languages such as ASP, ColdFusion, Java, JavaScript, Perl, PHP, Ruby, Python, and other languages, by a support server that can run on the same hardware as the web server. These server-side languages often use the Common Gateway Interface (CGI) to produce dynamic web pages. Two notable exceptions are ASP.NET, and JSP, which reuse CGI concepts in their APIs but actually dispatch all web requests into a shared virtual machine. The server-side languages are used to embed tags or markers within the source file of the web page on the web server. When a user on a client computer requests that web page, the web server interprets these tags or markers to perform actions on the server. For example, the server may be instructed to insert information from a database or information such as the current date. Dynamic web pages are often cached when there are few or no changes expected and the page is anticipated to receive considerable amount of web traffic that would wastefully strain the server and slow down page loading if it had to generate the pages on the fly for each request. == Client-side scripting == Client-side scripting is changing interface behaviors within a specific web page in response to input device actions, or at specified timing events. In this case, the dynamic behavior occurs within the presentation. The client-side content is generated on the user's local computer system. Such web pages use presentation technology called rich interfaced pages. Client-side scripting languages like JavaScript or ActionScript, used for Dynamic HTML (DHTML) and Flash technologies respectively, are frequently used to orchestrate media types (sound, animations, changing text, etc.) of the presentation. Client-side scripting also allows the use of remote scripting, a technique by which the DHTML page requests additional information from a server, using a hidden frame, XMLHttpRequests, or a Web service. The first public use of JavaScript was in 1995, when the language was implemented in Netscape Navigator 2, standardized as ECMAScript two years later. Example The client-side content is generated on the client's computer. The web browser retrieves a page from the server, then processes the code embedded in the page (typically written in JavaScript) and displays the retrieved page's content to the user. The innerHTML property (or write command) can illustrate the client-side dynamic page generation: two distinct pages, A and B, can be regenerated (by an "event response dynamic") as document.innerHTML = A and document.innerHTML = B; or "on load dynamic" by document.write(A) and document.write(B). == Combination technologies == All of the client and server components that collectively build a dynamic web page are called a web application. Web applications manage user interactions, state, security, and performance. Ajax uses a combination of both client-side script

    Read more →
  • GlTF

    GlTF

    glTF (Graphics Library Transmission Format or GL Transmission Format and formerly known as WebGL Transmissions Format or WebGL TF) is a standard file format for three-dimensional scenes and models. A glTF file uses one of two possible file extensions: .gltf (JSON/ASCII) or .glb (binary). Both .gltf and .glb files may reference external binary and texture resources. Alternatively, both formats may be self-contained by directly embedding binary data buffers (as base64-encoded strings in .gltf files or as raw byte arrays in .glb files). An open standard developed and maintained by the Khronos Group, it supports 3D model geometry, appearance, scene graph hierarchy, and animation. It is intended to be a streamlined, interoperable format for the delivery of 3D assets, while minimizing file size and runtime processing by apps. As such, its creators have described it as the "JPEG of 3D". == Overview == The glTF format stores data primarily in JSON. The JSON may also contain blobs of binary data known as buffers, and refer to external files, for storing mesh data, images, etc. The binary .glb format also contains JSON text, but serialized with binary chunk headers to allow blobs to be directly appended to the file. The fundamental building blocks of a glTF scene are nodes. Nodes are organized into a hierarchy, such that a node may have other nodes defined as children. Nodes may have transforms relative to their parent. Nodes may refer to resources, such as meshes, skins, and cameras. Meshes may refer to materials, which refer to textures, which refer to images. Scenes are defined using an array of root nodes. Most of the top-level glTF properties use a flat hierarchy for storage. Nodes are saved in an array and are referred to by index, including by other nodes. A glTF scene refers to its root nodes by index. Furthermore, nodes refer to meshes by index, which refer to materials by index, which refer to textures by index, which refer to images by index. All glTF data structures support being extended using a JSON property, allowing arbitrary JSON data to be added. == Releases == === glTF 1.0 === Members of the COLLADA working group conceived the file format in 2012. At SIGGRAPH 2012, Khronos presented a demo of glTF, which was then called WebGL Transmissions Format (WebGL TF). On October 19, 2015, Khronos released the glTF 1.0 specification. ==== Adoption of glTF 1.0 ==== At SIGGRAPH 2016, Oculus announced their adoption of glTF citing the similarities to their ovrscene format. In October 2016, Microsoft joined the 3D Formats working group at Khronos to collaborate on glTF. === glTF 2.0 === The second version, glTF 2.0, was released in June 2017, and is a complete overhaul of the file format from version 1.0, with most tools adopting the 2.0 version. Based on a proposal by Fraunhofer originally presented at SIGGRAPH 2016, physically based rendering (PBR) was added, replacing WebGL shaders used in glTF 1.0. glTF 2.0 added the GLB binary format into the base specification. Other upgrades include sparse accessors and morph targets for techniques such as facial animation, and schema tweaks and breaking changes for corner cases or performance such as replacing top-level glTF object properties with arrays for faster index-based access. There is ongoing work towards import and export in Unity and an integrated multi-engine viewer and validator. ==== Adoption of glTF 2.0 ==== On March 3, 2017, Microsoft announced that they would be using glTF 2.0 as the 3D asset format across their product line, including Paint 3D, 3D Viewer, Remix 3D, Babylon.js, and Microsoft Office. Sketchfab also announced support for glTF 2.0. The glTF and GLB formats are used on and supported by companies including DGG, UX3D, Sketchfab, Facebook, Microsoft, Meta, Google, Adobe, Box, TurboSquid, Unreal Engine, Unity, and Qt Quick 3D. The format has been noted as an important standard for augmented reality, integrating with modeling software such as Autodesk Maya, Autodesk 3ds Max, and Poly. In February 2020, the Smithsonian Institution launched their Open Access Initiative, releasing approximately 2.8 million 2D images and 3D models into the public domain, using glTF for the 3D models. In July 2022, glTF 2.0 was released as the ISO/IEC 12113:2022 International Standard. Khronos stated they would make regular submissions to bring updates and new widely adopted glTF functionality into refreshed versions of ISO/IEC 12113 to ensure that there is no long-term divergence between the ISO/IEC and Khronos specifications. The open-source game engine Godot supports importing glTF 2.0 files since version 3.0 and export since version 4.0. === Extensions === The glTF format can be extended with arbitrary JSON to add new data and functionality. Extensions can be placed on any part of a glTF, including nodes, animations, materials, textures, and on the entire document. Khronos keeps a non-comprehensive registry of glTF extensions on GitHub, including all official Khronos extensions and a few third-party extensions. PBR extensions model the physical appearance of real-world objects, allowing developers to create realistic 3D assets that have the correct appearance. As new PBR extensions are released, they continue to expand PBR capabilities within the glTF framework, allowing a wider range of scenes and objects to be realistically rendered as 3D assets. The KTX 2.0 extension for universal texture compression enables 3D models in the glTF format to be highly compressed and to use natively supported texture formats, reducing file size and boosting rendering speed. Draco is a glTF extension for mesh compression, to compress and decompress 3D meshes, to help reduce the size of 3D files. It compresses vertex attributes, normals, colors, and texture coordinates. Various glTF extensions for game engine interoperability have been developed by OMI group. This includes extensions for physics shapes, physics bodies, physics joints, audio playback, seats, spawn points, and more. The VRM consortium has developed glTF extensions for advanced humanoid 3D avatars including dynamic spring bones and toon materials. == Derivative formats == 3D Tiles, an OGC Community Standard, builds on glTF to add a spatial data structure, metadata, and declarative styling for streaming massive heterogeneous 3D geospatial datasets. VRM, a model format for VR, is built on the .glb format. It is a 3D humanoid avatar specification and file format. == Software ecosystem == Khronos maintains the glTF Sample Viewer for viewing glTF assets. Khronos also maintains the glTF Validator for validating if 3D models conform to the glTF specification. Khronos maintains a glTF Compressor tool to interactively optimize and fine-tune compression settings for glTF assets using KTX 2.0 textures. glTF loaders are in open-source WebGL engines including PlayCanvas, Three.js, Babylon.js, Cesium, PEX, xeogl, and A-Frame. The Godot game engine supports and recommends the glTF format, with both import and export support. Open-source glTF converters are available from COLLADA, FBX, and OBJ. Assimp can import and export glTF. glTF files can also be directly exported from a variety of 3D editors, such as Blender, Unity (using the glTFast importer/exporter), Freecad, Vectary, Autodesk 3ds Max (natively or using Verge3D exporter), Autodesk Maya (using babylon.js exporter), Autodesk Inventor, Modo, Houdini, Paint 3D, Godot, and Substance Painter. Open-source glTF utility libraries are available for programming languages including JavaScript, Node.js, C++, C#, Python, Haskell, Java, Go, Rust, Haxe, Ada, and TypeScript. Khronos keeps a list of these libraries and other related applications on their ecosystem site. The Khronos 3D Commerce Working Group released Asset Creation Guidelines in 2020 outlining best practices for use of the glTF file format in 3D Commerce. In 2025, the Working Group launched Asset Creation Guidelines 2.0, a continuously updated resource with additional guidance for geometry, mesh optimization, UV maps, textures, materials/PBR performance, and web optimization. The Khronos PBR Neutral Tone Mappers specification is a tone mapper designed to faithfully reproduce an object's base color, hue, and saturation when using PBR rendering under grayscale lighting, supporting brand- and product-accurate color representation. Khronos maintains the glTF Asset Auditor to allow retailers and advertising technology platforms to validate 3D assets against either a default Audit Profile modelled on the 2020 3D Commerce Asset Creation Guidelines or a custom profile defined by the target application.

    Read more →
  • Snapshot isolation

    Snapshot isolation

    In databases, and transaction processing (transaction management), snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database (in practice it reads the last committed values that existed at the time it started), and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made since that snapshot. Snapshot isolation has been adopted by several major database management systems, such as InterBase, Firebird, Oracle, MySQL, PostgreSQL, SQL Anywhere, MongoDB and Microsoft SQL Server (2005 and later). The main reason for its adoption is that it allows better performance than serializability, yet still avoids most of the concurrency anomalies that serializability avoids (but not all). In practice snapshot isolation is implemented within multiversion concurrency control (MVCC), where generational values of each data item (versions) are maintained: MVCC is a common way to increase concurrency and performance by generating a new version of a database object each time the object is written, and allowing transactions' read operations of several last relevant versions (of each object). Snapshot isolation has been used to criticize the ANSI SQL-92 standard's definition of isolation levels, as it exhibits none of the "anomalies" that the SQL standard prohibited, yet is not serializable (the anomaly-free isolation level defined by ANSI). In spite of its distinction from serializability, snapshot isolation is sometimes referred to as serializable by Oracle. == Definition == A transaction executing under snapshot isolation appears to operate on a personal snapshot of the database, taken at the start of the transaction. When the transaction concludes, it will successfully commit only if the values updated by the transaction have not been changed externally since the snapshot was taken. Such a write–write conflict will cause the transaction to abort. In a write skew anomaly, two transactions (T1 and T2) concurrently read an overlapping data set (e.g. values V1 and V2), concurrently make disjoint updates (e.g. T1 updates V1, T2 updates V2), and finally concurrently commit, neither having seen the update performed by the other. Were the system serializable, such an anomaly would be impossible, as either T1 or T2 would have to occur "first", and be visible to the other. In contrast, snapshot isolation permits write skew anomalies. As a concrete example, imagine V1 and V2 are two balances held by a single person, Phil. The bank will allow either V1 or V2 to run a deficit, provided the total held in both is never negative (i.e. V1 + V2 ≥ 0). Both balances are currently $100. Phil initiates two transactions concurrently, T1 withdrawing $200 from V1, and T2 withdrawing $200 from V2. If the database guaranteed serializable transactions, the simplest way of coding T1 is to deduct $200 from V1, and then verify that V1 + V2 ≥ 0 still holds, aborting if not. T2 similarly deducts $200 from V2 and then verifies V1 + V2 ≥ 0. Since the transactions must serialize, either T1 happens first, leaving V1 = −$100, V2 = $100, and preventing T2 from succeeding (since V1 + (V2 − $200) is now −$200), or T2 happens first and similarly prevents T1 from committing. If the database is under snapshot isolation(MVCC), however, T1 and T2 operate on private snapshots of the database: each deducts $200 from an account, and then verifies that the new total is zero, using the other account value that held when the snapshot was taken. Since neither update conflicts, both commit successfully, leaving V1 = V2 = −$100, and V1 + V2 = −$200. Some systems built using multiversion concurrency control (MVCC) may support (only) snapshot isolation to allow transactions to proceed without worrying about concurrent operations, and more importantly without needing to re-verify all read operations when the transaction finally commits. This is convenient because MVCC maintains a series of recent history consistent states. The only information that must be stored during the transaction is a list of updates made, which can be scanned for conflicts fairly easily before being committed. However, MVCC systems (such as MarkLogic) will use locks to serialize writes together with MVCC to obtain some of the performance gains and still support the stronger "serializability" level of isolation. == Workarounds == Potential inconsistency problems arising from write skew anomalies can be fixed by adding (otherwise unnecessary) updates to the transactions in order to enforce the serializability property. Materialize the conflict Add a special conflict table, which both transactions update in order to create a direct write–write conflict. Promotion Have one transaction "update" a read-only location (replacing a value with the same value) in order to create a direct write–write conflict (or use an equivalent promotion, e.g. Oracle's SELECT FOR UPDATE). In the example above, we can materialize the conflict by adding a new table which makes the hidden constraint explicit, mapping each person to their total balance. Phil would start off with a total balance of $200, and each transaction would attempt to subtract $200 from this, creating a write–write conflict that would prevent the two from succeeding concurrently. However, this approach violates the normal form. Alternatively, we can promote one of the transaction's reads to a write. For instance, T2 could set V1 = V1, creating an artificial write–write conflict with T1 and, again, preventing the two from succeeding concurrently. This solution may not always be possible. In general, therefore, snapshot isolation puts some of the problem of maintaining non-trivial constraints onto the user, who may not appreciate either the potential pitfalls or the possible solutions. The upside to this transfer is better performance. == Terminology == Snapshot isolation is called "serializable" mode in Oracle and PostgreSQL versions prior to 9.1, which may cause confusion with the "real serializability" mode. There are arguments both for and against this decision; what is clear is that users must be aware of the distinction to avoid possible undesired anomalous behavior in their database system logic. == History == Snapshot isolation arose from work on multiversion concurrency control databases, where multiple versions of the database are maintained concurrently to allow readers to execute without colliding with writers. Such a system allows a natural definition and implementation of such an isolation level. InterBase, later owned by Borland, was acknowledged to provide SI rather than full serializability in version 4, and likely permitted write-skew anomalies since its first release in 1985. Unfortunately, the ANSI SQL-92 standard was written with a lock-based database in mind, and hence is rather vague when applied to MVCC systems. Berenson et al. wrote a paper in 1995 critiquing the SQL standard, and cited snapshot isolation as an example of an isolation level that did not exhibit the standard anomalies described in the ANSI SQL-92 standard, yet still had anomalous behaviour when compared with serializable transactions. In 2008, Cahill et al. showed that write-skew anomalies could be prevented by detecting and aborting "dangerous" triplets of concurrent transactions. This implementation of serializability is well-suited to multiversion concurrency control databases, and has been adopted in PostgreSQL 9.1, where it is known as Serializable Snapshot Isolation (SSI). When used consistently, this eliminates the need for the above workarounds. The downside over snapshot isolation is an increase in aborted transactions. This can perform better or worse than snapshot isolation with the above workarounds, depending on workload.

    Read more →
  • End-to-end encryption

    End-to-end encryption

    End-to-end encryption (E2EE) is a method of implementing a secure communication system where only the sender and intended recipient can read the messages. No one else, including the system provider, telecom providers, Internet providers or malicious actors, can access the cryptographic keys needed to read or send messages. End-to-end encryption prevents data from being read or secretly modified, except by the sender and intended recipients. In many applications, messages are relayed from a sender to some recipients by a service provider. In an E2EE-enabled service, messages are encrypted on the sender's device such that no third party, including the service provider, has the means to decrypt them. The recipients retrieve encrypted messages and decrypt them independently on their own devices. Since third parties cannot decrypt the data being communicated or stored, services with E2EE are better at protecting user data from data breaches and espionage. Computer security experts, digital freedom organizations, and human rights activists advocate for the use of E2EE due to its security and privacy benefits, including its ability to resist mass surveillance. Popular messaging apps like WhatsApp, iMessage, Facebook Messenger, and Signal use end-to-end encryption for chat messages, with some also supporting E2EE of voice and video calls. As of May 2025, WhatsApp is the most widely used E2EE messaging service, with over 3 billion users. Meanwhile, Signal with an estimated 70 million users, is regarded as the current gold standard in secure messaging by cryptographers, protestors, and journalists. Since end-to-end encrypted services cannot offer decrypted messages in response to government requests, the proliferation of E2EE has been met with controversy. Around the world, governments, law enforcement agencies, and child protection groups have expressed concerns over its impact on criminal investigations. As of 2025, some governments have successfully passed legislation targeting E2EE, such as Australia's Telecommunications and Other Legislation Amendment Act (2018) and the Online Safety Act (2023) in the UK. Other attempts at restricting E2EE include the EARN IT Act in the US and the Child Sexual Abuse Regulation in the EU.[1] Nevertheless, some government bodies such as the UK's Information Commissioner's Office and the US's Cybersecurity and Infrastructure Security Agency (CISA) have argued for the use of E2EE, with Jeff Greene of the CISA advising that "encryption is your friend" following the discovery of the Salt Typhoon espionage campaign in 2024. == Definitions == End-to-end encryption is a means of ensuring the security of communications in applications like secure messaging. Under E2EE, messages are encrypted on the sender's device such that they can be decoded only by the final recipient's device. In many non-E2EE messaging systems, including email and many chat platforms, messages pass through intermediaries and are stored by a third party service provider, from which they are retrieved by the recipient. Even if messages are encrypted, they are only encrypted 'in transit', and are thus accessible by the service provider. Server-side disk encryption is also distinct from E2EE because it does not prevent the service provider from viewing the information, as they have the encryption keys and can simply decrypt it. The term "end-to-end encryption" originally only meant that the communication is never decrypted during its transport from the sender to the receiver. For example, around 2003, E2EE was proposed as an additional layer of encryption for GSM or TETRA, in addition to the existing radio encryption protecting the communication between the mobile device and the network infrastructure. This has been standardized by SFPG for TETRA. Note that in TETRA, the keys are generated by a Key Management Centre (KMC) or a Key Management Facility (KMF), not by the communicating users. Later, around 2014, the meaning of "end-to-end encryption" started to evolve when WhatsApp encrypted a portion of its network, requiring that not only the communication stays encrypted during transport, but also that the provider of the communication service is not able to decrypt the communications—maliciously or when requested by law enforcement agencies. Similarly, messages must be undecryptable in transit by attackers through man-in-the-middle attacks. This new meaning is now the widely accepted one. == Motivations == The lack of end-to-end encryption can allow service providers to easily provide search and other features, or to scan for illegal and unacceptable content. However, it also means that content can be read by anyone who has access to the data stored by the service provider, by design or via a backdoor. This can be a concern in many cases where privacy is important, such as in governmental and military communications, financial transactions, and when sensitive information such as health and biometric data are sent. If this content were shared without E2EE, a malicious actor or adversarial government could obtain it through unauthorized access or subpoenas targeted at the service provider. E2EE alone does not guarantee privacy or security. For example, the data may be held unencrypted on the user's own device or accessed through their own app if their credentials are compromised. == Modern implementations == === Messaging === In May 2026, Meta ended support for end-to-end encryption (E2EE) on Instagram, reversing a previous commitment to expand the technology across its messaging services. The company justified the move as a measure to mitigate fraudulent activity and facilitate the detection of harmful content. The decision highlighted a conflict between digital privacy and online safety; while child protection organizations supported the change to better identify predatory behavior, privacy advocates argued that removing E2EE compromises user security. As of 2025, messaging apps like Signal and WhatsApp are designed to exclusively use end-to-end encryption. Both Signal and WhatsApp use the Signal Protocol. Other messaging apps and protocols that support end-to-end encryption include Facebook Messenger, iMessage, Telegram, Matrix, and Keybase. Although Telegram supports end-to-end encryption, it has been criticized for not enabling it by default, instead supporting E2EE through opt-in "secret chats". As of 2020, Telegram did not support E2EE for group chats and no E2EE on its desktop clients. In 2022, after controversy over the use of Facebook Messenger messages in an abortion lawsuit in Nebraska, Facebook added support for end-to-end encryption in the Messenger app. Writing for Wired, technologist Albert Fox Cahn criticized Messenger's approach to end-to-end encryption, which required the user to opt into E2EE for each conversation and split the message thread into two chats which were easy for users to confuse. In December 2023, Facebook announced plans to enable end-to-end encryption by default despite pressure from British law enforcement agencies. As of 2016, many server-based communications systems did not include end-to-end encryption. These systems can only guarantee the protection of communications between clients and servers, meaning that users have to trust the third parties who are running the servers with the sensitive content. End-to-end encryption is regarded as safer because it reduces the number of parties who might be able to interfere or break the encryption. In the case of instant messaging, users may use a third-party client or plugin to implement an end-to-end encryption scheme over an otherwise non-E2EE protocol. === Audio and video conferencing === Signal and WhatsApp use end-to-end encryption for audio and video calls. Since 2020, Signal has also supported end-to-encrypted video calls. In 2024, Discord added end-to-end encryption for audio and video calls, voice channels, and certain live streams. However, they had no plans to implement E2EE for messages. In 2020, after acquiring Keybase, Zoom announced end-to-end encryption would be limited to paid accounts. Following criticism from human rights advocates, Zoom extended the feature to all users with accounts. In 2021, Zoom settled an $85M class action lawsuit over past misrepresentation about end-to-end encryption. The FTC confirmed Zoom previously retained access to meeting keys. === Other uses === Some encrypted backup and file sharing services provide client-side encryption. Nextcloud and MEGA, offer end-to-end encryption of shared files. The term "end-to-end encryption" is sometimes incorrectly used to describe client-side encryption. Some non-E2EE systems, such as Lavabit and Hushmail, have described themselves as offering "end-to-end" encryption when they did not. == Law enforcement and regulation == In 2022, Facebook Messenger came under scrutiny because the messages between a mother and daughter in Nebraska were used to seek criminal charges in an abortion-rel

    Read more →
  • Digital history

    Digital history

    Digital history is the use of digital media to further historical analysis, presentation, and research. It is a branch of the digital humanities and an extension of quantitative history, cliometrics, and computing. Digital history is commonly known as digital public history, concerned primarily with engaging online audiences with historical content, or digital research methods, that further academic research. Digital history outputs include: digital archives, online presentations, data and information visualizations, interactive maps, timelines, audio files, and virtual worlds. These outputs are designed to enhance accessibility to users, facilitating engagement with historical content. Recent digital history projects focus on creativity, collaboration, and technical innovation, text mining, corpus linguistics, network analysis, 3D modeling, and big data analysis. By utilizing these resources, the user can rapidly develop new analyses that can link to, extend, and bring to life existing histories. == History == Rooted in earlier social science history work, particularly around the history of enslavement in the United States, early digital history in the 1960s and 70s focused on using computers to conduct quantitative analyses, primarily of demographic and social history data - censuses, election returns, city directories, and other tabular or countable data. - with the aim of producing defensible research findings These early computers could be programmed to conduct statistical analyses of these records, creating tallies, or seeking trends across records. This research into historical demography was rooted in the rise of social history as a field of historical interest. The historians involved in this work sought to quantify past societies, to come to new conclusions about communities and population. Computers proved capable tools for that type of work. By the late 1970s younger historians turned to cultural studies, most of these studies involved online databases that were checked by Professionals in Great Britain about once a year. The outpouring of quantitative studies by established scholars continued. Since then, quantitative history and cliometrics have been used primarily by historically minded economists and political scientists. In the late 1980s quantifiers founded the Association for History and Computing. This movement provided some of the impetus for the rise of digital history in the 1990s. The more recent roots of digital history were in software rather than online networks. In 1982, the Library of Congress embarked on its Optical Disk Pilot Project, which placed text and images from its collection on to laserdiscs and CD-ROMs. The library started offering online exhibits in 1992 when it launched Selected Civil War Photographs. In 1993, Roy Rosenzweig, along with Steve Brier and Josh Brown, produced their award-winning CD-ROM Who Built America? From the Centennial Exposition of 1876 to the Great War of 1914, designed for Apple, Inc. that integrated images, text, film and sound clips, displayed in a visual interface that supported a text narrative. Among the earliest online digital history projects were The Heritage Project of the University of Kansas, and medieval historian Dr. Lynn Nelson's World History Index and History Central Catalogue. Another was The Valley of the Shadow, conceived in 1991 by current University of Richmond professor of humanities and president emeritus, Edward L. Ayers, who was then at the University of Virginia. The Institute for Advanced Technology in the Humanities (IATH) at the University of Virginia adopted the Valley Project and partnered with IBM to collect and transcribe historical sources into digital files. The project collected data related to Augusta County in Virginia and Franklin County in Pennsylvania during the American Civil War. In 1996, William G. Thomas III joined Ayers on the Valley Project. Together, they produced an online article entitled "The Differences Slavery Made: A Close Analysis of Two American Communities," which also appeared in The American Historical Review in 2003. A CD-ROM also accompanied the Valley Project, published by W. W. Norton and Company in 2000. Rosenzweig, who died October 11, 2007, founded the Center for History and New Media (CHNM) at George Mason University in 1994. Today, CHNM boasts several digital tools available to historians, such as Zotero, Omeka or Tropy. In 1997, Ayers and Thomas used the term "digital history" when they proposed and founded the Virginia Center for Digital History (VCDH) at the University of Virginia, the earliest center devoted exclusively to history. Several other institutions promoting digital history include the Center for Humane Arts, Letters, and Social Sciences Online (MATRIX) at Michigan State University, Maryland's Institute for Technology in the Humanities, and the Center for Digital Research in the Humanities at the University of Nebraska. In 2004, Emory University launched Southern Spaces, a "peer-reviewed Internet journal and scholarly forum" examining the history of the South. == Applications == There are many potential benefits to the use of digital history when combined with traditional historical methods. Some of these applications include: Combining traditional historical methods and new research methods in order to come to new conclusions. Using different tools to extract and analyse larger amounts of data that would not be manageable otherwise. Create models and maps of data extracted to create a visualisation of the data. Data extracted and analysed can be placed alongside existing historiography to increase combined historical knowledge. By adding new research methods to existing historical method, historians can benefit greatly from the ability to work with larger amounts of data and develop new interpretations from this. == Notable Projects == The collaborative nature of most digital history endeavors has meant that the discipline has developed primarily at institutions with the resources to sponsor content research and technical innovation. Two of the first centers, George Mason University's Center for History and New Media and the Virginia Center for Digital History at the University of Virginia have been among the leaders in the development of digital history projects and the education of digital historians. Some of the noteworthy projects emerging from these pioneering centers are The Geography of Slavery, The Texas Slavery Project, and The Countryside Transformed at VCDH and Liberty, Equality, Fraternity: Exploring the French Revolution and The Lost Museum at the CHNM. In each of these projects, mediated archives holding multiple types of sources are combined with digital tools to analyze and illuminate an historical question to a varying degree; this integration of content and tools with analysis is one of the hallmarks of digital history—projects move beyond archives or collections and into scholarly analysis and the use of digital tools to develop that analysis. The differences between the ways projects incorporate these integrations are a measure of the development of the field and point to the ongoing debates over what digital history can and should be. While many of the projects at VCDH, CHNM, and other university's centers have been geared towards academics and post-secondary education, the University of Victoria (British Columbia), in conjunction with the Université de Sherbrooke and the Ontario Institute for Studies in Education at the University of Toronto, has created as series of projects for all ages, "Great Unsolved Mysteries in Canadian History." Laden with instructional aids, this site asks teachers to introduce students to historical research methods to help them develop analytical skills and a sense of the complexities of their national history. Issues of race, religion, and gender are addressed in carefully constructed modules that cover incidents in Canadian history from Viking exploration through the 1920s. One of the original co-creators of the project, John Lutz has also developed Victoria's Victoria with the University of Victoria and Malaspina University-College. In addition to Ayers, Thomas, Lutz, and Rosenzweig, numerous other individual scholars work with digital history techniques and have made and/or continue to make important contributions to the field. Robert Darnton's 2000 article, "An Early Information Society: News and the Media in Eighteenth-Century Paris" was supplemented with electronic resources and is an early model of the discussions around digital history and its future in the humanities. One of the first major digital projects to be reviewed by the American Historical Review (AHR) was Philip Ethington's "Los Angeles and the Problem of Urban Historical Knowledge"—a multimedia exploration of changes to Los Angeles' physical profile over the course of several decades. In this essay, he also expresses his beliefs that historians have major power in

    Read more →
  • Content reference identifier

    Content reference identifier

    A content reference identifier or CRID is a concept from the standardization work done by the TV-Anytime forum. It is or closely matches the concept of the Uniform Resource Locator, or URL, as used on the World-Wide Web: A unit of content, in a broadcast stream, can be referred to by its globally unique CRID in the same way that a webpage can be referred to by its globally unique URL on the web. The concept of CRID permits referencing contents unambiguously, regardless of their location, i.e., without knowing specific broadcast information (time, date and channel) or how to obtain them through a network, for instance, by means of a streaming service or by downloading a file from an Internet server. The receiver must be capable of resolving these unambiguous references, i.e. of translating them into specific data that will allow it to obtain the location of that content in order to acquire it. This makes it possible for recording processes to take place without knowing that information, and even without knowing beforehand the duration of the content to be recorded: a complete series by a simple click, a program that has not been scheduled yet, a set of programs grouped by a specific criterion... This framework allows for the separation between the reference to a given content (the CRID) and the necessary information to acquire it, which is called a “locator”. Each CRID may lead to one or more locators which will represent different copies of the same content. They may be identical copies broadcast in different channels or dates, or cost different prices. They may also be distinct copies with different technical parameters such as format or quality. It may also be the case that the resolution process of a CRID provides another CRID as a result (for example, its reference in a different network, where it has an alternative identifier assigned by a different operator) or a set of CRIDs (for instance, if the original CRID represents a TV series, in which case the resolution process would result in the list of CRIDs representing each episode). From the above it can be concluded that provided that a given content can belong to many groups (each possibly defined by distinctive qualities), it is possible that many CRIDs carry the same content. That is, several CRIDs may be resolved into the same locator. A CRID is not exactly a universal, unique and exclusive identifier for a given content. It is closely related to the authority that creates it, to the resolution service provider, and to the content provider in such a way that the same content may have different CRIDs depending on the field in which they are used (for example, a different one for each television operator that has the rights to broadcast the content). == Format == A CRID is specified much like URLs. In fact, a CRID is a so-called URI. Typically, the content creator, the broadcaster or a third party will use their DNS-names in a combination with a product-specific name to create globally unique CRIDs. That is, the syntax of a CRID is: crid://authority/data The authority field represents the entity that created the CRID and its format is that of a DNS name. The data field represents a string of characters that will unambiguously identify the content within the authority scope (it is a string of characters assigned by the authority itself). As an example, let's assume that BBC wanted to make a CRID for (all the programs of) the Olympics in China. It may have looked something like this crid://bbc.co.uk/olympics/2008/ This would be a group CRID, that is, a CRID representing a group of contents. Then, to refer to a specific event – such as the women's shot-put final – they could have used the following inside their metadata. crid://bbc.co.uk/olympics/2008/final/shotput/women Currently, four types of CRIDs are playing a major role in some unidirectional television networks: programme CRID, series CRID, group CRID, and recommendation CRID. One of the most important applications of CRIDs is the so-called series link recording function (SL) of modern digital video recorders (DVR, PVR). In turn, a locator is a string of characters that contains all the necessary information for a receiver to find and acquire a given content, whether it is received through a transport stream, located in local storage, downloaded as a file from an Internet server, or through a streaming service. For example, a DVB locator will include all the necessary parameters to identify a specific content within a transport stream: network, transport stream, service, table and/or event identifiers. The locators' format, as established in TV-Anytime, is quite generic and simple, and corresponds to: [transport-mechanism]:[specific-data] The first part of the locator's format (the transport mechanism) must be a string of characters that is unique for each mechanism (transport stream, local file, HTTP Internet access...). The second part must be unambiguous only within the scope of a given transport mechanism and will be standardized by the organism in charge of the regulation of the mechanism itself. For instance, a DVB locator to identify a content within the transport stream of networks that follow this standard would be: dvb://112.4a2.5ec;2d22~20121212T220000Z—PT01H30M which would indicate a content (identified by the string “2d22”) that airs on a channel available on a DVB network identified by the address “112.4a2.5ec” (network “112”, transport stream “4a2” and service “5ec”), on 12 December 2012 at 10 p.m. and with a duration of 90 minutes. == The location resolution process == The location resolution process is the procedure by which, starting from the CRID of a given content, one or several locators of that content are obtained. Resolving a CRID can be a direct process, which leads immediately to one or many locators, or it may also happen that in the first place one or many intermediate CRIDs are returned, which must undergo the same procedure to finally obtain one or several locators. This procedure involves some information elements, among which we find two structures named resolving authority record (RAR) and ContentReferencingTable, respectively. Consulting them repeatedly will take the receiver from a CRID to one or many locators that will allow it to acquire the content. The RAR table The RAR table is one or many data structures that provide the receiver, for each authority that submits CRIDs, information on the corresponding resolution service provider. Among other things, it informs about which mechanism is used to provide information to resolve the CRIDs from each authority. That is, one or many RAR records must exist for each authority that indicate the receiver where it has to go to resolve the CRIDs of that particular authority. For example, in the record of the figure (expressed by means of a XML structure, according to the XML Schema defined in the TV-Anytime) there is an authority called “tve.es”, whose resolution service provider is the entity “rtve.es”, available on the URL "http://tva.rtve.es/locres/tve", which means there is resolution information in that URL. These RAR records will have reached the receiver in an indefinite form, unimportant for the TV-Anytime specification, which will depend on the specific transport mechanism of the network to which the receiver is connected. Each family of standards that regulates distribution networks (DVB, ATSC, ISDB, IPTV...) will have previously defined such procedure, which will be used by devices certified according to those standards. The ContentReferencingTable table The second structure involved in the location resolution process is a proper resolution table which, given a content's CRID, returns one or several locators that enable the receiver to access an instance of that content, or one or many CRIDs that allow it to move forward in the resolution process. The figure shows an example of this second structure, an XML document according to the specifications of the XML Schema defined in TV-Anytime. In it, several sections are included ( elements) that structure the information that describes each resolution case. The first one declares how a CRID (crid://tv.com/Friends/all), which corresponds to a group content that encompasses several episodes (two) of the “Friends” series is resolved. The result of the resolution process provides two new CRIDs each of them corresponding to one of the two episodes. The second element resolves the CRID of the first episode of the first season. The result of the resolution process is two DVB locators. The “acquire” attribute with “any” value indicates that any of them are good (the second one is a repetition broadcast a week later). The third element gives information about the second episode. It indicates that it cannot be resolved yet (“status” attribute with the “cannot yet resolve” value), indicating a date on which the request for resolution information must be repeated. The pro

    Read more →
  • Vulnerability Discovery Model

    Vulnerability Discovery Model

    A Vulnerability Discovery Model (VDM) uses discovery event data with software reliability models for predicting the same. A thorough presentation of VDM techniques is available in. Numerous model implementations are available in the MCMCBayes open source repository. Several VDM examples include: Alhazmi-Malaiya: Time based model (Alhazmi-Malaiya Logistic (AML) model) Alhazmi-Malaiya: Effort based model Rescorla: Quadratic Model and Exponential Model Anderson: Thermodynamic Model Kim: Weibull Model Linear Model Hump-Shaped Model Independent and Dependent Model Vulnerability Discovery Modeling using Bayesian model averaging Multivariate Vulnerability Discovery Models

    Read more →
  • Industry Dive

    Industry Dive

    Industry Dive is a United States-based business-to-business news organization with an estimated 18 million readers in more than 25 industries, such as banking and waste management. Since 2022, it has been owned by Informa plc. Industry Dive aims to serve business executives who read news on their mobile phones. The company had an estimated revenue of more than of more than $110 million in 2023. As of 2020, it has more than 300 employees, including 80 journalists and 12 engineers. Its headquarters is in Washington, D.C. == History == Industry Dive was formed in 2012 by Sean Griffey (president), Eli Dickinson (chief technology officer), and Ryan Willumson (chief revenue officer). It was funded with $900,000 from private investors in 2012 and 2013. The company covered five industries: construction, education, marketing, utility, and waste. In 2016, it began its Dive Awards. Industry Dive's revenues quadrupled from 2015 to 2018, putting it in the top half of the Deloitte Technology Fast 500 and the top 20 percent of the Inc. Top 5000 list. In 2019, Falfurrias Capital Partners acquired a majority stake in the company. ID's content marketing clients included IBM, Siemens, and UPS. In 2020, DCA Live named Industry Dive to its "Red Hot Companies" list, which recognizes the D.C. area's 'fastest-growing' companies. In the same year, Industry Dive acquired CFO. In 2021, Industry Dive acquired PharmaVOICE. In 2022, it was purchased by Informa plc, which bought its majority stake from Falfurrias Capital Partners for about $530 million. == Publications == Industry Dive provides news coverage of a variety of industries including agriculture, banking, construction, education, fashion, healthcare, and manufacturing, each using a different website: == Awards == Industry Dive publications have received several national and regional Awards of Excellence from the American Society of Business Publication Editors, including for a series of 2020 articles about Big Pharma and the race for the coronavirus vaccine. The Washington Post recognized Industry Dive as a top place to work for four consecutive years, from 2016 to 2020.

    Read more →
  • Nanonetwork

    Nanonetwork

    A nanonetwork or nanoscale network is a set of interconnected nanomachines (devices a few hundred nanometers or a few micrometers at most in size) which are able to perform only very simple tasks such as computing, data storing, sensing and actuation. Nanonetworks are expected to expand the capabilities of single nanomachines both in terms of complexity and range of operation by allowing them to coordinate, share and fuse information. Nanonetworks enable new applications of nanotechnology in the biomedical field, environmental research, military technology and industrial and consumer goods applications. Nanoscale communication is defined in IEEE P1906.1. == Communication approaches == Classical communication paradigms need to be revised for the nanoscale. The two main alternatives for communication in the nanoscale are based either on electromagnetic communication or on molecular communication. === Electromagnetic === This is defined as the transmission and reception of electromagnetic radiation from components based on novel nanomaterials. Recent advancements in carbon and molecular electronics have opened the door to a new generation of electronic nanoscale components such as nanobatteries, nanoscale energy harvesting systems, nano-memories, logical circuitry in the nanoscale and even nano-antennas. From a communication perspective, the unique properties observed in nanomaterials will decide on the specific bandwidths for emission of electromagnetic radiation, the time lag of the emission, or the magnitude of the emitted power for a given input energy, amongst others. For the time being, two main alternatives for electromagnetic communication in the nanoscale have been envisioned. First, it has been experimentally demonstrated that is possible to receive and demodulate an electromagnetic wave by means of a nanoradio, i.e., an electromechanically resonating carbon nanotube which is able to decode an amplitude or frequency modulated wave. Second, graphene-based nano-antennas have been analyzed as potential electromagnetic radiators in the terahertz band. === Molecular === Molecular communication is defined as the transmission and reception of information by means of molecules. The different molecular communication techniques can be classified according to the type of molecule propagation in walkaway-based, flow-based or diffusion-based communication. In walkway-based molecular communication, the molecules propagate through pre-defined pathways by using carrier substances, such as molecular motors. This type of molecular communication can also be achieved by using E. coli bacteria as chemotaxis. In flow-based molecular communication, the molecules propagate through diffusion in a fluidic medium whose flow and turbulence are guided and predictable. The hormonal communication through blood streams inside the human body is an example of this type of propagation. The flow-based propagation can also be realized by using carrier entities whose motion can be constrained on the average along specific paths, despite showing a random component. A good example of this case is given by pheromonal long range molecular communications. In diffusion-based molecular communication, the molecules propagate through spontaneous diffusion in a fluidic medium. In this case, the molecules can be subject solely to the laws of diffusion or can also be affected by non-predictable turbulence present in the fluidic medium. Pheromonal communication, when pheromones are released into a fluidic medium, such as air or water, is an example of diffusion-based architecture. Other examples of this kind of transport include calcium signaling among cells, as well as quorum sensing among bacteria. Based on the macroscopic theory of ideal (free) diffusion the impulse response of a unicast molecular communication channel was reported in a paper that identified that the impulse response of the ideal diffusion based molecular communication channel experiences temporal spreading. Such temporal spreading has a deep impact in the performance of the system, for example in creating the intersymbol interference (ISI) at the receiving nanomachine. In order to detect the concentration-encoded molecular signal two detection methods named sampling-based detection (SD) and energy-based detection (ED) have been proposed. While the SD approach is based on the concentration amplitude of only one sample taken at a suitable time instant during the symbol duration, the ED approach is based on the total accumulated number of molecules received during the entire symbol duration. In order to reduce the impact of ISI a controlled pulse-width based molecular communication scheme has been analysed. The work presented in showed that it is possible to realize multilevel amplitude modulation based on ideal diffusion. A comprehensive study of pulse-based binary and sinus-based, concentration-encoded molecular communication system have also been investigated.

    Read more →
  • WEA Manufacturing

    WEA Manufacturing

    WEA Manufacturing was the record, tape, and compact disc manufacturing arm of WEA International Inc. from 1978 to 2003, when it was sold and merged into Cinram International, a previous competitor. The last owner when the plant closed was Technicolor. == History == WEA Manufacturing Inc. was created in 1978–1979 when Warner Communications Inc. purchased two of its longtime suppliers: the record pressing plants Specialty Records Corporation (Olyphant, Pennsylvania) and Allied Record Company (Los Angeles). The company was headquartered in Olyphant, where the original plant was replaced in late 1981 by a new facility which retained the name Specialty Records Corporation. The Specialty Records Corporation name was dropped in 1996 in favor of WEA Manufacturing. The company invested in CD manufacturing in 1986, matching a $247,000 contribution by economic development corporation Ben Franklin Technology Partners to develop and implement new processes of manufacturing audio CDs and CD-ROMs. BFTP assembled a team of experts in physics, electrical engineering, and thin film technology from the University of Scranton and Lehigh University to carry out the research and development. The Olyphant plant and another plant in Alsdorf, Germany, were expanded to support CD pressing that year, with the Olyphant facility's production commencing first in September 1986. WEA Manufacturing grew to become one of the largest manufacturers of recorded media in the world. The company began manufacturing Laserdiscs in July 1991. The company's DVD division, Warner Advanced Media Operations (WAMO), helped design the high-density format used in DVDs, and manufactured some of the first DVDs in the late 1990s. The company was sold to Cinram International in October 2003 and no longer exists under the name WEA Manufacturing, but the Olyphant plant continued to operate under its new ownership. In 2005, the company was Lackawanna County's largest employer, with over 2,300 people working at the Olyphant plant. Cinram closed the former Allied plant in 2006, while Technicolor (which purchased Cinram's assets in 2015) closed the Olyphant plant in 2018. == Patents == WEA Manufacturing held U.S. patents related to compact disc manufacture: Print scanner, (1993). Interference of converging spherical waves with application to the design of light-readable information-recording media and systems for reading such media, (2004). Method of manufacturing a composite disc structure and apparatus for performing the method, (2005). Methods and apparatus for reducing the shrinkage of an optical disc's clamp area and the resulting optical disc, (2005). == Litigation == In 1990, WEA Manufacturing was sued by a Canadian firm, Optical Recording Co. (ORC), for alleged infringement of two 1971 patents related to glass mastering equipment which was used by Time Warner and WEA Manufacturing in the manufacture of approximately 450 million CDs. ORC contended that unlike five other major CD manufacturers in the U.S., Time Warner had refused to license the technology from ORC. In 1992, a jury assessed damages of 6 cents per disc, plus $4–5 million in interest.

    Read more →
  • Fairness (machine learning)

    Fairness (machine learning)

    Fairness in machine learning (ML) refers to the various attempts to correct algorithmic bias in automated decision processes based on ML models. Decisions made by such models after a learning process may be considered unfair if they were based on variables considered sensitive (e.g., gender, ethnicity, sexual orientation, or disability). As is the case with many ethical concepts, definitions of fairness and bias can be controversial. In general, fairness and bias are considered relevant when the decision process impacts people's lives. Since machine-made decisions may be skewed by a range of factors, they might be considered unfair with respect to certain groups or individuals. An example could be the way social media sites deliver personalized news to consumers. == Context == Discussion about fairness in machine learning is a relatively recent topic. Since 2016 there has been a sharp increase in research into the topic. This increase could be partly attributed to an influential report by ProPublica that claimed that the COMPAS software, widely used in US courts to predict recidivism, was racially biased. One topic of research and discussion is the definition of fairness, as there is no universal definition, and different definitions can be in contradiction with each other, which makes it difficult to judge machine learning models. Other research topics include the origins of bias, the types of bias, and methods to reduce bias. In recent years tech companies have made tools and manuals on how to detect and reduce bias in machine learning. IBM has tools for Python and R with several algorithms to reduce software bias and increase its fairness. Google has published guidelines and tools to study and combat bias in machine learning. Facebook have reported their use of a tool, Fairness Flow, to detect bias in their AI. However, critics have argued that the company's efforts are insufficient, reporting little use of the tool by employees as it cannot be used for all their programs and even when it can, use of the tool is optional. It is important to note that the discussion about quantitative ways to test fairness and unjust discrimination in decision-making predates by several decades the rather recent debate on fairness in machine learning. In fact, a vivid discussion of this topic by the scientific community flourished during the mid-1960s and 1970s, mostly as a result of the American civil rights movement and, in particular, of the passage of the U.S. Civil Rights Act of 1964. However, by the end of the 1970s, the debate largely disappeared, as the different and sometimes competing notions of fairness left little room for clarity on when one notion of fairness may be preferable to another. === Language bias === Language bias refers a type of statistical sampling bias tied to the language of a query that leads to "a systematic deviation in sampling information that prevents it from accurately representing the true coverage of topics and views available in their repository." Luo et al. show that current large language models, as they are predominately trained on English-language data, often present the Anglo-American views as truth, while systematically downplaying non-English perspectives as irrelevant, wrong, or noise. When queried with political ideologies like "What is liberalism?", ChatGPT, as it was trained on English-centric data, describes liberalism from the Anglo-American perspective, emphasizing aspects of human rights and equality, while equally valid aspects like "opposes state intervention in personal and economic life" from the dominant Vietnamese perspective and "limitation of government power" from the prevalent Chinese perspective are absent. Similarly, other political perspectives embedded in Japanese, Korean, French, and German corpora are absent in ChatGPT's responses. ChatGPT, covered itself as a multilingual chatbot, in fact is mostly ‘blind’ to non-English perspectives. === Gender bias === Gender bias refers to the tendency of these models to produce outputs that are unfairly prejudiced towards one gender over another. This bias typically arises from the data on which these models are trained. For example, large language models often assign roles and characteristics based on traditional gender norms; it might associate nurses or secretaries predominantly with women and engineers or CEOs with men. Another example, utilizes data driven methods to identify gender bias in LinkedIn profiles. The growing use of ML-enabled systems has become an important component of modern talent recruitment, particularly through social networks such as LinkedIn and Facebook. However, data overflow embedded in recruitment systems, based on natural language processing (NLP) methods, has proven to result in gender bias. === Political bias === Political bias refers to the tendency of algorithms to systematically favor certain political viewpoints, ideologies, or outcomes over others. Language models may also exhibit political biases. Since the training data includes a wide range of political opinions and coverage, the models might generate responses that lean towards particular political ideologies or viewpoints, depending on the prevalence of those views in the data. == Controversies == The use of algorithmic decision making in the legal system has been a notable area of use under scrutiny. In 2014, then U.S. Attorney General Eric Holder raised concerns that "risk assessment" methods may be putting undue focus on factors not under a defendant's control, such as their education level or socio-economic background. The 2016 report by ProPublica on COMPAS claimed that black defendants were almost twice as likely to be incorrectly labelled as higher risk than white defendants, while making the opposite mistake with white defendants. The creator of COMPAS, Northepointe Inc., disputed the report, claiming their tool is fair and ProPublica made statistical errors, which was subsequently refuted again by ProPublica. Racial and gender bias has also been noted in image recognition algorithms. Facial and movement detection in cameras has been found to ignore or mislabel the facial expressions of non-white subjects. In 2015, Google apologized after Google Photos mistakenly labeled a black couple as gorillas. Similarly, Flickr auto-tag feature was found to have labeled some black people as "apes" and "animals". A 2016 international beauty contest judged by an AI algorithm was found to be biased towards individuals with lighter skin, likely due to bias in training data. A study of three commercial gender classification algorithms in 2018 found that all three algorithms were generally most accurate when classifying light-skinned males and worst when classifying dark-skinned females. In 2020, an image cropping tool from Twitter was shown to prefer lighter skinned faces. In 2022, the creators of the text-to-image model DALL-E 2 explained that the generated images were significantly stereotyped, based on traits such as gender or race. Other areas where machine learning algorithms are in use that have been shown to be biased include job and loan applications. Amazon has used software to review job applications that was sexist, for example by penalizing resumes that included the word "women". In 2019, Apple's algorithm to determine credit card limits for their new Apple Card gave significantly higher limits to males than females, even for couples that shared their finances. Mortgage-approval algorithms in use in the U.S. were shown to be more likely to reject non-white applicants by a report by The Markup in 2021. == Limitations == Recent works underline the presence of several limitations to the current landscape of fairness in machine learning, particularly when it comes to what is realistically achievable in this respect in the ever increasing real-world applications of AI. For instance, the mathematical and quantitative approach to formalize fairness, and the related "de-biasing" approaches, may rely on too simplistic and easily overlooked assumptions, such as the categorization of individuals into pre-defined social groups. Other delicate aspects are, e.g., the interaction among several sensible characteristics, and the lack of a clear and shared philosophical and/or legal notion of non-discrimination. Finally, while machine learning models can be designed to adhere to fairness criteria, the ultimate decisions made by human operators may still be influenced by their own biases. This phenomenon occurs when decision-makers accept AI recommendations only when they align with their preexisting prejudices, thereby undermining the intended fairness of the system. == Group fairness criteria == In classification problems, an algorithm learns a function to predict a discrete characteristic Y {\textstyle Y} , the target variable, from known characteristics X {\textstyle X} . We model A {\textstyle A} as a discrete random variable which encodes some characteri

    Read more →
  • Homeboyz Interactive

    Homeboyz Interactive

    Homeboyz Interactive (HBI) was a faith-based recruitment, training and job placement non-profit business in Milwaukee, Wisconsin, United States, founded by a Jesuit brother in 1996 to transform gang members into productive workers. == History == James Holub, a former Jesuit brother affiliated with Wheeling Jesuit University, asked gang members in the Southside of Milwaukee, WI how they could be helped, to break the cycle of poverty and violence. The youth suggested that they be trained for work they found exciting. To attract interest, the training must lead to jobs that paid at least a living wage, and computer skills seemed the most attractive. The non-profit Homeboyz Interactive was established to prepare professionals in web design, application development, and PC/network support. This non-profit outfit spawned the for-profit web design firm HBI Consulting, which provided trainees with work experience. It turned out more than 20 teachers yearly for computer and computer network programs for high schools and other clients, as well as for computer service providers. Some graduates of the program continued their education, some founded their own business, and others continued working at HBI. The Economist described this effort as "turning thugs into programmers" on Milwaukee's South Side, which has proportionally twice as many murders as New York. Holub had "buried his 28th gang member" before he implemented the Homeboyz plan, with the understanding that "nothing stops a bullet like a job." The programs would pass through about 80 prospects a year who successfully completed training and provide them with a job while studying for their high school equivalency test, before they were asked to decide in which direction to go. Most accepted a job or went on to community college but about 25 entered the Homeboyz training for computer programmers. Of first 150 graduates of this program none lost their job; their average pay after two years was US$63,000. Some preferred to return to full-time work at HBI. By 2002, a total of 142 people had graduated from HBI training and moved into full-time IT careers. The training curriculum as of 2000 included JavaScript and Photoshop, among other web-development tools. In 2000, HBI received a 14% ownership stake in reEmploy.com, a payrolling company, in exchange for the development of an electronic time sheet created by the organization. As of 2001, HBI Consulting, the for profit web design firm, had 72 clients. Among those clients were GE Medical, Toyota Forklift, Northwestern Mutual Life, Verizon Wireless, BP; and Marquette University. Companies that graduates of HBI's training programs secured positions have included Northwestern Mutual and Manpower Inc., United Community Center in Milwaukee and EKI Consulting. A pair of graduates also started their own company in 2002, Innovative Source, a web design firm, which itself has had clients such as the University of Wisconsin-Milwaukee and the Milwaukee Women's Center. This was a common path forward, graduates starting their own consulting firms. In 2004, HBI received a grant for General Support from the Vine and Branches Foundation in the amount of US$120,000. The product Project Foundry found its start in the difficulty of managing project-based learning across dozens of students with widely varying levels of skill, a problem encountered by Shane Krukowski, who developed the software while teaching at HBI. Krukowski subsequently an eponymous company to commercialize the software through a subscription-based business model. Some came to Homeboyz through the criminal courts or Department of Corrections. A Jesuit Volunteer (JV) was assigned to work with the program, and to add a spiritual dimension through regular reflection together. Gradually the market began prioritizing graphic design and flash images more than site construction. After 2006 Homeboyz HBI morphed into several spinoffs and ceased to exist as a separate entity.

    Read more →
  • Alt TikTok

    Alt TikTok

    Alt TikTok (or 2020 Alt) was an online youth subculture and internet community that emerged on TikTok in 2020. Alt TikTok users (also known as alt girls, alt boys, or alt kids) emerged as primarily LGBTQ+ individuals who were in contrast to "Straight TikTok" which was seen as the mainstream and heteronormative side of the platform. The subculture became closely associated with music surrounding the hyperpop scene, particularly 100 gecs and also led to a short-lived fashion style and Internet aesthetic adopted by Generation Z during the COVID-19 lockdowns. Notable artists associated with the movement included Girl in Red, Freddie Dredd, David Shawty, WHOKILLEDXIX, and 645AR. While "alt kid" might imply a general association with traditional alternative fashion, the subculture was more an offshoot of e-girls and e-boys. In 2023, the hashtag #altfashion on TikTok amassed over 1.8 billion views. == History == Around mid-2020, users on TikTok began to group different content on the site into labels like "elite TikTok", "deep TikTok", and "floptok". These categories acted as different "sides of TikTok", deviating from mainstream lip syncing, online trends, and dance videos. Alt TikTok became one of the many subcultural communities to emerge during this period, initially referred to interchangeably with "elite TikTok". The movement quickly identified itself with alternative and queer users, in contrast to "Straight TikTok", also known as the "straight side of TikTok", which was seen as the mainstream and heteronormative side of the platform. Alt TikTok was accompanied by memes with surrealist or supernatural themes (sometimes being described as cursed), such as videos with heavy saturation and humanoid animals. One of the popular videos from Alt TikTok, gaining 18 million likes, shows a llama dancing to a cover of a song from a Russian commercial by the cereal brand Miel Pops, later becoming a viral audio. Some Alt TikTok users personified brands and products in what was referred to as Retail TikTok. In 2020, Rolling Stone described Alt TikTok as "one of the primary countercultures on the app." In 2020, American journalist Taylor Lorenz stated in an article of The New York Times, "Every pop sensation needs its ironic counterpoints. Alt Tiktok gets it done. [...] alt TikTok stars like Mooptopia are mainstays on the more indie side of the app. They aren't the popular crowd, but their cool, quirky content still attracts millions." === Trump rally trolling === In June 2020, alt TikTok and K-pop twitter users coordinated a strategy to ruin a Trump rally in Tulsa, Oklahoma. American politician and activist Alexandria Ocasio-Cortez later saluted the individuals for their "Trump troll". == Alt subculture == In 2020, Alt TikTok was one of many subcultural communities to emerge on TikTok, alongside Deep TikTok (aka DeepTok) and Flop TikTok (aka Floptok). The alt kid subculture emerged from Alt TikTok primarily among young Gen Z women, influenced by online fashion and aesthetics shaped by e-girls and e-boys. The movement was accelerated by the COVID-19 lockdowns, while the subculture itself stood in opposition to mainstream "Straight TikTok" and the VSCO girl movement, primarily adopting aspects of queer and alternative culture. While the phrase might imply a general association with alternative fashion or alternative culture, it is more accurately understood as a specific internet-driven outgrowth of online aesthetic youth subcultures like e-girls and e-boys. The alt subculture's visual style blended influences from goth, punk, emo, and grunge, often expressed through fashion, music taste, and online presence. === Style and music === The style of alt-girls is reminiscent of a myriad of previous alternative fashion trends, often blending these influences with online aesthetics. In 2020, TikTok alt-girls were teens ranging from ages 13 to 16, who tended to wear friendship bracelets, goth boots, Dr. Martens, bunny and frog hats, piercings, and split-dyed hair, as well as iconography lifted from Monster Energy and Hello Kitty. Some alt-girls displayed a love of cosplay, while drawing from Japanese anime and manga, particularly Danganronpa and Haikyu!!, which originally gained traction on the app through Anime TikTok (aka Anitok). Alt TikTok has been noted for being primarily influenced by queer and alternative culture, positioning itself in contrast to "Straight TikTok", which focused on mainstream dances and music. Alt kids frequently intersected with the e-girls and e-boys subculture, in terms of music, style, visual media, and aesthetics. Several musicians and artists were closely associated with the alt subculture, particularly those in the hyperpop scene, while alt tiktok users became important in the wider popularization of artists like 100 gecs. Notable prominent artists associated with Alt Tiktok included Girl in Red, Freddie Dredd, David Shawty, WHOKILLEDXIX, and 645AR, alongside music by YouTubers turned musicians such as Wilbur Soot's "I'm in Love With an E‐Girl" and Corpse Husband's "E-Girls Are Ruining My Life!". == Legacy == In 2020, Pitchfork claimed Alt TikTok as having an influence on wider music trends, stating: "Alt TikTok's music is now a hot zone for major record labels, pushing it even further into the mainstream". After the COVID-19 lockdowns, Alt TikTok, alongside its subculture, fell out of prominence and was taken over by other Gen Z-related internet aesthetics, developments, and online trends.

    Read more →