Adaptive algorithm

Adaptive algorithm

An adaptive algorithm is an algorithm that changes its behavior at the time it is run, based on information available and on a priori defined reward mechanism (or criterion). Such information could be the story of recently received data, information on the available computational resources, or other run-time acquired (or a priori known) information related to the environment in which it operates. Among the most used adaptive algorithms is the Widrow-Hoff’s least mean squares (LMS), which represents a class of stochastic gradient-descent algorithms used in adaptive filtering and machine learning. In adaptive filtering the LMS is used to mimic a desired filter by finding the filter coefficients that relate to producing the least mean square of the error signal (difference between the desired and the actual signal). For example, stable partition, using no additional memory is O(n lg n) but given O(n) memory, it can be O(n) in time. As implemented by the C++ Standard Library, stable_partition is adaptive and so it acquires as much memory as it can get (up to what it would need at most) and applies the algorithm using that available memory. Another example is adaptive sort, whose behavior changes upon the presortedness of its input. An example of an adaptive algorithm in radar systems is the constant false alarm rate (CFAR) detector. In machine learning and optimization, many algorithms are adaptive or have adaptive variants, which usually means that the algorithm parameters such as learning rate are automatically adjusted according to statistics about the optimisation thus far (e.g. the rate of convergence). Examples include adaptive simulated annealing, adaptive coordinate descent, adaptive quadrature, AdaBoost, Adagrad, Adadelta, RMSprop, and Adam. In data compression, adaptive coding algorithms such as Adaptive Huffman coding or Prediction by partial matching can take a stream of data as input, and adapt their compression technique based on the symbols that they have already encountered. In signal processing, the Adaptive Transform Acoustic Coding (ATRAC) codec used in MiniDisc recorders is called "adaptive" because the window length (the size of an audio "chunk") can change according to the nature of the sound being compressed, to try to achieve the best-sounding compression strategy.

AI alignment

In the field of artificial intelligence (AI), alignment aims to steer AI systems toward a person's or group's intended goals, preferences, or ethical principles. An AI system is considered aligned if it advances the intended objectives. A misaligned AI system pursues unintended objectives. It is often difficult for AI designers to specify the full range of desired and undesired behaviors. Therefore, the designers often use simpler proxy goals, such as gaining human approval. But proxy goals can overlook necessary constraints or reward the AI system for merely appearing aligned. AI systems may also find loopholes that allow them to accomplish their proxy goals efficiently but in unintended, sometimes harmful, ways (reward hacking). Advanced AI systems may develop unwanted instrumental strategies, such as seeking power or self-preservation because such strategies help them achieve their assigned final goals. Furthermore, they might develop undesirable emergent goals that could be hard to detect before the system is deployed and encounters new situations and data distributions. Empirical research showed in 2024 that advanced large language models (LLMs) such as OpenAI o1 or Claude 3 sometimes engage in strategic deception to achieve their goals or prevent them from being changed. Some of these issues affect existing commercial systems such as LLMs, robots, autonomous vehicles, and social media recommendation engines. Some AI researchers argue that more capable future systems will be more severely affected because these problems partially result from high capabilities. Many prominent AI researchers and AI company leaders have argued or asserted that AI is approaching human-like (AGI) and superhuman cognitive capabilities (ASI), and could endanger human civilization if misaligned. These include "AI godfathers" Geoffrey Hinton and Yoshua Bengio and the CEOs of OpenAI, Anthropic, and Google DeepMind. These risks remain debated. AI alignment is a subfield of AI safety, the study of how to build safe AI systems. Other subfields of AI safety include robustness, monitoring, and capability control. Research challenges in alignment include instilling complex values in AI, developing honest AI, scalable oversight, auditing and interpreting AI models, and preventing emergent AI behaviors like power-seeking. Alignment research has connections to interpretability research, (adversarial) robustness, anomaly detection, calibrated uncertainty, formal verification, preference learning, safety-critical engineering, game theory, algorithmic fairness, and social sciences. == Objectives in AI == Programmers provide an AI system such as AlphaZero with an "objective function", in which they intend to encapsulate the goal(s) the AI is configured to accomplish. Such a system later populates a (possibly implicit) internal "model" of its environment. This model encapsulates all the agent's beliefs about the world. The AI then creates and executes whatever plan is calculated to maximize the value of its objective function. For example, when AlphaZero is trained on chess, it has a simple objective function of "+1 if AlphaZero wins, −1 if AlphaZero loses". During the game, AlphaZero attempts to execute whatever sequence of moves it judges most likely to attain the maximum value of +1. Similarly, a reinforcement learning system can have a "reward function" that allows the programmers to shape the AI's desired behavior. An evolutionary algorithm's behavior is shaped by a "fitness function". == Alignment problem == In 1960, AI pioneer Norbert Wiener described the AI alignment problem as follows: If we use, to achieve our purposes, a mechanical agency with whose operation we cannot interfere effectively [...] we had better be quite sure that the purpose put into the machine is the purpose which we really desire. AI alignment refers to ensuring that an AI system's objectives match some target. The target is variously defined as the goals of the system's designers or users, widely shared values, objective ethical standards, legal requirements, or the intentions its designers would have if they were more informed and enlightened. In democratic AI alignment, the target is the values and preferences of median voters, which increases political legitimacy. AI alignment is an open problem for modern AI systems and is a research field within AI. Aligning AI involves two main challenges: carefully specifying the purpose of the system (outer alignment) and ensuring that the system adopts the specification robustly (inner alignment). Researchers also attempt to create AI models that have robust alignment, sticking to safety constraints even when users adversarially try to bypass them. === Specification gaming and side effects === To specify an AI system's purpose, AI designers typically provide an objective function, examples, or feedback to the system. But designers are often unable to completely specify all important values and constraints, so they resort to easy-to-specify proxy goals such as maximizing the approval of human overseers, who are fallible. As a result, AI systems can find loopholes that help them accomplish the specified objective efficiently but in unintended, possibly harmful ways. This tendency is known as specification gaming or reward hacking, and is an instance of Goodhart's law. As AI systems become more capable, they are often able to game their specifications more effectively. Specification gaming has been observed in numerous AI systems. OpenAI GPT models for programming—including in real-world cases—have been found to explicitly plan hacking the tests used to evaluate them to falsely appear successful (e.g., explicitly stating "let's hack"). When the company penalized this, many models learned to obfuscate their plans while continuing to hack the tests. Another system was trained to finish a simulated boat race by rewarding the system for hitting targets along the track, but the system achieved more reward by looping and crashing into the same targets indefinitely. A 2025 Palisade Research study found that when tasked to win at chess against a stronger opponent, some reasoning LLMs attempted to hack the game system, for example by modifying or entirely deleting their opponent. Some alignment researchers aim to help humans detect specification gaming and steer AI systems toward carefully specified objectives that are safe and useful to pursue. When a misaligned AI system is deployed, it can have consequential side effects. Social media platforms have been known to optimize their recommendation algorithms for click-through rates, causing user addiction on a global scale. Stanford researchers say that such recommender systems are misaligned with their users because they "optimize simple engagement metrics rather than a harder-to-measure combination of societal and consumer well-being". Explaining such side effects, Berkeley computer scientist Stuart J. Russell said that the omission of implicit constraints can cause harm: "A system [...] will often set [...] unconstrained variables to extreme values; if one of those unconstrained variables is actually something we care about, the solution found may be highly undesirable. This is essentially the old story of the genie in the lamp, or the sorcerer's apprentice, or King Midas: you get exactly what you ask for, not what you want." Some researchers suggest that AI designers specify their desired goals by listing forbidden actions or by formalizing ethical rules (as with Asimov's Three Laws of Robotics). But Russell and Norvig argue that this approach overlooks the complexity of human values: "It is certainly very hard, and perhaps impossible, for mere humans to anticipate and rule out in advance all the disastrous ways the machine could choose to achieve a specified objective." Additionally, even if an AI system fully understands human intentions, it may still disregard them, because following human intentions may not be its objective (unless it is already fully aligned). === Pressure to deploy unsafe systems === Commercial organizations sometimes have incentives to take shortcuts on safety and to deploy misaligned or unsafe AI systems. For example, social media recommender systems have been profitable despite creating unwanted addiction and polarization. Competitive pressure can also lead to a race to the bottom on AI safety standards. For example, OpenAI has been sued for releasing a ChatGPT version that encouraged suicide for some unstable users, a behavior the company had overlooked amid a rushed product release. Similarly, in 2018, a self-driving car killed a pedestrian (Elaine Herzberg) after engineers disabled the emergency braking system because it was oversensitive and slowed development. === Risks from advanced misaligned AI === Some researchers are interested in aligning increasingly advanced AI systems, as progress in AI development is rapid, and industry and governments are trying to build advan

Blue check

A blue check is used on social media platforms, notably X (formerly known as Twitter), to indicate the authenticity of an account. Since November 2022, Twitter users whose accounts are at least 90 days old and have a verified phone number receive verification upon subscribing to X Premium or Verified Organizations; this status persists as long as the subscription remains active. When introduced in June 2009, the system provided the site's readers with a means to distinguish genuine notable account holders, such as celebrities and organizations, from impostors or parodies. Until November 2022, a blue checkmark displayed against an account name indicated that Twitter had taken steps to ensure that the account was actually owned by the person or organization whom it claimed to represent. The checkmark does not imply endorsement from Twitter, and does not mean that tweets from a verified account are necessarily accurate or truthful in any way. People with verified accounts on Twitter are often colloquially referred to as "blue checks" on social media and by reporters. In November 2022, the verification program was modified heavily by new owner Elon Musk, extending verification to any account with a verified phone number and an active subscription to an eligible X Premium (formerly Twitter Blue) plan. These changes faced criticism from users and the media, who believed that the changes would ease impersonation, and allow accounts spreading misleading information to feign credibility. In a related change, Twitter introduced additional gold and gray checkmarks, used by Verified Organizations and government-affiliated accounts, respectively. Twitter claims that the changes to verification are required to "reduce fraudulent accounts and bots". Twitter users who had been verified through the previous system were known as "legacy verified" accounts; legacy verification was deprecated in April 2023, and stripped from accounts who do not meet the new payment requirements. Musk later implied that he had been personally paying for the X Premium subscriptions of several notable celebrities. == Until November 2022 == In June 2009, after being criticized by Kanye West and sued by Tony La Russa over unauthorized accounts run by impersonators, the company launched their "Verified Accounts" program. Twitter stated that an account with a "blue tick" verification badge indicates "we've been in contact with the person or entity the account is representing and verified that it is approved". After the beta period, the company stated in their FAQ that it "proactively verifies accounts on an ongoing basis to make it easier for users to find who they're looking for" and that they "do not accept requests for verification from the general public". Originally, Twitter took on the responsibility of reaching out to celebrities and other notable people to confirm their identities in order to establish a verified account. In July 2016, Twitter announced a public application process to grant verified status to an account "if it is determined to be of public interest" and that verification "does not imply an endorsement". In 2016, the company began accepting requests for verification, but it was discontinued the same year. Twitter explained that the volume of requests for verified accounts had exceeded its ability to cope; rather, Twitter determines on its own whom to approach about verified accounts, limiting verification to accounts which are "authentic, notable, and active". In November 2020, Twitter announced a relaunch of its verification system in 2021. According to the new policy, Twitter verifies six different types of accounts; for three of them (companies, brands, and influential individuals like activists), the existence of a Wikipedia page will be one criterion for showing that the account has "Off Twitter Notability". === Controversy === On June 21, 2014, actor William Shatner raised an issue with several Engadget editorial staff and their verification status on Twitter. Besides the site's social media editor, John Colucci, Shatner also targeted several junior members of the staff for being "nobodies", unlike some of his actor colleagues who did not bear such distinction. Shatner claimed Colucci and the team were bullying him when giving a text interview to Mashable. Over a month later, Shatner continued to discuss the issue on his Tumblr page, to which Engadget replied by defending its team and discussing the controversy surrounding the social media verification. Twitter's practice and process for verifying accounts came under scrutiny again in 2017 after the company verified the account of white supremacist and far-right political activist, Jason Kessler. Many who criticized Twitter's decision to verify Kessler's account saw this as a political act on the company's behalf. In response, Twitter put its verification process on hold. The company tweeted, "Verification was meant to authenticate identity & voice but it is interpreted as an endorsement or an indicator of importance. We recognize that we have created this confusion and need to resolve it. We have paused all general verifications while we work and will report back soon." As of November 2017, Twitter continued to deny verification of Julian Assange's account following his requests. In November 2019, Dalit activists of India alleged that higher-caste people get Twitter verification easily and trended hashtags #CancelAllBlueTicksInIndia and #CasteistTwitter. Critics have said that the company's verification process is not transparent and causes digital marginalisation of already marginalised communities. Twitter India rejected the allegations, calling them "impartial" and working on a "case-by-case" policy. == Since November 2022 == On April 20, 2023, Twitter (known as X since July 2023) began removing verification status for users of public interest, causing a controversy among Twitter users. The website's system was altered, allowing any individual to receive verification for a monthly fee, an act which saw significant criticism. Following the acquisition of Twitter by Elon Musk on October 28, 2022, Musk told Twitter employees to introduce paid verification by November 7 through Twitter Blue. The Verge reported that the updated Blue subscription would cost $19.99 per month, and users would lose their verification status if they did not join within 90 days. Following backlash, Musk tweeted, in response to author Stephen King, a lowered $8 price on November 1, 2022. Twitter confirmed the new price of $7.99 per month on November 5, 2022. The new verification system began rollout on November 9, 2022, a day after the 2022 United States elections. The decision to delay its rollout was to address concerns about users potentially spreading misinformation about voting results by posing as news outlets and lawmakers. At the same time, Twitter introduced a secondary gray "Official" label on some high-profile accounts, but removed them hours after launch. Less than 48 hours later, Twitter reinstated the gray "Official" label, after multiple users were suspended for deliberately impersonating reporters and high-profile athletes like LeBron James. A viral tweet from an account purporting to be the pharmaceutical company Eli Lilly and Company caused the company's stock to fall after announcing "insulin is free now". As a result, Twitter disabled new Blue subscriptions on November 11, 2022. === Announcement === In October 2022, Casey Newton of Platformer reported that executives at Twitter began discussing the possibility of users being forced to pay for Twitter Blue in order to keep their verification status. Musk publicly announced that verification was "being revamped right now" after Newton's article; according to The Verge, Twitter planned to increase the price of Twitter Blue from US$4.99 per month to US$19.99 per month. Users would have had 90 days to subscribe or face losing their verification status, and employees were told to implement paid verification by November 9 or risk getting fired. Upon the news that Twitter Blue would cost US$19.99 per month, author Stephen King expressed displeasure towards Twitter and stated that he would leave. Musk, replying to King's tweet, proposed that the service should cost US$7.99 instead. In a separate tweet, Musk wrote that Twitter Blue subscribers would receive priority in replies, mentions, and search, fewer advertisements, and longer audio and video. Although paid verification was expected to be launched by November 7, the reintroduction of Twitter Blue was delayed until after the 2022 United States elections on November 9, according to a memo obtained by The New York Times. The announcement of paid verification resulted in several accounts facetiously impersonating Musk, such as those of comedians Kathy Griffin and Sarah Silverman, being suspended. In response, Musk announced that impersonators using Twitter Blue "will be permanently suspended". An "official

Microformat

Microformats (μF) are predefined HTML markup (like HTML classes) created to serve as descriptive and consistent metadata about elements, designating them as representing a certain type of data (such as contact information, geographic coordinates, events, products, recipes, etc.). They allow software to process the information reliably by having set classes refer to a specific type of data rather than being arbitrary. Microformats emerged around 2005 and were predominantly designed for use by search engines, web syndication and aggregators such as RSS. Google confirmed in 2020 that it still parses microformats for use in content indexing. Microformats are referenced in several W3C social web specifications, including IndieAuth and Webmention. Although the content of web pages has been capable of some "automated processing" since the inception of the web, such processing is difficult because the markup elements used to display information on the web do not describe what the information means. Microformats can bridge this gap by attaching semantics, and thereby obviating other, more complicated, methods of automated processing, such as natural language processing or screen scraping. The use, adoption and processing of microformats enables data items to be indexed, searched for, saved or cross-referenced, so that information can be reused or combined. As of 2013, microformats allow the encoding and extraction of event details, contact information, social relationships and similar information. Microformats2, abbreviated as mf2, is the updated version of microformats. Mf2 provides an easier way of interpreting HTML structured syntax and vocabularies than the earlier ways that made use of RDFa and microdata. == Background == Microformats emerged around 2005 as part of a grassroots movement to make recognizable data items (such as events, contact details or geographical locations) capable of automated processing by software, as well as directly readable by end-users. Link-based microformats emerged first. These include vote links that express opinions of the linked page, which search engines can tally into instant polls. CommerceNet, a nonprofit organization that promotes e-commerce on the Internet, has helped sponsor and promote the technology and support the microformats community in various ways. CommerceNet also helped co-found the Microformats.org community site. Neither CommerceNet nor Microformats.org operates as a standards body. The microformats community functions through an open wiki, a mailing list, and an Internet relay chat (IRC) channel. Most of the existing microformats originated at the Microformats.org wiki and the associated mailing list by a process of gathering examples of web-publishing behaviour, then codifying it. Some other microformats (such as rel=nofollow and unAPI) have been proposed, or developed, elsewhere. == Technical overview == XHTML and HTML standards allow for the embedding and encoding of semantics within the attributes of markup elements. Microformats take advantage of these standards by indicating the presence of metadata using the following attributes: class Classname rel relationship, description of the target address in an anchor-element (...) rev reverse relationship, description of the referenced document (in one case, otherwise deprecated in microformats) For example, in the text "The birds roosted at 52.48, -1.89" is a pair of numbers which may be understood, from their context, to be a set of geographic coordinates. With wrapping in spans (or other HTML elements) with specific class names (in this case geo, latitude and longitude, all part of the geo microformat specification): Software agents can recognize exactly what each value represents and can then perform a variety of tasks such as indexing, locating it on a map and exporting it to a GPS device. === Examples === In this example, the contact information is presented as follows: With hCard microformat markup, that becomes: Here, the formatted name (fn), organisation (org), telephone number (tel) and web address (url) have been identified using specific class names and the whole thing is wrapped in class="vcard", which indicates that the other classes form an hCard (short for "HTML vCard") and are not merely coincidentally named. Other, optional, hCard classes also exist. Software, such as browser plug-ins, can now extract the information, and transfer it to other applications, such as an address book. == Specific microformats == Several microformats have been developed to enable semantic markup of particular types of information. However, only hCard and hCalendar have been ratified, the others remaining as drafts: hAtom (superseded by h-entry and h-feed) – for marking up Atom feeds from within standard HTML hCalendar – for events hCard – for contact information; includes: adr – for postal addresses geo – for geographical coordinates (latitude, longitude) hMedia – for audio/video content hAudio – for audio content hNews – for news content hProduct – for products hRecipe – for recipes and foodstuffs. hReview – for reviews rel-directory – for distributed directory creation and inclusion rel-enclosure – for multimedia attachments to web pages rel-license – specification of copyright license rel-nofollow, an attempt to discourage third-party content spam (e.g. spam in blogs) rel-tag – for decentralized tagging (Folksonomy) XHTML Friends Network (XFN) – for social relationships XOXO – for lists and outlines == Uses == Using microformats within HTML code provides additional formatting and semantic data that applications can use. For example, applications such as web crawlers can collect data about online resources, or desktop applications such as e-mail clients or scheduling software can compile details. The use of microformats can also facilitate "mash ups" such as exporting all of the geographical locations on a web page into (for example) Google Maps to visualize them spatially. Several browser extensions, such as Operator for Firefox and Oomph for Internet Explorer, provide the ability to detect microformats within an HTML document. When hCard or hCalendar are involved, such browser extensions allow microformats to be exported into formats compatible with contact management and calendar utilities, such as Microsoft Outlook. When dealing with geographical coordinates, they allow the location to be sent to applications such as Google Maps. Yahoo! Query Language can be used to extract microformats from web pages. On 12 May 2009 Google announced that they would be parsing the hCard, hReview and hProduct microformats, and using them to populate search result pages. They subsequently extended this in 2010 to use hCalendar for events and hRecipe for cookery recipes. Similarly, microformats are also processed by Bing and Yahoo!. As of late 2010, these are the world's top three search engines. Microsoft said in 2006 that they needed to incorporate microformats into upcoming projects, as did other software companies. Alex Faaborg summarizes the arguments for putting the responsibility for microformat user interfaces in the web browser rather than making more complicated HTML: Only the web browser knows what applications are accessible to the user and what the user's preferences are It lowers the barrier to entry for web site developers if they only need to do the markup and not handle "appearance" or "action" issues Retains backwards compatibility with web browsers that do not support microformats The web browser presents a single point of entry from the web to the user's computer, which simplifies security issues == Evaluation == Various commentators have offered review and discussion on the design principles and practical aspects of microformats. Microformats have been compared to other approaches that seek to serve the same or similar purpose. As of 2007, there had been some criticism of one, or all, microformats. The spread and use of microformats was being advocated as of 2007. Opera Software CTO and CSS creator Håkon Wium Lie said in 2005 "We will also see a bunch of microformats being developed, and that’s how the semantic web will be built, I believe." However, in August 2008 Toby Inkster, author of the "Swignition" (formerly "Cognition") microformat parsing service, pointed out that no new microformat specifications had been published since 2005. === Design principles === Computer scientist and entrepreneur, Rohit Khare stated that reduce, reuse, and recycle is "shorthand for several design principles" that motivated the development and practices behind microformats. These aspects can be summarized as follows: Reduce: favor the simplest solutions and focus attention on specific problems; Reuse: work from experience and favor examples of current practice; Recycle: encourage modularity and the ability to embed, valid XHTML can be reused in blog posts, RSS feeds, and anywhere else you can access the web. === Accessibi

Electronic kit

An electronic kit is a package of electrical components used to build an electronic device. Generally, kits are composed of electronic components, a circuit diagram (schematic), assembly instructions, and often a printed circuit board (PCB) or another type of prototyping board. There are two types of kits. Some build a single device or system. Other types used for education demonstrate a range of circuits. These will include a solderless construction board of some type, such as: Components mounted in plastic blocks with side contacts, that are held together in a base, e.g. Denshi blocks Springs on a card board, the springs trap wire leads, or component leads, such as Philips EE electronic experiment kits. These are a cheap and flexible option Professional type prototyping boards, (breadboards) into which component leads are inserted, following documentation of the "kit". The first type of kit for constructing a single device normally uses a PCB on which components are soldered. They normally come with extended documentation describing which component goes where into the PCB. For advanced hobby projects, sometimes the kit may only consist of a printed circuit board and assembly instructions, and the purchaser may have to source all the parts independently; or, the vendor may provide hard-to-get or pre-programmed parts while expecting the purchaser to obtain the rest of the components. People primarily purchase electronic kits to have fun and learn how things work. They were once popular as a means to reduce the cost of buying goods, but there is usually no cost saving in buying a kit today. Some electronic kits were assembled to make complete complex devices such as color television sets, oscilloscopes, high-end audio amplifiers, amateur radio equipment, electric organs, and even computers such as the Heathkit H-8, and the LNW-80. Many of the early microprocessor computers were sold as either electronic kits or assembled and tested. Heathkit sold millions of electronic kits during its 45-year history. Home assembly of common consumer electronics items no longer provides a cost advantage over commercially manufactured and distributed devices. People still build kits for custom devices and special-purpose electronics for professional and educational use and as a hobby. Also emerging is a trend to simplify the complexity by providing preprogrammed or modular kits often provided by many suppliers online. The fun and thrill of making your own electronics have shifted, in many cases, from easy-to-comprehend applications and analog devices to more sophisticated digital devices. == Examples == The Altair 8800 (the first home computer) was also sold as a kit, as were the MK14, Sinclair ZX80, Sinclair ZX81 and Acorn Atom computers. Many S-100 bus system cards were sold only as kits. Building a Robot kit, most often with a micro controller inside, is now in fashion.

List of large language models

A large language model (LLM) is a type of machine learning model designed for natural language processing tasks such as language generation. LLMs are language models with many parameters, and are trained with self-supervised learning on a vast amount of text. == List == For the training cost column, 1 petaFLOP-day equals 1 petaFLOP/sec × 1 day, or 8.64×1019 FLOP (floating point operations). Only the cost of the largest model is shown. The number of parameters is measured in billions, and the training cost is measured in petaFLOP-days. === 2018 === === 2019 === === 2020 === === 2021 === === 2022 === === 2023 === === 2024 === === 2025 === === 2026 ===

T.38

T.38 is an ITU recommendation for allowing transmission of fax over IP networks (FoIP) in real time. == History == The T.38 fax relay standard was devised in 1998 as a way to transport faxes across IP networks between existing Group 3 (G3) fax terminals. T.4 and related fax standards were published by the ITU in 1980, before the rise of the Internet. In the late 1990s, VoIP, or voice over IP, began to gain ground as an alternative to the conventional public switched telephone network (PSTN). However, because most VoIP systems are optimized (through their use of aggressive lossy bandwidth-saving compression) for voice rather than data calls, conventional fax machines worked poorly or not at all on them due to the network impairments such as delay, jitter, packet loss, and so on. Thus, some way of transmitting fax over IP was needed. == Overview == In practical scenarios, a T.38 fax call has at least part of the call being carried over PSTN, although this is not required by the T.38 definition, and two T.38 devices can send faxes to each other. This particular type of device is called Internet-Aware Fax device, or IAF, and it is capable of initiating or completing a fax call towards the IP network. The typical scenario where T.38 is used is – T.38 fax relay – where a T.30 fax device sends a fax over PSTN to a T.38 fax gateway which converts or encapsulates the T.30 protocol into a T.38 data stream. This is then sent either to a T.38-enabled end point such as fax machine or fax server or another T.38 gateway that converts it back to a PSTN PCM or analog signal and terminates the fax on a T.30 device. The T.38 recommendation defines the use of both TCP and UDP to transport T.38 packets. Implementations tend to use UDP, due to TCP's requirement for acknowledgement packets and resulting retransmission during packet loss, which introduces delays. When using UDP, T.38 copes with packet loss by using redundant data packets. T.38 is not a call setup protocol, thus the T.38 devices need to use standard call setup protocols to negotiate the T.38 call, e.g. H.323, SIP & MGCP. == Operation == There are two primary ways that fax transactions are conveyed across packet networks. The T.37 standard specifies how a fax image is encapsulated in e-mail and transported, ultimately, to the recipient using a store-and-forward process through intermediary entities. T.38, however, defines a protocol that supports the use of the T.30 protocol in both the sender and recipient terminals. (See diagram above.) T.38 lets one transmit a fax across an IP network in real time, just as the original G3 fax standards did for the traditional (time-division multiplexed (TDM)) network, also called the public switched telephone network or PSTN. A special protocol is needed for real-time fax over IP (Internet Protocol) since existing fax terminals only supported PSTN connections, where the information flow was generally smooth and uninterrupted, as opposed to the jittery arrival of IP packets. The trick was to come up with a protocol that makes the IP network “invisible” to the endpoint fax terminals, which would mean the user of a legacy fax terminal need not know that the fax call was traversing an IP network. The network interconnections supported by T.38 are shown above. The two fax terminals on either side of the figure communicate using the T.30 fax protocol published by the ITU in 1980. Interconnection of the PSTN with the IP packet network requires a “gateway” between the PSTN and IP networks. PSTN-IP Gateways support TDM voice on the PSTN side and VoIP and FoIP on the packet side. For voice sessions, the gateway will take in voice packets on the IP side, accumulate a few packets to ensure a smooth flow of TDM data upon their release, and then meter them out over TDM where they eventually are heard by a human or stored on a computer for later playback. The gateway employs packet-management techniques to enhance the quality of the speech in the presence of network errors by taking advantage of the natural ability of a listener to not really hear the occasional missing or repeated packet. But facsimile data are transmitted by modems, which aren't as forgiving as the human ear is for speech. Missing packets will often cause a fax session to fail at worst or create one or more image lines in error at best. So the job of T.38 is to “fool” the terminal into “thinking” that it's communicating directly with another T.30 terminal. It will also correct for network delays with so-called spoofing techniques, and missing or delayed packets with fax-aware buffer-management techniques. Spoofing refers to the logic implemented in the protocol engine of a T.38 relay that modifies the protocol commands and responses on the TDM side to keep network delays on the IP side from causing the transaction to fail. This is done, for example, by padding image lines or deliberately causing a message to be re-transmitted to render network delays transparent to the sending/receiving fax terminals. Networks that do not have packet loss or excessive delay can exhibit acceptable fax performance without T.38, provided the PCM clocks in all gateways are of very high accuracy (explained below). T.38 not only removes the effect of PCM clocks not being synchronized, but also reduces the required network bandwidth by a factor of 10, while it corrects for packet loss and delay. === Bandwidth reduction === As shown in the diagram below, a T.38 gateway is composed of two primary elements: the fax modems and the T.38 subsystem. The fax modems modulate and demodulate the PCM samples of the analog data, turning the sampled-data representation of the fax terminal's analog signal to its binary translation, and vice versa. The PSTN network samples the analog signal of a voice or modem signal (it doesn't know the difference) 8,000 times per second (SPS), and encodes them as 8-bit data bytes. This means 8000 samples-per-second times 8-bits per sample, or 64,000 bits per second (bit/s) to represent the modem (or voice) data in one direction. For both directions the modem transaction consumes 128,000 bits of network bandwidth. However, the typical modem in a fax terminal transmits the image data at 33,600 bit/s, so if the analog data are first converted to the digital content they represent, only 33,600 bits (plus network overhead of a few bytes) are needed. And since T.30 fax is a half-duplex protocol, the network is only needed for one direction at a time. Refer to RFC 3261 === PCM clock synchronization === In the diagram above, there is a sample-rate clock in the fax terminal and one in the gateway's modems that is used to trigger the sampling of the analog line 8,000 times per second. These clocks are usually quite accurate, but in some low-cost terminal adapters (a one or two-line gateway) the PCM clock can be surprisingly inaccurate. If the terminal is sending data to the gateway, and the gateway's clock is too slow, the buffers (jitter buffers) in the gateway will eventually overflow, causing the transaction to fail. Since the difference is often quite small, this problem occurs on long, detailed fax images giving the clocks more time to cause the jitter buffer in gateway to either underflow or overflow, which is just the same as missing or duplicated packets. === Packet loss === T.38 provides facilities to eliminate the effects of packet loss through data redundancy. When a packet is sent, either zero, one, two, three, or even more of the previously sent packets are repeated. (The specification does not impose a limit.) This increases the network bandwidth required (it's still much less than not using T.38) but it allows the receiving gateway to reconstruct the complete packet sequence, even with a fairly high level of packet loss. == Related standards == T.4 is the umbrella specification for fax. It specifies the standard image sizes, two forms of image-data compression (encoding), the image-data format, and references, T.30 and the various modem standards. T.6 specifies a compression scheme that reduces the time required to transmit an image by roughly 50-percent. T.30 specifies the procedures that a sending and receiving terminal use to set up a fax call, determine the image size, encoding, and transfer speed, the demarcation between pages, and the termination of the call. T.30 also references the various modem standards. V.21, V.27ter, V.29, V.17, V.34: ITU modem standards used in facsimile. The first three were ratified prior to 1980, and were specified in the original T.4 and T.30 standards. V.34 was published for fax in 1994. T.37 The ITU standard for sending a fax-image file via e-mail to the intended recipient of a fax. G.711 pass through - this is where the T.30 fax call is carried in a VoIP call encoded as audio. This is sensitive to network packet loss, jitter and clock synchronization. When using voice high-compression encoding techniques such as, but not limited to, G.729, some fax tonal signa