AI Code Visual Studio

AI Code Visual Studio — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Artificial intelligence in fraud detection

    Artificial intelligence in fraud detection

    Artificial intelligence is used by many different businesses and organizations. It is widely used in the financial sector, especially by accounting firms, to help detect fraud. In 2022, PricewaterhouseCoopers reported that fraud has impacted 46% of all businesses in the world. The shift from working in person to working from home has brought increased access to data. According to an FTC (Federal Trade Commission) study from 2022, customers reported fraud of approximately $5.8 billion in 2021, an increase of 70% from the year before. The majority of these scams were imposter scams and online shopping frauds. Furthermore, artificial intelligence plays a crucial role in developing advanced algorithms and machine learning models that enhance fraud detection systems, enabling businesses to stay ahead of evolving fraudulent tactics in an increasingly digital landscape. == Tools == === Expert systems === Expert systems were first designed in the 1970s as an expansion into artificial intelligence technologies. Their design is based on the premise of decreasing potential user error in decision-making and emulating mental reasoning used by experts in a particular field. They differentiate themselves from traditional linear reasoning models by separating identified points in data and processing them individually at the same time. Though, these systems do not rely purely on machine-learned intelligence. Information regarding rules, practices, and procedures in the form of "if-then" statements are implemented into the programming of the system. Users interact with the system by feeding information into the system either through direct entry or import of external data. An inference system compares the information provided by the user with corresponding rules that are believed to specifically apply to the situation. Using this information and the corresponding rules will be used to create a solution to the user's query. Expert systems will generally not operate properly when the common procedures for a specified situation are ambiguous due to the need for well-defined rules. Implementation of expert systems in accounting procedures is feasible in areas where professional judgment is required. Situations where expert systems are applicable include investigations into transactions that involve potential fraudulent entries, instances of going concern, and the evaluation of risk in the planning stages of an audit. === Continuous auditing === Continuous auditing is a set of processes that assess various aspects of information gathered in an audit to classify areas of risk and potential weaknesses in financial Internal controls at a more frequent rate than traditional methods. Instead of analyzing recorded transactions and journal entries periodically, continuous auditing focuses on interpreting the character of these actions more frequently. The frequency of these processes being undertaken as well as highlighting areas of importance is up to the discretion of their implementer, who commonly makes such decisions based on the level of risk in the accounts being evaluated and the goals of implementing the system. Performance of these processes can occur as frequently as being nearly instantaneous with an entry being posted. The processes involved with analyzing financial data in continuous auditing can include the creation of spreadsheets to allow for interactive information gathering, calculation of financial ratios for comparison with previously created models, and detection of errors in entered figures. A primary goal of this practice is to allow for quicker and easier detection of instances of faulty controls, errors, and instances of fraud. === Machine learning and deep learning === The ability of machine learning and deep learning to swiftly and effectively sort through vast volumes of data in the forms of various documents relevant to companies and documents being audited makes them applicable to the domains of audit and fraud detection. Examples of this include recognizing key language in contracts, identifying levels of risk of fraud in transactions, and assessing journal entries for misstatement. == Applications == === 'Big 4' Accounting Firms === Deloitte created an Al-enabled document-reviewing system in 2014. The system automates the method of reviewing and extracting relevant information from different business documents. Deloitte claims that this innovation has made a difference by reducing time spent going through lawful contract documents, invoices, money-related articulations, and board minutes by up to 50%. Working with IBM's Watson, Deloitte is developing cognitive-technology-enhanced commerce arrangements for its clients. LeasePoint is fueled by IBM TRIRIGA (this product evolved into IBM Maximo Real Estate and Facilities) and uses Deloitte's industrial information to create an end-to-end leasing portfolio. Automated Cognitive Resource Assessment employs IBM's Maximo innovation to progress the proficiency of asset inspection. Ernst and Young (EY) connected Al to the investigation of lease contracts. EY (Australia) has also received Al-enabled auditing technology. Collaborating with H20.ai, PwC developed an Al-enabled framework (GL.ai) capable of analyzing reports and preparing reports. PwC claims to have made a significant investment in normal dialect processing (NLP), an Al-enabled innovation to process unstructured information efficiently. KPMG built a portfolio of Al instruments, called KPMG Ignite, to upgrade trade decisions and forms. Working with Microsoft and IBM Watson, KPMG is creating instruments to coordinate Al, data analytics, Cognitive Technologies, and RPA. == Advantages == === Efficiency === The process of auditing an entity in an attempt to detect fraudulent activity requires the repeating of investigatory processes until an error or misstatement may be identified. Under traditional methods, these processes would be carried out by a human being. Proponents of artificial intelligence in fraud detection have stated that these traditional methods are inefficient and can be more quickly accomplished with the aid of an intelligent computing system. A survey of 400 chief executive officers created by KPMG in 2016 found that approximately 58% believed that artificial intelligence would play a key role in making audits more efficient in the future. === Data interpretation === Higher levels of fraud detection entail the use of professional judgement to interpret data. Supporters of artificial intelligence being used in financial audits have claimed that increased risks from instances of higher data interpretation can be minimized through such technologies. One necessary element of an audit of financial statements that requires professional judgement is the implementation of thresholds for materiality. Materiality entails the distinction between errors and transactions in financial statements that would impact decisions made by users of those financial statements. The threshold for materiality in an audit is set by the auditor based on various factors. Artificial intelligence has been used to interpret data and suggest materiality thresholds to be implemented through the use of expert systems. === Decreased costs === Those in favor of using artificial intelligence to complete investigations of fraud have stated that such technologies decrease the amount of time required to complete tasks that are repetitive. The claim further states that such efficiencies allow for lowered resource requirements, which can then be further spent on tasks that have not been fully automated. The audit firm Ernst & Young has posited these claims by declaring that their deep learning systems have been used to reduce time spent on administrative tasks by analyzing relevant audit documents. According to the firm, this has allowed their employees to focus more on judgement and analysis. == Disadvantages == === Job Displacement === The inescapable reception of computer based intelligence and robotization advancements might prompt critical work relocation across different enterprises. As artificial intelligence frameworks become more equipped for performing undertakings customarily completed by people, there is a worry that specific work jobs could become out of date, prompting joblessness and financial imbalance. === Initial investment requirement === Along with a knowledge of coding and building systems through computer programs, we are seeing the advantages of these systems, but since they are so new, they require a large investment to start building such a system. Any firm that is planning on implementing an AI system to detect fraud must hire a team of data scientists, along with upgrading their cloud system and data storage. The system must be consistently monitored and updated to be the most efficient form of itself, otherwise the likelihood of fraud being involved in those transactions increases. If one does not initially invest in such a syst

    Read more →
  • Facebook

    Facebook

    Facebook is an American social networking service owned by the American technology conglomerate Meta Platforms. It was founded in 2004 by Mark Zuckerberg, along with his Harvard College roommates and fellow students Eduardo Saverin, Andrew McCollum, Dustin Moskovitz, and Chris Hughes. The name Facebook derives from the face book directories often given to American university students. The service was initially limited to Harvard students before gradually expanding to other universities in North America. Since 2006, Facebook has permitted registration by individuals aged 13 and older, with the exception of South Korea, Spain, and Quebec, where the minimum age is 14. As of December 2023, Facebook reported approximately 3.07 billion monthly active users worldwide. As of July 2025, it was ranked as the third-most-visited website in the world, with 23 percent of its traffic originating from the United States. It was the most downloaded mobile application of the 2010s. Facebook can be accessed from devices with Internet connectivity, such as personal computers, tablets and smartphones. After registering, users can create a profile revealing personal information about themselves. They can post text, photos and multimedia which are shared either publicly or exclusively with other users who have agreed to be their friend, depending on privacy settings. Users can also communicate directly with each other with Messenger, edit messages (within 15 minutes after sending), join common-interest groups, and receive notifications on the activities of their Facebook friends and the pages they follow. Facebook has often been criticized over issues such as user privacy (as with the Facebook–Cambridge Analytica data scandal), political manipulation (as with the 2016 U.S. elections) and mass surveillance. The company has also been subject to criticism over its psychological effects such as addiction and low self-esteem, and over content such as fake news, conspiracy theories, copyright infringement, and hate speech. Commentators have accused Facebook of willingly facilitating the spread of such content, as well as overemphasizing its number of users to appeal to advertisers. == History == The history of Facebook traces its growth from a college networking site to a global social networking service. While attending Phillips Exeter in the early 2000s, Zuckerberg met Kris Tillery. Tillery, a one-time project collaborator with Zuckerberg, would create a school-based social networking project called Photo Address Book. Photo Address Book was a digital face book, created through a linked database composed of student information derived from the official records of the Exeter Student Council. The database contained linkages such as name, dorm-specific landline numbers, and student headshots. Mark Zuckerberg built a website called "Facemash" in 2003 while attending Harvard University. The site was comparable to Hot or Not and used photos from online face books, asking users to choose the 'hotter' person". Zuckerberg was reported and faced expulsion, but the charges were dropped. A "face book" is a student directory featuring photos and personal information. In January 2004, Zuckerberg coded a new site known as "TheFacebook", stating, "It is clear that the technology needed to create a centralized Website is readily available ... the benefits are many." Zuckerberg met with Harvard student Eduardo Saverin, and each agreed to invest $1,000. On February 4, 2004, Zuckerberg launched "TheFacebook". Membership was initially restricted to students of Harvard College. Dustin Moskovitz, Andrew McCollum, and Chris Hughes joined Zuckerberg to help manage the growth of the site. It became available successively to most universities in the US and Canada. In 2004, Napster co-founder Sean Parker became company president and the company moved to Palo Alto, California. PayPal co-founder Peter Thiel gave Facebook its first investment. In 2005, the company dropped "the" from its name after purchasing the domain name Facebook.com. In 2006, Facebook opened to everyone at least or only 13 years old with a valid email address. Facebook introduced key features like the News Feed, which became central to user engagement. By late 2007, Facebook had 100,000 pages on which companies promoted themselves. Facebook had surpassed MySpace in global traffic and became the world's most popular social media platform. Microsoft announced that it had purchased a 1.6% share of Facebook for $240 million ($373 million in 2025 dollars), giving Facebook an implied value of around $15 billion ($23.3 billion in 2025 dollars). Facebook focused on generating revenue through targeted advertising based on user data, a model that drove its rapid financial growth. In 2012, Facebook went public with one of the largest IPOs in tech history. Acquisitions played a significant role in Facebook's dominance. In 2012, it purchased Instagram, followed by WhatsApp and Oculus VR in 2014, extending its influence beyond social networking into messaging and virtual reality. The Facebook–Cambridge Analytica data scandal in 2018 revealed misuse of user data to influence elections, sparking global outcry and leading to regulatory fines and hearings. Facebook's role in global events, including its use in organizing movements like the Arab Spring and its impact on events like the Rohingya genocide in Myanmar, highlighted its dual nature as a tool for both empowerment and harm. In 2021, Facebook rebranded as Meta, reflecting its shift toward building the "metaverse" and focusing on virtual reality and augmented reality technologies. == Features == Facebook does not officially publish a maximum character limit for posts; however, user posts can be lengthy, with unofficial sources suggesting a high character limit. Posts may also include images and videos. According to Facebook's official business documentation, videos can be up to 240 minutes long and 10 GB in file size, with supported resolutions up to 1080p. Users can "friend" users, both sides must agree to being friends. Posts can be changed to be seen by everyone (public), friends, people in a certain group (group) or by selected friends (private). Users can join groups. Groups are composed of persons with shared interests. For example, they might go to the same sporting club, live in the same suburb, have the same breed of pet or share a hobby. Posts posted in a group can be seen only by those in a group, unless set to public. Users are able to buy, sell, and swap things on Facebook Marketplace or in a Buy, Swap and Sell group. Facebook users may advertise events, which can be offline, on a website other than Facebook, or on Facebook. == Website == === Technical aspects === The site's primary color is blue as Zuckerberg is red–green colorblind, a realization that occurred after a test taken around 2007. Facebook was initially built using PHP, a popular scripting language designed for web development. PHP was used to create dynamic content and manage data on the server side of the Facebook application. Zuckerberg and co-founders chose PHP for its simplicity and ease of use, which allowed them to quickly develop and deploy the initial version of Facebook. As Facebook grew in user base and functionality, the company encountered scalability and performance challenges with PHP. In response, Facebook engineers developed tools and technologies to optimize PHP performance. One of the most significant was the creation of the HipHop Virtual Machine (HHVM). This significantly improved the performance and efficiency of PHP code execution on Facebook's servers. The site upgraded from HTTP to the more secure HTTPS in January 2011. ==== 2012 architecture ==== Facebook is developed as one monolithic application. According to an interview in 2012 with Facebook build engineer Chuck Rossi, Facebook compiles into a 1.5 GB binary blob which is then distributed to the servers using a custom BitTorrent-based release system. Rossi stated that it takes about 15 minutes to build and 15 minutes to release to the servers. The build and release process has zero downtime. Changes to Facebook are rolled out daily. Facebook used a combination platform based on HBase to store data across distributed machines. Using a tailing architecture, events are stored in log files, and the logs are tailed. The system rolls these events up and writes them to storage. The user interface then pulls the data out and displays it to users. Facebook handles requests as AJAX behavior. These requests are written to a log file using Scribe (developed by Facebook). Data is read from these log files using Ptail, an internally built tool to aggregate data from multiple Scribe stores. It tails the log files and pulls data out. Ptail data are separated into three streams and sent to clusters in different data centers (Plugin impression, News feed impressions, Actions (plugin + news feed)). Puma is used to manage periods of high data

    Read more →
  • Battleboarding

    Battleboarding

    Battleboarding, also known as versus debating and "who would win" debating, is an activity that involves discussing and debating around hypothetical fights between individuals; most popularly, fictional characters. These debates are often held in forums, blogs, sites and wikis, known as versus sites or battle boards. Netizens who engage in battleboarding online are often called "battleboarders". The earliest iterations of battleboarding first appeared in various online boards and forums, though its origins can be traced back to magazines, television shows, and comic book letter columns. Eventually, the online activity grew, becoming one of the most popular internet activities today, and spawning many online communities dedicated solely for battleboarding. It soon evolved into its own subculture, and even went on to inspire other media. == History == === Origins === Before the advent of the internet, articles about hypothetical fights were published in magazines. These articles range from topics like sports, comics and anime, such as Black Belt Magazine issue May 1997 which discussed about a hypothetical match between Muhammad Ali and Bruce Lee, and Wizard Magazine #133 which discussed about various hypothetical fights between American comic characters against Japanese anime characters. During that time, many comic book publishers also conceptualized and published "versus" storylines like Batman Versus Predator and Justice League/Avengers. Many films also capitalized on the concept of characters from different franchises fighting each other, such as Frankenstein Meets the Wolf Man (1934), King Kong vs. Godzilla (1962), Freddy vs Jason (2003), and Alien vs. Predator (2004). Another inspiration behind battleboarding were television shows and documentaries whose premise involved hypothetical fights concerning a variety of subjects like zoology, paleontology, and military history. These include shows such as Animal Face-Off (which pitted animals against each other), Deadliest Warrior (which pitted historical warriors, oftentimes from different time periods, against each other), and Jurassic Fight Club (which was about analyzing cases where different types of dinosaurs fought one another). Death Battle, a web series about pitting characters against each other that began in 2010, is a similar show that soon inspired many battleboarding communities and fandoms. Death Battle, as with many other battleboarding series and websites before it, utilised "calcs", which are mathematical equations that try to calculate how strong a character or weapon is. Other popular web series about the subject include Super Power Beat Down and Grudge Match. === Forums and sites === Many internet forums about movies, comics, anime, and video games often held discussions about hypothetical fights between characters from these media. These discussions would be the first iteration of online battleboarding. A notable early battleboarding website was stardestroyer.net (founded 1998), created by Michael Wong. The website focuses in large part on match-ups between the Star Wars and Star Trek franchises, and also includes a forum covering this as well as other more general battleboarding topics, usually related to science fiction and space opera. In addition to the forums, several webpages written by the administrators and contributors were embedded on the site. These attempted to mathematically quantify the capabilities of Star Wars technology and prove their superiority to their Star Trek equivalents, such as Wong's "Star Wars vs Star Trek: Technology Overview" and Brian Young's "Turbolaser Commentaries." stardestroyer.net had a notable impact on early battleboarding culture and also influenced official products. Curtis Saxton, author of several officially-licensed Star Wars technical reference books, thanked Wong, Young, and several other stardestroyer.net contributors by name in the acknowledgements section of Star Wars: Attack of the Clones Incredible Cross-Sections (2002), referring to them as "prominent among the hundreds of people contributing to constructive debates about Star Wars technicalities over the years, resulting in the consensus of conceptual and physical foundations applied in these pages." Saxton's books in the Incredible Cross-Sections series contain specific numbers about the capabilities of Star Wars ships original to these publications and not used in any other official sources. In an interview conducted by TheForce.Net, Saxton claimed to have been offered the job of writing reference books by a DK employee familiar with his "Star Wars Technical Commentaries" webpage (1995–2001), where Saxton attempted to calculate the firepower, speed, and durability of Star Wars spaceships using his background as an astrophysics student. One of the oldest and longest-running battleboarding forum is Comic Vine's "battle forum", whose first post was in 2007. Comic Vine also has one of the largest impacts on battleboarding, creating many common rules and terminologies such as "bloodlusted", "morals are off", "speed equalized", and many others. Another long-running battle forum is a subreddit called r/whowouldwin, where redditors can post and debate fights about real or fictional individuals. Verdicts of these match-ups are often chosen by using evidences of a character's power, weakness, or feat, such as movie clips, comic book panel scans, and excerpts from related literature; all of which are posted and categorized in a separate subreddit called r/respectthreads. Other influential battle forums include Fanverse, where users can post their own calcs about a character's power level. The popularity of battle forums inspired the creation of websites dedicated only for battleboarding. These include The Outskirts Battle Dome, a website that popularized the use of "power levels" in battleboarding; the aforementioned stardestroyer.net; and Space Battles, a website whose forums and threads are filled with posts about hypothetical fights between characters as well as other related topics. Another influential battleboarding site is the now defunct Fact Pile, and its sister site, FactPileTopia. Fact Pile is one of the first battleboarding site that actually listed down and documented winners of their match-ups. The site closed down in 2016 along with its forum, wikia, and YouTube channel. Besides these, blogs about battleboarding were also created, such as dreager1.com. === Wikis === Nowadays, the most popular battleboarding communities can be seen in Fandom, with two of the oldest and most popular being Deadliest Fiction and VS Battles Wiki. Deadliest Fiction is a Deadliest Warrior-inspired fanon created in July 2010 by a group of historians, academics, and pop culture enthusiasts. Being one of the most influential and accurate battleboarding sites around, Deadliest Fiction allows users to create hypothetical match-ups in the form of blogs, where other users can vote and debate around who will win in the comment section. Once a verdict is reached, the site allows the user to create a simulated fanfiction of how the fight would happen. The same year in October, a similar battleboarding site named VS Battles Wiki was created. In the VS Battles Wiki, users can create profiles and power levels of characters, post match-ups in its threads and forums, and list down the winners and losers of these threads in said character profiles. The wiki is considered the most active wiki battleboarding site today, with over 1 million visitors per month. However, throughout the years, the VS Battles Wiki has had its share of controversies, such as alleged inaccuracies in its profiles. There have also been websites and fanfiction wikis inspired by the battleboarding internet show Death Battle. These include the long-running G1 Death Battle Fan Blog, r/deathbattlematchups, and the popular Death Battle Fanon Wiki and DBX Fanon Wiki. Death Battle also released its own dice and card game, complete with rules and effects taken from battleboarding. == Subculture == In its rise in popularity, battleboarding has given birth to a unique online subculture with its own rules, activities, and terminologies. Several of these influences have become present in other online communities and popular media. Some of the common slang and terminologies used in battleboarding subculture includes: Battle Field Removal: Often abbreviated to "BFR", this is a rule that a fight can end if one character is taken out of a battlefield. This rule is used for characters who have the powers to teleport or transport enemies without actually killing them. Battle Royale: A term originating from Comic Vine in which multiple characters are pitted against each other. The name is probably derived from the film Battle Royale or the video game genre of the same name. Bloodlusted: A hypothetical situation wherein the characters are pitted against each other while in a furious, berserker-like state. Calc: These are calculations battl

    Read more →
  • Digital asset

    Digital asset

    A digital asset is anything that exists only in digital form and comes with a distinct usage right or distinct permission for use. Data that do not possess those rights are not considered assets. Digital assets include, but are not limited to: digital documents, audio content, motion pictures, and other relevant digital data currently in circulation or stored on digital appliances, such as personal computers, laptops, portable media players, tablets, data storage devices, and telecommunication devices. This encompasses any apparatus that currently exists or will exist as technology progresses to accommodate the conception of new modalities capable of carrying digital assets. This holds true regardless of the ownership of the physical device on which the digital asset is located. == Types == Types of digital assets include, but are not limited to: software, photography, logos, illustrations, animations, audiovisual media, presentations, spreadsheets, digital paintings, word documents, electronic mails, websites, and various other digital formats with their respective metadata. The number of different types of digital assets is exponentially increasing due to the rising number of devices that leverage these assets, such as smartphones, serving as conduits for digital media. In Intel's presentation at the 'Intel Developer Forum 2013,' they introduced several new types of digital assets related to medicine, education, voting, friendships, conversations, and reputation, among others. == Digital asset management system == A digital asset management (DAM) is an integrated structure that combines software, hardware, and/or other services to manage, store, ingest, organize, and retrieve digital assets. These systems enable users to find and use content when needed. == Digital asset metadata == Metadata is data about other data. Any structured information that defines a specification of any form of data is referred to as metadata. Metadata is also a claimed relationship between two entities, often used to establish connections or associations. Librarian Lorcan Dempsey says "Think of metadata as data which removes from a user (human or machine) the need to have full advance knowledge of the existence or characteristics of things of potential interest in the environment". At first, the term metadata was used for digital data exclusively, but nowadays metadata can apply to both physical and digital data. Catalogs, inventories, registers, and other similar standardized forms of organizing, managing, and retrieving resources contain metadata. Metadata can be stored and contained directly within the file it refers to or independently from it with the help of other forms of data management such as a DAM system. The more metadata is assigned to an asset the easier it gets to categorize it, especially as the amount of information grows. The asset's value rises the more metadata it has for it becomes more accessible, easier to manage, and more complex. Structured metadata can be shared with open protocols like OAI-PMH to allow further aggregation and processing. Open data sources like institutional repositories have thus been aggregated to form large datasets and academic search engines comprising tens of millions of open access works, like BASE, CORE, and Unpaywall. == Issues == Due to a lack of either legislation or legal precedent, there is limited existing governmental control and regulation surrounding digital assets in the United States and other large economies globally. Many of the control issues relating to access and transferability are maintained by individual companies. Some consequences of this include 'What is to become of the assets once their owner is deceased?' as well as can, and, if so, how, may they be inherited. This subject was broached in a bogus story about Bruce Willis allegedly looking to sue Apple as the end user agreement prevented him from bequeathing his iTunes collection to his children. Another case of this was when a soldier died on duty and the family requested access to the Yahoo! account. When Yahoo! refused to grant access, the probate judge ordered them to give the emails to the family but Yahoo! still was not required to give access. The Music Modernization Act was passed in September 2018 by the U.S. Congress to create a new music licensing system, with the aim to help songwriters get paid more.

    Read more →
  • XLNet

    XLNet

    The XLNet was an autoregressive Transformer designed as an improvement over BERT, with 340M parameters and trained on 33 billion words. It was released on 19 June 2019, under the Apache 2.0 license. It achieved state-of-the-art results on a variety of natural language processing tasks, including language modeling, question answering, and natural language inference. == Architecture == The main idea of XLNet is to model language autoregressively like the GPT models, but allow for all possible permutations of a sentence. Concretely, consider the following sentence:My dog is cute.In standard autoregressive language modeling, the model would be tasked with predicting the probability of each word, conditioned on the previous words as its context: We factorize the joint probability of a sequence of words x 1 , … , x T {\displaystyle x_{1},\ldots ,x_{T}} using the chain rule: Pr ( x 1 , … , x T ) = Pr ( x 1 ) Pr ( x 2 | x 1 ) Pr ( x 3 | x 1 , x 2 ) … Pr ( x T | x 1 , … , x T − 1 ) . {\displaystyle \Pr(x_{1},\ldots ,x_{T})=\Pr(x_{1})\Pr(x_{2}|x_{1})\Pr(x_{3}|x_{1},x_{2})\ldots \Pr(x_{T}|x_{1},\ldots ,x_{T-1}).} For example, the sentence "My dog is cute" is factorized as: Pr ( My , dog , is , cute ) = Pr ( My ) Pr ( dog | My ) Pr ( is | My , dog ) Pr ( cute | My , dog , is ) . {\displaystyle \Pr({\text{My}},{\text{dog}},{\text{is}},{\text{cute}})=\Pr({\text{My}})\Pr({\text{dog}}|{\text{My}})\Pr({\text{is}}|{\text{My}},{\text{dog}})\Pr({\text{cute}}|{\text{My}},{\text{dog}},{\text{is}}).} Schematically, we can write it as → My → My dog → My dog is → My dog is cute . {\displaystyle {\texttt {}}{\texttt {}}{\texttt {}}{\texttt {}}\to {\text{My }}{\texttt {}}{\texttt {}}{\texttt {}}\to {\text{My dog }}{\texttt {}}{\texttt {}}\to {\text{My dog is }}{\texttt {}}\to {\text{My dog is cute}}.} However, for XLNet, the model is required to predict the words in a randomly generated order. Suppose we have sampled a randomly generated order 3241, then schematically, the model is required to perform the following prediction task: is dog is dog is cute → My dog is cute {\displaystyle {\texttt {}}{\texttt {}}{\texttt {}}{\texttt {}}\to {\texttt {}}{\texttt {}}{\text{is }}{\texttt {}}\to {\texttt {}}{\text{dog is }}{\texttt {}}\to {\texttt {}}{\text{dog is cute}}\to {\text{My dog is cute}}} By considering all permutations, XLNet is able to capture longer-range dependencies and better model the bidirectional context of words. === Two-Stream Self-Attention === To implement permutation language modeling, XLNet uses a two-stream self-attention mechanism. The two streams are: Content stream: This stream encodes the content of each word, as in standard causally masked self-attention. Query stream: This stream encodes the content of each word in the context of what has gone before. In more detail, it is a masked cross-attention mechanism, where the queries are from the query stream, and the key-value pairs are from the content stream. The content stream uses the causal mask M causal = [ 0 − ∞ − ∞ … − ∞ 0 0 − ∞ … − ∞ 0 0 0 … − ∞ ⋮ ⋮ ⋮ ⋱ ⋮ 0 0 0 … 0 ] {\displaystyle M_{\text{causal}}={\begin{bmatrix}0&-\infty &-\infty &\dots &-\infty \\0&0&-\infty &\dots &-\infty \\0&0&0&\dots &-\infty \\\vdots &\vdots &\vdots &\ddots &\vdots \\0&0&0&\dots &0\end{bmatrix}}} permuted by a random permutation matrix to P M causal P − 1 {\displaystyle PM_{\text{causal}}P^{-1}} . The query stream uses the cross-attention mask P ( M causal − ∞ I ) P − 1 {\displaystyle P(M_{\text{causal}}-\infty I)P^{-1}} , where the diagonal is subtracted away specifically to avoid the model "cheating" by looking at the content stream for what the current masked token is. Like the causal masking for GPT models, this two-stream masked architecture allows the model to train on all tokens in one forward pass. == Training == Two models were released: XLNet-Large, cased: 110M parameters, 24-layer, 1024-hidden, 16-heads XLNet-Base, cased: 340M parameters, 12-layer, 768-hidden, 12-heads. It was trained on a dataset that amounted to 32.89 billion tokens after tokenization with SentencePiece. The dataset was composed of BooksCorpus, and English Wikipedia, Giga5, ClueWeb 2012-B, and Common Crawl. It was trained on 512 TPU v3 chips, for 5.5 days. At the end of training, it still under-fitted the data, meaning it could have achieved lower loss with more training. It took 0.5 million steps with an Adam optimizer, linear learning rate decay, and a batch size of 8192.

    Read more →
  • IP Multimedia Subsystem

    IP Multimedia Subsystem

    The IP Multimedia Subsystem or IP Multimedia Core Network Subsystem (IMS) is a standardized architectural framework for delivering IP-based multimedia services. Historically, mobile phones have provided voice call services over a circuit-switched network, rather than over an IP-based packet-switched network. Various VoIP technologies are available on smartphones; IMS offers a standardized protocol across different vendors. IMS was originally designed by the wireless standards body 3rd Generation Partnership Project (3GPP), as a part of the vision for evolving mobile networks beyond GSM. Its original formulation (3GPP Rel-5) represented an approach for delivering Internet services over GPRS. This vision was later updated by 3GPP, 3GPP2 and ETSI TISPAN by requiring support of networks other than GPRS, such as Wireless LAN, CDMA2000 and fixed lines. IMS uses IETF protocols wherever possible, e.g., the Session Initiation Protocol (SIP). According to the 3GPP, IMS is not intended to standardize applications, but rather to aid the access of multimedia and voice applications from wireless and wireline terminals, i.e., to create a form of fixed-mobile convergence (FMC). This is done by having a horizontal control layer that isolates the access network from the service layer. From a logical architecture perspective, services need not have their own control functions, as the control layer is a common horizontal layer. However, in implementation this does not necessarily map into greater reduced cost and complexity. Alternative and overlapping technologies for access and provisioning of services across wired and wireless networks include combinations of Generic Access Network, softswitches and "naked" SIP. Since it is becoming increasingly easier to access content and contacts using mechanisms outside the control of traditional wireless/fixed operators, the interest of IMS is being challenged. Examples of global standards based on IMS are MMTel which is the basis for Voice over LTE (VoLTE), Wi-Fi Calling (VoWIFI), Video over LTE (ViLTE), SMS/MMS over WiFi and LTE, Unstructured Supplementary Service Data (USSD) over LTE, and Rich Communication Services (RCS), which is also known as joyn or Advanced Messaging, and now RCS is operator's implementation. RCS also further added Presence/EAB (enhanced address book) functionality. == History == IMS was defined by an industry forum called 3G.IP, formed in 1999. 3G.IP developed the initial IMS architecture, which was brought to the 3rd Generation Partnership Project (3GPP), as part of their standardization work for 3G mobile phone systems in UMTS networks. It first appeared in Release 5 (evolution from 2G to 3G networks), when SIP-based multimedia was added. Support for the older GSM and GPRS networks was also provided. 3GPP2 (a different organization from 3GPP) based their CDMA2000 Multimedia Domain (MMD) on 3GPP IMS, adding support for CDMA2000. 3GPP release 6 added interworking with WLAN, inter-operability between IMS using different IP-connectivity networks, routing group identities, multiple registration and forking, presence, speech recognition and speech-enabled services (Push to talk). 3GPP release 7 added support for fixed networks by working together with TISPAN release R1.1, the function of AGCF (access gateway control function) and PES (PSTN emulation service) are introduced to the wire-line network for the sake of inheritance of services which can be provided in PSTN network. AGCF works as a bridge interconnecting the IMS networks and the Megaco/H.248 networks. Megaco/H.248 networks offers the possibility to connect terminals of the old legacy networks to the new generation of networks based on IP networks. AGCF acts a SIP User agent towards the IMS and performs the role of P-CSCF. SIP User Agent functionality is included in the AGCF, and not on the customer device but in the network itself. Also added voice call continuity between circuit switching and packet switching domain (VCC), fixed broadband connection to the IMS, interworking with non-IMS networks, policy and charging control (PCC), emergency sessions. It also added SMS over IP. 3GPP release 8 added support for LTE / SAE, multimedia session continuity, enhanced emergency sessions, SMS over SGs and IMS centralized services. 3GPP release 9 added support for IMS emergency calls over GPRS and EPS, enhancements to multimedia telephony, IMS media plane security, enhancements to services centralization and continuity. 3GPP release 10 added support for inter device transfer, enhancements to the single radio voice call continuity (SRVCC), enhancements to IMS emergency sessions. 3GPP release 11 added USSD simulation service, network-provided location information for IMS, SMS submit and delivery without MSISDN in IMS, and overload control. Some operators opposed IMS because it was seen as complex and expensive. In response, a cut-down version of IMS—enough of IMS to support voice and SMS over the LTE network—was defined and standardized in 2010 as Voice over LTE (VoLTE). == Architecture == Each of the functions in the diagram is explained below. The IP multimedia core network subsystem is a collection of different functions, linked by standardized interfaces, which grouped form one IMS administrative network. A function is not a node (hardware box): An implementer is free to combine two functions in one node, or to split a single function into two or more nodes. Each node can also be present multiple times in a single network, for dimensioning, load balancing or organizational issues. === Access network === The user can connect to IMS in various ways, most of which use the standard IP. IMS terminals (such as mobile phones, personal digital assistants (PDAs) and computers) can register directly on IMS, even when they are roaming in another network or country (the visited network). The only requirement is that they can use IP and run SIP user agents. Fixed access (e.g., digital subscriber line (DSL), cable modems, Ethernet, FTTx), mobile access (e.g. 5G NR, LTE, W-CDMA, CDMA2000, GSM, GPRS) and wireless access (e.g., WLAN, WiMAX) are all supported. Other phone systems like plain old telephone service (POTS—the old analogue telephones), H.323 and non IMS-compatible systems, are supported through gateways. === Core network === HSS – Home subscriber server: The home subscriber server (HSS), or user profile server function (UPSF), is a master user database that supports the IMS network entities that actually handle calls. It contains the subscription-related information (subscriber profiles), performs authentication and authorization of the user, and can provide information about the subscriber's location and IP information. It is similar to the GSM home location register (HLR) and Authentication centre (AuC). A subscriber location function (SLF) is needed to map user addresses when multiple HSSs are used. User identities: Various identities may be associated with IMS: IP multimedia private identity (IMPI), IP multimedia public identity (IMPU), globally routable user agent URI (GRUU), wildcarded public user identity. Both IMPI and IMPU are not phone numbers or other series of digits, but uniform resource identifier (URIs), that can be digits (a Tel URI, such as tel:+1-555-123-4567) or alphanumeric identifiers (a SIP URI, such as sip:[email protected] ). IP Multimedia Private Identity: The IP Multimedia Private Identity (IMPI) is a unique permanently allocated global identity assigned by the home network operator. It has the form of a Network Access Identifier(NAI) i.e. user.name@domain, and is used, for example, for Registration, Authorization, Administration, and Accounting purposes. Every IMS user shall have one IMPI. IP Multimedia Public Identity: The IP Multimedia Public Identity (IMPU) is used by any user for requesting communications to other users (e.g. this might be included on a business card). Also known as Address of Record (AOR). There can be multiple IMPU per IMPI. The IMPU can also be shared with another phone, so that both can be reached with the same identity (for example, a single phone-number for an entire family). Globally Routable User Agent URI: Globally Routable User Agent URI (GRUU) is an identity that identifies a unique combination of IMPU and UE instance. There are two types of GRUU: Public-GRUU (P-GRUU) and Temporary GRUU (T-GRUU). P-GRUU reveal the IMPU and are very long lived. T-GRUU do not reveal the IMPU and are valid until the contact is explicitly de-registered or the current registration expires Wildcarded Public User Identity: A wildcarded Public User Identity expresses a set of IMPU grouped together. The HSS subscriber database contains the IMPU, IMPI, IMSI, MSISDN, subscriber service profiles, service triggers, and other information. ==== Call Session Control Function (CSCF) ==== Several roles of SIP servers or proxies, collectively called Call Session Control Function (CSCF), are used to process SIP sign

    Read more →
  • ISO/IEC JTC 1/SC 6

    ISO/IEC JTC 1/SC 6

    ISO/IEC JTC 1/SC 6 Telecommunications and information exchange between systems is a standardization subcommittee of the Joint Technical Committee ISO/IEC JTC 1. It is part of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), which develops and facilitates standards within the field of telecommunications and information exchange between systems. ISO/IEC JTC 1/SC 6 was established in 1964, following the creation of a Special Working Group under ISO/TC 97 on Data Link Control Procedures and Modem Interfaces. The international secretariat of ISO/IEC JTC 1/SC 6 is the Korean Agency for Technology and Standards (KATS), located in South Korea. == Scope == The scope of ISO/IEC JTC 1/SC 6 is “Standardization in the field of telecommunications dealing with the exchange of information between open systems including system functions, procedures, parameters as well as the conditions for their use. The standardization encompasses protocols and services of lower layers, including physical, data link, network, and transport as well as those of upper layers including but not limited to Directory and ASN.1.” Future Network has recently been added as an important work scope. A considerable part of the work is done in effective cooperation with ITU-T and other standardization bodies including IEEE 802 and Ecma International. == Structure == ISO/IEC JTC 1/SC 6 has three active working groups (WGs), each of which carries out specific tasks in standards development within the field of telecommunications and information exchange between systems. The focus of each working group is described in the group’s terms of reference. Working groups can be established if new working areas arise, or disbanded if the group’s working area is no longer relevant to standardization needs. Active working groups of ISO/IEC JTC 1/SC 6 are: == Collaborations == ISO/IEC JTC 1/SC 6 works in close collaboration with a number of other organizations or subcommittees, both internal and external to ISO or IEC. Organizations internal to ISO or IEC that collaborate with or are in liaison with ISO/IEC JTC 1/SC 6 include: ISO/IEC JTC 1/WG 7, Sensor networks ISO/IEC JTC 1/SC 17, Cards and personal identification ISO/IEC JTC 1/SC 25, Interconnection of information technology equipment ISO/IEC JTC 1/SC 27, IT security techniques ISO/IEC JTC 1/SC 29, Coding of audio, picture, multimedia and hypermedia information ISO/IEC JTC 1/SC 31, Automatic identification and data capture techniques ISO/IEC JTC 1/SC 38, Distributed application platforms & services (DAPS) ISO/TC 68, Financial services ISO/TC 122, Packaging ISO/TC 184/SC 5, Interoperability, integration, and architectures for enterprise systems and automation applications ISO/TC 215, Health Informatics IEC/SC 46A, Coaxial cables IEC/SC 46C, Wires and symmetric cables IEC/TC 48, Electrical connectors and mechanical structures for electrical and electronic equipment IEC/SC 48B, Electrical connectors IEC/TC 65, Industrial-process measurement, control and automation IEC/SC 65C, Industrial networks IEC/TC 86, Fibre optics IEC/SC 86C, Fibre optic systems and active devices IEC/TC 93, Design automation Some organizations external to ISO or IEC that collaborate with or are in liaison to ISO/IEC JTC 1/SC 6 include: European Conference of Postal and Telecommunications Administrations (CEPT) European Organization for Nuclear Research (CERN) European Commission (EC) European Telecommunications Standards Institute (ETSI) Ecma International International Civil Aviation Organization (ICAO) IEEE 802 LMSC (LAN/MAN Standards Committee) Internet Society (ISOC) International Telecommunications Satellite Organization (ITSO) ITU-T Organization for the Advancement of Structured Information Standards (OASIS) NFC Forum MFA Forum United Nations Conference on Trade and Development (UNCTAD) United Nations Economic Commission for Europe (UNECE) Universal Postal Union (UPU) World Meteorological Organization (WMO) CEN/TC 247/WG 4 == Member countries == Countries pay a fee to ISO to be members of subcommittees. The 19 "P" (participating) members of ISO/IEC JTC 1/SC 6 are: Austria, Belgium, Canada, China, Czech Republic, Finland, Germany, Greece, Jamaica, Japan, Kazakhstan, Republic of Korea, Netherlands, Russian Federation, Spain, Switzerland, Tunisia, United Kingdom, and United States. The 31 "O" (observing) members of ISO/IEC JTC 1/SC 6 are: Argentina, Bosnia and Herzegovina, Colombia, Cuba, Cyprus, France, Ghana, Hong Kong, Hungary, Iceland, India, Indonesia, Islamic Republic of Iran, Ireland, Italy, Kenya, Luxembourg, Malaysia, Malta, New Zealand, Norway, Philippines, Poland, Romania, Saudi Arabia, Serbia, Singapore, Slovenia, Thailand, Turkey, and Ukraine. == Published standards == There are 365 published standards under the direct responsibility of ISO/IEC JTC 1/SC 6. Published standards by ISO/IEC JTC 1/SC 6 include:

    Read more →
  • Content reference identifier

    Content reference identifier

    A content reference identifier or CRID is a concept from the standardization work done by the TV-Anytime forum. It is or closely matches the concept of the Uniform Resource Locator, or URL, as used on the World-Wide Web: A unit of content, in a broadcast stream, can be referred to by its globally unique CRID in the same way that a webpage can be referred to by its globally unique URL on the web. The concept of CRID permits referencing contents unambiguously, regardless of their location, i.e., without knowing specific broadcast information (time, date and channel) or how to obtain them through a network, for instance, by means of a streaming service or by downloading a file from an Internet server. The receiver must be capable of resolving these unambiguous references, i.e. of translating them into specific data that will allow it to obtain the location of that content in order to acquire it. This makes it possible for recording processes to take place without knowing that information, and even without knowing beforehand the duration of the content to be recorded: a complete series by a simple click, a program that has not been scheduled yet, a set of programs grouped by a specific criterion... This framework allows for the separation between the reference to a given content (the CRID) and the necessary information to acquire it, which is called a “locator”. Each CRID may lead to one or more locators which will represent different copies of the same content. They may be identical copies broadcast in different channels or dates, or cost different prices. They may also be distinct copies with different technical parameters such as format or quality. It may also be the case that the resolution process of a CRID provides another CRID as a result (for example, its reference in a different network, where it has an alternative identifier assigned by a different operator) or a set of CRIDs (for instance, if the original CRID represents a TV series, in which case the resolution process would result in the list of CRIDs representing each episode). From the above it can be concluded that provided that a given content can belong to many groups (each possibly defined by distinctive qualities), it is possible that many CRIDs carry the same content. That is, several CRIDs may be resolved into the same locator. A CRID is not exactly a universal, unique and exclusive identifier for a given content. It is closely related to the authority that creates it, to the resolution service provider, and to the content provider in such a way that the same content may have different CRIDs depending on the field in which they are used (for example, a different one for each television operator that has the rights to broadcast the content). == Format == A CRID is specified much like URLs. In fact, a CRID is a so-called URI. Typically, the content creator, the broadcaster or a third party will use their DNS-names in a combination with a product-specific name to create globally unique CRIDs. That is, the syntax of a CRID is: crid://authority/data The authority field represents the entity that created the CRID and its format is that of a DNS name. The data field represents a string of characters that will unambiguously identify the content within the authority scope (it is a string of characters assigned by the authority itself). As an example, let's assume that BBC wanted to make a CRID for (all the programs of) the Olympics in China. It may have looked something like this crid://bbc.co.uk/olympics/2008/ This would be a group CRID, that is, a CRID representing a group of contents. Then, to refer to a specific event – such as the women's shot-put final – they could have used the following inside their metadata. crid://bbc.co.uk/olympics/2008/final/shotput/women Currently, four types of CRIDs are playing a major role in some unidirectional television networks: programme CRID, series CRID, group CRID, and recommendation CRID. One of the most important applications of CRIDs is the so-called series link recording function (SL) of modern digital video recorders (DVR, PVR). In turn, a locator is a string of characters that contains all the necessary information for a receiver to find and acquire a given content, whether it is received through a transport stream, located in local storage, downloaded as a file from an Internet server, or through a streaming service. For example, a DVB locator will include all the necessary parameters to identify a specific content within a transport stream: network, transport stream, service, table and/or event identifiers. The locators' format, as established in TV-Anytime, is quite generic and simple, and corresponds to: [transport-mechanism]:[specific-data] The first part of the locator's format (the transport mechanism) must be a string of characters that is unique for each mechanism (transport stream, local file, HTTP Internet access...). The second part must be unambiguous only within the scope of a given transport mechanism and will be standardized by the organism in charge of the regulation of the mechanism itself. For instance, a DVB locator to identify a content within the transport stream of networks that follow this standard would be: dvb://112.4a2.5ec;2d22~20121212T220000Z—PT01H30M which would indicate a content (identified by the string “2d22”) that airs on a channel available on a DVB network identified by the address “112.4a2.5ec” (network “112”, transport stream “4a2” and service “5ec”), on 12 December 2012 at 10 p.m. and with a duration of 90 minutes. == The location resolution process == The location resolution process is the procedure by which, starting from the CRID of a given content, one or several locators of that content are obtained. Resolving a CRID can be a direct process, which leads immediately to one or many locators, or it may also happen that in the first place one or many intermediate CRIDs are returned, which must undergo the same procedure to finally obtain one or several locators. This procedure involves some information elements, among which we find two structures named resolving authority record (RAR) and ContentReferencingTable, respectively. Consulting them repeatedly will take the receiver from a CRID to one or many locators that will allow it to acquire the content. The RAR table The RAR table is one or many data structures that provide the receiver, for each authority that submits CRIDs, information on the corresponding resolution service provider. Among other things, it informs about which mechanism is used to provide information to resolve the CRIDs from each authority. That is, one or many RAR records must exist for each authority that indicate the receiver where it has to go to resolve the CRIDs of that particular authority. For example, in the record of the figure (expressed by means of a XML structure, according to the XML Schema defined in the TV-Anytime) there is an authority called “tve.es”, whose resolution service provider is the entity “rtve.es”, available on the URL "http://tva.rtve.es/locres/tve", which means there is resolution information in that URL. These RAR records will have reached the receiver in an indefinite form, unimportant for the TV-Anytime specification, which will depend on the specific transport mechanism of the network to which the receiver is connected. Each family of standards that regulates distribution networks (DVB, ATSC, ISDB, IPTV...) will have previously defined such procedure, which will be used by devices certified according to those standards. The ContentReferencingTable table The second structure involved in the location resolution process is a proper resolution table which, given a content's CRID, returns one or several locators that enable the receiver to access an instance of that content, or one or many CRIDs that allow it to move forward in the resolution process. The figure shows an example of this second structure, an XML document according to the specifications of the XML Schema defined in TV-Anytime. In it, several sections are included ( elements) that structure the information that describes each resolution case. The first one declares how a CRID (crid://tv.com/Friends/all), which corresponds to a group content that encompasses several episodes (two) of the “Friends” series is resolved. The result of the resolution process provides two new CRIDs each of them corresponding to one of the two episodes. The second element resolves the CRID of the first episode of the first season. The result of the resolution process is two DVB locators. The “acquire” attribute with “any” value indicates that any of them are good (the second one is a repetition broadcast a week later). The third element gives information about the second episode. It indicates that it cannot be resolved yet (“status” attribute with the “cannot yet resolve” value), indicating a date on which the request for resolution information must be repeated. The pro

    Read more →
  • Concurrency control

    Concurrency control

    In information technology and computer science, especially in the fields of computer programming, operating systems, multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible. Computer systems, both software and hardware, consist of modules, or components. Each component is designed to operate correctly, i.e., to obey or to meet certain consistency rules. When components that operate concurrently interact by messaging or by sharing accessed data (in memory or storage), a certain component's consistency may be violated by another component. The general area of concurrency control provides rules, methods, design methodologies, and theories to maintain the consistency of components operating concurrently while interacting, and thus the consistency and correctness of the whole system. Introducing concurrency control into a system means applying operation constraints which typically result in some performance reduction. Operation consistency and correctness should be achieved with as good as possible efficiency, without reducing performance below reasonable levels. Concurrency control can require significant additional complexity and overhead in a concurrent algorithm compared to the simpler sequential algorithm. For example, a failure in concurrency control can result in data corruption from torn read or write operations. == Concurrency control in databases == Comments: This section is applicable to all transactional systems, i.e., to all systems that use database transactions (atomic transactions; e.g., transactional objects in Systems management and in networks of smartphones which typically implement private, dedicated database systems), not only general-purpose database management systems (DBMSs). DBMSs need to deal also with concurrency control issues not typical just to database transactions but rather to operating systems in general. These issues (e.g., see Concurrency control in operating systems below) are out of the scope of this section. Concurrency control in Database management systems (DBMS; e.g., Bernstein et al. 1987, Weikum and Vossen 2001), other transactional objects, and related distributed applications (e.g., Grid computing and Cloud computing) ensures that database transactions are performed concurrently without violating the data integrity of the respective databases. Thus concurrency control is an essential element for correctness in any system where two database transactions or more, executed with time overlap, can access the same data, e.g., virtually in any general-purpose database system. Consequently, a vast body of related research has been accumulated since database systems emerged in the early 1970s. A well established concurrency control theory for database systems is outlined in the references mentioned above: serializability theory, which allows to effectively design and analyze concurrency control methods and mechanisms. An alternative theory for concurrency control of atomic transactions over abstract data types is presented in (Lynch et al. 1993), and not utilized below. This theory is more refined, complex, with a wider scope, and has been less utilized in the Database literature than the classical theory above. Each theory has its pros and cons, emphasis and insight. To some extent they are complementary, and their merging may be useful. To ensure correctness, a DBMS usually guarantees that only serializable transaction schedules are generated, unless serializability is intentionally relaxed to increase performance, but only in cases where application correctness is not harmed. For maintaining correctness in cases of failed (aborted) transactions (which can always happen for many reasons) schedules also need to have the recoverability (from abort) property. A DBMS also guarantees that no effect of committed transactions is lost, and no effect of aborted (rolled back) transactions remains in the related database. Overall transaction characterization is usually summarized by the ACID rules below. As databases have become distributed, or needed to cooperate in distributed environments (e.g., Federated databases in the early 1990, and Cloud computing currently), the effective distribution of concurrency control mechanisms has received special attention. === Database transaction and the ACID rules === The concept of a database transaction (or atomic transaction) has evolved in order to enable both a well understood database system behavior in a faulty environment where crashes can happen any time, and recovery from a crash to a well understood database state. A database transaction is a unit of work, typically encapsulating a number of operations over a database (e.g., reading a database object, writing, acquiring lock, etc.), an abstraction supported in database and also other systems. Each transaction has well defined boundaries in terms of which program/code executions are included in that transaction (determined by the transaction's programmer via special transaction commands). Every database transaction obeys the following rules (by support in the database system; i.e., a database system is designed to guarantee them for the transactions it runs): Atomicity - Either the effects of all or none of its operations remain ("all or nothing" semantics) when a transaction is completed (committed or aborted respectively). In other words, to the outside world a committed transaction appears (by its effects on the database) to be indivisible (atomic), and an aborted transaction does not affect the database at all. Either all the operations are done or none of them are. Consistency - Every transaction must leave the database in a consistent (correct) state, i.e., maintain the predetermined integrity rules of the database (constraints upon and among the database's objects). A transaction must transform a database from one consistent state to another consistent state (however, it is the responsibility of the transaction's programmer to make sure that the transaction itself is correct, i.e., performs correctly what it intends to perform (from the application's point of view) while the predefined integrity rules are enforced by the DBMS). Thus since a database can be normally changed only by transactions, all the database's states are consistent. Isolation - Transactions cannot interfere with each other (as an end result of their executions). Moreover, usually (depending on concurrency control method) the effects of an incomplete transaction are not even visible to another transaction. Providing isolation is the main goal of concurrency control. Durability - Effects of successful (committed) transactions must persist through crashes (typically by recording the transaction's effects and its commit event in a non-volatile memory). The concept of atomic transaction has been extended during the years to what has become Business transactions which actually implement types of Workflow and are not atomic. However also such enhanced transactions typically utilize atomic transactions as components. === Why is concurrency control needed? === If transactions are executed serially, i.e., sequentially with no overlap in time, no transaction concurrency exists. However, if concurrent transactions with interleaving operations are allowed in an uncontrolled manner, some unexpected, undesirable results may occur, such as: The lost update problem: A second transaction writes a second value of a data-item (datum) on top of a first value written by a first concurrent transaction, and the first value is lost to other transactions running concurrently which need, by their precedence, to read the first value. The transactions that have read the wrong value end with incorrect results. The dirty read problem: Transactions read a value written by a transaction that has been later aborted. This value disappears from the database upon abort, and should not have been read by any transaction ("dirty read"). The reading transactions end with incorrect results. The incorrect summary problem: While one transaction takes a summary over the values of all the instances of a repeated data-item, a second transaction updates some instances of that data-item. The resulting summary does not reflect a correct result for any (usually needed for correctness) precedence order between the two transactions (if one is executed before the other), but rather some random result, depending on the timing of the updates, and whether certain update results have been included in the summary or not. Most high-performance transactional systems need to run transactions concurrently to meet their performance requirements. Thus, without concurrency control such systems can neither provide correct results nor maintain their databases consistently. === Concurrency control mechanisms === ==== Categories ==== The main categories of concurrency control mechanis

    Read more →
  • International World Wide Web Conference Committee

    International World Wide Web Conference Committee

    The International World Wide Web Conference Committee (abbreviated as IW3C2 also written as IW3C2) is a professional non-profit organization registered in Switzerland (Article 60ff of the Swiss Civil Code) that promotes World Wide Web research and development. The IW3C2 organizes and hosts the annual World Wide Web Conference in conjunction with the W3C. The IW3C2 was founded by Joseph Hardin and Robert Cailliau at a meeting held in Boston, United States, on 14 August 1994 to prepare for the upcoming Second International World Wide Web Conference in Chicago. The IW3C2 formally became an incorporated entity in May 1996 at the fifth conference in Paris, France. The organization is governed by laws of the Swiss Confederation and the By-laws. == Abbreviation == The abbreviation for the International World Wide Web Conference Committee as IW3C2 is as follow: I- The I is represents the leading I in International. W3- The W3 represents the three 3 leading W's in World Wide Web. C2- The C2 represents the three 2 leading C's in Conference Committee. == Mission == The mission of the IW3C2 is: To coordinate the organization and planning of the international WWW conference series and ensure that it remains the foremost conference addressing World Wide Web research and development; To promote a collaborative spirit among conference attendees that is essential to the success of the series; To ensure the global geographical diversity of conference sites and provide support to local organizers at those sites; To make sure that all content arising from these conferences and forums is permanently and openly available on the widest possible scale; To preserve the history of the conference series; To encourage the global development of the World Wide Web through collaboration with WWW standards organizations; To provide a permanent, broad-based international body to achieve these purposes. == Conferences == The conferences are organized by the IW3C2 in collaboration with local organizing committees and technical program committees. The series provides an open forum in which all opinions can be presented, subject to a strict process of peer review. The proceedings of the conference are published in the ACM Digital Library. === Endorsed conferences === The IW3C2 has endorsed regional conferences devoted to a special topic of the Web by working with endorsed conferences on cross-promotion, publicity and programs. == Membership == Members of the IW3C2 are ordinary members, ex officio members, non-voting members, and officers. === Ordinary members === Ordinary members are elected for a period of 3 years during a general meeting. Members are nominated due to their recognition in the WWW community and represent themselves. Members can be re-elected only after at least one year of absence. The following are the founding members at the time when IW3C2 was officially incorporated in May 1996: Jean-François Abramatic Tim Berners-Lee Robert Cailliau Dale Dougherty Ira Goldstein Joseph Hardin Tim Krauskopf Detlef Krömker Corinne Moore R. P. Channing Rodgers Albert Vezza Stuart Weibel Yuri Rubinsky (died prior to incorporation) The following are the current (April 2016) ordinary members: Robin Chen Chin-Wan Chung Allan Ellis Wendy Hall - IW3C2 Chair Ivan Herman Arun Iyengar - IW3C2 Vice Chair Irwin King Yoelle Maarek Luc Mariaux - IW3C2 Treasurer Daniel Schwabe - IW3C2 Vice-Chair === Ex officio members === Ex officio members are selected from the immediate past conference general co-chairs and from future conference co-chairs. Their term expires one year after the conference they organized. Ex officio members can be elected as ordinary members. The following are current (April 2016) ex officio members and the conference with which they are affiliated: Jacqueline Bourdeau - WWW2016 James Hendler - WWW2016 Rick Barrett - WWW2017 Rick Cummings - WWW2017 Laurent Flory - WWW2018 Fabien Gandon - WWW2018 === Officers === The IW3C2 officers consist of a chairperson, a vice-chair (chairperson-elect), a secretary, a treasurer, and other appointees. Officers are elected during a general meeting (usually at the annual WWW conference) and serve for one year. They can be re-elected an indefinite number of times. == The Seoul Test of Time Award == This annual award, presented at the WWW conference, is made possible by a generous contribution from the organizers of WWW2014 (Seoul Korea). Recipients are determined by the IW3C2 and honor the author, or authors, of a paper presented at a previous WWW conference that has "stood the test of time." The first award, announced at WWW2015 (Florence Italy), recognized Sergey Brin and Larry Page, the founders of Google. The recipients of the WWW2016 award are LinkIn scientist Dr. Badrul Sarwar and University of Minnesota professors George Karypis, Joseph Konstan, and John Riedl (posthumous) for their work in item-item collaborative filtering.

    Read more →
  • Digital redlining

    Digital redlining

    Digital redlining is the practice of creating and perpetuating inequities between already marginalized groups specifically through the use of digital technologies, digital content, and the internet. The concept of digital redlining is an extension of the practice of redlining in housing discrimination, a historical legal practice in the United States and Canada dating back to the 1930s where red lines were drawn on maps to indicate poor and primarily black neighborhoods that were deemed unsuitable for loans or further development, which created great economic disparities between neighborhoods. The term was popularized by Dr. Chris Gilliard, a privacy scholar, who defines digital redlining as "the creation and maintenance of tech practices, policies, pedagogies, and investment decisions that enforce class boundaries and discriminate against specific groups". Though digital redlining is related to the digital divide and techniques such as weblining and personalization, it is distinct from these concepts as part of larger complex systemic issues. It can refer to practices that create inequities of access to technology services in geographical areas, such as when internet service providers decide to not service specific geographic areas because they are perceived to be not as profitable and thus reduce access to crucial services and civic participation. It can also be used to refer to inequities caused by the policies and practices of digital technologies. For instance, with these methods inequities are accomplished through divisions that are created via algorithms which are hidden from the technology user; the use of big data and analytics allow for a much more nuanced form of discrimination that can target specific vulnerable populations. These algorithmic means are enabled through the use of unregulated data technologies that apply a score to individuals that statistically categorize personality traits or tendencies which are similar to a credit score but are proprietary to the technology companies and not under outside oversight. == Digital redlining and geography == While the roots of redlining lie in excluding populations based on geography, digital redlining occurs in both geographical and non-geographical contexts. An example of both contexts can be found in the charges brought against Facebook on March 28 of 2019, by the United States Department of Housing and Urban Development (HUD). HUD charged Facebook with violating the Fair Housing Act of 1968 by "encouraging, enabling, and causing housing discrimination through the company's advertising platform." HUD stated that Facebook allowed advertisers to “exclude people who live in a specified area from seeing an ad by drawing a red line around that area.” The discrimination called out by HUD included those that were racist, homophobic, ableist, and classist. Besides this example of geographically based digital redlining, HUD also charged that Facebook used profile information and designations to exclude classes of people. The charges stated: "Facebook enabled advertisers to exclude people whom Facebook classified as parents; non-American-born; non-Christian; interested in accessibility; interested in Hispanic culture; or a wide variety of other interests that closely align with the Fair Housing Act’s protected classes" Several media outlets pointed out HUDs own history of housing discrimination through redlining, the establishment of the Fair Housing Act to combat redlining, and how the digital platform was recreating this discriminatory practice. === Digital redlining within a geographical context === Although digital redlining refers to a complex and varied set of practices, it has been most commonly applied to practices with a geographical dimension. Common examples include when an internet service providers decide to not service specific geographic areas because those areas are seen to be not as profitable, resulting in discrimination against low-income communities, with resulting impacts on access to crucial services and civic participation. AT&T has faced specific scrutiny for this form of digital redlining, it has been reported that AT&T has been classist in its offerings of broadband internet service in areas that are more impoverished. Geographically based digital redlining can also apply to digital content or the distribution of goods sold online. Geographically based games such as Pokémon Go have been shown to offer more virtual stops and rewards in geographic areas that are less ethnically and racially diverse. In 2016, Amazon was rebuked for not offering their Prime same-day delivery service to many communities that were largely African American and had incomes that were beneath the national average. Even services such as email can be impacted, with many email administrators creating filters for flagging particular email messages as spam based on the geographical origin of the message. === Digital redlining based on personal identity === Although often aligned with discrimination that falls into a geographically based context digital redlining also refers to when vulnerable populations are targeted for or excluded from specific content or access to the internet in a way that harms them based on some aspect of their identity. Trade schools and community colleges, which typically have a more working class student body, have been found to block public internet content from their students where elite research institutions do not. The use of big data and analytics allow for a much more nuanced form of discrimination that can target specific vulnerable populations. For example, Facebook has been criticized for providing tools that allow advertisers to target ads by ethnic affinity and gender, effectively blocking minorities from seeing specific ads for housing and employment. In October 2019, a major class action lawsuit was filed against Facebook alleging gender and age discrimination in financial advertising. A broad array of consumers can be particularly vulnerable to digital redlining when it is used outside of a geographical context. Besides targeting vulnerable populations based on traditional and legally recognized classifications such as race, gender, age, etc., it has been shown that personal data mined and then resold by brokers can be used to target those who have been identified as suffering from Alzheimer's or dementia, or simply identified as impulse buyers or gullible. == Term distinctions == === Distinctions between weblining and digital redlining === Earlier distinctions have been made between weblining—the process of charging customers different prices based on profile information --- and internet or digital redlining, with digital redlining being focused not on pricing but access. As early as 2002 the Gale Encyclopedia of E-Commerce puts forth the distinction more in use today: weblining is the pervasive and generally accepted (or at least tolerated) practice of personalizing access to products and services in ways invisible to the user; digital redlining is when such personalized, data-driven schemes perpetuate traditional advantages of privileged demographics. As weblining has become more ubiquitous, the term has fallen out of use in favor of the more general term personalization. === Distinctions between the digital divide and digital redlining === Scholars have often drawn connections between the digital divide and digital redlining. In practice, the digital divide is seen as one of a number of impacts of digital redlining, and digital redlining is one of a number of ways in which the divide is maintained or extended. == Criticisms == A 2001 report looked to find if the reason for a gap in access to broadband internet by low-income and minority populations was due to a lack of availability or due to other factors. The report found that there was "little evidence of digital redlining based on income or black or Hispanic concentrations" but that there was mixed evidence of redlining based on areas in which Native American or Asian populations were larger.

    Read more →
  • Hardware security

    Hardware security

    Hardware security is a discipline originated from the cryptographic engineering and involves hardware design, access control, secure multi-party computation, secure key storage, ensuring code authenticity, measures to ensure that the supply chain that built the product is secure among other things. A hardware security module (HSM) is a physical computing device that safeguards and manages digital keys for strong authentication and provides cryptoprocessing. These modules traditionally come in the form of a plug-in card or an external device that attaches directly to a computer or network server. Some providers in this discipline consider that the key difference between hardware security and software security is that hardware security is implemented using "non-Turing-machine" logic (raw combinatorial logic or simple state machines). One approach, referred to as "hardsec", uses FPGAs to implement non-Turing-machine security controls as a way of combining the security of hardware with the flexibility of software. Hardware backdoors are backdoors in hardware. Conceptionally related, a hardware Trojan (HT) is a malicious modification of electronic system, particularly in the context of integrated circuit. A physical unclonable function (PUF) is a physical entity that is embodied in a physical structure and is easy to evaluate but hard to predict. Further, an individual PUF device must be easy to make but practically impossible to duplicate, even given the exact manufacturing process that produced it. In this respect it is the hardware analog of a one-way function. The name "physical unclonable function" might be a little misleading as some PUFs are clonable, and most PUFs are noisy and therefore do not achieve the requirements for a function. Today, PUFs are usually implemented in integrated circuits and are typically used in applications with high security requirements. Many attacks on sensitive data and resources reported by organizations occur from within the organization itself.

    Read more →
  • Human–AI interaction

    Human–AI interaction

    Human–AI interaction is a developing field of research and a sub-field of human–computer interaction (HCI). HCI is a field of research that explores the interactions between humans and computer-based technology, focusing on design implementation, user experience, and psychological factors. With the proliferation of artificial intelligence (AI), there has developed a sub-section of HCI research dedicated specifically to artificial intelligence and how people interact with and are impacted by it. This is human–AI interaction, abbreviated either as HAX or HAII. == Introduction == Artificial intelligence (AI), in general, has fluid definitions and varied research applications, but in brief can be applied to mechanizing tasks that would require human intelligence to complete. AI are tools designed to replicate the human abilities of navigating uncertainty, active learning, and processing information in different contexts. Within the context of HCI and HAX research, artificial intelligence can be broken into two sub-fields, natural language processing (NLP) and computer vision (CV). AI technologies notably include machine-learning, deep-learning and neural networks, and large-language models (LLMs). As a new and rapidly developing technology, AI is changing how computers work and therefore changing how humans interact with computers. Unlike the traditional human-computer interaction, where a human directs a machine, human-AI interaction is characterized by a more collaborative relationship between the computer program (the AI) and the human user, as AI is perceived as an active agent rather than a tool. This changing dynamic creates new questions and necessitates new research methods that are not present in traditional HCI research. According to a scoping review on the state of the discipline, the HAX field comprises research on the "design, development, and evaluation of AI systems" and encompasses the themes of human-AI collaboration, human-AI competition, human-AI conflict, and human-AI symbiosis. == Design == Machine learning and artificial intelligence have been used for decades in targeted advertising and to recommend content in social media. Ethical Guidelines (Framework for ethical AI development) == User Experience (UX) == This section should handle research on how users interact with tools. What techniques do they use, do they develop habits, what types of programs and devices are they using to access these tools, what do they use these tools to do exactly. === Cognitive Frameworks in AI Tool Users === AI has been viewed with various expectations, attributions, and often misconceptions. Many people exclusively understand AI as the LLM chatbots they interact with, like ChatGPT or Claude, or other generative AI programs. [Insert section: discuss how people interact with these specific AI tools as a connection to the following paragraphs] Most fundamentally, humans have a mental model of understanding AI's reasoning and motivation for its decision recommendations, and building a holistic and precise mental model of AI helps people create prompts to receive more valuable responses from AI. However, these mental models are not whole because people can only gain more information about AI through their limited interaction with it; more interaction with AI builds a better mental model that a person may build to produce better prompt outcomes. Research on human-AI interaction has emphasized that users develop mental models of AI systems and revise those models through repeated use, feedback, and explanation, while design research has stressed the importance of communicating capabilities and limitations early and supporting trust calibration through explanation and correction. In a 2025 SSRN working paper, John DeVadoss proposed "Hypothetico-Deductive Interaction" (HDI), a framework that describes human-AI interaction as a mutual process of conjecture and refutation in which users test assumptions about an AI system's capabilities while the system infers and updates assumptions about user goals through its responses and clarifying questions. DeVadoss argued that this framing helps explain prompt iteration, weak capability awareness, and trust miscalibration, and suggested design responses such as clearer communication of uncertainty, easier correction, actionable explanations, and safer failure modes. == Research themes == === Human-AI collaboration === Human-AI collaboration occurs when the human and AI supervise the task on the same level and extent to achieve the same goal. Some collaboration occurs in the form of augmenting human capability. AI may help human ability in analysis and decision-making through providing and weighing a volume of information, and learning to defer to the human decision when it recognizes its unreliability. It is especially beneficial when the human can detect a task that AI can be trusted to make few errors so that there is not a lot of excessive checking process required on the human's end. Some findings show signs of human-AI augmentation, or human–AI symbiosis, in which AI enhances human ability in a way that co-working on a task with AI produces better outcomes than a human working alone. For example: the quality and speed of customer service tasks increase when a human agent collaborates with AI, training on specific models allows AI to improve diagnoses in clinical settings, and AI with human-intervention can improve creativity of artwork while fully AI-generated haikus were rated negatively. Human-AI synergy, a concept in which human-AI collaboration would produce more optimal outcomes than either human or AI working alone could explain why AI does not always help with performance. Some AI features and development may accelerate human-AI synergy, while others may stagnate it. For example, when AI updates for better performance, it sometimes worsens the team performance with human and AI by reducing the compatibility with the new model and the mental model a user has developed on the previous version. Research has found that AI often supports human capabilities in the form of human-AI augmentation and not human-AI synergy, potentially because people rely too much on AI and stop thinking on their own. Prompting people to actively engage in analysis and think when to follow AI recommendations reduces their over-reliance, especially for individuals with higher need for cognition. === Human-AI competition === Robots and computers have substituted routine tasks historically completed by humans, but agentic AI has made it possible to also replace cognitive tasks including taking phone calls for appointments and driving a car. At the point of 2016, research has estimated that 45% of paid activities could be replaced by AI by 2030. Perceived autonomy of robots is known to increase people's negative attitude toward them, and worry about the technology taking over leads people to reject it. There has been a consistent tendency of algorithm aversion in which people prefer human advice over AI advice. However, people are not always able to tell apart tasks completed by AI or other humans. See AI takeover for more information. It is also notable that this sentiment is more prominent in the Western cultures as Westerners tend to show less positive views about AI compared to East Asians. == Research on the psychological impacts of AI == === Perception on others who use AI === As much as people perceive and make judgment about AI itself, they also form impressions of themselves and others who use AI. In the workplace, employees who disclose the use of AI in their tasks are more likely to receive feedback that they are not as hardworking as those who are in the same job who receive non-AI help to complete the same tasks. AI use disclosure diminishes the perceived legitimacy in the employee's task and decision making which ultimately leads observers to distrust people who use AI. Although these negative effects of AI use disclosure are weakened by the observers who use AI frequently themselves, the effect is still not attenuated by the observers' positive attitude towards AI. === Bias, AI, and human === Although AI provides a wide range of information and suggestions to its users, AI itself is not free of biases and stereotypes, and it does not always help people reduce their cognitive errors and biases. People are prone to such errors by failing to see other potential ideas and cases that are not listed by AI responses and committing to a decision suggested by AI that directly contradicts the correct information and directions that they are already aware of. Gender bias is also reflected as the female gendering of AI technologies which conceptualizes females as a helpful assistant. == Emotional connection with AI == Human-AI interaction has been theorized in the context of interpersonal relationships mainly in social psychology, communications and media studies, and as a technology interface through the lens of hu

    Read more →
  • Institute of Telecommunications Professionals

    Institute of Telecommunications Professionals

    The Institute of Telecommunications Professionals (ITP) is a membership organisation for professionals in the telecommunications industry, based in the United Kingdom. The Institute was originally founded in 1906. It is now a registered company with Companies House in the United Kingdom, incorporated in 2002. Brendan O' Mahony has been the chief executive of the ITP. Lucy Woods presided over ITP for fifteen years, until 2018, when the organization named Kevin Paige chairman for five years. In 2022 the ITP appointed its new CEO, Charlotte Goodwill. In 2021, the ITP assisted a UK fibre network Vorboss in establishing its training academy. In 2023, the ITP appointed Tim Creswick, the CEO of Vorboss, as the new chair of its board of directors. The institute has an associated journal, the Journal of the Institute of Telecommunications Professionals, established in 2007 and published quarterly.

    Read more →
  • Web performance

    Web performance

    Web performance refers to the speed in which web pages are downloaded and displayed on the user's web browser. Web performance optimization (WPO), or website optimization is the field of knowledge about increasing web performance. Faster website download speeds have been shown to increase visitor retention and loyalty and user satisfaction, especially for users with slow internet connections and those on mobile devices. Web performance also leads to less data travelling across the web, which in turn lowers a website's power consumption and environmental impact. Some aspects which can affect the speed of page load include browser/server cache, image optimization, and encryption (for example SSL), which can affect the time it takes for pages to render. The performance of the web page can be improved through techniques such as multi-layered cache, light weight design of presentation layer components and asynchronous communication with server side components. == History == In the first decade or so of the web's existence, web performance improvement was focused mainly on optimizing website code and pushing hardware limitations. According to the 2002 book Web Performance Tuning by Patrick Killelea, some of the early techniques used were to use simple servlets or CGI, increase server memory, and look for packet loss and retransmission. Although these principles now comprise much of the optimized foundation of internet applications, they differ from current optimization theory in that there was much less of an attempt to improve the browser display speed. Steve Souders coined the term "web performance optimization" in 2004. At that time Souders made several predictions regarding the impact that WPO as an "emerging industry" would bring to the web, such as websites being fast by default, consolidation, web standards for performance, environmental impacts of optimization, and speed as a differentiator. One major point that Souders made in 2007 is that at least 80% of the time that it takes to download and view a website is controlled by the front-end structure. This lag time can be decreased through awareness of typical browser behavior, as well as of how HTTP works. == Optimization techniques == Web performance optimization improves user experience (UX) when visiting a website and therefore is highly desired by web designers and web developers. They employ several techniques that streamline web optimization tasks to decrease web page load times. This process is known as front end optimization (FEO) or content optimization. FEO concentrates on reducing file sizes and "minimizing the number of requests needed for a given page to load." In addition to the techniques listed below, the use of a content delivery network—a group of proxy servers spread across various locations around the globe—is an efficient delivery system that chooses a server for a specific user based on network proximity. Typically the server with the quickest response time is selected. The following techniques are commonly used web optimization tasks and are widely used by web developers: Web browsers open separate Transmission Control Protocol (TCP) connections for each Hypertext Transfer Protocol (HTTP) request submitted when downloading a web page. These requests total the number of page elements required for download. However, a browser is limited to opening only a certain number of simultaneous connections to a single host. To prevent bottlenecks, the number of individual page elements are reduced using resource consolidation whereby smaller files (such as images) are bundled together into one file. This reduces HTTP requests and the number of "round trips" required to load a web page. Web pages are constructed from code files such JavaScript and Hypertext Markup Language (HTML). As web pages grow in complexity, so do their code files and subsequently their load times. File compression can reduce code files by about 40 percent, thereby improving site responsiveness. Web Caching Optimization reduces server load, bandwidth usage and latency. CDNs use dedicated web caching software to store copies of documents passing through their system. Many website platforms, such as SiteGround, IONOS, Wix, and Hostinger, rely on global CDNs and caching technologies to deliver faster page loads across different geographical regions. Subsequent requests from the cache may be fulfilled should certain conditions apply. Web caches are located on either the client side (forward position) or web-server side (reverse position) of a CDN. Web browsers are also able to store content for re-use through the HTTP cache or web cache. Requests web browsers make are typically routed to the HTTP cache to validate if a cached response may be used to fulfill a request. If such a match is made, the response is fulfilled from the cache. This can be helpful for reducing network latency and costs associated with data-transfer. The HTTP cache is configured using request and response headers. Code minification distinguishes discrepancies between codes written by web developers and how network elements interpret code. Minification removes comments and extra spaces as well as crunch variable names in order to minimize code, decreasing files sizes by as much as 60%. In addition to caching and compression, lossy compression techniques (similar to those used with audio files) remove non-essential header information and lower original image quality on many high resolution images. These changes, such as pixel complexity or color gradations, are transparent to the end-user and do not noticeably affect perception of the image. Another technique is the replacement of raster graphics with resolution-independent vector graphics. Vector substitution is best suited for simple geometric images. Lazy loading of images and video reduces initial page load time, initial page weight, and system resource usage, all of which have positive impacts on website performance. It is used to defer initialization of an object right until the point at which it is needed. The browser loads the images in a page or post when they are needed such as when the user scrolls down the page and not all images at once, which is the default behavior, and naturally, takes more time. == HTTP/1.x and HTTP/2 == Since web browsers use multiple TCP connections for parallel user requests, congestion and browser monopolization of network resources may occur. Because HTTP/1 requests come with associated overhead, web performance is impacted by limited bandwidth and increased usage. Compared to HTTP/1, HTTP/2 is binary instead of textual is fully multiplexed instead of ordered and blocked can therefore use one connection for parallelism uses header compression to reduce overhead allows servers to "push" responses proactively into client caches Instead of a website's hosting server, CDNs are used in tandem with HTTP/2 in order to better serve the end-user with web resources such as images, JavaScript files and Cascading Style Sheet (CSS) files since a CDN's location is usually in closer proximity to the end-user. == Metrics == In recent years, several metrics have been introduced that help developers measure various aspects of the performance of their websites. In 2019, Google introduced metrics such as Time to First Byte (TTFB), First Contentful Paint (FCP), First Paint (FP), First Input Delay (FID), Cumulative Layout Shift (CLS) and Largest Contentful Paint (LCP) allow for website owner to gain insights into issues that might hurt the performance of their websites making it seem sluggish or slow to the user. Other metrics including Request Count (number of requests required to load a page), DOMContentLoaded (time when HTML document is completely loaded and parsed excluding CSS style sheets, images, etc.), Above The Fold Time (content that is visible without scrolling), Round Trip Time, number of Render Blocking Resources (such as scripts, stylesheets), Onload Time, Connection Time, Total Page Size help provide an accurate picture of latencies and slowdowns occurring at the networking level which might slow down a site. Modules to measure metrics such as TTFB, FCP, LCP, FP etc are provided with major frontend JavaScript libraries such as React, NuxtJS and Vue. Google publishes a library, the core-web-vitals library that allows for easy measurement of these metrics in frontend applications. In addition to this, Google also provides the Lighthouse, a Chrome dev-tools component and PageSpeed Insight a site that allows developers to measure and compare the performance of their website with Google's recommended minimums and maximums. In addition to this, tools such as the Network Monitor by Mozilla Firefox help provide insight into network-level slowdowns that might occur during transmission of data.

    Read more →