Over-the-top media services in India

As per Govt of India, there are currently about 57 providers of over-the-top media services (OTT) in India, which distribute streaming media or video on demand over the Internet. == History and growth == The first dependent Indian OTT platform was BIGFlix, launched by Reliance Entertainment in 2008. In 2010 Digivive launched India's first OTT mobile app called nexGTv, which provides access to both live TV and on–demand content. nexGTV was the first app to live–stream Indian Premier League matches on smart phones and did so during 2013 and 2014. The livestream of the IPL since 2015, when rights were won, played an important role in the growth of another OTT platform, Hotstar (now JioHotstar) in India. OTT Platforms gained significant momentum in India when both DittoTV (Zee) and Sony Liv were launched in the Indian market around 2013. Following the initial push of Regional OTT platforms like Aha, Hoichoi, Sun NXT, Planet Marathi, Chaupal & MX Player. The Indian OTT industry saw rapid transformation with the entry of global OTT companies such as Netflix and Amazon Prime Video into the Indian market in 2016. Replacement of this competition with global enterprises caused local rivals to innovate in both region and hyper-regional content. === Hotstar === Hotstar (now JioHotstar) is the most subscribed–to OTT platform in India, owned by JioStar as of February 2025, with around 500 million active users and over 650 million downloads. According to Hotstar's India Watch Report 2018, 96% of watch time on Hotstar comes from videos longer than 20 minutes, while one–third of Hotstar subscribers watch television shows. In 2019, Hotstar began investing ₹120 crore in generating original content such as "Hotstar Specials." 80% of the viewership on Hotstar comes from drama, movies and sports programs. Hotstar has the exclusive streaming rights of IPL in India. === Netflix === American streaming service Netflix entered India in January 2016. In April 2017, it was registered as a limited liability partnership (LLP) and started commissioning content. It earned a net profit of ₹2020,000 (₹2.02 million) for fiscal year 2017. In fiscal year 2018, Netflix earned revenues of ₹580 million. According to Morgan Stanley Research, Netflix had the highest average watch time of more than 120 minutes but viewer counts of around 20 million in July 2018. As of 2018, Netflix has six million subscribers, of which 5–6% are paid members. India was not affected by Netflix's July 2018 increase in subscription rates for the US and Latin America. Netflix has stated its intent to invest ₹600 crore in the production of Indian original programming. In late 2018, Netflix bought 150,000 square feet (14,000 m2) of office space in Bandra–Kurla Complex (BKC) in Mumbai as their head office. As of December 2018, Netflix has more than 40 employees in India. === Other OTT providers === Sun NXT is an Indian video on demand service run by Sun TV Network. It was launched in June 2017, streaming in the Tamil language and six other languages. The platform has more than 4,000 Tamil movies and 200 Tamil shows, as well as regional movies and shows. Sun NXT also streams a large library of its own Sun TV shows and movies. Amazon Prime Video was launched in 2016. The platform has 2,300 titles available including 2,000 movies and about 400 shows. It has announced that it will invest ₹20 billion in creating original content in India. Besides English, Prime Video is available in six Indian languages as of December 2018. Amazon India launched Amazon Prime Music in February 2018. Eros Now, an OTT platform launched by Eros International, has the most content among the OTT providers in India, including over 12,000 films, 100,000 music tracks and albums, and 100 TV shows. Eros Now was named the Best OTT Platform of the Year 2019 at the British Asian Media Awards. It has 211.5 million registered users and 36.2 million paying subscribers as of September 2020. In February 2020, Aha OTT platform was launched, broadcasting exclusively Telugu content. In 2021, Planet Marathi became the first OTT platform dedicated to Marathi content in India, including web-series, films, music, theater, fiction and non-fiction reality shows. It is available for both Android and iOS mobile devices along with Android TV and Amazon Fire TV devices. Bollywood actress Madhuri Dixit helped launch the platform. With rising interest for Korean dramas, Rakuten Viki saw its biggest jump of web traffic from India in 2020 due to the COVID-19 lockdown, which led to ad localization on the platform. The OTT market in fiscal year 2020 was estimated to be worth $1.7 billion. === SonyLIV and ZEE5 === In December 2021, Sony and Zee announced their merger, and announced plans to merge their OTT platforms. The merger was called off. === OTT services launched as Amazon Prime video channels === The list is by alphabetical order, not by rank or popularity. == Content regulation == Due to the absence of any rules and regulation regarding OTT content, many OTT providers were accused of showing nudity, vulgarity and obscenity and hurting Hindu religious sentiments in their shows. Series which were the focus of controversy include Four More Shots Please!, Tandav, Paatal Lok, Sacred Games, Mirzapur Lust stories franchise, Rana Naidu. Thank You for Coming, and Annapoorani (2023). According to media reports, between 2018 and 2024, some OTT platforms emerged which started showing porn in the form of web series. Both the Supreme Court and Delhi High Court say that OTT regulation is necessary. === OTT regulation === On 25 Feb 2021, Indian govt introduced self-regulation rules for OTT platforms to stop obscene content and abusive language. On 19 March 2023, I&B minister Anurag Thakur said that self regulation does not mean that OTT should show obscenity and nudity. On 15 April 2023, I&B Secretary Apurva Chandra has said because of the government's soft-touch regulations on OTT industry have led to the creation of content that is undesirable and vulgar. On 26 April 2023, MIB India said that if nudity and obscenity is seen on any OTT platform, strict action will be taken against it. On 16 May 2023, Don't show obscene content, parliamentary panel told to Netflix and Amazon Prime Video. On 20 June 2023, the government told Netflix, Disney+ Hotstar and all other streaming services that their content should be independently reviewed for obscenity and violence before being shown online. On 27 June 2023, DPCGC took punitive action against Ullu for streaming obscene content and asked them to remove all their explicit shows or remove all adult scenes within 15 days. On 18 July 2023, Anarug Thakur said in a meeting with all OTT stakeholders that demeaning Indian culture will not be tolerated. OTT can't show vulgarity and nudity in the garb of 'creative expression'.The cited sources do not mention vulgarity - they say this was about demeaning Indian culture/society. On 22 August 2023, Indian government assured that it will bring rules and regulation to regulate vulgar and obscene content on social media and OTT platforms. On 10 November 2023, MIB India introduces the 'Broadcasting Service Regulation Bill', which included Programme code with Content Evaluation Committee(CEC) for every OTT platforms. Currently public consultation is ongoing till 15 January 2024. The draft bill mandates that all OTT streaming platforms can only broadcast those web series or content, which will be duly certified by Content Evaluation Committee(CEC). On 14 March 2024, the Ministry of Information and Broadcasting banned over 18 OTT apps from Google play store and suspended all of their 57 social media accounts, as well as closed nineteen streaming websites. The banned platforms were MoodX, Prime Play, Hunters, Besharams, Rabbit movies, Voovi, Fugi, Mojflix, Chikooflix, Nuefliks, Xtramood, NeonX VIP, X Prime, Tri Flicks, Uncut Adda, Dreams Films, Hot Shots VIP, and Yessma. On 25 July 2025, the Ministry of Information and Broadcasting banned from 25 OTT apps from Google play store and suspended all of their 40 social media accounts, as well as 26 closed streaming websites. The banned platforms were include ALTT, Ullu, Big Shots App, Desiflix, Boomex, NeonX VIP, Navarasa Lite, Gulab App, Kangan App, Bull App, ShowHit, Jalva App, Wow Entertainment, Look Entertainment, Hitprime, Fugi, Feneo, ShowX, Sol Talkies, Adda TV, HotX VIP, Hulchul App, MoodX, Triflicks, and Mojflix. On 24 February 2026, the Ministry of Information and Broadcasting banned from 5 OTT apps from Google play store and suspended all of their 5 social media accounts, as well as 5 closed streaming websites. The banned platforms were include Feel App, Digi Movieplex, Jugnu App, MoodX VIP, and Koyal Playpro. === Legal action === Currently OTT is regulated under the IT Rules 2021, which clearly stated that 'No content that is prohibited by law at the time being force can be Publishing or transmitted'. MIB has continuously taking action

Emergent algorithm

An emergent algorithm is an algorithm that exhibits emergent behavior. In essence an emergent algorithm implements a set of simple building block behaviors that when combined exhibit more complex behaviors. One example of this is the implementation of fuzzy motion controllers used to adapt robot movement in response to environmental obstacles. An emergent algorithm has the following characteristics: it achieves predictable global effects it does not require global visibility it does not assume any kind of centralized control it is self-stabilizing Other examples of emergent algorithms and models include cellular automata, artificial neural networks and swarm intelligence systems (ant colony optimization, bees algorithm, etc.).

Attention (machine learning)

In machine learning, attention is a method that determines the importance of each component in a sequence relative to the other components in that sequence. In natural language processing, importance is represented by "soft" weights assigned to each word in a sentence. More generally, attention encodes vectors called token embeddings across a fixed-width sequence that can range from tens to millions of tokens in size. Unlike "hard" weights, which are computed during the backwards training pass, "soft" weights exist only in the forward pass and therefore change with every step of the input. Earlier designs implemented the attention mechanism in a serial recurrent neural network (RNN) language translation system, but a more recent design, namely the transformer, removed the slower sequential RNN and relied more heavily on the faster parallel attention scheme. Inspired by ideas about attention in humans, the attention mechanism was developed to address the weaknesses of using information from the hidden layers of recurrent neural networks. Recurrent neural networks favor information contained in words at the end of a sentence and thus deemed more recent, thereby tending to attenuate the significance and associated predictive weight assigned to information earlier in the sentence. Attention allows a token equal access to any part of a sentence directly, rather than only through the previous state. == History == Additional surveys of the attention mechanism in deep learning are provided by Niu et al. and Soydaner. The major breakthrough came with self-attention, where each element in the input sequence attends to all others, enabling the model to capture global dependencies. This idea was central to the Transformer architecture, which replaced recurrence with attention mechanisms. As a result, Transformers became the foundation for models like BERT, T5 and generative pre-trained transformers (GPT). == Overview == The modern era of machine attention was revitalized by grafting an attention mechanism (Fig 1. orange) to an Encoder-Decoder. Figure 2 shows the internal step-by-step operation of the attention block (A) in Fig 1. === Interpreting attention weights === In translating between languages, alignment is the process of matching words from the source sentence to words of the translated sentence. Networks that perform verbatim translation without regard to word order would show the highest scores along the (dominant) diagonal of the matrix. The off-diagonal dominance shows that the attention mechanism is more nuanced. Consider an example of translating I love you to French. On the first pass through the decoder, 94% of the attention weight is on the first English word I, so the network offers the word je. On the second pass of the decoder, 88% of the attention weight is on the third English word you, so it offers t'. On the last pass, 95% of the attention weight is on the second English word love, so it offers aime. In the I love you example, the second word love is aligned with the third word aime. Stacking soft row vectors together for je, t', and aime yields an alignment matrix: Sometimes, alignment can be multiple-to-multiple. For example, the English phrase look it up corresponds to cherchez-le. Thus, "soft" attention weights work better than "hard" attention weights (setting one attention weight to 1, and the others to 0), as we would like the model to make a context vector consisting of a weighted sum of the hidden vectors, rather than "the best one", as there may not be a best hidden vector. == Variants == Many variants of attention implement soft weights, such as fast weight programmers, or fast weight controllers (1992). A "slow" neural network outputs the "fast" weights of another neural network through outer products. The slow network learns by gradient descent. It was later renamed as "linearized self-attention". Bahdanau-style attention, also referred to as additive attention, Luong-style attention, which is known as multiplicative attention, Early attention mechanisms similar to modern self-attention were proposed using recurrent neural networks. However, the highly parallelizable self-attention was introduced in 2017 and successfully used in the Transformer model, positional attention and factorized positional attention. For convolutional neural networks, attention mechanisms can be distinguished by the dimension on which they operate, namely: spatial attention, channel attention, or combinations. These variants recombine the encoder-side inputs to redistribute those effects to each target output. Often, a correlation-style matrix of dot products provides the re-weighting coefficients. In the figures below, W is the matrix of context attention weights, similar to the formula in Overview section above. == Optimizations == === Flash attention === The size of the attention matrix is proportional to the square of the number of input tokens. Therefore, when the input is long, calculating the attention matrix requires a lot of GPU memory. Flash attention is an implementation that reduces the memory needs and increases efficiency without sacrificing accuracy. It achieves this by partitioning the attention computation into smaller blocks that fit into the GPU's faster on-chip memory, reducing the need to store large intermediate matrices and thus lowering memory usage while increasing computational efficiency. === FlexAttention === FlexAttention is an attention kernel developed by Meta that allows users to modify attention scores prior to softmax and dynamically chooses the optimal attention algorithm. == Applications == Attention is widely used in natural language processing, computer vision, and speech recognition. In NLP, it improves context understanding in tasks like question answering and summarization. In vision, visual attention helps models focus on relevant image regions, enhancing object detection and image captioning. === Attention maps as explanations for vision transformers === From the original paper on vision transformers (ViT), visualizing attention scores as a heat map (called saliency maps or attention maps) has become an important and routine way to inspect the decision making process of ViT models. One can compute the attention maps with respect to any attention head at any layer, while the deeper layers tend to show more semantically meaningful visualization. Attention rollout is a recursive algorithm to combine attention scores across all layers, by computing the dot product of successive attention maps. Because vision transformers are typically trained in a self-supervised manner, attention maps are generally not class-sensitive. When a classification head is attached to the ViT backbone, class-discriminative attention maps (CDAM) combines attention maps and gradients with respect to the class [CLS] token. Some class-sensitive interpretability methods originally developed for convolutional neural networks can be also applied to ViT, such as GradCAM, which back-propagates the gradients to the outputs of the final attention layer. Using attention as basis of explanation for the transformers in language and vision is not without debate. While some pioneering papers analyzed and framed attention scores as explanations, higher attention scores do not always correlate with greater impact on model performances. == Mathematical representation == === Standard scaled dot-product attention === For matrices: Q ∈ R m × d k , K ∈ R n × d k {\displaystyle Q\in \mathbb {R} ^{m\times d_{k}},K\in \mathbb {R} ^{n\times d_{k}}} and V ∈ R n × d v {\displaystyle V\in \mathbb {R} ^{n\times d_{v}}} , the scaled dot-product, or QKV attention, is defined as: Attention ( Q , K , V ) = softmax ( Q K T d k ) V ∈ R m × d v {\displaystyle {\text{Attention}}(Q,K,V)={\text{softmax}}\left({\frac {QK^{T}}{\sqrt {d_{k}}}}\right)V\in \mathbb {R} ^{m\times d_{v}}} where T {\displaystyle {}^{T}} denotes transpose and the softmax function is applied independently to every row of its argument. The matrix Q {\displaystyle Q} contains m {\displaystyle m} queries, while matrices K , V {\displaystyle K,V} jointly contain an unordered set of n {\displaystyle n} key-value pairs. Value vectors in matrix V {\displaystyle V} are weighted using the weights resulting from the softmax operation, so that the rows of the m {\displaystyle m} -by- d v {\displaystyle d_{v}} output matrix are confined to the convex hull of the points in R d v {\displaystyle \mathbb {R} ^{d_{v}}} given by the rows of V {\displaystyle V} . To understand the permutation invariance and permutation equivariance properties of QKV attention, let A ∈ R m × m {\displaystyle A\in \mathbb {R} ^{m\times m}} and B ∈ R n × n {\displaystyle B\in \mathbb {R} ^{n\times n}} be permutation matrices; and D ∈ R m × n {\displaystyle D\in \mathbb {R} ^{m\times n}} an arbitrary matrix. The softmax function is permutation equivariant in the sense that: softmax ( A D B ) = A softmax ( D ) B {\displays

Hidden layer

In artificial neural networks, a hidden layer is a layer of artificial neurons that is neither an input layer nor an output layer. The simplest examples appear in multilayer perceptrons (MLP), as illustrated in the diagram. An MLP without any hidden layer is essentially just a linear model. With hidden layers and activation functions, however, nonlinearity is introduced into the model. In typical machine learning practice, the weights and biases are initialized, then iteratively updated during training via backpropagation.

Timeline of artificial intelligence risks in global finance

The following article is a broad timeline of the course of events related to artificial intelligence risks in global finance. The AI boom has led to concerns including the existential risk from artificial intelligence, as the uptake on applications of artificial intelligence increases. By late 2025, global finance and artificial intelligence were "deeply intertwined". A June 2025 Menlo Ventures report raised concerns about the sustainability of future revenue and long-term profitability of AI, given the relatively low rate of consumer monetization. == 2017 == 30 NovemberThe New York Times said that new AI reports by McKinsey & Company, the National Bureau of Economic Research, and an AI Index created by university researchers, indicated an early AI boom. The Index built on a project—"The One Hundred Year Study on Artificial Intelligence" launched in 2014. == 2018 == 2018 was a year of incremental AI growth in finance. == 2022 == The release of ChatGPT by OpenAI became the catalyst for an artificial intelligence boom that continues to remake the global economy. According to a European Central Bank report, public interest in AI increased rapidly as evidenced with rising Google searches, AI jobs, models, patents, and innovations since late 2022. At that time Europe led the US in the size of its AI workforce. == 2023 == The regulatory body, the International Monetary Fund (IMF), published their report, "Generative Artificial Intelligence in Finance: Risk Considerations", drawing attention to oversight gaps and the need for regulations. The report explores the risks posed by using generative artificial intelligence (GenAI) systems in the financial sector including "broader risks to financial stability." == 2024 == January 12 In January 2024 Bloomberg's published its list of the "Magnificent Seven" Big Tech companies on the stock market based on their strength, size and market capitalization:Apple, Microsoft, Alphabet (Google), Amazon, Meta Platforms (Facebook), Nvidia, and Tesla. 21 June During the AI boom, Nvidia became the world's most valuable company, surpassing Microsoft, as its value increased to over US$4 trillion. In 2023 and 2024, the "Magnificent Seven" stocks were the primary drivers behind the increase in equity indexes, according to Reuters. == 2025 == === January === 23 January President Donald Trump's AI policy was announced calling for United States global leadership in artificial intelligence. The Economist noted that this politic shift in which the United States seeks "global dominance" in AI includes trimming regulations and assisting in expansion of infrastructure and increase in number of AI workers. Governments of Gulf nations were also investing trillions of dollars in AI. 27 January Against the backdrop of a tech war between China and the United States over AI dominance, within days of the launch of China's free DeepSeek App, it was the most downloaded app in the United States, rising to the first place in the Apple app store. President Trump responded immediately, saying this "sudden rise" should be a "wake-up" call to the United States, and called on US companies to be more competitive. === June === 26 June In their June 2025 report, Menlo Ventures estimated that only about 3% of consumers paid for artificial intelligence-related services, representing about $USD12 billion in annual spending. This is relatively low in contrast to the massive capital expenditure by AI infrastructure companies, which raises concerns about revenue sustainability and long-term profitability. === July === 23 July The Trump administration launched the US AI Action Plan, positioning the United States in a high-stakes technological race with China for global dominance in artificial intelligence, emphasizing that neither nation can afford to fall behind due to the exponential nature of AI advancement. The plan, a new government website and policy speech called for accelerated AI adoption across federal agencies, and a number of initiatives to make is easier for AI infrastructure expansion, and other measures to ensure American leadership in AI standards. Some leading experts warned that the administration failed to provide sufficient regulations and safeguards for AI safety. Concerns were raised about the negative impacts of cuts to research funding and tightened visa policies for scientists, potentially undermining public trust and America's ability to compete internationally. === September === 7 September The Economist cautioned that AI revenues are relatively modest compared to the high cost and investments in the creation of new data centers. Even Sam Altman, OpenAI CEO and one of the leading figures of the AI boom,, raised concerns about investors' outsized hopes for financial returns. At the same time, history has shown that new technologies, like railways and electricity, endured and spread after the initial hype faded. 12 September Economists warn that U.S. households' direct and indirect investments—mutual funds or retirement plans—in the stock market reached an unprecedented historically high level, now representing 45% of all financial assets, or about $USD51.2 trillion. Compared to the Dot-com bubble this represents a sharp increase in exposure. This makes U.S. households vulnerable to market downturns which in turn would result in decreasing consumer spending. U.S. household net worth rose to a record $176.3 trillion in the second quarter, an increase of $7.3 trillion since early 2025 and about $46 trillion higher than before the pandemic. Federal Reserve data attribute the surge primarily to gains in stock markets and housing values. However, the rise in wealth on paper coincided with increased household borrowing and growing government debt. 18 September Questions were being raised about how quickly the data centers, chips, servers, and GPUs assets of major AI companies will depreciate in value. Comparisons have been made to the Railway Mania in the aftermath of the stock market bubble where a valuable physical infrastructure remained standing, and the telecoms crash after the dot-com bubble which left fiber networks. 28 September There were warnings that record-high American stock ownership during the AI-fueled market boom is a red flag for systemic risk, as the current concentration in equities exceeds levels seen before the dot-com bubble burst in 2000, and could amplify the impact of any future stock market correction. === October === 3 October In 2025 alone, venture capitalists invested almost $USD200 billion in the artificial intelligence sector. 29 October Nvidia was the first company in the world to be valued at US$5 trillion, largely due to AI demand and strategic partnerships with leading technology and AI firms. Nvidia's increase in value was "meteoric". === November === 2 November Forbes reported that, since April, the 'Magnificent Seven' tech giants together contributed over 40% of the S&P 500's return, highlighting their outsized influence and the growing impact of AI on market valuations. CNN warned that while there is a current benefit to investors, with such a high concentration in the S&P 500, they are highly exposed to the fate of the Mag Seven. 2 November Globally there are 11,000 datacentres—huge campuses for AI infrastructure, including thousands of chips, GPUS, and servers. This represents a 500% increase over the last two decades. It is anticipated that $3USDtn more will be spent on increasing that number over the next two or three years. 5 November Concerns about the potential for a market bubble were raised as six of the AI-related Big Tech "Magnificent Seven"—that contribute to the AI boom—reported losing ground in the stock market. Global markets and artificial intelligence have become "deeply intertwined", according to a Reuters report. As of November 2025, more than 50% of the 20 largest S&P firms were deeply exposed to AI. In contrast, in 2000, the 20 S&P 500 firms represented 39% of its total value only 11 of these companies were exposed to the internet. If AI fails to deliver strong returns on their investments, these top S&P firms would be significantly impacted, according to the Economist. Analysts suggest that the AI market in 2025 may not behave like a traditional one, as investors are simultaneously aware of the risks and driven by the potential for outsized rewards. Leading AI labs may believe that the first company to achieve artificial general intelligence (AGI), when an AI system surpasses all human cognitive abilities and becomes capable of self-improvement—could dominate the future of technology and finance. While some have estimated that the potential value of such a breakthrough could be as high as $1.46 quadrillion, this figure is speculative and widely debated. 5 November Bloomberg described Nvidia's H100 Hopper-Blackwell AI chips as the "King of AI chips". Nvidia dominates the AI chip market with over 78% of the market share because of both speed and cost. According to B

Certified social engineering prevention specialist

Certified Social Engineering Prevention Specialist (CSEPS) is a social engineering security-awareness training and professional certification program originally developed by Kevin Mitnick and Alexis Kasperavičius. == Course structure == The original CSEPS program was structured as a multi-module corporate security-awareness course designed to teach employees, managers, and IT personnel how social engineers manipulate human behavior to bypass technical security systems. The curriculum combined case studies, psychological analysis, attack demonstrations, pretexting exercises, and operational security scenarios. The course materials described social engineering as the exploitation of "the human factor" in information security and argued that traditional technical defenses alone were insufficient to protect organizations from deception-based attacks. The training program was divided into instructional modules covering topics such as: social engineering methodology and threat analysis intelligence gathering and reconnaissance dumpster diving pretexting elicitation technique telephone-system exploitation and caller-ID spoofing psychological influence techniques industrial espionage identity theft organizational vulnerabilities security policy development and employee awareness training The course also analyzed historical and contemporary case studies involving information theft, corporate espionage, fraudulent wire transfers, and telephone-based impersonation attacks. Training exercises required participants to analyze how attackers established credibility, manipulated trust, overcame objections, and exploited organizational procedures. According to The Wall Street Journal, CSEPS was delivered as a two-day "boot camp" course costing approximately US$1,500 per attendee. Clients reportedly included the United States Air Force and the United States Marine Corps. The certification examination included multiple-choice and written-response sections dealing with social-engineering defense scenarios and mitigation strategies. == History == In 2003, Mitnick and Kasperavičius partnered with the Florida-based IT training company Intense School Inc. to offer CSEPS classes throughout the United States. In 2020, Mitnick partnered with security-awareness training company KnowBe4, and elements of the original CSEPS material became incorporated into KnowBe4's social-engineering awareness training offerings.

Knowledge graph embedding

In representation learning, knowledge graph embedding (KGE), also called knowledge representation learning (KRL), or multi-relation learning, is a machine learning task of learning a low-dimensional representation of a knowledge graph's entities and relations while preserving their semantic meaning. Leveraging their embedded representation, knowledge graphs can be used for various applications such as link prediction, triple classification, entity recognition, clustering, and relation extraction. == Definition == A knowledge graph G = { E , R , F } {\displaystyle {\mathcal {G}}=\{E,R,F\}} is a collection of entities E {\displaystyle E} , relations R {\displaystyle R} , and facts F {\displaystyle F} . A fact is a triple ( h , r , t ) ∈ F {\displaystyle (h,r,t)\in F} that denotes a link r ∈ R {\displaystyle r\in R} between the head h ∈ E {\displaystyle h\in E} and the tail t ∈ E {\displaystyle t\in E} of the triple. Another notation that is often used in the literature to represent a triple (or fact) is ⟨ head , relation , tail ⟩ {\displaystyle \langle {\text{head}},{\text{relation}},{\text{tail}}\rangle } . This notation is called the Resource Description Framework (RDF). A knowledge graph represents the knowledge related to a specific domain; leveraging this structured representation, it is possible to infer a piece of new knowledge from it after some refinement steps. However, nowadays, people have to deal with the sparsity of data and the computational inefficiency to use them in a real-world application. The embedding of a knowledge graph is a function that translates each entity and each relation into a vector of a given dimension d {\displaystyle d} , called embedding dimension. It is even possible to embed the entities and relations with different dimensions. The embedding vectors can then be used for other tasks. A knowledge graph embedding is characterized by four aspects: Representation space: The low-dimensional space in which the entities and relations are represented. Scoring function: A measure of the goodness of a triple-embedded representation. Encoding models: The modality in which the embedded representation of the entities and relations interact with each other. Additional information: Any additional information coming from the knowledge graph that can enrich the embedded representation. Usually, an ad hoc scoring function is integrated into the general scoring function for each additional piece of information. == Embedding procedure == All algorithms for creating a knowledge graph embedding follow the same approach. First, the embedding vectors are initialized to random values. Then, they are iteratively optimized using a training set of triples. In each iteration, a batch of size b {\displaystyle b} triples is sampled from the training set, and a triple from it is sampled and corrupted—i.e., a triple that does not represent a true fact in the knowledge graph. The corruption of a triple involves substituting the head or the tail (or both) of the triple with another entity that makes the fact false. The original triple and the corrupted triple are added in the training batch, and then the embeddings are updated, optimizing a scoring function. Iteration stops when a stop condition is reached. Usually, the stop condition depends on the overfitting of the training set. At the end, the learned embeddings should have extracted semantic meaning from the training triples and should correctly predict unseen true facts in the knowledge graph. === Pseudocode === The following is the pseudocode for the general embedding procedure. algorithm Compute entity and relation embeddings input: The training set S = { ( h , r , t ) } {\displaystyle S=\{(h,r,t)\}} , entity set E {\displaystyle E} , relation set R {\displaystyle R} , embedding dimension k {\displaystyle k} output: Entity and relation embeddings initialization: the entities e {\displaystyle e} and relations r {\displaystyle r} embeddings (vectors) are randomly initialized while stop condition do S b a t c h ← s a m p l e ( S , b ) {\displaystyle S_{batch}\leftarrow sample(S,b)} // Sample a batch from the training set for each ( h , r , t ) {\displaystyle (h,r,t)} in S b a t c h {\displaystyle S_{batch}} do ( h ′ , r , t ′ ) ← s a m p l e ( S ′ ) {\displaystyle (h',r,t')\leftarrow sample(S')} // Sample a corrupted fact T b a t c h ← T b a t c h ∪ { ( ( h , r , t ) , ( h ′ , r , t ′ ) ) } {\displaystyle T_{batch}\leftarrow T_{batch}\cup \{((h,r,t),(h',r,t'))\}} end for Update embeddings by minimizing the loss function end while == Performance indicators == These indexes are often used to measure the embedding quality of a model. The simplicity of the indexes makes them very suitable for evaluating the performance of an embedding algorithm even on a large scale. Given Q {\displaystyle {\ce {Q}}} as the set of all ranked predictions of a model, it is possible to define three different performance indexes: Hits@K, MR, and MRR. === Hits@K === Hits@K or in short, H@K, is a performance index that measures the probability to find the correct prediction in the first top K model predictions. Usually, it is used k = 10 {\displaystyle k=10} . Hits@K reflects the accuracy of an embedding model to predict the relation between two given triples correctly. Hits@K = | { q ∈ Q : q < k } | | Q | ∈ [ 0 , 1 ] {\displaystyle ={\frac {|\{q\in Q:q