"Something Big Is Happening" is an essay by Matt Shumer, an AI entrepreneur, about the impact of artificial intelligence, published in February 2026, that has since been reportedly viewed more than 80 million times and widely discussed. Shumer noted that the technology has crossed an important threshold, where AI has become capable of creating self-improving systems. Referring to one the most recent AI models, he wrote: "It was making intelligent decisions. It had something that felt, for the first time, like judgment. Like taste." Speaking to CNBC's Power Lunch, Shumer said that his "core message" is "people in the workforce should start to use and experiment with AI tools so they can understand what’s coming". Even as the essay was widely shared and discussed, the essay also elicited criticism. Paulo Carvao, in an essay published by the Forbes Magazine stated that some of his advice is sound, but added: "It reads at times like a sales pitch. He urges readers to subscribe to the most advanced AI tools. He implies that those with access to premium models will outpace those without. He frames paid AI subscriptions as a form of insurance against obsolescence." Writing in The Guardian, Dan Milmo and Aisha Down mentioned Shumer as having a history of AI hype and stated, "He previously excited the internet by announcing the release of the world's "top open-source model", which it was not". Many workers in the technology sector criticized the article in blog posts shared on Hacker News; Edward Zitron commented that "while coding LLMs can test products, or scan/fix some bugs, this suggests they A) do this autonomously without human input, B) they do this correctly every time (or ever!)." In an article alluding to Shumer's original post, Ari Colaprete wrote "the LLM is fundamentally a writing machine, it does everything via text, and if you make it produce writing that exists purely to serve some sort of mechanical function, and you train it to succeed in that task, then it will tend to do so, even with vast intricacy."
Connectionist expert system
Connectionist expert systems are artificial neural network (ANN) based expert systems where the ANN generates inferencing rules e.g., fuzzy-multi layer perceptron where linguistic and natural form of inputs are used. Apart from that, rough set theory may be used for encoding knowledge in the weights better and also genetic algorithms may be used to optimize the search solutions better. Symbolic reasoning methods may also be incorporated (see hybrid intelligent system). (Also see expert system, neural network, clinical decision support system.)
Query rewriting
Query rewriting is a typically automatic transformation that takes a set of database tables, views, and/or queries, usually indices, often gathered data and query statistics, and other metadata, and yields a set of different queries, which produce the same results but execute with better performance (for example, faster, or with lower memory use). Query rewriting can be based on relational algebra or an extension thereof (e.g. multiset relational algebra with sorting, aggregation and three-valued predicates i.e. NULLs as in the case of SQL). The equivalence rules of relational algebra are exploited, in other words, different query structures and orderings can be mathematically proven to yield the same result. For example, filtering on fields A and B, or cross joining R and S can be done in any order, but there can be a performance difference. Multiple operations may be combined, and operation orders may be altered. The result of query rewriting may not be at the same abstraction level or application programming interface (API) as the original set of queries (though often is). For example, the input queries may be in relational algebra or SQL, and the rewritten queries may be closer to the physical representation of the data, e.g. array operations. Query rewriting can also involve materialization of views and other subqueries; operations that may or may not be available to the API user. The query rewriting transformation can be aided by creating indices from which the optimizer can choose (some database systems create their own indexes if deemed useful), mandating the use of specific indices, creating materialized and/or denormalized views, or helping a database system gather statistics on the data and query use, as the optimality depends on patterns in data and typical query usage. Query rewriting may be rule based or optimizer based. Some sources discuss query rewriting as a distinct step prior to optimization, operating at the level of the user accessible algebra API (e.g. SQL). There are other, largely unrelated concepts also named similarly, for example, query rewriting by search engines.
Hindley–Milner type system
A Hindley–Milner (HM) type system is a classical type system for the lambda calculus with parametric polymorphism. It is also known as Damas–Milner or Damas–Hindley–Milner. It was first described by J. Roger Hindley and later rediscovered by Robin Milner. Luis Damas contributed a close formal analysis and proof of the method in his PhD thesis. Among HM's more notable properties are its completeness and its ability to infer the most general type of a given program without programmer-supplied type annotations or other hints. Algorithm W is an efficient type inference method in practice and has been successfully applied on large code bases, although it has a high theoretical complexity. HM is preferably used for functional programming languages. It was first implemented as part of the type system of the programming language ML. Since then, HM has been extended in various ways, most notably with type class constraints like those in Haskell. == Introduction == As a type inference method, Hindley–Milner is able to deduce the types of variables, expressions and functions from programs written in an entirely untyped style. Being scope sensitive, it is not limited to deriving the types only from a small portion of source code, but rather from complete programs or modules. Being able to cope with parametric types, too, it is core to the type systems of many functional programming languages. It was first applied in this manner in the ML programming language. The origin is the type inference algorithm for the simply typed lambda calculus that was devised by Haskell Curry and Robert Feys in 1958. In 1969, J. Roger Hindley extended this work and proved that their algorithm always inferred the most general type. In 1978, Robin Milner, independently of Hindley's work, provided an equivalent algorithm, Algorithm W. In 1982, Luis Damas finally proved that Milner's algorithm is complete and extended it to support systems with polymorphic references. === Monomorphism vs. polymorphism === In the simply typed lambda calculus, types T are either atomic type constants or function types of form T → T {\displaystyle T\rightarrow T} . Such types are monomorphic. Typical examples are the types used in arithmetic values: 3 : N u m b e r a d d 3 4 : N u m b e r a d d : N u m b e r → N u m b e r → N u m b e r {\displaystyle {\begin{array}{ll}3&:{\mathtt {Number}}\\{\mathtt {add}}\ 3\ 4&:{\mathtt {Number}}\\{\mathtt {add}}&:{\mathtt {Number}}\rightarrow {\mathtt {Number}}\rightarrow {\mathtt {Number}}\end{array}}} Contrary to this, the untyped lambda calculus is neutral to typing at all, and many of its functions can be meaningfully applied to all type of arguments. The trivial example is the identity function i d ≡ λ x . x {\displaystyle {\mathtt {id}}\equiv \lambda x.x} which simply returns whatever value it is applied to. Less trivial examples include parametric types like lists. While polymorphism in general means that operations accept values of more than one type, the polymorphism used here is parametric. One finds the notation of type schemes in the literature, too, emphasizing the parametric nature of the polymorphism. Additionally, constants may be typed with (quantified) type variables. For example, the following type schemes quantify universally over α {\displaystyle \alpha } , meaning that they are true for all possible α {\displaystyle \alpha } : c o n s : ∀ α . α → L i s t α → L i s t α n i l : ∀ α . L i s t α i d : ∀ α . α → α {\displaystyle {\begin{array}{ll}{\mathtt {cons}}&:\forall \alpha .\alpha \rightarrow {\mathtt {List}}\ \alpha \rightarrow {\mathtt {List}}\ \alpha \\{\mathtt {nil}}&:\forall \alpha .{\mathtt {List}}\ \alpha \\{\mathtt {id}}&:\forall \alpha .\alpha \rightarrow \alpha \end{array}}} Polymorphic types can become monomorphic by consistent substitution of their variables. Examples of monomorphic instances are: i d ′ : S t r i n g → S t r i n g n i l ′ : L i s t N u m b e r {\displaystyle {\begin{array}{ll}{\mathtt {id}}'&:{\mathtt {String}}\rightarrow {\mathtt {String}}\\{\mathtt {nil}}'&:{\mathtt {List}}\ {\mathtt {Number}}\end{array}}} More generally, types are polymorphic when they contain type variables, while types without them are monomorphic. Contrary to the type systems used for example in Pascal (1970) or C (1972), which only support monomorphic types, HM is designed with emphasis on parametric polymorphism. The successors of the languages mentioned, like C++ (1985), focused on different types of polymorphism, namely subtyping in connection with object-oriented programming and overloading. While subtyping is incompatible with HM, a variant of systematic overloading is available in the HM-based type system of Haskell. === Let-polymorphism === When extending the type inference for the simply-typed lambda calculus towards polymorphism, one has to decide whether assigning a polymorphic type not only as type of an expression, but also as the type of a λ-bound variable is admissible. This would allow the generic identity type to be assigned to the variable 'id' in: (λ id . ... (id 3) ... (id "text") ... ) (λ x . x) Allowing this gives rise to the polymorphic lambda calculus; however, type inference in this system is not decidable. Instead, HM distinguishes variables that are immediately bound to an expression from more general λ-bound variables, calling the former let-bound variables, and allows polymorphic types to be assigned only to these. This leads to let-polymorphism where the above example takes the form let id = λ x . x in ... (id 3) ... (id "text") ... which can be typed with a polymorphic type for 'id'. As indicated, the expression syntax is extended to make the let-bound variables explicit, and by restricting the type system to allow only let-bound variable to have polymorphic types, while the parameters in lambda-abstractions must get a monomorphic type, type inference becomes decidable. == Overview == The remainder of this article proceeds as follows: The HM type system is defined. This is done by describing a deduction system that makes precise what expressions have what type, if any. From there, it works towards an implementation of the type inference method. After introducing a syntax-driven variant of the above deductive system, it sketches an efficient implementation (algorithm J), appealing mostly to the reader's metalogical intuition. Because it remains open whether algorithm J indeed realises the initial deduction system, a less efficient implementation (algorithm W), is introduced and its use in a proof is hinted. Finally, further topics related to the algorithm are discussed. The same description of the deduction system is used throughout, even for the two algorithms, to make the various forms in which the HM method is presented directly comparable. == The Hindley–Milner type system == The type system can be formally described by syntax rules that fix a language for the expressions, types, etc. The presentation here of such a syntax is not too formal, in that it is written down not to study the surface grammar, but rather the depth grammar, and leaves some syntactical details open. This form of presentation is usual. Building on this, typing rules are used to define how expressions and types are related. As before, the form used is a bit liberal. === Syntax === The expressions to be typed are exactly those of the lambda calculus extended with a let-expression as shown in the adjacent table. Parentheses can be used to disambiguate an expression. The application is left-binding and binds stronger than abstraction or the let-in construct. Types are syntactically split into two groups, monotypes and polytypes. ==== Monotypes ==== Monotypes always designate a particular type. Monotypes τ {\displaystyle \tau } are syntactically represented as terms. Examples of monotypes include type constants like i n t {\displaystyle {\mathtt {int}}} or s t r i n g {\displaystyle {\mathtt {string}}} , and parametric types like M a p ( S e t s t r i n g ) i n t {\displaystyle {\mathtt {Map\ (Set\ string)\ int}}} . The latter types are examples of applications of type functions, for example, from the set { M a p 2 , S e t 1 , s t r i n g 0 , i n t 0 , → 2 } {\displaystyle \{{\mathtt {Map^{2},\ Set^{1},\ string^{0},\ int^{0}}},\ \rightarrow ^{2}\}} , where the superscript indicates the number of type parameters. The complete set of type functions C {\displaystyle C} is arbitrary in HM, except that it must contain at least → 2 {\displaystyle \rightarrow ^{2}} , the type of functions. It is often written in infix notation for convenience. For example, a function mapping integers to strings has type i n t → s t r i n g {\displaystyle {\mathtt {int}}\rightarrow {\mathtt {string}}} . Again, parentheses can be used to disambiguate a type expression. The application binds stronger than the infix arrow, which is right-binding. Type variables are admitted as monotypes. Monotypes are not to be confused with monomorphic types, which exc
Broadcast (parallel pattern)
Broadcast is a collective communication primitive in parallel programming to distribute programming instructions or data to nodes in a cluster. It is the reverse operation of reduction. The broadcast operation is widely used in parallel algorithms, such as matrix-vector multiplication, Gaussian elimination and shortest paths. The Message Passing Interface implements broadcast in MPI_Bcast. == Definition == A message M [ 1.. m ] {\displaystyle M[1..m]} of length m {\displaystyle m} should be distributed from one node to all other p − 1 {\displaystyle p-1} nodes. T byte {\displaystyle T_{\text{byte}}} is the time it takes to send one byte. T start {\displaystyle T_{\text{start}}} is the time it takes for a message to travel to another node, independent of its length. Therefore, the time to send a package from one node to another is t = s i z e × T byte + T start {\displaystyle t=\mathrm {size} \times T_{\text{byte}}+T_{\text{start}}} . p {\displaystyle p} is the number of nodes and the number of processors. == Binomial Tree Broadcast == With Binomial Tree Broadcast the whole message is sent at once. Each node that has already received the message sends it on further. This grows exponentially as each time step the amount of sending nodes is doubled. The algorithm is ideal for short messages but falls short with longer ones as during the time when the first transfer happens only one node is busy. Sending a message to all nodes takes log 2 ( p ) t {\displaystyle \log _{2}(p)t} time which results in a runtime of log 2 ( p ) ( m T byte + T start ) {\displaystyle \log _{2}(p)(mT_{\text{byte}}+T_{\text{start}})} == Linear Pipeline Broadcast == The message is split up into k {\displaystyle k} packages and sent piecewise from node n {\displaystyle n} to node n + 1 {\displaystyle n+1} . The time needed to distribute the first message piece is p t = m k T byte + T start {\textstyle pt={\frac {m}{k}}T_{\text{byte}}+T_{\text{start}}} whereby t {\displaystyle t} is the time needed to send a package from one processor to another. Sending a whole message takes ( p + k ) ( m T byte k + T start ) = ( p + k ) t = p t + k t {\displaystyle (p+k)\left({\frac {mT_{\text{byte}}}{k}}+T_{\text{start}}\right)=(p+k)t=pt+kt} . Optimal is to choose k = m ( p − 2 ) T byte T start {\displaystyle k={\sqrt {\frac {m(p-2)T_{\text{byte}}}{T_{\text{start}}}}}} resulting in a runtime of approximately m T byte + p T start + m p T start T byte {\displaystyle mT_{\text{byte}}+pT_{\text{start}}+{\sqrt {mpT_{\text{start}}T_{\text{byte}}}}} The run time is dependent on not only message length but also the number of processors that play roles. This approach shines when the length of the message is much larger than the amount of processors. == Pipelined Binary Tree Broadcast == This algorithm combines Binomial Tree Broadcast and Linear Pipeline Broadcast, which makes the algorithm work well for both short and long messages. The aim is to have as many nodes work as possible while maintaining the ability to send short messages quickly. A good approach is to use Fibonacci trees for splitting up the tree, which are a good choice as a message cannot be sent to both children at the same time. This results in a binary tree structure. We will assume in the following that communication is full-duplex. The Fibonacci tree structure has a depth of about d ≈ log Φ ( p ) {\displaystyle d\approx \log _{\Phi }(p)} whereby Φ = 1 + 5 2 {\displaystyle \Phi ={\frac {1+{\sqrt {5}}}{2}}} the golden ratio. The resulting runtime is ( m k T byte + T start ) ( d + 2 k − 2 ) {\textstyle ({\frac {m}{k}}T_{\text{byte}}+T_{\text{start}})(d+2k-2)} . Optimal is k = n ( d − 2 ) T byte 3 T start {\displaystyle k={\sqrt {\frac {n(d-2)T_{\text{byte}}}{3T_{\text{start}}}}}} . This results in a runtime of 2 m T byte + T start log Φ ( p ) + 2 m log Φ ( p ) T start T byte {\displaystyle 2mT_{\text{byte}}+T_{\text{start}}\log _{\Phi }(p)+{\sqrt {2m\log _{\Phi }(p)T_{\text{start}}T_{\text{byte}}}}} . == Two Tree Broadcast (23-Broadcast) == === Definition === This algorithm aims to improve on some disadvantages of tree structure models with pipelines. Normally in tree structure models with pipelines (see above methods), leaves receive just their data and cannot contribute to send and spread data. The algorithm concurrently uses two binary trees to communicate over. Those trees will be called tree A and B. Structurally in binary trees there are relatively more leave nodes than inner nodes. Basic Idea of this algorithm is to make a leaf node of tree A be an inner node of tree B. It has also the same technical function in opposite side from B to A tree. This means, two packets are sent and received by inner nodes and leaves in different steps. === Tree construction === The number of steps needed to construct two parallel-working binary trees is dependent on the amount of processors. Like with other structures one processor can is the root node who sends messages to two trees. It is not necessary to set a root node, because it is not hard to recognize that the direction of sending messages in binary tree is normally top to bottom. There is no limitation on the number of processors to build two binary trees. Let the height of the combined tree be h = ⌈log(p + 2)⌉. Tree A and B can have a height of h − 1 {\displaystyle h-1} . Especially, if the number of processors correspond to p = 2 h − 1 {\displaystyle p=2^{h}-1} , we can make both sides trees and a root node. To construct this model efficiently and easily with a fully built tree, we can use two methods called "Shifting" and "Mirroring" to get second tree. Let assume tree A is already modeled and tree B is supposed to be constructed based on tree A. We assume that we have p {\displaystyle p} processors ordered from 0 to p − 1 {\displaystyle p-1} . ==== Shifting ==== The "Shifting" method, first copies tree A and moves every node one position to the left to get tree B. The node, which will be located on -1, becomes a child of processor p − 2 {\displaystyle p-2} . ==== Mirroring ==== "Mirroring" is ideal for an even number of processors. With this method tree B can be more easily constructed by tree A, because there are no structural transformations in order to create the new tree. In addition, a symmetric process makes this approach simple. This method can also handle an odd number of processors, in this case, we can set processor p − 1 {\displaystyle p-1} as root node for both trees. For the remaining processors "Mirroring" can be used. === Coloring === We need to find a schedule in order to make sure that no processor has to send or receive two messages from two trees in a step. The edge, is a communication connection to connect two nodes, and can be labelled as either 0 or 1 to make sure that every processor can alternate between 0 and 1-labelled edges. The edges of A and B can be colored with two colors (0 and 1) such that no processor is connected to its parent nodes in A and B using edges of the same color- no processor is connected to its children nodes in A or B using edges of the same color. In every even step the edges with 0 are activated and edges with 1 are activated in every odd step. === Time complexity === In this case the number of packet k is divided in half for each tree. Both trees are working together the total number of packets k = k / 2 + k / 2 {\displaystyle k=k/2+k/2} (upper tree + bottom tree) In each binary tree sending a message to another nodes takes 2 i {\displaystyle 2i} steps until a processor has at least a packet in step i {\displaystyle i} . Therefore, we can calculate all steps as d := log 2 ( p + 1 ) ⇒ log 2 ( p + 1 ) ≈ log 2 ( p ) {\displaystyle d:=\log _{2}(p+1)\Rightarrow \log _{2}(p+1)\approx \log _{2}(p)} . The resulting run time is T ( m , p , k ) ≈ ( m k T byte + T start ) ( 2 d + k − 1 ) {\textstyle T(m,p,k)\approx ({\frac {m}{k}}T_{\text{byte}}+T_{\text{start}})(2d+k-1)} . (Optimal k = m ( 2 d − 1 ) T byte / T start {\textstyle k={\sqrt {{m(2d-1)T_{\text{byte}}}/{T_{\text{start}}}}}} ) This results in a run time of T ( m , p ) ≈ m T byte + T start ⋅ 2 log 2 ( p ) + m ⋅ 2 log 2 ( p ) T start T byte {\displaystyle T(m,p)\approx mT_{\text{byte}}+T_{\text{start}}\cdot 2\log _{2}(p)+{\sqrt {m\cdot 2\log _{2}(p)T_{\text{start}}T_{\text{byte}}}}} . == ESBT-Broadcasting (Edge-disjoint Spanning Binomial Trees) == In this section, another broadcasting algorithm with an underlying telephone communication model will be introduced. A Hypercube creates network system with p = 2 d ( d = 0 , 1 , 2 , 3 , . . . ) {\displaystyle p=2^{d}(d=0,1,2,3,...)} . Every node is represented by binary 0 , 1 {\displaystyle {0,1}} depending on the number of dimensions. Fundamentally ESBT(Edge-disjoint Spanning Binomial Trees) is based on hypercube graphs, pipelining( m {\displaystyle m} messages are divided by k {\displaystyle k} packets) and binomial trees. The Processor 0 d {\displaystyle 0^{d}} cyclically spreads packets to roots of ESB
Truth discovery
Truth discovery (also known as truth finding) is the process of choosing the actual true value for a data item when different data sources provide conflicting information on it. Several algorithms have been proposed to tackle this problem, ranging from simple methods like majority voting to more complex ones able to estimate the trustworthiness of data sources. Truth discovery problems can be divided into two sub-classes: single-truth and multi-truth. In the first case only one true value is allowed for a data item (e.g birthday of a person, capital city of a country). While in the second case multiple true values are allowed (e.g. cast of a movie, authors of a book). Typically, truth discovery is the last step of a data integration pipeline, when the schemas of different data sources have been unified and the records referring to the same data item have been detected. == General principles == The abundance of data available on the web makes more and more probable to find that different sources provide (partially or completely) different values for the same data item. This, together with the fact that we are increasing our reliance on data to derive important decisions, motivates the need of developing good truth discovery algorithms. Many currently available methods rely on a voting strategy to define the true value of a data item. Nevertheless, recent studies, have shown that, if we rely only on majority voting, we could get wrong results even in 30% of the data items. The solution to this problem is to assess the trustworthiness of the sources and give more importance to votes coming from trusted sources. Ideally, supervised learning techniques could be exploited to assign a reliability score to sources after hand-crafted labeling of the provided values; unfortunately, this is not feasible since the number of needed labeled examples should be proportional to the number of sources, and in many applications the number of sources can be prohibitive. == Single-truth vs multi-truth discovery == Single-truth and multi-truth discovery are two very different problems. Single-truth discovery is characterized by the following properties: only one true value is allowed for each data item; different values provided for a given data item oppose to each other; values and sources can either be correct or erroneous. While in the multi-truth case the following properties hold: the truth is composed by a set of values; different values could provide a partial truth; claiming one value for a given data item does not imply opposing to all the other values; the number of true values for each data item is not known a priori. Multi-truth discovery has unique features that make the problem more complex and should be taken into consideration when developing truth-discovery solutions. The examples below point out the main differences of the two methods. Knowing that in both examples the truth is provided by source 1, in the single truth case (first table) we can say that sources 2 and 3 oppose to the truth and as a result provide wrong values. On the other hand, in the second case (second table), sources 2 and 3 are neither correct nor erroneous, they instead provide a subset of the true values and at the same time they do not oppose the truth. == Source trustworthiness == The vast majority of truth discovery methods are based on a voting approach: each source votes for a value of a certain data item and, at the end, the value with the highest vote is select as the true one. In the more sophisticated methods, votes do not have the same weight for all the data sources, more importance is indeed given to votes coming from trusted sources. Source trustworthiness usually is not known a priori but estimated with an iterative approach. At each step of the truth discovery algorithm the trustworthiness score of each data source is refined, improving the assessment of the true values that in turn leads to a better estimation of the trustworthiness of the sources. This process usually ends when all the values reach a convergence state. Source trustworthiness can be based on different metrics, such as accuracy of provided values, copying values from other sources and domain coverage. Detecting copying behaviors is very important, in fact, copy allows to spread false values easily making truth discovery very hard, since many sources would vote for the wrong values. Usually systems decrease the weight of votes associated to copied values or even don’t count them at all. == Single-truth methods == Most of the currently available truth discovery methods have been designed to work well only in the single-truth case. Below are reported some of the characteristics of the most relevant typologies of single-truth methods and how different systems model source trustworthiness. === Majority voting === Majority voting is the simplest method, the most popular value is selected as the true one. Majority voting is commonly used as a baseline when assessing the performances of more complex methods. === Web-link based === These methods estimate source trustworthiness exploiting a similar technique to the one used to measure authority of web pages based on web links. The vote assigned to a value is computed as the sum of the trustworthiness of the sources that provide that particular value, while the trustworthiness of a source is computed as the sum of the votes assigned to the values that the source provides. === Information-retrieval based === These methods estimate source trustworthiness using similarity measures typically used in information retrieval. Source trustworthiness is computed as the cosine similarity (or other similarity measures) between the set of values provided by the source and the set of values considered true (either selected in a probabilistic way or obtained from a ground truth). === Bayesian based === These methods use Bayesian inference to define the probability of a value being true conditioned on the values provided by all the sources. P ( v ∣ ψ ( o ) ) = P ( ψ ( o ) ∣ v ) ⋅ P ( v ) P ( ψ ( o ) ) {\displaystyle P(v\mid \psi (o))={\frac {P(\psi (o)\mid v)\cdot P(v)}{P(\psi (o))}}} where v {\displaystyle \textstyle v} is a value provided for a data item o {\displaystyle \textstyle o} and ψ ( o ) {\displaystyle \textstyle \psi (o)} is the set of the observed values provided by all the sources for that specific data item. The trustworthiness of a source is then computed based on the accuracy of the values that provides. Other more complex methods exploit Bayesian inference to detect copying behaviors and use these insights to better assess source trustworthiness. == Multi-truth methods == Due to its complexity, less attention has been devoted to the study of the multi-truth discovery Below are reported two typologies of multi-truth methods and their characteristics. === Bayesian based === These methods use Bayesian inference to define the probability of a group of values being true conditioned on the values provided by all the data sources. In this case, since there could be multiple true values for each data item, and sources can provide multiple values for a single data item, it is not possible to consider values individually. An alternative is to consider mappings and relations between set of provided values and sources providing them. The trustworthiness of a source is then computed based on the accuracy of the values that provides. More sophisticated methods also consider domain coverage and copying behaviors to better estimate source trustworthiness. === Probabilistic Graphical Models based === These methods use probabilistic graphical models to automatically define the set of true values of given data item and also to assess source quality without need of any supervision. == Applications == Many real-world applications can benefit from the use of truth discovery algorithms. Typical domains of application include: healthcare, crowd/social sensing, crowdsourcing aggregation, information extraction and knowledge base construction. Truth discovery algorithms could be also used to revolutionize the way in which web pages are ranked in search engines, going from current methods based on link analysis like PageRank, to procedures that rank web pages based on the accuracy of the information they provide.
Webometrics Ranking of Business Schools
The Webometrics Ranking of Business Schools, also known as Ranking Web of Business Schools, is a ranking system for the world's business schools based on a composite indicator that takes into account both the volume of the Web content (number of web pages and files) and the visibility and impact of these web publications according to the number of external inlinks (site citations) they received. The ranking is published by the Cybermetrics Lab, a research group of the Spanish National Research Council (CSIC) located in Madrid. This ranking was discontinued in 2013 and is no longer updated. This discontinued ranking is, however, often cited (as of 2017-06-16) by Google as its main ranking reference. Examples are: "Spain business school ranking " = "Zurich business school ranking" etc. The Webometrics Ranking of World Universities is a similar ranking of universities.