Relational data mining

Relational data mining

Relational data mining is the data mining technique for relational databases. Unlike traditional data mining algorithms, which look for patterns in a single table (propositional patterns), relational data mining algorithms look for patterns among multiple tables (relational patterns). For most types of propositional patterns, there are corresponding relational patterns. For example, there are relational classification rules (relational classification), relational regression tree, and relational association rules. There are several approaches to relational data mining: Inductive Logic Programming (ILP) Statistical Relational Learning (SRL) Graph Mining Propositionalization Multi-view learning == Algorithms == Multi-Relation Association Rules: Multi-Relation Association Rules (MRAR) is a new class of association rules which in contrast to primitive, simple and even multi-relational association rules (that are usually extracted from multi-relational databases), each rule item consists of one entity but several relations. These relations indicate indirect relationship between the entities. Consider the following MRAR where the first item consists of three relations live in, nearby and humid: “Those who live in a place which is near by a city with humid climate type and also are younger than 20 -> their health condition is good”. Such association rules are extractable from RDBMS data or semantic web data. == Software == Safarii: a Data Mining environment for analysing large relational databases based on a multi-relational data mining engine. Dataconda: a software, free for research and teaching purposes, that helps mining relational databases without the use of SQL. == Datasets == Relational dataset repository: a collection of publicly available relational datasets.

Stereo cameras

The stereo cameras approach is a method of distilling a noisy video signal into a coherent data set that a computer can begin to process into actionable symbolic objects, or abstractions. Stereo cameras is one of many approaches used in the broader fields of computer vision and machine vision. == Calculation == In this approach, two cameras with a known physical relationship (i.e. a common field of view the cameras can see, and how far apart their focal points sit in physical space) are correlated via software. By finding mappings of common pixel values, and calculating how far apart these common areas reside in pixel space, a rough depth map can be created. This is very similar to how the human brain uses stereoscopic information from the eyes to gain depth cue information, i.e. how far apart any given object in the scene is from the viewer. The camera attributes must be known, focal length and distance apart etc., and a calibration done. Once this is completed, the systems can be used to sense the distances of objects by triangulation. Finding the same singular physical point in the two left and right images is known as the correspondence problem. Correctly locating the point gives the computer the capability to calculate the distance that the robot or camera is from the object. On the BH2 Lunar Rover the cameras use five steps: a bayer array filter, photometric consistency dense matching algorithm, a Laplace of Gaussian (LoG) edge detection algorithm, a stereo matching algorithm and finally uniqueness constraint. == Uses == This type of stereoscopic image processing technique is used in applications such as 3D reconstruction, robotic control and sensing, crowd dynamics monitoring and off-planet terrestrial rovers; for example, in mobile robot navigation, tracking, gesture recognition, targeting, 3D surface visualization, immersive and interactive gaming. Although the Xbox Kinect sensor is also able to create a depth map of an image, it uses an infrared camera for this purpose, and does not use the dual-camera technique. Other approaches to stereoscopic sensing include time of flight sensors and ultrasound.

Reification (computer science)

In computer science, reification is the process by which an abstract idea about a program is turned into an explicit data model or other object created in a programming language. A computable/addressable object—a resource—is created in a system as a proxy for a non computable/addressable object. By means of reification, something that was previously implicit, unexpressed, and possibly inexpressible is explicitly formulated and made available to conceptual (logical or computational) manipulation. Informally, reification is often referred to as "making something a first-class citizen" within the scope of a particular system. Some aspect of a system can be reified at language design time, which is related to reflection in programming languages. It can be applied as a stepwise refinement at system design time. Reification is one of the most frequently used techniques of conceptual analysis and knowledge representation. == Reflective programming languages == In the context of programming languages, reification is the process by which a user program or any aspect of a programming language that was implicit in the translated program and the run-time system, are expressed in the language itself. This process makes it available to the program, which can inspect all these aspects as ordinary data. In reflective languages, reification data is causally connected to the related reified aspect such that a modification to one of them affects the other. Therefore, the reification data is always a faithful representation of the related reified aspect . Reification data is often said to be made a first class object. Reification, at least partially, has been experienced in many languages to date: in early Lisp dialects and in current Prolog dialects, programs have been treated as data, although the causal connection has often been left to the responsibility of the programmer. In Smalltalk-80, the compiler from the source text to bytecode has been part of the run-time system since the very first implementations of the language. The C programming language reifies the low-level detail of memory addresses.Many programming language designs encapsulate the details of memory allocation in the compiler and the run-time system. In the design of the C programming language, the memory address is reified and is available for direct manipulation by other language constructs. For example, the following code may be used when implementing a memory-mapped device driver. The buffer pointer is a proxy for the memory address 0xB8000000. Functional programming languages based on lambda-calculus reify the concept of a procedure abstraction and procedure application in the form of the Lambda expression. The Scheme programming language reifies continuations (approximately, the call stack). In C#, reification is used to make parametric polymorphism implemented in the form of generics as a first-class feature of the language. In the Java programming language, there exist "reifiable types" that are "completely available at run time" (i.e. their information is not erased during compilation). REBOL reifies code as data and vice versa. Many languages, such as Lisp, JavaScript, and Curl, provide an eval or evaluate procedure that effectively reifies the language interpreter. Smalltalk and Actor languages permit the reification of blocks and messages, which are equivalent of lambda expressions in Lisp, and thisContext in Smalltalk, which is a reification of the current executing block. Homoiconic languages reify the syntax of the language as data that is understood by the language itself. This allows the user to write programs whose inputs and outputs are code (see macros, eval). Common representations of code include S-expressions (e.g. Clojure, Lisp), and abstract syntax trees (e.g. Rust). == Data reification vs. data refinement == Data reification (stepwise refinement) involves finding a more concrete representation of the abstract data types used in a formal specification. Data reification is the terminology of the Vienna Development Method (VDM) that most other people would call data refinement. An example is taking a step towards an implementation by replacing a data representation without a counterpart in the intended implementation language, such as sets, by one that does have a counterpart (such as maps with fixed domains that can be implemented by arrays), or at least one that is closer to having a counterpart, such as sequences. The VDM community prefers the word "reification" over "refinement", as the process has more to do with concretising an idea than with refining it. For similar usages, see Reification (linguistics). == In conceptual modeling == Reification is widely used in conceptual modeling. Reifying a relationship means viewing it as an entity. The purpose of reifying a relationship is to make it explicit, when additional information needs to be added to it. Consider the relationship type IsMemberOf(member:Person, Committee). An instance of IsMemberOf is a relationship that represents the fact that a person is a member of a committee. The figure below shows an example population of IsMemberOf relationship in tabular form. Person P1 is a member of committees C1 and C2. Person P2 is a member of committee C1 only. The same fact, however, could also be viewed as an entity. Viewing a relationship as an entity, one can say that the entity reifies the relationship. This is called reification of a relationship. Like any other entity, it must be an instance of an entity type. In the present example, the entity type has been named Membership. For each instance of IsMemberOf, there is one and only one instance of Membership, and vice versa. Now, it becomes possible to add more information to the original relationship. As an example, we can express the fact that "person p1 was nominated to be the member of committee c1 by person p2". Reified relationship Membership can be used as the source of a new relationship IsNominatedBy(Membership, Person). For related usages see Reification (knowledge representation). == In Unified Modeling Language (UML) == UML provides an association class construct for defining reified relationship types. The association class is a single model element that is both a kind of association and a kind of class. The association and the entity type that reifies are both the same model element. Note that attributes cannot be reified. == On Semantic Web == === RDF and OWL === In Semantic Web languages, such as Resource Description Framework (RDF) and Web Ontology Language (OWL), a statement is a binary relation. It is used to link two individuals or an individual and a value. Applications sometimes need to describe other RDF statements, for instance, to record information like when statements were made, or who made them, which is sometimes called "provenance" information. As an example, we may want to represent properties of a relation, such as our certainty about it, severity or strength of a relation, relevance of a relation, and so on. The example from the conceptual modeling section describes a particular person with URIref person:p1, who is a member of the committee:c1. The RDF triple from that description is Consider to store two further facts: (i) to record who nominated this particular person to this committee (a statement about the membership itself), and (ii) to record who added the fact to the database (a statement about the statement). The first case is a case of classical reification like above in UML: reify the membership and store its attributes and roles etc.: Additionally, RDF provides a built-in vocabulary intended for describing RDF statements. A description of a statement using this vocabulary is called a reification of the statement. The RDF reification vocabulary consists of the type rdf:Statement, and the properties rdf:subject, rdf:predicate, and rdf:object. Using the reification vocabulary, a reification of the statement about the person's membership would be given by assigning the statement a URIref such as committee:membership12345 so that describing statements can be written as follows: These statements say that the resource identified by the URIref committee:membership12345Stat is an RDF statement, that the subject of the statement refers to the resource identified by person:p1, the predicate of the statement refers to the resource identified by committee:isMemberOf, and the object of the statement refers to the resource committee:c1. Assuming that the original statement is actually identified by committee:membership12345, it should be clear by comparing the original statement with the reification that the reification actually does describe it. The conventional use of the RDF reification vocabulary always involves describing a statement using four statements in this pattern. Therefore, they are sometimes referred to as the "reification quad". Using reification according to this convention, we could record the fact that pe

Xinhua–Sogou AI news anchor

Xinhua News Agency and Sogou of China developed an artificial intelligence (AI) for news reporting purposes. The AI was unveiled in 2018. It is touted to be the "world's first AI news anchor". == History == The AI was unveiled at the 2018 World Internet Conference in Wuzhen, Zhejiang, China. The AI devises avatars patterned after real life Xinhua anchors. The AI patterned after Qiu Hao spoke in Chinese, while the one derived from the likeness of Zhang Zhao speaks in English. The unveiling of the AI raised concerns of its impact on employment. Xinhua and Sogou unveiled Xin Xiaomeng, an AI with a female avatar in 2019. People's Daily followed suit by unveiling its own AI newscaster in 2023.

Meta Content Framework

Meta Content Framework (MCF) is a specification of a content format for structuring metadata about web sites and other data. == History == MCF was developed by Ramanathan V. Guha at Apple Computer's Advanced Technology Group between 1995 and 1997. Rooted in knowledge-representation systems such as CycL, KRL, and KIF, it sought to describe objects, their attributes, and the relationships between them. One application of MCF was HotSauce, also developed by Guha while at Apple. It generated a 3D visualization of a web site's table of contents, based on MCF descriptions. By late 1996, a few hundred sites were creating MCF files and Apple HotSauce allowed users to browse these MCF representations in 3D. When the research project was discontinued, Guha left Apple for Netscape, where, in collaboration with Tim Bray, he adapted MCF to use XML and created the first version of the Resource Description Framework (RDF). == MCF format == An MCF file consists of one or more blocks, each corresponding to an entity. A block looks like this:The identifier is a unique identifier for that entity (more on the scope of the identifier below) and is used to refer to that entity. The following lines each specify a property and one or more values, separated by commas. Each value can be a reference to another entity (via its identifier), a string (enclosed by double quotes) or a number. For example:NOTE: The identifier must not include a comma (,) and must not be enclosed within double quotes. A common parsing failure is due to odd number of unescaped double quotes in text. For instance, "foo bar" baz" needs to be "foo bar\" baz". Commas within double quotes are not considered as value separators. Every entity has at least one property: typeOf.

Anthem medical data breach

The Anthem medical data breach was a medical data breach of information held by Elevance Health, known at that time as Anthem Inc. On February 4, 2015, Anthem, Inc. disclosed that criminal hackers had broken into its servers and had potentially stolen over 37.5 million records that contain personally identifiable information from its servers. On February 24, 2015 Anthem raised the number to 78.8 million people whose personal information had been affected. According to Anthem, Inc., the data breach extended into multiple brands Anthem, Inc. uses to market its healthcare plans, including, Anthem Blue Cross, Anthem Blue Cross and Blue Shield, Blue Cross and Blue Shield of Georgia, Empire Blue Cross and Blue Shield, Amerigroup, Caremore, and UniCare. Healthlink says that it was also a victim. Anthem says users' medical information and financial data were not compromised. Anthem has offered free credit monitoring in the wake of the breach. Michael Daniel, chief adviser on cybersecurity for President Barack Obama, said he would be changing his own password. According to The New York Times, about 80 million company records were hacked, and there is a fear that the stolen data will be used for identity theft. The compromised information contained names, birthdays, medical IDs, social security numbers, street addresses, e-mail addresses and employment information, including income data. == Theft of the data == The data was stolen over a period of weeks the month before the data breach was discovered. Because no medical information was compromised, Anthem was not required by law to encrypt the data. However, Anthem faced several civil class-action lawsuits, which were settled in 2017 at a cost of $115 million. Anthem did not admit any wrongdoing in the settlement. Data from the attack is expected to be sold on the black market. == Impact == Persons whose data was stolen could have resulting problems about identity theft for the rest of their lives. Anthem had a US$100 million insurance policy for cyber problems from American International Group. One report suggested that all of this money could be consumed by the process of notifying customers of the breach. == Responses == Anthem hired Mandiant, a cybersecurity firm, to review their security systems and advised people whose data was stolen to monitor their accounts and remain vigilant. The theft of the data raised fears generally about the theft of medical information. A writer from Harvard Law School suggested that this data breach might spark reform of security practices and government data safety regulation. An investigation conducted by several state insurance commissioners blames the breach on an attacker whose identity was withheld, and claims that the breach was likely ordered by a foreign government whose name was withheld. It also concluded that Anthem had taken reasonable measures to protect its data before the breach and that its remediation plan was effective at shutting down the breach once it was discovered. It also marks the starting date of the breach as February 18, 2014. The lead investigator was the Indiana Department of Insurance (DOI) -- Anthem's principal regulator, because Anthem is headquartered in Indiana. The Indiana DOI hired independent auditors to conduct a security assessment at Anthem, which concluded, "While deficiencies within Anthem’s cybersecurity posture were noted by the Examination Team, these deficiencies were not, in our experience, uncommon to companies comparable to Anthem in size and scope. While the pre-breach deficiencies impacted Anthem’s ability to reduce the likelihood of and quickly detect the Data Breach, the controls implemented subsequent to the Data Breach should improve Anthem’s ability to detect future breaches and enable Anthem to respond more effectively to a future attack than was the case in this instance." Federal regulators also conducted an investigation of the Anthem data breach, resulting in a $16 million settlement between Anthem and the Department of Health and Human Services (HHS) -- by far the largest HHS data breach settlement. An HHS Director overseeing the investigation said, "The largest health data breach in U.S. history fully merits the largest HIPAA settlement in history. Unfortunately, Anthem failed to implement appropriate measures for detecting hackers who had gained access to their system to harvest passwords and steal people's private information." The HHS settlement also required Anthem to perform a risk assessment and correct any identified deficiencies in its cybersecurity, with HHS oversight of Anthem's progress. Approximately 100 private class action lawsuits were filed against Anthem over the data breach and consolidated in California federal court, in front of Judge Koh, a respected authority in data breach litigation. After contested briefing over who should lead the litigation efforts, Judge Koh appoints Eve Cervantez of Altshuler Berzon and Andy Friedman of Cohen Milstein as co-lead counsel, and appointed Eric Gibbs of Gibbs Law Group and Michael Sobel of Lieff Cabraser to head a Plaintiffs' Steering Committee. In 2017, Anthem agreed to settle the litigation for $115 million, the largest ever data breach settlement at the time. The attorneys requested $38 million in fees for their work on the case, but Judge Koh slashed the fee request, finding that only $31 million in fees were merited.

Cristóbal Valenzuela

Cristóbal Valenzuela (born 1989) is a Chilean-born technologist, software developer, and CEO of Runway. In 2018, Valenzuela co-founded the AI research company Runway in New York City with Anastasis Germanidis and Alejandro Matamala. == Education == Valenzuela graduated from Adolfo Ibáñez University (AIU), a research private university in Chile. From there, Valenzuela obtained a bachelor's degree in economics and business management, along with a master's degree in arts in design in 2012. In 2018, he graduated with a media arts degree from ITP NYU's Tisch School of the Arts. == Career and recognition == One of Valenzuela's first jobs was as a teaching and research assistant at the Adolfo Ibáñez University School of Design, and later an adjunct professor in the same department. In 2018, he became a researcher at NYU's Tisch School of the Arts ITP program, where he worked with Daniel Shiffman. He contributes to open-source software projects, including ml5.js, an open-source machine learning software. He co-founded Runway with two colleagues from ITP, Anastasis Germanidis, and Alejandro Matamala. The goal of Runway is to create new tools for human imagination using generative AI. In recent years, Valenzuela's work has been sponsored by Google and the Processing Foundation and his projects have been exhibited throughout Latin America and the US, including the Santiago Museum of Contemporary Art, Lollapalooza, NYC Media Lab, New Latin Wave, Inter-American Development Bank, Stanford University and New York University. In September 2023, Valenzuela was named as one of the TIME 100 Most Influential People in AI (TIME100 AI).