AI Face Free

AI Face Free — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Generatrix

    Generatrix

    In geometry, a generatrix () or describent is a point, curve or surface that, when moved along a given path, generates a new shape. The path directing the motion of the generatrix motion is called a directrix or dirigent. == Examples == A cone can be generated by moving a line (the generatrix) fixed at the future apex of the cone along a closed curve (the directrix); if that directrix is a circle perpendicular to the line connecting its center to the apex, the motion is rotation around a fixed axis and the resulting shape is a circular cone. The generatrix of a cylinder, a limiting case of a cone, is a line that is kept parallel to some axis.

    Read more →
  • Structural risk minimization

    Structural risk minimization

    Structural risk minimization (SRM) is an inductive principle of use in machine learning. Commonly in machine learning, a generalized model must be selected from a finite data set, with the consequent problem of overfitting – the model becoming too strongly tailored to the particularities of the training set and generalizing poorly to new data. The SRM principle addresses this problem by balancing the model's complexity against its success at fitting the training data. This principle was first set out in a 1974 book by Vladimir Vapnik and Alexey Chervonenkis and uses the VC dimension. In practical terms, Structural Risk Minimization is implemented by minimizing E t r a i n + β H ( W ) {\displaystyle E_{train}+\beta H(W)} , where E t r a i n {\displaystyle E_{train}} is the train error, the function H ( W ) {\displaystyle H(W)} is called a regularization function, and β {\displaystyle \beta } is a constant. H ( W ) {\displaystyle H(W)} is chosen such that it takes large values on parameters W {\displaystyle W} that belong to high-capacity subsets of the parameter space. Minimizing H ( W ) {\displaystyle H(W)} in effect limits the capacity of the accessible subsets of the parameter space, thereby controlling the trade-off between minimizing the training error and minimizing the expected gap between the training error and test error. The SRM problem can be formulated in terms of data. Given n data points consisting of data x and labels y, the objective J ( θ ) {\displaystyle J(\theta )} is often expressed in the following manner: J ( θ ) = 1 2 n ∑ i = 1 n ( h θ ( x i ) − y i ) 2 + λ 2 ∑ j = 1 d θ j 2 {\displaystyle J(\theta )={\frac {1}{2n}}\sum _{i=1}^{n}(h_{\theta }(x^{i})-y^{i})^{2}+{\frac {\lambda }{2}}\sum _{j=1}^{d}\theta _{j}^{2}} The first term is the mean squared error (MSE) term between the value of the learned model, h θ {\displaystyle h_{\theta }} , and the given labels y {\displaystyle y} . This term is the training error, E t r a i n {\displaystyle E_{train}} , that was discussed earlier. The second term, places a prior over the weights, to favor sparsity and penalize larger weights. The trade-off coefficient, λ {\displaystyle \lambda } , is a hyperparameter that places more or less importance on the regularization term. Larger λ {\displaystyle \lambda } encourages sparser weights at the expense of a more optimal MSE, and smaller λ {\displaystyle \lambda } relaxes regularization allowing the model to fit to data. Note that as λ → ∞ {\displaystyle \lambda \to \infty } the weights become zero, and as λ → 0 {\displaystyle \lambda \to 0} , the model typically suffers from overfitting.

    Read more →
  • Eager learning

    Eager learning

    In artificial intelligence, eager learning is a learning method in which the system tries to construct a general, input-independent target function during training of the system, as opposed to lazy learning, where generalization beyond the training data is delayed until a query is made to the system. The main advantage gained in employing an eager learning method, such as an artificial neural network, is that the target function will be approximated globally during training, thus requiring much less space than using a lazy learning system. Eager learning systems also deal much better with noise in the training data. Eager learning is an example of offline learning, in which post-training queries to the system have no effect on the system itself, and thus the same query to the system will always produce the same result. The main disadvantage with eager learning is that it is generally unable to provide good local approximations in the target function.

    Read more →
  • Wadhwani Institute for Artificial Intelligence

    Wadhwani Institute for Artificial Intelligence

    Wadhwani AI, based in Mumbai, Maharashtra, is an independent, non-profit institute. Founded in 2018, it is dedicated to developing Artificial intelligence solutions for social good. Their mission is to build AI-based innovations and solutions for underserved communities in developing countries, for a wide range of domains including agriculture, education, financial inclusion, healthcare, and infrastructure. == History and funding == The institute was founded with a $30 million philanthropic effort by the Wadhwani brothers, Romesh Wadhwani and Sunil Wadhwani. The institute was inaugurated and dedicated to the nation by Narendra Modi, the 14th Prime Minister of India. In 2019, the institute received a $2 million grant from Google.org to create technologies to help reduce crop losses in cotton farming, through integrated pest management. The United States Agency for International Development awarded $2 million to the institute in 2020 to develop tools, using mathematical modeling techniques and digital technologies such as artificial intelligence and machine learning, to forecast COVID-19 disease patterns, estimate resources needed, and plan interventions. == Collaboration == With assistance from Google, the Ministry of Agriculture and Farmers' Welfare and the Wadhwani AI developed Krishi 24/7, the first AI-powered automated agricultural news monitoring and analysis tool. Through better decision-making, Krishi 24/7 will support the identification of valuable news, provide timely notifications, and respond quickly to safeguard farmers' interests and advance sustainable agricultural growth. The application converts news articles into English after scanning them in several languages. It ensures that the ministry is informed in a timely manner about pertinent occurrences that are published online by extracting key information from news items, including the headline, crop name, event type, date, location, severity, summary, and source link. The National Center for Disease Control has effectively implemented a comparable automated surveillance and analysis tool for disease outbreaks.

    Read more →
  • GeneXus

    GeneXus

    GeneXus is a low code, cross-platform, knowledge representation-based development tool, mainly oriented towards enterprise-class applications for web applications, smart devices, and the Microsoft Windows platform. GeneXus uses mostly declarative language to generate native code for multiple environments. It includes a normalization module, which creates and maintains an optimal database structure based on user views. The languages for which code can be generated include COBOL, Java, Objective-C, RPG, Ruby, Visual Basic, and Visual FoxPro. Some of the DBMSs supported are Microsoft SQL Server, Oracle, IBM Db2, Informix, PostgreSQL, and MySQL. GeneXus was developed by Uruguayan company ARTech Consultores SRL which later renamed to Genexus SA. The latest version is GeneXus 18, which was released on November 10, 2022.

    Read more →
  • Hierarchical navigable small world

    Hierarchical navigable small world

    Hierarchical navigable small world (HNSW) is an algorithm for approximate nearest neighbor search. It is used to find items that are similar to a query item in a large collection, without comparing the query with every item one by one. The algorithm is commonly used for searching vector data. In these systems, an item such as a document, image, song, or user profile is represented by a list of numbers called a vector. Items with similar vectors are treated as similar according to the model that produced the vectors. HNSW provides a way to search these vectors quickly, especially in large datasets. HNSW stores vectors in a graph. Each vector is a node, and links connect it to some nearby vectors. The graph has several layers: upper layers contain fewer nodes and act like a rough map, while the bottom layer contains all nodes and gives a more detailed view. A search starts in an upper layer, follows links toward nodes that are closer to the query, and then repeats the process in lower layers until it finds a set of likely nearest neighbors. == Background == The nearest neighbor search problem asks which items in a dataset are closest to a query item. A direct search can compare the query with every item in the dataset, but this becomes slow when the dataset is large. Exact search methods based on spatial trees, such as the k-d tree and R-tree, can also become less effective for high-dimensional data, a problem often associated with the curse of dimensionality. Approximate nearest neighbor methods trade some exactness for speed or lower resource use. Instead of always guaranteeing the exact closest item, they try to return close items quickly. Other approximate methods include locality-sensitive hashing and product quantization. HNSW builds on research into small-world networks and navigable graphs. In a small-world graph, most nodes can be reached from other nodes through a short chain of links. In a navigable graph, a search procedure can use local information to move toward a target. Jon Kleinberg's work on navigation in small-world networks is an important example of this research area. Later work studied ways to add links that make graphs easier to navigate greedily. The HNSW algorithm extends earlier navigable small world methods for similarity search by adding a hierarchy of graph layers. This hierarchy helps the algorithm find a good region of the graph before doing a more detailed search in the bottom layer. == Algorithm == HNSW is based on a proximity graph. In this graph, nearby vectors are connected by edges. The algorithm uses these edges to move through the dataset, rather than scanning every vector. The graph is hierarchical. Every vector appears in the bottom layer. Some vectors are also placed in higher layers, with fewer vectors appearing as the layers go upward. The upper layers allow long-range movement across the dataset, while the lower layers allow a more detailed search near promising candidates. A typical search proceeds as follows: The search begins from an entry point in the highest layer. At each step, the algorithm looks at neighboring nodes and moves to a neighbor that is closer to the query. When it cannot find a closer neighbor in that layer, it moves down to the next layer. In the bottom layer, it explores a wider set of candidate nodes and returns the nearest candidates found. This search strategy is often described as greedy navigation. The algorithm repeatedly chooses locally better nodes, using the graph structure to approach the query point. == Construction and parameters == The HNSW graph is built incrementally. When a new vector is inserted, the algorithm assigns it a maximum layer, searches for nearby existing nodes, and connects the new node to selected neighbors in each layer where it appears. Implementations usually expose parameters that control the trade-off between speed, accuracy, memory use, and construction time. A higher number of graph connections can improve recall but requires more memory. A larger search candidate list can improve accuracy but makes queries slower. A larger construction candidate list can improve the quality of the graph but makes index building slower. Because HNSW is approximate, its results are not always identical to a full exact search. Its practical performance depends on the dataset, distance measure, implementation, and parameter settings. Benchmarking studies have found HNSW-based libraries to be strong performers among approximate nearest neighbor methods, although worst-case performance can differ from performance on common benchmark datasets. == Use in vector search systems == HNSW is used as an index in systems that store and search high-dimensional vectors. These systems include vector databases, search engines, and database extensions. Typical uses include semantic search, recommender systems, image similarity search, and retrieval-augmented generation. Several software projects implement or support HNSW. Libraries include hnswlib, which is associated with the original HNSW authors, and FAISS. Database and search systems that document HNSW support include Apache Lucene, Chroma, ClickHouse, DuckDB, MariaDB, Milvus, pgvector, Qdrant, and Redis.

    Read more →
  • Astrostatistics

    Astrostatistics

    Astrostatistics is a discipline which spans astrophysics, statistical analysis and data mining. It is used to process the vast amount of data produced by automated scanning of the cosmos, to characterize complex datasets, and to link astronomical data to astrophysical theory. Many branches of statistics are involved in astronomical analysis including nonparametrics, multivariate regression and multivariate classification, time series analysis, and especially Bayesian inference. The field is closely related to astroinformatics.

    Read more →
  • Zeuthen strategy

    Zeuthen strategy

    The Zeuthen strategy in cognitive science is a negotiation strategy used by some artificial agents. Its purpose is to measure the willingness to risk conflict. An agent will be more willing to risk conflict if it does not have much to lose in case that the negotiation fails. In contrast, an agent is less willing to risk conflict when it has more to lose. The value of a deal is expressed in its utility. An agent has much to lose when the difference between the utility of its current proposal and the conflict deal is high. When both agents use the monotonic concession protocol, the Zeuthen strategy leads them to agree upon a deal in the negotiation set. This set consists of all conflict free deals, which are individually rational and Pareto optimal, and the conflict deal, which maximizes the Nash product. The strategy was introduced in 1930 by the Danish economist Frederik Zeuthen. == Three key questions == The Zeuthen strategy answers three open questions that arise when using the monotonic concession protocol, namely: Which deal should be proposed at first? On any given round, who should concede? In case of a concession, how much should the agent concede? The answer to the first question is that any agent should start with its most preferred deal, because that deal has the highest utility for that agent. The second answer is that the agent with the smallest value of Risk(i,t) concedes, because the agent with the lowest utility for the conflict deal profits most from avoiding conflict. To the third question, the Zeuthen strategy suggests that the conceding agent should concede just enough raise its value of Risk(i,t) just above that of the other agent. This prevents the conceding agent to have to concede again in the next round. == Risk == Risk ( i , t ) = { 1 U i ( δ ( i , t ) ) = 0 U i ( δ ( i , t ) ) − U i ( δ ( j , t ) ) U i ( δ ( i , t ) ) otherwise {\displaystyle {\text{Risk}}(i,t)={\begin{cases}1&U_{i}(\delta (i,t))=0\\{\frac {U_{i}(\delta (i,t))-U_{i}(\delta (j,t))}{U_{i}(\delta (i,t))}}&{\text{otherwise}}\end{cases}}} Risk(i,t) is a measurement of agent i's willingness to risk conflict. The risk function formalizes the notion that an agent's willingness to risk conflict is the ratio of the utility that agent would lose by accepting the other agent's proposal to the utility that agent would lose by causing a conflict. Agent i is said to be using a rational negotiation strategy if at any step t + 1 that agent i sticks to his last proposal, Risk(i,t) > Risk(j,t). == Sufficient concession == If agent i makes a sufficient concession in the next step, then, assuming that agent j is using a rational negotiation strategy, if agent j does not concede in the next step, he must do so in the step after that. The set of all sufficient concessions of agent i at step t is denoted SC(i, t). == Minimal sufficient concession == δ ′ = arg ⁡ max δ ∈ S C ( A , t ) { U A ( δ ) } {\displaystyle \delta '=\arg \max _{\delta \in {SC(A,t)}}\{U_{A}(\delta )\}} is the minimal sufficient concession of agent A in step t. Agent A begins the negotiation by proposing δ ( A , 0 ) = arg ⁡ max δ ∈ N S U A ( δ ) {\displaystyle \delta (A,0)=\arg \max _{\delta \in {NS}}U_{A}(\delta )} and will make the minimal sufficient concession in step t + 1 if and only if Risk(A,t) ≤ Risk(B,t). Theorem If both agents are using Zeuthen strategies, then they will agree on δ = arg ⁡ max δ ′ ∈ N S { π ( δ ′ ) } , {\displaystyle \delta =\arg \max _{\delta '\in {NS}}\{\pi (\delta ')\},} that is, the deal which maximizes the Nash product. Proof Let δA = δ(A,t). Let δB = δ(B,t). According to the Zeuthen strategy, agent A will concede at step t {\displaystyle t} if and only if R i s k ( A , t ) ≤ R i s k ( B , t ) . {\displaystyle Risk(A,t)\leq Risk(B,t).} That is, if and only if U A ( δ A ) − U A ( δ B ) U A ( δ A ) ≤ U B ( δ B ) − U B ( δ A ) U B ( δ B ) {\displaystyle {\frac {U_{A}(\delta _{A})-U_{A}(\delta _{B})}{U_{A}(\delta _{A})}}\leq {\frac {U_{B}(\delta _{B})-U_{B}(\delta _{A})}{U_{B}(\delta _{B})}}} U B ( δ B ) ( U A ( δ A ) − U A ( δ B ) ) ≤ U A ( δ A ) ( U B ( δ B ) − U B ( δ A ) ) {\displaystyle U_{B}(\delta _{B})(U_{A}(\delta _{A})-U_{A}(\delta _{B}))\leq U_{A}(\delta _{A})(U_{B}(\delta _{B})-U_{B}(\delta _{A}))} U A ( δ A ) U B ( δ B ) − U A ( δ B ) U B ( δ B ) ≤ U A ( δ A ) U B ( δ B ) − U A ( δ A ) U B ( δ A ) {\displaystyle U_{A}(\delta _{A})U_{B}(\delta _{B})-U_{A}(\delta _{B})U_{B}(\delta _{B})\leq U_{A}(\delta _{A})U_{B}(\delta _{B})-U_{A}(\delta _{A})U_{B}(\delta _{A})} − U A ( δ B ) U B ( δ B ) ≤ − U A ( δ A ) U B ( δ A ) {\displaystyle -U_{A}(\delta _{B})U_{B}(\delta _{B})\leq -U_{A}(\delta _{A})U_{B}(\delta _{A})} U A ( δ A ) U B ( δ A ) ≤ U A ( δ B ) U B ( δ B ) {\displaystyle U_{A}(\delta _{A})U_{B}(\delta _{A})\leq U_{A}(\delta _{B})U_{B}(\delta _{B})} π ( δ A ) ≤ π ( δ B ) {\displaystyle \pi (\delta _{A})\leq \pi (\delta _{B})} Thus, Agent A will concede if and only if δ A {\displaystyle \delta _{A}} does not yield the larger product of utilities. Therefore, the Zeuthen strategy guarantees a final agreement that maximizes the Nash Product.

    Read more →
  • Operational image

    Operational image

    An operational image, also known as operative image, is an image that serves a functional, rather than aesthetic, purpose. Operational images are not intended to be viewed by people as representations of the real world; they are created to be used as instruments in performing some task or operation, often by machine automation. Operational images are used in a wide variety of applications, such as weapons targeting and guidance systems, and assisting surgeons performing robot-assisted surgery. The term "operational image" was first coined in 2000 by German filmmaker Harun Farocki in the first part of his three-part audiovisual installation, Eye/Machine. Farocki's installation included operational images used by militaries, such as weapons guidance and targeting systems. Eye/Machine featured images shown to the public by the United States military from the cameras used by laser-guided missiles in the Gulf War. Farocki defined operational images as "Images without a social goal, not for edification, not for reflection," and that they "do not represent an object, but rather are part of an operation." According to Volker Pantenburg, operational images are more accurately characterized as "visualizations of data". He describes operational images as a "working image" or an image that "performs work". Operational images are ubiquitous in modern society, used for a variety of military and non-military applications, such as inspecting sewer piping, and assisting surgeons performing robotic surgery.

    Read more →
  • The AI Con

    The AI Con

    The AI Con: How to Fight Big Tech's Hype and Create the Future We Want is a 2025 non-fiction book by linguist Emily M. Bender and sociologist Alex Hanna. It argues that much of what is labeled "artificial intelligence" is a misleading term that obscures ordinary automation while concentrating power in a small number of technology firms. The book was published in May 2025 by Harper in the United States and Bodley Head in the United Kingdom. It was developed alongside the authors' long-running podcast Mystery AI Hype Theater 3000, which critiques exaggerated claims about AI. == Synopsis == The authors present AI as a marketing umbrella that encourages audiences to infer understanding and agency where none exist. They argue readers should treat such language skeptically and to separate specific automated tasks from broad claims of intelligence. The book describes a recurring hype cycle in which corporate narratives justify data and labor extraction, the replacement of human services with cheaper substitutes, and the diversion of attention from present harms to speculative futures. While acknowledging limited uses such as pattern recognition, the authors argue that contemporary systems are best understood as text and media generators shaped by training data and human labor, not as thinking or reasoning entities. A central theme is the social and environmental cost of scaling these systems, including increased energy and water use, the appropriation of creative work for training, and the outsourcing of ghost work to low-paid data workers worldwide. These costs are linked to workplace effects, with the authors arguing that automation rarely eliminates jobs outright and more often degrades them through surveillance, work intensification, and unpaid oversight. As alternatives to passive adoption, the authors propose concrete responses: asking precise questions about what is being automated and why, demanding transparency about data and evaluation, and practicing what they call strategic refusal when deployment conflicts with evidence or values. The book also develops a vocabulary for public debate, rejecting both boosterish and doomerish narratives as grounded in the same assumption that AI is a singular, autonomous force. The authors recommend reading strategies such as favoring trusted human sources over automated summaries and using humor to deflate inflated claims. They describe a link between language to policy and power, arguing that precise terminology can help policymakers and the public resist austerity-driven automation and demand accountability for errors and harms. == Reception == The Guardian praised the book's myth-busting approach and its analysis of how hype erodes cultural and civic life by normalizing synthetic media as a substitute for human judgment. Kirkus Reviews described it as a contrarian account that catalogs concrete risks while cutting through speculative predictions. An interview in Business Insider highlighted the authors' accessible frameworks, including their proposal to describe chatbots as conversation simulators and to evaluate systems in terms of values, labor, and evidence. Coverage in GeekWire emphasized the book's call for resistance through collective bargaining, stronger data rights, and a norm of rejecting deployments that fail basic standards of necessity and evaluation. Some reviews were more critical. A review in LLRX argued that the book's tone could be overly polemical and that it gave limited attention to potential benefits claimed for generative systems. Coverage in the Financial Times, focused on Bender's broader public scholarship, situated the book within her long-standing critique of anthropomorphic narratives about large language models and her advocacy for more democratic oversight of automated systems.

    Read more →
  • Artificial intelligence of things

    Artificial intelligence of things

    Artificial Intelligence of Things (AIoT) is the combination of artificial intelligence (AI) technologies with the Internet of things (IoT) infrastructure to create systems capable of sensing, learning, and acting on data without continuous human intervention. While IoT focuses on connectivity and sensor data collection, AI enables IoT devices to analyse data in real time and produce actionable outputs, including automated decisions at the edge. == Applications == === Manufacturing and predictive maintenance === Manufacturing accounts for the largest share of AIoT adoption by industry vertical. A common application is predictive maintenance, where sensors measuring vibration, temperature, current draw, and acoustic emissions feed machine learning models trained to detect signatures that precede equipment failure. These systems can flag developing faults weeks or months in advance, and in more advanced deployments can autonomously adjust machine parameters such as motor speed or cooling cycles to delay or prevent failure. === Other industries === In healthcare, AIoT enables remote patient monitoring through wearable devices that collect vital signs and apply AI models to detect anomalies or predict deterioration. In logistics, GPS and telematics sensors combined with AI models support real-time route optimisation, vehicle maintenance prediction, and fuel cost forecasting. Smart building systems use occupancy, temperature, and energy sensors with AI to dynamically adjust HVAC and lighting, reducing energy consumption. == Architecture == AIoT systems typically operate across three layers: a device layer of sensors and actuators that collect data, a connectivity layer that transmits data via protocols such as MQTT or HTTP, and a compute layer where AI models process the data either in the cloud or at the edge. The trend toward edge-based processing, where inference runs on low-cost processors near the data source rather than in a centralised cloud, has accelerated as hardware costs have fallen and applications increasingly require sub-second response times. == Market == Market sizing estimates for AIoT vary significantly depending on scope and definition. Fortune Business Insights valued the AIoT market at USD 35.65 billion in 2023, projecting growth to USD 253.86 billion by 2030 at a compound annual growth rate of 32.4%. Grand View Research estimated the broader market at USD 171.4 billion in 2024 with a CAGR of 31.7% through 2030, reflecting a wider definition that includes AI-integrated hardware components. North America accounted for approximately 40% of global market share in 2024, with the Asia-Pacific region projected as the fastest-growing market.

    Read more →
  • Data preprocessing

    Data preprocessing

    Data preprocessing can refer to manipulation, filtration or augmentation of data before it is analyzed, and is often an important step in the data mining process. Data collection methods are often loosely controlled, resulting in out-of-range values, impossible data combinations, and missing values, amongst other issues. Preprocessing is the process by which unstructured data is transformed into intelligible representations suitable for machine-learning models. This phase of model deals with noise in order to arrive at better and improved results from the original data set which was noisy. This dataset also has some level of missing value present in it. The preprocessing pipeline used can often have large effects on the conclusions drawn from the downstream analysis. Thus, representation and quality of data is necessary before running any analysis. If there is a high proportion of irrelevant and redundant information present or noisy and unreliable data, then knowledge discovery during the training phase may be more difficult. Data preparation and filtering steps can take a considerable amount of processing time. Examples of methods used in data preprocessing include cleaning, instance selection, normalization, one-hot encoding, data transformation, feature extraction and feature selection. == Applications == === Data mining === Data preprocessing allows for the removal of unwanted data with the use of data cleaning, this allows the user to have a dataset to contain more valuable information after the preprocessing stage for data manipulation later in the data mining process. Editing such dataset to either correct data corruption or human error is a crucial step to get accurate quantifiers like true positives, true negatives, false positives and false negatives found in a confusion matrix that are commonly used for a medical diagnosis. Users are able to join data files together and use preprocessing to filter any unnecessary noise from the data which can allow for higher accuracy. Users use Python programming scripts accompanied by the pandas library which gives them the ability to import data from a comma-separated values as a data-frame. The data-frame is then used to manipulate data that can be challenging otherwise to do in Excel. Pandas (software) which is a powerful tool that allows for data analysis and manipulation; which makes data visualizations, statistical operations and much more, a lot easier. Many also use the R programming language to do such tasks as well. The reason why a user transforms existing files into a new one is because of many reasons. Aspects of data preprocessing may include imputing missing values, aggregating numerical quantities and transforming continuous data into categories (data binning). More advanced techniques like principal component analysis and feature selection are working with statistical formulas and are applied to complex datasets which are recorded by GPS trackers and motion capture devices. === Semantic data preprocessing === Semantic data mining is a subset of data mining that specifically seeks to incorporate domain knowledge, such as formal semantics, into the data mining process. Domain knowledge is the knowledge of the environment the data was processed in. Domain knowledge can have a positive influence on many aspects of data mining, such as filtering out redundant or inconsistent data during the preprocessing phase. Domain knowledge also works as constraint. It does this by using working as set of prior knowledge to reduce the space required for searching and acting as a guide to the data. Simply put, semantic preprocessing seeks to filter data using the original environment of said data more correctly and efficiently. There are increasingly complex problems which are asking to be solved by more elaborate techniques to better analyze existing information. Instead of creating a simple script for aggregating different numerical values into a single value, it make sense to focus on semantic based data preprocessing. The idea is to build a dedicated ontology, which explains on a higher level what the problem is about. In regards to semantic data mining and semantic pre-processing, ontologies are a way to conceptualize and formally define semantic knowledge and data. The Protégé (software) is the standard tool for constructing an ontology. In general, the use of ontologies bridges the gaps between data, applications, algorithms, and results that occur from semantic mismatches. As a result, semantic data mining combined with ontology has many applications where semantic ambiguity can impact the usefulness and efficiency of data systems. Applications include the medical field, language processing, banking, and even tutoring, among many more. There are various strengths to using a semantic data mining and ontological based approach. As previously mentioned, these tools can help during the per-processing phase by filtering out non-desirable data from the data set. Additionally, well-structured formal semantics integrated into well designed ontologies can return powerful data that can be easily read and processed by machines. A specifically useful example of this exists in the medical use of semantic data processing. As an example, a patient is having a medical emergency and is being rushed to hospital. The emergency responders are trying to figure out the best medicine to administer to help the patient. Under normal data processing, scouring all the patient’s medical data to ensure they are getting the best treatment could take too long and risk the patients’ health or even life. However, using semantically processed ontologies, the first responders could save the patient’s life. Tools like a semantic reasoner can use ontology to infer the what best medicine to administer to the patient is based on their medical history, such as if they have a certain cancer or other conditions, simply by examining the natural language used in the patient's medical records. This would allow the first responders to quickly and efficiently search for medicine without having worry about the patient’s medical history themselves, as the semantic reasoner would already have analyzed this data and found solutions. In general, this illustrates the incredible strength of using semantic data mining and ontologies. They allow for quicker and more efficient data extraction on the user side, as the user has fewer variables to account for, since the semantically pre-processed data and ontology built for the data have already accounted for many of these variables. However, there are some drawbacks to this approach. Namely, it requires a high amount of computational power and complexity, even with relatively small data sets. This could result in higher costs and increased difficulties in building and maintaining semantic data processing systems. This can be mitigated somewhat if the data set is already well organized and formatted, but even then, the complexity is still higher when compared to standard data processing. Below is a simple a diagram combining some of the processes, in particular semantic data mining and their use in ontology. The diagram depicts a data set being broken up into two parts: the characteristics of its domain, or domain knowledge, and then the actual acquired data. The domain characteristics are then processed to become user understood domain knowledge that can be applied to the data. Meanwhile, the data set is processed and stored so that the domain knowledge can applied to it, so that the process may continue. This application forms the ontology. From there, the ontology can be used to analyze data and process results. Fuzzy preprocessing is another, more advanced technique for solving complex problems. Fuzzy preprocessing and fuzzy data mining make use of fuzzy sets. These data sets are composed of two elements: a set and a membership function for the set which comprises 0 and 1. Fuzzy preprocessing uses this fuzzy data set to ground numerical values with linguistic information. Raw data is then transformed into natural language. Ultimately, fuzzy data mining's goal is to help deal with inexact information, such as an incomplete database. Currently fuzzy preprocessing, as well as other fuzzy based data mining techniques see frequent use with neural networks and artificial intelligence.

    Read more →
  • Auralization

    Auralization

    Auralization is a procedure designed to model and simulate the experience of acoustic phenomena rendered as a soundfield in a virtualized space. This is useful in configuring the soundscape of architectural structures, concert venues, and public spaces, as well as in making coherent sound environments within virtual immersion systems. == History == The English term auralization was used for the first time by Kleiner et al. in an article in the journal of the AES en 1991. The increase of computational power allowed the development of the first acoustic simulation software towards the end of the 1960s. == Principles == Auralizations are experienced through systems rendering virtual acoustic models made by convolving or mixing acoustic events recorded 'dry' (or in an anechoic chamber) projected within a virtual model of an acoustic space, the characteristics of which are determined by means of sampling its impulse response (IR). Once this h ( t ) {\displaystyle h(t)} has been determined, the simulation of the resulting soundfield s ( t ) {\displaystyle s(t)} in the target environment is obtained by convolution: r ( t ) = h ( t ) ∗ s ( t ) {\displaystyle r(t)=h(t)s(t)} The resulting sound r ( t ) {\displaystyle r(t)} is heard as it would if emitted in that acoustic space. == Binaurality == For auralizations to be perceived as realistic, it is critical to emulate the human hearing in terms of position and orientation of the listener's head with respect to the sources of sound. For IR data to be convolved convincingly, the acoustic events are captured using a dummy head where two microphones are positioned on each side of the head to record an emulation of sound arriving at the locations of human ears, or using an ambisonics microphone array and mixed down for binaurality. Head-related transfer functions (HRTF) datasets can be used to simplify the process insofar as a monaural IR can be measured or simulated, then audio content is convolved with its target acoustic space. In rendering the experience, the transfer function corresponding to the orientation of the head is applied to simulate the corresponding spatial emanation of sound.

    Read more →
  • Evolvability (computer science)

    Evolvability (computer science)

    The term evolvability is a framework of computational learning introduced by Leslie Valiant in his paper of the same name. The aim of this theory is to model biological evolution and categorize which types of mechanisms are evolvable. Evolution is an extension of PAC learning and learning from statistical queries. == General framework == Let F n {\displaystyle F_{n}\,} and R n {\displaystyle R_{n}\,} be collections of functions on n {\displaystyle n\,} variables. Given an ideal function f ∈ F n {\displaystyle f\in F_{n}} , the goal is to find by local search a representation r ∈ R n {\displaystyle r\in R_{n}} that closely approximates f {\displaystyle f\,} . This closeness is measured by the performance Perf ⁡ ( f , r ) {\displaystyle \operatorname {Perf} (f,r)} of r {\displaystyle r\,} with respect to f {\displaystyle f\,} . As is the case in the biological world, there is a difference between genotype and phenotype. In general, there can be multiple representations (genotypes) that correspond to the same function (phenotype). That is, for some r , r ′ ∈ R n {\displaystyle r,r'\in R_{n}} , with r ≠ r ′ {\displaystyle r\neq r'\,} , still r ( x ) = r ′ ( x ) {\displaystyle r(x)=r'(x)\,} for all x ∈ X n {\displaystyle x\in X_{n}} . However, this need not be the case. The goal then, is to find a representation that closely matches the phenotype of the ideal function, and the spirit of the local search is to allow only small changes in the genotype. Let the neighborhood N ( r ) {\displaystyle N(r)\,} of a representation r {\displaystyle r\,} be the set of possible mutations of r {\displaystyle r\,} . For simplicity, consider Boolean functions on X n = { − 1 , 1 } n {\displaystyle X_{n}=\{-1,1\}^{n}\,} , and let D n {\displaystyle D_{n}\,} be a probability distribution on X n {\displaystyle X_{n}\,} . Define the performance in terms of this. Specifically, Perf ⁡ ( f , r ) = ∑ x ∈ X n f ( x ) r ( x ) D n ( x ) . {\displaystyle \operatorname {Perf} (f,r)=\sum _{x\in X_{n}}f(x)r(x)D_{n}(x).} Note that Perf ⁡ ( f , r ) = Prob ⁡ ( f ( x ) = r ( x ) ) − Prob ⁡ ( f ( x ) ≠ r ( x ) ) . {\displaystyle \operatorname {Perf} (f,r)=\operatorname {Prob} (f(x)=r(x))-\operatorname {Prob} (f(x)\neq r(x)).} In general, for non-Boolean functions, the performance will not correspond directly to the probability that the functions agree, although it will have some relationship. Throughout an organism's life, it will only experience a limited number of environments, so its performance cannot be determined exactly. The empirical performance is defined by Perf s ⁡ ( f , r ) = 1 s ∑ x ∈ S f ( x ) r ( x ) , {\displaystyle \operatorname {Perf} _{s}(f,r)={\frac {1}{s}}\sum _{x\in S}f(x)r(x),} where S {\displaystyle S\,} is a multiset of s {\displaystyle s\,} independent selections from X n {\displaystyle X_{n}\,} according to D n {\displaystyle D_{n}\,} . If s {\displaystyle s\,} is large enough, evidently Perf s ⁡ ( f , r ) {\displaystyle \operatorname {Perf} _{s}(f,r)} will be close to the actual performance Perf ⁡ ( f , r ) {\displaystyle \operatorname {Perf} (f,r)} . Given an ideal function f ∈ F n {\displaystyle f\in F_{n}} , initial representation r ∈ R n {\displaystyle r\in R_{n}} , sample size s {\displaystyle s\,} , and tolerance t {\displaystyle t\,} , the mutator Mut ⁡ ( f , r , s , t ) {\displaystyle \operatorname {Mut} (f,r,s,t)} is a random variable defined as follows. Each r ′ ∈ N ( r ) {\displaystyle r'\in N(r)} is classified as beneficial, neutral, or deleterious, depending on its empirical performance. Specifically, r ′ {\displaystyle r'\,} is a beneficial mutation if Perf s ⁡ ( f , r ′ ) − Perf s ⁡ ( f , r ) ≥ t {\displaystyle \operatorname {Perf} _{s}(f,r')-\operatorname {Perf} _{s}(f,r)\geq t} ; r ′ {\displaystyle r'\,} is a neutral mutation if − t < Perf s ⁡ ( f , r ′ ) − Perf s ⁡ ( f , r ) < t {\displaystyle -t<\operatorname {Perf} _{s}(f,r')-\operatorname {Perf} _{s}(f,r) 0 {\displaystyle \epsilon >0\,} , for all ideal functions f ∈ F n {\displaystyle f\in F_{n}} and representations r 0 ∈ R n {\displaystyle r_{0}\in R_{n}} , with probability at least 1 − ϵ {\displaystyle 1-\epsilon \,} , Perf ⁡ ( f , r g ( n , 1 / ϵ ) ) ≥ 1 − ϵ , {\displaystyle \operatorname {Perf} (f,r_{g(n,1/\epsilon )})\geq 1-\epsilon ,} where the sizes of neighborhoods N ( r ) {\displaystyle N(r)\,} for r ∈ R n {\displaystyle r\in R_{n}\,} are at most p ( n , 1 / ϵ ) {\displaystyle p(n,1/\epsilon )\,} , the sample size is s ( n , 1 / ϵ ) {\displaystyle s(n,1/\epsilon )\,} , the tolerance is t ( 1 / n , ϵ ) {\displaystyle t(1/n,\epsilon )\,} , and the generation size is g ( n , 1 / ϵ ) {\displaystyle g(n,1/\epsilon )\,} . F {\displaystyle F\,} is evolvable over D {\displaystyle D\,} if it is evolvable by some R {\displaystyle R\,} over D {\displaystyle D\,} . F {\displaystyle F\,} is evolvable if it is evolvable over all distributions D {\displaystyle D\,} . == Results == The class of conjunctions and the class of disjunctions are evolvable over the uniform distribution for short conjunctions and disjunctions, respectively. The class of parity functions (which evaluate to the parity of the number of true literals in a given subset of literals) are not evolvable, even for the uniform distribution. Evolvability implies PAC learnability.

    Read more →
  • AIXI

    AIXI

    AIXI is a theoretical mathematical formalism for artificial general intelligence. It combines Solomonoff induction with sequential decision theory. AIXI was first proposed by Marcus Hutter in 2000 and several results regarding AIXI are proved in Hutter's 2005 book Universal Artificial Intelligence. AIXI is a reinforcement learning (RL) agent. It maximizes the expected total rewards received from the environment. Intuitively, it simultaneously considers every computable hypothesis (or environment). In each time step, it looks at every possible program and evaluates how many rewards that program generates depending on the next action taken. The promised rewards are then weighted by the subjective belief that this program constitutes the true environment. This belief is computed from the length of the program: longer programs are considered less likely, in line with Occam's razor. AIXI then selects the action that has the highest expected total reward in the weighted sum of all these programs. == Etymology == According to Hutter, the word "AIXI" can have several interpretations. AIXI can stand for AI based on Solomonoff's distribution, denoted by ξ {\displaystyle \xi } (which is the Greek letter xi), or e.g. it can stand for AI "crossed" (X) with induction (I). There are other interpretations. == Definition == AIXI is a reinforcement learning agent that interacts with some stochastic and unknown but computable environment μ {\displaystyle \mu } . The interaction proceeds in time steps, from t = 1 {\displaystyle t=1} to t = m {\displaystyle t=m} , where m ∈ N {\displaystyle m\in \mathbb {N} } is the lifespan of the AIXI agent. At time step t, the agent chooses an action a t ∈ A {\displaystyle a_{t}\in {\mathcal {A}}} (e.g. a limb movement) and executes it in the environment, and the environment responds with a "percept" e t ∈ E = O × R {\displaystyle e_{t}\in {\mathcal {E}}={\mathcal {O}}\times \mathbb {R} } , which consists of an "observation" o t ∈ O {\displaystyle o_{t}\in {\mathcal {O}}} (e.g., a camera image) and a reward r t ∈ R {\displaystyle r_{t}\in \mathbb {R} } , distributed according to the conditional probability μ ( o t r t | a 1 o 1 r 1 . . . a t − 1 o t − 1 r t − 1 a t ) {\displaystyle \mu (o_{t}r_{t}|a_{1}o_{1}r_{1}...a_{t-1}o_{t-1}r_{t-1}a_{t})} , where a 1 o 1 r 1 . . . a t − 1 o t − 1 r t − 1 a t {\displaystyle a_{1}o_{1}r_{1}...a_{t-1}o_{t-1}r_{t-1}a_{t}} is the "history" of actions, observations and rewards. The environment μ {\displaystyle \mu } is thus mathematically represented as a probability distribution over "percepts" (observations and rewards) which depend on the full history, so there is no Markov assumption (as opposed to other RL algorithms). Note again that this probability distribution is unknown to the AIXI agent. Furthermore, note again that μ {\displaystyle \mu } is computable, that is, the observations and rewards received by the agent from the environment μ {\displaystyle \mu } can be computed by some program (which runs on a Turing machine), given the past actions of the AIXI agent. The only goal of the AIXI agent is to maximize ∑ t = 1 m r t {\displaystyle \sum _{t=1}^{m}r_{t}} , that is, the sum of rewards from time step 1 to m. The AIXI agent is associated with a stochastic policy π : ( A × E ) ∗ → A {\displaystyle \pi :({\mathcal {A}}\times {\mathcal {E}})^{}\rightarrow {\mathcal {A}}} , which is the function it uses to choose actions at every time step, where A {\displaystyle {\mathcal {A}}} is the space of all possible actions that AIXI can take and E {\displaystyle {\mathcal {E}}} is the space of all possible "percepts" that can be produced by the environment. The environment (or probability distribution) μ {\displaystyle \mu } can also be thought of as a stochastic policy (which is a function): μ : ( A × E ) ∗ × A → E {\displaystyle \mu :({\mathcal {A}}\times {\mathcal {E}})^{}\times {\mathcal {A}}\rightarrow {\mathcal {E}}} , where the ∗ {\displaystyle } is the Kleene star operation. In general, at time step t {\displaystyle t} (which ranges from 1 to m), AIXI, having previously executed actions a 1 … a t − 1 {\displaystyle a_{1}\dots a_{t-1}} (which is often abbreviated in the literature as a < t {\displaystyle a_{ Read more →