AI Art Is Real Art

AI Art Is Real Art — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Cooliris (plugin)

    Cooliris (plugin)

    Cooliris (for Desktop), formerly known as PicLens, was a web browser extension developed by Cooliris, Inc, and later acquired by Yahoo. The plugin provides an interactive 3D-like experience for viewing digital images and videos from the web and from desktop applications. The software places a small icon atop image thumbnails that appear on a webpage. Clicking on the icon loads the Cooliris 3D Wall, a browsing environment that gives the user the effect of flying through a three-dimensional space. Released to the public in January 2008, The New York Times described Cooliris as the "new immersive approach to Web navigation". Cooliris went out to win the 2008 Crunchies Award for Best Design. The plugin has received over 50 million downloads. As of May 2014 browser plugins are unavailable from the official website. There are only links to tablet apps - for iOS and Android.

    Read more →
  • AI Image Generators Reviews: What Actually Works in 2026

    AI Image Generators Reviews: What Actually Works in 2026

    Trying to pick the best AI image generator? An AI image generator is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI image generator slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Model inversion attack

    Model inversion attack

    Model inversion attack is a type of adversarial machine learning attack where an attacker tries to reconstruct or infer sensitive information about a model's training data by analyzing the outputs of a trained machine learning model. Instead of directly querying the underlying dataset, attackers query the model (usually via APIs or prediction interfaces), and leverage patterns in the model responses to infer properties of the original inputs. These attacks leverage the fact that machine learning models encode statistical information about their training data in their parameters and outputs, which can unintentionally leak private or proprietary information. Depending on the access level to the target model, model inversion attacks can be performed in both black-box and white-box settings. In a generic attack, an adversary makes several queries to a model and leverages the responses (e.g. confidence scores, predictions) to train a surrogate or inversion model that learns to approximate the inverse mapping from outputs to inputs. This process may enable the reconstruction of sensitive attributes, e.g., facial features, medical data, or user behavior patterns, from models trained on such data. The technique has been demonstrated against various models like deep neural networks, classification systems etc. The technique has significant privacy risks in areas like healthcare, finance, biometric identification etc. Mitigation strategies include restricting model access, reducing output granularity, using differential privacy and monitoring anomalous query patterns.

    Read more →
  • Gary B. Fogel

    Gary B. Fogel

    Gary Bryce Fogel (born 1968) is an American biologist and computer scientist. He is the Chief Executive Officer of Natural Selection, Inc. He is most known for his applications of computational intelligence and machine learning to bioinformatics, computational biology, and industrial optimization. == Education and Research == Fogel was born and raised in La Jolla, California, graduating from La Jolla High School. He received a B.A. in biology with a minor in earth sciences from the University of California, Santa Cruz in 1991 and a Ph.D. in biology from the University of California, Los Angeles in 1998. Fogel has published over 150 peer-reviewed publications in conferences and journals, 2 edited books, and 11 patents. As CEO of Natural Selection, Inc., his research focuses on the application of computational intelligence, machine learning, and predictive analytics in areas not limited to: Viral evolution, cellular differentiation, drug discovery, RNA structure, cis-regulatory elements, cancer, and evolutionary game theory as well as the development of evolutionary algorithms and other approaches. == Service == Between 2008–2018 Gary Fogel was editor-in-chief of the Elsevier journal BioSystems. He has served previously as an associate editor for IEEE Transactions on Artificial Intelligence, IEEE Computational Intelligence Magazine (2005–2010), IEEE Transactions on Evolutionary Computation (2001–2013), IEEE Transactions on Emerging Topics in Computational Intelligence (2016–2018), IEEE/ACM Transactions on Computational Biology and Bioinformatics (2004–2008), International Journal of Bioinformatics Research and Applications (2004–2007), International Journal of Data Mining and Bioinformatics (2005–2007), as a consulting editor for the Journal of Computational Intelligence in Bioinformatics (2006–2007), and as an editorial board member of Ecological Informatics (2005–2009) and BMC Big Data Analytics (2015–2020). Within the IEEE Computational Intelligence Society, Fogel founded the Bioinformatics and Bioengineering Technical Committee and established the IEEE Computational Intelligence in Bioinformatics and Computational Biology conference series, chairing the first two meetings in 2004 and 2005 in San Diego. He co-founded the IEEE Conference on Artificial Intelligence in 2023. Fogel served on the IEEE Computational Intelligence Society Administrative Committee (2004–2009, 2014–2022) and served as IEEE CIS Vice President of Conferences (2010–2013, 2019). == Teaching == Gary Fogel also serves as adjunct faculty at San Diego State University in the department of aerospace engineering as well as in the Computational Science Research Center. He has authored four books and numerous articles on the history of early aviation focusing on motorless flight. He is an associate fellow of the American Institute of Aeronautics and Astronautics and serves on the AIAA History Committee. == Awards == 2023 – Outstanding Contribution to Aerospace Education Award, AIAA San Diego Section 2022 – Elected Fellow of the Asia-Pacific Artificial Intelligence Association 2019 – Top 100 AI Leaders in Drug Discovery and Advanced Healthcare by Deep Knowledge Analytics 2019 – Outstanding Contribution to Aerospace Education Award, AIAA San Diego Section 2016 – Meritorious Service Award, IEEE Computational Intelligence Society 2016 – Outstanding Contribution to the Community Award, AIAA San Diego Section 2015 – Outstanding Enhancement of the Image of the Aerospace Profession Award, AIAA San Diego Section 2012 – Medal for Significant Achievement, San Diego Chapter of Sigma Xi 2012 – Fellow of the Institute of Electrical and Electronics Engineers for contributions to computational intelligence and its application to biology, chemistry, and medicine. == Aeromodeling == Gary Fogel has established national and world records for model aircraft. He helped establish the National Model Aviation Heritage program for the Academy of Model Aeronautics. He is a leader member, contest director, and fellow of the Academy of Model Aeronautics, and was inducted into the Academy of Model Aeronautics Hall of Fame in 2025.

    Read more →
  • Systems development life cycle

    Systems development life cycle

    The systems development life cycle (SDLC) describes the typical phases and progression between phases during the development of a computer-based system. These phases progress from inception to retirement. At base, there is just one life cycle, but the taxonomy used to describe it may vary; the cycle may be classified into different numbers of phases and various names may be used for those phases. The SDLC is analogous to the life cycle of a living organism from its birth to its death. In particular, the SDLC varies by system in much the same way that each living organism has a unique path through its life. The SDLC does not prescribe how engineers should go about their work to move the system through its life cycle. Prescriptive techniques are referred to using various terms such as methodology, model, framework, and formal process. Other terms are used for the same concept as SDLC, including software development life cycle (also SDLC), application development life cycle (ADLC), and system design life cycle (also SDLC). These other terms focus on a different scope of development and are associated with different prescriptive techniques, but are about the same essential life cycle. The term "life cycle" is often written without a space, as "lifecycle", with the former more popular in the past and in non-engineering contexts. The acronym SDLC was coined when the longer form was more popular and has remained associated with the expansion, even though the shorter form is popular in engineering. Also, SDLC is relatively unique as opposed to the TLA SDL, which is highly overloaded. == Phases == Depending on the source, the SDLC is described as having different phases and using different terms. Even so, there are common aspects. The following attempts to describe notable phases using notable terminology. The phases are somewhat ordered by the natural sequence of development, although they can be overlapping and iterative. === Conceptualization === During conceptualization (a.k.a. conceptual design, system investigation, feasibility), options and priorities are considered. A feasibility study can determine whether the development effort is worthwhile via activities such as understanding user needs, cost estimation, benefit analysis, and resource analysis. A study should address operational, financial, technical, human factors, and legal/political concerns. === Requirements analysis === Requirements analysis (a.k.a. preliminary design) involves understanding the problem and determining what is needed. Often this involves engaging users to define the requirements and recording them in a document known as a requirements specification. === Design === During the design phase (a.k.a. detail design), a solution is planned. The plan can include relatively high-level information such as describing the major components of the system. The plan can include relatively low-level information such as describing functions, screen layout, business rules, and process flow. The design phase is informed by the requirements of the system. The design must satisfy each requirement. The design may be recorded in textual documents as well as functional hierarchy diagrams, example screen images, business rules, process diagrams, pseudo-code, and data models. === Construction === During construction (a.k.a. implementation, production), the system is realized. Based on the design, hardware and software components are created and integrated. This phase includes testing sub-components, components and the integration of some components, but typically does not include testing at the complete system level. This phase may include the development of training materials, including user manuals and help files. === Acceptance === The acceptance phase (a.k.a. system testing) is about testing the complete system to ensure that it meets customer expectations (requirements). === Deployment === The deployment phase (a.k.a. implementation) involves the logistics of delivery to the customer. Some systems are deployed as a single instance (i.e. in the cloud), and deployment may be ad hoc and manual. Some systems are built in quantity and are associated with manufacturing process and commissioning. This phase may include training users to use the system. It may include transitioning future development to support staff. === Maintenance === During the maintenance phase (a.k.a. operation, utilization, support) development is largely inactive, although this phase does include customer support for resolving user issues and recording suggestions for improvement. Fixes and enhancements are handled by returning to the first phase, conceptualization. For minor changes, the cycle may be significantly abbreviated compared to initial development. === Decommission === Decommission (a.k.a. disposition, retirement, phase-out) is when the system is removed from use, i.e., when it reaches end-of-life. == Practices == === Management and control === SDLC phase objectives are described in this section with key deliverables, a description of recommended tasks, and a summary of related control objectives for effective management. It is critical for the project manager to establish and monitor control objectives while executing projects. Control objectives are clear statements of the desired result or purpose and should be defined and monitored throughout a project. Control objectives can be grouped into major categories (domains), and relate to the SDLC phases as shown in the figure. To manage and control a substantial SDLC initiative, a work breakdown structure (WBS) captures and schedules the work. The WBS and all programmatic material should be kept in the "project description" section of the project notebook. The project manager chooses a WBS format that best describes the project. The diagram shows that coverage spans numerous phases of the SDLC, but the associated MCD (Management Control Domains) shows mappings to SDLC phases. For example, Analysis and Design is primarily performed as part of the Acquisition and Implementation Domain, and System Build and Prototype is primarily performed as part of delivery and support. === Work breakdown structured organization === The upper section of the WBS provides an overview of the project scope and timeline. It should also summarize the major phases and milestones. The middle section is based on the SDLC phases. WBS elements consist of milestones and tasks to be completed rather than activities to be undertaken, and have a deadline. Each task has a measurable output (e.g., an analysis document). A WBS task may rely on one or more activities (e.g., coding). Parts of the project needing support from contractors should have a statement of work (SOW). The development of an SOW does not occur during a specific phase of SDLC but is developed to include the work from the SDLC process that may be conducted by contractors. === Baselines === Baselines are established after four of the five phases of the SDLC, and are critical to the iterative nature of the model. Baselines become milestones. functional baseline: established after the conceptual design phase. allocated baseline: established after the preliminary design phase. product baseline: established after the detailed design and development phase. updated product baseline: established after the production construction phase. In the following diagram, these stages are divided into ten steps, from definition to creation and modification of IT work products:

    Read more →
  • Yaron Singer

    Yaron Singer

    Yaron Singer is a computer scientist and entrepreneur whose work has focused on algorithms, machine learning, optimization, and artificial intelligence security. He was the Gordon McKay Professor of Computer Science and Applied Mathematics at Harvard University and co-founded Robust Intelligence, an artificial intelligence security company acquired by Cisco Systems in 2024. == Education == Singer received a PhD in computer science from the University of California, Berkeley under the supervision of Christos Papadimitriou. == Academic career == Singer was a postdoctoral research scientist at Google Research. Singer joined the computer science faculty at Harvard John A. Paulson School of Engineering and Applied Sciences in 2013 and became a full professor in 2019. == Research == Singer's research has focused on algorithms and machine learning, including optimization, algorithmic mechanism design, and adversarial machine learning. His doctoral work studied computational limits in algorithmic mechanism design, including truthful mechanisms and budget-feasible mechanisms. In optimization, Singer co-authored work on submodular optimization and parallel algorithms for large-scale data processing. Singer has also worked on adversarial machine learning, including attacks that use small perturbations or noise to affect the behavior of machine learning systems. == Entrepreneurship == In 2020, Singer co-founded Robust Intelligence Kojin Oshiba. Harvard SEAS reported that the company raised $14 million that year, and TechCrunch reported in 2021 that the company raised a $30 million Series B round led by Tiger Global. The company developed tools for testing AI models and detecting failures before or during deployment. TechCrunch described its RIME product as using an "AI firewall" to stress-test models. In 2024, Cisco Systems acquired Robust Intelligence. CTech reported that Cisco had not disclosed the purchase amount when the acquisition was announced, and later reported the deal value as $400 million. In 2025, Cisco launched Foundation AI, a Cisco team focused on AI for cybersecurity. Techzine reported that Singer led the team and was Cisco's VP of AI and Security. == Recognition == Singer has received a Sloan Research Fellowship, an NSF CAREER Award, a Google Faculty Research Award, and a Facebook Faculty Award. As a graduate student, he received Microsoft Research and Facebook fellowships. In 2012, he received the Best Student Paper Award at the ACM International Conference on Web Search and Data Mining for "How to Win Friends and Influence People, Truthfully: Influence Maximization Mechanisms for Social Networks."

    Read more →
  • Frank Hutter

    Frank Hutter

    Frank Hutter is a German computer scientist recognized for his contributions to machine learning, particularly in the areas of automated machine learning (AutoML), hyperparameter optimization, meta-learning and tabular machine learning. He is currently a Hector-Endowed Fellow and PI at the ELLIS Institute Tübingen and a Full Professor (W3) for Machine Learning at the Department of Computer Science, University of Freiburg. Hutter is known for his role in establishing AutoML as a key area in artificial intelligence research. == Education and academic career == Frank Hutter received his academic training in computer science at Darmstadt University of Technology, where he completed his Vordiplom (comparable to a BSc) and Hauptdiplom (equivalent to MSc) by 2004. He later pursued his PhD at the University of British Columbia, under the supervision of Profs. Holger Hoos, Kevin Leyton-Brown and Kevin Murphy, where his doctoral thesis, titled "Automated Configuration of Algorithms for Solving Hard Computational Problems," was awarded the CAIAC Doctoral Dissertation Award for the best thesis in Artificial Intelligence completed at a Canadian university in 2009. Hutter did his postdoctoral research at the University of British Columbia, where he worked from 2009 to 2013. In 2013, he moved to the University of Freiburg, initially leading an Emmy Noether Research Group, and in 2017, he was appointed as a Full Professor. His contributions to machine learning have been recognized globally, particularly his work in AutoML and hyperparameter optimization. Overall, Hutter has authored over 180 peer-reviewed publications, which have garnered more than 89,000 citations, reflecting the high impact of his work. == Contributions in AutoML == Hutter's early research laid the groundwork for the field of Automated Machine Learning (AutoML). He has been a key figure in establishing AutoML as a distinct research area. Along with various colleagues, he organized the AutoML workshops from 2014 to 2021, wrote the first book on AutoML and taught the first MOOC on AutoML. He also co-founded the AutoML conference in 2022 and served as its general chair the first two years. He also published prominent works in various subfields of AutoML, such as hyperparameter optimization, neural architecture search, meta-Learning and AutoML systems. He is currently the most highly cited researcher in AutoML. == Contributions in machine learning for tabular data == Hutter has also made many contributions to machine learning for tabular data. He led the development of the first widely adopted AutoML system for tabular data, AutoWEKA, which was published at KDD 2013 and received the test of time award at KDD (2023). Subsequently, he led the development of Auto-sklearn, the first highly used AutoML system for tabular data in Python, and with it, won the first international AutoML challenge and the subsequent second international AutoML challenge, both of which only included tabular data. More recently, he focused on tabular foundation models, including TabPFN, which was published in Nature magazine. In 2024, he also co-founded Prior Labs, the first company focusing on tabular foundation models. == Awards and honors == Hutter has received numerous awards throughout his career. In 2023, he won the KDD Test of Time Award for Research together with Chris Thornton, Holger H. Hoos, and Kevin Leyton-Brown. He has received three grants from the ERC, including the ERC Starting Grant (2016) and ERC Consolidator Grant (2022), as well as an ERC Proof of Concept Grant (2020). In 2021, he became an ELLIS Unit Director and was also recognized as a EurAI Fellow, in addition to receiving the AIJ Prominent Paper Award. Earlier, he was a recipient of the Google Faculty Research Award in 2018. His groundbreaking research was acknowledged early in his career with the IJCAI Distinguished Paper Award in 2013 and the IJCAI/JAIR Best Paper Prize in 2010. == Representative publications == Hutter, F. Kotthoff, L. and Vanschoren, J., editors. Automated machine learning: methods, systems, challenges, Springer Nature, 2019. www.automl.org/book. Feurer, M., Klein, A., Eggensperger, K., Springenberg, T., Blum, M., Hutter, F. Efficient and Robust Automated Machine Learning. In NeurIPS 2015. Loshchilov, I., and Hutter, F. Decoupled weight decay regularization. In ICLR 2018. Zela, A., Elsken, T. ,Saikia, T. ,Marrakschi, Y. ,Brox, T. and Hutter. ,F.Understanding and Robustifying Differentiable Architecture Search. In ICLR 2020. Hollmann, N., Müller, S., Eggensperger, K. and Hutter, F. TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second, In ICLR 2023.

    Read more →
  • Permutation automaton

    Permutation automaton

    In automata theory, a permutation automaton, or pure-group automaton, is a deterministic finite automaton such that each input symbol permutes the set of states. Formally, a deterministic finite automaton A may be defined by the tuple (Q, Σ, δ, q0, F), where Q is the set of states of the automaton, Σ is the set of input symbols, δ is the transition function that takes a state q and an input symbol x to a new state δ(q,x), q0 is the initial state of the automaton, and F is the set of accepting states (also: final states) of the automaton. A is a permutation automaton if and only if, for every two distinct states qi and qj in Q and every input symbol x in Σ, δ(qi,x) ≠ δ(qj,x). A formal language is p-regular (also: a pure-group language) if it is accepted by a permutation automaton. For example, the set of strings of even length forms a p-regular language: it may be accepted by a permutation automaton with two states in which every transition replaces one state by the other. == Applications == The pure-group languages were the first interesting family of regular languages for which the star height problem was proved to be computable. Another mathematical problem on regular languages is the separating words problem, which asks for the size of a smallest deterministic finite automaton that distinguishes between two given words of length at most n – by accepting one word and rejecting the other. The known upper bound in the general case is O ( n 2 / 5 ( log ⁡ n ) 3 / 5 ) {\displaystyle O(n^{2/5}(\log n)^{3/5})} . The problem was later studied for the restriction to permutation automata. In this case, the known upper bound changes to O ( n 1 / 2 ) {\displaystyle O(n^{1/2})} .

    Read more →
  • Admissible heuristic

    Admissible heuristic

    In computer science, specifically in algorithms related to pathfinding, a heuristic function is said to be admissible if it never overestimates the cost of reaching the goal, i.e. the cost it estimates to reach the goal is not higher than the lowest possible cost from the current point in the path. In other words, it should act as a lower bound. It is related to the concept of consistent heuristics. While all consistent heuristics are admissible, not all admissible heuristics are consistent. == Search algorithms == An admissible heuristic is used to estimate the cost of reaching the goal state in an informed search algorithm. In order for a heuristic to be admissible to the search problem, the estimated cost must always be lower than or equal to the actual cost of reaching the goal state. The search algorithm uses the admissible heuristic to find an estimated optimal path to the goal state from the current node. For example, in A search the evaluation function (where n {\displaystyle n} is the current node) is: f ( n ) = g ( n ) + h ( n ) {\displaystyle f(n)=g(n)+h(n)} where f ( n ) {\displaystyle f(n)} = the evaluation function. g ( n ) {\displaystyle g(n)} = the cost from the start node to the current node h ( n ) {\displaystyle h(n)} = estimated cost from current node to goal. h ( n ) {\displaystyle h(n)} is calculated using the heuristic function. With a non-admissible heuristic, the A algorithm could overlook the optimal solution to a search problem due to an overestimation in f ( n ) {\displaystyle f(n)} . == Formulation == n {\displaystyle n} is a node h {\displaystyle h} is a heuristic h ( n ) {\displaystyle h(n)} is cost indicated by h {\displaystyle h} to reach a goal from n {\displaystyle n} h ∗ ( n ) {\displaystyle h^{}(n)} is the optimal cost to reach a goal from n {\displaystyle n} h ( n ) {\displaystyle h(n)} is admissible if, ∀ n {\displaystyle \forall n} h ( n ) ≤ h ∗ ( n ) {\displaystyle h(n)\leq h^{}(n)} == Construction == An admissible heuristic can be derived from a relaxed version of the problem, or by information from pattern databases that store exact solutions to subproblems of the problem, or by using inductive learning methods. == Examples == Two different examples of admissible heuristics apply to the fifteen puzzle problem: Hamming distance Manhattan distance The Hamming distance is the total number of misplaced tiles. It is clear that this heuristic is admissible since the total number of moves to order the tiles correctly is at least the number of misplaced tiles (each tile not in place must be moved at least once). The cost (number of moves) to the goal (an ordered puzzle) is at least the Hamming distance of the puzzle. The Manhattan distance of a puzzle is defined as: h ( n ) = ∑ all tiles d i s t a n c e ( tile, correct position ) {\displaystyle h(n)=\sum _{\text{all tiles}}{\mathit {distance}}({\text{tile, correct position}})} Consider the puzzle below in which the player wishes to move each tile such that the numbers are ordered. The Manhattan distance is an admissible heuristic in this case because every tile will have to be moved at least the number of spots in between itself and its correct position. The subscripts show the Manhattan distance for each tile. The total Manhattan distance for the shown puzzle is: h ( n ) = 3 + 1 + 0 + 1 + 2 + 3 + 3 + 4 + 3 + 2 + 4 + 4 + 4 + 1 + 1 = 36 {\displaystyle h(n)=3+1+0+1+2+3+3+4+3+2+4+4+4+1+1=36} == Optimality proof == If an admissible heuristic is used in an algorithm that, per iteration, progresses only the path of lowest evaluation (current cost + heuristic) of several candidate paths, terminates the moment its exploration reaches the goal and, crucially, closes all optimal paths before terminating (something that's possible with A search algorithm if special care isn't taken), then this algorithm can only terminate on an optimal path. To see why, consider the following proof by contradiction: Assume such an algorithm managed to terminate on a path T with a true cost Ttrue greater than the optimal path S with true cost Strue. This means that before terminating, the evaluated cost of T was less than or equal to the evaluated cost of S (or else S would have been picked). Denote these evaluated costs Teval and Seval respectively. The above can be summarized as follows, Strue < Ttrue Teval ≤ Seval If our heuristic is admissible it follows that at this penultimate step Teval = Ttrue because any increase on the true cost by the heuristic on T would be inadmissible and the heuristic cannot be negative. On the other hand, an admissible heuristic would require that Seval ≤ Strue which combined with the above inequalities gives us Teval < Ttrue and more specifically Teval ≠ Ttrue. As Teval and Ttrue cannot be both equal and unequal our assumption must have been false and so it must be impossible to terminate on a more costly than optimal path. As an example, let us say we have costs as follows:(the cost above/below a node is the heuristic, the cost at an edge is the actual cost) 0 10 0 100 0 START ---- O ----- GOAL | | 0| |100 | | O ------- O ------ O 100 1 100 1 100 So clearly we would start off visiting the top middle node, since the expected total cost, i.e. f ( n ) {\displaystyle f(n)} , is 10 + 0 = 10 {\displaystyle 10+0=10} . Then the goal would be a candidate, with f ( n ) {\displaystyle f(n)} equal to 10 + 100 + 0 = 110 {\displaystyle 10+100+0=110} . Then we would clearly pick the bottom nodes one after the other, followed by the updated goal, since they all have f ( n ) {\displaystyle f(n)} lower than the f ( n ) {\displaystyle f(n)} of the current goal, i.e. their f ( n ) {\displaystyle f(n)} is 100 , 101 , 102 , 102 {\displaystyle 100,101,102,102} . So even though the goal was a candidate, we could not pick it because there were still better paths out there. This way, an admissible heuristic can ensure optimality. However, note that although an admissible heuristic can guarantee final optimality, it is not necessarily efficient.

    Read more →
  • FastText

    FastText

    fastText is a library for learning of word embeddings and text classification created by Facebook's AI Research (FAIR) lab. The model allows one to create an unsupervised learning or supervised learning algorithm for obtaining vector representations for words. Facebook makes available pretrained models for 294 languages. Several papers describe the techniques used by fastText. The GitHub repository was archived on March 19, 2024.

    Read more →
  • Wolfgang Ketter

    Wolfgang Ketter

    Wolfgang Ketter (born Traben-Trarbach, Germany, 1972) is Chaired Professor of Information Systems for a Sustainable Society at the University of Cologne. and a prominent scientist in the application of artificial intelligence, machine learning and intelligent agents in the design of smart markets, including demand response mechanisms and in particular automated auctions. He is a co-founder of the open energy system platform Power TAC, an automated retail electricity trading platform that simulates the performance of retail markets in an increasingly prosumer- and renewable-energy-influenced electricity landscape. == Career == === Advisory roles === Ketter is an advisor on the energy transition to the German government, in particular, the energy-intensive German state of North Rhine-Westphalia. He is also a fellow of the World Economic Forum and member of the WEF Global Council on Future Mobility and the Global New Mobility Coalition, contributing on the use of AI and machine learning to address issues arising from growth in electrification of energy such as the use of batteries as virtual power plants, the management of electric vehicle charging to prevent grid congestion, or the potential for peer-to-peer electricity trading. Ketter has also been an advisor for over a decade to the Port of Rotterdam on the design of energy cooperatives and energy trading platforms as well as one of the largest auction companies in the world, Royal FloraHolland, where his initial research led to a redesign of auction mechanisms and decision support systems. The cumulative research project team received the Association for Information Systems Impact Award in 2020 === Research === Ketter’s research is multidisciplinary, addressing the overlap of AI and ML in the economics of retail energy and mobility markets. The industry and policy applications of his research interconnect in large-scale projects such as the EU Smart city development project Ruggedised, for which the Erasmus University-based team's publication on the optimization of the City of Rotterdam's electric transit bus network was recognized with the Institute for Operations Research and the Management Sciences Daniel H. Wagner runner-up award. His research focuses on the use of competitive benchmarking and intelligent agents in virtual world simulations of retail energy markets as part of a smart grid. A small-scale version of the Power TAC project led to a publication on demand side management, 'A simulation of household behavior under variable prices' that has several hundred citations in publications representing a variety of scientific disciplines. Two of his publications in the Management Information Systems Quarterly journal and one in Energy Economics form the foundation for the current Power TAC platform. In 2016 and 2019 he was Chair of the Workshop on Information Technologies and Systems. Ketter is Coordinator of the Key Research Initiative Sustainable Smart Energy & Mobility at the University of Cologne, where he is a chaired Professor of Information Systems for a Sustainable Society. At the Rotterdam School of Management, Erasmus University, he is Professor of Next Generation Information Systems as well as Director of the Erasmus Centre for Future Energy Business and Academic Director of Smart Cities and Smart Energy at the Erasmus Centre of Data Analytics. He has been a visiting professor at the Haas School of Business and Berkeley Institute of Data Science, University of California at Berkeley in 2016 to 2017.

    Read more →
  • Markov switching multifractal

    Markov switching multifractal

    In financial econometrics (the application of statistical methods to economic data), the Markov-switching multifractal (MSM) is a model of asset returns developed by Laurent E. Calvet and Adlai J. Fisher that incorporates stochastic volatility components of heterogeneous durations. MSM captures the outliers, log-memory-like volatility persistence and power variation of financial returns. In currency and equity series, MSM compares favorably with standard volatility models such as GARCH(1,1) and FIGARCH both in- and out-of-sample. MSM is used by practitioners in the financial industry for different types of forecasts. == MSM specification == The MSM model can be specified in both discrete time and continuous time. === Discrete time === Let P t {\displaystyle P_{t}} denote the price of a financial asset, and let r t = ln ⁡ ( P t / P t − 1 ) {\displaystyle r_{t}=\ln(P_{t}/P_{t-1})} denote the return over two consecutive periods. In MSM, returns are specified as r t = μ + σ ¯ ( M 1 , t M 2 , t . . . M k ¯ , t ) 1 / 2 ϵ t , {\displaystyle r_{t}=\mu +{\bar {\sigma }}(M_{1,t}M_{2,t}...M_{{\bar {k}},t})^{1/2}\epsilon _{t},} where μ {\displaystyle \mu } and σ {\displaystyle \sigma } are constants and { ϵ t {\displaystyle \epsilon _{t}} } are independent standard Gaussians. Volatility is driven by the first-order latent Markov state vector: M t = ( M 1 , t M 2 , t … M k ¯ , t ) ∈ R + k ¯ . {\displaystyle M_{t}=(M_{1,t}M_{2,t}\dots M_{{\bar {k}},t})\in R_{+}^{\bar {k}}.} Given the volatility state M t {\displaystyle M_{t}} , the next-period multiplier M k , t + 1 {\displaystyle M_{k,t+1}} is drawn from a fixed distribution M with probability γ k {\displaystyle \gamma _{k}} , and is otherwise left unchanged. The transition probabilities are specified by γ k = 1 − ( 1 − γ 1 ) ( b k − 1 ) {\displaystyle \gamma _{k}=1-(1-\gamma _{1})^{(b^{k-1})}} . The sequence γ k {\displaystyle \gamma _{k}} is approximately geometric γ k ≈ γ 1 b k − 1 {\displaystyle \gamma _{k}\approx \gamma _{1}b^{k-1}} at low frequency. The marginal distribution M has a unit mean, has a positive support, and is independent of k. ==== Binomial MSM ==== In empirical applications, the distribution M is often a discrete distribution that can take the values m 0 {\displaystyle m_{0}} or 2 − m 0 {\displaystyle 2-m_{0}} with equal probability. The return process r t {\displaystyle r_{t}} is then specified by the parameters θ = ( m 0 , μ , σ ¯ , b , γ 1 ) {\displaystyle \theta =(m_{0},\mu ,{\bar {\sigma }},b,\gamma _{1})} . Note that the number of parameters is the same for all k ¯ > 1 {\displaystyle {\bar {k}}>1} . === Continuous time === MSM is similarly defined in continuous time. The price process follows the diffusion: d P t P t = μ d t + σ ( M t ) d W t , {\displaystyle {\frac {dP_{t}}{P_{t}}}=\mu dt+\sigma (M_{t})\,dW_{t},} where σ ( M t ) = σ ¯ ( M 1 , t … M k ¯ , t ) 1 / 2 {\displaystyle \sigma (M_{t})={\bar {\sigma }}(M_{1,t}\dots M_{{\bar {k}},t})^{1/2}} , W t {\displaystyle W_{t}} is a standard Brownian motion, and μ {\displaystyle \mu } and σ ¯ {\displaystyle {\bar {\sigma }}} are constants. Each component follows the dynamics: The intensities vary geometrically with k: γ k = γ 1 b k − 1 . {\displaystyle \gamma _{k}=\gamma _{1}b^{k-1}.} When the number of components k ¯ {\displaystyle {\bar {k}}} goes to infinity, continuous-time MSM converges to a multifractal diffusion, whose sample paths take a continuum of local Hölder exponents on any finite time interval. == Inference and closed-form likelihood == When M {\displaystyle M} has a discrete distribution, the Markov state vector M t {\displaystyle M_{t}} takes finitely many values m 1 , . . . , m d ∈ R + k ¯ {\displaystyle m^{1},...,m^{d}\in R_{+}^{\bar {k}}} . For instance, there are d = 2 k ¯ {\displaystyle d=2^{\bar {k}}} possible states in binomial MSM. The Markov dynamics are characterized by the transition matrix A = ( a i , j ) 1 ≤ i , j ≤ d {\displaystyle A=(a_{i,j})_{1\leq i,j\leq d}} with components a i , j = P ( M t + 1 = m j | M t = m i ) {\displaystyle a_{i,j}=P\left(M_{t+1}=m^{j}|M_{t}=m^{i}\right)} . Conditional on the volatility state, the return r t {\displaystyle r_{t}} has Gaussian density f ( r t | M t = m i ) = 1 2 π σ 2 ( m i ) exp ⁡ [ − ( r t − μ ) 2 2 σ 2 ( m i ) ] . {\displaystyle f(r_{t}|M_{t}=m^{i})={\frac {1}{\sqrt {2\pi \sigma ^{2}(m^{i})}}}\exp \left[-{\frac {(r_{t}-\mu )^{2}}{2\sigma ^{2}(m^{i})}}\right].} === Conditional distribution === === Closed-form Likelihood === The log likelihood function has the following analytical expression: ln ⁡ L ( r 1 , … , r T ; θ ) = ∑ t = 1 T ln ⁡ [ ω ( r t ) . ( Π t − 1 A ) ] . {\displaystyle \ln L(r_{1},\dots ,r_{T};\theta )=\sum _{t=1}^{T}\ln[\omega (r_{t}).(\Pi _{t-1}A)].} Maximum likelihood provides reasonably precise estimates in finite samples. === Other estimation methods === When M {\displaystyle M} has a continuous distribution, estimation can proceed by simulated method of moments, or simulated likelihood via a particle filter. == Forecasting == Given r 1 , … , r t {\displaystyle r_{1},\dots ,r_{t}} , the conditional distribution of the latent state vector at date t + n {\displaystyle t+n} is given by: Π ^ t , n = Π t A n . {\displaystyle {\hat {\Pi }}_{t,n}=\Pi _{t}A^{n}.\,} MSM often provides better volatility forecasts than some of the best traditional models both in and out of sample. Calvet and Fisher report considerable gains in exchange rate volatility forecasts at horizons of 10 to 50 days as compared with GARCH(1,1), Markov-Switching GARCH, and Fractionally Integrated GARCH. Lux obtains similar results using linear predictions. == Applications == === Multiple assets and value-at-risk === Extensions of MSM to multiple assets provide reliable estimates of the value-at-risk in a portfolio of securities. === Asset pricing === In financial economics, MSM has been used to analyze the pricing implications of multifrequency risk. The models have had some success in explaining the excess volatility of stock returns compared to fundamentals and the negative skewness of equity returns. They have also been used to generate multifractal jump-diffusions. == Related approaches == MSM is a stochastic volatility model with arbitrarily many frequencies. MSM builds on the convenience of regime-switching models, which were advanced in economics and finance by James D. Hamilton. MSM is closely related to the Multifractal Model of Asset Returns. MSM improves on the MMAR's combinatorial construction by randomizing arrival times, guaranteeing a strictly stationary process. MSM provides a pure regime-switching formulation of multifractal measures, which were pioneered by Benoit Mandelbrot.

    Read more →
  • Second-order co-occurrence pointwise mutual information

    Second-order co-occurrence pointwise mutual information

    In computational linguistics, second-order co-occurrence pointwise mutual information (SOC-PMI) is a method used to measure semantic similarity, or how close in meaning two words are. The method does not require the two words to appear together in a text. Instead, it works by analyzing the "neighbor" words that typically appear alongside each of the two target words in a large body of text (corpus). If the two target words frequently share the same neighbors, they are considered semantically similar. For example, the words "cemetery" and "graveyard" may not appear in the same sentence often, but they both frequently appear near words like "buried," "dead," and "funeral." SOC-PMI uses this shared context to determine that they have a similar meaning. The method is called "second-order" because it doesn't look at the direct co-occurrence of the target words (which would be first-order), but at the co-occurrence of their neighbors (a second level of association). The strength of these associations is quantified using pointwise mutual information (PMI). == History == The method builds on earlier work like the PMI-IR algorithm, which used the AltaVista search engine to calculate word association probabilities. The key advantage of a second-order approach like SOC-PMI is its ability to measure similarity between words that do not co-occur often, or at all. The British National Corpus (BNC) has been used as a source for word frequencies and contexts for this method. == Methodology == The SOC-PMI algorithm measures the similarity between two words, w 1 {\displaystyle w_{1}} and w 2 {\displaystyle w_{2}} , in several steps. === Step 1: Score neighboring words with PMI === First, for each target word ( w 1 {\displaystyle w_{1}} and w 2 {\displaystyle w_{2}} ), the algorithm identifies its "neighbor" words within a certain text window (e.g., within 5 words to the left or right) across a large corpus. The strength of the association between a target word t i {\displaystyle t_{i}} and its neighbor w {\displaystyle w} is calculated using pointwise mutual information (PMI). A higher PMI value means the two words appear together more often than would be expected by chance. The PMI between a target word t i {\displaystyle t_{i}} and a neighbor word w {\displaystyle w} is calculated as: f pmi ( t i , w ) = log 2 ⁡ f b ( t i , w ) × m f t ( t i ) f t ( w ) {\displaystyle f^{\text{pmi}}(t_{i},w)=\log _{2}{\frac {f^{b}(t_{i},w)\times m}{f^{t}(t_{i})f^{t}(w)}}} where: f b ( t i , w ) {\displaystyle f^{b}(t_{i},w)} is the number of times t i {\displaystyle t_{i}} and w {\displaystyle w} appear together in the context window. f t ( t i ) {\displaystyle f^{t}(t_{i})} is the total number of times t i {\displaystyle t_{i}} appears in the corpus. f t ( w ) {\displaystyle f^{t}(w)} is the total number of times w {\displaystyle w} appears in the corpus. m {\displaystyle m} is the total number of tokens (words) in the corpus. === Step 2: Create a semantic 'signature' for each word === For each target word ( w 1 {\displaystyle w_{1}} and w 2 {\displaystyle w_{2}} ), the algorithm creates a list of its most significant neighbors. This is done by taking the top β {\displaystyle \beta } neighbor words, sorted in descending order by their PMI score with the target word. This list of top neighbors, X w {\displaystyle X^{w}} , acts as a semantic "signature" for the word w {\displaystyle w} . X w = { X i w } {\displaystyle X^{w}=\{X_{i}^{w}\}} , for i = 1 , 2 , … , β {\displaystyle i=1,2,\ldots ,\beta } The size of this list, β {\displaystyle \beta } , is a parameter of the method. === Step 3: Compare the signatures === The algorithm then compares the signatures of w 1 {\displaystyle w_{1}} and w 2 {\displaystyle w_{2}} . It looks for words that are present in both signatures. The similarity of w 1 {\displaystyle w_{1}} to w 2 {\displaystyle w_{2}} is calculated by summing the PMI scores of w 2 {\displaystyle w_{2}} with every word in w 1 {\displaystyle w_{1}} 's signature list. The β {\displaystyle \beta } -PMI summation function defines this score. The score for w 1 {\displaystyle w_{1}} with respect to w 2 {\displaystyle w_{2}} is: f ( w 1 , w 2 , β ) = ∑ i = 1 β ( f pmi ( X i w 1 , w 2 ) ) γ {\displaystyle f(w_{1},w_{2},\beta )=\sum _{i=1}^{\beta }(f^{\text{pmi}}(X_{i}^{w_{1}},w_{2}))^{\gamma }} This sum only includes terms where the PMI value is positive. The exponent γ {\displaystyle \gamma } (with a value > 1) is used to give more weight to neighbors that are more strongly associated with w 2 {\displaystyle w_{2}} . This calculation is done in both directions: The similarity of w 1 {\displaystyle w_{1}} with respect to w 2 {\displaystyle w_{2}} : f ( w 1 , w 2 , β 1 ) = ∑ i = 1 β 1 ( f pmi ( X i w 1 , w 2 ) ) γ {\displaystyle f(w_{1},w_{2},\beta _{1})=\sum _{i=1}^{\beta _{1}}(f^{\text{pmi}}(X_{i}^{w_{1}},w_{2}))^{\gamma }} The similarity of w 2 {\displaystyle w_{2}} with respect to w 1 {\displaystyle w_{1}} : f ( w 2 , w 1 , β 2 ) = ∑ i = 1 β 2 ( f pmi ( X i w 2 , w 1 ) ) γ {\displaystyle f(w_{2},w_{1},\beta _{2})=\sum _{i=1}^{\beta _{2}}(f^{\text{pmi}}(X_{i}^{w_{2}},w_{1}))^{\gamma }} === Step 4: Calculate final similarity score === Finally, the total semantic similarity is the average of the two scores from the previous step. S i m ( w 1 , w 2 ) = f ( w 1 , w 2 , β 1 ) β 1 + f ( w 2 , w 1 , β 2 ) β 2 {\displaystyle \mathrm {Sim} (w_{1},w_{2})={\frac {f(w_{1},w_{2},\beta _{1})}{\beta _{1}}}+{\frac {f(w_{2},w_{1},\beta _{2})}{\beta _{2}}}} This score can be normalized to fall between 0 and 1. For example, using this method, the words cemetery and graveyard achieve a high similarity score of 0.986 (with specific parameter settings).

    Read more →
  • Pushpak Bhattacharyya

    Pushpak Bhattacharyya

    Pushpak Bhattacharyya (3 July 1962 – 5 October 2025) was an Indian computer scientist and professor in the Department of Computer Science and Engineering at the IIT Bombay. He served as the Director of the IIT Patna from 2015 to 2021. He was a past President of the Association for Computational Linguistics (2016–17), and held the Vijay and Sita Vashee Chair Professorship at IIT Bombay. Bhattacharyya led the Natural Language Processing (NLP) research group at the Centre for Indian Language Technology (CFILT) at IIT Bombay until his death. At the inauguration of the Nilekani Centre at AI4Bharat, IIT Madras, Nandan Nilekani, Co-founder and Non-Executive Chairman of Infosys, referred to Bhattacharyya as the "Godfather of Indian NLP". == Early life and education == Bhattacharyya was born in Shillong in 1962. He completed his schooling at Jail Road Boys' High School, Shillong. He obtained a B.Tech. in Computer Science from the IIT Kharagpur, followed by an M.Tech. from the IIT Kanpur, and a Ph.D. in Computer Science from IIT Bombay in 1994. == Research == Bhattacharyya’s research areas includes Natural language processing, Artificial intelligence, Machine learning, Psycholinguistics, Eye tracking, and Information retrieval. He made contributions to the development of multilingual lexical databases such as IndoWordNet and other projects related to machine translation and computational linguistics. He authored and co-authored multiple academic works, including Investigations in Computational Sarcasm (with Aditya Joshi), Cognitively Inspired Natural Language Processing: An Investigation Based on Eye Tracking (with Abhijit Mishra), and Machine Translation and Transliteration of Low Resource Related Languages (with Anoop Kunchukuttan). Over his career, Bhattacharyya published more than 350 research papers in journals and conference proceedings and supervised over 300 undergraduate, master’s, and doctoral students. His projects often addressed computational challenges for Indian languages, such as developing wordnets, building translation systems for low-resource languages, and studying cognitive aspects of language processing. He also led government- and industry-funded research initiatives supported by organizations including IBM, Microsoft, Yahoo, and the United Nations. == Death == Bhattacharyya died on 5 October 2025, at the age of 63. == Awards == Patwardhan Award, IIT Bombay, for Technology Development VNMM Award, IIT Roorkee, for Technology Development Fellow, Indian National Academy of Engineering Eminent Engineer Award, Institution of Engineers (India)

    Read more →
  • Generalized filtering

    Generalized filtering

    Generalized filtering is a generic Bayesian filtering scheme for nonlinear state-space models. It is based on a variational principle of least action, formulated in generalized coordinates of motion. Note that "generalized coordinates of motion" are related to—but distinct from—generalized coordinates as used in (multibody) dynamical systems analysis. Generalized filtering furnishes posterior densities over hidden states (and parameters) generating observed data using a generalized gradient descent on variational free energy, under the Laplace assumption. Unlike classical (e.g. Kalman-Bucy or particle) filtering, generalized filtering eschews Markovian assumptions about random fluctuations. Furthermore, it operates online, assimilating data to approximate the posterior density over unknown quantities, without the need for a backward pass. Special cases include variational filtering, dynamic expectation maximization and generalized predictive coding. == Definition == Definition: Generalized filtering rests on the tuple ( Ω , U , X , S , p , q ) {\displaystyle (\Omega ,U,X,S,p,q)} : A sample space Ω {\displaystyle \Omega } from which random fluctuations ω ∈ Ω {\displaystyle \omega \in \Omega } are drawn Control states U ∈ R {\displaystyle U\in \mathbb {R} } – that act as external causes, input or forcing terms Hidden states X : X × U × Ω → R {\displaystyle X:X\times U\times \Omega \to \mathbb {R} } – that cause sensory states and depend on control states Sensor states S : X × U × Ω → R {\displaystyle S:X\times U\times \Omega \to \mathbb {R} } – a probabilistic mapping from hidden and control states Generative density p ( s ~ , x ~ , u ~ ∣ m ) {\displaystyle p({\tilde {s}},{\tilde {x}},{\tilde {u}}\mid m)} – over sensory, hidden and control states under a generative model m {\displaystyle m} Variational density q ( x ~ , u ~ ∣ μ ~ ) {\displaystyle q({\tilde {x}},{\tilde {u}}\mid {\tilde {\mu }})} – over hidden and control states with mean μ ~ ∈ R {\displaystyle {\tilde {\mu }}\in \mathbb {R} } Here ~ denotes a variable in generalized coordinates of motion: u ~ = [ u , u ′ , u ″ , … ] T {\displaystyle {\tilde {u}}=[u,u',u'',\ldots ]^{T}} === Generalized filtering === The objective is to approximate the posterior density over hidden and control states, given sensor states and a generative model – and estimate the (path integral of) model evidence p ( s ~ ( t ) | m ) {\displaystyle p({\tilde {s}}(t)\vert m)} to compare different models. This generally involves an intractable marginalization over hidden states, so model evidence (or marginal likelihood) is replaced with a variational free energy bound. Given the following definitions: μ ~ ( t ) = a r g m i n μ ~ { F ( s ~ ( t ) , μ ~ ) } {\displaystyle {\tilde {\mu }}(t)={\underset {\tilde {\mu }}{\operatorname {arg\,min} }}\{F({\tilde {s}}(t),{\tilde {\mu }})\}} G ( s ~ , x ~ , u ~ ) = − ln ⁡ p ( s ~ , x ~ , u ~ | m ) {\displaystyle G({\tilde {s}},{\tilde {x}},{\tilde {u}})=-\ln p({\tilde {s}},{\tilde {x}},{\tilde {u}}\vert m)} Denote the Shannon entropy of the density q {\displaystyle q} by H [ q ] = E q [ − log ⁡ ( q ) ] {\displaystyle H[q]=E_{q}[-\log(q)]} . We can then write the variational free energy in two ways: F ( s ~ , μ ~ ) = E q [ G ( s ~ , x ~ , u ~ ) ] − H [ q ( x ~ , u ~ | μ ~ ) ] = − ln ⁡ p ( s ~ | m ) + D K L [ q ( x ~ , u ~ | μ ~ ) | | p ( x ~ , u ~ | s ~ , m ) ] {\displaystyle F({\tilde {s}},{\tilde {\mu }})=E_{q}[G({\tilde {s}},{\tilde {x}},{\tilde {u}})]-H[q({\tilde {x}},{\tilde {u}}\vert {\tilde {\mu }})]=-\ln p({\tilde {s}}\vert m)+D_{KL}[q({\tilde {x}},{\tilde {u}}\vert {\tilde {\mu }})\vert \vert p({\tilde {x}},{\tilde {u}}\vert {\tilde {s}},m)]} The second equality shows that minimizing variational free energy (i) minimizes the Kullback-Leibler divergence between the variational and true posterior density and (ii) renders the variational free energy (a bound approximation to) the negative log evidence (because the divergence can never be less than zero). Under the Laplace assumption q ( x ~ , u ~ ∣ μ ~ ) = N ( μ ~ , C ) {\displaystyle q({\tilde {x}},{\tilde {u}}\mid {\tilde {\mu }})={\mathcal {N}}({\tilde {\mu }},C)} the variational density is Gaussian and the precision that minimizes free energy is C − 1 = Π = ∂ μ ~ μ ~ G ( μ ~ ) {\displaystyle C^{-1}=\Pi =\partial _{{\tilde {\mu }}{\tilde {\mu }}}G({\tilde {\mu }})} . This means that free-energy can be expressed in terms of the variational mean (omitting constants): F = G ( μ ~ ) + 1 2 ln ⁡ | ∂ μ ~ μ ~ G ( μ ~ ) | {\displaystyle F=G({\tilde {\mu }})+\textstyle {1 \over 2}\ln \vert \partial _{{\tilde {\mu }}{\tilde {\mu }}}G({\tilde {\mu }})\vert } The variational means that minimize the (path integral) of free energy can now be recovered by solving the generalized filter: μ ~ ˙ = D μ ~ − ∂ μ ~ F ( s ~ , μ ~ ) {\displaystyle {\dot {\tilde {\mu }}}=D{\tilde {\mu }}-\partial _{\tilde {\mu }}F({\tilde {s}},{\tilde {\mu }})} where D {\displaystyle D} is a block matrix derivative operator of identify matrices such that D u ~ = [ u ′ , u ″ , … ] T {\displaystyle D{\tilde {u}}=[u',u'',\ldots ]^{T}} === Variational basis === Generalized filtering is based on the following lemma: The self-consistent solution to μ ~ ˙ = D μ ~ − ∂ μ ~ F ( s , μ ~ ) {\displaystyle {\dot {\tilde {\mu }}}=D{\tilde {\mu }}-\partial _{\tilde {\mu }}F(s,{\tilde {\mu }})} satisfies the variational principle of stationary action, where action is the path integral of variational free energy S = ∫ d t F ( s ~ ( t ) , μ ~ ( t ) ) {\displaystyle S=\int dt\,F({\tilde {s}}(t),{\tilde {\mu }}(t))} Proof: self-consistency requires the motion of the mean to be the mean of the motion and (by the fundamental lemma of variational calculus) μ ~ ˙ = D μ ~ ⇔ ∂ μ ~ F ( s ~ , μ ~ ) = 0 ⇔ δ μ ~ S = 0 {\displaystyle {\dot {\tilde {\mu }}}=D{\tilde {\mu }}\Leftrightarrow \partial _{\tilde {\mu }}F({\tilde {s}},{\tilde {\mu }})=0\Leftrightarrow \delta _{\tilde {\mu }}S=0} Put simply, small perturbations to the path of the mean do not change variational free energy and it has the least action of all possible (local) paths. Remarks: Heuristically, generalized filtering performs a gradient descent on variational free energy in a moving frame of reference: μ ~ ˙ − D μ ~ = − ∂ μ ~ F ( s , μ ~ ) {\displaystyle {\dot {\tilde {\mu }}}-D{\tilde {\mu }}=-\partial _{\tilde {\mu }}F(s,{\tilde {\mu }})} , where the frame itself minimizes variational free energy. For a related example in statistical physics, see Kerr and Graham who use ensemble dynamics in generalized coordinates to provide a generalized phase-space version of Langevin and associated Fokker-Planck equations. In practice, generalized filtering uses local linearization over intervals Δ t {\displaystyle \Delta t} to recover discrete updates Δ μ ~ = ( exp ⁡ ( Δ t ⋅ J ) − I ) J − 1 μ ~ ˙ J = ∂ μ ~ μ ~ ˙ = D − ∂ μ ~ μ ~ F ( s ~ , μ ~ ) {\displaystyle {\begin{aligned}\Delta {\tilde {\mu }}&=(\exp(\Delta t\cdot J)-I)J^{-1}{\dot {\tilde {\mu }}}\\J&=\partial _{\tilde {\mu }}{\dot {\tilde {\mu }}}=D-\partial _{{\tilde {\mu }}{\tilde {\mu }}}F({\tilde {s}},{\tilde {\mu }})\end{aligned}}} This updates the means of hidden variables at each interval (usually the interval between observations). == Generative (state-space) models in generalized coordinates == Usually, the generative density or model is specified in terms of a nonlinear input-state-output model with continuous nonlinear functions: s = g ( x , u ) + ω s x ˙ = f ( x , u ) + ω x {\displaystyle {\begin{aligned}s&=g(x,u)+\omega _{s}\\{\dot {x}}&=f(x,u)+\omega _{x}\end{aligned}}} The corresponding generalized model (under local linearity assumptions) obtains the from the chain rule s ~ = g ~ ( x ~ , u ~ ) + ω ~ s s = g ( x , u ) + ω s s ′ = ∂ x g ⋅ x ′ + ∂ u g ⋅ u ′ + ω s ′ s ″ = ∂ x g ⋅ x ″ + ∂ u g ⋅ u ″ + ω s ″ ⋮ x ~ ˙ = f ~ ( x ~ , u ~ ) + ω ~ x x ˙ = f ( x , u ) + ω x x ˙ ′ = ∂ x f ⋅ x ′ + ∂ u f ⋅ u ′ + ω x ′ x ˙ ″ = ∂ x f ⋅ x ″ + ∂ u f ⋅ u ″ + ω x ″ ⋮ {\displaystyle {\begin{aligned}{\tilde {s}}&={\tilde {g}}({\tilde {x}},{\tilde {u}})+{\tilde {\omega }}_{s}\\\\s&=g(x,u)+\omega _{s}\\s'&=\partial _{x}g\cdot x'+\partial _{u}g\cdot u'+\omega '_{s}\\s''&=\partial _{x}g\cdot x''+\partial _{u}g\cdot u''+\omega ''_{s}\\&\vdots \\\end{aligned}}\qquad {\begin{aligned}{\dot {\tilde {x}}}&={\tilde {f}}({\tilde {x}},{\tilde {u}})+{\tilde {\omega }}_{x}\\\\{\dot {x}}&=f(x,u)+\omega _{x}\\{\dot {x}}'&=\partial _{x}f\cdot x'+\partial _{u}f\cdot u'+\omega '_{x}\\{\dot {x}}''&=\partial _{x}f\cdot x''+\partial _{u}f\cdot u''+\omega ''_{x}\\&\vdots \end{aligned}}} Gaussian assumptions about the random fluctuations ω {\displaystyle \omega } then prescribe the likelihood and empirical priors on the motion of hidden states p ( s ~ , x ~ , u ~ | m ) = p ( s ~ | x ~ , u ~ , m ) p ( D x ~ | x , u ~ , m ) p ( x | m ) p ( u ~ | m ) p ( s ~ | x ~ , u ~ , m ) = N ( g ~ ( x ~ , u ~ ) , Σ ~ ( x ~ , u ~ ) s ) p ( D x ~ | x , u ~ , m ) = N ( f ~ ( x ~ , u ~ ) , Σ ~ ( x ~ , u ~ ) x ) {\displayst

    Read more →