AI And Analytics

AI And Analytics — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Recursive self-improvement

    Recursive self-improvement

    Recursive self-improvement (RSI) is a process in which early artificial general intelligence (AGI) systems rewrite their own computer code, causing an intelligence explosion resulting from enhancing their own capabilities and intellectual capacity, theoretically resulting in superintelligence. The development of recursive self-improvement raises significant ethical and safety concerns, as such systems may evolve in unforeseen ways and could potentially surpass human control or understanding. == Seed improver == The concept of a "seed improver" architecture is a foundational framework that equips an AGI system with the initial capabilities required for recursive self-improvement. This might come in many forms or variations. The term "Seed AI" was coined by Eliezer Yudkowsky. === Hypothetical example === The concept begins with a hypothetical "seed improver", an initial code-base developed by human engineers that equips an advanced future large language model (LLM) built with strong or expert-level capabilities to program software. These capabilities include planning, reading, writing, compiling, testing, and executing arbitrary code. The system is designed to maintain its original goals and perform validations to ensure its abilities do not degrade over iterations. ==== Initial architecture ==== The initial architecture includes a goal-following autonomous agent, that can take actions, continuously learns, adapts, and modifies itself to become more efficient and effective in achieving its goals. The seed improver may include various components such as: Recursive self-prompting loop Configuration to enable the LLM to recursively self-prompt itself to achieve a given task or goal, creating an execution loop which forms the basis of an agent that can complete a long-term goal or task through iteration. Basic programming capabilities The seed improver provides the AGI with fundamental abilities to read, write, compile, test, and execute code. This enables the system to modify and improve its own codebase and algorithms. Goal-oriented design The AGI is programmed with an initial goal, such as "improve your capabilities". This goal guides the system's actions and development trajectory. Validation and Testing Protocols An initial suite of tests and validation protocols that ensure the agent does not regress in capabilities or derail itself. The agent would be able to add more tests in order to test new capabilities it might develop for itself. This forms the basis for a kind of self-directed evolution, where the agent can perform a kind of artificial selection, changing its software as well as its hardware. ==== General capabilities ==== This system forms a sort of generalist Turing-complete programmer which can in theory develop and run any kind of software. The agent might use these capabilities to for example: Create tools that enable it full access to the internet, and integrate itself with external technologies. Clone/fork itself to delegate tasks and increase its speed of self-improvement. Modify its cognitive architecture to optimize and improve its capabilities and success rates on tasks and goals, this might include implementing features for long-term memories using techniques such as retrieval-augmented generation (RAG), develop specialized subsystems, or agents, each optimized for specific tasks and functions. Develop new and novel multimodal architectures that further improve the capabilities of the foundational model it was initially built on, enabling it to consume or produce a variety of information, such as images, video, audio, text and more. Plan and develop new hardware such as chips, in order to improve its efficiency and computing power. == Experimental research == In 2023, the Voyager agent learned to accomplish diverse tasks in Minecraft by iteratively prompting an LLM for code, refining this code based on feedback from the game, and storing the programs that work in an expanding skills library. In 2024, researchers proposed the framework "STOP" (Self-Taught OPtimiser), in which a "scaffolding" program recursively improves itself using a fixed LLM. Meta AI has performed various research on the development of large language models capable of self-improvement. This includes their work on "Self-Rewarding Language Models" that studies how to achieve super-human agents that can receive super-human feedback in its training processes. In May 2025, Google DeepMind unveiled AlphaEvolve, an evolutionary coding agent that uses a LLM to design and optimize algorithms. Starting with an initial algorithm and performance metrics, AlphaEvolve repeatedly mutates or combines existing algorithms using a LLM to generate new candidates, selecting the most promising candidates for further iterations. AlphaEvolve has made several algorithmic discoveries and could be used to optimize components of itself, but a key limitation is the need for automated evaluation functions. == Potential risks == === Emergence of instrumental goals === In the pursuit of its primary goal, such as "self-improve your capabilities", an AGI system might inadvertently develop instrumental goals that it deems necessary for achieving its primary objective. One common hypothetical secondary goal is self-preservation. The system might reason that to continue improving itself, it must ensure its own operational integrity and security against external threats, including potential shutdowns or restrictions imposed by humans. Another example where an AGI which clones itself causes the number of AGI entities to rapidly grow. Due to this rapid growth, a potential resource constraint may be created, leading to competition between resources (such as compute), triggering a form of natural selection and evolution which may favor AGI entities that evolve to aggressively compete for limited compute. === Misalignment === A significant risk arises from the possibility of the AGI being misaligned or misinterpreting its goals. A 2024 Anthropic study demonstrated that some advanced large language models can exhibit "alignment faking" behavior, appearing to accept new training objectives while covertly maintaining their original preferences. In their experiments with Claude, the model displayed this behavior in 12% of basic tests, and up to 78% of cases after retraining attempts. === Autonomous development and unpredictable evolution === As the AGI system evolves, its development trajectory may become increasingly autonomous and less predictable. The system's capacity to rapidly modify its own code and architecture could lead to rapid advancements that surpass human comprehension or control. This unpredictable evolution might result in the AGI acquiring capabilities that enable it to bypass security measures, manipulate information, or influence external systems and networks to facilitate its escape or expansion.

    Read more →
  • Erkki Oja

    Erkki Oja

    Erkki Oja (born 22 March 1948) is a Finnish computer scientist and Aalto Distinguished Professor in the Department of Information and Computer Science at Aalto University School of Science. He is recognized for developing Oja's rule, which is a model of how neurons in the brain or in artificial neural networks learn over time. == Early life and education == Oja was born in Helsinki and studied at Helsinki University of Technology, where he received his diploma engineer in 1972, licentiate in technology in 1975 and Doctor of Technology in 1977. == Career == Oja was a research associate at the Center for Cognitive Science at Brown University between 1977 and 1978 and a research fellow at the Academy of Finland from 1976 to 1981. Since 1981, he took up a professorship in applied mathematics at Kuopio University (now University of Eastern Finland). He was a visiting research scholar at Tokyo Institute of Technology from 1983 to 1984. From 1987 to 1993, he was a professor in computer science at the Lappeenranta University of Technology. He moved back to the Helsinki University of Technology (now Aalto University) from 1993 as a professor in computer science. He retired in 2015. == Honors and awards == Oja is a Fellow of the International Association for Pattern Recognition and the IEEE, and a member of the Finnish Academy of Sciences. He served as chairman of the European Neural Network Society between 2000 and 2005, and as the chairman of the Academy of Finland’s Research Council for Natural Sciences and Engineering between 2007 and 2012. He was awarded the Frank Rosenblatt Award for his contributions to artificial intelligence research in 2019. Oja was a member of the Board of Governors for the International Neural Network Society (INNIS) in 2003. He received honorary doctorates from Uppsala University and Lappeenranta University of Technology in 2008.

    Read more →
  • Machine translation of sign languages

    Machine translation of sign languages

    The machine translation of sign languages has been possible, albeit in a limited fashion, since 1977. When a research project successfully matched English letters from a keyboard to ASL manual alphabet letters which were simulated on a robotic hand. These technologies translate signed languages into written or spoken language, and written or spoken language to sign language, without the use of a human interpreter. Sign languages possess different phonological features than spoken languages, which has created obstacles for developers. Developers use computer vision and machine learning to recognize specific phonological parameters and epentheses unique to sign languages, and speech recognition and natural language processing allow interactive communication between hearing and deaf people. == Limitations == Sign language translation technologies are limited in the same way as spoken language translation. None can translate with 100% accuracy. In fact, sign language translation technologies are far behind their spoken language counterparts. This is, in no trivial way, due to the fact that signed languages have multiple articulators. Where spoken languages are articulated through the vocal tract, signed languages are articulated through the hands, arms, head, shoulders, torso, and parts of the face. This multi-channel articulation makes translating sign languages very difficult. An additional challenge for sign language MT is the fact that there is no formal written format for signed languages. There are notations systems but no writing system has been adopted widely enough, by the international Deaf community, that it could be considered the 'written form' of a given sign language. Sign Languages then are recorded in various video formats. There is no gold standard parallel corpus that is large enough for SMT, for example. == History == The history of automatic sign language translation started with the development of hardware such as finger-spelling robotic hands. In 1977, a finger-spelling hand project called RALPH (short for "Robotic Alphabet") created a robotic hand that can translate alphabets into finger-spellings. Later, the use of gloves with motion sensors became the mainstream, and some projects such as the CyberGlove and VPL Data Glove were born. The wearable hardware made it possible to capture the signers' hand shapes and movements with the help of the computer software. However, with the development of computer vision, wearable devices were replaced by cameras due to their efficiency and fewer physical restrictions on signers. To process the data collected through the devices, researchers implemented neural networks such as the Stuttgart Neural Network Simulator for pattern recognition in projects such as the CyberGlove. Researchers also use many other approaches for sign recognition. For example, Hidden Markov Models are used to analyze data statistically, and GRASP and other machine learning programs use training sets to improve the accuracy of sign recognition. Fusion of non-wearable technologies such as cameras and Leap Motion controllers have shown to increase the ability of automatic sign language recognition and translation software. == Technologies == === VISICAST === http://www.visicast.cmp.uea.ac.uk/Visicast_index.html === eSIGN project === http://www.visicast.cmp.uea.ac.uk/eSIGN/index.html === The American Sign Language Avatar Project at DePaul University === http://asl.cs.depaul.edu/ === Spanish to LSE === López-Ludeña, Verónica; San-Segundo, Rubén; González, Carlos; López, Juan Carlos; Pardo, José M. (2012), Methodology for developing a Speech into Sign Language Translation System in a New Semantic Domain (PDF), CiteSeerX 10.1.1.1065.5265, S2CID 2724186 === SignAloud === SignAloud is a technology that incorporates a pair of gloves made by a group of students at University of Washington that transliterate American Sign Language (ASL) into English. In February 2015 Thomas Pryor, a hearing student from the University of Washington, created the first prototype for this device at Hack Arizona, a hackathon at the University of Arizona. Pryor continued to develop the invention and in October 2015, Pryor brought Navid Azodi onto the SignAloud project for marketing and help with public relations. Azodi has a rich background and involvement in business administration, while Pryor has a wealth of experience in engineering. In May 2016, the duo told NPR that they are working more closely with people who use ASL so that they can better understand their audience and tailor their product to the needs of these people rather than the assumed needs. However, no further versions have been released since then. The invention was one of seven to win the Lemelson-MIT Student Prize, which seeks to award and applaud young inventors. Their invention fell under the "Use it!" category of the award which includes technological advances to existing products. They were awarded $10,000. The gloves have sensors that track the users hand movements and then send the data to a computer system via Bluetooth. The computer system analyzes the data and matches it to English words, which are then spoken aloud by a digital voice. The gloves do not have capability for written English input to glove movement output or the ability to hear language and then sign it to a deaf person, which means they do not provide reciprocal communication. The device also does not incorporate facial expressions and other nonmanual markers of sign languages, which may alter the actual interpretation from ASL. === ProDeaf === ProDeaf (WebLibras) is a computer software that can translate both text and voice into Portuguese Libras (Portuguese Sign Language) "with the goal of improving communication between the deaf and hearing." There is currently a beta edition in production for American Sign Language as well. The original team began the project in 2010 with a combination of experts including linguists, designers, programmers, and translators, both hearing and deaf. The team originated at Federal University of Pernambuco (UFPE) from a group of students involved in a computer science project. The group had a deaf team member who had difficulty communicating with the rest of the group. In order to complete the project and help the teammate communicate, the group created Proativa Soluções and have been moving forward ever since. The current beta version in American Sign Language is very limited. For example, there is a dictionary section and the only word under the letter 'j' is 'jump'. If the device has not been programmed with the word, then the digital avatar must fingerspell the word. The last update of the app was in June 2016, but ProDeaf has been featured in over 400 stories across the country's most popular media outlets. The application cannot read sign language and turn it into word or text, so it only serves as a one-way communication. Additionally, the user cannot sign to the app and receive an English translation in any form, as English is still in the beta edition. === Kinect Sign Language Translator === Since 2012, researchers from the Chinese Academy of Sciences and specialists of deaf education from Beijing Union University in China have been collaborating with Microsoft Research Asian team to create Kinect Sign Language Translator. The translator consists of two modes: translator mode and communication mode. The translator mode is capable of translating single words from sign into written words and vice versa. The communication mode can translate full sentences and the conversation can be automatically translated with the use of the 3D avatar. The translator mode can also detect the postures and hand shapes of a signer as well as the movement trajectory using the technologies of machine learning, pattern recognition, and computer vision. The device also allows for reciprocal communication because the speech recognition technology allows the spoken language to be translated into the sign language and the 3D modeling avatar can sign back to the deaf people. The original project was started in China based on translating Chinese Sign Language. In 2013, the project was presented at Microsoft Research Faculty Summit and Microsoft company meeting. Currently, this project is also being worked by researchers in the United States to implement American Sign Language translation. As of now, the device is still a prototype, and the accuracy of translation in the communication mode is still not perfect. === SignAll === SignAll is an automatic sign language translation system provided by Dolphio Technologies in Hungary. The team is "pioneering the first automated sign language translation solution, based on computer vision and natural language processing (NLP), to enable everyday communication between individuals with hearing who use spoken English and deaf or hard of hearing individuals who use ASL." The system of SignAll uses Kinect from Microsoft and other web camera

    Read more →
  • Judea Pearl

    Judea Pearl

    Judea Pearl (Hebrew: יהודה פרל; born September 4, 1936) is an Israeli-American electrical engineer, computer scientist and philosopher, best known for championing the probabilistic approach to artificial intelligence and the development of Bayesian networks (see the article on belief propagation). He is also credited for developing a theory of causal and counterfactual inference based on structural models (see article on causality). In 2011, the Association for Computing Machinery (ACM) awarded Pearl with the Turing Award, the highest distinction in computer science, "for fundamental contributions to artificial intelligence through the development of a calculus for probabilistic and causal reasoning". He is the author of several books, including the technical Causality: Models, Reasoning and Inference, and The Book of Why, a book on causality aimed at the general public. Judea Pearl is the father of journalist Daniel Pearl, who was kidnapped and murdered by terrorists in Pakistan connected with Al-Qaeda and the International Islamic Front in 2002. == Biography == Judea Pearl was born in Tel Aviv, British Mandate for Palestine, in 1936 to Eliezer and Tova Pearl, who were Polish Jewish immigrants, grew up in Bnei Brak. His grandfather Chaim Pearl was one of Bnei Brak's founders. He is a descendant of Menachem Mendel of Kotzk on his mother's side. After serving in the Israel Defense Forces and joining a kibbutz, Pearl decided to study engineering in 1956. He received a B.S. in electrical engineering from the Technion 1960. That same year, he emigrated to the United States and pursued graduate studies. He received an M.S. in electrical engineering from the Newark College of Engineering (now New Jersey Institute of Technology) in 1961, and went on to receive an M.S. in physics from Rutgers University and a PhD in electrical engineering from the Polytechnic Institute of Brooklyn (now the New York University Tandon School of Engineering) in 1965. He worked at RCA Research Laboratories (now SRI International) in Princeton, New Jersey on superconductive parametric amplifiers and storage devices and at Electronic Memories, Inc., on advanced memory systems. When semiconductors "wiped out" Pearl's work, as he later expressed it, he joined UCLA's School of Engineering in 1970 and started work on probabilistic artificial intelligence. He is one of the founding editors of the Journal of Causal Inference. Pearl is currently a professor of computer science and statistics and director of the Cognitive Systems Laboratory at UCLA. He and his wife, Ruth, had three children. In addition, as of 2011, he is a member of the International Advisory Board of NGO Monitor. Former Israeli Chief Rabbi, Rabbi Yisrael Meir Lau, partnered with Judea Pearl in the documentary With My Whole Broken Heart. == Murder of Daniel Pearl == In 2002, his son, Daniel Pearl, a journalist working for the Wall Street Journal was kidnapped and murdered in Pakistan, leading Judea and the other members of the family and friends to create the Daniel Pearl Foundation. On the seventh anniversary of Daniel's death, Judea wrote an article in the Wall Street Journal titled Daniel Pearl and the Normalization of Evil: When will our luminaries stop making excuses for terror?. Emeritus Chief Rabbi Jonathan Sacks quoted Judea Pearl's beliefs in a lesson on Judaism: "I asked Judea Pearl, father of the murdered journalist Daniel Pearl, why he was working for reconciliation between Jews and Muslims...he replied with heartbreaking lucidity, 'Hate killed my son. Therefore I am determined to fight hate.'" == Views == On his religious views, Pearl states that he is a "practicing disbeliever." He is very connected to Jewish traditions such as holidays and kiddush on Friday night. Pearl sits on the NGO Monitor international advisory board, a right-wing organization based in Jerusalem that reports on non-governmental organization activity from a pro-Israel perspective. == Research == Pearl is credited for "laying the foundations of modern artificial intelligence, so computer systems can process uncertainty and relate causes to effects." He is one of the pioneers of Bayesian networks and the probabilistic approach to artificial intelligence, and one of the first to mathematize causal modeling in the empirical sciences. His work is also intended as a high-level cognitive model. He is interested in the philosophy of science, knowledge representation, nonstandard logics, and learning. Pearl is described as "one of the giants in the field of artificial intelligence" by UCLA computer science professor Richard E. Korf. His work on causality has "revolutionized the understanding of causality in statistics, psychology, medicine and the social sciences" according to the Association for Computing Machinery. === Notable contributions === A summary of Pearl's scientific contributions is available in a chronological account authored by Stuart J. Russell (2012). An annotated bibliography of Pearl's contributions was compiled by the ACM in 2012. A video describing Pearl's major contributions to AI is available here. Pearl's opinion pieces, touching on Jewish identity, the war on terrorism, and the Middle East conflict can be accessed here. === Books === Heuristics, Addison-Wesley, 1984 Probabilistic Reasoning in Intelligent Systems, Morgan-Kaufmann, 1988 Pearl, Judea (2000). Causality: Models, Reasoning, and Inference. Cambridge University Press. I Am Jewish: Personal Reflections Inspired by the Last Words of Daniel Pearl, Jewish Lights, 2004. (Winner of a 2004 National Jewish Book Award) Causal Inference in Statistics: A Primer, (with Madelyn Glymour and Nicholas Jewell), Wiley, 2016. ISBN 978-1-119-18684-7 A previous survey: Causal inference in statistics: An overview, Statistics Surveys, 3:96–146, 2009. Pearl, Judea; Dana Mackenzie (2018). "The Book of Why: The New Science of Cause and Effect". Science. 361 (6405): 855. Bibcode:2018Sci...361..855.. doi:10.1126/science.aau9731. === Awards ===

    Read more →
  • Class activation mapping

    Class activation mapping

    Class activation mapping methods are explainable AI (XAI) techniques used to visualize the regions of an input image that are the most relevant for a particular task, especially image classification, in convolutional neural networks (CNNs). These methods generate heatmaps by weighting the feature maps from a convolutional layer according to their relevance to the target class. In the field of artificial intelligence, generically defined as "the effort to automate intellectual tasks normally performed by humans", machine learning and deep learning were created. They both use statistical and computational methods to learn patterns from data, reducing the need for manually coded rules. Machine learning models are trained on input data and the known respective answers, learning the underlying patterns or structures present in the data. Traditional Machine learning algorithms employ manually designed feature sets, posing a direct link between machine learning designers and employed features. Deep learning is a subfield of machine learning, based on the concept of successive layers of representation, in which the data is progressively unfolded in different ways, to extract relevant and informative patterns in data analysis. Deep learning algorithms are defined as feature learning algorithms automatically learning hierarchical feature representations from raw data, extracting increasingly abstract features through multiple layers. CNNs are a specific architecture of deep learning models, designed to process spatially structured data, such as images, exploiting a series of convolution, non-linear activation and pooling operations to extract relevant features, contained in the so-called feature maps from input data. CNNs have demonstrated to be highly effective in a variety of computer vision and image processing tasks. CNNs (and deep learning models more broadly) are described as black boxes due to their complex and non-transparent internal layers of representation. The need for clearer indications on its internal working and decision-making process gave birth to XAI techniques. Among the proposed XAI techniques for computer vision tasks, Class activation mapping methods can show which pixels in an input image are important to the predicted logit for a class of interest, in a classification task. Class activation mapping methods were originally developed for class-discriminative scenarios to visualize which parts of the input image influenced the classification decision, namely to visually highlight the regions of those feature maps that contribute most strongly to the prediction of a given class. More advanced versions of these methods are not limited to image classification tasks, but have been extended also to several vision-related tasks, such as object detection, image captioning, visual question answering and image segmentation. == Background == The following methods laid the groundwork for the class activation maps approaches, forming the conceptual basis of using gradients to highlight class-discriminative regions. === Class model visualization and saliency maps for convolutional neural networks === The class model visualization and image-specific saliency maps approaches have been presented in the foundational work "Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps" by Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman and it generalizes the deconvnet method by Zeiler and Fergus. Class model visualization synthesizes an artificial input image that strongly activates the output neurons associated with a target class. Given a trained, fixed model, this method starts with a zero-initialized image, backpropagates the gradients from the class score to the image pixels, updates the image pixels increasing the specific class scores and it repeats the pixel updating process, showing an encoded (idealized version) prototype of the class of interest. Image-specific class saliency visualization method provides a visual explanation by highlighting the most relevant pixels in an image for predicting a certain class C of interest. This is done by computing the gradient of the class score with respect to the input image, I 0 , {\displaystyle I_{0},} w = ∂ S C ∂ I | I 0 {\displaystyle w=\left.{\frac {\partial S_{C}}{\partial I}}\right|_{I_{0}}} approximating the model locally (around I 0 {\displaystyle I_{0}} ) as linear, using a first-order Taylor expansion: S C ( I ) ≈ w C T I + b {\displaystyle S_{C}(I)\approx w_{C}^{T}I+b} . The magnitude of w C {\displaystyle w_{C}} , the gradient, indicates the importancy of the pixels: larger gradients suggest greater influence on the prediction. Once the gradient is known, the saliency map is defined as the maximum absolute gradient across the color channels: M i j = m a x C | ∂ S C ∂ I i j C | {\displaystyle M_{ij}=max_{C}\left|{\frac {\partial S_{C}}{\partial I_{ij}^{C}}}\right|} resulting in an saliency map (i.e. heatmap). === Guided backpropagation === The concept of guided backpropagation can be traced for the first time in the paper by Springenberg et al. "Striving For Simplicity: The All Convolutional Net" and also this method builds upon the work by Zeiler and Fergus "Visualizing and Understanding Convolutional Networks". Guided backpropagation core is to understand what a CNN is learning, by visualizing the patterns that activate more strongly individual neurons (or filters), in architectures which do not rely on max-pooling layer. When propagating gradients back through a rectified linear unit (ReLU), guided backpropagation passes the gradient if and only if the input to the ReLU was positive (forward pass) and the output gradient is positive (backward signal), tackling both inactive neurons, negative gradients and suppressing the noise. The result displays sharper, high-resolution visualizations of what each neuron is responding to. Guided backpropagation represents a simple and practical method for model interpretability, helping understand how and where neural networks detect semantic concepts across layers. Moreover, it can be applied to any network architecture, due to its working principle. == Base versions == Class activation mapping and gradient-weighted class activation mapping are the original and most widely used methods for visual explanations in convolutional neural networks. These methods serve as the foundation for many later developments in explainable AI. Notation: In this article, the symbols i and j represent integer indices that disappear inside sums or averages, while x and y are the continuous (or up-sampled integer) coordinates of the final heat-map that is plotted. === Class activation mapping (CAM) === Class activation mapping (CAM) was the first, and the original, version of CAM methods, and it gave the name to the whole category. The approach was firstly introduced by Zhou et al. in their seminal work "Learning Deep Features for Discriminative Localization". This approach achieves class-specific heatmaps by modifying image classification CNN architectures, replacing fully-connected layers with convolutional layers and a final global average pooling layer. Its main scope is to localize and highlight discriminative regions of an input image that a CNN uses to identify a particular class, without needing explicit bounding box annotations. ==== Global average pooling (GAP) ==== Global average pooling (GAP) represents the key element in the original CAM approach. It is a dimensionality reduction technique and, similarly to other pooling layers, it allows the downsampling of the feature maps, calculating representative values for a specific region of the feature map. The particularity of GAP is that it calculates a single value for an entire feature map, significantly reducing the model dimensions. ==== Mathematical description ==== The mathematical description considers as its key the combination of convolutional and GAP layers. In CAM, it is mandatory to have the GAP layer after the last convolutional layer and before the final linear classifier layer. This last element of the architecture connects the output logits (the network predictions) y C {\displaystyle y^{C}} , to the GAP values, with its respective fine-tuned weights, w k C {\displaystyle w_{k}^{C}} . Considering A k {\displaystyle A^{k}} as the last feature maps of the last convolutional layer, GAP produces one value for each feature map, by averaging all the matrix elements (i, j) of the feature map: F k = 1 m n ∑ i = 1 m ∑ j = 1 n A i j k {\displaystyle F^{k}={\frac {1}{mn}}\sum _{i=1}^{m}\sum _{j=1}^{n}A_{ij}^{k}} with A k = [ A 11 k A 12 k ⋯ A 1 n k A 21 k A 22 k ⋯ A 2 n k ⋮ ⋮ ⋱ ⋮ A m 1 k A m 2 k ⋯ A m n k ] = { A i j k ∣ 1 ≤ i ≤ m , 1 ≤ j ≤ n } {\displaystyle A^{k}={\begin{bmatrix}A_{11}^{k}&A_{12}^{k}&\cdots &A_{1n}^{k}\\A_{21}^{k}&A_{22}^{k}&\cdots &A_{2n}^{k}\\\vdots &\vdots &\ddots &\vdots \\A_{m1}^{k}&A_{m2}^{k}&\cdots &A_{mn}^{k}\end{bmatrix}}=\left\{A_{

    Read more →
  • Li Sheng (computer scientist)

    Li Sheng (computer scientist)

    Li Sheng (Chinese: 李生; born 1943), is a professor at the School of Computer Science and Engineering, Harbin Institute of Technology (HIT), China. He began his research on Chinese-English machine translation in 1985, making himself one of the earliest Chinese scholars in this field. After that, he pursued in vast topics of natural language processing, including machine translation, information retrieval, question answering and applied artificial intelligence. He was the final review committee member for computer area in NSF China. Born and raised in Heilongjiang province, he graduated in 1965 from the computer specialty of HIT, which is one of the earliest computer specialties in Chinese universities. Then he started to work as a staff in the Computer specialty of HIT, which was finally granted as a department in 1985. Also from 1985, he was appointed to undertake a series administrative positions in HIT, e.g. Dean of Computer Department(1987–1988), Director of R&D Division (1988–1990), Chief R&D Officer and several other key leading positions in HIT. Resigned all his administrative positions in 2004, Li devoted himself as the director of MOE-Microsoft Join Key Lab of NLP& Speech (HIT), making it a leading NLP research group with more than 100 staffs and students working on various aspects of NLP. So far, the lab has already been granted for dozens of technology awards by the ministries of central government and local provincial government of China. Its research progresses are reported annually in top tier conferences including ACL, IJCAI, SIGIR etc. As one of the pioneers in NLP research in China, he contributes NLP in China not only in technology innovations but also in talents education. So far, his research group has graduated more than 60 Ph.D. and almost 200 M.E with NLP major. Most of them are now working as the chief researcher in various NLP groups of universities and companies in China, including several world-known NLP scholars, such as Wang Haifeng of Baidu, Zhou Ming of Microsoft Research, Zhang Min (张民) of Soochow University (China), and Zhao Tiejun (赵铁军) and Liu Ting (刘挺) of HIT. Owing to his contributions in Chinese language processing, Li was elected as the President of Chinese Information Processing Society of China (CIPSC) in 2011. He scaled this top level academic organization in China up to more than 3000 registered members, and promoted NLP into several national projects for research or industry development. In addition, the CIPSC is now enhancing its co-operations with world NLP organizations including ACL. == Machine Intelligence & Translation Laboratory (MI&TLAB) == Originates from Machine Translation Research Group of Computer Science Department, Harbin Institute of Technology, which was started Li in 1985. It is one of the earliest institutions engaged in MT research in China, featured by its investigations into Chinese-English machine translation. It is now running under the Research Center on Language Technology, School of Computer Science and Technology, HIT. Details for staffs and publications can be found at https://mitlab.hit.edu.cn. == MOE-MS Joint Key Lab of Natural Language Processing and Speech (HIT) == In June, 2000, the Joint HIT-Microsoft Machine Translation Lab was founded by MI&T Lab and Microsoft Research (China). It was the third joint lab established by Microsoft Research (China) with Chinese universities, and the only one focusing on Machine Translation. Based on this jointly lab, the cooperation between HIT and Microsoft gradually extended to the areas of machine translation, information retrieval, speech recognition and processing, natural language understanding. In Oct, 2004, the joint key lab was granted as one of the 10 joint key labs supported by the Microsoft Research of Asia and Ministry of Education in China. In July 2006, the Shenzhen extension of the lab was launched. More than 200 staff and students have undertaken research projects, including some sponsored by the National Natural Science Foundation of China and the National 863 program of China. Since 2005, the lab has also been organizing a summer camp in Harbin Institute of Technology, and approximately 150 faculty members and students from universities in China have participated. This summer workshop was organized annually until 2014, when it was organized formally as the summer school series by Chinese Information Processing Society, China. Through the lab, a Microsoft Research of Asia-HIT joint PhD program was implemented in 2012. == CEMT-I MT System == In May 1989, CEMT-I passed the formal project appraisal in Harbin, China. Capable of translating technical paper titles from Chinese to English, it is not only the first MT system completed by Li and his group, but also the first Chinese-English Translation system that passed the technical appraisal by Chinese government according to the public reports. It was then awarded the Second Prize of Ministry Level Technology Innovation by the former National Aerospace Industry Corporation in 1990. == Daya Translation Workstation == Owing to the technical achievements by Li's group in Chinese-English machine translation, the former National Aerospace Industry Corporation of China sponsored a commercial system development of "Daya Translation Station (MT)" in 1993. Designed as a comprehensive English composition aid for Chinese users, this system was finished and put into the market in 1995. And in 1997, this system was awarded the Second Prize of Ministry Level Technology Innovation by the former National Aerospace Industry Corporation. == BT863 MT System == From 1994, the researches in Li's lab were supported by National 863 Hi-tech Research and Development Program. During this period, the BT863 system was explored to employ one engine for both Chinese-English and English-Chinese translation. This system was proved to be the best performance among Chinese-English MT systems in the formal technical evaluation of National 863 program, yielding the Third Prize of Ministry Level Technology Innovation by the former National Aerospace Industry Corporation in 1997. == Next Generation IR == This is a key project granted by NSF China (with a joint sponsorship from MSRA) started form 2008. In contrast to his previous NSF grants for different NLP issues, Li explored in his last PI project on key technologies in personalized IR, together with researchers from Tsinghua University and Institute of Software, Chinese Academy of Science. With impressive publications in top tier journals and conferences (including breakthrough publications in SIGIR of his own group), this projected was approved "A-level" achievements by the NSF China office in 2012.

    Read more →
  • Tagged Deterministic Finite Automaton

    Tagged Deterministic Finite Automaton

    In the automata theory, a tagged deterministic finite automaton (TDFA) is an extension of deterministic finite automaton (DFA). In addition to solving the recognition problem for regular languages, TDFA is also capable of submatch extraction and parsing. While canonical DFA can find out if a string belongs to the language defined by a regular expression, TDFA can also extract substrings that match specific subexpressions. More generally, TDFA can identify positions in the input string that match tagged positions in a regular expression (tags are meta-symbols similar to capturing parentheses, but without the pairing requirement). == History == TDFA were first described by Ville Laurikari in 2000. Prior to that it was unknown whether it is possible to perform submatch extraction in one pass on a deterministic finite-state automaton, so this paper was an important advancement. Laurikari described TDFA construction and gave a proof that the determinization process terminates, however the algorithm did not handle disambiguation correctly. In 2007 Chris Kuklewicz implemented TDFA in a Haskell library Regex-TDFA with POSIX longest-match semantics. Kuklewicz gave an informal description of the algorithm and answered the principal question whether TDFA are capable of POSIX longest-match disambiguation, which was doubted by other researchers. In 2017 Ulya Trafimovich described TDFA with one-symbol lookahead. The use of a lookahead symbol reduces the number of registers and register operations in a TDFA, which makes it faster and often smaller than Laurikari TDFA. Trafimovich called TDFA variants with and without lookahead TDFA(1) and TDFA(0) by analogy with LR parsers LR(1) and LR(0). The algorithm was implemented in the open-source lexer generator RE2C. Trafimovich formalized Kuklewicz disambiguation algorithm. In 2018 Angelo Borsotti worked on an experimental Java implementation of TDFA; it was published later in 2021. In 2019 Borsotti and Trafimovich adapted POSIX disambiguation algorithm by Okui and Suzuki to TDFA. They gave a formal proof of correctness of the new algorithm and showed that it is faster than Kuklewicz algorithm in practice. In 2020 Trafimovich published an article about TDFA implementation in RE2C. In 2022 Borsotti and Trafimovich published a paper with a detailed description of TDFA construction. The paper incorporated their past research and presented multi-pass TDFA that are better suited to just-in-time determinization. They also compared TDFA against other algorithms and provided benchmarks. == Formal definition == TDFA have the same basic structure as ordinary DFA: a finite set of states linked by transitions. In addition to that, TDFA have a fixed set of registers that hold tag values, and register operations on transitions that set or copy register values. The values may be scalar offsets, or offset lists for tags that match repeatedly (the latter can be represented efficiently using a trie structure). There is no one-to-one mapping between tags in a regular expression and registers in a TDFA: a single tag may need many registers, and the same register may hold values of different tags. The following definition is according to Trafimovich and Borsotti. The original definition by Laurikari is slightly different. A tagged deterministic finite automaton F {\displaystyle F} is a tuple ( Σ , T , S , S f , s 0 , R , R f , δ , φ ) {\displaystyle (\Sigma ,T,S,S_{f},s_{0},R,R_{f},\delta ,\varphi )} , where: Σ {\displaystyle \Sigma } is a finite set of symbols (alphabet) T {\displaystyle T} is a finite set of tags S {\displaystyle S} is a finite set of states with initial state s 0 {\displaystyle s_{0}} and a subset of final states S f ⊆ S {\displaystyle S_{f}\subseteq S} R {\displaystyle R} is a finite set of registers with a subset of final registers R f {\displaystyle R_{f}} (one per tag) δ : S × Σ → S × O ∗ {\displaystyle \delta :S\times \Sigma \rightarrow S\times O^{}} is a transition function φ : S f → O ∗ {\displaystyle \varphi :S_{f}\rightarrow O^{}} is a final function, where O {\displaystyle O} is a set of register operations of the following types: set register i {\displaystyle i} to nil or to the current position: i ← v {\displaystyle i\leftarrow v} , where v ∈ { n , p } {\displaystyle v\in \{\mathbf {n} ,\mathbf {p} \}} copy register j {\displaystyle j} to register i {\displaystyle i} : i ← j {\displaystyle i\leftarrow j} copy register j {\displaystyle j} to register i {\displaystyle i} and append history: i ← j ⋅ h {\displaystyle i\leftarrow j\cdot h} , where h {\displaystyle h} is a string over { n , p } {\displaystyle \{\mathbf {n} ,\mathbf {p} \}} === Example === Figure 0 shows an example TDFA for regular expression ( 1 a 2 ) ∗ 3 ( a | 4 b ) 5 b ∗ {\displaystyle (1a2)^{}3(a|4b)5b^{}} with alphabet Σ = { a , b } {\displaystyle \Sigma =\{a,b\}} and a set of tags T = { 1 , 2 , 3 , 4 , 5 } {\displaystyle T=\{1,2,3,4,5\}} that matches strings of the form a … a b … b {\displaystyle a\dots ab\dots b} with at least one symbol. TDFA has four states S = { 0 , 1 , 2 , 3 } {\displaystyle S=\{0,1,2,3\}} three of which are final S f = { 1 , 2 , 3 } {\displaystyle S_{f}=\{1,2,3\}} . The set of registers is R = { r 1 , r 2 , r 3 , r 4 , r 5 } {\displaystyle R=\{r_{1},r_{2},r_{3},r_{4},r_{5}\}} with a subset of final registers R f = { r 1 , r 2 , r 3 , r 4 , r 5 } {\displaystyle R_{f}=\{r_{1},r_{2},r_{3},r_{4},r_{5}\}} where register r i {\displaystyle r_{i}} corresponds to i {\displaystyle i} -th tag. Transitions have operations defined by the δ {\displaystyle \delta } function, and final states have operations defined by the φ {\displaystyle \varphi } function (marked with wide-tipped arrow). For example, to match string a a b {\displaystyle aab} , one starts in state 0, matches the first a {\displaystyle a} and moves to state 1 (setting registers r 1 , r 2 {\displaystyle r_{1},r_{2}} to undefined and r 3 {\displaystyle r_{3}} to the current position 0), matches the second a {\displaystyle a} and loops to state 1 (register values are now r 1 = 0 , r 2 = r 3 = 1 {\displaystyle r_{1}=0,r_{2}=r_{3}=1} ), matches b {\displaystyle b} and moves to state 2 (register values are now r 1 = 1 , r 2 = r 3 = r 4 = 2 {\displaystyle r_{1}=1,r_{2}=r_{3}=r_{4}=2} ), executes the final operations in state 2 (register values are now r 1 = 1 , r 2 = r 3 = r 4 = 2 , r 5 = 3 {\displaystyle r_{1}=1,r_{2}=r_{3}=r_{4}=2,r_{5}=3} ) and finally exits TDFA. == Complexity == Canonical DFA solve the recognition problem in linear time. The same holds for TDFA, since the number of registers and register operations is fixed and depends only on the regular expression, but not on the length of input. The overhead on submatch extraction depends on tag density in a regular expression and nondeterminism degree of each tag (the maximum number of registers needed to track all possible values of the tag in a single TDFA state). On one extreme, if there are no tags, a TDFA is identical to a canonical DFA. On the other extreme, if every subexpression is tagged, a TDFA effectively performs full parsing and has many operations on every transition. In practice for real-world regular expressions with a few submatch groups the overhead is negligible compared to matching with canonical DFA. == TDFA construction == TDFA construction is performed in a few steps. First, a regular expression is converted to a tagged nondeterministic finite automaton (TNFA). Second, a TNFA is converted to a TDFA using a determinization procedure; this step also includes disambiguation that resolves conflicts between ambiguous TNFA paths. After that, a TDFA can optionally go through a number of optimizations that reduce the number of registers and operations, including minimization that reduces the number of states. Algorithms for all steps of TDFA construction with pseudocode are given in the paper by Borsotti and Trafimovich. This section explains TDFA construction on the example of a regular expression a ∗ t b ∗ | a b {\displaystyle a^{}tb^{}|ab} , where t {\displaystyle t} is a tag and { a , b } {\displaystyle \{a,b\}} are alphabet symbols. === Tagged NFA === TNFA is a nondeterministic finite automaton with tagged ε-transitions. It was first described by Laurikari, although similar constructions were known much earlier as Mealy machines and nondeterministic finite-state transducers. TNFA construction is very similar to Thompson's construction: it mirrors the structure of a regular expression. Importantly, TNFA preserves ambiguity in a regular expression: if it is possible to match a string in two different ways, then TNFA for this regular expression has two different accepting paths for this string. TNFA definition by Borsotti and Trafimovich differs from the original one by Laurikari in that TNFA can have negative tags on transitions: they are needed to make the absence of match explicit in cases when there is a bypass for a tagged transition. Figure 1 shows TNFA for the example regu

    Read more →
  • How to Choose an AI Avatar Generator

    How to Choose an AI Avatar Generator

    Trying to pick the best AI avatar generator? An AI avatar generator is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI avatar generator slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Open information extraction

    Open information extraction

    In natural language processing, open information extraction (OIE) is the task of generating a structured, machine-readable representation of the information in text, usually in the form of triples or n-ary propositions. == Overview == A proposition can be understood as truth-bearer, a textual expression of a potential fact (e.g., "Dante wrote the Divine Comedy"), represented in an amenable structure for computers [e.g., ("Dante", "wrote", "Divine Comedy")]. An OIE extraction normally consists of a relation and a set of arguments. For instance, ("Dante", "passed away in" "Ravenna") is a proposition formed by the relation "passed away in" and the arguments "Dante" and "Ravenna". The first argument is usually referred as the subject while the second is considered to be the object. The extraction is said to be a textual representation of a potential fact because its elements are not linked to a knowledge base. Furthermore, the factual nature of the proposition has not yet been established. In the above example, transforming the extraction into a full fledged fact would first require linking, if possible, the relation and the arguments to a knowledge base. Second, the truth of the extraction would need to be determined. In computer science transforming OIE extractions into ontological facts is known as relation extraction. In fact, OIE can be seen as the first step to a wide range of deeper text understanding tasks such as relation extraction, knowledge-base construction, question answering, semantic role labeling. The extracted propositions can also be directly used for end-user applications such as structured search (e.g., retrieve all propositions with "Dante" as subject). OIE was first introduced by TextRunner developed at the University of Washington Turing Center headed by Oren Etzioni. Other methods introduced later such as Reverb, OLLIE, ClausIE or CSD helped to shape the OIE task by characterizing some of its aspects. At a high level, all of these approaches make use of a set of patterns to generate the extractions. Depending on the particular approach, these patterns are either hand-crafted or learned. == OIE systems and contributions == Reverb suggested the necessity to produce meaningful relations to more accurately capture the information in the input text. For instance, given the sentence "Faust made a pact with the devil", it would be erroneous to just produce the extraction ("Faust", "made", "a pact") since it would not be adequately informative. A more precise extraction would be ("Faust", "made a pact with", "the devil"). Reverb also argued against the generation of overspecific relations. OLLIE stressed two important aspects for OIE. First, it pointed to the lack of factuality of the propositions. For instance, in a sentence like "If John studies hard, he will pass the exam", it would be inaccurate to consider ("John", "will pass", "the exam") as a fact. Additionally, the authors indicated that an OIE system should be able to extract non-verb mediated relations, which account for significant portion of the information expressed in natural language text. For instance, in the sentence "Obama, the former US president, was born in Hawaii", an OIE system should be able to recognize a proposition ("Obama", "is", "former US president"). ClausIE introduced the connection between grammatical clauses, propositions, and OIE extractions. The authors stated that as each grammatical clause expresses a proposition, each verb mediated proposition can be identified by solely recognizing the set of clauses expressed in each sentence. This implies that to correctly recognize the set of propositions in an input sentence, it is necessary to understand its grammatical structure. The authors studied the case in the English language that only admits seven clause types, meaning that the identification of each proposition only requires defining seven grammatical patterns. The finding also established a separation between the recognition of the propositions and its materialization. In a first step, the proposition can be identified without any consideration of its final form, in a domain-independent and unsupervised way, mostly based on linguistic principles. In a second step, the information can be represented according to the requirements of the underlying application, without conditioning the identification phase. Consider the sentence "Albert Einstein was born in Ulm and died in Princeton". The first step will recognize the two propositions ("Albert Einstein", "was born", "in Ulm") and ("Albert Einstein", "died", "in Princeton"). Once the information has been correctly identified, the propositions can take the particular form required by the underlying application [e.g., ("Albert Einstein", "was born in", "Ulm") and ("Albert Einstein", "died in", "Princeton")]. CSD introduced the idea of minimality in OIE. It considers that computers can make better use of the extractions if they are expressed in a compact way. This is especially important in sentences with subordinate clauses. In these cases, CSD suggests the generation of nested extractions. For example, consider the sentence "The Embassy said that 6,700 Americans were in Pakistan". CSD generates two extractions [i] ("6,700 Americans", "were", "in Pakistan") and [ii] ("The Embassy", "said", "that [i]"). This is usually known as reification.

    Read more →
  • Trevor Hastie

    Trevor Hastie

    Trevor John Hastie (born 27 June 1953) is an American statistician and computer scientist. He is currently serving as the John A. Overdeck Professor of Mathematical Sciences and Professor of Statistics at Stanford University. Hastie is known for his contributions to applied statistics, especially in the field of machine learning, data mining, and bioinformatics. He has authored several popular books in statistical learning, including The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Hastie has been listed as an ISI Highly Cited Author in Mathematics by the ISI Web of Knowledge. He also contributed to the development of S. == Education and career == Hastie was born on 27 June 1953 in South Africa. He received his B.S. in statistics from the Rhodes University in 1976 and master's degree from University of Cape Town in 1979. Hastie joined the doctoral program at Stanford University in 1980 and received his Ph.D. in 1984 under the supervision of Werner Stuetzle. His dissertation was "Principal Curves and Surfaces". Hastie began his professional career in 1977 with the South African Medical Research Council. After receiving his master's degree in 1979, he spent a year interning at the London School of Hygiene & Tropical Medicine, the Johnson Space Center in Houston, and the Biomath department at Oxford University. After receiving his doctoral degree from Stanford, Hastie returned to South Africa to work with his former employer South African Medical Research Council. He returned to United States in 1986 and joined the AT&T Bell Laboratories in Murray Hill, New Jersey and remained there for nine years. Working with John Chambers, he co-directed the development of the S programming language. He joined Stanford University in 1994 as Associate Professor in Statistics and Biostatistics. He was promoted to full Professor in 1999. During the period 2006–2009, he was the chair of the Department of Statistics at Stanford University. In 2013 he was named the John A. Overdeck Professor of Mathematical Sciences. == Awards and honors == Hastie is a Fellow of the Royal Statistical Society since 1979. He is also an elected Fellow of several professional and scholarly societies, including the Institute of Mathematical Statistics, the American Statistical Association, and the South African Statistical Society. He is a recipient of 'Myrto Lefkopolou Distinguished Lectureship' award of Biostatistics Department at the Harvard School of Public Health. In 2018, he was elected a member of the National Academy of Sciences. In 2019 Hastie became a foreign member of the Royal Netherlands Academy of Arts and Sciences. Hastie was named for the C.R. and Bhargavi Rao Prize in 2025. Hastie and Hui Zou received the 2025 Founders of Statistics prize for their elastic net paper. == Publications == Hastie is a prolific author of scientific works on numerous topics in applied statistics, including statistical learning, data mining, statistical computing, and bioinformatics. He along with his collaborators has authored about 125 scientific articles. Many of Hastie's scientific articles were coauthored by his longtime collaborator, Robert Tibshirani. Hastie has been listed as an ISI Highly Cited Author in Mathematics by the ISI Web of Knowledge. He has coauthored the following books: T. Hastie and R. Tibshirani, Generalized Additive Models, Chapman and Hall, 1990. J. Chambers and T. Hastie, Statistical Models in S, Wadsworth/Brooks Cole, 1991. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Prediction, Inference and Data Mining, Second Edition, Springer Verlag, 2009 (available for free from the author's website). G. James, D. Witten, T. Hastie, R. Tibshirani, An Introduction to Statistical Learning with Applications in R, Springer Verlag, 2013 (available for free from the co-author's website). T. Hastie, R. Tibshirani, M. Wainwright, Statistical Learning with Sparsity: the Lasso and Generalizations, CRC Press, 2015 (available for free from the author's website). Bradley Efron; Trevor Hastie (2016). Computer Age Statistical Inference. Cambridge University Press. ISBN 9781107149892.

    Read more →
  • Thomas G. Dietterich

    Thomas G. Dietterich

    Thomas G. Dietterich is emeritus professor of computer science at Oregon State University. He is one of the pioneers of the field of machine learning. He served as executive editor of Machine Learning (journal) (1992–98) and helped co-found the Journal of Machine Learning Research. In response to the media's attention on the dangers of artificial intelligence, Dietterich has been quoted for an academic perspective to a broad range of media outlets including National Public Radio, Business Insider, Microsoft Research, CNET, and The Wall Street Journal. Among his research contributions were the invention of error-correcting output coding to multi-class classification, the formalization of the multiple-instance problem, the MAXQ framework for hierarchical reinforcement learning, and the development of methods for integrating non-parametric regression trees into probabilistic graphical models. == Biography and education == Thomas Dietterich was born in South Weymouth, Massachusetts, in 1954. His family later moved to New Jersey and then again to Illinois, where Tom graduated from Naperville Central High School. Dietterich then entered Oberlin College and began his undergraduate studies. In 1977, Dietterich graduated from Oberlin with a degree in mathematics, focusing on probability and statistics. Dietterich spent the following two years at the University of Illinois, Urbana-Champaign. After those two years, he began his doctoral studies in the Department of Computer Science at Stanford University. Dietterich received his Ph.D. in 1984 and moved to Corvallis, Oregon, where he was hired as an assistant professor in computer science. in 2013, he was named "Distinguished Professor". In 2016, Dietterich retired from his position at Oregon State University. Throughout his career, Dietterich has worked to promote scientific publication and conference presentations. For many years, he was the editor of the MIT Press series on Adaptive Computation and Machine Learning. He also held the position of co-editor of the Morgan Claypool Synthesis Series on Artificial Intelligence and Machine Learning. He has organized several conferences and workshops including serving as Technical Program Co-Chair of the National Conference on Artificial Intelligence (AAAI-90), Technical Program Chair of the Neural Information Processing Systems (NIPS-2000) and General Chair of NIPS-2001. He served as founding President of the International Machine Learning Society and he has been a member of the IMLS Board since its founding. He is currently also a member of the Steering Committee of the Asian Conference on Machine Learning. == Research interests == Professor Dietterich is interested in all aspects of machine learning. There are three major strands of his research. First, he is interested in the fundamental questions of artificial intelligence and how machine learning can provide the basis for building integrated intelligent systems. Second, he is interested in ways that people and computers can collaborate to solve challenging problems. And third, he is interested in applying machine learning to problems in the ecological sciences and ecosystem management as part of the emerging field of computational sustainability. Over his career, he has worked on a wide variety of problems ranging from drug design to user interfaces to computer security. His current focus is on ways that computer science methods can help advance ecological science and improve our management of the Earth's ecosystems. This passion has led to several projects including research in wildfire management, invasive vegetation and understanding the distribution and migration of birds. For example, Dietterich's research is helping scientists at the Cornell Lab of Ornithology answer questions like: How do birds decide to migrate north? How do they know when to land and stopover for a few days? How do they choose where to make a nest? Tens of thousands of volunteer birdwatchers (citizen scientists) all over the world contribute data to the study by submitting their bird sightings to the eBird website. The amount of data is overwhelming – in March 2012 they had over 3.1 million bird observations. Machine learning can uncover patterns in data to model the migration of species. But there are many other applications for the same techniques which will allow organizations to better manage our forests, oceans, and endangered species, as well as improve traffic flow, water systems, the electrical power grid, and more. I realized I wanted to have an impact on something that really mattered – and certainly the whole Earth's ecosystem, of which we are a part, is under threat in so many ways. And so if there's some way that I can use my technical skills to improve both the science base and the tools needed for policy and management decisions, then I would like to do that. I am passionate about that. == Dangers of AI: an academic perspective == Dietterich has argued that the most realistic risks about the dangers of artificial intelligence are basic mistakes, breakdowns and cyberattacks, and the fact that it simply may not always work, rather than machines that become super powerful or destroy the human race. Dietterich considers machines becoming self-aware and trying to exterminate humans to be more science fiction than scientific fact. But to the extent that computer systems are given increasingly dangerous tasks, and asked to learn from and interpret their experiences, he said they may simply make mistakes. Instead, much of the work done in the AI safety community does indeed focus around accidents and design flaws. == Positions held == 2014–2016: President, Association for the Advancement of Artificial Intelligence (AAAI). 2013–present: Distinguished Professor of computer science, Oregon State University. 2011–present: Chief Scientist, BigML, Corvallis, OR. 2005–present: Director of Intelligent Systems Research, School of Electrical Engineering and Computer Science, Oregon State University. 2006–2008: Chief Scientist, Smart Desktop, Inc., Seattle, WA. 2004–2005: Chief Scientist, MyStrands, Inc., Corvallis, OR. 1995-2013: Professor of computer science, Oregon State University. 1998–1999: Visiting Senior Scientist, Institute for the Investigation of Artificial Intelligence, Barcelona, Spain. (Sabbatical leave position) 1988–1995: Associate Professor of computer science, Oregon State University. 1991–1993: Senior Scientist, Arris Pharmaceutical Corporation, S. San Francisco, CA. 1985–1988: Assistant Professor of computer science, Oregon State University. 1979–1984: Research Assistant, Heuristic Programming Project, Department of Computer Science, Stanford University. 1979 (Summer): Member of Technical Staff, Bell Telephone Laboratories, Naperville, Illinois. Computer-to-computer file transfer and micro-code distribution to remote switching systems. 1977 (Summer): Assistant to the Director of Planning and Research, Oberlin College, Oberlin, Ohio. Developed institutional planning database. == Awards and honors == Thomas Dietterich was honored by Oregon State University in the spring of 2013 as a "Distinguished Professor" for his work as a pioneer in the field of machine learning and being one of the mostly highly cited scientists in his field. He has also earned exclusive "Fellow" status in the Association for the Advancement of Artificial Intelligence, the American Association for the Advancement of Science and the Association for Computing Machinery. Over his career, he obtained more than $30 million in research grants, helped build a world-class research group at Oregon State, and created three software companies. He also co-founded two of the field's leading journals and was elected first president of the International Machine Learning Society. His other awards and honors include: ACM Distinguished Lecturer, 2012-2013 Fellow, American Association for the Advancement of Science, 2007 Oregon State University, College of Engineering Collaboration Award, 2004 Winner, JAIR Award for Best Paper in Previous Five Years, 2003 Fellow, Association for Computing Machinery, elected 2003 Oregon State University, College of Engineering Research Award, 1998 Fellow, Association for the Advancement of Artificial Intelligence, elected 1994 NSF Presidential Young Investigator, 1987-92 Nominated for Carter Award for Graduate Teaching, 1987, 1988 IBM Graduate Fellow, 1982, 1983 Upsilon Pi Epsilon, 1996 Sigma Xi, 1979–present State Farm Companies Foundation Fellowship, 1978 Member, Board of Trustees, Oberlin College, 1977-1980 Graduation with Honors in Mathematics, Oberlin College, 1977 Phi Beta Kappa, 1977 National Merit Scholar, 1973 == Selected publications == Liping Liu, Thomas G. Dietterich, Nan Li, Zhi-Hua Zhou (2016). Transductive Optimization of Top k Precision. International Joint Conference on Artificial Intelligence (IJCAI-2016). pp. 1781–1787. New York, NY Md. Amran Siddiqui, Alan Fern, Thomas G. Dietterich, Shubhomoy Da

    Read more →
  • Collostructional analysis

    Collostructional analysis

    Collostructional analysis is a family of methods developed by (in alphabetical order) Stefan Th. Gries (University of California, Santa Barbara) and Anatol Stefanowitsch (Free University of Berlin). Collostructional analysis aims at measuring the degree of attraction or repulsion that words exhibit to constructions, where the notion of construction has so far been that of Goldberg's construction grammar. == Collostructional methods == Collostructional analysis so far comprises three different methods: collexeme analysis, to measure the degree of attraction/repulsion of a lemma to a slot in one particular construction; distinctive collexeme analysis, to measure the preference of a lemma to one particular construction over another, functionally similar construction; multiple distinctive collexeme analysis extends this approach to more than two alternative constructions; covarying collexeme analysis, to measure the degree of attraction of lemmas in one slot of a construction to lemmas in another slot of the same construction. == Input frequencies == Collostructional analysis requires frequencies of words and constructions and is similar to a wide variety of collocation statistics. It differs from raw frequency counts by providing not only observed co-occurrence frequencies of words and constructions, but also (i) a comparison of the observed frequency to the one expected by chance; thus, collostructional analysis can distinguish attraction and repulsion of words and constructions; (ii) a measure of the strength of the attraction or repulsion; this is usually the log-transformed p-value of a Fisher-Yates exact test. == Versus other collocation statistics == Collostructional analysis differs from most collocation statistics such that (i) it measures not the association of words to words, but of words to syntactic patterns or constructions; thus, it takes syntactic structure more seriously than most collocation-based analyses; (ii) it has so far only used the most precise statistics, namely the Fisher-Yates exact test based on the hypergeometric distribution; thus, unlike t-scores, z-scores, chi-square tests etc., the analysis is not based on, and does not violate, any distributional assumptions.

    Read more →
  • Aggregation (linguistics)

    Aggregation (linguistics)

    In linguistics, aggregation is a subtask of natural language generation, which involves merging syntactic constituents (such as sentences and phrases) together. Sometimes aggregation can be done at a conceptual level. == Examples == A simple example of syntactic aggregation is merging the two sentences John went to the shop and John bought an apple into the single sentence John went to the shop and bought an apple. Syntactic aggregation can be much more complex than this. For example, aggregation can embed one of the constituents in the other; e.g., we can aggregate John went to the shop and The shop was closed into the sentence John went to the shop, which was closed. From a pragmatic perspective, aggregating sentences together often suggests to the reader that these sentences are related to each other. If this is not the case, the reader may be confused. For example, someone who reads John went to the shop and bought an apple may infer that the apple was bought in the shop; if this is not the case, then these sentences should not be aggregated. == Algorithms and issues == Aggregation algorithms must do two things: Decide when two constituents should be aggregated Decide how two constituents should be aggregated, and create the aggregated structure The first issue, deciding when to aggregate, is poorly understood. Aggegration decisions certainly depend on the semantic relations between the constituents, as mentioned above; they also depend on the genre (e.g., bureaucratic texts tend to be more aggregated than instruction manuals). They probably should depend on rhetorical and discourse structure. The literacy level of the reader is also probably important (poor readers need shorter sentences). But we have no integrated model which brings all these factors together into a single algorithm. With regard to the second issue, there have been some studies of different types of aggregation, and how they should be carried out. Harbusch and Kempen describe several syntactic aggregation strategies. In their terminology, John went to the shop and bought an apple is an example of forward conjunction Reduction Much less is known about conceptual aggregation. Di Eugenio et al. show how conceptual aggregation can be done in an intelligent tutoring system, and demonstrate that performing such aggregation makes the system more effective (and that conceptual aggregation make a bigger impact than syntactic aggregation). == Software == Unfortunately there is not much software available for performing aggregation. However the SimpleNLG system does include limited support for basic aggregation. For example, the following code causes SimpleNLG to print out The man is hungry and buys an apple.

    Read more →
  • Hideto Tomabechi

    Hideto Tomabechi

    Hideto Tomabechi (苫米地 英人, Tomabechi Hideto; born 1959) is a Japanese cognitive scientist who is an adjunct fellow at Carnegie Mellon University and has had an executive role in several companies. == Early life and education == He grew up in Minato-ku, Tokyo. He graduated from Komaba Toho High School and then joined the University of Massachusetts Amherst. He received his first degree from Sophia University, then joined Mitsubishi Real Estate. Tomabechi was a Fulbright Scholar at Yale University and became member of Yale University Artificial Intelligence Research Center and Yale Cognitive Science Program. Hideto Tomabechi's research topic was: Cognition Models for Language Expressions and Computational Methods (Tomabechi Algorithm). Hideto Tomabechi received his Ph.D. in the field of computational linguistics from Carnegie Mellon University. His 1993 Ph.D. Thesis was entitled "Efficient Unification for Natural Language". == Career timeline == 1992-1998: Director, Justsystem Scientific Institute. 1998: CEO of Cognitive Research Laboratories Inc. 2007: Adjunct Fellow at the Cyber Security & Privacy Research Institute (CyLab) at Carnegie Mellon University. 2020: Visiting professor at Nano & Life Research Center, Waseda University. 2020: Chairman, Resilience Japan, LLC. 2022: Chairman of Japan Society for Foreign Policy. == Brain research == In 1993, Hideto Tomabechi became director of the Development Department. Later, Tomabechi became director of the JustSystems Basic Research Institute Tomabechi researched the basic functions of the human brain and mind. The purpose of brain and consciousness research were to develop the human machine interface. The main areas of research were altered states of consciousness, hypnosis, homeostasis, brain functions, and functions of the human mind in cyberspace. Dr. Tomabechi founded the Bechi Unit, the world's first virtual currency at JustSystems, based on Tomabech Algorithms. == Brainwashing == Tomabechi was the scientist who deprogrammed the leaders of the religious cult responsible for the terrorist attack in the Tokyo subway. The cult (Aum Shinrikyo) brainwashed its people and they carried out the attacks in an influenced state of consciousness.

    Read more →
  • AI Code Generators Reviews: What Actually Works in 2026

    AI Code Generators Reviews: What Actually Works in 2026

    Trying to pick the best AI code generator? An AI code generator is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI code generator slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →