Detrended correspondence analysis

Detrended correspondence analysis

Detrended correspondence analysis (DCA) is a multivariate statistical technique widely used by ecologists to find the main factors or gradients in large, species-rich but usually sparse data matrices that typify ecological community data. DCA is frequently used to suppress artifacts inherent in most other multivariate analyses when applied to gradient data. == History == DCA was created in 1979 by Mark Hill of the United Kingdom's Institute for Terrestrial Ecology (now merged into Centre for Ecology and Hydrology) and implemented in FORTRAN code package called DECORANA (Detrended Correspondence Analysis), a correspondence analysis method. DCA is sometimes erroneously referred to as DECORANA; however, DCA is the underlying algorithm, while DECORANA is a tool implementing it. == Issues addressed == According to Hill and Gauch, DCA suppresses two artifacts inherent in most other multivariate analyses when applied to gradient data. An example is a time-series of plant species colonising a new habitat; early successional species are replaced by mid-successional species, then by late successional ones (see example below). When such data are analysed by a standard ordination such as a correspondence analysis: the ordination scores of the samples will exhibit the 'edge effect', i.e. the variance of the scores at the beginning and the end of a regular succession of species will be considerably smaller than that in the middle, when presented as a graph the points will be seen to follow a horseshoe shaped curve rather than a straight line ('arch effect'), even though the process under analysis is a steady and continuous change that human intuition would prefer to see as a linear trend. Outside ecology, the same artifacts occur when gradient data are analysed (e.g. soil properties along a transect running between 2 different geologies, or behavioural data over the lifespan of an individual) because the curved projection is an accurate representation of the shape of the data in multivariate space. Ter Braak and Prentice (1987, p. 121) cite a simulation study analysing two-dimensional species packing models resulting in a better performance of DCA compared to CA. == Method == DCA is an iterative algorithm that has shown itself to be a highly reliable and useful tool for data exploration and summary in community ecology (Shaw 2003). It starts by running a standard ordination (CA or reciprocal averaging) on the data, to produce the initial horse-shoe curve in which the 1st ordination axis distorts into the 2nd axis. It then divides the first axis into segments (default = 26), and rescales each segment to have mean value of zero on the 2nd axis - this effectively squashes the curve flat. It also rescales the axis so that the ends are no longer compressed relative to the middle, so that 1 DCA unit approximates to the same rate of turnover all the way through the data: the rule of thumb is that 4 DCA units mean that there has been a total turnover in the community. Ter Braak and Prentice (1987, p. 122) warn against the non-linear rescaling of the axes due to robustness issues and recommend using detrending-by-polynomials only. == Drawbacks == No significance tests are available with DCA, although there is a constrained (canonical) version called DCCA in which the axes are forced by Multiple linear regression to correlate optimally with a linear combination of other (usually environmental) variables; this allows testing of a null model by Monte-Carlo permutation analysis. == Example == The example shows an ideal data set: The species data is in rows, samples in columns. For each sample along the gradient, a new species is introduced but another species is no longer present. The result is a sparse matrix. Ones indicate the presence of a species in a sample. Except at the edges each sample contains five species. The plot of the first two axes of the correspondence analysis result on the right hand side clearly shows the disadvantages of this procedure: the edge effect, i.e. the points are clustered at the edges of the first axis, and the arch effect. == Software == An open source implementation of DCA, based on the original FORTRAN code, is available in the vegan R-package.

Jordan Antiquities Database and Information System

The Jordan Antiquities Database and Information System (JADIS) was a computer database of antiquities in Jordan, the first of its kind in the Arab world. It was established by the Department of Antiquities in 1990, in cooperation with the American Center for Oriental Research in Amman and sponsored by the United States Agency for International Development. JADIS was in use until 2002, when it was superseded by a new system, MEGA-J. Over 10,841 antiquities were registered in the database. An introduction and printed summary of the database was published by the Department of Antiquities in 1994, edited by Gaetano Palumbo.

Stochastic grammar

A stochastic grammar (statistical grammar) is a grammar framework with a probabilistic notion of grammaticality: Stochastic context-free grammar Statistical parsing Data-oriented parsing Hidden Markov model (or stochastic regular grammar) Estimation theory The grammar is realized as a language model. Allowed sentences are stored in a database together with the frequency how common a sentence is. Statistical natural language processing uses stochastic, probabilistic and statistical methods, especially to resolve difficulties that arise because longer sentences are highly ambiguous when processed with realistic grammars, yielding thousands or millions of possible analyses. Methods for disambiguation often involve the use of corpora and Markov models. "A probabilistic model consists of a non-probabilistic model plus some numerical quantities; it is not true that probabilistic models are inherently simpler or less structural than non-probabilistic models." == Examples == A probabilistic method for rhyme detection is implemented by Hirjee & Brown in their study in 2013 to find internal and imperfect rhyme pairs in rap lyrics. The concept is adapted from a sequence alignment technique using BLOSUM (BLOcks SUbstitution Matrix). They were able to detect rhymes undetectable by non-probabilistic models.

Karsten Borgwardt

Karsten Borgwardt (born 1980) is a German computer scientist and biologist specializing in machine learning and computational biology. Since February 2023, he has been a director at the Max Planck Institute of Biochemistry in Martinsried, Germany, where he leads the Department of Machine Learning and Systems Biology. == Education and career == Borgwardt was born in Kaiserslautern. He obtained a Diplom (equivalent to a master’s degree) in computer science from LMU Munich in 2004 and a Master of Science in biology from the University of Oxford in 2003. In 2007, he obtained his PhD from LMU Munich in computer science. Following a postdoctoral position at the University of Cambridge, he became a research group leader for machine learning and computational biology at the Max Planck Institute for Biological Cybernetics and the former Max Planck Institute for Developmental Biology in Tübingen in 2008. In 2011, Borgwardt was appointed professor of data mining in the life sciences at the University of Tübingen. In 2014, he joined ETH Zurich as an associate professor in the Department of Biosystems Science and Engineering (D-BSSE) and was promoted to full professor in 2017. During his tenure at ETH Zurich, he coordinated significant research programs, including two Marie Curie Innovative Training Networks and the Personalized Swiss Sepsis Study, focusing on the prediction of sepsis using machine learning. In 2023, he was appointed as Scientific Member of the Max Planck Society and as Director at the Max Planck Institute of Biochemistry in Martinsried. == Research contributions == Borgwardt’s research integrates big data analysis with biomedical research. He develops novel machine learning algorithms to detect patterns and statistical dependencies in large biological and medical datasets. His work aims to enable the automatic generation of new knowledge from big data and to understand the relationship between the function of biological systems and their molecular properties, which is fundamental for personalized medicine. == Awards and honors == During his studies, he was a scholar of the Stiftung Maximilianeum, and the Bavarian Foundation for the Promotion of the Gifted. Borgwardt received scholarships from the Studienstiftung des deutschen Volkes in 2002 and 2007. His PhD dissertation received the Heinz Schwärtzel Dissertation Award for Foundations of Computer Science in 2007. As a professor in Tübingen, he was awarded the Alfried-Krupp-Förderpreis for Young Professors in 2013. In 2015, he received an SNSF Starting Grant. In 2014, 2015 and 2016, he was listed in “Top 40 under 40” in Germany rankings selected by Capital magazine. In 2018, Borgwardt was named among “25 individuals who have the potential to shape the next 25 years” by Focus magazine. In 2023, Borgwardt received an honorary professorship from LMU Munich by the Faculty of Chemistry and Pharmacy. Publications from Borgwardt's group have received the Outstanding Student Paper Award in NIPS in 2009, the SIB Graduate Paper Award in 2020 and SIB Remarkable Output Awards in 2020 and 2021 from the Swiss Institute of Bioinformatics (SIB). == Selected publications == Weisfeiler-Lehman Graph Kernels (’‘Journal of Machine Learning Research’’, 2011): Introduced an efficient graph kernel based on the Weisfeiler-Lehman algorithm. “Direct antimicrobial resistance prediction from clinical MALDI-TOF mass spectra using machine learning” (’‘Nature Medicine’’, 2022): showcased the feasibility of predicting antimicrobial resistance from readily collected mass spectrometry data in the hospital. The new method is able to identify antibiotic resistance 24 hours earlier than previous methods.

Danqi Chen

Danqi Chen (Chinese: 陈丹琦; pinyin: Chén Dānqí, IPA: [ʈ͡ʂʰə̌n tan t͡ɕʰǐ]; born in Changsha, China) is a Chinese computer scientist and assistant professor at Princeton University specializing in the AI field of natural language processing (NLP). In 2019, she joined the Princeton NLP group, alongside Sanjeev Arora, Christiane Fellbaum, and Karthik Narasimhan. She was previously a visiting scientist at Facebook AI Research (FAIR). She earned her Ph.D. at Stanford University and her BS from Tsinghua University. Chen is the author of Neural Reading Comprehension and Beyond, a dissertation on using artificial intelligence to access knowledge in ordinary and structured documents. She is the author or co-author of a number of journal articles, including Reading Wikipedia to Answer Open-Domain Questions. Google's SyntaxNet is based on algorithms developed by Danqi Chen and Christopher Manning at Stanford. Her primary research interests are in text understanding and knowledge representation and reasoning. She won a gold medal at the 2008 International Informatics Olympiad. She is known among friends as CDQ. A well known algorithm in competitive programming, CDQ Divide and Conquer, is named after this acronym. She is married to Huacheng Yu, an assistant professor in theoretical computer science at Princeton University.

AI warfare

AI warfare refers to the use of artificial intelligence technologies to automate military operation and enhance or bypass human decision-making in armed conflicts. AI is used to rapidly analyze large volumes of military intelligence data, including making recommendations or decisions on who and what to target. Abdul-Rahman al-Rawi, a 20-year-old student, was the first acknowledged civilian killed by AI-assisted airstrike in a U.S. strike in Iraq in 2024. In 2026, the U.S. declared it would become an 'AI-first' warfighting force. Husain et al (2018) coined the term hyperwar to refer to warfare which is algorithmic or controlled by artificial intelligence, with little to no human decision-making. == 2026 Iran war == The 2026 Iran war has been described as the "first AI war", although the Untied States and Israel have previously used AI to identify targets during the Gaza war. The U.S. has used AI tools to attack Iran. These tools have been used for military intelligence, targeting, and damage assessment in the war in Iran. Using the Maven smart system, the U.S. attacked 1,000 targets in the first 24 hours of the war and 5,000 targets over the course of 10 days. While the U.S. had used Maven in 2022 to share targeting information with Ukraine and strike against Iraq, Syria, and against the Houthis in 2024, Iran's attacks are its biggest. Authorities are looking into whether artificial intelligence was involved in the airstrike on an Iranian girls' school that killed 170 civilians, the majority of whom were female students. The United States Central Command emphasized that humans were making final targeting decisions. Per a White House tally released on April 8, the U.S. military hit over 13,000 targets in Iran during the war's first 38 days, including more than 2,000 command-and-control sites, 1,500 air defense targets, and 1,450 industrial infrastructure targets. == Gaza war == As part of the Gaza war, the Israel Defense Forces (IDF) have used artificial intelligence to rapidly and automatically perform much of the process of determining what to bomb. IDF's Unit 8200 developed AI systems, dubbed the Gospel and Lavender, to find targets for the Israeli Air Force to bomb. The Gospel automatically provides targeting recommendations to human analysts, who decide whether to approve strikes. Lavender identified 37,000 Hamas-linked individuals early in the war, and was used alongside the Gospel, which chooses buildings or structures as targets. According to a report by +972 Magazine and Local Call, strikes assisted by Lavender were routinely permitted to kill 5–20 civilians for each suspected Hamas militant, who were often bombed at home with their families. The IDF denies these claims, maintaining that every strike is assessed to minimize collateral damage, and that there is no policy "to kill tens of thousands of people in their homes." Israel deployed AI technologies during the Gaza war for audio analysis, facial recognition, and airstrike targeting. One such system was used to help identify the location of Hamas commander Ibrahim Biari through phone call analysis, leading to strikes that killed him as well as more than 125 civilians. == 2022 Russian Ukraine war == Kyiv launched a project with Palantir called Brave1 Dataroom to build AI systems using the extensive combat data Ukraine has gathered since Russia’s full-scale invasion in 2022. The country has also created tools for in-depth airstrike analysis, introduced AI to process large volumes of intelligence, and incorporated these technologies into the planning of long-range strike operations. == Involved companies == Maven Smart System is developed by Palantir. It integrates Anthropic's Claude as its large language model, and uses Amazon's AWS servers as its cloud infrastructure. Since Anthropic's refusal to support autonomous weapons development and domestic surveillance efforts. In its place, other AI firms, including OpenAI, have been brought in to take over that role. == Involved state actors == In 2024, the United States Department of Defense had 800-plus active AI-related projects and requested $1.8 billion in AI funding, with Project Maven and Project Artemis (AI-resistant drones developed together with Ukraine) being the main ones. The technology has been used in Iran, Iraq, Syria and Yemen to identify targets. China is pursuing intelligentized warfare, integrating AI across all combat domains—land, sea, air, space, and cyber—with military AI spending exceeding $1.6 billion annually. == International regulation == Since 2014, states meeting within the framework of the Convention on Certain Conventional Weapons have discussed lethal autonomous weapon systems. In 2016, the treaty's states parties established an open-ended Group of Governmental Experts on Lethal Autonomous Weapons Systems to continue those discussions. The discussions have addressed international humanitarian law, accountability, possible prohibitions and regulations, and the extent of human control required over AI-enabled weapons.

Paola Velardi

Paola Velardi (born in Rome, April 26, 1955) is a full professor of computer science at Sapienza University in Rome, Italy. Her research encompasses Artificial Intelligence and specifically, natural language processing, machine learning business intelligence and semantic web. Velardi is one of the hundred female scientists included in the database "100esperte.it" (translated from Italian with "100 female experts"). This online, open database champions the recognition of top-rated female scientists in Science, Technology, Engineering and Mathematics (STEM) areas. Among her prestigious appointments and honors, her inclusion stands out —alongside 45 other international female scientists from the past, present, and future— in the Women in Science pavilion of UNESCO’s Virtual Science Museum. == Research == Paola Velardi's research activity has focused, since the early 1980s, on Artificial Intelligence, with a particular emphasis on natural language processing (NLP), Machine learning, and data mining. Her scientific contributions have evolved over time, following the sector's primary paradigms: Semantic Web and Ontologies: She is known for her pioneering work on semantic disambiguation and automated ontology learning, collaborating on the development of systems such as OntoLearn. Social Computing and Predictive Analysis: She has conducted research on extracting information from social media for epidemiological monitoring (syndromic surveillance) and for the identification of opinion leaders. In the educational field, she has developed machine learning models to predict the risk of student dropout. AI for Health and Elder Monitoring: She has coordinated projects to support frailty in the elderly, developing systems based on ambient intelligence and wearables to detect clinical and behavioral anomalies. She has also contributed to models for analyzing behavioral changes through dynamic clustering. Generative AI and Finance: More recently, her research has expanded into the use of generative AI and deep learning for finance, including benchmark studies on price trend prediction based on Limit Order Books (LOB) and the development of diffusion models for realistic market simulation (the TRADES project). According to Google Scholar bibliometrics updated until December 2025, Velardi's scientific publications have been cited more than 8100 times. Her h-index was 42. She has published more than 200 papers in international journals and conference proceedings. Some of her publications have been published in top rated journals such as Artificial Intelligence, Computational Linguistics, Knowledge-Based Systems, IEEE Transactions on Data and Knowledge Engineering , IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE Transactions on Computers, IEEE Transactions on Software Engineering , Data Mining and Knowledge Discovery, and Journal of Web Semantics. == Education and previous employments == Velardi graduated in electronic engineering from Sapienza University in 1978. From 1978 to 1983, she worked for the Ugo Bordoni Foundation, a research institution focusing on ICT and working under the supervision of the Italian Ministry of Economic Development. In 1983, she was a visiting scholar at Stanford University. During this period she became passionate about Artificial Intelligence, which will remain her area of research throughout her career. From 1984 to 1986, she came back to her natal city and worked as a researcher for IBM. From 1986 to 1996 she was an associate professor in the engineering faculty of Polytechnic University of the Marches (Ancona, Italy). Starting in November 1996, she taught in and did research for the Department of Computer Science at the Sapienza University. Velardi was the head of Bachelor and Master Programs in Computer Science at Sapienza University from 2010 to 2013 and from 2015 to 2016. == Current employment == Since November 2001, Velardi has been a full professor in the department of computer science ("Dipartimento di Informatica" in Italian) at Sapienza University in Rome, Italy. Since 2013, she has been the coordinator of the Distance Learning Degree in Computer Science at Sapienza University. As of today, Velardi is a Senior Associate at the Institute of Cognitive Sciences and Technologies (ISTC) of the CNR. == Recognition == Velardi is one of the hundred female scientists included in the database "100esperte.it" (translated from Italian with "100 female experts"). This database lists top Italian female STEM scientists. Six out of one hundred scientists in the 100esperte's database are computer scientists like Velardi. Velardi is in the list of the top Italian scientists. A top scientist appearing in the Top-Italian-Scientists database is a scientist whose h-index is greater than 30. In March 2017, she was given an IBM Faculty Award for her research on social recommender systems. In December 2018, Velardi was included in the list of the 50 most influential Italian women in science and technology by Inspiring Fifty, a non-profit that aims to increase diversity in STEM by making female role models in tech more visible. In September 2019 she was the local co-organizer and Program Chair of the 6th ACM Celebration of Women in Computing. In November 2019 Velardi received the Standout Woman Award International at the seat of the Italian Parliament in Montecitorio. == Causes == Velardi aims at debunking the myth of computer science as a man-oriented and "inflexible" discipline. She is the founder of the project "NERD? Non e' roba per donne?" (translated from Italian: "NERD? Is it not stuff for women?"). This project was launched by Velardi in 2012 in the Department of Computer Science at Sapienza University. Since 2013 the project has been carried out in partnership with IBM Italy, which later created a spin-off of the project. The goal of the project is two-fold: (1) conveying computer science as creative, interdisciplinary and problem-solving-oriented science, and (2) encouraging young female students in studying computer science by, for instance, developing apps for smartphones. She has been the program chair of the 19th ACM celebration of Women in Computing. She is the creator and coordinator of the G4GRETA, an educational project that involves students of the third and fourth grades of Rome and Lazio. The project combines the development of IT skills with the themes of environmental sustainability and soft skills (teambuilding, pitching, social networking, etc.) Velardi is also involved in scientific dissemination. In 2020 and 2021 she cooperated with RaiCultura, the cultural division of RAI, the national broadcasting company.