Backdoor (computing)

Backdoor (computing)

A backdoor is a typically covert method of bypassing normal authentication or encryption in a computer, product, embedded device (e.g. a home router), or its embodiment (e.g. part of a cryptosystem, algorithm, chipset, or even a "homunculus computer"—a tiny computer-within-a-computer such as that found in Intel's AMT technology). Backdoors are most often used for securing remote access to a computer, or obtaining access to plaintext in cryptosystems. From there it may be used to gain access to privileged information like passwords, corrupt or delete data on hard drives, or transfer information within compromised networks. In the United States, the 1994 Communications Assistance for Law Enforcement Act forces internet providers to provide backdoors for government authorities. In 2024, the U.S. government realized that China had been tapping communications in the U.S. using that infrastructure for months, or perhaps longer; China recorded presidential candidate campaign office phone calls—including employees of the then-vice president of the nation, and of the candidates themselves. A backdoor may take the form of a hidden part of a program, a separate program (e.g. Back Orifice may subvert the system through a rootkit), code in the firmware of the hardware, or parts of an operating system such as Windows, for example, device drivers. Trojan horses can be used to create vulnerabilities in a device. A Trojan horse may appear to be an entirely legitimate program, but when executed, it triggers an activity that may install a backdoor. Although some are secretly installed, other backdoors are deliberate and widely known. These kinds of backdoors have "legitimate" uses such as providing the manufacturer with a way to restore user passwords. Many systems that store information within the cloud fail to create accurate security measures. If many systems are connected within the cloud, hackers can gain access to all other platforms through the most vulnerable system. Default passwords (or other default credentials) can function as backdoors if they are not changed by the user. Some debugging features can also act as backdoors if they are not removed in the release version. In 1993, the United States government attempted to deploy an encryption system, the Clipper chip, with an explicit backdoor for law enforcement and national security access. The chip was unsuccessful. Recent proposals to counter backdoors include creating a database of backdoors' triggers and then using neural networks to detect them. == Overview == The threat of backdoors surfaced when multiuser and networked operating systems became widely adopted. Petersen and Turn discussed computer subversion in a paper published in the proceedings of the 1967 AFIPS Conference. They noted a class of active infiltration attacks that use "trapdoor" entry points into the system to bypass security facilities and permit direct access to data. The use of the word trapdoor here clearly coincides with more recent definitions of a backdoor. However, since the advent of public key cryptography the term trapdoor has acquired a different meaning (see: Trapdoor function), and thus the term "backdoor" is now preferred, only after the term trapdoor went out of use. More generally, such security breaches were discussed at length in a RAND Corporation task force report published under DARPA sponsorship by J.P. Anderson and D.J. Edwards in 1970. While initially targeting the computer vision domain, backdoor attacks have expanded to encompass various other domains, including text, audio, ML-based computer-aided design, and ML-based wireless signal classification. Additionally, vulnerabilities in backdoors have been demonstrated in deep generative models, reinforcement learning (e.g., AI GO), and deep graph models. These broad-ranging potential risks have prompted concerns from national security agencies regarding their potentially disastrous consequences. A backdoor in a login system might take the form of a hard coded user and password combination which gives access to the system. An example of this sort of backdoor was used as a plot device in the 1983 film WarGames, in which the architect of the "WOPR" computer system had inserted a hardcoded password-less account which gave the user access to the system, and to undocumented parts of the system (in particular, a video game-like simulation mode and direct interaction with the artificial intelligence). Although the number of backdoors in systems using proprietary software (software whose source code is not publicly available) is not widely credited, they are nevertheless frequently exposed. Programmers have even succeeded in secretly installing large amounts of benign code as Easter eggs in programs, although such cases may involve official forbearance, if not actual permission. == Examples == === Worms === Many computer worms, such as Sobig and Mydoom, install a backdoor on the affected computer (generally a PC on broadband running Microsoft Windows and Microsoft Outlook). Such backdoors appear to be installed so that spammers can send junk e-mail from the infected machines. Others, such as the Sony/BMG rootkit, placed secretly on millions of music CDs through late 2005, are intended as DRM measures—and, in that case, as data-gathering agents, since both surreptitious programs they installed routinely contacted central servers. A sophisticated attempt to plant a backdoor in the Linux kernel, exposed in November 2003, added a small and subtle code change by subverting the revision control system. In this case, a two-line change appeared to check root access permissions of a caller to the sys_wait4 function, but because it used assignment = instead of equality checking ==, it actually granted permissions to the system. This difference is easily overlooked, and could even be interpreted as an accidental typographical error, rather than an intentional attack. In January 2014, a backdoor was discovered in certain Samsung Android products, like the Galaxy devices. The Samsung proprietary Android versions are fitted with a backdoor that provides remote access to the data stored on the device. In particular, the Samsung Android software that is in charge of handling the communications with the modem, using the Samsung IPC protocol, implements a class of requests known as remote file server (RFS) commands, that allows the backdoor operator to perform via modem remote I/O operations on the device hard disk or other storage. As the modem is running Samsung proprietary Android software, it is likely that it offers over-the-air remote control that could then be used to issue the RFS commands and thus to access the file system on the device. === Object code backdoors === Harder to detect backdoors involve modifying object code, rather than source code—object code is much harder to inspect, as it is designed to be machine-readable, not human-readable. These backdoors can be inserted either directly in the on-disk object code, or inserted at some point during compilation, assembly linking, or loading—in the latter case the backdoor never appears on disk, only in memory. Object code backdoors are difficult to detect by inspection of the object code, but are easily detected by simply checking for changes (differences), notably in length or in checksum, and in some cases can be detected or analyzed by disassembling the object code. Further, object code backdoors can be removed (assuming source code is available) by simply recompiling from source on a trusted system. Thus for such backdoors to avoid detection, all extant copies of a binary must be subverted, and any validation checksums must also be compromised, and source must be unavailable, to prevent recompilation. Alternatively, these other tools (length checks, diff, checksumming, disassemblers) can themselves be compromised to conceal the backdoor, for example detecting that the subverted binary is being checksummed and returning the expected value, not the actual value. To conceal these further subversions, the tools must also conceal the changes in themselves—for example, a subverted checksummer must also detect if it is checksumming itself (or other subverted tools) and return false values. This leads to extensive changes in the system and tools being needed to conceal a single change. As object code can be regenerated by recompiling (reassembling, relinking) the original source code, making a persistent object code backdoor (without modifying source code) requires subverting the compiler itself—so that when it detects that it is compiling the program under attack it inserts the backdoor—or alternatively the assembler, linker, or loader. As this requires subverting the compiler, this in turn can be fixed by recompiling the compiler, removing the backdoor insertion code. This defense can in turn be subverted by putting a source meta-backdoor in the compiler, so that when it detects that it is compiling itself

H (company)

H Company, also known simply as H, is a French artificial intelligence startup which develops "action-oriented" artificial intelligence agents for enterprise automation and productivity. In May 2024, H Company closed a record-setting $220 million seed round, at the time the largest AI raise in Europe. In 2026, H Company released Holo 3, the latest generation of its computer-use AI models. The update marked a major advance in agentic AI, enabling agents to navigate any user interface, interpret screens, and complete complex, multi-step tasks across enterprise systems—much like a human user. This breakthrough positioned H Company at the frontier of computer-use autonomy, accelerating the integration of AI in enterprise workflows. == History == H Company was founded in 2023 in Paris by Laurent Sifre, Charles Kantor, and three DeepMind veterans: Daan Wiestra, Karl Tuyls, Julien Perollat. In May 2024, the firm secured what was then the largest European AI seed round, totaling $220 million led by US investors including Eric Schmidt (former Google CEO), Amazon, and backed by Accel, Bpifrance, UiPath, Eurazeo, Xavier Niel, Yuri Milner, Bernard Arnault, Samsung and others. In August 2024, three cofounders (Wiestra, Tuyls, Perollat) left the company over operational disagreements. In November 2024, H launched Runner H, its first agentic-API platform, which combined a large language model (LLM) and a reduced, 2-billion parameter vision-language model (VLM). In May 2025, H Company acquired Mithril Security, and in June 2025 the company widened its offering for agentic models. In June 2025, Gautier Cloix (formerly CEO Palantir France) replaced Charles Kantor as CEO of H Company, aiming to pivot the company towards a "forward deployed engineers" model. In July 2025, H Company introduced Surfer-H-CLI, an open-source, web-native Chrome agent designed for browser-based automation—able to search, scroll, click, and type on behalf of users and controllable via any visual language model (VLM). When paired with its June 2025 open-sourced 3B-parameter Holo-1 model, Surfer-H-CLI achieved 92.2% WebVoyager benchmark accuracy. == Activity == H Company creates enterprise AI models and agents (agentic AI) to automate and optimize complex workflows. H Company specifically designs AI agents called computer use capable of autonomously interfacing with any software (local or cloud-based) to detect and automate repetitive operations. H Company is based in Paris, France, with international offices in London and New York. H Company raised $220 million since its inception. Gautier Cloix is president and CEO of the company. H Company client include the French national lottery FDJ United. In March 2026, H Company released Holo3, a family of artificial intelligence models designed to operate digital systems by interacting directly with user interfaces. Holo3 enables agents ("virtual humanoids") to understand what is displayed in front-end environments—such as web pages, desktop applications, and other graphical user interfaces—and perform actions such as clicking, typing, and navigating across them to complete multi-step tasks. On the OSWorld-Verified benchmark, Holo3 reportedly achieved about 78.9%, surpassing the scores of OpenAI’s GPT‑5.4 and Anthropic’s Claude Opus 4.6 on this specific test, at roughly one-tenth of the inference cost of these proprietary systems. The release has been presented as a significant step toward automating routine digital workflows, allowing organizations to offload repetitive on-screen work, such as data entry and reconciliation across multiple tools, to AI-based agents.

Bootstrap aggregating

Bootstrap aggregating, also called bagging (from bootstrap aggregating) or bootstrapping, is a machine learning (ML) ensemble meta-algorithm designed to improve the stability and accuracy of ML classification and regression algorithms. It also reduces variance and overfitting. Although it is usually applied to decision tree methods, it can be used with any type of method. Bagging is a special case of the ensemble averaging approach. == Description of the technique == Given a standard training set D {\displaystyle D} of size n {\displaystyle n} , bagging generates m {\displaystyle m} new training sets D i {\displaystyle D_{i}} , each of size n ′ {\displaystyle n'} , by sampling from D {\displaystyle D} uniformly and with replacement. By sampling with replacement, some observations may be repeated in each D i {\displaystyle D_{i}} . If n ′ = n {\displaystyle n'=n} , then for large n {\displaystyle n} the set D i {\displaystyle D_{i}} is expected to have the fraction (1 - 1/e) (~63.2%) of the unique samples of D {\displaystyle D} , the rest being duplicates. This kind of sample is known as a bootstrap sample. Sampling with replacement ensures each bootstrap is independent from its peers, as it does not depend on previous chosen samples when sampling. Then, m {\displaystyle m} models are fitted using the above bootstrap samples and combined by averaging the output (for regression) or voting (for classification). Bagging leads to "improvements for unstable procedures", which include, for example, artificial neural networks, classification and regression trees, and subset selection in linear regression. Bagging was shown to improve preimage learning. On the other hand, it can mildly degrade the performance of stable methods such as k-nearest neighbors. == Process of the algorithm == === Key Terms === There are three types of datasets in bootstrap aggregating. These are the original, bootstrap, and out-of-bag datasets. Each section below will explain how each dataset is made except for the original dataset. The original dataset is whatever information is given. === Creating the bootstrap dataset === The bootstrap dataset is made by randomly picking objects from the original dataset. Also, it must be the same size as the original dataset. However, the difference is that the bootstrap dataset can have duplicate objects. Here is a simple example to demonstrate how it works along with the illustration below: Suppose the original dataset is a group of 12 people. Their names are Emily, Jessie, George, Constantine, Lexi, Theodore, John, James, Rachel, Anthony, Ellie, and Jamal. By randomly picking a group of names, let us say our bootstrap dataset had James, Ellie, Constantine, Lexi, John, Constantine, Theodore, Constantine, Anthony, Lexi, Constantine, and Theodore. In this case, the bootstrap sample contained four duplicates for Constantine, and two duplicates for Lexi, and Theodore. === Creating the out-of-bag dataset === The out-of-bag dataset represents the remaining people who were not in the bootstrap dataset. It can be calculated by taking the difference between the original and the bootstrap datasets. In this case, the remaining samples who were not selected are Emily, Jessie, George, Rachel, and Jamal. Keep in mind that since both datasets are sets, when taking the difference the duplicate names are ignored in the bootstrap dataset. The illustration below shows how the math is done: === Application === Creating the bootstrap and out-of-bag datasets is crucial since it is used to test the accuracy of ensemble learning algorithms like random forest. For example, a model that produces 50 trees using the bootstrap/out-of-bag datasets will have a better accuracy than if it produced 10 trees. Since the algorithm generates multiple trees and therefore multiple datasets the chance that an object is left out of the bootstrap dataset is low. The next few sections talk about how the random forest algorithm works in more detail. === Creation of Decision Trees === The next step of the algorithm involves the generation of decision trees from the bootstrapped dataset. To achieve this, the process examines each gene/feature and determines for how many samples the feature's presence or absence yields a positive or negative result. This information is then used to compute a confusion matrix, which lists the true positives, false positives, true negatives, and false negatives of the feature when used as a classifier. These features are then ranked according to various classification metrics based on their confusion matrices. Some common metrics include estimate of positive correctness (calculated by subtracting false positives from true positives), measure of "goodness", and information gain. These features are then used to partition the samples into two sets: those that possess the top feature, and those that do not. The diagram below shows a decision tree of depth two being used to classify data. For example, a data point that exhibits Feature 1, but not Feature 2, will be given a "No". Another point that does not exhibit Feature 1, but does exhibit Feature 3, will be given a "Yes". This process is repeated recursively for successive levels of the tree until the desired depth is reached. At the very bottom of the tree, samples that test positive for the final feature are generally classified as positive, while those that lack the feature are classified as negative. These trees are then used as predictors to classify new data. === Random Forests === The next part of the algorithm involves introducing yet another element of variability amongst the bootstrapped trees. In addition to each tree only examining a bootstrapped set of samples, only a small but consistent number of unique features are considered when ranking them as classifiers. This means that each tree only knows about the data pertaining to a small constant number of features, and a variable number of samples that is less than or equal to that of the original dataset. Consequently, the trees are more likely to return a wider array of answers, derived from more diverse knowledge. This results in a random forest, which possesses numerous benefits over a single decision tree generated without randomness. In a random forest, each tree "votes" on whether or not to classify a sample as positive based on its features. The sample is then classified based on majority vote. An example of this is given in the diagram below, where the four trees in a random forest vote on whether or not a patient with mutations A, B, F, and G has cancer. Since three out of four trees vote yes, the patient is then classified as cancer positive. Because of their properties, random forests are considered one of the most accurate data mining algorithms, are less likely to overfit their data, and run quickly and efficiently even for large datasets. They are primarily useful for classification as opposed to regression, which attempts to draw observed connections between statistical variables in a dataset. This makes random forests particularly useful in such fields as banking, healthcare, the stock market, and e-commerce where it is important to be able to predict future results based on past data. One of their applications would be as a useful tool for predicting cancer based on genetic factors, as seen in the above example. There are several important factors to consider when designing a random forest. If the trees in the random forests are too deep, overfitting can still occur due to over-specificity. If the forest is too large, the algorithm may become less efficient due to an increased runtime. Random forests also do not generally perform well when given sparse data with little variability. However, they still have numerous advantages over similar data classification algorithms such as neural networks, as they are much easier to interpret and generally require less data for training. As an integral component of random forests, bootstrap aggregating is very important to classification algorithms, and provides a critical element of variability that allows for increased accuracy when analyzing new data, as discussed below. == Improving Random Forests and Bagging == While the techniques described above utilize random forests and bagging (otherwise known as bootstrapping), there are certain techniques that can be used in order to improve their execution and voting time, their prediction accuracy, and their overall performance. The following are key steps in creating an efficient random forest: Specify the maximum depth of trees: Instead of allowing the random forest to continue until all nodes are pure, it is better to cut it off at a certain point in order to further decrease chances of overfitting. Prune the dataset: Using an extremely large dataset may create results that are less indicative of the data provided than a smaller set that more accurately represents what is being focused on. Continue pruning the data at each

International Conference on Computer Vision

The International Conference on Computer Vision (ICCV) is a research conference sponsored by the Institute of Electrical and Electronics Engineers (IEEE) held every other year. It is considered to be one of the top conferences in computer vision, alongside CVPR and ECCV, and it is held on years in which ECCV is not. The conference is usually spread over four to five days. Typically, experts in the focus areas give tutorial talks on the first day, then the technical sessions (and poster sessions in parallel) follow. Recent conferences have also had an increasing number of focused workshops and a commercial exhibition. == Awards == === Azriel Rosenfeld Lifetime Achievement Award === The Azriel Rosenfeld Award, or Azriel Rosenfeld Lifetime Achievement Award, recognizes researchers who have made significant contributions to the field of computer vision over their careers. It is named in memory of computer scientist and mathematician Azriel Rosenfeld. The following people have received this award: === Helmholtz Prize === The ICCV Helmholtz Prize, known as the Test of Time Award before 2013, is awarded every other year at the ICCV, recognizing ICCV papers from ten or more years earlier that had a significant impact on computer vision research. Winners are selected by the IEEE Computer Society's Technical Committee on Pattern Analysis and Machine Intelligence. The award is named after the 19th century physician and physicist Hermann von Helmholtz, and the ICCV's award is not related to the various Helmholtz Prizes in physics, or the Hermann von Helmholtz Prize in neuroscience. === Marr Prize === The ICCV best-paper award is the Marr Prize, named after British neuroscientist David Marr. === Mark Everingham Prize === The Mark Everingham Prize is an award given yearly by the Technical Committee on Pattern Analysis and Machine Intelligence of the IEEE Computer Society at the IEEE International Conference on Computer Vision or the European Conference on Computer Vision to commemorate the late Mark Everingham, "one of the rising stars of computer vision", and to encourage others to follow in his footsteps by acting to further progress in the computer vision community as a whole. The prize is given to a researcher, or a team of researchers, who have made a selfless contribution of significant benefit to other members of the computer vision community. The Mark Everingham Prize for Rigorous Evaluation was an award given in 2012 at the British Machine Vision Conference. === PAMI Distinguished Researcher Award === The PAMI Distinguished Researcher Award (until 2013 called Significant Researcher Award) is awarded to candidates whose research projects have significantly contributed to the progress of computer vision. Awards are made based on major research contributions, as well as the role of those contributions in influencing and inspiring other research. Candidates are nominated by the community. The following people have received this award: == Conference list == The conference is usually held in the Spring in various international locations.

Vowpal Wabbit

Vowpal Wabbit (VW) is an open-source fast online interactive machine learning system library and program developed originally at Yahoo! Research, and currently at Microsoft Research. It was started and is led by John Langford. Vowpal Wabbit's interactive learning support is particularly notable including Contextual Bandits, Active Learning, and forms of guided Reinforcement Learning. Vowpal Wabbit provides an efficient scalable out-of-core implementation with support for a number of machine learning reductions, importance weighting, and a selection of different loss functions and optimization algorithms. == Notable features == The VW program supports: Multiple supervised (and semi-supervised) learning problems: Classification (both binary and multi-class) Regression Active learning (partially labeled data) for both regression and classification Multiple learning algorithms (model-types / representations) OLS regression Matrix factorization (sparse matrix SVD) Single layer neural net (with user specified hidden layer node count) Searn (Search and Learn) Latent Dirichlet Allocation (LDA) Stagewise polynomial approximation Recommend top-K out of N One-against-all (OAA) and cost-sensitive OAA reduction for multi-class Weighted all pairs Contextual-bandit (with multiple exploration/exploitation strategies) Multiple loss functions: squared error quantile hinge logistic poisson Multiple optimization algorithms Stochastic gradient descent (SGD) BFGS Conjugate gradient Regularization (L1 norm, L2 norm, & elastic net regularization) Flexible input - input features may be: Binary Numerical Categorical (via flexible feature-naming and the hash trick) Can deal with missing values/sparse-features Other features On the fly generation of feature interactions (quadratic and cubic) On the fly generation of N-grams with optional skips (useful for word/language data-sets) Automatic test-set holdout and early termination on multiple passes bootstrapping User settable online learning progress report + auditing of the model Hyperparameter optimization == Scalability == Vowpal wabbit has been used to learn a tera-feature (1012) data-set on 1000 nodes in one hour. Its scalability is aided by several factors: Out-of-core online learning: no need to load all data into memory The hashing trick: feature identities are converted to a weight index via a hash (uses 32-bit MurmurHash3) Exploiting multi-core CPUs: parsing of input and learning are done in separate threads. Compiled C++ code

Sanctuary (app)

Sanctuary is a mobile app focusing on astrology and mystical services. Users enter their birthday, time of birth, and place of birth information into the app and receive a birth chart as well as daily horoscope readings. Users can also sign up for a monthly membership and receive on-demand astrological readings via a text message format. The service has been described as being “Talkspace for astrology" and "Uber for astrological readings". The mobile app uses an A.I.-driven interface. On May 14, 2019, Apple featured Sanctuary as the App of the Day. == History == Sanctuary initially began as project within the incubator of Lorne Michaels’ Broadway Video Ventures. The app officially launched on March 21, 2019. Its backers include Broadway Video Ventures, Greycroft Partners, and Shari Redstone.

Causal Markov condition

The Causal Markov (CM) condition states that, conditional on the set of all its direct causes, a node is independent of all variables which are not effects or direct causes of that node. In the event that the structure of a Bayesian network accurately depicts causality, the two conditions are equivalent. This is related to the Markov condition, an assumption made in Bayesian probability theory, that every node in a Bayesian network is conditionally independent of its nondescendants, given its parents. Stated loosely, it is assumed that a node has no bearing on nodes which do not descend from it. In a DAG, this local Markov condition is equivalent to the global Markov condition, which states that d-separations in the graph also correspond to conditional independence relations. This also means that a node is conditionally independent of the entire network, given its Markov blanket. A network may accurately embody the Markov condition without depicting causality, in which case it should not be assumed to embody the causal Markov condition. == Motivation == Statisticians are enormously interested in the ways in which certain events and variables are connected. The precise notion of what constitutes a cause and effect is necessary to understand the connections between them. The central idea behind the philosophical study of probabilistic causation is that causes raise the probabilities of their effects, all else being equal. A deterministic interpretation of causation means that if A causes B, then A must always be followed by B. In this sense, smoking does not cause cancer because some smokers never develop cancer. On the other hand, a probabilistic interpretation simply means that causes raise the probability of their effects. In this sense, changes in meteorological readings associated with a storm do cause that storm, since they raise its probability. (However, simply looking at a barometer does not change the probability of the storm, for a more detailed analysis, see:). == Examples == In a simple view, releasing one's hand from a hammer causes the hammer to fall. However, doing so in outer space does not produce the same outcome, calling into question if releasing one's fingers from a hammer always causes it to fall. A causal graph could be created to acknowledge that both the presence of gravity and the release of the hammer contribute to its falling. However, it would be very surprising if the surface underneath the hammer affected its falling. This essentially states the Causal Markov Condition, that given the existence of gravity the release of the hammer, it will fall regardless of what is beneath it. == Implications == === Dependence and Causation === It follows from the definition that if X and Y are in V and are probabilistically dependent, then either X causes Y, Y causes X, or X and Y are both effects of some common cause Z in V. This definition was seminally introduced by Hans Reichenbach as the Common Cause Principle (CCP). === Screening === It once again follows from the definition that the parents of X screen X from other "indirect causes" of X (parents of Parents(X)) and other effects of Parents(X) which are not also effects of X.