In many Unix variants, "nobody" is the conventional name of a user identifier which owns no files, is in no privileged groups, and has no abilities except those which every other user has. It is normally not enabled as a user account, i.e. has no home directory or login credentials assigned. Some systems also define an equivalent group "nogroup". == Uses == The pseudo-user "nobody" and group "nogroup" are used, for example, in the NFSv4 implementation of Linux by idmapd, if a user or group name in an incoming packet does not match any known username on the system. It was once common to run daemons as nobody, especially on servers, in order to limit the damage that could be done by a malicious user who gained control of them. However, the usefulness of this technique is reduced if more than one daemon is run like this, because then gaining control of one daemon would provide control of them all. The reason is that processes owned by the same user have the ability to send signals to each other and use debugging facilities to read or even modify each other's memory. Modern practice, as recommended by the Linux Standard Base, is to create a separate user account for each daemon.
Machine unlearning
Machine unlearning is a branch of machine learning focused on removing specific undesired element, such as private data, wrong or manipulated training data, outdated information, copyrighted material, harmful content, dangerous abilities, or misinformation, without needing to rebuild models from the ground up. Large language models, like the ones powering ChatGPT, may be asked not just to remove specific elements but also to unlearn a "concept," "fact," or "knowledge," which aren't easily linked to specific examples. New terms such as "model editing," "concept editing," and "knowledge unlearning" have emerged to describe this process. == History == Early research efforts were largely motivated by Article 17 of the GDPR, the European Union's privacy regulation commonly known as the "right to be forgotten" (RTBF), introduced in 2014. The GDPR did not anticipate that the development of large language models would make data erasure a complex task. This issue has since led to research on "machine unlearning," with a growing focus on removing copyrighted material, harmful content, dangerous capabilities, and misinformation. Just as early experiences in humans shape later ones, some concepts are more fundamental and harder to unlearn. A piece of knowledge may be so deeply embedded in the model's knowledge graph that unlearning it could cause internal contradictions, requiring adjustments to other parts of the graph to resolve them. Researchers have now also started studying unlearning in the context of removing incorrect or adversarially manipulated training data such as systematically biased labels or poisoning attacks. == Motivations == At present, machine unlearning is motivated by a growing range of concerns that extend well beyond the field's original focus on data privacy. A widely used taxonomy in the literature distinguishes two high-level categories of motivation. Access revocation covers cases where a data subject or rights holder requests the removal of data they own or control. This is most commonly associated with RTBF established by the European Union's General Data Protection Regulation (GDPR) and analogous legislation such as the California Consumer Privacy Act (CCPA). These regulations grant individuals the legal right to request erasure of their personal data from any system that has processed it, including models that were trained on it. Access revocation also encompasses the removal of copyrighted or pay-walled content that was incorporated into training corpora without the necessary licenses, a concern that has become prominent with the widespread use of largely web-scraped pre-training datasets. Model correction covers cases where the model exhibits undesirable behavior arising from the training data, regardless of any individual's request. This includes: Removal of toxic, biased, or unsafe outputs introduced by harmful content in the training set Correction of stale or factually incorrect associations, such as outdated knowledge encoded in a deployed model Removal of dangerous capabilities, such as detailed knowledge of the synthesis of chemical or biological agents Correction of the influence of data poisoning or adversarial attacks that have corrupted model behavior This second category has been formalized as corrective machine unlearning, which frames unlearning as a post-training mechanism for repairing the effects of bad or harmful training data. It is closely related to the AI safety literature, where data filtering alone has been found insufficient to prevent hazardous knowledge from being encoded in model weights, motivating unlearning as a complementary risk mitigation strategy. A further distinction has been drawn in the literature between removal {eliminating the influence of specific training data on model parameters) and suppression (preventing the model from generating specific outputs regardless of how that knowledge is encoded). These two goals are not equivalent: removing training data does not guarantee meaningful output suppression, and suppressing outputs does not constitute removal of the underlying training data's influence. == SISA Training == SISA is a training strategy consisting of four mechanisms designed to make machine unlearning more efficient by structuring how models are trained and updated. Its goal is to allow a system to remove the influence of specific data points without retraining an entire model from scratch. By reorganizing training data and workflows, SISA reduces the computational burden of unlearning requests. Sharding divides the training dataset into multiple disjoint subsets, or shards. Each shard is used to train a separate model instance. This ensures that a single data point affects only one shard, so unlearning it requires updating only the corresponding shard rather than the full model. Isolation refers to training each shard independently, with nothing shared across shards during the training process. This separation prevents cross-contamination between shards, ensuring that forgetting data in one shard does not require adjustments to any others. Slicing breaks the data within each shard into sequential slices and stores model states after each slice is trained on. When an unlearning request targets a piece of data, the system can roll back to the checkpoint before the point was seen and retrain only from that slice forward. This reduces retraining time even within a shard. Aggregation occurs at inference, when the model is queried. It combines the outputs of each shard to determine the output of the overall model. This is often through majority voting or averaging. This allows SISA-trained systems to behave like a single model despite being composed of multiple shard-level models. Together, these mechanisms enable machine learning systems to forget specific data points with far lower computational cost than full retraining. The trade-off is that sharding and slicing can lead to reduced model accuracy, worse generalization, and increased storage requirements for the intermediate checkpoints. This can be tolerable based on the needs of the individual or organization to comply with "right to be forgotten" or efficiently recover from backdoor attacks. == Algorithms == Machine unlearning algorithms are broadly categorized into exact and approximate methods, reflecting a fundamental trade-off between formal guarantees and computational tractability. === Exact Unlearning === Exact unlearning methods produce a model that is statistically indistinguishable from one retrained from scratch on the dataset with the forget data removed. The canonical framework for exact unlearning is SISA Training (Sharded, Isolated, Sliced, and Aggregated), introduced by Bourtoule et al. (2021). SISA partitions the training dataset into disjoint shards and trains a separate sub-model on each. At inference time, predictions are aggregated across sub-models. When an unlearning request is received, only the sub-model corresponding to the shard containing the target data requires retraining, reducing computational overhead proportionally to the number of shards. Exact methods provide the strongest guarantees but become prohibitively expensive for large pre-trained neural networks and are generally limited to settings where training can be structured in advance. === Approximate Unlearning === Approximate unlearning methods seek to produce a model whose behavior is sufficiently close to an exactly unlearned model without the cost of full retraining. These methods dominate practical applications. Common approaches include: Gradient Ascent: The model is fine-tuned by maximizing the loss on the forget set, directly degrading its performance on targeted data. This is the most direct approach but risks destabilizing performance on retained data. Random Labelling: The model is fine-tuned on the forget set using randomly shuffled labels, confusing its associations with the targeted data while producing a less aggressive weight shift than pure gradient ascent. Gradient Difference: Combines gradient ascent on the forget set with simultaneous gradient descent on the retain set, using the retain objective as a regularizer to preserve general model utility. KL Divergence Regularization: Minimizes the KL divergence between the outputs of the unlearned model and the original model on the retain set, anchoring behavior on data the model should remember. Weight Pruning and Fine-tuning: Parameters with the smallest L1-norm are pruned — targeting weights most weakly associated with general knowledge and potentially most associated with the forget set — followed by fine-tuning on the retain set to restore utility. Layer Reset and Fine-tuning: The first or last k layers are re-initialized to random weights and the model is subsequently fine-tuned on the retain set. This is a coarse but computationally simple approach. Selective Synaptic Dampening: Uses influence functions to estimate the effect of individual trainin
Cancer Likelihood in Plasma
Cancer Likelihood in Plasma (CLiP) refers to a set of ensemble learning methods for integrating various genomic features useful for the noninvasive detection of early cancers from blood plasma. An application of this technique for early detection of lung cancer (Lung-CLiP) was originally described by Chabon et al. (2020) from the labs of Ash Alizadeh and Max Diehn at Stanford. This method relies on several improvements to cancer personalized profiling by deep sequencing (CAPP-Seq) for analysis of circulating tumor DNA (ctDNA). The CLiP technique integrates multiple distinctive genomic features of a cancer of interest findings within a machine-learning framework for cancer detection. For example, studies have shown that the majority of somatic mutations found in cell-free DNA (cfDNA) are not tumor derived, but instead reflect clonal hematopoeisis (also known as CHIP). Even though CHIP tends to target specific genes, it also involves many generally non-recurrent mutations that can be shed from leukocytes and detected in cfDNA, regardless of whether profiling patients with cancer and healthy adults. However, genuine tumor derived ctDNA mutations can be distinguished from CHIP-derived mutations. This is because unlike tumor-derived mutations, CHIP-derived mutations that are shed from leukocytes into plasma tend to occur on longer cfDNA fragments, and to lack specific mutational signatures such as those associated with tobacco smoking in lung cancer that are also found in tumor derived ctDNA molecules. CLiP integrates these features within hierarchical ensemble machine learning models that consider somatic mutations and copy number alternations, among other features. While the CLiP method is unique in relying exclusively on mutations and copy number alterations, it is related to a variety of other liquid biopsy methods being commercially developed for early cancer detection using ctDNA and proteins (e.g., CancerSEEK / DETECT-A ), cfDNA fragmentation patterns (e.g., DELFI), and DNA methylation (e.g., cfMeDIP-Seq, Grail). While the CLiP method has not yet been broadly applied for population-based cancer screening, it has been shown to distinguish discriminate early-stage lung cancers from risk-matched controls across multiple cohorts of patients enrolled across the US.
Kullback–Leibler Upper Confidence Bound
In multi-armed bandit problems, KL-UCB (for Kullback–Leibler Upper Confidence Bound) is a UCB-type algorithm that is asymptotically optimal, in the sense that its regret matches the problem-dependent Lai-Robbins lower bound. == Multi-armed bandit problem == The Multi-armed bandit problem is a sequential game where one player has to choose at each turn between K {\displaystyle K} actions (arms). Behind every arm a {\displaystyle a} there is an unknown distribution ν a {\displaystyle \nu _{a}} that lies in a set D {\displaystyle {\mathcal {D}}} known by the player (for example, D {\displaystyle {\mathcal {D}}} can be the set of Gaussian distributions or Bernoulli distributions). At each turn t {\displaystyle t} the player chooses (pulls) an arm a t {\displaystyle a_{t}} , he then gets an observation X t {\displaystyle X_{t}} of the distribution ν a t {\displaystyle \nu _{a_{t}}} . === Regret minimization === The goal is to minimize the regret at time T {\displaystyle T} that is defined as R T := ∑ a = 1 K Δ a E [ N a ( T ) ] {\displaystyle R_{T}:=\sum _{a=1}^{K}\Delta _{a}\mathbb {E} [N_{a}(T)]} where μ a := E [ ν a ] {\displaystyle \mu _{a}:=\mathbb {E} [\nu _{a}]} is the mean of arm a {\displaystyle a} μ ∗ := max a μ a {\displaystyle \mu ^{}:=\max _{a}\mu _{a}} is the highest mean Δ a := μ ∗ − μ a {\displaystyle \Delta _{a}:=\mu ^{}-\mu _{a}} N a ( t ) {\displaystyle N_{a}(t)} is the number of pulls of arm a {\displaystyle a} up to turn t {\displaystyle t} The player has to find an algorithm that chooses at each turn t {\displaystyle t} which arm to pull based on the previous actions and observations ( a s , X s ) s < t {\displaystyle (a_{s},X_{s})_{s
AI notetaker
An AI notetaker is a tool using artificial intelligence to take notes during meetings. They are created by tech companies such as Microsoft and Google; by AI transcription services such Otter.ai, and by smaller firms such as Cluely and Krisp. Some business executives send AI notetakers to attend meetings not only to take notes, but also to answer questions on their behalf. The use of AI notetakers raises ethical questions, including recording meetings without the consent of all participants and the possibility that the notetaker will hallucinate and misrepresent what was said during meetings. There are also concerns when it comes to the privacy and security of meeting data and the sensitive information that lives inside meetings. Further controversies have developed from the use of AI notetakers such as Cluely to cheat in technical job interviews. == Technology == Large technology companies have integrated transcription capabilities into broader productivity and accessibility tools, including real-time captioning, dictation, and meeting documentation features embedded in operating systems and office platforms. Standalone transcription platforms, such as Transkriptor, focus specifically on automated transcription workflows and apply AI-based speech recognition to convert audio and video recordings into text. The software supports transcription in multiple languages and processes recordings uploaded via a web interface as well as through mobile and browser extensions. Tools of this type typically provide editable, time-aligned transcripts and export options for text and subtitle formats, cloud-based processing, multilingual support, and automation in transcription technology.
Plum Voice
The Plum Group, Inc. (DBA Plum Voice) is a company. Plum is headquartered in New York City with offices in Boston and Denver. == History == Plum Voice, founded in 2000 as The Plum Group, Inc., was incorporated to create technologies for personalized audio communication. By 2001, Plum had commercialized the open-standard Plum VoiceXML IVR platform which facilitated the creation of dynamic telecom applications. 2001 - Commercial launch of Plum VoiceXML IVR platform for customer-premises deployment 2002 - Launch of Plum Voice Hosting Centers for 24x7x365 managed IVR hosting 2004 - Plum Voice application suite receives a "Product of the Year" award from Customer Interactions magazine 2008 - Plum Survey builder launched, a do-it-yourself IVR survey tool. 2010 - Plum launched QuickFuse, a web-based rapid development platform used to create voice applications. 2013 - Plum launched VoiceTrends, an analytics and reporting toolkit designed specifically for voice applications. Plum achieves PCI-DSS Level 1. 2015 - Plum launched Plum Insight, a multi-channel (voice, web, mobile) survey platform. Plum achieves HIPAA compliance. 2016 - Plum launched a new version of QuickFuse called Fuse+. 2020 - Plum sunsets QuickFuse, rebrands Fuse+ as Plum Fuse.
Kinodynamic planning
In robotics and motion planning, kinodynamic planning is a class of problems for which velocity, acceleration, and force/torque bounds must be satisfied, together with kinematic constraints such as avoiding obstacles. The term was coined by Bruce Donald, Pat Xavier, John Canny, and John Reif. Donald et al. developed the first polynomial-time approximation schemes (PTAS) for the problem. By providing a provably polynomial-time ε-approximation algorithm, they resolved a long-standing open problem in optimal control. Their first paper considered time-optimal control ("fastest path") of a point mass under Newtonian dynamics, amidst polygonal (2D) or polyhedral (3D) obstacles, subject to state bounds on position, velocity, and acceleration. Later they extended the technique to many other cases, for example, to 3D open-chain kinematic robots under full Lagrangian dynamics. == Modern approaches == Since the foundational theoretical work of the 1990s, the field has evolved significantly with new algorithmic approaches that address the computational and practical limitations of early methods. === Sampling-based methods === Many practical heuristic algorithms based on stochastic optimization and iterative sampling have been developed by a wide range of authors to address the kinodynamic planning problem. Popular approaches include extensions of RRT algorithms such as RRT for kinodynamic systems, and sampling-based methods like Model Predictive Path Integral (MPPI) control. These stochastic techniques have been shown to work well in practice and can handle complex, high-dimensional state spaces more efficiently than deterministic methods. However, all motion planning methods are subject to the PSPACE-hardnesss of classical motion planning even without dynamics, which means (assuming the usual structural complexity conjectures) they all can be worst-case exponential-time in the state-space dimension (the number of degrees of freedom). On the other hand, the deterministic methods have provable guarantees of completeness, accuracy, and complexity (for fixed dimension, they are polynomial-time not only in the geometric complexity, but also in ( 1 / ε ) {\displaystyle (1/\varepsilon )} , the closeness of the desired approximation), whereas most of the recent heuristic/stochastic methods sacrifice at least one of these criteria. === Mixed-integer optimization approaches === Recent advances in mixed-integer programming have enabled new deterministic approaches to kinodynamic planning. These methods formulate the planning problem as an optimization task that simultaneously determines the spatial path and control sequence while respecting all kinodynamic constraints. By using techniques such as McCormick envelopes to handle bilinear constraints, these approaches can provide globally optimal solutions with mathematical guarantees while achieving significant computational speedups over traditional methods. === Genetic algorithm approaches === Genetic algorithms have also been adapted for kinodynamic planning, particularly for gradient-free optimization in challenging terrain. These methods use evolutionary computation to optimize trajectories over receding horizons, with specialized mutation operators that ensure vehicle controls remain within operational limits. This approach is particularly useful when dealing with non-differentiable cost functions or when gradient information is unavailable or unreliable. === Three-dimensional terrain planning === The foundational theoretical work of the 1990s was extended to higher degrees of freedom, and even to n {\displaystyle n} -link, 3D open-chain kinematic robots under full Lagrangian dynamics. However, many of the subsequent heuristic techniques (typically employing stochastic optimization) were confined to planar environments. More recent kinodynamic planning has extended beyond these planar environments to handle complex 3D terrains represented as simplicial complexes or triangular meshes. This advancement is particularly important for applications such as autonomous vehicle navigation in off-road environments, where elevation changes and terrain geometry significantly impact vehicle dynamics. These methods must account for pitch angles, surface curvature, and the coupling between terrain geometry and vehicle kinodynamic constraints. == Performance and guarantees == The landscape of performance guarantees in kinodynamic planning has evolved considerably. While early heuristic methods could not guarantee optimality, recent mixed-integer approaches have demonstrated the ability to find globally optimal solutions with proven constraint satisfaction. Experimental comparisons have shown that modern optimization-based planners can achieve execution times several orders of magnitude faster than sampling-based methods while maintaining strict adherence to kinodynamic constraints. However, the choice of method often depends on the specific application requirements. Sampling-based methods remain valuable for their ability to quickly find feasible solutions in high-dimensional spaces and their robustness to modeling uncertainties. Optimization-based methods excel when optimality guarantees and constraint compliance are critical, particularly in safety-critical applications. == Applications == Kinodynamic planning finds applications across numerous domains including: Autonomous vehicles: Path planning for cars, trucks, and other ground vehicles that must respect acceleration, steering, and velocity limits Aerial robotics: Trajectory planning for quadrotors and other unmanned aerial vehicles with dynamic constraints Manipulation: Planning for robotic arms where joint velocities, accelerations, and torques are limited Legged locomotion: Footstep and trajectory planning for walking and running robots Space robotics: Planning under thrust and fuel constraints for spacecraft and rovers