A web presence is a location on the World Wide Web where a person, business, or some other entity is represented (see also web property and point of presence). Examples of a web presence for a person could be a personal website, a blog, a profile page, a wiki page, or a social media point of presence (e.g. a LinkedIn profile, a Facebook account, or a Twitter account). Examples of a web presence for a business or some other entity could be a corporate website, a microsite, a page on a review site, a wiki page, or a social media point of presence (e.g., a LinkedIn company page and/or group, a Facebook business/brand/product page, or a Twitter account). Every web presence is associated with a unique web address to distinguish one point of presence from another. == Owned vs. unowned == Web presence can either be owned or unowned. Owned media exists when a single person or group can control the content that is published on its web presence (e.g. a corporate website or a personal Twitter account). However, when a single person or group cannot solely control the content, the creator is different from the owner. This is considered unowned media (see earned media). A Wikipedia page or a Yelp page about a person, company, or product would be an example of a known (or "earned") web presence. Occasionally, a first form of media known as "paid media" is often included in the discussion of media types: "earned vs. owned vs. paid". Paid media is commonly found in the form of advertisements, but it is not considered a form of web presence. == Management == Web presence management is the process of establishing and maintaining a digital footprint on the web. The three factors that are considered include the following: where a person or business has web presence; how each web presence represents its enterprise; and what is published at a point of presence. Web presence management is the discipline of determining and governing: the distribution of policy documents which platforms are most appropriate (e.g. internal vs. external blog, YouTube vs. Vimeo) the single inventory of personal or corporate web presence (e.g. partners or advocates) where on the web a business and any relatable assets are represented where on the web a business and any relatable assets are impersonated or pirated web properties with the particular entities they represent who has control over which web properties new web properties which are not in the personal or corporate inventory (e.g. someone creates a new presence) authorized and unauthorized changes to the creation (e.g. branding) of a web presence a workflow for creating a web property that follows its corporate standards === Management system === The purpose of a web presence management system is to manage the web presence of a person or business. This includes the collection of domain names, websites, social media, and other web pages where he, she, or it is being represented. The tool generally offers the following key functions: new presence discovery, inventory management, change detection, access control, stakeholder coordination, and compliance workflow. A web presence management system is meant to have a broader reach so that it emphasizes where a presence has been established, will be established, must be maintained, or must be remediated. An example of a web presence management system is the Brandle Presence Manager. In order to publish content to the various points of web presence, multiple content management systems and sometimes even social media management systems are often used. The primary focus of most content and social media management systems is limited to their specific web platforms. === Domain names === Another aspect of web presence management is managing the collection of domain names registered to the person or business. Any entity may register multiple domain names for the same property. As a result, they can link alternative spellings, different top-level domains, aliases, brands, or products to the same website. Similarly, negative or derogatory domain names may also be registered. This is done to prevent certain domain names from being used against the person or business. It is common for a larger business to have domain names registered by multiple employees at multiple domain name registrars, possibly a result of organizational or geographical requirements. Consequently, a web presence management system can be used to monitor all domain names registered by the business, regardless of the registrars used. == Discovery == Web presence discovery is the process of monitoring the web for a new point of presence about a person or business. Web presence discovery is often included in a web presence management system. Whether a new domain is registered, a new website is published, or a new social media account is established, it occurs outside of the person's or business’ control. As a result, its purpose is to assess a new point of presence and appropriately handle any violations. Web presence discovery differs from content listening. The former involves looking for new properties on the web, whereas the latter refers to analyzing content that already exists to hear how a person or business is seen often in near real time. Examples of content listening systems include Sysomos and Radian6, which is now a subsidiary of Salesforce.com. === Brand protection === A person or business may choose to watch for a new web presence that might appear to misrepresent or mislead an audience, such as counterfeiters, spoofers, or malicious hackers. One of the early software in the online brand protection marketplace was MarkMonitor, now part of Thomson Reuters. This software helped detect rogue domain names and websites. However, the modern day growth of social media has seen a rise in the number of fraudulent brand impersonations. It has become much easier for a new web presence to be created on those platforms, which results in a greater frequency of them today. As a preventive measure, online brand protection providers are now adding social media to their domain and website discovery options. === Security === The widespread growth of social media has also made it easier for unauthorized individuals to impersonate an employee. Consequently, social media has now become a recognized threat vector in that it can be used to socially engineer an attack on a business. To counter this, companies are able to use web presence monitoring tools to detect new points of presence on the web and thereby defend against socially engineered attacks. === Distributed inventory management === A web presence monitoring system can be used by a business to associate a new web property with its corporate inventory. It is designed to address autonomous, distributed behaviors. This usually applies to larger businesses whose geographically diverse employees are more prone to creating new points of presence on the web. For example, a retail chain may allow each local store to create and manage their web presence to market to and communicate with their local customer base. Similarly, a global business may have teams in each country or region who create and manage a web presence to adapt to local languages or cultures. == Monitoring == Web presence monitoring is the process of monitoring a known inventory of web presence to detect any changes that are made. Web presence monitoring is often included in a web presence management system and can serve multiple purposes for both larger corporations and certain individuals, such as celebrities. It is important to note that presence monitoring differs from content listening. The former involves monitoring the properties (e.g. branding) of a web property in an established inventory, whereas the latter refers to analyzing content that already exists to hear how a person or business is seen often in near real time. Additionally, presence monitoring focuses on owned media and content listening on earned media. === Corporate, brand, and regulatory compliance === Many companies ensure that certain standards are met for a property on the web that represents their business. For companies in regulated industries, such as finance and healthcare, the company may be required by law to ensure that all publicized content, regardless of platform or technology, follow specific requirements. The widespread growth of social media has seen a rise in the number of fraudulent corporate impersonations. It has become much easier for a new web presence to be created on these platforms, and so these are much more prevalent than they used to be. As a preventive measure, a web presence monitoring system alerts the company when a known property is changed, allowing for the property to be reviewed and amended so that it follows the proper standards. . A web presence monitoring system helps alert the company when a known property is changed, so it can be reviewed and brought back, if necessary, into compliance with the appro
Biorobotics
Biorobotics is an interdisciplinary science that combines the fields of biomedical engineering, cybernetics, and robotics to develop new technologies that integrate biology with mechanical systems to develop more efficient communication, alter genetic information, and create machines that imitate biological systems. == Cybernetics == Cybernetics focuses on the communication and system of living organisms and machines that can be applied and combined with multiple fields of study such as biology, mathematics, computer science, engineering, and much more. This discipline falls under the branch of biorobotics because of its combined field of study between biological bodies and mechanical systems. Studying these two systems allows for advanced analysis on the functions and processes of each system as well as the interactions between them. === History === Cybernetic theory is a concept that has existed for centuries, dating back to the era of Plato where he applied the term to refer to the "governance of people". The term cybernetique is seen in the mid-1800s used by physicist André-Marie Ampère. The term cybernetics was popularized in the late 1940s to refer to a discipline that touched on, but was separate, from established disciplines, such as electrical engineering, mathematics, and biology. === Science === Cybernetics is often misunderstood because of the breadth of disciplines it covers. In the early 20th century, it was coined as an interdisciplinary field of study that combines biology, science, network theory, and engineering. Today, it covers all scientific fields with system related processes. The goal of cybernetics is to analyze systems and processes of any system or systems in an attempt to make them more efficient and effective. === Applications === Cybernetics is used as an umbrella term so applications extend to all systems related scientific fields such as biology, mathematics, computer science, engineering, management, psychology, sociology, art, and more. Cybernetics is used amongst several fields to discover principles of systems, adaptation of organisms, information analysis and much more. == Genetic engineering == Genetic engineering is a field that uses advances in technology to modify biological organisms. Through different methods, scientists are able to alter the genetic material of microorganisms, plants and animals to provide them with desirable traits. For example, making plants grow bigger, better, and faster. Genetic engineering is included in biorobotics because it uses new technologies to alter biology and change an organism's DNA for their and society's benefit. === History === Although humans have modified genetic material of animals and plants through artificial selection for millennia (such as the genetic mutations that developed teosinte into corn and wolves into dogs), genetic engineering refers to the deliberate alteration or insertion of specific genes to an organism's DNA. The first successful case of genetic engineering occurred in 1973 when Herbert Boyer and Stanley Cohen were able to transfer a gene with antibiotic resistance to a bacterium. === Science === There are three main techniques used in genetic engineering: The plasmid method, the vector method and the biolistic method. ==== Plasmid method ==== This technique is used mainly for microorganisms such as bacteria. Through this method, DNA molecules called plasmids are extracted from bacteria and placed in a lab where restriction enzymes break them down. As the enzymes do this, some develop a rough edge that resembles that of a staircase which is considered 'sticky' and capable of reconnecting. These 'sticky' molecules are inserted into another bacteria where they will connect to the DNA rings with the altered genetic material. ==== Vector method ==== The vector method is considered a more precise technique than the plasmid method as it involves the transfer of a specific gene instead of a whole sequence. In the vector method, a specific gene from a DNA strand is isolated through restriction enzymes in a laboratory and is inserted into a vector. Once the vector accepts the genetic code, it is inserted into the host cell where the DNA will be transferred. ==== Biolistic method ==== The biolistic method is typically used to alter the genetic material of plants. This method embeds the desired DNA with a metallic particle such as gold or tungsten in a high speed gun. The particle is then bombarded into the plant. Due to the high velocities and the vacuum generated during bombardment, the particle is able to penetrate the cell wall and inserts the new DNA into the cell. === Applications === Genetic engineering has many uses in the fields of medicine, research and agriculture. In the medical field, genetically modified bacteria are used to produce drugs such as insulin, human growth hormones and vaccines. In research, scientists genetically modify organisms to observe physical and behavioral changes to understand the function of specific genes. In agriculture, genetic engineering is extremely important as it is used by farmers to grow crops that are resistant to herbicides and to insects such as BTCorn. == Bionics == Bionics is a medical engineering field and a branch of biorobotics consisting of electrical and mechanical systems that imitate biological systems, such as prosthetics and hearing aids. It's a portmanteau that combines biology and electronics. === History === The history of bionics goes as far back in time as ancient Egypt. A prosthetic toe made out of wood and leather was found on the foot of a mummy. The time period of the mummy corpse was estimated to be from around the fifteenth century B.C. Bionics can also be witnessed in ancient Greece and Rome. Prosthetic legs and arms were made for amputee soldiers. In the early 16th century, a French military surgeon by the name of Ambroise Pare became a pioneer in the field of bionics. He was known for making various types of upper and lower prosthetics. One of his most famous prosthetics, Le Petit Lorrain, was a mechanical hand operated by catches and springs. During the early 19th century, Alessandro Volta further progressed bionics. He set the foundation for the creation of hearing aids with his experiments. He found that electrical stimulation could restore hearing by inserting an electrical implant to the saccular nerve of a patient's ear. In 1945, the National Academy of Sciences created the Artificial Limb Program, which focused on improving prosthetics since there were a large number of World War II amputee soldiers. Since this creation, prosthetic materials, computer design methods, and surgical procedures have improved, creating modern-day bionics. === Science === ==== Prosthetics ==== The important components that make up modern-day prosthetics are the pylon, the socket, and the suspension system. The pylon is the internal frame of the prosthetic that is made up of metal rods or carbon-fiber composites. The socket is the part of the prosthetic that connects the prosthetic to the person's missing limb. The socket consists of a soft liner that makes the fit comfortable, but also snug enough to stay on the limb. The suspension system is important in keeping the prosthetic on the limb. The suspension system is usually a harness system made up of straps, belts or sleeves that are used to keep the limb attached. The operation of a prosthetic could be designed in various ways. The prosthetic could be body-powered, externally-powered, or myoelectrically powered. Body-powered prosthetics consist of cables attached to a strap or harness, which is placed on the person's functional shoulder, allowing the person to manipulate and control the prosthetic as he or she deems fit. Externally-powered prosthetics consist of motors to power the prosthetic and buttons and switches to control the prosthetic. Myoelectrically powered prosthetics are new, advanced forms of prosthetics where electrodes are placed on the muscles above the limb. The electrodes will detect the muscle contractions and send electrical signals to the prosthetic to move the prosthetic. The downside to this type of prosthetic is that if the sensors are not placed correctly on the limb then the electrical impulses will fail to move the prosthetic. TrueLimb is a specific brand of prosthetics that uses myoelectrical sensors which enable a person to have control of their bionic limb. ==== Hearing aids ==== Four major components make up the hearing aid: the microphone, the amplifier, the receiver, and the battery. The microphone takes in outside sound, turns that sound to electrical signals, and sends those signals to the amplifier. The amplifier increases the sound and sends that sound to the receiver. The receiver changes the electrical signal back into sound and sends the sound into the ear. Hair cells in the ear will sense the vibrations from the sound, convert the vibrations into nerve signals, and send it to the brain so
Growth function
The growth function, also called the shatter coefficient or the shattering number, measures the richness of a set family or class of functions. It is especially used in the context of statistical learning theory, where it is used to study properties of statistical learning methods. The term 'growth function' was coined by Vapnik and Chervonenkis in their 1968 paper, where they also proved many of its properties. It is a basic concept in machine learning. == Definitions == === Set-family definition === Let H {\displaystyle H} be a set family (a set of sets) and C {\displaystyle C} a set. Their intersection is defined as the following set-family: H ∩ C := { h ∩ C ∣ h ∈ H } {\displaystyle H\cap C:=\{h\cap C\mid h\in H\}} The intersection-size (also called the index) of H {\displaystyle H} with respect to C {\displaystyle C} is | H ∩ C | {\displaystyle |H\cap C|} . If a set C m {\displaystyle C_{m}} has m {\displaystyle m} elements then the index is at most 2 m {\displaystyle 2^{m}} . If the index is exactly 2m then the set C {\displaystyle C} is said to be shattered by H {\displaystyle H} , because H ∩ C {\displaystyle H\cap C} contains all the subsets of C {\displaystyle C} , i.e.: | H ∩ C | = 2 | C | , {\displaystyle |H\cap C|=2^{|C|},} The growth function measures the size of H ∩ C {\displaystyle H\cap C} as a function of | C | {\displaystyle |C|} . Formally: Growth ( H , m ) := max C : | C | = m | H ∩ C | {\displaystyle \operatorname {Growth} (H,m):=\max _{C:|C|=m}|H\cap C|} === Hypothesis-class definition === Equivalently, let H {\displaystyle H} be a hypothesis-class (a set of binary functions) and C {\displaystyle C} a set with m {\displaystyle m} elements. The restriction of H {\displaystyle H} to C {\displaystyle C} is the set of binary functions on C {\displaystyle C} that can be derived from H {\displaystyle H} : H C := { ( h ( x 1 ) , … , h ( x m ) ) ∣ h ∈ H , x i ∈ C } {\displaystyle H_{C}:=\{(h(x_{1}),\ldots ,h(x_{m}))\mid h\in H,x_{i}\in C\}} The growth function measures the size of H C {\displaystyle H_{C}} as a function of | C | {\displaystyle |C|} : Growth ( H , m ) := max C : | C | = m | H C | {\displaystyle \operatorname {Growth} (H,m):=\max _{C:|C|=m}|H_{C}|} == Examples == 1. The domain is the real line R {\displaystyle \mathbb {R} } . The set-family H {\displaystyle H} contains all the half-lines (rays) from a given number to positive infinity, i.e., all sets of the form { x > x 0 ∣ x ∈ R } {\displaystyle \{x>x_{0}\mid x\in \mathbb {R} \}} for some x 0 ∈ R {\displaystyle x_{0}\in \mathbb {R} } . For any set C {\displaystyle C} of m {\displaystyle m} real numbers, the intersection H ∩ C {\displaystyle H\cap C} contains m + 1 {\displaystyle m+1} sets: the empty set, the set containing the largest element of C {\displaystyle C} , the set containing the two largest elements of C {\displaystyle C} , and so on. Therefore: Growth ( H , m ) = m + 1 {\displaystyle \operatorname {Growth} (H,m)=m+1} . The same is true whether H {\displaystyle H} contains open half-lines, closed half-lines, or both. 2. The domain is the segment [ 0 , 1 ] {\displaystyle [0,1]} . The set-family H {\displaystyle H} contains all the open sets. For any finite set C {\displaystyle C} of m {\displaystyle m} real numbers, the intersection H ∩ C {\displaystyle H\cap C} contains all possible subsets of C {\displaystyle C} . There are 2 m {\displaystyle 2^{m}} such subsets, so Growth ( H , m ) = 2 m {\displaystyle \operatorname {Growth} (H,m)=2^{m}} . 3. The domain is the Euclidean space R n {\displaystyle \mathbb {R} ^{n}} . The set-family H {\displaystyle H} contains all the half-spaces of the form: x ⋅ ϕ ≥ 1 {\displaystyle x\cdot \phi \geq 1} , where ϕ {\displaystyle \phi } is a fixed vector. Then Growth ( H , m ) = Comp ( n , m ) {\displaystyle \operatorname {Growth} (H,m)=\operatorname {Comp} (n,m)} , where Comp is the number of components in a partitioning of an n-dimensional space by m hyperplanes. 4. The domain is the real line R {\displaystyle \mathbb {R} } . The set-family H {\displaystyle H} contains all the real intervals, i.e., all sets of the form { x ∈ [ x 0 , x 1 ] | x ∈ R } {\displaystyle \{x\in [x_{0},x_{1}]|x\in \mathbb {R} \}} for some x 0 , x 1 ∈ R {\displaystyle x_{0},x_{1}\in \mathbb {R} } . For any set C {\displaystyle C} of m {\displaystyle m} real numbers, the intersection H ∩ C {\displaystyle H\cap C} contains all runs of between 0 and m {\displaystyle m} consecutive elements of C {\displaystyle C} . The number of such runs is ( m + 1 2 ) + 1 {\displaystyle {m+1 \choose 2}+1} , so Growth ( H , m ) = ( m + 1 2 ) + 1 {\displaystyle \operatorname {Growth} (H,m)={m+1 \choose 2}+1} . == Polynomial or exponential == The main property that makes the growth function interesting is that it can be either polynomial or exponential - nothing in-between. The following is a property of the intersection-size: If, for some set C m {\displaystyle C_{m}} of size m {\displaystyle m} , and for some number n ≤ m {\displaystyle n\leq m} , | H ∩ C m | ≥ Comp ( n , m ) {\displaystyle |H\cap C_{m}|\geq \operatorname {Comp} (n,m)} - then, there exists a subset C n ⊆ C m {\displaystyle C_{n}\subseteq C_{m}} of size n {\displaystyle n} such that | H ∩ C n | = 2 n {\displaystyle |H\cap C_{n}|=2^{n}} . This implies the following property of the Growth function. For every family H {\displaystyle H} there are two cases: The exponential case: Growth ( H , m ) = 2 m {\displaystyle \operatorname {Growth} (H,m)=2^{m}} identically. The polynomial case: Growth ( H , m ) {\displaystyle \operatorname {Growth} (H,m)} is majorized by Comp ( n , m ) ≤ m n + 1 {\displaystyle \operatorname {Comp} (n,m)\leq m^{n}+1} , where n {\displaystyle n} is the smallest integer for which Growth ( H , n ) < 2 n {\displaystyle \operatorname {Growth} (H,n)<2^{n}} . == Other properties == === Trivial upper bound === For any finite H {\displaystyle H} : Growth ( H , m ) ≤ | H | {\displaystyle \operatorname {Growth} (H,m)\leq |H|} since for every C {\displaystyle C} , the number of elements in H ∩ C {\displaystyle H\cap C} is at most | H | {\displaystyle |H|} . Therefore, the growth function is mainly interesting when H {\displaystyle H} is infinite. === Exponential upper bound === For any nonempty H {\displaystyle H} : Growth ( H , m ) ≤ 2 m {\displaystyle \operatorname {Growth} (H,m)\leq 2^{m}} I.e, the growth function has an exponential upper-bound. We say that a set-family H {\displaystyle H} shatters a set C {\displaystyle C} if their intersection contains all possible subsets of C {\displaystyle C} , i.e. H ∩ C = 2 C {\displaystyle H\cap C=2^{C}} . If H {\displaystyle H} shatters C {\displaystyle C} of size m {\displaystyle m} , then Growth ( H , C ) = 2 m {\displaystyle \operatorname {Growth} (H,C)=2^{m}} , which is the upper bound. === Cartesian intersection === Define the Cartesian intersection of two set-families as: H 1 ⨂ H 2 := { h 1 ∩ h 2 ∣ h 1 ∈ H 1 , h 2 ∈ H 2 } {\displaystyle H_{1}\bigotimes H_{2}:=\{h_{1}\cap h_{2}\mid h_{1}\in H_{1},h_{2}\in H_{2}\}} . Then: Growth ( H 1 ⨂ H 2 , m ) ≤ Growth ( H 1 , m ) ⋅ Growth ( H 2 , m ) {\displaystyle \operatorname {Growth} (H_{1}\bigotimes H_{2},m)\leq \operatorname {Growth} (H_{1},m)\cdot \operatorname {Growth} (H_{2},m)} === Union === For every two set-families: Growth ( H 1 ∪ H 2 , m ) ≤ Growth ( H 1 , m ) + Growth ( H 2 , m ) {\displaystyle \operatorname {Growth} (H_{1}\cup H_{2},m)\leq \operatorname {Growth} (H_{1},m)+\operatorname {Growth} (H_{2},m)} === VC dimension === The VC dimension of H {\displaystyle H} is defined according to these two cases: In the polynomial case, VCDim ( H ) = n − 1 {\displaystyle \operatorname {VCDim} (H)=n-1} = the largest integer d {\displaystyle d} for which Growth ( H , d ) = 2 d {\displaystyle \operatorname {Growth} (H,d)=2^{d}} . In the exponential case VCDim ( H ) = ∞ {\displaystyle \operatorname {VCDim} (H)=\infty } . So VCDim ( H ) ≥ d {\displaystyle \operatorname {VCDim} (H)\geq d} if-and-only-if Growth ( H , d ) = 2 d {\displaystyle \operatorname {Growth} (H,d)=2^{d}} . The growth function can be regarded as a refinement of the concept of VC dimension. The VC dimension only tells us whether Growth ( H , d ) {\displaystyle \operatorname {Growth} (H,d)} is equal to or smaller than 2 d {\displaystyle 2^{d}} , while the growth function tells us exactly how Growth ( H , m ) {\displaystyle \operatorname {Growth} (H,m)} changes as a function of m {\displaystyle m} . Another connection between the growth function and the VC dimension is given by the Sauer–Shelah lemma: If VCDim ( H ) = d {\displaystyle \operatorname {VCDim} (H)=d} , then: for all m {\displaystyle m} : Growth ( H , m ) ≤ ∑ i = 0 d ( m i ) {\displaystyle \operatorname {Growth} (H,m)\leq \sum _{i=0}^{d}{m \choose i}} In particular, for all m > d + 1 {\displaystyle m>d+1} : Growth ( H , m ) ≤ ( e m / d ) d = O ( m d ) {\displaystyle \operatorname {Growth} (H,m)\leq (
Online machine learning
In computer science, online machine learning is a method of machine learning in which data becomes available in a sequential order and is used to update the best predictor for future data at each step, as opposed to batch learning techniques which generate the best predictor by learning on the entire training data set at once. Online learning is a common technique used in areas of machine learning where it is computationally infeasible to train over the entire dataset, requiring the need of out-of-core algorithms. It is also used in situations where it is necessary for the algorithm to dynamically adapt to new patterns in the data, or when the data itself is generated as a function of time, e.g., prediction of prices in the financial international markets. Online learning algorithms may be prone to catastrophic interference, a problem that can be addressed by incremental learning approaches. Online machine learning algorithms find applications in a wide variety of fields such as sponsored search to maximize ad revenue, portfolio optimization, shortest path prediction (with stochastic weights, e.g. traffic on roads for a maps application), spam filtering, real-time fraud detection, dynamic pricing for e-commerce, etc. There is also growing interest in usage of online learning paradigms for LLMs to enable continuous, real-time adaptation after the initial training. == Introduction == In the setting of supervised learning, a function of f : X → Y {\displaystyle f:X\to Y} is to be learned, where X {\displaystyle X} is thought of as a space of inputs and Y {\displaystyle Y} as a space of outputs, that predicts well on instances that are drawn from a joint probability distribution p ( x , y ) {\displaystyle p(x,y)} on X × Y {\displaystyle X\times Y} . In reality, the learner never knows the true distribution p ( x , y ) {\displaystyle p(x,y)} over instances. Instead, the learner usually has access to a training set of examples ( x 1 , y 1 ) , … , ( x n , y n ) {\displaystyle (x_{1},y_{1}),\ldots ,(x_{n},y_{n})} . In this setting, the loss function is given as V : Y × Y → R {\displaystyle V:Y\times Y\to \mathbb {R} } , such that V ( f ( x ) , y ) {\displaystyle V(f(x),y)} measures the difference between the predicted value f ( x ) {\displaystyle f(x)} and the true value y {\displaystyle y} . The ideal goal is to select a function f ∈ H {\displaystyle f\in {\mathcal {H}}} , where H {\displaystyle {\mathcal {H}}} is a space of functions called a hypothesis space, so that some notion of total loss is minimized. Depending on the type of model (statistical or adversarial), one can devise different notions of loss, which lead to different learning algorithms. == Statistical view of online learning == In statistical learning models, the training sample ( x i , y i ) {\displaystyle (x_{i},y_{i})} are assumed to have been drawn from the true distribution p ( x , y ) {\displaystyle p(x,y)} and the objective is to minimize the expected "risk" I [ f ] = E [ V ( f ( x ) , y ) ] = ∫ V ( f ( x ) , y ) d p ( x , y ) . {\displaystyle I[f]=\mathbb {E} [V(f(x),y)]=\int V(f(x),y)\,dp(x,y)\ .} A common paradigm in this situation is to estimate a function f ^ {\displaystyle {\hat {f}}} through empirical risk minimization or regularized empirical risk minimization (usually Tikhonov regularization). The choice of loss function here gives rise to several well-known learning algorithms such as regularized least squares and support vector machines. A purely online model in this category would learn based on just the new input ( x t + 1 , y t + 1 ) {\displaystyle (x_{t+1},y_{t+1})} , the current best predictor f t {\displaystyle f_{t}} and some extra stored information (which is usually expected to have storage requirements independent of training data size). For many formulations, for example nonlinear kernel methods, true online learning is not possible, though a form of hybrid online learning with recursive algorithms can be used where f t + 1 {\displaystyle f_{t+1}} is permitted to depend on f t {\displaystyle f_{t}} and all previous data points ( x 1 , y 1 ) , … , ( x t , y t ) {\displaystyle (x_{1},y_{1}),\ldots ,(x_{t},y_{t})} . In this case, the space requirements are no longer guaranteed to be constant since it requires storing all previous data points, but the solution may take less time to compute with the addition of a new data point, as compared to batch learning techniques. A common strategy to overcome the above issues is to learn using mini-batches, which process a small batch of b ≥ 1 {\displaystyle b\geq 1} data points at a time, this can be considered as pseudo-online learning for b {\displaystyle b} much smaller than the total number of training points. Mini-batch techniques are used with repeated passing over the training data to obtain optimized out-of-core versions of machine learning algorithms, for example, stochastic gradient descent. When combined with backpropagation, this is currently the de facto training method for training artificial neural networks. === Example: linear least squares === The simple example of linear least squares is used to explain a variety of ideas in online learning. The ideas are general enough to be applied to other settings, for example, with other convex loss functions. === Batch learning === Consider the setting of supervised learning with f {\displaystyle f} being a linear function to be learned: f ( x j ) = ⟨ w , x j ⟩ = w ⋅ x j {\displaystyle f(x_{j})=\langle w,x_{j}\rangle =w\cdot x_{j}} where x j ∈ R d {\displaystyle x_{j}\in \mathbb {R} ^{d}} is a vector of inputs (data points) and w ∈ R d {\displaystyle w\in \mathbb {R} ^{d}} is a linear filter vector. The goal is to compute the filter vector w {\displaystyle w} . To this end, a square loss function V ( f ( x j ) , y j ) = ( f ( x j ) − y j ) 2 = ( ⟨ w , x j ⟩ − y j ) 2 {\displaystyle V(f(x_{j}),y_{j})=(f(x_{j})-y_{j})^{2}=(\langle w,x_{j}\rangle -y_{j})^{2}} is used to compute the vector w {\displaystyle w} that minimizes the empirical loss I n [ w ] = ∑ j = 1 n V ( ⟨ w , x j ⟩ , y j ) = ∑ j = 1 n ( x j T w − y j ) 2 {\displaystyle I_{n}[w]=\sum _{j=1}^{n}V(\langle w,x_{j}\rangle ,y_{j})=\sum _{j=1}^{n}(x_{j}^{\mathsf {T}}w-y_{j})^{2}} where y j ∈ R . {\displaystyle y_{j}\in \mathbb {R} .} Let X {\displaystyle X} be the i × d {\displaystyle i\times d} data matrix and y ∈ R i {\displaystyle y\in \mathbb {R} ^{i}} is the column vector of target values after the arrival of the first i {\displaystyle i} data points. Assuming that the covariance matrix Σ i = X T X {\displaystyle \Sigma _{i}=X^{\mathsf {T}}X} is invertible (otherwise it is preferential to proceed in a similar fashion with Tikhonov regularization), the best solution f ∗ ( x ) = ⟨ w ∗ , x ⟩ {\displaystyle f^{}(x)=\langle w^{},x\rangle } to the linear least squares problem is given by w ∗ = ( X T X ) − 1 X T y = Σ i − 1 ∑ j = 1 i x j y j . {\displaystyle w^{}=(X^{\mathsf {T}}X)^{-1}X^{\mathsf {T}}y=\Sigma _{i}^{-1}\sum _{j=1}^{i}x_{j}y_{j}.} Now, calculating the covariance matrix Σ i = ∑ j = 1 i x j x j T {\displaystyle \Sigma _{i}=\sum _{j=1}^{i}x_{j}x_{j}^{\mathsf {T}}} takes time O ( i d 2 ) {\displaystyle O(id^{2})} , inverting the d × d {\displaystyle d\times d} matrix takes time O ( d 3 ) {\displaystyle O(d^{3})} , while the rest of the multiplication takes time O ( d 2 ) {\displaystyle O(d^{2})} , giving a total time of O ( i d 2 + d 3 ) {\displaystyle O(id^{2}+d^{3})} . When there are n {\displaystyle n} total points in the dataset, to recompute the solution after the arrival of every datapoint i = 1 , … , n {\displaystyle i=1,\ldots ,n} , the naive approach will have a total complexity O ( n 2 d 2 + n d 3 ) {\displaystyle O(n^{2}d^{2}+nd^{3})} . Note that when storing the matrix Σ i {\displaystyle \Sigma _{i}} , then updating it at each step needs only adding x i + 1 x i + 1 T {\displaystyle x_{i+1}x_{i+1}^{\mathsf {T}}} , which takes O ( d 2 ) {\displaystyle O(d^{2})} time, reducing the total time to O ( n d 2 + n d 3 ) = O ( n d 3 ) {\displaystyle O(nd^{2}+nd^{3})=O(nd^{3})} , but with an additional storage space of O ( d 2 ) {\displaystyle O(d^{2})} to store Σ i {\displaystyle \Sigma _{i}} . === Online learning: recursive least squares === The recursive least squares (RLS) algorithm considers an online approach to the least squares problem. It can be shown that by initialising w 0 = 0 ∈ R d {\displaystyle \textstyle w_{0}=0\in \mathbb {R} ^{d}} and Γ 0 = I ∈ R d × d {\displaystyle \textstyle \Gamma _{0}=I\in \mathbb {R} ^{d\times d}} , the solution of the linear least squares problem given in the previous section can be computed by the following iteration: Γ i = Γ i − 1 − Γ i − 1 x i x i T Γ i − 1 1 + x i T Γ i − 1 x i {\displaystyle \Gamma _{i}=\Gamma _{i-1}-{\frac {\Gamma _{i-1}x_{i}x_{i}^{\mathsf {T}}\Gamma _{i-1}}{1+x_{i}^{\mathsf {T}}\Gamma _{i-1}x_{i}}}} w i = w i − 1 − Γ i x i ( x i T w i − 1 − y i ) {\displaystyle w_{i}=w_{i-1}-\Gamma _{i}x_{i}\left(x_{i}^{\mathsf {T}}w_{
Targeted maximum likelihood estimation
Targeted Maximum Likelihood Estimation (TMLE) (also more accurately referred to as Targeted Minimum Loss-Based Estimation) is a general statistical estimation framework for causal inference and semiparametric models. TMLE combines ideas from maximum likelihood estimation, semiparametric efficiency theory, and machine learning. It was introduced by Mark J. van der Laan and colleagues in the mid-2000s as a method that yields asymptotically efficient plug-in estimators while allowing the use of flexible, data-adaptive algorithms such as ensemble machine learning for nuisance parameter estimation. TMLE is used in epidemiology, biostatistics, and the social sciences to estimate causal effects in observational and experimental studies. Applications of TMLE include Longitudinal TMLE (LTMLE) for time-varying treatments and confounders. Variations in how the targeting step in TMLE is carried out have resulted in various versions of TMLE such as Collaborative TMLE (CTMLE) and Adaptive TMLE for improved finite-sample performance and automated variable selection. == History == The TMLE framework was first described by van der Laan and Rubin (2006) as a general approach for the construction of efficient plug-in estimators of smooth features of the data density. It was demonstrated in the context of causal inference and missing data problems. It was developed to address limitations of traditional doubly robust methods, such as Augmented Inverse Probability Weighting (AIPW), by respecting the plug-in principle in the sense that it respects that the target parameter is a function of the data density that is an element of the statistical model. TMLE estimates the data density or relevant parts of it with machine learning and targets these machine learning fits before it is plugged in the target parameter mapping. In this manner, a TMLE always respects global knowledge and satisfies known bounds such as that the target parameter is a probability . Since its introduction, TMLE has been developed in a series of theoretical and applied papers, culminating in book-length treatments of the method and its applications to survival analysis, adaptive designs, and longitudinal data. == Methodology == At its core, TMLE is a two-step estimation procedure: Initial estimation: Machine learning methods (such as the Super Learner ensemble) are used to obtain flexible estimates of nuisance parameters, such as outcome regressions and propensity scores. Targeting step: The initial estimate is updated by solving a score equation (the efficient influence function) so that the final estimator is consistent, asymptotically normal, and efficient under mild regularity conditions. The targeted machine learning fit is then mapped into the corresponding estimator of the target parameter by simply plugging it in the target parameter mapping. This approach balances the bias–variance trade-off by combining data-adaptive estimation with semiparametric efficiency theory. TMLE is doubly robust, meaning it remains consistent if either the outcome model or the treatment model is consistently estimated. === Formula === Here we explain the TMLE of the average treatment effect of a binary treatment on an outcome adjusting for baseline covariates. Consider i.i.d. observations O i = ( W i , A i , Y i ) {\displaystyle O_{i}=(W_{i},A_{i},Y_{i})} from a distribution P 0 {\displaystyle P_{0}} , where W {\displaystyle W} are baseline covariates, A {\displaystyle A} is a binary treatment, and Y {\displaystyle Y} is an outcome. Let Q ¯ ( a , w ) = E [ Y ∣ A = a , W = w ] {\displaystyle {\bar {Q}}(a,w)=\mathbb {E} [Y\mid A=a,W=w]} represent the outcome model and g ( a ∣ w ) = P ( A = a ∣ W = w ) {\displaystyle g(a\mid w)=P(A=a\mid W=w)} represent the propensity score. The average treatment effect (ATE) is given by ψ 0 = E { Q ¯ ( 1 , W ) − Q ¯ ( 0 , W ) } . {\displaystyle \psi _{0}=\mathbb {E} \{{\bar {Q}}(1,W)-{\bar {Q}}(0,W)\}.} A basic TMLE for the ATE proceeds as follows: Step 1: Estimate initial models. Obtain estimates Q ¯ ^ ( a , w ) {\displaystyle {\hat {\bar {Q}}}(a,w)} and g ^ ( a ∣ w ) {\displaystyle {\hat {g}}(a\mid w)} , often using flexible methods such as Super Learner. Step 2: Compute the clever covariate. Define: H ( A , W ) = A g ^ ( 1 ∣ W ) − 1 − A g ^ ( 0 ∣ W ) . {\displaystyle H(A,W)={\frac {A}{{\hat {g}}(1\mid W)}}-{\frac {1-A}{{\hat {g}}(0\mid W)}}.} Step 3: Estimate the fluctuation parameter. Fit a logistic regression of Y {\displaystyle Y} on H ( A , W ) {\displaystyle H(A,W)} with logit ( Q ¯ ^ ( A , W ) ) {\displaystyle \operatorname {logit} ({\hat {\bar {Q}}}(A,W))} as offset. This yields ε ^ {\displaystyle {\hat {\varepsilon }}} , the MLE that solves the score equation: 1 n ∑ i = 1 n H ( A i , W i ) { Y i − Q ¯ ^ ε ( A i , W i ) } = 0. {\displaystyle {\frac {1}{n}}\sum _{i=1}^{n}H(A_{i},W_{i}){\big \{}Y_{i}-{\hat {\bar {Q}}}^{\varepsilon }(A_{i},W_{i}){\big \}}=0.} Step 4: Update the initial estimate. Apply the "blip" to obtain the targeted estimate: Q ¯ ^ ∗ ( A , W ) = expit ( logit ( Q ¯ ^ ( A , W ) ) + ε ^ H ( A , W ) ) . {\displaystyle {\hat {\bar {Q}}}^{}(A,W)=\operatorname {expit} {\Big (}\operatorname {logit} {\big (}{\hat {\bar {Q}}}(A,W){\big )}+{\hat {\varepsilon }}\,H(A,W){\Big )}.} Step 5: Compute the TMLE. The ATE estimate is: ψ ^ TMLE = 1 n ∑ i = 1 n [ Q ¯ ^ ∗ ( 1 , W i ) − Q ¯ ^ ∗ ( 0 , W i ) ] . {\displaystyle {\hat {\psi }}_{\text{TMLE}}={\frac {1}{n}}\sum _{i=1}^{n}{\big [}{\hat {\bar {Q}}}^{}(1,W_{i})-{\hat {\bar {Q}}}^{}(0,W_{i}){\big ]}.} Inference. The efficient influence function (EIF) for the ATE is: D ∗ ( O ) = H ( A , W ) { Y − Q ¯ ∗ ( A , W ) } + Q ¯ ∗ ( 1 , W ) − Q ¯ ∗ ( 0 , W ) − ψ . {\displaystyle D^{}(O)=H(A,W)\{Y-{\bar {Q}}^{}(A,W)\}+{\bar {Q}}^{}(1,W)-{\bar {Q}}^{}(0,W)-\psi .} The variance is estimated by σ ^ 2 = n − 1 ∑ i = 1 n ( D ∗ ( O i ) ) 2 {\displaystyle {\hat {\sigma }}^{2}=n^{-1}\sum _{i=1}^{n}{\big (}D^{}(O_{i}){\big )}^{2}} , yielding Wald-type confidence intervals ψ ^ TMLE ± z 1 − α / 2 σ ^ / n {\displaystyle {\hat {\psi }}_{\text{TMLE}}\pm z_{1-\alpha /2}\,{\hat {\sigma }}/{\sqrt {n}}} . Remark. For continuous outcomes, a linear fluctuation Q ¯ ^ ∗ = Q ¯ ^ + ε ^ H {\displaystyle {\hat {\bar {Q}}}^{}={\hat {\bar {Q}}}+{\hat {\varepsilon }}\,H} may be used instead. For bounded continuous outcomes, the logistic fluctuation (after rescaling Y {\displaystyle Y} to [ 0 , 1 ] {\displaystyle [0,1]} ) is often preferred for improved finite-sample performance. == Applications == TMLE has been applied in: Epidemiology: Estimating causal effects of exposures and interventions in observational cohort studies. Clinical trials and real-world evidence: The Targeted Learning roadmap provides a structured framework for generating and validating real-world evidence (RWE), bridging randomized trials and observational data using TMLE and related estimation techniques. This approach enables transparency, sensitivity analysis, and stronger causal inference for regulatory and clinical trial contexts. High-dimensional settings: Integration with ensemble methods for causal effect estimation. TMLE has been successfully applied in pharmacoepidemiology where a large number of covariates are automatically selected to adjust for confounding. In a study of post–myocardial infarction statin use and 1-year mortality, TMLE demonstrated robust performance relative to inverse probability weighting in scenarios with hundreds of potential confounders. == Derivatives and extensions == Longitudinal TMLE (LTMLE): A methodological extension of TMLE for longitudinal data with time-varying treatments, confounders, and censoring. It allows the estimation of dynamic treatment regimes and intervention-specific causal effects over time. This framework was originally introduced by van der Laan & Gruber (2012). Collaborative TMLE (CTMLE): Enhances finite-sample performance and variable selection by collaboratively fitting the treatment mechanism in conjunction with the target parameter. == Software == Several R packages implement TMLE and related methods: tmle: Functions for binary, categorical, and continuous outcomes. ltmle: Implementation for longitudinal data with time-varying treatments and outcomes. ctmle: Algorithms for collaborative TMLE and adaptive variable selection. SuperLearner: A theoretically grounded, cross-validated ensemble learning method that combines predictions from multiple algorithms to minimize predictive risk. Widely used in TMLE for estimating nuisance parameters. The original implementation is available as the R package SuperLearner. Recent machine learning platforms like H2O AutoML implement similar ensemble strategies, combining diverse learners in parallel and leveraging stacking and blending techniques, effectively functioning as a large-scale Super Learner.
EXAPT
EXAPT (a portmanteau of "Extended Subset of APT") is a production-oriented programming language that allows users to generate NC programs with control information for machining tools and facilitates decision-making for production-related issues that may arise during various machining processes. EXAPT was first developed to address industrial requirements. Through the years, the company created additional software for the manufacturing industry. Today, EXAPT offers a suite of SAAS products and services for the manufacturing industry. The trade name, EXAPT, is most commonly associated with the CAD/CAM-System, production data, and tool management software of the German company EXAPT Systemtechnik GmbH based in Aachen, DE. == General == EXAPT is a modularly built programming system for all NC machining operations as Drilling Turning Milling Turn-Milling Nibbling Flame-, laser-, plasma- and water jet cutting Wire eroding Operations with industrial robots Due to the modular structure, the main product groups, EXAPTcam and EXAPTpdo, are gradually expandable and permit individual software for the manufacturing industry used individually and also in a compound with an existing IT environment. == Functionality == EXAPTcam meets the requirements for NC planning, especially for the cutting operations such as turning, drilling, and milling up to 5-axis simultaneous machining. Thereby new process technologies, tool, and machine concepts are constantly involved. In the NC programming data from different sources such as 3D CAD models, drawings or tables can flow in. The possibilities of NC programming reaches from language-oriented to feature-oriented NC programming. The integrated EXAPT knowledge database and intelligent and scalable automatisms support the user. The EXAPT NC planning also covers the generation of production information as clamping and tool plans, presetting data or time calculations. The realistic simulation possibilities of NC planning and NC control data provide with production reliability. EXAPTpdo (EXAPT ProductionsDataOrganization) provides a neutrally applicable technology platform for the information compound of the NC planning - to the shop floor. This applies to all NC production data that are necessary for the set-up of NC machines, for the provision, presetting, and stocking of manufacturing resources and provided by EXAPTpdo in a central database. Besides classical functions of the tool management system (TMS) as the management of cutting tools, measuring, testing and clamping devices the technology data management and tool lifecycle management (TLM) is also included. System-supported "where-used lists" helps to handle the manufacturing resource cycle by secured requirement determination and requirement fulfillment. Unnecessary transports and unplanned dispositive adjustments are dropped, stocks are reduced, set-up times reduced and the throughput is increased. EXAPTpdo synchronizes involved systems within the value chain. Stock systems, MES systems or ERP systems (e.g. from the purchasing or production areas) do not work in isolation from each other but they interact with each other. EXAPTpdo provides the base to Smart Factory, for more flexibility in production and faster communication. == History == With the foundation of the EXAPT-Verein in 1967 as spin-off of the universities Aachen, Berlin and Stuttgart the further development "EXAPT (EXtended Subset of APT)" of the programming language "APT (Automatically Programmed Tool)" was focused and so the first milestone for the EXAPT history was set. In the same year the system EXAPT 1 for drilling and simple milling tasks became available. 1969 The industrial application of EXAPT 2 for the programming of NC machines with 2-axis linear and path control begins. In the following year, the development of the EXAPT modular system starts. 1972 BASIC-EXAPT is provided for the universal, homogeneous programming of all NC tasks. The support is made by the EXAPT applications consultancy. 1973 EXAPT 1.1 is provided for the programming of straight-cut and continuous-path controlled drilling and milling machines and machining centers. At the Hanover Fair (IHA 73) the interactive access to a mainframe via a time-sharing terminal for the part program entry and correction is presented and starts the replacement of the punch card. 1974 The possibilities for the use of process computers for the NC data transfer are leveled out. EXAPT offers the possibility of the result simulation when using plotters with display of tool paths and tools in assignment to the workpiece. In April 1975, the EXAPT NC Systemtechnik GmbH was founded with the aim, of enabling entry into the NC technique for small and medium-sized companies by a complete product and service program. In the following year, the system portfolio is extended with further system modules and service programs and the provision of postprocessors. 1978 The development activities on the EXAPT module system started in 1970 are completed. Using modern software techniques, the different system parts BASIC-EXAPT, EXAPT 1, EXAPT 1.1, and EXAPT 2 are composed of a total system. System support and applications consultancy become a new working focus. From the beginning to the middle of the 1980s Beside new portable software modules for CAD/CAM applications (e. g. CAPEX, NESTEX, CADEX, CADCPL), the first version of the EXAPT DNC system and extensions of the EXAPT NC programming system for the machining of sculptured surfaces are presented. 1988 EXAPT expands the software product range by systems for tool data management (BMO) and production data management (FDO). EXAPT trains more than 1,300 course participants including company-specific courses. 1992 The first version of the completely new product generation EXAPTplus is presented and the agency in Dresden is opened. 1993 The company name "EXAPT NC Systemtechnik GmbH" is changed to "EXAPT Systemtechnik GmbH." EXAPTplus is presented on PC under Windows NT at the EMO '93. The decentralization of the use of EXAPT systems expands the range of applications. In the following year, EXAPT-DNC is executable under Windows on a customary PC. Special hardware is not needed and so it can be used in compound with the database-supported EXAPT production data management system (FDO). 1995 EXAPTplus is also ready for complex application cases such as machining of tubes at extrusion tools. EXAPT-CADI provides the transfer of 2D CAD data to EXAPTplus. With the new office Gießen the marketing is strengthened. In the following year the EXAPT NC editor is developed for the direct processing of NC control data with tool path display and visualization of the tools. In the course of the market entry of more comfortable 3D CAD systems for the solid modelling of components a detailed evaluation of current systems is made in 1997. It is decided to use SolidWorks as a reference system for the solid-oriented NC planning with EXAPT. 1998 The first solution for the transfer of geometry data between SolidWorks and EXAPTplus is generated. The EXAPT organization systems are (beside SQL) also executable under Oracle now. The use of client server solutions supports the data flow in the production. 1999 AFR functions are provided in connection with EXAPTsolid to support a workpiece modelling for NC. The millennium capability is ensured for all EXAPT systems. AFR is a ground-breaking for the integration of third-party products. 2002 EXAPT-BMG is developed for the generation and visualization of tools with additional functions for the assembly from components. The acquisition of tools with their geometric and technological presentation offers extensive support of the NC planning with EXAPT systems. 2003 EXAPTpdo is available to optimize the process chains in production planning and production execution optimally regarding the increasing requirements of changing production conditions. 2004 Diverse system extensions are made in EXAPTplus, EXAPTsolid, EXAPT NC editor, EXAPTpdo for the complete machining on turning/milling centres with result reliability because of more extensive simulation based on realNC (Tecnomatix), for the use of new complex tool systems and the compound use between ERP systems as SAP and intelligent CNC systems. In the following year, EXAPTpdo is extended for the cross-order set-up optimization and provision of manufacturing re-sources especially for single and small series production with connection to purchase and physical portfolio management. 2006 The EXAPT systems are available for extended use as an information platform for production, the time management, and similar requirements. EXAPTsolid is extended for the feature-oriented milling operation and machine simulation. The NC programming of complex machine tools, e.g. three-turret-turning/milling centers is supported by EXAPT systems, as well as the use of multi-functional tools. 2007 A module for 3-5-axis simultaneous milling machining is presented.
PVLV
The primary value learned value (PVLV) model is a possible explanation for the reward-predictive firing properties of dopamine (DA) neurons. It simulates behavioral and neural data on Pavlovian conditioning and the midbrain dopaminergic neurons that fire in proportion to unexpected rewards. It is an alternative to the temporal-differences (TD) algorithm. It is used as part of Leabra.