Client-side encryption

Client-side encryption

Client-side encryption is the cryptographic technique of encrypting data on the sender's side, before it is transmitted to a server such as a cloud storage service. Client-side encryption features an encryption key that is not available to the service provider, making it difficult or impossible for service providers to decrypt hosted data. Client-side encryption allows for the creation of applications whose providers cannot access the data its users have stored, thus offering a high level of privacy. Applications utilizing client-side encryption are sometimes marketed under the misleading or incorrect term "zero-knowledge", but this is a misnomer, as the term zero-knowledge describes something entirely different in the context of cryptography. == Details == Client-side encryption seeks to eliminate the potential for data to be viewed by service providers (or third parties that compel service providers to deliver access to data), client-side encryption ensures that data and files that are stored in the cloud can only be viewed on the client-side of the exchange. This prevents data loss and the unauthorized disclosure of private or personal files, providing increased peace of mind for its users. Current recommendations by industry professionals as well as academic scholars offer great vocal support for developers to include client-side encryption to protect the confidentiality and integrity of information. === Examples of services that use client-side encryption by default === Tresorit MEGA Cryptee Cryptomator === Examples of services that optionally support client-side encryption === Apple iCloud offers optional client-side encryption when "Advanced Data Protection for iCloud" is enabled. Google Drive, Google Docs, Google Meet, Google Calendar, and Gmail — However, as of Jul 2024, optional client-side encryption features are only available to paid users. === Examples of services that do not support client-side encryption === Dropbox === Examples of client-side encrypted services that no longer exist === SpiderOak Backup

Machine vision

Machine vision is the technology and methods used to provide imaging-based automatic inspection and analysis for such applications as automatic inspection, process control, and robot guidance, usually in industry. Machine vision refers to many technologies, software and hardware products, integrated systems, actions, methods and expertise. Machine vision as a systems engineering discipline can be considered distinct from computer vision, a form of computer science. It attempts to integrate existing technologies in new ways and apply them to solve real world problems. The term is the prevalent one for these functions in industrial automation environments but is also used for these functions in other environment vehicle guidance. The overall machine vision process includes planning the details of the requirements and project, and then creating a solution. During run-time, the process starts with imaging, followed by automated analysis of the image and extraction of the required information. == Definition == Definitions of the term "Machine vision" vary, but all include the technology and methods used to extract information from an image on an automated basis, as opposed to image processing, where the output is another image. The information extracted can be a simple good-part/bad-part signal, or more a complex set of data such as the identity, position and orientation of each object in an image. The information can be used for such applications as automatic inspection and robot and process guidance in industry, for security monitoring and vehicle guidance. This field encompasses a large number of technologies, software and hardware products, integrated systems, actions, methods and expertise. Machine vision is practically the only term used for these functions in industrial automation applications; the term is less universal for these functions in other environments such as security and vehicle guidance. Machine vision as a systems engineering discipline can be considered distinct from computer vision, a form of basic computer science; machine vision attempts to integrate existing technologies in new ways and apply them to solve real world problems in a way that meets the requirements of industrial automation and similar application areas. The term is also used in a broader sense by trade shows and trade groups such as the Automated Imaging Association and the European Machine Vision Association. This broader definition also encompasses products and applications most often associated with image processing. The primary uses for machine vision are automatic inspection and industrial robot/process guidance. In more recent times the terms computer vision and machine vision have converged to a greater degree. See glossary of machine vision. == Imaging based automatic inspection and sorting == The primary uses for machine vision are imaging-based automatic inspection and sorting and robot guidance.; in this section the former is abbreviated as "automatic inspection". The overall process includes planning the details of the requirements and project, and then creating a solution. This section describes the technical process that occurs during the operation of the solution. === Methods and sequence of operation === The first step in the automatic inspection sequence of operation is acquisition of an image, typically using cameras, lenses, and lighting that has been designed to provide the differentiation required by subsequent processing. MV software packages and programs developed in them then employ various digital image processing techniques to extract the required information, and often make decisions (such as pass/fail) based on the extracted information. === Equipment === The components of an automatic inspection system usually include lighting, a camera or other imager, a processor, software, and output devices. === Imaging === The imaging device (e.g. camera) can either be separate from the main image processing unit or combined with it in which case the combination is generally called a smart camera or smart sensor. Inclusion of the full processing function into the same enclosure as the camera is often referred to as embedded processing. When separated, the connection may be made to specialized intermediate hardware, a custom processing appliance, or a frame grabber within a computer using either an analog or standardized digital interface (Camera Link, CoaXPress). MV implementations also use digital cameras capable of direct connections (without a framegrabber) to a computer via FireWire, USB or Gigabit Ethernet interfaces. While conventional (2D visible light) imaging is most commonly used in MV, alternatives include multispectral imaging, hyperspectral imaging, imaging various infrared bands, line scan imaging, 3D imaging of surfaces and X-ray imaging. Key differentiations within MV 2D visible light imaging are monochromatic vs. color, frame rate, resolution, and whether or not the imaging process is simultaneous over the entire image, making it suitable for moving processes. Though the vast majority of machine vision applications are solved using two-dimensional imaging, machine vision applications utilizing 3D imaging are a growing niche within the industry. The most commonly used method for 3D imaging is scanning based triangulation which utilizes motion of the product or image during the imaging process. A laser is projected onto the surfaces of an object. In machine vision this is accomplished with a scanning motion, either by moving the workpiece, or by moving the camera & laser imaging system. The line is viewed by a camera from a different angle; the deviation of the line represents shape variations. Lines from multiple scans are assembled into a depth map or point cloud. Stereoscopic vision is used in special cases involving unique features present in both views of a pair of cameras. Other 3D methods used for machine vision are time of flight and grid based. One method is grid array based systems using pseudorandom structured light system as employed by the Microsoft Kinect system circa 2012. === Image processing === After an image is acquired, it is processed. Central processing functions are generally done by a CPU, a GPU, a FPGA or a combination of these. Deep learning training and inference impose higher processing performance requirements. Multiple stages of processing are generally used in a sequence that ends up as a desired result. A typical sequence might start with tools such as filters which modify the image, followed by extraction of objects, then extraction (e.g. measurements, reading of codes) of data from those objects, followed by communicating that data, or comparing it against target values to create and communicate "pass/fail" results. Machine vision image processing methods include; Stitching/Registration: Combining of adjacent 2D or 3D images. Filtering (e.g. morphological filtering) Thresholding: Thresholding starts with setting or determining a gray value that will be useful for the following steps. The value is then used to separate portions of the image, and sometimes to transform each portion of the image to simply black and white based on whether it is below or above that grayscale value. Pixel counting: counts the number of light or dark pixels Segmentation: Partitioning a digital image into multiple segments to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Edge detection: finding object edges Color Analysis: Identify parts, products and items using color, assess quality from color, and isolate features using color. Blob detection and extraction: inspecting an image for discrete blobs of connected pixels (e.g. a black hole in a grey object) as image landmarks. Neural network / deep learning / machine learning processing: weighted and self-training multi-variable decision making Circa 2019 there is a large expansion of this, using deep learning and machine learning to significantly expand machine vision capabilities. The most common result of such processing is classification. Examples of classification are object identification,"pass fail" classification of identified objects and OCR. Pattern recognition including template matching. Finding, matching, and/or counting specific patterns. This may include location of an object that may be rotated, partially hidden by another object, or varying in size. Barcode, Data Matrix and "2D barcode" reading Optical character recognition: automated reading of text such as serial numbers Gauging/Metrology: measurement of object dimensions (e.g. in pixels, inches or millimeters) Comparison against target values to determine a "pass or fail" or "go/no go" result. For example, with code or bar code verification, the read value is compared to the stored target value. For gauging, a measurement is compared against the proper value and tolerances. For verification of alpha-numberic codes, the

Progress in artificial intelligence

Progress in artificial intelligence (AI) refers to the advances, milestones, and breakthroughs that have been achieved in the field of artificial intelligence over time. AI is a branch of computer science that aims to create machines and systems capable of performing tasks that typically require human intelligence. AI applications have been used in a wide range of fields including medical diagnosis, finance, robotics, law, video games, agriculture, and scientific discovery. The society as a whole is looking for artificial intelligence to be on a key factor in the upcming years because of its potential. However, many AI applications are not perceived as AI: "A lot of cutting-edge AI has filtered into general applications, often without being called AI because once something becomes useful enough and common enough it's not labeled AI anymore." "Many thousands of AI applications are deeply embedded in the infrastructure of every industry." In the late 1990s and early 2000s, AI technology became widely used as elements of larger systems, but the field was rarely credited for these successes at the time. Kaplan and Haenlein structure artificial intelligence along three evolutionary stages: Artificial narrow intelligence – AI capable only of specific tasks; Artificial general intelligence – AI with ability in several areas, and able to autonomously solve problems they were never even designed for; Artificial superintelligence – AI capable of general tasks, including scientific creativity, social skills, and general wisdom. To allow comparison with human performance, artificial intelligence can be evaluated on constrained and well-defined problems. Such tests have been termed subject-matter expert Turing tests. Also, smaller problems provide more achievable goals and there are an ever-increasing number of positive results. In 2023, humans still substantially outperformed both GPT-4 and other models tested on the ConceptARC benchmark. Those models scored 60% on most, and 77% on one category, while humans scored 91% on all and 97% on one category. However, later research in 2025 showed that human-generated output grids were only accurate 73% of the time, while AI models available that year managed to score above 77%. == History == Increasing, promoting or constraining AI progress has often be done via controlling or increasing the amount of compute. == Current performance in specific areas == There are many useful abilities that can be described as showing some form of intelligence. This gives better insight into the comparative success of artificial intelligence in different areas. AI, like electricity or the steam engine, is a general-purpose technology. There is no consensus on how to characterize which tasks AI tends to excel at. Some versions of Moravec's paradox observe that humans are more likely to outperform machines in areas such as physical dexterity that have been the direct target of natural selection. While projects such as AlphaZero have succeeded in generating their own knowledge from scratch, many other machine learning projects require large training datasets. Researcher Andrew Ng has suggested, as a "highly imperfect rule of thumb", that "almost anything a typical human can do with less than one second of mental thought, we can probably now or in the near future automate using AI." Games provide a high-profile benchmark for assessing rates of progress; many games have a large professional player base and a well-established competitive rating system. AlphaGo brought the era of classical board-game benchmarks to a close when Artificial Intelligence proved their competitive edge over humans in 2016. Deep Mind's AlphaGo AI software program defeated the world's best professional Go Player Lee Sedol. Games of imperfect knowledge provide new challenges to AI in the area of game theory; the most prominent milestone in this area was brought to a close by Libratus' poker victory in 2017. E-sports continue to provide additional benchmarks; Facebook AI, Deepmind, and others have engaged with the popular StarCraft franchise of videogames. Broad classes of outcome for an AI test may be given as: optimal: it is not possible to perform better (note: some of these entries were solved by humans) super-human: performs better than all humans high-human: performs better than most humans par-human: performs similarly to most humans sub-human: performs worse than most humans === Optimal === Tic-tac-toe Connect Four: 1988 Checkers (aka 8x8 draughts): Weakly solved (2007) Rubik's Cube: Mostly solved (2010) Heads-up limit hold'em poker: Statistically optimal in the sense that "a human lifetime of play is not sufficient to establish with statistical significance that the strategy is not an exact solution" (2015) === Super-human === Othello (aka reversi): c. 1997 Scrabble: 2006 Backgammon: c. 1995–2002 Chess: Supercomputer (c. 1997); Personal computer (c. 2006); Mobile phone (c. 2009); Computer defeats human + computer (c. 2017) Jeopardy!: Question answering, although the machine did not use speech recognition (2011) Arimaa: 2015 Shogi: c. 2017 Go: 2017 Heads-up no-limit hold'em poker: 2017 Six-player no-limit hold'em poker: 2019 Gran Turismo Sport: 2022 === High-human === Crosswords: c. 2012 Freeciv: 2016 Dota 2: 2018 Bridge card-playing: According to a 2009 review, "the best programs are attaining expert status as (bridge) card players", excluding bidding. StarCraft II: 2019 Mahjong: 2019 Stratego: 2022 No-Press Diplomacy: 2022 Hanabi: 2022 Natural language processing === Par-human === Optical character recognition for ISO 1073-1:1976 and similar special characters. Classification of images Handwriting recognition Facial recognition Visual question answering SQuAD 2.0 English reading-comprehension benchmark (2019) SuperGLUE English-language understanding benchmark (2020) Some school science exams (2019) Some tasks based on Raven's Progressive Matrices Many Atari 2600 games (2015) === Sub-human === Optical character recognition for printed text (nearing par-human for Latin-script typewritten text) Object recognition Various robotics tasks that may require advances in robot hardware as well as AI, including: Stable bipedal locomotion: Bipedal robots can walk, but are less stable than human walkers (as of 2017) Humanoid soccer Speech recognition: "nearly equal to human performance" (2017) Explainability. Current medical systems can diagnose certain medical conditions well, but cannot explain to users why they made the diagnosis. Many tests of fluid intelligence (2020) Bongard visual cognition problems, such as the Bongard-LOGO benchmark (2020) Visual Commonsense Reasoning (VCR) benchmark (as of 2020) Stock market prediction: Financial data collection and processing using Machine Learning algorithms Angry Birds video game, as of 2020 Various tasks that are difficult to solve without contextual knowledge, including: Translation Word-sense disambiguation == Proposed tests of artificial intelligence == In his famous Turing test, Alan Turing picked language, the defining feature of human beings, for its basis. The Turing test is now considered too exploitable to be a meaningful benchmark. The Feigenbaum test, proposed by the inventor of expert systems, tests a machine's knowledge and expertise about a specific subject. A paper by Jim Gray of Microsoft in 2003 suggested extending the Turing test to speech understanding, speaking and recognizing objects and behavior. Proposed "universal intelligence" tests aim to compare how well machines, humans, and even non-human animals perform on problem sets that are generic as possible. At an extreme, the test suite can contain every possible problem, weighted by Kolmogorov complexity; however, these problem sets tend to be dominated by impoverished pattern-matching exercises where a tuned AI can easily exceed human performance levels. == Exams == According to OpenAI, in 2023 GPT-4 achieved high scores on several standardized and professional examinations, including around the 90th percentile on the Uniform Bar Exam, the 89th percentile on the mathematics section of the SAT, the 93rd percentile on SAT Reading and Writing, the 54th percentile on the analytical writing section of the GRE, the 88th percentile on GRE quantitative reasoning, and the 99th percentile on GRE verbal reasoning. OpenAI also reported that GPT-4 scored in the 99th to 100th percentile on the 2020 USA Biology Olympiad semifinal exam and earned top scores on several AP exams. Independent researchers found in 2023 that ChatGPT based on GPT-3.5 performed "at or near the passing threshold" on all three parts of the United States Medical Licensing Examination (USMLE), suggesting that large language models could reach passing-level performance on some medical knowledge assessments even without domain-specific fine-tuning. GPT-3.5 was also reported to attain a low but passing grade on examinations for four law school courses at the University of Minnes

Cognitive computing

Cognitive computing refers to technology platforms that, broadly speaking, are based on the scientific disciplines of artificial intelligence and signal processing. These platforms encompass machine learning, reasoning, natural language processing, speech recognition and vision (object recognition), human–computer interaction, dialog and narrative generation, among other technologies. == Definition == At present, there is no widely agreed upon definition for cognitive computing in either academia or industry. In general, the term cognitive computing has been used to refer to new hardware and/or software that mimics the functioning of the human brain (2004). In this sense, cognitive computing is a new type of computing with the goal of more accurate models of how the human brain/mind senses, reasons, and responds to stimulus. Cognitive computing applications link data analysis and adaptive page displays (AUI) to adjust content for a particular type of audience. As such, cognitive computing hardware and applications strive to be more affective and more influential by design. The term "cognitive system" also applies to any artificial construct able to perform a cognitive process where a cognitive process is the transformation of data, information, knowledge, or wisdom to a new level in the DIKW Pyramid. While many cognitive systems employ techniques having their origination in artificial intelligence research, cognitive systems, themselves, may not be artificially intelligent. For example, a neural network trained to recognize cancer on an MRI scan may achieve a higher success rate than a human doctor. This system is certainly a cognitive system but is not artificially intelligent. Cognitive systems may be engineered to feed on dynamic data in real-time, or near real-time, and may draw on multiple sources of information, including both structured and unstructured digital information, as well as sensory inputs (visual, gestural, auditory, or sensor-provided). == Cognitive analytics == Cognitive computing-branded technology platforms typically specialize in the processing and analysis of large, unstructured datasets. == Applications == Education Even if cognitive computing can not take the place of teachers, it can still be a heavy driving force in the education of students. Cognitive computing being used in the classroom is applied by essentially having an assistant that is personalized for each individual student. This cognitive assistant can relieve the stress that teachers face while teaching students, while also enhancing the student's learning experience over all. Teachers may not be able to pay each and every student individual attention, this being the place that cognitive computers fill the gap. Some students may need a little more help with a particular subject. For many students, Human interaction between student and teacher can cause anxiety and can be uncomfortable. With the help of Cognitive Computer tutors, students will not have to face their uneasiness and can gain the confidence to learn and do well in the classroom. While a student is in class with their personalized assistant, this assistant can develop various techniques, like creating lesson plans, to tailor and aid the student and their needs. Healthcare Numerous tech companies are in the process of developing technology that involves cognitive computing that can be used in the medical field. The ability to classify and identify is one of the main goals of these cognitive devices. This trait can be very helpful in the study of identifying carcinogens. This cognitive system that can detect would be able to assist the examiner in interpreting countless numbers of documents in a lesser amount of time than if they did not use Cognitive Computer technology. This technology can also evaluate information about the patient, looking through every medical record in depth, searching for indications that can be the source of their problems. Commerce Together with Artificial Intelligence, it has been used in warehouse management systems to collect, store, organize and analyze all related supplier data. All these aims at improving efficiency, enabling faster decision-making, monitoring inventory and fraud detection Human Cognitive Augmentation In situations where humans are using or working collaboratively with cognitive systems, called a human/cog ensemble, results achieved by the ensemble are superior to results obtainable by the human working alone. Therefore, the human is cognitively augmented. In cases where the human/cog ensemble achieves results at, or superior to, the level of a human expert then the ensemble has achieved synthetic expertise. In a human/cog ensemble, the "cog" is a cognitive system employing virtually any kind of cognitive computing technology. Other use cases Speech recognition Sentiment analysis Face detection Risk assessment Fraud detection Behavioral recommendations == Industry work == Cognitive computing in conjunction with big data and algorithms that comprehend customer needs, can be a major advantage in economic decision making. The powers of cognitive computing and artificial intelligence hold the potential to affect almost every task that humans are capable of performing. This can negatively affect employment for humans, as there would be no such need for human labor anymore. It would also increase the inequality of wealth; the people at the head of the cognitive computing industry would grow significantly richer, while workers without ongoing, reliable employment would become less well off. The more industries start to use cognitive computing, the more difficult it will be for humans to compete. Increased use of the technology will also increase the amount of work that AI-driven robots and machines can perform. The influence of competitive individuals in conjunction with artificial intelligence/cognitive computing has the potential to change the course of humankind.

Empirical dynamic modeling

Empirical dynamic modeling (EDM) is a framework for analysis and prediction of nonlinear dynamical systems. Applications include population dynamics, ecosystem service, medicine, neuroscience, dynamical systems, geophysics, and human-computer interaction. EDM was originally developed by Robert May and George Sugihara. It can be considered a methodology for data modeling, predictive analytics, dynamical system analysis, machine learning and time series analysis. == Description == Mathematical models have tremendous power to describe observations of real-world systems. They are routinely used to test hypothesis, explain mechanisms and predict future outcomes. However, real-world systems are often nonlinear and multidimensional, in some instances rendering explicit equation-based modeling problematic. Empirical models, which infer patterns and associations from the data instead of using hypothesized equations, represent a natural and flexible framework for modeling complex dynamics. Donald DeAngelis and Simeon Yurek illustrated that canonical statistical models are ill-posed when applied to nonlinear dynamical systems. A hallmark of nonlinear dynamics is state-dependence: system states are related to previous states governing transition from one state to another. EDM operates in this space, the multidimensional state-space of system dynamics rather than on one-dimensional observational time series. EDM does not presume relationships among states, for example, a functional dependence, but projects future states from localised, neighboring states. EDM is thus a state-space, nearest-neighbors paradigm where system dynamics are inferred from states derived from observational time series. This provides a model-free representation of the system naturally encompassing nonlinear dynamics. A cornerstone of EDM is recognition that time series observed from a dynamical system can be transformed into higher-dimensional state-spaces by time-delay embedding with Takens's theorem. The state-space models are evaluated based on in-sample fidelity to observations, conventionally with Pearson correlation between predictions and observations. == Methods == Primary EDM algorithms include Simplex projection, Sequential locally weighted global linear maps (S-Map) projection, Multivariate embedding in Simplex or S-Map, Convergent cross mapping (CCM), and Multiview Embeding, described below. Nearest neighbors are found according to: NN ( y , X , k ) = ‖ X N i E − y ‖ ≤ ‖ X N j E − y ‖ if 1 ≤ i ≤ j ≤ k {\displaystyle {\text{NN}}(y,X,k)=\|X_{N_{i}}^{E}-y\|\leq \|X_{N_{j}}^{E}-y\|{\text{ if }}1\leq i\leq j\leq k} === Simplex === Simplex projection is a nearest neighbor projection. It locates the k {\displaystyle k} nearest neighbors to the location in the state-space from which a prediction is desired. To minimize the number of free parameters k {\displaystyle k} is typically set to E + 1 {\displaystyle E+1} defining an E + 1 {\displaystyle E+1} dimensional simplex in the state-space. The prediction is computed as the average of the weighted phase-space simplex projected T p {\displaystyle Tp} points ahead. Each neighbor is weighted proportional to their distance to the projection origin vector in the state-space. Find k {\displaystyle k} nearest neighbor: N k ← NN ( y , X , k ) {\displaystyle N_{k}\gets {\text{NN}}(y,X,k)} Define the distance scale: d ← ‖ X N 1 E − y ‖ {\displaystyle d\gets \|X_{N_{1}}^{E}-y\|} Compute weights: For{ i = 1 , … , k {\displaystyle i=1,\dots ,k} } : w i ← exp ⁡ ( − ‖ X N i E − y ‖ / d ) {\displaystyle w_{i}\gets \exp(-\|X_{N_{i}}^{E}-y\|/d)} Average of state-space simplex: y ^ ← ∑ i = 1 k ( w i X N i + T p ) / ∑ i = 1 k w i {\displaystyle {\hat {y}}\gets \sum _{i=1}^{k}\left(w_{i}X_{N_{i}+T_{p}}\right)/\sum _{i=1}^{k}w_{i}} === S-Map === S-Map extends the state-space prediction in Simplex from an average of the E + 1 {\displaystyle E+1} nearest neighbors to a linear regression fit to all neighbors, but localised with an exponential decay kernel. The exponential localisation function is F ( θ ) = exp ( − θ d / D ) {\displaystyle F(\theta )={\text{exp}}(-\theta d/D)} , where d {\displaystyle d} is the neighbor distance and D {\displaystyle D} the mean distance. In this way, depending on the value of θ {\displaystyle \theta } , neighbors close to the prediction origin point have a higher weight than those further from it, such that a local linear approximation to the nonlinear system is reasonable. This localisation ability allows one to identify an optimal local scale, in-effect quantifying the degree of state dependence, and hence nonlinearity of the system. Another feature of S-Map is that for a properly fit model, the regression coefficients between variables have been shown to approximate the gradient (directional derivative) of variables along the manifold. These Jacobians represent the time-varying interaction strengths between system variables. Find k {\displaystyle k} nearest neighbor: N ← NN ( y , X , k ) {\displaystyle N\gets {\text{NN}}(y,X,k)} Sum of distances: D ← 1 k ∑ i = 1 k ‖ X N i E − y ‖ {\displaystyle D\gets {\frac {1}{k}}\sum _{i=1}^{k}\|X_{N_{i}}^{E}-y\|} Compute weights: For{ i = 1 , … , k {\displaystyle i=1,\dots ,k} } : w i ← exp ⁡ ( − θ ‖ X N i E − y ‖ / D ) {\displaystyle w_{i}\gets \exp(-\theta \|X_{N_{i}}^{E}-y\|/D)} Reweighting matrix: W ← diag ( w i ) {\displaystyle W\gets {\text{diag}}(w_{i})} Design matrix: A ← [ 1 X N 1 X N 1 − 1 … X N 1 − E + 1 1 X N 2 X N 2 − 1 … X N 2 − E + 1 ⋮ ⋮ ⋮ ⋱ ⋮ 1 X N k X N k − 1 … X N k − E + 1 ] {\displaystyle A\gets {\begin{bmatrix}1&X_{N_{1}}&X_{N_{1}-1}&\dots &X_{N_{1}-E+1}\\1&X_{N_{2}}&X_{N_{2}-1}&\dots &X_{N_{2}-E+1}\\\vdots &\vdots &\vdots &\ddots &\vdots \\1&X_{N_{k}}&X_{N_{k}-1}&\dots &X_{N_{k}-E+1}\end{bmatrix}}} Weighted design matrix: A ← W A {\displaystyle A\gets WA} Response vector at T p {\displaystyle Tp} : b ← [ X N 1 + T p X N 2 + T p ⋮ X N k + T p ] {\displaystyle b\gets {\begin{bmatrix}X_{N_{1}+T_{p}}\\X_{N_{2}+T_{p}}\\\vdots \\X_{N_{k}+T_{p}}\end{bmatrix}}} Weighted response vector: b ← W b {\displaystyle b\gets Wb} Least squares solution (SVD): c ^ ← argmin c ‖ A c − b ‖ 2 2 {\displaystyle {\hat {c}}\gets {\text{argmin}}_{c}\|Ac-b\|_{2}^{2}} Local linear model c ^ {\displaystyle {\hat {c}}} is prediction: y ^ ← c ^ 0 + ∑ i = 1 E c ^ i y i {\displaystyle {\hat {y}}\gets {\hat {c}}_{0}+\sum _{i=1}^{E}{\hat {c}}_{i}y_{i}} === Multivariate Embedding === Multivariate Embedding recognizes that time-delay embeddings are not the only valid state-space construction. In Simplex and S-Map one can generate a state-space from observational vectors, or time-delay embeddings of a single observational time series, or both. === Convergent Cross Mapping === Convergent cross mapping (CCM) leverages a corollary to the Generalized Takens Theorem that it should be possible to cross predict or cross map between variables observed from the same system. Suppose that in some dynamical system involving variables X {\displaystyle X} and Y {\displaystyle Y} , X {\displaystyle X} causes Y {\displaystyle Y} . Since X {\displaystyle X} and Y {\displaystyle Y} belong to the same dynamical system, their reconstructions (via embeddings) M x {\displaystyle M_{x}} , and M y {\displaystyle M_{y}} , also map to the same system. The causal variable X {\displaystyle X} leaves a signature on the affected variable Y {\displaystyle Y} , and consequently, the reconstructed states based on Y {\displaystyle Y} can be used to cross predict values of X {\displaystyle X} . CCM leverages this property to infer causality by predicting X {\displaystyle X} using the M y {\displaystyle M_{y}} library of points (or vice versa for the other direction of causality), while assessing improvements in cross map predictability as larger and larger random samplings of M y {\displaystyle M_{y}} are used. If the prediction skill of X {\displaystyle X} increases and saturates as the entire M y {\displaystyle M_{y}} is used, this provides evidence that X {\displaystyle X} is casually influencing Y {\displaystyle Y} . === Multiview Embedding === Multiview Embedding is a Dimensionality reduction technique where a large number of state-space time series vectors are combitorially assessed towards maximal model predictability. == Extensions == Extensions to EDM techniques include: Generalized Theorems for Nonlinear State Space Reconstruction Extended Convergent Cross Mapping Dynamic stability S-Map regularization Visual analytics with EDM Convergent Cross Sorting Expert system with EDM hybrid Sliding windows based on the extended convergent cross-mapping Empirical Mode Modeling Accounting for missing data and variable step sizes Accounting for observation noise Hierarchical Bayesian EDM via Gaussian processes Intelligent and Adaptive Control Optimal control via Empirical dynamic programming Multiview distance regularised S-map

Vulnerability Discovery Model

A Vulnerability Discovery Model (VDM) uses discovery event data with software reliability models for predicting the same. A thorough presentation of VDM techniques is available in. Numerous model implementations are available in the MCMCBayes open source repository. Several VDM examples include: Alhazmi-Malaiya: Time based model (Alhazmi-Malaiya Logistic (AML) model) Alhazmi-Malaiya: Effort based model Rescorla: Quadratic Model and Exponential Model Anderson: Thermodynamic Model Kim: Weibull Model Linear Model Hump-Shaped Model Independent and Dependent Model Vulnerability Discovery Modeling using Bayesian model averaging Multivariate Vulnerability Discovery Models

Bayesian programming

Bayesian programming is a formalism and a methodology for having a technique to specify probabilistic models and solve problems when less than the necessary information is available. Edwin T. Jaynes proposed that probability could be considered as an alternative and an extension of logic for rational reasoning with incomplete and uncertain information. In his founding book Probability Theory: The Logic of Science he developed this theory and proposed what he called "the robot," which was not a physical device, but an inference engine to automate probabilistic reasoning—a kind of Prolog for probability instead of logic. Bayesian programming is a formal and concrete implementation of this "robot". Bayesian programming may also be seen as an algebraic formalism to specify graphical models such as, for instance, Bayesian networks, dynamic Bayesian networks, Kalman filters or hidden Markov models. Indeed, Bayesian programming is more general than Bayesian networks and has a power of expression equivalent to probabilistic factor graphs. == Formalism == A Bayesian program is a means of specifying a family of probability distributions. The constituent elements of a Bayesian program are presented below: Program { Description { Specification ( π ) { Variables Decomposition Forms Identification (based on δ ) Question {\displaystyle {\text{Program}}{\begin{cases}{\text{Description}}{\begin{cases}{\text{Specification}}(\pi ){\begin{cases}{\text{Variables}}\\{\text{Decomposition}}\\{\text{Forms}}\\\end{cases}}\\{\text{Identification (based on }}\delta )\end{cases}}\\{\text{Question}}\end{cases}}} A program is constructed from a description and a question. A description is constructed using some specification ( π {\displaystyle \pi } ) as given by the programmer and an identification or learning process for the parameters not completely specified by the specification, using a data set ( δ {\displaystyle \delta } ). A specification is constructed from a set of pertinent variables, a decomposition and a set of forms. Forms are either parametric forms or questions to other Bayesian programs. A question specifies which probability distribution has to be computed. === Description === The purpose of a description is to specify an effective method of computing a joint probability distribution on a set of variables { X 1 , X 2 , ⋯ , X N } {\displaystyle \left\{X_{1},X_{2},\cdots ,X_{N}\right\}} given a set of experimental data δ {\displaystyle \delta } and some specification π {\displaystyle \pi } . This joint distribution is denoted as: P ( X 1 ∧ X 2 ∧ ⋯ ∧ X N ∣ δ ∧ π ) {\displaystyle P\left(X_{1}\wedge X_{2}\wedge \cdots \wedge X_{N}\mid \delta \wedge \pi \right)} . To specify preliminary knowledge π {\displaystyle \pi } , the programmer must undertake the following: Define the set of relevant variables { X 1 , X 2 , ⋯ , X N } {\displaystyle \left\{X_{1},X_{2},\cdots ,X_{N}\right\}} on which the joint distribution is defined. Decompose the joint distribution (break it into relevant independent or conditional probabilities). Define the forms of each of the distributions (e.g., for each variable, one of the list of probability distributions). ==== Decomposition ==== Given a partition of { X 1 , X 2 , … , X N } {\displaystyle \left\{X_{1},X_{2},\ldots ,X_{N}\right\}} containing K {\displaystyle K} subsets, K {\displaystyle K} variables are defined L 1 , ⋯ , L K {\displaystyle L_{1},\cdots ,L_{K}} , each corresponding to one of these subsets. Each variable L k {\displaystyle L_{k}} is obtained as the conjunction of the variables { X k 1 , X k 2 , ⋯ } {\displaystyle \left\{X_{k_{1}},X_{k_{2}},\cdots \right\}} belonging to the k t h {\displaystyle k^{th}} subset. Recursive application of Bayes' theorem leads to: P ( X 1 ∧ X 2 ∧ ⋯ ∧ X N ∣ δ ∧ π ) = P ( L 1 ∧ ⋯ ∧ L K ∣ δ ∧ π ) = P ( L 1 ∣ δ ∧ π ) × P ( L 2 ∣ L 1 ∧ δ ∧ π ) × ⋯ × P ( L K ∣ L K − 1 ∧ ⋯ ∧ L 1 ∧ δ ∧ π ) {\displaystyle {\begin{aligned}&P\left(X_{1}\wedge X_{2}\wedge \cdots \wedge X_{N}\mid \delta \wedge \pi \right)\\={}&P\left(L_{1}\wedge \cdots \wedge L_{K}\mid \delta \wedge \pi \right)\\={}&P\left(L_{1}\mid \delta \wedge \pi \right)\times P\left(L_{2}\mid L_{1}\wedge \delta \wedge \pi \right)\times \cdots \times P\left(L_{K}\mid L_{K-1}\wedge \cdots \wedge L_{1}\wedge \delta \wedge \pi \right)\end{aligned}}} Conditional independence hypotheses then allow further simplifications. A conditional independence hypothesis for variable L k {\displaystyle L_{k}} is defined by choosing some variable X n {\displaystyle X_{n}} among the variables appearing in the conjunction L k − 1 ∧ ⋯ ∧ L 2 ∧ L 1 {\displaystyle L_{k-1}\wedge \cdots \wedge L_{2}\wedge L_{1}} , labelling R k {\displaystyle R_{k}} as the conjunction of these chosen variables and setting: P ( L k ∣ L k − 1 ∧ ⋯ ∧ L 1 ∧ δ ∧ π ) = P ( L k ∣ R k ∧ δ ∧ π ) {\displaystyle P\left(L_{k}\mid L_{k-1}\wedge \cdots \wedge L_{1}\wedge \delta \wedge \pi \right)=P\left(L_{k}\mid R_{k}\wedge \delta \wedge \pi \right)} We then obtain: P ( X 1 ∧ X 2 ∧ ⋯ ∧ X N ∣ δ ∧ π ) = P ( L 1 ∣ δ ∧ π ) × P ( L 2 ∣ R 2 ∧ δ ∧ π ) × ⋯ × P ( L K ∣ R K ∧ δ ∧ π ) {\displaystyle {\begin{aligned}&P\left(X_{1}\wedge X_{2}\wedge \cdots \wedge X_{N}\mid \delta \wedge \pi \right)\\={}&P\left(L_{1}\mid \delta \wedge \pi \right)\times P\left(L_{2}\mid R_{2}\wedge \delta \wedge \pi \right)\times \cdots \times P\left(L_{K}\mid R_{K}\wedge \delta \wedge \pi \right)\end{aligned}}} Such a simplification of the joint distribution as a product of simpler distributions is called a decomposition, derived using the chain rule. This ensures that each variable appears at the most once on the left of a conditioning bar, which is the necessary and sufficient condition to write mathematically valid decompositions. ==== Forms ==== Each distribution P ( L k ∣ R k ∧ δ ∧ π ) {\displaystyle P\left(L_{k}\mid R_{k}\wedge \delta \wedge \pi \right)} appearing in the product is then associated with either a parametric form (i.e., a function f μ ( L k ) {\displaystyle f_{\mu }\left(L_{k}\right)} ) or a question to another Bayesian program P ( L k ∣ R k ∧ δ ∧ π ) = P ( L ∣ R ∧ δ ^ ∧ π ^ ) {\displaystyle P\left(L_{k}\mid R_{k}\wedge \delta \wedge \pi \right)=P\left(L\mid R\wedge {\widehat {\delta }}\wedge {\widehat {\pi }}\right)} . When it is a form f μ ( L k ) {\displaystyle f_{\mu }\left(L_{k}\right)} , in general, μ {\displaystyle \mu } is a vector of parameters that may depend on R k {\displaystyle R_{k}} or δ {\displaystyle \delta } or both. Learning takes place when some of these parameters are computed using the data set δ {\displaystyle \delta } . An important feature of Bayesian programming is this capacity to use questions to other Bayesian programs as components of the definition of a new Bayesian program. P ( L k ∣ R k ∧ δ ∧ π ) {\displaystyle P\left(L_{k}\mid R_{k}\wedge \delta \wedge \pi \right)} is obtained by some inferences done by another Bayesian program defined by the specifications π ^ {\displaystyle {\widehat {\pi }}} and the data δ ^ {\displaystyle {\widehat {\delta }}} . This is similar to calling a subroutine in classical programming and provides an easy way to build hierarchical models. === Question === Given a description (i.e., P ( X 1 ∧ X 2 ∧ ⋯ ∧ X N ∣ δ ∧ π ) {\displaystyle P\left(X_{1}\wedge X_{2}\wedge \cdots \wedge X_{N}\mid \delta \wedge \pi \right)} ), a question is obtained by partitioning { X 1 , X 2 , ⋯ , X N } {\displaystyle \left\{X_{1},X_{2},\cdots ,X_{N}\right\}} into three sets: the searched variables, the known variables and the free variables. The 3 variables S e a r c h e d {\displaystyle Searched} , K n o w n {\displaystyle Known} and F r e e {\displaystyle Free} are defined as the conjunction of the variables belonging to these sets. A question is defined as the set of distributions: P ( S e a r c h e d ∣ Known ∧ δ ∧ π ) {\displaystyle P\left(Searched\mid {\text{Known}}\wedge \delta \wedge \pi \right)} made of many "instantiated questions" as the cardinal of K n o w n {\displaystyle Known} , each instantiated question being the distribution: P ( Searched ∣ Known ∧ δ ∧ π ) {\displaystyle P\left({\text{Searched}}\mid {\text{Known}}\wedge \delta \wedge \pi \right)} === Inference === Given the joint distribution P ( X 1 ∧ X 2 ∧ ⋯ ∧ X N ∣ δ ∧ π ) {\displaystyle P\left(X_{1}\wedge X_{2}\wedge \cdots \wedge X_{N}\mid \delta \wedge \pi \right)} , it is always possible to compute any possible question using the following general inference: P ( Searched ∣ Known ∧ δ ∧ π ) = ∑ Free [ P ( Searched ∧ Free ∣ Known ∧ δ ∧ π ) ] = ∑ Free [ P ( Searched ∧ Free ∧ Known ∣ δ ∧ π ) ] P ( Known ∣ δ ∧ π ) = ∑ Free [ P ( Searched ∧ Free ∧ Known ∣ δ ∧ π ) ] ∑ Free ∧ Searched [ P ( Searched ∧ Free ∧ Known ∣ δ ∧ π ) ] = 1 Z × ∑ Free [ P ( Searched ∧ Free ∧ Known ∣ δ ∧ π ) ] {\displaystyle {\begin{aligned}&P\left({\text{Searched}}\mid {\text{Known}}\wedge \delta \wedge \pi \right)\\={}&\sum _{\text{Free}}\left[P\left({\text{Searched}}\wedge {\text{Free}}\mid {\text{Known}}\wedge \delta \wedge \