Neural processing unit

Neural processing unit

A neural processing unit (NPU), also known as an AI accelerator or deep learning processor, is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and computer vision. == Use == Their purpose is either to efficiently execute already trained AI models (inference) or to train AI models. NPUs can be more efficient in terms of speed or power consumption. NPU applications include algorithms for robotics, Internet of things, and data-intensive or sensor-driven tasks. They are often manycore or spatial designs and focus on low-precision arithmetic, novel dataflow architectures, or in-memory computing capability. As of 2024, a widely used datacenter-grade AI integrated circuit chip, the Nvidia H100 GPU, contains tens of billions of MOSFETs. === Consumer devices === AI accelerators are used in Apple silicon, Qualcomm, Samsung, Huawei, and Google Tensor smartphone processors. Vision processing units are accelerators specialized for machine vision algorithms such as CNN (convolutional neural networks) and SIFT (scale-invariant feature transform). They are used in devices that need to keep track of objects visually such as AR headsets and drones. It is more recently (circa 2017) added to processors from Apple and (circa 2022) to processors from Intel and AMD. All models of Intel Meteor Lake processors have a built-in versatile processor unit (VPU) for accelerating inference for computer vision and deep learning. On consumer devices, the NPU is intended to be small, power-efficient, but reasonably fast when used to run small models. To do this they are designed to support low-bitwidth operations using data types such as INT4, INT8, FP8, and FP16. A common metric is trillions of operations per second (TOPS). Although TOPS does not explicitly specify the kind of operations, it is typically INT8 additions and multiplications. === Datacenters === Accelerators are used in cloud computing servers: e.g., tensor processing units (TPU) for Google Cloud Platform, and Trainium and Inferentia chips for Amazon Web Services. Many vendor-specific terms exist for devices in this category, and it is an emerging technology without a dominant design. Since the late 2010s, graphics processing units designed by companies such as Nvidia and AMD often include AI-specific hardware in the form of dedicated functional units for low-precision matrix-multiplication operations. These GPUs are commonly used as AI accelerators, both for training and inference. === Scientific computation === Although NPUs are tailored for low-precision (e.g., FP16, INT8) matrix multiplication operations, they can be used to emulate higher-precision matrix multiplications in scientific computing. As modern GPUs place much focus on making the NPU part fast, using emulated FP64 (Ozaki scheme) on NPUs can potentially outperform native FP64. This has been demonstrated using FP16-emulated FP64 on NVIDIA TITAN RTX and using INT8-emulated FP64 on NVIDIA consumer GPUs and the A100 GPU. Consumer GPUs especially benefited as they have limited FP64 hardware capacity, showing a 6× speedup. Since CUDA Toolkit 13.0 Update 2, cuBLAS automatically uses INT8-emulated FP64 matrix multiplication of the equivalent precision if it is faster than native. This is in addition to the FP16-emulated FP32 feature introduced in version 12.9. == Programming == An operating system or a higher-level library may provide application programming interfaces such as TensorFlow with LiteRT Next (Android), CoreML (iOS, macOS) or DirectML (Windows). Formats such as ONNX are used to represent trained neural networks. Consumer CPU-integrated NPUs are accessible through vendor-specific APIs. AMD (Ryzen AI), Intel (OpenVINO), Apple silicon (CoreML), and Qualcomm (SNPE) each have their own APIs, which can be built upon by a higher-level library. GPUs generally use existing GPGPU pipelines such as CUDA and OpenCL adapted for lower precisions and specialized matrix-multiplication operations. Vulkan is also being used. Custom-built systems such as the Google TPU use private interfaces. There are a large number of separate underlying acceleration APIs and compilers/runtimes in use in the AI field, causing a great increase in software development effort due to the many combinations involved. As of 2025, the open standard organization Khronos Group is pursuing standardization of AI-related interfaces to reduce the amount of work needed. Khronos is working on three separate fronts: expansion of data types and intrinsic operations in OpenCL and Vulkan, inclusion of compute graphs in SPIR-V, and a NNEF/SkriptND file format for describing a neural network.

Centurion Guard

Centurion Guard is a PC hardware and software-based security product, developed by Centurion Technologies. It was first released in 1996. There were several different releases and versions of this product, and many were distributed in computers donated to libraries by the Bill & Melinda Gates Foundation. == Operating system compatibility == Microsoft Windows 7 Microsoft Windows Vista Microsoft Windows XP

Causal Markov condition

The Causal Markov (CM) condition states that, conditional on the set of all its direct causes, a node is independent of all variables which are not effects or direct causes of that node. In the event that the structure of a Bayesian network accurately depicts causality, the two conditions are equivalent. This is related to the Markov condition, an assumption made in Bayesian probability theory, that every node in a Bayesian network is conditionally independent of its nondescendants, given its parents. Stated loosely, it is assumed that a node has no bearing on nodes which do not descend from it. In a DAG, this local Markov condition is equivalent to the global Markov condition, which states that d-separations in the graph also correspond to conditional independence relations. This also means that a node is conditionally independent of the entire network, given its Markov blanket. A network may accurately embody the Markov condition without depicting causality, in which case it should not be assumed to embody the causal Markov condition. == Motivation == Statisticians are enormously interested in the ways in which certain events and variables are connected. The precise notion of what constitutes a cause and effect is necessary to understand the connections between them. The central idea behind the philosophical study of probabilistic causation is that causes raise the probabilities of their effects, all else being equal. A deterministic interpretation of causation means that if A causes B, then A must always be followed by B. In this sense, smoking does not cause cancer because some smokers never develop cancer. On the other hand, a probabilistic interpretation simply means that causes raise the probability of their effects. In this sense, changes in meteorological readings associated with a storm do cause that storm, since they raise its probability. (However, simply looking at a barometer does not change the probability of the storm, for a more detailed analysis, see:). == Examples == In a simple view, releasing one's hand from a hammer causes the hammer to fall. However, doing so in outer space does not produce the same outcome, calling into question if releasing one's fingers from a hammer always causes it to fall. A causal graph could be created to acknowledge that both the presence of gravity and the release of the hammer contribute to its falling. However, it would be very surprising if the surface underneath the hammer affected its falling. This essentially states the Causal Markov Condition, that given the existence of gravity the release of the hammer, it will fall regardless of what is beneath it. == Implications == === Dependence and Causation === It follows from the definition that if X and Y are in V and are probabilistically dependent, then either X causes Y, Y causes X, or X and Y are both effects of some common cause Z in V. This definition was seminally introduced by Hans Reichenbach as the Common Cause Principle (CCP). === Screening === It once again follows from the definition that the parents of X screen X from other "indirect causes" of X (parents of Parents(X)) and other effects of Parents(X) which are not also effects of X.

Rprop

Rprop, short for resilient backpropagation, is a learning heuristic for supervised learning in feedforward artificial neural networks. This is a first-order optimization algorithm. This algorithm was created by Martin Riedmiller and Heinrich Braun in 1992. Similarly to the Manhattan update rule, Rprop takes into account only the sign of the partial derivative over all patterns (not the magnitude), and acts independently on each "weight". For each weight, if there was a sign change of the partial derivative of the total error function compared to the last iteration, the update value for that weight is multiplied by a factor η−, where η− < 1. If the last iteration produced the same sign, the update value is multiplied by a factor of η+, where η+ > 1. The update values are calculated for each weight in the above manner, and finally each weight is changed by its own update value, in the opposite direction of that weight's partial derivative, so as to minimise the total error function. η+ is empirically set to 1.2 and η− to 0.5. Rprop can result in very large weight increments or decrements if the gradients are large, which is a problem when using mini-batches as opposed to full batches. RMSprop addresses this problem by keeping the moving average of the squared gradients for each weight and dividing the gradient by the square root of the mean square. RPROP is a batch update algorithm. Next to the cascade correlation algorithm and the Levenberg–Marquardt algorithm, Rprop is one of the fastest weight update mechanisms. == Variations == Martin Riedmiller developed three algorithms, all named RPROP. Igel and Hüsken assigned names to them and added a new variant: RPROP+ is defined at A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm. RPROP− is defined at Advanced Supervised Learning in Multi-layer Perceptrons – From Backpropagation to Adaptive Learning Algorithms. Backtracking is removed from RPROP+. iRPROP− is defined in Rprop – Description and Implementation Details and was reinvented by Igel and Hüsken. This variant is very popular and most simple. iRPROP+ is defined at Improving the Rprop Learning Algorithm and is very robust and typically faster than the other three variants.

IBM Watsonx

Watsonx is a platform by IBM for building and managing artificial intelligence (AI) applications for business use. Released on May 9, 2023, the platform provides software tools and infrastructure for companies to work with both IBM's own AI models and models from third-party sources. The platform consists of three main components: watsonx.ai, a studio for training, validating, and deploying AI models; watsonx.data, a system for storing and managing data used by the models; and watsonx.governance, a toolkit to ensure AI applications are compliant with company policies and regulations. A key feature of the platform is that it can be trained on a company's private data to perform specialized tasks, a process known as fine-tuning. IBM states that this client-specific data is not used to train its own models. == History == Watsonx was introduced on May 9, 2023, at the annual IBM Think conference, as a platform that includes multiple services. Just like Watson AI computer with the similar name, Watsonx was named after Thomas J. Watson, IBM's founder and first CEO. On February 13, 2024, Anaconda partnered with IBM to embed its open-source Python packages into Watsonx. Watsonx is used at ESPN's Fantasy Football App for managing players' performance, and by Italian telecommunications company Wind Tre. It was employed to generate editorial content around nominees during the 66th Annual Grammy Awards. In 2025, Wimbledon integrated IBM watsonx generative AI into its app and website. Integrated with IBM Safer Payments, IBM watsonx has been used in banking sector fraud detection and anti-money laundering (AML) systems. == Services == === watsonx.ai === Watsonx.ai is a platform that allows AI developers to leverage a wide range of LLMs under IBM's own Granite series and others such as Facebook's LLaMA-2, free and open-source model Mistral, and many others present in the Hugging Face community. These models come pre-trained and optimized for various natural language processing (NLP) applications.The platform also allows fine-tuning with its Tuning Studio. === watsonx.data === Watsonx.data is a platform designed to assist clients in addressing issues related to data volume, complexity, cost, and governance.. The platform facilitates seamless data access, whether stored in the cloud or on-premises, through a single entry point. === watsonx.governance === Watsonx.governance is a platform that utilizes IBM's AI capabilities to implement AI lifecycle governance. This helps them manage risks and maintain compliance with evolving AI and industry regulations, while reducing AI bias through automated oversight.

Image texture

An image texture is the small-scale structure perceived on an image, based on the spatial arrangement of color or intensities. It can be quantified by a set of metrics calculated in image processing. Image texture metrics give us information about the whole image or selected regions. Image textures can be artificially created or found in natural scenes captured in an image. Image textures are one way that can be used to help in segmentation or classification of images. For more accurate segmentation the most useful features are spatial frequency and an average grey level. To analyze an image texture in computer graphics, there are two ways to approach the issue: structured approach and statistical approach. == Structured approach == A structured approach sees an image texture as a set of primitive texels in some regular or repeated pattern. This works well when analyzing artificial textures. To obtain a structured description a characterization of the spatial relationship of the texels is gathered by using Voronoi tessellation of the texels. == Statistical approach == A statistical approach sees an image texture as a quantitative measure of the arrangement of intensities in a region. In general this approach is easier to compute and is more widely used, since natural textures are made of patterns of irregular subelements. === Edge detection === The use of edge detection is to determine the number of edge pixels in a specified region, helps determine a characteristic of texture complexity. After edges have been found the direction of the edges can also be applied as a characteristic of texture and can be useful in determining patterns in the texture. These directions can be represented as an average or in a histogram. Consider a region with N pixels. the gradient-based edge detector is applied to this region by producing two outputs for each pixel p: the gradient magnitude Mag(p) and the gradient direction Dir(p). The edgeness per unit area can be defined by F e d g e n e s s = | { p | M a g ( p ) > T } | N {\displaystyle F_{edgeness}={\frac {|\{p|Mag(p)>T\}|}{N}}} for some threshold T. To include orientation with edgeness histograms for both gradient magnitude and gradient direction can be used. Hmag(R) denotes the normalized histogram of gradient magnitudes of region R, and Hdir(R) denotes the normalized histogram of gradient orientations of region R. Both are normalized according to the size NR Then F m a g , d i r = ( H m a g ( R ) , H d i r ( R ) ) {\displaystyle F_{mag,dir}=(H_{mag}(R),H_{dir}(R))} is a quantitative texture description of region R. === Co-occurrence matrices === The co-occurrence matrix captures numerical features of a texture using spatial relations of similar gray tones. Numerical features computed from the co-occurrence matrix can be used to represent, compare, and classify textures. The following are a subset of standard features derivable from a normalized co-occurrence matrix: A n g u l a r 2 n d M o m e n t = ∑ i ∑ j p [ i , j ] 2 C o n t r a s t = ∑ i = 1 N g ∑ j = 1 N g n 2 p [ i , j ] , where | i − j | = n C o r r e l a t i o n = ∑ i = 1 N g ∑ j = 1 N g ( i j ) p [ i , j ] − μ x μ y σ x σ y E n t r o p y = − ∑ i ∑ j p [ i , j ] l n ( p [ i , j ] ) {\displaystyle {\begin{aligned}Angular{\text{ }}2nd{\text{ }}Moment&=\sum _{i}\sum _{j}p[i,j]^{2}\\Contrast&=\sum _{i=1}^{Ng}\sum _{j=1}^{Ng}n^{2}p[i,j]{\text{, where }}|i-j|=n\\Correlation&={\frac {\sum _{i=1}^{Ng}\sum _{j=1}^{Ng}(ij)p[i,j]-\mu _{x}\mu _{y}}{\sigma _{x}\sigma _{y}}}\\Entropy&=-\sum _{i}\sum _{j}p[i,j]ln(p[i,j])\\\end{aligned}}} where p [ i , j ] {\displaystyle p[i,j]} is the [ i , j ] {\displaystyle [i,j]} th entry in a gray-tone spatial dependence matrix, and Ng is the number of distinct gray-levels in the quantized image. One negative aspect of the co-occurrence matrix is that the extracted features do not necessarily correspond to visual perception. It is used in dentistry for the objective evaluation of lesions [DOI: 10.1155/2020/8831161], treatment efficacy [DOI: 10.3390/ma13163614; DOI: 10.11607/jomi.5686; DOI: 10.3390/ma13173854; DOI: 10.3390/ma13132935] and bone reconstruction during healing [DOI: 10.5114/aoms.2013.33557; DOI: 10.1259/dmfr/22185098; EID: 2-s2.0-81455161223; DOI: 10.3390/ma13163649]. === Laws texture energy measures === Another approach is to use local masks to detect various types of texture features. Laws originally used four vectors representing texture features to create sixteen 2D masks from the outer products of the pairs of vectors. The four vectors and relevant features were as follows: L5 = [ +1 +4 6 +4 +1 ] (Level) E5 = [ -1 -2 0 +2 +1 ] (Edge) S5 = [ -1 0 2 0 -1 ] (Spot) R5 = [ +1 -4 6 -4 +1 ] (Ripple) To these 4, a fifth is sometimes added: W5 = [ -1 +2 0 -2 +1 ] (Wave) From Laws' 4 vectors, 16 5x5 "energy maps" are then filtered down to 9 in order to remove certain symmetric pairs. For instance, L5E5 measures vertical edge content and E5L5 measures horizontal edge content. The average of these two measures is the "edginess" of the content. The resulting 9 maps used by Laws are as follows: L5E5/E5L5 L5R5/R5L5 E5S5/S5E5 S5S5 R5R5 L5S5/S5L5 E5E5 E5R5/R5E5 S5R5/R5S5 Running each of these nine maps over an image to create a new image of the value of the origin ([2,2]) results in 9 "energy maps," or conceptually an image with each pixel associated with a vector of 9 texture attributes. === Autocorrelation and power spectrum === The autocorrelation function of an image can be used to detect repetitive patterns of textures. == Texture segmentation == The use of image texture can be used as a description for regions into segments. There are two main types of segmentation based on image texture, region based and boundary based. Though image texture is not a perfect measure for segmentation it is used along with other measures, such as color, that helps solve segmenting in image. === Region based === Attempts to group or cluster pixels based on texture properties. === Boundary based === Attempts to group or cluster pixels based on edges between pixels that come from different texture properties.

Tensor product network

A tensor product network, in artificial neural networks, is a network that exploits the properties of tensors to model associative concepts such as variable assignment. Orthonormal vectors are chosen to model the ideas (such as variable names and target assignments), and the tensor product of these vectors construct a network whose mathematical properties allow the user to easily extract the association from it.