Co-occurrence matrix

Co-occurrence matrix

A co-occurrence matrix or co-occurrence distribution (also referred to as : gray-level co-occurrence matrices GLCMs) is a matrix that is defined over an image to be the distribution of co-occurring pixel values (grayscale values, or colors) at a given offset. It is used as an approach to texture analysis with various applications especially in medical image analysis. == Method == Given a grey-level image I {\displaystyle I} , co-occurrence matrix computes how often pairs of pixels with a specific value and offset occur in the image. The offset, ( Δ x , Δ y ) {\displaystyle (\Delta x,\Delta y)} , is a position operator that can be applied to any pixel in the image (ignoring edge effects): for instance, ( 1 , 2 ) {\displaystyle (1,2)} could indicate "one down, two right". An image with p {\displaystyle p} different pixel values will produce a p × p {\displaystyle p\times p} co-occurrence matrix, for the given offset. The ( i , j ) th {\displaystyle (i,j)^{\text{th}}} value of the co-occurrence matrix gives the number of times in the image that the i th {\displaystyle i^{\text{th}}} and j th {\displaystyle j^{\text{th}}} pixel values occur in the relation given by the offset. For an image with p {\displaystyle p} different pixel values, the p × p {\displaystyle p\times p} co-occurrence matrix C is defined over an n × m {\displaystyle n\times m} image I {\displaystyle I} , parameterized by an offset ( Δ x , Δ y ) {\displaystyle (\Delta x,\Delta y)} , as: C Δ x , Δ y ( i , j ) = ∑ x = 1 n ∑ y = 1 m { 1 , if I ( x , y ) = i and I ( x + Δ x , y + Δ y ) = j 0 , otherwise {\displaystyle C_{\Delta x,\Delta y}(i,j)=\sum _{x=1}^{n}\sum _{y=1}^{m}{\begin{cases}1,&{\text{if }}I(x,y)=i{\text{ and }}I(x+\Delta x,y+\Delta y)=j\\0,&{\text{otherwise}}\end{cases}}} where: i {\displaystyle i} and j {\displaystyle j} are the pixel values; x {\displaystyle x} and y {\displaystyle y} are the spatial positions in the image I; the offsets ( Δ x , Δ y ) {\displaystyle (\Delta x,\Delta y)} define the spatial relation for which this matrix is calculated; and I ( x , y ) {\displaystyle I(x,y)} indicates the pixel value at pixel ( x , y ) {\displaystyle (x,y)} . The 'value' of the image originally referred to the grayscale value of the specified pixel, but could be anything, from a binary on/off value to 32-bit color and beyond. (Note that 32-bit color will yield a 232 × 232 co-occurrence matrix!) Co-occurrence matrices can also be parameterized in terms of a distance, d {\displaystyle d} , and an angle, θ {\displaystyle \theta } , instead of an offset ( Δ x , Δ y ) {\displaystyle (\Delta x,\Delta y)} . Any matrix or pair of matrices can be used to generate a co-occurrence matrix, though their most common application has been in measuring texture in images, so the typical definition, as above, assumes that the matrix is an image. It is also possible to define the matrix across two different images. Such a matrix can then be used for color mapping. == Aliases == Co-occurrence matrices are also referred to as: GLCMs (gray-level co-occurrence matrices) GLCHs (gray-level co-occurrence histograms) spatial dependence matrices == Application to image analysis == Whether considering the intensity or grayscale values of the image or various dimensions of color, the co-occurrence matrix can measure the texture of the image. Because co-occurrence matrices are typically large and sparse, various metrics of the matrix are often taken to get a more useful set of features. Features generated using this technique are usually called Haralick features, after Robert Haralick. Texture analysis is often concerned with detecting aspects of an image that are rotationally invariant. To approximate this, the co-occurrence matrices corresponding to the same relation, but rotated at various regular angles (e.g. 0, 45, 90, and 135 degrees), are often calculated and summed. Texture measures like the co-occurrence matrix, wavelet transforms, and model fitting have found application in medical image analysis in particular. == Other applications == Co-occurrence matrices are also used for words processing in natural language processing (NLP).

WaveMaker

WaveMaker is a Java-based low-code development platform designed for building software applications and platforms. The company, WaveMaker Inc., is based in Mountain View, California. The platform is intended to assist enterprises in speeding up their application development and IT modernization initiatives through low-code capabilities. Additionally, for independent software vendors (ISVs), WaveMaker serves as a customizable low-code component that integrates into their products. The WaveMaker Platform is a licensed software platform allowing organizations to establish their own end-to-application platform-as-a-service (PaaS) for the creation and operation of custom apps. It allows developers and business users to create apps that are customizable. These applications can seamlessly consume APIs, visualize data, and automatically adapt to multi-device responsive interfaces. WaveMaker's low-code platform allows organizations to deploy applications on either public or private cloud infrastructure. Containers can be deployed on top of virtual machines or directly on bare metal. The software features a graphical user interface (GUI) console for managing IT app infrastructure, leveraging the capabilities of Docker containerization. The solution offers functionalities for automating application deployment, managing the application lifecycle, overseeing release management, and controlling deployment workflows and access permissions: Apps for web, tablet, and smartphone interfaces Enterprise technologies like Java, Hibernate, Spring, AngularJS, JQuery Docker-provided APIs and CLI Software stack packaging, container provisioning, stack and app upgrading, replication, and fault tolerance == WaveMaker Studio == WaveMaker RAD Platform is built around WaveMaker Studio, a WYSIWYG rapid development tool that allows business users to compose an application using a drag-and-drop method. WaveMaker Studio supports rapid application development (RAD) for the web, similar to what products like PowerBuilder and Lotus Notes provided for client-server computing. WaveMaker Studio allows developers to produce an application once, then automatically adjust it for a particular target platform, whether a PC, mobile phone, or tablet. Applications created using the WaveMaker Studio follow a model–view–controller architecture. WaveMaker Studio has been downloaded more than two million times. The Studio community consists of 30,000 registered users. Applications generated by WaveMaker Studio are licensed under the Apache license. Studio 8 was released on September 25, 2015. The prior version, Studio 7, has some notable development milestones. It was based on AngularJS framework, previous Studio versions (6.7, 6.6, 6.5) use the Dojo Toolkit. Some of the features WaveMaker Studio 7 include: Automatic generation of Hibernate mapping, and Hibernate queries from database schema import. Automatic creation of Enterprise Data Widgets based on schema import. Each widget can display data from a database table as a grid or edit form. Edit form implements create, update, and delete functions automatically. WYSIWYG Ajax development studio runs in a browser. Deployment to Tomcat, IBM WebSphere, Weblogic, JBoss. Mashup tool to assemble web applications based on SOAP, REST and RSS web services, Java Services and databases. Supports existing CSS, HTML and Java code. The ability to deploy a standard Java .war file. == Technologies and frameworks == WaveMaker allows users to build applications that run on "Open Systems Stack" based on the following technologies and frameworks: AngularJS, Bootstrap, NVD3, HTML, CSS, Apache Cordova, Hibernate, Spring, Spring Security, Java. The various supported integrations include: Databases: Oracle, MySQL, Microsoft SQL Server, PostgreSQL, IBM DB2, HSQLDB Authentication: LDAP, Active Directory, CAS, Custom Java Service, Database Version Control: Bitbucket (or Stash), GitHub, Apache Subversion Deployment: Amazon AWS, Microsoft Azure, WaveMaker Private Cloud (Docker containerization), IBM Web Sphere, Apache Tomcat, SpringSource tcServer, Oracle WebLogic Server, JBoss(WildFly), GlassFish App Stores: Google Play, Apple App Store, Windows Store == History == In 2003, WaveMaker was founded as ActiveGrid. Then, in 2007, it was rebranded as Wavemaker. It was acquired by VMware in 2011. In March 2013, support for the WaveMaker project was discontinued. In May 2013, Pramati Technologies acquired the assets of WaveMaker. In February 2014, Wavemaker Studio 6.7 was released, which was the last open source version of Studio. In September 2014 WaveMaker Inc. launched the WaveMaker RAD Platform, which allowed organizations to run their own application platform for building and running apps. In March 2023, WaveMaker released version 11.5, which includes enhanced low-code development capabilities and new AI-driven tools to streamline the application development process.

Diffusion model

In machine learning, diffusion models, also known as diffusion-based generative models or score-based generative models, are a class of latent variable generative models. A diffusion model consists of two major components: the forward diffusion process, and the reverse sampling process. The goal of diffusion models is to learn a diffusion process for a given dataset, such that the process can generate new elements that are distributed similarly as the original dataset. A diffusion model models data as generated by a diffusion process, whereby a new datum performs a random walk with drift through the space of all possible data. A trained diffusion model can be sampled in many ways, with different efficiency and quality. There are various equivalent formalisms, including Markov chains, denoising diffusion probabilistic models, noise conditioned score networks, and stochastic differential equations. They are typically trained using variational inference. The model responsible for denoising is typically called its "backbone". The backbone may be of any kind, but they are typically U-nets or transformers. As of 2024, diffusion models are mainly used for computer vision tasks, including image denoising, inpainting, super-resolution, image generation, and video generation. These typically involve training a neural network to sequentially denoise images blurred with Gaussian noise. The model is trained to reverse the process of adding noise to an image. After training to convergence, it can be used for image generation by starting with an image composed of random noise, and applying the network iteratively to denoise the image. Diffusion-based image generators have seen widespread commercial interest, such as Stable Diffusion and DALL-E. These models typically combine diffusion models with other models, such as text-encoders and cross-attention modules to allow text-conditioned generation. Other than computer vision, diffusion models have also found applications in natural language processing such as text generation and summarization, sound generation, and reinforcement learning. == Denoising diffusion model == === Non-equilibrium thermodynamics === Diffusion models were introduced in 2015 as a method to train a model that can sample from a highly complex probability distribution. They used techniques from non-equilibrium thermodynamics, especially diffusion. Consider, for example, how one might model the distribution of all naturally occurring photos. Each image is a point in the space of all images, and the distribution of naturally occurring photos is a "cloud" in space, which, by repeatedly adding noise to the images, diffuses out to the rest of the image space, until the cloud becomes all but indistinguishable from a Gaussian distribution N ( 0 , I ) {\displaystyle {\mathcal {N}}(0,I)} . A model that can approximately undo the diffusion can then be used to sample from the original distribution. This is studied in "non-equilibrium" thermodynamics, as the starting distribution is not in equilibrium, unlike the final distribution. The equilibrium distribution is the Gaussian distribution N ( 0 , I ) {\displaystyle {\mathcal {N}}(0,I)} , with pdf ρ ( x ) ∝ e − 1 2 ‖ x ‖ 2 {\displaystyle \rho (x)\propto e^{-{\frac {1}{2}}\|x\|^{2}}} . This is just the Maxwell–Boltzmann distribution of particles in a potential well V ( x ) = 1 2 ‖ x ‖ 2 {\displaystyle V(x)={\frac {1}{2}}\|x\|^{2}} at temperature 1. The initial distribution, being very much out of equilibrium, would diffuse towards the equilibrium distribution, making biased random steps that are a sum of pure randomness (like a Brownian walker) and gradient descent down the potential well. The randomness is necessary: if the particles were to undergo only gradient descent, then they will all fall to the origin, collapsing the distribution. === Denoising Diffusion Probabilistic Model (DDPM) === The 2020 paper proposed the Denoising Diffusion Probabilistic Model (DDPM), which improves upon the previous method by variational inference. ==== Forward diffusion ==== To present the model, some notation is required. β 1 , . . . , β T ∈ ( 0 , 1 ) {\displaystyle \beta _{1},...,\beta _{T}\in (0,1)} are fixed constants. α t := 1 − β t {\displaystyle \alpha _{t}:=1-\beta _{t}} α ¯ t := α 1 ⋯ α t {\displaystyle {\bar {\alpha }}_{t}:=\alpha _{1}\cdots \alpha _{t}} σ t := 1 − α ¯ t {\displaystyle \sigma _{t}:={\sqrt {1-{\bar {\alpha }}_{t}}}} σ ~ t := σ t − 1 σ t β t {\displaystyle {\tilde {\sigma }}_{t}:={\frac {\sigma _{t-1}}{\sigma _{t}}}{\sqrt {\beta _{t}}}} μ ~ t ( x t , x 0 ) := α t ( 1 − α ¯ t − 1 ) x t + α ¯ t − 1 ( 1 − α t ) x 0 σ t 2 {\displaystyle {\tilde {\mu }}_{t}(x_{t},x_{0}):={\frac {{\sqrt {\alpha _{t}}}(1-{\bar {\alpha }}_{t-1})x_{t}+{\sqrt {{\bar {\alpha }}_{t-1}}}(1-\alpha _{t})x_{0}}{\sigma _{t}^{2}}}} N ( μ , Σ ) {\displaystyle {\mathcal {N}}(\mu ,\Sigma )} is the normal distribution with mean μ {\displaystyle \mu } and variance Σ {\displaystyle \Sigma } , and N ( x | μ , Σ ) {\displaystyle {\mathcal {N}}(x|\mu ,\Sigma )} is the probability density at x {\displaystyle x} . A vertical bar denotes conditioning. A forward diffusion process starts at some starting point x 0 ∼ q {\displaystyle x_{0}\sim q} , where q {\displaystyle q} is the probability distribution to be learned, then repeatedly adds noise to it by x t = 1 − β t x t − 1 + β t z t {\displaystyle x_{t}={\sqrt {1-\beta _{t}}}x_{t-1}+{\sqrt {\beta _{t}}}z_{t}} where z 1 , . . . , z T {\displaystyle z_{1},...,z_{T}} are IID (Independent and identically distributed random variables) samples from N ( 0 , I ) {\displaystyle {\mathcal {N}}(0,I)} . The coefficients 1 − β t {\displaystyle {\sqrt {1-\beta _{t}}}} and β t {\displaystyle {\sqrt {\beta _{t}}}} ensure that Var ( X t ) = I {\displaystyle {\mbox{Var}}(X_{t})=I} assuming that Var ( X 0 ) = I {\displaystyle {\mbox{Var}}(X_{0})=I} . The values of β t {\displaystyle \beta _{t}} are chosen such that for any starting distribution of x 0 {\displaystyle x_{0}} , if it has finite second moment, then lim t → ∞ x t | x 0 {\displaystyle \lim _{t\to \infty }x_{t}|x_{0}} converges to N ( 0 , I ) {\displaystyle {\mathcal {N}}(0,I)} . The entire diffusion process then satisfies q ( x 0 : T ) = q ( x 0 ) q ( x 1 | x 0 ) ⋯ q ( x T | x T − 1 ) = q ( x 0 ) N ( x 1 | α 1 x 0 , β 1 I ) ⋯ N ( x T | α T x T − 1 , β T I ) {\displaystyle q(x_{0:T})=q(x_{0})q(x_{1}|x_{0})\cdots q(x_{T}|x_{T-1})=q(x_{0}){\mathcal {N}}(x_{1}|{\sqrt {\alpha _{1}}}x_{0},\beta _{1}I)\cdots {\mathcal {N}}(x_{T}|{\sqrt {\alpha _{T}}}x_{T-1},\beta _{T}I)} or ln ⁡ q ( x 0 : T ) = ln ⁡ q ( x 0 ) − ∑ t = 1 T 1 2 β t ‖ x t − 1 − β t x t − 1 ‖ 2 + C {\displaystyle \ln q(x_{0:T})=\ln q(x_{0})-\sum _{t=1}^{T}{\frac {1}{2\beta _{t}}}\|x_{t}-{\sqrt {1-\beta _{t}}}x_{t-1}\|^{2}+C} where C {\displaystyle C} is a normalization constant and often omitted. In particular, we note that x 1 : T | x 0 {\displaystyle x_{1:T}|x_{0}} is a Gaussian process, which affords us considerable freedom in reparameterization. For example, by standard manipulation with Gaussian process, x t | x 0 ∼ N ( α ¯ t x 0 , σ t 2 I ) {\displaystyle x_{t}|x_{0}\sim N\left({\sqrt {{\bar {\alpha }}_{t}}}x_{0},\sigma _{t}^{2}I\right)} x t − 1 | x t , x 0 ∼ N ( μ ~ t ( x t , x 0 ) , σ ~ t 2 I ) {\displaystyle x_{t-1}|x_{t},x_{0}\sim {\mathcal {N}}({\tilde {\mu }}_{t}(x_{t},x_{0}),{\tilde {\sigma }}_{t}^{2}I)} In particular, notice that for large t {\displaystyle t} , the variable x t | x 0 ∼ N ( α ¯ t x 0 , σ t 2 I ) {\displaystyle x_{t}|x_{0}\sim N\left({\sqrt {{\bar {\alpha }}_{t}}}x_{0},\sigma _{t}^{2}I\right)} converges to N ( 0 , I ) {\displaystyle {\mathcal {N}}(0,I)} . That is, after a long enough diffusion process, we end up with some x T {\displaystyle x_{T}} that is very close to N ( 0 , I ) {\displaystyle {\mathcal {N}}(0,I)} , with all traces of the original x 0 ∼ q {\displaystyle x_{0}\sim q} gone. For example, since x t | x 0 ∼ N ( α ¯ t x 0 , σ t 2 I ) {\displaystyle x_{t}|x_{0}\sim N\left({\sqrt {{\bar {\alpha }}_{t}}}x_{0},\sigma _{t}^{2}I\right)} we can sample x t | x 0 {\displaystyle x_{t}|x_{0}} directly "in one step", instead of going through all the intermediate steps x 1 , x 2 , . . . , x t − 1 {\displaystyle x_{1},x_{2},...,x_{t-1}} . ==== Backward diffusion ==== The key idea of DDPM is to use a neural network parametrized by θ {\displaystyle \theta } . The network takes in two arguments x t , t {\displaystyle x_{t},t} , and outputs a vector μ θ ( x t , t ) {\displaystyle \mu _{\theta }(x_{t},t)} and a matrix Σ θ ( x t , t ) {\displaystyle \Sigma _{\theta }(x_{t},t)} , such that each step in the forward diffusion process can be approximately undone by x t − 1 ∼ N ( μ θ ( x t , t ) , Σ θ ( x t , t ) ) {\displaystyle x_{t-1}\sim {\mathcal {N}}(\mu _{\theta }(x_{t},t),\Sigma _{\theta }(x_{t},t))} . This then gives us a backward diffusion process p θ {\displaystyle p_{\theta }} defined by p θ ( x T ) = N ( x T | 0 , I ) {\displaystyle p_{\theta }(x

Tensor product network

A tensor product network, in artificial neural networks, is a network that exploits the properties of tensors to model associative concepts such as variable assignment. Orthonormal vectors are chosen to model the ideas (such as variable names and target assignments), and the tensor product of these vectors construct a network whose mathematical properties allow the user to easily extract the association from it.

Multi expression programming

Multi Expression Programming (MEP) is an evolutionary algorithm for generating mathematical functions describing a given set of data. MEP is a Genetic Programming variant encoding multiple solutions in the same chromosome. MEP representation is not specific (multiple representations have been tested). In the simplest variant, MEP chromosomes are linear strings of instructions. This representation was inspired by Three-address code. MEP strength consists in the ability to encode multiple solutions, of a problem, in the same chromosome. In this way, one can explore larger zones of the search space. For most of the problems this advantage comes with no running-time penalty compared with genetic programming variants encoding a single solution in a chromosome. == Representation == MEP chromosomes are arrays of instructions represented in Three-address code format. Each instruction contains a variable, a constant, or a function. If the instruction is a function, then the arguments (given as instruction's addresses) are also present. === Example of MEP program === Here is a simple MEP chromosome (labels on the left side are not a part of the chromosome): 1: a 2: b 3: + 1, 2 4: c 5: d 6: + 4, 5 7: 3, 5 == Fitness computation == When the chromosome is evaluated it is unclear which instruction will provide the output of the program. In many cases, a set of programs is obtained, some of them being completely unrelated (they do not have common instructions). For the above chromosome, here is the list of possible programs obtained during decoding: E1 = a, E2 = b, E4 = c, E5 = d, E3 = a + b. E6 = c + d. E7 = (a + b) d. Each instruction is evaluated as a possible output of the program. The fitness (or error) is computed in a standard manner. For instance, in the case of symbolic regression, the fitness is the sum of differences (in absolute value) between the expected output (called target) and the actual output. == Fitness assignment process == Which expression will represent the chromosome? Which one will give the fitness of the chromosome? In MEP, the best of them (which has the lowest error) will represent the chromosome. This is different from other GP techniques: In Linear genetic programming the last instruction will give the output. In Cartesian Genetic Programming the gene providing the output is evolved like all other genes. Note that, for many problems, this evaluation has the same complexity as in the case of encoding a single solution in each chromosome. Thus, there is no penalty in running time compared to other techniques. == Software == === MEPX === MEPX is a cross-platform (Windows, macOS, and Linux Ubuntu) free software for the automatic generation of computer programs. It can be used for data analysis, particularly for solving symbolic regression, statistical classification and time-series problems. === libmep === Libmep is a free and open source library implementing Multi Expression Programming technique. It is written in C++. === hmep === hmep is a new open source library implementing Multi Expression Programming technique in Haskell programming language.

Conversational user interface

A conversational user interface (CUI) is a user interface for computers that emulates a conversation with a human. Historically, computers have relied on text-based user interfaces and graphical user interfaces (GUIs) (such as the user pressing a "back" button) to translate the user's desired action into commands the computer understands. While an effective mechanism of completing computing actions, there is a learning curve for the user associated with GUI. Instead, CUIs provide opportunity for the user to communicate with the computer in their natural language rather than in a syntax specific commands.

Persian Speech Corpus

The Persian Speech Corpus is a Modern Persian speech corpus for speech synthesis. The corpus contains phonetic and orthographic transcriptions of about 2.5 hours of Persian speech aligned with recorded speech on the phoneme level, including annotations of word boundaries. Previous spoken corpora of Persian include FARSDAT, which consists of read aloud speech from newspaper texts from 100 Persian speakers and the Telephone FARsi Spoken language DATabase (TFARSDAT) which comprises seven hours of read and spontaneous speech produced by 60 native speakers of Persian from ten regions of Iran. The Persian Speech Corpus was built using the same methodologies laid out in the doctoral project on Modern Standard Arabic of Nawar Halabi at the University of Southampton. The work was funded by MicroLinkPC, who own an exclusive license to commercialise the corpus, though the corpus is available for non-commercial use through the corpus' website. It is distributed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. The corpus was built for speech synthesis purposes, but has been used for building HMM based voices in Persian. It can also be used to automatically align other speech corpora with their phonetic transcript and could be used as part of a larger corpus for training speech recognition systems. == Contents == The corpus is downloadable from its website, and contains the following: 396 .wav files containing spoken utterances 396 .lab files containing text utterances 396 .TextGrid files containing the phoneme labels with time stamps of the boundaries where these occur in the .wav files. phonetic-transcript.txt which has the form "[wav_filename]" "[Phoneme Sequence]" in every line orthographic-transcript.txt which has the form "[wav_filename]" "[Orthographic Transcript]" in every line