Scan line

Scan line

A scan line (also scanline) is one line, or row, in a raster scanning pattern, such as a line of video on a cathode-ray tube (CRT) display of a television set or computer monitor. On CRT screens the horizontal scan lines are visually discernible, even when viewed from a distance, as alternating colored lines and black lines, especially when a progressive scan signal with below maximum vertical resolution is displayed. This is sometimes used today as a visual effect in computer graphics. The term is used, by analogy, for a single row of pixels in a raster graphics image. Scan lines are important in representations of image data, because many image file formats have special rules for data at the end of a scan line. For example, there may be a rule that each scan line starts on a particular boundary (such as a byte or word; see for example BMP file format). This means that even otherwise compatible raster data may need to be analyzed at the level of scan lines in order to convert between formats.

Neighborhood operation

In computer vision and image processing a neighborhood operation is a commonly used class of computations on image data which implies that it is processed according to the following pseudo code: Visit each point p in the image data and do { N = a neighborhood or region of the image data around the point p result(p) = f(N) } This general procedure can be applied to image data of arbitrary dimensionality. Also, the image data on which the operation is applied does not have to be defined in terms of intensity or color, it can be any type of information which is organized as a function of spatial (and possibly temporal) variables in p. The result of applying a neighborhood operation on an image is again something which can be interpreted as an image, it has the same dimension as the original data. The value at each image point, however, does not have to be directly related to intensity or color. Instead it is an element in the range of the function f, which can be of arbitrary type. Normally the neighborhood N is of fixed size and is a square (or a cube, depending on the dimensionality of the image data) centered on the point p. Also the function f is fixed, but may in some cases have parameters which can vary with p, see below. In the simplest case, the neighborhood N may be only a single point. This type of operation is often referred to as a point-wise operation. == Examples == The most common examples of a neighborhood operation use a fixed function f which in addition is linear, that is, the computation consists of a linear shift invariant operation. In this case, the neighborhood operation corresponds to the convolution operation. A typical example is convolution with a low-pass filter, where the result can be interpreted in terms of local averages of the image data around each image point. Other examples are computation of local derivatives of the image data. It is also rather common to use a fixed but non-linear function f. This includes median filtering, and computation of local variances. The Nagao-Matsuyama filter is an example of a complex local neighbourhood operation that uses variance as an indicator of the uniformity within a pixel group. The result is similar to a convolution with a low-pass filter with the added effect of preserving sharp edges. There is also a class of neighborhood operations in which the function f has additional parameters which can vary with p: Visit each point p in the image data and do { N = a neighborhood or region of the image data around the point p result(p) = f(N, parameters(p)) } This implies that the result is not shift invariant. Examples are adaptive Wiener filters. == Implementation aspects == The pseudo code given above suggests that a neighborhood operation is implemented in terms of an outer loop over all image points. However, since the results are independent, the image points can be visited in arbitrary order, or can even be processed in parallel. Furthermore, in the case of linear shift-invariant operations, the computation of f at each point implies a summation of products between the image data and the filter coefficients. The implementation of this neighborhood operation can then be made by having the summation loop outside the loop over all image points. An important issue related to neighborhood operation is how to deal with the fact that the neighborhood N becomes more or less undefined for points p close to the edge or border of the image data. Several strategies have been proposed: Compute result only for points p for which the corresponding neighborhood is well-defined. This implies that the output image will be somewhat smaller than the input image. Zero padding: Extend the input image sufficiently by adding extra points outside the original image which are set to zero. The loops over the image points described above visit only the original image points. Border extension: Extend the input image sufficiently by adding extra points outside the original image which are set to the image value at the closest image point. The loops over the image points described above visit only the original image points. Mirror extension: Extend the image sufficiently much by mirroring the image at the image boundaries. This method is less sensitive to local variations at the image boundary than border extension. Wrapping: The image is tiled, so that going off one edge wraps around to the opposite side of the image. This method assumes that the image is largely homogeneous, for example a stochastic image texture without large textons.

Conference on Computer Vision and Pattern Recognition

The Conference on Computer Vision and Pattern Recognition is an annual conference on computer vision and pattern recognition. == Affiliations == The conference was first held in 1983 in Washington, DC, organized by Takeo Kanade and Dana H. Ballard. From 1985 to 2010 it was sponsored by the IEEE Computer Society. In 2011 it was also co-sponsored by University of Colorado Colorado Springs. Since 2012 it has been co-sponsored by the IEEE Computer Society and the Computer Vision Foundation, which provides open access to the conference papers. == Scope == The conference considers a wide range of topics related to computer vision and pattern recognition—basically any topic that is extracting structures or answers from images or video or applying mathematical methods to data to extract or recognize patterns. Common topics include object recognition, image segmentation, motion estimation, 3D reconstruction, and deep learning. The conference generally has less than 30% acceptance rates for all papers and less than 5% for oral presentations. It is managed by a rotating group of volunteers who are chosen in a public election at the Pattern Analysis and Machine Intelligence-Technical Community (PAMI-TC) meeting four years before the meeting. The conference uses a multi-tier double-blind peer review process. The program chairs, who cannot submit papers, select area chairs who manage the reviewers for their subset of submissions. == Location and time == The conference is usually held in June in North America. == Awards == === Best Paper Award === These awards are picked by committees delegated by the program chairs of the conference. === Longuet-Higgins Prize === The Longuet-Higgins Prize recognizes papers from ten years ago that have made a significant impact on computer vision research. === PAMI Young Researcher Award === The Pattern Analysis and Machine Intelligence Young Researcher Award is an award given by the Technical Committee on Pattern Analysis and Machine Intelligence of the IEEE Computer Society to a researcher within 7 years of completing their Ph.D. for outstanding early career research contributions. Candidates are nominated by the computer vision community, with winners selected by a committee of senior researchers in the field. This award was originally instituted in 2012 by the journal Image and Vision Computing, also presented at the conference, and the journal continues to sponsor the award. === PAMI Thomas S. Huang Memorial Prize === The Thomas Huang Memorial Prize was established at the 2020 conference and is awarded annually starting from 2021 to honor researchers who are recognized as examples in research, teaching/mentoring, and service to the computer vision community.

Swish function

The swish function is a family of mathematical function defined as follows: swish β ⁡ ( x ) = x sigmoid ⁡ ( β x ) = x 1 + e − β x . {\displaystyle \operatorname {swish} _{\beta }(x)=x\operatorname {sigmoid} (\beta x)={\frac {x}{1+e^{-\beta x}}}.} where β {\displaystyle \beta } can be constant (usually set to 1) or trainable and "sigmoid" refers to the logistic function. The swish family was designed to smoothly interpolate between a linear function and the Rectified linear unit (ReLU) function. When considering positive values, Swish is a particular case of doubly parameterized sigmoid shrinkage function defined in . Variants of the swish function include Mish. == Special values == For β = 0, the function is linear: f(x) = x/2. For β = 1, the function is the Sigmoid Linear Unit (SiLU). For β = 1.702, the function approximates GeLU. With β → ∞, the function converges to ReLU. Thus, the swish family smoothly interpolates between a linear function and the ReLU function. Since swish β ⁡ ( x ) = swish 1 ⁡ ( β x ) / β {\displaystyle \operatorname {swish} _{\beta }(x)=\operatorname {swish} _{1}(\beta x)/\beta } , all instances of swish have the same shape as the default swish 1 {\displaystyle \operatorname {swish} _{1}} , zoomed by β {\displaystyle \beta } . One usually sets β > 0 {\displaystyle \beta >0} . When β {\displaystyle \beta } is trainable, this constraint can be enforced by β = e b {\displaystyle \beta =e^{b}} , where b {\displaystyle b} is trainable. swish 1 ⁡ ( x ) = x 2 + x 2 4 − x 4 48 + x 6 480 + O ( x 8 ) {\displaystyle \operatorname {swish} _{1}(x)={\frac {x}{2}}+{\frac {x^{2}}{4}}-{\frac {x^{4}}{48}}+{\frac {x^{6}}{480}}+O\left(x^{8}\right)} swish 1 ⁡ ( x ) = x 2 tanh ⁡ ( x 2 ) + x 2 swish 1 ⁡ ( x ) + swish − 1 ⁡ ( x ) = x tanh ⁡ ( x 2 ) swish 1 ⁡ ( x ) − swish − 1 ⁡ ( x ) = x {\displaystyle {\begin{aligned}\operatorname {swish} _{1}(x)&={\frac {x}{2}}\tanh \left({\frac {x}{2}}\right)+{\frac {x}{2}}\\\operatorname {swish} _{1}(x)+\operatorname {swish} _{-1}(x)&=x\tanh \left({\frac {x}{2}}\right)\\\operatorname {swish} _{1}(x)-\operatorname {swish} _{-1}(x)&=x\end{aligned}}} == Derivatives == Because swish β ⁡ ( x ) = swish 1 ⁡ ( β x ) / β {\displaystyle \operatorname {swish} _{\beta }(x)=\operatorname {swish} _{1}(\beta x)/\beta } , it suffices to calculate its derivatives for the default case. swish 1 ′ ⁡ ( x ) = x + sinh ⁡ ( x ) 4 cosh 2 ⁡ ( x 2 ) + 1 2 {\displaystyle \operatorname {swish} _{1}'(x)={\frac {x+\sinh(x)}{4\cosh ^{2}\left({\frac {x}{2}}\right)}}+{\frac {1}{2}}} so swish 1 ′ ⁡ ( x ) − 1 2 {\displaystyle \operatorname {swish} _{1}'(x)-{\frac {1}{2}}} is odd. swish 1 ″ ⁡ ( x ) = 1 − x 2 tanh ⁡ ( x 2 ) 2 cosh 2 ⁡ ( x 2 ) {\displaystyle \operatorname {swish} _{1}''(x)={\frac {1-{\frac {x}{2}}\tanh \left({\frac {x}{2}}\right)}{2\cosh ^{2}\left({\frac {x}{2}}\right)}}} so swish 1 ″ ⁡ ( x ) {\displaystyle \operatorname {swish} _{1}''(x)} is even. == History == SiLU was first proposed alongside the GELU in 2016, then again proposed in 2017 as the Sigmoid-weighted Linear Unit (SiL) in reinforcement learning. The SiLU/SiL was then again proposed as the SWISH over a year after its initial discovery, originally proposed without the learnable parameter β, so that β implicitly equaled 1. The swish paper was then updated to propose the activation with the learnable parameter β. In 2017, after performing analysis on ImageNet data, researchers from Google indicated that using this function as an activation function in artificial neural networks improves the performance, compared to ReLU and sigmoid functions. It is believed that one reason for the improvement is that the swish function helps alleviate the vanishing gradient problem during backpropagation.

Density-based clustering validation

Density-Based Clustering Validation (DBCV) is a metric designed to assess the quality of clustering solutions, particularly for density-based clustering algorithms like DBSCAN, Mean shift, and OPTICS. This metric is particularly suited for identifying concave and nested clusters, where traditional metrics such as the Silhouette coefficient, Davies–Bouldin index, or Calinski–Harabasz index often struggle to provide meaningful evaluations. Unlike traditional validation measures, which often rely on compact and well-separated clusters, DBCV index evaluates how well clusters are defined in terms of local density variations and structural coherence. This metric was introduced in 2014 by David Moulavi and colleagues in their work. It utilizes density connectivity principles to quantify clustering structures, making it especially effective at detecting arbitrarily shaped clusters in concave datasets, where traditional metrics may be less reliable. The DBCV index has been employed for clustering analysis in bioinformatics, ecology, techno-economy, and health informatics , as well as in numerous other fields. == Definition == DBCV index evaluates clustering structures by analyzing the relationships between data points within and across clusters. Given a dataset X = x 1 , x 2 , . . . , x n {\displaystyle X={x_{1},x_{2},...,x_{n}}} , a density-based algorithm partitions it into K clusters C 1 , C 2 , . . . , C K {\displaystyle {C_{1},C_{2},...,C_{K}}} . Each point x i {\displaystyle x_{i}} belongs to a specific cluster, denoted as C c l u s t e r ( x i ) {\displaystyle C_{cluster(x_{i})}} A key concept in DBCV index is the notion of density-connected paths. Two points within the same cluster are considered density-connected if there exists a sequence of intermediate points linking them, where each consecutive pair meets a predefined density criterion. The density-based distance between two points is determined by identifying the optimal path that minimizes the maximum local reachability distance along its trajectory. DBCV index extends the Silhouette coefficient by redefining cluster cohesion and separation using density-based distances: Within-cluster density distance measures how closely a point is related to other members of its cluster: a i = 1 | C c l u s t e r ( x i ) | − 1 ∑ x j ∈ C c l u s t e r ( x i ) , y ≠ x d d e n s i t y ( x j , x i ) {\displaystyle a_{i}={\frac {1}{|C_{cluster(x_{i})}|-1}}\sum _{x_{j}\in C_{cluster(x_{i})},y\neq x}d_{density}(x_{j},x_{i})} Nearest-cluster density distance quantifies how far a point is from the closest external cluster: b i = min C ≠ C cluster ( x i ) C ∈ { C 1 , … , C K } ( 1 | C | ∑ x j ∈ C d density ( x i , x j ) ) . {\displaystyle b_{i}=\min _{C\neq C_{{\text{cluster}}(x_{i})} \atop C\in \{C_{1},\dots ,C_{K}\}}\left({\frac {1}{|C|}}\sum _{x_{j}\in C}d_{\text{density}}(x_{i},x_{j})\right).} Using these measures, the DBCV index is computed as: D B C V = 1 n ∑ i = 1 n b i − a i max ( a i , b i ) {\displaystyle DBCV={\frac {1}{n}}\sum _{i=1}^{n}{\frac {b_{i}-a_{i}}{\max(a_{i},b_{i})}}} == Explanation == DBCV index values range between −1 and +1: +1: Strongly cohesive and well-separated clusters. 0: Ambiguous clustering structure. −1: Poorly formed clusters or incorrect assignments. By leveraging density-based distances instead of traditional Euclidean measures, DBCV index provides a more robust evaluation of clustering performance in datasets with irregular or non-spherical distributions.

Personality computing

Personality computing is a research field related to artificial intelligence and personality psychology that studies personality by means of computational techniques from different sources, including text, multimedia, and social networks. == Overview == Personality computing addresses three main problems involving personality: automatic personality recognition, perception, and synthesis. Automatic personality recognition is the inference of the personality type of target individuals from their digital footprint. Automatic personality perception is the inference of the personality attributed by an observer to a target individual based on some observable behavior. Automatic personality synthesis is the generation of the style or behaviour of artificial personalities in Avatars and virtual agents. Self-assessed personality tests or observer ratings are always exploited as the ground truth for testing and validating the performance of artificial intelligence algorithms for the automatic prediction of personality types. There is a wide variety of personality tests, such as the Myers Briggs Type Indicator (MBTI) or the MMPI, but the most used are tests based on the Five Factor Model such as the Revised NEO Personality Inventory. Personality computing can be considered as an extension or complement of Affective computing, where the former focuses on personality traits and the latter on affective states. A further extension of the two fields is Character Computing which combines various character states and traits including but not limited to personality and affect. == History == Personality computing began around 2005 with the pioneering research in personality recognition by Shlomo Argamon and later by François Mairesse. These works showed that personality traits could be inferred with reasonable accuracy from text, such as blogs, self-presentations, and email addresses. In 2008, the concept of "portable personality" for the distributed management of personality profiles has been developed. A few years later, research began in personality recognition and perception from multimodal and social signals, such as recorded meetings and voice calls. In the 2010s, the research focused mainly on personality recognition and perception from social media, helped by the first workshops organized by Fabio Celli. In particular personality was extracted from Facebook, Twitter and Instagram. In the same years, automatic personality synthesis helped improve the coherence of simulated behavior in virtual agents. Scientific works by Michal Kosinski demonstrated the validity of Personality Computing from different digital footprints, in particular from user preferences such as Facebook page likes, showed that machines can recognize personality better than humans and raised a warning against Cambridge Analytica and misuse of this kind of technology. == Applications == Personality computing techniques, in particular personality recognition and perception, have applications in Social media marketing, where they can help reducing the cost of advertising campaigns through psychological targeting.

Soft independent modelling of class analogies

Soft independent modelling by class analogy (SIMCA) is a statistical method for supervised classification of data. The method requires a training data set consisting of samples (or objects) with a set of attributes and their class membership. The term soft refers to the fact the classifier can identify samples as belonging to multiple classes and not necessarily producing a classification of samples into non-overlapping classes. == Method == In order to build the classification models, the samples belonging to each class need to be analysed using principal component analysis (PCA); only the significant components are retained. For a given class, the resulting model then describes either a line (for one Principal Component or PC), plane (for two PCs) or hyper-plane (for more than two PCs). For each modelled class, the mean orthogonal distance of training data samples from the line, plane, or hyper-plane (calculated as the residual standard deviation) is used to determine a critical distance for classification. This critical distance is based on the F-distribution and is usually calculated using 95% or 99% confidence intervals. New observations are projected into each PC model and the residual distances calculated. An observation is assigned to the model class when its residual distance from the model is below the statistical limit for the class. The observation may be found to belong to multiple classes and a measure of goodness of the model can be found from the number of cases where the observations are classified into multiple classes. The classification efficiency is usually indicated by Receiver operating characteristics. In the original SIMCA method, the ends of the hyper-plane of each class are closed off by setting statistical control limits along the retained principal components axes (i.e., score value between plus and minus 0.5 times score standard deviation). More recent adaptations of the SIMCA method close off the hyper-plane by construction of ellipsoids (e.g. Hotelling's T2 or Mahalanobis distance). With such modified SIMCA methods, classification of an object requires both that its orthogonal distance from the model and its projection within the model (i.e. score value within the region defined by the ellipsoid) are not significant. == Application == SIMCA as a method of classification has gained widespread use especially in applied statistical fields such as chemometrics and spectroscopic data analysis.