Spike-and-slab regression is a type of Bayesian linear regression in which a particular hierarchical prior distribution for the regression coefficients is chosen such that only a subset of the possible regressors is retained. The technique is particularly useful when the number of possible predictors is larger than the number of observations. The idea of the spike-and-slab model was originally proposed by Mitchell & Beauchamp (1988). The approach was further significantly developed by Madigan & Raftery (1994) and George & McCulloch (1997). A recent and important contribution to this literature is Ishwaran & Rao (2005). == Model description == Suppose we have P possible predictors in some model. Vector γ has a length equal to P and consists of zeros and ones. This vector indicates whether a particular variable is included in the regression or not. If no specific prior information on initial inclusion probabilities of particular variables is available, a Bernoulli prior distribution is a common default choice. Conditional on a predictor being in the regression, we identify a prior distribution for the model coefficient, which corresponds to that variable (β). A common choice on that step is to use a normal prior with a mean equal to zero and a large variance calculated based on ( X T X ) − 1 {\displaystyle (X^{T}X)^{-1}} (where X {\displaystyle X} is a design matrix of explanatory variables of the model). A draw of γ from its prior distribution is a list of the variables included in the regression. Conditional on this set of selected variables, we take a draw from the prior distribution of the regression coefficients (if γi = 1 then βi ≠ 0 and if γi = 0 then βi = 0). βγ denotes the subset of β for which γi = 1. In the next step, we calculate a posterior probability for both inclusion and coefficients by applying a standard statistical procedure. All steps of the described algorithm are repeated thousands of times using the Markov chain Monte Carlo (MCMC) technique. As a result, we obtain a posterior distribution of γ (variable inclusion in the model), β (regression coefficient values) and the corresponding prediction of y. The model got its name (spike-and-slab) due to the shape of the two prior distributions. The "spike" is the probability of a particular coefficient in the model to be zero. The "slab" is the prior distribution for the regression coefficient values. An advantage of Bayesian variable selection techniques is that they are able to make use of prior knowledge about the model. In the absence of such knowledge, some reasonable default values can be used; to quote Scott and Varian (2013): "For the analyst who prefers simplicity at the cost of some reasonable assumptions, useful prior information can be reduced to an expected model size, an expected R2, and a sample size ν determining the weight given to the guess at R2." Some researchers suggest the following default values: R2 = 0.5, ν = 0.01, and π = 0.5 (parameter of a prior Bernoulli distribution).
Normal distributions transform
The normal distributions transform (NDT) is a point cloud registration algorithm introduced by Peter Biber and Wolfgang Straßer in 2003, while working at University of Tübingen. The algorithm registers two point clouds by first associating a piecewise normal distribution to the first point cloud, that gives the probability of sampling a point belonging to the cloud at a given spatial coordinate, and then finding a transform that maps the second point cloud to the first by maximising the likelihood of the second point cloud on such distribution as a function of the transform parameters. Originally introduced for 2D point cloud map matching in simultaneous localization and mapping (SLAM) and relative position tracking, the algorithm was extended to 3D point clouds and has wide applications in computer vision and robotics. NDT is very fast and accurate, making it suitable for application to large scale data, but it is also sensitive to initialisation, requiring a sufficiently accurate initial guess, and for this reason it is typically used in a coarse-to-fine alignment strategy. == Formulation == The NDT function associated to a point cloud is constructed by partitioning the space in regular cells. For each cell, it is possible to define the mean q = 1 n ∑ i x i {\displaystyle \textstyle \mathbf {q} ={\frac {1}{n}}\sum _{i}\mathbf {x_{i}} } and covariance S = 1 n ∑ i ( x i − q ) ( x i − q ) ⊤ {\displaystyle \textstyle \mathbf {S} ={\frac {1}{n}}\sum _{i}\left(\mathbf {x} _{i}-\mathbf {q} \right)\left(\mathbf {x} _{i}-\mathbf {q} \right)^{\top }} of the n {\displaystyle n} points of the cloud x 1 , … , x n {\displaystyle \mathbf {x} _{1},\dots ,\mathbf {x} _{n}} that fall within the cell. The probability density of sampling a point at a given spatial location x {\displaystyle \mathbf {x} } within the cell is then given by the normal distribution e − 1 2 ( x − q ) ⊤ S − 1 ( x − q ) {\displaystyle e^{-{\frac {1}{2}}\left(\mathbf {x} -\mathbf {q} \right)^{\top }\mathbf {S} ^{-1}\left(\mathbf {x} -\mathbf {q} \right)}} . Two point clouds can be mapped by a Euclidean transformation f {\displaystyle f} with rotation matrix R {\displaystyle \mathbf {R} } and translation vector t {\displaystyle \mathbf {t} } f R , t ( x ) = R x + t {\displaystyle f_{\mathbf {R} ,\mathbf {t} }(\mathbf {x} )=\mathbf {R} \mathbf {x} +\mathbf {t} } that maps from the second cloud to the first, parametrised by the rotation angles and translation components. The algorithm registers the two point clouds by optimising the parameters of the transformation that maps the second cloud to the first, with respect to a loss function based on the NDT of the first point cloud, solving the following problem arg min R , t { − ∑ i NDT ( f R , t ( x i ) ) } {\displaystyle \arg \min _{\mathbf {R} ,\mathbf {t} }\left\{-\sum _{i}\operatorname {NDT} \left(f_{\mathbf {R} ,\mathbf {t} }\left(\mathbf {x_{i}} \right)\right)\right\}} where the loss function represents the negated likelihood, obtained by applying the transformation to all points in the second cloud and summing the value of the NDT at each transformed point f R , t ( x ) {\displaystyle f_{\mathbf {R} ,\mathbf {t} }(\mathbf {x} )} . The loss is piecewise continuous and differentiable, and can be optimised with gradient-based methods (in the original formulation, the authors use Newton's method). In order to reduce the effect of cell discretisation, a technique consists of partitioning the space into multiple overlapping grids, shifted by half cell size along the spatial directions, and computing the likelihood at a given location as the sum of the NDTs induced by each grid.
Scene statistics
Scene statistics is a discipline within the field of perception. It is concerned with the statistical regularities related to scenes. It is based on the premise that a perceptual system is designed to interpret scenes. Biological perceptual systems have evolved in response to physical properties of natural environments. Therefore natural scenes receive a great deal of attention. Natural scene statistics are useful for defining the behavior of an ideal observer in a natural task, typically by incorporating signal detection theory, information theory or estimation theory. == Within-domain versus across-domain == Geisler (2008) distinguishes between four kinds of domains: (1) Physical environments (2) Images/Scenes (3) Neural responses and (4) Behavior. Within the domain of images/scenes one can study the characteristics of information related to redundancy and efficient coding. Across-domain statistics determine how an autonomous system should make inferences about its environment, process information and control its behavior. To study these statistics it is necessary to sample or register information in multiple domains simultaneously. == Applications == === Prediction of picture and video quality === One of the most successful applications of Natural Scenes Statistics Models has been perceptual picture and video quality prediction. For example, the Visual Information Fidelity (VIF) algorithm, which is used to measure the degree of distortion of pictures and videos, is used extensively by the image and video processing communities to assess perceptual quality. This is often after processing, such as compression, which can degrade the appearance of a visual signal. The premise is that the scene statistics are changed by distortion and that the visual system is sensitive to the changes in the scene statistics. VIF is heavily used in the streaming television industry. Other popular picture quality models that use natural scene statistics include BRISQUE and NIQE, both of which are no-reference since they do not require any reference picture to measure quality against.
Frameserver
A frameserver is any program that acts as a media source in the process called frameserving, which transfers digital video data from one computer program to another without intermediate files. The program that receives the data – the frameclient – could be any type of video application. The process is controlled by the frameclient: the frameclient requests audio/video frames and the frameserver serves them. The client can request frames in any order, allowing it to pause or jump to an arbitrary frame, just as a media player does with a file on disk. The client is most commonly a media encoder, a non-linear editing system, or a media player. == Frameservers == AviSynth VirtualDub VapourSynth Debugmode FrameServer
Sorenson Squeeze
Sorenson Squeeze was a software video encoding tool used to compress and convert video and audio files on Mac OS X or Windows operating systems. It was sold as a standalone tool and has also long been bundled with Avid Media Composer. == History == Sorenson Squeeze was first announced on July 17, 2001, as the first variable bit rate (VBR) compression application for Mac OS X, and was released on October 29 of that same year. By March 2002, Sorenson Squeeze became available for Windows OS. Sorenson Squeeze was originally released as a tool for encoding videos for the Web and QuickTime playback but began adding new codecs as more versions were released. The software was discontinued by Sorenson in January 2019, and correspondingly was no longer offered as part of Avid Media Composer. == Features == Squeeze included a number of features to improve video & audio quality. Features included: GPU accelerated H.264 encoding, adaptive bitrate encoding, HD encoding and Dolby certified AC3 Audio. Intelligent encoding presets available in Squeeze included: x265 (H.265) MainConcept H.264 and MainConcept H.264 CUDA. Adaptive bitrate encoding allows for optimal bitrate and error resilience based on network conditions, resulting in a dynamic adjustment of the video bitstream being delivered. It encoded to multiple formats including QuickTime, Windows Media, Flash Video, Silverlight, WebM & WMV. It uses multiple codecs, including the Sorenson codecs SV3 Pro and Spark, H.265, H.264, H.263, VP6, VC1, MPEG2, and many others. Squeeze operates on the Apple Macintosh and Microsoft Windows operating systems. Squeeze offers native plugins to Avid, Apple Final Cut Pro and Adobe Premiere (CS4, CS5) NLEs. Each copy of Squeeze included the Dolby Certified AC3 Consumer encoder. Squeeze also included a simplified review and approval process, which allows the user to automatically send secure, password protected videos for immediate review. Instant feedback is received via Web or mobile. == Versions == Sorenson Squeeze was released on October 29, 2001. Sorenson Squeeze for Macromedia Flash MX was released on March 14, 2002. Sorenson Squeeze 3 for MPEG-4 was released in January 2003. Sorenson Squeeze 3 Compression Suite was released in January 2003. Sorenson Squeeze 5 was released on March 31, 2008. Sorenson Squeeze was updated to version 5.1 on May 11, 2009. Sorenson Squeeze 6 was released on November 3, 2009. Sorenson Squeeze 7 was released January 25, 2011. Sorenson Squeeze 11 was released August 27, 2016. == Awards == Streaming Media magazine Readers’ Choice Award for Encoding Software for 2007, 2008, 2009 and 2010. 2008 Vanguard Award from Digital Content Producer magazine == Squeeze 7 system requirements == Windows Pentium IV-based computer or greater Windows XP, Vista or 7 32- and 64-bit compatible (including AVID 64-bit update); Faster performance on 64-bit systems 512 MB RAM 120 MB available hard drive space QuickTime 7.2 or later DirectX 9.0b or later Macintosh Intel-based processor Mac OS 10.4 or later 32- and 64-bit compatible; Faster performance on 64-bit systems 512 MB RAM 120 MB available hard drive space QuickTime 7.2 or later
Motor theory of speech perception
The motor theory of speech perception is the hypothesis that people perceive spoken words by identifying the vocal tract gestures with which they are pronounced rather than by identifying the sound patterns that speech generates. It originally claimed that speech perception is done through a specialized module that is innate and human-specific. Though the idea of a module has been qualified in more recent versions of the theory, the idea remains that the role of the speech motor system is not only to produce speech articulations but also to detect them. The hypothesis has gained more interest outside the field of speech perception than inside. This has increased particularly since the discovery of mirror neurons that link the production and perception of motor movements, including those made by the vocal tract. The theory was initially proposed in the Haskins Laboratories in the 1950s by Alvin Liberman and Franklin S. Cooper, and developed further by Donald Shankweiler, Michael Studdert-Kennedy, Ignatius Mattingly, Carol Fowler and Douglas Whalen. == Origins and development == The hypothesis has its origins in research using pattern playback to create reading machines for the blind that would substitute sounds for orthographic letters. This led to a close examination of how spoken sounds correspond to the acoustic spectrogram of them as a sequence of auditory sounds. This found that successive consonants and vowels overlap in time with one another (a phenomenon known as coarticulation). This suggested that speech is not heard like an acoustic "alphabet" or "cipher," but as a "code" of overlapping speech gestures. === Associationist approach === Initially, the theory was associationist: infants mimic the speech they hear and that this leads to behavioristic associations between articulation and its sensory consequences. Later, this overt mimicry would be short-circuited and become speech perception. This aspect of the theory was dropped, however, with the discovery that prelinguistic infants could already detect most of the phonetic contrasts used to separate different speech sounds. === Cognitivist approach === The behavioristic approach was replaced by a cognitivist one in which there was a speech module. The module detected speech in terms of hidden distal objects rather than at the proximal or immediate level of their input. The evidence for this was the research finding that speech processing was special such as duplex perception. === Changing distal objects === Initially, speech perception was assumed to link to speech objects that were both the invariant movements of speech articulators the invariant motor commands sent to muscles to move the vocal tract articulators This was later revised to include the phonetic gestures rather than motor commands, and then the gestures intended by the speaker at a prevocal, linguistic level, rather than actual movements. === Modern revision === The "speech is special" claim has been dropped, as it was found that speech perception could occur for nonspeech sounds (for example, slamming doors for duplex perception). === Mirror neurons === The discovery of mirror neurons has led to renewed interest in the motor theory of speech perception, and the theory still has its advocates, although there are also critics. == Support == === Nonauditory gesture information === If speech is identified in terms of how it is physically made, then nonauditory information should be incorporated into speech percepts even if it is still subjectively heard as "sounds". This is, in fact, the case. The McGurk effect shows that seeing the production of a spoken syllable that differs from an auditory cue synchronized with it affects the perception of the auditory one. In other words, if someone hears "ba" but sees a video of someone pronouncing "ga", what they hear is different—some people believe they hear "da". People find it easier to hear speech in noise if they can see the speaker. People can hear syllables better when their production can be felt haptically. === Categorical perception === Using a speech synthesizer, speech sounds can be varied in place of articulation along a continuum from /bɑ/ to /dɑ/ to /ɡɑ/, or in voice onset time on a continuum from /dɑ/ to /tɑ/ (for example). When listeners are asked to discriminate between two different sounds, they perceive sounds as belonging to discrete categories, even though the sounds vary continuously. In other words, 10 sounds (with the sound on one extreme being /dɑ/ and the sound on the other extreme being /tɑ/, and the ones in the middle varying on a scale) may all be acoustically different from one another, but the listener will hear all of them as either /dɑ/ or /tɑ/. Likewise, the English consonant /d/ may vary in its acoustic details across different phonetic contexts (the /d/ in /du/ does not technically sound the same as the one in /di/, for example), but all /d/'s as perceived by a listener fall within one category (voiced alveolar plosive) and that is because "linguistic representations are abstract, canonical, phonetic segments or the gestures that underlie these segments." This suggests that humans identify speech using categorical perception, and thus that a specialized module, such as that proposed by the motor theory of speech perception, may be on the right track. === Speech imitation === If people can hear the gestures in speech, then the imitation of speech should be very fast, as in when words are repeated that are heard in headphones as in speech shadowing. People can repeat heard syllables more quickly than they would be able to produce them normally. === Speech production === Hearing speech activates vocal tract muscles, and the motor cortex and premotor cortex. The integration of auditory and visual input in speech perception also involves such areas. Disrupting the premotor cortex disrupts the perception of speech units such as plosives. The activation of the motor areas occurs in terms of the phonemic features which link with the vocal track articulators that create speech gestures. The perception of a speech sound is aided by pre-emptively stimulating the motor representation of the articulators responsible for its pronunciation . Auditory and motor cortical coupling is restricted to a specific range of neuronal firing frequency. === Perception-action meshing === Evidence exists that perception and production are generally coupled in the motor system. This is supported by the existence of mirror neurons that are activated both by seeing (or hearing) an action and when that action is carried out. Another source of evidence is that for common coding theory between the representations used for perception and action. == Criticisms == The motor theory of speech perception is not widely held in the field of speech perception, though it is more popular in other fields, such as theoretical linguistics. As three of its advocates have noted, "it has few proponents within the field of speech perception, and many authors cite it primarily to offer critical commentary".p. 361 Several critiques of it exist. === Multiple sources === Speech perception is affected by nonproduction sources of information, such as context. Individual words are hard to understand in isolation but easy when heard in sentence context. It therefore seems that speech perception uses multiple sources that are integrated together in an optimal way. === Production === The motor theory of speech perception would predict that speech motor abilities in infants predict their speech perception abilities, but in actuality it is the other way around. It would also predict that defects in speech production would impair speech perception, but they do not. However, this only affects the first and already superseded behaviorist version of the theory, where infants were supposed to learn all production-perception patterns by imitation early in childhood. This is no longer the mainstream view of motor-speech theorists. === Speech module === Several sources of evidence for a specialized speech module have failed to be supported. Duplex perception can be observed with door slams. The McGurk effect can also be achieved with nonlinguistic stimuli, such as showing someone a video of a basketball bouncing but playing the sound of a ping-pong ball bouncing. As for categorical perception, listeners can be sensitive to acoustic differences within single phonetic categories. As a result, this part of the theory has been dropped by some researchers. === Sublexical tasks === The evidence provided for the motor theory of speech perception is limited to tasks such as syllable discrimination that use speech units not full spoken words or spoken sentences. As a result, "speech perception is sometimes interpreted as referring to the perception of speech at the sublexical level. However, the ultimate goal of these studies is presumably to understand the neural processes supporting the ability to process spee
Reference Software International
Reference Software International, Inc. (RSI), was an American software developer active from 1985 to 1993 and based in Albuquerque, New Mexico, and San Francisco, California. The company released several productivity and reference software packages, including the Grammatik grammar checker, for MS-DOS. The company was acquired by WordPerfect Corporation in 1993. == History == === Background (1980–1985) === Reference Software International, Inc., was founded by Donald "Don" Emery and Bruce Wampler in 1985 in San Francisco, California. Both Wampler and Emery were college professors when they founded RSI: Wampler at the University of New Mexico as a professor of computer science and Emery a professor of marketing at San Francisco State University. After graduating from the University of Utah in around 1978, Wampler founded his first software company, Aspen Software, in Tijeras, New Mexico, in 1979. Wampler founded Aspen to develop an early spell checker software package, called Proofreader, for the TRS-80, licensing Random House's Webster's Unabridged Dictionary for the package's lexicon. In 1980, he began development on a grammar checker inspired by Writer's Workbench, a pioneering grammar checker for Unix systems. Wampler used Writer's Workbench heavily during the writer of his doctoral dissertation but disliked having to jump between the Apple II on which he composed the dissertation and the mainframe on which Writer's Workbench ran, and so wanted to develop a version of the latter for microcomputers. Wampler's work came to fruition as Grammatik in 1981, eventually ported to several other microcomputer platforms in the early 1980s. In 1983, by which point the company had 12 employees and sold a combined 80,000 units of Grammatik and Proofreader, Wampler sold Aspen to Dictronics, a software company best known for developing the Electronic Thesaurus, an early thesaurus program for microcomputers. Dictronics was in turn purchased by Wang Laboratories; according to Wampler, "Wang bought [Aspen] and sat on it. They did nothing with it". Wampler moved on to teach for the University of New Mexico, but, frustrated by Wang's inaction, got the urge to resurrect his work. In 1985, he was able to license back Grammatik and Proofreader from a small California-based software firm that had grandfathered rights to a forked version of both. In the same year, he met Emery, who, impressed by Wampler's, founded Reference Software International to market his software. RSI's research and development headquarters were based in Albuquerque, while the company's sales and marketing department was based in Walnut Creek, California. === Success (1985–1992) === In August 1985, RSI released their first product: the Random House Reference Set, a new version of Proofreader for the IBM Personal Computer and compatibles, revised to be a terminate-and-stay-resident program that ran atop other word processors such as WordStar or WordPerfect. At the time, Reference Set was the only such program on the market that functioned like this. RSI netted $114,000 from sales of Reference Set by the end of 1985. In June 1986, they released version 2.0 of Grammatik as Grammatik II for the PC. The latter was a breakout hit for RSI, receiving praise in the press (including technology journals such as PC Magazine) and RSI selling 1,000 units a month. In spring 1987, they released Reference Set II, which allowed users to import their own words into the built-in dictionary and added a thesaurus of 300,000 words. In November 1987, they released version 3.0 of Reference Set, which comprised two new field-specific dictionaries for the medical and legal professions. As well as the general Random House dictionary and thesaurus, it included Stedman's Medical Dictionary and Black's Law Dictionary. Emery consulted Paul Brest and Bob Jackson—professors of law at Stanford Law School and San Francisco State respectively—for the curation of the law dictionary; and Burton Grebin—at the time the executive director of Mount Saint Mary's Hospital—for the curation of the medical dictionary. In fall 1988, the company released Grammatik III, a total rewrite that made use of artificial intelligence to more accurately judge the grammar of sentences by breaking them down into a syntactic hierarchy. Grammatik III received universal acclaim, with Gloria Morris of InfoWorld calling it the apparent leader in the grammar checking field and Sandra Anderson of Mac Home Journal calling it "hands down ... the best of the industry" six years after its release. By 1989, the product had competitors in Correct Grammar by Lifetree Software and RightWriter by Rightsoft, Inc. By 1990, RSI achieved annual sales of $9.7 million. In the same year they released Grammatik IV, which was the first to offer direct integration with WordPerfect on both MS-DOS and Windows. In March 1992—by which point RSI had sold 1.5 million copies of Grammatik across all versions—the company released version 5 of the program, another rewrite that updated the lexicon further and added new functions such as word redundancy detection. Around the same time, the company introduced Easy Proof, a pared-down version of Grammatik intended for novice writers, students, and family computers. In 1991, the company was engaged in a trademark dispute with Systems Compatibility Corporation (SCC) of Chicago, Illinois, over the rights to the Software Toolkit title. Both companies had published software bundles bearing the name in the turn of the 1990s; SCC had published theirs first in 1988 and registered the trademark with the USPTO. SCC was granted a restraining order against RSI in January 1991. The following month, RSI agreed to rename their product, preventing a protracted legal battle. === Decline and acquisition (1992–1993) === By early 1992, RSI achieved annual sales of more than $13 million, employed 120 people, and had opened international offices in London, Belgium, and Antwerp to sell foreign versions of Reference Set and Grammatik. The company reached peak employment in the middle of 1992, with 140 employees. However, RSI's launch of six disparate titles in the year proved problematic for the company when they failed to sell as well as they had projected, and the company laid off employees by the dozens. By December 1992, only 71 employees were left, 32 from their San Francisco office. On the last day of 1992, RSI received an acquisition offer from WordPerfect Corporation, makers of the namesake word processor based in Orem, Utah. The deal was inked in January 1993, RSI's stakeholders receiving $19 million. The company's remaining employees were absorbed into WordPerfect in Orem. WordPerfect continued selling Grammatik as a standalone product for several years.