AI Data Explorer Servicenow

AI Data Explorer Servicenow — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Convolution

    Convolution

    In mathematics (in particular, functional analysis), convolution is a mathematical operation on two functions f {\displaystyle f} and g {\displaystyle g} that produces a third function f ∗ g {\displaystyle fg} , as the integral of the product of the two functions after one is reflected about the y-axis and shifted. The term convolution refers to both the resulting function and to the process of computing it. The integral is evaluated for all values of shift, producing the convolution function. The choice of which function is reflected and shifted before the integral does not change the integral result (see commutativity). Graphically, it expresses how the 'shape' of one function is modified by the other. Some features of convolution are similar to cross-correlation: for real-valued functions, of a continuous or discrete variable, convolution f ∗ g {\displaystyle fg} differs from cross-correlation f ⋆ g {\displaystyle f\star g} only in that either f ( x ) {\displaystyle f(x)} or g ( x ) {\displaystyle g(x)} is reflected about the y-axis in convolution; thus it is a cross-correlation of g ( − x ) {\displaystyle g(-x)} and f ( x ) {\displaystyle f(x)} , or f ( − x ) {\displaystyle f(-x)} and g ( x ) {\displaystyle g(x)} . For complex-valued functions, the cross-correlation operator is the adjoint of the convolution operator. Convolution has applications that include probability, statistics, acoustics, spectroscopy, signal processing and image processing, computer vision and human vision, geophysics, engineering, physics, and differential equations. The convolution can be defined for functions on Euclidean space and other groups (as algebraic structures). For example, periodic functions, such as the discrete-time Fourier transform, can be defined on a circle and convolved by periodic convolution. (See row 18 at DTFT § Properties.) A discrete convolution can be defined for functions on the set of integers. Generalizations of convolution have applications in the field of numerical analysis and numerical linear algebra, and in the design and implementation of finite impulse response filters in signal processing. Computing the inverse of the convolution operation is known as deconvolution. == Definition == The convolution of f {\displaystyle f} and g {\displaystyle g} is written f ∗ g {\displaystyle fg} , denoting the operator with the symbol ∗ {\displaystyle } . It is defined as the integral of the product of the two functions after one is reflected about the y-axis and shifted. As such, it is a particular kind of integral transform: ( f ∗ g ) ( t ) := ∫ − ∞ ∞ f ( τ ) g ( t − τ ) d τ . {\displaystyle (fg)(t):=\int _{-\infty }^{\infty }f(\tau )g(t-\tau )\,d\tau .} An equivalent definition is (see commutativity): ( f ∗ g ) ( t ) := ∫ − ∞ ∞ f ( t − τ ) g ( τ ) d τ . {\displaystyle (fg)(t):=\int _{-\infty }^{\infty }f(t-\tau )g(\tau )\,d\tau .} While the symbol t {\displaystyle t} is used above, it need not represent the time domain. At each t {\displaystyle t} , the convolution formula can be described as the area under the function f ( τ ) {\displaystyle f(\tau )} weighted by the function g ( − τ ) {\displaystyle g(-\tau )} shifted by the amount t {\displaystyle t} . As t {\displaystyle t} changes, the weighting function g ( t − τ ) {\displaystyle g(t-\tau )} emphasizes different parts of the input function f ( τ ) {\displaystyle f(\tau )} ; If t {\displaystyle t} is a positive value, then g ( t − τ ) {\displaystyle g(t-\tau )} is equal to g ( − τ ) {\displaystyle g(-\tau )} that slides or is shifted along the τ {\displaystyle \tau } -axis toward the right (toward + ∞ {\displaystyle +\infty } ) by the amount of t {\displaystyle t} , while if t {\displaystyle t} is a negative value, then g ( t − τ ) {\displaystyle g(t-\tau )} is equal to g ( − τ ) {\displaystyle g(-\tau )} that slides or is shifted toward the left (toward − ∞ {\displaystyle -\infty } ) by the amount of | t | {\displaystyle |t|} . For functions f {\displaystyle f} , g {\displaystyle g} supported on only [ 0 , ∞ ) {\displaystyle [0,\infty )} (i.e., zero for negative arguments), the integration limits can be truncated, resulting in: ( f ∗ g ) ( t ) = ∫ 0 t f ( τ ) g ( t − τ ) d τ for f , g : [ 0 , ∞ ) → R . {\displaystyle (fg)(t)=\int _{0}^{t}f(\tau )g(t-\tau )\,d\tau \quad \ {\text{for }}f,g:[0,\infty )\to \mathbb {R} .} For the multi-dimensional formulation of convolution, see domain of definition (below). === Notation === A common engineering notational convention is: f ( t ) ∗ g ( t ) := ∫ − ∞ ∞ f ( τ ) g ( t − τ ) d τ ⏟ ( f ∗ g ) ( t ) , {\displaystyle f(t)g(t)\mathrel {:=} \underbrace {\int _{-\infty }^{\infty }f(\tau )g(t-\tau )\,d\tau } _{(fg)(t)},} which has to be interpreted carefully to avoid confusion. For instance, f ( t ) ∗ g ( t − t 0 ) {\displaystyle f(t)g(t-t_{0})} is equivalent to ( f ∗ g ) ( t − t 0 ) {\displaystyle (fg)(t-t_{0})} , but f ( t − t 0 ) ∗ g ( t − t 0 ) {\displaystyle f(t-t_{0})g(t-t_{0})} is in fact equivalent to ( f ∗ g ) ( t − 2 t 0 ) {\displaystyle (fg)(t-2t_{0})} . === Relations with other transforms === Given two functions f ( t ) {\displaystyle f(t)} and g ( t ) {\displaystyle g(t)} with bilateral Laplace transforms (two-sided Laplace transform) F ( s ) = ∫ − ∞ ∞ e − s u f ( u ) d u {\displaystyle F(s)=\int _{-\infty }^{\infty }e^{-su}\ f(u)\ {\text{d}}u} and G ( s ) = ∫ − ∞ ∞ e − s v g ( v ) d v {\displaystyle G(s)=\int _{-\infty }^{\infty }e^{-sv}\ g(v)\ {\text{d}}v} respectively, the convolution operation ( f ∗ g ) ( t ) {\displaystyle (fg)(t)} can be defined as the inverse Laplace transform of the product of F ( s ) {\displaystyle F(s)} and G ( s ) {\displaystyle G(s)} . More precisely, F ( s ) ⋅ G ( s ) = ∫ − ∞ ∞ e − s u f ( u ) d u ⋅ ∫ − ∞ ∞ e − s v g ( v ) d v = ∫ − ∞ ∞ ∫ − ∞ ∞ e − s ( u + v ) f ( u ) g ( v ) d u d v {\displaystyle {\begin{aligned}F(s)\cdot G(s)&=\int _{-\infty }^{\infty }e^{-su}\ f(u)\ {\text{d}}u\cdot \int _{-\infty }^{\infty }e^{-sv}\ g(v)\ {\text{d}}v\\&=\int _{-\infty }^{\infty }\int _{-\infty }^{\infty }e^{-s(u+v)}\ f(u)\ g(v)\ {\text{d}}u\ {\text{d}}v\end{aligned}}} Let t = u + v {\displaystyle t=u+v} , then F ( s ) ⋅ G ( s ) = ∫ − ∞ ∞ ∫ − ∞ ∞ e − s t f ( u ) g ( t − u ) d u d t = ∫ − ∞ ∞ e − s t ∫ − ∞ ∞ f ( u ) g ( t − u ) d u ⏟ ( f ∗ g ) ( t ) d t = ∫ − ∞ ∞ e − s t ( f ∗ g ) ( t ) d t . {\displaystyle {\begin{aligned}F(s)\cdot G(s)&=\int _{-\infty }^{\infty }\int _{-\infty }^{\infty }e^{-st}\ f(u)\ g(t-u)\ {\text{d}}u\ {\text{d}}t\\&=\int _{-\infty }^{\infty }e^{-st}\underbrace {\int _{-\infty }^{\infty }f(u)\ g(t-u)\ {\text{d}}u} _{(fg)(t)}\ {\text{d}}t\\&=\int _{-\infty }^{\infty }e^{-st}(fg)(t)\ {\text{d}}t.\end{aligned}}} Note that F ( s ) ⋅ G ( s ) {\displaystyle F(s)\cdot G(s)} is the bilateral Laplace transform of ( f ∗ g ) ( t ) {\displaystyle (fg)(t)} . A similar derivation can be done using the unilateral Laplace transform (one-sided Laplace transform). The convolution operation also describes the output (in terms of the input) of an important class of operations known as linear time-invariant (LTI). See LTI system theory for a derivation of convolution as the result of LTI constraints. In terms of the Fourier transforms of the input and output of an LTI operation, no new frequency components are created. The existing ones are only modified (amplitude and/or phase). In other words, the output transform is the pointwise product of the input transform with a third transform (known as a transfer function). See Convolution theorem for a derivation of that property of convolution. Conversely, convolution can be derived as the inverse Fourier transform of the pointwise product of two Fourier transforms. == Visual explanation == == Historical developments == One of the earliest uses of the convolution integral appeared in D'Alembert's derivation of Taylor's theorem in Recherches sur différents points importants du système du monde, published in 1754. Also, an expression of the type: ∫ f ( u ) ⋅ g ( x − u ) d u {\displaystyle \int f(u)\cdot g(x-u)\,du} is used by Sylvestre François Lacroix on page 505 of his book entitled Treatise on differences and series, which is the last of 3 volumes of the encyclopedic series: Traité du calcul différentiel et du calcul intégral, Chez Courcier, Paris, 1797–1800. Soon thereafter, convolution operations appear in the works of Pierre Simon Laplace, Jean-Baptiste Joseph Fourier, Siméon Denis Poisson, and others. The term itself did not come into wide use until the 1950s or 1960s. Prior to that it was sometimes known as Faltung (which means folding in German), composition product, superposition integral, and Carson's integral. Yet it appears as early as 1903, though the definition is rather unfamiliar in older uses. The operation: ∫ 0 t φ ( s ) ψ ( t − s ) d s , 0 ≤ t < ∞ , {\displaystyle \int _{0}^{t}\varphi (s)\psi (t-s)\,ds,\quad 0\leq t<\infty ,} is a particular case of composition products considered by the Italian mathematician Vito Volterra in 1913. == Circular c

    Read more →
  • Real-time transcription

    Real-time transcription

    Real-time transcription is the general term for transcription by court reporters using real-time text technologies to deliver computer text screens within a few seconds of the words being spoken. Specialist software allows participants in court hearings or depositions to make notes in the text and highlight portions for future reference. Real-time transcription is also used in the broadcasting environment where it is more commonly termed "captioning." == Career opportunities == Real-time reporting is used in a variety of industries, including entertainment, television, the Internet, and law. Specific careers include the following: Judicial reporters use a stenotype to provide instant transcripts on computer screens as a trial or deposition occurs. Communication access real-time translation (CART) reporters assist the hearing-impaired by transcribing spoken words, giving them personal access to the communications they need day to day. Television broadcast captioners use real-time reporting technology to allow hard-of-hearing or deaf people to see what is being said on live television broadcasts such as news, emergency broadcasts, sporting events, awards shows, and other programs. Internet information (or Webcast) reporters provide real-time reporting of sales meetings, press conferences, and other events, while simultaneously transmitting the transcripts to computers worldwide. Other rapid data entry positions. == History == Before the advent of the stenotype machine, court reporters wrote official trial transcripts by hand using a shorthand system of stenoforms that could later be translated into readable English. It often took eight years of training to learn this manual form of writing at the necessary speed. Walter Heironimus was among the first stenographers to make use of the stenotype machine during his work in the U.S. District Court system in New Jersey in 1935. A "transcript crisis" arose during the later half of the twentieth century due to the increasing volume of lawsuits. There were not enough number of court reporters to match the increasing number of trials. Not only were court reporters unavailable to attend many court proceedings, court transcripts were constantly late and the qualities varied. Some believed it was due to the non-interchangeability between court reporters, and others believed it was simply due to a labor shortage. In the meantime, magnetic audiotape recording, or known as electronic recording (ER) began to threaten all reporters' job since it could record long-hour courtroom trials and replace a court reporter's position in the courtroom. As a result, machine translation (MT) intended to serve as a solution for preventing ER from potentially replacing reporters' jobs. However, MT relied heavily on human labors operating behind the system and many started to question if it should be the right way to end the "transcript crisis." Later in 1964, set up by CIA, the Automatic Language Processing Advisory Committee (ALPAC) was set to review whether MT was capable of solving this crisis. They concluded that MT had failed to do so. Then Patrick O'Neill, a skilled and experienced court reporter, stayed to work on the stenotype-translation project with CIA and developed the prototype CAT system. After adopting the CAT system in court-reporting community, CAT was brought into the television broadcasting system, aiming to provide captions for the deaf or hard-of-hearing communities. In 1983, Linda Miller developed a further use for the CAT system. She successfully translated a lecture live on the television screen and provided a transcript for students. This technique is known as Computer-Aided Real-time Translation, or CART. == Court reporter == It is the court reporter's job to note down the exact words spoken by every participants during a court or deposition proceeding. Then court reporters will provide verbatim transcripts. The reason to have an official court transcript is that the real-time transcriptions allows attorneys and judges to have immediate access to the transcript. It also helps when there's a need to look up for information from the proceeding. Additionally, the deaf and the hard-of-hearing communities can also participate in the judicial process with the help of real-time transcriptions provided by court reporters. === Education and training === The required degree level for a court reporter to have is an Associate's degree or postsecondary certificate. In order to become a court reporter, more than 150 reporter training programs are provided at proprietary schools, community colleges, and four-year universities. After graduation, court reporters can choose to further pursue certifications to achieve a higher level of expertise and increase their marketability during a job search. In most states, Certificates of Proficiency from the NCRA or from state agencies are now required certificates for court reporters to have in order to qualify for appointments. The NCRA aims to set the national standard for the certification of court reporters, and since 1937 it has offered its certification program which is now accepted by 22 states instead of state licenses. Court reporter training programs include but not limited to: Training in rapid writing skill, or shorthand, which will enable students to record, with accuracy, at least 225 words per minute Training in typing, which will enable students to type at least 60 words per minute A general training in English, which covers aspects of grammar, word formation, punctuation, spelling and capitalization Taking Law related courses in order to understand the overall principles of civil and criminal law, legal terminology and common Latin phrases, rules of evidence, court procedures, the duties of court reporters, the ethics of the profession Visits to actual trials Taking courses in elementary anatomy and physiology and medical word study including medical prefixes, roots and suffixes. Other than official court reporters, who are assigned to and work for a particular court, other types of court reporters include free-lance reporter, who either works for a court reporting firm or self-employed. They are different from official court reporters in that they have the chances to work on a wider range of assignments and work on basis of hourly wage. Hearing reporters work at governmental agency hearings. Legislative reporters work in law-making bodies. The demand for reporters is not limited in just the court settings. Reporters are also needed in conferences, meetings, conventions, investigations, and a variety of industries with needs for employers with real-time data entry skills. == Non-English transcription == Transcription services are universally necessary, so it is not limited to the English language. A stenographer's ability to transcribe languages beyond only English is especially valuable as society as a whole becomes increasingly multilingual. Education in non-English transcription demands a comprehensive understanding of the given language. Phonetic differences between English and other languages are a particular challenge in carrying English transcription skills over into other languages. Stenography represents various sounds of a language in a formal system of shorthand, so differences within the sets of sounds that emerge in other languages require an alternative system of shorthand transcription. For example, the presence of many diphthongs and triphthongs in Spanish requires certain sounds to be distinguished that would not be present in transcribing English into shorthand. == Controversies == The usage of transcription in the context of linguistic discussions has been controversial. Typically, two kinds of linguistic records are considered to be scientifically relevant. First, linguistic records of general acoustic features, and secondly, records that only focuses on the distinctive phonemes of a language. While transcriptions are not entirely illegitimate, transcriptions without enough detailed commentary regarding any linguistic features, or transcriptions of poor quality resources, has a great chance of the content being misinterpreted. Besides misinterpretation, transcribers could also bring in cultural biases and ignorance that reflect onto their transcription. These instances may cause a disruption of reliability in the final real-time transcription, which could influence how the written utterance is seen as an evidence for a court-case. === Quality issues === Problems in the final resulting transcription can be caused by either the quality of the transcriber or the original source that is being transcribed. Transcribers can come from different levels of skill and training background. This makes the final transcription prone to poor quality, or if the transcription is being done by multiple people, lack of consistency in the content. If the source of the transcription is a recording, the problem may root back to the quality of the re

    Read more →
  • Progressive Graphics File

    Progressive Graphics File

    PGF (Progressive Graphics File) is a wavelet-based bitmapped image format that employs lossless and lossy data compression. PGF was created to improve upon and replace the JPEG format. It was developed at the same time as JPEG 2000 but with a focus on speed over compression ratio. PGF can operate at higher compression ratios without taking more encoding/decoding time and without generating the characteristic "blocky and blurry" artifacts of the original DCT-based JPEG standard. It also allows more sophisticated progressive downloads. == Color models == PGF supports a wide variety of color models: Grayscale with 1, 8, 16, or 31 bits per pixel Indexed color with palette size of 256 RGB color image with 12, 16 (red: 5 bits, green: 6 bits, blue: 5 bits), 24, or 48 bits per pixel ARGB color image with 32 bits per pixel Lab color image with 24 or 48 bits per pixel CMYK color image with 32 or 64 bits per pixel == Technical discussion == PGF claims to achieve an improved compression quality over JPEG adding or improving features such as scalability. Its compression performance is similar to the original JPEG standard. Very low and very high compression rates (including lossless compression) are also supported in PGF. The ability of the design to handle a very large range of effective bit rates is one of the strengths of PGF. For example, to reduce the number of bits for a picture below a certain amount, the advisable thing to do with the first JPEG standard is to reduce the resolution of the input image before encoding it — something that is ordinarily not necessary for that purpose when using PGF because of its wavelet scalability properties. The PGF process chain contains the following four steps: Color space transform (in case of color images) Discrete Wavelet Transform Quantization (in case of lossy data compression) Hierarchical bit-plane run-length encoding === Color components transformation === Initially, images have to be transformed from the RGB color space to another color space, leading to three components that are handled separately. PGF uses a fully reversible modified YUV color transform. The transformation matrices are: [ Y r U r V r ] = [ 1 4 1 2 1 4 1 − 1 0 0 − 1 1 ] [ R G B ] ; [ R G B ] = [ 1 3 4 − 1 4 1 − 1 4 − 1 4 1 − 1 4 3 4 ] [ Y r U r V r ] {\displaystyle {\begin{bmatrix}Y_{r}\\U_{r}\\V_{r}\end{bmatrix}}={\begin{bmatrix}{\frac {1}{4}}&{\frac {1}{2}}&{\frac {1}{4}}\\1&-1&0\\0&-1&1\end{bmatrix}}{\begin{bmatrix}R\\G\\B\end{bmatrix}};\qquad \qquad {\begin{bmatrix}R\\G\\B\end{bmatrix}}={\begin{bmatrix}1&{\frac {3}{4}}&-{\frac {1}{4}}\\1&-{\frac {1}{4}}&-{\frac {1}{4}}\\1&-{\frac {1}{4}}&{\frac {3}{4}}\end{bmatrix}}{\begin{bmatrix}Y_{r}\\U_{r}\\V_{r}\end{bmatrix}}} The chrominance components can be, but do not necessarily have to be, down-scaled in resolution. === Wavelet transform === The color components are then wavelet transformed to an arbitrary depth. In contrast to JPEG 1992 which uses an 8x8 block-size discrete cosine transform, PGF uses one reversible wavelet transform: a rounded version of the biorthogonal CDF 5/3 wavelet transform. This wavelet filter bank is exactly the same as the reversible wavelet used in JPEG 2000. It uses only integer coefficients, so the output does not require rounding (quantization) and so it does not introduce any quantization noise. === Quantization === After the wavelet transform, the coefficients are scalar-quantized to reduce the amount of bits to represent them, at the expense of a loss of quality. The output is a set of integer numbers which have to be encoded bit-by-bit. The parameter that can be changed to set the final quality is the quantization step: the greater the step, the greater is the compression and the loss of quality. With a quantization step that equals 1, no quantization is performed (it is used in lossless compression). In contrast to JPEG 2000, PGF uses only powers of two, therefore the parameter value i represents a quantization step of 2i. Just using powers of two makes no need of integer multiplication and division operations. === Coding === The result of the previous process is a collection of sub-bands which represent several approximation scales. A sub-band is a set of coefficients — integer numbers which represent aspects of the image associated with a certain frequency range as well as a spatial area of the image. The quantized sub-bands are split further into blocks, rectangular regions in the wavelet domain. They are typically selected in a way that the coefficients within them across the sub-bands form approximately spatial blocks in the (reconstructed) image domain and collected in a fixed size macroblock. The encoder has to encode the bits of all quantized coefficients of a macroblock, starting with the most significant bits and progressing to less significant bits. In this encoding process, each bit-plane of the macroblock gets encoded in two so-called coding passes, first encoding bits of significant coefficients, then refinement bits of significant coefficients. Clearly, in lossless mode all bit-planes have to be encoded, and no bit-planes can be dropped. Only significant coefficients are compressed with an adaptive run-length/Rice (RLR) coder, because they contain long runs of zeros. The RLR coder with parameter k (logarithmic length of a run of zeros) is also known as the elementary Golomb code of order 2k. === Comparison with other file formats === JPEG 2000 is slightly more space-efficient in handling natural images. The PSNR for the same compression ratio is on average 3% better than the PSNR of PGF. It has a small advantage in compression ratio but longer encoding and decoding times. PNG (Portable Network Graphics) is more space-efficient in handling images with many pixels of the same color. There are several self-proclaimed advantages of PGF over the ordinary JPEG standard: Superior compression performance: The image quality (measured in PSNR) for the same compression ratio is on average 3% better than the PSNR of JPEG. At lower bit rates (e.g. less than 0.25 bits/pixel for gray-scale images), PGF has a much more significant advantage over certain modes of JPEG: artifacts are less visible and there is almost no blocking. The compression gains over JPEG are attributed to the use of DWT. Multiple resolution representation: PGF provides seamless compression of multiple image components, with each component carrying from 1 to 31 bits per component sample. With this feature there is no need for separately stored preview images (thumbnails). Progressive transmission by resolution accuracy, commonly referred to as progressive decoding: PGF provides efficient code-stream organizations which are progressive by resolution. This way, after a smaller part of the whole file has been received, it is possible to see a lower quality of the final picture, the quality can be improved monotonically getting more data from the source. Lossless and lossy compression: PGF provides both lossless and lossy compression in a single compression architecture. Both lossy and lossless compression are provided by the use of a reversible (integer) wavelet transform. Side channel spatial information: Transparency and alpha planes are fully supported ROI extraction: Since version 5, PGF supports extraction of regions of interest (ROI) without decoding the whole image. == Available software == The author published libPGF via a SourceForge, under the GNU Lesser General Public License version 2.0. Xeraina offers a free Windows console encoder and decoder, and PGF viewers based on WIC for 32bit and 64bit Windows platforms. Other WIC applications including File Explorer are able to display PGF images after installing this viewer. Digikam is a popular open-source image editing and cataloging software that uses libPGF for its thumbnails. It makes use of the progressive decoding feature of PGF images to store a single version of each thumbnail, which can then be decoded to different resolutions without loss, thus allowing users to dynamically change the size of the thumbnails without having to recalculate them again.

    Read more →
  • Plum Voice

    Plum Voice

    The Plum Group, Inc. (DBA Plum Voice) is a company. Plum is headquartered in New York City with offices in Boston and Denver. == History == Plum Voice, founded in 2000 as The Plum Group, Inc., was incorporated to create technologies for personalized audio communication. By 2001, Plum had commercialized the open-standard Plum VoiceXML IVR platform which facilitated the creation of dynamic telecom applications. 2001 - Commercial launch of Plum VoiceXML IVR platform for customer-premises deployment 2002 - Launch of Plum Voice Hosting Centers for 24x7x365 managed IVR hosting 2004 - Plum Voice application suite receives a "Product of the Year" award from Customer Interactions magazine 2008 - Plum Survey builder launched, a do-it-yourself IVR survey tool. 2010 - Plum launched QuickFuse, a web-based rapid development platform used to create voice applications. 2013 - Plum launched VoiceTrends, an analytics and reporting toolkit designed specifically for voice applications. Plum achieves PCI-DSS Level 1. 2015 - Plum launched Plum Insight, a multi-channel (voice, web, mobile) survey platform. Plum achieves HIPAA compliance. 2016 - Plum launched a new version of QuickFuse called Fuse+. 2020 - Plum sunsets QuickFuse, rebrands Fuse+ as Plum Fuse.

    Read more →
  • Oculus Medium

    Oculus Medium

    Oculus Medium is a digital sculpting software that works with virtual reality headsets and 6DoF motion controllers. It is used to create and paint digital sculptures. Medium works only on Oculus Rift. It was released on December 5, 2016, following with a major update in 2018 introducing new features and a revamped UI. On December 9, 2019, Oculus Medium was acquired by Adobe and re-named to "Medium by Adobe".

    Read more →
  • PressWise

    PressWise

    PressWise was digital imposition software to quickly and easily impose most any variety of flat and folding layouts. It was acquired by the Aldus Prepress Group affectionately known in the print and publishing industry as the Aldus WiseGuys in August 1991 from Emulation Technologies Inc. of Cleveland, Ohio. It was further developed by the Aldus Press Group and launched as the first of many Aldus prepress products in 1993. It was subsequently owned by Adobe Systems, then Luminous Corporation (Seattle), then Imation, and finally ScenicSoft. PressWise was discontinued by ScenicSoft in 1999 ultimately. == History == In February 2009, the PressWise copyright was acquired by Aethos Technologies and a new print automation product was launched by its creator, Eric Wold of Santa Rosa, California. This new product has no relationship to the old imposition software of the same name. It's notable that Larry Letteney, former President of Creo Americas was a board member and shareholder of Aethos Technologies during its early phase. Datatech SmartSoft acquired exclusive distribution rights to the software in September 2009. In September 2010 Datatech SmartSoft completed the acquisition of the PressWise brand and product.

    Read more →
  • Network Abstraction Layer

    Network Abstraction Layer

    The Network Abstraction Layer (NAL) is a part of the H.264/AVC and HEVC video coding standards. The main goal of the NAL is the provision of a "network-friendly" video representation addressing "conversational" (video telephony) and "non conversational" (storage, broadcast, or streaming) applications. NAL has achieved a significant improvement in application flexibility relative to prior video coding standards. == Introduction == An increasing number of services and growing popularity of high definition TV are creating greater needs for higher coding efficiency. Moreover, other transmission media such as cable modem, xDSL, or UMTS offer much lower data rates than broadcast channels, and enhanced coding efficiency can enable the transmission of more video channels or higher quality video representations within existing digital transmission capacities. Video coding for telecommunication applications has diversified from ISDN and T1/E1 service to embrace PSTN, mobile wireless networks, and LAN/Internet network delivery. Throughout this evolution, continued efforts have been made to maximize coding efficiency while dealing with the diversification of network types and their characteristic formatting and loss/error robustness requirements. The H.264/AVC and HEVC standards are designed for technical solutions including areas like broadcasting (over cable, satellite, cable modem, DSL, terrestrial, etc.) interactive or serial storage on optical and magnetic devices, conversational services, video-on-demand or multimedia streaming, multimedia messaging services, etc. Moreover, new applications may be deployed over existing and future networks. This raises the question about how to handle this variety of applications and networks. To address this need for flexibility and customizability, the design covers a NAL that formats the Video Coding Layer (VCL) representation of the video and provides header information in a manner appropriate for conveyance by a variety of transport layers or storage media. The NAL is designed in order to provide "network friendliness" to enable simple and effective customization of the use of VCL for a broad variety of systems. The NAL facilitates the ability to map VCL data to transport layers such as: RTP/IP for any kind of real-time wire-line and wireless Internet services. File formats, e.g., ISO MP4 for storage and MMS. H.32X for wireline and wireless conversational services. MPEG-2 systems for broadcasting services, etc. The full degree of customization of the video content to fit the needs of each particular application is outside the scope of the video coding standardization effort, but the design of the NAL anticipates a variety of such mappings. Some key concepts of the NAL are NAL units, byte stream, and packet formats uses of NAL units, parameter sets, and access units. A short description of these concepts is given below. == NAL units == The coded video data is organized into NAL units, each of which is effectively a packet that contains an integer number of bytes. The first byte of each H.264/AVC NAL unit is a header byte that contains an indication of the type of data in the NAL unit. For HEVC the header was extended to two bytes. All the remaining bytes contain payload data of the type indicated by the header. The NAL unit structure definition specifies a generic format for use in both packet-oriented and bitstream-oriented transport systems, and a series of NAL units generated by an encoder is referred to as a NAL unit stream. == NAL Units in Byte-Stream Format Use == Some systems require delivery of the entire or partial NAL unit stream as an ordered stream of bytes or bits within which the locations of NAL unit boundaries need to be identifiable from patterns within the coded data itself. For use in such systems, the H.264/AVC and HEVC specifications define a byte stream format. In the byte stream format, each NAL unit is prefixed by a specific pattern of three bytes called a start code prefix. The boundaries of the NAL unit can then be identified by searching the coded data for the unique start code prefix pattern. The use of emulation prevention bytes guarantees that start code prefixes are unique identifiers of the start of a new NAL unit. A small amount of additional data (one byte per video picture) is also added to allow decoders that operate in systems that provide streams of bits without alignment to byte boundaries to recover the necessary alignment from the data in the stream. Additional data can also be inserted in the byte stream format that allows expansion of the amount of data to be sent and can aid in achieving more rapid byte alignment recovery, if desired. == NAL Units in Packet-Transport System Use == In other systems (e.g., IP/RTP systems), the coded data is carried in packets that are framed by the system transport protocol, and identification of the boundaries of NAL units within the packets can be established without use of start code prefix patterns. In such systems, the inclusion of start code prefixes in the data would be a waste of data carrying capacity, so instead the NAL units can be carried in data packets without start code prefixes. == VCL and Non-VCL NAL Units == NAL units are classified into VCL and non-VCL NAL units. VCL NAL units contain the data that represents the values of the samples in the video pictures. Non-VCL NAL units contain any associated additional information such as parameter sets (important header data that can apply to a large number of VCL NAL units) and supplemental enhancement information (timing information and other supplemental data that may enhance usability of the decoded video signal but are not necessary for decoding the values of the samples in the video pictures). == Parameter Sets == A parameter set contains shared configuration data that is carried in non-VCL NAL units. Parameter sets are typically reused when decoding many coded pictures within a video sequence. Each VCL NAL unit references a picture parameter set (PPS), which in turn references a sequence parameter set (SPS). There are two types of parameter sets: Sequence parameter set (SPS), which specifies mostly constant configuration such as resolution, bit depth, or chroma format. (For a concrete implementation, see FFmpeg's SPS struct.) Picture parameter set (PPS), which applies on top of an SPS, and specifies configuration such as QP offsets. (For a concrete implementation, see FFmpeg's PPS struct.) The sequence and picture parameter-set mechanism decouples the transmission of infrequently changing information from the transmission of coded representations of the values of the samples in the video pictures. Each VCL NAL unit contains an identifier that refers to the content of the relevant picture parameter set and each picture parameter set contains an identifier that refers to the content of the relevant sequence parameter set. In this manner, a small amount of data (the identifier) can be used to refer to a larger amount of information (the parameter set) without repeating that information within each VCL NAL unit. Sequence and picture parameter sets can be sent well ahead of the VCL NAL units that they apply to, and can be repeated to provide robustness against data loss. In some applications, parameter sets may be sent within the channel that carries the VCL NAL units (termed "in-band" transmission). In other applications, it can be advantageous to convey the parameter sets "out-of-band" using a more reliable transport mechanism than the video channel itself. == Access Units == A set of NAL units in a specified form is referred to as an access unit. The decoding of each access unit results in one decoded picture. Each access unit contains a set of VCL NAL units that together compose a primary coded picture. It may also be prefixed with an access unit delimiter to aid in locating the start of the access unit. Some supplemental enhancement information containing data such as picture timing information may also precede the primary coded picture. The primary coded picture consists of a set of VCL NAL units consisting of slices or slice data partitions that represent the samples of the video picture. Following the primary coded picture may be some additional VCL NAL units that contain redundant representations of areas of the same video picture. These are referred to as redundant coded pictures, and are available for use by a decoder in recovering from loss or corruption of the data in the primary coded pictures. Decoders are not required to decode redundant coded pictures if they are present. Finally, if the coded picture is the last picture of a coded video sequence (a sequence of pictures that is independently decodable and uses only one sequence parameter set), an end of sequence NAL unit may be present to indicate the end of the sequence; and if the coded picture is the last coded picture in the entire NAL unit stream, an end of stream NAL unit may be present to

    Read more →
  • Quantum image processing

    Quantum image processing

    Quantum image processing (QIMP) is using quantum computing or quantum information processing to create and work with quantum images. Due to some of the properties inherent to quantum computation, notably entanglement and parallelism, it is hoped that QIMP technologies will offer capabilities and performances that surpass their traditional equivalents, in terms of computing speed, security, and minimum storage requirements. == Background == A. Y. Vlasov's work in 1997 focused on using a quantum system to recognize orthogonal images. This was followed by efforts using quantum algorithms to search specific patterns in binary images and detect the posture of certain targets. Notably, more optics-based interpretations for quantum imaging were initially experimentally demonstrated in and formalized in after seven years. In 2003, Salvador Venegas-Andraca and S. Bose presented Qubit Lattice, the first published general model for storing, processing and retrieving images using quantum systems. Later on, in 2005, Latorre proposed another kind of representation, called the Real Ket, whose purpose was to encode quantum images as a basis for further applications in QIMP. Furthermore, in 2010 Venegas-Andraca and Ball presented a method for storing and retrieving binary geometrical shapes in quantum mechanical systems in which it is shown that maximally entangled qubits can be used to reconstruct images without using any additional information. Technically, these pioneering efforts with the subsequent studies related to them can be classified into three main groups: Quantum-assisted digital image processing (QDIP): These applications aim at improving digital or classical image processing tasks and applications. Optics-based quantum imaging (OQI) Classically inspired quantum image processing (QIMP) A survey of quantum image representation has been published in. Furthermore, the recently published book Quantum Image Processing provides a comprehensive introduction to quantum image processing, which focuses on extending conventional image processing tasks to the quantum computing frameworks. It summarizes the available quantum image representations and their operations, reviews the possible quantum image applications and their implementation, and discusses the open questions and future development trends. == Quantum image representations == There are various approaches for quantum image representation, that are usually based on the encoding of color information. A common representation is FRQI (Flexible Representation for Quantum Images), that captures the color and position at every pixel of the image, and defined as: | I ⟩ = 1 2 n ∑ i = 0 2 2 n − 1 | c i ⟩ ⊗ | i ⟩ {\displaystyle \vert I\rangle ={\frac {1}{2^{n}}}\sum _{i=0}^{2^{2n-1}}\vert c_{i}\rangle \otimes \vert i\rangle } where | i ⟩ {\textstyle |i\rangle } is the position and | c i ⟩ = c o s θ i | 0 ⟩ + s i n θ i | 1 ⟩ {\textstyle \vert c_{i}\rangle =cos\theta _{i}\vert 0\rangle +sin\theta _{i}\vert 1\rangle } the color with a vector of angles θ i ∈ [ 0 , π / 2 ] {\textstyle \theta _{i}\in \left[0,\pi /2\right]} . As it can be seen, | c i ⟩ {\textstyle \vert c_{i}\rangle } is a regular qubit state of the form | ψ ⟩ = α | 0 ⟩ + β | 1 ⟩ {\displaystyle \vert \psi \rangle =\alpha \vert 0\rangle +\beta \vert 1\rangle } , with basis states | 0 ⟩ = ( 1 0 ) {\textstyle \vert 0\rangle ={\begin{pmatrix}1\\0\end{pmatrix}}} and | 1 ⟩ = ( 0 1 ) {\textstyle \vert 1\rangle ={\begin{pmatrix}0\\1\end{pmatrix}}} , as well as amplitudes α {\textstyle \alpha } and β {\textstyle \beta } that satisfy | α | 2 + | β | 2 = 1 {\textstyle \left|\alpha \right|^{2}+\left|\beta \right|^{2}=1} . Another common representation is MCQI (Multi-Channel Representation for Quantum Images), that uses the RGB channels with quantum states and following FRQI definition: | I ⟩ = 1 2 n + 1 ∑ i = 0 2 2 n − 1 | C R G B i ⟩ ⊗ | i ⟩ {\displaystyle \vert I\rangle ={\frac {1}{2^{n+1}}}\sum _{i=0}^{2^{2n-1}}\vert C_{RGB}^{i}\rangle \otimes \vert i\rangle } | C R G B i ⟩ = cos ⁡ θ R i | 000 ⟩ + cos ⁡ θ G i | 001 ⟩ + cos ⁡ θ B i | 010 ⟩ + sin ⁡ θ R i | 100 ⟩ + sin ⁡ θ G i | 101 ⟩ + sin ⁡ θ B i | 110 ⟩ + cos ⁡ θ α | 011 ⟩ + sin ⁡ θ α | 111 ⟩ {\displaystyle {\begin{aligned}{\begin{aligned}\vert C_{RGB}^{i}\rangle &={\cos \theta _{R}^{i}\vert 000\rangle }+{\cos \theta _{G}^{i}\vert 001\rangle }+{\cos \theta _{B}^{i}\vert 010\rangle }\\&\quad +{\sin \theta _{R}^{i}\vert 100\rangle }+{\sin \theta _{G}^{i}\vert 101\rangle }+{\sin \theta _{B}^{i}\vert 110\rangle }\\&\quad +{\cos {\theta _{\alpha }}\vert 011\rangle }+{\sin \theta _{\alpha }\vert 111\rangle }\end{aligned}}\end{aligned}}} Departing from the angle-based approach of FRQI and MCQI, and using a qubit sequence, NEQR (Novel Enhanced Representation for Quantum Images) is another representation approach, that uses a function f ( y , x ) = C y x q − 1 C y x q − 2 … C y x 1 C y x 0 {\textstyle f\left(y,x\right)=C_{yx}^{q-1}C_{yx}^{q-2}\ldots C_{yx}^{1}C_{yx}^{0}} to encode color values for a 2 n × 2 n {\displaystyle 2^{n}\times 2^{n}} image: | I ⟩ = 1 2 n ∑ y = 0 2 n − 1 ∑ x = 0 2 n − 1 | f ( y , x ) ⟩ | y x ⟩ {\displaystyle \vert I\rangle ={\frac {1}{2^{n}}}\sum _{y=0}^{2^{n}-1}\sum _{x=0}^{2^{n}-1}\vert f\left(y,x\right)\rangle \vert yx\rangle } == Quantum image manipulations == A lot of the effort in QIMP has been focused on designing algorithms to manipulate the position and color information encoded using flexible representation of quantum images (FRQI) and its many variants. For instance, FRQI-based fast geometric transformations including (two-point) swapping, flip, (orthogonal) rotations and restricted geometric transformations to constrain these operations to a specified area of an image were initially proposed. Recently, NEQR-based quantum image translation to map the position of each picture element in an input image into a new position in an output image and quantum image scaling to resize a quantum image were discussed. While FRQI-based general form of color transformations were first proposed by means of the single qubit gates such as X, Z, and H gates. Later, Multi-Channel Quantum Image-based channel of interest (CoI) operator to entail shifting the grayscale value of the preselected color channel and the channel swapping (CS) operator to swap the grayscale values between two channels have been fully discussed. To illustrate the feasibility and capability of QIMP algorithms and application, researchers always prefer to simulate the digital image processing tasks on the basis of the QIRs that we already have. By using the basic quantum gates and the aforementioned operations, so far, researchers have contributed to quantum image feature extraction, quantum image segmentation, quantum image morphology, quantum image comparison, quantum image filtering, quantum image classification, quantum image stabilization, among others. In particular, QIMP-based security technologies have attracted extensive interest of researchers as presented in the ensuing discussions. Similarly, these advancements have led to many applications in the areas of watermarking, encryption, and steganography etc., which form the core security technologies highlighted in this area. In general, the work pursued by the researchers in this area are focused on expanding the applicability of QIMP to realize more classical-like digital image processing algorithms; propose technologies to physically realize the QIMP hardware; or simply to note the likely challenges that could impede the realization of some QIMP protocols. == Quantum image transform == By encoding and processing the image information in quantum-mechanical systems, a framework of quantum image processing is presented, where a pure quantum state encodes the image information: to encode the pixel values in the probability amplitudes and the pixel positions in the computational basis states. Given an image F = ( F i , j ) M × L {\displaystyle F=(F_{i,j})_{M\times L}} , where F i , j {\displaystyle F_{i,j}} represents the pixel value at position ( i , j ) {\displaystyle (i,j)} with i = 1 , … , M {\displaystyle i=1,\dots ,M} and j = 1 , … , L {\displaystyle j=1,\dots ,L} , a vector f → {\displaystyle {\vec {f}}} with M L {\displaystyle ML} elements can be formed by letting the first M {\displaystyle M} elements of f → {\displaystyle {\vec {f}}} be the first column of F {\displaystyle F} , the next M {\displaystyle M} elements the second column, etc. A large class of image operations is linear, e.g., unitary transformations, convolutions, and linear filtering. In the quantum computing, the linear transformation can be represented as | g ⟩ = U ^ | f ⟩ {\displaystyle |g\rangle ={\hat {U}}|f\rangle } with the input image state | f ⟩ {\displaystyle |f\rangle } and the output image state | g ⟩ {\displaystyle |g\rangle } . A unitary transformation can be implemented as a unitary evolution. Some basic and commonly used image transforms (e.g., the Fourier, Hadamard, an

    Read more →
  • Amira (software)

    Amira (software)

    Amira (ah-MEER-ah) is a software platform for visualization, processing, and analysis of 3D and 4D data. It is being actively developed by Thermo Fisher Scientific in collaboration with the Zuse Institute Berlin (ZIB), and commercially distributed by Thermo Fisher Scientific — together with its sister software Avizo. == Overview == Amira is an extendable software system for scientific visualization, data analysis, and presentation of 3D and 4D data. It is used by researchers and engineers in academia and industry. It is a tool for processing, analysis and visualization of data from various modalities; e.g. micro-CT, PET, Ultrasound. It is used in many fields, such as microscopy in biology and materials science, molecular biology, quantum physics, astrophysics, computational fluid dynamics (CFD), finite element modeling (FEM), non-destructive testing (NDT), and many more. One of the key features, besides data visualization, is Amira's set of tools for image segmentation and geometry reconstruction. This allows the user to mark (or segment) structures and regions of interest in 3D image volumes using automatic, semi-automatic, and manual tools. The segmentation can then be used for a variety of subsequent tasks, such as volumetric analysis, density analysis, shape analysis, or the generation of 3D computer models for visualization, numerical simulations, or rapid prototyping or 3D printing. Other key Amira features are multi-planar and volume visualization, image registration, filament tracing, cell separation and analysis, tetrahedral mesh generation, fiber-tracking from diffusion tensor imaging (DTI) data, skeletonization, spatial graph analysis, and stereoscopic rendering of 3D data over multiple displays and immersive virtual reality environments, including CAVEs. As a commercial product Amira requires the purchase of a license or an academic subscription. A time-limited, but full-featured evaluation version is available for download free of charge. == History == === 1993–1998: Research software === Amira's roots go back to 1993 and the Department for Scientific Visualization, headed by Hans-Christian Hege at the Zuse Institute Berlin (ZIB). The ZIB is a research institute for mathematics and informatics. The Scientific Visualization department's mission is to help solve computationally and scientifically challenging tasks in medicine, biology, engineering and materials science. For this purpose, it develops algorithms and software for 2D, 3D, and 4D data visualization and visually supported exploration and analysis. At that time, the young visualization group at the ZIB had experience with the extendable, data flow-oriented visualization environments apE, IRIS Explorer, and Advanced Visualization Studio (AVS), but was not satisfied with these products' interactivity, flexibility, and ease-of-use for non-computer scientists. Therefore, the development of a new software system was started in a research project within a medically oriented, multi-disciplinary collaborative research center. Based on experiences that Tobias Höllerer had gained in late 1993 with the new graphics library IRIS Inventor, it was decided to utilize that library. The development of the medical planning system was performed by Detlev Stalling, who later became the chief software architect of Amira. The new software was called "HyperPlan", highlighting its initial target application – a planning system for hyperthermia cancer treatment. The system was being developed on Silicon Graphics (SGI) computers, which at the time were the standard workstations used for high-end graphics computing. The software was based on libraries such as OpenGL (originally IRIS GL), Open Inventor (originally IRIS Inventor), and the graphical user interface libraries X11, Motif (software), and ViewKit. In 1998, X11/Motif/Viewkit were replaced by the Qt toolkit. The HyperPlan framework served as the base for more and more projects at the ZIB and was used by a growing number of researchers in collaborating institutions. The projects included applications in medical image computing, medical visualization, neurobiology, confocal microscopy, flow visualization, molecular analytics and computational astrophysics. === 1998–today: Commercially supported product === The growing number of users of the system started to exceed the capacities that ZIB could spare for software distribution and support, as ZIB's primary mission was algorithmic research. Therefore, the spin-off company Indeed – Visual Concepts GmbH was founded by Hans-Christian Hege, Detlev Stalling, and Malte Westerhoff. In Feb 1998 the HyperPlan software was given the new, application-neutral name "Amira". This name is not an acronym, but was chosen for being pronounceable in different languages and providing a suitable connotation, namely "to look at" or "to wonder at", from the Latin verb "admirare" (to admire), which reflects a basic situation in data visualization. A major re-design of the software was undertaken by Detlev Stalling and Malte Westerhoff in order to make it a commercially supportable product and to make it available on non-SGI computers as well. In March 1999, the first version of the commercial Amira was exhibited at the CeBIT tradeshow in Hannover, Germany on SGI IRIX and Hewlett-Packard UniX (HP-UX) booths. Versions for Linux and Microsoft Windows followed within the following twelve months. Later Mac OS X support was added. Indeed – Visual Concepts GmbH selected the Bordeaux, France and San Diego, United States based company TGS, Inc. as the worldwide distributor for Amira and completed five major releases (up to version 3.1) in the subsequent four years. In 2003 both Indeed – Visual Concepts GmbH, as well as TGS, Inc. were acquired by Massachusetts-based Mercury Computer Systems, Inc. (NASDAQ:MRCY) and became part of Mercury's newly formed life sciences business unit, later branded Visage Imaging. In 2009, Mercury Computer Systems, Inc. spun off Visage Imaging again and sold it to Melbourne, Australia based Promedicus Ltd (ASX:PME), a leading provider of radiology information systems and medical IT solutions. During this time, Amira continued to be developed in Berlin, Germany and in close collaboration with the ZIB, still headed by the original creators of Amira. TGS, located in Bordeaux, France was sold by Mercury Computer systems to a French investor and renamed to Visualization Sciences Group (VSG). VSG continued the work on a complementary product named Avizo, based on the same source code but customized for material sciences. In August 2012, FEI, to that date the largest OEM reseller of Amira, purchased VSG and the Amira business from Promedicus. This brought the two software sisters Amira and Avizo back into one hand. In August 2013, Visualization Sciences Group (VSG) became a business unit of FEI. In 2016 FEI has been bought by Thermo Fisher Scientific and became part of its Materials & Structural Analysis division in early 2017. Amira and Avizo are still being marketed as two different products; Amira for life sciences and Avizo for materials science, but the development efforts are now joined once again. In the meantime, the number of scientific articles using the Amira / Avizo software, is in the order of 10 thousands. == Amira options == === Microscopy option === Specific readers for microscopy data Image deconvolution Exploration of 3D imagery obtained from virtually any microscope Extraction and editing of filament networks from microscopy images === DICOM reader === Import of clinical and preclinical data in DICOM format === Mesh option === Generation of 3D finite element (FE) meshes from segmented image data Support for many state-of-the-art FE solver formats High-quality visualization of simulation mesh-based results, using scalar, vector, and tensor field display modules === Skeletonization option === Reconstruction and analysis of neural and vascular networks Visualization of skeletonized networks Length and diameter quantification of network segments Ordering of segments in a tree graph Skeletonization of very large image stacks === Molecular option === Advanced tools for the visualization of molecule models Hardware-accelerated volume rendering Powerful molecule editor Specific tools for complex molecular visualization === Developer option === Creation of new custom components for visualizing or data processing Implementation of new file readers or writers C++ programming language Development wizard for getting started quickly === Neuro option === Medical image analysis for DTI and brain perfusion Fiber tracking supporting several stream-line based algorithms Fiber separation into fiber bundles based on user defined source and destination regions Computation of tensor fields, diffusion weighted maps Eigenvalue decomposition of tensor fields Computation of mean transit time, cerebral blood flow, and cerebral blood volume === VR option === Visualization of data on large tiled displays

    Read more →
  • ACLU Mobile Justice

    ACLU Mobile Justice

    ACLU Mobile Justice was a video live streaming application developed for smartphones by various state chapters of the American Civil Liberties Union. It was intended to allow instant, secure video recording and transmission of interactions with, and perceived abuses by, law enforcement officers. Since its release by the ACLU of California for California residents, other versions of the app have been released for 16 other states and the District of Columbia by their ACLU chapters. It was discontinued in February 2025.

    Read more →
  • Edits (app)

    Edits (app)

    Edits is an American photo and short form video editing software service owned by Meta Platforms. It allows users to create videos and edit them by using features like green screens, and AI animation, and also provides real-time statistics to Instagram creators to track their accounts. Accounts directly from Instagram can be imported, and videos can be exported vice-versa. It is available solely on iOS and Android. On Apple, it supports over 32 different languages, including French, Spanish, and Chinese. It has been noted by critics as a direct competitor for apps like CapCut, owned by Chinese brand ByteDance. The Instagram head, Adam Mosseri, also acknowledged these similarities. Launched on April 22 for both iOS and Android. It received over 5M+ users on Apple and Android combined in its first 4 days since its launch. == History == On January 19, 2025, following the ban of all ByteDance Apps from the Google Play Store, and App Store, Instagram head Adam Mosseri announced on Threads that they would be launching the app in February for iOS, followed by an Android counterpart. He said the app is working with select people to test its features. In a separate post, he emphasized that the app is "more for creators than casual video makers". == Features == Edits contains many similar features to other competition of video editors like KineMaster, Inshot, and CapCut. When creating a video, users have the option to export in resolution of HD, 4K, and 2K, along with having HDR and SDR support. Like many traditional video editing software, it includes a timeline, and basic undo-redo buttons. On the bottom bar, 7 tabs for editing exist, namely the Split, Volume, Adjust, Speed, Delete, Filters, Green Screen, Voice FX, Extract Audio, Mirror, Slip, Replace and Duplicate bars. Basic features, like splitting, and adjusting speed and volume of clips are present, along with more advanced Green Screens, and AI features. Being a mobile video editor app, Edits also has drag-and-drop features to ease customer usage. Users have the ability to record videos directly within the app. This feature allows users to create content without needing extra software or devices. They can choose from several focal lengths, which affect how close or wide the shot appears. The app also supports different frame rates. Users have the ability to record videos directly within the app. This feature allows users to create content without needing extra software or devices. Once users are done filming your clips, they can simply transfer them into a project to start editing immediately. Upcoming features for the app include Keyframes, AI-powered modification, Collaboration, and Enhanced creativity. == Reception == Since its release, it received over 5 million downloads in 4 days. Critically, the app received great rankings from many. From users, the app received an average of 4.45 stars over Google Play Store and App Store in the first few days, with Google Play Store receiving the least stars. As in reviews, it was received mixed by the public. Many people praised the smoothness and intuivity of the app. "The app is more than just a basic editor, offering a full suite of creative tools, including a dedicated tab for inspiration and trending audio, as well as a tab for managing drafts," said a blogger. Some users were disappointed with the range of editing tools, some users have noted that it could benefit from more transition options between clips. Some even reported crashing between clips.

    Read more →
  • Sensory, Inc.

    Sensory, Inc.

    Sensory, Inc. is an American company which develops software AI technologies for speech, sound and vision. It is based in Santa Clara, California. Sensory’s technologies have shipped in over three billion products from hundreds of leading consumer electronics manufacturers including AT&T, Hasbro, Huawei, Google, Amazon, Samsung, LG, Mattel, Motorola, Plantronics, GoPro, Sony, Tencent, Garmin, LG, Microsoft, Lenovo, and more. Sensory has over 60 issued patents covering speech recognition in consumer electronics, biometric authentication, sensor/speech combinations, wake word technology, and more. == History == Sensory, Inc. was founded in 1994, originally as Sensory Circuits, by Forrest Mozer, Mike Mozer and Todd Mozer. The three had also co-founded ESS Technology years earlier. In 1999 Sensory acquired Fluent Speech Technologies, which was formed and started by a group of professors out of the Oregon Graduate Institute (formerly OGI, now OHSU). Fluent Speech Technologies developed high performance embedded speech engines, the technology from this acquisition is now the core technology used throughout Sensory's chip and software line. === Company timeline === 1994 – Founded 1995 – Introduces the RSC 164 - first commercially successful speech recognition IC 1998 – Introduces first speaker verification IC 2000 – Acquires Oregon based Fluent-Speech Technologies 2002 – Acquires Texas Instruments line of speech output ICs (the SC series) 2007 – Introduces first Voice User Interface for Bluetooth silicon (CSR BC-5) - BlueGenie 2008 - Sensory and BlueAnt partner on the V1 - Revolutionary new Bluetooth headset with a voice user interface. First wearable to use a voice user interface for control and best-reviewed speech recognition product in history 2009 – Introduced world's smallest text to speech system (TTS) and Truly HandsfreeTM Triggers/ wake words. 2010 – Introduced the NLP-5x – First Natural Language Voice Processor and TrulyHandsfree wake words in SDKs for Android, iOS, Linux, and Windows. NLP5x used the first generation of TrulyHandsfree wake words with low power and enhanced accuracy. 2011 – Sensory partners with Google and Microsoft to enable TrulyHandsfree as a front end to Goog411 and Bing411 2012 – Partnered with Tensilica to offer ultra-low power TrulyHandsfree wake words; introduced Speaker Verification and Speaker Identification for mobile phones and other consumer electronics. 2012 - TrulyHandsfree released into Samsung's Galaxy S2 for "Hey Galaxy" wake word 2013 – TrulyHandsfree wake words migrated to many new platforms and began shipping as MotoVoice in the Google-owned MotoX. Sensory's TrulyHandsfree in mobile takes off with the Galaxy S3 and S4 and Galaxy Note and is licensed into wearables like Google Glass. 2014 – Announced new initiative in Vision; added LG and Motorola as customers; received the 2014 Global Mobile Award for Best Mobile Technology Breakthrough at the GSMA Mobile World Congress in Barcelona, Spain (judges commented, "A big advance for the wearables market, this offers many benefits for consumers, increasing uptake and usage of many mobile apps, driving revenue for operators and content providers.") 2015-2018 - Licensed Google, Amazon, MSFT, Baidu, Huawei, ZTE, and many others with TrulyHandsfree wake words. Sensory develops first wake words for OK Google, Hey Siri, and Hey Cortana. 2019 - Sensory launched two new solutions: SoundID, sound identification, and TrulyNatural, embedded large vocabulary speech recognition. Sensory also acquired Vocalize.ai, an independent testing lab. 2020 - Sensory introduced VoiceHub, which allows the automated generation of wake words. 2021 - Sensory expands VoiceHub with speech recognition and NLU capabilities. The company initiated a new cloud platform, SensoryCloud.ai. 2022-Sensory rolls out SensoryCloud.ai with speech to text, text to speech, face & voice biometrics 2024- Sensory Automotive & TrulyNatural Speech-to-text On-Device launched == Technology and products == Sensory originally developed both hardware (Integrated Circuit - IC or "chip") and software platforms but migrated to software only around 2005 and added cloud and hybrid computing capabilities in 2021. Sensory's RSC-164 IC (Integrated Circuit or "chip") was used on NASA's Mars Polar Lander in the Mars Microphone on the Lander. Speech Synthesis SC-6x chips – acquired some speech synthesis technology from Texas Instruments. Sensory’s embedded AI solutions include the following: TrulyHandsfree (THF) - wake word detection and phrase spotting. TrulyNatural (TNL) - large vocabulary continuous speech recognition with NLU. TrulySecure (TS) - face and voice biometrics. TrulySecureSpeakerVerification (TSSV) - speaker and sound identification. VoiceHub - Online portal for creating custom wake words and speech recognition models with NLU. Sensory Automotive- Sensory Automotive is a full voice and vision suite of AI technologies that operate efficiently in the car without connecting to a network. The cloud initiative, SensoryCloud.ai, is targeting Speech To Text (STT), Text To Speech (TTS), Wake Word verification, face and voice recognition, and sound identification.

    Read more →
  • Security and Privacy in Computer Systems

    Security and Privacy in Computer Systems

    Security and Privacy in Computer Systems is a paper by Willis Ware that was first presented to the public at the 1967 Spring Joint Computer Conference. == Significance == Ware's presentation was the first public conference session about information security and privacy in respect of computer systems, especially networked or remotely-accessed ones. The IEEE Annals of the History of Computing said that Ware's 1967 Spring Joint Computer Conference session, together with 1970's Ware report, marked the start of the field of computer security.

    Read more →
  • Spectral shape analysis

    Spectral shape analysis

    Spectral shape analysis relies on the spectrum (eigenvalues and/or eigenfunctions) of the Laplace–Beltrami operator to compare and analyze geometric shapes. Since the spectrum of the Laplace–Beltrami operator is invariant under isometries, it is well suited for the analysis or retrieval of non-rigid shapes, i.e. bendable objects such as humans, animals, plants, etc. == Laplace == The Laplace–Beltrami operator is involved in many important differential equations, such as the heat equation and the wave equation. It can be defined on a Riemannian manifold as the divergence of the gradient of a real-valued function f: Δ f := div ⁡ grad ⁡ f . {\displaystyle \Delta f:=\operatorname {div} \operatorname {grad} f.} Its spectral components can be computed by solving the Helmholtz equation (or Laplacian eigenvalue problem): Δ φ i + λ i φ i = 0. {\displaystyle \Delta \varphi _{i}+\lambda _{i}\varphi _{i}=0.} The solutions are the eigenfunctions φ i {\displaystyle \varphi _{i}} (modes) and corresponding eigenvalues λ i {\displaystyle \lambda _{i}} , representing a diverging sequence of positive real numbers. The first eigenvalue is zero for closed domains or when using the Neumann boundary condition. For some shapes, the spectrum can be computed analytically (e.g. rectangle, flat torus, cylinder, disk or sphere). For the sphere, for example, the eigenfunctions are the spherical harmonics. The most important properties of the eigenvalues and eigenfunctions are that they are isometry invariants. In other words, if the shape is not stretched (e.g. a sheet of paper bent into the third dimension), the spectral values will not change. Bendable objects, like animals, plants and humans, can move into different body postures with only minimal stretching at the joints. The resulting shapes are called near-isometric and can be compared using spectral shape analysis. == Discretizations == Geometric shapes are often represented as 2D curved surfaces, 2D surface meshes (usually triangle meshes) or 3D solid objects (e.g. using voxels or tetrahedra meshes). The Helmholtz equation can be solved for all these cases. If a boundary exists, e.g. a square, or the volume of any 3D geometric shape, boundary conditions need to be specified. Several discretizations of the Laplace operator exist (see Discrete Laplace operator) for the different types of geometry representations. Many of these operators do not approximate well the underlying continuous operator. == Spectral shape descriptors == === ShapeDNA and its variants === The ShapeDNA is one of the first spectral shape descriptors. It is the normalized beginning sequence of the eigenvalues of the Laplace–Beltrami operator. Its main advantages are the simple representation (a vector of numbers) and comparison, scale invariance, and in spite of its simplicity a very good performance for shape retrieval of non-rigid shapes. Competitors of shapeDNA include singular values of Geodesic Distance Matrix (SD-GDM) and Reduced BiHarmonic Distance Matrix (R-BiHDM). However, the eigenvalues are global descriptors, therefore the shapeDNA and other global spectral descriptors cannot be used for local or partial shape analysis. === Global point signature (GPS) === The global point signature at a point x {\displaystyle x} is a vector of scaled eigenfunctions of the Laplace–Beltrami operator computed at x {\displaystyle x} (i.e. the spectral embedding of the shape). The GPS is a global feature in the sense that it cannot be used for partial shape matching. === Heat kernel signature (HKS) === The heat kernel signature makes use of the eigen-decomposition of the heat kernel: h t ( x , y ) = ∑ i = 0 ∞ exp ⁡ ( − λ i t ) φ i ( x ) φ i ( y ) . {\displaystyle h_{t}(x,y)=\sum _{i=0}^{\infty }\exp(-\lambda _{i}t)\varphi _{i}(x)\varphi _{i}(y).} For each point on the surface the diagonal of the heat kernel h t ( x , x ) {\displaystyle h_{t}(x,x)} is sampled at specific time values t j {\displaystyle t_{j}} and yields a local signature that can also be used for partial matching or symmetry detection. === Wave kernel signature (WKS) === The WKS follows a similar idea to the HKS, replacing the heat equation with the Schrödinger wave equation. === Improved wave kernel signature (IWKS) === The IWKS improves the WKS for non-rigid shape retrieval by introducing a new scaling function to the eigenvalues and aggregating a new curvature term. === Spectral graph wavelet signature (SGWS) === SGWS is a local descriptor that is not only isometric invariant, but also compact, easy to compute and combines the advantages of both band-pass and low-pass filters. An important facet of SGWS is the ability to combine the advantages of WKS and HKS into a single signature, while allowing a multiresolution representation of shapes. == Spectral Matching == The spectral decomposition of the graph Laplacian associated with complex shapes (see Discrete Laplace operator) provides eigenfunctions (modes) which are invariant to isometries. Each vertex on the shape could be uniquely represented with a combinations of the eigenmodal values at each point, sometimes called spectral coordinates: s ( x ) = ( φ 1 ( x ) , φ 2 ( x ) , … , φ N ( x ) ) for vertex x . {\displaystyle s(x)=(\varphi _{1}(x),\varphi _{2}(x),\ldots ,\varphi _{N}(x)){\text{ for vertex }}x.} Spectral matching consists of establishing the point correspondences by pairing vertices on different shapes that have the most similar spectral coordinates. Early work focused on sparse correspondences for stereoscopy. Computational efficiency now enables dense correspondences on full meshes, for instance between cortical surfaces. Spectral matching could also be used for complex non-rigid image registration, which is notably difficult when images have very large deformations. Such image registration methods based on spectral eigenmodal values indeed capture global shape characteristics, and contrast with conventional non-rigid image registration methods which are often based on local shape characteristics (e.g., image gradients).

    Read more →
  • Mix automation

    Mix automation

    In music recording, mix automation allows the mixing console to remember the mixing engineer's dynamic adjustment of faders during a musical piece in the post-production editing process. A timecode is necessary for the synchronization of automation. Modern mixing consoles and digital audio workstations use comprehensive mix automation. The need for automated mixing originated from the late 1970s transition form 8-track to 16-track and then 24-track multitrack recording, as mixing could be laborious and require multiple people and hands, and the results could be almost impossible to reproduce. With 48-track recording - synchronized twin 24-track recorders (for a net 46 audio tracks, with one on each machine for SMPTE timecode) - came larger recording and mixing consoles with even more channel faders to manage during mixdown. Manufacturers, such as Neve Electronics (now AMS Neve) and Solid State Logic (SSL), both English companies, developed systems that enabled one engineer to oversee every detail of a complex mix, although the computers required to power these desks remained a rarity into the late 1970s. According to record producer Roy Thomas Baker, Queen's 1975 single "Bohemian Rhapsody" was one of the first mixes to be done with automation. == Types == Voltage Controlled Automation fader levels are regulated by voltage-controlled amplifiers (VCA). VCAs control the audio level and not the actual fader. Moving Fader Automation a motor is attached to the fader, which then can be controlled by the console, digital audio workstation (DAW), or user. Software Controlled Automation the software can be internal to the console, or external as part of a DAW. The virtual fader can be adjusted in the software by the user. MIDI Automation the communications protocol MIDI can be used to send messages to the console to control automation. == Modes == Auto Write used the first time automation is created or when writing over existing automation Auto Touch writes automation data only while a fader is touched/faders return to any previously automated position after release Auto Latch starts writing automation data when a fader is touched/stays in position after release Auto Read digital Audio Workstation performs the written automation Auto Off automation is temporarily disabled All of these include the mute button. If mute is pressed during writing of automation, the audio track will be muted during playback of that automation. Depending on software, other parameters such as panning, sends, and plug-in controls can be automated as well. In some cases, automation can be written using a digital potentiometer instead of a fader.

    Read more →