AI Chat Prompt Generator

AI Chat Prompt Generator — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • WebGPU Shading Language

    WebGPU Shading Language

    WebGPU Shading Language (WGSL, internet media type: text/wgsl) is a high-level shading language and the normative shader language for the WebGPU API on the web. WGSL's syntax is influenced by Rust and is designed with strong static validation, explicit resource binding, and portability in mind for secure execution in browsers. In web contexts, WebGPU implementations accept WGSL source and perform compilation to platform-specific intermediate forms (for example, to SPIR‑V, DXIL, or MSL via the user agent), but such backends are not exposed to web content. == History and background == Graphics on the web historically used WebGL, with shaders written in GLSL ES. As applications demanded more modern GPU features and finer control over compute and graphics pipelines, the W3C's GPU for the Web Community Group and Working Group created WebGPU and its companion shading language, WGSL, to provide a secure, portable model suitable for the web platform. WGSL was developed to be human-readable, avoid undefined behavior common in legacy shading languages, and align closely with WebGPU's resource and validation model. == Design goals == WGSL's design emphasizes: Safety and determinism suitable for web security constraints (extensive static validation and well-defined semantics). Portability across diverse GPU backends via an abstract resource model shared with WebGPU. Readability and explicitness (no preprocessor, minimal implicit conversions, explicit address spaces and bindings). Alignment with modern GPU features (compute, storage buffers, textures, atomics) while retaining a familiar C/Rust-like syntax. == Language overview == === Types and values === Core scalar types include bool, i32, u32, and f32. Vectors (e.g., vec2, vec3, vec4) and matrices (up to 4×4) are available for floating-point element types. Optional f16 (half precision) may be enabled via a WebGPU feature; availability is implementation-dependent. Atomic types (atomic, atomic) support limited atomic operations in qualified address spaces. === Variables and address spaces === Variables are declared with let (immutable), var (mutable), or const (compile-time constant). Storage classes (address spaces) include function, private, workgroup, uniform, and storage with read or read_write access as applicable. WGSL defines explicit layout and alignment rules; attributes such as @align, @size, and @stride control data layout for buffer interoperability. === Functions and control flow === Functions use explicit parameter and return types. Control flow includes if, switch, for, while, and loop constructs, with break/continue. Recursion is disallowed; entry-point call graphs must be acyclic. === Entry points and attributes === Shaders define stage entry points with @vertex, @fragment, or @compute. Attributes annotate bindings and interfaces, including @group, @binding (resource binding), @location (user-defined I/O), @builtin (stage built-ins such as position or global_invocation_id), @interpolate, and @workgroup_size. === Resources === WGSL exposes buffers (uniform, storage), textures (sampled, storage, and multisampled variants), and samplers (filtering/non-filtering/comparison). The binding model is explicit via descriptor sets called groups and bindings, matching WebGPU's pipeline layout model. == Compilation and validation == Browsers compile WGSL to platform-appropriate representations and native driver formats; the specific compilation pipeline is not observable by web content. WGSL source undergoes strict parsing and static validation, and WebGPU enforces robust resource access rules to avoid out-of-bounds memory hazards, contributing to predictable behavior across implementations. == Shader stages == WGSL supports three pipeline stages: vertex, fragment, and compute. === Vertex shaders === Vertex shaders transform per-vertex inputs and produce values for rasterization, including a clip-space position written to the position builtin. ==== Example ==== === Fragment shaders === Fragment shaders run per-fragment and compute color (and optionally depth) outputs written to color attachments. ==== Example ==== If half-precision (vec4h, shorthand for vec4) is desired, the code must be prefaced with a enable f16; statement. === Compute shaders === Compute shaders run in workgroups and are used for general-purpose GPU computations. ==== Example ==== == Differences from GLSL and HLSL == Compared with legacy shading languages, WGSL: Omits a preprocessor and requires explicit types and conversions. Uses explicit address spaces and binding annotations aligned with WebGPU's model. Enforces strict validation to avoid undefined behavior common in other shading languages. Defines a portable, web-focused feature set; 16-bit types and other features are opt-in and may depend on device capabilities.

    Read more →
  • Boris FX

    Boris FX

    Boris FX is a visual effects, video editing, photography, and audio software plug-in developer based in Miami, Florida, USA. The developer is known for its flagship products, Continuum (formerly Boris Continuum Complete/BCC), Sapphire, Mocha, and Silhouette. Boris FX creates plug-in tools for feature film, broadcast television, and multimedia post-production workflows. The plug-ins are compatible with various NLEs, including Adobe After Effects and Premiere Pro, Avid Media Composer, Apple Final Cut Pro, and OFX hosts such as Autodesk Flame, Foundry Nuke, Blackmagic Design DaVinci Resolve and Fusion, and VEGAS Pro. Boris FX has incorporated artificial intelligence into its software, introducing features for noise reduction, rotoscoping, upscaling, and masking. The company has acquired technologies via mergers and acquisitions from Imagineer Systems, GenArts, Silhouette FX, Digital Film Tools, CrumplePop and Andersson Technologies to expand its visual effects, editing, photography, and audio tools. == History == Boris FX was founded in 1995 by Boris Yamnitsky. The former Media 100 engineer (a member of the original Media 100 launch team in 1993) released “Boris FX,” the first plug-in-based digital video effects (DVE) for Adobe Premiere and Media 100, in 1995. The plug-in won Best of Show at Apple Macworld in Boston, MA that same year. The Boris FX Suite includes a range of visual effects and post-production tools, such as Sapphire, Continuum, Mocha Pro, Silhouette, SynthEyes, CrumplePop, Optics, and Particle Illusion. == Media 100 == In October 2005, Yamnitsky acquired Media 100 the company that launched his plug-in career. Boris FX had a long relationship with Media 100 which bundled Boris RED software as its main titling and compositing solution. Media 100's video editing software is available as freeware for macOS. == Continuum == Continuum is a visual effect and compositing plugin suite that includes a library of over 300 effects and more than 40 transitions, including tools for image restoration, compositing, titling, particle generation, and stylized effects, along with features such as lens flares, lighting effects, and cinematic color grading presets. A key component of Continuum is its integration with the Mocha planar tracking and masking system, enabling advanced tracking and rotoscoping within the effects. The suite also includes Particle Illusion, a real-time particle generator used for creating visual effects such as explosions, smoke, and abstract motion graphics, as well as Primatte Studio, a chroma keying and compositing toolset for green screen and blue screen workflows. Continuum supports GPU acceleration and offers compatibility with HDR and 360/VR content. Regular updates introduce new effects, presets, and performance enhancements to expand its capabilities. In October 2018, Continuum relaunched Particle Illusion, a Mocha Essentials workflow with magnetic edge-snapping, and updates to Title Studio. In October 2019, Continuum introduced Corner Pin Studio with built-in Mocha tracking for quick screen replacement and inserts, 6 stylized transitions, and 4 creative effects. In October 2020, Continuum released an update that included over 80 GPU-accelerated effects such as film stocks, color grades, optical filter simulations, and a digital gobo library. The update also introduced a custom FX Editor interface, real-time particles, and more than 1,000 drag-and-drop presets. In November 2021, it added multi-frame rendering for After Effects, native Apple M1 support, fluid dynamics in Particle Illusion, and 60 color-grade presets. In October 2022, the software introduced 10 additional transitions, a revised Particle Illusion workflow, an atmospheric glow effect, and more than 250 curated presets. Continuum plugins have been used in television, streaming, and film projects, including A Black Lady Sketch Show (HBO/HBO Max), Star Trek: Discovery (CBS), Andor (Disney+), The Curse of Oak Island (History Channel), Keeping up with the Kardashians (E!), This Old House (PBS), Ms. Marvel (Disney+), MasterChef (Fox), WipeOut (TBS), The Boys (Prime Video), and The Today Show (NBC). == Mocha Pro == In December 2014, Boris FX merged with Imagineer Systems, the UK-based developer of the Academy Award-winning planar motion tracking software, Mocha Pro. Mocha Pro's features include planar tracking (motion tracking), rotoscoping, image stabilization, 3D camera tracking, and object removal. In June 2016, Mocha released (v5) which introduced Mocha Pro's tools as plug-ins for Adobe After Effects and Premiere Pro, Avid Media Composer, and OFX hosts Foundry's NUKE, Blackmagic Design Fusion, VEGAS Pro, and HitFilm. A simplified version, Mocha AE, is included with Adobe After Effects Creative Cloud and has been bundled with the software since CS4. A similar version is also available with HitFilm Pro from FXhome and VEGAS Pro. Mocha's tracking SDK is integrated into other visual effects tools, including SAM Quantel Pablo Rio, Silhouette FX, CoreMelt, and Motion VFX. Mocha Pro has been used in various film and television productions, including Birdman, Black Swan, the Harry Potter series, The Hobbit, Star Wars, The Mandalorian, Star Trek: Discovery, and The Umbrella Academy. It has also been employed in projects such as Gone Girl, The Hunger Games: Mockingjay – Part 1, Game of Thrones, and House of Cards. == Sapphire == GenArts, founded by Karl Sims in 1996, developed visual effects plug-ins that were used by studios and post-production facilities. In September 2016, Boris FX merged with former competitor, GenArts, Inc., developer of Sapphire high-end visual effects plug-ins, to expand its suite of motion graphics and VFX tools. The merger brought Sapphire alongside Boris Continuum Complete (BCC) and Mocha Pro, integrating these tools for film and television post-production. The Sapphire suite includes a library of over 270 effects and transitions, organized into categories such as lighting, stylization, distortions, textures, and transitions. Commonly used effects include glows, lens flares, film looks, and blurs. The plug-ins are designed to be GPU-accelerated, allowing for improved rendering performance and real-time previews in supported host applications. A central feature of Sapphire is the Builder tool, a node-based workspace that allows users to create custom effects and transitions by combining multiple Sapphire plug-ins. This enables a high level of creative flexibility and reusability, making it a popular tool for both editors and VFX artists. Sapphire also integrates with Mocha, Boris FX's planar tracking and masking system, allowing for advanced control of visual elements within an effect. In October 2017, Boris FX released its first new version of Sapphire since the GenArts acquisition. Sapphire (v11) now includes integrated Mocha tracking and masking tools. Sapphire is available for Adobe, Avid, the Autodesk Flame family, and OFX hosts including Blackmagic DaVinci Resolve and Fusion, and Foundry's NUKE. As part of the merger, Boris FX acquired the rights to Particle Illusion. In 2018, Boris FX reintroduced the product to the larger NLE/Compositing market. Sapphire's plug-ins transitioned from C to C++ to improve performance and support higher-resolution visual effects. This update enhanced floating-point calculations, compatibility with film editing APIs, and integration with NVIDIA's CUDA for faster rendering. The plug-ins have been used in various films, including Avatar, the Harry Potter and the Prisoner of Azkaban, Iron Man, The Lord of the Rings, The Matrix trilogy, Titanic, and X-Men. == Particle Illusion == As part of the merger with GenArts in 2016, Boris FX acquired the rights to the Particle Illusion (formerly particleIllusion) product, a storied particle system from the original developer Alan Lorence, the founder of Wondertouch. In 2018, Boris FX released a redesigned version of the product to a larger NLE/compositing market as part of Continuum (2019). The new Particle Illusion plug-in supports Adobe, Avid, and many OFX hosts. == Silhouette == In September 2019, Boris FX merged with SilhouetteFX, Academy Award-winning developer of Silhouette, a high-end digital paint, advanced rotoscoping, motion tracking, and node-based compositing application for visual effects in film post-production. The acquisition integrated Silhouette's advanced rotoscoping and paint technology, recognized by the Academy of Motion Pictures, into Boris FX's suite of products, alongside Sapphire, Continuum, and Mocha Pro. In May 2021, Boris FX released Silhouette 2021, the first version of Silhouette released by Boris FX to function both as a standalone application and as a plug-in for Adobe, Autodesk, Nuke, and other OFX hosts. Silhouette has been used in the visual effects of films such as Avatar, Avengers: Infinity War, Blade Runner 2049, Ex Machina, and Interstellar. == Optics == In June 2020, Boris FX launched Optics, its first plugin deve

    Read more →
  • DoorDash

    DoorDash

    DoorDash, Inc. is an American company operating online food ordering and food delivery. It trades under the symbol DASH. With a 56% market share, DoorDash is the largest food delivery platform in the United States. It also has a 60% market share in the convenience delivery category. As of December 31, 2020, the platform was used by 450,000 merchants, 20 million consumers, and had over one million delivery couriers. Founded by Tony Xu, Andy Fang, Stanley Tang and Evan Moore, DoorDash made its debut on the Fortune 500 list in 2024, ranking No. 443. DoorDash has been sued for or held legally liable for withholding tips, reducing tip transparency, antitrust price manipulation, listing restaurants without permission, misclassifying workers, withholding sick time, and illegally selling personal data. As of April 2026, DoorDash operates in the United States (including Puerto Rico), Canada, Australia, and New Zealand. Through its subsidiaries Deliveroo and Wolt, the company also operates across Europe, as well as in Azerbaijan, Georgia, Israel, Kazakhstan, Kuwait, and the United Arab Emirates. == History == In January 2013, Stanford University students Tony Xu, Stanley Tang, Andy Fang and Evan Moore launched PaloAltoDelivery.com in Palo Alto, California. In the summer of 2013, it received US$120,000 in seed money from Y Combinator in exchange for a 7% stake. It incorporated as DoorDash in June 2013. DoorDash's first partnership with a fast food burger restaurant chain was in April 2016, when it partnered with CKE Restaurants, parent company of Carl's Jr. and Hardee's, for food delivery. In December 2017, DoorDash announced its partnership with Wendy's for delivery from its restaurants. In December 2018, DoorDash overtook Uber Eats to hold the second position in total US food delivery sales, behind GrubHub. By March 2019, it had exceeded GrubHub in total sales, at 27.6% of the on-demand delivery market. By early 2019, DoorDash was the largest food delivery provider in the U.S., as measured by consumer spending. In October 2019, DoorDash opened its first ghost kitchen, DoorDash Kitchen, in Redwood City, California, with four restaurants operating at the location. By June 2020, DoorDash had raised more than $2.5 billion over several financing rounds from investors including Y Combinator, Charles River Ventures, SV Angel, Khosla Ventures, Sequoia Capital, SoftBank Group, GIC, and Kleiner Perkins. DoorDash announced a partnership with KFC in September 2020, followed by Taco Bell in October 2020. In November 2020, DoorDash announced the opening of its first physical restaurant location, partnering up with Bay Area restaurant Burma Bites to offer delivery and pick-up orders. In December 2020, it became a public company via an initial public offering, raising $3.37 billion. In November 2021, DoorDash acquired Finland's Wolt for €7bn. In August 2022, DoorDash announced it would end its partnership with Walmart in September, ending the companies' cooperation agreement from 2018. In November 2022, DoorDash announced plans to lay off 1,250 corporate employees, or about six percent of its workforce, to rein in expenses. In June 2023, DoorDash announced it would give its drivers the option of earning an hourly minimum wage instead of being paid per delivery. However, drivers are only paid hourly when on an active delivery. In September 2023, the company transferred its stock listing from the New York Stock Exchange to the Nasdaq. On December 18, 2023, DoorDash was added to the Nasdaq-100 index. In March 2025, DoorDash announced a partnership with Klarna, a Buy Now, Pay Later (BNPL) service, letting customers schedule small payments over a set period of time. DoorDash received widespread criticism from this decision, including internet mockery, given concerns about the increase of household debt in America. In 2025, DoorDash acquired the UK-based delivery service Deliveroo for $3.88 billion. The combined company operates in 40 countries and serves 50 million users monthly. In September 2025, DoorDash and Ace Hardware (the largest hardware cooperative) announced their partnership to offer delivery for home use products from over 4,000 Ace locations. == Lawsuits against DoorDash == === 2017 class-action lawsuit for misclassifying workers === In 2017, a class-action lawsuit was filed against DoorDash for allegedly misclassifying delivery drivers in California and Massachusetts as independent contractors. In 2022, a tentative settlement was reached in which DoorDash would pay $100 million total, with $61 million going to over 900,000 drivers, paying out just over $130 per driver, and $28 million for the lawyers. Gizmodo criticized the settlement, noting that the $413 million that DoorDash CEO Tony Xu received the previous year was one of the largest CEO compensation packages of all time. === 2019 data breach lawsuit === On May 4, 2019, DoorDash confirmed 4.9 million customers, delivery workers and merchants had sensitive information stolen via a data breach. Those who joined the platform after April 5, 2018, were unaffected by the breach. A class-action lawsuit for the breach was filed against DoorDash in October 2019. === Withholding of tips and subsequent class-action lawsuits === In July 2019, the company's tipping policy was criticized by The New York Times, and later The Verge and Vox and Gothamist. Drivers receive a guaranteed minimum per order that is paid by DoorDash by default. When a customer added a tip, instead of going directly to the driver, it first went to the company to cover the guaranteed minimum. Drivers then only directly received the part of the tip that exceeded the guaranteed minimum per order. In January 2020, it was reported that DoorDash had lied about skimming tips from its drivers, causing them to earn an average of $1.45 an hour after expenses, and that after the company had allegedly overhauled its tipping system, DoorDash was still manipulating per-delivery payouts at the expense of drivers. A DoorDash customer filed a class action lawsuit against the company for its "materially false and misleading" tipping policy. The case was referred to arbitration in August 2020. Under pressure, the company revised its policy. The company settled a lawsuit with District of Columbia Attorney General Karl Racine for $2.5 million, with funds going to deliverers, the government, and to charity. ==== 2021 driver strike for tip transparency ==== In July 2021, DoorDash drivers went on strike to protest lack of tip transparency and to ask for higher pay. At the time of the strike, and, as of June 2022, DoorDash did not allow drivers to see the full tip amounts prior to accepting a delivery in the app. If customers tip over a set amount for the order total, Doordash hides a portion of the tip until the delivery is complete. The strike occurred after DoorDash rewrote its code to cut off access to Para, a third-party app that drivers had been using to see the full tip amounts. ==== 2025 class-action lawsuit settlement ==== In 2025, DoorDash agreed to pay around $17 million for "misleading both consumers and delivery workers" with tips being docked from drivers' pay instead of directly going to drivers. === 2020 antitrust litigation === In April 2020, in the case of Davitashvili v. GrubHub Inc. DoorDash, Grubhub, Postmates, and Uber Eats were accused of monopolistic power by only listing restaurants on its apps if the restaurant owners signed contracts which include clauses that require prices be the same for dine-in customers as for customers receiving delivery. The plaintiffs stated that this arrangement increases the cost for dine-in customers, as they are required to subsidize the cost of delivery; and that the apps charge "exorbitant" fees, which range from 13% to 40% of revenue, while the average restaurant's profit ranges from 3% to 9% of revenue. The lawsuit seeks treble damages, including for overcharges, since April 14, 2016, for dine-in and delivery customers in the United States at restaurants using the defendants’ delivery apps. Although several preliminary documents in the case have now been filed, a trial date has not yet been set. === Litigation for illegal unauthorized restaurant listing === In May 2021, DoorDash was criticized for unauthorized listings of restaurants who had not given permission to appear on the app. The company was sued by Lona's Lil Eats in St. Louis, with the lawsuit claiming that DoorDash had listed them without permission, then prevented any orders to the restaurant from going through and redirecting customers to other restaurants instead, because Lona's was "too far away," when in reality it had not paid DoorDash a fee for listing. This aspect of DoorDash's business practice is illegal in California. === 2021 lawsuit by the city of Chicago === In August 2021, the city of Chicago sued DoorDash and GrubHub. According to Chicago mayor Lori Lightfoot, the companies broke the law by using "unfair and deceptive t

    Read more →
  • Perceptual robotics

    Perceptual robotics

    Perceptual robotics is an interdisciplinary science linking Robotics and Neuroscience. It investigates biologically motivated robot control strategies, concentrating on perceptual rather than cognitive processes and thereby sides with J. J. Gibson's view against the Poverty of the stimulus theory. As a working definition, the following quote from Chapter 64 by H. Bülthoff, C. Wallraven and M. Giese from The Springer Handbook of Robotics, edited by Bruno Siciliano and Oussama Khatib, published by Springer in 2007, could be used: In the following we will apply the term Perceptual Robotics to signify the design of robots based on principles that are derived from human perception on all three levels in the sense of Marr. This includes a realization in terms of specific neural circuits as well as the transfer of more abstract biologically-inspired strategies for the solution of relevant computational problems.

    Read more →
  • Circular convolution

    Circular convolution

    Circular convolution, also known as cyclic convolution, is a special case of periodic convolution, which is the convolution of two periodic functions that have the same period. Periodic convolution arises, for example, in the context of the discrete-time Fourier transform (DTFT). In particular, the DTFT of the product of two discrete sequences is the periodic convolution of the DTFTs of the individual sequences. And each DTFT is a periodic summation of a continuous Fourier transform function (see Discrete-time Fourier transform § Relation to Fourier Transform). Although DTFTs are usually continuous functions of frequency, the concepts of periodic and circular convolution are also directly applicable to discrete sequences of data. In that context, circular convolution plays an important role in maximizing the efficiency of a certain kind of common filtering operation. == Definitions == The periodic convolution of two T-periodic functions, h T ( t ) {\displaystyle h_{_{T}}(t)} and x T ( t ) {\displaystyle x_{_{T}}(t)} can be defined as: ∫ t o t o + T h T ( τ ) ⋅ x T ( t − τ ) d τ , {\displaystyle \int _{t_{o}}^{t_{o}+T}h_{_{T}}(\tau )\cdot x_{_{T}}(t-\tau )\,d\tau ,} where t o {\displaystyle t_{o}} is an arbitrary parameter. An alternative definition, in terms of the notation of normal linear or aperiodic convolution, follows from expressing h T ( t ) {\displaystyle h_{_{T}}(t)} and x T ( t ) {\displaystyle x_{_{T}}(t)} as periodic summations of aperiodic components h {\displaystyle h} and x {\displaystyle x} , i.e.: h T ( t ) ≜ ∑ k = − ∞ ∞ h ( t − k T ) = ∑ k = − ∞ ∞ h ( t + k T ) . {\displaystyle h_{_{T}}(t)\ \triangleq \ \sum _{k=-\infty }^{\infty }h(t-kT)=\sum _{k=-\infty }^{\infty }h(t+kT).} Then: Both forms can be called periodic convolution. The term circular convolution arises from the important special case of constraining the non-zero portions of both h {\displaystyle h} and x {\displaystyle x} to the interval [ 0 , T ] . {\displaystyle [0,T].} Then the periodic summation becomes a periodic extension, which can also be expressed as a circular function: x T ( t ) = x ( t m o d T ) , t ∈ R {\displaystyle x_{_{T}}(t)=x(t_{\mathrm {mod} \ T}),\quad t\in \mathbb {R} \,} (any real number) And the limits of integration reduce to the length of function h {\displaystyle h} : ( h ∗ x T ) ( t ) = ∫ 0 T h ( τ ) ⋅ x ( ( t − τ ) m o d T ) d τ . {\displaystyle (hx_{_{T}})(t)=\int _{0}^{T}h(\tau )\cdot x((t-\tau )_{\mathrm {mod} \ T})\ d\tau .} == Discrete sequences == Similarly, for discrete sequences, and a parameter N, we can write a circular convolution of aperiodic functions h {\displaystyle h} and x {\displaystyle x} as: ( h ∗ x N ) [ n ] ≜ ∑ m = − ∞ ∞ h [ m ] ⋅ x N [ n − m ] ⏟ ∑ k = − ∞ ∞ x [ n − m − k N ] {\displaystyle (hx_{_{N}})[n]\ \triangleq \ \sum _{m=-\infty }^{\infty }h[m]\cdot \underbrace {x_{_{N}}[n-m]} _{\sum _{k=-\infty }^{\infty }x[n-m-kN]}} This function is N-periodic. It has at most N unique values. For the special case that the non-zero extent of both x and h are ≤ N, it is reducible to matrix multiplication where the kernel of the integral transform is a circulant matrix. == Example == A case of great practical interest is illustrated in the figure. The duration of the x sequence is N (or less), and the duration of the h sequence is significantly less. Then many of the values of the circular convolution are identical to values of x∗h, which is actually the desired result when the h sequence is a finite impulse response (FIR) filter. Furthermore, the circular convolution is very efficient to compute, using a fast Fourier transform (FFT) algorithm and the circular convolution theorem. There are also methods for dealing with an x sequence that is longer than a practical value for N. The sequence is divided into segments (blocks) and processed piecewise. Then the filtered segments are carefully pieced back together. Edge effects are eliminated by overlapping either the input blocks or the output blocks. To help explain and compare the methods, we discuss them both in the context of an h sequence of length 201 and an FFT size of N = 1024. === Overlapping input blocks === This method uses a block size equal to the FFT size (1024). We describe it first in terms of normal or linear convolution. When a normal convolution is performed on each block, there are start-up and decay transients at the block edges, due to the filter latency (200-samples). Only 824 of the convolution outputs are unaffected by edge effects. The others are discarded, or simply not computed. That would cause gaps in the output if the input blocks are contiguous. The gaps are avoided by overlapping the input blocks by 200 samples. In a sense, 200 elements from each input block are "saved" and carried over to the next block. This method is referred to as overlap-save, although the method we describe next requires a similar "save" with the output samples. When an FFT is used to compute the 824 unaffected DFT samples, we don't have the option of not computing the affected samples, but the leading and trailing edge-effects are overlapped and added because of circular convolution. Consequently, the 1024-point inverse FFT (IFFT) output contains only 200 samples of edge effects (which are discarded) and the 824 unaffected samples (which are kept). To illustrate this, the fourth frame of the figure at right depicts a block that has been periodically (or "circularly") extended, and the fifth frame depicts the individual components of a linear convolution performed on the entire sequence. The edge effects are where the contributions from the extended blocks overlap the contributions from the original block. The last frame is the composite output, and the section colored green represents the unaffected portion. === Overlapping output blocks === This method is known as overlap-add. In our example, it uses contiguous input blocks of size 824 and pads each one with 200 zero-valued samples. Then it overlaps and adds the 1024-element output blocks. Nothing is discarded, but 200 values of each output block must be "saved" for the addition with the next block. Both methods advance only 824 samples per 1024-point IFFT, but overlap-save avoids the initial zero-padding and final addition.

    Read more →
  • Quantum image processing

    Quantum image processing

    Quantum image processing (QIMP) is using quantum computing or quantum information processing to create and work with quantum images. Due to some of the properties inherent to quantum computation, notably entanglement and parallelism, it is hoped that QIMP technologies will offer capabilities and performances that surpass their traditional equivalents, in terms of computing speed, security, and minimum storage requirements. == Background == A. Y. Vlasov's work in 1997 focused on using a quantum system to recognize orthogonal images. This was followed by efforts using quantum algorithms to search specific patterns in binary images and detect the posture of certain targets. Notably, more optics-based interpretations for quantum imaging were initially experimentally demonstrated in and formalized in after seven years. In 2003, Salvador Venegas-Andraca and S. Bose presented Qubit Lattice, the first published general model for storing, processing and retrieving images using quantum systems. Later on, in 2005, Latorre proposed another kind of representation, called the Real Ket, whose purpose was to encode quantum images as a basis for further applications in QIMP. Furthermore, in 2010 Venegas-Andraca and Ball presented a method for storing and retrieving binary geometrical shapes in quantum mechanical systems in which it is shown that maximally entangled qubits can be used to reconstruct images without using any additional information. Technically, these pioneering efforts with the subsequent studies related to them can be classified into three main groups: Quantum-assisted digital image processing (QDIP): These applications aim at improving digital or classical image processing tasks and applications. Optics-based quantum imaging (OQI) Classically inspired quantum image processing (QIMP) A survey of quantum image representation has been published in. Furthermore, the recently published book Quantum Image Processing provides a comprehensive introduction to quantum image processing, which focuses on extending conventional image processing tasks to the quantum computing frameworks. It summarizes the available quantum image representations and their operations, reviews the possible quantum image applications and their implementation, and discusses the open questions and future development trends. == Quantum image representations == There are various approaches for quantum image representation, that are usually based on the encoding of color information. A common representation is FRQI (Flexible Representation for Quantum Images), that captures the color and position at every pixel of the image, and defined as: | I ⟩ = 1 2 n ∑ i = 0 2 2 n − 1 | c i ⟩ ⊗ | i ⟩ {\displaystyle \vert I\rangle ={\frac {1}{2^{n}}}\sum _{i=0}^{2^{2n-1}}\vert c_{i}\rangle \otimes \vert i\rangle } where | i ⟩ {\textstyle |i\rangle } is the position and | c i ⟩ = c o s θ i | 0 ⟩ + s i n θ i | 1 ⟩ {\textstyle \vert c_{i}\rangle =cos\theta _{i}\vert 0\rangle +sin\theta _{i}\vert 1\rangle } the color with a vector of angles θ i ∈ [ 0 , π / 2 ] {\textstyle \theta _{i}\in \left[0,\pi /2\right]} . As it can be seen, | c i ⟩ {\textstyle \vert c_{i}\rangle } is a regular qubit state of the form | ψ ⟩ = α | 0 ⟩ + β | 1 ⟩ {\displaystyle \vert \psi \rangle =\alpha \vert 0\rangle +\beta \vert 1\rangle } , with basis states | 0 ⟩ = ( 1 0 ) {\textstyle \vert 0\rangle ={\begin{pmatrix}1\\0\end{pmatrix}}} and | 1 ⟩ = ( 0 1 ) {\textstyle \vert 1\rangle ={\begin{pmatrix}0\\1\end{pmatrix}}} , as well as amplitudes α {\textstyle \alpha } and β {\textstyle \beta } that satisfy | α | 2 + | β | 2 = 1 {\textstyle \left|\alpha \right|^{2}+\left|\beta \right|^{2}=1} . Another common representation is MCQI (Multi-Channel Representation for Quantum Images), that uses the RGB channels with quantum states and following FRQI definition: | I ⟩ = 1 2 n + 1 ∑ i = 0 2 2 n − 1 | C R G B i ⟩ ⊗ | i ⟩ {\displaystyle \vert I\rangle ={\frac {1}{2^{n+1}}}\sum _{i=0}^{2^{2n-1}}\vert C_{RGB}^{i}\rangle \otimes \vert i\rangle } | C R G B i ⟩ = cos ⁡ θ R i | 000 ⟩ + cos ⁡ θ G i | 001 ⟩ + cos ⁡ θ B i | 010 ⟩ + sin ⁡ θ R i | 100 ⟩ + sin ⁡ θ G i | 101 ⟩ + sin ⁡ θ B i | 110 ⟩ + cos ⁡ θ α | 011 ⟩ + sin ⁡ θ α | 111 ⟩ {\displaystyle {\begin{aligned}{\begin{aligned}\vert C_{RGB}^{i}\rangle &={\cos \theta _{R}^{i}\vert 000\rangle }+{\cos \theta _{G}^{i}\vert 001\rangle }+{\cos \theta _{B}^{i}\vert 010\rangle }\\&\quad +{\sin \theta _{R}^{i}\vert 100\rangle }+{\sin \theta _{G}^{i}\vert 101\rangle }+{\sin \theta _{B}^{i}\vert 110\rangle }\\&\quad +{\cos {\theta _{\alpha }}\vert 011\rangle }+{\sin \theta _{\alpha }\vert 111\rangle }\end{aligned}}\end{aligned}}} Departing from the angle-based approach of FRQI and MCQI, and using a qubit sequence, NEQR (Novel Enhanced Representation for Quantum Images) is another representation approach, that uses a function f ( y , x ) = C y x q − 1 C y x q − 2 … C y x 1 C y x 0 {\textstyle f\left(y,x\right)=C_{yx}^{q-1}C_{yx}^{q-2}\ldots C_{yx}^{1}C_{yx}^{0}} to encode color values for a 2 n × 2 n {\displaystyle 2^{n}\times 2^{n}} image: | I ⟩ = 1 2 n ∑ y = 0 2 n − 1 ∑ x = 0 2 n − 1 | f ( y , x ) ⟩ | y x ⟩ {\displaystyle \vert I\rangle ={\frac {1}{2^{n}}}\sum _{y=0}^{2^{n}-1}\sum _{x=0}^{2^{n}-1}\vert f\left(y,x\right)\rangle \vert yx\rangle } == Quantum image manipulations == A lot of the effort in QIMP has been focused on designing algorithms to manipulate the position and color information encoded using flexible representation of quantum images (FRQI) and its many variants. For instance, FRQI-based fast geometric transformations including (two-point) swapping, flip, (orthogonal) rotations and restricted geometric transformations to constrain these operations to a specified area of an image were initially proposed. Recently, NEQR-based quantum image translation to map the position of each picture element in an input image into a new position in an output image and quantum image scaling to resize a quantum image were discussed. While FRQI-based general form of color transformations were first proposed by means of the single qubit gates such as X, Z, and H gates. Later, Multi-Channel Quantum Image-based channel of interest (CoI) operator to entail shifting the grayscale value of the preselected color channel and the channel swapping (CS) operator to swap the grayscale values between two channels have been fully discussed. To illustrate the feasibility and capability of QIMP algorithms and application, researchers always prefer to simulate the digital image processing tasks on the basis of the QIRs that we already have. By using the basic quantum gates and the aforementioned operations, so far, researchers have contributed to quantum image feature extraction, quantum image segmentation, quantum image morphology, quantum image comparison, quantum image filtering, quantum image classification, quantum image stabilization, among others. In particular, QIMP-based security technologies have attracted extensive interest of researchers as presented in the ensuing discussions. Similarly, these advancements have led to many applications in the areas of watermarking, encryption, and steganography etc., which form the core security technologies highlighted in this area. In general, the work pursued by the researchers in this area are focused on expanding the applicability of QIMP to realize more classical-like digital image processing algorithms; propose technologies to physically realize the QIMP hardware; or simply to note the likely challenges that could impede the realization of some QIMP protocols. == Quantum image transform == By encoding and processing the image information in quantum-mechanical systems, a framework of quantum image processing is presented, where a pure quantum state encodes the image information: to encode the pixel values in the probability amplitudes and the pixel positions in the computational basis states. Given an image F = ( F i , j ) M × L {\displaystyle F=(F_{i,j})_{M\times L}} , where F i , j {\displaystyle F_{i,j}} represents the pixel value at position ( i , j ) {\displaystyle (i,j)} with i = 1 , … , M {\displaystyle i=1,\dots ,M} and j = 1 , … , L {\displaystyle j=1,\dots ,L} , a vector f → {\displaystyle {\vec {f}}} with M L {\displaystyle ML} elements can be formed by letting the first M {\displaystyle M} elements of f → {\displaystyle {\vec {f}}} be the first column of F {\displaystyle F} , the next M {\displaystyle M} elements the second column, etc. A large class of image operations is linear, e.g., unitary transformations, convolutions, and linear filtering. In the quantum computing, the linear transformation can be represented as | g ⟩ = U ^ | f ⟩ {\displaystyle |g\rangle ={\hat {U}}|f\rangle } with the input image state | f ⟩ {\displaystyle |f\rangle } and the output image state | g ⟩ {\displaystyle |g\rangle } . A unitary transformation can be implemented as a unitary evolution. Some basic and commonly used image transforms (e.g., the Fourier, Hadamard, an

    Read more →
  • Articulatory speech recognition

    Articulatory speech recognition

    Articulatory speech recognition means the recovery of speech (in forms of phonemes, syllables or words) from acoustic signals with the help of articulatory modeling or an extra input of articulatory movement data. Speech recognition (or automatic speech recognition, acoustic speech recognition) means the recovery of speech from acoustics (sound wave) only. Articulatory information is extremely helpful when the acoustic input is in low quality, perhaps because of noise or missing data. Measurable information from the articulatory system (e.g. tongue, jaw movements) can supplement acoustic signals to improve phone recognition accuracy by 2%. However, attempts to estimate articulatory data from acoustic signals alone have not significantly enhanced recognition performance.

    Read more →
  • Face Swap Live

    Face Swap Live

    Face Swap Live is a mobile app created by Laan Labs that enables users to swap faces with another person in real-time using the device's camera. It was released on December 14, 2015. In addition to swapping faces with another person, the app enables users to create videos using a set of bundled live filters. The app is available on iOS and Android devices. Face Swap Live was named Apple's #2 best-selling paid app in 2016.

    Read more →
  • XLeratorDB

    XLeratorDB

    XLeratorDB is a suite of database function libraries that enable Microsoft SQL Server to perform a wide range of additional (non-native) business intelligence and ad hoc analytics. The libraries, which are embedded and run centrally on the database, include more than 450 individual functions similar to those found in Microsoft Excel spreadsheets. The individual functions are grouped and sold as six separate libraries based on usage: finance, statistics, math, engineering, unit conversions and strings. WestClinTech, the company that developed XLeratorDB, claims it is "the first commercial function package add-in for Microsoft SQL Server." == Company history == WestClinTech (LLC), founded by software industry veterans Charles Flock and Joe Stampf in 2008, is located in Irvington, New York, United States. Flock was a co-founder of The Frustum Group, developer of the OPICS enterprise banking and trading platform, which was acquired by London-based Misys, PLC in 1996. Stampf joined Frustum in 1994 and with Flock remained active with the company after acquisition, helping to develop successive generations of OPICS now employed by over 150 leading financial institutions worldwide. Following a full year of research, development and testing, WestClinTech introduced and recorded its first commercial sale of XLeratorDB in April 2009. In September 2009, XLeratorDB became available to all Federal agencies through NASA's Strategic Enterprise-Wide Procurement (SEWP-IV) program, a government-wide acquisition contract. == Technology == XLeratorDB uses Microsoft SQL CLR(Common Language Runtime) technology. SQL CLR allows managed code to be hosted by, and run in, the Microsoft SQL Server environment. SQL CLR relies on the creation, deployment and registration of .NET Framework assemblies that are physically stored in managed code dynamic-link libraries (DLL). The assemblies may contain .NET namespaces, classes, functions, and properties. Because managed code compiles to native code prior to execution, functions using SQL CLR can achieve significant performance increases versus the equivalent functions written in T-SQL in some scenarios. XLeratorDB requires Microsoft SQL Server 2005 or SQL Server 2005 Express editions, or later (compatibility mode 90 or higher). The product installs with PERMISSION_SET=SAFE. SAFE mode, the most restrictive permission set, is accessible by all users. Code executed by an assembly with SAFE permissions cannot access external system resources such as files, the network, the internet, environment variables, or the registry. == Functions == In computer science, a function is a portion of code within a larger program which performs a specific task and is relatively independent of the remaining code. As used in database and spreadsheet applications these functions generally represent mathematical formulas widely used across a variety of fields. While this code may be user-generated, it is also embedded as a pre-written sub-routine in applications. These functions are typically identified by common nomenclature which corresponds to their underlying operations: e.g. IRR identifies the function which calculates Internal Rate of Return on a series of periodic cash flows. === Function uses === As subroutines, functions can be integrated and used in a variety of ways, and as part of larger, more complicated applications. Within large enterprise applications they may, for example, play an important role in defining business rules or risk management parameters, while remaining virtually invisible to end users. Within database management systems and spreadsheets, however, these kinds of functions also represent discrete sets of tools; they can be accessed directly and utilized on a stand-alone basis, or in more complex, user-defined configurations. In this context, functions can be used for business intelligence and ad hoc analysis of data in fields such as finance, statistics, engineering, math, etc. === Function types === XLeratorDB uses three kinds of functions to perform analytic operations: scalar, aggregate, and a hybrid form which WestClinTech calls Range Queries. Scalar functions take a single value, perform an operation and return a single value. An example of this type of function is LOG, which returns the logarithm of a number to a specified base. Aggregate functions operate on a series of values but return a single, summarizing value. An example of this type of function is AVG, which returns the average of values in a specified group. In XLeratorDB there are some functions which have characteristics of aggregate functions (operating on multiple series of values) but cannot be processed in SQL CLR using single column inputs, such as AVG does. For example, irregular internal rate of return (XIRR), a financial function, operates on a collection of cash flow values from one column, but must also apply variable period lengths from another column and an initial iterative assumption from a third, in order to return a single, summarizing value. WestClinTech documentation notes that Range Queries specify the data to be included in the result set of the function independently of the WHERE clause associated with the T-SQL statement, by incorporating a SELECT statement into the function as a string argument; the function then traps that SELECT statement, executes it internally and processes the result. Some XLeratorDB functions that employ Range Queries are: NPV, XNPV, IRR, XIRR, MIRR, MULTINOMIAL, and SERIESSUM. Within the application these functions are identified by a "_q" naming convention: e.g. NPV_q, IRR_q, etc. == Analytic functions == === SQL Server functions === Microsoft SQL Server is the #3 selling database management system (DBMS), behind Oracle and IBM. (While versions of SQL Server have been on the market since 1987, XLeratorDB is compatible with only the 2005 edition and later.) Like all major DBMS, SQL Server performs a variety of data mining operations by returning or arraying data in different views (also known as drill-down). In addition, SQL Server uses Transact-SQL (T-SQL) to execute four major classes of pre-defined functions in native mode. Functions operating on the DBMS offer several advantages over client layer applications like Excel: they utilize the most up-to-date data available; they can process far larger quantities of data; and, the data is not subject to exporting and transcription errors. SQL Server 2008 includes a total of 58 functions that perform relatively basic aggregation (12), math (23) and string manipulation (23) operations useful for analytics; it includes no native functions that perform more complex operations directly related to finance, statistics or engineering. === Excel functions === Microsoft Excel, a component of Microsoft Office suite, is one of the most widely used spreadsheet applications on the market today. In addition to its inherent utility as a stand-alone desktop application, Excel overlaps and complements the functionality of DBMS in several ways: storing and arraying data in rows and columns; performing certain basic tasks such as pivot table and aggregating values; and facilitating sharing, importing and exporting of database data. Excel's chief limitation relative to a true database is capacity; Excel 2003 is limited to some 65k rows and 256 columns; Excel 2007 extends this capacity to roughly 1million rows and 16k columns. By comparison, SQL Server is able to manage over 500k terabytes of memory. Excel offers, however, an extensive library of specialized pre-written functions which are useful for performing ad hoc analysis on database data. Excel 2007 includes over 300 of these pre-defined functions, although customized functions can also be created by users, or imported from third party developers as add-ons. Excel functions are grouped by type: === Excel business intelligence functions === Operating on the client computing layer Excel plays an important role as a business intelligence tool because it: performs a wide array of complex analytic functions not native to most DBMS software offers far greater ad hoc reporting and analytic flexibility than most enterprise software provides a medium for sharing and collaborating because of its ubiquity throughout the enterprise Microsoft reinforces this positioning with Business Intelligence documentation that positions Excel in a clearly pivotal role. === XLeratorDB vs. Excel functions === While operating within the database environment, XLeratorDB functions utilize the same naming conventions and input formats, and in most cases, return the same calculation results as Excel functions. XLeratorDB, coupled with SQL Server's native capabilities, compares to Excel's function sets as follows:

    Read more →
  • Adobe Presenter

    Adobe Presenter

    Adobe Presenter is eLearning software released by Adobe Systems available on the Microsoft Windows platform as a Microsoft PowerPoint plug-in, and on both Windows and OS X as the screencasting and video editing tool Adobe Presenter Video Express. It is mainly targeted towards learning professionals and trainers. In addition to recording one's computer desktop and speech, it also provides the option to add quizzes and track performance by integrating with learning management systems. Adobe Presenter was designed to replace the discontinued Adobe Ovation software, which had similar functions. == Predecessor == Adobe Ovation was originally released by Serious Magic. It converted PowerPoint slides into visual presentations with additional effects. Ovation included themes called PowerLooks that could add motion and polish the presentations. They were available in a variety of color variations complete with animated backgrounds and dynamic text effects. Ovation could make text with jagged edges more readable. TimeKeeper could be used to set the period of the presentation, and the PointPrompter scrolled down the notes. Ovation's development has been discontinued, nor does it support PowerPoint 2007. == Features == The main purpose of Adobe Presenter is to capture on-screen presentations and convert them into more interactive and engaging videos. Support is given to convert Microsoft PowerPoint 2010 and 2013 presentations into videos. It also allows for content authoring on PowerPoint and ActionScript 3, and offers integration with Adobe Captivate. Slide branching enables users to control slide navigation and titles and create complex slide branching to guide viewers through the content of the presentation. Video editing tools are also provided, and offer the ability to upload to video-sharing platforms such as YouTube, Vimeo and other sites. Multimedia features such as annotations, eLearning templates, actors, audio narration and drag-and-drop elements enrich users' presentations. Quizzes and surveys is another highlighted feature, which include generating question pools, importing questions from existing quizzes and in-course collaboration which allows presenters to receive feedback by allowing them to comment on specific content within a course or ask questions for more clarity. Presenters could opt to receive feedback from viewers through video analytics and create Experience API, SCORM and AICC-compliant content. Options to publish to Adobe Connect are provided. Other unique features include universal standards support, file size control, navigational restrictions among others.

    Read more →
  • Minimum resolvable contrast

    Minimum resolvable contrast

    Minimum resolvable contrast (MRC) is a subjective measure of a visible spectrum sensor’s or camera's sensitivity and ability to resolve data. A snapshot image of a series of three bar targets of selected spatial frequencies and various contrast coatings captured by the unit under test (UUT) is used to determine the MRC of the UUT, i.e., the visible spectrum camera or sensor. A trained observer selects the smallest target resolvable at each contrast level. Typically, specialized computer software collects the inputted data of the observer and provides a graph of contrast vs. spatial frequency at a given luminance level. A first order polynomial is fitted to the data and an MRC curve of spatial frequency versus contrast is generated.

    Read more →
  • ArcSoft ShowBiz

    ArcSoft ShowBiz

    ShowBiz is a video editor by ArcSoft for the Windows operating system. It can create VCD and DVDs and can also export to the formats AVI, MPEG, WMV, and MOV. ShowBiz also contains a DVD burning and menu building feature. As of 2003, it was one of the three most dominant bundled titles. == Reception == PC Magazine reviewer Jan Ozer states: "ArcSoft's ShowBiz has evolved into a competent editor that's generally more usable than Dazzle's MovieStar program, providing more configuration controls, better preview features, and a much greater range of fun effects." John Virata, senior editor of Digital Media Online, says in his three page review of ShowBiz DVD 2, "It is an easy editor to work with and has a logically laid out interface that takes you step by step through the video creation and DVD creation process"

    Read more →
  • Cross-entropy method

    Cross-entropy method

    The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorial and continuous problems, with either a static or noisy objective. The method approximates the optimal importance sampling estimator by repeating two phases: Draw a sample from a probability distribution. Minimize the cross-entropy between this distribution and a target distribution to produce a better sample in the next iteration. Reuven Rubinstein developed the method in the context of rare-event simulation, where tiny probabilities must be estimated, for example in network reliability analysis, queueing models, or performance analysis of telecommunication systems. The method has also been applied to the traveling salesman, quadratic assignment, DNA sequence alignment, max-cut and buffer allocation problems. == Estimation via importance sampling == Consider the general problem of estimating the quantity ℓ = E u [ H ( X ) ] = ∫ H ( x ) f ( x ; u ) d x {\displaystyle \ell =\mathbb {E} _{\mathbf {u} }[H(\mathbf {X} )]=\int H(\mathbf {x} )\,f(\mathbf {x} ;\mathbf {u} )\,{\textrm {d}}\mathbf {x} } , where H {\displaystyle H} is some performance function and f ( x ; u ) {\displaystyle f(\mathbf {x} ;\mathbf {u} )} is a member of some parametric family of distributions. Using importance sampling this quantity can be estimated as ℓ ^ = 1 N ∑ i = 1 N H ( X i ) f ( X i ; u ) g ( X i ) {\displaystyle {\hat {\ell }}={\frac {1}{N}}\sum _{i=1}^{N}H(\mathbf {X} _{i}){\frac {f(\mathbf {X} _{i};\mathbf {u} )}{g(\mathbf {X} _{i})}}} , where X 1 , … , X N {\displaystyle \mathbf {X} _{1},\dots ,\mathbf {X} _{N}} is a random sample from g {\displaystyle g\,} . For positive H {\displaystyle H} , the theoretically optimal importance sampling density (PDF) is given by g ∗ ( x ) = H ( x ) f ( x ; u ) / ℓ {\displaystyle g^{}(\mathbf {x} )=H(\mathbf {x} )f(\mathbf {x} ;\mathbf {u} )/\ell } . This, however, depends on the unknown ℓ {\displaystyle \ell } . The CE method aims to approximate the optimal PDF by adaptively selecting members of the parametric family that are closest (in the Kullback–Leibler sense) to the optimal PDF g ∗ {\displaystyle g^{}} . == Generic CE algorithm == Choose initial parameter vector v ( 0 ) {\displaystyle \mathbf {v} ^{(0)}} ; set t = 1. Generate a random sample X 1 , … , X N {\displaystyle \mathbf {X} _{1},\dots ,\mathbf {X} _{N}} from f ( ⋅ ; v ( t − 1 ) ) {\displaystyle f(\cdot ;\mathbf {v} ^{(t-1)})} Solve for v ( t ) {\displaystyle \mathbf {v} ^{(t)}} , where v ( t ) = argmax v ⁡ 1 N ∑ i = 1 N H ( X i ) f ( X i ; u ) f ( X i ; v ( t − 1 ) ) log ⁡ f ( X i ; v ) {\displaystyle \mathbf {v} ^{(t)}=\mathop {\textrm {argmax}} _{\mathbf {v} }{\frac {1}{N}}\sum _{i=1}^{N}H(\mathbf {X} _{i}){\frac {f(\mathbf {X} _{i};\mathbf {u} )}{f(\mathbf {X} _{i};\mathbf {v} ^{(t-1)})}}\log f(\mathbf {X} _{i};\mathbf {v} )} If convergence is reached then stop; otherwise, increase t by 1 and reiterate from step 2. In several cases, the solution to step 3 can be found analytically. Situations in which this occurs are When f {\displaystyle f\,} belongs to the natural exponential family When f {\displaystyle f\,} is discrete with finite support When H ( X ) = I { x ∈ A } {\displaystyle H(\mathbf {X} )=\mathrm {I} _{\{\mathbf {x} \in A\}}} and f ( X i ; u ) = f ( X i ; v ( t − 1 ) ) {\displaystyle f(\mathbf {X} _{i};\mathbf {u} )=f(\mathbf {X} _{i};\mathbf {v} ^{(t-1)})} , then v ( t ) {\displaystyle \mathbf {v} ^{(t)}} corresponds to the maximum likelihood estimator based on those X k ∈ A {\displaystyle \mathbf {X} _{k}\in A} . == Continuous optimization—example == The same CE algorithm can be used for optimization, rather than estimation. Suppose the problem is to maximize some function S {\displaystyle S} , for example, S ( x ) = e − ( x − 2 ) 2 + 0.8 e − ( x + 2 ) 2 {\displaystyle S(x)={\textrm {e}}^{-(x-2)^{2}}+0.8\,{\textrm {e}}^{-(x+2)^{2}}} . To apply CE, one considers first the associated stochastic problem of estimating P θ ( S ( X ) ≥ γ ) {\displaystyle \mathbb {P} _{\boldsymbol {\theta }}(S(X)\geq \gamma )} for a given level γ {\displaystyle \gamma \,} , and parametric family { f ( ⋅ ; θ ) } {\displaystyle \left\{f(\cdot ;{\boldsymbol {\theta }})\right\}} , for example the 1-dimensional Gaussian distribution, parameterized by its mean μ t {\displaystyle \mu _{t}\,} and variance σ t 2 {\displaystyle \sigma _{t}^{2}} (so θ = ( μ , σ 2 ) {\displaystyle {\boldsymbol {\theta }}=(\mu ,\sigma ^{2})} here). Hence, for a given γ {\displaystyle \gamma \,} , the goal is to find θ {\displaystyle {\boldsymbol {\theta }}} so that D K L ( I { S ( x ) ≥ γ } ‖ f θ ) {\displaystyle D_{\mathrm {KL} }({\textrm {I}}_{\{S(x)\geq \gamma \}}\|f_{\boldsymbol {\theta }})} is minimized. This is done by solving the sample version (stochastic counterpart) of the KL divergence minimization problem, as in step 3 above. It turns out that parameters that minimize the stochastic counterpart for this choice of target distribution and parametric family are the sample mean and sample variance corresponding to the elite samples, which are those samples that have objective function value ≥ γ {\displaystyle \geq \gamma } . The worst of the elite samples is then used as the level parameter for the next iteration. This yields the following randomized algorithm that happens to coincide with the so-called Estimation of Multivariate Normal Algorithm (EMNA), an estimation of distribution algorithm. === Pseudocode === // Initialize parameters μ := −6 σ2 := 100 t := 0 maxits := 100 N := 100 Ne := 10 // While maxits not exceeded and not converged while t < maxits and σ2 > ε do // Obtain N samples from current sampling distribution X := SampleGaussian(μ, σ2, N) // Evaluate objective function at sampled points S := exp(−(X − 2) ^ 2) + 0.8 exp(−(X + 2) ^ 2) // Sort X by objective function values in descending order X := sort(X, S) // Update parameters of sampling distribution via elite samples μ := mean(X(1:Ne)) σ2 := var(X(1:Ne)) t := t + 1 // Return mean of final sampling distribution as solution return μ == Related methods == Simulated annealing Genetic algorithms Harmony search Estimation of distribution algorithm Tabu search Natural Evolution Strategy Ant colony optimization algorithms

    Read more →
  • EditDV

    EditDV

    EditDV was a video editing software released by Radius, Inc. in late 1997 as an evolution of their earlier Radius Edit product. EditDV was one of the first products providing professional-quality editing of the then new DV format at a relatively affordable cost ($999 including Radius FireWire capture card) and was named "The Best Video Tool of 1998". Originally EditDV was available for Macintosh only but in February 2000 EditDV 2.0 for Windows was released. With version 3.0 EditDV's name was changed to CineStream. == Features == Originally bundled with a FireWire card, EditDV 1.5 got updated into a less expensive software only package for use with the newer PowerMac G3 that came with a FireWire interface. Later, a scaled down version named EditDV 1.6.1 Unplugged was released as a freeware version next to EditDV 2.0. Unlike many other applications at the time which transcoded video to M-JPEG for editing, EditDV provided lossless native editing of the DV format. Only transitions (such as dissolves or wipes), effects (such as rotating or scaling the video, adjusting the audio level, or adding titles) and filters (such as changing the brightness or color balance) needed to be rendered. This also had the disadvantage to not work with analogue video capture. EditDV was built on top of QuickTime and supported QuickTime filters as well as its own built-in effects and transitions. Effects could be animated using keyframes. EditDV 2.0 worked natively with Quicktime MOV format. For Microsoft Windows users, where the standard was AVI, this required the use of a provided external conversion tool afterwards when AVI was wanted. The user interface had a Project window for organising clips into bins, a Sequence window with a multi-track timeline for arranging clips into a program using three-point editing, and Source and Program monitor windows. A finished program could either be exported as a QuickTime movie or written back to DV tape using the "print to video" command. Version 3.0, then renamed CineStream, shifted towards web designers who wanted to add video streaming interactivity to a website. The new feature called EventStream allowed setting clickable hot spots to link to another location, either to another page with a URL or to another video. This feature distinguished CineStream from the rest of the competition. == Products == The EditDV product family included a number of related products, all sharing a similar name: EditDV Video editing software (Mac and Windows) SoftDV A QuickTime software codec for playing DV media, included as part of EditDV (Mac and Windows) MotoDV PCI-based FireWire interface with DV capture software (Mac and Windows) PhotoDV Software to capture high-quality stills from a DV tape using MotoDV hardware (Mac and Windows) RotoDV Software for rotoscoping (painting over video), released in Sept 1999 (Macintosh only) == Name changes and eventual demise == In 1999, the company Radius Inc. changed its name to Digital Origin. In 2000, Digital Origin Inc (and EditDV) was bought by Media 100. In early 2001, Media 100 released an updated version of EditDV under the new name CineStream 3.0. Later that year (October 2001) Media 100 was bought by Autodesk's Discreet Division. CineStream for Macintosh required classic Mac OS. It was never ported to Mac OS X and faced increasing competition on that platform from Apple's own Final Cut Pro application. Development of EditDV/Cinestream was officially discontinued in 2002.

    Read more →
  • Subvocal recognition

    Subvocal recognition

    Subvocal recognition (SVR) is the process of taking subvocalization and converting the detected results to a digital output, aural or text-based. A silent speech interface is a device that allows speech communication without using the sound made when people vocalize their speech sounds. It works by the computer identifying the phonemes that an individual pronounces from nonauditory sources of information about their speech movements. These are then used to recreate the speech using speech synthesis. == Input methods == Silent speech interface systems have been created using ultrasound and optical camera input of tongue and lip movements. Electromagnetic devices are another technique for tracking tongue and lip movements. The detection of speech movements by electromyography of speech articulator muscles and the larynx is another technique. Another source of information is the vocal tract resonance signals that get transmitted through bone conduction called non-audible murmurs. They have also been created as a brain–computer interface using brain activity in the motor cortex obtained from intracortical microelectrodes. == Uses == Such devices are created as aids to those unable to create the sound phonation needed for audible speech such as after laryngectomies. Another use is for communication when speech is masked by background noise or distorted by self-contained breathing apparatus. A further practical use is where a need exists for silent communication, such as when privacy is required in a public place, or hands-free data silent transmission is needed during a military or security operation. In 2002, the Japanese company NTT DoCoMo announced it had created a silent mobile phone using electromyography and imaging of lip movement. The company stated that "the spur to developing such a phone was ridding public places of noise," adding that, "the technology is also expected to help people who have permanently lost their voice." The feasibility of using silent speech interfaces for practical communication has since then been shown. In 2019, Arnav Kapur, a researcher from the Massachusetts Institute of Technology, conducted a study known as AlterEgo. Its implementation of the silent speech interface enables direct communication between the human brain and external devices through stimulation of the speech muscles. By leveraging neural signals associated with speech and language, the AlterEgo system deciphers the user's intended words and translates them into text or commands without the need for audible speech. == Research and patents == With a grant from the U.S. Army, research into synthetic telepathy using subvocalization is taking place at the University of California, Irvine under lead scientist Mike D'Zmura. NASA's Ames Research Laboratory in Mountain View, California, under the supervision of Charles Jorgensen is conducting subvocalization research. The Brain Computer Interface R&D program at Wadsworth Center under the New York State Department of Health has confirmed the existing ability to decipher consonants and vowels from imagined speech, which allows for brain-based communication using imagined speech, however using EEGs instead of subvocalization techniques. US Patents on silent communication technologies include: US Patent 6587729 "Apparatus for audibly communicating speech using the radio frequency hearing effect", US Patent 5159703 "Silent subliminal presentation system", US Patent 6011991 "Communication system and method including brain wave analysis and/or use of brain activity", US Patent 3951134 "Apparatus and method for remotely monitoring and altering brain waves". Latter two rely on brain wave analysis. == In fiction == The decoding of silent speech using a computer played an important role in Arthur C. Clarke's story and Stanley Kubrick's associated film A Space Odyssey. In this, HAL 9000, a computer controlling spaceship Discovery One, bound for Jupiter, discovers a plot to deactivate it by the mission astronauts Dave Bowman and Frank Poole through lip reading their conversations. In Orson Scott Card's series (including Ender's Game), the artificial intelligence can be spoken to while the protagonist wears a movement sensor in his jaw, enabling him to converse with the AI without making noise. He also wears an ear implant. In Speaker for the Dead and subsequent novels, author Orson Scott Card described an ear implant, called a "jewel", that allows subvocal communication with computer systems. Author Robert J. Sawyer made use of subvocal recognition to allow silent commands to the cybernetic 'companion implants' used by the advanced Neanderthal characters in his Neanderthal Parallax trilogy of science fiction novels. In Earth, David Brin depicts this technology and its uses as a normal gear in the near future. In Down and Out in the Magic Kingdom, Cory Doctorow has cellphone technology become silent through a cochlear implant and miking the throat to pick up subvocalization. William Gibson's Sprawl Trilogy frequently uses sub-vocalization systems in various devices. In Kage Baker's Company novels, the immortal cyborgs communicate subvocally. In the Hugo Award-winning Hyperion Cantos by Dan Simmons, the characters often use subvocalization to communicate. In the Culture novels by Iain M. Banks, more highly advanced species often communicate subvocally through their technology. In Deus Ex: Human Revolution (2011), the protagonist is augmented with a subvocalization implant for sending covert communications (and a corresponding cochlear implant for receiving covert communications). In the tabletop RPG and video game series Shadowrun, player characters can communicate via subvocal microphones in some instances. In Paranoia, all citizens can speak to the computer via their "cerebral cortech" implants. Alistair Reynolds Revelation Space trilogy frequently uses sub-vocalization systems in various devices.

    Read more →