AI Code Ui

AI Code Ui — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Generative design

    Generative design

    Generative design is an iterative design process that uses software to generate outputs that fulfill a set of constraints iteratively adjusted by a designer. Whether a human, test program, or artificial intelligence, the designer algorithmically or manually refines the feasible region of the program's inputs and outputs with each iteration to fulfill evolving design requirements. By employing computing power to evaluate more design permutations than a human alone is capable of, the process is capable of producing an optimal design that mimics nature's evolutionary approach to design through genetic variation and selection. The output can be images, sounds, architectural models, animation, and much more. It is, therefore, a fast method of exploring design possibilities that is used in various design fields such as art, architecture, communication design, and product design. Generative design has become more important, largely due to new programming environments or scripting capabilities that have made it relatively easy, even for designers with little programming experience, to implement their ideas. Additionally, this process can create solutions to substantially complex problems that would otherwise be resource-exhaustive with an alternative approach, making it a more attractive option for problems with a large or unknown solution set. It is also facilitated with tools in commercially available CAD packages. Not only are implementation tools more accessible, but also tools leveraging generative design as a foundation. Recent advancements have led to the development of Deep Generative Design, a framework that integrates topology optimization with deep learning models, such as Generative Adversarial Networks (GANs). Unlike traditional evolutionary methods that primarily focus on engineering performance, this approach uses deep generative models to enhance aesthetic diversity and novelty while simultaneously satisfying engineering constraints. For instance, research by Oh et al. (2019) proposed a framework using Boundary Equilibrium GANs (BEGAN) to generate diverse design options which are then refined through density-based topology optimization, allowing for the exploration of complex design spaces that balance structural integrity with visual variation. In practice, generative design does not solely aim to produce a single optimal solution, but involves iteratively refining the design problem by modifying parameters, constraints, and evaluation criteria within a computational model, resulting in multiple design alternatives from which the designer selects. == Use in architecture == Generative design in architecture is an iterative design process that enables architects to explore a wider solution space with more possibility and creativity. Architectural design has long been regarded as a wicked problem. Compared with traditional top-down design approach, generative design can address design problems efficiently, by using a bottom-up paradigm that uses parametric-defined rules to generate complex solutions. The solution itself then evolves to a good, if not optimal, solution. The advantage of using generative design as a design tool is that it does not construct fixed geometries, but take a set of design rules that can generate an infinite set of possible design solutions. The generated design solutions can be more sensitive, responsive, and adaptive to the problem. Generative design involves rule definition and result analysis that are integrated with the design process. By defining parameters and rules, the generative approach is able to provide optimized solution for both structural stability and aesthetics. Possible design algorithms include cellular automata, shape grammar, genetic algorithm, space syntax, and most recently, artificial neural network. Due to the high complexity of the solution generated, rule-based computational tools, such as finite element method and topology optimisation, are preferred to evaluate and optimise the generated solution. The iterative process provided by computer software enables the trial-and-error approach in design, and involves architects interfering with the optimisation process. Historically precedent work includes Antoni Gaudí's Sagrada Família, which used rule based geometrical forms for structures, and Buckminster Fuller's Montreal Biosphere where the rules were designed to generate individual components, rather than the final product. More recent generative-design cases include Foster and Partners' Queen Elizabeth II Great Court, where the tessellated glass roof was designed using a geometric schema to define hierarchical relationships, and then the generated solution was optimized based on geometrical and structural requirements. == Use in sustainable design == Generative design in sustainable design is an effective approach addressing energy efficiency and climate change at the early design stage, recognizing buildings contribute to approximately one-third of global greenhouse gas emissions and 30%-40% of total building energy use. It integrates environmental principles with algorithms, enabling exploration of countless design alternatives to enhance energy performance, reduce carbon footprints, and minimize waste. A key feature of generative design in sustainable design is its ability to incorporate Building Performance Simulations (BPS) into the design process. Simulation programs such as EnergyPlus, Ladybug Tools,, and so on, combined with generative algorithms, can optimize design solutions for cost-effective energy use and zero-carbon building designs. For example, the GENE_ARCH system used a Pareto algorithm with building energy simulation for the whole building design optimization. Generative design has improved sustainable facade design, as illustrated by the algorithm of cellular automata and daylight simulations in adaptive facade design. In addition, genetic algorithms were used with radiation simulations for energy-efficient photo-voltaic (PV) modules on high-rise building facades. Generative design is also applied to life cycle analysis (LCA), as demonstrated by a framework using grid search algorithms to optimize exterior wall design for minimum environmental impact. Multi-objective optimization embraces multiple diverse sustainability goals, such as interactive kinetic louvers using biomimicry and daylight simulations to enhance daylight, visual comfort, and energy efficiency. The study of PV and shading systems can maximize on-site electricity, improve visual quality, and daylight performance. Artificial intelligence (AI) and machine learning (ML) further improve computation efficiency in complex climate-responsive sustainable design. One study employed reinforcement learning to identify the relationship between design parameters and energy use for a sustainable campus, while other studies tried hybrid algorithms, such as using the genetic algorithm and GANs to balance daylight illumination and thermal comfort under different roof conditions. Other popular AI tools were also integrated, including deep reinforcement learning (DRL) and computer vision (CV), to generate an urban block according to direct sunlight hours and solar heat gains. These AI-driven generative design methods enable faster simulations and design decision making, resulting in designs that are environmentally responsible. == Use in additive manufacturing == Additive manufacturing (AM) is a process that creates physical models directly from three-dimensional (3D) data by joining materials layer by layer. It is used in industries to produce a variety of end-use parts, which are final components designed for direct application in products or systems. AM provides design flexibility and enables material reduction in lightweight applications, such as aerospace, automotive, medical, and portable electronic devices, where minimizing weight is critical for performance. Generative design, one of the four key methods for lightweight design in AM, is commonly applied to optimize structures for specific performance requirements. Generative design can help create optimized solutions that balance multiple objectives, such as enhancing performance while minimizing cost. In design for additive manufacturing (DfAM), multi-objective topology optimization is used to generate a set of candidate solutions. Designers then assess these options using their expertise and key performance indicators (KPIs) to select the best option for implementation. However, integrating AM constraints (e.g., speed of build, materials, build envelope, and accuracy) into generative design remains challenging, as ensuring all solutions are valid is complex. Balancing multiple design objectives while limiting computational costs adds further challenges for designers. To overcome these difficulties, researchers proposed a generative design method with manufacturing validation to improve decision-making efficiency. This method starts with a cons

    Read more →
  • Iterative reconstruction

    Iterative reconstruction

    Iterative reconstruction refers to iterative algorithms used to reconstruct 2D and 3D images in certain imaging techniques. For example, in computed tomography an image must be reconstructed from projections of an object. Here, iterative reconstruction techniques are usually a better, but computationally more expensive alternative to the common filtered back projection (FBP) method, which directly calculates the image in a single reconstruction step. In recent research works, scientists have shown that extremely fast computations and massive parallelism is possible for iterative reconstruction, which makes iterative reconstruction practical for commercialization. == Basic concepts == The reconstruction of an image from the acquired data is an inverse problem. Often, it is not possible to exactly solve the inverse problem directly. In this case, a direct algorithm has to approximate the solution, which might cause visible reconstruction artifacts in the image. Iterative algorithms approach the correct solution using multiple iteration steps, which allows to obtain a better reconstruction at the cost of a higher computation time. There are a large variety of algorithms, but each starts with an assumed image, computes projections from the image, compares the original projection data and updates the image based upon the difference between the calculated and the actual projections. === Algebraic reconstruction === The Algebraic Reconstruction Technique (ART) was the first iterative reconstruction technique used for computed tomography by Hounsfield. === Iterative Sparse Asymptotic Minimum Variance === The iterative sparse asymptotic minimum variance algorithm is an iterative, parameter-free superresolution tomographic reconstruction method inspired by compressed sensing, with applications in synthetic-aperture radar, computed tomography scan, and magnetic resonance imaging (MRI). === Statistical reconstruction === There are typically five components to statistical iterative image reconstruction algorithms, e.g. An object model that expresses the unknown continuous-space function f ( r ) {\displaystyle f(r)} that is to be reconstructed in terms of a finite series with unknown coefficients that must be estimated from the data. A system model that relates the unknown object to the "ideal" measurements that would be recorded in the absence of measurement noise. Often this is a linear model of the form A x + ϵ {\displaystyle \mathbf {A} x+\epsilon } , where ϵ {\displaystyle \epsilon } represents the noise. A statistical model that describes how the noisy measurements vary around their ideal values. Often Gaussian noise or Poisson statistics are assumed. Because Poisson statistics are closer to reality, it is more widely used. A cost function that is to be minimized to estimate the image coefficient vector. Often this cost function includes some form of regularization. Sometimes the regularization is based on Markov random fields. An algorithm, usually iterative, for minimizing the cost function, including some initial estimate of the image and some stopping criterion for terminating the iterations. === Learned Iterative Reconstruction === In learned iterative reconstruction, the updating algorithm is learned from training data using techniques from machine learning such as convolutional neural networks, while still incorporating the image formation model. This typically gives faster and higher quality reconstructions and has been applied to CT and MRI reconstruction. == Advantages == The advantages of the iterative approach include improved insensitivity to noise and capability of reconstructing an optimal image in the case of incomplete data. The method has been applied in emission tomography modalities like SPECT and PET, where there is significant attenuation along ray paths and noise statistics are relatively poor. Statistical, likelihood-based approaches: Statistical, likelihood-based iterative expectation-maximization algorithms are now the preferred method of reconstruction. Such algorithms compute estimates of the likely distribution of annihilation events that led to the measured data, based on statistical principle, often providing better noise profiles and resistance to the streak artifacts common with FBP. Since the density of radioactive tracer is a function in a function space, therefore of extremely high-dimensions, methods which regularize the maximum-likelihood solution turning it towards penalized or maximum a-posteriori methods can have significant advantages for low counts. Examples such as Ulf Grenander's Sieve estimator or Bayes penalty methods, or via I.J. Good's roughness method may yield superior performance to expectation-maximization-based methods which involve a Poisson likelihood function only. As another example, it is considered superior when one does not have a large set of projections available, when the projections are not distributed uniformly in angle, or when the projections are sparse or missing at certain orientations. These scenarios may occur in intraoperative CT, in cardiac CT, or when metal artifacts require the exclusion of some portions of the projection data. In Magnetic Resonance Imaging it can be used to reconstruct images from data acquired with multiple receive coils and with sampling patterns different from the conventional Cartesian grid and allows the use of improved regularization techniques (e.g. total variation) or an extended modeling of physical processes to improve the reconstruction. For example, with iterative algorithms it is possible to reconstruct images from data acquired in a very short time as required for real-time MRI (rt-MRI). In Cryo Electron Tomography, where the limited number of projections are acquired due to the hardware limitations and to avoid the biological specimen damage, it can be used along with compressive sensing techniques or regularization functions (e.g. Huber function) to improve the reconstruction for better interpretation. Here is an example that illustrates the benefits of iterative image reconstruction for cardiac MRI.

    Read more →
  • BBC Own It

    BBC Own It

    The BBC Own It app was a British information site designed to protect and support children using the Internet. The app was launched in 2017 and retired in 2022, though the website retired in 2024 and has since moved to BBC Teach. As part of the BBC's partnership with Internet Matters, the not-for-profit contributed to content on the BBC Own It website. == History == In 2016, The Royal Foundation of The Duke and Duchess of Cambridge established The Royal Foundation Taskforce on the Prevention of Cyberbullying. Work began in 2017 by the BBC to create an app about cyberbullying and online safety (later titled Own It) in response to a call for action from the Taskforce. In December 2017, the BBC launched Own It. In November 2018, work on the BBC Own It App was announced by Prince William. In September 2019, the BBC Own It App was launched into the AppStore and Google Play. In 2022, the BBC discontinued the app, although the website was still active, however in 2024, the website was discontinued, and now any links to the website now redirect to a BBC Teach page. == Awards == UXUK award for Best Education or Learning Experience (2019) Banff World Media Festival Rockies Award for Children & Youth Interactive Content (2020) CogX Award for Best Innovation In Natural Language Processing (2020)

    Read more →
  • Irwin Sobel

    Irwin Sobel

    Irwin Sobel (born September 12, 1940) is a scientist and researcher in digital image processing. == Biography == Irwin Sobel was born in New York City. He graduated from MIT in 1961 and completed his Ph.D. research at the Stanford Artificial Intelligence Project (SAIL) with thesis Camera Models and Machine Perception. His Ph.D. advisor was Jerome A. Feldman. Starting in 1973, he spent nine years doing postdoctoral research at Columbia University. After 1982, he worked as a Senior Researcher at HP Labs. == Sobel operator == In 1968, Sobel gave a talk entitled "An Isotropic 3x3 Image Gradient Operator" at SAIL; this method became known as the Sobel operator. It was developed jointly with a colleague, Gary Feldman, also at SAIL.

    Read more →
  • Imo.im

    Imo.im

    imo.im is a proprietary audio/video calling and instant messaging software service. It allows sending music, video, PDFs and other files, along with various free stickers. It supports encrypted group video and voice calls with up to 20 participants. According to its developer, the service possesses over 200 million users and over 50 million messages per day are sent through it. == History == The product was created as a web-based application in 2005 for accessing multiple chat platforms, including Facebook Messenger, Google Talk, Yahoo! Messenger, and Skype chat. It was developed by Pagebites, which is a subsidiary of Singularity IM, Inc. and required a subscriber's phone number to verify the users' account. In March 2014, support for all third-party messaging networks ended. In January 2018, the app reached 500 million installs. imo.im has implemented end-to-end encryption for its chats and calls, ensuring that the conversations remain private between the sender and receiver.

    Read more →
  • LENA Foundation

    LENA Foundation

    The LENA Foundation is an American nonprofit organisation which provides tools for measuring children's language acquisition and exposure. Specifically, the LENA system consists of a digital language processor which is worn by a child and records and analyses their auditory environment, using propriety software. It then presents a summary of child-adult conversation, such as conversation turns and word counts. The purpose of the LENA system is to encourage interactive talk between children (between the age of two to forty-eight months) and their caretakers. The LENA system is also used for research; while useful for researchers who wish to save transcription costs or observe the child in its natural state, the accuracy of this system, while often quite high, varies between contexts, for example notably in the case of hard of hearing children. Because of this, several researchers recommend caution in using only the LENA system on its own for the purposes of scientific research. == History == The LENA Foundation was established in 2009 by Terrance and Judith Paul, founders of Renaissance Learning, Inc., with the purpose of aiding children with disabilities and assisting with early learning. They were inspired by the book "Meaningful Differences in the Everyday Experience of American Children" by Dr. Betty Hart and Dr. Todd Risley. A pilot version of the LENA system was launched in February 2006. The LENA Research Foundation was registered as a tax-exempt 501(c)(3) nonprofit in September 2010. The organisation was renamed simply LENA in 2018 and adopted the tagline "Building brains through early talk." LENA has been used for parental feedback, linguistics or paediatrics research, and for specific clinical cases. == Scientific background == In 2018, research using the LENA system showed that there was a link between children's conversational turns and activation of Broca's area (a part of the brain responsible, although not necessarily essential, for language processing). The LENA foundation cites research by its own employees as evidence for the scientific basis of its technology. Said research claims that verbal interaction with young children has an effect on language acquisition, including verbal comprehension skills during adolescence. == LENA System == The LENA software analyses a child's natural language environment, such as verbal exposure, and provides several metrics, such as adult and child speech time, television/recorded audio time, word count, or conversation turn count. The LENA hardware is a recorder that is usually placed into a child's specially-designed vest. The software was trained on over 65,000 hours of manually annotated American English audio recordings. It splits the audio into segments which are categorised as "key child", "other child", "male adult", "noise", etc. The advantages of LENA as opposed to manual transcription are its speed and ease of use; the disadvantages are its potential inaccuracies and lack of transcription capability (which LENA does not profess to attempt). The LENA system has also been criticised for prioritising quantity of speaking over quality (i.e., mastery of the language, as opposed to babble). == Product lines == === LENA Start === LENA Start is a program for parents that utilises feedback from the LENA System in conjunction with weekly group sessions in order to address the home language environment. It was introduced in 2015 and implemented across several U.S. states. In October 2020, during the restrictions of the COVID-19 pandemic, Read Aloud Delaware began a virtual LENA Start program with families statewide, where parents received feedback and participated in one-hour Zoom workshops each week during the 10-week program. === LENA Grow === LENA Grow is a professional development program for teachers in early childhood classrooms. Before launching at sites around the country, the program was first piloted in Escambia County, Florida. === LENA Home === LENA Home is a supplement to existing parent coaching curricula. Typically, home visitors facilitate the use of the LENA System to help parents track their progress towards increasing interactive talk in their homes. === Developmental Snapshot === The LENA Developmental Snapshot, based on a 52-question parent survey, assesses both expressive and receptive language skills and provides an estimate of a child's developmental age from 2 months to 36 months.

    Read more →
  • Pattern playback

    Pattern playback

    The pattern playback is an early talking device that was built by Dr. Franklin S. Cooper and his colleagues, including John M. Borst and Caryl Haskins, at Haskins Laboratories in the late 1940s and completed in 1950. There were several different versions of this hardware device. Only one currently survives. The machine converts pictures of the acoustic patterns of speech in the form of a spectrogram back into sound. Using this device, Alvin Liberman, Frank Cooper, and Pierre Delattre (later joined by Katherine Safford Harris, Leigh Lisker, and others) were able to discover acoustic cues for the perception of phonetic segments (consonants and vowels). This research was fundamental to the development of modern techniques of speech synthesis, reading machines for the blind, the study of speech perception and speech recognition, and the development of the motor theory of speech perception. To create sound, the pattern playback machine uses an arc light source which is directed against a rotating disk with 50 concentric tracks whose transparencies vary systematically in order to produce 50 harmonics of a fundamental frequency. The light is further projected against a spectrogram, whose reflectance corresponds to the sound pressure level of the partial of the signal, and is then directed towards a photovoltaic cell by which the light variation is converted into sound pressure variations. The pattern playback was last used in an experimental study by Robert Remez in 1976. The pattern playback now resides in the Museum at Haskins Laboratories in New Haven, Connecticut. The technique of pattern playback also now refers, more generally, to algorithms or techniques for converting spectrograms, cochleagrams, and correlograms from pictures back into sounds. A demonstration is in the TV show Adventure. Pioneering technology in psycholinguistics (CBS Television. 1953). == Digital pattern playback == In the 1970s, digital pattern playbacks began to supplant the earlier version. An early prototype was developed by Patrick Nye, Philip Rubin, and colleagues at Haskins Laboratories. It combined a "Ubiquitous Spectrum Analyzer"[1] for automatic spectral analysis, along with a VAX GT-40 display processor for graphic manipulation of the displayed spectrogram, a form of "synthesis by art", and subsequent re-synthesis using a 40 channel filter bank. This hybrid hardware/software digital pattern playback was eventually replaced at Haskins Laboratories by the HADES analysis and display system, designed by Philip Rubin, and implemented in Fortran on the VAX family of computers. A more modern version has been described by Arai and colleagues [2]. An on-line demonstration is available [3].

    Read more →
  • Quantum robotics

    Quantum robotics

    Quantum robotics is an interdisciplinary field that investigates the intersection of robotics and quantum mechanics. This field, in particular, explores the applications of quantum phenomena such as quantum entanglement within the realm of robotics. Examples of its applications include quantum communication in multi-agent cooperative robotic scenarios, the use of quantum algorithms in performing robotics tasks, and the integration of quantum devices (e.g., quantum detectors) in robotic systems. == Introduction == The free-space quantum communication between mobile platforms was proposed for reconfigurable quantum key distribution (QKD) applications using unmanned aerial vehicle (UAVs, a.k.a. drones) in 2017. This technology was later advanced in various aspects in mobile drone and vehicle platforms in several configurations such as drone-to-drone, drone-to-moving vehicle, and vehicle-to-vehicle systems. Some research has contributed to low-size, low-weight, and low-power quantum key distribution systems for small-form UAVs, the characterization of a polarization-based receiver for mobile free-space optical QKD, and optical-relayed entanglement distribution using drones as mobile nodes. The topic of free-space quantum communication between mobile platforms, initially developed to meet the need for free-space QKD and entanglement distribution using mobile nodes, was brought into the robotics domain as an emerging interdisciplinary mechatronics topic to investigate the interface between quantum technologies and the robotic systems domain. The main advantage of such integrated technology is the guaranteed security in communication between multi-agent and cooperative autonomous systems. Other advances are anticipated. == Quantum entanglement == According to quantum mechanics, entanglement occurs when more than one particle become connected. If the state of one particle changes then it will instantly change the state of other particles regardless of their distance. Entangled sensors do the same kind of work and achieve strong sensitivity. A group of quantum robots can measure magnetic fields, gravitational fields and other physical properties using entangled sensors with high rate of accuracy. Again the connection of one robot to other is increased (become strong) by quantum entanglement. == Quantum teleportation == Quantum teleportation is the transfer of quantum information (not physical objects). This is used in case of multi robot process. One robot is programmed with a complex quantum update. Then that robot can teleport that complex quantum information (the update) to other robots. This teleportation or communication is very secure because all the work is done in quantum state. == Kinematics == Quantum computing has been proposed as being optimal for calculating inverse kinematics values. == Alice and Bob robots == In the realm of quantum mechanics, the names Alice and Bob are frequently employed to illustrate various phenomena, protocols, and applications. These include their roles in QKD, quantum cryptography, entanglement, and teleportation. The terms "Alice Robot" and "Bob Robot" serve as analogous expressions that merge the concepts of Alice and Bob from quantum mechanics with mechatronic mobile platforms (such as robots, drones, and autonomous vehicles). For example, the Alice Robot functions as a transmitter platform that communicates with the Bob Robot, housing the receiving detectors.

    Read more →
  • You Only Look Once

    You Only Look Once

    You Only Look Once (YOLO) is a series of real-time object detection systems based on convolutional neural networks. First introduced by Joseph Redmon et al. in 2015, YOLO has undergone several iterations and improvements, becoming one of the most popular object detection frameworks. The name "You Only Look Once" refers to the fact that the algorithm requires only one forward propagation pass through the neural network to make predictions, unlike previous region proposal-based techniques like R-CNN that require thousands for a single image. == Overview == Compared to previous methods like R-CNN and OverFeat, instead of applying the model to an image at multiple locations and scales, YOLO applies a single neural network to the full image. This network divides the image into regions and predicts bounding boxes and probabilities for each region. These bounding boxes are weighted by the predicted probabilities. === OverFeat === OverFeat was an early influential model for simultaneous object classification and localization. Its architecture is as follows: Train a neural network for image classification only ("classification-trained network"). This could be one like the AlexNet. The last layer of the trained network is removed, and for every possible object class, initialize a network module at the last layer ("regression network"). The base network has its parameters frozen. The regression network is trained to predict the ( x , y ) {\displaystyle (x,y)} coordinates of two corners of the object's bounding box. During inference time, the classification-trained network is run over the same image over many different zoom levels and croppings. For each, it outputs a class label and a probability for that class label. Each output is then processed by the regression network of the corresponding class. This results in thousands of bounding boxes with class labels and probability. These boxes are merged until only one single box with a single class label remains. == Versions == There are two parts to the YOLO series. The original part contained YOLOv1, v2, and v3, all released on a website maintained by Joseph Redmon. === YOLOv1 === The original YOLO algorithm, introduced in 2015, divides the image into an S × S {\displaystyle S\times S} grid of cells. If the center of an object's bounding box falls into a grid cell, that cell is said to "contain" that object. Each grid cell predicts B bounding boxes and confidence scores for those boxes. These confidence scores reflect how confident the model is that the box contains an object and how accurate it thinks the box is that it predicts. In more detail, the network performs the same convolutional operation over each of the S 2 {\displaystyle S^{2}} patches. The output of the network on each patch is a tuple as follows: ( p 1 , … , p C , c 1 , x 1 , y 1 , w 1 , h 1 , … , c B , x B , y B , w B , h B ) {\displaystyle (p_{1},\dots ,p_{C},c_{1},x_{1},y_{1},w_{1},h_{1},\dots ,c_{B},x_{B},y_{B},w_{B},h_{B})} where p i {\displaystyle p_{i}} is the conditional probability that the cell contains an object of class i {\displaystyle i} , conditional on the cell containing at least one object. x j , y j , w j , h j {\displaystyle x_{j},y_{j},w_{j},h_{j}} are the center coordinates, width, and height of the j {\displaystyle j} -th predicted bounding box that is centered in the cell. Multiple bounding boxes are predicted to allow each prediction to specialize in one kind of bounding box. For example, slender objects might be predicted by j = 2 {\displaystyle j=2} while stout objects might be predicted by j = 1 {\displaystyle j=1} . c j {\displaystyle c_{j}} is the predicted intersection over union (IoU) of each bounding box with its corresponding ground truth. The network architecture has 24 convolutional layers followed by 2 fully connected layers. During training, for each cell, if it contains a ground truth bounding box, then only the predicted bounding boxes with the highest IoU with the ground truth bounding boxes is used for gradient descent. Concretely, let j {\displaystyle j} be that predicted bounding box, and let i {\displaystyle i} be the ground truth class label, then x j , y j , w j , h j {\displaystyle x_{j},y_{j},w_{j},h_{j}} are trained by gradient descent to approach the ground truth, p i {\displaystyle p_{i}} is trained towards 1 {\displaystyle 1} , other p i ′ {\displaystyle p_{i'}} are trained towards zero. If a cell contains no ground truth, then only c 1 , c 2 , … , c B {\displaystyle c_{1},c_{2},\dots ,c_{B}} are trained by gradient descent to approach zero. === YOLOv2 === Released in 2016, YOLOv2 (also known as YOLO9000) improved upon the original model by incorporating batch normalization, a higher resolution classifier, and using anchor boxes to predict bounding boxes. It could detect over 9000 object categories. It was also released on GitHub under the Apache 2.0 license. === YOLOv3 === YOLOv3, introduced in 2018, contained only "incremental" improvements, including the use of a more complex backbone network, multiple scales for detection, and a more sophisticated loss function. === YOLOv4 and beyond === Subsequent versions of YOLO (v4, v5, etc.) have been developed by different researchers, further improving performance and introducing new features. These versions are not officially associated with the original YOLO authors but build upon their work. As of 2026, versions up to YOLO26 have been released..

    Read more →
  • Clapper (service)

    Clapper (service)

    Clapper is an American short-form video-hosting service headquartered in Dallas, Texas. It was founded in 2020 by Edison Chen as an alternative for TikTok for mature audiences. The app is functionally similar to TikTok and includes tipping and e-commerce features. Following an influx of far-right content in early 2021, Clapper strengthened its moderation practices. It achieved 2 million monthly active users by 2023, and the number of downloads increased after a U.S. bill that would potentially ban TikTok in the country was signed in 2024. == History == With its offices in Dallas, Texas, Clapper was founded in July 2020 by Chinese-American entrepreneur Edison Chen. Chen considered that most online platforms, such as TikTok, were being targeted to young generations, such as Generation Z. He then concepted Clapper as a service with short-form content for mature audiences among Generation X and millennials, while not intending to compete directly with TikTok. Clapper averaged fewer than ten thousand daily active users during 2020, reaching 500 thousand downloads in the next year. Initially without paying for external advertising, the company raised about $3 million during a 2021 seed funding round. In 2023, the app reportedly reached about 300 to 400 thousand daily active users and 2 million monthly active users. The average user was between the ages of 35 and 55. Following the April 2024 signing of the Protecting Americans from Foreign Adversary Controlled Applications Act, which would potentially enact a ban on TikTok in the U.S. in January 2025, Clapper averaged 200 thousand weekly downloads. In 2025, before the day scheduled for the ban (January 19), TikTok users migrated to other apps. As a result, Clapper received 1.4 million new downloads in a week preceding the date. It was listed as the third most-downloaded free app on Apple's App Store on January 14, behind Xiaohongshu and Lemon8, and the term "TikTok refugee" became a trending term. == Features == Clapper presents similarities with TikTok in its layout, including "Following" and "For You" tabs with videos up to three minutes long that can be liked, commented on or shared. A "Clapback" feature allows users to create responses to videos from others. Users can create livestreams and chat rooms in the app. Users can tip Clapper creators through its Clapper Fam monetization feature, in place of in-app advertisements. The Clapper Shop allows for e-commerce between users. The service had distributed $10 million to its users in total by 2023, according to Clapper CEO Chen. == Content == Clapper includes a policy requiring users to be at least 17 years of age, although Clapper CEO Chen described that "there is no adult content" on the platform. Lindsay Dodgson of Business Insider described the content as generally outdated and "reminiscent of 'getting owned' compilations of the earlier internet." The Washington Post's Tatum Hunter characterized Clapper as including sexual or engagement baiting content more prevalently than TikTok. === Moderation === Clapper's team, which had fifteen employees in early 2021, initially stated it would not moderate content as strictly as TikTok and would mostly rely on user reports. Following that year's January 6 United States Capitol attack, far-right conservative videos promoting QAnon and anti-vaccine conspiracy theories appeared on Clapper's "For You" page to a substantial degree for weeks. The videos were made in protest against decisions by platforms, particularly TikTok, to ban such content. Clapper's team stated in January 10 that its rules prohibiting incitements to violence would be strictly enforced. By February, videos and accounts promoting the conspiracy theories had been removed, and QAnon-related content was banned permanently. Clapper's team hired more content auditors and implemented moderation by artificial intelligence for further community guideline violations.

    Read more →
  • Comparison of user features of messaging platforms

    Comparison of user features of messaging platforms

    Comparison of user features of messaging platforms refers to a comparison of all the various user features of various electronic instant messaging platforms. This includes a wide variety of resources; it includes standalone apps, platforms within websites, computer software, and various internal functions available on specific devices, such as iMessage for iPhones. This entry includes only the features and functions that shape the user experience for such apps. A comparison of the underlying system components, programming aspects, and other internal technical information, is outside the scope of this entry. == Overview and background == Instant messaging technology is a type of online chat that offers real-time text transmission over the Internet. A LAN messenger operates in a similar way over a local area network. Short messages are typically transmitted between two parties when each user chooses to complete a thought and select "send". Some IM applications can use push technology to provide real-time text, which transmits messages character by character, as they are composed. More advanced instant messaging can add file transfer, clickable hyperlinks, Voice over IP, or video chat. Non-IM types of chat include multicast transmission, usually referred to as "chat rooms", where participants might be anonymous or might be previously known to each other (for example collaborators on a project that is using chat to facilitate communication). Instant messaging systems tend to facilitate connections between specified known users (often using a contact list also known as a "buddy list" or "friend list"). Depending on the IM protocol, the technical architecture can be peer-to-peer (direct point-to-point transmission) or client-server (an Instant message service center retransmits messages from the sender to the communication device). By 2010, instant messaging over the Web was in sharp decline, in favor of messaging features on social networks. The most popular IM platforms were terminated, such as AIM which closed down and Windows Live Messenger which merged into Skype. Instant messaging has since seen a revival in popularity in the form of "messaging apps" (usually on mobile devices) which by 2014 had more users than social networks. As of 2010, social networking providers often offer IM abilities. Facebook Chat is a form of instant messaging, and Twitter can be thought of as a Web 2.0 instant messaging system. Similar server-side chat features are part of most dating websites, such as OkCupid or PlentyofFish. The spread of smartphones and similar devices in the late 2000s also caused increased competition with conventional instant messaging, by making text messaging services still more ubiquitous. Many instant messaging services offer video calling features, voice over IP and web conferencing services. Web conferencing services can integrate both video calling and instant messaging abilities. Some instant messaging companies are also offering desktop sharing, IP radio, and IPTV to the voice and video features. The term "Instant Messenger" is a service mark of Time Warner and may not be used in software not affiliated with AOL in the United States. For this reason, in April 2007, the instant messaging client formerly named Gaim (or gaim) announced that they would be renamed "Pidgin". In the 2010s, more people started to use messaging apps on modern computers and devices like WhatsApp, WeChat, Viber, Facebook Messenger, Telegram, Signal and Line rather than instant messaging on computers like AIM and Windows Live Messenger. For example, WhatsApp was founded in 2009, and Facebook acquired in 2014, by which time it already had half a billion users. === Concepts === ==== Backchannel ==== Backchannel is the practice of using networked computers to maintain a real-time online conversation alongside the primary group activity or live spoken remarks. The term was coined in the field of linguistics to describe listeners' behaviours during verbal communication. (See Backchannel (linguistics).) The term "backchannel" generally refers to online conversation about the conference topic or speaker. Occasionally backchannel provides audience members a chance to fact-check the presentation. First growing in popularity at technology conferences, backchannel is increasingly a factor in education where WiFi connections and laptop computers allow participants to use ordinary chat like IRC or AIM to actively communicate during presentation. More recent research include works where the backchannel is brought publicly visible, such as the ClassCommons, backchan.nl and Fragmented Social Mirror. Twitter is also widely used today by audiences to create backchannels during broadcasting of content or at conferences. For example, television drama, other forms of entertainment and magazine programs. This practice is often also called live tweeting. Many conferences nowadays also have a hashtag that can be used by the participants to share notes and experiences; furthermore such hashtags can be user generated. == Features == Various platforms and apps are distinguished by their strengths and features in regards to specific functions. === Group messaging === === Official channels === Some apps include a feature known as "official channels" which allows companies, especially news media outlets, publications, and other mass media companies, to offer an official channel, which users can join, and thereby receive regular updates, published articles, or news updates from companies or news outlets. Two apps which have a large amount of such channels available are Line and Telegram. === Video group calls === == Basic default platforms == Basic platforms which are common across entire categories of mobile devices, computers, or operating systems. === SMS === SMS (short message service) is a text messaging service component of most telephone, Internet, and mobile device systems. It uses standardized communication protocols to enable mobile devices to exchange short text messages. An intermediary service can facilitate a text-to-voice conversion to be sent to landlines. SMS, as used on modern devices, originated from radio telegraphy in radio memo pagers that used standardized phone protocols. These were defined in 1985 as part of the Global System for Mobile Communications (GSM) series of standards. The first test SMS message was sent on December 3, 1992, when Neil Papwort, a test engineer for Sema Group, used a personal computer to send "Merry Christmas" to the phone of colleague Richard Jarvis. It commercially rolled out to many cellular networks that decade. SMS became hugely popular worldwide as a way of text communication. By the end of 2010, SMS was the most widely used data application, with an estimated 3.5 billion active users, or about 80% of all mobile phone subscribers. The protocols allowed users to send and receive messages of up to 160 characters (when entirely alpha-numeric) to and from GSM mobiles. Although most SMS messages are sent from one mobile phone to another, support for the service has expanded to include other mobile technologies, such as ANSI CDMA networks and Digital AMPS. Mobile marketing, a type of direct marketing, uses SMS. According to a 2018 market research report the global SMS messaging business was estimated to be worth over US$100 billion, accounting for almost 50 percent of all the revenue generated by mobile messaging. A Flash SMS is a type of SMS that appears directly on the main screen without user interaction and is not automatically stored in the inbox. It can be useful in emergencies, such as a fire alarm or cases of confidentiality, as in delivering one-time passwords. ==== Threaded SMS format ==== Threaded SMS is a visual styling orientation of SMS message history that arranges messages to and from a contact in chronological order on a single screen. It was first invented by a developer working to implement the SMS client for the BlackBerry, who was looking to make use of the blank screen left below the message on a device with a larger screen capable of displaying far more than the usual 160 characters, and was inspired by threaded Reply conversations in email. Visually, this style of representation provides a back-and-forth chat-like history for each individual contact. Hierarchical-threading at the conversation-level (as typical in blogs and on-line messaging boards) is not widely supported by SMS messaging clients. This limitation is due to the fact that there is no session identifier or subject-line passed back and forth between sent and received messages in the header data (as specified by SMS protocol) from which the client device can properly thread an incoming message to a specific dialogue, or even to a specific message within a dialogue. Most smart phone text-messaging-clients are able to create some contextual threading of "group messages" which narrows the context of the thread around the common interests shared by

    Read more →
  • Normalization (image processing)

    Normalization (image processing)

    In image processing, normalization is a process that changes the range of pixel intensity values, a kind of intensity mapping. Applications include photographs with poor contrast due to glare, for example. A typical case is contrast stretching. In more general fields of data processing, such as digital signal processing, it is referred to as dynamic range expansion. The purpose of dynamic range expansion in the various applications is usually to bring the image, or other type of signal, into a range that is more familiar or normal to the senses, hence the term normalization. Often, the motivation is to achieve consistency in dynamic range for a set of data, signals, or images to avoid mental distraction or fatigue. For example, a newspaper will strive to make all of the images in an issue share a similar range of grayscale. Auto-normalization in image processing software typically normalizes to the full dynamic range of the number system specified in the image file format. == Definition == Normalization transforms an n-dimensional grayscale image I : { X ⊆ R n } → { Min , . . , Max } {\displaystyle I:\{\mathbb {X} \subseteq \mathbb {R} ^{n}\}\rightarrow \{{\text{Min}},..,{\text{Max}}\}} with intensity values in the range ( Min , Max ) {\displaystyle ({\text{Min}},{\text{Max}})} , into a new image I N : { X ⊆ R n } → { newMin , . . , newMax } {\displaystyle I_{N}:\{\mathbb {X} \subseteq \mathbb {R} ^{n}\}\rightarrow \{{\text{newMin}},..,{\text{newMax}}\}} with intensity values in the range ( newMin , newMax ) {\displaystyle ({\text{newMin}},{\text{newMax}})} . The linear normalization of a grayscale digital image is performed according to the formula I N = ( I − Min ) newMax − newMin Max − Min + newMin {\displaystyle I_{N}=(I-{\text{Min}}){\frac {{\text{newMax}}-{\text{newMin}}}{{\text{Max}}-{\text{Min}}}}+{\text{newMin}}} For example, if the intensity range of the image is 50 to 180 and the desired range is 0 to 255 the process entails subtracting 50 from each of pixel intensity, making the range 0 to 130. Then each pixel intensity is multiplied by 255/130, making the range 0 to 255. Normalization might also be non-linear, as the relationship between I {\displaystyle I} and I N {\displaystyle I_{N}} may not be linear. An example of non-linear normalization is when the normalization follows a sigmoid function, in which case the normalized image is computed according to the formula I N = ( newMax − newMin ) 1 1 + e − I − β α + newMin {\displaystyle I_{N}=({\text{newMax}}-{\text{newMin}}){\frac {1}{1+e^{-{\frac {I-\beta }{\alpha }}}}}+{\text{newMin}}} Where α {\displaystyle \alpha } defines the width of the input intensity range, and β {\displaystyle \beta } defines the intensity around which the range is centered. Gamma correction (log/inverse log) is also a common transformation function. === Colorspace === Intensity operations generally operate on a colorspace that maps to the human perception of lightness without intentionally changing the other properties. This can be done, for example, by operating on the L component of the CIELAB color space, or approximately by operating on the Y component of YCbCr. It is also possible to operate on each of the RGB color channels, though the result will not always make sense. == Contrast stretching == This is the most significant and essential technique of spatial-based image enhancement. The basic intent of this contrast enhancement technique is to adjust the local contrast in the image so as to bring out the clear regions or objects in the image. Low-contrast images often result from poor or non-uniform lighting conditions, a limited dynamic range of the imaging sensor, or improper settings of the lens aperture. This operation tries to change the intensity of the pixel in the image, particularly in the input image, to obtain an enhanced image. It is based on the number of techniques, namely local, global, dark and bright levels of contrast. The contrast enhancement is considered as the amount of color or gray differentiation that lies among the different features in an image. The contrast enhancement improves the quality of image by increasing the luminance difference between the foreground and background. A contrast stretching transformation can be achieved by: Stretching the dark range of input values into a wider range of output values: This involves increasing the brightness of the darker areas in the image to enhance details and improve visibility. Shifting the mid-range of input values: This involves adjusting the brightness levels of the mid-tones in the image to improve overall contrast and clarity. Compressing the bright range of input values: This process involves reducing the brightness of the brighter areas in the image to prevent overexposure resulting in a more balanced and visually appealing image. It can be described as the following piecewise funciton: I N = { s 1 r 1 I if I < r 1 s 2 − s 1 r 1 − r 2 ( I − r 1 ) if r 1 ≤ I ≤ r 2 1 − s 2 1 − r 2 ( I − r 2 ) if I > r 2 {\displaystyle I_{N}={\begin{cases}{\frac {s_{1}}{r_{1}}}I&{\text{if }}Ir_{2}\end{cases}}} Where: ( r 1 , s 1 ) {\displaystyle (r_{1},s_{1})} defines the transition point between the "dark" range to the "main" range. ( r 2 , s 2 ) {\displaystyle (r_{2},s_{2})} defines the transition point between the "main" range to the "bright" range. A typical linear stretch is obtained when ( r 1 , s 1 ) = ( r min , 0 ) {\displaystyle (r_{1},s_{1})=(r_{\text{min}},0)} and ( r 2 , s 2 ) = ( r max , 1 ) {\displaystyle (r_{2},s_{2})=(r_{\text{max}},1)} , where r min {\displaystyle r_{\text{min}}} and r max {\displaystyle r_{\text{max}}} denote the minimum and maximum levels in the source image. === Global contrast stretching === Global Contrast Stretching considers all color palate ranges at once to determine the maximum and minimum values for the entire RGB color image. This approach utilizes the combination of RGB colors to derive a single maximum and minimum value for contrast stretching across the entire image. === Local contrast stretching === Local contrast stretching (LCS) is an image enhancement method that focuses on locally adjusting each pixel's value to improve the visualization of structures within an image, particularly in both the darkest and lightest portions. It operates by utilizing sliding windows, known as kernels, which traverse the image. The central pixel within each kernel is adjusted using the following formula: I p ( x , y ) = 255 × [ I 0 ( x , y ) − m i n ] ( m a x − m i n ) {\displaystyle I_{p}(x,y)=255\times {\frac {[I_{0}(x,y)-min]}{(max-min)}}} Where: Ip(x,y) is the color level for the output pixel (x,y) after the contrast stretching process. I0(x,y) is the color level input for data pixel (x, y). max is the maximum value for color level in the input image within the selected kernel. min is the minimum value for color level in the input image within the selected kernel. A piecewise form (see above) may also be used. LCS can be applied to the three color channels of an image separately.

    Read more →
  • Subpixel rendering

    Subpixel rendering

    Subpixel rendering is a method used to increase the effective resolution of a color display device. It utilizes the composition of each pixel, which consists of three subpixels of which are red, green, and blue that can each be individually addressable on the display matrix. Subpixel rendering is primarily used for text rendering on standard DPI displays. Despite the inherent color anomalies, it can also be used to render general graphics. == History == The origin of subpixel rendering as used today remains controversial. Apple Inc., IBM, and Microsoft patented various implementations that differed in technical details owing to the different purposes for which their technologies were intended. Microsoft held several patents in the United States for subpixel rendering technology used in text rendering on RGB Stripe layouts. The patents 6,219,025; 6,239,783; 6,307,566; 6,225,973; 6,243,070; 6,393,145; 6,421,054; 6,282,327; and 6,624,828 were filed between October 7, 1998, and October 7, 1999, and expired on July 30, 2019. Analysis of the patent by FreeType indicates that the patent does not cover the idea of subpixel rendering, but rather the actual filter used as a last step to balance the color. Microsoft's patent describes the smallest possible filter that distributes each subpixel value equally among the R, G, and B pixels. Any other filter will either be blurrier or will introduce color artifacts. Apple was able to use it in Mac OS X due to a patent cross-licensing agreement. == Characteristics == A single pixel on a color display is made of several subpixels, typically three arranged left-to-right as red, green, and blue (RGB). The components are readily visible with a small magnifying glass, such as a loupe. These pixel components appear as a single color to the human eye because of blurring by optics and spatial integration by nerve cells in the eye. However, the eye is much more sensitive to the location. Therefore, turning on the G and B of one pixel and the R of the next pixel to the right will produce a white dot, but it will appear to be 1/3 of a pixel to the right of the white dot that would be seen from the RGB of only the first pixel. Subpixel rendering leverages this to provide three times the horizontal resolution of the rendered image. However, it has to blur this image to produce the correct color by ensuring the same amount of red, green, and blue are turned on as when no subpixel rendering is being done. Subpixel rendering does not necessitate the use of antialiasing. It gives a smoother result regardless of whether antialiasing is used or not since it artificially increases the resolution. However, it introduces color aliasing since subpixels are colored. Subsequent filtering applied to remove the color artifacts is a form of antialiasing, although its purpose is not smoothing jagged shapes as in conventional antialiasing. Subpixel rendering requires the software to know the layout of the subpixels. The most common reason it is wrong is monitors that can be rotated 90 (or 180) degrees, though monitors are manufactured with other arrangements of the subpixels, such as BGR or in triangles, or with 4 colors like RGBW squares. On any such display the result of incorrect subpixel rendering will be worse than if no subpixel rendering was done at all (it will not produce color artifacts, but it will produce noisy edges). == Implementations == === Apple II === Steve Gibson has claimed that the Apple II, introduced in 1977, supports an early form of subpixel rendering in its high-resolution (280×192) graphics mode. The Wozniak patent only used 2 "sub-pixels". The bytes that comprise the Apple II high-resolution screen buffer contain seven visible bits (each corresponding directly to a pixel) and a flag bit used to select between purple/green or blue/orange color sets. Each pixel, since it is represented by a single bit, is either on or off; there are no bits within the pixel itself for specifying color or brightness. Color is instead created as an artifact of the NTSC color encoding scheme, determined by horizontal position: pixels with even horizontal coordinates are always purple (or blue, if the flag bit is set), and odd pixels are always green (or orange). Two lit pixels next to each other are always white, regardless of whether the pair is even/odd or odd/even, and irrespective of the value of the flag bit. This is an approximation, but it is what most programmers of the time would have in mind while working with the Apple's high-resolution mode. Gibson's example claims that because two adjacent bits form a white block, there are, in fact, two bits per pixel: one that activates the pixel's purple left half and the other that activates its green right half. If the programmer instead activates the green right half of a pixel and the purple left half of the next pixel, the result is a white block 1/2 pixel to the right, which is indeed an instance of subpixel rendering. However, it is not clear whether any programmers of the Apple II have considered the pairs of bits as pixels—instead calling each bit a pixel. The flag bit in each byte affects color by shifting pixels half a pixel-width to the right. This half-pixel shift was exploited by some graphics software, such as HRCG (High-Resolution Character Generator), an Apple utility that displayed text using the high-resolution graphics mode, to smooth diagonals. === ClearType === Microsoft announced its subpixel rendering technology, called ClearType, at COMDEX in 1998. Microsoft published a paper in May 2000, Displaced Filtering for Patterned Displays, describing the filtering behind ClearType. It was then made available in Windows XP. Still, it was not activated by default until Windows Vista, while Windows XP OEMs could and did change the default setting. === FreeType === FreeType, the library used by most current software on the X Window System, contains two open source implementations. The original implementation uses the ClearType antialiasing filters and carries the following notice: "The colour filtering algorithm of Microsoft's ClearType technology for subpixel rendering is covered by patents; for this reason, the corresponding code in FreeType is disabled by default. Note that subpixel rendering per se is prior art; using a different colour filter thus easily circumvents Microsoft's patent claims." FreeType offers a variety of color filters. Since version 2.6.2, the default filter is light, a filter that is both normalized (value sums up to 1) and color-balanced (eliminate color fringes at the cost of resolution). Since version 2.8.1, a second implementation exists, called Harmony, that "offers high quality LCD-optimized output without resorting to ClearType techniques of resolution tripling and filtering". This is the method enabled by default. When using this method, "each color channel is generated separately after shifting the glyph outline, capitalizing on the fact that the color grids on LCD panels are shifted by a third of a pixel. This output is indistinguishable from ClearType with a light 3-tap filter." Since the Harmony method does not require additional filtering, it is not covered by the ClearType patents. === CoolType === Adobe created their own subpixel renderer called CoolType, allowing them to display documents the same way across various operating systems: Windows, MacOS, Linux etc. When it was launched around the year 2001, CoolType supported a wider range of fonts than Microsoft's ClearType, which at the time was limited to TrueType fonts. In contrast, Adobe's CoolType also supported PostScript fonts (and their OpenType equivalents). === macOS === Mac OS X (later OS X, now macOS) also used subpixel rendering, as part of Quartz 2D. However, it was removed after the introduction of Retina displays. Unlike Microsoft's implementation, which favors a tight fit to the grid (font hinting) to maximize legibility, Apple's implementation prioritizes the shape of the glyphs as set out by their designer.

    Read more →
  • Feed forward (control)

    Feed forward (control)

    A feed forward (sometimes written feedforward) is an element or pathway within a control system that passes a controlling signal from a source in its external environment to a load elsewhere in its external environment. This is often a command signal from an external operator. In control engineering, a feedforward control system is a control system that uses sensors to detect disturbances affecting the system and then applies an additional input to minimize the effect of the disturbance. This requires a mathematical model of the system so that the effect of disturbances can be properly predicted. A control system which has only feed-forward behavior responds to its control signal in a pre-defined way without responding to the way the system reacts; it is in contrast with a system that also has feedback, which adjusts the input to take account of how it affects the system, and how the system itself may vary unpredictably. In a feed-forward system, the control variable adjustment is not error-based. Instead it is based on knowledge about the process in the form of a mathematical model of the process and knowledge about, or measurements of, the process disturbances. Some prerequisites are needed for control scheme to be reliable by pure feed-forward without feedback: the external command or controlling signal must be available, and the effect of the output of the system on the load should be known (that usually means that the load must be predictably unchanging with time). Sometimes pure feed-forward control without feedback is called 'ballistic', because once a control signal has been sent, it cannot be further adjusted; any corrective adjustment must be by way of a new control signal. In contrast, 'cruise control' adjusts the output in response to the load that it encounters, by a feedback mechanism. These systems could relate to control theory, physiology, or computing. == Overview == With feed-forward or feedforward control, the disturbances are measured and accounted for before they have time to affect the system. In the house example, a feed-forward system may measure the fact that the door is opened and automatically turn on the heater before the house can get too cold. The difficulty with feed-forward control is that the effects of the disturbances on the system must be accurately predicted, and there must not be any unmeasured disturbances. For instance, if a window was opened that was not being measured, the feed-forward-controlled thermostat might let the house cool down. The term has specific meaning within the field of CPU-based automatic control. The discipline of feedforward control as it relates to modern, CPU based automatic controls is widely discussed, but is seldom practiced due to the difficulty and expense of developing or providing for the mathematical model required to facilitate this type of control. Open-loop control and feedback control, often based on canned PID control algorithms, are much more widely used. There are three types of control systems: open-loop, feed-forward, and feedback. An example of a pure open-loop control system is manual non-power-assisted steering of a motor car; the steering system does not have access to an auxiliary power source and does not respond to varying resistance to turning of the direction wheels; the driver must make that response without help from the steering system. In comparison, power steering has access to a controlled auxiliary power source, which depends on the engine speed. When the steering wheel is turned, a valve is opened which allows fluid under pressure to turn the wheels. A sensor monitors that pressure so that the valve only opens enough to cause the correct pressure to reach the wheel turning mechanism. This is feed-forward control where the output of the system, the change in direction of travel of the vehicle, plays no part in the system. See Model predictive control. If the driver is included in the system, then they do provide a feedback path by observing the direction of travel and compensating for errors by turning the steering wheel. In that case you have a feedback system, and the block labeled System in Figure(c) is a feed-forward system. In other words, systems of different types can be nested, and the overall system regarded as a black-box. Feedforward control is distinctly different from open-loop control and teleoperator systems. Feedforward control requires a mathematical model of the plant (process and/or machine being controlled) and the plant's relationship to any inputs or feedback the system might receive. Neither open-loop control nor teleoperator systems require the sophistication of a mathematical model of the physical system or plant being controlled. Control based on operator input without integral processing and interpretation through a mathematical model of the system is a teleoperator system and is not considered feedforward control. == History == Historically, the use of the term feedforward is found in works by Harold S. Black in US patent 1686792 (invented 17 March 1923) and D. M. MacKay as early as 1956. While MacKay's work is in the field of biological control theory, he speaks only of feedforward systems. MacKay does not mention feedforward control or allude to the discipline of feedforward controls. MacKay and other early writers who use the term feedforward are generally writing about theories of how human or animal brains work. Black also has US patent 2102671 invented 2 August 1927 on the technique of feedback applied to electronic systems. The discipline of feedforward controls was largely developed by professors and graduate students at Georgia Tech, MIT, Stanford and Carnegie Mellon. Feedforward is not typically hyphenated in scholarly publications. Meckl and Seering of MIT and Book and Dickerson of Georgia Tech began the development of the concepts of Feedforward Control in the mid-1970s. The discipline of Feedforward Controls was well defined in many scholarly papers, articles and books by the late 1980s. == Benefits == The benefits of feedforward control are significant and can often justify the extra cost, time and effort required to implement the technology. Control accuracy can often be improved by as much as an order of magnitude if the mathematical model is of sufficient quality and implementation of the feedforward control law is well thought out. Energy consumption by the feedforward control system and its driver is typically substantially lower than with other controls. Stability is enhanced such that the controlled device can be built of lower cost, lighter weight, springier materials while still being highly accurate and able to operate at high speeds. Other benefits of feedforward control include reduced wear and tear on equipment, lower maintenance costs, higher reliability and a substantial reduction in hysteresis. Feedforward control is often combined with feedback control to optimize performance. == Model == The mathematical model of the plant (machine, process or organism) used by the feedforward control system may be created and input by a control engineer or it may be learned by the control system. Control systems capable of learning and/or adapting their mathematical model have become more practical as microprocessor speeds have increased. The discipline of modern feedforward control was itself made possible by the invention of microprocessors. Feedforward control requires integration of the mathematical model into the control algorithm such that it is used to determine the control actions based on what is known about the state of the system being controlled. In the case of control for a lightweight, flexible robotic arm, this could be as simple as compensating between when the robot arm is carrying a payload and when it is not. The target joint angles are adjusted to place the payload in the desired position based on knowing the deflections in the arm from the mathematical model's interpretation of the disturbance caused by the payload. Systems that plan actions and then pass the plan to a different system for execution do not satisfy the above definition of feedforward control. Unless the system includes a means to detect a disturbance or receive an input and process that input through the mathematical model to determine the required modification to the control action, it is not true feedforward control. === Open system === In control theory, an open system is a feed forward system that does not have any feedback loop to control its output. In contrast, a closed system uses on a feedback loop to control the operation of the system. In an open system, the output of the system is not fed back into the input to the system for control or operation. == Applications == === Physiological feed-forward system === In physiology, feed-forward control is exemplified by the normal anticipatory regulation of heartbeat in advance of actual physical exertion by the central autonomic network. Feed-forward

    Read more →
  • Non-native speech database

    Non-native speech database

    A non-native speech database is a speech database of non-native pronunciations of English. Such databases are used in the development of: multilingual automatic speech recognition systems, text to speech systems, pronunciation trainers, and second language learning systems. == List == The actual table with information about the different databases is shown in Table 2. === Legend === In the table of non-native databases some abbreviations for language names are used. They are listed in Table 1. Table 2 gives the following information about each corpus: The name of the corpus, the institution where the corpus can be obtained, or at least further information should be available, the language which was actually spoken by the speakers, the number of speakers, the native language of the speakers, the total amount of non-native utterances the corpus contains, the duration in hours of the non-native part, the date of the first public reference to this corpus, some free text highlighting special aspects of this database and a reference to another publication. The reference in the last field is in most cases to the paper which is especially devoted to describe this corpus by the original collectors. In some cases it was not possible to identify such a paper. In these cases a paper is referenced which is using this corpus is. Some entries are left blank and others are marked with unknown. The difference here is that blank entries refer to attributes where the value is just not known. Unknown entries, however, indicate that no information about this attribute is available in the database itself. As an example, in the Jupiter weather database no information about the origin of the speakers is given. Therefore this data would be less useful for verifying accent detection or similar issues. Where possible, the name is a standard name of the corpus, for some of the smaller corpora, however, there was no established name and hence an identifier had to be created. In such cases, a combination of the institution and the collector of the database is used. In the case where the databases contain native and non-native speech, only attributes of the non-native part of the corpus are listed. Most of the corpora are collections of read speech. If the corpus instead consists either partly or completely of spontaneous utterances, this is mentioned in the Specials column.

    Read more →