AI Face Judge

AI Face Judge — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Hardware for artificial intelligence

    Hardware for artificial intelligence

    Specialized computer hardware is often used to execute artificial intelligence (AI) programs faster, and with less energy, such as Lisp machines, neuromorphic engineering, event cameras, and physical neural networks. Since 2017, several consumer grade CPUs and SoCs have on-die NPUs. As of 2023, the market for AI hardware is dominated by GPUs. As of the 2020s, AI computation is dominated by graphics processing units (GPUs) and newer domain-specific accelerators such as Google's Tensor Processing Units (TPUs), AMD's Instinct MI300 series, and various on-device neural-processing units (NPUs) found in consumer hardware. == Scope == For the purposes of this article, AI hardware refers to computing components and systems specifically designed or optimized to accelerate artificial-intelligence workloads such as machine-learning training or inference. This includes general-purpose accelerators used for AI (for example, GPUs) and domain-specific accelerators (for example, TPUs, NPUs, and other AI ASICs). Event-based cameras are sometimes discussed in the context of neuromorphic computing, but they are input sensors rather than AI compute devices. Conversely, components such as memristors are basic circuit elements rather than specialized AI hardware when considered alone. == Lisp machines == Lisp machines were developed in the late 1970s and early 1980s to make artificial intelligence programs written in the programming language Lisp run faster. == Dataflow architecture == Dataflow architecture processors used for AI serve various purposes with varied implementations like the polymorphic dataflow Convolution Engine by Kinara (formerly Deep Vision), structure-driven dataflow by Hailo, and dataflow scheduling by Cerebras. == Component hardware == === AI accelerators === Since the 2010s, advances in computer hardware have led to more efficient methods for training deep neural networks that contain many layers of non-linear hidden units and a very large output layer. By 2019, graphics processing units (GPUs), often with AI-specific enhancements, had displaced central processing units (CPUs) as the dominant means to train large-scale commercial cloud AI. OpenAI estimated the hardware compute used in the largest deep learning projects from Alex Net (2012) to Alpha Zero (2017), and found a 300,000-fold increase in the amount of compute needed, with a doubling-time trend of 3.4 months. === General-purpose GPUs for AI === Since the 2010s, graphics processing units (GPUs) have been widely used to train and deploy deep learning models because of their highly parallel architecture and high memory bandwidth. Modern data-center GPUs include dedicated tensor or matrix-math units that accelerate neural-network operations. In 2022, NVIDIA introduced the Hopper-generation H100 GPU, adding FP8 precision support and faster interconnects for large-scale model training. AMD and other vendors have also developed GPUs and accelerators aimed at AI and high-performance computing workloads. === Domain-specific accelerators (ASICs / NPUs) === Beyond general-purpose GPUs, several companies have developed application-specific integrated circuits (ASICs) and neural processing units (NPUs) tailored for AI workloads. Google introduced the Tensor Processing Unit (TPU) in 2016 for deep-learning inference, with later generations supporting large-scale training through dense systolic-array designs and optical interconnects. Other vendors have released similar devices—such as Apple's Neural Engine and various on-device NPUs—that emphasize energy-efficient inference in mobile or edge computing environments. === Memory and interconnects === AI accelerators rely on fast memory and inter-chip links to manage the large data volumes of training and inference. High-bandwidth memory (HBM) stacks, standardized as HBM3 in 2022, provide terabytes-per-second throughput on modern GPUs and ASICs. These accelerators are often connected through dedicated fabrics such as NVIDIA's NVLink and NVSwitch or optical interconnects used in TPU systems to scale performance across thousands of chips.

    Read more →
  • Supersampling

    Supersampling

    Supersampling or supersampling anti-aliasing (SSAA) is a spatial anti-aliasing method, i.e. a method used to remove aliasing (jagged and pixelated edges, colloquially known as "jaggies") from images rendered in computer games or other computer programs that generate imagery. Aliasing occurs because unlike real-world objects, which have continuous smooth curves and lines, a computer screen shows the viewer a large number of small squares. These pixels all have the same size, and each one has a single color. A line can only be shown as a collection of pixels, and therefore appears jagged unless it is perfectly horizontal or vertical. The aim of supersampling is to reduce this effect. Color samples are taken at several instances inside the pixel (not just at the center as normal)—hence the term "supersampling"—and an average color value is calculated. This can for example be achieved by rendering the image at a much higher resolution than the one being displayed, then shrinking it to the desired size, using the extra pixels for calculation, with the result being a downsampled image with smoother transitions from one line of pixels to another along the edges of objects, but each pixel could also be supersampled using other strategies (see the Supersampling patterns section). The number of samples determines the quality of the output. == Motivation == Aliasing is manifested in the case of 2D images as moiré pattern and pixelated edges, colloquially known as "jaggies". Common signal processing and image processing knowledge suggests that to achieve perfect elimination of aliasing, proper spatial sampling at the Nyquist rate (or higher) after applying a 2D Anti-aliasing filter is required. As this approach would require a forward and inverse fourier transformation, computationally less demanding approximations like supersampling were developed to avoid domain switches by staying in the spatial domain ("image domain"). == Method == === Computational cost and adaptive supersampling === Supersampling is computationally expensive because it requires much greater video card memory and memory bandwidth, since the amount of buffer used is several times larger. A way around this problem is to use a technique known as adaptive supersampling, where only pixels at the edges of objects are supersampled. Initially only a few samples are taken within each pixel. If these values are very similar, only these samples are used to determine the color. If not, more are used. The result of this method is that a higher number of samples are calculated only where necessary, thus improving performance. === Supersampling patterns === When taking samples within a pixel, the sample positions have to be determined in some way. Although the number of ways in which this can be done is infinite, there are a few ways which are commonly used. ==== Grid ==== The simplest algorithm. The pixel is split into several sub-pixels, and a sample is taken from the center of each. It is fast and easy to implement. Although, due to the regular nature of sampling, aliasing can still occur if a low number of sub-pixels is used. ==== Random ==== Also known as stochastic sampling, it avoids the regularity of grid supersampling. However, due to the irregularity of the pattern, samples end up being unnecessary in some areas of the pixel and lacking in others. ==== Poisson disk ==== The Poisson disk sampling algorithm places the samples randomly, but then checks that any two are not too close. The end result is an even but random distribution of samples. The naive "dart throwing" algorithm is extremely slow for large data sets, which once limited its applications for real-time rendering. However, many fast algorithms now exist to generate Poisson disk noise, even those with variable density. The Delone set provides a mathematical description of such sampling. ==== Jittered ==== A modification of the grid algorithm to approximate the Poisson disk. A pixel is split into several sub-pixels, but a sample is not taken from the center of each, but from a random point within the sub-pixel. Congregation can still occur, but to a lesser degree. ==== Rotated grid ==== A 2×2 grid layout is used but the sample pattern is rotated to avoid samples aligning on the horizontal or vertical axis, greatly improving antialiasing quality for the most commonly encountered cases. For an optimal pattern, the rotation angle is arctan (⁠1/2⁠) (about 26.6°) and the square is stretched by a factor of ⁠√5/2⁠, making it also a 4-queens solution.

    Read more →
  • Boris FX

    Boris FX

    Boris FX is a visual effects, video editing, photography, and audio software plug-in developer based in Miami, Florida, USA. The developer is known for its flagship products, Continuum (formerly Boris Continuum Complete/BCC), Sapphire, Mocha, and Silhouette. Boris FX creates plug-in tools for feature film, broadcast television, and multimedia post-production workflows. The plug-ins are compatible with various NLEs, including Adobe After Effects and Premiere Pro, Avid Media Composer, Apple Final Cut Pro, and OFX hosts such as Autodesk Flame, Foundry Nuke, Blackmagic Design DaVinci Resolve and Fusion, and VEGAS Pro. Boris FX has incorporated artificial intelligence into its software, introducing features for noise reduction, rotoscoping, upscaling, and masking. The company has acquired technologies via mergers and acquisitions from Imagineer Systems, GenArts, Silhouette FX, Digital Film Tools, CrumplePop and Andersson Technologies to expand its visual effects, editing, photography, and audio tools. == History == Boris FX was founded in 1995 by Boris Yamnitsky. The former Media 100 engineer (a member of the original Media 100 launch team in 1993) released “Boris FX,” the first plug-in-based digital video effects (DVE) for Adobe Premiere and Media 100, in 1995. The plug-in won Best of Show at Apple Macworld in Boston, MA that same year. The Boris FX Suite includes a range of visual effects and post-production tools, such as Sapphire, Continuum, Mocha Pro, Silhouette, SynthEyes, CrumplePop, Optics, and Particle Illusion. == Media 100 == In October 2005, Yamnitsky acquired Media 100 the company that launched his plug-in career. Boris FX had a long relationship with Media 100 which bundled Boris RED software as its main titling and compositing solution. Media 100's video editing software is available as freeware for macOS. == Continuum == Continuum is a visual effect and compositing plugin suite that includes a library of over 300 effects and more than 40 transitions, including tools for image restoration, compositing, titling, particle generation, and stylized effects, along with features such as lens flares, lighting effects, and cinematic color grading presets. A key component of Continuum is its integration with the Mocha planar tracking and masking system, enabling advanced tracking and rotoscoping within the effects. The suite also includes Particle Illusion, a real-time particle generator used for creating visual effects such as explosions, smoke, and abstract motion graphics, as well as Primatte Studio, a chroma keying and compositing toolset for green screen and blue screen workflows. Continuum supports GPU acceleration and offers compatibility with HDR and 360/VR content. Regular updates introduce new effects, presets, and performance enhancements to expand its capabilities. In October 2018, Continuum relaunched Particle Illusion, a Mocha Essentials workflow with magnetic edge-snapping, and updates to Title Studio. In October 2019, Continuum introduced Corner Pin Studio with built-in Mocha tracking for quick screen replacement and inserts, 6 stylized transitions, and 4 creative effects. In October 2020, Continuum released an update that included over 80 GPU-accelerated effects such as film stocks, color grades, optical filter simulations, and a digital gobo library. The update also introduced a custom FX Editor interface, real-time particles, and more than 1,000 drag-and-drop presets. In November 2021, it added multi-frame rendering for After Effects, native Apple M1 support, fluid dynamics in Particle Illusion, and 60 color-grade presets. In October 2022, the software introduced 10 additional transitions, a revised Particle Illusion workflow, an atmospheric glow effect, and more than 250 curated presets. Continuum plugins have been used in television, streaming, and film projects, including A Black Lady Sketch Show (HBO/HBO Max), Star Trek: Discovery (CBS), Andor (Disney+), The Curse of Oak Island (History Channel), Keeping up with the Kardashians (E!), This Old House (PBS), Ms. Marvel (Disney+), MasterChef (Fox), WipeOut (TBS), The Boys (Prime Video), and The Today Show (NBC). == Mocha Pro == In December 2014, Boris FX merged with Imagineer Systems, the UK-based developer of the Academy Award-winning planar motion tracking software, Mocha Pro. Mocha Pro's features include planar tracking (motion tracking), rotoscoping, image stabilization, 3D camera tracking, and object removal. In June 2016, Mocha released (v5) which introduced Mocha Pro's tools as plug-ins for Adobe After Effects and Premiere Pro, Avid Media Composer, and OFX hosts Foundry's NUKE, Blackmagic Design Fusion, VEGAS Pro, and HitFilm. A simplified version, Mocha AE, is included with Adobe After Effects Creative Cloud and has been bundled with the software since CS4. A similar version is also available with HitFilm Pro from FXhome and VEGAS Pro. Mocha's tracking SDK is integrated into other visual effects tools, including SAM Quantel Pablo Rio, Silhouette FX, CoreMelt, and Motion VFX. Mocha Pro has been used in various film and television productions, including Birdman, Black Swan, the Harry Potter series, The Hobbit, Star Wars, The Mandalorian, Star Trek: Discovery, and The Umbrella Academy. It has also been employed in projects such as Gone Girl, The Hunger Games: Mockingjay – Part 1, Game of Thrones, and House of Cards. == Sapphire == GenArts, founded by Karl Sims in 1996, developed visual effects plug-ins that were used by studios and post-production facilities. In September 2016, Boris FX merged with former competitor, GenArts, Inc., developer of Sapphire high-end visual effects plug-ins, to expand its suite of motion graphics and VFX tools. The merger brought Sapphire alongside Boris Continuum Complete (BCC) and Mocha Pro, integrating these tools for film and television post-production. The Sapphire suite includes a library of over 270 effects and transitions, organized into categories such as lighting, stylization, distortions, textures, and transitions. Commonly used effects include glows, lens flares, film looks, and blurs. The plug-ins are designed to be GPU-accelerated, allowing for improved rendering performance and real-time previews in supported host applications. A central feature of Sapphire is the Builder tool, a node-based workspace that allows users to create custom effects and transitions by combining multiple Sapphire plug-ins. This enables a high level of creative flexibility and reusability, making it a popular tool for both editors and VFX artists. Sapphire also integrates with Mocha, Boris FX's planar tracking and masking system, allowing for advanced control of visual elements within an effect. In October 2017, Boris FX released its first new version of Sapphire since the GenArts acquisition. Sapphire (v11) now includes integrated Mocha tracking and masking tools. Sapphire is available for Adobe, Avid, the Autodesk Flame family, and OFX hosts including Blackmagic DaVinci Resolve and Fusion, and Foundry's NUKE. As part of the merger, Boris FX acquired the rights to Particle Illusion. In 2018, Boris FX reintroduced the product to the larger NLE/Compositing market. Sapphire's plug-ins transitioned from C to C++ to improve performance and support higher-resolution visual effects. This update enhanced floating-point calculations, compatibility with film editing APIs, and integration with NVIDIA's CUDA for faster rendering. The plug-ins have been used in various films, including Avatar, the Harry Potter and the Prisoner of Azkaban, Iron Man, The Lord of the Rings, The Matrix trilogy, Titanic, and X-Men. == Particle Illusion == As part of the merger with GenArts in 2016, Boris FX acquired the rights to the Particle Illusion (formerly particleIllusion) product, a storied particle system from the original developer Alan Lorence, the founder of Wondertouch. In 2018, Boris FX released a redesigned version of the product to a larger NLE/compositing market as part of Continuum (2019). The new Particle Illusion plug-in supports Adobe, Avid, and many OFX hosts. == Silhouette == In September 2019, Boris FX merged with SilhouetteFX, Academy Award-winning developer of Silhouette, a high-end digital paint, advanced rotoscoping, motion tracking, and node-based compositing application for visual effects in film post-production. The acquisition integrated Silhouette's advanced rotoscoping and paint technology, recognized by the Academy of Motion Pictures, into Boris FX's suite of products, alongside Sapphire, Continuum, and Mocha Pro. In May 2021, Boris FX released Silhouette 2021, the first version of Silhouette released by Boris FX to function both as a standalone application and as a plug-in for Adobe, Autodesk, Nuke, and other OFX hosts. Silhouette has been used in the visual effects of films such as Avatar, Avengers: Infinity War, Blade Runner 2049, Ex Machina, and Interstellar. == Optics == In June 2020, Boris FX launched Optics, its first plugin deve

    Read more →
  • AstroPay

    AstroPay

    AstroPay is a global digital wallet that provides users with a way to pay, send, and receive money. The app provides online payments, virtual and physical debit cards, peer-to-peer money transfers, and more. == History == AstroPay was founded in Uruguay in 2009 as a payment processing company. Over time, it expanded its services across Latin America, EMEA, and APAC. A significant milestone occurred in 2016, when AstroPay spun off dLocal, focusing on cross-border payments for emerging markets. dLocal became Uruguay's first unicorn and eventually went public through a successful IPO. In 2020, AstroPay spun off its payment processing services into a new entity, D24, to focus on mobile wallet for cross border. Between 2023 and 2024 the Company brought new leadership to guide its transition towards becoming a fully focused global digital multicurrency wallet where users save, send, and spend globally. This shift introduced enhanced features, including loyalty prepaid cards and multicurrency accounts. == Services == AstroPay offers three main products: AstroPay Wallet, AstroPay check-out, and AstroPay Platform. AstroPay Wallet is a digital wallet for consumers, where they have multicurrency accounts, prepaid card and marketplace. With AstroPay check-out, businesses can tap into AstroPay's wallet user base by accepting AstroPay as a payment method in their check-out options. Lastly, AstroPay Platform enables other businesses to use the AstroPay network to launch their own global wallet. == Brand endorsements, partnerships == AstroPay's marketing strategy has included the development of co-branded products with sports teams and other brand. The company sponsored Burnley Football Club during the 2018–19 Premier League season, renewing the partnership for the 2021–22 Premier League season when it became the club's official payment service partner. In August 2021, AstroPay entered into a partnership with the Wolverhampton Wanderers for the 2021-22 Premier League season, and the following year, became the team's shirt sponsor. Later, in September 2021, AstroPay expanded its partnership with Wolverhampton Wanderers, which included becoming the team's official payment partner and later, in 2023, co-launching a co-branded card. Other partnerships include Newcastle United in 2021 in the English Premier League. AstroPay made arrangements to ensure that branding and logo would be visible on the pitch-side LED advertising during Premier League matches. Furthermore, in June 2022, the company renewed it's partnership with Wolverhampton Wanderers for the 2022-23 Premier League season and launched its Wolves debit card in February 2023. Some other notable partnerships include: Universidad de Chile in 2024, Tottenham Hotspurs in 2023-25, and even a collaboration with Lionel Messi across all of Latin America. == Recent developments == AstroPay has refocused its strategy since 2023, pivoting from payment processing to concentrate on its global digital wallet. This move reflects a broader effort to redefine the company's market positioning by emphasizing global user-friendly financial services, while separating its identity from previous operations managed by dLocal and D24.

    Read more →
  • Artificial wisdom

    Artificial wisdom

    Artificial wisdom (AW) is an artificial intelligence (AI) system which is able to display the human traits of wisdom and morals while being able to contemplate its own “endpoint”. Artificial wisdom can be described as artificial intelligence reaching the top-level of decision-making when confronted with the most complex challenging situations. The term artificial wisdom is used when the "intelligence" is based on more than by chance collecting and interpreting data, but by design enriched with smart and conscience strategies that wise people would use. == Overview == The goal of artificial wisdom is to create artificial intelligence that can successfully replicate the “uniquely human trait[s]” of having wisdom and morals as closely as possible. Thus, artificial wisdom, must “incorporate [the] ethical and moral considerations” of the data it uses. There are also many significant ethical and legal implications of AW which are compounded by the rapid advances in AI and related technologies alongside the lack of the development of ethics, guidelines, and regulations without the oversight of any kind of overarching advisory board. Additionally, there are challenges in how to develop, test, and implement AW in real world scenarios. Existing tests do not test the internal thought process by which a computer system reaches its conclusion, only the result of said process. When examining computer-aided wisdom; the partnership of artificial intelligence and contemplative neuroscience, concerns regarding the future of artificial intelligence shift to a more optimistic viewpoint. This artificial wisdom forms the basis of Louis Molnar's monographic article on artificial philosophy, where he coined the term and proposes how artificial intelligence might view its place in the grand scheme of things. == Definitions == There are no universal or standardized definitions for human intelligence, artificial intelligence, human wisdom, or artificial wisdom. However, the DIKW pyramid, describes the continuum of relationship between data, information, knowledge, and wisdom, puts wisdom at the highest level in its hierarchy. Gottfredson defines intelligence as “the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly, and learn from experience”. Definitions for wisdom typically include requiring: The ability for emotional regulation, Pro-social behaviors (e.g., empathy, compassion, and altruism), Self-reflection, “A balance between decisiveness and acceptance of uncertainty and diversity of perspectives, and social advising.” As previously defined, Artificial Wisdom would then be an AI system which is able to solve problems via “an understanding of…context, ethics and moral principles,” rather than simple pre-defined inputs or “learned patterns.” Some scientists have also considered the field of artificial consciousness. However, Jeste states that “…it is generally agreed that only humans can have consciousness, autonomy, will, and theory of mind.” An artificially wise system must also be able to contemplate its end goal and recognize its own ignorance. Additionally, to contemplate its end goal, a wise system must have a “correct conception of worthwhile goals (broadly speaking) or well-being (narrowly speaking)”. "Stephen Grimm further suggests that the following three types of knowledge are individually necessary for wisdom: first, "knowledge of what is good or important for well-being", second, "knowledge of one’s standing, relative to what is good or important for well-being", and third, "knowledge of a strategy for obtaining what is good or important for wellbeing."" == Problems == There are notable problems with attempting to create an artificially wise system. Consciousness, autonomy, and will are considered strictly human features. === Values === There are significant ethical and philosophical issues when attempting to create an intelligent or a wise system. Notably, whose moral values will be used to train the system to be wise. Differing moral values and prejudice can already be seen from various organizations and governments in artificial intelligence. Deployment strategies and values of Artificial Wisdom will conflict between leaders, companies, and countries. Nusbaum states, “When values are in conflict, leaders often make choices that are clever or smart about their own needs, but are often not wise.” === Ethics === Science fiction author Isaac Asimov realized the need to control the technology in the 1940s when he wrote the three laws of robotics as follows: A robot may not injure a human directly or indirectly. A robot must obey human’s orders. A robot should seek to protect its own existence. Additionally, the pace at which technology is rapidly advancing artificial intelligence and thus the need for artificial wisdom may “have outpaced the development of societal guidelines have raised serious questions about the ethics and morality of AI, and called for international oversight and regulations to ensure safety.” === Principal impossibility === One argument, coined by Tsai as the “argument against AW,” or AAAW, postulates the principal impossibility of Artificial Wisdom. The argument is based on the philosophical differences between practical wisdom, also called phronesis, and practical intelligence. Said difference isn’t in “selecting the correct means, but reasoning correctly about what ends to follow”. Tsai puts the argument into a logical proposition as follows: “(P1) An agent is genuinely wise only if the agent can deliberate about the final goal of the domain in which the agent is situated.” “(P2) An intelligent agent cannot deliberate about the final goal of the domain in which the agent is situated.” “(C1) An intelligent agent cannot be genuinely wise.” “(P3) An AW is, at its core, intelligent.” “(C2) An AW cannot be genuinely wise.”

    Read more →
  • Faceu

    Faceu

    FaceU (Chinese: 激萌) is a camera app for smartphones running Android or Apple iOS that edits portrait photographs, typically selfies. This app uses AR technology to allow users to add stickers or effects in real-time when taking selfies and videos. It was launched in 2016 and had 250 million registered users in 2017. Most of the users of Faceu are females from 15 to 35 years old. In February 2018, Faceu was acquired by Chinese media startup Toutiao, which is worth about $300 million. The app was banned in India (along with other Chinese apps) on 2 September 2020 by the government, the move came amid the 2020 China-India skirmish. == Online marketing == FaceU is one of several selfie camera apps in China, including MeituPic, Pitu, and Camera360. The app includes social functions such as instant messaging and video chat. Photos and short videos are deleted after a short period. . FaceU has worked with brands to create themed stickers for social media campaigns. In 2016, Faceu collaborated with MeituPic's Meipai and launched a rainbow effect. In October 2017, during the Mid-Autumn Festival and National Day, FaceU released a feature that applied historical or military costumes to selfies. The app has also worked with various social media personalities and celebrities, who have posted content using FaceU effects. Faceu group engages users' emotions utilizing key opinion leaders (KOL) and posters on social media. == Usage and Demographics == FaceU had a large user base. According to industry sources, the app had more than 90 million monthly active users (MAU) and over 11 million daily active users (DAU) at certain points. Most of the users were under 30 and mainly women. The app was especially popular in major Chinese cities like Beijing, Shanghai, and Guangzhou. FaceU also caught on in other parts of East Asia, particularly Japan and South Korea. Some app stores claim the app had hundreds of millions of users worldwide, but these numbers mostly come from the company’s marketing materials and have not been confirmed by independent sources. == Product Features == FaceU includes face recognition and live augmented reality (AR) effects. It allows users to add filters and stickers in real time while they are recording, rather than having to apply them later. The app integrates beauty filters, tools to create emojis and GIFs, and follow-video functionality that automatically tracks the face and movements as it records. Studies and market reports indicate that augmented reality (AR) filters and beautification tools are now common in smartphone photography. These features have influenced the way people take photos and what they expect photos to look like when shared online. Adding AR filters and beautification options has become a standard feature that most mobile photography apps now include.

    Read more →
  • CinePlayer

    CinePlayer

    CinePlayer is a software based media player used to review Digital Cinema Packages (DCP) without the need for a digital cinema server by Doremi Labs. CinePlayer can play back any DCP, not just those created by Doremi Mastering products. In addition to playing DCPs, CinePlayer can also playback JPEG2000 image sequences and many popular multimedia file types. There are two versions of CinePlayer available, standard and Pro. The standard version supports playback of non-encrypted, 2D DCP's up to 2K resolution. The Pro version supports playback of encrypted, 2D or 3D DCP's with subtitles up to 4K resolution. == Supported formats == === Containers === AVI MOV MXF MPG TS WMV M2TS MTS MP4 MKV === Video codecs === JPEG2000 ProRes 422 DNxHD YUV Uncompressed 8-10 bits DIVX XVID MPEG4 AVC / H-264 VC-1 MPEG2 === Supported image sequences === BMP TIFF TGA DPX JPG J2C === Supported audio files === WAV MP3 WMA MP2

    Read more →
  • DeepRoute.ai

    DeepRoute.ai

    DeepRoute.ai (Chinese: 元戎启行) is a Chinese autonomous driving company founded in 2019 and headquartered in Shenzhen, China. The company develops full-stack self-driving solutions including perception, decision-making, and control systems. == History == DeepRoute.ai was founded in February 2019 in Shenzhen, China, by Zhou Guang (周光), who serves as the company's CEO. In September 2019, the company collaborated with Dongfeng for a live-streamed autonomous driving demonstration. In October 2019, during the 7th Military World Games, DeepRoute.ai conducted Robotaxi demonstration operations. In November 2019, it obtained an intelligent connected vehicle road test permit for public roads in Shenzhen. In October 2020, DeepRoute.ai signed an "Autonomous Driving Leadership Project" with Dongfeng to build one of China's largest autonomous fleets. In August 2020, DeepRoute.ai announced its partnership with Cao Cao Mobility, a Geely-backed ride-hailing company, to test Robotaxis in Hangzhou for daily operations, planning to provide Robotaxis during the 2022 Asian Games. In September 2021, DeepRoute.ai secured US$300 million in a Series B funding round led by Alibaba. In December 2021, the company unveiled its DeepRoute-Driver 2.0, an L4-level autonomous driving solution comprising five solid-state lidar sensors, eight cameras, a proprietary computing system and an optional millimeter-wave radar. with a production cost of under US$10,000. In June 2022, it partnered with Deppon Express to provide autonomous light truck freight transfer services. In March 2023, the company launched its high-precision map-free intelligent driving solution, DeepRoute-Driver 3.0. In November 2024, Great Wall Motor announced a $100 million Series C funding round for Deeproute. With this, Deeproute has completed five rounds of financing, raising a cumulative total of over $500 million. Its shareholders include Fosun RZ Capital, Yunqi Partners, Alibaba, Vision Plus Capital, and Dongfeng, among others. In the same month, Deeproute.ai emphasised that they were in "deep cooperation" with Nvidia and spoke on being part of the first batch of companies in China to get a hold of Nvidia's newer Thor chip for cars which will be used in a new system released next year. This new system will help manage more complex driving scenarios through visual cues. == Products == === VLA Model === VLA Model is a Vision–language–action model designed for autonomous driving systems. It integrates visual perception, semantic understanding, and action decision-making into a unified framework, aiming to enhance the safety and adaptability of advanced driver-assistance systems (ADAS) in complex road environments. The model was officially launched on August 26, 2025, as the core of DeepRoute.ai's DeepRoute IO 2.0 platform. The VLA model is characterized by its "visual-language-action" architecture, which incorporates a chain-of-thought (CoT) reasoning capability inspired by large language models. This design is intended to address the "black box" limitations of traditional end-to-end autonomous driving systems by enabling the model to analyze information, infer causality, and make decisions in a more transparent and interpretable manner. === Appliance === The company has partnered with several automakers including Dongfeng Motor Corporation and Geely to develop and test autonomous vehicles.

    Read more →
  • OrCam device

    OrCam device

    OrCam devices such as OrCam MyEye are portable, artificial vision devices that allow visually impaired people to understand text and identify objects through audio feedback, describing what they are unable to see. Reuters described an important part of how it works as "a wireless smartcamera" which, when attached outside eyeglass frames, can read and verbalize text, and also supermarket barcodes. This information is converted to spoken words and entered "into the user’s ear." Face-recognition is also part of OrCam's feature set. == Devices == OrCam Technologies Ltd has created three devices; OrCam MyEye 2.0, OrCam MyEye 1, and OrCam MyReader. OrCam My Eye 2.0: OrCam debuted the second-generation model, the OrCam MyEye 2.0 in December 2017. About the size of a finger, the MyEye 2.0 is battery-powered, and has been compressed into a self-contained device. The device snaps onto any eyeglass frame magnetically. Orcam 2.0 is small and light (22.5 grams/0.8 ounces) with functionality to restore independence to the visually impaired. It comes in two versions. The basic model can read text, and a more advanced one adds features such as face recognition and barcode reading. As of July 2023, the retail cost is between $4000 and $6000 (USD). == Clinical Studies == JAMA Ophthalmology: In 2016 JAMA Ophthalmology conducted a study involving 12 legally blind participants to evaluate the usefulness of a portable artificial vision device (OrCam) for patients with low vision. The results showed that the OrCam device improved the patient's ability to perform tasks simulating those of daily living, such as reading a message on an electronic device, a newspaper article or a menu. Wills Eye: Wills Eye was a clinical study designed to measure the impact of the OrCam device on the quality of life of patients with End-stage Glaucoma. The conclusion was that OrCam, a novel artificial vision device using a mini-camera mounted on eyeglasses, allowed legally blind patients with end-stage glaucoma to read independently, subsequently improving their quality of life. == Employee testing == The New York Times described how a pre-release OrCam device was used by a Coloboma-impaired employee of the device's developer in 2013 for grocery shopping. It was the small size of the prototype rather than the functionality that gave her added mobility in an Israeli store's aisles. Added life-enhancement was described: "to both recognize and speak .. bus numbers .. traffic lights." == Social aspects == In contrast to an early version of Google Glass, which "failed ... because .. Glass wearers were ..mocked", early OrCam devices used designs that "clip unobtrusively on your shirt or perhaps your belt." In addition, it does not record sounds or images, what was called "the privacy puzzle that stumped Google. One 2018 technology reviewer wrote that he wished it had a headphone jack "so it would be less disruptive in places where others are working." An attempt was made to use bone conduction. == USA introduction == In 2018 a team headed by New York Assemblyman Dov Hikind introduced use of OrCam devices to ten individuals screened for what he termed "new Israeli technology that really makes a difference to the blind." Although not the first USA success, it was more focused than a publicly funded project that was authorized in 2016 by a California government agency. Also in 2016 the Chicago Lighthouse for the Blind demonstrated its use. == Technology == In the area of hardware, miniaturization has been quite important, but one major area, software, was mentioned by Assemblyman Hikind, and reported by The Times of Israel is the "AI-driven algorithms" that "reports .. how many people are in a room. In addition to reading printed text, it can also aid in "seeing" what is on a television or computer screen. Although OrCam can't help with handwritten information, it can reuse information, the basis of recognizing "US currency, and even faces." === Features === While early language support was for English, French, German, Hebrew and Spanish, others now available include Danish, Dutch, Finnish, Italian, Norwegian, Portuguese and Swedish. == History == OrCam Technologies Ltd was founded in 2010 by Professor Amnon Shashua and Ziv Aviram. Before co-founding OrCam, the two in 1999 co-founded Mobileye, an Israeli company that develops vision-based advanced driver-assistance systems (ADAS) providing warnings for collision prevention and mitigation, which was acquired by Intel for $15.3 billion in 2017. OrCam launched OrCam MyEye in 2013 after years of development and testing, and began selling it commercially in 2015. In its early years, the company raised $22 million, $6 million of which came from Intel Capital. By 2014, Intel, which was also investing in Google Glass, had invested $15 million in Orcam. In March 2017, OrCam had raised $41 million in capital, making it worth $600 million. === Marketing === One outcome of initial marketing in the USA was that they "reached a deal with the California Department of Rehabilitation, ...qualifying blind and visually impaired state residents." == OrCam Technologies Ltd == OrCam Technologies Ltd. is the Israeli-based company producing these OrCam devices, which are wearable artificial intelligence space. The company develops and manufactures assistive technology devices for individuals who are visually impaired, partially sighted, blind, print disabilities, or have other disabilities. OrCam headquarters is located in Jerusalem, operating under the company name OrCam Technologies Ltd. OrCam has over 150 employees, is headquartered in Jerusalem, and has offices in New York, Toronto, and London. == Awards == 2018 Last Gadget Standing Winner 2018 CES Innovation Awards Honoree in Accessible Tech 2017 NAIDEX Innovation Award 2016 Louise Braille Corporate Recognition Award 2016 Silmo-d-Or Award

    Read more →
  • LCD crosstalk

    LCD crosstalk

    LCD crosstalk is a visual defect in an LCD screen which occurs because of interference between adjacent pixels. Owing to the way rows and columns in the display are addressed, and charge is pushed around, the data on one part of the display has the potential to influence what is displayed elsewhere. This is generally known as crosstalk, and in matrix displays typically occurs in the horizontal and vertical directions. Crosstalk used to be a serious problem in the old passive-matrix (STN) displays, but is rarely discernable in modern active-matrix (TFT) displays. A fortunate side effect of inversion (see above) is that, for most display material, what little crosstalk there is largely cancelled out. For most practical purposes, the level of crosstalk in modern LCDs is negligible. Certain patterns, particularly those involving fine dots, can interact with the inversion and reveal visible crosstalk. If you try moving a small Window in front of the inversion pattern (above) which makes your screen flicker the most, you may well see crosstalk in the surrounding pattern. Different patterns are required to reveal crosstalk on different displays (depending on their inversion scheme).

    Read more →
  • Template matching

    Template matching

    Template matching is a technique in digital image processing for finding small parts of an image which match a template image. It can be used for quality control in manufacturing, navigation of mobile robots, or edge detection in images. The main challenges in a template matching task are detection of occlusion, when a sought-after object is partly hidden in an image; detection of non-rigid transformations, when an object is distorted or imaged from different angles; sensitivity to illumination and background changes; background clutter; and scale changes. == Feature-based approach == The feature-based approach to template matching relies on the extraction of image features, such as shapes, textures, and colors, that match the target image or frame. This approach is usually achieved using neural networks and deep-learning classifiers such as VGG, AlexNet, and ResNet.Convolutional neural networks (CNNs), which many modern classifiers are based on, process an image by passing it through different hidden layers, producing a vector at each layer with classification information about the image. These vectors are extracted from the network and used as the features of the image. Feature extraction using deep neural networks, like CNNs, has proven extremely effective has become the standard in state-of-the-art template matching algorithms. This feature-based approach is often more robust than the template-based approach described below. As such, it has become the state-of-the-art method for template matching, as it can match templates with non-rigid and out-of-plane transformations, as well as high background clutter and illumination changes. == Template-based approach == For templates without strong features, or for when the bulk of a template image constitutes the matching image as a whole, a template-based approach may be effective. Since template-based matching may require sampling of a large number of data points, it is often desirable to reduce the number of sampling points by reducing the resolution of search and template images by the same factor before performing the operation on the resultant downsized images. This pre-processing method creates a multi-scale, or pyramid, representation of images, providing a reduced search window of data points within a search image so that the template does not have to be compared with every viable data point. Pyramid representations are a method of dimensionality reduction, a common aim of machine learning on data sets that suffer the curse of dimensionality. == Common challenges == In instances where the template may not provide a direct match, it may be useful to implement eigenspaces to create templates that detail the matching object under a number of different conditions, such as varying perspectives, illuminations, color contrasts, or object poses. For example, if an algorithm is looking for a face, its template eigenspaces may consist of images (i.e., templates) of faces in different positions to the camera, in different lighting conditions, or with different expressions (i.e., poses). It is also possible for a matching image to be obscured or occluded by an object. In these cases, it is unreasonable to provide a multitude of templates to cover each possible occlusion. For example, the search object may be a playing card, and in some of the search images, the card is obscured by the fingers of someone holding the card, or by another card on top of it, or by some other object in front of the camera. In cases where the object is malleable or poseable, motion becomes an additional problem, and problems involving both motion and occlusion become ambiguous. In these cases, one possible solution is to divide the template image into multiple sub-images and perform matching on each subdivision. == Deformable templates in computational anatomy == Template matching is a central tool in computational anatomy (CA). In this field, a deformable template model is used to model the space of human anatomies and their orbits under the group of diffeomorphisms, functions which smoothly deform an object. Template matching arises as an approach to finding the unknown diffeomorphism that acts on a template image to match the target image. Template matching algorithms in CA have come to be called large deformation diffeomorphic metric mappings (LDDMMs). Currently, there are LDDMM template matching algorithms for matching anatomical landmark points, curves, surfaces, volumes. == Template-based matching explained using cross correlation or sum of absolute differences == A basic method of template matching sometimes called "Linear Spatial Filtering" uses an image patch (i.e., the "template image" or "filter mask") tailored to a specific feature of search images to detect. This technique can be easily performed on grey images or edge images, where the additional variable of color is either not present or not relevant. Cross correlation techniques compare the similarities of the search and template images. Their outputs should be highest at places where the image structure matches the template structure, i.e., where large search image values get multiplied by large template image values. This method is normally implemented by first picking out a part of a search image to use as a template. Let S ( x , y ) {\displaystyle S(x,y)} represent the value of a search image pixel, where ( x , y ) {\displaystyle (x,y)} represents the coordinates of the pixel in the search image. For simplicity, assume pixel values are scalar, as in a greyscale image. Similarly, let T ( x t , y t ) {\textstyle T(x_{t},y_{t})} represent the value of a template pixel, where ( x t , y t ) {\textstyle (x_{t},y_{t})} represents the coordinates of the pixel in the template image. To apply the filter, simply move the center (or origin) of the template image over each point in the search image and calculate the sum of products, similar to a dot product, between the pixel values in the search and template images over the whole area spanned by the template. More formally, if ( 0 , 0 ) {\displaystyle (0,0)} is the center (or origin) of the template image, then the cross correlation T ⋆ S {\displaystyle T\star S} at each point ( x , y ) {\displaystyle (x,y)} in the search image can be computed as: ( T ⋆ S ) ( x , y ) = ∑ ( x t , y t ) ∈ T T ( x t , y t ) ⋅ S ( x t + x , y t + y ) {\displaystyle (T\star S)(x,y)=\sum _{(x_{t},y_{t})\in T}T(x_{t},y_{t})\cdot S(x_{t}+x,y_{t}+y)} For convenience, T {\displaystyle T} denotes both the pixel values of the template image as well as its domain, the bounds of the template. Note that all possible positions of the template with respect to the search image are considered. Since cross correlation values are greatest when the values of the search and template pixels align, the best matching position ( x m , y m ) {\displaystyle (x_{m},y_{m})} corresponds to the maximum value of T ⋆ S {\displaystyle T\star S} over S {\displaystyle S} . Another way to handle translation problems on images using template matching is to compare the intensities of the pixels, using the sum of absolute differences (SAD) measure. To formulate this, let I S ( x s , y s ) {\displaystyle I_{S}(x_{s},y_{s})} and I T ( x t , y t ) {\displaystyle I_{T}(x_{t},y_{t})} denote the light intensity of pixels in the search and template images with coordinates ( x s , y s ) {\displaystyle (x_{s},y_{s})} and ( x t , y t ) {\displaystyle (x_{t},y_{t})} , respectively. Then by moving the center (or origin) of the template to a point ( x , y ) {\displaystyle (x,y)} in the search image, as before, the sum of absolute differences between the template and search pixel intensities at that point is: S A D ( x , y ) = ∑ ( x t , y t ) ∈ T | I T ( x t , y t ) − I S ( x t + x , y t + y ) | {\displaystyle SAD(x,y)=\sum _{(x_{t},y_{t})\in T}\left\vert I_{T}(x_{t},y_{t})-I_{S}(x_{t}+x,y_{t}+y)\right\vert } With this measure, the lowest SAD gives the best position for the template, rather than the greatest as with cross correlation. SAD tends to be relatively simple to implement and understand, but it also tends to be relatively slow to execute. A simple C++ implementation of SAD template matching is given below. == Implementation == In this simple implementation, it is assumed that the above described method is applied on grey images: This is why Grey is used as pixel intensity. The final position in this implementation gives the top left location for where the template image best matches the search image. One way to perform template matching on color images is to decompose the pixels into their color components and measure the quality of match between the color template and search image using the sum of the SAD computed for each color separately. == Speeding up the process == In the past, this type of spatial filtering was normally only used in dedicated hardware solutions because of the computational complexity of the operation, however we can lessen this complexity b

    Read more →
  • RagTime

    RagTime

    RagTime is a frame-oriented business publishing software which combines word processing, spreadsheets, simple drawings, image processing, and charts, in a single document/program, integrated software. It is often used to create forms, reports, documentation, desktop publishing, and in office environments. Typical users are business clients, educational institutions, administrations, architects, and also private users. Ragtime includes the following modules: Page layout (forms, templates etc.) Word processing Image processing Spreadsheets, similar to Microsoft Excel Formulas and functions which can be used throughout, in text, graphics, and spreadsheets Charts in different types of diagrams Drawings in vector graphics including lines, polygons, Bézier curves and more Slide show (presentation of RagTime documents) Audio/video Buttons (pop-up menus, switches, and more) that can be used within RagTime documents Import/export of various file formats Support of the AppleScript scripting language available system-wide under macOS == Principle == RagTime differs from most other comparable programs or software packages in its strict frame-oriented design: all content is contained within frames on each page. The content can have a fixed position within its frame or, if it is text or a spreadsheet, flow into another frame that is connected to the first frame via a so-called “pipeline”. RagTime has no different document types for different types of data; all content is stored in a single compound document type. Thus, a RagTime document not only can contain multiple pages, but also multiple layouts within the same document; e.g. spreadsheets in addition to text and images. The RagTime filename extension is .rtd (RagTime document); for templates the extension is .rtt (RagTime template). The current version is RagTime 6.6.5. It is available for OS X (10.6-10.14) and Windows (XP/Vista/7/8/10). == Extensions == FileTime – allows accessing “FileMaker Pro” databases from RagTime documents under OS X RagTime Connect – ODBC database connection for RagTime 6 (Mac and Windows) Johannes – print extension for the simple creation of stapled or folded brochures, booklets etc. PowerFunctions – additional functions for a more effective creation of intelligent documents for exchanging data and for use in mixed Mac/Windows environments MetaFormula – SYLK-based extension that allows calculating text as formula == History == RagTime has been developed since 1985 for the Macintosh – originally named MacFrame – and was published in 1986. When released, it already had the present name, which was chosen following the then-available software package Lotus Jazz. In the European Macintosh market, RagTime quickly gained a prominent position that continues to this day, even though the market share has decreased. Despite repeated attempts, the program could not gain acceptance in the North American market due to its high cost ($395 in 1990). The North American sales office closed in 1991, shortly after Claris Corporation released ClarisWorks which duplicated much of the functionality of RagTime for a lower price. After the manufacturer – first Brüning & Everth, followed by B&E Software and today RagTime.de Development – had focused on the Macintosh only for a very long time, it also released a Windows version, RagTime 5.0, in 1999. However, the program could not assume great significance against established competitors, especially Microsoft Office. Until mid-2006 RagTime was, in addition to the commercial version, also available as a free version (RagTime Solo) for personal use. RagTime Solo included the same features and performance (except for spelling and Syllabification) dictionaries), but was not allowed for use in commercial environments. In other languages RagTime Solo was distributed as RagTime Privat. In a press release from July 5, 2006, RagTime announced the discontinuation of RagTime Solo: “… the RagTime Solo license conditions were often misinterpreted or deliberately flouted. Therefore we discontinued RagTime Solo, there will be no private version of RagTime 6 anymore.” After a successful start of the RagTime 6.0 software, sales edged significantly lower in the following years. Disagreements arose among the shareholders about the continuation of the company, which filed for bankruptcy in July 2007. As a result, the rights to RagTime were taken over by the newly established company RagTime.de Development GmbH, which was responsible for the development. The sales partner RagTime.de Sales GmbH distributed the RagTime products until October 2015. Today RagTime.de Development GmbH is also responsible for sales. The last level of development is the extensively revamped version RagTime 6.6 of 8 October 2015, which also includes new OS X features (e.g. high-resolution “Retina” displays) and supports Windows 10. == Programming == RagTime 1-3 were developed in Pascal, since version 4 the development is completely coded in C++. External programming and automation can be implemented via AppleScript on a Mac, and via OLE/COM-API (e.g. Visual Basic) under Windows. On a Mac, RagTime provides a comprehensive AppleScript library, for the automation of almost any task, from automatic document creation to the export of PDF documents. RagTime also supports “recordings” by use of the “AppleScript Editor”, which allows recording the interactive RagTime operation as an AppleScript program sequence. AppleScripts can be saved in the RagTime document and called via menu or shortcut keys. On Windows, RagTime (since version 6) disposes over an OLE/COM API, which allows automating many RagTime components via external programming. For that purpose there is a type library that installs the available RagTime OLE/COM object catalogue. Programming can be realized in all programming languages supported by Microsoft.

    Read more →
  • Automation integrator

    Automation integrator

    An automation integrator is a systems integrator company or individual who makes different versions of automation hardware and software work together, generally combining several subsystems to work together as one large system. The title may refer to those who only integrate hardware, although these will often work with software integrators. Software created by automation integrators allows devices to communicate with each other, as well as collecting and reporting data. The magazine Control Engineering publishes an annual “Automation Integrator Guide” which lists over 2,000 automation integrators. They also give an annual system integrator of the year award to three automation integration firms. The Control System Integrators Association (CSIA) maintains a buyers' guide of over 1200 member and nonmember systems integrators known as the Industrial Automation Exchange, or CSIA Exchange for short. == Certification == The Control System Integrators Association (CSIA) certifies automation integrators, through an audit based on 79 critical criteria from the best practices manual. Companies must be associate members of the CSIA to be eligible for certification. Integrators can also receive certification through a program launched in 2012 by the Robotics Industries Association. == Industries == Automation Integrators work in a wide variety of industries which use robotics and automation. Some of the most common include:

    Read more →
  • Pulse-coupled networks

    Pulse-coupled networks

    Pulse-coupled networks or pulse-coupled neural networks (PCNNs) are neural models proposed by modeling a cat's visual cortex, and developed for high-performance biomimetic image processing. In 1989, Eckhorn introduced a neural model to emulate the mechanism of cat's visual cortex. The Eckhorn model provided a simple and effective tool for studying small mammal’s visual cortex, and was soon recognized as having significant application potential in image processing. In 1994, Johnson adapted the Eckhorn model to an image processing algorithm, calling this algorithm a pulse-coupled neural network. The basic property of the Eckhorn's linking-field model (LFM) is the coupling term. LFM is a modulation of the primary input by a biased offset factor driven by the linking input. These drive a threshold variable that decays from an initial high value. When the threshold drops below zero it is reset to a high value and the process starts over. This is different than the standard integrate-and-fire neural model, which accumulates the input until it passes an upper limit and effectively "shorts out" to cause the pulse. LFM uses this difference to sustain pulse bursts, something the standard model does not do on a single neuron level. It is valuable to understand, however, that a detailed analysis of the standard model must include a shunting term, due to the floating voltages level in the dendritic compartment(s), and in turn this causes an elegant multiple modulation effect that enables a true higher-order network (HON). A PCNN is a two-dimensional neural network. Each neuron in the network corresponds to one pixel in an input image, receiving its corresponding pixel's color information (e.g. intensity) as an external stimulus. Each neuron also connects with its neighboring neurons, receiving local stimuli from them. The external and local stimuli are combined in an internal activation system, which accumulates the stimuli until it exceeds a dynamic threshold, resulting in a pulse output. Through iterative computation, PCNN neurons produce temporal series of pulse outputs. The temporal series of pulse outputs contain information of input images and can be used for various image processing applications, such as image segmentation and feature generation. Compared with conventional image processing means, PCNNs have several significant merits, including robustness against noise, independence of geometric variations in input patterns, capability of bridging minor intensity variations in input patterns, etc. A simplified PCNN called a spiking cortical model was developed in 2009. == Applications == PCNNs are useful for image processing, as discussed in a book by Thomas Lindblad and Jason M. Kinser. PCNNs have been used in a variety of image processing applications, including: image segmentation, pattern recognition, feature generation, face extraction, motion detection, region growing, image denoising and image enhancement Multidimensional pulse image processing of chemical structure data using PCNN has been discussed by Kinser, et al. They have also been applied to an all pairs shortest path problem.

    Read more →
  • Shepp–Logan phantom

    Shepp–Logan phantom

    The Shepp–Logan phantom is a standard test image created by Larry Shepp and Benjamin F. Logan for their 1974 paper "The Fourier Reconstruction of a Head Section". It serves as the model of a human head in the development and testing of image reconstruction algorithms. == Definition == The function describing the phantom is defined as the sum of 10 ellipses inside a 2×2 square:

    Read more →