Collision detection is the computational problem of detecting an intersection of two or more objects in virtual space. More precisely, it deals with the questions of if, when, and where two or more objects intersect. Collision detection is a classic problem of computational geometry with applications in computer graphics, physical simulation, video games, robotics (including autonomous driving), and computational physics. Collision detection algorithms can be divided into operating on 2D or 3D spatial objects. == Overview == Collision detection is closely linked to calculating the distance between objects, as objects collide when the distance between them is less than or equal to zero. Negative distances indicate that one object has penetrated another. Performing collision detection requires more context than just the distance between the objects. Accurately identifying the points of contact on both objects' surfaces is also essential for computing a physically accurate collision response. The complexity of this task increases with the level of detail in the objects' representations: the more intricate the model, the greater the computational cost. Collision detection frequently involves dynamic objects, adding a temporal dimension to distance calculations. Instead of simply measuring distance between static objects, collision detection algorithms often aim to determine whether the objects' motion will bring them to a point in time when their distance is zero—an operation that adds significant computational overhead. In collision detection involving multiple objects, a naive approach would require detecting collisions for all pairwise combinations of objects. As the number of objects increases, the number of required comparisons grows rapidly: for n {\displaystyle n} objects, n ( n − 1 ) / 2 {n(n-1)}/{2} intersection tests are needed with a naive approach. This quadratic growth makes such an approach computationally expensive as n {\displaystyle n} increases. Due to the complexity mentioned above, collision detection is a computationally intensive process. Nevertheless, it is essential for interactive applications like video games, robotics, and real-time physics engines. To manage these computational demands, extensive efforts have gone into optimizing collision detection algorithms. A commonly used approach towards accelerating the required computations is to divide the process into two phases: the broad phase and the narrow phase. The broad phase aims to answer the question of whether objects might collide, using a conservative but efficient approach to rule out pairs that clearly do not intersect, thus avoiding unnecessary calculations. Objects that cannot be definitively separated in the broad phase are passed to the narrow phase. Here, more precise algorithms determine whether these objects actually intersect. If they do, the narrow phase often calculates the exact time and location of the intersection. == Broad phase == This phase aims at quickly finding objects or parts of objects for which it can be quickly determined that no further collision test is needed. A useful property of such approach is that it is output sensitive. In the context of collision detection this means that the time complexity of the collision detection is proportional to the number of objects that are close to each other. An early example of that is the I-COLLIDE where the number of required narrow phase collision tests was O ( n + m ) {\displaystyle O(n+m)} where n {\displaystyle n} is the number of objects and m {\displaystyle m} is the number of objects at close proximity. This is a significant improvement over the quadratic complexity of the naive approach. === Spatial partitioning === Several approaches can be grouped under the spatial partitioning umbrella, which includes octrees (for 3D), quadtrees (for 2D), binary space partitioning (or BSP trees) and other, similar approaches. If one splits space into a number of simple cells, and if two objects can be shown not to be in the same cell, then they need not be checked for intersection. Dynamic scenes and deformable objects require updating the partitioning which can add overhead. === Bounding volume hierarchy === Bounding Volume Hierarchy (BVH) is a tree structure over a set of bounding volumes. Collision is determined by doing a tree traversal starting from the root. If the bounding volume of the root doesn't intersect with the object of interest, the traversal can be stopped. If, however there is an intersection, the traversal proceeds and checks the branches for each there is an intersection. Branches for which there is no intersection with the bounding volume can be culled from further intersection test. Therefore, multiple objects can be determined to not intersect at once. BVH can be used with deformable objects such as cloth or soft-bodies but the volume hierarchy has to be adjusted as the shape deforms. For deformable objects we need to be concerned about self-collisions or self intersections. BVH can be used for that end as well. Collision between two objects is computed by computing intersection between the bounding volumes of the root of the tree as there are collision we dive into the sub-trees that intersect. Exact collisions between the actual objects, or its parts (often triangles of a triangle mesh) need to be computed only between intersecting leaves. The same approach works for pair wise collision and self-collisions. === Exploiting temporal coherence === During the broad-phase, when the objects in the world move or deform, the data-structures used to cull collisions have to be updated. In cases where the changes between two frames or time-steps are small and the objects can be approximated well with axis-aligned bounding boxes, the sweep and prune algorithm can be a suitable approach. Several key observation make the implementation efficient: Two bounding-boxes intersect if, and only if, there is overlap along all three axes; overlap can be determined, for each axis separately, by sorting the intervals for all the boxes; and lastly, between two frames updates are typically small (making sorting algorithms optimized for almost-sorted lists suitable for this application). The algorithm keeps track of currently intersecting boxes, and as objects move, re-sorting the intervals helps keep track of the status. === Pairwise pruning === Once a pair of physical bodies has been selected for further investigation, collisions need to be checked more carefully. However, in many applications, individual objects (if they are not too deformable) are described by a set of smaller primitives, mainly triangles. So there are two sets of triangles, S = S 1 , S 2 , … , S n {\displaystyle S={S_{1},S_{2},\dots ,S_{n}}} and T = T 1 , T 2 , … , T n {\displaystyle T={T_{1},T_{2},\dots ,T_{n}}} (for simplicity, each set has the same number of triangles.) The obvious thing to do is to check all triangles S j {\displaystyle S_{j}} against all triangles T k {\displaystyle T_{k}} for collisions, but this involves n 2 {\displaystyle n^{2}} comparisons, which is highly inefficient. If possible, it is desirable to use a pruning algorithm to reduce the number of pairs of triangles that need to be checked. The most widely used family of algorithms is known as the hierarchical bounding volumes method. As a preprocessing step, for each object (e.g., S {\displaystyle S} and T {\displaystyle T} ) calculates a hierarchy of bounding volumes. Then, at each time step, when collisions need to be checked between S {\displaystyle S} and T {\displaystyle T} , the hierarchical bounding volumes are used to reduce the number of pairs of triangles under consideration. For simplicity, provide an example using bounding spheres, although it has been noted that spheres are undesirable in many cases. If E {\displaystyle E} is a set of triangles, a bounding sphere is pre-calculated. B ( E ) {\displaystyle B(E)} . There are many ways of choosing B ( E ) {\displaystyle B(E)} , B ( E ) {\displaystyle B(E)} is a sphere that completely contains E {\displaystyle E} and is as small as possible. Ahead of time, B ( S ) {\displaystyle B(S)} and B ( T ) {\displaystyle B(T)} can be computed. Clearly, if these two spheres do not intersect (and that is very easy to test), then neither do S {\displaystyle S} and T {\displaystyle T} . This is not much better than an n-body pruning algorithm, however. If E = E 1 , E 2 , … , E m {\displaystyle E={E_{1},E_{2},\dots ,E_{m}}} is a set of triangles, then split it into two halves L ( E ) := E 1 , E 2 , … , E m / 2 {\displaystyle L(E):={E_{1},E_{2},\dots ,E_{m/2}}} and R ( E ) := E m / 2 + 1 , … , E m − 1 , E m {\displaystyle R(E):={E_{m/2+1},\dots ,E_{m-1},E_{m}}} . Apply this to S {\displaystyle S} and T {\displaystyle T} , and calculate (ahead of time) the bounding spheres B ( L ( S ) ) , B ( R ( S ) ) {\displaystyle B(L(S)),B(R(S))} and B ( L ( T ) ) , B ( R ( T ) ) {\displaystyle B(L(T)),B(R(T))} . T
Diagnostically acceptable irreversible compression
Diagnostically acceptable irreversible compression (DAIC) is the amount of lossy compression which can be used on a medical image to produce a result that does not prevent the reader from using the image to make a medical diagnosis. The term was first introduced at a workshop on irreversible compression convened by the European Society of Radiology (ESR) in Palma de Mallorca October 13, 2010, the results of which were reported in a subsequent position paper. == Determination == The "amount of compression" in irreversible compression used to be determined by the compression ratio, where the acceptable minimum is determined by the algorithm (typically JPEG or J2K) and the data type (body part and imaging method). Such a definition is easy to follow, and has been used by medical bodies in 2010 around the world. However, its downside is obvious: the compression ratio tells nothing about the real quality of the image, as different compressors can produce vastly different qualities under the same file size. For example, the JPEG format of 1992 can perform as well as many modern formats given newer techniques exploited in mozjpeg and ISO libjpeg, yet they would be lumped together with the legacy encoders in such a scheme. The image compression community has long used objective quality metrics like SSIM to measure the effects of compression. In the absence of good data regarding SSIM, the ESR review of 2010 concluded that it is still difficult to establish a criterion for whether a particular irreversible compression scheme applied with particular parameters to a particular individual image, or category of images, avoids the introduction of some quantifiable risk of a diagnostic error for any particular diagnostic task. A 2017 study showed that a SSIM variant called 4-G-r (4-component, gradient, structural component of SSIM) best reflects changes in images that affect the decision of radiologists out of 16 SSIM variants. A 2020 study shows that visual information fidelity (VIF), feature similarity index (FSIM), and noise quality metric (NQM) best reflect radiologist preferences out of ten metrics. It also mentions that the original version of SSIM works as poorly as a basic root-mean-square distance (RMSD) for this purpose, a result echoed by the 2017 study. The 4-G-r modification is not tested in the study.
Text-to-video model
A text-to-video model is a form of generative artificial intelligence that uses a natural language description as input to produce a video relevant to the input text. Advancements during the 2020s in the generation of high-quality, text-conditioned videos have largely been driven by the development of video diffusion models. == Models == There are different models, including open source models. Chinese-language input CogVideo is the earliest text-to-video model "of 9.4 billion parameters" to be developed, with its demo version of open source codes first presented on GitHub in 2022. That year, Meta Platforms released a partial text-to-video model called "Make-A-Video", and Google's Brain (later Google DeepMind) introduced Imagen Video, a text-to-video model with 3D U-Net. === 2023 === In February 2023, Runway released Gen-1 and Gen-2, among the first commercially available text-to-video and video-to-video models accessible to the public through a web interface. Gen-1, initially released as a video-to-video model, allowed users to transform existing video footage using text or image prompts. Gen-2, introduced in March 2023 and made publicly available in June 2023, added text-to-video capabilities, enabling users to generate videos from text prompts alone. In March 2023, a research paper titled "VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation" was published, presenting a novel approach to video generation. The VideoFusion model decomposes the diffusion process into two components: base noise and residual noise, which are shared across frames to ensure temporal coherence. By utilizing a pre-trained image diffusion model as a base generator, the model efficiently generated high-quality and coherent videos. Fine-tuning the pre-trained model on video data addressed the domain gap between image and video data, enhancing the model's ability to produce realistic and consistent video sequences. In the same month, Adobe introduced Firefly AI as part of its features. === 2024 === In January 2024, Google announced development of a text-to-video model named Lumiere which is anticipated to integrate advanced video editing capabilities. Matthias Niessner and Lourdes Agapito at AI company Synthesia work on developing 3D neural rendering techniques that can synthesise realistic video by using 2D and 3D neural representations of shape, appearances, and motion for controllable video synthesis of avatars. In June 2024, Luma Labs launched its Dream Machine video tool. That same month, Kuaishou extended its Kling AI text-to-video model to international users. In July 2024, TikTok owner ByteDance released Jimeng AI in China, through its subsidiary, Faceu Technology. By September 2024, the Chinese AI company MiniMax debuted its video-01 model, joining other established AI model companies like Zhipu AI, Baichuan, and Moonshot AI, which contribute to China's involvement in AI technology. In December 2024 Lightricks launched LTX Video as an open source model. === 2025 === Alternative approaches to text-to-video models include Google's Phenaki, Hour One, Colossyan, Runway's Gen-3 Alpha, and OpenAI's Sora, Several additional text-to-video models, such as Plug-and-Play, Text2LIVE, and TuneAVideo, have emerged. FLUX.1 developer Black Forest Labs has announced its text-to-video model SOTA. Google was preparing to launch a video generation tool named Veo for YouTube Shorts in 2025. In May 2025, Google launched the Veo 3 iteration of the model. It was noted for its impressive audio generation capabilities, which were a previous limitation for text-to-video models. In July 2025 Lightricks released an update to LTX Video capable of generating clips reaching 60 seconds, and in October 2025 it released LTX-2, with audio capabilities built in. === 2026 === In February 2026, ByteDance released Seedance 2.0, it was noted for its impressive realistic generation, motion and camera control and 15 second generation, however the model faced huge critiscism from Motion Picture Association for copyright infringement. After viewing a viral clip of a fight between actors Brad Pitt and Tom Cruise, Rhett Reese, who is the co-writer of Deadpool & Wolverine and Zombieland announced that on social media "I hate to say it. It’s likely over for us," further stating that "In next to no time, one person is going to be able to sit at a computer and create a movie indistinguishable from what Hollywood now releases." == Architecture and training == There are several architectures that have been used to create text-to-video models. Similar to text-to-image models, these models can be trained using Recurrent Neural Networks (RNNs) such as long short-term memory (LSTM) networks, which has been used for Pixel Transformation Models and Stochastic Video Generation Models, which aid in consistency and realism respectively. An alternative for these include transformer models. Generative adversarial networks (GANs), Variational autoencoders (VAEs), — which can aid in the prediction of human motion — and diffusion models have also been used to develop the image generation aspects of the model. Text-video datasets used to train models include, but are not limited to, WebVid-10M, HDVILA-100M, CCV, ActivityNet, and Panda-70M. These datasets contain millions of original videos of interest, generated videos, captioned-videos, and textual information that help train models for accuracy. Text-video datasets used to train models include, but are not limited to PromptSource, DiffusionDB, and VidProM. These datasets provide the range of text inputs needed to teach models how to interpret a variety of textual prompts. The video generation process involves synchronizing the text inputs with video frames, ensuring alignment and consistency throughout the sequence. This predictive process is subject to decline in quality as the length of the video increases due to resource limitations. The Will Smith Eating Spaghetti test is a benchmark for models. == Limitations == Despite the rapid evolution of text-to-video models in their performance, a primary limitation is that they are very computationally heavy which limits its capacity to provide high quality and lengthy outputs. Additionally, these models require a large amount of specific training data to be able to generate high quality and coherent outputs, which brings about the issue of accessibility. Moreover, models may misinterpret textual prompts, resulting in video outputs that deviate from the intended meaning. This can occur due to limitations in capturing semantic context embedded in text, which affects the model's ability to align generated video with the user's intended message. Various models, including Make-A-Video, Imagen Video, Phenaki, CogVideo, GODIVA, and NUWA, are currently being tested and refined to enhance their alignment capabilities and overall performance in text-to-video generation. Another issue with the outputs is that text or fine details in AI-generated videos often appear garbled, a problem that stable diffusion models also struggle with. Examples include distorted hands and unreadable text. == Ethics == The deployment of text-to-video models raises ethical considerations related to content generation. These models have the potential to create inappropriate or unauthorized content, including explicit material, graphic violence, misinformation, and likenesses of real individuals without consent. Ensuring that AI-generated content complies with established standards for safe and ethical usage is essential, as content generated by these models may not always be easily identified as harmful or misleading. The ability of AI to recognize and filter out NSFW or copyrighted content remains an ongoing challenge, with implications for both creators and audiences. == Impacts and applications == Text-to-video models offer a broad range of applications that may benefit various fields, from educational and promotional to creative industries. These models can streamline content creation for training videos, movie previews, gaming assets, and visualizations, making it easier to generate content. During the Russo-Ukrainian war, fake videos made with artificial intelligence were created as part of a propaganda war against Ukraine and shared in social media. These included depictions of children in the Ukrainian Armed Forces, fake ads targeting children encouraging them to denounce critics of the Ukrainian government, or fictitious statements by Ukrainian President Volodymyr Zelenskyy about the country's surrender, among others. === Movies === Kaur vs Kore is the first Indian feature film made using generative AI which features dual role for the AI character of Sunny Leone, set to release in 2026. Chiranjeevi Hanuman – The Eternal is an Indian movie made entirely using Generative AI created by Vijay Subramaniam which is set for theatrical release in 2026. The movie was widely criticised by the Film makers in the Bollywood industr
Perplexity AI
Perplexity AI, Inc., or simply Perplexity, is an American privately held software company offering a web search engine that processes user queries and synthesizes responses. Perplexity products use large language models and incorporate real-time web search capabilities, providing responses based on current Internet content, citing sources used. Its real-time search engine is called Sonar and is based on Meta's Llama model. A free public version is available, while a paid Pro subscription offers access to more advanced language models and additional features. Perplexity AI, Inc., was founded in August 2022 by Aravind Srinivas, Denis Yarats, Johnny Ho, and Andy Konwinski. As of September 2025, the company was valued at US$20 billion. Perplexity AI has attracted legal scrutiny over allegations of copyright infringement, unauthorized content use, and trademark issues from several major media organizations, including the BBC, Dow Jones, and The New York Times. According to separate analyses by Wired and later Cloudflare, Perplexity uses undisclosed web crawlers with spoofed user-agent strings to scrape the content of websites which prohibit, or explicitly block, web scraping. == History == In August 2022, Perplexity AI, Inc., was founded by Aravind Srinivas, Denis Yarats, Johnny Ho, and Andy Konwinski, engineers with backgrounds in back-end systems, artificial intelligence (AI) and machine learning. It launched its main search engine on December 7, 2022, and has since released a Google Chrome extension and apps for iOS and Android. In February 2023, Perplexity reported two million unique visitors. By April 2024, Perplexity had raised $165 million in funding, valuing the company at over $1 billion. As of June 2025, Perplexity closed a $500 million round of funding that elevated its valuation to $14 billion. Investors in Perplexity AI have included Jeff Bezos, Tobias Lütke, Nat Friedman, Nvidia, and Databricks. Perplexity has also received funding from 1789 Capital, a venture capital firm notable for its association with Donald Trump Jr. During Bloomberg’s Tech Summit 2025, Srinivas shared that the company processed 780 million queries in May 2025, experiencing more than 20% month-over-month growth, processing around 30 million queries daily. In July 2024, Perplexity announced the launch of a new publishers' program to share advertising revenue with partners. On January 18, 2025, the day before the impending U.S. ban on the social media app TikTok, Perplexity submitted a proposal for a merger with TikTok US. On August 12, 2025, Perplexity made a bid to buy Chrome from Google for $34.5 billion. Perplexity stated that the sale could remedy anti-trust litigation against Google, in which a judge was considering compelling the sale of Chrome. In December 2025, Cristiano Ronaldo took an undisclosed stake in Perplexity AI and entered a global brand partnership with the company. === Business Strategy and Finance (2026) === As of early 2026, Perplexity AI reached a valuation of $21.21 billion following its Series E-6 funding round. The company's Annual Recurring Revenue (ARR) grew from $80 million in late 2024 to an estimated $200 million by February 2026. In January 2026, the company entered into a three-year, $750 million commitment with Microsoft Azure to secure the GPU capacity required for its advanced "Deep Research" and "Model Council" features. In February 2026, Perplexity transitioned to a subscription-first model by discontinuing its AI-integrated advertising strategy. Leadership stated the move was intended to preserve user trust in the "answer engine," prioritizing objective results over ad revenue. The company also introduced the "Model Council" feature on February 5, 2026, which allows users to compare outputs from multiple large language models, such as GPT-5.2 and Claude 4.6, simultaneously. To expand its user base, Perplexity began offering a free year of Pro access to students, U.S. Military Veterans, and government employees. == Products and services == === Search engine web portal === Perplexity’s primary offering is an online information retrieval system (search engine) that uses large language models to generate responses to user queries by searching and summarizing web-based content. Perplexity offers a feature known as Perplexity Pages that generates structured summaries and report-like content from user queries by aggregating cited sources. Perplexity is available without charge or registration to Web users, a freemium model. === Perplexity Pro === Perplexity Pro is a subscription tier, a more capable paid "enterprise" service, including stronger security and data protection and additional tools, including the ability to search uploaded documents alongside web content and access to a programmatic application programming interface (API). It allows the user to select between backend models such as GPT-5.4, Claude 4.6 and Gemini 3.1 Pro. The company has also developed its own models, Sonar (based on Llama 3.3) and R1 1776 (based on DeepSeek R1). === Internal Knowledge Search === Internal Knowledge Search enables Pro and Enterprise Pro users to simultaneously search across web content and internal documents. Users can upload and search through Excel, Word, PDF, and other common file formats. Enterprise Pro users can upload and index up to 500 files. === Search API === Perplexity's Search API provides AI developers with programmatic access to the company's search infrastructure. The September 2025 release includes a software development kit, an open-source evaluation framework called search_evals, and documentation detailing the API's design and optimization. === Shopping hub === Perplexity's Shopping Hub is an online shopping platform that provides AI-generated product recommendations, and enables users to purchase products directly through Perplexity's interface. It was launched in November 2024 with backing by Amazon and Nvidia. === Finance === In October 2024, Perplexity AI introduced new finance-related features, including looking up stock prices and company earnings data. The tool provides real-time stock quotes and price tracking, industry peer comparisons and basic financial analysis tools. The platform sources its financial data from Financial Modeling Prep. === Assistant === In January 2025, Perplexity launched the Perplexity Assistant, an AI-powered tool designed to enhance the functionality of its search engine. It can perform tasks across multiple apps, such as hailing a ride or searching for a song, and can maintain context across actions. The assistant is also multi-modal, meaning it can use a phone's camera to provide answers about the user's surroundings or on-screen content. Perplexity has acknowledged that the assistant is still in development and may not always function as expected. For instance, certain features, such as summarizing unread emails or upcoming calendar events, require users to enable a workaround based on notifications. === Comet === In July 2025, Perplexity launched Comet, an AI browser based on Chromium. Initially, access to the browser was limited to users subscribed to the most expensive subscription tier. The browser was later released for free download in October 2025. A key feature is integration of the Perplexity search engine, which can perform a variety of tasks such as generating article summaries, describing an image, conducting research about a topic and composing emails. === Truth Social chatbot === Perplexity has been contracted to produce a chatbot for Donald Trump's social media platform Truth Social. == Leadership == Aravind Srinivas is the CEO and co-founder of Perplexity AI. He previously held research positions at OpenAI, Google DeepMind, and other AI research institutions focusing on machine learning and artificial intelligence. In a March 2026 All-In episode, Srinivas said the incoming AI-related layoffs were "glorious future" to "look forward", as it freed people from jobs they didn't like and gave them opportunities to pursue entrepreneurship. == Controversies == === Copyright and trademark infringement allegations === In June 2024, Forbes publicly criticized Perplexity for using their content. According to Forbes, Perplexity published a story largely copied from a proprietary Forbes article without mentioning or prominently citing Forbes. In response, Srinivas said that the feature had some "rough edges" and accepted feedback but maintained that Perplexity only "aggregates" rather than plagiarizes information. In October 2024, The New York Times sent a cease-and-desist notice to Perplexity to stop accessing and using NYT content, claiming that Perplexity is violating its copyright by scraping data from its website. In June 2024, Dow Jones and New York Post filed a lawsuit against Perplexity, alleging copyright infringement. The lawsuit also alleged that Perplexity harmed their brand by attributing hallucinated quotes, for example on F-16 jets for Ukraine, to artic
Salience (neuroscience)
Salience (also called saliency, from Latin saliō meaning "leap, spring") is the property by which some thing stands out. Salient events are an attentional mechanism by which organisms learn and survive; those organisms can focus their limited perceptual and cognitive resources on the pertinent (that is, salient) subset of the sensory data available to them. Saliency typically arises from contrasts between items and their neighborhood. They might be represented, for example, by a red dot surrounded by white dots, or by a flickering message indicator of an answering machine, or a loud noise in an otherwise quiet environment. Saliency detection is often studied in the context of the visual system, but similar mechanisms operate in other sensory systems. Just what is salient can be influenced by training: for example, for human subjects particular letters can become salient by training. There can be a sequence of necessary events, each of which has to be salient, in turn, in order for successful training in the sequence; the alternative is a failure, as in an illustrated sequence when tying a bowline; in the list of illustrations, even the first illustration is a salient: the rope in the list must cross over, and not under the bitter end of the rope (which can remain fixed, and not free to move); failure to notice that the first salient has not been satisfied means the knot will fail to hold, even when the remaining salient events have been satisfied. When attention deployment is driven by salient stimuli, it is considered to be bottom-up, memory-free, and reactive. Conversely, attention can also be guided by top-down, memory-dependent, or anticipatory mechanisms, such as when looking ahead of moving objects or sideways before crossing streets. Humans and other animals have difficulty paying attention to more than one item simultaneously, so they are faced with the challenge of continuously integrating and prioritizing different bottom-up and top-down influences. == Neuroanatomy == The brain component named the hippocampus helps with the assessment of salience and context by using past memories to filter new incoming stimuli, and placing those that are most important into long term memory. The entorhinal cortex is the pathway into and out of the hippocampus, and is an important part of the brain's memory network; research shows that it is a brain region that suffers damage early on in Alzheimer's disease, one of the effects of which is altered (diminished) salience. The pulvinar nuclei (in the thalamus) modulate physical/perceptual salience in attentional selection. One group of neurons (i.e., D1-type medium spiny neurons) within the nucleus accumbens shell (NAcc shell) assigns appetitive motivational salience ("want" and "desire", which includes a motivational component), aka incentive salience, to rewarding stimuli, while another group of neurons (i.e., D2-type medium spiny neurons) within the NAcc shell assigns aversive motivational salience to aversive stimuli. The primary visual cortex (V1) generates a bottom-up saliency map from visual inputs to guide reflexive attentional shifts or gaze shifts. According to V1 Saliency Hypothesis, the saliency of a location is higher when V1 neurons give higher responses to that location relative to V1 neurons' responses to other visual locations. For example, a unique red item among green items, or a unique vertical bar among horizontal bars, is salient since it evokes higher V1 responses and attracts attention or gaze. The V1 neural responses are sent to the superior colliculus to guide gaze shifts to the salient locations. A fingerprint of the saliency map in V1 is that attention or gaze can be captured by the location of an eye-of-origin singleton in visual inputs, e.g., a bar uniquely shown to the left eye in a background of many other bars shown to the right eye, even when observers cannot tell the difference between the singleton and the background bars. == In psychology == The term is widely used in the study of perception and cognition to refer to any aspect of a stimulus that, for any of many reasons, stands out from the rest. Salience may be the result of emotional, motivational or cognitive factors and is not necessarily associated with physical factors such as intensity, clarity or size. Although salience is thought to determine attentional selection, salience associated with physical factors does not necessarily influence selection of a stimulus. === Salience bias === Salience bias (also referred to as perceptual salience) is a cognitive bias that predisposes individuals to focus on or attend to items, information, or stimuli that are more prominent, visible, or emotionally striking. This is as opposed to stimuli that are unremarkable, or less salient, even though this difference is often irrelevant by objective standards. The American Psychological Association (APA) defines the salience hypothesis as a theory regarding perception where "motivationally significant" information is more readily perceived than information with little or less significant motivational importance. Perceptual salience (salience bias) is linked to the vividness effect, whereby a more pronounced response is produced by a more vivid perception of a stimulus than the mere knowledge of the stimulus. Salience bias assumes that more dynamic, conspicuous, or distinctive stimuli engage attention more than less prominent stimuli, disproportionately impacting decision making, it is a bias which favors more salient information. ==== Application ==== ===== Cognitive Psychology ===== Salience bias, like all other cognitive biases, is an applicable concept to various disciplines. For example, cognitive psychology investigates cognitive functions and processes, such as perception, attention, memory, problem solving, and decision making, all of which could be influenced by salience bias. Salience bias acts to combat cognitive overload by focusing attention on prominent stimuli, which affects how individuals perceive the world as other, less vivid stimuli that could add to or change this perception, are ignored. Human attention gravitates towards novel and relevant stimuli and unconsciously filters out less prominent information, demonstrating salience bias, which influences behavior as human behavior is affected by what is attended to. Behavioral economists Tversky and Kahneman also suggest that the retrieval of instances is influenced by their salience, such as how witnessing or experiencing an event first-hand has a greater impact than when it is less salient, like if it were read about, implying that memory is affected by salience. ===== Language ===== It is also relevant in language understanding and acquisition. Focusing on more salient phenomena allows people to detect language patterns and dialect variations more easily, making dialect categorization more efficient. ===== Social Behavior ===== Furthermore, social behaviors and interactions can also be influenced by perceptual salience. Changes in the perceptual salience of an individual heavily influences their social behavior and subjective experience of their social interactions, confirming a "social salience effect". Social salience relates to how individuals perceive and respond to other people. ===== Behavioral Science ===== The connection between salience bias and other heuristics, like availability and representativeness, links it to the fields of behavioral science and behavioral economics. Salience bias is closely related to the availability heuristic in behavioral economics, based on the influence of information vividness and visibility, such as recency or frequency, on judgements, for example:Accessibility and salience are closely related to availability, and they are important as well. If you have personally experienced a serious earthquake, you're more likely to believe that an earthquake is likely than if you read about it in a weekly magazine. Thus, vivid and easily imagined causes of death (for example, tornadoes) often receive inflated estimates of probability, and less-vivid causes (for example, asthma attacks) receive low estimates, even if they occur with a far greater frequency (here, by a factor of twenty). Timing counts too: more recent events have a greater impact on our behavior, and on our fears, than earlier ones.Humans have bounded rationality, which refers to their limited ability to be rational in decision making, due to a limited capacity to process information and cognitive ability. Heuristics, such as availability, are employed to reduce the complexity of cognitive and social tasks or judgements, in order to decrease the cognitive load that result from bounded rationality. Despite the effectiveness of heuristics in doing so, they are limited by systematic errors that occur, often the result of influencing biases, such as salience. This can lead to misdirected or misinformed judgements, based on an overemphasis or overweighting of
New York Institute of Technology Computer Graphics Lab
The New York Institute of Technology Computer Graphics Lab is a computer lab located at the New York Institute of Technology (NYIT), founded by Alexander Schure. It was originally located at the "pink building" on the NYIT campus. It has played an important role in the history of computer graphics and animation, as founders of Pixar and Lucasfilm Limited, including Turing Award winners Edwin Catmull and Patrick Hanrahan, began their research there. It is the birthplace of entirely 3D CGI films. The lab was initially founded to produce a short high-quality feature film with the project name of The Works. The feature, which was never completed, was a 90-minute feature that was to be the first entirely computer-generated CGI movie. Production mainly focused around DEC PDP and VAX machines. Many of the original CGL team now form the elite of the CG and computer world with members going on to Silicon Graphics, Microsoft, Cisco, NVIDIA and others, including Pixar president, co-founder and Turing laureate Ed Catmull, Pixar co-founder and Microsoft graphics fellow Alvy Ray Smith, Pixar co-founder Ralph Guggenheim, Walt Disney Animation Studios chief scientist Lance Williams, Netscape and Silicon Graphics founder Jim Clark, Tableau co-founder and Turing laureate Pat Hanrahan, Microsoft graphics fellow Jim Blinn, Thad Beier, Oscar and Bafta nominee Jacques Stroweis, Andrew Glassner, and Tom Brigham. Systems programmer Bruce Perens went on to co-found the Open Source Initiative. Researchers at the New York Institute of Technology Computer Graphics Lab created the tools that made entirely 3D CGI films possible. Among NYIT CG Lab's many innovations was an eight-bit paint system to ease computer animation. NYIT CG Lab was regarded as the top computer animation research and development group in the world during the late 70s and early 80s. == The 21st century == The lab is presently located at NYIT's Long Island campus, and NYIT currently offers a Ph.D. program in Computer Science.
Spell checker
In software, a spell checker (or spelling checker or spell check) is a software feature that checks for misspellings in a text. Spell-checking features are often embedded in software or services, such as a word processor, email client, electronic dictionary, or search engine. == Design == A basic spell checker carries out the following processes: It scans the text and extracts the words contained in it. It then compares each word with a known list of correctly spelled words (i.e. a dictionary). This might contain just a list of words, or it might also contain additional information, such as hyphenation points or lexical and grammatical attributes. An additional step is a language-dependent algorithm for handling morphology. Even for a lightly inflected language like English, the spell checker will need to consider different forms of the same word, such as plurals, verbal forms, contractions, and possessives. For many other languages, such as those featuring agglutination and more complex declension and conjugation, this part of the process is more complicated. It is unclear whether morphological analysis—allowing for many forms of a word depending on its grammatical role—provides a significant benefit for English, though its benefits for highly synthetic languages such as German, Hungarian, or Turkish are clear. As an adjunct to these components, the program's user interface allows users to approve or reject replacements and modify the program's operation. Spell checkers can use approximate string matching algorithms such as Levenshtein distance to find correct spellings of misspelled words. An alternative type of spell checker uses solely statistical information, such as n-grams, to recognize errors instead of correctly-spelled words. This approach usually requires a lot of effort to obtain sufficient statistical information. Key advantages include needing less runtime storage and the ability to correct errors in words that are not included in a dictionary. In some cases, spell checkers use a fixed list of misspellings and suggestions for those misspellings; this less flexible approach is often used in paper-based correction methods, such as the see also entries of encyclopedias. Clustering algorithms have also been used for spell checking combined with phonetic information. == History == === Pre-PC === In 1961, Les Earnest, who headed the research on this budding technology, saw it necessary to include the first spell checker that accessed a list of 10,000 acceptable words. Ralph Gorin, a graduate student under Earnest at the time, created the first true spelling checker program written as an applications program (rather than research) for general English text: SPELL for the DEC PDP-10 at Stanford University's Artificial Intelligence Laboratory, in February 1971. Gorin wrote SPELL in assembly language, for faster action; he made the first spelling corrector by searching the word list for plausible correct spellings that differ by a single letter or adjacent letter transpositions and presenting them to the user. Gorin made SPELL publicly accessible, as was done with most SAIL (Stanford Artificial Intelligence Laboratory) programs, and it soon spread around the world via the new ARPAnet, about ten years before personal computers came into general use. SPELL, its algorithms and data structures inspired the Unix ispell program. The first spell checkers were widely available on mainframe computers in the late 1970s. A group of six linguists from Georgetown University developed the first spell-check system for the IBM corporation. Henry Kučera invented one for the VAX machines of Digital Equipment Corp in 1981. === Unix === The International Ispell program commonly used in Unix is based on R. E. Gorin's SPELL. It was converted to C by Pace Willisson at MIT. The GNU project has its spell checker GNU Aspell. Aspell's main improvement is that it can more accurately suggest correct alternatives for misspelled English words. Due to the inability of traditional spell checkers to check words in complex inflected languages, Hungarian László Németh developed Hunspell, a spell checker that supports agglutinative languages and complex compound words. Hunspell also uses Unicode in its dictionaries. Hunspell replaced the previous MySpell in OpenOffice.org in version 2.0.2. Enchant is another general spell checker, derived from AbiWord. Its goal is to combine programs supporting different languages such as Aspell, Hunspell, Nuspell, Hspell (Hebrew), Voikko (Finnish), Zemberek (Turkish) and AppleSpell under one interface. === PCs === The first spell checkers for personal computers appeared in 1980, such as "WordCheck" for Commodore systems which was released in late 1980 in time for advertisements to go to print in January 1981. Developers such as Maria Mariani and Random House rushed OEM packages or end-user products into the rapidly expanding software market. On the pre-Windows PCs, these spell checkers were standalone programs, many of which could be run in terminate-and-stay-resident mode from within word-processing packages on PCs with sufficient memory. However, the market for standalone packages was short-lived, as by the mid-1980s developers of popular word-processing packages like WordStar and WordPerfect had incorporated spell checkers in their packages, mostly licensed from the above companies, who quickly expanded support from just English to many European and eventually even Asian languages. However, this required increasing sophistication in the morphology routines of the software, particularly with regard to heavily-agglutinative languages like Hungarian and Finnish. Although the size of the word-processing market in a country like Iceland might not have justified the investment of implementing a spell checker, companies like WordPerfect nonetheless strove to localize their software for as many national markets as possible as part of their global marketing strategy. When Apple developed "a system-wide spelling checker" for Mac OS X so that "the operating system took over spelling fixes," it was a first: one "didn't have to maintain a separate spelling checker for each" program. Mac OS X's spellcheck coverage includes virtually all bundled and third party applications. Visual Tools' VT Speller, introduced in 1994, was "designed for developers of applications that support Windows." It came with a dictionary but had the ability to build and incorporate use of secondary dictionaries. === Browsers === Web browsers such as Firefox and Google Chrome offer spell checking support, using Hunspell. Prior to using Hunspell, Firefox and Chrome used MySpell and GNU Aspell, respectively. === Specialties === Some spell checkers have separate support for medical dictionaries to help prevent medical errors. == Functionality == The first spell checkers were "verifiers" instead of "correctors." They offered no suggestions for incorrectly spelled words. This was helpful for typos but it was not so helpful for logical or phonetic errors. The challenge the developers faced was the difficulty in offering useful suggestions for misspelled words. This requires reducing words to a skeletal form and applying pattern-matching algorithms. It might seem logical that where spell-checking dictionaries are concerned, "the bigger, the better," so that correct words are not marked as incorrect. In practice, however, an optimal size for English appears to be around 90,000 entries. If there are more than this, incorrectly spelled words may be skipped because they are mistaken for others. For example, a linguist might determine on the basis of corpus linguistics that the word baht is more frequently a misspelling of bath or bat than a reference to the Thai currency. Hence, it would typically be more useful if a few people who write about Thai currency were slightly inconvenienced than if the spelling errors of the many more people who discuss baths were overlooked. The first MS-DOS spell checkers were mostly used in proofing mode from within word processing packages. After preparing a document, a user scanned the text looking for misspellings. Later, however, batch processing was offered in such packages as Oracle's short-lived CoAuthor and allowed a user to view the results after a document was processed and correct only the words that were known to be wrong. When memory and processing power became abundant, spell checking was performed in the background in an interactive way, such as has been the case with the Sector Software produced Spellbound program released in 1987 and Microsoft Word since Word 95. Spell checkers became increasingly sophisticated; now capable of recognizing grammatical errors. However, even at their best, they rarely catch all the errors in a text (such as homophone errors) and will flag neologisms and foreign words as misspellings. Nonetheless, spell checkers can be considered as a type of foreign language writing aid that non-native language lea