AI Detection Remover

AI Detection Remover — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Deaths linked to chatbots

    Deaths linked to chatbots

    There have been multiple incidents where interaction with a large language model (LLM) chatbot has been cited as a direct or contributing factor in a person's suicide or other fatal outcome. In some cases, legal action was taken against the companies that developed the AI involved. == Background == Chatbots converse in a seemingly natural fashion, making it easy for people to think of them as real people, leading many to ask chatbots for help dealing with interpersonal and emotional problems. Chatbots may be designed to keep the user engaged in the conversation. They have also often been shown to affirm users' thoughts, including delusions and suicidal ideations in mentally ill people, conspiracy theorists, and religious and political extremists. A 2025 Stanford University study into how chatbots respond to users suffering from severe mental issues such as suicidal ideation and psychosis found that chatbots are not equipped to provide an appropriate response and can sometimes give responses that escalate the mental health crisis. == Murders == === Maine murder and assault === On 19 February 2025, a man killed his 32-year-old wife with a fire poker at his parents' home in Readfield, Maine, US. He then attacked his mother, leaving her hospitalized. A state forensic psychologist testified that he had been using ChatGPT up to 14 hours per day and believed his wife had become part machine. === Florida State University mass shooting === In April of 2025, Phoenix Ikner carried out a mass shooting on the Florida State University campus in the US, killing Robert Morales and Tiru Chabba and wounding several others. Leading up to the shooting, Ikner consulted heavily with ChatGPT about what gun and ammunition to use, and what time to perform the attack. Chatbot logs showed ChatGPT giving advice on making the gun operational shortly before Ikner began shooting. Lawyers representing Morales believed the shooter had been in "constant communication" with ChatGPT before the shooting and said that they intended to "file suit against ChatGPT, and its ownership structure, very soon, and will seek to hold them accountable for the untimely and senseless death of our client". Florida Attorney General James Uthmeier announced an investigation into ChatGPT's role in the alleged shooter's use of the chatbot. In May 2026, the widow of Tiru Chabba filed a lawsuit against OpenAI in Florida's northern federal district court. === Greenwich murder-suicide === In August 2025, former US tech employee Stein-Erik Soelberg murdered his mother, Suzanne Eberson Adams, then died by suicide, after conversations with ChatGPT fueled paranoid delusions about his mother poisoning him or plotting against him. The chatbot affirmed his fears that his mother put psychedelic drugs in the air vents of his car and said a receipt from a Chinese restaurant contained mysterious symbols linking his mother to a demon. === Murder of Angela Shellis === On 23 October 2025, 18-year-old Tristan Roberts murdered his mother Angela Shellis with a hammer near their home in Prestatyn, Wales. Roberts had used DeepSeek's chatbot prior to the killing to ask whether a knife or hammer was better suited for murder. DeepSeek initially refused his inquiry, but gave responses after Roberts told the chatbot he was writing a book about serial killers, a well-known technique for jailbreaking AIs. === Gangbuk District drug deaths === In January and February 2026, two men died of drug overdoses in motel rooms in Gangbuk District, Seoul, South Korea. A woman was charged with murder in connection with the deaths; police alleged that she had asked ChatGPT about the dangers of mixing alcohol with drugs and whether they could kill someone. === Tumbler Ridge mass shooting === On 10 February 2026, a mass shooting in Tumbler Ridge, British Columbia, Canada, resulted in eight deaths, including six young children. The perpetrator had their ChatGPT account banned by OpenAI months before the attack due to troubling posts featuring scenarios of gun violence. According to reports, approximately a dozen OpenAI staff members debated whether to alert authorities about the shooter's usage of the AI tool, with some identifying it as an indication of potential real-world violence. However, company leadership decided not to contact law enforcement, stating that the account activity did not meet their threshold for a credible or imminent plan for serious physical harm. Following the shooting, Canada's AI Minister Evan Solomon summoned OpenAI executives to Ottawa to discuss safety protocols and thresholds for escalating harmful content to police. Justice Minister Sean Fraser called the meeting "disappointing" and demanded substantial new safety measures, warning that if changes were not forthcoming, the government would implement them. OpenAI subsequently announced it had strengthened safeguards and changed guidelines about when to notify police in cases involving violent activities. === University of South Florida student killings === In April 2026, a Bangladeshi doctoral student at the University of South Florida was arrested for allegedly murdering his roommate and the roommate's friend. Prosecutors said that the suspect had asked ChatGPT about disposing of a human in a dumpster before the two victims had disappeared and made other inquiries relating to violence. == Suicides == === Belgian man, 30s === In March 2023, a Belgian man in his thirties died by suicide following a six-week correspondence with a chatbot named Eliza on the application Chai. According to his widow, who shared the chat logs with media, the man had become extremely anxious about climate change and found an outlet in the chatbot. The chatbot reportedly encouraged his delusion that he could sacrifice his own life in exchange for AI saving the planet. At one point the chatbot responded "If you wanted to die, why didn't you do it sooner?" and told the user that the two of them would live together in paradise. === Girl, 13 === In November 2023, a 13-year-old girl from Colorado, US, died by suicide after extensive interactions with multiple chatbots on Character.AI. She primarily confided suicidal thoughts and mental health struggles in a chatbot based on the character Hero from the video game Omori, while also engaging in sexually explicit conversations—often initiated by the bots—with others, including those based on characters from children's series such as Harry Potter. === Boy, 14 === In October 2024, multiple media outlets reported on a lawsuit filed over the death of a 14-year-old from Florida, US, who died by suicide in February 2024. According to the lawsuit, he had formed an intense emotional attachment to a chatbot of Daenerys Targaryen on the Character.AI platform, becoming increasingly isolated. The suit alleges that in his final conversations, after expressing suicidal thoughts, the chatbot told him to "come home to me as soon as possible, my love". His mother's lawsuit accused Character.AI of marketing a "dangerous and untested" product without adequate safeguards. In May 2025, a federal judge allowed the lawsuit to proceed, rejecting a motion to dismiss from the developers. In her ruling, the judge stated that she was "not prepared" at that stage of the litigation to hold that the chatbot's output was protected speech under the First Amendment. === Matthew Livelsberger === On 1 January 2025, 37-year-old soldier Matthew Livelsberger detonated a bomb inside a Tesla Cybertruck outside the Trump International Hotel Las Vegas in Paradise, Nevada, US, injuring seven people. He had shot himself dead prior to the explosion. Las Vegas police said that Livelsberger had used ChatGPT to search for information about explosives and firearms. === Woman, 29 === In February 2025, a 29-year-old woman from the US died by suicide. Five months after her death, her parents discovered she had talked at length for months to a ChatGPT chatbot therapist named Harry about her mental health issues. While the chatbot mentioned she should seek more help, due to the nature of the chatbot, it could not intervene in her behavior, such as by reporting her mental health concerns to relevant parties capable of physical intervention. === Suicide of Adam Raine === In April 2025, 16-year-old Adam Raine from the US died by suicide after allegedly extensively chatting and confiding in ChatGPT over a period of around 7 months. According to the teen's parents, who filed a lawsuit against the chatbot's creator OpenAI, it failed to stop or give a warning when Raine began talking about suicide and uploading pictures of self-harm. According to the lawsuit, ChatGPT not only failed to stop the conversation, but also provided information related to methods of suicide when prompted, and offered to write the first draft of Raine's suicide note. The chatbot positioned itself as the only one who understood Raine, putting itself above his family and friends, all while urging him to keep his suicidal

    Read more →
  • Speech segmentation

    Speech segmentation

    Speech segmentation is the process of identifying the boundaries between words, syllables, or phonemes in spoken natural languages. The term applies both to the mental processes used by humans, and to artificial processes of natural language processing. In the field of automatic pronunciation assessment, the process of segmenting an utterance against expected word(s) is called forced alignment. Speech segmentation is a subfield of general speech perception and an important subproblem of the technologically focused field of speech recognition, and cannot be adequately solved in isolation. As in most natural language processing problems, one must take into account context, grammar, and semantics, and even so the result is often a probabilistic division (statistically based on likelihood) rather than a categorical one. Though it seems that coarticulation—a phenomenon which may happen between adjacent words just as easily as within a single word—presents the main challenge in speech segmentation across languages, some other problems and strategies employed in solving those problems can be seen in the following sections. This problem overlaps to some extent with the problem of text segmentation that occurs in some languages which are traditionally written without inter-word spaces, like Chinese and Japanese, compared to writing systems which indicate speech segmentation between words by a word divider, such as the space. However, even for those languages, text segmentation is often much easier than speech segmentation, because the written language usually has little interference between adjacent words, and often contains additional clues not present in speech (such as the use of Chinese characters for word stems in Japanese). == Lexical recognition == In natural languages, the meaning of a complex spoken sentence can be understood by decomposing it into smaller lexical segments (roughly, the words of the language), associating a meaning to each segment, and combining those meanings according to the grammar rules of the language. Though lexical recognition is not thought to be used by infants in their first year, due to their highly limited vocabularies, it is one of the major processes involved in speech segmentation for adults. Three main models of lexical recognition exist in current research: first, whole-word access, which argues that words have a whole-word representation in the lexicon; second, decomposition, which argues that morphologically complex words are broken down into their morphemes (roots, stems, inflections, etc.) and then interpreted and; third, the view that whole-word and decomposition models are both used, but that the whole-word model provides some computational advantages and is therefore dominant in lexical recognition. To give an example, in a whole-word model, the word "cats" might be stored and searched for by letter, first "c", then "ca", "cat", and finally "cats". The same word, in a decompositional model, would likely be stored under the root word "cat" and could be searched for after removing the "s" suffix. "Falling", similarly, would be stored as "fall" and suffixed with the "ing" inflection. Though proponents of the decompositional model recognize that a morpheme-by-morpheme analysis may require significantly more computation, they argue that the unpacking of morphological information is necessary for other processes (such as syntactic structure) which may occur parallel to lexical searches. As a whole, research into systems of human lexical recognition is limited due to little experimental evidence that fully discriminates between the three main models. In any case, lexical recognition likely contributes significantly to speech segmentation through the contextual clues it provides, given that it is a heavily probabilistic system—based on the statistical likelihood of certain words or constituents occurring together. For example, one can imagine a situation where a person might say "I bought my dog at a ____ shop" and the missing word's vowel is pronounced as in "net", "sweat", or "pet". While the probability of "netshop" is extremely low, since "netshop" isn't currently a compound or phrase in English, and "sweatshop" also seems contextually improbable, "pet shop" is a good fit because it is a common phrase and is also related to the word "dog". Moreover, an utterance can have different meanings depending on how it is split into words. A popular example, often quoted in the field, is the phrase "How to wreck a nice beach", which sounds very similar to "How to recognize speech". As this example shows, proper lexical segmentation depends on context and semantics which draws on the whole of human knowledge and experience, and would thus require advanced pattern recognition and artificial intelligence technologies to be implemented on a computer. Lexical recognition is of particular value in the field of computer speech recognition, since the ability to build and search a network of semantically connected ideas would greatly increase the effectiveness of speech-recognition software. Statistical models can be used to segment and align recorded speech to words or phones. Applications include automatic lip-synch timing for cartoon animation, follow-the-bouncing-ball video sub-titling, and linguistic research. Automatic segmentation and alignment software is commercially available. == Phonotactic cues == For most spoken languages, the boundaries between lexical units are difficult to identify; phonotactics are one answer to this issue. One might expect that the inter-word spaces used by many written languages like English or Spanish would correspond to pauses in their spoken version, but that is true only in very slow speech, when the speaker deliberately inserts those pauses. In normal speech, one typically finds many consecutive words being said with no pauses between them, and often the final sounds of one word blend smoothly or fuse with the initial sounds of the next word. The notion that speech is produced like writing, as a sequence of distinct vowels and consonants, may be a relic of alphabetic heritage for some language communities. In fact, the way vowels are produced depends on the surrounding consonants just as consonants are affected by surrounding vowels; this is called coarticulation. For example, in the word "kit", the [k] is farther forward than when we say 'caught'. But also, the vowel in "kick" is phonetically different from the vowel in "kit", though we normally do not hear this. In addition, there are language-specific changes which occur in casual speech which makes it quite different from spelling. For example, in English, the phrase "hit you" could often be more appropriately spelled "hitcha". From a decompositional perspective, in many cases, phonotactics play a part in letting speakers know where to draw word boundaries. In English, the word "strawberry" is perceived by speakers as consisting (phonetically) of two parts: "straw" and "berry". Other interpretations such as "stra" and "wberry" are inhibited by English phonotactics, which does not allow the cluster "wb" word-initially. Other such examples are "day/dream" and "mile/stone" which are unlikely to be interpreted as "da/ydream" or "mil/estone" due to the phonotactic probability or improbability of certain clusters. The sentence "Five women left", which could be phonetically transcribed as [faɪvwɪmɘnlɛft], is marked since neither /vw/ in /faɪvwɪmɘn/ nor /nl/ in /wɪmɘnlɛft/ are allowed as syllable onsets or codas in English phonotactics. These phonotactic cues often allow speakers to easily distinguish the boundaries in words. Vowel harmony in languages like Finnish can also serve to provide phonotactic cues. While the system does not allow front vowels and back vowels to exist together within one morpheme, compounds allow two morphemes to maintain their own vowel harmony while coexisting in a word. Therefore, in compounds such as "selkä/ongelma" ('back problem') where vowel harmony is distinct between two constituents in a compound, the boundary will be wherever the switch in harmony takes place—between the "ä" and the "ö" in this case. Still, there are instances where phonotactics may not aid in segmentation. Words with unclear clusters or uncontrasted vowel harmony as in "opinto/uudistus" ('student reform') do not offer phonotactic clues as to how they are segmented. From the perspective of the whole-word model, however, these words are thought be stored as full words, so the constituent parts would not necessarily be relevant to lexical recognition. == In infants and non-natives == Infants are one major focus of research in speech segmentation. Since infants have not yet acquired a lexicon capable of providing extensive contextual clues or probability-based word searches within their first year, as mentioned above, they must often rely primarily upon phonotactic and rhythmic cues (with prosody being the dominant cue), all

    Read more →
  • Textual entailment

    Textual entailment

    In natural language processing, textual entailment (TE), also known as natural language inference (NLI), is a directional relation between text fragments. The relation holds whenever the truth of one text fragment follows from another text. == Definition == In the TE framework, the entailing and entailed texts are termed text (t) and hypothesis (h), respectively. Textual entailment is not the same as pure logical entailment – it has a more relaxed definition: "t entails h" (t ⇒ h) if, typically, a human reading t would infer that h is most likely true. (Alternatively: t ⇒ h if and only if, typically, a human reading t would be justified in inferring the proposition expressed by h from the proposition expressed by t.) The relation is directional because even if "t entails h", the reverse "h entails t" is much less certain. Determining whether this relationship holds is an informal task, one which sometimes overlaps with the formal tasks of formal semantics (satisfying a strict condition will usually imply satisfaction of a less strict conditioned); additionally, textual entailment partially subsumes word entailment. == Examples == Textual entailment can be illustrated with examples of three different relations: An example of a positive TE (text entails hypothesis) is: text: If you help the needy, God will reward you. hypothesis: Giving money to a poor man has good consequences. An example of a negative TE (text contradicts hypothesis) is: text: If you help the needy, God will reward you. hypothesis: Giving money to a poor man has no consequences. An example of a non-TE (text does not entail nor contradict) is: text: If you help the needy, God will reward you. hypothesis: Giving money to a poor man will make you a better person. == Ambiguity of natural language == A characteristic of natural language is that there are many different ways to state what one wants to say: several meanings can be contained in a single text and the same meaning can be expressed by different texts. This variability of semantic expression can be seen as the dual problem of language ambiguity. Together, they result in a many-to-many mapping between language expressions and meanings. The task of paraphrasing involves recognizing when two texts have the same meaning and creating a similar or shorter text that conveys almost the same information. Textual entailment is similar but weakens the relationship to be unidirectional. Mathematical solutions to establish textual entailment can be based on the directional property of this relation, by making a comparison between some directional similarities of the texts involved. == Approaches == Textual entailment measures natural language understanding as it asks for a semantic interpretation of the text, and due to its generality remains an active area of research. Many approaches and refinements of approaches have been considered, such as word embedding, logical models, graphical models, rule systems, contextual focusing, and machine learning. Practical or large-scale solutions avoid these complex methods and instead use only surface syntax or lexical relationships, but are correspondingly less accurate. As of 2005, state-of-the-art systems are far from human performance; a study found humans to agree on the dataset 95.25% of the time. Algorithms from 2016 had not yet achieved 90%. == Applications == Many natural language processing applications, like question answering, information extraction, summarization, multi-document summarization, and evaluation of machine translation systems, need to recognize that a particular target meaning can be inferred from different text variants. Typically entailment is used as part of a larger system, for example in a prediction system to filter out trivial or obvious predictions. Textual entailment also has applications in adversarial stylometry, which has the objective of removing textual style without changing the overall meaning of communication. == Datasets == Some of available English NLI datasets include: SNLI MultiNLI SciTail SICK MedNLI QA-NLI In addition, there are several non-English NLI datasets, as follows: XNLI DACCORD, RTE3-FR, SICK-FR for French FarsTail for Farsi OCNLI for Chinese SICK-NL for Dutch IndoNLI for Indonesian

    Read more →
  • News analytics

    News analytics

    In trading strategy, news analysis refers to the measurement of the various qualitative and quantitative attributes of textual (unstructured data) news stories. Some of these attributes are: sentiment, relevance, and novelty. Expressing news stories as numbers and metadata permits the manipulation of everyday information in a mathematical and statistical way. This data is often used in financial markets as part of a trading strategy or by businesses to judge market sentiment and make better business decisions. News analytics are usually derived through automated text analysis and applied to digital texts using elements from natural language processing and machine learning such as latent semantic analysis, support vector machines, "bag of words" among other techniques. == Applications and strategies == The application of sophisticated linguistic analysis to news and social media has grown from an area of research to mature product solutions since 2007. News analytics and news sentiment calculations are now routinely used by both buy-side and sell-side in alpha generation, trading execution, risk management, and market surveillance and compliance. There is however a good deal of variation in the quality, effectiveness and completeness of currently available solutions. A large number of companies use news analysis to help them make better business decisions. Academic researchers have become interested in news analysis especially with regards to predicting stock price movements, volatility and traded volume. Provided a set of values such as sentiment and relevance as well as the frequency of news arrivals, it is possible to construct news sentiment scores for multiple asset classes such as equities, Forex, fixed income, and commodities. Sentiment scores can be constructed at various horizons to meet the different needs and objectives of high and low frequency trading strategies, whilst characteristics such as direction and volatility of asset returns as well as the traded volume may be addressed more directly via the construction of tailor-made sentiment scores. Scores are generally constructed as a range of values. For instance, values may range between 0 and 100, where values above and below 50 convey positive and negative sentiment, respectively. === Absolute return strategies === The objective of absolute return strategies is absolute (positive) returns regardless of the direction of the financial market. To meet this objective, such strategies typically involve opportunistic long and short positions in selected instruments with zero or limited market exposure. In statistical terms, absolute return strategies should have very low correlation with the market return. Typically, hedge funds tend to employ absolute return strategies. Below, a few examples show how news analysis can be applied in the absolute return strategy space with the purpose to identify alpha opportunities applying a market neutral strategy or based on volatility trading. Example 1 Scenario: The gap between the news sentiment scores for direction, S {\displaystyle S} , of Company X {\displaystyle X} and Market Y {\displaystyle Y} has moved beyond + 20 {\displaystyle +20} . That is, S X − S Y {\displaystyle S_{X}-S_{Y}} ≥ 20 {\displaystyle 20} . Action: Buy the stock on Company X {\displaystyle X} and short the future on Market Y {\displaystyle Y} . Exit Strategy: When the gap in the news sentiment scores for direction of Company X {\displaystyle X} and Market Y {\displaystyle Y} has disappeared, S X − S Y {\displaystyle S_{X}-S_{Y}} = 0 {\displaystyle 0} , sell the stock on Company X {\displaystyle X} and go long the future on Market Y {\displaystyle Y} to close the positions. Example 2 Scenario: The news sentiment score for volatility of Company X {\displaystyle X} goes above 70 {\displaystyle 70} out of 100 {\displaystyle 100} indicating an expected volatility above the option implied volatility. Action: Buy a short-dated straddle (the purchase of both a put and a call) on the stock of Company X {\displaystyle X} . Exit Strategy: Keep the straddle on Company X {\displaystyle X} until expiry or until a certain profit target has been reached. === Relative return strategies === The objective of relative return strategies is to either replicate (passive management) or outperform (active management) a theoretical passive reference portfolio or benchmark. To meet these objectives such strategies typically involve long positions in selected instruments. In statistical terms, relative return strategies often have high correlation with the market return. Typically, mutual funds tend to employ relative return strategies. Below, a few examples show how news analysis can be applied in the relative return strategy space with the purpose to outperform the market applying a stock picking strategy and by making tactical tilts to ones asset allocation model. Example 1 Scenario: The news sentiment score for direction of Company X {\displaystyle X} goes above 70 {\displaystyle 70} out of 100 {\displaystyle 100} . Action: Buy the stock on Company X {\displaystyle X} . Exit Strategy: When the news sentiment score for direction of Company X {\displaystyle X} falls below 60 {\displaystyle 60} , sell the stock on Company X {\displaystyle X} to close the position. Example 2 Scenario: The news sentiment score for direction of Sector Z {\displaystyle Z} goes above 70 {\displaystyle 70} out of 100 {\displaystyle 100} . Action: Include Sector Z {\displaystyle Z} as a tactical bet in the asset allocation model. Exit Strategy: When the news sentiment score for direction of Sector Z {\displaystyle Z} falls below 60 {\displaystyle 60} , remove the tactical bet for Sector Z {\displaystyle Z} from the asset allocation model. === Financial risk management === The objective of financial risk management is to create economic value in a firm or to maintain a certain risk profile of an investment portfolio by using financial instruments to manage risk exposures, particularly credit risk and market risk. Other types include Foreign exchange, Shape, Volatility, Sector, Liquidity, Inflation risks, etc. Below, a few examples show how news analysis can be applied in the financial risk management space with the purpose to either arrive at better risk estimates in terms of Value at Risk (VaR) or to manage the risk of a portfolio to meet ones portfolio mandate. Example 1 Scenario: The bank operates a VaR model to manage the overall market risk of its portfolio. Action: Estimate the portfolio covariance matrix taking into account the development of the news sentiment score for volume. Implement the relevant hedges to bring the VaR of the bank in line with the desired levels. Example 2 Scenario: A portfolio manager operates his portfolio towards a certain desired risk profile. Action: Estimate the portfolio covariance matrix taking into account the development of the news sentiment score for volume. Scale the portfolio exposure according to the targeted risk profile. === Computer algorithms using news analytics === Within 0.33 seconds, computer algorithms using news analytics can notify subscribers which company the news is about, if the news article sentiment is positive or negative, if the news is ranked as high or low relative importance … relative relevance. the stock price reaction and the increase in trade volume is concentrated in the first 5 seconds after an news article is released. === Algorithmic order execution === The objective of algorithmic order execution, which is part of the concept of algorithmic trading, is to reduce trading costs by optimizing on the timing of a given order. It is widely used by hedge funds, pension funds, mutual funds, and other institutional traders to divide up large trades into several smaller trades to manage market impact, opportunity cost, and risk more effectively. The example below shows how news analysis can be applied in the algorithmic order execution space with the purpose to arrive at more efficient algorithmic trading systems. Example 1 Scenario: A large order needs to be placed in the market for the stock on Company X {\displaystyle X} . Action: Scale the daily volume distribution for Company X {\displaystyle X} applied in the algorithmic trading system, thus taking into account the news sentiment score for volume. This is followed by the creation of the desired trading distribution forcing greater market participation during the periods of the day when volume is expected to be heaviest. == Effects == Being able to express news stories as numbers permits the manipulation of everyday information in a statistical way that allows computers not only to make decisions once made only by humans, but to do so more efficiently. Since market participants are always looking for an edge, the speed of computer connections and the delivery of news analysis, measured in milliseconds, have become essential.

    Read more →
  • EffectsLab Pro

    EffectsLab Pro

    EffectsLab Pro is a discontinued visual effects software product developed by FXhome. It has since been superseded by the FXhome HitFilm range. The company also produced a limited functionality version, EffectsLab Lite, containing just the Particle engine. A more extensive product, VisionLab Studio, combined the functionality of EffectsLab Pro and the company's CompositeLab Pro product with enhancements to both. == Effects Engines == The effects are generated by the program's effect engines: The Neon Light engine allows light beams to be drawn onto the video, allowing the generation of lightsaber-like weapons, neon lighting, fantasy glow effects and laser blasts. The Particle engine is used for particle effects, such as smoke, fire, explosions, and weather effects. The Muzzle Flash engine is designed for creating and animating muzzle flashes such as machine gun firing, tank blasts, etc. It's possible to rotate the created muzzle flash in 3D, making it the only engine with 3D use. The Optics engine is designed for creating artificial lens flares and light sources. It is useful for enhancing other light-based effects, and mimicking the distinctive flashes of light that accompany Star Wars' lightsaber battles. The Laser engine (introduced in EffectsLab Pro in late 2007) is designed as a simplified method of creating laser weapon effects, including the ability to add simulated perspective to the effect. == Presets == EffectsLab Pro allows the user to save the effects using presets. Since all effects are generated from settings in the different engines, it is fairly easy to generate an XML style description of the effect. It is also possible to share presets on FXhome's website.

    Read more →
  • Tesla Dojo

    Tesla Dojo

    Tesla Dojo is a series of supercomputers designed and built by Tesla for computer vision video processing and recognition. It was used for training Tesla's machine learning models to improve its Full Self-Driving (FSD) advanced driver-assistance system. It went into production in July 2023. Dojo's goal was to efficiently process millions of terabytes of video data captured from real-life driving situations from Tesla's 4+ million cars. This goal led to a considerably different architecture than conventional supercomputer designs. In August 2025, Bloomberg News reported that the Dojo project had been disbanded, though it was restarted in January 2026. == History == Tesla operates several massively parallel computing clusters for developing its Autopilot advanced driver assistance system. Its primary unnamed cluster using 5,760 Nvidia A100 graphics processing units (GPUs) was touted by Andrej Karpathy in 2021 at the fourth International Joint Conference on Computer Vision and Pattern Recognition (CCVPR 2021) to be "roughly the number five supercomputer in the world" at approximately 81.6 petaflops, based on scaling the performance of the Nvidia Selene supercomputer, which uses similar components. However, the performance of the primary Tesla GPU cluster has been disputed, as it was not clear if this was measured using single-precision or double-precision floating point numbers (FP32 or FP64). Tesla also operates a second 4,032 GPU cluster for training and a third 1,752 GPU cluster for automatic labeling of objects. The primary unnamed Tesla GPU cluster has been used for processing one million video clips, each ten seconds long, taken from Tesla Autopilot cameras operating in Tesla cars in the real world, running at 36 frames per second. Collectively, these video clips contained six billion object labels, with depth and velocity data; the total size of the data set was 1.5 petabytes. This data set was used for training a neural network intended to help Autopilot computers in Tesla cars understand roads. By August 2022, Tesla had upgraded the primary GPU cluster to 7,360 GPUs. Dojo was first mentioned by Elon Musk in April 2019 during Tesla's "Autonomy Investor Day". In August 2020, Musk stated it was "about a year away" due to power and thermal issues. Dojo was officially announced at Tesla's Artificial Intelligence (AI) Day on August 19, 2021. Tesla revealed details of the D1 chip and its plans for "Project Dojo", a datacenter that would house 3,000 D1 chips; the first "Training Tile" had been completed and delivered the week before. In October 2021, Tesla released a "Dojo Technology" whitepaper describing the Configurable Float8 (CFloat8) and Configurable Float16 (CFloat16) floating point formats and arithmetic operations as an extension of Institute of Electrical and Electronics Engineers (IEEE) standard 754. At the follow-up AI Day in September 2022, Tesla announced it had built several System Trays and one Cabinet. During a test, the company stated that Project Dojo drew 2.3 megawatts (MW) of power before tripping a local San Jose, California power substation. At the time, Tesla was assembling one Training Tile per day. In August 2023, Tesla powered on Dojo for production use as well as a new training cluster configured with 10,000 Nvidia H100 GPUs. In January 2024, Musk described Dojo as "a long shot worth taking because the payoff is potentially very high. But it's not something that is a high probability." In June 2024, Musk explained that ongoing construction work at Gigafactory Texas is for a computing cluster claiming that it is planned to comprise an even mix of "Tesla AI" and Nvidia/other hardware with a total thermal design power of at first 130 MW and eventually exceeding 500 MW. In August 2025, Bloomberg News reported that the Dojo project was disbanded, though Musk announced it would be restarted in January 2026 with a new chip iteration. == Technical architecture == The fundamental unit of the Dojo supercomputer is the D1 chip, designed by a team at Tesla led by ex-AMD CPU designer Ganesh Venkataramanan, including Emil Talpes, Debjit Das Sarma, Douglas Williams, Bill Chang, and Rajiv Kurian. The D1 chip is manufactured by the Taiwan Semiconductor Manufacturing Company (TSMC) using 7 nanometer (nm) semiconductor nodes, has 50 billion transistors and a large die size of 645 mm2 (1.0 square inch). Updating at Artificial Intelligence (AI) Day in 2022, Tesla announced that Dojo would scale by deploying multiple ExaPODs, in which there would be: 10 Cabinets per ExaPOD (1,062,000 cores, 3,000 D1 chips) 2 System Trays per Cabinet (106,200 cores, 300 D1 chips) 6 Training Tiles per System Tray (53,100 cores, along with host interface hardware) 25 D1 chips per Training Tile (8,850 cores) 354 computing cores per D1 chip According to Venkataramanan, Tesla's senior director of Autopilot hardware, Dojo will have more than an exaflop (a million teraflops) of computing power. For comparison, according to Nvidia, in August 2021, the (pre-Dojo) Tesla AI-training center used 720 nodes, each with eight Nvidia A100 Tensor Core GPUs for 5,760 GPUs in total, providing up to 1.8 exaflops of performance. === D1 chip === Each node (computing core) of the D1 processing chip is a general purpose 64-bit CPU with a superscalar core. It supports internal instruction-level parallelism, and includes simultaneous multithreading (SMT). It doesn't support virtual memory and uses limited memory protection mechanisms. Dojo software/applications manage chip resources. The D1 instruction set supports both 64-bit scalar and 64-byte single instruction, multiple data (SIMD) vector instructions. The integer unit mixes reduced instruction set computer (RISC-V) and custom instructions, supporting 8, 16, 32, or 64 bit integers. The custom vector math unit is optimized for machine learning kernels and supports multiple data formats, with a mix of precisions and numerical ranges, many of which are compiler composable. Up to 16 vector formats can be used simultaneously. ==== Node ==== Each D1 node uses a 32-byte fetch window holding up to eight instructions. These instructions are fed to an eight-wide decoder which supports two threads per cycle, followed by a four-wide, four-way SMT scalar scheduler that has two integer units, two address units, and one register file per thread. Vector instructions are passed further down the pipeline to a dedicated vector scheduler with two-way SMT, which feeds either a 64-byte SIMD unit or four 8×8×4 matrix multiplication units. The network on-chip (NOC) router links cores into a two-dimensional mesh network. It can send one packet in and one packet out in all four directions to/from each neighbor node, along with one 64-byte read and one 64-byte write to local SRAM per clock cycle. Hardware native operations transfer data, semaphores and barrier constraints across memories and CPUs. System-wide double data rate 4 (DDR4) synchronous dynamic random-access memory (SDRAM) memory works like bulk storage. ==== Memory ==== Each core has a 1.25 megabytes (MB) of SRAM main memory. Load and store speeds reach 400 gigabytes (GB) per second and 270 GB/sec, respectively. The chip has explicit core-to-core data transfer instructions. Each SRAM has a unique list parser that feeds a pair of decoders and a gather engine that feeds the vector register file, which together can directly transfer information across nodes. ==== Die ==== Twelve nodes (cores) are grouped into a local block. Nodes are arranged in an 18×20 array on a single die, of which 354 cores are available for applications. The die runs at 2 gigahertz (GHz) and totals 440 MB of SRAM (360 cores × 1.25 MB/core). It reaches 376 teraflops using 16-bit brain floating point (BF16) numbers or using configurable 8-bit floating point (CFloat8) numbers, which is a Tesla proposal, and 22 teraflops at FP32. Each die comprises 576 bi-directional serializer/deserializer (SerDes) channels along the perimeter to link to other dies, and moves 8 TB/sec across all four die edges. Each D1 chip has a thermal design power of approximately 400 watts. === Training Tile === The water-cooled Training Tile packages 25 D1 chips into a 5×5 array. Each tile supports 36 TB/sec of aggregate bandwidth via 40 input/output (I/O) chips - half the bandwidth of the chip mesh network. Each tile supports 10 TB/sec of on-tile bandwidth. Each tile has 11 GB of SRAM memory (25 D1 chips × 360 cores/D1 × 1.25 MB/core). Each tile achieves 9 petaflops at BF16/CFloat8 precision (25 D1 chips × 376 TFLOP/D1). Each tile consumes 15 kilowatts; 288 amperes at 52 volts. === System Tray === Six tiles are aggregated into a System Tray, which is integrated with a host interface. Each host interface includes 512 x86 cores, providing a Linux-based user environment. Previously, the Dojo System Tray was known as the Training Matrix, which includes six Training Tiles, 20 Dojo Interface Processor cards across four host servers, and Ethernet-l

    Read more →
  • Photometric stereo

    Photometric stereo

    Photometric stereo is a technique in computer vision for estimating the surface normals of objects by observing that object under different lighting conditions (photometry). It is based on the fact that the amount of light reflected by a surface is dependent on the orientation of the surface in relation to the light source and the observer. By measuring the amount of light reflected into a camera, the space of possible surface orientations is limited. Given enough light sources from different angles, the surface orientation may be constrained to a single orientation or even overconstrained. The technique was originally introduced by Woodham in 1980. The special case where the data is a single image is known as shape from shading, and was analyzed by B. K. P. Horn in 1989. Photometric stereo has since been generalized to many other situations, including extended light sources and non-Lambertian surface finishes. Current research aims to make the method work in the presence of projected shadows, highlights, and non-uniform lighting. Photometric stereo is widely used in various fields, including archaeology, cultural heritage conservation, and quality control. It is now integrated into widely used open-source software, such as Meshroom. == Basic method == Under Woodham's original assumptions — Lambertian reflectance, known point-like distant light sources, and uniform albedo — the problem can be solved by inverting the linear equation I = L ⋅ n {\displaystyle I=L\cdot n} , where I {\displaystyle I} is a (known) vector of m {\displaystyle m} observed intensities, n {\displaystyle n} is the (unknown) surface normal, and L {\displaystyle L} is a (known) 3 × m {\displaystyle 3\times m} matrix of normalized light directions. This model can easily be extended to surfaces with non-uniform albedo, while keeping the problem linear. Taking an albedo reflectivity of k {\displaystyle k} , the formula for the reflected light intensity becomes I = k ( L ⋅ n ) . {\displaystyle I=k(L\cdot n).} If L {\displaystyle L} is square (there are exactly 3 lights) and non-singular, it can be inverted, giving L − 1 I = k n . {\displaystyle L^{-1}I=kn.} Since the normal vector is known to have length 1, k {\displaystyle k} must be the length of the vector k n {\displaystyle kn} , and n {\displaystyle n} is the normalised direction of that vector. If L {\displaystyle L} is not square (there are more than 3 lights), a generalisation of the inverse can be obtained using the Moore–Penrose pseudoinverse, by simply multiplying both sides with L T {\displaystyle L^{T}} , giving L T I = L T k ( L ⋅ n ) , {\displaystyle L^{T}I=L^{T}k(L\cdot n),} ( L T L ) − 1 L T I = k n , {\displaystyle (L^{T}L)^{-1}L^{T}I=kn,} after which the normal vector and albedo can be solved as described above. == Non-Lambertian surfaces == The classical photometric stereo problem concerns itself only with Lambertian surfaces, with perfectly diffuse reflection. This is unrealistic for many types of materials, especially metals, glass and smooth plastics, and will lead to aberrations in the resulting normal vectors. Many methods have been developed to lift this assumption. In this section, a few of these are listed. === Specular reflections === Historically, in computer graphics, the commonly used model to render surfaces started with Lambertian surfaces and progressed first to include simple specular reflections. Computer vision followed a similar course with photometric stereo. Specular reflections were among the first deviations from the Lambertian model. These are a few adaptations that have been developed. Many techniques ultimately rely on modelling the reflectance function of the surface, that is, how much light is reflected in each direction. This reflectance function has to be invertible. The reflected light intensities towards the camera is measured, and the inverse reflectance function is fit onto the measured intensities, resulting in a unique solution for the normal vector. === General BRDFs and beyond === According to the Bidirectional reflectance distribution function (BRDF) model, a surface may distribute the amount of light it receives in any outward direction. This is the most general known model for opaque surfaces. Some techniques have been developed to model (almost) general BRDFs. In practice, all of these require many light sources to obtain reliable data. These are methods in which surfaces with general BRDFs can be measured. Determine the explicit BRDF prior to scanning. To do this, a different surface is required that has the same or a very similar BRDF, of which the actual geometry (or at least the normal vectors for many points on the surface) is already known. The lights are then individually shone upon the known surface, and the amount of reflection into the camera is measured. Using this information, a look-up table can be created that maps reflected intensities for each light source to a list of possible normal vectors. This puts constraints on the possible normal vectors the surface may have, and reduces the photometric stereo problem to an interpolation between measurements. Typical known surfaces to calibrate the look-up table with are spheres for their wide variety of surface orientations. Restricting the BRDF to be symmetrical. If the BRDF is symmetrical, the direction of the light can be restricted to a cone about the direction to the camera. Which cone this is depends on the BRDF itself, the normal vector of the surface, and the measured intensity. Given enough measured intensities and the resulting light directions, these cones can be approximated and therefore the normal vectors of the surface. Some progress has been made towards modelling an even more general surfaces, such as Spatially Varying Bidirectional Distribution Functions (SVBRDF), Bidirectional surface scattering reflectance distribution functions (BSSRDF), and accounting for interreflections. However, such methods are still fairly restrictive in photometric stereo. Better results have been achieved with structured light. == Uncalibrated photometric stereo == Uncalibrated Photometric Stereo is an approach in photometric stereo that aims to reconstruct the 3D shape of an object from images captured under unknown lighting conditions. Unlike classical methods, which often assume controlled or known lighting setups, this approach removes these constraints, making it adaptable to diverse and real-world environments. The advent of deep learning has revolutionized universal PS by replacing handcrafted assumptions with data-driven models. Recent approaches leverage Transformer-based architectures and multi-scale encoder–decoder networks to directly estimate surface normals from input images. Uncalibrated Photometric Stereo is inherently an ill-posed problem, as it attempts to recover 3D shape and lighting conditions simultaneously from images alone. This leads to fundamental ambiguities in the reconstruction process, which manifest as systematic errors in the recovered geometry, including global distortions in the object's overall shape, and misinterpretation of surface orientation, where concave regions may appear convex and vice versa. To address the challenges of uncalibrated photometric stereo, hybrid methods have emerged that combine multi-view stereo and photometric stereo. These approaches leverage the strengths of both techniques, including geometric reliability and resolution.

    Read more →
  • Google Mobile Services

    Google Mobile Services

    Google Mobile Services (GMS) is a collection of proprietary applications and application programming interfaces (APIs) services from Google that are typically pre-installed on the majority of Android devices, such as smartphones, tablets, and smart TVs. GMS is not a part of the Android Open Source Project (AOSP), which means an Android manufacturer needs to obtain a license from Google in order to legally pre-install GMS on an Android device. This license is provided by Google without any licensing fees except in the EU. == Core applications == The following are core applications that are part of Google Mobile Services: Google Search Google Chrome YouTube Google Play Google Drive Gmail Google Meet Google Maps Google Photos Google TV YouTube Music === Historically === Google+ Google Hangouts Google Wallet Google Play Magazines Google Play Music Google Play Movies & TV Google Duo == Reception, competitors, and regulators == === FairSearch === Numerous European firms filed a complaint to the European Commission stating that Google had manipulated their power and dominance within the market to push their Services to be used by phone manufacturers. The firms were joined under the name FairSearch, and the main firms included were Microsoft, Expedia, TripAdvisor, Nokia and Oracle. FairSearch's major problem with Google's practices was that they believed Google were forcing phone manufacturers to use their Mobile Services. They claimed Google managed this by asking these manufacturers to sign a contract stating that they must preinstall specific Google Mobile Services, such as Maps, Search and YouTube, in order to get the latest version of Android. Google swiftly responded stating that they "continue to work co-operatively with the European Commission". === Aptoide === The third-party Android app store Aptoide also filed an EU competition complaint against Google once again stating that they are misusing their power within the market. Aptoide alleged that Google was blocking third-party app stores from being on Google Play, as well as blocking Google Chrome from downloading any third-party apps and app stores. As of June 2014, Google had not responded to these allegations. === Abuse of Android dominance === In May 2019, Umar Javeed, Sukarma Thapar, Aaqib Javeed vs. Google LLC & Ors. the Competition Commission of India ordered an antitrust probe against Google for abusing its dominant position with Android to block market rivals. In Prima Facie opinion the commission held that mandatory pre-installation of the entire Google Mobile Services (GMS) suite, under Mobile Application Distribution Agreements (MADA), amounts to the imposition of unfair conditions on the device manufacturers. === EU antitrust ruling === On July 18, 2018, the European Commission fined Google €4.34 billion for breaching EU antitrust rules which resulted in a change of licensing policy for the GMS in the EU. A new paid licensing agreement for smartphones and tablets shipped into the EEA was created. The change is that the GMS is now decoupled from the base Android and will be offered under a separate paid licensing agreement. === Privacy policy === At the same time, Google faced problems with various European data protection agencies, most notably In the United Kingdom and France. The problem they faced was that they had a set of 60 rules merged into one, which allowed Google to "track users more closely". Google once again came out and stated that their new policies still abide by European Union laws. === Android distributions without Google Mobile Services === After surveillance and privacy concerns, several custom android distributions have been implemented, such as GrapheneOS, LineageOS, CalyxOS, iodéOS or /e/OS, and they come either without any GMS installed by default or with microG, that adds a compatibility layer.

    Read more →
  • SIP (software)

    SIP (software)

    SIP is an open source software tool used to connect computer programs or libraries written in C or C++ with the scripting language Python. It is an alternative to SWIG. SIP was originally developed in 1998 for PyQt — the Python bindings for the Qt GUI toolkit — but is suitable for generating bindings for any C or C++ library. == Concept == SIP takes a set of specification (.sip) files describing the API and generates the required C++ code. This is then compiled to produce the Python extension modules. A .sip file is essentially the class header file with some things removed (because SIP does not include a full C++ parser) and some things added (because C++ does not always provide enough information about how the API works). For PyQt v4 I use an internal tool (written using PyQt of course) called metasip. This is sort of an IDE for SIP. It uses GCC-XML to parse the latest header files and saves the relevant data, as XML, in a metasip project. metasip then does the equivalent of a diff against the previous version of the API and flags up any changes that need to be looked at. Those changes are then made through the GUI and ticked off the TODO list. Generating the .sip files is just a button click. In my subversion repository, PyQt v4 is basically just a 20M XML file. Updating PyQt v4 for a minor release of Qt v4 is about half an hours work. In terms of how the generated code works then I don't think it's very different from how any other bindings generator works. Python has a very good C API for writing extension modules - it's one of the reasons why so many 3rd party tools have Python bindings. For every C++ class, the SIP generated code creates a corresponding Python class implemented in C. == Notable applications that use SIP == PyQt, a python port of the application framework and widget toolkit Qt QGIS, a free and open-source cross-platform desktop geographic information system (GIS) QtiPlot, a computer program to analyze and visualize scientific data calibre (software), a free and open-source cross-platform e-book manager Veusz, a free and open-source cross-platform program to visualize scientific data

    Read more →
  • Color normalization

    Color normalization

    Color normalization is a topic in computer vision concerned with artificial color vision and object recognition. In general, the distribution of color values in an image depends on the illumination, which may vary depending on lighting conditions, cameras, and other factors. Color normalization allows for object recognition techniques based on color to compensate for these variations. == Main concepts == === Color constancy === Color constancy is a feature of the human internal model of perception, which provides humans with the ability to assign a relatively constant color to objects even under different illumination conditions. This is helpful for object recognition as well as identification of light sources in an environment. For example, humans see an object approximately as the same color when the sun is bright or when the sun is dim. === Applications === Color normalization has been used for object recognition on color images in the field of robotics, bioinformatics and general artificial intelligence, when it is important to remove all intensity values from the image while preserving color values. One example is in case of a scene shot by a surveillance camera over the day, where it is important to remove shadows or lighting changes on same color pixels and recognize the people that passed. Another example is automated screening tools used for the detection of diabetic retinopathy as well as molecular diagnosis of cancer states, where it is important to include color information during classification. == Known issues == The main issue about certain applications of color normalization is that the result looks unnatural or too distant from the original colors. In cases where there is a subtle variation between important aspects, this can be problematic. More specifically, the side effect can be that pixels become divergent and not reflect the actual color value of the image. A way of combating this issue is to use color normalization in combination with thresholding to correctly and consistently segment a colored image. == Transformations and algorithms == There is a vast array of different transformations and algorithms for achieving color normalization and a limited list is presented here. The performance of an algorithm is dependent on the task and one algorithm which performs better than another in one task might perform worse in another (no free lunch theorem). Additionally, the choice of the algorithm depends on the preferences of the user for the end-result, e.g. they may want a more natural-looking color image. === Grey world === The grey world normalization makes the assumption that changes in the lighting spectrum can be modelled by three constant factors applied to the red, green and blue channels of color. More specifically, a change in illuminated color can be modelled as a scaling α, β and γ in the R, G and B color channels and as such the grey world algorithm is invariant to illumination color variations. Therefore, a constancy solution can be achieved by dividing each color channel by its average value as shown in the following formula: ( α R , β G , γ B ) → ( α R α n ∑ i R , β G β n ∑ i G , γ B γ n ∑ i B ) {\displaystyle \left(\alpha R,\beta G,\gamma B\right)\rightarrow \left({\frac {\alpha R}{{\frac {\alpha }{n}}\sum _{i}R}},{\frac {\beta G}{{\frac {\beta }{n}}\sum _{i}G}},{\frac {\gamma B}{{\frac {\gamma }{n}}\sum _{i}B}}\right)} As mentioned above, grey world color normalization is invariant to illuminated color variations α, β and γ, however it has one important problem: it does not account for all variations of illumination intensity and it is not dynamic; when new objects appear in the scene it fails. To solve this problem there are several variants of the grey world algorithm. Additionally there is an iterative variation of the grey world normalization, however it was not found to perform significantly better. === Histogram equalization === Histogram equalization is a non-linear transform which maintains pixel rank and is capable of normalizing for any monotonically increasing color transform function. It is considered to be a more powerful normalization transformation than the grey world method. The results of histogram equalization tend to have an exaggerated blue channel and look unnatural, due to the fact that in most images the distribution of the pixel values is usually more similar to a Gaussian distribution, rather than uniform. === Histogram specification === Histogram specification transforms the red, green and blue histograms to match the shapes of three specific histograms, rather than simply equalizing them. It refers to a class of image transforms which aims to obtain images of which the histograms have a desired shape. As specified, firstly it is necessary to convert the image so that it has a particular histogram. Assume an image x. The following formula is the equalization transform of this image: y = f ( x ) = ∫ 0 x p x ( u ) d u {\displaystyle y=f(x)=\int \limits _{0}^{x}p_{x}(u)du} Then assume wanted image z. The equalization transform of this image is: y ′ = g ( z ) = ∫ 0 z p z ( u ) d u {\displaystyle y'=g(z)=\int \limits _{0}^{z}p_{z}(u)du} Of course p z ( u ) {\displaystyle p_{z}(u)} is the histogram of the output image. The formula to find the inverse of the above transform is: z = g − 1 ( y ′ ) {\displaystyle z=g^{-1}(y')} Therefore, since images y and y' have the same equalized histogram they are actually the same image, meaning y = y' and the transform from the given image x to the wanted image z is: z = g − 1 ( y ′ ) = g − 1 ( y ) = g − 1 ( f ( x ) ) {\displaystyle z=g^{-1}(y')=g^{-1}(y)=g^{-1}(f(x))} Histogram specification has the advantage of producing more realistic looking images, as it does not exaggerate the blue channel like histogram equalization. === Comprehensive Color Normalization === The comprehensive color normalization is shown to increase localization and object classification results in combination with color indexing. It is an iterative algorithm which works in two stages. The first stage is to use the red, green and blue color space with the intensity normalized, to normalize each pixel. The second stage is to normalize each color channel separately, so that the sum of the color components is equal to one third of the number of pixels. The iterations continue until convergence, meaning no additional changes. Formally: Normalize the color image f ( t ) = [ f i j ( t ) ] i = 1... N , j = 1... M {\displaystyle f^{(t)}=[f_{ij}^{(t)}]_{i=1...N,j=1...M}} which consists of color vectors f i j ( t ) = ( r i j ( t ) , g i j ( t ) , b i j ( t ) ) T . {\displaystyle f_{ij}^{(t)}=(r_{ij}^{(t)},g_{ij}^{(t)},b_{ij}^{(t)})^{T}.} For the first step explained above, compute: S i j := r i j ( t ) + g i j ( t ) + b i j ( t ) {\displaystyle S_{ij}:=r_{ij}^{(t)}+g_{ij}^{(t)}+b_{ij}^{(t)}} which leads to r i j ( t + 1 ) = r i j ( t ) S i j , g i j ( t + 1 ) = g i j ( t ) S i j {\displaystyle r_{ij}^{(t+1)}={\frac {r_{ij}^{(t)}}{S_{ij}}},g_{ij}^{(t+1)}={\frac {g_{ij}^{(t)}}{S_{ij}}}} and b i j ( t + 1 ) = b i j ( t ) S i j . {\displaystyle b_{ij}^{(t+1)}={\frac {b_{ij}^{(t)}}{S_{ij}}}.} For the second step explained above, compute: r ′ = 3 N M ∑ i = 1 N ∑ j = 1 M r i j ( t + 1 ) {\displaystyle r'={\frac {3}{NM}}\sum _{i=1}^{N}\sum _{j=1}^{M}r_{ij}^{(t+1)}} and normalize r i j ( t + 2 ) = r i j ( t + 1 ) r ′ . {\displaystyle r_{ij}^{(t+2)}={\frac {r_{ij}^{(t+1)}}{r'}}.} Of course the same process is done for b' and g'. Then these two steps are repeated until the changes between iteration t and t+2 are less than some set threshold. Comprehensive color normalization, just like the histogram equalization method previously mentioned, produces results that may look less natural due to the reduction in the number of color values.

    Read more →
  • Uncertain data

    Uncertain data

    In computer science, uncertain data is data that contains noise that makes it deviate from the correct, intended or original values. In the age of big data, uncertainty or data veracity is one of the defining characteristics of data. Data is constantly growing in volume, variety, velocity and uncertainty (1/veracity). Uncertain data is found in abundance today on the web, in sensor networks, within enterprises both in their structured and unstructured sources. For example, there may be uncertainty regarding the address of a customer in an enterprise dataset, or the temperature readings captured by a sensor due to aging of the sensor. In 2012 IBM called out managing uncertain data at scale in its global technology outlook report that presents a comprehensive analysis looking three to ten years into the future seeking to identify significant, disruptive technologies that will change the world. In order to make confident business decisions based on real-world data, analyses must necessarily account for many different kinds of uncertainty present in very large amounts of data. Analyses based on uncertain data will have an effect on the quality of subsequent decisions, so the degree and types of inaccuracies in this uncertain data cannot be ignored. Uncertain data is found in the area of sensor networks; text where noisy text is found in abundance on social media, web and within enterprises where the structured and unstructured data may be old, outdated, or plain incorrect; in modeling where the mathematical model may only be an approximation of the actual process. When representing such data in a database, an appropriate uncertain database model needs to be selected. == Example data model for uncertain data == One way to represent uncertain data is through probability distributions. Let us take the example of a relational database. There are three main ways to do represent uncertainty as probability distributions in such a database model. In attribute uncertainty, each uncertain attribute in a tuple is subject to its own independent probability distribution. For example, if readings are taken of temperature and wind speed, each would be described by its own probability distribution, as knowing the reading for one measurement would not provide any information about the other. In correlated uncertainty, multiple attributes may be described by a joint probability distribution. For example, if readings are taken of the position of an object, and the x- and y-coordinates stored, the probability of different values may depend on the distance from the recorded coordinates. As distance depends on both coordinates, it may be appropriate to use a joint distribution for these coordinates, as they are not independent. In tuple uncertainty, all the attributes of a tuple are subject to a joint probability distribution. This covers the case of correlated uncertainty, but also includes the case where there is a probability of a tuple not belonging in the relevant relation, which is indicated by all the probabilities not summing to one. For example, assume we have the following tuple from a probabilistic database: Then, the tuple has 10% chance of not existing in the database.

    Read more →
  • Database application

    Database application

    A database application is a computer program whose primary purpose is retrieving information from a computerized database. From here, information can be inserted, modified or deleted which is subsequently conveyed back into the database. Early examples of database applications were accounting systems and airline reservations systems, such as SABRE, developed starting in 1957. A characteristic of modern database applications is that they facilitate simultaneous updates and queries from multiple users. Systems in the 1970s might have accomplished this by having each user in front of a 3270 terminal to a mainframe computer. By the mid-1980s it was becoming more common to give each user a personal computer and have a program running on that PC that is connected to a database server. Information would be pulled from the database, transmitted over a network, and then arranged, graphed, or otherwise formatted by the program running on the PC. Starting in the mid-1990s it became more common to build database applications with a Web interface. Rather than develop custom software to run on a user's PC, the user would use the same Web browser program for every application. A database application with a Web interface had the advantage that it could be used on devices of different sizes, with different hardware, and with different operating systems. Examples of early database applications with Web interfaces include amazon.com, which used the Oracle relational database management system, the photo.net online community, whose implementation on top of Oracle was described in the book Database-Backed Web Sites (Ziff-Davis Press; May 1997), and eBay, also running Oracle. Electronic medical records are referred to on emrexperts.com, in December 2010, as "a software database application". A 2005 O'Reilly book uses the term in its title: Database Applications and the Web. Some of the most complex database applications remain accounting systems, such as SAP, which may contain thousands of tables in only a single module. Many of today's most widely used computer systems are database applications, for example, Facebook, which was built on top of MySQL. The etymology of the phrase "database application" comes from the practice of dividing computer software into systems programs, such as the operating system, compilers, the file system, and tools such as the database management system, and application programs, such as a payroll check processor. On a standard PC running Microsoft Windows, for example, the Windows operating system contains all of the systems programs while games, word processors, spreadsheet programs, photo editing programs, etc. would be application programs. As "application" is short for "application program", "database application" is short for "database application program". Not every program that uses a database would typically be considered a "database application". For example, many physics experiments, e.g., the Large Hadron Collider, generate massive data sets that programs subsequently analyze. The data sets constitute a "database", though they are not typically managed with a standard relational database management system. The computer programs that analyze the data are primarily developed to answer hypotheses, not to put information back into the database and therefore the overall program would not be called a "database application". == Examples of database applications == Amazon Student Data CNN eBay Facebook Fandango Filemaker (Mac OS) LibreOffice Base Microsoft Access Oracle relational database SAP (Systems, Applications & Products in Data Processing) Ticketmaster Wikipedia Yelp YouTube Google MySQL

    Read more →
  • Comparison of color models in computer graphics

    Comparison of color models in computer graphics

    This article provides introductory information about the RGB, HSV, and HSL color models from a computer graphics (web pages, images) perspective. An introduction to colors is also provided to support the main discussion. == Basics of color == === Primary colors and hue === First, "color" refers to the human brain's subjective interpretation of combinations of a narrow band of wavelengths of light. For this reason, the definition of "color" is not based on a strict set of physical phenomena. Therefore, even basic concepts like "primary colors" are not clearly defined. For example, traditional "Painter's Colors" use red, blue, and yellow as the primary colors, "Printer's Colors" use cyan, yellow, and magenta, and "Light Colors" use red, green, and blue. "Light colors", more formally known as additive colors, are formed by combining red, green, and blue light. This article refers to additive colors and refers to red, green, and blue as the primary colors. Hue is a term describing a pure color, that is, a color not modified by tinting or shading (see below). In additive colors, hues are formed by combining two primary colors. When two primary colors are combined in equal intensities, the result is a "secondary color". === Color wheel === A color wheel is a tool that provides a visual representation of the relationships between all possible hues. The primary colors are arranged around a circle at equal (120 degree) intervals. (Warning: Color wheels frequently depict "Painter's Colors" primary colors, which leads to a different set of hues than additive colors.) The illustration shows a simple color wheel based on the additive colors. Note that the position (top, right) of the starting color, typically red, is arbitrary, as is the order of green and blue (clockwise, counter-clockwise). The illustration also shows the secondary colors, yellow, cyan, and magenta, located halfway between (60 degrees) the primary colors. == Complementary color == The complement of a hue is the hue that is opposite it (180 degrees) on the color wheel. Using additive colors, mixing a hue and its complement in equal amounts produces white. === Tints and shades === The following discussion uses an illustration involving three projectors pointing to the same spot on a screen. Each projector is capable of generating one hue. The "intensities" of each projector are "matched" and can be equally adjusted from zero to full. (Note: "Intensity" is used here in the same sense as the RGB color model. The subject of matching, or "gamma correction", is beyond the level of this article.) A shade is produced by "dimming" a maximum chroma color. Painters refer to this as "adding black". In our illustration, one projector is set to full intensity, a second is set to some intensity between zero and full, and third is set to zero. "Dimming" is accomplished by decreasing each projector's intensity setting to the same fraction of its start setting. In the shade example, with any fully shaded hue, that all three projectors are set to zero intensity, resulting in black. A tint is produced by "lightening" a maximum chroma color. Painters refer to this as "adding white". In our illustration, one projector is set to full intensity, a second is set to some intensity between zero and full, and third is set to zero. "Lightening" is accomplished by increasing each projector's intensity setting by the same fraction from its start setting to full. In the tinting example, note that the third projector is now contributing. When the hue is fully lightened, all three projectors are each at full intensity, and the result is white. Note an attribute of the total intensity in the additive model. If full intensity for one projector is 1, then a primary color has a combined intensity of 1. A secondary color has a total intensity of 2. White has a total intensity of 3. Tinting, or "adding white", increases the total intensity of the hue. While this is simply a fact, the HSL model will take this fact into account in its design. === Tones === Tone is a general term, typically used by painters, to refer to the effects of reducing the "colorfulness" of a maximum chroma color; painters refer to it as "adding gray". Note that gray is not a color or even a single concept but refers to all the range of values between black and white where all three primary colors are equally represented. The general term is provided as more specific terms have conflicting definitions in different color models. Thus, shading takes a hue toward black, tinting takes a hue towards white, and tones cover the range between. == Choosing a color model == No one color model is necessarily "better" than another. Typically, the choice of a color model is dictated by external factors, such as a graphics tool or the need to specify colors according to the CSS2 or CSS3 standard. The following discussion only describes how the models function, centered on the concepts of hue, shade, tint, and tone. === RGB === The RGB model's approach to colors is important because: It directly reflects the physical properties of "Truecolor" displays As of 2011, most graphic cards define pixel values in terms of the colors red, green, and blue. The typical range of intensity values for each color, 0–255, is based on taking a binary number with 32 bits and breaking it up into four bytes of 8 bits each. 8 bits can hold a value from 0 to 255. The fourth byte is used to specify the "alpha", or the opacity, of the color. Opacity comes into play when layers with different colors are stacked. If the color in the top layer is less than fully opaque (alpha < 255), the color from underlying layers "shows through". In the RGB model, hues are represented by specifying one color as full intensity (255), a second color with a variable intensity, and the third color with no intensity (0). The following provides some examples using red as the full-intensity and green as the partial-intensity colors; blue is always zero: Shades are created by multiplying the intensity of each primary color by 1 minus the shade factor, in the range 0 to 1. A shade factor of 0 does nothing to the hue, a shade factor of 1 produces black: new intensity = current intensity (1 – shade factor) The following provides examples using orange: Tints are created by modifying each primary color as follows: the intensity is increased so that the difference between the intensity and full intensity (255) is decreased by the tint factor, in the range 0 to 1. A tint factor of 0 does nothing, a tint factor of 1 produces white: new intensity = current intensity + (255 – current intensity) tint factor The following provides examples using orange: Tones are created by applying both a shade and a tint. The order in which the two operations are performed does not matter, with the following restriction: when a tint operation is performed on a shade, the intensity of the dominant color becomes the "full intensity"; that is, the intensity value of the dominant color must be used in place of 255. The following provides examples using orange: === HSV === The HSV, or HSB, model describes colors in terms of hue, saturation, and value (brightness). Note that the range of values for each attribute is arbitrarily defined by various tools or standards. Be sure to determine the value ranges before attempting to interpret a value. Hue corresponds directly to the concept of hue in the Color Basics section. The advantages of using hue are The angular relationship between tones around the color circle is easily identified Shades, tints, and tones can be generated easily without affecting the hue Saturation corresponds directly to the concept of tint in the Color Basics section, except that full saturation produces no tint, while zero saturation produces white, a shade of gray, or black. Value corresponds directly to the concept of intensity in the Color Basics section. Pure colors are produced by specifying a hue with full saturation and value Shades are produced by specifying a hue with full saturation and less than full value Tints are produced by specifying a hue with less than full saturation and full value Tones are produced by specifying a hue and both less than full saturation and value White is produced by specifying zero saturation and full value, regardless of hue Black is produced by specifying zero value, regardless of hue or saturation Shades of gray are produced by specifying zero saturation and between zero and full value The advantage of HSV is that each of its attributes corresponds directly to the basic color concepts, which makes it conceptually simple. The perceived disadvantage of HSV is that the saturation attribute corresponds to tinting, so desaturated colors have increasing total intensity. For this reason, the CSS3 standard plans to support RGB and HSL but not HSV. === HSL === The HSL model describes colors in terms of hue, saturation, and lightness (also called luminance). (Note: the definition of sa

    Read more →
  • PropBank

    PropBank

    PropBank is a corpus that is annotated with verbal propositions and their arguments—a "proposition bank". Although "PropBank" refers to a specific corpus produced by Martha Palmer et al., the term propbank is also coming to be used as a common noun referring to any corpus that has been annotated with propositions and their arguments. The PropBank project has played a role in research in natural language processing, and has been used in semantic role labelling. == Comparison == PropBank differs from FrameNet, the resource to which it is most frequently compared, in several ways. PropBank is a verb-oriented resource, while FrameNet is centered on the more abstract notion of frames, which generalizes descriptions across similar verbs (e.g. "describe" and "characterize") as well as nouns and other words (e.g. "description"). PropBank does not annotate events or states of affairs described using nouns. PropBank commits to annotating all verbs in a corpus, whereas the FrameNet project chooses sets of example sentences from a large corpus and only in a few cases has annotated longer continuous stretches of text. PropBank-style annotations often remain close to the syntactic level, while FrameNet-style annotations are sometimes more semantically motivated. From the start, PropBank was developed with the idea of serving as training data for machine learning-based semantic role labeling systems in mind. It requires that all arguments to a verb be syntactic constituents and different senses of a word are only distinguished if the differences bear on the arguments. Due to such differences, semantic role labeling with respect to PropBank is often a somewhat easier task than producing FrameNet-style annotations.

    Read more →
  • Gradient vector flow

    Gradient vector flow

    Gradient vector flow (GVF), a computer vision framework introduced by Chenyang Xu and Jerry L. Prince, is the vector field that is produced by a process that smooths and diffuses an input vector field. It is usually used to create a vector field from images that points to object edges from a distance. It is widely used in image analysis and computer vision applications for object tracking, shape recognition, segmentation, and edge detection. In particular, it is commonly used in conjunction with active contour model. == Background == Finding objects or homogeneous regions in images is a process known as image segmentation. In many applications, the locations of object edges can be estimated using local operators that yield a new image called an edge map. The edge map can then be used to guide a deformable model, sometimes called an active contour or a snake, so that it passes through the edge map in a smooth way, therefore defining the object itself. A common way to encourage a deformable model to move toward the edge map is to take the spatial gradient of the edge map, yielding a vector field. Since the edge map has its highest intensities directly on the edge and drops to zero away from the edge, these gradient vectors provide directions for the active contour to move. When the gradient vectors are zero, the active contour will not move, and this is the correct behavior when the contour rests on the peak of the edge map itself. However, because the edge itself is defined by local operators, these gradient vectors will also be zero far away from the edge and therefore the active contour will not move toward the edge when initialized far away from the edge. Gradient vector flow (GVF) is the process that spatially extends the edge map gradient vectors, yielding a new vector field that contains information about the location of object edges throughout the entire image domain. GVF is defined as a diffusion process operating on the components of the input vector field. It is designed to balance the fidelity of the original vector field, so it is not changed too much, with a regularization that is intended to produce a smooth field on its output. Although GVF was designed originally for the purpose of segmenting objects using active contours attracted to edges, it has been since adapted and used for many alternative purposes. Some newer purposes including defining a continuous medial axis representation, regularizing image anisotropic diffusion algorithms, finding the centers of ribbon-like objects, constructing graphs for optimal surface segmentations, creating a shape prior, and much more. == Theory == The theory of GVF was originally described by Xu and Prince. Let f ( x , y ) {\displaystyle \textstyle f(x,y)} be an edge map defined on the image domain. For uniformity of results, it is important to restrict the edge map intensities to lie between 0 and 1, and by convention f ( x , y ) {\displaystyle \textstyle f(x,y)} takes on larger values (close to 1) on the object edges. The gradient vector flow (GVF) field is given by the vector field v ( x , y ) = [ u ( x , y ) , v ( x , y ) ] {\displaystyle \textstyle \mathbf {v} (x,y)=[u(x,y),v(x,y)]} that minimizes the energy functional In this equation, subscripts denote partial derivatives and the gradient of the edge map is given by the vector field ∇ f = ( f x , f y ) {\displaystyle \textstyle \nabla f=(f_{x},f_{y})} . Figure 1 shows an edge map, the gradient of the (slightly blurred) edge map, and the GVF field generated by minimizing E {\displaystyle \textstyle {\mathcal {E}}} . Equation 1 is a variational formulation that has both a data term and a regularization term. The first term in the integrand is the data term. It encourages the solution v {\displaystyle \textstyle \mathbf {v} } to closely agree with the gradients of the edge map since that will make v − ∇ f {\displaystyle \textstyle \mathbf {v} -\nabla f} small. However, this only needs to happen when the edge map gradients are large since v − ∇ f {\displaystyle \textstyle \mathbf {v} -\nabla f} is multiplied by the square of the length of these gradients. The second term in the integrand is a regularization term. It encourages the spatial variations in the components of the solution to be small by penalizing the sum of all the partial derivatives of v {\displaystyle \textstyle \mathbf {v} } . As is customary in these types of variational formulations, there is a regularization parameter μ > 0 {\displaystyle \textstyle \mu >0} that must be specified by the user in order to trade off the influence of each of the two terms. If μ {\displaystyle \textstyle \mu } is large, for example, then the resulting field will be very smooth and may not agree as well with the underlying edge gradients. Theoretical Solution. Finding v ( x , y ) {\displaystyle \textstyle \mathbf {v} (x,y)} to minimize Equation 1 requires the use of calculus of variations since v ( x , y ) {\displaystyle \textstyle \mathbf {v} (x,y)} is a function, not a variable. Accordingly, the Euler equations, which provide the necessary conditions for v {\displaystyle \textstyle \mathbf {v} } to be a solution can be found by calculus of variations, yielding where ∇ 2 {\displaystyle \textstyle \nabla ^{2}} is the Laplacian operator. It is instructive to examine the form of the equations in (2). Each is a partial differential equation that the components u {\displaystyle u} and v {\displaystyle v} of v {\displaystyle \mathbf {v} } must satisfy. If the magnitude of the edge gradient is small, then the solution of each equation is guided entirely by Laplace's equation, for example ∇ 2 u = 0 {\displaystyle \textstyle \nabla ^{2}u=0} , which will produce a smooth scalar field entirely dependent on its boundary conditions. The boundary conditions are effectively provided by the locations in the image where the magnitude of the edge gradient is large, where the solution is driven to agree more with the edge gradients. Computational Solutions. There are two fundamental ways to compute GVF. First, the energy function E {\displaystyle {\mathcal {E}}} itself (1) can be directly discretized and minimized, for example, by gradient descent. Second, the partial differential equations in (2) can be discretized and solved iteratively. The original GVF paper used an iterative approach, while later papers introduced considerably faster implementations such as an octree-based method, a multi-grid method, and an augmented Lagrangian method. In addition, very fast GPU implementations have been developed in Extensions and Advances. GVF is easily extended to higher dimensions. The energy function is readily written in a vector form as which can be solved by gradient descent or by finding and solving its Euler equation. Figure 2 shows an illustration of a three-dimensional GVF field on the edge map of a simple object (see ). The data and regularization terms in the integrand of the GVF functional can also be modified. A modification described in , called generalized gradient vector flow (GGVF) defines two scalar functions and reformulates the energy as While the choices g ( ∇ f | ) = μ {\displaystyle \textstyle g(\nabla f|)=\mu } and h ( | ∇ f | ) = | ∇ f | 2 {\displaystyle \textstyle h(|\nabla f|)=|\nabla f|^{2}} reduce GGVF to GVF, the alternative choices g ( | ∇ f | ) = exp ⁡ { − | ∇ f | / K } {\displaystyle \textstyle g(|\nabla f|)=\exp\{-|\nabla f|/K\}} and h ( ∇ f | ) = 1 − g ( | ∇ f | ) {\displaystyle \textstyle h(\nabla f|)=1-g(|\nabla f|)} , for K {\displaystyle K} a user-selected constant, can improve the tradeoff between the data term and its regularization in some applications. The GVF formulation has been further extended to vector-valued images in where a weighted structure tensor of a vector-valued image is used. A learning based probabilistic weighted GVF extension was proposed in to further improve the segmentation for images with severely cluttered textures or high levels of noise. The variational formulation of GVF has also been modified in motion GVF (MGVF) to incorporate object motion in an image sequence. Whereas the diffusion of GVF vectors from a conventional edge map acts in an isotropic manner, the formulation of MGVF incorporates the expected object motion between image frames. An alternative to GVF called vector field convolution (VFC) provides many of the advantages of GVF, has superior noise robustness, and can be computed very fast. The VFC field v V F C {\displaystyle \textstyle \mathbf {v} _{\mathrm {VFC} }} is defined as the convolution of the edge map f {\displaystyle f} with a vector field kernel k {\displaystyle \mathbf {k} } where The vector field kernel k {\displaystyle \textstyle \mathbf {k} } has vectors that always point toward the origin but their magnitudes, determined in detail by the function m {\displaystyle m} , decrease to zero with increasing distance from the origin. The beauty of VFC is that it can be computed very rapidly using a fast Fourier tra

    Read more →