AI Assistant In Adobe Acrobat

AI Assistant In Adobe Acrobat — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Referring expression generation

    Referring expression generation

    Referring expression generation (REG) is the subtask of natural language generation (NLG) that received most scholarly attention. While NLG is concerned with the conversion of non-linguistic information into natural language, REG focuses only on the creation of referring expressions (noun phrases) that identify specific entities called targets. This task can be split into two sections. The content selection part determines which set of properties distinguish the intended target and the linguistic realization part defines how these properties are translated into natural language. A variety of algorithms have been developed in the NLG community to generate different types of referring expressions. == Types of referring expressions == A referring expression (RE), in linguistics, is any noun phrase, or surrogate for a noun phrase, whose function in discourse is to identify some individual object (thing, being, event...) The technical terminology for identify differs a great deal from one school of linguistics to another. The most widespread term is probably refer, and a thing identified is a referent, as for example in the work of John Lyons. In linguistics, the study of reference relations belongs to pragmatics, the study of language use, though it is also a matter of great interest to philosophers, especially those wishing to understand the nature of knowledge, perception and cognition more generally. Various devices can be used for reference: determiners, pronouns, proper names... Reference relations can be of different kinds; referents can be in a "real" or imaginary world, in discourse itself, and they may be singular, plural, or collective. === Pronouns === The simplest type of referring expressions are pronoun such as he and it. The linguistics and natural language processing communities have developed various models for predicting anaphor referents, such as centering theory, and ideally referring-expression generation would be based on such models. However most NLG systems use much simpler algorithms, for example using a pronoun if the referent was mentioned in the previous sentence (or sentential clause), and no other entity of the same gender was mentioned in this sentence. === Definite noun phrases === There has been a considerable amount of research on generating definite noun phrases, such as the big red book. Much of this builds on the model proposed by Dale and Reiter. This has been extended in various ways, for example Krahmer et al. present a graph-theoretic model of definite NP generation with many nice properties. In recent years a shared-task event has compared different algorithms for definite NP generation, using the TUNA corpus. === Spatial and temporal reference === Recently there has been more research on generating referring expressions for time and space. Such references tend to be imprecise (what is the exact meaning of tonight?), and also to be interpreted in different ways by different people. Hence it may be necessary to explicitly reason about false positive vs false negative tradeoffs, and even calculate the utility of different possible referring expressions in a particular task context. === Criteria for good expressions === Ideally, a good referring expression should satisfy a number of criteria: Referential success: It should unambiguously identify the referent to the reader. Ease of comprehension: The reader should be able to quickly read and understand it. Computational complexity: The generation algorithm should be fast No false inferences: The expression should not confuse or mislead the reader by suggesting false implicatures or other pragmatic inferences. For example, a reader may be confused if he is told Sit by the brown wooden table in a context where there is only one table. == History == === Pre-2000 era === REG goes back to the early days of NLG. One of the first approaches was done by Winograd in 1972 who developed an "incremental" REG algorithm for his SHRDLU program. Afterwards researchers started to model the human abilities to create referring expressions in the 1980s. This new approach to the topic was influenced by the researchers Appelt and Kronfeld who created the programs KAMP and BERTRAND and considered referring expressions as parts of bigger speech acts. Some of their most interesting findings were the fact that referring expressions can be used to add information beyond the identification of the referent as well as the influence of communicative context and the Gricean maxims on referring expressions. Furthermore, its skepticism concerning the naturalness of minimal descriptions made Appelt and Kronfeld's research a foundation of later work on REG. The search for simple, well-defined problems changed the direction of research in the early 1990s. This new approach was led by Dale and Reiter who stressed the identification of the referent as the central goal. Like Appelt they discuss the connection between the Gricean maxims and referring expressions in their culminant paper in which they also propose a formal problem definition. Furthermore, Reiter and Dale discuss the Full Brevity and Greedy Heuristics algorithms as well as their Incremental Algorithm(IA) which became one of the most important algorithms in REG. === Later developments === After 2000 the research began to lift some of the simplifying assumptions, that had been made in early REG research in order to create more simple algorithms. Different research groups concentrated on different limitations creating several expanded algorithms. Often these extend the IA in a single perspective for example in relation to: Reference to Sets like "the t-shirt wearers" or "the green apples and the banana on the left" Relational Descriptions like "the cup on the table" or "the woman who has three children" Context Dependency, Vagueness and Gradeability include statements like "the older man" or "the car on the left" which are often unclear without a context Salience and Generation of Pronouns are highly discourse dependent making for example "she" a reference to "the (most salient) female person" Many simplifying assumptions are still in place or have just begun to be worked on. Also a combination of the different extensions has yet to be done and is called a "non-trivial enterprise" by Krahmer and van Deemter. Another important change after 2000 was the increasing use of empirical studies in order to evaluate algorithms. This development took place due to the emergence of transparent corpora. Although there are still discussions about what the best evaluation metrics are, the use of experimental evaluation has already led to a better comparability of algorithms, a discussion about the goals of REG and more task-oriented research. Furthermore, research has extended its range to related topics such as the choice of Knowledge Representation(KR) Frameworks. In this area the main question, which KR framework is most suitable for the use in REG remains open. The answer to this question depends on how well descriptions can be expressed or found. A lot of the potential of KR frameworks has been left unused so far. Some of the different approaches are the usage of: Graph search which treats relations between targets in the same way as properties. Constraint Satisfaction which allows for a separation between problem specification and the implementation. Modern Knowledge Representation which offers logical inference in for example Description Logic or Conceptual Graphs. == Problem definition == Dale and Reiter (1995) think about referring expressions as distinguishing descriptions. They define: The referent as the entity that should be described The context set as set of salient entities The contrast set or potential distractors as all elements of the context set except the referent A property as a reference to a single attribute–value pair Each entity in the domain can be characterised as a set of attribute–value pairs for example ⟨ {\displaystyle \langle } type, dog ⟩ {\displaystyle \rangle } , ⟨ {\displaystyle \langle } gender, female ⟩ {\displaystyle \rangle } or ⟨ {\displaystyle \langle } age, 10 years ⟩ {\displaystyle \rangle } . The problem then is defined as follows: Let r {\displaystyle r} be the intended referent, and C {\displaystyle C} be the contrast set. Then, a set L {\displaystyle L} of attribute–value pairs will represent a distinguishing description if the following two conditions hold: Every attribute–value pair in L {\displaystyle L} applies to r {\displaystyle r} : that is, every element of L {\displaystyle L} specifies an attribute–value that r {\displaystyle r} possesses. For every member c {\displaystyle c} of C {\displaystyle C} , there is at least one element l {\displaystyle l} of L {\displaystyle L} that does not apply to c {\displaystyle c} : that is, there is an l {\displaystyle l} in L {\displaystyle L} that specifies an attribute–value that c {\displaystyle c} does not possess. l {\displaystyle l} is said

    Read more →
  • Midjourney

    Midjourney

    Midjourney is a generative artificial intelligence program and service created and hosted by the San Francisco–based "independent research lab" Midjourney, Inc. Midjourney generates images from natural language descriptions, called prompts, similar to OpenAI's DALL-E and Stability AI's Stable Diffusion. It is one of the technologies of the AI boom. The tool was launched into open beta on July 12, 2022. The Midjourney team is led by David Holz, who co-founded Leap Motion. Holz told The Register in August 2022 that the company was already profitable. Users generate images with Midjourney using Discord bot commands or the official website. == History == Midjourney, Inc. was founded in San Francisco, California, by David Holz, previously a co-founder of Leap Motion. The Midjourney image generation platform entered open beta on July 12, 2022. On March 14, 2022, the Midjourney Discord server launched with a request to post high-quality photographs to Twitter and Reddit for systems training. === Model versions === The company has been working on improving its algorithms, releasing new model versions every few months. Version 2 of their algorithm was launched in April 2022, and version 3 on July 25. On November 5, 2022, the alpha iteration of version 4 was released to users. Starting from the 4th version, MJ models were trained on Google TPUs. On March 15, 2023, the alpha iteration of version 5 was released. The 5.1 model is more opinionated than version 5, applying more of its own stylization to images, while the 5.1 RAW model adds improvements while working better with more literal prompts. The version 5.2 included a new "aesthetics system", and the ability to "zoom out" by generating surroundings to an existing image. On December 21, 2023, the alpha iteration of version 6 was released. The model was trained from scratch over a nine month period. Support was added for better text rendition and a more literal interpretation of prompts. == Functionality == Midjourney is accessible through a Discord bot or by accessing their website. Users can use Midjourney through Discord either through their official Discord server, by directly messaging the bot, or by inviting the bot to a third-party server. To generate images, users use the /imagine command and type in a prompt; the bot then returns a set of four images, which users are given the option to upscale. To generate images on the website, users initially needed to have generated at least 1,000 images through the bot; this limitation has since been removed. === Vary (Region) + remix feature === Midjourney released a Vary (Region) feature on September 5, 2023, as part of MidJourney V5.2. This feature allows users to select a specific area of an image and apply variations only to that region while keeping the rest of the image unchanged. === Midjourney web interface === Midjourney introduced its web interface to make its tools more accessible, moving beyond its initial reliance on Discord. This web-based platform was launched in August 2024 alongside the release of Midjourney version 6.1. The web editor consolidates tools such as image editing, panning, zooming, region variation, and inpainting into a single interface. The introduction of the web interface also syncs conversations between Midjourney's Discord channels and web rooms, further enhancing collaboration across both platforms. This shift was in response to growing competition from other AI image generation platforms like Adobe Firefly and Google’s Imagen, which had already launched as native web apps with integration into popular design tools. === Image Weight === This feature lets users control how much influence an uploaded image has on the final output. By adjusting the "image weight" parameter, users can prioritize either the content of the prompt or the characteristics of the image. For instance, setting a higher weight will ensure that the generated result closely follows the image's structure and details, while a lower weight allows the text prompt to have more influence over the final output. === Style Reference === With Style Reference, users can upload an image to use as a stylistic guide for their creation. This tool enables MidJourney to extract the style—whether it is the color palette, texture, or overall atmosphere—from the reference image and apply it to a newly generated image. The feature allows users to fine-tune the aesthetics of their creations by integrating specific artistic styles or moods. === Character Reference === The Character Reference feature allows for a more targeted approach in defining characters. Users can upload an image of a character, and the system uses that image as a reference to generate similar characters in the output. This feature is particularly useful in maintaining consistency in appearance for characters across different images. == Uses == Midjourney's founder, David Holz, told The Register that artists use Midjourney for rapid prototyping of artistic concepts to show to clients before starting work themselves. The advertising industry quickly adopted AI tools such as Midjourney, DALL-E, and Stable Diffusion to create original content and brainstorm ideas. Architects have described using the software to generate mood boards for the early stages of projects, as an alternative to searching Google Images. === Notable usage and controversy === The program was used by the British magazine The Economist to create the front cover for an issue in June 2022. In Italy, the leading newspaper Corriere della Sera published a comic created with Midjourney by writer Vanni Santoni in August 2022. Charlie Warzel used Midjourney to generate two images of Alex Jones for Warzel's newsletter in The Atlantic. The use of an AI-generated cover was criticised by people who felt it was taking jobs from artists. Warzel called his action a mistake in an article about his decision to use generated images. Last Week Tonight with John Oliver included a 10-minute segment on Midjourney in an episode broadcast in August 2022. A Midjourney image called Théâtre D'opéra Spatial won first place in the digital art competition at the 2022 Colorado State Fair. Jason Allen, who wrote the prompt that led Midjourney to generate the image, printed the image onto a canvas and entered it into the competition using the name Jason M. Allen via Midjourney. Other digital artists were upset by the news. Allen was unapologetic, insisting that he followed the competition's rules. The two category judges were unaware that Midjourney used AI to generate images, although they later said that had they known this, they would have awarded Allen the top prize anyway. In December 2022, Midjourney was used to generate the images for an AI-generated children's book that was created over a weekend. Titled Alice and Sparkle, the book features a young girl who builds a robot that becomes self-aware. The creator, Ammaar Reeshi, used Midjourney to generate a large number of images, from which he chose 13 for the book. Both the product and process drew criticism. One artist wrote that "the main problem... is that it was trained off of artists' work. It's our creations, our distinct styles that we created, that we did not consent to being used." In 2023, the realism of AI-based text-to-image generators, such as Midjourney, DALL-E, or Stable Diffusion, reached such a high level that it led to a significant wave of viral AI-generated photos. Widespread attention was gained by a Midjourney-generated photo of Pope Francis wearing a white puffer coat, the fictional arrest of Donald Trump, and a hoax of an attack on the Pentagon, as well as the usage in professional creative arts. Research has suggested that the images Midjourney generates can be biased. For example, even neutral prompts in one study returned unequal results on the aspects of gender, skin color, and location. A study by researchers at the nonprofit group Center for Countering Digital Hate found the tool to be easy to use to generate racist and conspiratorial images. In October 2023, Rest of World reported that Midjourney tends to generate images based on national stereotypes. In 2024, a Frontiers journal published a paper which contained gibberish figures generated with Midjourney, one of which was a diagram of a rat with large testicles and a large penis towering over himself. The paper was retracted a day after the images went viral on Twitter. ==== Content moderation and censorship in Midjourney ==== Prior to May 2023, Midjourney implemented a moderation mechanism predicated on a banned word system. This method prohibited the use of language associated with explicit content, such as sexual or pornographic themes, as well as extreme violence. Moreover, the system also banned certain individual words, including those of religious and political figures, such as Allah or General Secretary of the Chinese Communist Party Xi Jinping. This practice occasionally stirred controversy due to perceiv

    Read more →
  • Degree of truth

    Degree of truth

    In classical logic, propositions are typically unambiguously considered as being true or false. For instance, the proposition one is both equal and not equal to itself is regarded as simply false, being contrary to the Law of Noncontradiction; while the proposition one is equal to one is regarded as simply true, by the Law of Identity. However, some mathematicians, computer scientists, and philosophers have been attracted to the idea that a proposition might be more or less true, rather than wholly true or wholly false. Consider this pizza is hot. In mathematics, this idea can be developed in terms of fuzzy logic. In computer science, it has found application in artificial intelligence. In philosophy, the idea has proved particularly appealing in the case of vagueness. Degrees of truth is an important concept in law. The term is an older concept than conditional probability. Instead of determining the objective probability, only a subjective assessment is defined. In adjudicative processes, 'substantive truth' is distinct from 'formal legal truth' which comes in four degrees: hearsay, balance of probabilities, proven beyond reasonable doubt and absolute truth (knowledge reserved unto God).

    Read more →
  • Evolutionary computation

    Evolutionary computation

    Evolutionary computation (EC) from computer science is a family of algorithms for global optimization inspired by biological evolution, and a subfield of computational intelligence and soft computing studying these algorithms. In technical terms, they are a family of population-based trial and error problem solvers with a metaheuristic or stochastic optimization character. In evolutionary computation, an initial set of candidate solutions is generated and iteratively updated. Each new generation is produced by stochastically removing less desired solutions, and introducing small random changes as well as, depending on the method, mixing parental information. In biological terminology, a population of solutions is subjected to natural selection (or artificial selection), mutation and possibly recombination. These biological functions serve as role models for the genetic operators - mutation, crossover, and selection - used in the EC procedures. As a result, the population will gradually evolve to increase in fitness, in this case the chosen fitness function of the algorithm. Evolutionary computation techniques can produce highly optimized solutions in a wide range of problem settings, making them popular in computer science. Many variants and extensions exist, suited to more specific families of problems and data structures. Evolutionary computation is also sometimes used in evolutionary biology as an in silico experimental procedure to study common aspects of general evolutionary processes. == History == The concept of mimicking evolutionary processes to solve problems originates before the advent of computers, such as when Alan Turing proposed a method of genetic search in 1948 . Turing's B-type u-machines resemble primitive neural networks, and connections between neurons were learnt via a sort of genetic algorithm. His P-type u-machines resemble a method for reinforcement learning, where pleasure and pain signals direct the machine to learn certain behaviors. However, Turing's paper went unpublished until 1968, and he died in 1954, so this early work had little to no effect on the field of evolutionary computation that was to develop. Evolutionary computing as a field began in earnest in the 1950s and 1960s. There were several independent attempts to use the process of evolution in computing at this time, which developed separately for roughly 15 years. Three branches emerged in different places to attain this goal: evolution strategies, evolutionary programming, and genetic algorithms. A fourth branch, genetic programming, eventually emerged in the early 1990s. These approaches differ in the method of selection, the permitted mutations, and the representation of genetic data. By the 1990s, the distinctions between the historic branches had begun to blur, and the term 'evolutionary computing' was coined in 1991 to denote a field that exists over all four paradigms. In 1962, Lawrence J. Fogel initiated the research of Evolutionary Programming in the United States, which was considered an artificial intelligence endeavor. In this system, finite state machines are used to solve a prediction problem: these machines would be mutated (adding or deleting states, or changing the state transition rules), and the best of these mutated machines would be evolved further in future generations. The final finite state machine may be used to generate predictions when needed. The evolutionary programming method was successfully applied to prediction problems, system identification, and automatic control. It was eventually extended to handle time series data and to model the evolution of gaming strategies. In 1964, Ingo Rechenberg and Hans-Paul Schwefel introduce the paradigm of evolution strategies in Germany. Since traditional gradient descent techniques produce results that may get stuck in local minima, Rechenberg and Schwefel proposed that random mutations (applied to all parameters of some solution vector) may be used to escape these minima. Child solutions were generated from parent solutions, and the more successful of the two was kept for future generations. This technique was first used by the two to successfully solve optimization problems in fluid dynamics. Initially, this optimization technique was performed without computers, instead relying on dice to determine random mutations. By 1965, the calculations were performed wholly by machine. John Henry Holland introduced genetic algorithms in the 1960s, and it was further developed at the University of Michigan in the 1970s. While the other approaches were focused on solving problems, Holland primarily aimed to use genetic algorithms to study adaptation and determine how it may be simulated. Populations of chromosomes, represented as bit strings, were transformed by an artificial selection process, selecting for specific 'allele' bits in the bit string. Among other mutation methods, interactions between chromosomes were used to simulate the recombination of DNA between different organisms. While previous methods only tracked a single optimal organism at a time (having children compete with parents), Holland's genetic algorithms tracked large populations (having many organisms compete each generation). By the 1990s, a new approach to evolutionary computation that came to be called genetic programming emerged, advocated for by John Koza among others. In this class of algorithms, the subject of evolution was itself a program written in a high-level programming language (there had been some previous attempts as early as 1958 to use machine code, but they met with little success). For Koza, the programs were Lisp S-expressions, which can be thought of as trees of sub-expressions. This representation permits programs to swap subtrees, representing a sort of genetic mixing. Programs are scored based on how well they complete a certain task, and the score is used for artificial selection. Sequence induction, pattern recognition, and planning were all successful applications of the genetic programming paradigm. Many other figures played a role in the history of evolutionary computing, although their work did not always fit into one of the major historical branches of the field. The earliest computational simulations of evolution using evolutionary algorithms and artificial life techniques were performed by Nils Aall Barricelli in 1953, with first results published in 1954. Another pioneer in the 1950s was Alex Fraser, who published a series of papers on simulation of artificial selection. As academic interest grew, dramatic increases in the power of computers allowed practical applications, including the automatic evolution of computer programs. Evolutionary algorithms are now used to solve multi-dimensional problems more efficiently than software produced by human designers, and also to optimize the design of systems. == Techniques == Evolutionary computing techniques mostly involve metaheuristic optimization algorithms. Broadly speaking, the field includes: Agent-based modeling Ant colony optimization Particle swarm optimization Swarm intelligence Artificial immune systems Artificial life Digital organism Cultural algorithms Differential evolution Dual-phase evolution Estimation of distribution algorithm Evolutionary algorithm Genetic algorithm Evolutionary programming Genetic programming Gene expression programming Grammatical evolution Evolution strategy Learnable evolution model Learning classifier system Memetic algorithms Neuroevolution Self-organization such as self-organizing maps, competitive learning Over recent years many dubious algorithms have been proposed, that are often just copies of existing algorithms (frequently Particle Swarm Optimization), where only the metaphor changed, but the algorithm itself is not new at all. A thorough catalogue with many of these dubious algorithms has been published in the Evolutionary Computation Bestiary. It is also important to note that many of these dubiously 'novel' algorithms have poor experimental validation. == Evolutionary algorithms == Evolutionary algorithms form a subset of evolutionary computation in that they generally only involve techniques implementing mechanisms inspired by biological evolution such as reproduction, mutation, recombination and natural selection. Candidate solutions to the optimization problem play the role of individuals in a population, and the cost function determines the environment within which the solutions "live" (see also fitness function). Evolution of the population then takes place after the repeated application of the above operators. In this process, there are two main forces that form the basis of evolutionary systems: Recombination (e.g. crossover) and mutation create the necessary diversity and thereby facilitate novelty, while selection acts as a force increasing quality. Many aspects of such an evolutionary process are stochastic. Changed pieces of information due to recombination and mutati

    Read more →
  • Cloud testing

    Cloud testing

    Cloud testing is a form of software testing in which web applications use cloud computing environments (a "cloud") to simulate real-world user traffic. == Steps == Companies simulate real world Web users by using cloud testing services that are provided by cloud service vendors such as Advaltis, Compuware, HP, Keynote Systems, Neotys, RadView and SOASTA. Once user scenarios are developed and the test is designed, these service providers leverage cloud servers (provided by cloud platform vendors such as Amazon.com, Google, Rackspace, Microsoft, etc.) to generate web traffic that originates from around the world. Once the test is complete, the cloud service providers deliver results and analytics back to corporate IT professionals through real-time dashboards for a complete analysis of how their applications and the internet will perform during peak volumes. == Applications == Cloud testing is often seen as only performance or load tests, however, as discussed earlier it covers many other types of testing. Cloud computing itself is often referred to as the marriage of software as a service (SaaS) and utility computing. In regard to test execution, the software offered as a service may be a transaction generator and the cloud provider's infrastructure software, or may just be the latter. Distributed Systems and Parallel Systems mainly use this approach for testing, because of their inherent complex nature. D-Cloud is an example of such a software testing environment. == Tools == Leading cloud computing service providers include, among others, Amazon, Microsoft, Google, RadView, Skytap, HP and SOASTA. == Benefits == The ability and cost to simulate web traffic for software testing purposes has been an inhibitor to overall web reliability. The low cost and accessibility of the cloud's extremely large computing resources provides the ability to replicate real world usage of these systems by geographically distributed users, executing wide varieties of user scenarios, at scales previously unattainable in traditional testing environments. Minimal start-up time along with quality assurance can be achieved by cloud testing. Following are some of the key benefits: Reduction in capital expenditure Highly scalable

    Read more →
  • Adaptive neuro fuzzy inference system

    Adaptive neuro fuzzy inference system

    An adaptive neuro-fuzzy inference system or adaptive network-based fuzzy inference system (ANFIS) is a kind of artificial neural network that is based on Takagi–Sugeno fuzzy inference system, a class of fuzzy models introduced by Tomohiro Takagi and Michio Sugeno for system identification and control. The technique was developed in the early 1990s. Since it integrates both neural networks and fuzzy logic principles, it has potential to capture the benefits of both in a single framework. Its inference system corresponds to a set of fuzzy IF–THEN rules that have learning capability to approximate nonlinear functions. Hence, ANFIS is considered to be a universal estimator. For using the ANFIS in a more efficient and optimal way, one can use the best parameters obtained by genetic algorithm. It has uses in intelligent situational aware energy management system. == ANFIS architecture == It is possible to identify two parts in the network structure, namely premise and consequence parts. In more details, the architecture is composed by five layers. The first layer takes the input values and determines the membership functions belonging to them. It is commonly called fuzzification layer. The membership degrees of each function are computed by using the premise parameter set, namely {a,b,c}. The second layer is responsible of generating the firing strengths for the rules. Due to its task, the second layer is denoted as "rule layer". The role of the third layer is to normalize the computed firing strengths, by dividing each value for the total firing strength. The fourth layer takes as input the normalized values and the consequence parameter set {p,q,r}. The values returned by this layer are the defuzzificated ones and those values are passed to the last layer to return the final output. === Fuzzification layer === The first layer of an ANFIS network describes the difference to a vanilla neural network. Neural networks in general are operating with a data pre-processing step, in which the features are converted into normalized values between 0 and 1. An ANFIS neural network doesn't need a sigmoid function, but it's doing the preprocessing step by converting numeric values into fuzzy values. Here is an example: Suppose, the network gets as input the distance between two points in the 2d space. The distance is measured in pixels and it can have values from 0 up to 500 pixels. Converting the numerical values into fuzzy numbers is done with the membership function which consists of semantic descriptions like near, middle and far. Each possible linguistic value is given by an individual neuron. The neuron “near” fires with a value from 0 until 1, if the distance is located within the category "near". While the neuron “middle” fires, if the distance in that category. The input value “distance in pixels” is split into three different neurons for near, middle and far.

    Read more →
  • Revelation Space series

    Revelation Space series

    The Revelation Space series is a book series created by Alastair Reynolds. The fictional universe is used as the setting for a number of his novels and stories. Its fictional history follows the human species through various conflicts from the relatively near future (roughly 2200) to approximately 40,000 AD (all the novels to date are set between 2427 and 2858, although certain stories extend beyond this period). It takes its name from Revelation Space (2000), which was the first published novel set in the universe. == Universe == The Revelation Space universe is a fictional universe set in a future version of our world, with the addition of a number of extraterrestrial species and advanced technologies that are not necessarily grounded in current science. It is, nonetheless, somewhat "harder" than most examples of space opera, relying to a considerable extent on science Reynolds believes to be possible; in particular, faster-than-light travel is largely absent. Reynolds has said he prefers to keep the science in his fiction plausible, but he will adopt science he believes to be impossible when it is necessary for the story. The name "Revelation Space universe" has been used by Alastair Reynolds in both the introductory text in the collections Diamond Dogs, Turquoise Days and Galactic North, and also on several editions of the novels set in the universe. He considered calling it the "Exordium universe" after a key plot device, but found that the name was already in use. While a great deal of science fiction reflects either very optimistic or dystopian visions of the human future, the Revelation Space universe is notable in that human societies have not developed to either positive or negative extremes. Instead, despite their dramatically advanced technology, they are similar to those of today in terms of their moral ambiguity and mixture of cruelty and decency, corruption and opportunity. The Revelation Space universe contains elements of Lovecraftian horror, with one posthuman entity stating explicitly that some things in the universe are fundamentally beyond human or transhuman understanding. Nevertheless, the main storyline is essentially optimistic, with humans continuing to survive even in a universe that seems fundamentally hostile to intelligent life. The name "Revelation Space" appears in the novel of the same name during Philip Lascaille's account of his visit to Lascaille's Shroud, an anomalous region of the local universe. Lascaille says that "the key" to something momentous "was explained to me [. . .] while I was in Revelation Space." === Chronology === The chronology of the Revelation Space universe extends to roughly one billion years into the past, when the "Dawn War" — a galaxy-spanning conflict over the availability of various natural resources — resulted in almost all sentient life in the galaxy being wiped out. One race of survivors, later termed the Inhibitors, having converted itself to machine form, predicted that the impending Andromeda–Milky Way collision, roughly 3 billion years in our future, may severely damage the capacity of either galaxy to support life. Consequently, they planned to adjust the positions of stars in order to limit the damage the collision would cause. Also central to the Inhibitor project was the eradication of all species above a certain technological level until the crisis was over, as they believed no organic species would be capable of co-operating on such a large-scale project (an in-universe solution to the Fermi paradox). Whilst they were relatively successful, certain advanced species were able to hide from Inhibitor forces, or even fight back. In human history, during the 21st and 22nd centuries, numerous wars occurred, and a flotilla of generation ships was deployed to colonise a planet orbiting the star 61 Cygni (which becomes a major segment of the plot of Chasm City). The flotilla later reached a planet termed Sky's Edge, which was to be embroiled in war until human civilisation there was eradicated. Meanwhile, in the Solar System in 2190, a faction known as the Conjoiners emerged as a result of increased experimentation with neural implants. In response, the Coalition for Neural Purity was formed, opposed to the Conjoiners. Nevil Clavain, one of the series's primary protagonists, fought on the side of the Coalition in the ensuing war, but defected later on after being betrayed. Clavain, and the Conjoiners, succeeded in escaping the Solar System and left for surrounding stars. For the next few centuries, the so-called Belle Epoque, humanity enjoyed a period of relative peace and prosperity, with several planets being colonised. The most successful planet of all was Yellowstone, a planet orbiting the star Epsilon Eridani, site of the Glitter Band / Rust Belt and Chasm City. Technologies developed included the Conjoiner Drive, a gift from the Conjoiners (who resumed contact with humanity at an unknown time), advanced nanotechnology, and numerous other devices. With the exception of an attempted takeover of the Glitter Band, no major incidents affected humanity during this time. The Belle Epoque was terminated by the advent of the Melding Plague in 2510, a nanotechnological virus that destroyed all other nanotechnology it came into contact with. Only the Conjoiners were unaffected by this disaster, which devastated the civilisation around Yellowstone. War between the Conjoiners and the Demarchists, a rival faction, erupted as a result of the plague. Meanwhile, activities around a far-flung human colony on the planet Resurgam, orbiting the star Delta Pavonis, inadvertently attracted the attention of the Inhibitors. The Conjoiners, also made aware of this event, sent Clavain to recover the exceedingly powerful "Cache Weapons" from this system (said weapons having been stolen from the Conjoiners centuries before) so that they could be used to fend off the Inhibitors while the Conjoiners escaped. Clavain instead defected from the Conjoiners, intending to use the weapons to protect all of humanity. Skade, another Conjoiner, was sent to stop him and recover the weapons. They fought around the Resurgam system, with Clavain and his allies eventually obtaining the weapons. Clavain's ally Remontoire agreed to seek out alien assistance from the Hades Matrix, a nearby alien computer disguised as a neutron star, whilst Clavain sheltered refugees from Resurgam on another planet, later termed Ararat. Remontoire returned in 2675, only a few days after Clavain's death at the hands of Skade, who had arrived with him. Remontoire and his allies were now at war with the Inhibitors, assisted by alien technology obtained from Hades. Even so, it was realised that the humans would not last indefinitely, and Clavain's people, now led by Scorpio, decided to seek out the mysterious "Shadows": a race believed to be near a moon called Hela, site of a theocracy. Aura, daughter of Ana Khouri (an ally of Remontoire) infiltrated the theocracy under the pseudonym Rashmika Els. After considerable conflict, Scorpio and Aura realised that contacting the Shadows was inadvisable. With the later assistance of the Conjoiner known as Glass, and of Clavain's estranged brother Warren, Scorpio and Aura (now going by the name Lady Arek) instead succeeded in contacting the Nestbuilders, an alien race who provided them with weapons capable of defeating the Inhibitors. As such, the Inhibitors were effectively eradicated from human space, with buffer zones and frontiers established to keep them at bay. Humanity then enjoyed a second, 400-year-long golden age. After this, however, came the Greenfly outbreak, in which human civilisation was destroyed by a rogue terraforming system of human origin that destroyed planets and converted them to millions of orbiting, vegetation-filled habitats. The Greenfly began to subsume most of human space, with all efforts to stop them failing, due to the Greenfly having assimilated aspects of both the Melding Plague and Inhibitor technology. The storyline of the Revelation Space universe thus far concludes with humanity leaving the Milky Way galaxy in an attempt to set up a new civilisation elsewhere. == Books and stories set in the universe == All short stories and novellas in this universe to date are collected in Galactic North and Diamond Dogs, Turquoise Days, with the exception of "Monkey Suit", "The Last Log of the Lachrimosa", "Night Passage", "Open and Shut", and "Plague Music". === The Inhibitor Sequence === Revelation Space. London: Gollancz, 2000. ISBN 978-0-575-06875-9. Redemption Ark. London: Gollancz, 2002. ISBN 978-0-575-06879-7. Absolution Gap. London: Gollancz, 2003. ISBN 978-0-575-07434-7. Inhibitor Phase. London: Gollancz, 2021. ISBN 978-0-575-09075-0. === Prefect Dreyfus Emergencies === The Prefect. London: Gollancz, 2007, ISBN 978-0-575-07716-4. (Re-released as Aurora Rising in 2017, ISBN 978-1-473-22336-3) Elysium Fire. London: Gollancz, 2018, ISBN 978-0-575-09059-0.

    Read more →
  • Futuresport

    Futuresport

    Futuresport is a 1998 American made-for-television sports film directed by Ernest Dickerson, starring Dean Cain, Vanessa Williams, and Wesley Snipes. It originally aired on ABC in October 1998, was released on VHS and DVD in March 1999 and then distributed outside of the U.S. by Minerva Pictures. == Plot == The film is set in 2025, and centers on a sport called "Futuresport" (a combination of basketball, baseball and hockey that uses hoverboards and rollerblades) created as a non-lethal way to reduce gang warfare. Tre Ramzey (Dean Cain) along with his ex-girlfriend Alex Torres (Vanessa Williams) and his old coach Obike Fixx (Wesley Snipes) must prevent an all out war between the North American Alliance and the Pan-Pacific Commonwealth (The Com). At stake is who rules over the Hawaiian Islands—which are being terrorized by Eric Sythe (JR Bourne) and his gang the Hawaiian Liberation Organization (Hilo). It takes a revolutionary sport to stop a revolution. == Cast ==

    Read more →
  • Meesho

    Meesho

    Meesho Limited (short for Meri shop, transl. My shop) is an Indian e-commerce company, headquartered in Bengaluru. Founded by Vidit Aatrey and Sanjeev Barnwal in December 2015, Meesho is an online marketplace in categories such as fashion, home and kitchen, beauty and personal care, electronics accessories, and daily use products. == History == Meesho Private Limited, formerly Fashnear Technologies Private Limited, was established by IIT Delhi graduates Vidit Aatrey and Sanjeev Barnwal in December, 2015 In 2016, the founders came up with the idea of re-establishing the platform as Meesho, one that would enable country-wide shipping for resellers with the use of social media sites as tools for marketing. In February 2019, the platform reported having around 209,000 users and about 1.2 million monthly orders, and in March 2020, it reported approximately 563,000 users and 3.1 million monthly orders. In 2021, the Meesho mobile application was ranked among the most downloaded shopping apps globally. In 2022, Meesho had about 120 million monthly users and about 910 million orders were made through the platform, with a gross merchandise value (GMV) of about $5 billion. According to report as of August 2023 Meesho delisted 42 lakh counterfeit listings and 10 lakh restricted products under its initiative Project Suraksha. During the same period, the platform blocked access for over 12,000 user accounts flagged for policy violations. The Court granted injunctive relief by directing domain registrars to suspend the infringing websites. Additionally, the Court ordered law enforcement authorities to initiate criminal investigations, freeze associated financial accounts against the identified offenders. In 2023, Meesho became the fastest shopping app to cross over 500 million downloads. In 2024, Meesho introduced Valmo, a logistics marketplace, to provide shipment services to sellers by aggregating multiple logistics providers. Meesho employs over 3,000 small businesses and 10-12 large firms for warehousing and sorting operations within its logistics framework. According to media reports, Valmo operating in approximately 15,000 pincodes in India with around 6,000 partners. It is reported to handle over 50% of Meesho's daily orders. In November 2024, Meesho introduced a generative AI-powered voice bot for customer support, managing approximately 60,000 calls daily in English and Hindi. According to media reports, the system resolves the majority of queries without human assistance, with only a small fraction of calls requiring manual intervention. According to media reports, in 2024, Meesho prevented over 22 million suspicious or potentially fraudulent transactions on its platform. The company initiated legal proceedings, resulting in the filing of twelve cases, including nine specifically targeting over forty individuals in the cities of Kolkata and Ranchi. The company filed a suit in the Delhi High Court for a permanent injunction against parties operating deceptive websites misappropriating its brand identity. Meesha went public through an initial public offering in December 2025, raising $603 million. It is listed on both the BSE and NSE. == Recognition == In 2023, Meesho was named one of the most influential companies of the year by Time (magazine).

    Read more →
  • GITEX AI Europe

    GITEX AI Europe

    GITEX AI Europe is an annual technology trade show and conference held in Berlin, Germany, as part of GITEX GLOBAL. The event focuses on the European technology market, specifically in the sectors of artificial intelligence (AI), cybersecurity, quantum computing, and digital infrastructure. The event is organized by Kaoun International GmbH, the international arm of the Dubai World Trade Centre (DWTC), in partnership with Messe Berlin. == History == The establishment of GITEX AI Europe was announced in 2023 as part of a strategic move to bring the GITEX brand to the European market. The inaugural edition took place from May 21 to 23, 2025, at the Messe Berlin exhibition grounds. The launch was supported by the Berlin Senate and the German Federal Ministry for Economic Affairs and Climate Action. The first edition of GITEX AI Europe in 2025 featured 21,650 attendees, 1,434 exhibiting companies, and 755 startups, with 513 speakers representing 125 countries. The next edition is scheduled for June 30 – July 1, 2026 in Berlin. == Program == The event consists of an exhibition floor for corporate displays, several conference stages for keynote speeches, and specialized sub-events. The conference program includes tracks such as "AI Stack Sovereignty," "Cyber Regulation & Trust Convergence," and "Institutional Growth Capital." GITEX AI Europe incorporates brands under its umbrella: AI Everything Europe: Focused on the development and application of generative AI and machine learning. North Star Europe: A dedicated program for startups and venture capital, featuring the "Supernova Challenge" pitch competition. GISEC Europe: A cybersecurity forum discussing regulation and infrastructure defense. GITEX Quantum Expo: Focused on the commercialization of quantum computing. Institutional partners for the event include the German Federal Ministry for Economic Affairs and Climate Action, the European Innovation Council (EIC), the International Telecommunication Union (ITU), Bitkom, and Digital Dubai.

    Read more →
  • Residuated lattice

    Residuated lattice

    In abstract algebra, a residuated lattice is an algebraic structure that is simultaneously a lattice x ≤ y and a monoid x•y that admits operations x\z and z/y, loosely analogous to division or implication, when x•y is viewed as multiplication or conjunction, respectively. Called respectively right and left residuals, these operations coincide when the monoid is commutative. The general concept was introduced by Morgan Ward and Robert P. Dilworth in 1939. Examples, some of which existed prior to the general concept, include Boolean algebras, Heyting algebras, residuated Boolean algebras, relation algebras, and MV-algebras. Residuated semilattices omit the meet operation ∧, for example Kleene algebras and action algebras. == Definition == In mathematics, a residuated lattice is an algebraic structure L = (L, ≤, •, I) such that (i) (L, ≤) is a lattice. (ii) (L, •, I) is a monoid. (iii) For all z there exists for every x a greatest y, and for every y a greatest x, such that x•y ≤ z (the residuation properties). In (iii), the "greatest y", being a function of z and x, is denoted x\z and called the right residual of z by x. Think of it as what remains of z on the right after "dividing" z on the left by x. Dually, the "greatest x" is denoted z/y and called the left residual of z by y. An equivalent, more formal statement of (iii) that uses these operations to name these greatest values is (iii)' for all x, y, z in L, y ≤ x\z ⇔ x•y ≤ z ⇔ x ≤ z/y. As suggested by the notation, the residuals are a form of quotient. More precisely, for a given x in L, the unary operations x• and x\ are respectively the lower and upper adjoints of a Galois connection on L, and dually for the two functions •y and /y. By the same reasoning that applies to any Galois connection, we have yet another definition of the residuals, namely, x•(x\y) ≤ y ≤ x\(x•y), and (y/x)•x ≤ y ≤ (y•x)/x, together with the requirement that x•y be monotone in x and y. (When axiomatized using (iii) or (iii)' monotonicity becomes a theorem and hence not required in the axiomatization.) These give a sense in which the functions x• and x\ are pseudoinverses or adjoints of each other, and likewise for •x and /x. This last definition is purely in terms of inequalities, noting that monotonicity can be axiomatized as x • y ≤ (x∨z) • y and similarly for the other operations and their arguments. Moreover, any inequality x ≤ y can be expressed equivalently as an equation, either x∧y = x or x∨y = y. This along with the equations axiomatizing lattices and monoids then yields a purely equational definition of residuated lattices, provided the requisite operations are adjoined to the signature (L, ≤, •, I) thereby expanding it to (L, ∧, ∨, •, I, /, \). When thus organized, residuated lattices form an equational class or variety, whose homomorphisms respect the residuals as well as the lattice and monoid operations. Note that distributivity x • (y ∨ z) = (x • y) ∨ (x • z) and x•0 = 0 are consequences of these axioms and so do not need to be made part of the definition. This necessary distributivity of • over ∨ does not in general entail distributivity of ∧ over ∨, that is, a residuated lattice need not be a distributive lattice. However distributivity of ∧ over ∨ is entailed when • and ∧ are the same operation, a special case of residuated lattices called a Heyting algebra. Alternative notations for x•y include x◦y, x;y (relation algebra), and x⊗y (linear logic). Alternatives for I include e and 1'. Alternative notations for the residuals are x → y for x\y and y ← x for y/x, suggested by the similarity between residuation and implication in logic, with the multiplication of the monoid understood as a form of conjunction that need not be commutative. When the monoid is commutative the two residuals coincide. When not commutative, the intuitive meaning of the monoid as conjunction and the residuals as implications can be understood as having a temporal quality: x•y means x and then y, x → y means had x (in the past) then y (now), and y ← x means if-ever x (in the future) then y (at that time), as illustrated by the natural language example at the end of the examples. == Examples == One of the original motivations for the study of residuated lattices was the lattice of (two-sided) ideals of a ring. Given a ring R, the ideals of R, denoted Id(R), forms a complete lattice with set intersection acting as the meet operation and "ideal addition" acting as the join operation. The monoid operation • is given by "ideal multiplication", and the element R of Id(R) acts as the identity for this operation. Given two ideals A and B in Id(R), the residuals are given by A / B := { r ∈ R ∣ r B ⊆ A } {\displaystyle A/B:=\{r\in R\mid rB\subseteq A\}} B ∖ A := { r ∈ R ∣ B r ⊆ A } {\displaystyle B\setminus A:=\{r\in R\mid Br\subseteq A\}} It is worth noting that {0}/B and B\{0} are respectively the left and right annihilators of B. This residuation is related to the conductor (or transporter) in commutative algebra written as (A:B)=A/B. One difference in usage is that B need not be an ideal of R: it may just be a subset. Boolean algebras and Heyting algebras are commutative residuated lattices in which x•y = x∧y (whence the unit I is the top element 1 of the algebra) and both residuals x\y and y/x are the same operation, namely implication x → y. The second example is quite general since Heyting algebras include all finite distributive lattices, as well as all chains or total orders, for example the unit interval [0,1] in the real line, or the integers and ± ∞ {\displaystyle \pm \infty } . The structure (Z, min, max, +, 0, −, −) (the integers with subtraction for both residuals) is a commutative residuated lattice such that the unit of the monoid is not the greatest element (indeed there is no least or greatest integer), and the multiplication of the monoid is not the meet operation of the lattice. In this example the inequalities are equalities because − (subtraction) is not merely the adjoint or pseudoinverse of + but the true inverse. Any totally ordered group under addition such as the rationals or the reals can be substituted for the integers in this example. The nonnegative portion of any of these examples is an example provided min and max are interchanged and − is replaced by monus, defined (in this case) so that x-y = 0 when x ≤ y and otherwise is ordinary subtraction. A more general class of examples is given by the Boolean algebra of all binary relations on a set X, namely the power set of X2, made a residuated lattice by taking the monoid multiplication • to be composition of relations and the monoid unit to be the identity relation I on X consisting of all pairs (x,x) for x in X. Given two relations R and S on X, the right residual R\S of S by R is the binary relation such that x(R\S)y holds just when for all z in X, zRx implies zSy (notice the connection with implication). The left residual is the mirror image of this: y(S/R)x holds just when for all z in X, xRz implies ySz. This can be illustrated with the binary relations < and > on {0,1} in which 0 < 1 and 1 > 0 are the only relationships that hold. Then x(>\<)y holds just when x = 1, while x()y holds just when y = 0, showing that residuation of < by > is different depending on whether we residuate on the right or the left. This difference is a consequence of the difference between <•> and >•<, where the only relationships that hold are 0(<•>)0 (since 0<1>0) and 1(>•<)1 (since 1>0<1). Had we chosen ≤ and ≥ instead of < and >, ≥\≤ and ≤/≥ would have been the same because ≤•≥ = ≥•≤, both of which always hold between all x and y (since x≤1≥y and x≥0≤y). The Boolean algebra 2Σ of all formal languages over an alphabet (set) Σ forms a residuated lattice whose monoid multiplication is language concatenation LM and whose monoid unit I is the language {ε} consisting of just the empty string ε. The right residual M\L consists of all words w over Σ such that Mw ⊆ L. The left residual L/M is the same with wM in place of Mw. The residuated lattice of all binary relations on X is finite just when X is finite, and commutative just when X has at most one element. When X is empty the algebra is the degenerate Boolean algebra in which 0 = 1 = I. The residuated lattice of all languages on Σ is commutative just when Σ has at most one letter. It is finite just when Σ is empty, consisting of the two languages 0 (the empty language {}) and the monoid unit I = {ε} = 1. The examples forming a Boolean algebra have special properties treated in the article on residuated Boolean algebras. == Residuated semilattice == A residuated semilattice is defined almost identically for residuated lattices, omitting just the meet operation ∧. Thus it is an algebraic structure L = (L, ∨, •, 1, /, \) satisfying all the residuated lattice equations as specified above except those containing an occurrence of the symbol ∧. The option of defining x ≤ y as x∧y = x is then not available, leaving on

    Read more →
  • TuVox

    TuVox

    TuVox is a company that produces VXML-based telephone speech-recognition applications to replace DTMF touch-tone systems for their clients. == History == TuVox was founded in 2001 by Steven S. Pollock and Ashok Khosla, formerly of Apple Computer Corporation and Claris Corporation. Since then, TuVox has grown to over 150 employees and has US offices in Cupertino, California and Boca Raton, Florida as well as international offices in London, Vancouver and Sydney. In 2005, TuVox acquired the customers and hosting facilities of Net-By-Tel. In 2007, the company raised $20m for its speech recognition, and phone menu software. On July 22, 2010, West Interactive — a subsidiary of West Corporation — announced its acquisition of TuVox. == Customers == TuVox clients include: 1-800-Flowers.com, AMC Entertainment, American Airlines, British Airways, M&T Bank, Canon Inc., Gateway, Inc., Motorola, Progress Energy Inc., Telecom New Zealand, Time, Inc., BECU, Virgin America and USAA.

    Read more →
  • Pooling layer

    Pooling layer

    In neural networks, a pooling layer is a kind of network layer that downsamples and aggregates information that is dispersed among many vectors into fewer vectors. It has several uses. It removes redundant information, thus reducing the amount of computation and memory required, which makes the model more robust to small variations in the input; and it increases the receptive field of neurons in later layers in the network. == Convolutional neural network pooling == Pooling is most commonly used in convolutional neural networks (CNN). Below is a description of pooling in 2-dimensional CNNs. The generalization to n-dimensions is immediate. As notation, we consider a tensor x ∈ R H × W × C {\displaystyle x\in \mathbb {R} ^{H\times W\times C}} , where H {\displaystyle H} is height, W {\displaystyle W} is width, and C {\displaystyle C} is the number of channels. A pooling layer outputs a tensor y ∈ R H ′ × W ′ × C ′ {\displaystyle y\in \mathbb {R} ^{H'\times W'\times C'}} . We define two variables f , s {\displaystyle f,s} called "filter size" (aka "kernel size") and "stride". Sometimes, it is necessary to use a different filter size and stride for horizontal and vertical directions. In such cases, we define 4 variables: f H , f W , s H , s W {\displaystyle f_{H},f_{W},s_{H},s_{W}} . The receptive field of an entry in the output tensor, y {\displaystyle y} , are all the entries in x {\displaystyle x} that can affect that entry. === Max pooling === Max Pooling (MaxPool) is commonly used in CNNs to reduce the spatial dimensions of feature maps. Define M a x P o o l ( x | f , s ) 0 , 0 , 0 = max ( x 0 : f − 1 , 0 : f − 1 , 0 ) {\displaystyle \mathrm {MaxPool} (x|f,s)_{0,0,0}=\max(x_{0:f-1,0:f-1,0})} where 0 : f − 1 {\displaystyle 0:f-1} means the range 0 , 1 , … , f − 1 {\displaystyle 0,1,\dots ,f-1} . Note that we need to avoid the off-by-one error. The next input is M a x P o o l ( x | f , s ) 1 , 0 , 0 = max ( x s : s + f − 1 , 0 : f − 1 , 0 ) {\displaystyle \mathrm {MaxPool} (x|f,s)_{1,0,0}=\max(x_{s:s+f-1,0:f-1,0})} and so on. The receptive field of y i , j , c {\displaystyle y_{i,j,c}} is x i s + f − 1 , j s + f − 1 , c {\displaystyle x_{is+f-1,js+f-1,c}} , so in general, M a x P o o l ( x | f , s ) i , j , c = m a x ( x i s : i s + f − 1 , j s : j s + f − 1 , c ) {\displaystyle \mathrm {MaxPool} (x|f,s)_{i,j,c}=\mathrm {max} (x_{is:is+f-1,js:js+f-1,c})} If the horizontal and vertical filter size and strides differ, then in general, M a x P o o l ( x | f , s ) i , j , c = m a x ( x i s H : i s H + f H − 1 , j s W : j s W + f W − 1 , c ) {\displaystyle \mathrm {MaxPool} (x|f,s)_{i,j,c}=\mathrm {max} (x_{is_{H}:is_{H}+f_{H}-1,js_{W}:js_{W}+f_{W}-1,c})} More succinctly, we can write y k = max ( { x k ′ | k ′ in the receptive field of k } ) {\displaystyle y_{k}=\max(\{x_{k'}|k'{\text{ in the receptive field of }}k\})} . If H {\displaystyle H} is not expressible as k s + f {\displaystyle ks+f} where k {\displaystyle k} is an integer, then for computing the entries of the output tensor on the boundaries, max pooling would attempt to take as inputs variables off the tensor. In this case, how those non-existent variables are handled depends on the padding conditions, illustrated on the right. Global Max Pooling (GMP) is a specific kind of max pooling where the output tensor has shape R C {\displaystyle \mathbb {R} ^{C}} and the receptive field of y c {\displaystyle y_{c}} is all of x 0 : H , 0 : W , c {\displaystyle x_{0:H,0:W,c}} . That is, it takes the maximum over each entire channel. It is often used just before the final fully connected layers in a CNN classification head. === Average pooling === Average pooling (AvgPool) is similarly defined A v g P o o l ( x | f , s ) i , j , c = a v e r a g e ( x i s : i s + f − 1 , j s : j s + f − 1 , c ) = 1 f 2 ∑ k ∈ i s : i s + f − 1 ∑ l ∈ j s : j s + f − 1 x k , l , c {\displaystyle \mathrm {AvgPool} (x|f,s)_{i,j,c}=\mathrm {average} (x_{is:is+f-1,js:js+f-1,c})={\frac {1}{f^{2}}}\sum _{k\in is:is+f-1}\sum _{l\in js:js+f-1}x_{k,l,c}} Global Average Pooling (GAP) is defined similarly to GMP. It was first proposed in Network-in-Network. Similarly to GMP, it is often used just before the final fully connected layers in a CNN classification head. === Interpolations === There are some interpolations of max pooling and average pooling. Mixed Pooling is a linear sum of max pooling and average pooling. That is, M i x e d P o o l ( x | f , s , w ) = w M a x P o o l ( x | f , s ) + ( 1 − w ) A v g P o o l ( x | f , s ) {\displaystyle \mathrm {MixedPool} (x|f,s,w)=w\mathrm {MaxPool} (x|f,s)+(1-w)\mathrm {AvgPool} (x|f,s)} where w ∈ [ 0 , 1 ] {\displaystyle w\in [0,1]} is either a hyperparameter, a learnable parameter, or randomly sampled anew every time. Lp Pooling is similar to average pooling, but uses Lp norm average instead of average: y k = ( 1 N ∑ k ′ in the receptive field of k | x k ′ | p ) 1 / p {\displaystyle y_{k}=\left({\frac {1}{N}}\sum _{k'{\text{ in the receptive field of }}k}|x_{k'}|^{p}\right)^{1/p}} where N {\displaystyle N} is the size of receptive field, and p ≥ 1 {\displaystyle p\geq 1} is a hyperparameter. If all activations are non-negative, then average pooling is the case of p = 1 {\displaystyle p=1} , and max pooling is the case of p → ∞ {\displaystyle p\to \infty } . Square-root pooling is the case of p = 2 {\displaystyle p=2} . Stochastic pooling samples a random activation x k ′ {\displaystyle x_{k'}} from the receptive field with probability x k ′ ∑ k ″ x k ″ {\displaystyle {\frac {x_{k'}}{\sum _{k''}x_{k''}}}} . It is the same as average pooling in expectation. Softmax pooling is like max pooling, but uses softmax, i.e. ∑ k ′ e β x k ′ x k ′ ∑ k ″ e β x k ″ {\displaystyle {\frac {\sum _{k'}e^{\beta x_{k'}}x_{k'}}{\sum _{k''}e^{\beta x_{k''}}}}} where β > 0 {\displaystyle \beta >0} . Average pooling is the case of β ↓ 0 {\displaystyle \beta \downarrow 0} , and max pooling is the case of β ↑ ∞ {\displaystyle \beta \uparrow \infty } Local Importance-based Pooling generalizes softmax pooling by ∑ k ′ e g ( x k ′ ) x k ′ ∑ k ″ e g ( x k ″ ) {\displaystyle {\frac {\sum _{k'}e^{g(x_{k'})}x_{k'}}{\sum _{k''}e^{g(x_{k''})}}}} where g {\displaystyle g} is a learnable function. === Other poolings === Spatial pyramidal pooling applies max pooling (or any other form of pooling) in a pyramid structure. That is, it applies global max pooling, then applies max pooling to the image divided into 4 equal parts, then 16, etc. The results are then concatenated. It is a hierarchical form of global pooling, and similar to global pooling, it is often used just before a classification head. Region of Interest Pooling (also known as RoI pooling) is a variant of max pooling used in R-CNNs for object detection. It is designed to take an arbitrarily-sized input matrix, and output a fixed-sized output matrix. Covariance pooling computes the covariance matrix of the vectors { x k , l , 0 : C − 1 } k ∈ i s : i s + f − 1 , l ∈ j s : j s + f − 1 {\displaystyle \{x_{k,l,0:C-1}\}_{k\in is:is+f-1,l\in js:js+f-1}} which is then flattened to a C 2 {\displaystyle C^{2}} -dimensional vector y i , j , 0 : C 2 − 1 {\displaystyle y_{i,j,0:C^{2}-1}} . Global covariance pooling is used similarly to global max pooling. As average pooling computes the average, which is a first-degree statistic, and covariance is a second-degree statistic, covariance pooling is also called "second-order pooling". It can be generalized to higher-order poolings. Blur Pooling means applying a blurring method before downsampling. For example, the Rect-2 blur pooling means taking an average pooling at f = 2 , s = 1 {\displaystyle f=2,s=1} , then taking every second pixel (identity with s = 2 {\displaystyle s=2} ). == Vision Transformer pooling == In Vision Transformers (ViT), there are the following common kinds of poolings. BERT-like pooling uses a dummy [CLS] token, "classification". For classification, the output at [CLS] is the classification token, which is then processed by a LayerNorm-feedforward-softmax module into a probability distribution, which is the network's prediction of class probability distribution. This is the one used by the original ViT and Masked Autoencoder. Global average pooling (GAP) does not use the dummy token, but simply takes the average of all output tokens as the classification token. It was mentioned in the original ViT as being equally good. Multihead attention pooling (MAP) applies a multi headed attention block to pooling. Specifically, it takes as input a list of vectors x 1 , x 2 , … , x n {\displaystyle x_{1},x_{2},\dots ,x_{n}} , which might be thought of as the output vectors of a layer of a ViT. It then applies a feedforward layer F F N {\displaystyle \mathrm {FFN} } on each vector, resulting in a matrix V = [ F F N ( v 1 ) , … , F F N ( v n ) ] {\displaystyle V=[\mathrm {FFN} (v_{1}),\dots ,\mathrm {FFN} (v_{n})]} . This is then sent to a multi-headed attention, resulting in M u l t i h e a d e d A

    Read more →
  • Someday (short story)

    Someday (short story)

    "Someday" is a science fiction short story by American writer Isaac Asimov. It was first published in the August 1956 issue of Infinity Science Fiction and reprinted in the collections Earth Is Room Enough (1957), The Complete Robot (1982), Robot Visions (1990), and The Complete Stories, Volume 1 (1990). == Plot summary == The story is set in a future where computers play a central role in organizing society. Humans are employed as computer operators, but they leave most of the thinking to machines. Indeed, whilst binary programming is taught at school, reading and writing have become obsolete. The story concerns a pair of boys who dismantle and upgrade an old Bard, a child's computer whose sole function is to generate random fairy tales. The boys download a book about computers into the Bard's memory in an attempt to expand its vocabulary, but the Bard simply incorporates computers into its standard fairy tale repertoire. The story ends with the boys excitedly leaving the room after deciding to go to the library to learn "squiggles" (writing) as a means of passing secret messages to one another. As they leave, one of the boys accidentally kicks the Bard's on switch. The Bard begins reciting a new story about a poor mistreated and often ignored robot called the Bard, whose sole purpose is to tell stories, which ends with the words: "the little computer knew then that computers would always grow wiser and more powerful until someday—someday—someday—…"

    Read more →
  • Batch normalization

    Batch normalization

    In artificial neural networks, batch normalization (also known as batch norm) is a normalization technique used to make training faster and more stable by adjusting the inputs to each layer—re-centering them around zero and re-scaling them to a standard size. It was introduced by Sergey Ioffe and Christian Szegedy in 2015. Experts still debate why batch normalization works so well. It was initially thought to tackle internal covariate shift, a problem where parameter initialization and changes in the distribution of the inputs of each layer affect the learning rate of the network. However, newer research suggests it doesn’t fix this shift but instead smooths the objective function—a mathematical guide the network follows to improve—enhancing performance. In very deep networks, batch normalization can initially cause a severe gradient explosion—where updates to the network grow uncontrollably large—but this is managed with shortcuts called skip connections in residual networks. Another theory is that batch normalization adjusts data by handling its size and path separately, speeding up training. == Internal covariate shift == Each layer in a neural network has inputs that follow a specific distribution, which shifts during training due to two main factors: the random starting values of the network’s settings (parameter initialization) and the natural variation in the input data. This shifting pattern affecting the inputs to the network’s inner layers is called internal covariate shift. While a strict definition isn’t fully agreed upon, experiments show that it involves changes in the means and variances of these inputs during training. Batch normalization was first developed to address internal covariate shift. During training, as the parameters of preceding layers adjust, the distribution of inputs to the current layer changes accordingly, such that the current layer needs to constantly readjust to new distributions. This issue is particularly severe in deep networks, because small changes in shallower hidden layers will be amplified as they propagate within the network, resulting in significant shift in deeper hidden layers. Batch normalization was proposed to reduced these unwanted shifts to speed up training and produce more reliable models. Beyond possibly tackling internal covariate shift, batch normalization offers several additional advantages. It allows the network to use a higher learning rate—a setting that controls how quickly the network learns—without causing problems like vanishing or exploding gradients, where updates become too small or too large. It also appears to have a regularizing effect, improving the network’s ability to generalize to new data, reducing the need for dropout, a technique used to prevent overfitting (when a model learns the training data too well and fails on new data). Additionally, networks using batch normalization are less sensitive to the choice of starting settings or learning rates, making them more robust and adaptable. == Procedures == === Transformation === In a neural network, batch normalization is achieved through a normalization step that fixes the means and variances of each layer's inputs. Ideally, the normalization would be conducted over the entire training set, but to use this step jointly with stochastic optimization methods, it is impractical to use the global information. Thus, normalization is restrained to each mini-batch in the training process. Let us use B to denote a mini-batch of size m of the entire training set. The empirical mean and variance of B could thus be denoted as μ B = 1 m ∑ i = 1 m x i {\displaystyle \mu _{B}={\frac {1}{m}}\sum _{i=1}^{m}x_{i}} and σ B 2 = 1 m ∑ i = 1 m ( x i − μ B ) 2 {\displaystyle \sigma _{B}^{2}={\frac {1}{m}}\sum _{i=1}^{m}(x_{i}-\mu _{B})^{2}} . For a layer of the network with d-dimensional input, x = ( x ( 1 ) , . . . , x ( d ) ) {\displaystyle x=(x^{(1)},...,x^{(d)})} , each dimension of its input is then normalized (i.e. re-centered and re-scaled) separately, x ^ i ( k ) = x i ( k ) − μ B ( k ) ( σ B ( k ) ) 2 + ϵ {\displaystyle {\hat {x}}_{i}^{(k)}={\frac {x_{i}^{(k)}-\mu _{B}^{(k)}}{\sqrt {\left(\sigma _{B}^{(k)}\right)^{2}+\epsilon }}}} , where k ∈ [ 1 , d ] {\displaystyle k\in [1,d]} and i ∈ [ 1 , m ] {\displaystyle i\in [1,m]} ; μ B ( k ) {\displaystyle \mu _{B}^{(k)}} and σ B ( k ) {\displaystyle \sigma _{B}^{(k)}} are the per-dimension mean and standard deviation, respectively. ϵ {\displaystyle \epsilon } is added in the denominator for numerical stability and is an arbitrarily small positive constant. The resulting normalized activation x ^ ( k ) {\displaystyle {\hat {x}}^{(k)}} have zero mean and unit variance, if ϵ {\displaystyle \epsilon } is not taken into account. To restore the representation power of the network, a transformation step then follows as y i ( k ) = γ ( k ) x ^ i ( k ) + β ( k ) {\displaystyle y_{i}^{(k)}=\gamma ^{(k)}{\hat {x}}_{i}^{(k)}+\beta ^{(k)}} , where the parameters γ ( k ) {\displaystyle \gamma ^{(k)}} and β ( k ) {\displaystyle \beta ^{(k)}} are subsequently learned in the optimization process. Formally, the operation that implements batch normalization is a transform B N γ ( k ) , β ( k ) : x 1... m ( k ) → y 1... m ( k ) {\displaystyle BN_{\gamma ^{(k)},\beta ^{(k)}}:x_{1...m}^{(k)}\rightarrow y_{1...m}^{(k)}} called the Batch Normalizing transform. The output of the BN transform y ( k ) = B N γ ( k ) , β ( k ) ( x ( k ) ) {\displaystyle y^{(k)}=BN_{\gamma ^{(k)},\beta ^{(k)}}(x^{(k)})} is then passed to other network layers, while the normalized output x ^ i ( k ) {\displaystyle {\hat {x}}_{i}^{(k)}} remains internal to the current layer. === Backpropagation === The described BN transform is a differentiable operation, and the gradient of the loss l {\displaystyle l} with respect to the different parameters can be computed directly with the chain rule. Specifically, ∂ l ∂ y i ( k ) {\displaystyle {\frac {\partial l}{\partial y_{i}^{(k)}}}} depends on the choice of activation function, and the gradient against other parameters could be expressed as a function of ∂ l ∂ y i ( k ) {\displaystyle {\frac {\partial l}{\partial y_{i}^{(k)}}}} : ∂ l ∂ x ^ i ( k ) = ∂ l ∂ y i ( k ) γ ( k ) {\displaystyle {\frac {\partial l}{\partial {\hat {x}}_{i}^{(k)}}}={\frac {\partial l}{\partial y_{i}^{(k)}}}\gamma ^{(k)}} , ∂ l ∂ γ ( k ) = ∑ i = 1 m ∂ l ∂ y i ( k ) x ^ i ( k ) {\displaystyle {\frac {\partial l}{\partial \gamma ^{(k)}}}=\sum _{i=1}^{m}{\frac {\partial l}{\partial y_{i}^{(k)}}}{\hat {x}}_{i}^{(k)}} , ∂ l ∂ β ( k ) = ∑ i = 1 m ∂ l ∂ y i ( k ) {\displaystyle {\frac {\partial l}{\partial \beta ^{(k)}}}=\sum _{i=1}^{m}{\frac {\partial l}{\partial y_{i}^{(k)}}}} , ∂ l ∂ σ B ( k ) 2 = ∑ i = 1 m ∂ l ∂ y i ( k ) ( x i ( k ) − μ B ( k ) ) ( − γ ( k ) 2 ( σ B ( k ) 2 + ϵ ) − 3 / 2 ) {\displaystyle {\frac {\partial l}{\partial \sigma _{B}^{(k)^{2}}}}=\sum _{i=1}^{m}{\frac {\partial l}{\partial y_{i}^{(k)}}}(x_{i}^{(k)}-\mu _{B}^{(k)})\left(-{\frac {\gamma ^{(k)}}{2}}(\sigma _{B}^{(k)^{2}}+\epsilon )^{-3/2}\right)} , ∂ l ∂ μ B ( k ) = ∑ i = 1 m ∂ l ∂ y i ( k ) − γ ( k ) σ B ( k ) 2 + ϵ + ∂ l ∂ σ B ( k ) 2 1 m ∑ i = 1 m ( − 2 ) ⋅ ( x i ( k ) − μ B ( k ) ) {\displaystyle {\frac {\partial l}{\partial \mu _{B}^{(k)}}}=\sum _{i=1}^{m}{\frac {\partial l}{\partial y_{i}^{(k)}}}{\frac {-\gamma ^{(k)}}{\sqrt {\sigma _{B}^{(k)^{2}}+\epsilon }}}+{\frac {\partial l}{\partial \sigma _{B}^{(k)^{2}}}}{\frac {1}{m}}\sum _{i=1}^{m}(-2)\cdot (x_{i}^{(k)}-\mu _{B}^{(k)})} , and ∂ l ∂ x i ( k ) = ∂ l ∂ x ^ i ( k ) 1 σ B ( k ) 2 + ϵ + ∂ l ∂ σ B ( k ) 2 2 ( x i ( k ) − μ B ( k ) ) m + ∂ l ∂ μ B ( k ) 1 m {\displaystyle {\frac {\partial l}{\partial x_{i}^{(k)}}}={\frac {\partial l}{\partial {\hat {x}}_{i}^{(k)}}}{\frac {1}{\sqrt {\sigma _{B}^{(k)^{2}}+\epsilon }}}+{\frac {\partial l}{\partial \sigma _{B}^{(k)^{2}}}}{\frac {2(x_{i}^{(k)}-\mu _{B}^{(k)})}{m}}+{\frac {\partial l}{\partial \mu _{B}^{(k)}}}{\frac {1}{m}}} . === Inference === During the training stage, the normalization steps depend on the mini-batches to ensure efficient and reliable training. However, in the inference stage, this dependence is not useful any more. Instead, the normalization step in this stage is computed with the population statistics such that the output could depend on the input in a deterministic manner. The population mean, E [ x ( k ) ] {\displaystyle E[x^{(k)}]} , and variance, Var ⁡ [ x ( k ) ] {\displaystyle \operatorname {Var} [x^{(k)}]} , are computed as: E [ x ( k ) ] = E B [ μ B ( k ) ] {\displaystyle E[x^{(k)}]=E_{B}[\mu _{B}^{(k)}]} , and Var ⁡ [ x ( k ) ] = m m − 1 E B [ ( σ B ( k ) ) 2 ] {\displaystyle \operatorname {Var} [x^{(k)}]={\frac {m}{m-1}}E_{B}[\left(\sigma _{B}^{(k)}\right)^{2}]} . The population statistics thus is a complete representation of the mini-batches. The BN transform in the inference step thus becomes y ( k ) = B N γ ( k ) , β ( k ) inf ( x ( k ) ) = γ ( k ) x ( k ) − E [ x ( k ) ] Var ⁡ [ x ( k ) ] + ϵ + β

    Read more →