Image formation

Image formation

The study of image formation encompasses the radiometric and geometric processes by which 2D images of 3D objects are formed. In the case of digital images, the image formation process also includes analog to digital conversion and sampling. == Imaging == The imaging process is a mapping of an object to an image plane. Each point on the image corresponds to a point on the object. An illuminated object will scatter light toward a lens and the lens will collect and focus the light to create the image. The ratio of the height of the image to the height of the object is the magnification. The spatial extent of the image surface and the focal length of the lens determines the field of view of the lens. Image formation of mirror these have a center of curvature and its focal length of the mirror is half of the center of curvature. == Illumination == An object may be illuminated by the light from an emitting source such as the sun, a light bulb or a Light Emitting Diode. The light incident on the object is reflected in a manner dependent on the surface properties of the object. For rough surfaces, the reflected light is scattered in a manner described by the Bi-directional Reflectance Distribution Function (BRDF) of the surface. The BRDF of a surface is the ratio of the exiting power per square meter per steradian (radiance) to the incident power per square meter (irradiance). The BRDF typically varies with angle and may vary with wavelength, but a specific important case is a surface that has constant BRDF. This surface type is referred to as Lambertian and the magnitude of the BRDF is R/π, where R is the reflectivity of the surface. The portion of scattered light that propagates toward the lens is collected by the entrance pupil of the imaging lens over the field of view. == Field of view and imagery == The Field of view of a lens is limited by the size of the image plane and the focal length of the lens. The relationship between a location on the image and a location on the object is y = ftan(θ), where y is the max extent of the image plane, f is the focal length of the lens and θ is the field of view. If y is the max radial size of the image then θ is the field of view of the lens. While the image created by a lens is continuous, it can be modeled as a set of discrete field points, each representing a point on the object. The quality of the image is limited by the aberrations in the lens and the diffraction created by the finite aperture stop. == Pupils and stops == The aperture stop of a lens is a mechanical aperture which limits the light collection for each field point. The entrance pupil is the image of the aperture stop created by the optical elements on the object side of the lens. The light scattered by an object is collected by the entrance pupil and focused onto the image plane via a series of refractive elements. The cone of the focused light at the image plane is set by the size of the entrance pupil and the focal length of the lens. This is often referred to as the f-stop or f-number of the lens. f/# = f/D where D is the diameter of the entrance pupil. == Pixelation and color vs. monochrome == In typical digital imaging systems, a sensor is placed at the image plane. The light is focused on to the sensor and the continuous image is pixelated. The light incident on each pixel in the sensor will be integrated within the pixel and a proportional electronic signal will be generated. The angular geometric resolution of a pixel is given by atan(p/f), where p is the pitch of the pixel. This is also called the pixel field of view. The sensor may be monochrome or color. In the case of a monochrome sensor, the light incident on each pixel is integrated and the resulting image is a grayscale like picture. For color images, a mosaic color filter is typically placed over the pixels to create a color image. An example is a Bayer filter. The signal incident on each pixel is then digitized to a bit stream. == Image quality == The quality of an image is dependent upon both geometric and physical items. Geometrically, higher density of pixels across an image will give less blocky pixelation and thus a better geometric image quality. Lens aberrations also contribute to the quality of the image. Physically, diffraction due to the aperture stop will limit the resolvable spatial frequencies as a function of f-number. In the frequency domain, Modulation Transfer Function (MTF) is a measure of the quality of the imaging system. The MTF is a measure of the visibility of a sinusoidal variation in irradiance on the image plane as a function of the frequency of the sinusoid. It includes the effects of diffraction, aberrations and pixelation. For the lens, the MTF is the autocorrelation of the pupil function, so it accounts for the finite pupil extent and the lens aberrations. The sensor MTF is the Fourier Transform of the pixel geometry. For a square pixel, MTF(ξ) = sin(πξp)/πξp where p is the pixel width and ξ is the spatial frequency. The MTF of the combination of the lens and detector is the product of the two component MTFs. == Perception == Color images can be perceived via two means. In the case of computer vision the light incident on the sensor comprises the image. In the case of visual perception, the human eye has a color dependent response to light so this must be accounted for. This is important consideration when converting to grayscale. == Image formation in eye == The principal difference between the lens of the eye and an ordinary optical lens is that the former is flexible. The radius of the curvature of the anterior surface of the lens is greater than the radius of its posterior surface. The shape of the lens is controlled by tension in the fibers of the ciliary body. To focus on distant objects, the controlling muscles cause the lens to be relatively flattened. Similarly, these muscles allow the lens to become thicker in order to focus on objects near the eye. The distance between the center of the lens and the retina (focal length) varies from approximately 17 mm to about 14 mm, as the refractive power of the lens increases from its minimum to its maximum. When the eye focuses on an object farther away than about 3 m, the lens exhibits its lowest refractive power. When the eye focuses on a close object, the lens is most strongly refractive.

QANDA

QANDA (stands for 'Q and A') is an AI-based learning platform developed by Mathpresso Inc., a South Korea-based education technology company. Its best known feature is a solution search, which uses optical character recognition technology to scan problems and provide step-by-step solutions and learning content. As of March 2024, QANDA solved over 6.3 billion questions. QANDA has 90 million total registered users and has reached 8 million monthly active users (MAU) in 50 countries. 90% of the cumulative users are from overseas such as Vietnam and Indonesia. In January 2024, its MathGPT, a math-specific small large language model set a new world record, surpassed Microsoft's 'ToRA 13B', the previous record holder in benchmarks assessing mathematical performance such as 'MATH' (high school math) and 'GSM8K' (grade school math). 'MathGPT' was co-developed with Upstage and KT. In March 2024, Mathpresso launched 'Cramify' (formerly known as Prep.Pie), an AI-powered study material generator designed to create personalized exam prep materials for U.S. college students. It uses generative AI to create customized study materials uploaded by students. Its features include a range of tools including study summarizer and question solver. == History == Co-founder Jongheun ‘Ray’ Lee first came up with the idea of QANDA during his freshman year in college. While he was tutoring to earn money, Lee realized that the quality of education a student receives is greatly based on their location. Lee saw his K-12 students were regularly asking similar questions and realized that these questions were from a pre-selected number of textbooks currently being used in schools. He decided to team up with his high school friend, Yongjae ‘Jake’ Lee to build a platform whereby, one uses a mobile app to scan and submit questions, and students can ask and receive detailed responses. Lee's school friends, Wonguk Jung and Hojae Jeong, joined the team. In June 2015, Mathpresso, Inc. was founded in Seoul, South Korea. In January 2016, Mathpresso's first product QANDA was launched. It supported a Q&A feature between students and tutors. In October 2017, QANDA introduced an AI-based search capability that permitted users to search for answers in seconds. In April 2020, Jake Yongjae Lee(CEO & co-founder) and Ray Jongheun Lee (co-founder) were selected as Forbes 30 under 30 Asia. In June 2021, QANDA raised $50 million in series C funding. Jake Yongjae Lee was recognized as an Innovator Under 35 by MIT Technology Review. In November 2021, QANDA secured a strategic investment from Google. Since its inception, it has received backing in Series C funding from investors namely Google, Yellowdog, GGV Capital, Goodwater Capital, KDB, and SKS Private Equity with participation from SoftBank Ventures Asia, Legend Capital, Mirae Asset Venture Investment, and Smilegate Investment. In September 2023, Mathpresso has raised $8 million (10 billion KRW) from Korea's telecom giant, KT. The total cumulative investment is about 130 million US dollars. The partnership aims to accelerate the development of an education-specific Large Language Model. The company intends to incorporate the LLM model to fortify its AI tutor, which later will be integrated into the existing services: QANDA App, B2B & B2G Saas, and 1:1 online tutoring (QANDA Tutor). == Features == QANDA features OCR-based solution search, one-on-one Q&A tutoring, a study timer. In 2021, QANDA launched additional features, including the premium subscription model that offers unlimited “byte-sized” micro-video lectures and the community feature that enhances collaborative learning. In 2021, QANDA launched QANDA Tutor, a tablet-based 1:1 tutoring service and QANDA Study, a 1:N online school in Vietnam. In 2022, QANDA launched an exam prep feature that offers past exam materials from school via online. This feature is currently available in South Korea. In August 2023, QANDA launched a beta version of an LLM-powered AI Tutor. == Awards and recognition == Best Hidden Gems of 2017 by Google Playstore 2018 AWS AI Startup Challenge Award National representative for the Google AI for Social Good APAC, 2018 Best Self-Improvement Apps of 2018 by Google Playstore GSV Edtech 150 — the Most Transformational Growth Companies in Digital Learning Speaker at the Google App Summit, 2021 Selected as a prospect unicorn company by Korea Technology Finance Corporation in 2023 Winner of G20-DIA Global Pitching in 2023 2021, 2022, 2023 East Asia EdTech 150 by HolonIQ

AI Paragraph Rewriters: Free vs Paid (2026)

Curious about the best AI paragraph rewriter? An AI paragraph rewriter is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI paragraph rewriter slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

Evaluation of machine translation

Various methods for the evaluation for machine translation have been employed. This article focuses on the evaluation of the output of machine translation, rather than on performance or usability evaluation. == Round-trip translation == A typical way for lay people to assess machine translation quality is to translate from a source language to a target language and back to the source language with the same engine. Though intuitively this may seem like a good method of evaluation, it has been shown that round-trip translation is a "poor predictor of quality". The reason why it is such a poor predictor of quality is reasonably intuitive. A round-trip translation is not testing one system, but two systems: the language pair of the engine for translating into the target language, and the language pair translating back from the target language. Consider the following examples of round-trip translation performed from English to Italian and Portuguese from Somers (2005): In the first example, where the text is translated into Italian then back into English—the English text is significantly garbled, but the Italian is a serviceable translation. In the second example, the text translated back into English is perfect, but the Portuguese translation is meaningless; the program thought "tit" was a reference to a tit (bird), which was intended for a "tat", a word it did not understand. While round-trip translation may be useful to generate a "surplus of fun," the methodology is deficient for serious study of machine translation quality. == Human evaluation == This section covers two of the large scale evaluation studies that have had significant impact on the field—the ALPAC 1966 study and the ARPA study. === Automatic Language Processing Advisory Committee (ALPAC) === One of the constituent parts of the ALPAC report was a study comparing different levels of human translation with machine translation output, using human subjects as judges. The human judges were specially trained for the purpose. The evaluation study compared an MT system translating from Russian into English with human translators, on two variables. The variables studied were "intelligibility" and "fidelity". Intelligibility was a measure of how "understandable" the sentence was, and was measured on a scale of 1–9. Fidelity was a measure of how much information the translated sentence retained compared to the original, and was measured on a scale of 0–9. Each point on the scale was associated with a textual description. For example, 3 on the intelligibility scale was described as "Generally unintelligible; it tends to read like nonsense but, with a considerable amount of reflection and study, one can at least hypothesize the idea intended by the sentence". Intelligibility was measured without reference to the original, while fidelity was measured indirectly. The translated sentence was presented, and after reading it and absorbing the content, the original sentence was presented. The judges were asked to rate the original sentence on informativeness. So, the more informative the original sentence, the lower the quality of the translation. The study showed that the variables were highly correlated when the human judgment was averaged per sentence. The variation among raters was small, but the researchers recommended that at the very least, three or four raters should be used. The evaluation methodology managed to separate translations by humans from translations by machines with ease. The study concluded that, "highly reliable assessments can be made of the quality of human and machine translations". === Advanced Research Projects Agency (ARPA) === As part of the Human Language Technologies Program, the Advanced Research Projects Agency (ARPA) created a methodology to evaluate machine translation systems, and continues to perform evaluations based on this methodology. The evaluation programme was instigated in 1991, and continues to this day. Details of the programme can be found in White et al. (1994) and White (1995). The evaluation programme involved testing several systems based on different theoretical approaches; statistical, rule-based and human-assisted. A number of methods for the evaluation of the output from these systems were tested in 1992 and the most recent suitable methods were selected for inclusion in the programmes for subsequent years. The methods were; comprehension evaluation, quality panel evaluation, and evaluation based on adequacy and fluency. Comprehension evaluation aimed to directly compare systems based on the results from multiple choice comprehension tests, as in Church et al. (1993). The texts chosen were a set of articles in English on the subject of financial news. These articles were translated by professional translators into a series of language pairs, and then translated back into English using the machine translation systems. It was decided that this was not adequate for a standalone method of comparing systems and as such abandoned due to issues with the modification of meaning in the process of translating from English. The idea of quality panel evaluation was to submit translations to a panel of expert native English speakers who were professional translators and get them to evaluate them. The evaluations were done on the basis of a metric, modelled on a standard US government metric used to rate human translations. This was good from the point of view that the metric was "externally motivated", since it was not specifically developed for machine translation. However, the quality panel evaluation was very difficult to set up logistically, as it necessitated having a number of experts together in one place for a week or more, and furthermore for them to reach consensus. This method was also abandoned. Along with a modified form of the comprehension evaluation (re-styled as informativeness evaluation), the most popular method was to obtain ratings from monolingual judges for segments of a document. The judges were presented with a segment, and asked to rate it for two variables, adequacy and fluency. Adequacy is a rating of how much information is transferred between the original and the translation, and fluency is a rating of how good the English is. This technique was found to cover the relevant parts of the quality panel evaluation, while at the same time being easier to deploy, as it didn't require expert judgment. Measuring systems based on adequacy and fluency, along with informativeness is now the standard methodology for the ARPA evaluation program. == Automatic evaluation == In the context of this article, a metric is a measurement. A metric that evaluates machine translation output represents the quality of the output. The quality of a translation is inherently subjective, there is no objective or quantifiable "good." Therefore, any metric must assign quality scores so they correlate with the human judgment of quality. That is, a metric should score highly translations that humans score highly, and give low scores to those humans give low scores. Human judgment is the benchmark for assessing automatic metrics, as humans are the end-users of any translation output. The measure of evaluation for metrics is correlation with human judgment. This is generally done at two levels, at the sentence level, where scores are calculated by the metric for a set of translated sentences, and then correlated against human judgment for the same sentences. And at the corpus level, where scores over the sentences are aggregated for both human judgments and metric judgments, and these aggregate scores are then correlated. Figures for correlation at the sentence level are rarely reported, although Banerjee et al. (2005) do give correlation figures that show that, at least for their metric, sentence-level correlation is substantially worse than corpus level correlation. While not widely reported, it has been noted that the genre, or domain, of a text has an effect on the correlation obtained when using metrics. Coughlin (2003) reports that comparing the candidate text against a single reference translation does not adversely affect the correlation of metrics when working in a restricted domain text. Even if a metric correlates well with human judgment in one study on one corpus, this successful correlation may not carry over to another corpus. Good metric performance, across text types or domains, is important for the reusability of the metric. A metric that only works for text in a specific domain is useful, but less useful than one that works across many domains—because creating a new metric for every new evaluation or domain is undesirable. Another important factor in the usefulness of an evaluation metric is to have a good correlation, even when working with small amounts of data, that is candidate sentences and reference translations. Turian et al. (2003) point out that, "Any MT evaluation measure is less reliable on shorter translations", and

Aslı Çelikyılmaz

Aslı Çelikyılmaz is an engineer specializing in natural language processing, and particularly in natural language generation for software agents with advanced reasoning and real-world modeling capabilities. Educated in Turkey and Canada, she works in the US as senior research lead at Fundamentals AI Research, Meta. She also holds an affiliate faculty position in computer science at the University of Washington, and is co-editor-in-chief of the journal Transactions of the Association for Computational Linguistics. == Education and career == Çelikyılmaz is a 1997 graduate of Istanbul Technical University, where she studied industrial engineering. After a 2002 master's degree in computer and information science from Seneca Polytechnic in Toronto, and a second master's degree in information science from the University of Toronto in 2005, she completed a Ph.D. in information science at the University of Toronto in 2008. She worked as a postdoctoral researcher in California, at the University of California, Berkeley, from 2008 to 2010. In 2010 she joined Microsoft in Sunnyvale, California, where she became a senior scientist and later a senior principal researcher in Redmond, Washington. She added her affiliation with the University of Washington in 2018, and moved to Meta in Seattle in 2021. == Recognition == Çelikyılmaz was named to the 2026 class of IEEE Fellows, "for contributions to conversational systems and language generation".

Stixel

In computer vision, a stixel (portmanteau of "stick" and "pixel") is a superpixel representation of depth information in an image, in the form of a vertical stick that approximates the closest obstacles within a certain vertical slice of the scene. Introduced in 2009, stixels have applications in robotic navigation and advanced driver-assistance systems, where they can be used to define a representation of robotic environments and traffic scenes with a medium level of abstraction. == Definition == One of the problems of scene understanding in computer vision is to determine horizontal freespace around the camera, where the agent can move, and the vertical obstacles delimiting it. An image can be paired with depth information (produced e.g. from stereo disparity, lidar, or monocular depth estimation), allowing a dense tridimensional reconstruction of the observed scene. One drawback of dense reconstruction is the large amount of data involved, since each pixel in the image is mapped to an element of a point cloud. Vision problems characterised by planar freespace delimited by mostly vertical obstacles, such as traffic scenes or robotic navigation, can benefit from a condensed representation that allows to save memory and processing time. Stixels are thin vertical rectangles representing a slice of a vertical surface belonging to the closest obstacle in the observed scene. They allow to dramatically reduce the amount of information needed to represent a scene in such problems. A stixel is characterised by three parameters: vertical coordinate of the bottom, height of the stick, and depth. Stixels have fixed width, with each stixel spanning over a certain number of image columns, allowing downsampling of the horizontal image resolution. In the original formulation, each column of the image would contain at most one stixel, and later extensions were developed to allow multiple stixels on each column, allowing to represent multiple objects at different distances. == Stixel estimation == The input to stixel estimation is a dense depth map, that can be computed from stereo disparity or other means. The original approach computes an occupancy grid that can be segmented to estimate the freespace, with dynamic programming providing an efficient method to find an optimal segmentation. Alternative approaches can be used instead of occupancy grid mapping, such as manifold-based methods. The freespace boundary provides the base points of the obstacles at closest longitudinal distance, however multiple objects at different distances might appear in each column of the image. To fully define the obstacles, their height should be estimated, and this is accomplished by segmenting the depth of the object from the depth of the background. A membership function over the pixels can be defined based on the depth value, where the membership represents the confidence of a pixel belonging to the closest vertical obstacle or to the background, and a cut separating the obstacles from the background can again be computed effectively with dynamic programming. Once both the freespace and the obstacle height are known, the stixels can be estimated by fusing the information over the columns spanned by each stixel, and finally a refined depth of the stixel can be estimated via model fitting over the depth of the pixels covered by the stixel, possibly paired with confidence information (e.g. disparity confidence produced by methods such as semi-global matching).

Is an AI Essay Writer Worth It in 2026?

Comparing the best AI essay writer? An AI essay writer is software that uses machine learning to help you get more done — it lowers the barrier so anyone can produce professional output. Privacy matters too: check whether your data trains the model and whether a no-log or enterprise tier is available. Whether you are a beginner or a pro, the right AI essay writer slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.