AI Coding Wiki

AI Coding Wiki — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Database

    Database

    In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data. The DBMS additionally encompasses the core facilities provided to administer the database. The sum total of the database, the DBMS and the associated applications can be referred to as a database system. Often the term "database" is also used loosely to refer to any of the DBMS, the database system or an application associated with the database. Before digital storage and retrieval of data became widespread, index cards were used for data storage in a wide range of applications and environments: in the home to record and store recipes, shopping lists, contact information and other organizational data; in business to record presentation notes, project research and notes, and contact information; in schools as flash cards or other visual aids; and in academic research to hold data such as bibliographical citations or notes in a card file. Professional book indexers used index cards in the creation of book indexes until they were replaced by indexing software in the 1980s and 1990s. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spans formal techniques and practical considerations, including data modeling, efficient data representation and storage, query languages, security and privacy of sensitive data, and distributed computing issues, including supporting concurrent access and fault tolerance. Computer scientists may classify database management systems according to the database models that they support. Relational databases became dominant in the 1980s. These model data as rows and columns in a series of tables, and the vast majority use SQL for writing and querying data. In the 2000s, non-relational databases became popular, collectively referred to as NoSQL, because they use different query languages. == Terminology and overview == Formally, a "database" refers to a set of related data accessed through the use of a "database management system" (DBMS), which is an integrated set of computer software that allows users to interact with one or more databases and provides access to all of the data contained in the database (although restrictions may exist that limit access to particular data). The DBMS provides various functions that allow entry, storage and retrieval of large quantities of information and provides ways to manage how that information is organized. Because of the close relationship between them, the term "database" is often used casually to refer to both a database and the DBMS used to manipulate it. Outside the world of professional information technology, the term database is often used to refer to any collection of related data (such as a spreadsheet or a card index) as size and usage requirements typically necessitate use of a database management system. Existing DBMSs provide various functions that allow management of a database and its data which can be classified into four main functional groups: Data definition – Creation, modification and removal of definitions that detail how the data is to be organized. Update – Insertion, modification, and deletion of the data itself. Retrieval – Selecting data according to specified criteria (e.g., a query, a position in a hierarchy, or a position in relation to other data) and providing that data either directly to the user, or making it available for further processing by the database itself or by other applications. The retrieved data may be made available in a more or less direct form without modification, as it is stored in the database, or in a new form obtained by altering it or combining it with existing data from the database. Administration – Registering and monitoring users, enforcing data security, monitoring performance, maintaining data integrity, dealing with concurrency control, and recovering information that has been corrupted by some event such as an unexpected system failure. Both a database and its DBMS conform to the principles of a particular database model. "Database system" refers collectively to the database model, database management system, and database. Physically, database servers are dedicated computers that hold the actual databases and run only the DBMS and related software. Database servers are usually multiprocessor computers, with generous memory and RAID disk arrays used for stable storage. Hardware database accelerators, connected to one or more servers via a high-speed channel, are also used in large-volume transaction processing environments. DBMSs are found at the heart of most database applications. DBMSs may be built around a custom multitasking kernel with built-in networking support, but modern DBMSs typically rely on a standard operating system to provide these functions. Since DBMSs comprise a significant market, computer and storage vendors often take into account DBMS requirements in their own development plans. Databases and DBMSs can be categorized according to the database model(s) that they support (such as relational or XML), the type(s) of computer they run on (from a server cluster to a mobile phone), the query language(s) used to access the database (such as SQL or XQuery), and their internal engineering, which affects performance, scalability, resilience, and security. == History == The sizes, capabilities, and performance of databases and their respective DBMSs have grown in orders of magnitude. These performance increases were enabled by the technology progress in the areas of processors, computer memory, computer storage, and computer networks. The concept of a database was made possible by the emergence of direct access storage media such as magnetic disks, which became widely available in the mid-1960s; earlier systems relied on sequential storage of data on magnetic tape. The subsequent development of database technology can be divided into three eras based on data model or structure: navigational, SQL/relational, and post-relational. The two main early navigational data models were the hierarchical model and the CODASYL model (network model). These were characterized by the use of pointers (often physical disk addresses) to follow relationships from one record to another. The relational model, first proposed in 1970 by Edgar F. Codd, departed from this tradition by insisting that applications should search for data by content, rather than by following links. The relational model employs sets of ledger-style tables, each used for a different type of entity. Only in the mid-1980s did computing hardware become powerful enough to allow the wide deployment of relational systems (DBMSs plus applications). By the early 1990s, however, relational systems dominated in all large-scale data processing applications, and as of 2018 they remain dominant: IBM Db2, Oracle, MySQL, and Microsoft SQL Server are the most searched DBMS. The dominant database language, standardized SQL for the relational model, has influenced database languages for other data models. Object databases were developed in the 1980s to overcome the inconvenience of object–relational impedance mismatch, which led to the coining of the term "post-relational" and also the development of hybrid object–relational databases. The next generation of post-relational databases in the late 2000s became known as NoSQL databases, introducing fast key–value stores and document-oriented databases. A competing "next generation" known as NewSQL databases attempted new implementations that retained the relational/SQL model while aiming to match the high performance of NoSQL compared to commercially available relational DBMSs. === 1960s, navigational DBMS === The introduction of the term database coincided with the availability of direct-access storage (disks and drums) from the mid-1960s onwards. The term represented a contrast with the tape-based systems of the past, allowing shared interactive use rather than daily batch processing. The Oxford English Dictionary cites a 1962 report by the System Development Corporation of California as the first to use the term "data-base" in a specific technical sense. As computers grew in speed and capability, a number of general-purpose database systems emerged; by the mid-1960s a number of such systems had come into commercial use. Interest in a standard began to grow, and Charles Bachman, author of one such product, the Integrated Data Store (IDS), founded the Database Task Group within CODASYL, the group responsible for the creation and standardization of COBOL. In 1971, the Database Task Group delivered their standard, which generally became known as the CODASYL approach, and soon a number of commercial products based on this approach entered the market. The CODASYL approach of

    Read more →
  • Reverse correlation technique

    Reverse correlation technique

    The reverse correlation technique is a data driven study method used primarily in psychological and neurophysiological research. This method earned its name from its origins in neurophysiology, where cross-correlations between white noise stimuli and sparsely occurring neuronal spikes could be computed quicker when only computing it for segments preceding the spikes. The term has since been adopted in psychological experiments that usually do not analyze the temporal dimension, but also present noise to human participants. In contrast to the original meaning, the term is here thought to reflect that the standard psychological practice of presenting stimuli of defined categories to the participants is "reversed": Instead, the participant's mental representations of categories are estimated from interactions of the presented noise and the behavioral responses. It is used to create composite pictures of individual and/or group mental representations of various items (e.g. faces, bodies, and the self) that depict characteristics of said items (e.g. trustworthiness and self-body image). This technique is helpful when evaluating the mental representations of those with and without mental illnesses. == Terms == This technique utilizes spike-triggered average to explain what areas of signal and noise in an image are valuable for the given research question. Signal is information used to produce objects of value that help explain and connect the world around us. Noise is commonly referred to as unwanted signal that obscures the information that the signal is trying to present. Most importantly for reverse correlation studies, noise is randomly varying information. To determine the areas of importance using reverse correlation, noise is applied to a base image and then evaluated by observers. A base image is any image void of noise that relates to the research question. A base image that has noise superimposed on top is the stimuli that is presented to and evaluated by participants. Each time a new set of stimuli is presented to a participant, this is known as a trial. After a participant has responded to hundreds to thousands of trials, a researcher is ready to create a classification image. A classification image (abbreviated as "CI" in some studies) is a single image that represents the average noise patterns in the images selected by participants. A classification image can also be computed for groups by averaging the individuals’ classification images. These classification images are what researchers use to interpret the data and draw conclusions. As a whole, the reverse correlation method is a process that results in a composite image (from an individual or group) that can be used to estimate and interpret mental representations. == Basic study layout == The reverse correlation method is typically executed as an in-lab computer experiment. This method follows four broad steps. Each of the following steps are described in greater detail below. After creating a research question and determining that the reverse correlation method is the most suitable technique to answer the question, a researcher must (1) design randomly varying stimuli. After the stimuli have been prepared, a researcher should (2) collect data from participants who will see and respond to approximately 300 -1,000 trials. Each trial will either consist of one or two images (side by side) derived from the same base image with noise superimposed on top. Participant responses will depend on the chosen study design; if a researcher presents only one image at a time, participants rate the image on a 4pt scale, but when two images are shown, the participant is asked to choose which best aligns with the given category (e.g. choose the image that looks the most aggressive). Once all of the data is collected, the researcher will (3) compute classification images for each participant and using those images compute group classification images. Finally, with the classification images available, the researcher will (4) evaluate the images and draw conclusions about their results. === Step 1: making stimuli === When designing the stimuli for a reverse correlation study, the two primary factors that one should consider are (1) the base image and (2) the noise that will be used. While not all bases are images per se, the majority are and for this reason the base is typically referred to as a base image. The base image should represent whatever the research question is addressing. For example, if you are interested in peoples’ mental representations of Chinese people, it would not make sense to use a base image of a Spanish or Caucasian person. Again, if you are interested in the mental representations of male vocal patterns, it would make the most sense to use a base vocal pattern that has been produced by a male. Having a base is important because it provides a kind of anchor for participants to work from. When there is no base image, the number of trials that are required increases dramatically, thus making it harder to collect data. While there are studies that have excluded a base image, (e.g. the S study), for more elaborate and nuanced research questions, it is important to have a base image that is a fair representation of what participants are being asked to categorize. Photographs of faces are generally the most popular base image. Although the reverse correlation method is capable of investigating a wide variety of research questions, the most common application of the method is for evaluating faces on a single trait. Reverse correlation studies that address evaluations of the face are sometimes referred to as being a face space reverse correlation model (FSRCM). Thankfully, there are existing databases for face images of varying demographics and emotion that work well as base images. The reverse correlation method can also be used to help researchers identify what areas of an image (e.g. the areas on the face) have diagnostic value. In order to identify these areas of value, researchers start by minimizing the space a participant can pull information from. By imposing a “mask” on an image (e.g. blur an image while leaving random areas un-blurred), this reduces the information individuals might see, and forces them to focus on certain areas. Then, if/when participants are able to correctly identify an image with a trait repeatedly, we can draw conclusions about what areas have diagnostic value. While faces and visual stimuli are the most popular, this is not the only stimuli that can be used in a reverse correlation study. This method was originally designed for auditory stimuli which allows researchers to investigate how perceivers interpret auditory information and create trait based attributions to different sound patterns. For example, by segmenting a vocal recording of a single word (total sound time 426 ms) into six segments (71 ms each), and varying each segment's pitch using Gaussian distributions, researchers were able to uncover what vocal patterns people associated with certain traits. Specifically, this study investigated how listeners rated sound clips of the word “really” as sounding more interrogative (i.e. like the more common reverse correlation studies this study had participants listen to two sound clips per trial, choose which fit the category the best, and then created an average of the pitch contours). Beyond face and auditory perception, research utilizing the reverse correlation method has expanded to investigate how individuals see three-dimensional objects in images with noise (but no signal). After selecting your base image, regardless of what the image is, it is helpful to apply a Gaussian blur to smooth noise in the image. While noise will be applied later, it is helpful to reduce existing noise in the photo before applying your chosen noise. There are three primary choices when it comes to noise: white noise, sine-wave noise, and Gabor noise. The latter two of these constrain the configurations that the noise can have, and because of this white noise is usually the most commonly used. Regardless of the type of noise that is chosen, it is crucial that the noise randomly varies. === Step 2: data collection === Once the stimuli for the study has been developed, the researcher must make a few decisions before actually collecting the data. The researcher must come to a conclusion on how many stimuli will be presented at a time and how many trials the participants will see. In terms of stimuli presentation, a researcher can choose from either a 2-Image Forced Choice (2IFC) or a 4-Alternative Forced Choice (4AFC). The 2IFC presents two images at once (side by side) and requires participants to choose between the two on a specified category (e.g. which image looks the most like a male). Typically the noise from the left image is the mathematical inverse of the noise from the right image. This method was developed to better answer questions that could n

    Read more →
  • Firefox Lockwise

    Firefox Lockwise

    Firefox Lockwise (formerly Lockbox) is a deprecated password manager for the Firefox web browser, as well as the mobile operating systems iOS and Android. On desktop, Lockwise was simply part of Firefox, whereas on iOS and Android it was available as a standalone app. If Firefox Sync was activated (with a Firefox account), then Lockwise synced passwords between Firefox installations across devices. It also featured a built-in random password generator. The application and branding have since been "phased out." == History == Developed by Mozilla, it was originally named Firefox Lockbox in 2018. It was renamed "Lockwise" in May 2019. It was introduced for iOS on 10 July 2018 as part of the Test Pilot program. On 26 March 2019, it was released for Android. On desktop, Lockwise started out as a browser addon. Alphas were released between March and August 2019. Since Firefox version 70, Lockwise has been integrated into the browser (accessible at about:logins), having replaced a basic password manager presented in a popup window. Mozilla ended support for Firefox Lockwise on December 13, 2021. As of January 2026, Lockwise is still fully functional on Android to this day.

    Read more →
  • Automate This

    Automate This

    Automate This: How Algorithms Came to Rule Our World is a book written by Christopher Steiner and published by Penguin Group. == Book == Steiner begins his study of algorithms on Wall Street in the 1980s but also provides examples from other industries. For example, he explains the history of Pandora Radio and the use of algorithms in music identification. He expresses concern that such use of algorithms may lead to the homogenization of music over time. Steiner also discusses the algorithms that eLoyalty (now owned by Mattersight Corporation following divestiture of the technology) was created by dissecting 2 million speech patterns and can now identify a caller's personality style and direct the caller with a compatible customer support representative. Steiner's book shares both the warning and the opportunity that algorithms bring to just about every industry in the world, and the pros and cons of the societal impact of automation (e.g. impact on employment).

    Read more →
  • VueScan

    VueScan

    VueScan is a computer program for image scanning, especially of photographs, including negatives. It supports optical character recognition (OCR) of text documents. The software can be downloaded and used free of charge, but adds a watermark on scans until a license is purchased. == Purpose == VueScan is intended to work with a large number of image scanners, excluding specialised professional scanners such as drum scanners, on many computer operating systems (OS), even if drivers for the scanner are not available for the OS. These scanners are supplied with device drivers and software to operate them, included in their price. A 2014 review considered that the reasons to purchase VueScan are to allow older scanners not supported by drivers for newer operating systems to be used in more up-to-date systems and for better scanning and processing of photographs (prints; also slides and negatives when supported by scanners) than is afforded by manufacturers' software. The review did not report any advantages to VueScan's processing of documents over other software. The reviewer considered VueScan comparable to SilverFast, a similar program, with support for some specific scanners better in one or the other. Vuescan supports more scanners, with a single purchase giving access to the full range of both film and flatbed scanners, and costs less. The VueScan program can be used with its own drivers or with drivers supplied by the scanner manufacturer, if supported by the operating system. VueScan drivers can also be used without the VueScan program by application software that supports scanning directly, such as Adobe Photoshop, again enabling the use of scanners without current manufacturers' drivers. In 2019 when Apple released macOS Catalina, they removed support for running 32-bit programs, including 32-bit drivers for scanning equipment. In response, Hamrick released VueScan 9.7, effectively saving thousands of scanners from being rendered obsolete. == Overview == VueScan enables the user to modify and fine-tune the scanning parameters. The program uses its own independent method to interface with scanner hardware, and can support many older scanners under computer operating systems for which drivers are not available, allowing old scanners to be used with newer platforms that do not otherwise support them. VueScan supports an increasing number of scanners and digital cameras; 2,400 on Windows, 2,100 on Mac OS X and 1,900 on Linux in 2018. VueScan is supplied as one downloadable file for each operating system, which supports the full range of scanners. Without the purchase of a license, the program runs in fully functional demonstration mode, identical to Professional mode, except that watermarks are superimposed on saved and printed images. Purchase of a license removes the watermark. A standard license allows updates for one year; a professional license allows unlimited updates and provides some additional features. VueScan supports optical character recognition (OCR), with English included, and 32 additional language packages available on its website. In September 2011, VueScan co-developer Ed Hamrick said that he was selling US$3 million per year of VueScan licenses.

    Read more →
  • Kuwahara filter

    Kuwahara filter

    The Kuwahara filter is a non-linear smoothing filter used in image processing for adaptive noise reduction. Most filters that are used for image smoothing are linear low-pass filters that effectively reduce noise but also blur out the edges. However the Kuwahara filter is able to apply smoothing on the image while preserving the edges. It is named after Michiyoshi Kuwahara, Ph.D., who worked at Kyoto and Osaka Sangyo Universities in Japan, developing early medical imaging of dynamic heart muscle in the 1970s and 80s. == The Kuwahara operator == Suppose that I ( x , y ) {\displaystyle I(x,y)} is a grey scale image and that we take a square window of size 2 a + 1 {\displaystyle 2a+1} centered around a point ( x , y ) {\displaystyle (x,y)} in the image. This square can be divided into four smaller square regions Q i = 1 ⋯ 4 {\displaystyle Q_{i=1\cdots 4}} each of which will be Q i ( x , y ) = { [ x , x + a ] × [ y , y + a ] if i = 1 [ x − a , x ] × [ y , y + a ] if i = 2 [ x − a , x ] × [ y − a , y ] if i = 3 [ x , x + a ] × [ y − a , y ] if i = 4 {\displaystyle Q_{i}(x,y)={\begin{cases}\left[x,x+a\right]\times \left[y,y+a\right]&{\mbox{ if }}i=1\\\left[x-a,x\right]\times \left[y,y+a\right]&{\mbox{ if }}i=2\\\left[x-a,x\right]\times \left[y-a,y\right]&{\mbox{ if }}i=3\\\left[x,x+a\right]\times \left[y-a,y\right]&{\mbox{ if }}i=4\\\end{cases}}} where × {\displaystyle \times } is the cartesian product. Pixels located on the borders between two regions belong to both regions so there is a slight overlap between subregions. The arithmetic mean m i ( x , y ) {\displaystyle m_{i}(x,y)} and standard deviation σ i ( x , y ) {\displaystyle \sigma _{i}(x,y)} of the four regions centered around a pixel (x,y) are calculated and used to determine the value of the central pixel. The output of the Kuwahara filter Φ ( x , y ) {\displaystyle \Phi (x,y)} for any point ( x , y ) {\displaystyle (x,y)} is then given by Φ ( x , y ) = m i ( x , y ) {\textstyle \Phi (x,y)=m_{i}(x,y)} where i = a r g min j ⁡ σ j ( x , y ) {\displaystyle i=\operatorname {arg\min } _{j}\sigma _{j}(x,y)} . This means that the central pixel will take the mean value of the area that is most homogenous. The location of the pixel in relation to an edge plays a great role in determining which region will have the greater standard deviation. If for example the pixel is located on a dark side of an edge it will most probably take the mean value of the dark region. On the other hand, should the pixel be on the lighter side of an edge it will most probably take a light value. On the event that the pixel is located on the edge it will take the value of the more smooth, least textured region. The fact that the filter takes into account the homogeneity of the regions ensures that it will preserve the edges while using the mean creates the blurring effect. Similarly to the median filter, the Kuwahara filter uses a sliding window approach to access every pixel in the image. The size of the window is chosen in advance and may vary depending on the desired level of blur in the final image. Bigger windows typically result in the creation of more abstract images whereas small windows produce images that retain their detail. Typically windows are chosen to be square with sides that have an odd number of pixels for symmetry. However, there are variations of the Kuwahara filter that use rectangular windows. Additionally, the subregions do not need to overlap or have the same size as long as they cover all of the window. == Color images == For color images, the filter should not be performed by applying the filter to each RGB channel separately, and then recombining the three filtered color channels to form the filtered RGB image. The main problem with that is that the quadrants will have different standard deviations for each of the channels. For example, the upper left quadrant may have the lowest standard deviation in the red channel, but the lower right quadrant may have the lowest standard deviation in the green channel. This situation would result in the color of the central pixel to be determined by different regions, which might result in color artifacts or blurrier edges. To overcome this problem, for color images a slightly modified Kuwahara filter must be used. The image is first converted into another color space, the HSV color space. The modified filter then operates on only the "brightness" channel, the Value coordinate in the HSV model. The variance of the "brightness" of each quadrant is calculated to determine the quadrant from which the final filtered color should be taken from. The filter will produce an output for each channel which will correspond to the mean of that channel from the quadrant that had the lowest standard deviation in "brightness". This ensures that only one region will determine the RGB values of the central pixel. ImageMagick uses a similar approach, but using the Rec. 709 Luma as the brightness metric. === Julia Implementation === == Applications == Originally the Kuwahara filter was proposed for use in processing RI-angiocardiographic images of the cardiovascular system. The fact that any edges are preserved when smoothing makes it especially useful for feature extraction and segmentation and explains why it is used in medical imaging. The Kuwahara filter however also finds many applications in artistic imaging and fine-art photography due to its ability to remove textures and sharpen the edges of photographs. The level of abstraction helps create a desirable painting-like effect in artistic photographs especially in the case of the colored image version of the filter. These applications have known great success and have encouraged similar research in the field of image processing for the arts. Although the vast majority of applications have been in the field of image processing there have been cases that use modifications of the Kuwahara filter for machine learning tasks such as clustering. The Kuwahara filter has been implemented in CVIPtools. The Kuwahara filter is present as a shader node in Blender. == Drawbacks and restrictions == The Kuwahara filter despite its capabilities in edge preservation has certain drawbacks. At a first glance it is noticeable that the Kuwahara filter does not take into account the case where two regions have equal standard deviations. This is not often the case in real images since it is rather hard to find two regions with exactly the same standard deviation due to the noise that is always present. In cases where two regions have similar standard deviations the value of the center pixel could be decided at random by the noise in these regions. Again this would not be a problem if the regions had the same mean. However, it is not unusual for regions of very different means to have the same standard deviation. This makes the Kuwahara filter susceptible to noise. Different ways have been proposed for dealing with this issue, one of which is to set the value of the center pixel to ( m 1 + m 2 ) / 2 {\textstyle (m_{1}+m_{2})/2} in cases where the standard deviation of two regions do not differ more than a certain value D {\displaystyle D} . The Kuwahara filter is also known to create block artifacts in the images especially in regions of the image that are highly textured. These blocks disrupt the smoothness of the image and are considered to have a negative effect in the aesthetics of the image. This phenomenon occurs due to the division of the window into square regions. A way to overcome this effect is to take windows that are not rectangular(i.e. circular windows) and separate them into more non-rectangular regions. There have also been approaches where the filter adapts its window depending on the input image. == Extensions of the Kuwahara filter == The success of the Kuwahara filter has spurred an increase the development of edge-enhancing smoothing filters. Several variations have been proposed for similar use most of which attempt to deal with the drawbacks of the original Kuwahara filter. The "Generalized Kuwahara filter" proposed by P. Bakker considers several windows that contain a fixed pixel. Each window is then assigned an estimate and a confidence value. The value of the fixed pixel then takes the value of the estimate of the window with the highest confidence. This filter is not characterized by the same ambiguity in the presence of noise and manages to eliminate the block artifacts. The "Mean of Least Variance"(MLV) filter, proposed by M.A. Schulze also produces edge-enhancing smoothing results in images. Similarly to the Kuwahara filter it assumes a window of size 2 d − 1 × 2 d − 1 {\displaystyle 2d-1\times 2d-1} but instead of searching amongst four subregions of size d × d {\displaystyle d\times d} for the one with minimum variance it searches amongst all possible d × d {\displaystyle d\times d} subregions. This means the central pixel of the window will be assigned the mean of the one subregion out of a poss

    Read more →
  • Digital art

    Digital art

    Digital art, or the digital arts, is artistic work that uses digital technology as part of the creative or presentational process. It can also refer to computational art that uses and engages with digital media. Since the 1960s, various names have been used to describe digital art, including computer art, electronic art, multimedia art, and new media art. Digital art includes pieces stored on physical media, such as with digital painting, as well as digital galleries on websites. Digital art also extends to the field of visual computing. == History == In the early 1960s, John Whitney developed the first computer-generated art using mathematical operations. In 1963, Ivan Sutherland invented the first user interactive computer-graphics interface known as Sketchpad. Between 1974 and 1977, Salvador Dalí created two big canvases of Gala Contemplating the Mediterranean Sea which at a distance of 20 meters is transformed into the portrait of Abraham Lincoln (Homage to Rothko) and prints of Lincoln in Dalivision based on a portrait of Abraham Lincoln processed on a computer by Leon Harmon published in "The Recognition of Faces". The technique is similar to what later became known as photographic mosaics. Andy Warhol created digital art using an Amiga where the computer was publicly introduced at the Lincoln Center in July 1985. An image of Debbie Harry was captured in monochrome from a video camera and digitized into a graphics program called ProPaint. Warhol manipulated the image by adding color using flood fills. == Art made for digital media == Artwork that is highly computational, presented through digital media, and explicitly engages with digital technologies are categorized as "art made for digital media". This differs from art using digital tools, which incorporate digital technology in the creation process but may exist outside the digital world. Digital art historian Christiane Paul writes that it "is highly problematic to classify all art that makes use of digital technologies somewhere in its production and dissemination process as digital art since it makes it almost impossible to arrive at any unifying statement about the art form". == Art that uses digital tools == Digital art can be purely computer-generated (such as fractals and algorithmic art) or taken from other sources, such as a scanned photograph or an image drawn using vector graphics software using a mouse or graphics tablet. Artworks are considered digital paintings when created similarly to non-digital paintings but using software on a computer platform and digitally outputting the resulting image as painted on canvas. Despite differing viewpoints on digital technology's impact on the arts, a consensus exists within the digital art community about its significant contribution to expanding the creative domain, i.e., that it has greatly broadened the creative opportunities available to professional and non-professional artists alike. == Art theorists and art historians == Notable art theorists and historians in this field include: Oliver Grau, Jon Ippolito, Christiane Paul, Frank Popper, Jasia Reichardt, Mario Costa, Christine Buci-Glucksmann, Dominique Moulon, Roy Ascott, Catherine Perret, Margot Lovejoy, Edmond Couchot, Tina Rivers Ryan, Fred Forest and Edward A. Shanken. === Digital painting === Digital painting is either a physical painting made with the use of digital electronics and spray paint robotics within the digital art fine art context or pictorial art imagery made with pixels on a computer screen that mimics artworks from the traditional histories of painting and illustration. === Artificial intelligence art === Artists have used artificial intelligence to create artwork since at least the 1960s. Since their design in 2014, some artists have created artwork using a generative adversarial network (GAN), which is a machine learning framework that allows two "algorithms" to compete with each other and iterate. It can be used to generate pictures that have visual effects similar to traditional fine art. The essential idea of image generators is that people can use text descriptions to let AI convert their text into visual picture content. Anyone can turn their language into a painting through a picture generator. == Digital art education == Digital art education has become more common with the advancement of digital hardware and software. From hardware such as graphics tablets, styluses, tablets, 3D scanners, virtual reality headsets, and digital cameras; to software such as digital art software, 3D modeling software, 3D rendering, digital sculpting, 2D graphics software, digital painting, 3D terrain generation, 2D animation software, 3D animation software, raster graphics editors, vector graphics editors, mathematical art software, and video editing software. == Scholarship and archives == In addition to the creation of original art, research methods that utilize AI have been generated to quantitatively analyze digital art collections. This has been made possible due to the large-scale digitization of artwork in the past few decades. Although the main goal of digitization was to allow for accessibility and exploration of these collections, the use of AI in analyzing them has brought about new research perspectives. Two computational methods, close reading and distant viewing, are the typical approaches used to analyze digitized art. Close reading focuses on specific visual aspects of one piece. Some tasks performed by machines in close reading methods include computational artist authentication and analysis of brushstrokes or texture properties. In contrast, through distant viewing methods, the similarity across an entire collection for a specific feature can be statistically visualized. Common tasks relating to this method include automatic classification, object detection, multimodal tasks, knowledge discovery in art history, and computational aesthetics. Whereas distant viewing includes the analysis of large collections, close reading involves one piece of artwork. Whilst 2D and 3D digital art is beneficial as it allows the preservation of history that would otherwise have been destroyed by events like natural disasters and war, there is the issue of who should own these 3D scans – i.e., who should own the digital copyrights. === Computer demos === Computer demos are based on computer programs, usually non-interactive. It produces audiovisual presentations. They are a novel form of art, which emerged as a consequence of the home computer revolution in the early 1980s. In the classification of digital art, they can be best described as real-time procedurally generated animated audio-visuals. This form of art does not concentrate only on the aesthetics of the final presentation, but also on the complexities and skills involved in creating the presentation. As such, it can be fully enjoyed only by persons with a relatively high knowledge level of relevant computer technologies. An example is that, as said by Hua Jin and Jie Yang, Using computer-aided design software to present the class content in art design teaching," is not to advocate computer-aided design instead of hand-drawn performance, but to make it serve the profession earlier through a more reasonable course arrangement." On the other hand, many of the created pieces of art are primarily aesthetic or amusing, and those can be enjoyed by the general public. === Digital installation art === Digital installation art constitutes a broad field of artistic practices and a variety of forms. Some resemble video installations, especially large-scale works involving projections and live video capture. By using projection techniques that enhance an audience's impression of sensory envelopment, many digital installations attempt to create immersive environments. While others go even further and attempt to facilitate a complete immersion in virtual realms. This type of installation is generally site-specific, scalable, and without fixed dimensionality, meaning it can be reconfigured to accommodate different presentation spaces. Scott Snibbe's "Boundary Functions" is an example of augmented reality digital installation art, which responds to people who enter the installation by drawing lines between people, indicating their personal space.Noah Wardrip-Fruin's "Screen"(2003) utilizes a Cave Automatic Virtual Environment (CAVE) to create an interactive, text-based digital experience that engages the viewer in a multi-sensory interaction. === Internet art and net.art === Internet art is digital art that uses the specific characteristics of the Internet and is exhibited on the Internet. The term "internet art" is included by "net art" for which artists assume that network will be refreshed through history. So the term "post-internet art" is used to exclude artworks outside of the internet media. A representative example is Protocols for Achievements, which is a digital photo frame that confronts the aestheti

    Read more →
  • List of computer graphics journals

    List of computer graphics journals

    List of computer graphics journals includes notable peer-reviewed scientific and academic journals that focus on computer graphics, visualization, and related areas such as rendering, animation, image processing, and geometric modeling. == Journals == ACM Transactions on Graphics Computers & Graphics IEEE Computer Graphics and Applications IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Graphical Models Journal of Computer Graphics Techniques Presence: Teleoperators and Virtual Environments Virtual Reality Simulation & Gaming

    Read more →
  • Final Cut Express

    Final Cut Express

    Final Cut Express was a video editing software suite created by Apple Inc. It was the consumer version of Final Cut Pro and was designed for advanced editing of digital video as well as high-definition video, which was used by many amateur and professional videographers. Final Cut Express was considered a step above iMovie in terms of capabilities, but a step underneath Final Cut Pro and its suite of applications. As of June 21, 2011, Final Cut Express was discontinued in favor of Final Cut Pro X. == History == Final Cut Express 1.0, based on Final Cut Pro 3, was released at Macworld Conference and Expo in San Francisco in 2003. The second version, based on Final Cut Pro 4, was released at Macworld San Francisco in 2004. The third version, capable of editing high definition video, was also announced at Macworld San Francisco a year later, and was released as Final Cut Express HD in February 2005. It was based on Final Cut Pro HD (version 4.5) and included LiveType 1.2 and Soundtrack 1.2. Final Cut Express version 3.5 was released with little fanfare in May 2006 as a Universal Binary. In addition to improving real-time rendering with Dynamic RT, version 3.5 upgraded LiveType to version 2.0 and Soundtrack to version 1.5. In November 2007, Apple released Final Cut Express 4, which although it did not support real-time editing in the AVCHD format (it only allowed for transcoding AVCHD to Apple Intermediate Codec (AIC) provided that the camera was actually attached to the computer - it did not convert AVCHD files stored elsewhere and is currently for Intel processors only), imported iMovie '08 projects and included 50 new filters. It did not include Soundtrack 1.5, but it still included LiveType which enables users to create advanced text for the movies they created in Final Cut. The price was dropped from $299 for version 3.5 to $199 for version 4.0. In June 2011, Final Cut Express was officially discontinued, in favor of Final Cut Pro X. == Features == Final Cut Express' interface was identical to that of Final Cut Pro, but lacks some film-specific features, including Cinema Tools, multi-cam editing, batch capture, and a time code view. The program performed 32 undo operations, while Final Cut Pro did 99 [2]. Features the program did include were: The ability to keyframe filters Dynamic RT, which changes real-time settings on-the-fly Motion path keyframing Opacity keyframing Ripple, roll, slip, slide and blade edits Picture-in-picture and split-screen effects Up to 99 video tracks and 12 compositing modes Up to 99 audio tracks Motion project import Two-way color correction. Chroma key One feature of Final Cut Express that was not available in Final Cut Pro is the ability to import iMovie '08 projects (though transitions are not preserved). === RT Extreme === Inherited from Final Cut Pro, Final Cut Express features RT Extreme, which allows previews of some video filters and transitions without rendering. Audio that is not in the native AIFF file format needs rendering before it can be played back. RT Extreme has three modes: 'Safe', for seeing multiple video layers at a quality that more or less guarantees a smooth playback; 'Unlimited', which allows the maximum number of composited video layers to be viewed at the same time; and 'Dynamic', which alternates between these settings depending on how many simultaneous video tracks are present. Frame dropping may result from using 'Unlimited' on low-resource machines. === Boris Calligraphy === Like Final Cut Pro, Express also comes with Boris Calligraphy, a plugin for advanced titling and scrolling/crawling titles more sophisticated than the ones that can be created with the built-in title overlays. Calligraphy has a WYSIWYG interface and features wrapping, alignment, leading, kerning and tracking features, as well as allowing up to five custom outlines and five custom drop shadows to be defined for a selected portion of the title. == Soundtrack == Prior to version 4, Final Cut Express included Soundtrack 1.5, a music program similar to the consumer-level GarageBand, but designed for videographers who wish to add music to their films. Soundtrack comes with around 4,000 professionally recorded instrument loops and sound effects that can be arranged in multiple tracks beneath the video track. To use Soundtrack, users export their Final Cut Express sequence, or a marked portion thereof, as a reference file, which can include scoring markers defined in the timeline. This reference file can be imported as the video track in Soundtrack. Soundtrack is functionally and visually identical to Soundtrack Pro's multitrack editing mode, but includes fewer Logic plugins and lacks the highly regarded noise removal tool. Soundtrack was removed from Final Cut Express 4, which lowered its price and may have encouraged people to buy Logic Express.

    Read more →
  • Fluency Voice Technology

    Fluency Voice Technology

    Fluency Voice Technology was a company that developed and sold packaged speech recognition solutions for use in call centers. Fluency's Speech Recognition solutions are used by call centers worldwide to improve customer service and significantly reduce costs and are available on-premises and hosted. == History == 1998 – Fluency was created as a spin-off from the Voice Research & Development team of a company called netdecisions. This R&D operation was established in Cambridge UK. The focus of the development was speech recognition systems based on the VXML standard. 2001 – Fluency became a separate entity in May 2001. Fluency began the creation of a software development platform specifically aimed at automating call center activities. This platform became Fluency's VoiceRunner. 2002 to 2004 – Fluency establishes accomplishes many successful deployments in customer sites such as National Express and Barclaycard. 2003 – Fluency expanded into the USA. Fluency also acquires Vocalis of Cambridge, UK in August 2003. 2004 – Fluency receives £6 million investment from leading European Venture Capitalists and establishes a global OEM partnership with Avaya, and the acquisition of SRC Telecom. 2008 – Fluency is acquired by Syntellect Ltd == Customers == Call Centers around the world use Fluency to improve service and reduce costs. They include Travelodge, Standard Life Bank, Sutton and East Surrey Water, Pizza Hut, CWT, Barclays, Powergen, First Choice, OutRight, J D Williams, Capital Blue Cross, Chelsea Building Society, EDF, bss, TV Licensing and Capita Software Services.

    Read more →
  • Clips (software)

    Clips (software)

    Clips is a discontinued mobile video editing software application created by Apple Inc. It was released onto the iOS App Store on April 6, 2017, for free. Initially, it was only available on 64-bit devices running iOS 10.3 or later; as of version 3.1.3, it requires iOS 16.0 or later. Apple describes it as an app for "making and sharing fun videos with text, effects, graphics, and more.". Its final release was on May 9, 2024 before was removed from the App Store on October 10, 2025. == Features == After launching of the app, the user sees the view of the front-facing camera. The app allows the user to create a new clip by tapping on a red record button, or use photos or videos from the device's photo library. Once a clip is recorded, it can be added to a project timeline shown at the bottom of the screen. The user can share their project on social media platforms. The user can also add filters and effects to the project. "Live Titles" (available in several styles) can also be created by dictating to the device.

    Read more →
  • Signal-to-noise ratio (imaging)

    Signal-to-noise ratio (imaging)

    Signal-to-noise ratio (SNR) is used in imaging to characterize image quality. The sensitivity of a (digital or film) imaging system is typically described in the terms of the signal level that yields a threshold level of SNR. Industry standards define sensitivity in terms of the ISO film speed equivalent, using SNR thresholds (at average scene luminance) of 40:1 for "excellent" image quality and 10:1 for "acceptable" image quality. SNR is sometimes quantified in decibels (dB) of signal power relative to noise power, though in the imaging field the concept of "power" is sometimes taken to be the power of a voltage signal proportional to optical power; so a 20 dB SNR may mean either 10:1 or 100:1 optical power, depending on which definition is in use. == Definition of SNR == Traditionally, SNR is defined to be the ratio of the average signal value μ s i g {\displaystyle \mu _{\mathrm {sig} }} to the standard deviation of the signal σ s i g {\displaystyle \sigma _{\mathrm {sig} }} : S N R = μ s i g σ s i g {\displaystyle \mathrm {SNR} ={\frac {\mu _{\mathrm {sig} }}{\sigma _{\mathrm {sig} }}}} when the signal is an optical intensity, or as the square of this value if the signal and noise are viewed as amplitudes (field quantities).

    Read more →
  • Bump (application)

    Bump (application)

    Bump was an iOS and Android mobile app that enabled smartphone users to transfer contact information, photos and files between devices. In 2011, it was #8 on Apple's list of all-time most popular free iPhone apps, and by February 2013 it had been downloaded 125 million times. Its developer, Bump Technologies, shut down the service and discontinued the app on January 31, 2014, after being acquired by Google for Google Photos and Android Camera. == Features == Bump sent contact information, photos and files to another device over the internet. Before activating the transfer, each user confirmed what they want to send to the other user. To initiate a transfer, two people physically bumped their phones together. A screen appeared on both users' smartphone displays, allowing them to confirm what they want to send to each other. When two users bumped their phones, software on the phones send a variety of sensor data to an algorithm running on Bump servers, which included the location of the phone, accelerometer readings, IP address, and other sensor readings. The algorithm figured out which two phones felt the same physical bump and then transfers the information between those phones. Bump did not use Near Field Communication. February 2012 release of Bump 3.0 for iOS, the company streamlined the app to focus on its most frequently used features: contact and photo sharing. Bump 3.0 for Android maintained the features eliminated from the iOS version but moved them behind swipeable layers. In May 2012, a Bump update enabled users to transfer photos from their phone to their computer via a web service. To initiate a transfer, the user goes to the Bump website on their computer and bumps the smartphone on the computer keyboard's space bar. By December 2012, various Bump updates for iOS and Android had added the abilities to share video, audio, and any files. Users swipe to access those features. In February 2013, an update to the Bump iOS and Android apps enabled users to transfer photos, videos, contacts and other files from a computer to a smartphone and vice versa via a web service. To perform the transfer, users went to the Bump website on their computer and bump the smartphone on the computer keyboard's space bar. == History == The underlying idea of a synchronous gesture like bumping two devices for content transfer or pairing them was first conceived by Ken Hinkley of Microsoft Research in 2003. This idea was presented at a user interface and technology conference that same year. The paper proposed the use of accelerometers and a bumping gesture of two devices to enable communication, screen sharing and content transfer between them. Similar to this original concept, the idea for Bump app was conceived by David Lieb, a former employee of Texas Instruments, while he was attending the University of Chicago Booth School of Business for his MBA. While going through the orientation and meeting process of business school, he became frustrated by constantly entering contact information into his iPhone and felt that the process could be improved. His fellow Texas Instruments employees Andy Huibers and Jake Mintz, who was a classmate of Lieb's at the University of Chicago's MBA program, joined Lieb to form Bump Technologies. Bump Technologies launched in 2008 and is located in Mountain View, CA. Early funding for the project was provided by startup incubator Y Combinator, Sequoia Capital and other angel investors. It gained attention at the CTIA international wireless conference, due to its accessibility and novelty factor. In October 2009, Bump received $3.4m in Series A funding followed in January 2011 with a $16m series B financing round led by Andreessen Horowitz. Silicon Valley venture capitalist Marc Andreessen sits on the company's board. The Bump app debuted in the Apple iOS App Store in March 2009 and was “one of the apps that helped to define the iPhone” (Harry McCracken, Technologizer). It soon became the billionth download on Apple's App Store. An Android version launched in November 2009. By the time Bump 3.0 for iOS was released in February 2012, the app had been installed 77 million times, with users sharing more than 2 million photos daily. As of February 2013, there had been 125 million Bump app downloads. == Other apps created by Bump Technologies == Bump Technologies worked with PayPal in March 2010 to create a PayPal iPhone application. The application, which allows two users to automatically activate an Internet transfer of money between their accounts, found widespread adoption. A similar version was released for Android in August 2010. The Bump capability in PayPal's apps was removed in March 2012. At that time, Bump Technologies released Bump Pay, an iOS app that lets users transfer money via PayPal by physically bumping two smartphones together. The tool was originally created for the Bump team to use when splitting up restaurant bills. The payment feature was not added to the Bump app because the company “wanted to make it as simple as possible so people understand how this works,” Lieb told ABC News. Bump Pay was the first app from the company's Bump Labs initiative. A goal of Bump Labs is to test new app ideas that may not fit within the main Bump app. ING Direct added a feature to its iPhone app in 2011 that lets users transfer money to each other using Bump's technology. The feature was later added to its Android app, now called Capital One 360. In July 2012, Bump Technologies released Flock, an iPhone photo sharing app. An Android version was released in December 2012. Using geolocation data embedded in photos and a user's Facebook connections, Flock finds pictures the user takes while out with friends and family and puts everyone's photos from that event into a single shared album. Users receive a push notification after the event, asking if they want to share their photos with friends who were there in the moment. The app will also scan previous photos in the iPhone camera roll and uncover photos that have yet to be shared. If location services were enabled at the time a photo was taken, Flock allows users to create an album of photos from the past with the friends who were there with them. == Acquisition by Google == On September 16, 2013, Bump Technologies announced that it had been acquired by Google. On December 31, 2013, they broke the news that both Bump and Flock would be discontinued so that the team could focus on new projects at Google. The apps were removed from the App Store and Google Play on January 31, 2014. The company subsequently deleted all user data and shut down their servers, thus rendering existing installations of the apps inoperable.

    Read more →
  • Scientific Working Group – Imaging Technology

    Scientific Working Group – Imaging Technology

    The Scientific Working Group on Imaging Technology was convened by the Federal Bureau of Investigation in 1997 to provide guidance to law enforcement agencies and others in the criminal justice system regarding the best practices for photography, videography, and video and image analysis. This group was terminated in 2015. == History == As technology has advanced through the years, law enforcement has needed to stay abreast of emerging technological advances and use these in the investigation of crime. A factor that is considered when new technology is used in these investigations is the determination of whether the use of that new technology will be admissible in court. The judicial system in the United States currently has two standards used in the determination of admissibility of testimony regarding scientific evidence; the Daubert Standard and the Frye Standard. These standards guide the courts in the admissibility of testimony derived from the use of new technologies and scientific techniques. The Federal Bureau of Investigation (FBI), seeking to address possible admissibility issues with such testimony, established Scientific Working Groups starting with the Scientific Working Group on DNA Analysis and Methods (SWGDAM) in 1988. The goal of these groups is to open lines of communication between law enforcement agencies and forensic laboratories around the world while providing guidance on the use of new and innovative technologies and techniques. This guidance can lead to admissibility of evidence and/or testimony, provided proper methods in the collection of evidence and its analysis are employed. In 2009, the National Academy of Sciences released a report entitled, "Strengthening Forensic Science in the United States: A Path Forward." This report addresses many topics including challenges and disparities facing the forensic science community, standardization, certification of practitioners and accreditation of their respective entities, problems related to the interpretation of forensic evidence, the need for research, and the admission of forensic science evidence in litigation. This report mentions the Scientific Working Groups and their role in forensic science. The history of imaging technology (photography) can be said to extend back to the times of Chinese philosopher Mo-Ti (470-390 B.C.) who described the principles behind the precursor to the camera obscura. Since that time, advances in imaging technology include the discovery of chemical photographic processes in the 19th century and the use of electronic imaging technology that includes analog video cameras and digital video and still cameras. By the mid 1990s, it was apparent that technologically advanced camera systems such as these were being adopted for use in the criminal justice system. This led the FBI to convene a meeting of individuals working in the field of forensic imaging from federal, state, local, and foreign law enforcement, and the U.S. military, during the summer of 1997. As a result of this meeting, the Technical Working Group on Imaging Technology was formed from a core group of the meeting’s participants. This group later became the Scientific Working Group on Imaging Technology (SWGIT). Prior to the inception of SWGIT, some law enforcement agencies began adopting digital imaging technology. Due to the lack of guidelines or standards, some of these agencies attempted to replace all their film cameras with substandard digital cameras, only to find that the equipment they had purchased was not capable of accomplishing the mission for which they were intended. At that time only low resolution digital cameras were deemed affordable by some law enforcement agencies. Some of these agencies were forced to rethink their photography procedures and reverted to the use of film cameras or replaced their low-resolution digital cameras with higher quality, more expensive equipment. Also lacking at this early stage was guidance on how to store and archive digital image files. When SWGIT was formed, it was tasked with providing guidance to law enforcement and others in the criminal justice system by releasing documents that describe the best practices and guidelines for the use of imaging technology, to include these concerns and many others. This group was terminated in 2015. == SWGIT Function == During its existence, SWGIT provided information on the appropriate use of various imaging technologies including both established and new. This was accomplished through the release of documents such as the SWGIT Best Practices documents. As changes in technology occurred, these documents were updated. Over the course of its existence, SWGIT collaborated with other Scientific Working Groups to address imaging concerns within their respective disciplines. SWGIT published over 20 documents that dealt specifically with imaging technology. SWGIT also co-published documents with the Scientific Working Group on Digital Evidence (SWGDE) that had a component or components dealing with imaging technology. SWGIT also provided imaging technology guidance and input for documents from the Scientific Working Group on Friction Ridge Analysis, Study and Technology (SWGFAST), the Scientific Working Group for Forensic Document Examination (SWGDOC), and the Scientific Working Group on Shoeprint and Tire Tread Evidence (SWGTREAD). SWGIT assisted the American Society of Crime Lab Directors/Laboratory Accreditation Board (ASCLD/LAB) in the writing of definitions and standards for the accreditation of Digital and Multimedia Evidence sections of crime laboratories. In addition to releasing documents, SWGIT members disseminated best practices for law enforcement professionals where imaging technology was concerned. This was carried out by attending and lecturing at meetings and conferences of various forensic organizations that included: The American Academy of Forensic Sciences (AAFS) The International Association for Identification (IAI) The Law Enforcement and Emergency Services Video Association (LEVA) The American Society of Crime Lab Directors (ASCLD) The SWGIT membership consisted of approximately fifty scientists, photographers, instructors, and managers from more than two dozen federal, state, and local law enforcement agencies, as well as from the academic and research communities. The membership elected its officers from within. SWGIT was composed of the Executive Committee, four standing subcommittees, and ad hoc subcommittees appointed on an as-needed basis. The standing subcommittees were: Image Analysis, Forensic Photography, Video, and Outreach. This group was terminated in 2015. == Legal Proceedings == The following court cases have conducted Daubert v. Merrell Dow Pharm., Inc., 509 U.S. 579 (1993) hearings in which SWGIT best practice documents have been cited as accepted protocol, methodology, and as generally accepted techniques in the forensic community: U. S. v. Rudy Frabizio, U.S. District Court, Boston, MA, 2008 (Image Authentication) U.S. v. Nobumochi Furukawa, U.S. District Court, Minnesota, 2007 (Video Authentication) U.S. v. John Stroman, U.S. District Court, South Carolina, 2007 (Facial Comparison Analysis) State of Texas v. Daniel Day, Tarrant County Texas, 2005 (Camera Identification to Images) U.S. v. Marc Watzman, U.S. District Court, Northern Illinois, 2004 (Video Authentication) U.S. v. McKreith, U.S. District Court, Fort Lauderdale, FL, 2002 (Photo comparison of shirt) == Termination == This group was unfunded by the FBI in 2015.

    Read more →
  • Multi-focus image fusion

    Multi-focus image fusion

    Multi-focus image fusion is a multiple image compression technique using input images with different focus depths to make one output image that preserves all information. == Overview == The main idea of image fusion is gathering important and the essential information from the input images into one single image which ideally has all of the information of the input images. The research history of image fusion spans over 30 years and many scientific papers. Image fusion generally has two aspects: image fusion methods and objective evaluation metrics. In visual sensor networks (VSN), sensors are cameras which record images and video sequences. In many applications of VSN, a camera can't give a perfect illustration including all details of the scene. This is because of the limited depth of focus of the optical lens of cameras. Therefore, just the object located in the focal length of camera is focused and clear, and other parts of the image are blurred. VSN captures images with different depths of focus using several cameras. Due to the large amount of data generated by cameras compared to other sensors such as pressure and temperature sensors and some limitations of bandwidth, energy consumption and processing time, it is essential to process the local input images to decrease the amount of transmitted data. == Multi-Focus image fusion in the spatial domain == Huang and Jing have reviewed and applied several focus measurements in the spatial domain for the multi-focus image fusion process, suitable for real-time applications. They mentioned some focus measurements including variance, energy of image gradient (EOG), Tenenbaum's algorithm (Tenengrad), energy of Laplacian (EOL), sum-modified-Laplacian (SML), and spatial frequency (SF). Their experiments showed that EOL gave better results than other methods like variance and spatial frequency. == Multi-Focus image fusion in multi-scale transform and DCT domain == Image fusion based on the multi-scale transform is the most commonly used and promising technique. Laplacian pyramid transform, gradient pyramid-based transform, morphological pyramid transform and the premier ones, discrete wavelet transform, shift-invariant wavelet transform (SIDWT), and discrete cosine harmonic wavelet transform (DCHWT) are some examples of image fusion methods based on multi-scale transform. These methods are complex and have some limitations e.g. processing time and energy consumption. For example, multi-focus image fusion methods based on DWT require a lot of convolution operations, so they take more time and energy to process. Therefore, most methods in multi-scale transform are not suitable for real-time applications. Moreover, these methods are not very successful along edges, due to the wavelet transform process missing the edges of the image. They create ringing artefacts in the output image and reduce its quality. Due to the aforementioned problems in the multi-scale transform methods, researchers are interested in multi-focus image fusion in the DCT domain. DCT-based methods are more efficient in terms of transmission and archiving images coded in Joint Photographic Experts Group (JPEG) standard to the upper node in the VSN agent. A JPEG system consists of a pair of an encoder and a decoder. In the encoder, images are divided into non-overlapping 8×8 blocks, and the DCT coefficients are calculated for each. Since the quantization of DCT coefficients is a lossy process, many of the small-valued DCT coefficients are quantized to zero, which corresponds to high frequencies. DCT-based image fusion algorithms work better when the multi-focus image fusion methods are applied in the compressed domain. In addition, in the spatial-based methods, the input images must be decoded and then transferred to the spatial domain. After implementation of the image fusion operations, the output fused images must again be encoded. DCT domain-based methods do not require complex and time-consuming consecutive decoding and encoding operations. Therefore, the image fusion methods based on DCT domain operate with much less energy and processing time. Recently, a lot of research has been carried out in the DCT domain. DCT+Variance, DCT+Corr_Eng, DCT+EOL, and DCT+VOL are some prominent examples of DCT based methods.

    Read more →