AI Content Quiz

AI Content Quiz — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Sub-pixel resolution

    Sub-pixel resolution

    In digital image processing, sub-pixel resolution can be obtained in images constructed from sources with information exceeding the nominal pixel resolution of said images. == Example == For example, if the image of a ship of length 50 metres (160 ft), viewed side-on, is 500 pixels long, the nominal resolution (pixel size) on the side of the ship facing the camera is 0.1 metres (3.9 in). Now sub-pixel resolution of well resolved features can measure ship movements which are an order of magnitude (10×) smaller. Movement is specifically mentioned here because measuring absolute positions requires an accurate lens model and known reference points within the image to achieve sub-pixel position accuracy. Small movements can however be measured (down to 1 cm) with simple calibration procedures. Specific fit functions often suffer specific bias with respect to image pixel boundaries. Users should therefore take care to avoid these "pixel locking" (or "peak locking") effects. == Determining feasibility == Whether features in a digital image are sharp enough to achieve sub-pixel resolution can be quantified by measuring the point spread function (PSF) of an isolated point in the image. If the image does not contain isolated points, similar methods can be applied to edges in the image. It is also important when attempting sub-pixel resolution to keep image noise to a minimum. This, in the case of a stationary scene, can be measured from a time series of images. Appropriate pixel averaging, through both time (for stationary images) and space (for uniform regions of the image) is often used to prepare the image for sub-pixel resolution measurements.

    Read more →
  • Glow (app)

    Glow (app)

    Glow is a fertility awareness and period-tracking app. It is part of a suite of mobile apps focused on women's reproductive health and childcare, which includes Eve by Glow (a dedicated period tracker), Glow Nurture (a pregnancy tracker), and Glow Baby (a baby development tracker). The Glow company also operates an online shop that sells several fertility-related products, including ovulation test strips, pregnancy tests, and wearable breast pumps. In 2024, Glow was reported to have approximately 25 million users across its various apps and community message boards. == History == Glow debuted in August 2013 as an iOS app. It was founded by Michael Huang and Max Levchin and launched with $6 million in Series A funding from venture capital firms Founders Fund and Andreesen Horowitz. In 2014, Glow raised an additional $17 million in Series B funding, with Formation 8 joining existing investors. In 2015, Glow launched Ruby, an app dedicated to sexual health. That year, Wired reported that the company had added features to their apps allowing men to monitor their fertility. Glow subsequently released an additional set of apps focused on pregnancy tracking and infant development. In 2016, Glow reported that it had a total of approximately 3 million users; by 2018, this had grown to 15 million. Vox described it as one of the “big two” period and fertility tracking apps and the one that had started the “boom” in the femtech space. == Application and features == Glow was initially described as a fertility application that applied data-driven methods to menstrual and ovulation tracking. Core features include cycle logging, ovulation prediction, and symptom tracking. The app also provides educational content related to reproductive health and childcare, as well as a set of online message boards that allow individuals to share experiences and seek peer support. == Privacy and legal issues == Glow has received significant media attention for its privacy and security practices. In 2016, Consumer Reports identified potential exploits in the Glow app that they claimed could have exposed private user data to hackers. Glow subsequently reported that it had fixed the vulnerabilities and told The Washington Post they had no evidence that user data had been compromised. In September 2020, the California Attorney General announced a settlement with Glow related to Consumer Reports’ findings, which included a $250,000 civil penalty. Following the US Supreme Court's 2022 Dobbs v. Jackson ruling, which legalized state-level bans on abortion, Glow (and other fertility trackers, such as Clue and Flo) came under additional scrutiny over concerns that user data on abortions could be reported to law enforcement. After this surge of media interest, a research team affiliated with the University of New South Wales conducted an investigation into the privacy practices of several popular fertility apps, including Glow. Their review of Glow was mixed, noting that they provided several privacy settings and de-identified sensitive data, but that user information could still be disclosed in the future if the app was sold. Glow rejected that claim, telling the Australian Associated Press that it "did not share" personal data. The company also cited several internal security measures it had implemented and its apps' offline data protection setting, which allows users to permanently delete their health-related data. == Reception == In 2014, Fast Company reported that 20,000 women had used Glow to conceive. Later that year, The Guardian included Glow Nurture on its list of the best iPhone apps of 2014. Media coverage often praised Glow's array of menstrual tracking options, although some reviews also noted that fertility apps are not birth control tools and cautioned against relying on them for that purpose. In 2019, Cosmopolitan singled Glow's community of users as one of its standout features.

    Read more →
  • Videotex

    Videotex

    Videotex (or interactive videotex) was one of the earliest implementations of an end-user information system. From the late 1970s to early 2010s, it was used to deliver information (usually pages of text) to a user in computer-like format, typically to be displayed on a television or a dumb terminal. In a strict definition, videotex is any system that provides interactive content and displays it on a video monitor such as a television, typically using modems to send data in both directions. A close relative is teletext, which sends data in one direction only, typically encoded in a television signal. All such systems are occasionally referred to as viewdata. Unlike the modern Internet, traditional videotex services were highly centralized. Videotex in its broader definition can be used to refer to any such service, including teletext, the Internet, bulletin board systems, online service providers, and even the arrival/departure displays at an airport. This usage is no longer common. With the exception of Minitel in France, videotex elsewhere never managed to attract any more than a very small percentage of the universal mass market once envisaged. By the end of the 1980s its use was essentially limited to a few niche applications. == Initial development and technologies == === United Kingdom === The first attempts at a general-purpose videotex service were created in the United Kingdom in the late 1960s. In about 1970 the BBC had a brainstorming session in which it was decided to start researching ways to send closed captioning information to the audience. As the Teledata research continued the BBC became interested in using the system for delivering any sort of information, not just closed captioning. In 1972, the concept was first made public under the new name Ceefax. Meanwhile, the General Post Office (soon to become British Telecom) had been researching a similar concept since the late 1960s, known as Viewdata. Unlike Ceefax which was a one-way service carried in the existing TV signal, Viewdata was a two-way system using telephones. Since the Post Office owned the telephones, this was considered to be an excellent way to drive more customers to use the phones. Not to be outdone by the BBC, they also announced their service, under the name Prestel. ITV soon joined the fray with a Ceefax-clone known as ORACLE. In 1974, all the services agreed on a standard for displaying the information. The display would be a simple 40×24 grid of text, with some "graphics characters" for constructing simple graphics, revised and finalized in 1976. The standard did not define the delivery system, so both Viewdata-like and Teledata-like services could at least share the TV-side hardware, which was expensive at the time. The standard also introduced a new term that covered all such services, teletext. Ceefax first started operation in 1974 with a limited 30 pages, followed quickly by ORACLE and then Prestel in 1979. By 1981, Prestel International was available in nine countries, and a number of countries, including Sweden, The Netherlands, Finland and West Germany were developing their own national systems closely based on Prestel. General Telephone and Electronics (GTE) acquired an exclusive agency for the system for North America. In the early 1980s, videotex became the base technology for the London Stock Exchange's pricing service called TOPIC. Later versions of TOPIC, notably TOPIC2 and TOPIC3, were developed by Thanos Vassilakis and introduced trading and historic price feeds. === France === Development of a French teletext-like system began in 1973. A very simple 2-way videotex system called Tictac was also demonstrated in the mid-1970s. As in the UK, this led on to work to develop a common display standard for videotex and teletext, called Antiope, which was finalised in 1977. Antiope had similar capabilities to the UK system for displaying alphanumeric text and chunky "mosaic" character-based block graphics. A difference however was that while in the UK standard control codes automatically also occupied one character position on screen, Antiope allowed for "non spacing" control codes. This gave Antiope slightly more flexibility in the use of colours in mosaic block graphics, and in presenting the accents and diacritics of the French language. Meanwhile, spurred on by the 1978 Nora/Minc report, the French government was determined to catch up on a perceived falling behind in its computer and communications facilities. In 1980 it began field trials issuing Antiope-based terminals for free to over 250,000 telephone subscribers in Ille-et-Vilaine region, where the French CCETT research centre was based, for use as telephone directories. The trial was a success, and in 1982 Minitel was rolled out nationwide. === Canada === Since 1970, researchers at the Communications Research Centre (CRC) in Ottawa had been working on a set of "picture description instructions", which encoded graphics commands as a text stream. Graphics were encoded as a series of instructions (graphics primitives) each represented by a single ASCII character. Graphic coordinates were encoded in multiple 6 bit strings of XY coordinate data, flagged to place them in the printable ASCII range so that they could be transmitted with conventional text transmission techniques. ASCII SI/SO characters were used to differentiate the text from graphic portions of a transmitted "page". In 1975, the CRC gave a contract to Norpak to develop an interactive graphics terminal that could decode the instructions and display them on a colour display, which was successfully up and running by 1977. Against the background of the developments in Europe, CRC was able to persuade the Canadian government to develop the system into a fully-fledged service. In August 1978, the Canadian Department of Communications publicly launched it as Telidon, a "second generation" videotex/teletext service, and committed to a four-year development plan to encourage rollout. Compared to the European systems, Telidon offered real graphics, as opposed to block-mosaic character graphics. The downside was that it required much more advanced decoders, typically featuring Zilog Z80 or Motorola 6809 processors. === Japan === Research in Japan was shaped by the demands of the large number of Kanji characters used in Japanese script. With 1970s technology, the ability to generate so many characters on demand in the end-user's terminal was seen as prohibitive. Instead, development focussed on methods to send pages to user terminals pre-rendered, using coding strategies similar to facsimile machines. This led to a videotex system called Captain ("Character and Pattern Telephone Access Information Network"), created by NTT in 1978, which went into full trials from 1979 to 1981. The system also lent itself naturally to photographic images, albeit at only moderate resolution. However, the pages typically took two or three times longer to load, compared to the European systems. NHK developed an experimental teletext system along similar lines, called CIBS ("Character Information Broadcasting Station"). Based on a 388×200 pixel resolution, it was first announced in 1976, and began trials in late 1978. (NHK's ultimate production teletext system launched in 1983). == Standards == Work to establish an international standard for videotex began in 1978 in CCITT. But the national delegations showed little interest in compromise, each hoping that their system would come to define what was perceived to be going to be an enormous new mass-market. In 1980 CCITT therefore issued recommendation S.100 (later T.100), noting the points of similarity but the essential incompatibility of the systems, and declaring all four to be recognised options. Trying to kick-start the market, AT&T Corporation entered the fray, and in May 1981 announced its own Presentation Layer Protocol (PLP). This was closely based on the Canadian Telidon system, but added to it some further graphics primitives and a syntax for defining macros, algorithms to define cleaner pixel spacing for the (arbitrarily sizeable) text, and also dynamically redefinable characters and a mosaic block graphic character set, so that it could reproduce content from the French Antiope. After some further revisions this was adopted in 1983 as ANSI standard X3.110, more commonly called NAPLPS, the North American Presentation Layer Protocol Syntax. It was also adopted in 1988 as the presentation-layer syntax for NABTS, the North American Broadcast Teletext Specification. Meanwhile, the European national Postal Telephone and Telegraph (PTT) agencies were also increasingly interested in videotex, and had convened discussions in European Conference of Postal and Telecommunications Administrations (CEPT) to co-ordinate developments, which had been diverging along national lines. As well as the British and French standards, the Swedes had proposed extending the British Prestel standard with a new se

    Read more →
  • List of color palettes

    List of color palettes

    The following is a list that contains color palettes for notable computer graphics, terminals and video game consoles. Only a simulated image using a palette and its name are given. Main articles are linked from the name of each palette, test charts, sample colours, simulated images, and further technical details (including references). During older eras of computing, manufacturers developed many different display systems often in a competitive, non-collaborative basis (with a few exceptions in the VESA consortium), creating many proprietary, non-standard different instances of display hardware. Often, as with early personal and home computers, a given machine employed its unique display subsystem, also with its unique color palette. Furthermore, software developers had made use of the color abilities of distinct display systems in many different ways. The result is that there is no single common standard nomenclature or classification taxonomy which can encompass every computer color palette. In order to organize the material, color palettes have been grouped following certain criteria. First, generic monochrome and full RGB repertories common to various computer display systems are listed. Then, usual color repertories used for display systems that employ indexed color techniques. And finally, specific manufacturers' color palettes implemented in many representative early personal computers and video game consoles of various brands. The list for personal computer palettes is split into two categories: 8-bit and 16-bit machines. This is not intended as a true strict categorization of such machines, because mixed architectures also exist (16-bit processors with an 8-bit data bus or 32-bit processors with a 16-bit data bus, among others). The distinction is based more on broad 8-bit and 16-bit computer ages or generations (around 1975–1985 and 1985–1995, respectively) and their associated state of the art in color display capabilities. The following is the common color test chart and sample image used to render each palette in this list: See further details in the summary paragraph of the corresponding article. == List of monochrome and RGB palettes == In this article, the term monochrome palette means a set of intensities for a monochrome display, and the term RGB palette is defined as the complete set of combinations a given RGB display can offer by mixing all the possible intensities of the red, green, and blue primaries available in its hardware. These are generic complete repertories of colors to produce black and white and RGB color pictures by the display hardware, not necessarily the total number of such colors that can be simultaneously displayed in a given text or graphic mode of any machine. RGB is the most common method to produce colors for displays; so these complete RGB color repertories have every possible combination of R-G-B triplets within any given maximum number of levels per component. For specific hardware and different methods to produce colors than RGB, see the List of computer hardware palettes and the List of video game consoles sections. For various software arrangements and sorts of colors, including other possible full RGB arrangements within 8-bit depth displays, see the List of software palettes section. === Monochrome palettes === These palettes only have shades of gray. === Dichrome palettes === Each permuted pair of red, green, and blue (16-bit color palette, with 65,536 colors). For example, "additive red green" has zero blue and "subtractive red green" has full blue. === Regular RGB palettes === These full RGB palettes employ the same number of bits to store the relative intensity for the red, green and blue components of every image's pixel color. Thus, they have the same number of levels per channel and the total number of possible colors is always the cube of a power of two. It should be understood that 'when developed' many of these formats were directly related to the size of some host computers 'natural word length' in bytes—the amount of memory in bits held by a single memory address such that the CPU can grab or put it in one operation. === Non-regular RGB palettes === These are also RGB palettes, in the sense defined above (except for 4-bit RGBI, which has an intensity bit that affects all channels at once), but either they do not have the same number of levels for each primary channel, or the numbers are not powers of two, so are not represented as separate bit fields. All of these have been used in popular personal computers. == List of software palettes == Systems that use a 4-bit or 8-bit pixel depth can display up to 16 or 256 colors simultaneously. Many personal computers in the later 1980s and early 1990s displayed at most 256 different colors, freely selected by software (either by the user or by a program) from their wider hardware's color palette. Usual selections of colors in limited subsets (generally 16 or 256) of the full palette includes some RGB level arrangements commonly used with the 8 bpp palettes as master palettes or universal palettes (i.e., palettes for multipurpose uses). These are some representative software palettes, but any selection can be made in such types of systems. === System specific palettes === These are selections of colors officially employed as system palettes in some popular operating systems for personal computers that feature 8-bit displays. === RGB arrangements === These are selections of colors based on evenly ordered RGB levels, mainly used as master palettes to display any kind of image within the limitations of the 8-bit pixel depth. === Other common uses of software palettes === == List of computer hardware palettes == In old personal computers and terminals that offered color displays, some color palettes were chosen algorithmically to provide the most diverse set of colors for a given palette size, and others were chosen to assure the availability of certain colors. In many early home computers, especially when the palette choices were determined at the hardware level by resistor combinations, the palette was determined by the manufacturer. Many early models output composite video colors. When seen on TV devices, the perception of the colors may not correspond with the value levels for the color values employed (most noticeable with NTSC TV color system). For current RGB display systems for PCs (Super VGA, etc.), see the 16-bit RGB and 24-bit RGB for High Color (thousands) and True Color (millions of colors) modes. For video game consoles, see the List of video game consoles section. For every model, their main different graphical color modes are listed based exclusively in the way they handle colors on screen, not all their different screen modes. The list is organized roughly historically by video hardware, not by branch. They are listed according to the original model of each system, which means that extended versions, clones, and compatibles also support the original palette. === Terminals and 8-bit machines === === 16-bit machines === === Video game console palettes === Color palettes of some of the most popular video game consoles. The criteria are the same as those of the List of computer hardware palettes section.

    Read more →
  • Random feature

    Random feature

    Random features (RF) are a technique used in machine learning to approximate kernel methods, introduced by Ali Rahimi and Ben Recht in their 2007 paper "Random Features for Large-Scale Kernel Machines", and extended by. RF uses a Monte Carlo approximation to kernel functions by randomly sampled feature maps. It is used for datasets that are too large for traditional kernel methods like support vector machine, kernel ridge regression, and gaussian process. == Mathematics == === Kernel method === Given a feature map ϕ : R d → V {\textstyle \phi :\mathbb {R} ^{d}\to V} , where V {\textstyle V} is a Hilbert space (more specifically, a reproducing kernel Hilbert space), the kernel trick replaces inner products in feature space ⟨ ϕ ( x i ) , ϕ ( x j ) ⟩ V {\displaystyle \langle \phi (x_{i}),\phi (x_{j})\rangle _{V}} by a kernel function k ( x i , x j ) : R d × R d → R {\displaystyle k(x_{i},x_{j}):\mathbb {R} ^{d}\times \mathbb {R} ^{d}\to \mathbb {R} } Kernel methods replaces linear operations in high-dimensional space by operations on the kernel matrix: K X := [ k ( x i , x j ) ] i , j ∈ 1 : N {\displaystyle K_{X}:=[k(x_{i},x_{j})]_{i,j\in 1:N}} where N {\textstyle N} is the number of data points. === Random kernel method === The problem with kernel methods is that the kernel matrix K X {\textstyle K_{X}} has size N × N {\textstyle N\times N} . This becomes computationally infeasible when N {\textstyle N} reaches the order of a million. The random kernel method replaces the kernel function k {\textstyle k} by an inner product in low-dimensional feature space R D {\textstyle \mathbb {R} ^{D}} : k ( x , y ) ≈ ⟨ z ( x ) , z ( y ) ⟩ {\displaystyle k(x,y)\approx \langle z(x),z(y)\rangle } where z {\textstyle z} is a randomly sampled feature map z : R d → R D {\textstyle z:\mathbb {R} ^{d}\to \mathbb {R} ^{D}} . This converts kernel linear regression into linear regression in feature space, kernel SVM into SVM in feature space, etc. Since we have K X ≈ Z X T Z X {\displaystyle K_{X}\approx Z_{X}^{T}Z_{X}} where Z X = [ z ( x 1 ) , … , z ( x N ) ] {\displaystyle Z_{X}=[z(x_{1}),\dots ,z(x_{N})]} , these methods no longer involve matrices of size O ( N 2 ) {\textstyle O(N^{2})} , but only random feature matrices of size O ( D N ) {\textstyle O(DN)} . == Random Fourier feature == === Radial basis function kernel === The radial basis function (RBF) kernel on two samples x i , x j ∈ R d {\displaystyle x_{i},x_{j}\in \mathbb {R} ^{d}} is defined as k ( x i , x j ) = exp ⁡ ( − ‖ x i − x j ‖ 2 2 σ 2 ) {\displaystyle k(x_{i},x_{j})=\exp \left(-{\frac {\|x_{i}-x_{j}\|^{2}}{2\sigma ^{2}}}\right)} where ‖ x i − x j ‖ 2 {\displaystyle \|x_{i}-x_{j}\|^{2}} is the squared Euclidean distance and σ {\displaystyle \sigma } is a free parameter defining the shape of the kernel. It can be approximated by a random Fourier feature map z : R d → R 2 D {\displaystyle z:\mathbb {R} ^{d}\to \mathbb {R} ^{2D}} : z ( x ) := 1 D [ cos ⁡ ⟨ ω 1 , x ⟩ , sin ⁡ ⟨ ω 1 , x ⟩ , … , cos ⁡ ⟨ ω D , x ⟩ , sin ⁡ ⟨ ω D , x ⟩ ] T {\displaystyle z(x):={\frac {1}{\sqrt {D}}}[\cos \langle \omega _{1},x\rangle ,\sin \langle \omega _{1},x\rangle ,\ldots ,\cos \langle \omega _{D},x\rangle ,\sin \langle \omega _{D},x\rangle ]^{T}} where ω 1 , . . . , ω D {\displaystyle \omega _{1},...,\omega _{D}} are IID samples from the multidimensional normal distribution N ( 0 , σ − 2 I ) {\displaystyle N(0,\sigma ^{-2}I)} . Since cos , sin {\displaystyle \cos ,\sin } are bounded, there is a stronger convergence guarantee by Hoeffding's inequality. === Random Fourier features === By Bochner's theorem, the above construction can be generalized to arbitrary positive definite shift-invariant kernel k ( x , y ) = k ( x − y ) {\displaystyle k(x,y)=k(x-y)} . Define its Fourier transform p ( ω ) = 1 2 π ∫ R d e − j ⟨ ω , Δ ⟩ k ( Δ ) d Δ {\displaystyle p(\omega )={\frac {1}{2\pi }}\int _{\mathbb {R} ^{d}}e^{-j\langle \omega ,\Delta \rangle }k(\Delta )d\Delta } then ω 1 , . . . , ω D {\displaystyle \omega _{1},...,\omega _{D}} are sampled IID from the probability distribution with probability density p {\displaystyle p} . This applies for other kernels like the Laplace kernel and the Cauchy kernel. === Neural network interpretation === Given a random Fourier feature map z {\displaystyle z} , training the feature on a dataset by featurized linear regression is equivalent to fitting complex parameters θ 1 , … , θ D ∈ C {\displaystyle \theta _{1},\dots ,\theta _{D}\in \mathbb {C} } such that f θ ( x ) = R e ( ∑ k θ k e i ⟨ ω k , x ⟩ ) {\displaystyle f_{\theta }(x)=\mathrm {Re} \left(\sum _{k}\theta _{k}e^{i\langle \omega _{k},x\rangle }\right)} which is a neural network with a single hidden layer, with activation function t ↦ e i t {\displaystyle t\mapsto e^{it}} , zero bias, and the parameters in the first layer frozen. In the overparameterized case, when 2 D ≥ N {\displaystyle 2D\geq N} , the network linearly interpolates the dataset { ( x i , y i ) } i ∈ 1 : N {\displaystyle \{(x_{i},y_{i})\}_{i\in 1:N}} , and the network parameters is the least-norm solution: θ ^ = arg ⁡ min θ ∈ C D , f θ ( x k ) = y k ∀ k ∈ 1 : N ‖ θ ‖ {\displaystyle {\hat {\theta }}=\arg \min _{\theta \in \mathbb {C} ^{D},f_{\theta }(x_{k})=y_{k}\forall k\in 1:N}\|\theta \|} At the limit of D → ∞ {\displaystyle D\to \infty } , the L2 norm ‖ θ ^ ‖ → ‖ f K ‖ H {\displaystyle \|{\hat {\theta }}\|\to \|f_{K}\|_{H}} where f K {\displaystyle f_{K}} is the interpolating function obtained by the kernel regression with the original kernel, and ‖ ⋅ ‖ H {\displaystyle \|\cdot \|_{H}} is the norm in the reproducing kernel Hilbert space for the kernel. == Other examples == === Random binning features === A random binning features map partitions the input space using randomly shifted grids at randomly chosen resolutions and assigns to an input point a binary bit string that corresponds to the bins in which it falls. The grids are constructed so that the probability that two points x i , x j ∈ R d {\displaystyle x_{i},x_{j}\in \mathbb {R} ^{d}} are assigned to the same bin is proportional to K ( x i , x j ) {\displaystyle K(x_{i},x_{j})} . The inner product between a pair of transformed points is proportional to the number of times the two points are binned together, and is therefore an unbiased estimate of K ( x i , x j ) {\displaystyle K(x_{i},x_{j})} . Since this mapping is not smooth and uses the proximity between input points, Random Binning Features works well for approximating kernels that depend only on the L 1 {\displaystyle L_{1}} distance between datapoints. === Orthogonal random features === Orthogonal random features uses a random orthogonal matrix instead of a random Fourier matrix. == Historical context == In NIPS 2006, deep learning had just become competitive with linear models like PCA and linear SVMs for large datasets, and people speculated about whether it could compete with kernel SVMs. However, there was no way to train kernel SVM on large datasets. The two authors developed the random feature method to train those. It was then found that the O ( 1 / D ) {\displaystyle O(1/D)} variance bound did not match practice: the variance bound predicts that approximation to within 0.01 {\displaystyle 0.01} requires D ∼ 10 4 {\displaystyle D\sim 10^{4}} , but in practice required only ∼ 10 2 {\displaystyle \sim 10^{2}} . Attempting to discover what caused this led to the subsequent two papers.

    Read more →
  • ImageNet

    ImageNet

    The ImageNet project is a large visual database designed for use in visual object recognition software research. More than 14 million images have been hand-annotated by the project to indicate what objects are pictured and in at least one million of the images, bounding boxes are also provided. ImageNet contains more than 20,000 categories, with a typical category, such as "balloon" or "strawberry", consisting of several hundred images. The database of annotations of third-party image URLs is freely available directly from ImageNet, though the actual images are not owned by ImageNet. Since 2010, the ImageNet project runs an annual software contest, the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), where software programs compete to correctly classify and detect objects and scenes. The challenge uses a "trimmed" list of one thousand non-overlapping classes. == History == AI researcher Fei-Fei Li began working on the idea for ImageNet in 2006. At a time when most AI research focused on models and algorithms, Li wanted to expand and improve the data available to train AI algorithms. In 2007, Li met with Princeton professor Christiane Fellbaum, one of the creators of WordNet, to discuss the project. As a result of this meeting, Li went on to build ImageNet starting from the roughly 22,000 nouns of WordNet and using many of its features. She was also inspired by a 1987 estimate that the average person recognizes roughly 30,000 different kinds of objects. As an assistant professor at Princeton, Li assembled a team of researchers to work on the ImageNet project. They used Amazon Mechanical Turk to help with the classification of images. Labeling started in July 2008 and ended in April 2010. It took 49K workers from 167 countries filtering and labeling over 160M candidate images. They had enough budget to have each of the 14 million images labelled three times. The original plan called for 10,000 images per category, for 40,000 categories at 400 million images, each verified 3 times. They found that humans can classify at most 2 images/sec. At this rate, it was estimated to take 19 human-years of labor (without rest). They presented their database for the first time as a poster at the 2009 Conference on Computer Vision and Pattern Recognition (CVPR) in Florida, titled "ImageNet: A Preview of a Large-scale Hierarchical Dataset". The poster was reused at Vision Sciences Society 2009. In 2009, Alex Berg suggested adding object localization as a task. Li approached PASCAL Visual Object Classes contest in 2009 for a collaboration. It resulted in the subsequent ImageNet Large Scale Visual Recognition Challenge starting in 2010, which has 1000 classes and object localization, as compared to PASCAL VOC which had just 20 classes and 19,737 images (in 2010). === Significance for deep learning === On 30 September 2012, a convolutional neural network (CNN) called AlexNet achieved a top-5 error of 15.3% in the ImageNet 2012 Challenge, more than 10.8 percentage points lower than that of the runner-up. Using convolutional neural networks was feasible due to the use of graphics processing units (GPUs) during training, an essential ingredient of the deep learning revolution. According to The Economist, "Suddenly people started to pay attention, not just within the AI community but across the technology industry as a whole." In 2015, AlexNet was outperformed by Microsoft's very deep CNN with over 100 layers, which won the ImageNet 2015 contest, having 3.57% error on the test set. Andrej Karpathy estimated in 2014 that with concentrated effort, he could reach 5.1% error rate, and ~10 people from his lab reached ~12-13% with less effort. It was estimated that with maximal effort, a human could reach 2.4%. == Dataset == ImageNet crowdsources its annotation process. Image-level annotations indicate the presence or absence of an object class in an image, such as "there are tigers in this image" or "there are no tigers in this image". Object-level annotations provide a bounding box around the (visible part of the) indicated object. ImageNet uses a variant of the broad WordNet schema to categorize objects, augmented with 120 categories of dog breeds to showcase fine-grained classification. In 2012, ImageNet was the world's largest academic user of Mechanical Turk. The average worker identified 50 images per minute. The original plan of the full ImageNet would have roughly 50M clean, diverse and full resolution images spread over approximately 50K synsets. This was not achieved. The summary statistics given on April 30, 2010: Total number of non-empty synsets: 21841 Total number of images: 14,197,122 Number of images with bounding box annotations: 1,034,908 Number of synsets with SIFT features: 1000 Number of images with SIFT features: 1.2 million === Categories === The categories of ImageNet were filtered from the WordNet concepts. Each concept, since it can contain multiple synonyms (for example, "kitty" and "young cat"), so each concept is called a "synonym set" or "synset". There were more than 100,000 synsets in WordNet 3.0, majority of them are nouns (80,000+). The ImageNet dataset filtered these to 21,841 synsets that are countable nouns that can be visually illustrated. Each synset in WordNet 3.0 has a "WordNet ID" (wnid), which is a concatenation of part of speech and an "offset" (a unique identifying number). Every wnid starts with "n" because ImageNet only includes nouns. For example, the wnid of synset "dog, domestic dog, Canis familiaris" is "n02084071". The categories in ImageNet fall into 9 levels, from level 1 (such as "mammal") to level 9 (such as "German shepherd"). === Image format === The images were scraped from online image search (Google, Picsearch, MSN, Yahoo, Flickr, etc) using synonyms in multiple languages. For example: German shepherd, German police dog, German shepherd dog, Alsatian, ovejero alemán, pastore tedesco, 德国牧羊犬. ImageNet consists of images in RGB format with varying resolutions. For example, in ImageNet 2012, "fish" category, the resolution ranges from 4288 x 2848 to 75 x 56. In machine learning, these are typically preprocessed into a standard constant resolution, and whitened, before further processing by neural networks. For example, in PyTorch, ImageNet images are by default normalized by dividing the pixel values so that they fall between 0 and 1, then subtracting by [0.485, 0.456, 0.406], then dividing by [0.229, 0.224, 0.225]. These are the mean and standard deviations for ImageNet, so this whitens the input data. === Labels and annotations === Each image is labelled with exactly one wnid. Dense SIFT features (raw SIFT descriptors, quantized codewords, and coordinates of each descriptor/codeword) for ImageNet-1K were available for download, designed for bag of visual words. The bounding boxes of objects were available for about 3000 popular synsets with on average 150 images in each synset. Furthermore, some images have attributes. They released 25 attributes for ~400 popular synsets: Color: black, blue, brown, gray, green, orange, pink, red, violet, white, yellow Pattern: spotted, striped Shape: long, round, rectangular, square Texture: furry, smooth, rough, shiny, metallic, vegetation, wooden, wet === ImageNet-21K === The full original dataset is referred to as ImageNet-21K. ImageNet-21k contains 14,197,122 images divided into 21,841 classes. Some papers round this up and name it ImageNet-22k. The full ImageNet-21k was released in Fall of 2011, as fall11_whole.tar. There is no official train-validation-test split for ImageNet-21k. Some classes contain only 1-10 samples, while others contain thousands. === ImageNet-1K === There are various subsets of the ImageNet dataset used in various context, sometimes referred to as "versions". One of the most highly used subsets of ImageNet is the "ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012–2017 image classification and localization dataset". This is also referred to in the research literature as ImageNet-1K or ILSVRC2017, reflecting the original ILSVRC challenge that involved 1,000 classes. ImageNet-1K contains 1,281,167 training images, 50,000 validation images and 100,000 test images. Each category in ImageNet-1K is a leaf category, meaning that there are no child nodes below it, unlike ImageNet-21K. For example, in ImageNet-21K, there are some images categorized as simply "mammal", whereas in ImageNet-1K, there are only images categorized as things like "German shepherd", since there are no child-words below "German shepherd". === Later developments === In the WordNet they built ImageNet on, there were 2832 synsets in the "person" subtree. During 2018--2020 period, they removed the download of the ImageNet-21k as they went through extensive filtering in these person synsets. Out of these 2832 synsets, 1593 were deemed "potentially offensive". Out of the remaining 1239, 1081 were deemed not really "visual". The result was that only 158 syn

    Read more →
  • Biometric device

    Biometric device

    A biometric device is a security identification and authentication device. Such devices use automated methods of verifying or recognising the identity of a living person based on a physiological or behavioral characteristic. These characteristics include fingerprints, facial images, iris and voice recognition. == History == Biometric devices have been in use for thousands of years. Non-automated biometric devices have been in use since 500 BC, when ancient Babylonians would sign their business transactions by pressing their fingertips into clay tablets. Automation in biometric devices was first seen in the 1960s. The Federal Bureau of Investigation (FBI) in the 1960s, introduced the Indentimat, which started checking for fingerprints to maintain criminal records. The first systems measured the shape of the hand and the length of the fingers. Although discontinued in the 1980s, the system set a precedent for future Biometric Devices. == Subgroups == The characteristic of the human body is used to access information by the users. According to these characteristics, the sub-divided groups are Chemical biometric devices: Analyses the segments of the DNA to grant access to the users. Visual biometric devices: Analyses the visual features of the humans to grant access which includes iris recognition, face recognition, Finger recognition, and Retina Recognition. Behavioral biometric devices: Analyses the Walking Ability and Signatures (velocity of sign, width of sign, pressure of sign) distinct to every human. Olfactory biometric devices: Analyses the odor to distinguish between varied users. Auditory biometric devices: Analyses the voice to determine the identity of a speaker for accessing control. == Uses == === Workplace === Biometrics are being used to establish better and accessible records of the hour's employee's work. With the increase in "Buddy Punching" (a case where employees clocked out coworkers and fraudulently inflated their work hours) employers have looked towards new technology like fingerprint recognition to reduce such fraud. Additionally, employers are also faced with the task of proper collection of data such as entry and exit times. Biometric devices make for largely fool proof and reliable ways of enabling to collect data as employees have to be present to enter biometric details which are unique to them. === Immigration === As the demand for air travel grows and more people travel, modern-day airports have to implement technology in such a way that there are no long queues. Biometrics are being implemented in more and more airports as they enable quick recognition of passengers and hence lead to lower volume of people standing in queues. One such example is of the Dubai International Airport which plans to make immigration counters a relic of the past as they implement IRIS on the move technology (IOM) which should help the seamless departures and arrivals of passengers at the airport. === Handheld and personal devices === Fingerprint sensors can be found on mobile devices. The fingerprint sensor is used to unlock the device and authorize actions, like money and file transfers, for example. It can be used to prevent a device from being used by an unauthorized person. It is also used in attendance in number of colleges and universities. == Present day biometric devices == === Personal signature verification systems === This is one of the most highly recognised and acceptable biometrics in corporate surroundings. This verification has been taken one step further by capturing the signature while taking into account many parameters revolving around this like the pressure applied while signing, the speed of the hand movement and the angle made between the surface and the pen used to make the signature. This system also has the ability to learn from users as signature styles vary for the same user. Hence by taking a sample of data, this system is able to increase its own accuracy. === Iris recognition system === Iris recognition involves the device scanning the pupil of the subject and then cross referencing that to data stored on the database. It is one of the most secure forms of authentication, as while fingerprints can be left behind on surfaces, iris prints are extremely hard to be stolen. Iris recognition is widely applied by organisations dealing with the masses, one being the Aadhaar identification system issued by the Government of India to keep records of its population. The reason for this is that iris recognition makes use of iris prints of humans, which change little over the course of one's lifetime. == Problems with present day biometric devices == === Biometric spoofing === Biometric spoofing is a method of fooling a biometric identification management system, where a counterfeit mold is presented in front of the biometric scanner. This counterfeit mold emulates the unique biometric attributes of an individual so as to confuse the system between the artifact and the real biological target and gain access to sensitive data/materials. One such high-profile case of Biometric spoofing came to the limelight when it was found that German Defence Minister, Ursula von der Leyen's fingerprint had been successfully replicated by Chaos Computer Club. The group used high quality camera lenses and shot images from 6 feet away. They used a professional finger software and mapped the contours of the Ministers thumbprint. Although progress has been made to stop spoofing. Using the principle of pulse oximetry — the liveliness of the test subject is taken into account by measure of blood oxygenation and the heart rate. This reduces attacks like the ones mentioned above, although these methods aren't commercially applicable as costs of implementation are high. This reduces their real world application and hence makes biometrics insecure until these methods are commercially viable. === Accuracy === Accuracy is a major issue with biometric recognition. Passwords are still extremely popular, because a password is static in nature, while biometric data can be subject to change (such as one's voice becoming heavier due to puberty, or an accident to the face, which could lead to improper reading of facial scan data). When testing voice recognition as a substitute to PIN-based systems, Barclays reported that their voice recognition system is 95 percent accurate. This statistic means that many of its customers' voices might still not be recognised even when correct. This uncertainty revolving around the system could lead to slower adoption of biometric devices, continuing the reliance of traditional password-based methods. == Benefits of biometric devices over traditional methods of authentication == Biometric data cannot be lent and hacking of Biometric data is complicated hence it makes it safer to use than traditional methods of authentication like passwords which can be lent and shared. Passwords do not have the ability to judge the user but rely only on the data provided by the user, which can easily be stolen while Biometrics work on the uniqueness of each individual. Passwords can be forgotten and recovering them can take time, whereas Biometric devices rely on biometric data which tends to be unique to a person, hence there is no risk of forgetting the authentication data. A study conducted among Yahoo! users found that at least 1.5 percent of Yahoo users forgot their passwords every month, hence this makes accessing services more lengthy for consumers as the process of recovering passwords is lengthy. These shortcomings make Biometric devices more efficient and reduces effort for the end user. == Future == Researchers are targeting the drawbacks of present-day biometric devices and developing to reduce problems like biometric spoofing and inaccurate intake of data. Technologies which are being developed are- The United States Military Academy are developing an algorithm that allows identification through the ways each individual interacts with their own computers; this algorithm considers unique traits like typing speed, rhythm of writing and common spelling mistakes. This data allows the algorithm to create a unique profile for each user by combining their multiple behavioral and stylometric information. This can be very difficult to replicate collectively. A recent innovation by Kenneth Okereafor and, presented an optimized and secure design of applying biometric liveness detection technique using a trait randomization approach. This novel concept potentially opens up new ways of mitigating biometric spoofing more accurately, and making impostor predictions intractable or very difficult in future biometric devices. A simulation of Kenneth Okereafor's biometric liveness detection algorithm using a 3D multi-biometric framework consisting of 15 liveness parameters from facial print, finger print and iris pattern traits resulted in a system efficiency of the 99.2% over a cardinality of 125 distinct randomization combinat

    Read more →
  • PeduliLindungi

    PeduliLindungi

    SatuSehat (Indonesian for "one health"), formerly PeduliLindungi (roughly "care to protect"), is a national integrated health data exchange platform, jointly developed by the Indonesian Ministry of Communication and Information Technology (Kemenkominfo), in partnership with Committee for COVID-19 Response and National Economic Recovery (KPCPEN), Ministry of Health (Kemenkes), Ministry of State-Owned Enterprises (KemenBUMN), and Telkom Indonesia. The SatuSehat platform aims to facilitate data accessibility and service efficiency for health providers and the government, and assist the public as a tool to access their own electronic medical record data. This app was the official COVID-19 contact tracing app used for digital contact tracing in Indonesia, and originally known as TraceTogether but later changed because Singapore had its app using the same name. == Implementation == On 23 August 2021, Coordinating Minister for Maritime and Investments Affairs, Luhut Binsar Panjaitan, encouraged the government to make this app a mandatory requirement before using public transportations, such as train, bus, ferry, and plane. Furthermore, citizen must have installed the app before entering shopping malls, factories, and sport venues. Every person who have received at least a dose of vaccine will receive a vaccine card and vaccination certificate which can be downloaded from the app. In December 2022, with the revocation of PPKM (Community Activities Restrictions Enforcement) starting from 1 January 2023, Ministry of Health issued a statement that the usage of the app is not a governmental mandatory requirement as it used to be. === Transition into a citizen health app === On 7 September 2022, it was announced that the app would be modified to become a citizen health app, capitalising on the reach of the app and the existing work done around the app. On 28 February 2023, the authorities announced that the app was rebranded to SATUSEHAT Mobile (lit. 'OneHealth Mobile'), with existing users needing to update the PeduliLindungi app and re-synchronise their COVID-19 related health information. The re-branded app would eventually be an all-in-one health service and records retrieval app for Indonesians. == Controversy == It was reported that the app requires continuous access to the phone's files, media, and GPS, which quickly drains the battery. Allowing location access only during use or denying it altogether will render the app unusable. This stands in stark contrast to COVID-19 apps used in other countries that only utilize Bluetooth and do not require any additional permissions. In September 2021, stored personal data of at least 1.3 million Indonesian residents were leaked online, including the vaccine certificate of President Joko Widodo. The data leak was also reported on eHAC (electronic Health Alert Card), a mandatory app used for air passengers.

    Read more →
  • Scale space

    Scale space

    Scale-space theory is a framework for multi-scale signal representation developed by the computer vision, image processing and signal processing communities with complementary motivations from physics and biological vision. It is a formal theory for handling image structures at different scales, by representing an image as a one-parameter family of smoothed images, the scale-space representation, parametrized by the size of the smoothing kernel used for suppressing fine-scale structures. The parameter t {\displaystyle t} in this family is referred to as the scale parameter, with the interpretation that image structures of spatial size smaller than about t {\displaystyle {\sqrt {t}}} have largely been smoothed away in the scale-space level at scale t {\displaystyle t} . The main type of scale space is the linear (Gaussian) scale space, which has wide applicability as well as the attractive property of being possible to derive from a small set of scale-space axioms. The corresponding scale-space framework encompasses a theory for Gaussian derivative operators, which can be used as a basis for expressing a large class of visual operations for computerized systems that process visual information. This framework also allows visual operations to be made scale invariant, which is necessary for dealing with the size variations that may occur in image data, because real-world objects may be of different sizes and in addition the distance between the object and the camera may be unknown and may vary depending on the circumstances. == Definition == The notion of scale space applies to signals of arbitrary numbers of variables. The most common case in the literature applies to two-dimensional images, which is what is presented here. Consider a given image f {\displaystyle f} where f ( x , y ) {\displaystyle f(x,y)} is the greyscale value of the pixel at position ( x , y ) {\displaystyle (x,y)} . The linear (Gaussian) scale-space representation of f {\displaystyle f} is a family of derived signals L ( x , y ; t ) {\displaystyle L(x,y;t)} defined by the convolution of f ( x , y ) {\displaystyle f(x,y)} with the two-dimensional Gaussian kernel g ( x , y ; t ) = 1 2 π t e − ( x 2 + y 2 ) / 2 t {\displaystyle g(x,y;t)={\frac {1}{2\pi t}}e^{-(x^{2}+y^{2})/2t}\,} such that L ( ⋅ , ⋅ ; t ) = g ( ⋅ , ⋅ ; t ) ∗ f ( ⋅ , ⋅ ) , {\displaystyle L(\cdot ,\cdot ;t)\ =g(\cdot ,\cdot ;t)f(\cdot ,\cdot ),} where the semicolon in the argument of L {\displaystyle L} implies that the convolution is performed only over the variables x , y {\displaystyle x,y} , while the scale parameter t {\displaystyle t} after the semicolon just indicates which scale level is being defined. This definition of L {\displaystyle L} works for a continuum of scales t ≥ 0 {\displaystyle t\geq 0} , but typically only a finite discrete set of levels in the scale-space representation would be actually considered. The scale parameter t = σ 2 {\displaystyle t=\sigma ^{2}} is the variance of the Gaussian filter and as a limit for t = 0 {\displaystyle t=0} the filter g {\displaystyle g} becomes an impulse function such that L ( x , y ; 0 ) = f ( x , y ) , {\displaystyle L(x,y;0)=f(x,y),} that is, the scale-space representation at scale level t = 0 {\displaystyle t=0} is the image f {\displaystyle f} itself. As t {\displaystyle t} increases, L {\displaystyle L} is the result of smoothing f {\displaystyle f} with a larger and larger filter, thereby removing more and more of the details that the image contains. Since the standard deviation of the filter is σ = t {\displaystyle \sigma ={\sqrt {t}}} , details that are significantly smaller than this value are to a large extent removed from the image at scale parameter t {\displaystyle t} , see the following figures and for graphical illustrations. === Why a Gaussian filter? === When faced with the task of generating a multi-scale representation one may ask: could any filter g of low-pass type and with a parameter t which determines its width be used to generate a scale space? The answer is no, as it is of crucial importance that the smoothing filter does not introduce new spurious structures at coarse scales that do not correspond to simplifications of corresponding structures at finer scales. In the scale-space literature, a number of different ways have been expressed to formulate this criterion in precise mathematical terms. The conclusion from several different axiomatic derivations that have been presented is that the Gaussian scale space constitutes the canonical way to generate a linear scale space, based on the essential requirement that new structures must not be created when going from a fine scale to any coarser scale. Conditions, referred to as scale-space axioms, that have been used for deriving the uniqueness of the Gaussian kernel include linearity, shift invariance, semi-group structure, non-enhancement of local extrema, scale invariance and rotational invariance. In the works, the uniqueness claimed in the arguments based on scale invariance has been criticized, and alternative self-similar scale-space kernels have been proposed. The Gaussian kernel is, however, a unique choice according to the scale-space axiomatics based on causality or non-enhancement of local extrema. === Alternative definition === Equivalently, the scale-space family can be defined as the solution of the diffusion equation (for example in terms of the heat equation), ∂ t L = 1 2 ∇ 2 L , {\displaystyle \partial _{t}L={\frac {1}{2}}\nabla ^{2}L,} with initial condition L ( x , y ; 0 ) = f ( x , y ) {\displaystyle L(x,y;0)=f(x,y)} . This formulation of the scale-space representation L means that it is possible to interpret the intensity values of the image f as a "temperature distribution" in the image plane and that the process that generates the scale-space representation as a function of t corresponds to heat diffusion in the image plane over time t (assuming the thermal conductivity of the material equal to the arbitrarily chosen constant ⁠1/2⁠). Although this connection may appear superficial for a reader not familiar with differential equations, it is indeed the case that the main scale-space formulation in terms of non-enhancement of local extrema is expressed in terms of a sign condition on partial derivatives in the 2+1-D volume generated by the scale space, thus within the framework of partial differential equations. Furthermore, a detailed analysis of the discrete case shows that the diffusion equation provides a unifying link between continuous and discrete scale spaces, which also generalizes to nonlinear scale spaces, for example, using anisotropic diffusion. Hence, one may say that the primary way to generate a scale space is by the diffusion equation, and that the Gaussian kernel arises as the Green's function of this specific partial differential equation. == Motivations == The motivation for generating a scale-space representation of a given data set originates from the basic observation that real-world objects are composed of different structures at different scales. This implies that real-world objects, in contrast to idealized mathematical entities such as points or lines, may appear in different ways depending on the scale of observation. For example, the concept of a "tree" is appropriate at the scale of meters, while concepts such as leaves and molecules are more appropriate at finer scales. For a computer vision system analysing an unknown scene, there is no way to know a priori what scales are appropriate for describing the interesting structures in the image data. Hence, the only reasonable approach is to consider descriptions at multiple scales in order to be able to capture the unknown scale variations that may occur. Taken to the limit, a scale-space representation considers representations at all scales. Another motivation to the scale-space concept originates from the process of performing a physical measurement on real-world data. In order to extract any information from a measurement process, one has to apply operators of non-infinitesimal size to the data. In many branches of computer science and applied mathematics, the size of the measurement operator is disregarded in the theoretical modelling of a problem. The scale-space theory on the other hand explicitly incorporates the need for a non-infinitesimal size of the image operators as an integral part of any measurement as well as any other operation that depends on a real-world measurement. There is a close link between scale-space theory and biological vision. Many scale-space operations show a high degree of similarity with receptive field profiles recorded from the mammalian retina and the first stages in the visual cortex. In these respects, the scale-space framework can be seen as a theoretically well-founded paradigm for early vision, which in addition has been thoroughly tested by algorithms and experiments. == Gaussian derivatives == At any scale in scale space, we c

    Read more →
  • Image-based modeling and rendering

    Image-based modeling and rendering

    In computer graphics and computer vision, image-based modeling and rendering (IBMR) methods rely on a set of two-dimensional images of a scene to generate a three-dimensional model and then render some novel views of this scene. The traditional approach of computer graphics has been used to create a geometric model in 3D and try to reproject it onto a two-dimensional image. Computer vision, conversely, is mostly focused on detecting, grouping, and extracting features (edges, faces, etc.) present in a given picture and then trying to interpret them as three-dimensional clues. Image-based modeling and rendering allows the use of multiple two-dimensional images in order to generate directly novel two-dimensional images, skipping the manual modeling stage. == Light modeling == Instead of considering only the physical model of a solid, IBMR methods usually focus more on light modeling. The fundamental concept behind IBMR is the plenoptic illumination function which is a parametrisation of the light field. The plenoptic function describes the light rays contained in a given volume. It can be represented with seven dimensions: a ray is defined by its position ( x , y , z ) {\displaystyle (x,y,z)} , its orientation ( θ , ϕ ) {\displaystyle (\theta ,\phi )} , its wavelength ( λ ) {\displaystyle (\lambda )} and its time ( t ) {\displaystyle (t)} : P ( x , y , z , θ , ϕ , λ , t ) {\displaystyle P(x,y,z,\theta ,\phi ,\lambda ,t)} . IBMR methods try to approximate the plenoptic function to render a novel set of two-dimensional images from another. Given the high dimensionality of this function, practical methods place constraints on the parameters in order to reduce this number (typically to 2 to 4). == IBMR methods and algorithms == View morphing generates a transition between images Panoramic imaging renders panoramas using image mosaics of individual still images Lumigraph relies on a dense sampling of a scene Space carving generates a 3D model based on a photo-consistency check

    Read more →
  • Tokken

    Tokken

    Tokken is a payment system and mobile app most known for being a legal and secure option for businesses transactions within the cannabis industry, because of its compliance with bank requirements. The startup company was created by Lamine Zarrad, a former regulator at the Office of the Comptroller of the Currency. == Operability == In order for a person to start using the app, they need to provide evidence, in the form of bioidentification data and mobile carrier records, that they can legally purchase weed. After they have been verified, customers can pay directly through the app at any dispensary that is using Tokken. Tokken turns credit card transactions into a digital token, which can be exchanged back for money that can later be deposited into a bank account. All transactions are logged publicly through a blockchain leger, making the process both anonymous and verified. === Banking services === Tokken has a "pay taxes" function which enables dispensaries to pay their taxes directly to the department.

    Read more →
  • Tandem Money

    Tandem Money

    Tandem is one of the UK's original challenger banks. Tandem is a digital bank with a mobile app, and no branches. The acquisition of Harrods Bank in 2017 allowed the company to provide services using the former's banking licence. Tandem Bank Limited is authorised by the Prudential Regulation Authority and regulated by the Financial Conduct Authority. Tandem has offices across the UK in Blackpool, Cardiff, Durham and London, employing over 500 people. == History == The company was founded by Ricky Knox, Matt Cooper and Michael Kent in 2014. In December 2016, Tandem announced that it had secured a £35 million investment from The Sanpower Group, the Chinese company that also owned the department store House of Fraser; however, £29 million of this investment was later revoked by Sanpower over concerns that the Chinese Government would object to the investment following increased restrictions on outbound investment in China. This resulted in a delay in the launch of Tandem's savings products, which, at the time of the revocation, was expected imminently and, more importantly, meant that Tandem volunteered the return of their banking license but retained all other permissions. In April 2018, Tandem launched fixed-term savings accounts, offering one-, two- and three-year terms through its app. === Acquisitions === In August 2017, it was announced that Tandem would fully acquire Harrods Bank, founded in 1893, in a deal that would bring a near-£200m loan book, over £300m of deposits and nearly £80 million of capital. Prior to its sale to Tandem Money, Harrods Bank catered for high-net-worth (HNW) individuals and operated from the Harrods store in Knightsbridge, London. It offered a variety of personal and business current and savings accounts, mortgages, foreign currency and gold bullion trading services. On 7 August 2017, Tandem Money Limited announced a deal to acquire 100% of Harrods Bank Limited shares. The purchase deal closed successfully on 11 January 2018. In March 2018, Tandem agreed to acquire Pariti Technologies Limited, developers of the Pariti money management application. In August 2020 Tandem acquired green home improvement loan specialists Allium Lending Group. It was announced on 8 February 2021 that Tandem had agreed to purchase the mortgage book from private bank Bank and Clients, consisting of 300 B&C customers for an undisclosed amount. In January 2022 Tandem Bank acquired consumer lender Oplo, creating a combined business with £1.2 billion of total assets. In April 2023, it was announced that Tandem had acquired money-sharing app Loop Money. At the time of the purchase, one of Loop's founders – Paul Pester – was also chairman at Tandem. == Features == Tandem Bank offers customers savings, mortgages, personal and secured loans, green home improvement loans and motor finance. In November 2022, the bank launched its new Tandem Marketplace, providing information and resources to help promote greener living.

    Read more →
  • Leakage (machine learning)

    Leakage (machine learning)

    In statistics and machine learning, leakage (also known as data leakage or target leakage) refers to the use of information during model training that would not be available at prediction time. This results in overly optimistic performance estimates, as the model appears to perform better during evaluation than it actually would in a production environment. Leakage is often subtle and indirect, making it difficult to detect and eliminate. It can lead a statistician or modeler to select a suboptimal model, which may be outperformed by a leakage-free alternative. == Leakage modes == Leakage can occur at multiple stages of the machine learning workflow. Broadly, its sources can be divided into two categories: those arising from features and those arising from training examples. === Feature leakage === Feature or column-wise leakage is caused by the inclusion of columns which are one of the following: a duplicate label, a proxy for the label, or the label itself. These features, known as anachronisms, will not be available when the model is used for predictions, and result in leakage if included when the model is trained. For example, including a "MonthlySalary" column when predicting "YearlySalary"; or "MinutesLate" when predicting "IsLate". === Training example leakage === Row-wise leakage is caused by improper sharing of information between rows of data. Types of row-wise leakage include: Premature featurization; leaking from premature featurization before Cross-validation/Train/Test split (must fit MinMax/ngrams/etc on only the train split, then transform the test set) Duplicate rows between train/validation/test (for example, oversampling a dataset to pad its size before splitting; or, different rotations/augmentations of a single image; bootstrap sampling before splitting; or duplicating rows to up sample the minority class) Non-independent and identically distributed random (non-IID) data Time leakage (for example, splitting a time-series dataset randomly instead of newer data in test set using a train/test split or rolling-origin cross-validation) Group leakage—not including a grouping split column (for example, Andrew Ng's group had 100k x-rays of 30k patients, meaning ~3 images per patient. The paper used random splitting instead of ensuring that all images of a patient were in the same split. Hence the model partially memorized the patients instead of learning to recognize pneumonia in chest x-rays.) A 2023 review found data leakage to be "a widespread failure mode in machine-learning (ML)-based science", having affected at least 294 academic publications across 17 disciplines, and causing a potential reproducibility crisis. == Detection == Data leakage in machine learning can be detected through various methods, focusing on performance analysis, feature examination, data auditing, and model behavior analysis. Performance-wise, unusually high accuracy or significant discrepancies between training and test results often indicate leakage. Inconsistent cross-validation outcomes may also signal issues. Feature examination involves scrutinizing feature importance rankings and ensuring temporal integrity in time series data. A thorough audit of the data pipeline is crucial, reviewing pre-processing steps, feature engineering, and data splitting processes. Detecting duplicate entries across dataset splits is also important. For language models, the Min-K% method can detect the presence of data in a pretraining dataset. It presents a sentence suspected to be present in the pretraining dataset, and computes the log-likelihood of each token, then compute the average of the lowest K of these. If this exceeds a threshold, then the sentence is likely present. This method is improved by comparing against a baseline of the mean and variance. Analyzing model behavior can reveal leakage. Models relying heavily on counter-intuitive features or showing unexpected prediction patterns warrant investigation. Performance degradation over time when tested on new data may suggest earlier inflated metrics due to leakage. Advanced techniques include backward feature elimination, where suspicious features are temporarily removed to observe performance changes. Using a separate hold-out dataset for final validation before deployment is advisable.

    Read more →
  • Database virtualization

    Database virtualization

    Database virtualization is the decoupling of the database layer, which lies between the storage and application layers within the application stack. Virtualization of the database layer enables a shift away from the physical, toward the logical or virtual. Virtualization enables compute and storage resources to be pooled and allocated on demand. This enables both the sharing of single server resources for multi-tenancy, as well as the pooling of server resources into a single logical database or cluster. In both cases, database virtualization provides increased flexibility, more granular and efficient allocation of pooled resources, and more scalable computing. == Virtual data partitioning == The act of partitioning data stores as a database grows has been in use for several decades. There are two primary ways that data has been partitioned inside legacy data management systems: Shared-data databases: an architecture that assumes all database cluster nodes share a single partition. Inter-node communications are used to synchronize update activities performed by different nodes on the cluster. Shared-data data management systems are limited to single-digit node clusters. Shared-nothing databases: an architecture in which all data is segregated to internally managed partitions with clear, well-defined data location boundaries. Shared-nothing databases require manual partition management. In virtual partitioning, logical data is abstracted from physical data by autonomously creating and managing large numbers of data partitions (100s to 1000s). Because they are autonomously maintained, the resources required to manage the partitions are minimal. This kind of massive partitioning results in: Partitions that are small, efficiently managed, and load-balanced. Systems that do not require re-partitioning events to define additional partitions, even when the hardware is changed. “Shared-data” and “shared-nothing” architectures allow scalability through multiple data partitions and cross-partition querying and transaction processing without full partition scanning. == Horizontal data partitioning == Partitioning database sources from consumers is a fundamental concept. With greater numbers of database sources, inserting a horizontal data virtualization layer between the sources and consumers helps address this complexity. Rick van der Lans, the author of multiple books on SQL and relational databases, has defined data virtualization as "the process of offering data consumers a data access interface that hides the technical aspects of stored data, such as location, storage structure, API, access language, and storage technology." == Advantages == Added flexibility and agility for existing computing infrastructure. Enhanced database performance. Pooling and sharing computing resources, either splitting them (multi-tenancy) or combining them (clustering). Simplification of administration and management. Increased fault tolerance.

    Read more →
  • Image tracing

    Image tracing

    In computer graphics, image tracing, raster-to-vector conversion or raster vectorization is the conversion of raster graphics into vector graphics. == Background == An image does not have any structure: it is just a collection of marks on paper, grains in film, or pixels in a bitmap. While such an image is useful, it has some limits. If the image is magnified enough, its artifacts appear. The halftone dots, film grains, and pixels become apparent. Images of sharp edges become fuzzy or jagged. See, for example, pixelation. Ideally, a vector image does not have the same problem. Edges and filled areas are represented as mathematical curves or gradients, and they can be magnified arbitrarily (though of course the final image must also be rasterized in to be rendered, and its quality depends on the quality of the rasterization algorithm for the given inputs). The task in vectorization is to convert a two-dimensional image into a two-dimensional vector representation of the image. It is not examining the image and attempting to recognize or extract a three-dimensional model that may be depicted; i.e. it is not a vision system. For most applications, vectorization also does not involve optical character recognition; characters are treated as lines, curves, or filled objects without attaching any significance to them. In vectorization, the shape of the character is preserved, so artistic embellishments remain. Vectorization is the inverse operation corresponding to rasterization, as integration is to differentiation. And, just as with these other operations, while rasterization is fairly straightforward and algorithmic, vectorization involves the reconstruction of lost information and therefore requires heuristic methods. Synthetic images such as maps, cartoons, logos, clip art, and technical drawings are suitable for vectorization. Those images could have been originally made as vector images because they are based on geometric shapes or drawn with simple curves. Continuous tone photographs (such as live portraits) are not good candidates for vectorization. The input to vectorization is an image, but an image may come in many forms such as a photograph, a drawing on paper, or one of several raster file formats. Programs that do raster-to-vector conversion may accept bitmap formats such as TIFF, BMP and PNG. The output is a vector file format. Common vector formats are SVG, DXF, EPS, EMF and AI. Vectorization can be used to update images or recover work. Personal computers often come with a simple paint program that produces a bitmap output file. These programs allow users to make simple illustrations by adding text, drawing outlines, and filling outlines with a specific color. Only the results of these operations (the pixels) are saved in the resulting bitmap; the drawing and filling operations are discarded. Vectorization can be used to recapture some of the information that was lost. Vectorization is also used to recover information that was originally in a vector format but has been lost or has become unavailable. A company may have commissioned a logo from a graphic arts firm. Although the graphics firm used a vector format, the client company may not have received a copy of that format. The company may then acquire a vector format by scanning and vectorizing a paper copy of the logo. == Process == Vectorization starts with an image. === Manual === The image can be vectorized manually. A person could look at the image, make some measurements, and then write the output file by hand. That was the case for the vectorization of a technical illustration about neutrinos. The illustration has a few geometric shapes and a lot of text; it was relatively easy to convert the shapes, and the SVG vector format allows the text (even subscripts and superscripts) to be entered easily. The original image did not have any curves (except for the text), so the conversion is straightforward. Curves make the conversion more complicated. Manual vectorization of complicated shapes can be facilitated by the tracing function built into some vector graphics editing programs. If the image is not yet in machine readable form, then it has to be scanned into a usable file format. Once there is a machine-readable bitmap, the image can be imported into a graphics editing program (such as Adobe Illustrator, CorelDRAW, or Inkscape). Then a person can manually trace the elements of the image using the program's editing features. Curves in the original image can be approximated with lines, arcs, and Bézier curves. An illustration program allows spline knots to be adjusted for a close fit. Manual vectorization is possible, but it can be tedious. Although graphics drawing programs have been around for a long time, artists may find the freehand drawing facilities awkward even when a drawing tablet is used. Instead of using a program, Pepper recommends making an initial sketch on paper. Instead of scanning the sketch and tracing it freehand in the computer, Pepper states: "Those proficient with a graphic tablet and stylus could make the following changes directly in CorelDRAW by using a scan of the sketch as an underlay and drawing over it. I prefer to use pen and ink, and a light table"; most of the final image was traced by hand in ink. Later the line-drawing image was scanned at 600 dpi, cleaned up in a paint program, and then automatically traced with a program. Once the black and white image was in the graphics program, some other elements were added and the figure was colored. Similarly, Ploch recreated a design from a digital photograph. The JPEG was imported and some "basic shapes" were traced by hand and colored in the graphics drawing program; more complex shapes were handled differently. Ploch used a bitmap editor to remove the background and crop the more complex image components. He then printed the image and traced it by hand onto tracing paper to get a clean black and white line drawing. That drawing was scanned and then vectorized with a program. === Automatic === Some programs automate the vectorization process. Example programs are Adobe Illustrator, Inkscape, Corel's PowerTRACE, and Potrace. Some of these programs have a command line interface while others are interactive that allow the user to adjust the conversion settings and view the result. Adobe Streamline is not only an interactive program, but it also allows a user to manually edit the input bitmap and the output curves. Corel's PowerTRACE is accessed through CorelDRAW; CorelDRAW can be used to modify the input bitmap and edit the output curves. Adobe Illustrator has a facility to trace individual curves. Automated programs can have mixed results. A program (PowerTRACE) was used to convert a PNG map to SVG. The program did a good job on the map boundaries (the most tedious task in the tracing) and the settings dropped out all the text (small objects). The text was manually re-inserted. Other conversions may not go as well. The results depend on having high-quality scans, reasonable settings, and good algorithms. Scanned images often have a lot of noise, which can require additional work to clean up. == Options == There are many different image styles and possibilities, and no single vectorization method works well on all images. Consequently, vectorization programs have many options that influence the result. One issue is what the predominant shapes are. If the image is of a fill-in form, then it will probably have just vertical and horizontal lines of a constant width. The program's vectorization should take that into account. On the other hand, a CAD drawing may have lines at any angle, there may be curved lines, and there may be several line weights (thick for objects and thin for dimension lines). Instead of (or in addition to) curves, the image may contain outlines filled with the same color. Adobe Streamline allows users to select a combination of line recognition (horizontal and vertical lines), centerline recognition, or outline recognition. Streamline also allows small outline shapes to be thrown out; the notion is such small shapes are noise. The user may set the noise level between 0 and 1000; an outline that has fewer pixels than that setting is discarded. Another issue is the number of colors in the image. Even images that were created as black on white drawings may end up with many shades of gray. Some line-drawing routines employ anti-aliasing; a pixel completely covered by the line will be black, but a pixel that is only partially covered will be gray. If the original image is on paper and is scanned, there is a similar result: edge pixels will be gray. Sometimes images are compressed (e.g., JPEG images), and the compression will introduce gray levels. Many of the vectorization programs will group same-color pixels into lines, curves, or outlined shapes. If each possible color is grouped into its object, there can be an enormous number of objects. Instead, the user is asked to s

    Read more →