AI App Just Like Chatgpt

AI App Just Like Chatgpt — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Evaluation of binary classifiers

    Evaluation of binary classifiers

    Evaluation of a binary classifier typically assigns a numerical value, or values, to a classifier that represent its accuracy. An example is error rate, which measures how frequently the classifier makes a mistake. There are many metrics that can be used; different fields have different preferences. For example, in medicine sensitivity and specificity are often used, while in computer science precision and recall are preferred. An important distinction is between metrics that are independent of the prevalence or skew (how often each class occurs in the population), and metrics that depend on the prevalence – both types are useful, but they have very different properties. Often, evaluation is used to compare two methods of classification, so that one can be adopted and the other discarded. Such comparisons are more directly achieved by a form of evaluation that results in a single unitary metric rather than a pair of metrics. == Contingency table == Given a data set, a classification (the output of a classifier on that set) gives two numbers: the number of positives and the number of negatives, which add up to the total size of the set. To evaluate a classifier, one compares its output to another reference classification – ideally a perfect classification, but in practice the output of another gold standard test – and cross tabulates the data into a 2×2 contingency table, comparing the two classifications. One then evaluates the classifier relative to the gold standard by computing summary statistics of these 4 numbers. Generally these statistics will be scale invariant (scaling all the numbers by the same factor does not change the output), to make them independent of population size, which is achieved by using ratios of homogeneous functions, most simply homogeneous linear or homogeneous quadratic functions. Say we test some people for the presence of a disease. Some of these people have the disease, and our test correctly says they are positive. They are called true positives (TP). Some have the disease, but the test incorrectly claims they don't. They are called false negatives (FN). Some don't have the disease, and the test says they don't – true negatives (TN). Finally, there might be healthy people who have a positive test result – false positives (FP). These can be arranged into a 2×2 contingency table (confusion matrix), conventionally with the test result on the vertical axis and the actual condition on the horizontal axis. These numbers can then be totaled, yielding both a grand total and marginal totals. Totaling the entire table, the number of true positives, false negatives, true negatives, and false positives add up to 100% of the set. Totaling the columns (adding vertically) the number of true positives and false positives add up to 100% of the test positives, and likewise for negatives. Totaling the rows (adding horizontally), the number of true positives and false negatives add up to 100% of the condition positives (conversely for negatives). The basic marginal ratio statistics are obtained by dividing the 2×2=4 values in the table by the marginal totals (either rows or columns), yielding 2 auxiliary 2×2 tables, for a total of 8 ratios. These ratios come in 4 complementary pairs, each pair summing to 1, and so each of these derived 2×2 tables can be summarized as a pair of 2 numbers, together with their complements. Further statistics can be obtained by taking ratios of these ratios, ratios of ratios, or more complicated functions. The contingency table and the most common derived ratios are summarized below; see sequel for details. Note that the rows correspond to the condition actually being positive or negative (or classified as such by the gold standard), as indicated by the color-coding, and the associated statistics are prevalence-independent, while the columns correspond to the test being positive or negative, and the associated statistics are prevalence-dependent. There are analogous likelihood ratios for prediction values, but these are less commonly used, and not depicted above. == Pairs of metrics == Often accuracy is evaluated with a pair of metrics composed in a standard pattern. === Sensitivity and specificity === The fundamental prevalence-independent statistics are sensitivity and specificity. Sensitivity or True Positive Rate (TPR), also known as recall, is the proportion of people that tested positive and are positive (True Positive, TP) of all the people that actually are positive (Condition Positive, CP = TP + FN). It can be seen as the probability that the test is positive given that the patient is sick. With higher sensitivity, fewer actual cases of disease go undetected (or, in the case of the factory quality control, fewer faulty products go to the market). Specificity (SPC) or True Negative Rate (TNR) is the proportion of people that tested negative and are negative (True Negative, TN) of all the people that actually are negative (Condition Negative, CN = TN + FP). As with sensitivity, it can be looked at as the probability that the test result is negative given that the patient is not sick. With higher specificity, fewer healthy people are labeled as sick (or, in the factory case, fewer good products are discarded). The relationship between sensitivity and specificity, as well as the performance of the classifier, can be visualized and studied using the Receiver Operating Characteristic (ROC) curve. In theory, sensitivity and specificity are independent in the sense that it is possible to achieve 100% in both (such as in the red/blue ball example given above). In more practical, less contrived instances, however, there is usually a trade-off, such that they are inversely proportional to one another to some extent. This is because we rarely measure the actual thing we would like to classify; rather, we generally measure an indicator of the thing we would like to classify, referred to as a surrogate marker. The reason why 100% is achievable in the ball example is because redness and blueness is determined by directly detecting redness and blueness. However, indicators are sometimes compromised, such as when non-indicators mimic indicators or when indicators are time-dependent, only becoming evident after a certain lag time. The following example of a pregnancy test will make use of such an indicator. Modern pregnancy tests do not use the pregnancy itself to determine pregnancy status; rather, human chorionic gonadotropin is used, or hCG, present in the urine of gravid females, as a surrogate marker to indicate that a woman is pregnant. Because hCG can also be produced by a tumor, the specificity of modern pregnancy tests cannot be 100% (because false positives are possible). Also, because hCG is present in the urine in such small concentrations after fertilization and early embryogenesis, the sensitivity of modern pregnancy tests cannot be 100% (because false negatives are possible). === Positive and negative predictive values === In addition to sensitivity and specificity, the performance of a binary classification test can be measured with positive predictive value (PPV), also known as precision, and negative predictive value (NPV). The positive prediction value answers the question "If the test result is positive, how well does that predict an actual presence of disease?". It is calculated as TP/(TP + FP); that is, it is the proportion of true positives out of all positive results. The negative prediction value is the same, but for negatives, naturally. ==== Impact of prevalence on predictive values ==== Prevalence has a significant impact on prediction values. As an example, suppose there is a test for a disease with 99% sensitivity and 99% specificity. If 2000 people are tested and the prevalence (in the sample) is 50%, 1000 of them are sick and 1000 of them are healthy. Thus about 990 true positives and 990 true negatives are likely, with 10 false positives and 10 false negatives. The positive and negative prediction values would be 99%, so there can be high confidence in the result. However, if the prevalence is only 5%, so of the 2000 people only 100 are really sick, then the prediction values change significantly. The likely result is 99 true positives, 1 false negative, 1881 true negatives and 19 false positives. Of the 19+99 people tested positive, only 99 really have the disease – that means, intuitively, that given that a patient's test result is positive, there is only 84% chance that they really have the disease. On the other hand, given that the patient's test result is negative, there is only 1 chance in 1882, or 0.05% probability, that the patient has the disease despite the test result. === Precision and recall === Precision and recall can be interpreted as (estimated) conditional probabilities: Precision is given by P ( C = P | C ^ = P ) {\displaystyle P(C=P|{\hat {C}}=P)} while recall is given by P ( C ^ = P | C = P ) {\displaystyle P({\hat {C}}=P|C=P)} , where C ^ {\

    Read more →
  • Alerts.in.ua

    Alerts.in.ua

    alerts.in.ua is an online service that visualizes information about air alerts and other threats on the map of Ukraine. == History == The idea of the site appeared in the first weeks of the 2022 Russian invasion of Ukraine, during the development of other projects related to alerting the population about alarms. So, on March 2, 2022, the "Lviv Siren" bot was created, which reported on air alarms in Lviv on Twitter. Later, the idea arose to monitor alarms all over Ukraine and display them on a map. However, the lack of a single official source reporting alarms made this task much more difficult. On March 15, 2022, the Ajax Systems company announced the creation of the official Telegram channel "Air Alarm". This channel receives signals from the "Air Alarm" application and instantly publishes messages about the start and end of alarms in different regions of Ukraine. This immediately solved the problem with the source of information and gave impetus to the further implementation of the project. On March 22, 2022, the first version of the "Air Alarm Map" website was published, located on the war.ukrzen.in.ua domain. The map quickly gained popularity in social networks. It, like several other similar projects, began to be widely distributed by the mass media: Suspilne, Novyi Kanal, UNIAN, DW, Fakty ICTV, Vikna TV, Ukrainian Radio, STB, Espresso, dev.ua, itc.ua and state bodies: Center for Countering Disinformation at the National Security and Defense Council of Ukraine, Verkhovna Rada of Ukraine, Khmelnytska OVA, etc. On April 8, 2022, the site moved to the alerts.in.ua domain, where it is still available today. On August 25, 2022, the service began monitoring local official channels in addition to the main "Air Alarm". On September 11, 2022, the English version of the site was published. On March 22, 2023, its own Android application was published. The project is actively developing and has its own community. == Description == The main part of the site is a map of Ukraine, on which the regions where an air alert or other threats have been declared are highlighted in real time. As of October 16, 2022, 5 types of threats are supported: Air alarm. The threat of artillery fire. The threat of street fighting. Chemical threat. Nuclear threat. Additionally, based on media reports, information is published about other dangerous events, such as explosions, demining, etc. On the site, you can view the history of announced alarms with links to sources. Alarm statistics for different time periods are also available. For developers, there is an API that allows you to develop your own services based on information about declared alarms. The site is available in Ukrainian, English, Polish and Japanese. == Use == The map is used by: To monitor the situation in the country and the region. To illustrate the alarms announced in the mass media: TSN, Ukrainian truth, Channel 24, Suspilne, RBC Ukraine, Gromadske, Glavkom. As a map of alarms in mobile applications, there is Alarm and AirAlert. As an API for its services, including alternative alarm maps, Telegram, Viber channels, Discord bots, IoT projects, etc. == Statistics == 89.5% of users use the map from a mobile phone, 10% from a PC and 1% from a tablet. Top 6 countries by visit: Ukraine, United States, Poland, Germany, Great Britain and Japan . == Alternative projects == eMap was created by the developer Vadym Klymenko. AlarmMap is an online from the Ukrainian office of Agroprep. The official map of air alarms was developed by Ajax Systems together with the developer Artem Lemeshev, Stfalcon with the support of the Ministry of Statistics.

    Read more →
  • Subpixel rendering

    Subpixel rendering

    Subpixel rendering is a method used to increase the effective resolution of a color display device. It utilizes the composition of each pixel, which consists of three subpixels of which are red, green, and blue that can each be individually addressable on the display matrix. Subpixel rendering is primarily used for text rendering on standard DPI displays. Despite the inherent color anomalies, it can also be used to render general graphics. == History == The origin of subpixel rendering as used today remains controversial. Apple Inc., IBM, and Microsoft patented various implementations that differed in technical details owing to the different purposes for which their technologies were intended. Microsoft held several patents in the United States for subpixel rendering technology used in text rendering on RGB Stripe layouts. The patents 6,219,025; 6,239,783; 6,307,566; 6,225,973; 6,243,070; 6,393,145; 6,421,054; 6,282,327; and 6,624,828 were filed between October 7, 1998, and October 7, 1999, and expired on July 30, 2019. Analysis of the patent by FreeType indicates that the patent does not cover the idea of subpixel rendering, but rather the actual filter used as a last step to balance the color. Microsoft's patent describes the smallest possible filter that distributes each subpixel value equally among the R, G, and B pixels. Any other filter will either be blurrier or will introduce color artifacts. Apple was able to use it in Mac OS X due to a patent cross-licensing agreement. == Characteristics == A single pixel on a color display is made of several subpixels, typically three arranged left-to-right as red, green, and blue (RGB). The components are readily visible with a small magnifying glass, such as a loupe. These pixel components appear as a single color to the human eye because of blurring by optics and spatial integration by nerve cells in the eye. However, the eye is much more sensitive to the location. Therefore, turning on the G and B of one pixel and the R of the next pixel to the right will produce a white dot, but it will appear to be 1/3 of a pixel to the right of the white dot that would be seen from the RGB of only the first pixel. Subpixel rendering leverages this to provide three times the horizontal resolution of the rendered image. However, it has to blur this image to produce the correct color by ensuring the same amount of red, green, and blue are turned on as when no subpixel rendering is being done. Subpixel rendering does not necessitate the use of antialiasing. It gives a smoother result regardless of whether antialiasing is used or not since it artificially increases the resolution. However, it introduces color aliasing since subpixels are colored. Subsequent filtering applied to remove the color artifacts is a form of antialiasing, although its purpose is not smoothing jagged shapes as in conventional antialiasing. Subpixel rendering requires the software to know the layout of the subpixels. The most common reason it is wrong is monitors that can be rotated 90 (or 180) degrees, though monitors are manufactured with other arrangements of the subpixels, such as BGR or in triangles, or with 4 colors like RGBW squares. On any such display the result of incorrect subpixel rendering will be worse than if no subpixel rendering was done at all (it will not produce color artifacts, but it will produce noisy edges). == Implementations == === Apple II === Steve Gibson has claimed that the Apple II, introduced in 1977, supports an early form of subpixel rendering in its high-resolution (280×192) graphics mode. The Wozniak patent only used 2 "sub-pixels". The bytes that comprise the Apple II high-resolution screen buffer contain seven visible bits (each corresponding directly to a pixel) and a flag bit used to select between purple/green or blue/orange color sets. Each pixel, since it is represented by a single bit, is either on or off; there are no bits within the pixel itself for specifying color or brightness. Color is instead created as an artifact of the NTSC color encoding scheme, determined by horizontal position: pixels with even horizontal coordinates are always purple (or blue, if the flag bit is set), and odd pixels are always green (or orange). Two lit pixels next to each other are always white, regardless of whether the pair is even/odd or odd/even, and irrespective of the value of the flag bit. This is an approximation, but it is what most programmers of the time would have in mind while working with the Apple's high-resolution mode. Gibson's example claims that because two adjacent bits form a white block, there are, in fact, two bits per pixel: one that activates the pixel's purple left half and the other that activates its green right half. If the programmer instead activates the green right half of a pixel and the purple left half of the next pixel, the result is a white block 1/2 pixel to the right, which is indeed an instance of subpixel rendering. However, it is not clear whether any programmers of the Apple II have considered the pairs of bits as pixels—instead calling each bit a pixel. The flag bit in each byte affects color by shifting pixels half a pixel-width to the right. This half-pixel shift was exploited by some graphics software, such as HRCG (High-Resolution Character Generator), an Apple utility that displayed text using the high-resolution graphics mode, to smooth diagonals. === ClearType === Microsoft announced its subpixel rendering technology, called ClearType, at COMDEX in 1998. Microsoft published a paper in May 2000, Displaced Filtering for Patterned Displays, describing the filtering behind ClearType. It was then made available in Windows XP. Still, it was not activated by default until Windows Vista, while Windows XP OEMs could and did change the default setting. === FreeType === FreeType, the library used by most current software on the X Window System, contains two open source implementations. The original implementation uses the ClearType antialiasing filters and carries the following notice: "The colour filtering algorithm of Microsoft's ClearType technology for subpixel rendering is covered by patents; for this reason, the corresponding code in FreeType is disabled by default. Note that subpixel rendering per se is prior art; using a different colour filter thus easily circumvents Microsoft's patent claims." FreeType offers a variety of color filters. Since version 2.6.2, the default filter is light, a filter that is both normalized (value sums up to 1) and color-balanced (eliminate color fringes at the cost of resolution). Since version 2.8.1, a second implementation exists, called Harmony, that "offers high quality LCD-optimized output without resorting to ClearType techniques of resolution tripling and filtering". This is the method enabled by default. When using this method, "each color channel is generated separately after shifting the glyph outline, capitalizing on the fact that the color grids on LCD panels are shifted by a third of a pixel. This output is indistinguishable from ClearType with a light 3-tap filter." Since the Harmony method does not require additional filtering, it is not covered by the ClearType patents. === CoolType === Adobe created their own subpixel renderer called CoolType, allowing them to display documents the same way across various operating systems: Windows, MacOS, Linux etc. When it was launched around the year 2001, CoolType supported a wider range of fonts than Microsoft's ClearType, which at the time was limited to TrueType fonts. In contrast, Adobe's CoolType also supported PostScript fonts (and their OpenType equivalents). === macOS === Mac OS X (later OS X, now macOS) also used subpixel rendering, as part of Quartz 2D. However, it was removed after the introduction of Retina displays. Unlike Microsoft's implementation, which favors a tight fit to the grid (font hinting) to maximize legibility, Apple's implementation prioritizes the shape of the glyphs as set out by their designer.

    Read more →
  • Intel Management Engine

    Intel Management Engine

    The Intel Management Engine (ME), also known as the Intel Manageability Engine, is an autonomous subsystem that has been incorporated in virtually all of Intel's processor chipsets since 2008. It is located in the Platform Controller Hub of modern Intel motherboards. The Intel Management Engine always runs as long as the motherboard is receiving power, even when the computer is turned off. This issue can be mitigated with the deployment of a hardware device which is able to disconnect all connections to mains power as well as all internal forms of energy storage. The Electronic Frontier Foundation and some security researchers have voiced concern that the Management Engine is a backdoor. Intel's main competitor, AMD, has incorporated the equivalent AMD Secure Technology (formally called Platform Security Processor) in virtually all of its post-2013 CPUs. == Difference from Intel AMT == The Management Engine is often confused with Intel AMT (Intel Active Management Technology). AMT runs on the ME, but is only available on processors with vPro. AMT gives device owners remote administration of their computer, such as powering it on or off, and reinstalling the operating system. However, the ME itself has been built into all Intel chipsets since 2008, not only those with AMT. While AMT can be unprovisioned by the owner, there is no official, documented way to disable the ME. == Design == The subsystem primarily consists of proprietary firmware running on a separate microprocessor that performs tasks during boot-up, while the computer is running, and while it is asleep. As long as the chipset or SoC is supplied with power (via battery or power supply), it continues to run even when the system is turned off. Intel claims the ME is required to provide full performance. Its exact workings are largely undocumented and its code is obfuscated using confidential Huffman tables stored directly in hardware, so the firmware does not contain the information necessary to decode its contents. === Hardware === Starting with ME 11 (introduced in Skylake CPUs), it is based on the Intel Quark x86-based 32-bit CPU and runs the MINIX 3 operating system. The ME firmware is stored in a partition of the SPI BIOS Flash, using the Embedded Flash File System (EFFS). Previous versions were based on an ARC core, with the Management Engine running the ThreadX RTOS. Versions 1.x to 5.x of the ME used the ARCTangent-A4 (32-bit only instructions) whereas versions 6.x to 8.x used the newer ARCompact (mixed 32- and 16-bit instruction set architecture). Starting with ME 7.1, the ARC processor could also execute signed Java applets. The ME has its own MAC and IP address for the out-of-band management interface, with direct access to the Ethernet controller; one portion of the Ethernet traffic is diverted to the ME even before reaching the host's operating system, for what support exists in various Ethernet controllers, exported and made configurable via Management Component Transport Protocol (MCTP). The ME also communicates with the host via PCI interface. Under Linux, communication between the host and the ME is done via /dev/mei or /dev/mei0. Until the release of Nehalem processors, the ME was usually embedded into the motherboard's northbridge, following the Memory Controller Hub (MCH) layout. With the newer Intel architectures (Intel 5 Series onwards), the ME is integrated into the Platform Controller Hub (PCH). === Firmware === By Intel's current terminology as of 2017, ME is one of several firmware sets for the Converged Security and Manageability Engine (CSME). Prior to AMT version 11, CSME was called Intel Management Engine BIOS Extension (Intel MEBx). Management Engine (ME) – mainstream chipsets Server Platform Services (SPS) – server chipsets and SoCs Trusted Execution Engine (TXE) – tablet/embedded/low power It was also found that the ME firmware version 11 runs MINIX 3. Management of the ME modules for provisioning inside the UEFI is done via a tool called Intel Flash Image Tool (FITC). ==== Modules ==== Active Management Technology (AMT) Intel Boot Guard (IBG) and Secure Boot Quiet System Technology (QST), formerly known as Advanced Fan Speed Control (AFSC), which provides support for acoustically optimized fan speed control, and monitoring of temperature, voltage, current and fan speed sensors that are provided in the chipset, CPU and other devices present on the motherboard. Communication with the QST firmware subsystem is documented and available through the official software development kit (SDK). Protected Audio Video Path, enforces HDCP Intel Anti-Theft Technology (AT), discontinued in 2015 Serial over LAN (SOL) Intel Platform Trust Technology (PTT), a firmware-based Trusted Platform Module (TPM) Near Field Communication, a middleware for NFC readers and vendors to access NFC cards and provide secure element access, found in later MEI versions. == The intricacies of working with Intel ME == It should also be noted that the ME region requires special cleaning and subsequent initialisation, for example, after replacing the platform hub on the motherboard. Usually, this requires an SPI programmer. There are known successful cases of this operation being performed. == Security vulnerabilities == Several weaknesses have been found in the ME. On May 1, 2017, Intel confirmed a Remote Elevation of Privilege bug (SA-00075) in its Management Technology. Every Intel platform with provisioned Intel Standard Manageability, Active Management Technology, or Small Business Technology, from Nehalem in 2008 to Kaby Lake in 2017 has a remotely exploitable security hole in the ME. Several ways to disable the ME without authorization that could allow ME's functions to be sabotaged have been found. Additional major security flaws in the ME affecting a very large number of computers incorporating ME, Trusted Execution Engine (TXE), and Server Platform Services (SPS) firmware, from Skylake in 2015 to Coffee Lake in 2017, were confirmed by Intel on November 20, 2017 (SA-00086). Unlike SA-00075, this bug is even present if AMT is absent, not provisioned or if the ME was "disabled" by any of the known unofficial methods. In July 2018, another set of vulnerabilities was disclosed (SA-00112). In September 2018, yet another vulnerability was published (SA-00125). === Ring −3 rootkit === A ring −3 rootkit was demonstrated by Invisible Things Lab for the Q35 chipset; it does not work for the later Q45 chipset as Intel implemented additional protections. The exploit worked by remapping the normally protected memory region (top 16 MB of RAM) reserved for the ME. The ME rootkit could be installed regardless of whether the AMT is present or enabled on the system, as the chipset always contains the ARC ME coprocessor. (The "−3" designation was chosen because the ME coprocessor works even when the system is in the S3 state. Thus, it was considered a layer below the System Management Mode rootkits.) For the vulnerable Q35 chipset, a keystroke logger ME-based rootkit was demonstrated by Patrick Stewin. === Zero-touch provisioning === Another security evaluation by Vassilios Ververis showed serious weaknesses in the GM45 chipset implementation. In particular, it criticized AMT for transmitting unencrypted passwords in the SMB provisioning mode when the IDE redirection and Serial over LAN features are used. It also found that the "zero touch" provisioning mode (ZTC) is still enabled even when the AMT appears to be disabled in BIOS. For about 60 euros, Ververis purchased from GoDaddy a certificate that is accepted by the ME firmware and allows remote "zero touch" provisioning of (possibly unsuspecting) machines, which broadcast their HELLO packets to would-be configuration servers. === SA-00075 (a.k.a. Silent Bob is Silent) === In May 2017, Intel confirmed that many computers with AMT have had an unpatched critical privilege escalation vulnerability (CVE-2017-5689). The vulnerability was nicknamed "Silent Bob is Silent" by the researchers who had reported it to Intel. It affects numerous laptops, desktops and servers sold by Dell, Fujitsu, Hewlett-Packard (later Hewlett Packard Enterprise and HP Inc.), Intel, Lenovo, and possibly others. Those researchers claimed that the bug affects systems made in 2010 or later. Other reports claimed the bug also affects systems made as long ago as 2008. The vulnerability was described as giving remote attackers: "full control of affected machines, including the ability to read and modify everything. It can be used to install persistent malware (possibly in firmware), and read and modify any data." === PLATINUM === In June 2017, the PLATINUM cybercrime group became notable for exploiting the serial over LAN (SOL) capabilities of AMT to perform data exfiltration of stolen documents. SOL is disabled by default and must be enabled to exploit this vulnerability. === SA-00086 === Some months after the previous bugs, and subsequent warnings from the EFF, securi

    Read more →
  • Magiran

    Magiran

    Magiran (Persian: مگیران)—Iran's publications database—is a digital library that was founded in 2000 and includes digitized versions of scientific journals, which currently provides the possibility of searching among the full text of 1,500 journals. Registration is required for full access to the database, but access to some items such as newspapers is also possible without registration. A list of Iranian researchers is also maintained there.

    Read more →
  • Metadata repository

    Metadata repository

    A metadata repository is a database created to store metadata. Metadata is information about the structures that contain the actual data. Metadata is often said to be "data about data", but this is misleading. Data profiles are an example of actual "data about data". Metadata adds one layer of abstraction to this definition– it is data about the structures that contain data. Metadata may describe the structure of any data, of any subject, stored in any format. A well-designed metadata repository typically contains data far beyond simple definitions of the various data structures. Typical repositories store dozens to hundreds of separate pieces of information about each data structure. Comparing the metadata of a couple data items - one digital and one physical - clarify what metadata is: First, digital: For data stored in a database one may have a table called "Patient" with many columns, each containing data which describes a different attribute of each patient. One of these columns may be named "Patient_Last_Name". What is some of the metadata about the column that contains the actual surnames of patients in the database? We have already used two items: the name of the column that contains the data (Patient_Last_Name) and the name of the table that contains the column (Patient). Other metadata might include the maximum length of last name that may be entered, whether or not last name is required (can we have a patient without Patient_Last_Name?), and whether the database converts any surnames entered in lower case to upper case. Metadata of a security nature may show the restrictions which limit who may view these names. Second, physical: For data stored in a brick and mortar library, one have many volumes and may have various media, including books. Metadata about books would include ISBN, Binding_Type, Page_Count, Author, etc. Within Binding_Type, metadata would include possible bindings, material, etc. This contextual information of business data include meaning and content, policies that govern, technical attributes, specifications that transform, and programs that manipulate. == Definition == The metadata repository is responsible for physically storing and cataloging metadata. Data in a metadata repository should be generic, integrated, current, and historical: Generic Meta model should store the metadata by generic terms instead of storing it by an applications-specific defined way, so that if your data base standard changes from one product to another the physical meta model of the metadata repository would not need to change. Integration of the metadata repository allows all business areas' metadata to be in an integrated fashion: Covering all domains and subject areas of the organization. current and historical The metadata repository should have accessible current and historical metadata. Metadata repositories used to be referred to as a data dictionary. With the transition of needs for the metadata usage for business intelligence has increased so is the scope of the metadata repository increased. Earlier data dictionaries are the closest place to interact technology with business. Data dictionaries are the universe of metadata repository in the initial stages but as the scope increased Business glossary and their tags to variety of status flags emerged in the business side while consumption of the technology metadata, their lineage and linkages made the repository, the source for valuable reports to bring business and technology together and helped data management decisions easier as well as assess the cost of the changes. Metadata repository explores the enterprise wide data governance, data quality and master data management (includes master data and reference data) and integrates this wealth of information with integrated metadata across the organization to provide decision support system for data structures, even though it only reflects the structures consumed from various systems. == Repository vs. registry == Repository has additional functionalities compared with registry. Metadata repository not only stores metadata like Metadata registry but also adds relationships with related metadata types. Metadata when related in a flow from its point of entry into organization up to the deliverables is considered as the lineage of that data point. Metadata when related across other related metadata types is called linkages. By providing the relationships to all the metadata points across the organization and maintaining its integrity with an architecture to handle the changes, metadata repository provides the basic material for understanding the complete data flow and their definitions and their impact. Also the important feature is to maintain the version control though this statement for contrasting is open for discussion. These definitions are still evolving, so the accuracy of the definitions needs refinement. The purpose of registry is to define the metadata element and maintained across the organization. And data models and other data management teams refer to the registry for any changes to follow. While Metadata repository sources metadata from various metadata systems in the organizations and reflects what is in the upstream. Repository never acts as an upstream while registry is used as an upstream for metadata changes. == Reason for use == Metadata repository enables all the structure of the organizations data containers to one integrated place. This opens plethora of resourceful information for making calculated business decisions. This tool uses one generic form of data model to integrate all the models thus brings all the applications and programs of the organization into one format. And on top of it applying the business definitions and business processes brings the business and technology closer that will help organizations make reliable roadmaps with definite goals. With one stop information, business will have more control on the changes, and can do impact analysis of the tool. Usually business spends much time and money to make decisions based on discovery and research on impacts to make changes or to add new data structures or remove structures in data management of the organization. With a structured and well maintained repository, moving the product from ideation to delivery takes the least amount of time (considering other variables are constant). To sum it up: Integration of the metadata across the organization Build relationship between various metadata types Build relationship between various disparate systems Define business golden copy of definitions Version control of the changes at structure level Interaction with Reference data Link view to master data Automatic synchronization with various authorized metadata source systems More control to business decisions Validate the structures by overlapping the models Discovering discrepancies, gaps, lineage, metrics at data structure level Each database management system (DBMS) and database tools have their own language for the metadata components within. Database applications already have their own repositories or registries that are expected to provide all of the necessary functionality to access the data stored within. Vendors do not want other companies to be capable of easily migrating data away from their products and into competitors products, so they are proprietary with the way they handle metadata. CASE tools, DBMS dictionaries, ETL tools, data cleansing tools, OLAP tools, and data mining tools all handle and store metadata differently. Only a metadata repository can be designed to store the metadata components from all of these tools. == Design == Metadata repositories should store metadata in four classifications: ownership, descriptive characteristics, rules and policies, and physical characteristics. Ownership, showing the data owner and the application owner. The descriptive characteristics, define the names, types and lengths, and definitions describing business data or business processes. Rules and policies, will define security, data cleanliness, timelines for data, and relationships. Physical characteristics define the origin or source, and physical location. Like building a logical data model for creating a database, a logical meta model can help identify the metadata requirements for business data. The metadata repository will be centralized, decentralized, or distributed. A centralized design means that there is one database for the metadata repository that stores metadata for all applications business wide. A centralized metadata repository has the same advantages and disadvantages of a centralized database. Easier to manage because all the data is in one database, but the disadvantage is that bottlenecks may occur. A decentralized metadata repository stores metadata in multiple databases, either separated by location and or departments of the business. This makes management of the repository more involved than a centraliz

    Read more →
  • Ray tracing (graphics)

    Ray tracing (graphics)

    In 3D computer graphics, ray tracing is a technique for modeling light transport for use in a wide variety of rendering algorithms for generating digital images. On a spectrum of computational cost and visual fidelity, ray tracing-based rendering techniques, such as ray casting, recursive ray tracing, distribution ray tracing, photon mapping and path tracing, are generally slower and higher fidelity than scanline rendering methods. Thus, ray tracing was first deployed in applications where taking a relatively long time to render could be tolerated, such as still CGI images, and film and television visual effects (VFX), but was less suited to real-time applications such as video games, where speed is critical in rendering each frame. Since 2018, however, hardware acceleration for real-time ray tracing has become standard on new commercial graphics cards, and graphics APIs have followed suit, allowing developers to use hybrid ray tracing and rasterization-based rendering in games and other real-time applications with a lesser hit to frame render times. Ray tracing is capable of simulating a variety of optical effects, such as reflection, refraction, soft shadows, scattering, depth of field, motion blur, caustics, ambient occlusion and dispersion phenomena (such as chromatic aberration). It can also be used to trace the path of sound waves in a similar fashion to light waves, making it a viable option for more immersive sound design in video games by rendering realistic reverberation and echoes. In fact, any physical wave or particle phenomenon with approximately linear motion can be simulated with ray tracing. Ray tracing–based rendering techniques that sample light over a domain typically generate multiple rays and often rely on denoising to reduce the resulting noise. == History == The idea of ray tracing comes from as early as the 16th century, when it was described by Albrecht Dürer, who is credited for its invention. Dürer described multiple techniques for projecting 3-D scenes onto an image plane. Some of these project chosen geometry onto the image plane, as is done with rasterization today. Others determine what geometry is visible along a given ray, as is done with ray tracing. Using a computer for ray tracing to generate shaded pictures was first accomplished by Arthur Appel in 1968. Appel used ray tracing for primary visibility (determining the closest surface to the camera at each image point) by tracing a ray through each point to be shaded into the scene to identify the visible surface. The closest surface intersected by the ray was the visible one. This non-recursive ray tracing-based rendering algorithm is today called "ray casting". His algorithm then traced secondary rays to the light source from each point being shaded to determine whether the point was in shadow or not. Later, in 1971, Goldstein and Nagel of MAGI (Mathematical Applications Group, Inc.) published "3-D Visual Simulation", wherein ray tracing was used to make shaded pictures of solids. At the ray-surface intersection point found, they computed the surface normal and, knowing the position of the light source, computed the brightness of the pixel on the screen. Their publication describes a short (30-second) film "made using the University of Maryland's display hardware outfitted with a 16mm camera. The film showed the helicopter and a simple ground-level gun emplacement. The helicopter was programmed to undergo a series of maneuvers including turns, take-offs, and landings, etc., until it eventually is shot down and crashed." A CDC 6600 computer was used. MAGI produced an animation video called MAGI/SynthaVision Sampler in 1974. Another early instance of ray casting came in 1976, when Scott Roth created a flip book animation in Bob Sproull's computer graphics course at Caltech. The scanned pages are shown as a video in the accompanying image. Roth's computer program noted an edge point at a pixel location if the ray intersected a bounded plane different from that of its neighbors. Of course, a ray could intersect multiple planes in space, but only the surface point closest to the camera was noted as visible. The platform was a DEC PDP-10, a Tektronix storage-tube display, and a printer which would create an image of the display on rolling thermal paper. Roth extended the framework, introduced the term ray casting in the context of computer graphics and solid modeling, and in 1982 published his work while at GM Research Labs. Turner Whitted was the first to show recursive ray tracing for mirror reflection and for refraction through translucent objects, with an angle determined by the solid's index of refraction, and to use ray tracing for anti-aliasing. Whitted also showed ray traced shadows. He produced a recursive ray traced film called The Compleat Angler in 1979 while an engineer at Bell Labs. Whitted's deeply recursive ray tracing algorithm reframed rendering from being primarily a matter of surface visibility determination to being a matter of light transport. His paper inspired a series of subsequent work by others that included distribution ray tracing and finally unbiased path tracing, which provides the rendering equation framework that has allowed computer-generated imagery to be faithful to reality. For decades, global illumination in major films using computer-generated imagery was approximated with additional lights. Ray tracing-based rendering eventually changed that by enabling physically based light transport. Early feature films rendered entirely using path tracing include Monster House (2006), Cloudy with a Chance of Meatballs (2009), and Monsters University (2013). == Algorithm overview == Optical ray tracing describes a method for producing visual images constructed in 3D computer graphics environments, with more photorealism than either ray casting or scanline rendering techniques. It works by tracing a path from an imaginary eye through each pixel in a virtual screen, and calculating the color of the object visible through it. Scenes in ray tracing are described mathematically by a programmer or by a visual artist (normally using intermediary tools). Scenes may also incorporate data from images and models captured by means such as digital photography. Typically, each ray must be tested for intersection with some subset of all the objects in the scene. Once the nearest object has been identified, the algorithm will estimate the incoming light at the point of intersection, examine the material properties of the object, and combine this information to calculate the final color of the pixel. Certain illumination algorithms and reflective or translucent materials may require more rays to be re-cast into the scene. It may at first seem counterintuitive or "backward" to send rays away from the camera, rather than into it (as actual light does in reality), but doing so is many orders of magnitude more efficient. Since the overwhelming majority of light rays from a given light source do not make it directly into the viewer's eye, a "forward" simulation could potentially waste a tremendous amount of computation on light paths that are never recorded. Therefore, the shortcut taken in ray tracing is to presuppose that a given ray intersects the view frame. After either a maximum number of reflections or a ray traveling a certain distance without intersection, the ray ceases to travel and the pixel's value is updated. === Calculate rays for rectangular viewport === On input we have (in calculation we use vector normalization and cross product): E ∈ R 3 {\displaystyle E\in \mathbb {R^{3}} } eye position T ∈ R 3 {\displaystyle T\in \mathbb {R^{3}} } target position θ ∈ [ 0 , π ] {\displaystyle \theta \in [0,\pi ]} field of view - for humans, we can assume ≈ π / 2 rad = 90 ∘ {\displaystyle \approx \pi /2{\text{ rad}}=90^{\circ }} m , k ∈ N {\displaystyle m,k\in \mathbb {N} } numbers of square pixels on viewport vertical and horizontal direction i , j ∈ N , 1 ≤ i ≤ k ∧ 1 ≤ j ≤ m {\displaystyle i,j\in \mathbb {N} ,1\leq i\leq k\land 1\leq j\leq m} numbers of actual pixel v → ∈ R 3 {\displaystyle {\vec {v}}\in \mathbb {R^{3}} } vertical vector which indicates where is up and down, usually v → = [ 0 , 1 , 0 ] {\displaystyle {\vec {v}}=[0,1,0]} - roll component which determine viewport rotation around point C (where the axis of rotation is the ET section) The idea is to find the position of each viewport pixel center P i j {\displaystyle P_{ij}} which allows us to find the line going from eye E {\displaystyle E} through that pixel and finally get the ray described by point E {\displaystyle E} and vector R → i j = P i j − E {\displaystyle {\vec {R}}_{ij}=P_{ij}-E} (or its normalization r → i j {\displaystyle {\vec {r}}_{ij}} ). First we need to find the coordinates of the bottom left viewport pixel P 1 m {\displaystyle P_{1m}} and find the next pixel by making a shift along directions parallel to viewport (vectors b → n {\displaystyle {\vec {b}}_{n

    Read more →
  • Materialized view

    Materialized view

    In computing, a materialized view is a database object that contains the results of a query. For example, it may be a local copy of data located remotely, or may be a subset of the rows and/or columns of a table or join result, or may be a summary using an aggregate function. The process of setting up a materialized view is sometimes called materialization. This is a form of caching the results of a query, similar to memoization of the value of a function in functional languages, and it is sometimes described as a form of precomputation. As with other forms of precomputation, database users typically use materialized views for performance reasons, i.e. as a form of optimization. Materialized views that store data based on remote tables were also known as snapshots (deprecated Oracle terminology). In any database management system following the relational model, a view is a virtual table representing the result of a database query. Whenever a query or an update addresses an ordinary view's virtual table, the DBMS converts these into queries or updates against the underlying base tables. A materialized view takes a different approach: the query result is cached as a concrete ("materialized") table (rather than a view as such) that may be updated from the original base tables from time to time. This enables much more efficient access, at the cost of extra storage and of some data being potentially out-of-date. Materialized views find use especially in data warehousing scenarios, where frequent queries of the actual base tables can be expensive. In a materialized view, indexes can be built on any column. In contrast, in a normal view, it's typically only possible to exploit indexes on columns that come directly from (or have a mapping to) indexed columns in the base tables; often this functionality is not offered at all. == Implementations == === Oracle === Materialized views were implemented first by the Oracle Database: the Query rewrite feature was added from version 8i. Example syntax to create a materialized view in Oracle: === PostgreSQL === In PostgreSQL, version 9.3 and newer natively support materialized views. In version 9.3, a materialized view is not auto-refreshed, and is populated only at time of creation (unless WITH NO DATA is used). It may be refreshed later manually using REFRESH MATERIALIZED VIEW. In version 9.4, the refresh may be concurrent with selects on the materialized view if CONCURRENTLY is used. Example syntax to create a materialized view in PostgreSQL: === SQL Server === Microsoft SQL Server differs from other RDBMS by the way of implementing materialized view via a concept known as "Indexed Views". The main difference is that such views do not require a refresh because they are in fact always synchronized to the original data of the tables that compound the view. To achieve this, it is necessary that the lines of origin and destination are "deterministic" in their mapping, which limits the types of possible queries to do this. This mechanism has been realised since the 2000 version of SQL Server. Example syntax to create a materialized view in SQL Server: === Stream processing frameworks === Apache Kafka (since v0.10.2), Apache Spark (since v2.0), Apache Flink, Kinetica DB, Materialize, RisingWave, and Epsio all support materialized views on streams of data. === Others === Materialized views are also supported in Sybase SQL Anywhere. In IBM Db2, they are called "materialized query tables". ClickHouse supports materialized views that automatically refresh on merges. MySQL doesn't support materialized views natively, but workarounds can be implemented by using triggers or stored procedures or by using the open-source application Flexviews. Materialized views can be implemented in Amazon DynamoDB using data modification events captured by DynamoDB Streams. Google announced in 8 April 2020 the availability of materialized views for BigQuery as a beta release.

    Read more →
  • G'MIC

    G'MIC

    G'MIC (GREYC's Magic for Image Computing) is a free and open-source framework for image processing. It defines a script language that allows the creation of complex macros. Originally usable only through a command line interface, it is currently mostly popular as a GIMP plugin, and is also included in Krita. G'MIC is dual-licensed under CECILL-2.1 or CECILL-C. == Features == G'MIC's graphical interface is notable for its noise removal filters, which came from an earlier project called GREYCstoration by the same authors. G'MIC offers many built-in commands for image processing, including basic mathematical manipulations, look up tables, and filtering operations. More complex macros and pipelines built out of those commands are defined in its library files. == Interpreters == === Command line === G'MIC is primarily a script language callable from a shell. For example, to display an image: This command displays the image contained in the file image.jpg and allows zooming in to examine values. Several filters can be applied in succession. For example, to crop and resize an image: === Graphical interface === G'MIC comes with a Qt-based graphical interface, which may be integrated as a Gimp or Krita plugin. It contains several hundred filters written in the G'MIC language, dynamically updated through an internet feed. The interface provides a preview and setting sliders for each filter. G'MIC is one of the most popular Gimp plugins. === G'MIC Online === Most of the filters available for the graphical interface are also available online. === ZArt === ZArt is a graphical interface for real-time manipulation of webcam images. === libgmic === Libgmic is a C++ library that can be linked to third-party applications. It sees integration in Flowblade and Veejay.

    Read more →
  • System integrity

    System integrity

    In telecommunications, the term system integrity has the following meanings: That condition of a system wherein its mandated operational and technical parameters are within the prescribed limits. The quality of an AIS when it performs its intended function in an unimpaired manner, free from deliberate or inadvertent unauthorized manipulation of the system. The state that exists when there is complete assurance that under all conditions an IT system is based on the logical correctness and reliability of the operating system, the logical completeness of the hardware and software that implement the protection mechanisms, and data integrity.

    Read more →
  • MovieRide FX

    MovieRide FX

    MovieRide FX is a patented automated special visual effects video compositing engine used in the MovieRide FX mobile application for Android (requires Android 2.3 or later) and iOS (compatible with iPhone 4 and up, iPad, and iPod Touch (new generation), requires iOS 7 or later). MovieRide FX allows the user to personalize a "Hollywood-style" movie clip by inserting themself into the clip as the "actor". == Features == The MovieRide FX app uses the relevant mobile device's camera to record a video of the user and insert it into a pre-packaged "Hollywood style" movie clip. The "actor" is extracted from their recorded video clip through various known effects such as masking, keying, and motion tracking. The "actor" is then inserted into one of the pre-packaged movie clips created by the MovieRide FX visual effects artists. This is done through an automated process requiring little or no artistic or technical skill from the user. The custom movie clips pre-packaged with MovieRide FX offer the user a variety of movie scenarios. Additional clips based on popular television and movie themes are continually being developed and are available on a freemium basis. == Sharing == Once the user's footage has automatically been composited into a movie clip and rendered as an .mp4 file, it can be shared via social media, such as Facebook, YouTube, and Twitter, and by e-mail. == History == === 2012 === MovieRide FX was created by Grant Waterston and Johann Mynhardt, who started development in 2012. === 2013 === The beta version was released on Google Play in July 2013. In August 2013 MovieRide FX was a New Media Award winner in the "New Media" category of the Accolade International Awards in Los Angeles. In October 2013 MovieRide FX was awarded exhibitor space in the ‘start-up village’ at the Apps-World Expo in London. === 2014 === MovieRide FX reached the 100 000 – 500 000 downloads category on the Google Play Store in June 2014. The official Android version was launched in July 2014. iOS version released in August 2014. MovieRide FX was selected as one of the "Top 150" startups at the Pioneer Festival in Vienna in September 2014. In November 2014 MovieRide FX was shortlisted for the Appster Awards in the "Best Entertainment App" and "Most Innovative App" categories and was awarded exhibitor space at the ‘start-up village’ at the Apps-World Expo in London. Patent applications were filed in South Africa, the EU and USA in April 2014. === 2015 === In September 2015 MovieRide FX was shortlisted for "Best Software innovation" at The Technology Expo Awards in London. === 2016 === In April 2016 MovieRide FX was nominated for a National Science and Technology Forum (NSTF) award for 'Research leading to Innovation by a corporate organization' In August 2016 Movie Ride FX won two Gold Awards at the 2016 Mobile Marketing Awards (MMA Smarties SA). These two Gold awards were for the 'Innovation' and 'Best in Show’ categories. In December 2016 FlicJam Inc. was formed in the US to access the larger global market. EU patent application was published in March 2016. === 2017 === South African patent was granted in February 2017. === 2018 === US patent was granted in March 2018.

    Read more →
  • System Service Descriptor Table

    System Service Descriptor Table

    The System Service Descriptor Table (SSDT) is an internal dispatch table within Microsoft Windows. == Function == The SSDT maps syscalls to kernel function addresses. When a syscall is issued by a user space application, it contains the service index as parameter to indicate which syscall is called. The SSDT is then used to resolve the address of the corresponding function within ntoskrnl.exe. In modern Windows kernels, two SSDTs are used: One for generic routines (KeServiceDescriptorTable) and a second (KeServiceDescriptorTableShadow) for graphical routines. A parameter passed by the calling userspace application determines which SSDT shall be used. == Hooking == Modification of the SSDT allows to redirect syscalls to routines outside the kernel. These routines can be either used to hide the presence of software or to act as a backdoor to allow attackers permanent code execution with kernel privileges. For both reasons, hooking SSDT calls is often used as a technique in both Windows kernel mode rootkits and antivirus software. In 2010, many computer security products which relied on hooking SSDT calls were shown to be vulnerable to exploits using race conditions to attack the products' security checks.

    Read more →
  • Vatican News App

    Vatican News App

    The Vatican News App is an official mobile application software issued by the Vatican's Dicastery for Communication. Formerly titled The Pope App, the app was launched on January 23, 2013, under the auspices of the Pontifical Council for Social Communications, a now-defunct dicastery that was merged into the Secretariat (now Dicastery) for Communication in March 2016. Initially, The Pope App was available only on iOS devices, but became available for Android phones at the end of February 2013. The app is available for download on iOS and Android in five languages: English, French, Italian, Portuguese and Spanish. It was originally promoted as an application with focus on the figure of the Pope which made it possible to follow the Pope's events while they are taking place. Alerts notified the followers by informing and offering access to "official papal-related content in a variety of formats". The app also enabled its users to see areas of the Vatican through webcams allocated throughout St. Peter's Square in Rome that broadcast images. In early 2018, The Pope App was relaunched as the Vatican News App, accompanied by a redesign that eliminated many of the previous version's features, reducing the app to a more conventional news service, with increased emphasis on news from the Vatican and the worldwide Catholic Church and less focus on the day-to-day activities of the Pope.

    Read more →
  • User-defined function

    User-defined function

    A user-defined function (UDF) is a function provided by the user of a program or environment, in a context where the usual assumption is that functions are built into the program or environment. UDFs are usually written for the requirement of its creator. == BASIC language == In some old implementations of the BASIC programming language, user-defined functions are defined using the "DEF FN" syntax. More modern dialects of BASIC are influenced by the structured programming paradigm, where most or all of the code is written as user-defined functions or procedures, and the concept becomes practically redundant. == COBOL language == In the COBOL programming language, a user-defined function is an entity that is defined by the user by specifying a FUNCTION-ID paragraph. A user-defined function must return a value by specifying the RETURNING phrase of the procedure division header and they are invoked using the function-identifier syntax. See the ISO/IEC 1989:2014 Programming Language COBOL standard for details. As of May 2022, the IBM Enterprise COBOL for z/OS 6.4 (IBM COBOL) compiler contains support for user-defined functions. == Databases == In relational database management systems, a user-defined function provides a mechanism for extending the functionality of the database server by adding a function, that can be evaluated in standard query language (usually SQL) statements. The SQL standard distinguishes between scalar and table functions. A scalar function returns only a single value (or NULL), whereas a table function returns a (relational) table comprising zero or more rows, each row with one or more columns. User-defined functions in SQL are declared using the CREATE FUNCTION statement. For example, a user-defined function that converts Celsius to Fahrenheit (a temperature scale used in USA) might be declared like this: Once created, a user-defined function may be used in expressions in SQL statements. For example, it can be invoked where most other intrinsic functions are allowed. This also includes SELECT statements, where the function can be used against data stored in tables in the database. Conceptually, the function is evaluated once per row in such usage. For example, assume a table named Elements, with a row for each known chemical element. The table has a column named BoilingPoint for the boiling point of that element, in Celsius. The query would retrieve the name and the boiling point from each row. It invokes the CtoF user-defined function as declared above in order to convert the value in the column to a value in Fahrenheit. Each user-defined function carries certain properties or characteristics. The SQL standard defines the following properties: Language - defines the programming language in which the user-defined function is implemented; examples include SQL, C, C# and Java. Parameter style - defines the conventions that are used to pass the function parameters and results between the implementation of the function and the database system (only applicable if language is not SQL). Specific name - a name for the function that is unique within the database. Note that the function name does not have to be unique, considering overloaded functions. Some SQL implementations require that function names are unique within a database, and overloaded functions are not allowed. Determinism - specifies whether the function is deterministic or not. The determinism characteristic has an influence on the query optimizer when compiling a SQL statement. SQL-data access - tells the database management system whether the function contains no SQL statements (NO SQL), contains SQL statements but does not access any tables or views (CONTAINS SQL), reads data from tables or views (READS SQL DATA), or actually modifies data in the database (MODIFIES SQL DATA). User-defined functions should not be confused with stored procedures. Stored procedures allow the user to group a set of SQL commands. A procedure can accept parameters and execute its SQL statements depending on those parameters. A procedure is not an expression and, thus, cannot be used like user-defined functions. Some database management systems allow the creation of user defined functions in languages other than SQL. Microsoft SQL Server, for example, allows the user to use .NET languages including C# for this purpose. DB2 and Oracle support user-defined functions written in C or Java programming languages. === SQL Server 2000 === There are three types of UDF in Microsoft SQL Server 2000: scalar functions, inline table-valued functions, and multistatement table-valued functions. Scalar functions return a single data value (not a table) with RETURNS clause. Scalar functions can use all scalar data types, with exception of timestamp and user-defined data types. Inline table-valued functions return the result set of a single SELECT statement. Multistatement table-valued functions return a table, which was built with many TRANSACT-SQL statements. User-defined functions can be invoked from a query like built‑in functions such as OBJECT_ID, LEN, DATEDIFF, or can be executed through an EXECUTE statement like stored procedures. Performance Notes: User-defined functions are subroutines made of one or more Transact-SQL statements that can be used to encapsulate code for reuse. It takes zero or more arguments and evaluates a return value. Has both control-flow and DML statements in its body similar to stored procedures. Does not allow changes to any Global Session State, like modifications to database or external resource, such as a file or network. Does not support output parameter. DEFAULT keyword must be specified to pass the default value of parameter. Errors in UDF cause UDF to abort which, in turn, aborts the statement that invoked the UDF. === Apache Hive === Apache Hive defines, in addition to the regular user-defined functions (UDF), also user-defined aggregate functions (UDAF) and table-generating functions (UDTF). Hive enables developers to create their own custom functions with Java. === Apache Doris === Apache Doris, an open-source real-time analytical database, allows external users to contribute their own UDFs written in C++ to it.

    Read more →
  • INaturalist

    INaturalist

    iNaturalist is an American 501(c)(3) nonprofit social network of naturalists, citizen scientists, and biologists built on the concept of mapping and sharing observations of biodiversity across the globe. iNaturalist may be accessed via its website or from its mobile applications. iNaturalist includes an automated species identification tool, and users further assist each other in identifying organisms from photographs and sound recordings. As of 5 August 2025, iNaturalist users had contributed nearly 300 million observations of plants, animals, fungi, and other organisms worldwide, and 400,000 users were active in the previous 30 days. iNaturalist serves as an important resource of open data for biodiversity research, conservation, and education, describing itself as "an online social network of people sharing biodiversity information to help each other learn about nature." It is the primary application for crowd-sourced biodiversity data in places such as Mexico, southern Africa, and Australia, and the project has been called "a standard-bearer for natural history mobile applications." Most of iNaturalist's software is open source. It has contributed to over 4,000 research papers and is widely used by scientists, land managers, and conservationists worldwide. The platform has also been active in the discovery of new species and rediscovery of species previously assumed to be extinct. == History == iNaturalist began in 2008 as a UC Berkeley School of Information Master's final project of Nate Agrin, Jessica Kline, and Ken-ichi Ueda. Agrin and Ueda continued work on the site with Sean McGregor, a web developer. In 2011, Ueda began collaboration with Scott Loarie, a research fellow at Stanford University and lecturer at UC Berkeley. Ueda and Loarie are the current co-directors of iNaturalist.org. The organization merged with the California Academy of Sciences on 24 April 2014. In 2017, iNaturalist became a joint initiative between the California Academy of Sciences and the National Geographic Society. With these collaborations and growing popularity of the site since 2012, the number of participants and observations has roughly doubled each year. In 2014, iNaturalist reached 1 million observations. Later, as of October 2023, there were 181 million observations (163 million verifiable). On 11 July 2023 iNaturalist announced its status as a newly independent 501(c)(3) nonprofit organization. === Google AI controversy === On 9 June 2025 Google announced that iNaturalist would be part of its "Generative AI Accelerator". This announcement, paired with the initial lack of information on the iNaturalist site, led to outcry from many iNaturalist users in the blog comments and forum, worrying about the consequences for the environment, volunteer engagement, reliability and raised questions about the decision making within iNaturalist, while some saw the backlash as a sign that people want to resist 'corrosive technologies'. PZ Myers, a biology professor who uses iNaturalist in his teaching, published an article on his website Pharyngula stating that "any decision that drives people away and replaces them with a hallucinating bot is a bad decision". == Platforms == Users can interact with iNaturalist in the following ways: through the iNaturalist.org website, through two mobile apps: iNaturalist (iOS/Android) and Seek by iNaturalist (iOS/Android), or through partner organizations such as the Global Biodiversity Information Facility (GBIF) website. On the iNaturalist.org website, visitors can search the public dataset and interact with other people adding observations and identifications. The website provides tools for registered users to add, identify, and discuss observations, write journal posts, explore information about species, create project pages to recruit participation, and coordinate work on their topics of interest. On the iNaturalist mobile app, users can create and share nature observations to the online dataset, explore observations both nearby and around the world, and learn about different species. Seek by iNaturalist, a separate app marketed to families, requires no online account registration and all observations may remain private. Seek incorporates features of gamification, such as providing a list of nearby organisms to find and encouraging the collection of badges and participation in challenges. Seek was initially released in the spring of 2018. == Observations == The iNaturalist platform is based on crowdsourcing of observations and identifications. An iNaturalist observation records a person's encounter with an individual organism at a particular time and place. An iNaturalist observation may also record evidence of an organism, such as animal tracks, nests, or scat. The scope of iNaturalist excludes natural but inert subjects such as geologic or hydrologic features. Users typically upload photos as evidence of their findings, though audio recordings are also accepted, and such evidence is not a strict requirement. Users may share observation locations publicly, "obscure" them to display a less precise location or make the locations completely private. iNaturalist users can add identifications to each other's observations in order to confirm or improve the identification of the observation. Observations are classified as "Casual", "Needs ID" (needs identification), or "Research Grade" based on the quality of the data provided and the community identification process. Any quality of data can be downloaded from iNaturalist and "Research Grade" observations are often incorporated into other online databases such as the Global Biodiversity Information Facility and the Atlas of Living Australia. === Automated species identification === In addition to observations being identified by others in the community, iNaturalist includes an automated species identification tool, first released in 2017. Images can be identified via a computer vision model which has been trained on the large database of the observations on iNaturalist. Multiple species suggestions are typically provided with the suggestion that the software guesses to be most likely is at the top of the list. A broader taxon such as a genus or family is commonly provided if the model is unsure of the species. It is trained once or twice a year, and the threshold for species included in the training set has changed over time. It can be difficult for the model to guess correctly if the species in question is infrequently observed or hard to identify from images alone, or if the image submitted has poor lighting, is blurry, or contains multiple subjects. In February 2023, iNaturalist released v2.1 of its computer vision model, which was trained on a new source model which performed significantly better than the previous models trained using a different source model. In April 2025 iNaturalist released an updated app for iOS, changing the original version to "iNaturalist Classic." == Projects == Users have created and contributed to tens of thousands of different projects on iNaturalist. The platform is commonly used to record observations during bioblitzes, which are biological surveying events that attempt to record all the species that occur within a designated area, and a specific project type on iNaturalist. Other project types include collections of observations by location or taxon or documenting specific types of observations such as animal tracks and signs, the spread of invasive species, roadkill, fishing catches, or discovering new species. In 2011, iNaturalist was used as a platform to power the Global Amphibian and Global Reptile BioBlitzes, in which observations were used to help monitor the occurrence and distribution of the world's reptiles and amphibian species. The US National Park Service partnered with iNaturalist to record observations from the 2016 National Parks BioBlitz. That project exceeded 100,000 observations in August 2016. In 2017, the United Nations Environment Programme teamed up with iNaturalist to celebrate World Environment Day.. In 2022, Reef Ecologic teamed up with iNaturalist to celebrate World Oceans Day. === City Nature Challenge === In 2016, Lila Higgins from the Natural History Museum of Los Angeles County and Alison Young from the California Academy of Sciences co-founded the City Nature Challenge (CNC). In the first City Nature Challenge, naturalists in Los Angeles and the San Francisco Bay Area documented over 20,000 observations with the iNaturalist platform. In 2017, the CNC expanded to 16 cities across the United States and collected over 125,000 observations of wildlife in 5 days. The CNC expanded to a global audience in 2018, with 68 cities participating from 19 countries, with some cities using community science platforms other than iNaturalist to participate. In 4 days, over 17,000 people cataloged over 440,000 nature observations in urban regions around the world. In 2019, the CNC once again expanded, with 35,000 parti

    Read more →