AI Generator Job Application

AI Generator Job Application — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Linux Trace Toolkit

    Linux Trace Toolkit

    The Linux Trace Toolkit (LTT) is a set of tools that is designed to log program execution details from a patched Linux kernel and then perform various analyses on them, using console-based and graphical tools. LTT has been mostly superseded by its successor LTTng (Linux Trace Toolkit Next Generation). LTT allows the user to see in-depth information about the processes that were running during the trace period, including when context switches occurred, how long the processes were blocked for, and how much time the processes spent executing vs. how much time the processes were blocked. The data is logged to a text file and various console-based and graphical (GTK+) tools are provided for interpreting that data. In order to do data collection, LTT requires a patched Linux kernel. The authors of LTT claim that the performance hit for a patched kernel compared to a regular kernel is minimal; Their testing has reportedly shown that this is less than 2.5% on a "normal use" system (measured using batches of kernel makes) and less than 5% on a file I/O intensive system (measured using batches of tar). == Usage == === Collecting trace data === Data collection is Started by: trace 15 foo This command will cause the LTT tracedaemon to do a trace that lasts for 15 seconds, writing trace data to foo.trace and process information from the /proc filesystem to foo.proc. The trace command is actually a script which runs the program tracedaemon with some common options. It is possible to run tracedaemon directly and in that case, the user can use a number of command-line options to control the data which is collected. For the complete list of options supported by tracedaemon, see the online manual page for tracedaemon. === Viewing the results === Viewing the results of a trace can be accomplished with: traceview foo This command will launch a graphical (GTK+) traceview tool that will read from foo.trace and foo.proc. This tool can show information in various interesting ways, including Event Graph, Process Analysis, and Raw Trace. The Event Graph is perhaps the most interesting view, showing the exact timing of events like page faults, interrupts, and context switches, in a simple graphical way. The traceview command is a wrapper for a program called tracevisualizer. For the complete list of options supported by tracevisualizer, see the online manual page for tracevisualizer.

    Read more →
  • Clef (app)

    Clef (app)

    Clef was a San Francisco-based technology company, known for developing a mobile app that created a two-factor authentication for websites. It allowed users to access sites with a single login password management service which stores encrypted passwords in private accounts. It had a standard verification method that requires access to data on the mobile phone to confirm the user's identity. The application required a Wi-Fi or mobile network, and the user could log in by scanning the computer screen with their phone. == History == Clef was founded in 2013 by Mark Hudnall, B. Byrne and Jesse Pollak. It raised $1.6 million in seed funding in November 2014. Clef integrated with many websites and applications, including WordPress. On March 17, 2017, Clef announced they would no longer support the plugin after June 6, 2017; Clef was acquired by Authy, another 2FA service, which later got acquired by Twilio.

    Read more →
  • N-World

    N-World

    N-World is a 3D graphics package developed by Nichimen Graphics in the 1990s, for Silicon Graphics and Windows NT workstations. Intended primarily for video game content creation, it has polygon modeling tools, 2D and 3D paint, scripting, color reduction, and exporters for several popular game consoles. After its initial release on Windows NT, N-World was renamed Mirai. The winged edge 3D modeler in N-World inspired the development at Nichimen Graphics of Nendo, a standalone 3D modeler, which in turn inspired the open source modeler Wings 3D. == History == N-World originated with Symbolics, a computer manufacturer notable for producing Lisp-based systems in the 1980s. Among the software packages that were produced for Symbolics computers are S-Graphics, a 3D animation suite that includes modules for polygon modeling, dynamics, paint, and rendering — titled S-Geometry, S-Dynamics, S-Paint, and S-Render, respectively. In 1992, Japanese trading company Nichimen Corporation purchased the rights to S-Graphics, ported it to Silicon Graphics IRIX, and marketed it as N-World. N-World retains the Lisp-based underpinnings of its predecessor, but was targeted at interactive content producers, with features useful for game developers. It was priced at US$16,995 (equivalent to $34,100 in 2025) for the full suite, later reduced to $9,995 when ported to Windows NT in 1997. N-World was used to create graphics for many console games in the 1990s, specifically most of the Nintendo 64 games, like Super Mario 64 and Final Fantasy VII. It was superseded by Mirai in 1999. == Features == The N-World package, like its predecessor S-Graphics, is divided into several components: N-Geometry: 3D polygon-based modeling tools, including smoothing, "magnet" geometry editing, and instancing. N-Dynamics: Animation tools including scripting, curve-based animation, and skeletal animation. N-Render: Surfacing and rendering tools with ray tracing and materials output to various game console formats. N-Paint: 2D and 3D paint with mattes, effects, color reduction, and a visual VRAM editor for PlayStation. Game Tools: Utilities for game developers, including exporters for PlayStation, Nintendo 64, and Saturn consoles. == Credits == The following games were created using N-World. Rap Stars Online

    Read more →
  • Exposure Notification

    Exposure Notification

    The (Google/Apple) Exposure Notification System (GAEN) is a framework and protocol specification developed by Apple Inc. and Google to facilitate digital contact tracing during the COVID-19 pandemic. When used by health authorities, it augments more traditional contact tracing techniques by automatically logging close approaches among notification system users using Android or iOS smartphones. Exposure Notification is a decentralized reporting protocol built on a combination of Bluetooth Low Energy technology and privacy-preserving cryptography. It is an opt-in feature within COVID-19 apps developed and published by authorized health authorities. Unveiled on April 10, 2020, it was made available on iOS on May 20, 2020, as part of the iOS 13.5 update and on December 14, 2020, as part of the iOS 12.5 update for older iPhones. On Android, it was added to devices via a Google Play Services update, supporting all versions since Android Marshmallow. The Apple/Google protocol is similar to the Decentralized Privacy-Preserving Proximity Tracing (DP-3T) protocol created by the European DP-3T consortium and the Temporary Contact Number (TCN) protocol by Covid Watch, but is implemented at the operating system level, which allows for more efficient operation as a background process. Since May 2020, a variant of the DP-3T protocol is supported by the Exposure Notification Interface. Other protocols are constrained in operation because they are not privileged over normal apps. This leads to issues, particularly on iOS devices where digital contact tracing apps running in the background experience significantly degraded performance. The joint approach is also designed to maintain interoperability between Android and iOS devices, which constitute nearly all of the market. The ACLU stated the approach "appears to mitigate the worst privacy and centralization risks, but there is still room for improvement". In late April, Google and Apple shifted the emphasis of the naming of the system, describing it as an "exposure notification service", rather than "contact tracing" system. == Technical specification == Digital contact tracing protocols typically have two major responsibilities: encounter logging and infection reporting. Exposure Notification only involves encounter logging which is a decentralized architecture. The majority of infection reporting is centralized in individual app implementations. To handle encounter logging, the system uses Bluetooth Low Energy to send tracking messages to nearby devices running the protocol to discover encounters with other people. The tracking messages contain unique identifiers that are encrypted with a secret daily key held by the sending device. These identifiers change every 15–20 minutes as well as Bluetooth MAC address in order to prevent tracking of clients by malicious third parties through observing static identifiers over time. The sender's daily encryption keys are generated using a random number generator. Devices record received messages, retaining them locally for 14 days. If a user tests positive for infection, the last 14 days of their daily encryption keys can be uploaded to a central server, where it is then broadcast to all devices on the network. The method through which daily encryption keys are transmitted to the central server and broadcast is defined by individual app developers. The Google-developed reference implementation calls for a health official to request a one-time verification code (VC) from a verification server, which the user enters into the encounter logging app. This causes the app to obtain a cryptographically signed certificate, which is used to authorize the submission of keys to the central reporting server. The received keys are then provided to the protocol, where each client individually searches for matches in their local encounter history. If a match meeting certain risk parameters is found, the app notifies the user of potential exposure to the infection. Google and Apple intend to use the received signal strength (RSSI) of the beacon messages as a source to infer proximity. RSSI and other signal metadata will also be encrypted to resist deanonymization attacks. === Version 1.0 === To generate encounter identifiers, first a persistent 32-byte private Tracing Key ( t k {\displaystyle tk} ) is generated by a client. From this a 16 byte Daily Tracing Key is derived using the algorithm d t k i = H K D F ( t k , N U L L , 'CT-DTK' | | D i , 16 ) {\displaystyle dtk_{i}=HKDF(tk,NULL,{\text{'CT-DTK'}}||D_{i},16)} , where H K D F ( Key, Salt, Data, OutputLength ) {\displaystyle HKDF({\text{Key, Salt, Data, OutputLength}})} is a HKDF function using SHA-256, and D i {\displaystyle D_{i}} is the day number for the 24-hour window the broadcast is in starting from Unix Epoch Time. These generated keys are later sent to the central reporting server should a user become infected. From the daily tracing key a 16-byte temporary Rolling Proximity Identifier is generated every 10 minutes with the algorithm R P I i , j = Truncate ( H M A C ( d t k i , 'CT-RPI' | | T I N j ) , 16 ) {\displaystyle RPI_{i,j}={\text{Truncate}}(HMAC(dtk_{i},{\text{'CT-RPI'}}||TIN_{j}),16)} , where H M A C ( Key, Data ) {\displaystyle HMAC({\text{Key, Data}})} is a HMAC function using SHA-256, and T I N j {\displaystyle TIN_{j}} is the time interval number, representing a unique index for every 10 minute period in a 24-hour day. The Truncate function returns the first 16 bytes of the HMAC value. When two clients come within proximity of each other they exchange and locally store the current R P I i , j {\displaystyle RPI_{i,j}} as the encounter identifier. Once a registered health authority has confirmed the infection of a user, the user's Daily Tracing Key for the past 14 days is uploaded to the central reporting server. Clients then download this report and individually recalculate every Rolling Proximity Identifier used in the report period, matching it against the user's local encounter log. If a matching entry is found, then contact has been established and the app presents a notification to the user warning them of potential infection. === Version 1.1 === Unlike version 1.0 of the protocol, version 1.1 does not use a persistent tracing key, rather every day a new random 16-byte Temporary Exposure Key ( t e k i {\displaystyle tek_{i}} ) is generated. This is analogous to the daily tracing key from version 1.0. Here i {\displaystyle i} denotes the time is discretized in 10 minute intervals starting from Unix Epoch Time. From this two 128-bit keys are calculated, the Rolling Proximity Identifier Key ( R P I K i {\displaystyle RPIK_{i}} ) and the Associated Encrypted Metadata Key ( A E M K i {\displaystyle AEMK_{i}} ). R P I K i {\displaystyle RPIK_{i}} is calculated with the algorithm R P I K i = H K D F ( t e k i , N U L L , 'EN-RPIK' , 16 ) {\displaystyle RPIK_{i}=HKDF(tek_{i},NULL,{\text{'EN-RPIK'}},16)} , and A E M K i {\displaystyle AEMK_{i}} using the algorithm A E M K i = H K D F ( t e k i , N U L L , 'EN-AEMK' , 16 ) {\displaystyle AEMK_{i}=HKDF(tek_{i},NULL,{\text{'EN-AEMK'}},16)} . From these values a temporary Rolling Proximity Identifier ( R P I i , j {\displaystyle RPI_{i,j}} ) is generated every time the BLE MAC address changes, roughly every 15–20 minutes. The following algorithm is used: R P I i , j = A E S 128 ( R P I K i , 'EN-RPI' | | 0 x 000000000000 | | E N I N j ) {\displaystyle RPI_{i,j}=AES128(RPIK_{i},{\text{'EN-RPI'}}||{\mathtt {0x000000000000}}||ENIN_{j})} , where A E S 128 ( Key, Data ) {\displaystyle AES128({\text{Key, Data}})} is an AES cryptography function with a 128-bit key, the data is one 16-byte block, j {\displaystyle j} denotes the Unix Epoch Time at the moment the roll occurs, and E N I N j {\displaystyle ENIN_{j}} is the corresponding 10-minute interval number. Next, additional Associated Encrypted Metadata is encrypted. What the metadata represents is not specified, likely to allow the later expansion of the protocol. The following algorithm is used: Associated Encrypted Metadata i , j = A E S 128 _ C T R ( A E M K i , R P I i , j , Metadata ) {\displaystyle {\text{Associated Encrypted Metadata}}_{i,j}=AES128\_CTR(AEMK_{i},RPI_{i,j},{\text{Metadata}})} , where A E S 128 _ C T R ( Key, IV, Data ) {\displaystyle AES128\_CTR({\text{Key, IV, Data}})} denotes AES encryption with a 128-bit key in CTR mode. The Rolling Proximity Identifier and the Associated Encrypted Metadata are then combined and broadcast using BLE. Clients exchange and log these payloads. Once a registered health authority has confirmed the infection of a user, the user's Temporary Exposure Keys t e k i {\displaystyle tek_{i}} and their respective interval numbers i {\displaystyle i} for the past 14 days are uploaded to the central reporting server. Clients then download this report and individually recalculate every Rolling Proximity Identifier starting from interval number i {\displaystyle i} ,

    Read more →
  • Multi-task learning

    Multi-task learning

    Multi-task learning (MTL) is a subfield of machine learning in which multiple learning tasks are solved at the same time, while exploiting commonalities and differences across tasks. This can result in improved learning efficiency and prediction accuracy for the task-specific models, when compared to training the models separately. Inherently, Multi-task learning is a multi-objective optimization problem having trade-offs between different tasks. Early versions of MTL were called "hints". In a widely cited 1997 paper, Rich Caruana gave the following characterization:Multitask Learning is an approach to inductive transfer that improves generalization by using the domain information contained in the training signals of related tasks as an inductive bias. It does this by learning tasks in parallel while using a shared representation; what is learned for each task can help other tasks be learned better. In the classification context, MTL aims to improve the performance of multiple classification tasks by learning them jointly. One example is a spam-filter, which can be treated as distinct but related classification tasks across different users. To make this more concrete, consider that different people have different distributions of features which distinguish spam emails from legitimate ones, for example an English speaker may find that all emails in Russian are spam, not so for Russian speakers. Yet there is a definite commonality in this classification task across users, for example one common feature might be text related to money transfer. Solving each user's spam classification problem jointly via MTL can let the solutions inform each other and improve performance. Further examples of settings for MTL include multiclass classification and multi-label classification. Multi-task learning works because regularization induced by requiring an algorithm to perform well on a related task can be superior to regularization that prevents overfitting by penalizing all complexity uniformly. One situation where MTL may be particularly helpful is if the tasks share significant commonalities and are generally slightly under sampled. However, as discussed below, MTL has also been shown to be beneficial for learning unrelated tasks. == Methods == The key challenge in multi-task learning, is how to combine learning signals from multiple tasks into a single model. This may strongly depend on how well different task agree with each other, or contradict each other. There are several ways to address this challenge: === Task grouping and overlap === Within the MTL paradigm, information can be shared across some or all of the tasks. Depending on the structure of task relatedness, one may want to share information selectively across the tasks. For example, tasks may be grouped or exist in a hierarchy, or be related according to some general metric. Suppose, as developed more formally below, that the parameter vector modeling each task is a linear combination of some underlying basis. Similarity in terms of this basis can indicate the relatedness of the tasks. For example, with sparsity, overlap of nonzero coefficients across tasks indicates commonality. A task grouping then corresponds to those tasks lying in a subspace generated by some subset of basis elements, where tasks in different groups may be disjoint or overlap arbitrarily in terms of their bases. Task relatedness can be imposed a priori or learned from the data. Hierarchical task relatedness can also be exploited implicitly without assuming a priori knowledge or learning relations explicitly. For example, the explicit learning of sample relevance across tasks can be done to guarantee the effectiveness of joint learning across multiple domains. === Exploiting unrelated tasks: Auxiliary learning === In auxiliary learning, one attempts learning a group of principal tasks using a group of auxiliary tasks, unrelated to the principal ones. With the right unrelated tasks, joint learning of unrelated tasks which use the same input data have been shown to be beneficial, and provide significant improvement over standard MTL. The reason is that prior knowledge about task relatedness can lead to sparser and more informative representations for each task grouping, essentially by screening out idiosyncrasies of the data distribution. It has been proposed to build on a prior multitask methodology by favoring a shared low-dimensional representation within each task grouping, and imposing a penalty on tasks from different groups which encourages the two representations to be orthogonal. Learning with auxiliary unrelated tasks poses two major challenges: Finding useful auxiliary tasks and combining losses of all tasks in a useful way. Some methods can learn these from data together with the training process, and combine tasks efficiently. === Transfer of knowledge === Related to multi-task learning is the concept of knowledge transfer. Whereas traditional multi-task learning implies that a shared representation is developed concurrently across tasks, transfer of knowledge implies a sequentially shared representation. Large scale machine learning projects such as the deep convolutional neural network GoogLeNet, an image-based object classifier, can develop robust representations which may be useful to further algorithms learning related tasks. For example, the pre-trained model can be used as a feature extractor to perform pre-processing for another learning algorithm. Or the pre-trained model can be used to initialize a model with similar architecture which is then fine-tuned to learn a different classification task. === Multiple non-stationary tasks === Traditionally Multi-task learning and transfer of knowledge are applied to stationary learning settings. Their extension to non-stationary environments is termed Group online adaptive learning (GOAL). Sharing information could be particularly useful if learners operate in continuously changing environments, because a learner could benefit from previous experience of another learner to quickly adapt to their new environment. Such group-adaptive learning has numerous applications, from predicting financial time-series, through content recommendation systems, to visual understanding for adaptive autonomous agents. === Multi-task optimization === Multi-task optimization focuses on solving optimizing the whole process. The paradigm has been inspired by the well-established concepts of transfer learning and multi-task learning in predictive analytics. The key motivation behind multi-task optimization is that if optimization tasks are related to each other in terms of their optimal solutions or the general characteristics of their function landscapes, the search progress can be transferred to substantially accelerate the search on the other. The success of the paradigm is not necessarily limited to one-way knowledge transfers from simpler to more complex tasks. In practice an attempt is to intentionally solve a more difficult task that may unintentionally solve several smaller problems. There is a direct relationship between multitask optimization and multi-objective optimization. In some cases, the simultaneous training of seemingly related tasks may hinder performance compared to single-task models. Commonly, MTL models employ task-specific modules on top of a joint feature representation obtained using a shared module. Since this joint representation must capture useful features across all tasks, MTL may hinder individual task performance if the different tasks seek conflicting representation, i.e., the gradients of different tasks point to opposing directions or differ significantly in magnitude. This phenomenon is commonly referred to as negative transfer. To mitigate this issue, various MTL optimization methods have been proposed. It has been reported that meta-knowledge transfer could help avoid negative transfer.Besides, the per-task gradients are combined into a joint update direction through various aggregation algorithms or heuristics. There are several common approaches for multi-task optimization: Bayesian optimization, evolutionary computation, and approaches based on Game theory. ==== Multi-task Bayesian optimization ==== Multi-task Bayesian optimization is a modern model-based approach that leverages the concept of knowledge transfer to speed up the automatic hyperparameter optimization process of machine learning algorithms. The method builds a multi-task Gaussian process model on the data originating from different searches progressing in tandem. The captured inter-task dependencies are thereafter utilized to better inform the subsequent sampling of candidate solutions in respective search spaces. ==== Evolutionary multi-tasking ==== Evolutionary multi-tasking has been explored as a means of exploiting the implicit parallelism of population-based search algorithms to simultaneously progress multiple distinct optimization tasks. By mapping all task

    Read more →
  • Automatic meter reading

    Automatic meter reading

    Automatic meter reading (AMR) is the technology of automatically collecting consumption, diagnostic, and status data from water meter or energy metering devices (gas, electric) and transferring that data to a central database for billing, troubleshooting, and analyzing. This technology mainly saves utility providers the expense of periodic trips to each physical location to read a meter. Another advantage is that billing can be based on near real-time consumption rather than on estimates based on past or predicted consumption. This timely information coupled with analysis can help both utility providers and customers better control the use and production of electric energy, gas usage, or water consumption. AMR technologies include handheld, mobile and network technologies based on telephony platforms (wired and wireless), radio frequency (RF), or powerline transmission. == Technologies == === Touch technology === With touch-based AMR, a meter reader carries a handheld computer or data collection device with a wand or probe. The device automatically collects the readings from a meter by touching or placing the read probe close to a reading coil enclosed in the touchpad. When a button is pressed, the probe sends an interrogate signal to the touch module to collect the meter reading. The software in the device matches the serial number to one in the route database, and saves the meter reading for later download to a billing or data collection computer. Since the meter reader still has to go to the site of the meter, this is sometimes referred to as "on-site" AMR. Another form of contact reader uses a standardized infrared port to transmit data. Protocols are standardized between manufacturers by such documents as ANSI C12.18 or IEC 61107. === AMR hosting === AMR hosting is a back-office solution which allows a user to track their electricity, water, or gas consumption over the Internet. All data is collected in near real-time, and is stored in a database by data acquisition software. The user can view the data via a web application, and can analyze the data using various online analysis tools such as charting load profiles, analyzing tariff components, and verify their utility bill. === Radio frequency network === Radio frequency based AMR can take many forms. The more common ones are handheld, mobile, satellite and fixed network solutions. There are both two-way RF systems and one-way RF systems in use that use both licensed and unlicensed RF bands. In a two-way or "wake up" system, a radio signal is normally sent to an AMR meter's unique serial number, instructing its transceiver to power-up and transmit its data. The meter transceiver and the reading transceiver both send and receive radio signals. In a one-way "bubble-up" or continuous broadcast type system, the meter transmits continuously and data is sent every few seconds. This means the reading device can be a receiver only, and the meter a transmitter only. Data travels only from the meter transmitter to the reading receiver. There are also hybrid systems that combine one-way and two-way techniques, using one-way communication for reading and two-way communication for programming functions. RF-based meter reading usually eliminates the need for the meter reader to enter the property or home, or to locate and open an underground meter pit. The utility saves money by increased speed of reading, has less liability from entering private property, and has fewer missed readings from being unable to access the meter. The technology based on RF is not readily accepted everywhere. In several Asian countries, the technology faces a barrier of regulations in place pertaining to use of the radio frequency of any radiated power. For example, in India the radio frequency which is generally in ISM band is not free to use even for low power radio of 10 mW. The majority of manufacturers of electricity meters have radio frequency devices in the frequency band of 433/868 MHz for large scale deployment in European countries. The frequency band of 2.4 GHz can be now used in India for outdoor as well as indoor applications, but few manufacturers have shown products within this frequency band. Initiatives in radio frequency AMR in such countries are being taken up with regulators wherever the cost of licensing outweighs the benefits of AMR. ==== Handheld ==== In handheld AMR, a meter reader carries a handheld computer with a built-in or attached receiver/transceiver (radio frequency or touch) to collect meter readings from an AMR capable meter. This is sometimes referred to as "walk-by" meter reading since the meter reader walks by the locations where meters are installed as they go through their meter reading route. Handheld computers may also be used to manually enter readings without the use of AMR technology as an alternate but this will not support exhaustive data which can be accurately read using the meter reading electronically. ==== Mobile ==== Mobile or "drive-by" meter reading is where a reading device is installed in a vehicle. The meter reader drives the vehicle while the reading device automatically collects the meter readings. Often, for mobile meter reading, the reading equipment includes navigational and mapping features provided by GPS and mapping software. With mobile meter reading, the reader does not normally have to read the meters in any particular route order, but just drives the service area until all meters are read. Components often consist of a laptop or proprietary computer, software, RF receiver/transceiver, and external vehicle antennas. ==== Satellite ==== Transmitters for data collection satellites can be installed in the field next to existing meters. The satellite AMR devices communicate with the meter for readings, and then sends those readings over a fixed or mobile satellite network. This network requires a clear view to the sky for the satellite transmitter/receiver, but eliminates the need to install fixed towers or send out field technicians, thereby being particularly suited for areas with low geographic meter density. ==== RF technologies commonly used for AMR ==== Narrow Band (single fixed radio frequency) Spread spectrum Direct-sequence spread spectrum (DSSS) Frequency-hopping spread spectrum (FHSS) There are also meters using AMR with RF technologies such as cellular phone data systems, Zigbee, Bluetooth, Wavenis and others. Some systems operate with U.S. Federal Communications Commission (FCC) licensed frequencies and others under FCC Part 15, which allows use of unlicensed radio frequencies. ==== Wi-Fi ==== WiSmart is a versatile platform which can be used by a variety of electrical home appliances in order to provide wireless TCP/IP communication using the 802.11 b/g protocol. Devices such as the Smart Thermostat permit a utility to lower a home's power consumption to help manage power demand. The city of Corpus Christi became one of the first cities in the United States to implement citywide Wi-Fi, which had been free until May 31, 2007, mainly to facilitate AMR after a meter reader was attacked by a dog. Today many meters are designed to transmit using Wi-Fi, even if a Wi-Fi network is not available, and they are read using a drive-by local Wi-Fi hand held receiver. The meters installed in Corpus Christi are not directly Wi-Fi enabled, but rather transmit narrow-band burst telemetry on the 460 MHz band. This narrow-band signal has much greater range than Wi-Fi, so the number of receivers required for the project are far fewer. Special receiver stations then decode the narrow-band signals and resend the data via Wi-Fi. Most of the automated utility meters installed in the Corpus Christi area are battery powered. Wi-Fi technology is unsuitable for long-term battery-powered operation. === Power line communication === PLC is a method where electronic data is transmitted over power lines back to the substation, then relayed to a central computer in the utility's main office. This would be considered a type of fixed network system—the network being the distribution network which the utility has built and maintains to deliver electric power. Such systems are primarily used for electric meter reading. Some providers have interfaced gas and water meters to feed into a PLC type system. == Brief history == In 1972, Theodore George "Ted" Paraskevakos, while working with Boeing in Huntsville, Alabama, developed a sensor monitoring system which used digital transmission for security, fire and medical alarm systems as well as meter reading capabilities for all utilities. This technology was a spin-off of the automatic telephone line identification system, now known as caller ID. In 1974, Paraskevakos was awarded a U.S. patent for this technology. In 1977, he launched Metretek, Inc., which developed and produced the first fully automated, commercially available remote meter reading and load management system. Since this system was developed pre-Internet, Metret

    Read more →
  • Transfer function matrix

    Transfer function matrix

    In control system theory, and various branches of engineering, a transfer function matrix, or just transfer matrix is a generalisation of the transfer functions of single-input single-output (SISO) systems to multiple-input and multiple-output (MIMO) systems. The matrix relates the outputs of the system to its inputs. It is a particularly useful construction for linear time-invariant (LTI) systems because it can be expressed in terms of the s-plane. In some systems, especially ones consisting entirely of passive components, it can be ambiguous which variables are inputs and which are outputs. In electrical engineering, a common scheme is to gather all the voltage variables on one side and all the current variables on the other regardless of which are inputs or outputs. This results in all the elements of the transfer matrix being in units of impedance. The concept of impedance (and hence impedance matrices) has been borrowed into other energy domains by analogy, especially mechanics and acoustics. Many control systems span several different energy domains. This requires transfer matrices with elements in mixed units. This is needed both to describe transducers that make connections between domains and to describe the system as a whole. If the matrix is to properly model energy flows in the system, compatible variables must be chosen to allow this. == General == A MIMO system with m outputs and n inputs is represented by a m × n matrix. Each entry in the matrix is in the form of a transfer function relating an output to an input. For example, for a three-input, two-output system, one might write, [ y 1 y 2 ] = [ g 11 g 12 g 13 g 21 g 22 g 23 ] [ u 1 u 2 u 3 ] {\displaystyle {\begin{bmatrix}y_{1}\\y_{2}\end{bmatrix}}={\begin{bmatrix}g_{11}&g_{12}&g_{13}\\g_{21}&g_{22}&g_{23}\end{bmatrix}}{\begin{bmatrix}u_{1}\\u_{2}\\u_{3}\end{bmatrix}}} where the un are the inputs, the ym are the outputs, and the gmn are the transfer functions. This may be written more succinctly in matrix operator notation as, Y = G U {\displaystyle \mathbf {Y} =\mathbf {G} \mathbf {U} } where Y is a column vector of the outputs, G is a matrix of the transfer functions, and U is a column vector of the inputs. In many cases, the system under consideration is a linear time-invariant (LTI) system. In such cases, it is convenient to express the transfer matrix in terms of the Laplace transform (in the case of continuous time variables) or the z-transform (in the case of discrete time variables) of the variables. This may be indicated by writing, for instance, Y ( s ) = G ( s ) U ( s ) {\displaystyle \mathbf {Y} (s)=\mathbf {G} (s)\mathbf {U} (s)} which indicates that the variables and matrix are in terms of s, the complex frequency variable of the s-plane arising from Laplace transforms, rather than time. The examples in this article are all assumed to be in this form, although that is not explicitly indicated for brevity. For discrete time systems s is replaced by z from the z-transform, but this makes no difference to subsequent analysis. The matrix is particularly useful when it is a proper rational matrix, that is, all its elements are proper rational functions. In this case, the state-space representation can be applied. In systems engineering, the overall system transfer matrix G (s) is decomposed into two parts: H (s) representing the system being controlled, and C(s) representing the control system. C (s) takes as its inputs the inputs of G (s) and the outputs of H (s). The outputs of C (s) form the inputs for H (s). == Electrical systems == In electrical systems, it is often the case that the distinction between input and output variables is ambiguous. They can be either, depending on circumstance and point of view. In such cases, the concept of port (a place where energy is transferred from one system to another) can be more useful than input and output. It is customary to define two variables for each port (p): the voltage across it (Vp) and the current entering it (Ip). For instance, the transfer matrix of a two-port network can be defined as follows, [ V 1 V 2 ] = [ z 11 z 12 z 21 z 22 ] [ I 1 I 2 ] {\displaystyle {\begin{bmatrix}V_{1}\\V_{2}\end{bmatrix}}={\begin{bmatrix}z_{11}&z_{12}\\z_{21}&z_{22}\\\end{bmatrix}}{\begin{bmatrix}I_{1}\\I_{2}\end{bmatrix}}} where the zmn are called the impedance parameters, or z-parameters. They are so-called because they are in units of impedance and relate port currents to a port voltage. The z-parameters are not the only way that transfer matrices are defined for two-port networks. Six basic matrices relate voltages and currents, each with advantages for particular system network topologies. However, only two of these can be extended beyond two ports to an arbitrary number of ports. These two are the z-parameters and their inverse, the admittance parameters or y-parameters. To understand the relationship between port voltages and currents and inputs and outputs, consider the simple voltage divider circuit. If we only wish to consider the output voltage (V2) resulting from applying the input voltage (V1) then the transfer function can be expressed as, [ V 2 ] = [ R 2 R 1 + R 2 ] [ V 1 ] {\displaystyle {\begin{bmatrix}V_{2}\end{bmatrix}}={\begin{bmatrix}{\dfrac {R_{2}}{R_{1}+R_{2}}}\end{bmatrix}}{\begin{bmatrix}V_{1}\end{bmatrix}}} which can be considered the trivial case of a 1×1 transfer matrix. The expression correctly predicts the output voltage if there is no current leaving port 2, but is increasingly inaccurate as the load increases. If, however, we attempt to use the circuit in reverse, driving it with a voltage at port 2 and calculate the resulting voltage at port 1 the expression gives completely the wrong result even with no load on port 1. It predicts a greater voltage at port 1 than was applied at port 2, an impossibility with a purely resistive circuit like this one. To correctly predict the behaviour of the circuit, the currents entering or leaving the ports must also be taken into account, which is what the transfer matrix does. The impedance matrix for the voltage divider circuit is, [ V 1 V 2 ] = [ R 1 + R 2 R 2 R 2 R 2 ] [ I 1 I 2 ] {\displaystyle {\begin{bmatrix}V_{1}\\V_{2}\end{bmatrix}}={\begin{bmatrix}R_{1}+R_{2}&R_{2}\\R_{2}&R_{2}\end{bmatrix}}{\begin{bmatrix}I_{1}\\I_{2}\end{bmatrix}}} which fully describes its behaviour under all input and output conditions. At microwave frequencies, none of the transfer matrices based on port voltages and currents are convenient to use in practice. Voltage is difficult to measure directly, current next to impossible, and the open circuits and short circuits required by the measurement technique cannot be achieved with any accuracy. For waveguide implementations, circuit voltage and current are entirely meaningless. Transfer matrices using different sorts of variables are used instead. These are the powers transmitted into, and reflected from a port, which are readily measured in the transmission line technology used in distributed-element circuits in the microwave band. The most well-known and widely used of these sorts of parameters is the scattering parameters, or s-parameters. == Mechanical and other systems == The concept of impedance can be extended into the mechanical and other domains through a mechanical-electrical analogy, hence the impedance parameters and other forms of 2-port network parameters can also be extended to the mechanical domain. To do this, an effort variable and a flow variable are made analogues of voltage and current, respectively. For mechanical systems under translation these variables are force and velocity respectively. Expressing the behaviour of a mechanical component as a two-port or multi-port with a transfer matrix is a useful thing to do because, like electrical circuits, the component can often be operated in reverse and its behaviour is dependent on the loads at the inputs and outputs. For instance, a gear train is often characterised simply by its gear ratio, a SISO transfer function. However, the gearbox output shaft can be driven around to turn the input shaft, requiring a MIMO analysis. In this example, the effort and flow variables are torque T and angular velocity ω, respectively. The transfer matrix in terms of z-parameters will look like, [ T 1 T 2 ] = [ z 11 z 12 z 21 z 22 ] [ ω 1 ω 2 ] {\displaystyle {\begin{bmatrix}T_{1}\\T_{2}\end{bmatrix}}={\begin{bmatrix}z_{11}&z_{12}\\z_{21}&z_{22}\end{bmatrix}}{\begin{bmatrix}\omega _{1}\\\omega _{2}\end{bmatrix}}} However, the z-parameters are not necessarily the most convenient for characterising gear trains. A gear train is the analogue of an electrical transformer and the h-parameters (hybrid parameters) better describe transformers because they directly include the turns ratios (the analogue of gear ratios). The gearbox transfer matrix in h-parameter format is, [ T 1 ω 2 ] = [ h 11 h 12 h 21 h 22 ] [ ω 1 T 2 ] {\displaystyle {\begin{bmatrix}T_{1}\\\omega _{2}\end{bm

    Read more →
  • Bitstrips

    Bitstrips

    Bitstrips, Inc. was a Canadian media and technology company based in Toronto, founded in 2007 by Jacob Blackstock, David Kennedy, Shahan Panth, Dorian Baldwin, and Jesse Brown. The company created and offered a web application, Bitstrips.com, which allowed users to create comic strips using personalized avatars, and preset templates and poses. Brown and Blackstock explained that the service was meant to enable self-expression without the need to have artistic skills. Bitstrips was first presented in 2008 at South by Southwest in Austin, Texas, and the service later piloted and launched a version designed for use as educational software. The service achieved increasing prominence following the launch of versions for Facebook and mobile platforms. In 2014, Bitstrips launched a spin-off app known as Bitmoji, which allows users to create personalized stickers for use in instant messaging. In July 2016, Snapchat Inc. announced that it had acquired the company; the Bitstrips comic service was shut down, but Bitmoji remains operational, and has subsequently been given greater prominence within Snapchat's overall platform. == History == Bitstrips was co-developed by Toronto-based comic artist Jacob Blackstock and his high school friend, journalist Jesse Brown. The service was originally envisioned as a means to allow anyone to create their own comic strip without needing artistic skills. Brown explained that "it's so difficult and time-consuming to tell a story in comic book form, drawing the same characters again and again in these tiny little panels, and just the amount of craftsmanship required. And even if you can do it well, which I never could, it takes years to make a story." Brown stated that the service would be "groundwork for a whole new way to communicate", and went as far as describing the service as being a "YouTube for comics". Blackstock explained that the concept of Bitstrips was influenced by his own use of comics as a form of socialization; a student, Blackstock and his friends drew comics featuring each other and shared them during classes. He felt that Bitstrips was a "medium for self-expression", stating that "It's not just about you making the comics, but since you and your friends star in these comics, it's like you're the medium. The visual nature of comics just speaks so much louder than text." The service was publicly unveiled at South by Southwest in 2008. In 2009, the service introduced a version oriented towards the educational market, Bitstrips for Schools, which was initially piloted at a number of schools in Ontario. The service was praised by educators for being engaging to students, especially within language classes. Brown noted that students were using the service to create comics outside of class as well, stating that it was "so gratifying and shocking what people do with your tool to make their own stories in ways that you never would have anticipated. Some of them are just brilliant." In December 2012, Bitstrips launched a version for Facebook; by July 2013, Bitstrips had 10 million unique users on Facebook, having created over 50 million comics. In October 2013, Bitstrips launched a mobile app; in two months, Bitstrips became a top-downloaded app in 40 countries, and over 30 million avatars had been created with it. In November 2013, Bitstrips secured a round of funding from Horizons Ventures and Li Ka-shing. In October 2014, Bitstrips launched Bitmoji, a spin-off app that allows users to create stickers featuring Bitstrips characters in various templates. In July 2016, following unconfirmed reports earlier in the year, Snapchat Inc. announced that it had acquired Bitstrips. The company's staff continue to operate out of Toronto, but the original Bitstrips comic service was shut down in favour of focusing exclusively on Bitmoji, leaving many Bitstrips users to call for a reboot of the comic service.

    Read more →
  • ReRites

    ReRites

    ReRites (also known as RERITES, ReadingRites, Big Data Poetry) is a literary work of "Human + A.I. poetry" by David Jhave Johnston that used neural network models trained to generate poetry which the author then edited. ReRites won the Robert Coover Award for a Work of Electronic Literature in 2022. == About the project == The ReRites project began as a daily rite of writing with a neural network, expanded into a series of performances from which video documentation has been published online, and concluded with a set of 12 books and an accompanying book of essays published by Anteism Books in 2019. In Electronic Literature, Scott Rettberg describes the early phases of the project in 2016, when it bore the preliminary name Big Data Poetry. Jhave (the artist name that David Jhave Johnston goes by) describes the process of writing ReRites as a rite: "Every morning for 2 hours (normally 6:30–8:30am) I get up and edit the poetic output of a neural net. Deleting, weaving, conjugating, lineating, cohering. Re-writing. Re-wiring authorship: hybrid augmented enhanced evolutionary". There is video documentation of the writing process. The human editing of the neural network's output is fundamental to this project, and Jhave gives examples of both unedited text extracts and his edited versions in publications about the project. Kyle Booten describes ReRites as "simultaneously dusty and outrageously verdant, monotonously sublime and speckled with beautiful and rare specimens". === Performances === ReRites was first shared with an audience through a series of performances where audience members and poets would participate in reading the automatically generated texts, which appeared on screen so fast that human readers could barely keep up. This has been described as allowing participants to "re-discover[..] the peculiar pleasures of being embodied", or, in Jhave's own words, as a space where human participants were "playing their wits and voices against an evocative infinite deep-learning muse". The first performance was at Brown University's Interrupt Festival in 2019. It has been performed many times since, including at the Barbican Centre in London and Anteism Books. === Print publications === For a single year Jhave published one book of poetry from the ReRites project each month. These twelve volumes are accompanied by a book of essays, all published by Anteism Books. The accompanying essays provide critical responses to the project from poets and scholars including Allison Parrish, Johanna Drucker, Kyle Booten, Stephanie Strickland, John Cayley, Lai-Tze Fan, Nick Montfort, Mairéad Byrne, and Chris Funkhouser. Allison Parrish notes elsewhere that these paratexts to ReRites serve a legitimising function for a genre of poetry that is not yet institutionally acknowledged. === Technical details === Starting in 2016 under the name Big Data Poetry, Jhave generated poems using, in his own words, "neural network code (..) adapted from three corporate github-hosted machine-learning libraries: TensorFlow (Google), PyTorch (Facebook), and AWD-LSTM (SalesForce)". He explains that the "models were trained on a customised corpus of 600,000 lines of poetry ranging from the romantic epoch to the 20th century avant garde". Jhave maintains a GitHub repository with some of the code supporting ReRites. == Reception == ReRites is described by John Cayley as "one of the most thorough and beautiful" poetic responses to machine learning. The work's influence on the field of electronic literature was acknowledged in 2022, when the work won the Electronic Literature Organization's Robert Coover Award for a Work of Electronic Literature. The jury described ReRites as particularly poignant in the time of the pandemic, as it was "a documentation of the performance of the private ritual of writing and the obsessive-compulsive need for writers to communicate — even when no one else is reading". The question of authorship and voice in ReRites has been raised by several critics. Although generated poetry is an established genre in electronic literature, Cayley notes that unlike the combinatory poems created by authors like Nick Montfort, where the author explicitly defines which words and phrases will be recombined, ReRites has "not been directed by literary preconceptions inscribed in the program itself, but only by patterns and rhythms pre-existing in the corpora". In an essay for the Australian journal TEXT, David Thomas Henry Wright asks how to understand authorship and authority in ReRites: "Who or what is the authority of the work? The original data fed into the machine, that is not currently retrievable or discernible from the final works? The code that was taken and adapted for his purposes? Or Jhave, the human editor?" Wright concludes that Jhave is the only actor with any intentionality and therefore the authority of the work. The centrality of the human editor is also emphasised by other scholars. In a chapter analysing ReRites Malthe Stavning Erslev argues that the machine learning misrepresents the dataset it is trained on. While ReRites uses 21st century neural networks, it has been compared to earlier literary traditions. Poet Victoria Stanton, who read at one of the ReRites performances, has compared ReRites to found poetry, while David Thomas Henry Wright compares it to the Oulipo movement and Mark Amerika to the cut-up technique. Scholars also position ReRites firmly within the long tradition of generative poetry both in electronic literature and print, stretching from the I Ching, Queneau's Cent Mille Milliards de Poemes and Nabokov's Pale Fire to computer-generated poems like Christopher Strachey's Love Letter Generator (1952) and more contemporary examples. Jhave describes the process of working with the output from the neural network as "carving". In his book My Life as an Artificial Creative Intelligence, Mark Amerika writes that the "method of carving the digital outputs provided by the language model as part of a collaborative remix jam session with GPT-2, where the language artist and the language model play off each other’s unexpected outputs as if caught in a live postproduction set, is one I share with electronic literature composer David Jhave Johnston, whose AI poetry experiments precede my own investigations."

    Read more →
  • Word error rate

    Word error rate

    Word error rate (WER) is a common metric of the performance of a speech recognition or machine translation system. The WER metric typically ranges from 0 to 1, where 0 indicates that the compared pieces of text are exactly identical, and 1 (or larger) indicates that they are completely different with no similarity. This way, a WER of 0.8 means that there is an 80% error rate for compared sentences. The general difficulty of measuring performance lies in the fact that the recognized word sequence can have a different length from the reference word sequence (supposedly the correct one). The WER is derived from the Levenshtein distance, working at the word level instead of the phoneme level. The WER is a valuable tool for comparing different systems as well as for evaluating improvements within one system. This kind of measurement, however, provides no details on the nature of translation errors and further work is therefore required to identify the main source(s) of error and to focus any research effort. This problem is solved by first aligning the recognized word sequence with the reference (spoken) word sequence using dynamic string alignment. Examination of this issue is seen through a theory called the power law that states the correlation between perplexity and word error rate. Word error rate can then be computed as: W E R = S + D + I N = S + D + I S + D + C {\displaystyle {\mathit {WER}}={\frac {S+D+I}{N}}={\frac {S+D+I}{S+D+C}}} where S is the number of substitutions, D is the number of deletions, I is the number of insertions, C is the number of correct words, N is the number of words in the reference (N=S+D+C) The intuition behind 'deletion' and 'insertion' is how to get from the reference to the hypothesis. So if we have the reference "This is wikipedia" and hypothesis "This _ wikipedia", we call it a deletion. Note that since N is the number of words in the reference, the word error rate can be larger than 1.0, namely if the number of insertions I is larger than the number of correct words C. When reporting the performance of a speech recognition system, sometimes word accuracy (WAcc) is used instead: W A c c = 1 − W E R = N − S − D − I N = C − I N {\displaystyle {\mathit {WAcc}}=1-{\mathit {WER}}={\frac {N-S-D-I}{N}}={\frac {C-I}{N}}} Since the WER can be larger than 1.0, the word accuracy can be smaller than 0.0. == Experiments == It is commonly believed that a lower word error rate shows superior accuracy in recognition of speech, compared with a higher word error rate. However, at least one study has shown that this may not be true. In a Microsoft Research experiment, it was shown that, if people were trained under "that matches the optimization objective for understanding", (Wang, Acero and Chelba, 2003) they would show a higher accuracy in understanding of language than other people who demonstrated a lower word error rate, showing that true understanding of spoken language relies on more than just high word recognition accuracy. == Other metrics == One problem with using a generic formula such as the one above, however, is that no account is taken of the effect that different types of error may have on the likelihood of successful outcome, e.g. some errors may be more disruptive than others and some may be corrected more easily than others. These factors are likely to be specific to the syntax being tested. A further problem is that, even with the best alignment, the formula cannot distinguish a substitution error from a combined deletion plus insertion error. Hunt (1990) has proposed the use of a weighted measure of performance accuracy where errors of substitution are weighted at unity but errors of deletion and insertion are both weighted only at 0.5, thus: W E R = S + 0.5 D + 0.5 I N {\displaystyle {\mathit {WER}}={\frac {S+0.5D+0.5I}{N}}} There is some debate, however, as to whether Hunt's formula may properly be used to assess the performance of a single system, as it was developed as a means of comparing more fairly competing candidate systems. A further complication is added by whether a given syntax allows for error correction and, if it does, how easy that process is for the user. There is thus some merit to the argument that performance metrics should be developed to suit the particular system being measured. Whichever metric is used, however, one major theoretical problem in assessing the performance of a system is deciding whether a word has been “mis-pronounced,” i.e. does the fault lie with the user or with the recogniser. This may be particularly relevant in a system which is designed to cope with non-native speakers of a given language or with strong regional accents. The pace at which words should be spoken during the measurement process is also a source of variability between subjects, as is the need for subjects to rest or take a breath. All such factors may need to be controlled in some way. For text dictation it is generally agreed that performance accuracy at a rate below 95% is not acceptable, but this again may be syntax and/or domain specific, e.g. whether there is time pressure on users to complete the task, whether there are alternative methods of completion, and so on. The term "Single Word Error Rate" is sometimes referred to as the percentage of incorrect recognitions for each different word in the system vocabulary. == Edit distance == The word error rate may also be referred to as the length normalized edit distance. The normalized edit distance between X and Y, d( X, Y ) is defined as the minimum of W( P ) / L ( P ), where P is an editing path between X and Y, W ( P ) is the sum of the weights of the elementary edit operations of P, and L(P) is the number of these operations (length of P).

    Read more →
  • NER model

    NER model

    NER is one of several formulas for accessing live subtitles in television broadcasts and events that are produced using speech recognition. The three letters stand for number, edit error and recognition error. It has been promoted as an alternative to Word error rate (Word Error Rate) which is a more objective measure. The overall score is calculated as follows: Firstly, the number of edit and recognition errors is deducted from the total number of words in the live subtitles. This number is then divided by the total number of words in the live subtitles and finally multiplied by one hundred. N E R v a l u e = N − E − R N ∗ 100 {\displaystyle NERvalue={\frac {N-E-R}{N}}100} . The acronyms stand for the following: N (number) = total number of words in the live subtitles E (Edit error) = edit error R (Recognition error) = recognition error This measurement process has been used for public television broadcasts in European countries like Italy and Switzerland. One major drawback with NER is that it requires a human assessor to rate errors as either: 1 Minor edition or recognition errors 2 Normal edition or recognition errors 3 Serious errors which are then weighted in the assessment process. This is both subjective, time consuming and costly. Also, NER fails to account for words left out subtitles which is something that does not take account of the D/deaf audience who want verbatim subtitles. As a result, NER cannot accurately reflect the audience's experience of subtitles. Another problem is the inconsistency of human evaluation of subtitles, particularly with live subtitles, where there are differing opinions of the importance of subtitle errors. By way of contrast, Word error rate is an objective measure of subtitle errors, since it measures the textual discrepancy between the subtitles and the speech.

    Read more →
  • List of video editing software

    List of video editing software

    The following is a list of video editing software. The criterion for inclusion in this list is the ability to perform non-linear video editing. Most modern transcoding software supports transcoding a portion of a video clip, which would count as cropping and trimming. However, items in this article have one of the following conditions: Can perform other non-linear video editing function such as montage or compositing Can do the trimming or cropping without transcoding == Free (libre) or open-source == The software listed in this section is either free software or open source, and may or may not be commercial. === Active and stable === === Inactive === == Proprietary (non-commercial) == The software listed in this section is proprietary, and freeware or freemium. === Active === === Discontinued === == Proprietary (commercial) == The software listed in this section is proprietary and commercial. === Active === === Discontinued ===

    Read more →
  • Anti-Grain Geometry

    Anti-Grain Geometry

    Anti-Grain Geometry (AGG) is a 2D rendering graphics library written in C++. It features anti-aliasing and sub-pixel resolution. It is not a graphics library, per se, but rather a framework to build a graphics library upon. The library is operating system independent and renders to an abstract memory object. It comes with examples interfaced to the X Window System, Microsoft Windows, Mac OS X, AmigaOS, BeOS, SDL. The examples also include an SVG viewer. The design of AGG uses C++ templates only at a very high level, rather than extensively, to achieve the flexibility to plug custom classes into the rendering pipeline, without requiring a rigid class hierarchy, and allows the compiler to inline many of the method calls for high performance. For a library of its complexity, it is remarkably lightweight: it has no dependencies above the standard C++ libraries and it avoids the C++ STL in the implementation of the basic algorithms. The implicit interfaces are not well documented, however, and this can make the learning process quite cumbersome. While AGG version 2.5 is licensed under the GNU General Public License, version 2 or greater, AGG version 2.4 is still available under the 3-clause BSD license and is virtually the same as version 2.5. == History == Active development of the AGG codebase stalled in 2006, around the time of the v2.5 release, due to shifting priorities of its main developer and maintainer Maxim Shemanarev. M. Shemanarev remained active in the community until his sudden death in 2013. Development has continued on a fork of the more liberally licensed v2.4 on SourceForge.net. == Usage == The Haiku operating system uses AGG in its windowing system. It is one of the renderers available for use in GNU's Gnash Flash player. Graphical version of Rebol language interpreter is using AGG for scalable vector graphics DRAW dialect. Hilti uses it in some of their rebar detection tools, like the PS 1000. Matplotlib uses AGG as its canonical renderer for interactive user interfaces. fpGUI Toolkit has an optional AggPas back-end rendering engine. Work is being done to make AggPas the default or sole rendering engine for fpGUI. Mapnik, the toolkit that renders the maps on the OpenStreetMap website, uses AGG for all its bitmap map rendering by default. HTTPhotos uses AGG to scale photos. Pdfium, the PDF rendering engine used by Google Chrome makes use of AGG, although work is progressing to replace this with Skia Graphics Engine. Graphics Mill, the .NET imaging SDK uses AGG as its drawing engine. Image-Line FL Studio, a digital audio workstation, since version 10.8 released on September 30, 2012, uses AGG for drawing. Native Instruments's Supercharger and Supercharger GT compressors use AGG for its user interface. == Author == The main author of the library was Maxim Shemanarev (Russian: Максим Шеманарёв). On November 26, 2013 Shemanarev (born June 15, 1966, Nizhny Novgorod, Russia) was reported dead at the age of 47 at his home in Columbia, Maryland (US). He died suddenly, allegedly from an epileptic seizure that he had suffered for a while. He was a graduate from Nizhny Novgorod State Technical University. Little is known about his personal life. It's known though that he was divorced and his mother was alive at the time of his death. He used to love skiing, snowboarding (in Colorado), and inline skating. He was praised by his friends for his intelligent programming skills.

    Read more →
  • Template matching

    Template matching

    Template matching is a technique in digital image processing for finding small parts of an image which match a template image. It can be used for quality control in manufacturing, navigation of mobile robots, or edge detection in images. The main challenges in a template matching task are detection of occlusion, when a sought-after object is partly hidden in an image; detection of non-rigid transformations, when an object is distorted or imaged from different angles; sensitivity to illumination and background changes; background clutter; and scale changes. == Feature-based approach == The feature-based approach to template matching relies on the extraction of image features, such as shapes, textures, and colors, that match the target image or frame. This approach is usually achieved using neural networks and deep-learning classifiers such as VGG, AlexNet, and ResNet.Convolutional neural networks (CNNs), which many modern classifiers are based on, process an image by passing it through different hidden layers, producing a vector at each layer with classification information about the image. These vectors are extracted from the network and used as the features of the image. Feature extraction using deep neural networks, like CNNs, has proven extremely effective has become the standard in state-of-the-art template matching algorithms. This feature-based approach is often more robust than the template-based approach described below. As such, it has become the state-of-the-art method for template matching, as it can match templates with non-rigid and out-of-plane transformations, as well as high background clutter and illumination changes. == Template-based approach == For templates without strong features, or for when the bulk of a template image constitutes the matching image as a whole, a template-based approach may be effective. Since template-based matching may require sampling of a large number of data points, it is often desirable to reduce the number of sampling points by reducing the resolution of search and template images by the same factor before performing the operation on the resultant downsized images. This pre-processing method creates a multi-scale, or pyramid, representation of images, providing a reduced search window of data points within a search image so that the template does not have to be compared with every viable data point. Pyramid representations are a method of dimensionality reduction, a common aim of machine learning on data sets that suffer the curse of dimensionality. == Common challenges == In instances where the template may not provide a direct match, it may be useful to implement eigenspaces to create templates that detail the matching object under a number of different conditions, such as varying perspectives, illuminations, color contrasts, or object poses. For example, if an algorithm is looking for a face, its template eigenspaces may consist of images (i.e., templates) of faces in different positions to the camera, in different lighting conditions, or with different expressions (i.e., poses). It is also possible for a matching image to be obscured or occluded by an object. In these cases, it is unreasonable to provide a multitude of templates to cover each possible occlusion. For example, the search object may be a playing card, and in some of the search images, the card is obscured by the fingers of someone holding the card, or by another card on top of it, or by some other object in front of the camera. In cases where the object is malleable or poseable, motion becomes an additional problem, and problems involving both motion and occlusion become ambiguous. In these cases, one possible solution is to divide the template image into multiple sub-images and perform matching on each subdivision. == Deformable templates in computational anatomy == Template matching is a central tool in computational anatomy (CA). In this field, a deformable template model is used to model the space of human anatomies and their orbits under the group of diffeomorphisms, functions which smoothly deform an object. Template matching arises as an approach to finding the unknown diffeomorphism that acts on a template image to match the target image. Template matching algorithms in CA have come to be called large deformation diffeomorphic metric mappings (LDDMMs). Currently, there are LDDMM template matching algorithms for matching anatomical landmark points, curves, surfaces, volumes. == Template-based matching explained using cross correlation or sum of absolute differences == A basic method of template matching sometimes called "Linear Spatial Filtering" uses an image patch (i.e., the "template image" or "filter mask") tailored to a specific feature of search images to detect. This technique can be easily performed on grey images or edge images, where the additional variable of color is either not present or not relevant. Cross correlation techniques compare the similarities of the search and template images. Their outputs should be highest at places where the image structure matches the template structure, i.e., where large search image values get multiplied by large template image values. This method is normally implemented by first picking out a part of a search image to use as a template. Let S ( x , y ) {\displaystyle S(x,y)} represent the value of a search image pixel, where ( x , y ) {\displaystyle (x,y)} represents the coordinates of the pixel in the search image. For simplicity, assume pixel values are scalar, as in a greyscale image. Similarly, let T ( x t , y t ) {\textstyle T(x_{t},y_{t})} represent the value of a template pixel, where ( x t , y t ) {\textstyle (x_{t},y_{t})} represents the coordinates of the pixel in the template image. To apply the filter, simply move the center (or origin) of the template image over each point in the search image and calculate the sum of products, similar to a dot product, between the pixel values in the search and template images over the whole area spanned by the template. More formally, if ( 0 , 0 ) {\displaystyle (0,0)} is the center (or origin) of the template image, then the cross correlation T ⋆ S {\displaystyle T\star S} at each point ( x , y ) {\displaystyle (x,y)} in the search image can be computed as: ( T ⋆ S ) ( x , y ) = ∑ ( x t , y t ) ∈ T T ( x t , y t ) ⋅ S ( x t + x , y t + y ) {\displaystyle (T\star S)(x,y)=\sum _{(x_{t},y_{t})\in T}T(x_{t},y_{t})\cdot S(x_{t}+x,y_{t}+y)} For convenience, T {\displaystyle T} denotes both the pixel values of the template image as well as its domain, the bounds of the template. Note that all possible positions of the template with respect to the search image are considered. Since cross correlation values are greatest when the values of the search and template pixels align, the best matching position ( x m , y m ) {\displaystyle (x_{m},y_{m})} corresponds to the maximum value of T ⋆ S {\displaystyle T\star S} over S {\displaystyle S} . Another way to handle translation problems on images using template matching is to compare the intensities of the pixels, using the sum of absolute differences (SAD) measure. To formulate this, let I S ( x s , y s ) {\displaystyle I_{S}(x_{s},y_{s})} and I T ( x t , y t ) {\displaystyle I_{T}(x_{t},y_{t})} denote the light intensity of pixels in the search and template images with coordinates ( x s , y s ) {\displaystyle (x_{s},y_{s})} and ( x t , y t ) {\displaystyle (x_{t},y_{t})} , respectively. Then by moving the center (or origin) of the template to a point ( x , y ) {\displaystyle (x,y)} in the search image, as before, the sum of absolute differences between the template and search pixel intensities at that point is: S A D ( x , y ) = ∑ ( x t , y t ) ∈ T | I T ( x t , y t ) − I S ( x t + x , y t + y ) | {\displaystyle SAD(x,y)=\sum _{(x_{t},y_{t})\in T}\left\vert I_{T}(x_{t},y_{t})-I_{S}(x_{t}+x,y_{t}+y)\right\vert } With this measure, the lowest SAD gives the best position for the template, rather than the greatest as with cross correlation. SAD tends to be relatively simple to implement and understand, but it also tends to be relatively slow to execute. A simple C++ implementation of SAD template matching is given below. == Implementation == In this simple implementation, it is assumed that the above described method is applied on grey images: This is why Grey is used as pixel intensity. The final position in this implementation gives the top left location for where the template image best matches the search image. One way to perform template matching on color images is to decompose the pixels into their color components and measure the quality of match between the color template and search image using the sum of the SAD computed for each color separately. == Speeding up the process == In the past, this type of spatial filtering was normally only used in dedicated hardware solutions because of the computational complexity of the operation, however we can lessen this complexity b

    Read more →
  • Deep Zoom

    Deep Zoom

    Deep Zoom is a technology developed by Microsoft for efficiently transmitting and viewing images. It allows users to pan around and zoom in on a large, high resolution image or a large collection of images. It reduces the time required for initial load by downloading only the region being viewed or only at the resolution it is displayed at. Subsequent regions are downloaded as the user pans to (or zooms into) them; animations are used to hide any jerkiness in the transition. The libraries are also available in other platforms including Java and Flash. == History == The Deep Zoom file format is very similar to the Google Maps image format where images are broken into tiles and then displayed as required. The tiling typically follows a quadtree pattern of increasing resolution of image (in other words twice the zoom and twice the resolution). The main difference is that with Google Maps the actual details on the image change from one zoom level to another, while with Deep Zoom the same image is displayed at each zoom level. Seadragon Software, formerly Sand Codex, first created the Seadragon technology and its implementation of what is now called Deep Zoom. This technology was then absorbed into the Microsoft Live Labs when Seadragon Software was acquired. Engineers from Seadragon now work with Microsoft to integrate their work into technology such as Silverlight and Photosynth. == Deep Zoom examples == The most famous implementation of Deep Zoom was probably the first: the memorabilia collection at the Hard Rock website. Conceived and designed by Duncan/Channon and built by Vertigo, it was demonstrated for the first time in March 2008 at the Microsoft MIX convention in Las Vegas. In 2010, Microsoft Live Labs partnered with the University of California, Berkeley to create ChronoZoom, a DeepZoom-powered time visualization tool that pushed the limits of DeepZoom, since it required zooming from the scale of 13 billion years down to a single day. The project has since graduated to development under Microsoft Research. Another example is the Deep Earth project. It is described by its creators as "a community project focused on creating a rich interactive mapping control using Silverlight2 Deep Zoom. Concentrating on Microsoft Virtual Earth imagery and data the project offers team members the opportunity to learn and share while creating something cool and useful." A paintings collection project http://galleryzoom.co.uk/ shows 1000 high resolution/sensor images individually indexed. (Using Deep Zoom Composer). Blaise Aguera y Arcas gave a demonstration of Seadragon and Photosynth at the 2007 TED conference. In November 2009, 352 Media Group, a Silverlight developer in the Microsoft Silverlight Partner Program, created an example of Deep Zoom using Microsoft Silverlight version 3. It is online at 352 Media Group's Web site. The Winston Churchill Deep Zoom Archived 2010-07-04 at the Wayback Machine mosaic, created by Silverlight developers Shoothill, features as both an online interactive deep zoom and a standalone deep zoom which forms part of the Churchill exhibit in the Churchill War Rooms in Whitehall. In 2010, Shoothill built the Sumatran Tiger Deep Zoom - the largest seen to date - for worldwide conservation charity Fauna and Flora International, featuring thousands of images of endangered species. An early example of Deep Zoom-like technology was implemented at The Department of Maori Affairs in New Zealand in 1997. The technology was used to display Maori land ownership. == Deep Zoom images == The file format used by Deep Zoom (as well as Photosynth and Seadragon Ajax) is XML based. Users can specify a single large image (dzi) or a collection of images (dzc). It also allows for "Sparse Images"; where some parts of the image have greater resolution than others, an example of which can be found on the Seadragon Ajax home page; The bike image displayed is a sparse image. Though used in the proprietary Deep Zoom, the dzi format is open and able to be used by anyone. === Deep Zoom image (dzi) === A DZI has two parts: a DZI file (with either a .dzi or .xml extension) and a subdirectory of image folders. Each folder in the image subdirectory is labeled with its level of resolution. Higher numbers correspond to a higher resolution level; inside each folder are the image tiles corresponding to that level of resolution, numbered consecutively in columns from top left to bottom right. === Deep Zoom collection (dzc) === A DZC is a collection of some number of DZIs linked and referenced by a DZC file (with either a .dzc or .xml extension). At a high level, a collection is a number of image thumbnails whose location is kept track of by the .dzc/.xml file, when zooming into an image, it accesses greater resolutions tiles. A DZC's structure is similar to that of a DZI; the .dzc/.xml file defines the collection and the subdirectory of folders maps to the DZI file structure, each with their set of .dzi/.xml and image tiles. The DZC is used in Microsoft's Pivot, but not in SeaDragon per se. === Sparse Images === Sparse images are a sub-classification of the DZI file type. A sparse image is normally a number of separate photographs with varying resolution levels that have been placed in a single DZI instead of a DZC. Sparse images have no different file structure than that of a DZI and differ only in that there is not a single "highest resolution" level for the entire DZI. == Software that uses Deep Zoom == Image Composite Editor - image stitching tool created by Microsoft Research Deep Zoom Composer - collage maker and simple panorama tool created by Microsoft. Images' resolution is maintained when exporting for web use (via Silverlight Deep Zoom or JavaScript using a third-party template). No longer available for download from Microsoft though it can be found on various other sources such as Internet Archive. == iPhone OS development == Microsoft Live Labs has created an application for the App Store called Seadragon Mobile. It is run over the internet and includes Deep Zoom on the following categories; art, history, maps, photos, Photosynth which anybody can upload to, space and technology & web.

    Read more →