AI Detector Checker

AI Detector Checker — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Web application firewall

    Web application firewall

    A Web application firewall (WAF) is a specific form of application firewall that filters, monitors, and blocks HTTP traffic to and from a web service. By inspecting HTTP traffic, it can prevent attacks exploiting a Web application's known vulnerabilities, such as SQL injection, cross-site scripting (XSS), file inclusion, and improper system configuration. Financial institutions often utilize WAFs to help in the mitigation of Web application zero-day vulnerabilities, as well as hard-to-patch bugs or weaknesses through custom attack signature strings. == History == Dedicated Web application firewalls entered the market in the late 1990s during a time when web server attacks were becoming more prevalent. Early WAF products, from Kavado and Gilian technologies, tried to solve the increasing amount of attacks on Web applications in the late 1990s. In 2002, the open-source project ModSecurity was formed in order to make WAF technology more accessible. They finalized a core rule set for protecting Web applications, based on OASIS Web Application Security Technical Committee’s (WAS TC) vulnerability work. In 2003, they expanded and standardized rules through the Open Web Application Security Project’s (OWASP) Top 10 List, an annual ranking for Web security vulnerabilities. This list would become the industry standard for Web application security compliance. Since then, the market has continued to grow and evolve, especially focusing on credit card fraud prevention. With the development of the Payment Card Industry Data Security Standard (PCI DSS), a standardization of control over cardholder data, security has become more regulated in this sector. == Description == A Web application firewall is a special type of application firewall that applies specifically to Web applications. It is deployed in front of Web applications and analyzes bi-directional web-based (HTTP) traffic – detecting and blocking anything malicious. The OWASP provides a broad technical definition for a WAF as “a security solution on the Web application level which – from a technical point of view – does not depend on the application itself”. According to the PCI DSS Information Supplement for requirement 6.6, a WAF is defined as “a security policy enforcement point positioned between a Web application and the client endpoint. This functionality can be implemented in software or hardware, running in an appliance device, or in a typical server running a common operating system. It may be a stand-alone device or integrated into other network components.” In other words, a WAF can be a virtual or physical appliance that prevents vulnerabilities in Web applications from being exploited by outside threats. These vulnerabilities may be because the application itself is a legacy type or was insufficiently coded by design. The WAF addresses these code shortcomings by special configurations of rule-sets, also known as policies. Previously unknown vulnerabilities can be discovered through penetration testing or via a vulnerability scanner. A Web application vulnerability scanner, also known as a web application security scanner, is defined in the SAMATE NIST 500-269 as “an automated program that examines Web applications for potential security vulnerabilities. In addition to searching for Web application-specific vulnerabilities, the tools also look for software coding errors.” Resolving vulnerabilities is commonly referred to as remediation. Corrections to the code can be made in the application, but typically a more prompt response is necessary. In these situations, the application of a custom policy for a unique Web application vulnerability to provide a temporary but immediate fix (known as a virtual patch) may be necessary. WAFs are not an ultimate security solution, rather they are meant to be used in conjunction with other network perimeter security solutions such as network firewalls and intrusion prevention systems to provide a holistic defense strategy. WAFs typically follow a positive security model, a negative security, or a combination of both as mentioned by the SANS Institute. WAFs use a combination of rule-based logic, parsing, and signatures to detect and prevent attacks such as cross-site scripting and SQL injection. In general, features like browser emulation, obfuscation and virtualization, and IP obfuscation are used to attempt to bypass WAFs. The OWASP produces a list of the top ten Web application security flaws. All commercial WAF offerings cover these ten flaws at a minimum. There are non-commercial options as well. As mentioned earlier, the well-known open-source WAF engine called ModSecurity is one of these options. A WAF engine alone is insufficient to provide adequate protection, therefore OWASP along with Trustwave's Spiderlabs help organize and maintain a Core-Rule Set via GitHub to use with the ModSecurity WAF engine. == Deployment options == Although the names for operating mode may differ, WAFs are basically deployed inline in three different ways. According to NSS Labs, deployment options are transparent bridge, transparent reverse proxy, and reverse proxy. "Transparent" refers to the fact that the HTTP traffic is sent straight to the Web application, therefore the WAF is transparent between the client and server. This is in contrast to reverse proxy, where the WAF acts as a proxy, and the client’s traffic is sent directly to the WAF. The WAF then separately sends filtered traffic to Web applications. This can provide additional benefits such as IP masking but may introduce disadvantages such as performance latencies. == JA3 fingerprint == JA3, developed by Salesforce in 2017, is a technique for generating a unique fingerprint for SSL/TLS traffic based on specific fields in the handshake, such as the version, cipher suites, and extensions used by the client. This fingerprint enables the identification and tracking of clients based on the characteristics of their encrypted traffic. In the context of distributed denial of service (DDoS) protection, JA3 fingerprints are used to detect and differentiate malicious traffic, often associated with attack bots, from legitimate traffic, allowing for more precise filtering of potential threats. In September 2023, AWS WAF announced built-in support for JA3, enabling customers to inspect the JA3 fingerprints of incoming requests. JA3 was deprecated in May 2025 in favor of JA4. JA4 is currently patent pending.

    Read more →
  • Pol.is

    Pol.is

    Polis (or Pol.is) is wiki survey software designed for large group collaborations. As a civic technology, Polis allows people to share their opinions and ideas, and its algorithm is intended to elevate ideas that can facilitate better decision-making, especially when there are lots of participants. Polis has been credited for assisting the passage of legislation in Taiwan. Pol.is has been used by governments in the United States, Canada, Singapore, Philippines, Finland, Spain and elsewhere. == History == Pol.is was founded by Colin Megill, Christopher Small, and Michael Bjorkegren after the Occupy Wall Street and Arab Spring movements. In Taiwan, pol.is has been "one of the key parts" of vTaiwan's suite of open-source tools for its citizen engagement efforts arising out of the Sunflower Student Movement. vTaiwan claims that of the 26 national issues related to technology discussed on the platform, 80% led to government action. Pol.is is also utilized by "Join," a national platform for online deliberation run by the Taiwanese government. In 2022, Wired reported that Polis was an influence on the Community Notes project at Twitter. In 2023, Megill advised OpenAI on how to facilitate deliberation at scale in a way that was more efficient than Polis, which still required significant human labor and analysis at the time. He helped to award $1 million in grants to teams working on solving the problem of deliberation at scale. In 2023, Anthropic was also exploring steering model behavior using Polis. In 2025, it helped the county that includes Bowling Green, Kentucky make a 25 year plan by facilitating the collection and review of ideas from thousands of residents, representing 10% of the county. 2,370 of 3,940 unique ideas were agreed-upon by over 80% of survey respondents. Ideas were screened by volunteers if they were redundant to an existing idea, off-topic or obscene. == How it works == Pol.is participants are anonymous and cannot reply directly to others posts, in an effort to avoid personal attacks for users. Its algorithms are designed not for engagement and scrolling, but to find areas of agreement to better understand the nuances of a wide range of opinions. Participants are prompted for ideas and vote on other participants' ideas. == Reception == Andrew Leonard, The Financial Times, and VentureBeat describe Pol.is as a possible antidote to the divisiveness of traditional internet discourse by gamifying consensus. Audrey Tang agreed saying, "Polis is quite well known in that it's a kind of social media that instead of polarizing people to drive so called engagement or addiction or attention, it automatically drives bridge making narratives and statements. So only the ideas that speak to both sides or to multiple sides will gain prominence in Polis." Niall Ferguson argues that the approach to utilize tools like Pol.is and Join in Taiwan empowers ordinary people instead of the elite and protects individual freedoms, providing a contrast to the AI-enhanced panopticon model seen in China. Carl Miller praised the technology as having "gamified finding consensus." Darshana Narayanan, in an op-ed in the Economist, argues that open-source machine-learning-based tools like Polis can help to bypass the influence of special interests or experts. Jamie Susskind cited polis and vTaiwan as a model for democracies, particularly around digital policy issues.

    Read more →
  • Environmental informatics

    Environmental informatics

    Environmental informatics is the science of information applied to environmental science. As such, it provides the information processing and communication infrastructure to the interdisciplinary field of environmental sciences aiming at data, information and knowledge integration, the application of computational intelligence to environmental data as well as the identification of environmental impacts of information technology. Environmental informatics thus acts as a bridge, providing an interdisciplinary means of analysing, describing and understanding the complex interactions between humans, nature and technology. Since each field of applied computer science has its own subject matter, terminology and methods, specialised disciplines, such as environmental, bio- and geoinformatics have emerged, each of which combines computer science with a specific field of application such as environmental, bio- or geosciences. Environmental informatics, bioinformatics and geoinformatics all deal with computer-based processing of environmental phenomena. However, environmental informatics is the only field that pursues normative goals (e.g., political goals of environmental protection, environmental planning, and sustainability). This also influences the choice of methods. This also distinguishes it from application areas such as numerical weather prediction, which is considered an early and important example of computer simulation of environmental phenomena. The UK Natural Environment Research Council defines environmental informatics as the "research and system development focusing on the environmental sciences relating to the creation, collection, storage, processing, modelling, interpretation, display and dissemination of data and information." Kostas Karatzas defined environmental informatics as the "creation of a new 'knowledge-paradigm' towards serving environmental management needs." Karatzas argued further that environmental informatics "is an integrator of science, methods and techniques and not just the result of using information and software technology methods and tools for serving environmental engineering needs." Environmental informatics emerged in early 1990 in Central Europe. Current initiatives to effectively manage, share, and reuse environmental and ecological data are indicative of the increasing importance of fields like environmental informatics and ecoinformatics to develop the foundations for effectively managing ecological information. Examples of these initiatives are National Science Foundation Datanet projects, DataONE and Data Conservancy. == Subject matter and objectives == The subject of environmental informatics are environmental information systems (EIS). An EIS 'is a computer-based system that integrates and stores data collected about the natural environment and provides powerful methods for accessing and evaluating it.' This allows environmental data to be processed by computers for environmental protection, planning, research and technology. According to Jaeschke and Bossel, environmental informatics has three interrelated objectives: Environmental informatics serves to procure data and information for describing the state and development of the environment. Of particular importance is information that is needed to prevent or limit undesirable changes and to support desirable changes. Based on the evaluation and analysis of data, environmental informatics improves our understanding of the environment and the interactions between nature, technology and society. It thus supports environmentally relevant decisions. This enables the influence of development (system correction), the assessment of the effects and side effects of potential measures, and the creation of tools for the routine planning, implementation and monitoring of measures. == History == The simulation model World3, which formed the basis of the highly acclaimed study The Limits to Growth, is considered the starting point of environmental informatics. It incorporated environmental information, among other things, to calculate scenarios for global development. In the mid-1980s, interest grew in structuring environmental protection as an area of application for computer science. One of the first publications in German was the book Informatik im Umweltschutz. Anwendungen und Perspektiven (Computer science in environmental protection. Applications and perspectives) from 1986. The term 'environmental informatics' did not appear until around 1993, which is why the development of environmental informatics is usually referred to as having taken place in the 1990s. In 1993, the first university chair for environmental informatics was established in Cottbus. In 1994, the anthology Umweltinformatik. Informatikmethoden für Umweltschutz und Umweltforschung (Environmental Informatics: Informatics Methods for Environmental Protection and Environmental Research) was published. The development of environmental informatics was 'primarily initiated by German computer science.' In the English-speaking world, the volume Environmental Informatics was published in 1995, mainly based on the German anthology of 1994. An article in the conference proceedings of the World Computer Congress of the International Federation for Information Processing (IFIP) in Hamburg in 1994 describes the initial situation of environmental informatics as follows: 'On the one hand, we suffer from the huge amount of available data – people sometimes speak of data graveyards – on the other hand, the really relevant data may still be missing.' This statement indicates the need that led to the emergence of environmental informatics as a specialised discipline of applied computer science. Furthermore, the specific characteristics and processing requirements of environmental data necessitated the emergence of environmental informatics. The special features of environmental data include: The data structures required are highly heterogeneous due to specific processes and differing perspectives on environmental aspects (e.g., water protection, emission control, hazardous substances). In addition to the heterogeneity of the data, heterogeneous databases also play a role, as environmental data is often obtained and presented in an interdisciplinary manner. Obligations change frequently as a result of new legislation, whether regional (e.g. state regulations on water protection), national (e.g. federal emission control regulations) or international (e.g. Registration, Evaluation, Authorisation and Restriction of Chemicals|REACH). The objects represented are often multidimensional and, therefore, require complex geometric representation using curves or polygons. It is often necessary to process uncertain, imprecise or incomplete data, which is, for example, the result of extrapolations or forecasts. A new "knowledge paradigm" has emerged to meet the requirements of environmental management. Environmental informatics produces its own concepts, methods and techniques and is not merely the result of using information and communication technology methods and tools to meet environmental requirements. The development of environmental informatics since the 1990s has been significantly influenced by the newly established conferences EnviroInfo, ISESS and ITEE and is documented in the respective proceedings. Aspects of sustainability and sustainable development were increasingly integrated into environmental informatics after 2000, thereby expanding the field. In 2004, the Working Group on Sustainable Information Society of the Gesellschaft für Informatik e. V. (German Informatics Society, GI) published the Memorandum on a Sustainable Information Society, which formulates recommendations for an information society that is compatible with human, social and natural needs. Since 2007, environmental informatics has often been described in more detail as informatics for environmental protection, sustainable development and risk management. The increased focus on sustainability has also contributed to the formation of the research focus Information and Communications Technology for Sustainability (ICT4S) and to the emergence of the international conference ICT4S in 2013. ICT-ENSURE, the European Commission's funding measure for the establishment of a European research area on "ICT for Environmental Sustainability Research" (2008–2010), has also contributed to the structuring of environmental informatics. == Environmental informatics and sustainable development == Efforts to place environmental informatics within the context of sustainable development have been growing since 2000 and were significantly influenced by the Memorandum on a Sustainable Information Society. According to this Memorandum, the information society offers great but unevenly distributed opportunities for education, participation and intercultural understanding. In addition, the Memorandum highlighted the material and energy consumption of inf

    Read more →
  • NoSQL

    NoSQL

    NoSQL (originally meaning "not only SQL" or "non-relational") refers to a type of database design that stores and retrieves data differently from the traditional table-based structure of relational databases. Unlike relational databases, which organize data into rows and columns like a spreadsheet, NoSQL databases use a single data structure—such as key–value pairs, wide columns, graphs, or documents—to hold information. Since this non-relational design does not require a fixed schema, it scales easily to manage large, often unstructured datasets. NoSQL systems are sometimes called "Not only SQL" because they can support SQL-like query languages or work alongside SQL databases in polyglot-persistent setups, where multiple database types are combined. Non-relational databases date back to the late 1960s, but the term "NoSQL" emerged in the early 2000s, spurred by the needs of Web 2.0 companies like social media platforms. NoSQL databases are popular in big data and real-time web applications due to their simple design, ability to scale across clusters of machines (called horizontal scaling), and precise control over data availability. These structures can speed up certain tasks and are often considered more adaptable than fixed database tables. However, many NoSQL systems prioritize speed and availability over strict consistency (per the CAP theorem), using eventual consistency—where updates reach all nodes eventually, typically within milliseconds, but may cause brief delays in accessing the latest data, known as stale reads. While most lack full ACID transaction support, some, like MongoDB, include it as a key feature. == Barriers to adoption == Barriers to wider NoSQL adoption include their use of low-level query languages instead of SQL, inability to perform ad hoc joins across tables, lack of standardized interfaces, and significant investments already made in relational databases. Some NoSQL systems risk losing data through lost writes or other forms, though features like write-ahead logging—a method to record changes before they’re applied—can help prevent this. For distributed transaction processing across multiple databases, keeping data consistent is a challenge for both NoSQL and relational systems, as relational databases cannot enforce rules linking separate databases, and few systems support both ACID transactions and X/Open XA standards for managing distributed updates. Limitations within the interface environment are overcome using semantic virtualization protocols, such that NoSQL services are accessible to most operating systems. == History == The term NoSQL was used by Carlo Strozzi in 1998 to name his lightweight Strozzi NoSQL open-source relational database that did not expose the standard Structured Query Language (SQL) interface, but was still relational. His NoSQL RDBMS is distinct from the around-2009 general concept of NoSQL databases. Strozzi suggests that, because the current NoSQL movement "departs from the relational model altogether, it should therefore have been called more appropriately 'NoREL'", referring to "not relational". Johan Oskarsson, then a developer at Last.fm, reintroduced the term NoSQL in early 2009 when he organized an event to discuss "open-source distributed, non-relational databases". The name attempted to label the emergence of an increasing number of non-relational, distributed data stores, including open source clones of Google's Bigtable/MapReduce and Amazon's DynamoDB. == Types and examples == There are various ways to classify NoSQL databases, with different categories and subcategories, some of which overlap. What follows is a non-exhaustive classification by data model, with examples: === Key–value store === Key–value (KV) stores use the associative array (also called a map or dictionary) as their fundamental data model. In this model, data is represented as a collection of key–value pairs, such that each possible key appears at most once in the collection. The key–value model is one of the simplest non-trivial data models, and richer data models are often implemented as an extension of it. The key–value model can be extended to a discretely ordered model that maintains keys in lexicographic order. This extension is computationally powerful, in that it can efficiently retrieve selective key ranges. Key–value stores can use consistency models ranging from eventual consistency to serializability. Some databases support ordering of keys. There are various hardware implementations, and some users store data in memory (RAM), while others on solid-state drives (SSD) or rotating disks (aka hard disk drive (HDD)). === Document store === The central concept of a document store is that of a "document". While the details of this definition differ among document-oriented databases, they all assume that documents encapsulate and encode data (or information) in some standard formats or encodings. Encodings in use include XML, YAML, and JSON and binary forms like BSON. Documents are addressed in the database via a unique key that represents that document. Another defining characteristic of a document-oriented database is an API or query language to retrieve documents based on their contents. Different implementations offer different ways of organizing and/or grouping documents: Collections Tags Non-visible metadata Directory hierarchies Compared to relational databases, collections could be considered analogous to tables and documents analogous to records. But they are different – every record in a table has the same sequence of fields, while documents in a collection may have fields that are completely different. === Graph === Graph databases are designed for data whose relations are well represented as a graph consisting of elements connected by a finite number of relations. Examples of data include social relations, public transport links, road maps, network topologies, etc. Graph databases and their query language == Performance == The performance of NoSQL databases is usually evaluated using the metric of throughput, which is measured as operations per second. Performance evaluation must pay attention to the right benchmarks such as production configurations, parameters of the databases, anticipated data volume, and concurrent user workloads. Ben Scofield rated different categories of NoSQL databases as follows: Performance and scalability comparisons are most commonly done using the YCSB benchmark. == Handling relational data == Since most NoSQL databases lack ability for joins in queries, the database schema generally needs to be designed differently. There are three main techniques for handling relational data in a NoSQL database. (See table join and ACID support for NoSQL databases that support joins.) === Multiple queries === Instead of retrieving all the data with one query, it is common to do several queries to get the desired data. NoSQL queries are often faster than traditional SQL queries, so the cost of additional queries may be acceptable. If an excessive number of queries would be necessary, one of the other two approaches is more appropriate. === Caching, replication and non-normalized data === Instead of only storing foreign keys, it is common to store actual foreign values along with the model's data. For example, each blog comment might include the username in addition to a user id, thus providing easy access to the username without requiring another lookup. When a username changes, however, this will now need to be changed in many places in the database. Thus this approach works better when reads are much more common than writes. === Nesting data === With document databases like MongoDB it is common to put more data in a smaller number of collections. For example, in a blogging application, one might choose to store comments within the blog post document, so that with a single retrieval one gets all the comments. Thus in this approach a single document contains all the data needed for a specific task. == ACID and join support == A database is marked as supporting ACID properties (atomicity, consistency, isolation, durability) or join operations if the documentation for the database makes that claim. However, this doesn't necessarily mean that the capability is fully supported in a manner similar to most SQL databases. == Query optimization and indexing in NoSQL databases == Different NoSQL databases, such as DynamoDB, MongoDB, Cassandra, Couchbase, HBase, and Redis, exhibit varying behaviors when querying non-indexed fields. Many perform full-table or collection scans for such queries, applying filtering operations after retrieving data. However, modern NoSQL databases often incorporate advanced features to optimize query performance. For example, MongoDB supports compound indexes and query-optimization strategies, Cassandra offers secondary indexes and materialized views, and Redis employs custom indexing mechanisms tailored to specific use cases. Systems like El

    Read more →
  • DaVinci (software)

    DaVinci (software)

    DaVinci was a development tool produced by Incross, which aimed at creating HTML5 mobile applications and media content. It included a jQuery framework and a JavaScript library that enabled developers and designers to craft web applications designed for mobile devices with a user experience similar to native applications. Business applications, games, rich media content, such as HTML5 multi-media magazines, advertisements, and animation, may be produced with the tool. DaVinci was based on standard web technology – including HTML5, CSS3, and JavaScript. == Features == DaVinci comprised DaVinci Studio and DaVinci Animator, which handled application programming and UI design. The tool had a WYSIWYG authoring environment. Open-source libraries, such as KnockOut, JsRender/JsViews, Impress.js, and turn.js, were included in the tool. Other open-source frameworks could also be integrated. The Model View Controller (MVC) and Data Binding in JavaScript could be handled through DaVinci's Data-Set Editor. In this mode, view components and model data could be visually bound, which allowed users to create web applications with server-integrated UI components without coding. Additionally, DaVinci included an N-Screen editor, which automatically adjusted designs and functionalities to fit the screen sizes of various devices, including smartphones, tablet PCs, and TVs. == DaVinci and jQuery == In collaboration with the jQuery Foundation, DaVinci played a significant role in hosting the first jQuery conference in an Asian district, which took place on November 12, 2012, in Seoul, South Korea. The conference showcased how DaVinci could be utilized in application development demonstrations.

    Read more →
  • Emotion recognition

    Emotion recognition

    Emotion recognition is the process of identifying human emotion. People vary widely in their accuracy at recognizing the emotions of others. Use of technology to help people with emotion recognition is a relatively nascent research area. Generally, the technology works best if it uses multiple modalities in context. To date, the most work has been conducted on automating the recognition of facial expressions from video, spoken expressions from audio, written expressions from text, and physiology as measured by wearables. == Human == Humans show a great deal of variability in their abilities to recognize emotion. A key point to keep in mind when learning about automated emotion recognition is that there are several sources of "ground truth", or truth about what the real emotion is. Suppose we are trying to recognize the emotions of Alex. One source is "what would most people say that Alex is feeling?" In this case, the 'truth' may not correspond to what Alex feels, but may correspond to what most people would say it looks like Alex feels. For example, Alex may actually feel sad, but he puts on a big smile and then most people say he looks happy. If an automated method achieves the same results as a group of observers it may be considered accurate, even if it does not actually measure what Alex truly feels. Another source of 'truth' is to ask Alex what he truly feels. This works if Alex has a good sense of his internal state, and wants to tell you what it is, and is capable of putting it accurately into words or a number. However, some people are alexithymic and do not have a good sense of their internal feelings, or they are not able to communicate them accurately with words and numbers. In general, getting to the truth of what emotion is actually present can take some work, can vary depending on the criteria that are selected, and will usually involve maintaining some level of uncertainty. == Automatic == Decades of scientific research have been conducted developing and evaluating methods for automated emotion recognition. There is now an extensive literature proposing and evaluating hundreds of different kinds of methods, leveraging techniques from multiple areas, such as signal processing, machine learning, computer vision, and speech processing. Different methodologies and techniques may be employed to interpret emotion such as Bayesian networks. , Gaussian Mixture models and Hidden Markov Models and deep neural networks. === Approaches === The accuracy of emotion recognition is usually improved when it combines the analysis of human expressions from multimodal forms such as texts, physiology, audio, or video. Different emotion types are detected through the integration of information from facial expressions, body movement and gestures, and speech. The technology is said to contribute in the emergence of the so-called emotional or emotive Internet. The existing approaches in emotion recognition to classify certain emotion types can be generally classified into three main categories: knowledge-based techniques, statistical methods, and hybrid approaches. ==== Knowledge-based techniques ==== Knowledge-based techniques (sometimes referred to as lexicon-based techniques), utilize domain knowledge and the semantic and syntactic characteristics of text and potentially spoken language in order to detect certain emotion types. In this approach, it is common to use knowledge-based resources during the emotion classification process such as WordNet, SenticNet, ConceptNet, and EmotiNet, to name a few. One of the advantages of this approach is the accessibility and economy brought about by the large availability of such knowledge-based resources. A limitation of this technique on the other hand, is its inability to handle concept nuances and complex linguistic rules. Knowledge-based techniques can be mainly classified into two categories: dictionary-based and corpus-based approaches. Dictionary-based approaches find opinion or emotion seed words in a dictionary and search for their synonyms and antonyms to expand the initial list of opinions or emotions. Corpus-based approaches on the other hand, start with a seed list of opinion or emotion words, and expand the database by finding other words with context-specific characteristics in a large corpus. While corpus-based approaches take into account context, their performance still vary in different domains since a word in one domain can have a different orientation in another domain. ==== Statistical methods ==== Statistical methods commonly involve the use of different supervised machine learning algorithms in which a large set of annotated data is fed into the algorithms for the system to learn and predict the appropriate emotion types. Machine learning algorithms generally provide more reasonable classification accuracy compared to other approaches, but one of the challenges in achieving good results in the classification process, is the need to have a sufficiently large training set. Some of the most commonly used machine learning algorithms include Support Vector Machines (SVM), Naive Bayes, and Maximum Entropy. Deep learning, which is under the unsupervised family of machine learning, is also widely employed in emotion recognition. Well-known deep learning algorithms include different architectures of Artificial Neural Network (ANN) such as Convolutional Neural Network (CNN), Long Short-term Memory (LSTM), and Extreme Learning Machine (ELM). The popularity of deep learning approaches in the domain of emotion recognition may be mainly attributed to its success in related applications such as in computer vision, speech recognition, and Natural Language Processing (NLP). ==== Hybrid approaches ==== Hybrid approaches in emotion recognition are essentially a combination of knowledge-based techniques and statistical methods, which exploit complementary characteristics from both techniques. Some of the works that have applied an ensemble of knowledge-driven linguistic elements and statistical methods include sentic computing and iFeel, both of which have adopted the concept-level knowledge-based resource SenticNet. The role of such knowledge-based resources in the implementation of hybrid approaches is highly important in the emotion classification process. Since hybrid techniques gain from the benefits offered by both knowledge-based and statistical approaches, they tend to have better classification performance as opposed to employing knowledge-based or statistical methods independently. A downside of using hybrid techniques however, is the computational complexity during the classification process. === Datasets === Data is an integral part of the existing approaches in emotion recognition and in most cases it is a challenge to obtain annotated data that is necessary to train machine learning algorithms. For the task of classifying different emotion types from multimodal sources in the form of texts, audio, videos or physiological signals, the following datasets are available: HUMAINE: provides natural clips with emotion words and context labels in multiple modalities Belfast database: provides clips with a wide range of emotions from TV programs and interview recordings SEMAINE: provides audiovisual recordings between a person and a virtual agent and contains emotion annotations such as angry, happy, fear, disgust, sadness, contempt, and amusement IEMOCAP: provides recordings of dyadic sessions between actors and contains emotion annotations such as happiness, anger, sadness, frustration, and neutral state eNTERFACE: provides audiovisual recordings of subjects from seven nationalities and contains emotion annotations such as happiness, anger, sadness, surprise, disgust, and fear DEAP: provides electroencephalography (EEG), electrocardiography (ECG), and face video recordings, as well as emotion annotations in terms of valence, arousal, and dominance of people watching film clips DREAMER: provides electroencephalography (EEG) and electrocardiography (ECG) recordings, as well as emotion annotations in terms of valence, dominance of people watching film clips MELD: is a multiparty conversational dataset where each utterance is labeled with emotion and sentiment. MELD provides conversations in video format and hence suitable for multimodal emotion recognition and sentiment analysis. MELD is useful for multimodal sentiment analysis and emotion recognition, dialogue systems and emotion recognition in conversations. MuSe: provides audiovisual recordings of natural interactions between a person and an object. It has discrete and continuous emotion annotations in terms of valence, arousal and trustworthiness as well as speech topics useful for multimodal sentiment analysis and emotion recognition. UIT-VSMEC: is a standard Vietnamese Social Media Emotion Corpus (UIT-VSMEC) with about 6,927 human-annotated sentences with six emotion labels, contributing to emotion recognition research in Vietnamese

    Read more →
  • Xulvi-Brunet–Sokolov algorithm

    Xulvi-Brunet–Sokolov algorithm

    Xulvi-Brunet and Sokolov's algorithm generates networks with chosen degree correlations. This method is based on link rewiring, in which the desired degree is governed by parameter ρ. By varying this single parameter it is possible to generate networks from random (when ρ = 0) to perfectly assortative or disassortative (when ρ = 1). This algorithm allows to keep network's degree distribution unchanged when changing the value of ρ. == Assortative model == In assortative networks, well-connected nodes are likely to be connected to other highly connected nodes. Social networks are examples of assortative networks. This means that an assortative network has the property that almost all nodes with the same degree are linked only between themselves. The Xulvi-Brunet–Sokolov algorithm for this type of networks is the following. In a given network, two links connecting four different nodes are chosen randomly. These nodes are ordered by their degrees. Then, with probability ρ, the links are randomly rewired in such a way that one link connects the two nodes with the smaller degrees and the other connects the two nodes with the larger degrees. If one or both of these links already existed in the network, the step is discarded and is repeated again. Thus, there will be no self-connected nodes or multiple links connecting the same two nodes. Different degrees of assortativity of a network can be achieved by changing the parameter ρ. Assortative networks are characterized by highly connected groups of nodes with similar degree. As assortativity grows, the average path length and clustering coefficient increase. == Disassortative model == In disassortative networks, highly connected nodes tend to connect to less-well-connected nodes with larger probability than in uncorrelated networks. Examples of such networks include biological networks. The Xulvi-Brunet and Sokolov's algorithm for this type of networks is similar to the one for assortative networks with one minor change. As before, two links of four nodes are randomly chosen and the nodes are ordered with respect to their degrees. However, in this case, the links are rewired (with probability p) such that one link connects the highest connected node with the node with the lowest degree and the other link connects the two remaining nodes randomly with probability 1 − ρ. Similarly, if the new links already existed, the previous step is repeated. This algorithm does not change the degree of nodes and thus the degree distribution of the network.

    Read more →
  • Known-item search

    Known-item search

    Known-item search is a specialization of information exploration which represents the activities carried out by searchers who have a particular item in mind. In the context of library catalogs, known‐item search means a search for an item for which the author or title is known. Although the concept of known-item search originated in library science, it is now applied in the context of web search and other online search activities. Known-item search is distinguished from exploratory search, in which a searcher is unfamiliar with the domain of their search goal, unsure about the ways to achieve their goal, and/or unsure about what their goal is.

    Read more →
  • Color balance

    Color balance

    In photography and image processing, color balance is the global adjustment of the intensities of the colors (typically red, green, and blue primary colors). An important goal of this adjustment is to render specific colors – particularly neutral colors like white or grey – correctly. Hence, the general method is sometimes called gray balance, neutral balance, or white balance. Color balance changes the overall mixture of colors in an image and is used for color correction. Generalized versions of color balance are used to correct colors other than neutrals or to deliberately change them for effect. White balance is one of the most common kinds of balancing, and is when colors are adjusted to make a white object (such as a piece of paper or a wall) appear white and not a shade of any other colour. Image data acquired by sensors – either film or electronic image sensors – must be transformed from the acquired values to new values that are appropriate for color reproduction or display. Several aspects of the acquisition and display process make such color correction essential – including that the acquisition sensors do not match the sensors in the human eye, that the properties of the display medium must be accounted for, and that the ambient viewing conditions of the acquisition differ from the display viewing conditions. The color balance operations in popular image editing applications usually operate directly on the red, green, and blue channel pixel values, without respect to any color sensing or reproduction model. In film photography, color balance is typically achieved by using color correction filters over the lights or on the camera lens. == Generalized color balance == Sometimes the adjustment to keep neutrals neutral is called white balance, and the phrase color balance refers to the adjustment that in addition makes other colors in a displayed image appear to have the same general appearance as the colors in an original scene. It is particularly important that neutral (gray, neutral, white) colors in a scene appear neutral in the reproduction. === Psychological color balance === Humans relate to flesh tones more critically than other colors. Trees, grass and sky can all be off without concern, but if human flesh tones are 'off' then the human subject can look sick or dead. To address this critical color balance issue, the tri-color primaries themselves are formulated to not balance as a true neutral color. The purpose of this color primary imbalance is to more faithfully reproduce the flesh tones through the entire brightness range. == Illuminant estimation and adaptation == Most digital cameras have means to select color correction based on the type of scene lighting, using either manual lighting selection, automatic white balance, or custom white balance. The algorithms for these processes perform generalized chromatic adaptation. Many methods exist for color balancing. Setting a button on a camera is a way for the user to indicate to the processor the nature of the scene lighting. Another option on some cameras is a button which one may press when the camera is pointed at a gray card or other neutral colored object. This captures an image of the ambient light, which enables a digital camera to set the correct color balance for that light. There is a large literature on how one might estimate the ambient lighting from the camera data and then use this information to transform the image data. A variety of algorithms have been proposed, and the quality of these has been debated. A few examples and examination of the references therein will lead the reader to many others. Examples are Retinex, an artificial neural network or a Bayesian method. == Chromatic colors == Color balancing an image affects not only the neutrals, but other colors as well. An image that is not color balanced is said to have a color cast, as everything in the image appears to have been shifted towards one color. Color balancing may be thought in terms of removing this color cast. Color balance is also related to color constancy. Algorithms and techniques used to attain color constancy are frequently used for color balancing, as well. Color constancy is, in turn, related to chromatic adaptation. Conceptually, color balancing consists of two steps: first, determining the illuminant under which an image was captured; and second, scaling the components (e.g., R, G, and B) of the image or otherwise transforming the components so they conform to the viewing illuminant. Viggiano found that white balancing in the camera's native RGB color model tended to produce less color inconstancy (i.e., less distortion of the colors) than in monitor RGB for over 4000 hypothetical sets of camera sensitivities. This difference typically amounted to a factor of more than two in favor of camera RGB. This means that it is advantageous to get color balance right at the time an image is captured, rather than edit later on a monitor. If one must color balance later, balancing the raw image data will tend to produce less distortion of chromatic colors than balancing in monitor RGB. == Mathematics of color balance == Color balancing is sometimes performed on a three-component image (e.g., RGB) using a 3x3 matrix. This type of transformation is appropriate if the image was captured using the wrong white balance setting on a digital camera, or through a color filter. Changing the color balance of an image can improve classifier results on a trained ML model. === Scaling monitor R, G, and B === In principle, one wants to scale all relative luminances in an image so that objects which are believed to be neutral appear so. If, say, a surface with R = 240 {\displaystyle R=240} was believed to be a white object, and if 255 is the count which corresponds to white, one could multiply all red values by 255/240. Doing analogously for green and blue would result, at least in theory, in a color balanced image. In this type of transformation the 3x3 matrix is a diagonal matrix. [ R G B ] = [ 255 / R w ′ 0 0 0 255 / G w ′ 0 0 0 255 / B w ′ ] [ R ′ G ′ B ′ ] {\displaystyle \left[{\begin{array}{c}R\\G\\B\end{array}}\right]=\left[{\begin{array}{ccc}255/R'_{w}&0&0\\0&255/G'_{w}&0\\0&0&255/B'_{w}\end{array}}\right]\left[{\begin{array}{c}R'\\G'\\B'\end{array}}\right]} where R {\displaystyle R} , G {\displaystyle G} , and B {\displaystyle B} are the color balanced red, green, and blue components of a pixel in the image; R ′ {\displaystyle R'} , G ′ {\displaystyle G'} , and B ′ {\displaystyle B'} are the red, green, and blue components of the image before color balancing, and R w ′ {\displaystyle R'_{w}} , G w ′ {\displaystyle G'_{w}} , and B w ′ {\displaystyle B'_{w}} are the red, green, and blue components of a pixel which is believed to be a white surface in the image before color balancing. This is a simple scaling of the red, green, and blue channels, and is why color balance tools in Photoshop have a white eyedropper tool. It has been demonstrated that performing the white balancing in the phosphor set assumed by sRGB tends to produce large errors in chromatic colors, even though it can render the neutral surfaces perfectly neutral. === Scaling X, Y, Z === If the image may be transformed into CIE XYZ tristimulus values, the color balancing may be performed there. This has been termed a "wrong von Kries" transformation. Although it has been demonstrated to offer usually poorer results than balancing in monitor RGB, it is mentioned here as a bridge to other things. Mathematically, one computes: [ X Y Z ] = [ X w / X w ′ 0 0 0 Y w / Y w ′ 0 0 0 Z w / Z w ′ ] [ X ′ Y ′ Z ′ ] {\displaystyle \left[{\begin{array}{c}X\\Y\\Z\end{array}}\right]=\left[{\begin{array}{ccc}X_{w}/X'_{w}&0&0\\0&Y_{w}/Y'_{w}&0\\0&0&Z_{w}/Z'_{w}\end{array}}\right]\left[{\begin{array}{c}X'\\Y'\\Z'\end{array}}\right]} where X {\displaystyle X} , Y {\displaystyle Y} , and Z {\displaystyle Z} are the color-balanced tristimulus values; X w {\displaystyle X_{w}} , Y w {\displaystyle Y_{w}} , and Z w {\displaystyle Z_{w}} are the tristimulus values of the viewing illuminant (the white point to which the image is being transformed to conform to); X w ′ {\displaystyle X'_{w}} , Y w ′ {\displaystyle Y'_{w}} , and Z w ′ {\displaystyle Z'_{w}} are the tristimulus values of an object believed to be white in the un-color-balanced image, and X ′ {\displaystyle X'} , Y ′ {\displaystyle Y'} , and Z ′ {\displaystyle Z'} are the tristimulus values of a pixel in the un-color-balanced image. If the tristimulus values of the monitor primaries are in a matrix P {\displaystyle \mathbf {P} } so that: [ X Y Z ] = P [ L R L G L B ] {\displaystyle \left[{\begin{array}{c}X\\Y\\Z\end{array}}\right]=\mathbf {P} \left[{\begin{array}{c}L_{R}\\L_{G}\\L_{B}\end{array}}\right]} where L R {\displaystyle L_{R}} , L G {\displaystyle L_{G}} , and L B {\displaystyle L_{B}} are the un-gamma corrected monitor RGB, one may use: [ L R L G L B ] = P − 1 [ X w / X w ′ 0 0

    Read more →
  • Manhattan address algorithm

    Manhattan address algorithm

    The Manhattan address algorithm is a series of formulas used to estimate the closest east–west cross street for building numbers on north–south avenues in the New York City borough of Manhattan. == Algorithm == To find the approximate number of the closest cross street, divide the building number by a divisor (generally 20) and add (or subtract) the "tricky number" from the table below: For the north–south avenues, there are typically 20 address numbers between consecutive east–west streets (10 on either side of the avenue). A standard land lot on each avenue was originally 20 feet (6.1 m) wide, and there is about 200 feet (61 m) between each pair of east–west streets, for ten land lots between each pair of streets. The exceptions are Riverside Drive, as well as Fifth Avenue and Central Park West between 59th and 110th streets, which use a divisor of 10. These avenues all have buildings only on one side of the street, with a park on the other side. The "tricky number" often corresponds to a street near the southern end of the avenue. There are some notable exceptions: York Avenue address numbers are continuations of Avenue A address numbers, since the avenue was originally called Avenue A. East End Avenue address numbers are continuations of Avenue B address numbers, since the avenue was originally called Avenue B. Sixth Avenue and Broadway start south of Houston Street, the southern boundary of the Manhattan street numbering system. Although Park Avenue's southern terminus is at 32nd Street, a homeowner at 34th Street wanted the address "1 Park Avenue" (this was later changed). === Examples === For example, if you are at 62 Avenue B, 62 ÷ 20 ≈ 3 {\displaystyle 62\div 20\approx 3} , then add the "tricky number" 3 {\displaystyle 3} to give 6 {\displaystyle 6} . The nearest cross street to 62 Avenue B is East 6th Street. If you are at 78 Riverside Drive, 78 ÷ 10 ≈ 8 {\displaystyle 78\div 10\approx 8} , then add the "tricky number" 72 {\displaystyle 72} to give 80 {\displaystyle 80} . The nearest cross street to 78 Riverside Drive is West 80th Street. If you are at 501 5th Avenue, 501 ÷ 20 ≈ 25 {\displaystyle 501\div 20\approx 25} , then add the "tricky number" 18 {\displaystyle 18} to give 43 {\displaystyle 43} . The nearest cross street to 501 5th Avenue is actually 42nd Street, not 43rd Street, as the Manhattan address algorithm only gives approximate answers.

    Read more →
  • Virtual data room

    Virtual data room

    A virtual data room (sometimes called a VDR or Deal Room) is an online repository of information that is used for the storing and distribution of documents. In many cases, a virtual data room is used to facilitate the due diligence process during an M&A transaction, loan syndication, or private equity and venture capital transactions. This due diligence process has traditionally used a physical data room to accomplish the disclosure of documents. For reasons of cost, efficiency and security, virtual data rooms have widely replaced the more traditional physical data room. A virtual data room is an extranet to which the bidders and their advisers are given access via the internet. An extranet is essentially a website with limited controlled access, using a secure log-on supplied by the vendor, which can be disabled at any time, by the vendor, if a bidder withdraws. Much of the information released is confidential and restrictions are applied to the viewer's ability to release this to third parties (by means of forwarding, copying or printing). This can be effectively applied to protect the data using digital rights management. The virtual data room provides access to secure documents for authorized users through a dedicated web site, or through secure agent applications. In the process of mergers and acquisitions the data room is set up as part of the central repository of data relating to companies or divisions being acquired or sold. The data room enables the interested parties to view information relating to the business in a controlled environment where confidentiality can be preserved. Conventionally this was achieved by establishing a supervised, physical data room in secure premises with controlled access. In most cases, with a physical data room, only one bidder team can access the room at a time. A virtual data room is designed to have the same advantages as a conventional data room (controlling access, viewing, copying and printing, etc.) with fewer disadvantages. Due to their increased efficiency, many businesses and industries have moved to using virtual data rooms instead of physical data rooms. In 2006, a spokesperson for a company which sets up virtual deal rooms was reported claiming that the process reduced the bidding process by about thirty days compared to physical data rooms. In the process of startup fundraising, a virtual data room is set up to be a central location for key data, documents, and financials. These are shared with venture capital and angel investors and allows them to streamline due diligence. == Application == Any business dealing with private data can apply VDRs when secure transaction processing is required. This includes financial institutions that need to negotiate confidential customer information without involving third parties. VDRs have traditionally been used for IPOs and real estate asset management. Technology companies may use them to exchange and review code or confidential data needed for operations. The same is true for clients, who entrust their valuable code only to the most qualified people in the organisation. The code is not something that can be printed out and brought in a folder. It resides on a computer and must be used together. VDR can find application in any business that manages data in the form of documents, especially law firms, financial advisers or the B2B sector. The latter work with documents that must always be handled and controlled confidentially, and it is difficult to store them securely when they are on a server that other people can access. In addition, in B2B, it is important to close the deal as quickly as possible: the average sales cycle is one to three months. VDR can be compared to a locked filing cabinet where all those folders and documents are kept. It automates the mathematics of pricing to prevent revenue leakage, and initially integrates CRM to ensure accurate synchronisation of all account data, which is important for B2B in particular and sales in general. While virtual data rooms offer many advantages, they are not suitable for every industry. For example, some governments may decide to continue using physical data rooms for highly confidential information sharing. The damage from potential cyberattacks and data breaches exceeds the benefits offered by virtual data rooms. In such cases, the use of VDRs is not considered. Data breaches have particularly affected the US healthcare system from March 2021 to March 2022 - according to IBM Security the cost of the breach was a record high of $10.1 million.

    Read more →
  • Algorithm engineering

    Algorithm engineering

    Algorithm engineering focuses on the design, analysis, implementation, optimization, profiling and experimental evaluation of computer algorithms, bridging the gap between algorithmics theory and practical applications of algorithms in software engineering. It is a general methodology for algorithmic research. == Origins == In 1995, a report from an NSF-sponsored workshop "with the purpose of assessing the current goals and directions of the Theory of Computing (TOC) community" identified the slow speed of adoption of theoretical insights by practitioners as an important issue and suggested measures to reduce the uncertainty by practitioners whether a certain theoretical breakthrough will translate into practical gains in their field of work, and tackle the lack of ready-to-use algorithm libraries, which provide stable, bug-free and well-tested implementations for algorithmic problems and expose an easy-to-use interface for library consumers. But also, promising algorithmic approaches have been neglected due to difficulties in mathematical analysis. The term "algorithm engineering" was first used with specificity in 1997, with the first Workshop on Algorithm Engineering (WAE97), organized by Giuseppe F. Italiano. == Difference from algorithm theory == Algorithm engineering does not intend to replace or compete with algorithm theory, but tries to enrich, refine and reinforce its formal approaches with experimental algorithmics (also called empirical algorithmics). This way it can provide new insights into the efficiency and performance of algorithms in cases where the algorithm at hand is less amenable to algorithm theoretic analysis, formal analysis pessimistically suggests bounds which are unlikely to appear on inputs of practical interest, the algorithm relies on the intricacies of modern hardware architectures like data locality, branch prediction, instruction stalls, instruction latencies which the machine model used in Algorithm Theory is unable to capture in the required detail, the crossover between competing algorithms with different constant costs and asymptotic behaviors needs to be determined. == Methodology == Some researchers describe algorithm engineering's methodology as a cycle consisting of algorithm design, analysis, implementation and experimental evaluation, joined by further aspects like machine models or realistic inputs. They argue that equating algorithm engineering with experimental algorithmics is too limited, because viewing design and analysis, implementation and experimentation as separate activities ignores the crucial feedback loop between those elements of algorithm engineering. === Realistic models and real inputs === While specific applications are outside the methodology of algorithm engineering, they play an important role in shaping realistic models of the problem and the underlying machine, and supply real inputs and other design parameters for experiments. === Design === Compared to algorithm theory, which usually focuses on the asymptotic behavior of algorithms, algorithm engineers need to keep further requirements in mind: Simplicity of the algorithm, implementability in programming languages on real hardware, and allowing code reuse. Additionally, constant factors of algorithms have such a considerable impact on real-world inputs that sometimes an algorithm with worse asymptotic behavior performs better in practice due to lower constant factors. === Analysis === Some problems can be solved with heuristics and randomized algorithms in a simpler and more efficient fashion than with deterministic algorithms. Unfortunately, this makes even simple randomized algorithms difficult to analyze because there are subtle dependencies to be taken into account. === Implementation === Huge semantic gaps between theoretical insights, formulated algorithms, programming languages and hardware pose a challenge to efficient implementations of even simple algorithms, because small implementation details can have rippling effects on execution behavior. The only reliable way to compare several implementations of an algorithm is to spend an considerable amount of time on tuning and profiling, running those algorithms on multiple architectures, and looking at the generated machine code. === Experiments === See: Experimental algorithmics === Application engineering === Implementations of algorithms used for experiments differ in significant ways from code usable in applications. While the former prioritizes fast prototyping, performance and instrumentation for measurements during experiments, the latter requires thorough testing, maintainability, simplicity, and tuning for particular classes of inputs. === Algorithm libraries === Stable, well-tested algorithm libraries like LEDA play an important role in technology transfer by speeding up the adoption of new algorithms in applications. Such libraries reduce the required investment and risk for practitioners, because it removes the burden of understanding and implementing the results of academic research. == Conferences == Two main conferences on Algorithm Engineering are organized annually, namely: Symposium on Experimental Algorithms (SEA), established in 1997 (formerly known as WEA). SIAM Meeting on Algorithm Engineering and Experiments (ALENEX), established in 1999. The 1997 Workshop on Algorithm Engineering (WAE'97) was held in Venice (Italy) on September 11–13, 1997. The Third International Workshop on Algorithm Engineering (WAE'99) was held in London, UK in July 1999. The first Workshop on Algorithm Engineering and Experimentation (ALENEX99) was held in Baltimore, Maryland on January 15–16, 1999. It was sponsored by DIMACS, the Center for Discrete Mathematics and Theoretical Computer Science (at Rutgers University), with additional support from SIGACT, the ACM Special Interest Group on Algorithms and Computation Theory, and SIAM, the Society for Industrial and Applied Mathematics.

    Read more →
  • Adobe Presenter

    Adobe Presenter

    Adobe Presenter is eLearning software released by Adobe Systems available on the Microsoft Windows platform as a Microsoft PowerPoint plug-in, and on both Windows and OS X as the screencasting and video editing tool Adobe Presenter Video Express. It is mainly targeted towards learning professionals and trainers. In addition to recording one's computer desktop and speech, it also provides the option to add quizzes and track performance by integrating with learning management systems. Adobe Presenter was designed to replace the discontinued Adobe Ovation software, which had similar functions. == Predecessor == Adobe Ovation was originally released by Serious Magic. It converted PowerPoint slides into visual presentations with additional effects. Ovation included themes called PowerLooks that could add motion and polish the presentations. They were available in a variety of color variations complete with animated backgrounds and dynamic text effects. Ovation could make text with jagged edges more readable. TimeKeeper could be used to set the period of the presentation, and the PointPrompter scrolled down the notes. Ovation's development has been discontinued, nor does it support PowerPoint 2007. == Features == The main purpose of Adobe Presenter is to capture on-screen presentations and convert them into more interactive and engaging videos. Support is given to convert Microsoft PowerPoint 2010 and 2013 presentations into videos. It also allows for content authoring on PowerPoint and ActionScript 3, and offers integration with Adobe Captivate. Slide branching enables users to control slide navigation and titles and create complex slide branching to guide viewers through the content of the presentation. Video editing tools are also provided, and offer the ability to upload to video-sharing platforms such as YouTube, Vimeo and other sites. Multimedia features such as annotations, eLearning templates, actors, audio narration and drag-and-drop elements enrich users' presentations. Quizzes and surveys is another highlighted feature, which include generating question pools, importing questions from existing quizzes and in-course collaboration which allows presenters to receive feedback by allowing them to comment on specific content within a course or ask questions for more clarity. Presenters could opt to receive feedback from viewers through video analytics and create Experience API, SCORM and AICC-compliant content. Options to publish to Adobe Connect are provided. Other unique features include universal standards support, file size control, navigational restrictions among others.

    Read more →
  • Virtual directory

    Virtual directory

    In computing, the term virtual directory has a couple of meanings. It may simply designate (for example in IIS) a folder which appears in a path but which is not actually a subfolder of the preceding folder in the path. However, this article will discuss the term in the context of directory services and identity management. A virtual directory or virtual directory server (VDS) in this context is a software layer that delivers a single access point for identity management applications and service platforms. A virtual directory operates as a high-performance, lightweight abstraction layer that resides between client applications and disparate types of identity-data repositories, such as proprietary and standard directories, databases, web services, and applications. A virtual directory receives queries and directs them to the appropriate data sources by abstracting and virtualizing data. The virtual directory integrates identity data from multiple heterogeneous data stores and presents it as though it were coming from one source. This ability to reach into disparate repositories makes virtual directory technology ideal for consolidating data stored in a distributed environment. As of 2011, virtual directory servers most commonly use the LDAP protocol, but more sophisticated virtual directories can also support SQL as well as DSML and SPML. Industry experts have heralded the importance of the virtual directory in modernizing the identity infrastructure. According to Dave Kearns of Network World, "Virtualization is hot and a virtual directory is the building block, or foundation, you should be looking at for your next identity management project." In addition, Gartner analyst, Bob Blakley said that virtual directories are playing an increasingly vital role. In his report, “The Emerging Architecture of Identity Management,” Blakley wrote: “In the first phase, production of identities will be separated from consumption of identities through the introduction of a virtual directory interface.” == Capabilities == Virtual directories can have some or all of the following capabilities: Aggregate identity data across sources to create a single point of access. Create high-availability for authoritative data stores. Act as identity firewall by preventing denial-of-service attacks on the primary data stores through an additional virtual layer. Support a common searchable namespace for centralized authentication. Present a unified virtual view of user information stored across multiple systems. Delegate authentication to backend sources through source-specific security means. Virtualize data sources to support migration from legacy data stores without modifying the applications that rely on them. Enrich identities with attributes pulled from multiple data stores, based on a link between user entries. Some advanced identity virtualization platforms can also: Enable application-specific, customized views of identity data without violating internal or external regulations governing identity data. Reveal contextual relationships between objects through hierarchical directory structures. Develop advanced correlation across diverse sources using correlation rules. Build a global user identity by correlating unique user accounts across various data stores, and enrich identities with attributes pulled from multiple data stores, based on a link between user entries. Enable constant data refresh for real-time updates through a persistent cache. == Advantages == Virtual directories: Enable faster deployment because users do not need to add and sync additional application-specific data sources Leverage existing identity infrastructure and security investments to deploy new services Deliver high availability of data sources Provide application-specific views of identity data which can help avoid the need to develop a master enterprise schema Allow a single view of identity data without violating internal or external regulations governing identity data Act as identity firewalls by preventing denial-of-service attacks on the primary data-stores and providing further security on access to sensitive data Can reflect changes made to authoritative sources in real-time Leverages existing update processes of authoritative sources, so no separate (sometimes manual) process to update a central directory is needed Present a unified virtual view of user information from multiple systems so that it appears to reside in a single system Can secure all backend storage locations with a single security policy == Disadvantages == An original disadvantage is public perception of "push & pull technologies" which is the general classification of "virtual directories" depending on the nature of their deployment. Virtual directories were initially designed and later deployed with "push technologies" in mind, which also contravened with privacy laws of the United States. This is no longer the case. There are, however, other disadvantages in the current technologies. The classical virtual directory based on proxy cannot modify underlying data structures or create new views based on the relationships of data from across multiple systems. So if an application requires a different structure, such as a flattened list of identities, or a deeper hierarchy for delegated administration, a virtual directory is limited. Many virtual directories cannot correlate same-users across multiple diverse sources in the case of duplicate users Virtual directories without advanced caching technologies cannot scale to heterogeneous, high-volume environments. == Sample terminology == Unify metadata: Extract schemas from the local data source, map them to a common format, and link the same identities from different data silos based on a unique identifier. Namespace joining: Create a single large directory by bringing multiple directories together at the namespace level. For instance, if one directory has the namespace "ou=internal,dc=domain,dc=com" and a second directory has the namespace "ou=external,dc=domain,dc=com," then creating a virtual directory with both namespaces is an example of namespace joining. Identity joining: Enrich identities with attributes pulled from multiple data stores, based on a link between user entries. For instance if the user joeuser exists in a directory as "cn=joeuser,ou=users" and in a database with a username of "joeuser" then the "joeuser" identity can be constructed from both the directory and the database. Data remapping: The translation of data inside of the virtual directory. For instance, mapping “uid” to “samaccountname,” so a client application that only supports a standard LDAP-compliant data source is able to search an Active Directory namespace, as well. Query routing: Route requests based on certain criteria, such as “write operations going to a master, while read operations are forwarded to replicas.” Identity routing: Virtual directories may support the routing of requests based on certain criteria (such as write operations going to a master while read operations being forwarded to replicas). Authoritative source: A "virtualized" data repository, such as a directory or database, that the virtual directory can trust for user data. Server groups: Group one or more servers containing the same data and functionality. A typical implementation is the multi-master, multi-replica environment in which replicas process "read" requests and are in one server group, while masters process "write" requests and are in another, so that servers are grouped by their response to external stimuli, even though all share the same data. == Use cases == The following are sample use cases of virtual directories: Integrating multiple directory namespaces to create a central enterprise directory. Supporting infrastructure integrations after mergers and acquisitions. Centralizing identity storage across the infrastructure, making identity information available to applications through various protocols (including LDAP, JDBC, and web services). Creating a single access point for web access management (WAM) tools. Enabling web single sign-on (SSO) across varied sources or domains. Supporting role-based, fine-grained authorization policies Enabling authentication across different security domains using each domain’s specific credential checking method. Improving secure access to information both inside and outside of the firewall.

    Read more →
  • Bibliographic database

    Bibliographic database

    A bibliographic database is a database of bibliographic records. This is an organised online collection of references to published written works like journal and newspaper articles, conference proceedings, reports, government and legal publications, patents and books. In contrast to library catalogue entries, a majority of the records in bibliographic databases describe articles and conference papers rather than complete monographs, and they generally contain very rich subject descriptions in the form of keywords, subject classification terms, or abstracts. A bibliographic database may cover a wide range of topics or one academic field like computer science. A significant number of bibliographic databases are marketed under a trade name by licensing agreement from vendors, or directly from their makers: the indexing and abstracting services. Many bibliographic databases have evolved into digital libraries, providing the full text of the organised contents:for instance CORE also organises and mirrors scholarly articles and OurResearch develops a search engine for open access content in Unpaywall. Others merge with non-bibliographic and scholarly databases to create more complete disciplinary search engine systems, such as Chemical Abstracts or Entrez. == History == Prior to the mid-20th century, individuals searching for published literature had to rely on printed bibliographic indexes, generated manually from index cards. During the early 1960s computers were used to digitize text for the first time; the purpose was to reduce the cost and time required to publish two American abstracting journals, the Index Medicus of the National Library of Medicine and the Scientific and Technical Aerospace Reports of the National Aeronautics and Space Administration (NASA). By the late 1960s, such bodies of digitized alphanumeric information, known as bibliographic and numeric databases, constituted a new type of information resource. Online interactive retrieval became commercially viable in the early 1970s over private telecommunications networks. The first services offered a few databases of indexes and abstracts of scholarly literature. These databases contained bibliographic descriptions of journal articles that were searchable by keywords in author and title, and sometimes by journal name or subject heading. The user interfaces were crude, the access was expensive, and searching was done by librarians on behalf of "end users".

    Read more →