AI For Work Kore

AI For Work Kore — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Dataset shift

    Dataset shift

    Dataset shift is a phenomenon in machine learning and statistics in which the joint distribution of input variables and target labels is different in the training phase and the deployment or test phase (i.e., P t r a i n ( X , Y ) ≠ P t e s t ( X , Y ) {\displaystyle P_{train}(X,Y)\neq P_{test}(X,Y)} ). This happens when the statistical properties of data used to train a model are no longer representative of the data encountered in real-world use, often resulting in degraded predictive performance and diminished generalization ability. Dataset shift is a generic term for a number of particular types of distributional change. Covariate shift is when the distribution of the input features changes, but the conditional relationship between inputs and outputs remains constant . Prior probability shift (or label shift) happens when the distribution of target labels changes, but the conditional distribution of inputs given labels stays the same. Concept shift (also known as concept drift) is the change of the conditional relationship between inputs and outputs that renders previously learned patterns invalid over time. A key challenge for deploying machine learning systems is dataset shift, in particular in dynamic environments where the data distributions change over time. Detecting and mitigating such shifts is an active area of research, e.g., drift detection, domain adaptation, continual learning.

    Read more →
  • Philco computers

    Philco computers

    Philco was one of the pioneers of transistorized computers, also known as second-generation computers. After the company developed the surface-barrier transistor, which was much faster than previous point-contact types, it was awarded contracts for military and government computers. Commercialized derivatives of some of these designs became successful business and scientific computers. The TRANSAC (Transistor Automatic Computer) Model S-1000 was released as a scientific computer. The TRANSAC S-2000 mainframe computer system was first produced in 1958, and a family of compatible machines, with increasing performance, was released over the next several years. However, the mainframe computer market was dominated by IBM. Other companies could not deploy resources for development, customer support and marketing on the scale that IBM could afford, making competition in this segment difficult after the introduction of the IBM 360 family. Philco went bankrupt and was purchased in 1961 by Ford Motor Company, but the computer division carried on until the Philco division of Ford exited the computer business in 1963. The Ford company maintained one Philco mainframe in use until 1981. == The surface-barrier transistor == The surface-barrier transistor developed by Philco in 1953 had a much higher frequency response than the original point-contact transistors. The transistor was made of a thin crystal of germanium, which was electrolytically etched with pits on either side forming a very thin base region, on the order of 5 micrometers. Philco's process for etching was United States patent number 2,885,571. Philco surface-barrier transistors were used in TX-0, and in early models of what would become the DEC PDP product line. Although relatively fast, the small size of the devices limited their power to circuits operating at a few tens of milliwatts. == Military and government == Between 1955 and 1957, Philco built transistor computers for use in aircraft, models C-1000, C-1100, and C-1102, intended for airborne real-time applications. By 1957, the C-1102 had been used by a civilian sector customer. The BASICPAC AN/TYK 6V (first delivery in 1961), COMPAC AN/TYK 4V (not completed), and LOGICPAC systems were built for the US Army as transportable computer systems for use with their Fieldata concept of integrated information management. BASICPAC was a transistorized computer with up to 28,672 words of 38-bit core memory (including sign and parity), available in several configurations from a minimum system, to a truck-borne mobile version, to a fully expanded system. Basic clock periods was 1 microsecond (which gives a clock rate of 1 MHz), with 12 microsecond memory access and a fixed-point multiplication taking 242 microseconds. Input/output was by paper tape reader and punch, or through a teletypewriter. With additional hardware, magnetic tape storage was also available, with up to seven I/O devices. The instruction set had 31 basic operation codes and nine opcodes for I/O === CXPQ === Philco was contracted by the US Navy to build the CXPQ computer. One model was completed and installed at the David Taylor Model Basin. This design was later adapted to become the commercial TRANSAC S-2000. Only one CXPQ was built. The CXPQ is a 48-bit transistorized computer. === SOLO === In 1955, the National Security Agency through the US Navy contracted with Philco to produce a computer suitable for use as a workstation, with an architecture based on the vacuum-tube computer system called Atlas II already in use at the NSA, and similar to the commercial UNIVAC 1103. At the time, Philco was the largest producer of surface barrier transistors, which were the only type available with the speed and quantities required for a computer. The SOLO prototype was delivered in 1958, but required extensive debugging at NSA. Difficulties were encountered with core memory and power supplies. SOLO used paper tape and teleprinter machines for input and output. SOLO cost about $1 million US, and contained 8,000 transistors. While the system was extensively used for training, testing, research and development, no additional units were ordered. SOLO was removed from active service in 1963. The design of the SOLO became commercialized as Philco's TRANSAC Model S-1000. == Commercial == === S-1000 === The TRANSAC S-1000 was a scientific computer with a 36-bit word length and 4096 words of core memory. It was packaged in a container about the size of a large office desk, and used only 1.2 kilowatts, much less than vacuum-tube-based computers of similar capacity. In a 1961 survey, about 15 S-1000 computer installations had been identified. It weighed about 1,650 pounds (750 kg). === S-2000 === The TRANSAC S-2000 was a large mainframe system intended for both business and scientific work. It had a 48-bit word length and supported calculations in fixed point, floating point and binary-coded decimal formats. The original S-2000 "TRANSAC" (Transistor Automatic Computer) released in 1958 was later designated Model 210; it was used internally at Philco. Similar to the Control Data Corporation Model 1604, it was a 48-bit fully transistorized computer. Three succeeding models were released in the series, all compatible with the software of the original model. The Model 211 was introduced in 1960, using micro-alloy diffused field-effect transistors, requiring significant redesign of circuits compared to the original. The TRANSAC S-2000/Philco 210/211 weighed about 2,000 pounds (910 kg). By 1964, eighteen Model 210, eighteen Model 211 and seven Model 212 systems had been sold. After Philco was purchased by Ford Motor Company, the Model 212 was introduced in 1962 and released in 1963. It had 65,535 words of 48-bit memory. Initially made with 6-microsecond core memory, it had better performance than the IBM 7094 transistor computer. It was later upgraded in 1964 to 2-microsecond core memory, which gave the machine floating-point performance greater than the IBM 7030 Stretch computer. A Model 213 was announced in 1964 but never built. By that time competition from IBM had made the Philco computer operations no longer profitable for Ford, and the division was closed down. The Model 212 could carry out a floating-point multiplication in 22 microseconds. Each word contained two 24-bit instructions with 16 bits of address information and eight bits for the opcode. There were 225 different valid opcodes in the Model 212; invalid opcodes were detected and halted the machine. The CPU had an accumulator register of 48 bits, three general-purpose registers of 24 bits, and 32 index registers of 15 bits. Main memory size ranged from 4K words to 64K words. Only the first model had a magnetic drum memory; later editions used tape drives. The Model 212 weighed about 6,500 pounds (3.3 short tons; 2.9 t). Software for the S-2000 initially consisted of TAC (Translator-Assembler-Compiler), and ALTAC, a FORTRAN II-like language with some differences from the IBM 704 FORTRAN implementation. A COBOL compiler was also available, targeted at business applications. The Philco 2400 was the input/output system for the S-2000. Operations such as reading cards or printing were carried out through magnetic tapes, thereby offloading the S-2000 from relatively slow input/output processing. The 2400 had a 24-bit word length and could be supplied with 4K to 32K characters (1K to 8K words) of core memory, rated at 3-microsecond cycle time. The instruction set was aimed at character I/O use. The idea of base registers, implemented in Philco computers, influenced the design of IBM/360. The last Philco TRANSAC S-2000 Model 212 was taken out of service in December 1981, after 19 years of service at Ford.

    Read more →
  • Data Transformation Services

    Data Transformation Services

    Data Transformation Services (DTS) is a Microsoft database tool with a set of objects and utilities to allow the automation of extract, transform and load operations to or from a database. The objects are DTS packages and their components, and the utilities are called DTS tools. DTS was included with earlier versions of Microsoft SQL Server, and was almost always used with SQL Server databases, although it could be used independently with other databases. DTS allows data to be transformed and loaded from heterogeneous sources using OLE DB, ODBC, or text-only files, into any supported database. DTS can also allow automation of data import or transformation on a scheduled basis, and can perform additional functions such as FTPing files and executing external programs. In addition, DTS provides an alternative method of version control and backup for packages when used in conjunction with a version control system, such as Microsoft Visual SourceSafe. DTS has been superseded by SQL Server Integration Services in later releases of Microsoft SQL Server though there was some backwards compatibility and ability to run DTS packages in the new SSIS for a time. == History == In SQL Server versions 6.5 and earlier, database administrators (DBAs) used SQL Server Transfer Manager and Bulk Copy Program, included with SQL Server, to transfer data. These tools had significant shortcomings, and many DBAs used third-party tools such as Pervasive Data Integrator to transfer data more flexibly and easily. With the release of SQL Server 7 in 1998, "Data Transformation Services" was packaged with it to replace all these tools. The concept, design, and implementation of the Data Transformation Services was led by Stewart P. MacLeod (SQL Server Development Group Program Manager), Vij Rajarajan (SQL Server Lead Developer), and Ted Hart (SQL Server Lead Developer). The goal was to make it easier to import, export, and transform heterogeneous data and simplify the creation of data warehouses from operational data sources. SQL Server 2000 expanded DTS functionality in several ways. It introduced new types of tasks, including the ability to FTP files, move databases or database components, and add messages into Microsoft Message Queue. DTS packages can be saved as a Visual Basic file in SQL Server 2000, and this can be expanded to save into any COM-compliant language. Microsoft also integrated packages into Windows 2000 security and made DTS tools more user-friendly; tasks can accept input and output parameters. DTS comes with all editions of SQL Server 7 and 2000, but was superseded by SQL Server Integration Services in the Microsoft SQL Server 2005 release in 2005. == DTS packages == The DTS package is the fundamental logical component of DTS; every DTS object is a child component of the package. Packages are used whenever one modifies data using DTS. All the metadata about the data transformation is contained within the package. Packages can be saved directly in a SQL Server, or can be saved in the Microsoft Repository or in COM files. SQL Server 2000 also allows a programmer to save packages in a Visual Basic or other language file (when stored to a VB file, the package is actually scripted—that is, a VB script is executed to dynamically create the package objects and its component objects). A package can contain any number of connection objects, but does not have to contain any. These allow the package to read data from any OLE DB-compliant data source, and can be expanded to handle other sorts of data. The functionality of a package is organized into tasks and steps. A DTS Task is a discrete set of functionalities executed as a single step in a DTS package. Each task defines a work item to be performed as part of the data movement and data transformation process or as a job to be executed. Data Transformation Services supplies a number of tasks that are part of the DTS object model and that can be accessed graphically through the DTS Designer or accessed programmatically. These tasks, which can be configured individually, cover a wide variety of data copying, data transformation and notification situations. For example, the following types of tasks represent some actions that you can perform by using DTS: executing a single SQL statement, sending an email, and transferring a file with FTP. A step within a DTS package describes the order in which tasks are run and the precedence constraints that describe what to do in the case damage or of failure. These steps can be executed sequentially or in parallel. Packages can also contain global variables which can be used throughout the package. SQL Server 2000 allows input and output parameters for tasks, greatly expanding the usefulness of global variables. DTS packages can be edited, password protected, scheduled for execution, and retrieved by version. == DTS tools == DTS tools packaged with SQL Server include the DTS wizards, DTS Designer, and DTS Programming Interfaces. === DTS wizards === The DTS wizards can be used to perform simple or common DTS tasks. These include the Import/Export Wizard and the Copy of Database Wizard. They provide the simplest method of copying data between OLE DB data sources. There is a great deal of functionality that is not available by merely using a wizard. However, a package created with a wizard can be saved and later altered with one of the other DTS tools. A Create Publishing Wizard is also available to schedule packages to run at certain times. This only works if SQL Server Agent is running; otherwise the package will be scheduled, but will not be executed. === DTS Designer === The DTS Designer is a graphical tool used to build complex DTS Packages with workflows and event-driven logic. DTS Designer can also be used to edit and customize DTS Packages created with the DTS wizard. Each connection and task in DTS Designer is shown with a specific icon. These icons are joined with precedence constraints, which specify the order and requirements for tasks to be run. One task may run, for instance, only if another task succeeds (or fails). Other tasks may run concurrently. The DTS Designer has been criticized for having unusual quirks and limitations, such as the inability to visually copy and paste multiple tasks at one time. Many of these shortcomings have been overcome in SQL Server Integration Services, DTS's successor. === DTS Query Designer === A graphical tool used to build queries in DTS. === DTS Run Utility === DTS Packages can be run from the command line using the DTSRUN Utility. The utility is invoked using the following syntax: dtsrun /S server_name[\instance_name] { {/[~]U user_name [/[~]P password]} | /E } ] { {/[~]N package_name } | {/[~]G package_guid_string} | {/[~]V package_version_guid_string} } [/[~]M package_password] [/[~]F filename] [/[~]R repository_database_name] [/A global_variable_name:typeid=value] [/L log_file_name] [/W NT_event_log_completion_status] [/Z] [/!X] [/!D] [/!Y] [/!C] ] When passing in parameters which are mapped to Global Variables, you are required to include the typeid. This is rather difficult to find on the Microsoft site. Below are the TypeIds used in passing in these values.

    Read more →
  • Pinoy baiting

    Pinoy baiting

    Pinoy baiting is a phrase that has been used to refer to acts by non-Filipino individuals, usually celebrities or YouTubers, of posting content online purportedly with the intention of getting the attention of Filipinos, by being surprised about the Philippines or its people. Pinoy baiters are defined as giving superficial and allegedly insincere praises and similar reactions that give recognition to the Philippines or its people. Subsequent responses by Filipinos to what have been referred to as acts of Pinoy baiting have been criticized as a form of cultural cringe. This criticism would subsequently give the advice that Filipinos should not constantly require validation from non-Filipinos about themselves or their country. == Pinoy baiting mediums == === Reaction videos === On social media such as YouTube, channels with specific focus on showing their reaction towards and opinions about certain videos or topics are called reaction channels. Reaction videos are very popular and require minimal effort to create, and thus made it easy for alleged Pinoy baiting to thrive within this video-making genre. === Travel vlogs === Vlogging, short for video blogging, grew in popularity in the 2020s. Most of the popular alleged Pinoy-baiting channels tend to be vlog channels, normally following the same script under such titles as "The Philippines changed us/me", "First impression of the Philippines", "Is this really Manila?" and "Filipinos are such Kind/Good People!", and made while travelling to touristy areas such as Boracay or Bonifacio Global City and taste-testing the fast food chain Jollibee, among others. == Criticism of the phrase == Philippines-based Korean vlogger Jessica Lee had been accused by some YouTube viewers of engaging in Pinoy baiting. In a response vlog, Lee acknowledged that there may be individuals engaging in this "business strategy" of gaining views and subscribers from one of the largest communities online. However, she questioned the objectivity of some use of the phrase, citing any vlogging subject as fair game for a negative impression of being a "baiting" tool for the vlogger treating of that subject. She also invoked vloggers' freedom to choose whatever subject they want to talk about in a deep or shallow manner, while enjoining citizens to exercise their free-market right to unfollow vloggers they hate and follow those vloggers that "make them happy". She also gave her critics an explanation why she ended up vlogging about Philippine and Filipino subjects.

    Read more →
  • Class activation mapping

    Class activation mapping

    Class activation mapping methods are explainable AI (XAI) techniques used to visualize the regions of an input image that are the most relevant for a particular task, especially image classification, in convolutional neural networks (CNNs). These methods generate heatmaps by weighting the feature maps from a convolutional layer according to their relevance to the target class. In the field of artificial intelligence, generically defined as "the effort to automate intellectual tasks normally performed by humans", machine learning and deep learning were created. They both use statistical and computational methods to learn patterns from data, reducing the need for manually coded rules. Machine learning models are trained on input data and the known respective answers, learning the underlying patterns or structures present in the data. Traditional Machine learning algorithms employ manually designed feature sets, posing a direct link between machine learning designers and employed features. Deep learning is a subfield of machine learning, based on the concept of successive layers of representation, in which the data is progressively unfolded in different ways, to extract relevant and informative patterns in data analysis. Deep learning algorithms are defined as feature learning algorithms automatically learning hierarchical feature representations from raw data, extracting increasingly abstract features through multiple layers. CNNs are a specific architecture of deep learning models, designed to process spatially structured data, such as images, exploiting a series of convolution, non-linear activation and pooling operations to extract relevant features, contained in the so-called feature maps from input data. CNNs have demonstrated to be highly effective in a variety of computer vision and image processing tasks. CNNs (and deep learning models more broadly) are described as black boxes due to their complex and non-transparent internal layers of representation. The need for clearer indications on its internal working and decision-making process gave birth to XAI techniques. Among the proposed XAI techniques for computer vision tasks, Class activation mapping methods can show which pixels in an input image are important to the predicted logit for a class of interest, in a classification task. Class activation mapping methods were originally developed for class-discriminative scenarios to visualize which parts of the input image influenced the classification decision, namely to visually highlight the regions of those feature maps that contribute most strongly to the prediction of a given class. More advanced versions of these methods are not limited to image classification tasks, but have been extended also to several vision-related tasks, such as object detection, image captioning, visual question answering and image segmentation. == Background == The following methods laid the groundwork for the class activation maps approaches, forming the conceptual basis of using gradients to highlight class-discriminative regions. === Class model visualization and saliency maps for convolutional neural networks === The class model visualization and image-specific saliency maps approaches have been presented in the foundational work "Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps" by Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman and it generalizes the deconvnet method by Zeiler and Fergus. Class model visualization synthesizes an artificial input image that strongly activates the output neurons associated with a target class. Given a trained, fixed model, this method starts with a zero-initialized image, backpropagates the gradients from the class score to the image pixels, updates the image pixels increasing the specific class scores and it repeats the pixel updating process, showing an encoded (idealized version) prototype of the class of interest. Image-specific class saliency visualization method provides a visual explanation by highlighting the most relevant pixels in an image for predicting a certain class C of interest. This is done by computing the gradient of the class score with respect to the input image, I 0 , {\displaystyle I_{0},} w = ∂ S C ∂ I | I 0 {\displaystyle w=\left.{\frac {\partial S_{C}}{\partial I}}\right|_{I_{0}}} approximating the model locally (around I 0 {\displaystyle I_{0}} ) as linear, using a first-order Taylor expansion: S C ( I ) ≈ w C T I + b {\displaystyle S_{C}(I)\approx w_{C}^{T}I+b} . The magnitude of w C {\displaystyle w_{C}} , the gradient, indicates the importancy of the pixels: larger gradients suggest greater influence on the prediction. Once the gradient is known, the saliency map is defined as the maximum absolute gradient across the color channels: M i j = m a x C | ∂ S C ∂ I i j C | {\displaystyle M_{ij}=max_{C}\left|{\frac {\partial S_{C}}{\partial I_{ij}^{C}}}\right|} resulting in an saliency map (i.e. heatmap). === Guided backpropagation === The concept of guided backpropagation can be traced for the first time in the paper by Springenberg et al. "Striving For Simplicity: The All Convolutional Net" and also this method builds upon the work by Zeiler and Fergus "Visualizing and Understanding Convolutional Networks". Guided backpropagation core is to understand what a CNN is learning, by visualizing the patterns that activate more strongly individual neurons (or filters), in architectures which do not rely on max-pooling layer. When propagating gradients back through a rectified linear unit (ReLU), guided backpropagation passes the gradient if and only if the input to the ReLU was positive (forward pass) and the output gradient is positive (backward signal), tackling both inactive neurons, negative gradients and suppressing the noise. The result displays sharper, high-resolution visualizations of what each neuron is responding to. Guided backpropagation represents a simple and practical method for model interpretability, helping understand how and where neural networks detect semantic concepts across layers. Moreover, it can be applied to any network architecture, due to its working principle. == Base versions == Class activation mapping and gradient-weighted class activation mapping are the original and most widely used methods for visual explanations in convolutional neural networks. These methods serve as the foundation for many later developments in explainable AI. Notation: In this article, the symbols i and j represent integer indices that disappear inside sums or averages, while x and y are the continuous (or up-sampled integer) coordinates of the final heat-map that is plotted. === Class activation mapping (CAM) === Class activation mapping (CAM) was the first, and the original, version of CAM methods, and it gave the name to the whole category. The approach was firstly introduced by Zhou et al. in their seminal work "Learning Deep Features for Discriminative Localization". This approach achieves class-specific heatmaps by modifying image classification CNN architectures, replacing fully-connected layers with convolutional layers and a final global average pooling layer. Its main scope is to localize and highlight discriminative regions of an input image that a CNN uses to identify a particular class, without needing explicit bounding box annotations. ==== Global average pooling (GAP) ==== Global average pooling (GAP) represents the key element in the original CAM approach. It is a dimensionality reduction technique and, similarly to other pooling layers, it allows the downsampling of the feature maps, calculating representative values for a specific region of the feature map. The particularity of GAP is that it calculates a single value for an entire feature map, significantly reducing the model dimensions. ==== Mathematical description ==== The mathematical description considers as its key the combination of convolutional and GAP layers. In CAM, it is mandatory to have the GAP layer after the last convolutional layer and before the final linear classifier layer. This last element of the architecture connects the output logits (the network predictions) y C {\displaystyle y^{C}} , to the GAP values, with its respective fine-tuned weights, w k C {\displaystyle w_{k}^{C}} . Considering A k {\displaystyle A^{k}} as the last feature maps of the last convolutional layer, GAP produces one value for each feature map, by averaging all the matrix elements (i, j) of the feature map: F k = 1 m n ∑ i = 1 m ∑ j = 1 n A i j k {\displaystyle F^{k}={\frac {1}{mn}}\sum _{i=1}^{m}\sum _{j=1}^{n}A_{ij}^{k}} with A k = [ A 11 k A 12 k ⋯ A 1 n k A 21 k A 22 k ⋯ A 2 n k ⋮ ⋮ ⋱ ⋮ A m 1 k A m 2 k ⋯ A m n k ] = { A i j k ∣ 1 ≤ i ≤ m , 1 ≤ j ≤ n } {\displaystyle A^{k}={\begin{bmatrix}A_{11}^{k}&A_{12}^{k}&\cdots &A_{1n}^{k}\\A_{21}^{k}&A_{22}^{k}&\cdots &A_{2n}^{k}\\\vdots &\vdots &\ddots &\vdots \\A_{m1}^{k}&A_{m2}^{k}&\cdots &A_{mn}^{k}\end{bmatrix}}=\left\{A_{

    Read more →
  • Data lineage

    Data lineage

    Data lineage refers to the process of tracking how data is generated, transformed, transmitted and used across systems over time. It documents data's origins, transformations and movements, providing detailed visibility into its life cycle. This process simplifies the identification of errors in data analytics workflows, by enabling users to trace issues back to their root causes. Data lineage facilitates the ability to replay specific segments or inputs of the dataflow. This can be used in debugging or regenerating lost outputs. In database systems, this concept is closely related to data provenance, which involves maintaining records of inputs, entities, systems and processes that influence data. Data provenance provides a historical record of data origins and transformations. It supports forensic activities such as data-dependency analysis, error/compromise detection, recovery, auditing and compliance analysis: "Lineage is a simple type of why provenance." Data governance plays a critical role in managing metadata by establishing guidelines, strategies and policies. Enhancing data lineage with data quality measures and master data management adds business value. Although data lineage is typically represented through a graphical user interface (GUI), the methods for gathering and exposing metadata to this interface can vary. Based on the metadata collection approach, data lineage can be categorized into three types: Those involving software packages for structured data, programming languages and Big data systems. Data lineage information includes technical metadata about data transformations. Enriched data lineage may include additional elements such as data quality test results, reference data, data models, business terminology, data stewardship information, program management details and enterprise systems associated with data points and transformations. Data lineage visualization tools often include masking features that allow users to focus on information relevant to specific use cases. To unify representations across disparate systems, metadata normalization or standardization may be required. == Representation of data lineage == Representation broadly depends on the scope of the metadata management and reference point of interest. Data lineage provides sources of the data and intermediate data flow hops from the reference point with backward data lineage, leading to the final destination's data points and its intermediate data flows with forward data lineage. These views can be combined with end-to-end lineage for a reference point that provides a complete audit trail of that data point of interest from sources to their final destinations. As the data points or hops increase, the complexity of such representation becomes incomprehensible. Thus, the best feature of the data lineage view is the ability to simplify the view by temporarily masking unwanted peripheral data points. Tools with the masking feature enable scalability of the view and enhance analysis with the best user experience for both technical and business users. Data lineage also enables companies to trace sources of specific business data to track errors, implement changes in processes and implementing system migrations to save significant amounts of time and resources. Data lineage can improve efficiency in business intelligence BI processes. Data lineage can be represented visually to discover the data flow and movement from its source to destination via various changes and hops on its way in the enterprise environment. This includes how the data is transformed along the way, how the representation and parameters change and how the data splits or converges after each hop. A simple representation of the Data Lineage can be shown with dots and lines, where dots represent data containers for data points, and lines connecting them represent transformations the data undergoes between the data containers. Data lineage can be visualized at various levels based on the granularity of the view. At a very high-level, data lineage is visualized as systems that the data interacts with before it reaches its destination. At its most granular, visualizations at the data point level can provide the details of the data point and its historical behavior, attribute properties and trends and data quality of the data passed through that specific data point in the data lineage. The scope of the data lineage determines the volume of metadata required to represent its data lineage. Usually, data governance and data management of an organization determine the scope of the data lineage based on their regulations, enterprise data management strategy, data impact, reporting attributes and critical data elements of the organization. == Rationale == Distributed systems like Google Map Reduce, Microsoft Dryad, Apache Hadoop (an open-source project) and Google Pregel provide such platforms for businesses and users. However, even with these systems, Big Data analytics can take several hours, days or weeks to run, simply due to the data volumes involved. For example, a ratings prediction algorithm for the Netflix Prize challenge took nearly 20 hours to execute on 50 cores, and a large-scale image processing task to estimate geographic information took 3 days to complete using 400 cores. "The Large Synoptic Survey Telescope is expected to generate terabytes of data every night and eventually store more than 50 petabytes, while in the bioinformatics sector, the 12 largest genome sequencing houses in the world now store petabytes of data apiece. It is very difficult for a data scientist to trace an unknown or an unanticipated result. === Big data debugging === Big data analytics is the process of examining large data sets to uncover hidden patterns, unknown correlations, market trends, customer preferences and other useful business information. Machine learning, among other algorithms, is used to transform and analyze the data. Due to the large size of the data, there could be unknown features in the data. The massive scale and unstructured nature of data, the complexity of these analytics pipelines, and long runtimes pose significant manageability and debugging challenges. Even a single error in these analytics can be extremely difficult to identify and remove. While one may debug them by re-running the entire analytics through a debugger for stepwise debugging, this can be expensive due to the amount of time and resources needed. Auditing and data validation are other major problems due to the growing ease of access to relevant data sources for use in experiments, the sharing of data between scientific communities and use of third-party data in business enterprises. As such, more cost-efficient ways of analyzing data intensive scale-able computing (DISC) are crucial to their continued effective use. === Challenges in Big Data debugging === ==== Massive scale ==== According to an EMC/IDC study, 2.8 ZB of data were created and replicated in 2012. Furthermore, the same study states that the digital universe will double every two years between now and 2020, and that there will be approximately 5.2 TB of data for every person in 2020. Based on current technology, the storage of this much data will mean greater energy usage by data centers. ==== Unstructured data ==== Unstructured data usually refers to information that doesn't reside in a traditional row-column database. Unstructured data files often include text and multimedia content, such as e-mail messages, word processing documents, videos, photos, audio files, presentations, web pages and many other kinds of business documents. While these types of files may have an internal structure, they are still considered "unstructured" because the data they contain doesn't fit neatly into a database. The amount of unstructured data in enterprises is growing many times faster than structured databases are growing. Big data can include both structured and unstructured data, but IDC estimates that 90 percent of Big Data is unstructured data. The fundamental challenge of unstructured data sources is that they are difficult for non-technical business users and data analysts alike to unbox, understand and prepare for analytic use. Beyond issues of structure, the sheer volume of this type of data contributes to such difficulty. Because of this, current data mining techniques often leave out valuable information and make analyzing unstructured data laborious and expensive. In today's competitive business environment, companies have to find and analyze the relevant data they need quickly. The challenge is going through the volumes of data and accessing the level of detail needed, all at a high speed. The challenge only grows as the degree of granularity increases. One possible solution is hardware. Some vendors are using increased memory and parallel processing to crunch large volumes of data quickly. Another method is putting data in-memory but using a grid

    Read more →
  • Format-transforming encryption

    Format-transforming encryption

    In cryptography, format-transforming encryption (FTE) refers to encryption where the format of the input plaintext and output ciphertext are configurable. Descriptions of formats can vary, but are typically compact set descriptors, such as a regular expression. Format-transforming encryption is closely related to, and a generalization of, format-preserving encryption. == Applications of FTE == === Restricted fields or formats === Similar to format-preserving encryption, FTE can be used to control the format of ciphertexts. The canonical example is a credit card number, such as 1234567812345670 (16 bytes long, digits only). However, FTE does not enforce that the input format must be the same as the output format. === Censorship circumvention === FTE is used by the Tor Project to circumvent deep packet inspection by pretending to be some other protocols. The implementation is fteproxy; it was written by the authors who came up with the FTE concept.

    Read more →
  • Media engagement framework

    Media engagement framework

    The media engagement framework is a planning framework used by marketing professionals to understand the behavior of social media marketing-based audiences. The construct was introduced in the book, ROI of Social Media. Powell’s background in marketing ROI and Groves' experience and understanding of the applications of social media in business led to a collaboration. Dimos joined as a brand strategist for Litmus Group, a global management consulting firm. The media engagement framework consists of the definitions of personas (Individuals, Consumers and Influencers), referenced by the competitive set or constraint that applies to that persona and the measurement framework that might be applied to those personas. It is referenced at the center of the marketing process diagram, surrounded by the marketing functions of strategy, tactics, metrics and ROI. The marketing process diagram describes how the media engagement framework can apply to any strategic marketing activity but was developed to establish a completely integrated framework describing how both traditional and social media marketing activities can be planned, executed, measured and improved. == Application == The media engagement framework provides a strategic planning construct in which measurements and metrics play a crucial role. Applying the media engagement framework aids in the development and management of an effective online marketing presence leveraging social media to engage a market or audience. By first personifying the audience, the marketer is able to identify the limiting aspect of the engagements possible with that audience segment and then, understand the type of engagement metrics to apply. Each persona makes decisions differently about how he/she acts in the social media universe. A framework metric can be applied for each of these personas: Endorsement funnel for influencers Community engagement funnel for individuals Purchase funnel for consumers Individuals, influencers and consumers make decisions based on alternatives available to them and constraints put on them. To engage with an individual brands must realize they are competing against the time an individual spends on line. If they find something else more engaging, they will engage with that activity. Brands compete against other brands for the purchases of consumers acting in the category. Lastly, influencers have only so many endorsements they can make and therefore brands compete with other endorsers for the endorsement of an influencer. Creating engaging content by keeping target audience in mind like create content that audience find it funny, interesting, and relatable will encourage audience to share it on social networks. Which will be beneficial for you brand, getting more people to know about your business and brand. Contact Digilord to create engaging content for your brand. Use of listening tools (Google Alerts, Twitter Search, SocialMention.com, Veooz.com, Alterian SM2, Radian6, Sysomos, Buzzient etc.) can be employed within the model to help identify the members of the audience segment and to support the formation of other social engagement planning and management tools.

    Read more →
  • CLAWS (linguistics)

    CLAWS (linguistics)

    The Constituent Likelihood Automatic Word-tagging System (CLAWS) is a program that performs part-of-speech tagging. It was developed in the 1980s at Lancaster University by the University Centre for Computer Corpus Research on Language. It has an overall accuracy rate of 96–97% with the latest version (CLAWS4) tagging around 100 million words of the British National Corpus. == History == A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some language and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc., although generally computational applications use more fine-grained POS tags like 'noun-plural'. Developed in the early 1980s, CLAWS was built to fill the ever-growing gap created by always-changing POS necessities. Originally created to add part-of-speech tags to the LOB corpus of British English, the CLAWS tagset has since been adapted to other languages as well, including Urdu and Arabic. Since its inception, CLAWS has been hailed for its functionality and adaptability. Still, it is not without flaws, and though it boasts an error-rate of only 1.5% when judged in major categories, CLAWS still remains with c.3.3% ambiguities unresolved. Ambiguity arises in cases such as with the word flies, and whether it should be classified as a noun or a verb. It's these ambiguities that will require the various upgrades and tagsets that CLAWS will endure. == Rules and processing == CLAWS uses a Hidden Markov model to determine the likelihood of sequences of words in anticipating each part-of-speech label. === Sample output === This excerpt from Bram Stoker's Dracula (1897) has been tagged using both the CLAWS C5 and C7 tagsets. This is what a CLAWS output will generally look like, with the most likely part-of-speech tag following each word. == Tagsets == === CLAWS1 tagset === The first tagset developed in CLAWS, CLAWS1 tagset, has 132 word tags. In terms of form and application, C1 tagset is similar to Brown Corpus tags. See Table of tags in C1 tagset here. === CLAWS2 tagset === From 1983 to 1986, updated versions leading to CLAWS2 were part of a larger attempt to deal with aspects such as recognizing sentence breaks, in order to avoid the need for manual pre-processing of a text before the tags were applied, moving instead to optional manual post-editing to adjust the output of the automatic annotation, if needed. The CLAWS2 tagset has 166 word tags. See Table of tags in C2 tagset here. === CLAWS4 tagset === The CLAWS4 was used for the 100-million-word British National Corpus (BNC). A general-purpose grammatical tagger, it is a successor of the CLAWS1 tagger. In tagging the BNC, the many rounds of work that went into CLAWS4 focused on making the CLAWS program independent from the tagsets. For example, the BNC project used two tagset versions: "a main tagset (C5) with 62 tags with which the whole of the corpus has been tagged, and a larger (C7) tagset with 152 tags, which has been used to make a selected 'core' sample corpus of two million words." The latest version of CLAWS4 is offered by UCREL, a research center of Lancaster University. === CLAWS5 tagset === The CLAWS5 tagset, which was used for BNC, has over 60 tags. See Table of tags in C5 tagset here. === CLAWS6 tagset === The CLAWS6 tagset was used for the BNC sampler corpus and the COLT corpus. It has over 160 tags, including 13 determiner subtypes. See Table of tags in C6 tagset here. === CLAWS7 tagset === The standard CLAWS7 tagset is used currently. It is only different in the punctuation tags when compared to the CLAWS6 tagset. See Table of tags in C7 tagset here. === CLAWS8 tagset === CLAWS8 tagset was extended from C7 tagset with further distinctions in the determiner and pronoun categories, as well as 37 new auxiliary tags for forms of be, do, and have. See Table of tags in C8 tagset here

    Read more →
  • Death of Molly Russell

    Death of Molly Russell

    In November 2017, Molly Russell, a fourteen-year-old British schoolgirl from Harrow, London, was found dead in her bedroom by her parents. In an inquest, the coroner stated that she had died from an act of self-harm following depression and the results of social media consumption, including material on Instagram and Pinterest. She also had a Twitter account in which she documented her growing depression. == Life == Russell had been a pupil at Hatch End High School. At the inquest, the school's head teacher expressed shock that she was able to access distressing online content. Her parents stated that she had never shown any previous signs of struggle and was doing very well in school. It was revealed at the inquest that in the six months prior to her death, 2,100 of 16,300 pieces of content she had interacted with on Instagram were on topics such as self-harm, depression, and suicide. It was also noted that throughout her experience on social media, there were never any warning signs about the information she viewed on these platforms. == Subsequent events == Dr. Navin Venugopal, the child psychiatrist assigned to the case investigating her death, called the material she viewed "disturbing and distressing" and said he was unable to sleep well for weeks after viewing it. The coroner Andrew Walker concluded that Molly's death was "an act of self harm suffering from depression and the negative effects of online content". He issued a prevention of future deaths report regarding her death, in which he made a number of recommendations for operators of online platforms, including: separating platforms for adults and children age verification changes in policy on filtering of age-specific content adding features for parental supervision and control data retention of material viewed by children He suggested that this could be accomplished by either legislation or self-regulation. The lawyer representing her family at the inquest stated that the findings "captured all of the elements of why this material is so harmful." The case has been cited as a motivator for the passage of the Online Safety Act. A charity, the Molly Rose Foundation, was set up in her memory, with the goal of suicide prevention for young people. Meta and Pinterest are believed to have made substantial donations to the charity.

    Read more →
  • Data refuge

    Data refuge

    Data Refuge is a public and collaborative project designed to address concerns about federal climate and environmental data that is in danger of being lost. In particular, the initiative addresses five main concerns: What are the best ways to safeguard data? How do federal agencies play a crucial role in collecting, managing, and distributing data? How do government priorities impact data's accessibility? Which projects and research fields depend on federal data? Which data sets are of value to research and local communities, and why? Data Refuge began as a grassroots organization in opposition to government data on climate change and the environment not being archived systemically. Data Refuge's main goal is to collect and allocate data in multiple safe locations to create a sustainable way of archiving old and new data. Data Refuge was initiated in 2016 to protect federal climate and environmental data that is vulnerable under an administration that denies climate change. The system aims to make public research-quality copies of federal climate and environmental data. Data Refuge is supported by the National Geographic Foundation, private donors, Libraries+ Network, Preserving Electronic Governance Initiative (PEGI), the Union of Concerned Scientists (USC), and the Penn Program in Environmental Humanities (PPEH). == Types of data == Data Refuge collects public federal data on the climate and environment in the form of satellite imagery, PDFs, and stories. The data are stored in multiple trusted locations as they are less vulnerable if in only one location, and to ensure accessibility for researchers. Through the Data Rescue events, Data Refuge has accumulated 4 terabytes of data, 30,000 URLs, and 800 participants. === Storytelling === Data Refuge collects stories on vulnerable federal climate and environmental data through: surveys, oral history, photo essays, maps, video shorts, and animations. The stories are archived in a public bank that showcase how federal environmental data support health and safety in communities. Data Stories are collected at Data Rescue events, which are partnered with universities, city and town halls, and advocacy groups. Data stories are collected and used to emphasize the importance of Data Refuge, in how the data on climate change and the environment are being used by people in the United States and across the world for meaningful practices.

    Read more →
  • Sumazi

    Sumazi

    Sumazi is a social media and social intelligence platform for enterprises, brands, and celebrities. Its technology performs social data analysis across social networking services including Facebook, Twitter and LinkedIn, to identify key people in his/her network who are experts, influencers or are located in a specific area for marketing, advertising or sales campaigns. The technology company was founded in 2011 by former Sun Microsystems employee Sumaya Kazi. The company was headquartered in San Francisco, California. The company was out of business by 2017. == Reception == Sumazi was one of 25 startups selected out of more than 1,200 to compete at TechCrunch Disrupt Startup Battlefield, where it won the Omidyar Network award for the startup "Most Likely to Change the World." Sumazi, which was based out of San Francisco, California, had been profiled in The New York Times as well as USA Today, which commented the advantages of the startup's location in the Silicon Valley. American Express OPEN Forum also featured Sumazi as a "Startup of the Week". Sumazi has additionally been mentioned in articles by Mashable, The Wall Street Journal, Current Editorials, Harvard Business Review, Smashing Magazine, and TechCrunch.

    Read more →
  • Google Research

    Google Research

    Google Research (also known as Research at Google) is the research division of Google, a subsidiary of Alphabet Inc.. According to its official website, Google Research publishes findings, releases open-source software, and applies research results within Google products and services as well as within the wider scientific community. == Notable contributions == The 2017 landmark paper Attention Is All You Need, which introduced the Transformer architecture, which has subsequently been used to build modern large language models. Advances in neural machine translation powering Google Translate. Time series forecasting. Development of scalable learning systems and infrastructure for large-model training. Flood forecasting. Research into computational discovery via Google Accelerated Science including demonstrating the first below-threshold quantum calculations.

    Read more →
  • Professional network service

    Professional network service

    A professional network service (or, in an Internet context, simply a professional network) is a type of social network service that focuses on interactions and relationships for business opportunities and career growth, with less emphasis on activities in personal life. A professional network service is used by working individuals, job-seekers, and businesses to establish and maintain professional contacts, to find work or hire employees, share professional achievements, sell or promote services, and stay up-to-date with industry news and trends. According to LinkedIn managing director Clifford Rosenberg in an interview with AAP in 2010, "[t]his is a call to action for professionals to re-address their use of social networks and begin to reap as many rewards from networking professionally as they do personally." Businesses mostly depend on resources and information outside the company and to get what they need, they need to reach out and professionally network with others, such as employees or clients as well as potential opportunities. "Nardi, Whittaker, and Schwarz (2002) point out three main tasks that they believe networkers need to attend to keep a successful professional (intentional) network: building a network, maintaining the network, and activating selected contacts. They stress that networkers need to continue to add new contacts to their network to access as many resources as possible and to maintain their network by staying in touch with their contacts. This is so that the contacts are easy to activate when the networker has work that needs to be done." By using a professional network service, businesses can keep all of their networks up-to-date, and in order, and helps figure out the best way to efficiently get in touch with each of them. A service that can do all that helps relieve some of the stress when trying to get things done. Not all professional network services are online sites that help promote a business. Some services connect the user to other services that help promote the business other than online sites, such as phone/Internet companies that provide services and companies that specifically are designed to do all of the promoting, online and in person, for a business. == History == In 1997, professional network services started up throughout the world and continue to grow. The first recognizable site to combine all features, such as creating profiles, adding friends, and searching for friends, was SixDegrees.com. According to Boyd and Ellison's article, "Social Network Sites: Definition, History, and Scholarship", from 1997 to 2001, several community tools began supporting various combinations of profiles and publicly articulated Friends. Boyd and Ellison go on to say that the next wave began with Ryze.com in 2001. It was introduced as a new way "to help people leverage their business networks". == Inside the works == Quite a lot of work is put into a professional network service, such as the number of hours that go into them and the type of people they work for, as well as the business model of it all, such as the professional interaction and the multiple services they deal with. === Types of services === Some professional network services not only help promote the business but can also help in connecting to other people. Those services may include a specific phone and/or Internet company or a company that helps to connect with other businesses. According to the Society for New Communications Research (SNCR), there are at least nine online professional networks that are being used. === Professional interaction === Kaplan and Haenlein elaborate on five key considerations for companies when utilizing media. These include the importance of careful selection, the option to choose existing applications or develop custom ones, ensuring alignment with organizational activities, integrating a comprehensive media plan, and providing accessibility to all stakeholders. ==== Choose carefully ==== "Choosing the right medium for any given purpose depends on the target group to be reached and the message to be communicated. On one hand, each Social Media application usually attracts a certain group of people, and firms should be active wherever their customers are present. On the other hand, there may be situations whereby certain features are necessary to ensure effective communication, and these features are only offered by one specific application." ==== Ensure activity alignment ==== "Sometimes you may decide to rely on various Social Media, or a set of different applications within the same group, to have the largest possible reach." "Using different contact channels can be a worthwhile and profitable strategy." According to the Society for New Communications Research at Harvard University, "the average professional belongs to 3–5 online networks for business use, and LinkedIn, Facebook, and Twitter are among the top used." ==== Integrate a media plan ==== Social media and traditional media are "both part of the same: your corporate image" in the customers' eyes. ==== Allow access to all ==== "...once the firm has decided to utilize Social Media applications, it is worth checking that all employees may access them." According to the SNCR, "the convergence of Internet, mobile, and social media has taken significant shape as professionals rely on anywhere access to information, relationships, and networks." ==== Online usage ==== "Half of the respondents report participating in 3 to 5 online professional networks. Another three in ten participate in 6 or more professional networks." "Popular social networks are now being used frequently as Professional Communities. More than nine in ten respondents indicated that they use LinkedIn and half reported using Facebook. Twitter and blogs were frequently listed as 'professional networks'." === Business model === According to Michael Rappa's article, Business models on the Web", "a business model is the method of doing business by which a company can sustain itself – that is, generate revenue. The business model spells out how a company makes money by specifying where it is positioned in the value chain." Rappa mentions that there are at least nine basic categories from which a business model can be separated. Those categories are a brokerage, advertising, infomediary, merchant, manufacturer, affiliate, community, subscription, and utility. "...a firm may combine several different models as part of its overall Internet business strategy." At first, Flickr started as a way to mainstream public relations. == Social impact == When it comes to the social impact that professional network services have on today's society, it has proved to increase activity. According to the SNCR, "[t]hree quarters of respondents rely on professional networks to support business decisions. Reliance has increased for essentially all respondents over the past three years. Younger (20–35) and older professionals (55+) are more active users of social tools than middle-aged professionals. More people are collaborating outside their company wall than within their organizational intranet." == Limitations == Since the internet and social media are a part of this "world where consumers can speak so freely with each other and businesses have increasingly less control over the information available about them in cyberspace", most firms and businesses are uncomfortable with all the freedom. According to Kaplan and Haenlein's article, "Users of the world, unite! The challenges and opportunities of Social Media", businesses are pushed aside and are only able to sit back and watch as their customers publicly post comments, which may or may not be well-written.

    Read more →
  • IWARP

    IWARP

    iWARP is a computer networking protocol that implements remote direct memory access (RDMA) for efficient data transfer over Internet Protocol networks. Contrary to some accounts, iWARP is not an acronym. Because iWARP is layered on Internet Engineering Task Force (IETF)-standard congestion-aware protocols such as Transmission Control Protocol (TCP) and Stream Control Transmission Protocol (SCTP), it makes few requirements on the network, and can be successfully deployed in a broad range of environments. == History == In 2007, the IETF published five Request for Comments (RFCs) that define iWARP: RFC 5040 A Remote Direct Memory Access Protocol Specification is layered over Direct Data Placement Protocol (DDP). It defines how RDMA Send, Read, and Write operations are encoded using DDP into headers on the network. RFC 5041 Direct Data Placement over Reliable Transports is layered over MPA/TCP or SCTP. It defines how received data can be directly placed into an upper layer protocols receive buffer without intermediate buffers. RFC 5042 Direct Data Placement Protocol (DDP) / Remote Direct Memory Access Protocol (RDMAP) Security analyzes security issues related to iWARP DDP and RDMAP protocol layers. RFC 5043 Stream Control Transmission Protocol (SCTP) Direct Data Placement (DDP) Adaptation defines an adaptation layer that enables DDP over SCTP. RFC 5044 Marker PDU Aligned Framing for TCP Specification defines an adaptation layer that enables preservation of DDP-level protocol record boundaries layered over the TCP reliable connected byte stream. These RFCs are based on the RDMA Consortium's specifications for RDMA over TCP. The RDMA Consortium's specifications are influenced by earlier RDMA standards, including Virtual Interface Architecture (VIA) and InfiniBand (IB). Since 2007, the IETF has published three additional RFCs that maintain and extend iWARP: RFC 6580 IANA Registries for the Remote Direct Data Placement (RDDP) Protocols published in 2012 defines IANA registries for Remote Direct Data Placement (RDDP) error codes, operation codes, and function codes. RFC 6581 Enhanced Remote Direct Memory Access (RDMA) Connection Establishment published in 2011 fixes shortcomings with iWARP connection setup. RFC 7306 Remote Direct Memory Access (RDMA) Protocol Extensions published in 2014 extends RFC 5040 with atomic operations and RDMA Write with Immediate Data. == Protocol == The main component in the iWARP protocol is the Direct Data Placement Protocol (DDP), which permits the actual zero-copy transmission. DDP itself does not perform the transmission; the underlying protocol (TCP or SCTP) does. However, TCP does not respect message boundaries; it sends data as a sequence of bytes without regard to protocol data units (PDU). In this regard, DDP itself may be better suited for SCTP, and indeed the IETF proposed a standard RDMA over SCTP. To run DDP over TCP requires a tweak known as marker PDU aligned (MPA) framing to guarantee boundaries of messages. Furthermore, DDP is not intended to be accessed directly. Instead, a separate RDMA protocol (RDMAP) provides the services to read and write data. Therefore, the entire RDMA over TCP specification is really RDMAP over DDP over either MPA/TCP or SCTP. All of these protocols can be implemented in hardware. Unlike IB, iWARP only has reliable connected communication, as this is the only service that TCP and SCTP provide. The iWARP specification omits other features of IB, such as Send with Immediate Data operations. With RFC 7306, the IETF is working to reduce these omissions. == Implementation == Because a kernel implementation of the TCP stack can be seen as a bottleneck, the protocol is typically implemented in hardware RDMA network interface controllers (rNICs). As simple data losses are rare in tightly coupled network environments, the error-correction mechanisms of TCP may be performed by software while the more frequently performed communications are handled strictly by logic embedded on the rNIC. Similarly, connections are often established entirely by software and then handed off to the hardware. Furthermore, the handling of iWARP specific protocol details is typically isolated from the TCP implementation, allowing rNICs to be used for both as RDMA offload and TCP offload (in support of traditional sockets based TCP/IP applications). The portion of the hardware implementation used for implementing the TCP protocol is known as the TCP Offload Engine (TOE). TOE itself does not prevent copying on the reception side, and must be combined with RDMA hardware for zero-copy results. The RDMA / TCP specification is a set of different wire protocols intended to be implemented in hardware (though it seems feasible to emulate it in software for compatibility but without the performance benefits). == Interfaces == iWARP is a protocol, not an implementation, but defines protocol behavior in terms of the operations that are legal for the protocol, known as Verbs. As such, iWARP does not have any single standard programming interface. However, programming interfaces tend to very closely correspond to the Verbs. Several programmatic interfaces have been proposed, including OpenFabrics Verbs, Network Direct, uDAPL, kDAPL, IT-API, and RNICPI. Implementations of some of these interfaces are available for different platforms, including Windows and Linux. == Services available == Networking services implemented over iWARP include those offered in the OpenFabrics Enterprise Distribution (OFED) by the OpenFabrics Alliance for Linux operating systems, and by Microsoft Windows via Network Direct. NVMe over Fabrics (NVMEoF) iSCSI Extensions for RDMA (iSER) Server Message Block Direct (SMB Direct) Sockets Direct Protocol (SDP) SCSI RDMA Protocol (SRP) Network File System over RDMA (NFS over RDMA) GPUDirect

    Read more →