Inductive logic programming

Inductive logic programming

Inductive logic programming (ILP) is a subfield of symbolic artificial intelligence which uses logic programming as a uniform representation for examples, background knowledge and hypotheses. The term "inductive" here refers to philosophical (i.e. suggesting a theory to explain observed facts) rather than mathematical (i.e. proving a property for all members of a well-ordered set) induction. Given an encoding of the known background knowledge and a set of examples represented as a logical database of facts, an ILP system will derive a hypothesised logic program which entails all the positive and none of the negative examples. Schema: positive examples + negative examples + background knowledge ⇒ hypothesis. Bioinformatics and drug design have been highlighted as a principal application area of inductive logic programming techniques. == History == Building on earlier work on Inductive inference, Gordon Plotkin was the first to formalise induction in a clausal setting around 1970, adopting an approach of generalising from examples. In 1981, Ehud Shapiro introduced several ideas that would shape the field in his new approach of model inference, an algorithm employing refinement and backtracing to search for a complete axiomatisation of given examples. His first implementation was the Model Inference System in 1981: a Prolog program that inductively inferred Horn clause logic programs from positive and negative examples. The term Inductive Logic Programming was first introduced in a paper by Stephen Muggleton in 1990, defined as the intersection of machine learning and logic programming. Muggleton and Wray Buntine introduced predicate invention and inverse resolution in 1988. Several inductive logic programming systems that proved influential appeared in the early 1990s. FOIL, introduced by Ross Quinlan in 1990 was based on upgrading propositional learning algorithms AQ and ID3. Golem, introduced by Muggleton and Feng in 1990, went back to a restricted form of Plotkin's least generalisation algorithm. The Progol system, introduced by Muggleton in 1995, first implemented inverse entailment, and inspired many later systems. Aleph, a descendant of Progol introduced by Ashwin Srinivasan in 2001, is still one of the most widely used systems as of 2022. At around the same time, the first practical applications emerged, particularly in bioinformatics, where by 2000 inductive logic programming had been successfully applied to drug design, carcinogenicity and mutagenicity prediction, and elucidation of the structure and function of proteins. Unlike the focus on automatic programming inherent in the early work, these fields used inductive logic programming techniques from a viewpoint of relational data mining. The success of those initial applications and the lack of progress in recovering larger traditional logic programs shaped the focus of the field. Recently, classical tasks from automated programming have moved back into focus, as the introduction of meta-interpretative learning makes predicate invention and learning recursive programs more feasible. This technique was pioneered with the Metagol system introduced by Muggleton, Dianhuan Lin, Niels Pahlavi and Alireza Tamaddoni-Nezhad in 2014. This allows ILP systems to work with fewer examples, and brought successes in learning string transformation programs, answer set grammars and general algorithms. == Setting == Inductive logic programming has adopted several different learning settings, the most common of which are learning from entailment and learning from interpretations. In both cases, the input is provided in the form of background knowledge B, a logical theory (commonly in the form of clauses used in logic programming), as well as positive and negative examples, denoted E + {\textstyle E^{+}} and E − {\textstyle E^{-}} respectively. The output is given as a hypothesis H, itself a logical theory that typically consists of one or more clauses. The two settings differ in the format of examples presented. === Learning from entailment === As of 2022, learning from entailment is by far the most popular setting for inductive logic programming. In this setting, the positive and negative examples are given as finite sets E + {\textstyle E^{+}} and E − {\textstyle E^{-}} of positive and negated ground literals, respectively. A correct hypothesis H is a set of clauses satisfying the following requirements, where the turnstile symbol ⊨ {\displaystyle \models } stands for logical entailment: Completeness: B ∪ H ⊨ E + Consistency: B ∪ H ∪ E − ⊭ false {\displaystyle {\begin{array}{llll}{\text{Completeness:}}&B\cup H&\models &E^{+}\\{\text{Consistency: }}&B\cup H\cup E^{-}&\not \models &{\textit {false}}\end{array}}} Completeness requires any generated hypothesis H to explain all positive examples E + {\textstyle E^{+}} , and consistency forbids generation of any hypothesis H that is inconsistent with the negative examples E − {\textstyle E^{-}} , both given the background knowledge B. In Muggleton's setting of concept learning, "completeness" is referred to as "sufficiency", and "consistency" as "strong consistency". Two further conditions are added: "Necessity", which postulates that B does not entail E + {\textstyle E^{+}} , does not impose a restriction on H, but forbids any generation of a hypothesis as long as the positive facts are explainable without it. "Weak consistency", which states that no contradiction can be derived from B ∧ H {\textstyle B\land H} , forbids generation of any hypothesis H that contradicts the background knowledge B. Weak consistency is implied by strong consistency; if no negative examples are given, both requirements coincide. Weak consistency is particularly important in the case of noisy data, where completeness and strong consistency cannot be guaranteed. === Learning from interpretations === In learning from interpretations, the positive and negative examples are given as a set of complete or partial Herbrand structures, each of which are themselves a finite set of ground literals. Such a structure e is said to be a model of the set of clauses B ∪ H {\textstyle B\cup H} if for any substitution θ {\textstyle \theta } and any clause h e a d ← b o d y {\textstyle \mathrm {head} \leftarrow \mathrm {body} } in B ∪ H {\textstyle B\cup H} such that b o d y θ ⊆ e {\textstyle \mathrm {body} \theta \subseteq e} , h e a d θ ⊆ e {\displaystyle \mathrm {head} \theta \subseteq e} also holds. The goal is then to output a hypothesis that is complete, meaning every positive example is a model of B ∪ H {\textstyle B\cup H} , and consistent, meaning that no negative example is a model of B ∪ H {\textstyle B\cup H} . == Approaches to ILP == An inductive logic programming system is a program that takes as an input logic theories B , E + , E − {\displaystyle B,E^{+},E^{-}} and outputs a correct hypothesis H with respect to theories B , E + , E − {\displaystyle B,E^{+},E^{-}} . A system is complete if and only if for any input logic theories B , E + , E − {\displaystyle B,E^{+},E^{-}} any correct hypothesis H with respect to these input theories can be found with its hypothesis search procedure. Inductive logic programming systems can be roughly divided into two classes, search-based and meta-interpretative systems. Search-based systems exploit that the space of possible clauses forms a complete lattice under the subsumption relation, where one clause C 1 {\textstyle C_{1}} subsumes another clause C 2 {\textstyle C_{2}} if there is a substitution θ {\textstyle \theta } such that C 1 θ {\textstyle C_{1}\theta } , the result of applying θ {\textstyle \theta } to C 1 {\textstyle C_{1}} , is a subset of C 2 {\textstyle C_{2}} . This lattice can be traversed either bottom-up or top-down. === Bottom-up search === Bottom-up methods to search the subsumption lattice have been investigated since Plotkin's first work on formalising induction in clausal logic in 1970. Techniques used include least general generalisation, based on anti-unification, and inverse resolution, based on inverting the resolution inference rule. ==== Least general generalisation ==== A least general generalisation algorithm takes as input two clauses C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} and outputs the least general generalisation of C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} , that is, a clause C {\textstyle C} that subsumes C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} , and that is subsumed by every other clause that subsumes C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} . The least general generalisation can be computed by first computing all selections from C 1 {\textstyle C_{1}} and C 2 {\textstyle C_{2}} , which are pairs of literals ( L , M ) ∈ ( C 1 × C 2 ) {\displaystyle (L,M)\in (C_{1}\times C_{2})} sharing the same predicate symbol and negated/unnegated status. Then, the least general generalisation is obtained as the disjunction of the least general generalisations of the indi

Lazy learning

(Not to be confused with the lazy learning regime, see Neural tangent kernel). In machine learning, lazy learning is a learning method in which generalization of the training data is, in theory, delayed until a query is made to the system, as opposed to eager learning, where the system tries to generalize the training data before receiving queries. The primary motivation for employing lazy learning, as in the K-nearest neighbors algorithm, used by online recommendation systems ("people who viewed/purchased/listened to this movie/item/tune also ...") is that the data set is continuously updated with new entries (e.g., new items for sale at Amazon, new movies to view at Netflix, new clips at YouTube, new music at Spotify or Pandora). Because of the continuous update, the "training data" would be rendered obsolete in a relatively short time especially in areas like books and movies, where new best-sellers or hit movies/music are published/released continuously. Therefore, one cannot really talk of a "training phase". Lazy classifiers are most useful for large, continuously changing datasets with few attributes that are commonly queried. Specifically, even if a large set of attributes exist - for example, books have a year of publication, author/s, publisher, title, edition, ISBN, selling price, etc. - recommendation queries rely on far fewer attributes - e.g., purchase or viewing co-occurrence data, and user ratings of items purchased/viewed. == Advantages == The main advantage gained in employing a lazy learning method is that the target function will be approximated locally, such as in the k-nearest neighbor algorithm. Because the target function is approximated locally for each query to the system, lazy learning systems can simultaneously solve multiple problems and deal successfully with changes in the problem domain. At the same time they can reuse a lot of theoretical and applied results from linear regression modelling (notably PRESS statistic) and control. It is said that the advantage of this system is achieved if the predictions using a single training set are only developed for few objects. This can be demonstrated in the case of the k-NN technique, which is instance-based and function is only estimated locally. == Disadvantages == Theoretical disadvantages with lazy learning include: The large space requirement to store the entire training dataset. In practice, this is not an issue because of advances in hardware and the relatively small number of attributes (e.g., as co-occurrence frequency) that need to be stored. Particularly noisy training data increases the case base unnecessarily, because no abstraction is made during the training phase. In practice, as stated earlier, lazy learning is applied to situations where any learning performed in advance soon becomes obsolete because of changes in the data. Also, for the problems for which lazy learning is optimal, "noisy" data does not really occur - the purchaser of a book has either bought another book or hasn't. Lazy learning methods are usually slower to evaluate. In practice, for very large databases with high concurrency loads, the queries are not postponed until actual query time, but recomputed in advance on a periodic basis - e.g., nightly, in anticipation of future queries, and the answers stored. This way, the next time new queries are asked about existing entries in the database, the answers are merely looked up rapidly instead of having to be computed on the fly, which would almost certainly bring a high-concurrency multi-user system to its knees. Larger training data also entail increased cost. Particularly, there is the fixed amount of computational cost, where a processor can only process a limited amount of training data points. There are standard techniques to improve re-computation efficiency so that a particular answer is not recomputed unless the data that impact this answer has changed (e.g., new items, new purchases, new views). In other words, the stored answers are updated incrementally. This approach, used by large e-commerce or media sites, has long been used in the Entrez portal of the National Center for Biotechnology Information (NCBI) to precompute similarities between the different items in its large datasets: biological sequences, 3-D protein structures, published-article abstracts, etc. Because "find similar" queries are asked so frequently, the NCBI uses highly parallel hardware to perform nightly recomputation. The recomputation is performed only for new entries in the datasets against each other and against existing entries: the similarity between two existing entries need not be recomputed. == Examples of Lazy Learning Methods == K-nearest neighbors, which is a special case of instance-based learning. Local regression. Lazy naive Bayes rules, which are extensively used in commercial spam detection software. Here, the spammers keep getting smarter and revising their spamming strategies, and therefore the learning rules must also be continually updated.

Database

In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data. The DBMS additionally encompasses the core facilities provided to administer the database. The sum total of the database, the DBMS and the associated applications can be referred to as a database system. Often the term "database" is also used loosely to refer to any of the DBMS, the database system or an application associated with the database. Before digital storage and retrieval of data became widespread, index cards were used for data storage in a wide range of applications and environments: in the home to record and store recipes, shopping lists, contact information and other organizational data; in business to record presentation notes, project research and notes, and contact information; in schools as flash cards or other visual aids; and in academic research to hold data such as bibliographical citations or notes in a card file. Professional book indexers used index cards in the creation of book indexes until they were replaced by indexing software in the 1980s and 1990s. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spans formal techniques and practical considerations, including data modeling, efficient data representation and storage, query languages, security and privacy of sensitive data, and distributed computing issues, including supporting concurrent access and fault tolerance. Computer scientists may classify database management systems according to the database models that they support. Relational databases became dominant in the 1980s. These model data as rows and columns in a series of tables, and the vast majority use SQL for writing and querying data. In the 2000s, non-relational databases became popular, collectively referred to as NoSQL, because they use different query languages. == Terminology and overview == Formally, a "database" refers to a set of related data accessed through the use of a "database management system" (DBMS), which is an integrated set of computer software that allows users to interact with one or more databases and provides access to all of the data contained in the database (although restrictions may exist that limit access to particular data). The DBMS provides various functions that allow entry, storage and retrieval of large quantities of information and provides ways to manage how that information is organized. Because of the close relationship between them, the term "database" is often used casually to refer to both a database and the DBMS used to manipulate it. Outside the world of professional information technology, the term database is often used to refer to any collection of related data (such as a spreadsheet or a card index) as size and usage requirements typically necessitate use of a database management system. Existing DBMSs provide various functions that allow management of a database and its data which can be classified into four main functional groups: Data definition – Creation, modification and removal of definitions that detail how the data is to be organized. Update – Insertion, modification, and deletion of the data itself. Retrieval – Selecting data according to specified criteria (e.g., a query, a position in a hierarchy, or a position in relation to other data) and providing that data either directly to the user, or making it available for further processing by the database itself or by other applications. The retrieved data may be made available in a more or less direct form without modification, as it is stored in the database, or in a new form obtained by altering it or combining it with existing data from the database. Administration – Registering and monitoring users, enforcing data security, monitoring performance, maintaining data integrity, dealing with concurrency control, and recovering information that has been corrupted by some event such as an unexpected system failure. Both a database and its DBMS conform to the principles of a particular database model. "Database system" refers collectively to the database model, database management system, and database. Physically, database servers are dedicated computers that hold the actual databases and run only the DBMS and related software. Database servers are usually multiprocessor computers, with generous memory and RAID disk arrays used for stable storage. Hardware database accelerators, connected to one or more servers via a high-speed channel, are also used in large-volume transaction processing environments. DBMSs are found at the heart of most database applications. DBMSs may be built around a custom multitasking kernel with built-in networking support, but modern DBMSs typically rely on a standard operating system to provide these functions. Since DBMSs comprise a significant market, computer and storage vendors often take into account DBMS requirements in their own development plans. Databases and DBMSs can be categorized according to the database model(s) that they support (such as relational or XML), the type(s) of computer they run on (from a server cluster to a mobile phone), the query language(s) used to access the database (such as SQL or XQuery), and their internal engineering, which affects performance, scalability, resilience, and security. == History == The sizes, capabilities, and performance of databases and their respective DBMSs have grown in orders of magnitude. These performance increases were enabled by the technology progress in the areas of processors, computer memory, computer storage, and computer networks. The concept of a database was made possible by the emergence of direct access storage media such as magnetic disks, which became widely available in the mid-1960s; earlier systems relied on sequential storage of data on magnetic tape. The subsequent development of database technology can be divided into three eras based on data model or structure: navigational, SQL/relational, and post-relational. The two main early navigational data models were the hierarchical model and the CODASYL model (network model). These were characterized by the use of pointers (often physical disk addresses) to follow relationships from one record to another. The relational model, first proposed in 1970 by Edgar F. Codd, departed from this tradition by insisting that applications should search for data by content, rather than by following links. The relational model employs sets of ledger-style tables, each used for a different type of entity. Only in the mid-1980s did computing hardware become powerful enough to allow the wide deployment of relational systems (DBMSs plus applications). By the early 1990s, however, relational systems dominated in all large-scale data processing applications, and as of 2018 they remain dominant: IBM Db2, Oracle, MySQL, and Microsoft SQL Server are the most searched DBMS. The dominant database language, standardized SQL for the relational model, has influenced database languages for other data models. Object databases were developed in the 1980s to overcome the inconvenience of object–relational impedance mismatch, which led to the coining of the term "post-relational" and also the development of hybrid object–relational databases. The next generation of post-relational databases in the late 2000s became known as NoSQL databases, introducing fast key–value stores and document-oriented databases. A competing "next generation" known as NewSQL databases attempted new implementations that retained the relational/SQL model while aiming to match the high performance of NoSQL compared to commercially available relational DBMSs. === 1960s, navigational DBMS === The introduction of the term database coincided with the availability of direct-access storage (disks and drums) from the mid-1960s onwards. The term represented a contrast with the tape-based systems of the past, allowing shared interactive use rather than daily batch processing. The Oxford English Dictionary cites a 1962 report by the System Development Corporation of California as the first to use the term "data-base" in a specific technical sense. As computers grew in speed and capability, a number of general-purpose database systems emerged; by the mid-1960s a number of such systems had come into commercial use. Interest in a standard began to grow, and Charles Bachman, author of one such product, the Integrated Data Store (IDS), founded the Database Task Group within CODASYL, the group responsible for the creation and standardization of COBOL. In 1971, the Database Task Group delivered their standard, which generally became known as the CODASYL approach, and soon a number of commercial products based on this approach entered the market. The CODASYL approach of

Waveform graphics

Waveform graphics is a simple vector graphics system introduced by Digital Equipment Corporation (DEC) on the VT55 and VT105 terminals in the mid-1970s. It was used to produce graphics output from mainframes and minicomputers. DEC used the term "waveform graphics" to refer specifically to the hardware, but it was used more generally to describe the whole system. The system was designed to use as little computer memory as possible. At any given X location it could draw two dots at given Y locations, making it suitable for producing two superimposed waveforms, line charts or histograms. Text and graphics could be mixed, and there were additional tools for drawing axes and markers. The waveform graphics system was used only for a short period of time before it was replaced by the more sophisticated ReGIS system, first introduced on the VT125 in 1981. ReGIS allowed the construction of arbitrary vectors and other shapes. Whereas DEC normally provided a backward compatible solution in newer terminal models, they did not choose to do this when ReGIS was introduced, and waveform graphics disappeared from later terminals. == Description == Waveform graphics was introduced on the VT55 terminal in October 1975, an era when memory was extremely expensive. Although it was technically possible to produce a bitmap display using a framebuffer using technology of the era, the memory needed to do so at a reasonable resolution was typically beyond the price point that made it practical. All sorts of systems were used to replace computer memory with other concepts, like the storage tubes used in the Tektronix 4010 terminals, or the zero memory racing-the-beam system used in the Atari 2600. DEC chose to attack this problem through a clever use of a small buffer representing only the vertical positions on the screen. Such a system could not draw arbitrary shapes, but would allow the display of graph data. The system was based on a 512 by 236 pixel display, producing 512 vertical columns along the X-axis, and 236 horizontal rows on the Y-axis. Y locations were counted up from the bottom, so the coordinate 0,0 was in the lower left, and 511, 235 in the upper right. Had this been implemented using a framebuffer with each location represented by a single bit, 512 × 236 × 1 = 120,832 bits, or 15,104 bytes, would have been required. At the time, memory cost about $50 per kilobyte, so the buffer alone would cost over $700 (equivalent to $4,570 in 2025). Instead, the waveform graphic system used one byte of memory for each X axis location, with the byte's value representing the Y location. This required only 512 bytes for each graph, a total of 1024 bytes for the two graphs. Drawing a line required the programmer to construct a series of Y locations and send them as individual points, the terminal could not connect the dots itself. To make this easier, the terminal automatically incremented the X location every time an Y coordinate was received, so a graph line could be sent as a long string of numbers for subsequent Y locations instead of having to repeatedly send the X location every time. Drawing normally started by sending a single instruction to set the initial X location, often 0 on the left, and then sending in data for the entire curve. The system also included storage for up to 512 markers on both lines. These were always drawn centered on the Y value of the line they were associated with, meaning that a simple on/off indication for X locations was all that was needed, requiring only 1024 bits, or 128 bytes, in total. The markers extended 16 pixels vertically, and could only be aligned on 16-pixel boundaries, so they were not necessarily centered across the underlying graph. Markers were used to indicate important points on the graph, where a symbol of some sort would normally be used. The system also allowed a vertical line to be drawn for every horizontal location and a horizontal one at every vertical location. These were also stored as simple on/off bits, requiring another 128 bytes of memory. These lines were used to draw axes and scale lines, or could be used for a screen-spanning crosshair cursor. A separate set of two 7-bit registers held additional information about the drawing style and other settings. Although complex from the user's perspective, this system was easy to implement in hardware. A cathode ray tube produces a display by scanning the screen in a series of horizontal motions, moving down one vertical line after each horizontal scan. At any given instant during this process, the display hardware examines a few memory locations to see if anything needs to be displayed. For instance, it can determine whether to draw a marker on graph 0 by examining register 1 to see if markers are turned on, looking in the marker buffer to see if there is a 1 at the current X location, and then examining the Y location of graph 0 to see if it is within 16 pixels of the current scan line. If all of these are true, a spot is drawn to present that portion of the marker. As this will be true for 16 vertical locations during the scanning process, a 16-pixel high marker will be drawn. Sold alone, the VT55 was priced at $2,496 (equivalent to $16,295 in 2025),. Like other models of the VT50 series, the terminal could be equipped with an optional wet-paper printer in a panel on the right of the screen. This added $800 (equivalent to $5,223 in 2025) to the price. DEC also offered VT55 in a package with a small model of the PDP-11 to create one model of the DEClab 11/03 system. The DEClab normally sold for $14,000 (equivalent to $91,397 in 2025) with a DECwriter II (LA36) hard-copy terminal for $15,000 (equivalent to $97,925 in 2025), with the VT55. The system had I/O channels for up to 15 lab devices, and included libraries for FORTRAN and BASIC for reading the data and creating graphs. The fairly extensive VT55 Programmers Manual covered the latter in depth. == Commands and data == Data was sent to the terminal using an extended set of codes similar to those introduced on the VT52. VT52 codes generally started with the ESC character (octal 33, decimal 27) and was then followed by a single letter instruction. For instance, the string of four characters ESC H ESC J would reposition the cursor in the upper left (home) and then clear the screen from that point down. These codes were basically modeless; triggered by the ESC the resulting escape mode automatically exited again when the command was complete. Escape codes could be interspersed with display text anywhere in the stream of data. In contrast, the graphics system was entirely modal, with escape sequences being sent to cause the terminal to enter or exit graph drawing mode. Data sent between these two codes were interpreted by the graphics hardware, so text and graphics could not be mixed in a single stream of instructions. Graphics mode was entered by sending the string ESC 1, and exited again with the string ESC 2. Even the commands within the graphics mode were modal; characters were interpreted as being additional data for the previous load character (command) until another load character is seen. Ten load characters were available: @ - no operation, used to tell the terminal the last command is no longer active A - load data into register 0, selecting the drawing mode for the two graphs I - load data into register 1, selecting other drawing options H - load the starting X position (Horizontal) for the following commands B - load data for Y locations for graph 0 starting at the H position selected earlier J - load data for Y locations for graph 1 starting at the H position selected earlier C - store a marker on graph 0 at the following X location K - store a marker on graph 1 at the following X location D - draw a horizontal line at the given Y location L - draw a vertical line at the given X location X and Y locations were sent as 10-bit decimal numbers, encoded as ASCII characters, with 5 bits per character. This means that any number within the 1024 number space (210) can be stored as a string of two characters. To ensure the characters can be transmitted over 7-bit links, the pattern 01 is placed in front of both 5-bit numbers, producing 7-bit ASCII values that are always within the printable range. This results in a somewhat complex encoding algorithm. For instance, if one wanted to encode the decimal value 102, first you convert that to the 10-bit decimal pattern 0010010010. That is then split that into upper and lower 5-bit parts, 00100 and 10010. Then append 01 binary to produce 7-bit numbers 0100100 and 0110010. Individually convert back to decimal 40 and 50, and then look up those characters in an ASCII chart, finding ( and 2. These have to be sent to the terminal least significant character first. If these were being used to set the X coordinate, the complete string would be H2(. When used as X and Y locations for the graphs, extra digits were ignored. For instance, the 512 pixel X axis r

Spanish Network of Excellence on Cybersecurity Research

The Spanish Network of Excellence on Cybersecurity Research (RENIC), is a research initiative to promote cybersecurity interests in Spain. == Members == === Board of Directors (2018) === President: Universidad de Málaga Vice president: CSIC Treasurer: Universidad Politécnica de Madrid Secretary: Universidad de Granada Vocals: Tecnalia, Universidad de La Laguna and Universidad de Modragón === Board of Directors (2016) === President: Universidad Carlos III de Madrid Vice president: Universidad Politécnica de Madrid Treasurer: Universidad de Granada Secretary: Universidad de León Vocals: Gradiant, Tecnalia, Universidad de Málaga === Founding Members === Centro Andaluz de Innovación y Tecnologías de la Información y las Comunicaciones (CITIC). Consejo Superior de Investigaciones Científicas (CSIC). Centro Tecnolóxico de Telecomunicaciones de Galicia (Gradiant). Instituto Imdea Software. Instituto Nacional de Ciberseguridad (INCIBE). Mondragón Unibertsitatea. Tecnalia. Universidad Carlos III de Madrid. Universidad Castilla la Mancha. Universidad de Granada. Universidad de la Laguna. Universidad de León. Universidad de Málaga. Universidad de Murcia. Universidad de Vigo. Universidad Internacional de la Rioja. Universidad Politécnica de Madrid. Universidad Rey Juan Carlos. === Members === Consejo Superior de Investigaciones Científicas (CSIC). Centro Tecnolóxico de Telecomunicaciones de Galicia (Gradiant). Instituto Imdea Software. Instituto Nacional de Ciberseguridad (INCIBE). Mondragón Unibertsitatea. Tecnalia. Universidad Carlos III de Madrid. Universidad de Castilla-La Mancha. Universidad de Granada. Universidad de la Laguna. Universidad de León. Universidad de Málaga. Universidad de Murcia. Universidad de Vigo. Universidad Politécnica de Madrid. Universidad Rey Juan Carlos. Universitat Oberta de Catalunya. IKERLAN. === Honorary Members === Centre for the Development of Industrial Technology (CDTI). (2017) Instituto Nacional de Ciberseguridad (INCIBE). (2016) == Initiatives and Participations == RENIC is ECSO member, and is also a member of its board of directors. A collaboration agreement between RENIC and the Innovative Business Cluster on Cybersecurity (AEI Cybersecurity) has been signed. RENIC is pleased to sponsor the Cybersecurity Research National Conferences (JNIC) JNIC2017 edition, organized by Universidad Rey Juan Carlos. RENIC is pleased to announce the publication of the online version of the Catalog and knowledge map of cybersecurity research

Law practice management software

Law practice management software is software designed to manage the business operations of a law firm. This can include software that manages cases, client intake, court communications, electronic discovery, time tracking, trust accounting, and billing. == Features of law practice management software == Common features of practice management software include: Case management Time tracking Document assembly Contact management Calendaring Docket management Client portal Contract Management Court Case Status Tracker Trust accounting == Examples of law practice management software == Smokeball LEAP Legal Software PracticeEvolve Dye & Durham

Confused deputy problem

In information security, a confused deputy is a computer program that is tricked by another program (with fewer privileges or less rights) into misusing its authority on the system. It is a specific type of privilege escalation. The confused deputy problem is often cited as an example of why capability-based security is important. Capability systems protect against the confused deputy problem, whereas access-control list–based systems do not. Such systems can mitigate the confused deputy problem by eliminating ambient authority, allowing programs to act only on resources for which they hold explicit capabilities, whereas access-control list–based systems are more susceptible to it. However, this protection depends on correct implementation; in formally verified capability systems such as seL4, it can be shown that the kernel enforces capability constraints correctly, preventing such behavior at the system level. == Example == In the original example of a confused deputy, there was a compiler program provided on a commercial timesharing service. Users could run the compiler and optionally specify a filename where it would write debugging output, and the compiler would be able to write to that file if the user had permission to write there. The compiler also collected statistics about language feature usage. Those statistics were stored in a file called "(SYSX)STAT", in the directory "SYSX". To make this possible, the compiler program was given permission to write to files in SYSX. But there were other files in SYSX: in particular, the system's billing information was stored in a file "(SYSX)BILL". A user ran the compiler and named "(SYSX)BILL" as the desired debugging output file. This produced a confused deputy problem. The compiler made a request to the operating system to open (SYSX)BILL. Even though the user did not have access to that file, the compiler did, so the open succeeded. The compiler wrote the compilation output to the file (here "(SYSX)BILL") as normal, overwriting it, and the billing information was destroyed. === The confused deputy === In this example, the compiler program is the deputy because it is acting at the request of the user. The program is seen as 'confused' because it was tricked into overwriting the system's billing file. Whenever a program tries to access a file, the operating system needs to know two things: which file the program is asking for, and whether the program has permission to access the file. In the example, the file is designated by its name, “(SYSX)BILL”. The program receives the file name from the user, but does not know whether the user had permission to write the file. When the program opens the file, the system uses the program's permission, not the user's. When the file name was passed from the user to the program, the permission did not go along with it; the permission was increased by the system silently and automatically. It is not essential to the attack that the billing file be designated by a name represented as a string. The essential points are that: the designator for the file does not carry the full authority needed to access the file; the program's own permission to access the file is used implicitly. == Other examples == A cross-site request forgery (CSRF) is an example of a confused deputy attack that uses the web browser to perform sensitive actions against a web application. A common form of this attack occurs when a web application uses a cookie to authenticate all requests transmitted by a browser. Using JavaScript, an attacker can force a browser into transmitting authenticated HTTP requests. The Samy computer worm used cross-site scripting (XSS) to turn the browser's authenticated MySpace session into a confused deputy. Using XSS the worm forced the browser into posting an executable copy of the worm as a MySpace message which was then viewed and executed by friends of the infected user. Clickjacking is an attack where the user acts as the confused deputy. In this attack a user thinks they are harmlessly browsing a website (an attacker-controlled website) but they are in fact tricked into performing sensitive actions on another website. An FTP bounce attack can allow an attacker to connect indirectly to TCP ports to which the attacker's machine has no access, using a remote FTP server as the confused deputy. Another example relates to personal firewall software. It can restrict Internet access for specific applications. Some applications circumvent this by starting a browser with instructions to access a specific URL. The browser has authority to open a network connection, even though the application does not. Firewall software can attempt to address this by prompting the user in cases where one program starts another which then accesses the network. However, the user frequently does not have sufficient information to determine whether such an access is legitimate—false positives are common, and there is a substantial risk that even sophisticated users will become habituated to clicking "OK" to these prompts. Not every program that misuses authority is a confused deputy. Sometimes misuse of authority is simply a result of a program error. The confused deputy problem occurs when the designation of an object is passed from one program to another, and the associated permission changes unintentionally, without any explicit action by either party. It is insidious because neither party did anything explicit to change the authority. Another example is when an administrator authorizes an AI agent to act on their behalf, and that AI subsequently delegates authority to another AI agent neither vetted nor authorized by the original administrator. The unvetted AI can then act without permissions or oversight from the original developer. == Solutions == In some systems it is possible to ask the operating system to open a file using the permissions of another client. This solution has some drawbacks: It requires explicit attention to security by the server. A naive or careless server might not take this extra step. It becomes more difficult to identify the correct permission if the server is in turn the client of another service and wants to pass along access to the file. It requires the client to trust the server to not abuse the borrowed permissions. Note that intersecting the server and client's permissions does not solve the problem either, because the server may then have to be given very wide permissions (all of the time, rather than those needed for a given request) in order to act for arbitrary clients. The simplest way to solve the confused deputy problem is to bundle together the designation of an object and the permission to access that object. This is exactly what a capability is. Using capability security in the compiler example, the client would pass to the server a capability to the output file, such as a file descriptor, rather than the name of the file. Since it lacks a capability to the billing file, it cannot designate that file for output. In the cross-site request forgery example, a URL supplied "cross"-site would include its own authority independent of that of the client of the web browser.