Logic learning machine

Logic learning machine

Logic learning machine (LLM) is a machine learning method based on the generation of intelligible rules. LLM is an efficient implementation of the Switching Neural Network (SNN) paradigm, developed by Marco Muselli, Senior Researcher at the Italian National Research Council CNR-IEIIT in Genoa. LLM has been employed in many different sectors, including the field of medicine (orthopedic patient classification, DNA micro-array analysis and Clinical Decision Support Systems), financial services and supply chain management. == History == The Switching Neural Network approach was developed in the 1990s to overcome the drawbacks of the most commonly used machine learning methods. In particular, black box methods, such as multilayer perceptron and support vector machine, had good accuracy but could not provide deep insight into the studied phenomenon. On the other hand, decision trees were able to describe the phenomenon but often lacked accuracy. Switching Neural Networks made use of Boolean algebra to build sets of intelligible rules able to obtain very good performance. In 2014, an efficient version of Switching Neural Network was developed and implemented in the Rulex suite with the name Logic Learning Machine. Also, an LLM version devoted to regression problems was developed. == General == Like other machine learning methods, LLM uses data to build a model able to perform a good forecast about future behaviors. LLM starts from a table including a target variable (output) and some inputs and generates a set of rules that return the output value y {\displaystyle y} corresponding to a given configuration of inputs. A rule is written in the form: if premise then consequence where consequence contains the output value whereas premise includes one or more conditions on the inputs. According to the input type, conditions can have different forms: for categorical variables the input value must be in a given subset: x 1 ∈ { A , B , C , . . . } {\displaystyle x_{1}\in \{A,B,C,...\}} . for ordered variables the condition is written as an inequality or an interval: x 2 ≤ α {\displaystyle x_{2}\leq \alpha } or β ≤ x 3 ≤ γ {\displaystyle \beta \leq x_{3}\leq \gamma } A possible rule is therefore in the form if x 1 ∈ { A , B , C , . . . } {\displaystyle x_{1}\in \{A,B,C,...\}} AND x 2 ≤ α {\displaystyle x_{2}\leq \alpha } AND β ≤ x 3 ≤ γ {\displaystyle \beta \leq x_{3}\leq \gamma } then y = y ¯ {\displaystyle y={\bar {y}}} == Types == According to the output type, different versions of the Logic Learning Machine have been developed: Logic Learning Machine for classification, when the output is a categorical variable, which can assume values in a finite set Logic Learning Machine for regression, when the output is an integer or real number.

Boundary vector field

The boundary vector field (BVF) is an external force for parametric active contours (i.e. Snakes). In the fields of computer vision and image processing, parametric active contours are widely used for segmentation and object extraction. The active contours move progressively towards its target based on the external forces. There are a number of shortcomings in using the traditional external forces, including the capture range problem, the concave object extraction problem, and high computational requirements. The BVF is generated by an interpolation scheme which reduces the computational requirement significantly, and at the same time, improves the capture range and concave object extraction capability. The BVF is also tested in moving object tracking and is proven to provide fast detection method for real time video applications.

C4.5 algorithm

C4.5 is an algorithm used to generate a decision tree developed by Ross Quinlan. C4.5 is an extension of Quinlan's earlier ID3 algorithm. The decision trees generated by C4.5 can be used for classification, and for this reason, C4.5 is often referred to as a statistical classifier. In 2011, authors of the Weka machine learning software described the C4.5 algorithm as "a landmark decision tree program that is probably the machine learning workhorse most widely used in practice to date". It became quite popular after ranking #1 in the Top 10 Algorithms in Data Mining pre-eminent paper published by Springer LNCS in 2008. == Algorithm == C4.5 builds decision trees from a set of training data in the same way as ID3, using the concept of information entropy. The training data is a set S = s 1 , s 2 , . . . {\displaystyle S={s_{1},s_{2},...}} of already classified samples. Each sample s i {\displaystyle s_{i}} consists of a p-dimensional vector ( x 1 , i , x 2 , i , . . . , x p , i ) {\displaystyle (x_{1,i},x_{2,i},...,x_{p,i})} , where the x j {\displaystyle x_{j}} represent attribute values or features of the sample, as well as the class in which s i {\displaystyle s_{i}} falls. At each node of the tree, C4.5 chooses the attribute of the data that most effectively splits its set of samples into subsets enriched in one class or the other. The splitting criterion is the normalized information gain (difference in entropy). The attribute with the highest normalized information gain is chosen to make the decision. The C4.5 algorithm then recurses on the partitioned sublists. This algorithm has a few base cases. All the samples in the list belong to the same class. When this happens, it simply creates a leaf node for the decision tree saying to choose that class. None of the features provide any information gain. In this case, C4.5 creates a decision node higher up the tree using the expected value of the class. Instance of previously unseen class encountered. Again, C4.5 creates a decision node higher up the tree using the expected value. === Pseudocode === In pseudocode, the general algorithm for building decision trees is: Check for the above base cases. For each attribute a, find the normalized information gain ratio from splitting on a. Let a_best be the attribute with the highest normalized information gain. Create a decision node that splits on a_best. Recurse on the sublists obtained by splitting on a_best, and add those nodes as children of node. == Improvements from ID3 algorithm == C4.5 made a number of improvements to ID3. Some of these are: Handling both continuous and discrete attributes: In order to handle continuous attributes, C4.5 creates a threshold and then splits the list into those whose attribute value is above the threshold and those that are less than or equal to it. Handling training data with missing attribute values: C4.5 allows attribute values to be marked as missing. Missing attribute values are simply not used in gain and entropy calculations. Handling attributes with differing costs. Pruning trees after creation: C4.5 goes back through the tree once it's been created and attempts to remove branches that do not help by replacing them with leaf nodes. == Improvements in C5.0/See5 algorithm == Quinlan went on to create C5.0 and See5 (C5.0 for Unix/Linux, See5 for Windows) which he markets commercially. C5.0 offers a number of improvements on C4.5. Some of these are: Speed - C5.0 is significantly faster than C4.5 (several orders of magnitude) Memory usage - C5.0 is more memory efficient than C4.5 Smaller decision trees - C5.0 gets similar results to C4.5 with considerably smaller decision trees. Support for boosting - Boosting improves the trees and gives them more accuracy. Weighting - C5.0 allows you to weight different cases and misclassification types. Winnowing - a C5.0 option automatically winnows the attributes to remove those that may be unhelpful. Source for a single-threaded Linux version of C5.0 is available under the GNU General Public License (GPL).

C4.5 algorithm

C4.5 is an algorithm used to generate a decision tree developed by Ross Quinlan. C4.5 is an extension of Quinlan's earlier ID3 algorithm. The decision trees generated by C4.5 can be used for classification, and for this reason, C4.5 is often referred to as a statistical classifier. In 2011, authors of the Weka machine learning software described the C4.5 algorithm as "a landmark decision tree program that is probably the machine learning workhorse most widely used in practice to date". It became quite popular after ranking #1 in the Top 10 Algorithms in Data Mining pre-eminent paper published by Springer LNCS in 2008. == Algorithm == C4.5 builds decision trees from a set of training data in the same way as ID3, using the concept of information entropy. The training data is a set S = s 1 , s 2 , . . . {\displaystyle S={s_{1},s_{2},...}} of already classified samples. Each sample s i {\displaystyle s_{i}} consists of a p-dimensional vector ( x 1 , i , x 2 , i , . . . , x p , i ) {\displaystyle (x_{1,i},x_{2,i},...,x_{p,i})} , where the x j {\displaystyle x_{j}} represent attribute values or features of the sample, as well as the class in which s i {\displaystyle s_{i}} falls. At each node of the tree, C4.5 chooses the attribute of the data that most effectively splits its set of samples into subsets enriched in one class or the other. The splitting criterion is the normalized information gain (difference in entropy). The attribute with the highest normalized information gain is chosen to make the decision. The C4.5 algorithm then recurses on the partitioned sublists. This algorithm has a few base cases. All the samples in the list belong to the same class. When this happens, it simply creates a leaf node for the decision tree saying to choose that class. None of the features provide any information gain. In this case, C4.5 creates a decision node higher up the tree using the expected value of the class. Instance of previously unseen class encountered. Again, C4.5 creates a decision node higher up the tree using the expected value. === Pseudocode === In pseudocode, the general algorithm for building decision trees is: Check for the above base cases. For each attribute a, find the normalized information gain ratio from splitting on a. Let a_best be the attribute with the highest normalized information gain. Create a decision node that splits on a_best. Recurse on the sublists obtained by splitting on a_best, and add those nodes as children of node. == Improvements from ID3 algorithm == C4.5 made a number of improvements to ID3. Some of these are: Handling both continuous and discrete attributes: In order to handle continuous attributes, C4.5 creates a threshold and then splits the list into those whose attribute value is above the threshold and those that are less than or equal to it. Handling training data with missing attribute values: C4.5 allows attribute values to be marked as missing. Missing attribute values are simply not used in gain and entropy calculations. Handling attributes with differing costs. Pruning trees after creation: C4.5 goes back through the tree once it's been created and attempts to remove branches that do not help by replacing them with leaf nodes. == Improvements in C5.0/See5 algorithm == Quinlan went on to create C5.0 and See5 (C5.0 for Unix/Linux, See5 for Windows) which he markets commercially. C5.0 offers a number of improvements on C4.5. Some of these are: Speed - C5.0 is significantly faster than C4.5 (several orders of magnitude) Memory usage - C5.0 is more memory efficient than C4.5 Smaller decision trees - C5.0 gets similar results to C4.5 with considerably smaller decision trees. Support for boosting - Boosting improves the trees and gives them more accuracy. Weighting - C5.0 allows you to weight different cases and misclassification types. Winnowing - a C5.0 option automatically winnows the attributes to remove those that may be unhelpful. Source for a single-threaded Linux version of C5.0 is available under the GNU General Public License (GPL).

Neural Networks (journal)

Neural Networks is a monthly peer-reviewed scientific journal and an official journal of the International Neural Network Society, European Neural Network Society, and Japanese Neural Network Society. == History == The journal was established in 1988 and is published by Elsevier. It covers all aspects of research on artificial neural networks. The founding editor-in-chief was Stephen Grossberg (Boston University). The current editors-in-chief are DeLiang Wang (Ohio State University) and Taro Toyoizumi (RIKEN Center for Brain Science). == Abstracting and indexing == The journal is abstracted and indexed in Scopus and the Science Citation Index Expanded. According to the Journal Citation Reports, the journal has a 2022 impact factor of 7.8.

Autognostics

Autognostics is a new paradigm that describes the capacity for computer networks to be self-aware. It is considered one of the major components of Autonomic Networking. == Introduction == One of the most important characteristics of today's Internet that has contributed to its success is its basic design principle: a simple and transparent core with intelligence at the edges (the so-called "end-to-end principle"). Based on this principle, the network carries data without knowing the characteristics of that data (e.g., voice, video, etc.) - only the end-points have application-specific knowledge. If something goes wrong with the data, only the edge may be able to recognize that since it knows about the application and what the expected behavior is. The core has no information about what should happen with that data - it only forwards packets. Although an effective and beneficial attribute, this design principle has also led to many of today's problems, limitations, and frustrations. Currently, it is almost impossible for most end-users to know why certain network-based applications do not work well and what they need to do to make it better. Also, network operators who interact with the core in low-level terms such as router configuration have problems expressing their high-level goals into low-level actions. In high-level terms, this may be summarized as a weak coupling between the network and application layers of the overall system. As a consequence of the Internet end-to-end principle, the network performance experienced by a particular application is difficult to attribute based on the behavior of the individual elements. At any given moment, the measure of performance between any two points is typically unknown and applications must operate blindly. As a further consequence, changes to the configuration of given element, or changes in the end-to-end path, cannot easily be validated. Optimization and provisioning cannot then be automated except against only the simplest design specifications. There is an increasing interest in Autonomic Networking research, and a strong conviction that an evolution from the current networking status quo is necessary. Although to date there have not been any practical implementations demonstrating the benefits of an effective autonomic networking paradigm, there seems to be a consensus as to the characteristics which such implementations would need to demonstrate. These specifically include continuous monitoring, identifying, diagnosing and fixing problems based on high-level policies and objectives. Autognostics, as a major part of the autonomic networking concept, intends to bring networks to a new level of awareness and eliminate the lack of visibility which currently exists in today's networks. == Definition == Autognostics is a new paradigm that describes the capacity for computer networks to be self-aware, in part and as a whole, and dynamically adapt to the applications running on them by autonomously monitoring, identifying, diagnosing, resolving issues, subsequently verifying that any remediation was successful, and reporting the impact with respect to the application's use (i.e., providing visibility into the changes to networks and their effects). Although similar to the concept of network awareness, i.e., the capability of network devices and applications to be aware of network characteristics (see References section below), it is noteworthy that autognostics takes that concept one step further. The main difference is the auto part of autognostics, which entails that network devices are self-aware of network characteristics, and have the capability to adapt themselves as a result of continuous monitoring and diagnostics. == Path to autognostics == Autognostics, or in other words deep self-knowledge, can be best described as the ability of a network to know itself and the applications that run on it. This knowledge is used to autonomously adapt to dynamic network and application conditions such as utilization, capacity, quality of service/application/user experience, etc. In order to achieve autognosis, networks need a means to: Continuously monitor/test the network for application-specific performance Analyze the monitoring/test data to detect problems (e.g., performance degradation) Diagnose, identify and localize sources of degradation Automatically take actions to resolve problems via remediation/provisioning Verify the problems have been resolved (potentially rolling back changes if ineffective) Subsequently, continue to monitor/test for performance

Random indexing

Random indexing is a dimensionality reduction method and computational framework for distributional semantics, based on the insight that very-high-dimensional vector space model implementations are impractical, that models need not grow in dimensionality when new items (e.g. new terminology) are encountered, and that a high-dimensional model can be projected into a space of lower dimensionality without compromising L2 distance metrics if the resulting dimensions are chosen appropriately. This is the original point of the random projection approach to dimension reduction first formulated as the Johnson–Lindenstrauss lemma, and locality-sensitive hashing has some of the same starting points. Random indexing, as used in representation of language, originates from the work of Pentti Kanerva on sparse distributed memory, and can be described as an incremental formulation of a random projection. It can be also verified that random indexing is a random projection technique for the construction of Euclidean spaces—i.e. L2 normed vector spaces. In Euclidean spaces, random projections are elucidated using the Johnson–Lindenstrauss lemma. The TopSig technique extends the random indexing model to produce bit vectors for comparison with the Hamming distance similarity function. It is used for improving the performance of information retrieval and document clustering. In a similar line of research, Random Manhattan Integer Indexing (RMII) is proposed for improving the performance of the methods that employ the Manhattan distance between text units. Many random indexing methods primarily generate similarity from co-occurrence of items in a corpus. Reflexive Random Indexing (RRI) generates similarity from co-occurrence and from shared occurrence with other items.