Stevens Award

Stevens Award

The Stevens Award is a software engineering lecture award given by the Reengineering Forum, an industry association. The international Stevens Award was created to recognize outstanding contributions to the literature or practice of methods for software and systems development. The first award was given in 1995. The presentations focus on the current state of software methods and their direction for the future. This award lecture is named in memory of Wayne Stevens (1944-1993), a consultant, author, pioneer, and advocate of the practical application of software methods and tools. The Stevens Award and lecture is managed by the Reengineering Forum. The award was founded by International Workshop on Computer Aided Software Engineering (IWCASE), an international workshop association of users and developers of computer-aided software engineering (CASE) technology, which merged into The Reengineering Forum. Wayne Stevens was a charter member of the IWCASE executive board. == Recipients == 1995: Tony Wasserman 1996: David Harel 1997: Michael Jackson 1998: Thomas McCabe 1999: Tom DeMarco 2000: Gerald Weinberg 2001: Peter Chen 2002: Cordell Green 2003: Manny Lehman 2004: François Bodart 2005: Mary Shaw, Jim Highsmith 2006: Grady Booch 2007: Nicholas Zvegintzov 2008: Harry Sneed 2009: Larry Constantine 2010: Peter Aiken 2011: Jared Spool, Barry Boehm 2012: Philip Newcomb 2013: Jean-Luc Hainaut 2014: François Coallier 2015: Pierre Bourque

RagTime

RagTime is a frame-oriented business publishing software which combines word processing, spreadsheets, simple drawings, image processing, and charts, in a single document/program, integrated software. It is often used to create forms, reports, documentation, desktop publishing, and in office environments. Typical users are business clients, educational institutions, administrations, architects, and also private users. Ragtime includes the following modules: Page layout (forms, templates etc.) Word processing Image processing Spreadsheets, similar to Microsoft Excel Formulas and functions which can be used throughout, in text, graphics, and spreadsheets Charts in different types of diagrams Drawings in vector graphics including lines, polygons, Bézier curves and more Slide show (presentation of RagTime documents) Audio/video Buttons (pop-up menus, switches, and more) that can be used within RagTime documents Import/export of various file formats Support of the AppleScript scripting language available system-wide under macOS == Principle == RagTime differs from most other comparable programs or software packages in its strict frame-oriented design: all content is contained within frames on each page. The content can have a fixed position within its frame or, if it is text or a spreadsheet, flow into another frame that is connected to the first frame via a so-called “pipeline”. RagTime has no different document types for different types of data; all content is stored in a single compound document type. Thus, a RagTime document not only can contain multiple pages, but also multiple layouts within the same document; e.g. spreadsheets in addition to text and images. The RagTime filename extension is .rtd (RagTime document); for templates the extension is .rtt (RagTime template). The current version is RagTime 6.6.5. It is available for OS X (10.6-10.14) and Windows (XP/Vista/7/8/10). == Extensions == FileTime – allows accessing “FileMaker Pro” databases from RagTime documents under OS X RagTime Connect – ODBC database connection for RagTime 6 (Mac and Windows) Johannes – print extension for the simple creation of stapled or folded brochures, booklets etc. PowerFunctions – additional functions for a more effective creation of intelligent documents for exchanging data and for use in mixed Mac/Windows environments MetaFormula – SYLK-based extension that allows calculating text as formula == History == RagTime has been developed since 1985 for the Macintosh – originally named MacFrame – and was published in 1986. When released, it already had the present name, which was chosen following the then-available software package Lotus Jazz. In the European Macintosh market, RagTime quickly gained a prominent position that continues to this day, even though the market share has decreased. Despite repeated attempts, the program could not gain acceptance in the North American market due to its high cost ($395 in 1990). The North American sales office closed in 1991, shortly after Claris Corporation released ClarisWorks which duplicated much of the functionality of RagTime for a lower price. After the manufacturer – first Brüning & Everth, followed by B&E Software and today RagTime.de Development – had focused on the Macintosh only for a very long time, it also released a Windows version, RagTime 5.0, in 1999. However, the program could not assume great significance against established competitors, especially Microsoft Office. Until mid-2006 RagTime was, in addition to the commercial version, also available as a free version (RagTime Solo) for personal use. RagTime Solo included the same features and performance (except for spelling and Syllabification) dictionaries), but was not allowed for use in commercial environments. In other languages RagTime Solo was distributed as RagTime Privat. In a press release from July 5, 2006, RagTime announced the discontinuation of RagTime Solo: “… the RagTime Solo license conditions were often misinterpreted or deliberately flouted. Therefore we discontinued RagTime Solo, there will be no private version of RagTime 6 anymore.” After a successful start of the RagTime 6.0 software, sales edged significantly lower in the following years. Disagreements arose among the shareholders about the continuation of the company, which filed for bankruptcy in July 2007. As a result, the rights to RagTime were taken over by the newly established company RagTime.de Development GmbH, which was responsible for the development. The sales partner RagTime.de Sales GmbH distributed the RagTime products until October 2015. Today RagTime.de Development GmbH is also responsible for sales. The last level of development is the extensively revamped version RagTime 6.6 of 8 October 2015, which also includes new OS X features (e.g. high-resolution “Retina” displays) and supports Windows 10. == Programming == RagTime 1-3 were developed in Pascal, since version 4 the development is completely coded in C++. External programming and automation can be implemented via AppleScript on a Mac, and via OLE/COM-API (e.g. Visual Basic) under Windows. On a Mac, RagTime provides a comprehensive AppleScript library, for the automation of almost any task, from automatic document creation to the export of PDF documents. RagTime also supports “recordings” by use of the “AppleScript Editor”, which allows recording the interactive RagTime operation as an AppleScript program sequence. AppleScripts can be saved in the RagTime document and called via menu or shortcut keys. On Windows, RagTime (since version 6) disposes over an OLE/COM API, which allows automating many RagTime components via external programming. For that purpose there is a type library that installs the available RagTime OLE/COM object catalogue. Programming can be realized in all programming languages supported by Microsoft.

Variable kernel density estimation

In statistics, adaptive or "variable-bandwidth" kernel density estimation is a form of kernel density estimation in which the size of the kernels used in the estimate are varied depending upon either the location of the samples or the location of the test point. It is a particularly effective technique when the sample space is multi-dimensional. == Rationale == Given a set of samples, { x → i } {\displaystyle \lbrace {\vec {x}}_{i}\rbrace } , we wish to estimate the density, P ( x → ) {\displaystyle P({\vec {x}})} , at a test point, x → {\displaystyle {\vec {x}}} : P ( x → ) ≈ W n h D {\displaystyle P({\vec {x}})\approx {\frac {W}{nh^{D}}}} W = ∑ i = 1 n w i {\displaystyle W=\sum _{i=1}^{n}w_{i}} w i = K ( x → − x → i h ) {\displaystyle w_{i}=K\left({\frac {{\vec {x}}-{\vec {x}}_{i}}{h}}\right)} where n is the number of samples, K is the "kernel", h is its width and D is the number of dimensions in x → {\displaystyle {\vec {x}}} . The kernel can be thought of as a simple, linear filter. Using a fixed filter width may mean that in regions of low density, all samples will fall in the tails of the filter with very low weighting, while regions of high density will find an excessive number of samples in the central region with weighting close to unity. To fix this problem, we vary the width of the kernel in different regions of the sample space. There are two methods of doing this: balloon and pointwise estimation. In a balloon estimator, the kernel width is varied depending on the location of the test point. In a pointwise estimator, the kernel width is varied depending on the location of the sample. For multivariate estimators, the parameter, h, can be generalized to vary not just the size, but also the shape of the kernel. This more complicated approach will not be covered here. == Balloon estimators == A common method of varying the kernel width is to make it inversely proportional to the density at the test point: h = k [ n P ( x → ) ] 1 / D {\displaystyle h={\frac {k}{\left[nP({\vec {x}})\right]^{1/D}}}} where k is a constant. If we back-substitute the estimated PDF, and assuming a Gaussian kernel function, we can show that W is a constant: W = k D ( 2 π ) D / 2 {\displaystyle W=k^{D}(2\pi )^{D/2}} A similar derivation holds for any kernel whose normalising function is of the order hD, although with a different constant factor in place of the (2 π)D/2 term. This produces a generalization of the k-nearest neighbour algorithm. That is, a uniform kernel function will return the KNN technique. There are two components to the error: a variance term and a bias term. The variance term is given as: e 1 = P ∫ K 2 n h D {\displaystyle e_{1}={\frac {P\int K^{2}}{nh^{D}}}} . The bias term is found by evaluating the approximated function in the limit as the kernel width becomes much larger than the sample spacing. By using a Taylor expansion for the real function, the bias term drops out: e 2 = h 2 n ∇ 2 P {\displaystyle e_{2}={\frac {h^{2}}{n}}\nabla ^{2}P} An optimal kernel width that minimizes the error of each estimate can thus be derived. == Use for statistical classification == The method is particularly effective when applied to statistical classification. There are two ways we can proceed: the first is to compute the PDFs of each class separately, using different bandwidth parameters, and then compare them as in Taylor. Alternatively, we can divide up the sum based on the class of each sample: P ( j , x → ) ≈ 1 n ∑ i = 1 , c i = j n w i {\displaystyle P(j,{\vec {x}})\approx {\frac {1}{n}}\sum _{i=1,c_{i}=j}^{n}w_{i}} where ci is the class of the ith sample. The class of the test point may be estimated through maximum likelihood.

LogitBoost

In machine learning and computational learning theory, LogitBoost is a boosting algorithm formulated by Jerome Friedman, Trevor Hastie, and Robert Tibshirani. The original paper casts the AdaBoost algorithm into a statistical framework. Specifically, if one considers AdaBoost as a generalized additive model and then applies the cost function of logistic regression, one can derive the LogitBoost algorithm. == Minimizing the LogitBoost cost function == LogitBoost can be seen as a convex optimization. Specifically, given that we seek an additive model of the form f = ∑ t α t h t {\displaystyle f=\sum _{t}\alpha _{t}h_{t}} the LogitBoost algorithm minimizes the logistic loss: ∑ i log ⁡ ( 1 + e − y i f ( x i ) ) {\displaystyle \sum _{i}\log \left(1+e^{-y_{i}f(x_{i})}\right)}

Medoid

Medoids are representative objects of a data set or a cluster within a data set whose sum of dissimilarities to all the objects in the cluster is minimal. Medoids are similar in concept to means or centroids, but medoids are always restricted to be members of the data set. Medoids are most commonly used on data when a mean or centroid cannot be defined, such as graphs. They are also used in contexts where the centroid is not representative of the dataset like in images, 3-D trajectories and gene expression (where while the data is sparse the medoid need not be). These are also of interest while wanting to find a representative using some distance other than squared euclidean distance (for instance in movie-ratings). For some data sets there may be more than one medoid, as with medians. A common application of the medoid is the k-medoids clustering algorithm, which is similar to the k-means algorithm but works when a mean or centroid is not definable. This algorithm basically works as follows. First, a set of medoids is chosen at random. Second, the distances to the other points are computed. Third, data are clustered according to the medoid they are most similar to. Fourth, the medoid set is optimized via an iterative process. Note that a medoid is not equivalent to a median, a geometric median, or centroid. A median is only defined on 1-dimensional data, and it only minimizes dissimilarity to other points for metrics induced by a norm (such as the Manhattan distance or Euclidean distance). A geometric median is defined in any dimension, but unlike a medoid, it is not necessarily a point from within the original dataset. == Definition == Let X := { x 1 , x 2 , … , x n } {\textstyle {\mathcal {X}}:=\{x_{1},x_{2},\dots ,x_{n}\}} be a set of n {\textstyle n} points in a space with a distance function d. Medoid is defined as x medoid = arg ⁡ min y ∈ X ∑ i = 1 n d ( y , x i ) . {\displaystyle x_{\text{medoid}}=\arg \min _{y\in {\mathcal {X}}}\sum _{i=1}^{n}d(y,x_{i}).} == Clustering with medoids == Medoids are a popular replacement for the cluster mean when the distance function is not (squared) Euclidean distance, or not even a metric (as the medoid does not require the triangle inequality). When partitioning the data set into clusters, the medoid of each cluster can be used as a representative of each cluster. Clustering algorithms based on the idea of medoids include: Partitioning Around Medoids (PAM), the standard k-medoids algorithm Hierarchical Clustering Around Medoids (HACAM), which uses medoids in hierarchical clustering == Algorithms to compute the medoid of a set == From the definition above, it is clear that the medoid of a set X {\displaystyle {\mathcal {X}}} can be computed after computing all pairwise distances between points in the ensemble. This would take O ( n 2 ) {\textstyle O(n^{2})} distance evaluations (with n = | X | {\displaystyle n=|{\mathcal {X}}|} ). In the worst case, one can not compute the medoid with fewer distance evaluations. However, there are many approaches that allow us to compute medoids either exactly or approximately in sub-quadratic time under different statistical models. If the points lie on the real line, computing the medoid reduces to computing the median which can be done in O ( n ) {\textstyle O(n)} by Quick-select algorithm of Hoare. However, in higher dimensional real spaces, no linear-time algorithm is known. RAND is an algorithm that estimates the average distance of each point to all the other points by sampling a random subset of other points. It takes a total of O ( n log ⁡ n ϵ 2 ) {\textstyle O\left({\frac {n\log n}{\epsilon ^{2}}}\right)} distance computations to approximate the medoid within a factor of ( 1 + ϵ Δ ) {\textstyle (1+\epsilon \Delta )} with high probability, where Δ {\textstyle \Delta } is the maximum distance between two points in the ensemble. Note that RAND is an approximation algorithm, and moreover Δ {\textstyle \Delta } may not be known apriori. RAND was leveraged by TOPRANK which uses the estimates obtained by RAND to focus on a small subset of candidate points, evaluates the average distance of these points exactly, and picks the minimum of those. TOPRANK needs O ( n 5 3 log 4 3 ⁡ n ) {\textstyle O(n^{\frac {5}{3}}\log ^{\frac {4}{3}}n)} distance computations to find the exact medoid with high probability under a distributional assumption on the average distances. trimed presents an algorithm to find the medoid with O ( n 3 2 2 Θ ( d ) ) {\textstyle O(n^{\frac {3}{2}}2^{\Theta (d)})} distance evaluations under a distributional assumption on the points. The algorithm uses the triangle inequality to cut down the search space. Meddit leverages a connection of the medoid computation with multi-armed bandits and uses an upper-Confidence-bound type of algorithm to get an algorithm which takes O ( n log ⁡ n ) {\textstyle O(n\log n)} distance evaluations under statistical assumptions on the points. Correlated Sequential Halving also leverages multi-armed bandit techniques, improving upon Meddit. By exploiting the correlation structure in the problem, the algorithm is able to provably yield drastic improvement (usually around 1-2 orders of magnitude) in both number of distance computations needed and wall clock time. == Implementations == An implementation of RAND, TOPRANK, and trimed can be found here. An implementation of Meddit can be found here and here. An implementation of Correlated Sequential Halving can be found here. == Medoids in text and natural language processing (NLP) == Medoids can be applied to various text and NLP tasks to improve the efficiency and accuracy of analyses. By clustering text data based on similarity, medoids can help identify representative examples within the dataset, leading to better understanding and interpretation of the data. === Text clustering === Text clustering is the process of grouping similar text or documents together based on their content. Medoid-based clustering algorithms can be employed to partition large amounts of text into clusters, with each cluster represented by a medoid document. This technique helps in organizing, summarizing, and retrieving information from large collections of documents, such as in search engines, social media analytics and recommendation systems. === Text summarization === Text summarization aims to produce a concise and coherent summary of a larger text by extracting the most important and relevant information. Medoid-based clustering can be used to identify the most representative sentences in a document or a group of documents, which can then be combined to create a summary. This approach is especially useful for extractive summarization tasks, where the goal is to generate a summary by selecting the most relevant sentences from the original text. === Sentiment analysis === Sentiment analysis involves determining the sentiment or emotion expressed in a piece of text, such as positive, negative, or neutral. Medoid-based clustering can be applied to group text data based on similar sentiment patterns. By analyzing the medoid of each cluster, researchers can gain insights into the predominant sentiment of the cluster, helping in tasks such as opinion mining, customer feedback analysis, and social media monitoring. === Topic modeling === Topic modeling is a technique used to discover abstract topics that occur in a collection of documents. Medoid-based clustering can be applied to group documents with similar themes or topics. By analyzing the medoids of these clusters, researchers can gain an understanding of the underlying topics in the text corpus, facilitating tasks such as document categorization, trend analysis, and content recommendation. === Techniques for measuring text similarity in medoid-based clustering === When applying medoid-based clustering to text data, it is essential to choose an appropriate similarity measure to compare documents effectively. Each technique has its advantages and limitations, and the choice of the similarity measure should be based on the specific requirements and characteristics of the text data being analyzed. The following are common techniques for measuring text similarity in medoid-based clustering: ==== Cosine similarity ==== Cosine similarity is a widely used measure to compare the similarity between two pieces of text. It calculates the cosine of the angle between two document vectors in a high-dimensional space. Cosine similarity ranges between -1 and 1, where a value closer to 1 indicates higher similarity, and a value closer to -1 indicates lower similarity. By visualizing two lines originating from the origin and extending to the respective points of interest, and then measuring the angle between these lines, one can determine the similarity between the associated points. Cosine similarity is less affected by document length, so it may be better at producing medoids that are representative of the content of a cluster instead of the lengt

Agentive logic

Agentive logic (also called the logic of action or logic of agency) is the field of philosophical logic and logic in computer science that studies formal representations of agents, their actions, and their abilities. An agentive logic in the narrower sense is a formal system whose primitive operators express that an agent does something, can do something, or sees to it that something is the case. Agentive logics generalise modal logic by adding modalities indexed to agents and to actions. Typical examples include: STIT logics (from sees to it that) with operators of the form [ i s t i t : φ ] {\displaystyle [i\ {\mathsf {stit}}:\varphi ]} meaning that agent i {\displaystyle i} sees to it that φ {\displaystyle \varphi } holds; dynamic logics of action with program-like modalities [ α ] φ {\displaystyle [\alpha ]\varphi } and ⟨ α ⟩ φ {\displaystyle \langle \alpha \rangle \varphi } meaning, roughly, that after every (respectively, some) execution(s) of action α {\displaystyle \alpha } , φ {\displaystyle \varphi } holds; logics with explicit agentive operators such as "can do", "brings about", or "is able to ensure". Agentive logics are used in action theory in philosophy, in the semantics of natural language, in the theory of program verification, and in artificial intelligence, where they underpin formalisms for reasoning about actions, planning, and intelligent agents. == Terminology and scope == The adjective agentive derives from the Latin agens ("one who acts") and originally referred to the grammatical agent of a verb. In logical contexts it designates operators or predicates whose primary argument position is an agent rather than a proposition alone, for example A i φ {\displaystyle A_{i}\varphi } ("agent i {\displaystyle i} does φ {\displaystyle \varphi } ") or C i φ {\displaystyle C_{i}\varphi } ("agent i {\displaystyle i} can bring about φ {\displaystyle \varphi } "). In contemporary literature, agentive logic is sometimes used narrowly for formal reconstructions of St. Anselm's modal account of facere ("to do"). More broadly, the term is used interchangeably with logic of action or logic of agency to cover a family of modal and dynamic logics designed to capture the structure of action and choice. == Historical background == === Medieval and early modern roots === Medieval logicians already explored analogies between modalities of action and alethic modalities such as possibility and necessity, for instance, in discussions of obligation and power. An influential early agentive analysis is due to St. Anselm (11th century), who treated "doing φ {\displaystyle \varphi } " as a kind of modal operator on propositions, anticipating later modal logics of agency. Modern reconstructions of Anselm's theory show that the resulting "agentive logic" can be modelled with neighbourhood semantics and satisfies a recognisable square of opposition. === Modern logic of action === Modern study of the logic of action began in the mid-20th century, parallel to developments in deontic logic and tense logic. Early systems were proposed by Georg Henrik von Wright, Stig Kanger, and others, often motivated by questions about norms and responsibility. From the 1960s onward, two largely independent but eventually converging traditions emerged: a branching-time tradition, culminating in STIT logics, emphasising agents' choices among possible futures; and dynamic logics of programs and actions, developed within computer science to reason about program execution. In the 1990s and 2000s, action logics were further developed in connection with knowledge representation, planning, and multi-agent systems in AI, and with dynamic and update semantics in linguistics. == Core ideas == Despite their diversity, most agentive logics share some general themes: Agents are treated as explicit indices of modal operators, as in [ i d o e s ] φ {\displaystyle [i\ {\mathsf {does}}]\varphi } or C i φ {\displaystyle C_{i}\varphi } . Actions are represented either implicitly, via changes between possible worlds along an accessibility relation, or explicitly, as terms denoting primitive and composite actions. Choice and ability are captured by modalities describing what an agent can ensure, usually relative to assumptions about the environment and other agents. Formal properties such as closure under composition, interaction between different agents, and connections to obligation (what an agent ought to do) and knowledge (what an agent knows how to do) are investigated. == STIT logics == STIT ("sees to it that") logics, originating in work by Nuel Belnap and collaborators, treat agency in a branching-time framework. A STIT model consists of a partially ordered set of moments with a tree-like structure, sets of histories (maximal branches through the tree), and for each agent at each moment, a partition of the histories through that moment representing the choices available to the agent. Intuitively, an agent's action at a moment determines which equivalence class (choice cell) of histories becomes actual; a formula [ i s t i t : φ ] {\displaystyle [i\ {\mathsf {stit}}:\varphi ]} is true at a history–moment pair if φ {\displaystyle \varphi } holds on all histories in the choice cell corresponding to the agent's current action. Different STIT operators have been distinguished, notably: the Chellas STIT operator, often written [ i c s t i t : φ ] {\displaystyle [i\ {\mathsf {cstit}}:\varphi ]} , which requires only that the agent's choice guarantees φ {\displaystyle \varphi } ; and the deliberative STIT operator, [ i d s t i t : φ ] {\displaystyle [i\ {\mathsf {dstit}}:\varphi ]} , which additionally requires that φ {\displaystyle \varphi } is not already historically necessary. STIT frameworks have been extended with group agency operators, temporal modalities, epistemic operators, and deontic operators to study responsibility, collective action, and obligations under indeterminism. == Dynamic logics of action == Dynamic logic was originally developed to reason about the behaviour of computer programs, treating program execution as a kind of action. In propositional dynamic logic (PDL), action terms α , β , … {\displaystyle \alpha ,\beta ,\dots } denote abstract programs or actions, and formulas of the form [ α ] φ {\displaystyle [\alpha ]\varphi } and ⟨ α ⟩ φ {\displaystyle \langle \alpha \rangle \varphi } express that all, respectively some, terminating executions of α {\displaystyle \alpha } lead to states where φ {\displaystyle \varphi } holds. From the standpoint of agentive logic, dynamic logic provides: a language for building complex actions from primitives via sequencing, choice, and iteration (e.g., α ; β {\displaystyle \alpha ;\beta } , α ∪ β {\displaystyle \alpha \cup \beta } , α ∗ {\displaystyle \alpha ^{}} ); a Kripke semantics in which actions correspond to labelled accessibility relations; and proof systems (such as Hoare logic and weakest precondition calculi) for reasoning about the correctness of action sequences. Extensions such as concurrent dynamic logic add operators for parallel composition, allowing reasoning about interacting processes and concurrent actions. John-Jules Ch. Meyer and others have argued that dynamic logic is a natural base for logics of agents, by adding modalities for knowledge, belief, and ability on top of the action modalities. Dynamic logics have also been applied to normative reasoning, yielding dynamic deontic logics where actions are related to obligations and permissions, and to dynamic epistemic logics in which information-changing actions such as announcements are modelled as programs. == Situation calculus and other action formalisms == In artificial intelligence, reasoning about action and change is often based on first-order languages that explicitly represent situations, events, and fluents (time-varying properties). The best known is situation calculus, introduced by John McCarthy and developed extensively by Raymond Reiter. In such formalisms: action terms name primitive actions; a function symbol (often d o {\displaystyle {\mathsf {do}}} ) maps an action and a situation to a successor situation; and axioms describe which fluents hold in which situations and how actions change them. Reiter's successor state axioms give compact specifications of how each fluent changes under all actions, and precondition axioms specify when actions are possible. Related formalisms include the event calculus and fluent calculus, which provide alternative ways of representing events and their effects. While these systems are often first-order rather than modal, they are closely related to agentive logics: their action terms and transition structures can be seen as providing models for dynamic or STIT-style modalities, and conversely, dynamic logics can be used as abstract specification languages for such AI formalisms. == Ability, agency, and related modalities == Many agentive logics introduce explicit operators for ability or "can-do"

Grammatical evolution

Grammatical evolution (GE) is a genetic programming (GP) technique (or approach) from evolutionary computation pioneered by Conor Ryan, JJ Collins and Michael O'Neill in 1998 at the BDS Group in the University of Limerick. As in any other GP approach, the objective is to find an executable program, program fragment, or function, which will achieve a good fitness value for a given objective function. In most published work on GP, a LISP-style tree-structured expression is directly manipulated, whereas GE applies genetic operators to an integer string, subsequently mapped to a program (or similar) through the use of a grammar, which is typically expressed in Backus–Naur form. One of the benefits of GE is that this mapping simplifies the application of search to different programming languages and other structures. == Problem addressed == In type-free, conventional Koza-style GP, the function set must meet the requirement of closure: all functions must be capable of accepting as their arguments the output of all other functions in the function set. Usually, this is implemented by dealing with a single data-type such as double-precision floating point. While modern Genetic Programming frameworks support typing, such type-systems have limitations that Grammatical Evolution does not suffer from. == GE's solution == GE offers a solution to the single-type limitation by evolving solutions according to a user-specified grammar (usually a grammar in Backus-Naur form). Therefore, the search space can be restricted, and domain knowledge of the problem can be incorporated. The inspiration for this approach comes from a desire to separate the "genotype" from the "phenotype": in GP, the objects the search algorithm operates on and what the fitness evaluation function interprets are one and the same. In contrast, GE's "genotypes" are ordered lists of integers which code for selecting rules from the provided context-free grammar. The phenotype, however, is the same as in Koza-style GP: a tree-like structure that is evaluated recursively. This model is more in line with how genetics work in nature, where there is a separation between an organism's genotype and the final expression of phenotype in proteins, etc. Separating genotype and phenotype allows a modular approach. In particular, the search portion of the GE paradigm needn't be carried out by any one particular algorithm or method. Observe that the objects GE performs search on are the same as those used in genetic algorithms. This means, in principle, that any existing genetic algorithm package, such as the popular GAlib, can be used to carry out the search, and a developer implementing a GE system need only worry about carrying out the mapping from list of integers to program tree. It is also in principle possible to perform the search using some other method, such as particle swarm optimization (see the remark below); the modular nature of GE creates many opportunities for hybrids as the problem of interest to be solved dictates. Brabazon and O'Neill have successfully applied GE to predicting corporate bankruptcy, forecasting stock indices, bond credit ratings, and other financial applications. GE has also been used with a classic predator-prey model to explore the impact of parameters such as predator efficiency, niche number, and random mutations on ecological stability. It is possible to structure a GE grammar that for a given function/terminal set is equivalent to genetic programming. == Criticism == Despite its successes, GE has been the subject of some criticism. One issue is that as a result of its mapping operation, GE's genetic operators do not achieve high locality which is a highly regarded property of genetic operators in evolutionary algorithms. == Variants == Although GE was originally described in terms of using an Evolutionary Algorithm, specifically, a Genetic Algorithm, other variants exist. For example, GE researchers have experimented with using particle swarm optimization to carry out the searching instead of genetic algorithms with results comparable to that of normal GE; this is referred to as a "grammatical swarm"; using only the basic PSO model it has been found that PSO is probably equally capable of carrying out the search process in GE as simple genetic algorithms are. (Although PSO is normally a floating-point search paradigm, it can be discretized, e.g., by simply rounding each vector to the nearest integer, for use with GE.) Yet another possible variation that has been experimented with in the literature is attempting to encode semantic information in the grammar in order to further bias the search process. Other work showed that, with biased grammars that leverage domain knowledge, even random search can be used to drive GE. == Related work == GE was originally a combination of the linear representation as used by the Genetic Algorithm for Developing Software (GADS) and Backus Naur Form grammars, which were originally used in tree-based GP by Wong and Leung in 1995 and Whigham in 1996. Other related work noted in the original GE paper was that of Frederic Gruau, who used a conceptually similar "embryonic" approach, as well as that of Keller and Banzhaf, which similarly used linear genomes. == Implementations == There are several implementations of GE. These include the following.