Chromosome (evolutionary algorithm)

Chromosome (evolutionary algorithm)

A chromosome or genotype in evolutionary algorithms (EA) is a set of parameters which define a proposed solution of the problem that the evolutionary algorithm is trying to solve. The set of all solutions, also called individuals according to the biological model, is known as the population. The genome of an individual consists of one, more rarely of several, chromosomes and corresponds to the genetic representation of the task to be solved. A chromosome is composed of a set of genes, where a gene consists of one or more semantically connected parameters, which are often also called decision variables. They determine one or more phenotypic characteristics of the individual or at least have an influence on them. In the basic form of genetic algorithms, the chromosome is represented as a binary string, while in later variants and in EAs in general, a wide variety of other data structures are used. == Chromosome design == When creating the genetic representation of a task, it is determined which decision variables and other degrees of freedom of the task should be improved by the EA and possible additional heuristics and how the genotype-phenotype mapping should look like. The design of a chromosome translates these considerations into concrete data structures for which an EA then has to be selected, configured, extended, or, in the worst case, created. Finding a suitable representation of the problem domain for a chromosome is an important consideration, as a good representation will make the search easier by limiting the search space; similarly, a poorer representation will allow a larger search space. In this context, suitable mutation and crossover operators must also be found or newly defined to fit the chosen chromosome design. An important requirement for these operators is that they not only allow all points in the search space to be reached in principle, but also make this as easy as possible. The following requirements must be met by a well-suited chromosome: It must allow the accessibility of all admissible points in the search space. Design of the chromosome in such a way that it covers only the search space and no additional areas. so that there is no redundancy or only as little redundancy as possible. Observance of strong causality: small changes in the chromosome should only lead to small changes in the phenotype. This is also called locality of the relationship between search and problem space. Designing the chromosome in such a way that it excludes prohibited regions in the search space completely or as much as possible. While the first requirement is indispensable, depending on the application and the EA used, one usually only has to be satisfied with fulfilling the remaining requirements as far as possible. The evolutionary search is supported and possibly considerably accelerated by a fulfillment as complete as possible. == Examples of chromosomes == === Chromosomes for binary codings === In their classical form, GAs use bit strings and map the decision variables to be optimized onto them. An example for one Boolean and three integer decision variables with the value ranges 0 ≤ D 1 ≤ 60 {\displaystyle 0\leq D_{1}\leq 60} , 28 ≤ D 2 ≤ 30 {\displaystyle 28\leq D_{2}\leq 30} and − 12 ≤ D 3 ≤ 14 {\displaystyle -12\leq D_{3}\leq 14} may illustrate this: Note that the negative number here is given in two's complement. This straight forward representation uses five bits to represent the three values of D 2 {\displaystyle D_{2}} , although two bits would suffice. This is a significant redundancy. An improved alternative, where 28 is to be added for the genotype-phenotype mapping, could look like this: with D 2 = 28 + D 2 ′ = 29 {\displaystyle D_{2}=28+D'_{2}=29} . === Chromosomes with real-valued or integer genes === For the processing of tasks with real-valued or mixed-integer decision variables, EAs such as the evolution strategy or the real-coded GAs are suited. In the case of mixed-integer values, rounding is often used, but this represents some violation of the redundancy requirement. If the necessary precisions of the real values can be reasonably narrowed down, this violation can be remedied by using integer-coded GAs. For this purpose, the valid digits of real values are mapped to integers by multiplication with a suitable factor. For example, 12.380 becomes the integer 12380 by multiplying by 1000. This must of course be taken into account in genotype-phenotype mapping for evaluation and result presentation. A common form is a chromosome consisting of a list or an array of integer or real values. === Chromosomes for permutations === Combinatorial problems are mainly concerned with finding an optimal sequence of a set of elementary items. As an example, consider the problem of the traveling salesman who wants to visit a given number of cities exactly once on the shortest possible tour. The simplest and most obvious mapping onto a chromosome is to number the cities consecutively, to interpret a resulting sequence as permutation and to store it directly in a chromosome, where one gene corresponds to the ordinal number of a city. Then, however, the variation operators may only change the gene order and not remove or duplicate any genes. The chromosome thus contains the path of a possible tour to the cities. As an example the sequence 3 , 5 , 7 , 1 , 4 , 2 , 9 , 6 , 8 {\displaystyle 3,5,7,1,4,2,9,6,8} of nine cities may serve, to which the following chromosome corresponds: In addition to this encoding frequently called path representation, there are several other ways of representing a permutation, for example the ordinal representation or the matrix representation. === Chromosomes for co-evolution === When a genetic representation contains, in addition to the decision variables, additional information that influences evolution and/or the mapping of the genotype to the phenotype and is itself subject to evolution, this is referred to as co-evolution. A typical example is the evolution strategy (ES), which includes one or more mutation step sizes as strategy parameters in each chromosome. Another example is an additional gene to control a selection heuristic for resource allocation in a scheduling tasks. This approach is based on the assumption that good solutions are based on an appropriate selection of strategy parameters or on control gene(s) that influences genotype-phenotype mapping. The success of the ES gives evidence to this assumption. === Chromosomes for complex representations === The chromosomes presented above are well suited for processing tasks of continuous, mixed-integer, pure-integer or combinatorial optimization. For a combination of these optimization areas, on the other hand, it becomes increasingly difficult to map them to simple strings of values, depending on the task. The following extension of the gene concept is proposed by the EA GLEAM (General Learning Evolutionary Algorithm and Method) for this purpose: A gene is considered to be the description of an element or elementary trait of the phenotype, which may have multiple parameters. For this purpose, gene types are defined that contain as many parameters of the appropriate data type as are required to describe the particular element of the phenotype. A chromosome now consists of genes as data objects of the gene types, whereby, depending on the application, each gene type occurs exactly once as a gene or can be contained in the chromosome any number of times. The latter leads to chromosomes of dynamic length, as they are required for some problems. The gene type definitions also contain information on the permissible value ranges of the gene parameters, which are observed during chromosome generation and by corresponding mutations, so they cannot lead to lethal mutations. For tasks with a combinatorial part, there are suitable genetic operators that can move or reposition genes as a whole, i.e. with their parameters. A scheduling task is used as an illustration, in which workflows are to be scheduled that require different numbers of heterogeneous resources. A workflow specifies which work steps can be processed in parallel and which have to be executed one after the other. In this context, heterogeneous resources mean different processing times at different costs in addition to different processing capabilities. Each scheduling operation therefore requires one or more parameters that determine the resource selection, where the value ranges of the parameters depend on the number of alternative resources available for each work step. A suitable chromosome provides one gene type per work step and in this case one corresponding gene, which has one parameter for each required resource. The order of genes determines the order of scheduling operations and, therefore, the precedence in case of allocation conflicts. The exemplary gene type definition of work step 15 with two resources, for which there are four and seven alternatives respectively

Coupled pattern learner

Coupled Pattern Learner (CPL) is a machine learning algorithm which couples the semi-supervised learning of categories and relations to forestall the problem of semantic drift associated with boot-strap learning methods. == Coupled Pattern Learner == Semi-supervised learning approaches using a small number of labeled examples with many unlabeled examples are usually unreliable as they produce an internally consistent, but incorrect set of extractions. CPL solves this problem by simultaneously learning classifiers for many different categories and relations in the presence of an ontology defining constraints that couple the training of these classifiers. It was introduced by Andrew Carlson, Justin Betteridge, Estevam R. Hruschka Jr. and Tom M. Mitchell in 2009. == CPL overview == CPL is an approach to semi-supervised learning that yields more accurate results by coupling the training of many information extractors. Basic idea behind CPL is that semi-supervised training of a single type of extractor such as ‘coach’ is much more difficult than simultaneously training many extractors that cover a variety of inter-related entity and relation types. Using prior knowledge about the relationships between these different entities and relations CPL makes unlabeled data as a useful constraint during training. For e.g., ‘coach(x)’ implies ‘person(x)’ and ‘not sport(x)’. == CPL description == === Coupling of predicates === CPL primarily relies on the notion of coupling the learning of multiple functions so as to constrain the semi-supervised learning problem. CPL constrains the learned function in two ways. Sharing among same-arity predicates according to logical relations Relation argument type-checking === Sharing among same-arity predicates === Each predicate P in the ontology has a list of other same-arity predicates with which P is mutually exclusive. If A is mutually exclusive with predicate B, A’s positive instances and patterns become negative instances and negative patterns for B. For example, if ‘city’, having an instance ‘Boston’ and a pattern ‘mayor of arg1’, is mutually exclusive with ‘scientist’, then ‘Boston’ and ‘mayor of arg1’ will become a negative instance and a negative pattern respectively for ‘scientist.’ Further, Some categories are declared to be a subset of another category. For e.g., ‘athlete’ is a subset of ‘person’. === Relation argument type-checking === This is a type checking information used to couple the learning of relations and categories. For example, the arguments of the ‘ceoOf’ relation are declared to be of the categories ‘person’ and ‘company’. CPL does not promote a pair of noun phrases as an instance of a relation unless the two noun phrases are classified as belonging to the correct argument types. === Algorithm description === Following is a quick summary of the CPL algorithm. Input: An ontology O, and a text corpus C Output: Trusted instances/patterns for each predicate for i=1,2,...,∞ do foreach predicate p in O do EXTRACT candidate instances/contextual patterns using recently promoted patterns/instances; FILTER candidates that violate coupling; RANK candidate instances/patterns; PROMOTE top candidates; end end ==== Inputs ==== A large corpus of Part-Of-Speech tagged sentences and an initial ontology with predefined categories, relations, mutually exclusive relationships between same-arity predicates, subset relationships between some categories, seed instances for all predicates, and seed patterns for the categories. ==== Candidate extraction ==== CPL finds new candidate instances by using newly promoted patterns to extract the noun phrases that co-occur with those patterns in the text corpus. CPL extracts, Category Instances Category Patterns Relation Instances Relation Patterns ==== Candidate filtering ==== Candidate instances and patterns are filtered to maintain high precision, and to avoid extremely specific patterns. An instance is only considered for assessment if it co-occurs with at least two promoted patterns in the text corpus, and if its co-occurrence count with all promoted patterns is at least three times greater than its co-occurrence count with negative patterns. ==== Candidate ranking ==== CPL ranks candidate instances using the number of promoted patterns that they co-occur with so that candidates that occur with more patterns are ranked higher. Patterns are ranked using an estimate of the precision of each pattern. ==== Candidate promotion ==== CPL ranks the candidates according to their assessment scores and promotes at most 100 instances and 5 patterns for each predicate. Instances and patterns are only promoted if they co-occur with at least two promoted patterns or instances, respectively. == Meta-Bootstrap Learner == Meta-Bootstrap Learner (MBL) was also proposed by the authors of CPL. Meta-Bootstrap learner couples the training of multiple extraction techniques with a multi-view constraint, which requires the extractors to agree. It makes addition of coupling constraints on top of existing extraction algorithms, while treating them as black boxes, feasible. MBL assumes that the errors made by different extraction techniques are independent. Following is a quick summary of MBL. Input: An ontology O, a set of extractors ε Output: Trusted instances for each predicate for i=1,2,...,∞ do foreach predicate p in O do foreach extractor e in ε do Extract new candidates for p using e with recently promoted instances; end FILTER candidates that violate mutual-exclusion or type-checking constraints; PROMOTE candidates that were extracted by all extractors; end end Subordinate algorithms used with MBL do not promote any instance on their own, they report the evidence about each candidate to MBL and MBL is responsible for promoting instances. == Applications == In their paper authors have presented results showing the potential of CPL to contribute new facts to existing repository of semantic knowledge, Freebase

Harvest now, decrypt later

Harvest now, decrypt later (HNDL) is a surveillance strategy that relies on the acquisition and long-term storage of currently unreadable encrypted data awaiting possible breakthroughs in decryption technology that would render it readable in the future—a hypothetical date referred to as Y2Q (a reference to Y2K), or Q-Day. The most common concern is the prospect of developments in quantum computing which would allow current strong encryption algorithms to be broken at some time in the future, making it possible to decrypt any stored material that had been encrypted using those algorithms. However, the improvement in decryption technology need not be due to a quantum-cryptographic advance; any other form of attack capable of enabling decryption would be sufficient. The existence of this strategy has led to concerns about the need to urgently deploy post-quantum cryptography; even though no practical quantum attacks yet exist, some data stored now may still remain sensitive even decades into the future. As of 2022, the U.S. federal government has proposed a roadmap for organizations to start migrating toward quantum-cryptography-resistant algorithms to mitigate these threats. This new version of Commercial National Security Algorithm Suite uses publicly-available algorithms and is allowed for government use up to the TOP SECRET level. == Terminology and scope == The term “harvest now, decrypt later” encompasses various surveillance or espionage operations in which ciphertext or encrypted communications are collected today with the view that they may one day be decrypted, given sufficient advances in computing power or cryptanalysis. The abbreviation HNDL is sometimes used in technical and policy documents. The “Y2Q” (or “Q-Day”) label draws an analogy to the Y2K date-change issue, emphasising a potential future point at which current cryptography may collapse. The strategy is particularly relevant for data with long confidentiality lifetimes, such as diplomatic communications, personal health records, critical infrastructure logs, or intellectual property. == Mitigation strategies == The primary defense against HNDL attacks is the transition to post-quantum cryptography (PQC), which utilizes algorithms believed to be secure against quantum computer attacks. However, because PQC protects the data payload digitally, rather than the transmission itself, the encrypted data can still be harvested and stored. A complementary approach involves physical layer security (also known as optical layer encryption or photonic shielding). Unlike algorithmic encryption, this method modifies the optical waveform itself—often by burying the signal within optical noise or using spectral phase encoding—to render the transmission unrecordable by standard receivers. By preventing the attacker from capturing a valid signal in the first place, this approach aims to eliminate the "harvest" phase of the threat. Commercial implementations of harvest-proof optical encryption have been developed by firms such as CyberRidge to secure long-haul fiber networks. Field trials have demonstrated 100 Gbps throughput over legacy DWDM networks using this method.

Yao's test

In cryptography and the theory of computation, Yao's test is a test defined by Andrew Chi-Chih Yao in 1982, against pseudo-random sequences. A sequence of words passes Yao's test if an attacker with reasonable computational power cannot distinguish it from a sequence generated uniformly at random. == Formal statement == === Boolean circuits === Let P {\displaystyle P} be a polynomial, and S = { S k } k {\displaystyle S=\{S_{k}\}_{k}} be a collection of sets S k {\displaystyle S_{k}} of P ( k ) {\displaystyle P(k)} -bit long sequences, and for each k {\displaystyle k} , let μ k {\displaystyle \mu _{k}} be a probability distribution on S k {\displaystyle S_{k}} , and P C {\displaystyle P_{C}} be a polynomial. A predicting collection C = { C k } {\displaystyle C=\{C_{k}\}} is a collection of boolean circuits of size less than P C ( k ) {\displaystyle P_{C}(k)} . Let p k , S C {\displaystyle p_{k,S}^{C}} be the probability that on input s {\displaystyle s} , a string randomly selected in S k {\displaystyle S_{k}} with probability μ ( s ) {\displaystyle \mu (s)} , C k ( s ) = 1 {\displaystyle C_{k}(s)=1} , i.e. Moreover, let p k , U C {\displaystyle p_{k,U}^{C}} be the probability that C k ( s ) = 1 {\displaystyle C_{k}(s)=1} on input s {\displaystyle s} a P ( k ) {\displaystyle P(k)} -bit long sequence selected uniformly at random in { 0 , 1 } P ( k ) {\displaystyle \{0,1\}^{P(k)}} . We say that S {\displaystyle S} passes Yao's test if for all predicting collection C {\displaystyle C} , for all but finitely many k {\displaystyle k} , for all polynomial Q {\displaystyle Q} : === Probabilistic formulation === As in the case of the next-bit test, the predicting collection used in the above definition can be replaced by a probabilistic Turing machine, working in polynomial time. This also yields a strictly stronger definition of Yao's test (see Adleman's theorem). Indeed, one could decide undecidable properties of the pseudo-random sequence with the non-uniform circuits described above, whereas BPP machines can always be simulated by exponential-time deterministic Turing machines.

Consistency (database systems)

In database systems, consistency (or correctness) refers to the requirement that any given database transaction must change affected data only in allowed ways. Any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof. This does not guarantee correctness of the transaction in all ways the application programmer might have wanted (that is the responsibility of application-level code) but merely that any programming errors cannot result in the violation of any defined database constraints. In a distributed system, referencing CAP theorem, consistency can also be understood as after a successful write, update or delete of a Record, any read request immediately receives the latest value of the Record. == As an ACID guarantee == Consistency is one of the four guarantees that define ACID transactions; however, significant ambiguity exists about the nature of this guarantee. It is defined variously as: The guarantee that database constraints are not violated, particularly once a transaction commits. The guarantee that any transactions started in the future necessarily see the effects of other transactions committed in the past. As these various definitions are not mutually exclusive, it is possible to design a system that guarantees "consistency" in every sense of the word, as most relational database management systems in common use today arguably do. == As a CAP trade-off == The CAP theorem is based on three trade-offs, one of which is "atomic consistency" (shortened to "consistency" for the acronym), about which the authors note, "Discussing atomic consistency is somewhat different than talking about an ACID database, as database consistency refers to transactions, while atomic consistency refers only to a property of a single request/response operation sequence. And it has a different meaning than the Atomic in ACID, as it subsumes the database notions of both Atomic and Consistent." In the CAP theorem, you can only have two of the following three properties: consistency, availability, or partition tolerance. Therefore, consistency may have to be traded off in some database systems.

SciPy

SciPy (pronounced "sigh pie") is a free and open-source Python library used for scientific computing and technical computing. SciPy contains modules for optimization, linear algebra, integration, interpolation, special functions, fast Fourier transform, signal and image processing, ordinary differential equation solvers and other tasks common in science and engineering. SciPy is also a family of conferences for users and developers of these tools: SciPy (in the United States), EuroSciPy (in Europe) and SciPy.in (in India). Enthought originated the SciPy conference in the United States and continues to sponsor many of the international conferences as well as host the SciPy website. The SciPy library is currently distributed under the BSD license, and its development is sponsored and supported by an open community of developers. It is also supported by NumFOCUS, a community foundation for supporting reproducible and accessible science. == Components == The SciPy package is at the core of Python's scientific computing capabilities. Available sub-packages include: cluster: hierarchical clustering, vector quantization, K-means constants: physical constants and conversion factors datasets: various example datasets for demonstrating image and data processing differentiate: numerical differentiation for first and second derivatives fft: Discrete Fourier Transform algorithms fftpack: Legacy interface for Discrete Fourier Transforms integrate: numerical integration routines interpolate: interpolation tools io: data input and output, including support for MATLAB and Matrix Market files linalg: linear algebra routines ndimage: various functions for multi-dimensional image processing odr: orthogonal distance regression classes and algorithms optimize: optimization algorithms including linear programming and a variety of numerical nonlinear programming optimizers signal: signal processing tools sparse: sparse matrices and related algorithms spatial: algorithms for spatial structures such as k-d trees, nearest neighbors, convex hulls, etc. special: special functions stats: statistical functions == Data structures == The basic data structure used by SciPy is a multidimensional array provided by the NumPy module. NumPy provides some functions for linear algebra, Fourier transforms, and random number generation, but not with the generality of the equivalent functions in SciPy. NumPy can also be used as an efficient multidimensional container of data with arbitrary datatypes. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases. Older versions of SciPy used Numeric as an array type, which is now deprecated in favor of the newer NumPy array code. == History == In the 1990s, Python was extended to include an array type for numerical computing called Numeric. (This package was eventually replaced by NumPy, which was written by Travis Oliphant in 2006 as a blending of Numeric and Numarray, with Numarray itself being started in 2001.) As of 2000, there was a growing number of extension modules and increasing interest in creating a complete environment for scientific and technical computing. In 2001, Travis Oliphant, Eric Jones, and Pearu Peterson merged code they had written and called the resulting package SciPy. The newly created package provided a standard collection of common numerical operations on top of the Numeric array data structure. Shortly thereafter, Fernando Pérez released IPython, an enhanced interactive shell widely used in the technical computing community, and John Hunter released the first version of Matplotlib, the 2D plotting library for technical computing. Since then the SciPy environment has continued to grow with more packages and tools for technical computing. == Scientific Python versus ScientificPython == In the scientific literature, SciPy is occasionally referred to as "Scientific Python (SciPy)". This is incorrect: the official name of the project is just "SciPy". Furthermore, expanding "SciPy" as "Scientific Python" may cause confusion with "ScientificPython", a project led by Konrad Hinsen of Orléans University that was active between 1995 and 2014. "Scientific Python" is also used for the related ecosystem of tools.

Computer-aided software engineering

Computer-aided software engineering (CASE) is a domain of software tools used to design and implement applications. CASE tools are similar to and are partly inspired by computer-aided design (CAD) tools used for designing hardware products. CASE tools are intended to help develop high-quality, defect-free, and maintainable software. CASE software was often associated with methods for the development of information systems together with automated tools that could be used in the software development process. == History == The Information System Design and Optimization System (ISDOS) project, started in 1968 at the University of Michigan, initiated a great deal of interest in the whole concept of using computer systems to help analysts in the very difficult process of analysing requirements and developing systems. Several papers by Daniel Teichroew fired a whole generation of enthusiasts with the potential of automated systems development. His Problem Statement Language / Problem Statement Analyzer (PSL/PSA) tool was a CASE tool although it predated the term. Another major thread emerged as a logical extension to the data dictionary of a database. By extending the range of metadata held, the attributes of an application could be held within a dictionary and used at runtime. This "active dictionary" became the precursor to the more modern model-driven engineering capability. However, the active dictionary did not provide a graphical representation of any of the metadata. It was the linking of the concept of a dictionary holding analysts' metadata, as derived from the use of an integrated set of techniques, together with the graphical representation of such data that gave rise to the earlier versions of CASE. The next entrant into the market was Excelerator from Index Technology in Cambridge, Mass. While DesignAid ran on Convergent Technologies and later Burroughs Ngen networked microcomputers, Index launched Excelerator on the IBM PC/AT platform. While, at the time of launch, and for several years, the IBM platform did not support networking or a centralized database as did the Convergent Technologies or Burroughs machines, the allure of IBM was strong, and Excelerator came to prominence. Hot on the heels of Excelerator were a rash of offerings from companies such as Knowledgeware (James Martin, Fran Tarkenton and Don Addington), Texas Instrument's CA Gen and Andersen Consulting's FOUNDATION toolset (DESIGN/1, INSTALL/1, FCP). CASE tools were at their peak in the early 1990s. According to the PC Magazine of January 1990, over 100 companies were offering nearly 200 different CASE tools. At the time IBM had proposed AD/Cycle, which was an alliance of software vendors centered on IBM's Software repository using IBM DB2 in mainframe and OS/2: The application development tools can be from several sources: from IBM, from vendors, and from the customers themselves. IBM has entered into relationships with Bachman Information Systems, Index Technology Corporation, and Knowledgeware wherein selected products from these vendors will be marketed through an IBM complementary marketing program to provide offerings that will help to achieve complete life-cycle coverage. With the decline of the mainframe, AD/Cycle and the Big CASE tools died off, opening the market for the mainstream CASE tools of today. Many of the leaders of the CASE market of the early 1990s ended up being purchased by Computer Associates, including IEW, IEF, ADW, Cayenne, and Learmonth & Burchett Management Systems (LBMS). The other trend that led to the evolution of CASE tools was the rise of object-oriented methods and tools. Most of the various tool vendors added some support for object-oriented methods and tools. In addition new products arose that were designed from the bottom up to support the object-oriented approach. Andersen developed its project Eagle as an alternative to Foundation. Several of the thought leaders in object-oriented development each developed their own methodology and CASE tool set: Jacobson, Rumbaugh, Booch, etc. Eventually, these diverse tool sets and methods were consolidated via standards led by the Object Management Group (OMG). The OMG's Unified Modelling Language (UML) is currently widely accepted as the industry standard for object-oriented modeling. == CASE software == === Tools === CASE tools support specific tasks in the software development life-cycle. They can be divided into the following categories: Business and analysis modeling: Graphical modeling tools. E.g., E/R modeling, object modeling, etc. Development: Design and construction phases of the life-cycle. Debugging environments. E.g., IISE LKO. Verification and validation: Analyze code and specifications for correctness, performance, etc. Configuration management: Control the check-in and check-out of repository objects and files. E.g., SCCS, IISE. Metrics and measurement: Analyze code for complexity, modularity (e.g., no "go to's"), performance, etc. Project management: Manage project plans, task assignments, scheduling. Another common way to distinguish CASE tools is the distinction between Upper CASE and Lower CASE. Upper CASE Tools support business and analysis modeling. They support traditional diagrammatic languages such as ER diagrams, Data flow diagram, Structure charts, Decision Trees, Decision tables, etc. Lower CASE Tools support development activities, such as physical design, debugging, construction, testing, component integration, maintenance, and reverse engineering. All other activities span the entire life-cycle and apply equally to upper and lower CASE. === Workbenches === Workbenches integrate two or more CASE tools and support specific software-process activities. Hence they achieve: A homogeneous and consistent interface (presentation integration) Seamless integration of tools and toolchains (control and data integration) An example workbench is Microsoft's Visual Basic programming environment. It incorporates several development tools: a GUI builder, a smart code editor, debugger, etc. Most commercial CASE products tended to be such workbenches that seamlessly integrated two or more tools. Workbenches also can be classified in the same manner as tools; as focusing on Analysis, Development, Verification, etc. as well as being focused on the upper case, lower case, or processes such as configuration management that span the complete life-cycle. === Environments === An environment is a collection of CASE tools or workbenches that attempts to support the complete software process. This contrasts with tools that focus on one specific task or a specific part of the life-cycle. CASE environments are classified by Fuggetta as follows: Toolkits: Loosely coupled collections of tools. These typically build on operating system workbenches such as the Unix Programmer's Workbench or the VMS VAX set. They typically perform integration via piping or some other basic mechanism to share data and pass control. The strength of easy integration is also one of the drawbacks. Simple passing of parameters via technologies such as shell scripting can't provide the kind of sophisticated integration that a common repository database can. Fourth generation: These environments are also known as 4GL standing for fourth generation language environments due to the fact that the early environments were designed around specific languages such as Visual Basic. They were the first environments to provide deep integration of multiple tools. Typically these environments were focused on specific types of applications. For example, user-interface driven applications that did standard atomic transactions to a relational database. Examples are Informix 4GL, and Focus. Language-centered: Environments based on a single often object-oriented language such as the Symbolics Lisp Genera environment or VisualWorks Smalltalk from Parcplace. In these environments all the operating system resources were objects in the object-oriented language. This provides powerful debugging and graphical opportunities but the code developed is mostly limited to the specific language. For this reason, these environments were mostly a niche within CASE. Their use was mostly for prototyping and R&D projects. A common core idea for these environments was the model–view–controller user interface that facilitated keeping multiple presentations of the same design consistent with the underlying model. The MVC architecture was adopted by the other types of CASE environments as well as many of the applications that were built with them. Integrated: These environments are an example of what most IT people tend to think of first when they think of CASE. Environments such as IBM's AD/Cycle, Andersen Consulting's FOUNDATION, the ICL CADES system, and DEC Cohesion. These environments attempt to cover the complete life-cycle from analysis to maintenance and provide an integrated database repository for storing all artifacts of the software pr