Amália (LLM)

Amália is a Portuguese large language model (LLM) announced in November 2024 by the Portuguese Prime-Minister Luís Montenegro. Its final version is expected to be launched in 2026. It is being developed by Center for Responsible AI (Centro para a AI Responsável) and by the research centers of NOVA School of Science and Technology and Instituto Superior Técnico. == History == In 2024 it was announced that the Portuguese Agency for Administrative Modernization (Agência para a Modernização Administrativa) transpose this LLM to Portuguese Public Administration. According to Paulo Dimas (CEO of the Center for Responsible AI) the three fundamental points of this LLM project are the linguistic variant (European Portuguese), cultural representation and data protection. In April 2025 it was announced that Amália had entered beta phase with an improved version being expected to be launched in September 2025. The beta version released in September is available only to the Public Administration, but the website launched in October reiterates the final version will be an open model.

Schema-agnostic databases

Schema-agnostic databases or vocabulary-independent databases aim at supporting users to be abstracted from the representation of the data, supporting the automatic semantic matching between queries and databases. Schema-agnosticism is the property of a database of mapping a query issued with the user terminology and structure, automatically mapping it to the dataset vocabulary. The increase in the size and in the semantic heterogeneity of database schemas bring new requirements for users querying and searching structured data. At this scale it can become unfeasible for data consumers to be familiar with the representation of the data in order to query it. At the center of this discussion is the semantic gap between users and databases, which becomes more central as the scale and complexity of the data grows. == Description == The evolution of data environments towards the consumption of data from multiple data sources and the growth in the schema size, complexity, dynamicity and decentralisation (SCoDD) of schemas increases the complexity of contemporary data management. The SCoDD trend emerges as a central data management concern in Big Data scenarios, where users and applications have a demand for more complete data, produced by independent data sources, under different semantic assumptions and contexts of use, which is the typical scenario for Semantic Web Data applications. The evolution of databases in the direction of heterogeneous data environments strongly impacts the usability, semiotics and semantic assumptions behind existing data accessibility methods such as structured queries, keyword-based search and visual query systems. With schema-less databases containing potentially millions of dynamically changing attributes, it becomes unfeasible for some users to become aware of the 'schema' or vocabulary in order to query the database. At this scale, the effort in understanding the schema in order to build a structured query can become prohibitive. == Schema-agnostic queries == Schema-agnostic queries can be defined as query approaches over structured databases which allow users satisfying complex information needs without the understanding of the representation (schema) of the database. Similarly, Tran et al. defines it as "search approaches, which do not require users to know the schema underlying the data". Approaches such as keyword-based search over databases allow users to query databases without employing structured queries. However, as discussed by Tran et al.: "From these points, users however have to do further navigation and exploration to address complex information needs. Unlike keyword search used on the Web, which focuses on simple needs, the keyword search elaborated here is used to obtain more complex results. Instead of a single set of resources, the goal is to compute complex sets of resources and their relations." The development of approaches to support natural language interfaces (NLI) over databases have aimed towards the goal of schema-agnostic queries. Complementarily, some approaches based on keyword search have targeted keyword-based queries which express more complex information needs. Other approaches have explored the construction of structured queries over databases where schema constraints can be relaxed. All these approaches (natural language, keyword-based search and structured queries) have targeted different degrees of sophistication in addressing the problem of supporting a flexible semantic matching between queries and data, which vary from the completely absence of the semantic concern to more principled semantic models. While the demand for schema-agnosticism has been an implicit requirement across semantic search and natural language query systems over structured data, it is not sufficiently individuated as a concept and as a necessary requirement for contemporary database management systems. Recent works have started to define and model the semantic aspects involved on schema-agnostic queries. === Schema-agnostic structured queries === Consist of schema-agnostic queries following the syntax of a structured standard (for example SQL, SPARQL). The syntax and semantics of operators are maintained, while different terminologies are used. ==== Example 1 ==== SELECT ?y { BillClinton hasDaughter ?x . ?x marriedTo ?y . } which maps to the following SPARQL query in the dataset vocabulary: ==== Example 2 ==== which maps to the following SPARQL query in the dataset vocabulary: === Schema-agnostic keyword queries === Consist of schema-agnostic queries using keyword queries. In this case the syntax and semantics of operators are different from the structured query syntax. ==== Example ==== "Bill Clinton daughter married to" "Books by William Goldman with more than 300 pages" == Semantic complexity == As of 2016 the concept of schema-agnostic queries has been developed primarily in academia. Most of schema-agnostic query systems have been investigated in the context of Natural Language Interfaces over databases or over the Semantic Web. These works explore the application of semantic parsing techniques over large, heterogeneous and schema-less databases. More recently, the individuation of the concept of schema-agnostic query systems and databases have appeared more explicitly within the literature. Freitas et al. provide a probabilistic model on the semantic complexity of mapping schema-agnostic queries.

Morphobank

MorphoBank is a web application for collaborative evolutionary research, specifically phylogenetic systematics or cladistics, on the phenotype. Historically, scientists conducting research on phylogenetic systematics have worked individually or in small groups employing traditional single-user software applications such as MacClade, Mesquite and Nexus Data Editor. As the hypotheses under study have grown more complex, large research teams have assembled to tackle the problem of discovering the Tree of Life for the estimated 4-100 million living species(Wilson 2003, pp. 77–80) and the many thousands more extinct species known from fossils. Because the phenotype is fundamentally visual, and as phenotype-based phylogenetic studies have continued to increase in size, it becomes important that observations be backed up by labeled images. Traditional desktop software applications currently in wide use do not provide robust support for team-based research or for image manipulation and storage. MorphoBank is a particularly important tool for the growing scientific field of phenomics. The development of MorphoBank, which began in 2001, has been funded by the National Science Foundation's Directorates for Geosciences, Biological Sciences and Computer and Information Science and Engineering. The significance of the scientific work on MorphoBank has been featured in the New York Times(here and here), among other publications. == Advantages == Teams of scientists studying phylogenetics to build the Tree of Life assemble large spreadsheets of observations about species (referred to as "matrices"). These teams require simultaneous access by each team member to a single and secure copy of the team's data during a scientific research project. This single copy of the data also changes with great frequency during the data collection phase. Images that can be very helpful for documenting homology statements must be displayed, labeled and shared as homology statements develop. This cannot be accomplished elegantly with a desktop software package alone because in a desktop environment each collaborator is working on his own private copy of project data. Changes made by one participant cannot automatically propagate to others, preventing collaborators from seeing each other's data edits until they are manually (and due to the effort involved, often only periodically) merged into a single "true" dataset. In all but the smallest and most disciplined of teams, file version control and the reconciliation of changes made on multiple copies of the data emerge quickly as significant drags on productivity. MorphoBank is an attempt to address these issues by leveraging the ubiquity of the web and modern web-based application techniques, including Ajax, web service layers, and rich web applications to provide a full-featured, net-accessible collaborative workspace for phylogenetic research. In particular, MorphoBank makes it easy to: Share all kinds of data with geographically separated team members, including taxonomy, character and specimen data, media (including images, video and audio), phylogenetic matrices (including data in the widely used NEXUS and TNT format) and other data such as documents and genetic sequences. Label high-resolution images using a web-based image annotation application. Collaboratively edit project data such as phylogenetic matrices using a built-in web-based matrix editor. The editor allows the linking of labeled images to individual cells of a matrix. Manage access to project data. Access ranges from full-access for team members to anonymous read-only access for potential reviewers. Publish completed project data on the web in support of a published paper with a persistent URL. Search The Encyclopedia of Life for taxon exemplar images. Store high resolution CT data Create ontologies for updating and populating matrix cells. These tasks are difficult or impossible in most existing software applications. == History == In 2001 the National Science Foundation (NSF) sponsored a workshop, at the American Museum of Natural History in New York to develop the outlines of a web-based system for a collaborative, media-rich research tool for morphological phylogenetics. An application prototype presented at the workshop was later refined with feedback from the workshop and became MorphoBank version 1.0. A grant from the US National Oceanic and Atmospheric Administration funded further revisions resulting in version 2.0, released in 2005. Current support from the NSF is funding current feature enhancements to MorphoBank. MorphoBank was hosted by Stony Brook University until late October 2021 and received back up support from the American Museum of Natural History. The current version is 3.0. Rationale for the software was described in the journal Cladistics. MorphoBank has also received support from NESCENT and the San Diego Supercomputer Center. Since 2018, MorphoBank has been supported in part by Phoenix Bioinformatics, a non-profit company founded to sustain databases for the basic sciences. A permanent move of MorphoBank from Stony Brook University to Phoenix Bioinformatics was complete in late October 2021. The San Diego Supercomputer Center has previously provided technical and hosting resources to the MorphoBank project. == Usage == MorphoBank hosts the products of peer-reviewed scientific research on phenotypes. An increasing volume of systematics data is "born digital" and MorphoBank is well suited to handle this type of material. On August 24, 2007, 62 active research projects were hosted by MorphoBank, as well as 6 completed (and published) projects. By 2017 over 2000 scientists and their students were registered content builders (users are not required to register and are even more numerous) and has more than 500 publicly available projects with approximately 80,000 images that are the products of scientific research. Over 1,500 active research projects are hosted by MorphoBank. The software has been used to assemble phylogenetic research on such groups as mammals, from bats to whales, bivalve molluscs, arachnids, fossil plants and living and extinct amniotes. It has also been used more broadly in evolutionary and paleontological research to host curated images associated with published research on lacewing insects geckos, raptor birds, dinosaurs, frogs and nematodes. MorphoBank is increasingly used in conjunction with the Paleobiology Database. Example published projects: Project 1097: Blank CE, 2013 Origin and early evolution of photosynthetic eukaryotes in freshwater environments – reinterpreting proterozoic paleobiology and biogeochemical processes in light of trait evolution Project 2520: Carvalho, T. P., R. E. Reis, and J. P. Friel, 2017 A new species of Hoplomyzon (Siluriformes: Aspredinidae) from Maracaibo Basin, Venezuela: osteological description using high-resolution Project 2651: Baron, M. G., Norman, D. B., Barrett, P. M., 2017 A new hypothesis of dinosaur relationships and early dinosaur evolution MorphoBank has been particularly important to the Assembling the Tree of Life initiative sponsored by the National Science Foundation. MorphoBank is well-suited to such projects because of its tools for merging taxonomic, character and matrix-based data, as well as its collaborative features. Highlights of this research include a collaborative matrix on mammal evolution published in Science that included over 4,000 phenomic characters scored for over 80 species, a matrix on extant baleen whales featuring nearly 600 images, and more.

Brownout (software engineering)

Brownout in software engineering is a technique that involves disabling certain features of an application. == Description == Brownout is used to increase the robustness of an application to computing capacity shortage. If too many users are simultaneously accessing an application hosted online, the underlying computing infrastructure may become overloaded, rendering the application unresponsive. Users are likely to abandon the application and switch to competing alternatives, hence incurring long-term revenue loss. To better deal with such a situation, the application can be given brownout capabilities: The application will disable certain features – e.g., an online shop will no longer display recommendations of related products – to avoid overload. Although reducing features generally has a negative impact on the short-term revenue of the application owner, long-term revenue loss can be avoided. The technique is inspired by brownouts in power grids, which consists in reducing the power grid's voltage in case electricity demand exceeds production. Some consumers, such as incandescent light bulbs, will dim – hence originating the term – and draw less power, thus helping match demand with production. Similarly, a brownout application helps match its computing capacity requirements to what is available on the target infrastructure. Brownout complements elasticity. The former can help the application withstand short-term capacity shortage, but does so without changing the capacity available to the application. In contrast, elasticity consists of adding (or removing) capacity to the application, preferably in advance, so as to avoid capacity shortage altogether. The two techniques can be combined; e.g., brownout is triggered when the number of users increases unexpectedly until elasticity can be triggered, the latter usually requiring minutes to show an effect. Brownout is relatively non-intrusive for the developer, for example, it can be implemented as an advice in aspect-oriented programming. However, surrounding components, such as load-balancers, need to be made brownout-aware to distinguish between cases where an application is running normally and cases where the application maintains a low response time by triggering brownout. == Usage in phased deprecation == A related use of the brownout concept in software engineering is the deliberate introduction of temporary outages to a system, API or feature that is being phased out. This is sometimes also called a "scream test" when it is used to discover unknown dependents of a system or API. The intention is to allow detection of downstream consumers of an API or service who may otherwise have missed deprecation announcements or to uncover hidden side-effects of the deprecation that may have been overlooked. The intention is that developers of dependent systems will notice their own system failures caused by the upstream brownout. Such brownouts are typically pre-announced scheduled outages or probabilistic in nature (such as artificially failing a percentage of requests). As a brownout is only a temporary or partial outage, it provides downstream consumers of an API or service time to remove any discovered dependencies on the deprecated API before it is fully retired. For consumers that have already prepared for the deprecation, a brownout provides valuable testing that the final removal of the service won't cause any unexpected problems.

Software development process

A software development process prescribes a process for developing software. It typically divides an overall effort into smaller steps or sub-processes that are intended to ensure high-quality results. The process may describe specific deliverables – artifacts to be created and completed. Although not strictly limited to it, software development process often refers to the high-level process that governs the development of a software system from its beginning to its end of life – known as a methodology, model or framework. The system development life cycle (SDLC) describes the typical phases that a development effort goes through from the beginning to the end of life for a system – including a software system. A methodology prescribes how engineers go about their work in order to move the system through its life cycle. A methodology is a classification of processes or a blueprint for a process that is devised for the SDLC. For example, many processes can be classified as a spiral model. Software process and software quality are closely interrelated; some unexpected facets and effects have been observed in practice. == Methodology == The SDLC drives the definition of a methodology in that a methodology must address the phases of the SDLC. Generally, a methodology is designed to result in a high-quality system that meets or exceeds expectations (requirements) and is delivered on time and within budget even though computer systems can be complex and integrate disparate components. Various methodologies have been devised, including waterfall, spiral, agile, rapid prototyping, incremental, and synchronize and stabilize. A major difference between methodologies is the degree to which the phases are sequential vs. iterative. Agile methodologies, such as XP and scrum, focus on lightweight processes that allow for rapid changes. Iterative methodologies, such as Rational Unified Process and dynamic systems development method, focus on stabilizing project scope and iteratively expanding or improving products. Sequential or big-design-up-front (BDUF) models, such as waterfall, focus on complete and correct planning to guide larger projects and limit risks to successful and predictable results. Anamorphic development is guided by project scope and adaptive iterations. In scrum, for example, one could say a single user story goes through all the phases of the SDLC within a two-week sprint. By contrast the waterfall methodology, where every business requirement is translated into feature/functional descriptions which are then all implemented typically over a period of months or longer. A project can include both a project life cycle (PLC) and an SDLC, which describe different activities. According to Taylor (2004), "the project life cycle encompasses all the activities of the project, while the systems development life cycle focuses on realizing the product requirements". === History === The term SDLC is often used as an abbreviated version of SDLC methodology. Further, some use SDLC and traditional SDLC to mean the waterfall methodology. According to Elliott (2004), SDLC "originated in the 1960s, to develop large scale functional business systems in an age of large scale business conglomerates. Information systems activities revolved around heavy data processing and number crunching routines". The structured systems analysis and design method (SSADM) was produced for the UK government Office of Government Commerce in the 1980s. Ever since, according to Elliott (2004), "the traditional life cycle approaches to systems development have been increasingly replaced with alternative approaches and frameworks, which attempted to overcome some of the inherent deficiencies of the traditional SDLC". The main idea of the SDLC has been "to pursue the development of information systems in a very deliberate, structured and methodical way, requiring each stage of the life cycle––from the inception of the idea to delivery of the final system––to be carried out rigidly and sequentially" within the context of the framework being applied. Other methodologies were devised later: 1970s Structured programming since 1969 Cap Gemini SDM, originally from PANDATA, the first English translation was published in 1974. SDM stands for System Development Methodology 1980s Structured systems analysis and design method (SSADM) from 1980 onwards Information Requirement Analysis/Soft systems methodology 1990s Object-oriented programming (OOP) developed in the early 1960s and became a dominant programming approach during the mid-1990s Rapid application development (RAD), since 1991 Dynamic systems development method (DSDM), since 1994 Scrum, since 1995 Team software process, since 1998 Rational Unified Process (RUP), maintained by IBM since 1998 Extreme programming, since 1999 2000s Agile Unified Process (AUP) maintained since 2005 by Scott Ambler Disciplined agile delivery (DAD) Supersedes AUP 2010s Scaled Agile Framework (SAFe) Large-Scale Scrum (LeSS) DevOps Since DSDM in 1994, all of the methodologies on the above list except RUP have been agile methodologies - yet many organizations, especially governments, still use pre-agile processes (often waterfall or similar). === Examples === The following are notable methodologies somewhat ordered by popularity. Agile Agile software development refers to a group of frameworks based on iterative development, where requirements and solutions evolve via collaboration between self-organizing cross-functional teams. The term was coined in the year 2001 when the Agile Manifesto was formulated. Waterfall The waterfall model is a sequential development approach, in which development flows one-way (like a waterfall) through the SDLC phases. Spiral In 1988, Barry Boehm published a software system development spiral model, which combines key aspects of the waterfall model and rapid prototyping, in an effort to combine advantages of top-down and bottom-up concepts. It emphases a key area many felt had been neglected by other methodologies: deliberate iterative risk analysis, particularly suited to large-scale complex systems. Incremental Various methods combine linear and iterative methodologies, with the primary objective of reducing inherent project risk by breaking a project into smaller segments and providing more ease-of-change during the development process. Prototyping Software prototyping is about creating prototypes, i.e. incomplete versions of the software program being developed. Rapid Rapid application development (RAD) is a methodology which favors iterative development and the rapid construction of prototypes instead of large amounts of up-front planning. The "planning" of software developed using RAD is interleaved with writing the software itself. The lack of extensive pre-planning generally allows software to be written much faster and makes it easier to change requirements. Shape Up Shape Up is a software development approach introduced by Basecamp in 2018. It is a set of principles and techniques that Basecamp developed internally to overcome the problem of projects dragging on with no clear end. Its primary target audience is remote teams. Shape Up has no estimation and velocity tracking, backlogs, or sprints, unlike waterfall, agile, or scrum. Instead, those concepts are replaced with appetite, betting, and cycles. As of 2022, besides Basecamp, notable organizations that have adopted Shape Up include UserVoice and Block. Chaos Chaos model has one main rule: always resolve the most important issue first. Incremental funding Incremental funding methodology - an iterative approach. Lightweight Lightweight methodology - a general term for methods that only have a few rules and practices. Structured systems analysis and design Structured systems analysis and design method - a specific version of waterfall. Slow programming As part of the larger slow movement, emphasizes careful and gradual work without (or minimal) time pressures. Slow programming aims to avoid bugs and overly quick release schedules. V-Model V-Model (software development) - an extension of the waterfall model. Unified Process Unified Process (UP) is an iterative software development methodology framework, based on Unified Modeling Language (UML). UP organizes the development of software into four phases, each consisting of one or more executable iterations of the software at that stage of development: inception, elaboration, construction, and guidelines. === Comparison === The waterfall model describes the SDLC phases such that each builds on the result of the previous one. Not every project requires that the phases be sequential. For relatively simple projects, phases may be combined or overlapping. Alternative methodologies to waterfall are described and compared below. == Process meta-models == Some process models are abstract descriptions for evaluating, comparing, and improving the specific process adopted by an organization. ISO/IEC 12207 ISO/IEC 12207 i

Semantic interpretation

Semantic interpretation is an important component in dialog systems. It is related to natural language understanding, but mostly it refers to the last stage of understanding. The goal of interpretation is binding the user utterance to concept, or something the system can understand. Typically it is creating a database query based on user utterance.

List of Haskell software and tools

This is a list of Haskell software and tools, including compilers, interpreters, build tools, package managers, integrated development environments, libraries, and other development utilities. == Compilers, interpreters and editors == Emacs — text editor Glasgow Haskell Compiler (GHC) Hugs — bytecode interpreter (discontinued) IntelliJ IDEA — IDE with Haskell support via plugins Vim — text editor Visual Studio Code — editor/IDE with Haskell support via extensions == Libraries and frameworks == Parsec — parser combinator library Servant — web framework Yesod — web framework == Build tools and package management == Cabal — build system and packaging infrastructure Haskell Platform — bundled distribution of Haskell tools and libraries (deprecated) Stack — build tool and dependency manager == Language tools and static analysis == Fourmolu — code formatter based on Ormolu Haskell Language Server — implementation of the Language Server Protocol for Haskell HLint — source code suggestion and linting tool Hoogle — Haskell API search engine Ormolu — code formatter Stan — static analysis tool Stylish Haskell — source code formatter == Interactive environments == GHCi — interactive REPL for the Glasgow Haskell Compiler IHaskell — Jupyter kernel for Haskell == Debugging and profiling tools == hp2ps — heap profiling visualization tool ThreadScope — parallel execution visualizer for Haskell programs == Documentation generators == Haddock — API documentation generator for Haskell == Parser and lexer generators == Alex — lexer generator for Haskell Happy — parser generator for Haskell == Testing frameworks == HUnit — unit testing framework QuickCheck — property-based testing library == Version control == Darcs — distributed version control system written in Haskell