A web API is an application programming interface (API) for either a web server or a web browser. As a web development concept, it can be related to a web application's client side (including any web frameworks being used). A server-side web API consists of one or more publicly exposed endpoints to a defined request–response message system, typically expressed in JSON or XML by means of an HTTP-based web server. A server API (SAPI) is not considered a server-side web API, unless it is publicly accessible by a remote web application. == Client side == A client-side web API is a programmatic interface to extend functionality within a web browser or other HTTP client. Originally these were most commonly in the form of native plug-in browser extensions however most newer ones target standardized JavaScript bindings. The Mozilla Foundation created their WebAPI specification which is designed to help replace native mobile applications with HTML5 applications. Google created their Native Client architecture which is designed to help replace insecure native plug-ins with secure native sandboxed extensions and applications. They have also made this portable by employing a modified LLVM AOT compiler. == Server side == A server-side web API consists of one or more publicly exposed endpoints to a defined request–response message system, typically expressed in JSON or XML. The web API is exposed most commonly by means of an HTTP-based web server. Mashups are web applications which combine the use of multiple server-side web APIs. Webhooks are server-side web APIs that take input as a Uniform Resource Identifier (URI) that is designed to be used like a remote named pipe or a type of callback such that the server acts as a client to dereference the provided URI and trigger an event on another server which handles this event thus providing a type of peer-to-peer IPC. === Endpoints === Endpoints are important aspects of interacting with server-side web APIs, as they specify where resources can be accessed by third-party software. Usually the access is via a URI to which HTTP requests are posted, and from which the response is thus expected. Web APIs may be public or private, the latter of which requires an access token. Endpoints need to be static, otherwise the correct functioning of software that interacts with them cannot be guaranteed. If the location of a resource changes (and with it the endpoint) then previously written software will break, as the required resource can no longer be found at the same place. As API providers still want to update their web APIs, many have introduced a versioning system in the URI that points to an endpoint. === Resources versus services === Web 2.0 Web APIs often use machine-based interactions such as REST and SOAP. RESTful web APIs use HTTP methods to access resources via URL-encoded parameters, and use JSON or XML to transmit data. By contrast, SOAP protocols are standardized by the W3C and mandate the use of XML as the payload format, typically over HTTP. Furthermore, SOAP-based Web APIs use XML validation to ensure structural message integrity, by leveraging the XML schemas provisioned with WSDL documents. A WSDL document accurately defines the XML messages and transport bindings of a Web service. === Documentation === Server-side web APIs are interfaces for the outside world to interact with the business logic. For many companies this internal business logic and the intellectual property associated with it are what distinguishes them from other companies, and potentially what gives them a competitive edge. They do not want this information to be exposed. However, in order to provide a web API of high quality, there needs to be a sufficient level of documentation. One API provider that not only provides documentation, but also links to it in its error messages is Twilio. However, there are now directories of popular documented server-side web APIs. === Growth and impact === The number of available web APIs has grown consistently over the past years, as businesses realize the growth opportunities associated with running an open platform, that any developer can interact with. ProgrammableWeb tracks over 24000 Web APIs that were available in 2022, up from 105 in 2005. Web APIs have become ubiquitous. There are few major software applications/services that do not offer some form of web API. One of the most common forms of interacting with these web APIs is via embedding external resources, such as tweets, Facebook comments, YouTube videos, etc. In fact there are very successful companies, such as Disqus, whose main service is to provide embeddable tools, such as a feature-rich comment system. Any website of the TOP 100 Alexa Internet ranked websites uses APIs and/or provides its own APIs, which is a very distinct indicator for the prodigious scale and impact of web APIs as a whole. As the number of available web APIs has grown, open source tools have been developed to provide more sophisticated search and discovery. APIs.json provides a machine-readable description of an API and its operations, and the related project APIs.io offers a searchable public listing of APIs based on the APIs.json metadata format. === Business === ==== Commercial ==== Many companies and organizations rely heavily on their Web API infrastructure to serve their core business clients. In 2014 Netflix received around 5 billion API requests, most of them within their private API. ==== Governmental ==== Many governments collect a lot of data, and some governments are now opening up access to this data. The interfaces through which this data is typically made accessible are web APIs. Web APIs allow for data, such as "budget, public works, crime, legal, and other agency data" to be accessed by any developer in a convenient manner. == Example == An example of a popular web API is the Astronomy Picture of the Day API operated by the American space agency NASA. It is a server-side API used to retrieve photographs of space or other images of interest to astronomers, and metadata about the images. According to the API documentation, the API has one endpoint: https://api.nasa.gov/planetary/apod The documentation states that this endpoint accepts GET requests. It requires one piece of information from the user, an API key, and accepts several other optional pieces of information. Such pieces of information are known as parameters. The parameters for this API are written in a format known as a query string, which is separated by a question mark character (?) from the endpoint. An ampersand (&) separates the parameters in the query string from each other. Together, the endpoint and the query string form a URL that determines how the API will respond. This URL is also known as a query or an API call. In the below example, two parameters are transmitted (or passed) to the API via the query string. The first is the required API key and the second is an optional parameter — the date of the photograph requested. https://api.nasa.gov/planetary/apod?api_key=DEMO_KEY&date=1996-12-03 Visiting the above URL in a web browser will initiate a GET request, calling the API and showing the user a result, known as a return value or as a return. This API returns JSON, a type of data format intended to be understood by computers, but which is somewhat easy for a human to read as well. In this case, the JSON contains information about a photograph of a white dwarf star: The above API return has been reformatted so that names of JSON data items, known as keys, appear at the start of each line. The last of these keys, named url, indicates a URL which points to a photograph: https://apod.nasa.gov/apod/image/9612/ngc2440_hst2.jpg Following the above URL, a web browser user would see this photo: Although this API can be called by an end user with a web browser (as in this example) it is intended to be called automatically by software or by computer programmers while writing software. JSON is intended to be parsed by a computer program, which would extract the URL of the photograph and the other metadata. The resulting photo could be embedded in a website, automatically sent via text message, or used for any other purpose envisioned by a software developer.
Noisy text analytics
Noisy text analytics is a process of information extraction whose goal is to automatically extract structured or semistructured information from noisy unstructured text data. While Text analytics is a growing and mature field that has great value because of the huge amounts of data being produced, processing of noisy text is gaining in importance because a lot of common applications produce noisy text data. Noisy unstructured text data is found in informal settings such as online chat, text messages, e-mails, message boards, newsgroups, blogs, wikis and web pages. Also, text produced by processing spontaneous speech using automatic speech recognition and printed or handwritten text using optical character recognition contains processing noise. Text produced under such circumstances is typically highly noisy containing spelling errors, abbreviations, non-standard words, false starts, repetitions, missing punctuations, missing letter case information, pause filling words such as “um” and “uh” and other texting and speech disfluencies. Such text can be seen in large amounts in contact centers, chat rooms, optical character recognition (OCR) of text documents, short message service (SMS) text, etc. Documents with historical language can also be considered noisy with respect to today's knowledge about the language. Such text contains important historical, religious, ancient medical knowledge that is useful. The nature of the noisy text produced in all these contexts warrants moving beyond traditional text analysis techniques. == Techniques for noisy text analysis == Missing punctuation and the use of non-standard words can often hinder standard natural language processing tools such as part-of-speech tagging and parsing. Techniques to both learn from the noisy data and then to be able to process the noisy data are only now being developed. == Possible source of noisy text == World Wide Web: Poorly written text is found in web pages, online chat, blogs, wikis, discussion forums, newsgroups. Most of these data are unstructured and the style of writing is very different from, say, well-written news articles. Analysis for the web data is important because they are sources for market buzz analysis, market review, trend estimation, etc. Also, because of the large amount of data, it is necessary to find efficient methods of information extraction, classification, automatic summarization and analysis of these data. Contact centers: This is a general term for help desks, information lines and customer service centers operating in domains ranging from computer sales and support to mobile phones to apparels. On an average a person in the developed world interacts at least once a week with a contact center agent. A typical contact center agent handles over a hundred calls per day. They operate in various modes such as voice, online chat and E-mail. The contact center industry produces gigabytes of data in the form of E-mails, chat logs, voice conversation transcriptions, customer feedback, etc. A bulk of the contact center data is voice conversations. Transcription of these using state of the art automatic speech recognition results in text with 30-40% word error rate. Further, even written modes of communication like online chat between customers and agents and even the interactions over email tend to be noisy. Analysis of contact center data is essential for customer relationship management, customer satisfaction analysis, call modeling, customer profiling, agent profiling, etc., and it requires sophisticated techniques to handle poorly written text. Printed Documents: Many libraries, government organizations and national defence organizations have vast repositories of hard copy documents. To retrieve and process the content from such documents, they need to be processed using Optical Character Recognition. In addition to printed text, these documents may also contain handwritten annotations. OCRed text can be highly noisy depending on the font size, quality of the print etc. It can range from 2-3% word error rates to as high as 50-60% word error rates. Handwritten annotations can be particularly hard to decipher, and error rates can be quite high in their presence. Short Messaging Service (SMS): Language usage over computer mediated discourses, like chats, emails and SMS texts, significantly differs from the standard form of the language. An urge towards shorter message length facilitating faster typing and the need for semantic clarity, shape the structure of this non-standard form known as the texting language.
Kinect
Kinect is a discontinued line of motion sensing input devices produced by Microsoft and first released in 2010. The devices generally contain RGB cameras, and infrared projectors and detectors that map depth through either structured light or time of flight calculations, which can in turn be used to perform real-time gesture recognition and body skeletal detection, among other capabilities. They also contain microphones that can be used for speech recognition and voice control. Kinect was originally developed as a motion controller peripheral for Xbox video game consoles, distinguished from competitors (such as Nintendo's Wii Remote and Sony's PlayStation Move) by not requiring physical controllers. The first-generation Kinect was based on technology from Israeli company PrimeSense, and unveiled at E3 2009 as a peripheral for Xbox 360 codenamed "Project Natal". It was first released on November 4, 2010, and would go on to sell eight million units in its first 60 days of availability. The majority of the games developed for Kinect were casual, family-oriented titles, which helped to attract new audiences to Xbox 360, but did not result in wide adoption by the console's existing, overall userbase. As part of the 2013 unveiling of Xbox 360's successor, Xbox One, Microsoft unveiled a second-generation version of Kinect with improved tracking capabilities. Microsoft also announced that Kinect would be a required component of the console, and that it would not function unless the peripheral is connected. The requirement proved controversial among users and critics due to privacy concerns, prompting Microsoft to backtrack on the decision. However, Microsoft still bundled the new Kinect with Xbox One consoles upon their launch in November 2013. A market for Kinect-based games still did not emerge after the Xbox One's launch; Microsoft would later offer Xbox One hardware bundles without Kinect included, and later revisions of the console removed the dedicated ports used to connect it (requiring a powered USB adapter instead). Microsoft ended production of Kinect for Xbox One in October 2017. Kinect has also been used as part of non-game applications in academic and commercial environments, as it was cheaper and more robust than other depth-sensing technologies at the time. While Microsoft initially objected to such applications, it later released software development kits (SDKs) for the development of Microsoft Windows applications that use Kinect. In 2020, Microsoft released Azure Kinect as a continuation of the technology integrated with the Microsoft Azure cloud computing platform. Part of the Kinect technology was also used within Microsoft's HoloLens project. Microsoft discontinued the Azure Kinect developer kits in October 2023. == History == === Development === The origins of the Kinect started around 2005, at a point where technology vendors were starting to develop depth-sensing cameras. Microsoft had been interested in a 3D camera for the Xbox line earlier but because the technology had not been refined, had placed it in the "Boneyard", a collection of possible technology they could not immediately work on. In 2005, Israeli company PrimeSense was founded by mathematicians and engineers to develop the "next big thing" for video games, incorporating cameras that were capable of mapping a human body in front of them and sensing hand motions. They showed off their system at the 2006 Game Developers Conference, where Microsoft's Alex Kipman, the general manager of hardware incubation, saw the potential in PrimeSense's technology for the Xbox system. Microsoft began discussions with PrimeSense about what would need to be done to make their product more consumer-friendly: not only improvements in the capabilities of depth-sensing cameras, but a reduction in size and cost, and a means to manufacture the units at scale was required. PrimeSense spent the next few years working at these improvements. Nintendo released the Wii in November 2006. The Wii's central feature was the Wii Remote, a handheld device that was detected by the Wii through a motion sensor bar mounted onto a television screen to enable motion controlled games. Microsoft felt pressure from the Wii, and began looking into depth-sensing in more detail with PrimeSense's hardware, but could not get to the level of motion tracking they desired. While they could determine hand gestures, and sense the general shape of a body, they could not do skeletal tracking. A separate path within Microsoft looked to create an equivalent of the Wii Remote, considering that this type of unit may become standardized similar to how two-thumbstick controllers became a standard feature. However, it was still ultimately Microsoft's goal to remove any device between the player and the Xbox. Kudo Tsunoda and Darren Bennett joined Microsoft in 2008, and began working with Kipman on a new approach to depth-sensing aided by machine learning to improve skeletal tracking. They internally demonstrated this and established where they believed the technology could be in a few years, which led to the strong interest to fund further development of the technology; this has also occurred at a time that Microsoft executives wanted to abandon the Wii-like motion tracking approach, and favored the depth-sensing solution to present a product that went beyond the Wii's capabilities. The project was greenlit by late 2008 with work started in 2009. The project was codenamed "Project Natal" after the Brazilian city Natal, Kipman's birthplace. Additionally, Kipman recognized the Latin origins of the word "natal" to mean "to be born", reflecting the new types of audiences they hoped to draw with the technology. Much of the initial work was related to ethnographic research to see how video game players' home environments were laid out, lit, and how those with Wiis used the system to plan how Kinect units would be used. The Microsoft team discovered from this research that the up-and-down angle of the depth-sensing camera would either need to be adjusted manually, or would require an expensive motor to move automatically. Upper management at Microsoft opted to include the motor despite the increased cost to avoid breaking game immersion. Kinect project work also involved packaging the system for mass production and optimizing its performance. Hardware development took around 22 months. During hardware development, Microsoft engaged with software developers to use Kinect. Microsoft wanted to make games that would be playable by families since Kinect could sense multiple bodies in front of it. One of the first internal titles developed for the device was the pack-in game Kinect Adventures developed by Good Science Studio that was part of Microsoft Studios. One of the game modes of Kinect Adventures was "Reflex Ridge", based on the Japanese Brain Wall game where players attempt to contort their bodies in a short time to match cutouts of a wall moving at them. This type of game was a key example of the type of interactivity they wanted with Kinect, and its development helped feed into the hardware improvements. Another development was Project Milo, a prototype game developed by Lionhead Studios led by Peter Molyneux where the player could interact with a virtual avatar through motion controls and voice recognition. Lionhead had developed the project based on original capabilities of the Kinect, but according to Molyneux, Microsoft had found that a consumer-grade version of the Kinect would cost thousands of dollars, so they scaled back the device and refocused the role of games for the Kinect to be more casual games as seen on the Wii. As a result, Project Milo no longer fit Microsoft's portfolio and was cancelled. Nearing the planned release, there was a problem of widespread testing of Kinect in various room types and different bodies accounting for age, gender, and race among other factors, while keeping the details of the unit confidential. Microsoft engaged in a company-wide program offering employees to take home Kinect units to test them. Microsoft also brought other non-gaming divisions, including its Microsoft Research, Microsoft Windows, and Bing teams to help complete the system. Microsoft established its own large-scale manufacturing facility to bulk product Kinect units and test them. === Introduction === Kinect was first announced to the public as "Project Natal" on June 1, 2009, during Microsoft's press conference at E3 2009; film director Steven Spielberg joined Microsoft's Don Mattrick to introduce the technology and its potential. Three demos were presented during the conference—Microsoft's Ricochet and Paint Party, and Lionhead Studios' Milo & Kate created by Peter Molyneux—while a Project Natal-enabled version of Criterion Games' Burnout Paradise was shown during the E3 exhibition. By E3 2009, the skeletal mapping technology was capable of simultaneously tracking four people, with a feature extraction of 4
Shane Legg
Shane Legg (born 1973 or 1974) is a machine learning researcher and entrepreneur. With Demis Hassabis and Mustafa Suleyman, he cofounded DeepMind Technologies (later bought by Google and now called Google DeepMind), and works there as the chief AGI scientist. He is also known for his academic work on artificial general intelligence, including his thesis supervised by Marcus Hutter. == Early life and education == Legg attended Rotorua Lakes High School in Rotorua, on New Zealand's North Island. He completed his undergraduate studies at Waikato University in 1996. Also in 1996, he obtained his MSc degree with a thesis entitled "Solomonoff Induction", with Cristian S. Calude at the University of Auckland. == Research interests == In the early 2000s, Legg re-introduced and popularized with Ben Goertzel the term "artificial general intelligence" (AGI), to describe an AI that can do practically any cognitive task a human can do. At that time, talking about AGI "would put you on the lunatic fringe". Legg is known for his concern of existential risk from AI, highlighted in 2011 in an interview on LessWrong and in 2023 he signed the statement on AI risk of extinction. == Career == Before his PhD and before cofounding DeepMind, Shane Legg worked at "a number of software development positions at private companies", including the "big data firm Adaptive Intelligence" and the startup WebMind founded by Ben Goertzel. === Research === Legg later obtained a PhD at the Dalle Molle Institute for Artificial Intelligence Research (IDSIA), a joint research institute of USI Università della Svizzera italiana and SUPSI. He worked on theoretical models of super intelligent machines (AIXI) with Marcus Hutter, and completed in 2008 his doctoral thesis entitled "Machine Super Intelligence". He then went on to complete a postdoctoral fellowship in finance at USI, and began a further fellowship at University College London's Gatsby Computational Neuroscience Unit. === DeepMind === Demis Hassabis and Shane Legg first met in 2009 at University College London, where Legg was a postdoctoral researcher. In 2010, Legg cofounded the start-up DeepMind Technologies along with Demis Hassabis and Mustafa Suleyman. DeepMind Technologies was bought in 2014 by Google. After the merge with Google Brain in 2023, the company is now known as Google DeepMind. According to a 2017 article, a significant part of his job as the chief scientist was to supervise recruitment, to decide where DeepMind should focus its efforts, and to lead DeepMind's AI safety work. As of July 2023, Legg works at Google DeepMind as the Chief AGI Scientist. == Awards and honors == Legg was awarded the $10,000 prize of the Singularity Institute for Artificial Intelligence for his PhD done in 2008. Legg was appointed Commander of the Order of the British Empire (CBE) in the 2019 Birthday Honours for services to the science and technology sector and to investment.
Mycin
MYCIN was an early backward chaining expert system that used black box to identify bacteria causing severe infections, such as bacteremia and meningitis, and to recommend antibiotics, with the dosage adjusted for patient's body weight — the name derived from the antibiotics themselves, as many antibiotics have the suffix "-mycin". The Mycin system was also used for the diagnosis of blood clotting diseases. MYCIN was developed over five or six years in the early 1970s at Stanford University. It was written in Lisp as the doctoral dissertation of Edward Shortliffe under the direction of Bruce G. Buchanan, Stanley N. Cohen and others. MYCIN emerged from the Stanford Heuristic Programming Project. MYCIN demonstrated the potential for expert systems in building high-performance medical reasoning programs. MYCIN is often viewed as a pioneer in the field of expert systems, even being referred to as the "grandaddy of them all-the one that launched the field" by Dr. Allen Newell. MYCIN led to the EMYCIN expert system shell ("essential MYCIN") for acquiring knowledge, reasoning with it, and explaining the results, without the specific medical knowledge. It can be described as "EMYCIN = Prolog + uncertainty + caching + questions + explanations + contexts - variables". An introduction is in Chapter 16 of Paradigms of Artificial Intelligence Programming (PAIP). == Method == MYCIN operated using a fairly simple inference engine and a knowledge base of ~600 rules by obtaining individual inferential facts identified by experts and encoding such facts as individual production rules. No other AI program at the time contained as much domain-specific knowledge clearly separated from its inference procedures as MYCIN. It would query the physician running the program via a long series of simple yes/no or textual questions. At the end, it provided a list of possible culprit bacteria ranked from high to low based on the probability of each diagnosis, its confidence in each diagnosis' probability, the reasoning behind each diagnosis (that is, MYCIN would also list the questions and rules which led it to rank a diagnosis a particular way), and its recommended course of drug treatment. MYCIN could additionally respond to queries by physicians related to why it asked the user a certain question, how it arrived at a conclusion, and why it did not consider certain factors. The developers performed studies showing that MYCIN's performance was minimally affected by perturbations in the uncertainty metrics associated with individual rules, suggesting that the power in the system was related more to its knowledge representation and reasoning scheme than to the details of its numerical uncertainty model. Some observers felt that it should have been possible to use classical Bayesian statistics. MYCIN's developers argued that this would require either unrealistic assumptions of probabilistic independence, or require the experts to provide estimates for an unfeasibly large number of conditional probabilities. Subsequent studies later showed that the certainty factor model could indeed be interpreted in a probabilistic sense, and highlighted problems with the implied assumptions of such a model. However the modular structure of the system would prove very successful, leading to the development of graphical models such as Bayesian networks. === Context === A context in MYCIN determines what types of objects can be reasoned about. They are similar to variables in Prolog, or environment variables in operating systems. === Evidence combination === In MYCIN it was possible that two or more rules might draw conclusions about a parameter with different weights of evidence. For example, one rule may conclude that the organism in question is E. Coli with a certainty of 0.8 whilst another concludes that it is E. Coli with a certainty of 0.5 or even −0.8. In the event the certainty is less than zero the evidence is actually against the hypothesis. In order to calculate the certainty factor MYCIN combined these weights using the formula below to yield a single certainty factor: C F ( x , y ) = { X + Y − X Y if X , Y > 0 X + Y + X Y if X , Y < 0 X + Y 1 − min ( | X | , | Y | ) otherwise {\displaystyle CF(x,y)={\begin{cases}X+Y-XY&{\text{if }}X,Y>0\\X+Y+XY&{\text{if }}X,Y<0\\{\frac {X+Y}{1-\min(|X|,|Y|)}}&{\text{otherwise}}\end{cases}}} Where X and Y are the certainty factors. This formula can be applied more than once if more than two rules draw conclusions about the same parameter. It is commutative, so it does not matter in which order the weights were combined. The combination formula was designed to have the following desirable properties: −1 can be interpreted as "false", +1 as "true", and 0 as "uncertain". Combining unknown with anything leaves it unchanged. Combining true with anything (except false) gives true. Similarly for false. Combining true and false is a division-by-zero error. Combining +x and -x gives unknown. Combining two positives (except true) gives a larger positive. Similarly for negatives. Combining a positive and a negative gives something in between. === Examples === The following examples come from Chapter 16 of PAIP, which contains an implementation in Common Lisp of a modified and simplified version of MYCIN for pedagogical purposes. A rule, and an English paraphrase generated by the system: == Results == An evaluation of MYCIN was conducted at the Stanford Medical School. The first phase of the evaluation consisted of 10 test cases of diverse origin, chosen by a physician who was not acquainted with MYCIN's methods or knowledge base. These cases were presented to 7 physicians and 1 senior medical student. 10 prescriptions were compiled for each of the cases, 1 recommended by MYCIN, 1 prescribed by the treating physician at the county hospital, and 8 by the aforementioned individuals. The second phase of the evaluation consisted of eight infectious disease specialists being provided the clinical summary and set of 10 prescriptions for each of the 10 cases and tasked to provide their own recommendations for each case and assess the 10 prescriptions. MYCIN received an acceptability rating of 65%, which was comparable to the 42.5% to 62.5% rating of five faculty members. This study is often cited as showing the potential for disagreement about therapeutic decisions, even among experts, when there is no "gold standard" for correct treatment. == Practical use == MYCIN was never actually used in practice. This wasn't because of any weakness in its performance. Some observers raised ethical and legal issues related to the use of computers in medicine, regarding the responsibility of the physicians in case the system gave wrong diagnosis. However, the greatest problem, and the reason that MYCIN was not used in routine practice, was the state of technologies for system integration, especially at the time it was developed. MYCIN was a stand-alone system that required a user to enter all relevant information about a patient by typing in responses to questions MYCIN posed. MYCIN ran on the DEC KI10 PDP-10, supporting a large time-shared system available over the early Internet (ARPANet), before personal computers were developed. MYCIN's greatest influence was accordingly its demonstration of the power of its representation and reasoning approach. Rule-based systems in many non-medical domains were developed in the years that followed MYCIN's introduction of the approach. In the 1980s, expert system "shells" were introduced (including one based on MYCIN, known as E-MYCIN (followed by Knowledge Engineering Environment - KEE)) and supported the development of expert systems in a wide variety of application areas. A difficulty that rose to prominence during the development of MYCIN and subsequent complex expert systems has been the extraction of the necessary knowledge for the inference engine to use from the human expert in the relevant fields into the rule base (the so-called "knowledge acquisition bottleneck").
Software development process
A software development process prescribes a process for developing software. It typically divides an overall effort into smaller steps or sub-processes that are intended to ensure high-quality results. The process may describe specific deliverables – artifacts to be created and completed. Although not strictly limited to it, software development process often refers to the high-level process that governs the development of a software system from its beginning to its end of life – known as a methodology, model or framework. The system development life cycle (SDLC) describes the typical phases that a development effort goes through from the beginning to the end of life for a system – including a software system. A methodology prescribes how engineers go about their work in order to move the system through its life cycle. A methodology is a classification of processes or a blueprint for a process that is devised for the SDLC. For example, many processes can be classified as a spiral model. Software process and software quality are closely interrelated; some unexpected facets and effects have been observed in practice. == Methodology == The SDLC drives the definition of a methodology in that a methodology must address the phases of the SDLC. Generally, a methodology is designed to result in a high-quality system that meets or exceeds expectations (requirements) and is delivered on time and within budget even though computer systems can be complex and integrate disparate components. Various methodologies have been devised, including waterfall, spiral, agile, rapid prototyping, incremental, and synchronize and stabilize. A major difference between methodologies is the degree to which the phases are sequential vs. iterative. Agile methodologies, such as XP and scrum, focus on lightweight processes that allow for rapid changes. Iterative methodologies, such as Rational Unified Process and dynamic systems development method, focus on stabilizing project scope and iteratively expanding or improving products. Sequential or big-design-up-front (BDUF) models, such as waterfall, focus on complete and correct planning to guide larger projects and limit risks to successful and predictable results. Anamorphic development is guided by project scope and adaptive iterations. In scrum, for example, one could say a single user story goes through all the phases of the SDLC within a two-week sprint. By contrast the waterfall methodology, where every business requirement is translated into feature/functional descriptions which are then all implemented typically over a period of months or longer. A project can include both a project life cycle (PLC) and an SDLC, which describe different activities. According to Taylor (2004), "the project life cycle encompasses all the activities of the project, while the systems development life cycle focuses on realizing the product requirements". === History === The term SDLC is often used as an abbreviated version of SDLC methodology. Further, some use SDLC and traditional SDLC to mean the waterfall methodology. According to Elliott (2004), SDLC "originated in the 1960s, to develop large scale functional business systems in an age of large scale business conglomerates. Information systems activities revolved around heavy data processing and number crunching routines". The structured systems analysis and design method (SSADM) was produced for the UK government Office of Government Commerce in the 1980s. Ever since, according to Elliott (2004), "the traditional life cycle approaches to systems development have been increasingly replaced with alternative approaches and frameworks, which attempted to overcome some of the inherent deficiencies of the traditional SDLC". The main idea of the SDLC has been "to pursue the development of information systems in a very deliberate, structured and methodical way, requiring each stage of the life cycle––from the inception of the idea to delivery of the final system––to be carried out rigidly and sequentially" within the context of the framework being applied. Other methodologies were devised later: 1970s Structured programming since 1969 Cap Gemini SDM, originally from PANDATA, the first English translation was published in 1974. SDM stands for System Development Methodology 1980s Structured systems analysis and design method (SSADM) from 1980 onwards Information Requirement Analysis/Soft systems methodology 1990s Object-oriented programming (OOP) developed in the early 1960s and became a dominant programming approach during the mid-1990s Rapid application development (RAD), since 1991 Dynamic systems development method (DSDM), since 1994 Scrum, since 1995 Team software process, since 1998 Rational Unified Process (RUP), maintained by IBM since 1998 Extreme programming, since 1999 2000s Agile Unified Process (AUP) maintained since 2005 by Scott Ambler Disciplined agile delivery (DAD) Supersedes AUP 2010s Scaled Agile Framework (SAFe) Large-Scale Scrum (LeSS) DevOps Since DSDM in 1994, all of the methodologies on the above list except RUP have been agile methodologies - yet many organizations, especially governments, still use pre-agile processes (often waterfall or similar). === Examples === The following are notable methodologies somewhat ordered by popularity. Agile Agile software development refers to a group of frameworks based on iterative development, where requirements and solutions evolve via collaboration between self-organizing cross-functional teams. The term was coined in the year 2001 when the Agile Manifesto was formulated. Waterfall The waterfall model is a sequential development approach, in which development flows one-way (like a waterfall) through the SDLC phases. Spiral In 1988, Barry Boehm published a software system development spiral model, which combines key aspects of the waterfall model and rapid prototyping, in an effort to combine advantages of top-down and bottom-up concepts. It emphases a key area many felt had been neglected by other methodologies: deliberate iterative risk analysis, particularly suited to large-scale complex systems. Incremental Various methods combine linear and iterative methodologies, with the primary objective of reducing inherent project risk by breaking a project into smaller segments and providing more ease-of-change during the development process. Prototyping Software prototyping is about creating prototypes, i.e. incomplete versions of the software program being developed. Rapid Rapid application development (RAD) is a methodology which favors iterative development and the rapid construction of prototypes instead of large amounts of up-front planning. The "planning" of software developed using RAD is interleaved with writing the software itself. The lack of extensive pre-planning generally allows software to be written much faster and makes it easier to change requirements. Shape Up Shape Up is a software development approach introduced by Basecamp in 2018. It is a set of principles and techniques that Basecamp developed internally to overcome the problem of projects dragging on with no clear end. Its primary target audience is remote teams. Shape Up has no estimation and velocity tracking, backlogs, or sprints, unlike waterfall, agile, or scrum. Instead, those concepts are replaced with appetite, betting, and cycles. As of 2022, besides Basecamp, notable organizations that have adopted Shape Up include UserVoice and Block. Chaos Chaos model has one main rule: always resolve the most important issue first. Incremental funding Incremental funding methodology - an iterative approach. Lightweight Lightweight methodology - a general term for methods that only have a few rules and practices. Structured systems analysis and design Structured systems analysis and design method - a specific version of waterfall. Slow programming As part of the larger slow movement, emphasizes careful and gradual work without (or minimal) time pressures. Slow programming aims to avoid bugs and overly quick release schedules. V-Model V-Model (software development) - an extension of the waterfall model. Unified Process Unified Process (UP) is an iterative software development methodology framework, based on Unified Modeling Language (UML). UP organizes the development of software into four phases, each consisting of one or more executable iterations of the software at that stage of development: inception, elaboration, construction, and guidelines. === Comparison === The waterfall model describes the SDLC phases such that each builds on the result of the previous one. Not every project requires that the phases be sequential. For relatively simple projects, phases may be combined or overlapping. Alternative methodologies to waterfall are described and compared below. == Process meta-models == Some process models are abstract descriptions for evaluating, comparing, and improving the specific process adopted by an organization. ISO/IEC 12207 ISO/IEC 12207 i
OpenCog
OpenCog is a project that aims to build an open source artificial intelligence framework. OpenCog Prime is an architecture for robot and virtual embodied cognition that defines a set of interacting components designed to give rise to human-equivalent artificial general intelligence (AGI) as an emergent phenomenon of the whole system. OpenCog Prime's design is primarily the work of Ben Goertzel while the OpenCog framework is intended as a generic framework for broad-based AGI research. Research utilizing OpenCog has been published in journals and presented at conferences and workshops including the annual Conference on Artificial General Intelligence. OpenCog is released under the terms of the GNU Affero General Public License. OpenCog is in use by more than 50 companies, including Huawei and Cisco. == Origin == OpenCog was originally based on the release in 2008 of the source code of the proprietary "Novamente Cognition Engine" (NCE) of Novamente LLC. The original NCE code is discussed in the PLN book (ref below). Ongoing development of OpenCog is supported by Artificial General Intelligence Research Institute (AGIRI), the Google Summer of Code project, Hanson Robotics, SingularityNET and others. == Components == OpenCog consists of: A graph database, dubbed the AtomSpace, that holds "atoms" (that is, terms, atomic formulas, sentences and relationships) together with their "values" (valuations or interpretations, which can be thought of as per-atom key-value databases). An example of a value would be a truth value. Atoms are globally unique, immutable and are indexed (searchable); values are fleeting and changeable. A collection of pre-defined atoms, termed Atomese, used for generic knowledge representation, such as conceptual graphs and semantic networks, as well as to represent and store the rules (in the sense of term rewriting) needed to manipulate such graphs. A collection of pre-defined atoms that encode a type subsystem, including type constructors and function types. These are used to specify the types of variables, terms and expressions, and are used to specify the structure of generic graphs containing variables. A collection of pre-defined atoms that encode both functional and imperative programming styles. These include the lambda abstraction for binding free variables into bound variables, as well as for performing beta reduction. A collection of pre-defined atoms that encode a satisfiability modulo theories solver, built in as a part of a generic graph query engine, for performing graph and hypergraph pattern matching (isomorphic subgraph discovery). This generalizes the idea of a structured query language (SQL) to the domain of generic graphical queries; it is an extended form of a graph query language. A generic rule engine, including a forward chainer and a backward chainer, that is able to chain together rules. The rules are exactly the graph queries of the graph query subsystem, and so the rule engine vaguely resembles a query planner. It is designed so as to allow different kinds of inference engines and reasoning systems to be implemented, such as Bayesian inference or fuzzy logic, or practical tasks, such as constraint solvers or motion planners. An attention allocation subsystem based on economic theory, termed ECAN. This subsystem is used to control the combinatorial explosion of search possibilities that are met during inference and chaining. An implementation of a probabilistic reasoning engine based on probabilistic logic networks. The current implementation uses the rule engine to chain together specific rules of logical inference (such as modus ponens), together with some very specific mathematical formulas assigning a probability and a confidence to each deduction. This subsystem can be thought of as a certain kind of proof assistant that works with a modified form of Bayesian inference. A probabilistic genetic program evolver called Meta-Optimizing Semantic Evolutionary Search, or MOSES. This is used to discover collections of short Atomese programs that accomplish tasks; these can be thought of as performing a kind of decision tree learning, resulting in a kind of decision forest, or rather, a generalization thereof. A natural language input system consisting of Link Grammar, and partly inspired by both Meaning-Text Theory as well as Dick Hudson's Word Grammar, which encodes semantic and syntactic relations in Atomese. A natural language generation system. An implementation of Psi-Theory for handling emotional states, drives and urges, dubbed OpenPsi. Interfaces to Hanson Robotics robots, including emotion modelling via OpenPsi. This includes the Loving AI project, used to demonstrate meditation techniques. == Organization and funding == In 2008, the Machine Intelligence Research Institute (MIRI), formerly called Singularity Institute for Artificial Intelligence (SIAI), sponsored several researchers and engineers. Many contributions from the open source community have been made since OpenCog's involvement in the Google Summer of Code in 2008 and 2009. Currently MIRI no longer supports OpenCog. OpenCog has received funding and support from several sources, including the Hong Kong government, Hong Kong Polytechnic University, the Jeffrey Epstein VI Foundation and Hanson Robotics. In 2013, OpenCog began providing AI solutions to Hanson Robotics, and in 2017, OpenCog became a founding member of SingularityNET. == Applications == Similar to other cognitive architectures, the main purpose is to create virtual humans, which are three dimensional avatar characters. The goal is to mimic behaviors like emotions, gestures and learning. For example, the emotion module in the software was only programmed because humans have emotions. Artificial General Intelligence can be realized if it simulates intelligence of humans. The self-description of the OpenCog project provides additional possible applications which are going into the direction of natural language processing and the simulation of a dog.