AI Content Strategist Jobs

AI Content Strategist Jobs — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Cognitive computing

    Cognitive computing

    Cognitive computing refers to technology platforms that, broadly speaking, are based on the scientific disciplines of artificial intelligence and signal processing. These platforms encompass machine learning, reasoning, natural language processing, speech recognition and vision (object recognition), human–computer interaction, dialog and narrative generation, among other technologies. == Definition == At present, there is no widely agreed upon definition for cognitive computing in either academia or industry. In general, the term cognitive computing has been used to refer to new hardware and/or software that mimics the functioning of the human brain (2004). In this sense, cognitive computing is a new type of computing with the goal of more accurate models of how the human brain/mind senses, reasons, and responds to stimulus. Cognitive computing applications link data analysis and adaptive page displays (AUI) to adjust content for a particular type of audience. As such, cognitive computing hardware and applications strive to be more affective and more influential by design. The term "cognitive system" also applies to any artificial construct able to perform a cognitive process where a cognitive process is the transformation of data, information, knowledge, or wisdom to a new level in the DIKW Pyramid. While many cognitive systems employ techniques having their origination in artificial intelligence research, cognitive systems, themselves, may not be artificially intelligent. For example, a neural network trained to recognize cancer on an MRI scan may achieve a higher success rate than a human doctor. This system is certainly a cognitive system but is not artificially intelligent. Cognitive systems may be engineered to feed on dynamic data in real-time, or near real-time, and may draw on multiple sources of information, including both structured and unstructured digital information, as well as sensory inputs (visual, gestural, auditory, or sensor-provided). == Cognitive analytics == Cognitive computing-branded technology platforms typically specialize in the processing and analysis of large, unstructured datasets. == Applications == Education Even if cognitive computing can not take the place of teachers, it can still be a heavy driving force in the education of students. Cognitive computing being used in the classroom is applied by essentially having an assistant that is personalized for each individual student. This cognitive assistant can relieve the stress that teachers face while teaching students, while also enhancing the student's learning experience over all. Teachers may not be able to pay each and every student individual attention, this being the place that cognitive computers fill the gap. Some students may need a little more help with a particular subject. For many students, Human interaction between student and teacher can cause anxiety and can be uncomfortable. With the help of Cognitive Computer tutors, students will not have to face their uneasiness and can gain the confidence to learn and do well in the classroom. While a student is in class with their personalized assistant, this assistant can develop various techniques, like creating lesson plans, to tailor and aid the student and their needs. Healthcare Numerous tech companies are in the process of developing technology that involves cognitive computing that can be used in the medical field. The ability to classify and identify is one of the main goals of these cognitive devices. This trait can be very helpful in the study of identifying carcinogens. This cognitive system that can detect would be able to assist the examiner in interpreting countless numbers of documents in a lesser amount of time than if they did not use Cognitive Computer technology. This technology can also evaluate information about the patient, looking through every medical record in depth, searching for indications that can be the source of their problems. Commerce Together with Artificial Intelligence, it has been used in warehouse management systems to collect, store, organize and analyze all related supplier data. All these aims at improving efficiency, enabling faster decision-making, monitoring inventory and fraud detection Human Cognitive Augmentation In situations where humans are using or working collaboratively with cognitive systems, called a human/cog ensemble, results achieved by the ensemble are superior to results obtainable by the human working alone. Therefore, the human is cognitively augmented. In cases where the human/cog ensemble achieves results at, or superior to, the level of a human expert then the ensemble has achieved synthetic expertise. In a human/cog ensemble, the "cog" is a cognitive system employing virtually any kind of cognitive computing technology. Other use cases Speech recognition Sentiment analysis Face detection Risk assessment Fraud detection Behavioral recommendations == Industry work == Cognitive computing in conjunction with big data and algorithms that comprehend customer needs, can be a major advantage in economic decision making. The powers of cognitive computing and artificial intelligence hold the potential to affect almost every task that humans are capable of performing. This can negatively affect employment for humans, as there would be no such need for human labor anymore. It would also increase the inequality of wealth; the people at the head of the cognitive computing industry would grow significantly richer, while workers without ongoing, reliable employment would become less well off. The more industries start to use cognitive computing, the more difficult it will be for humans to compete. Increased use of the technology will also increase the amount of work that AI-driven robots and machines can perform. The influence of competitive individuals in conjunction with artificial intelligence/cognitive computing has the potential to change the course of humankind.

    Read more →
  • MacSpeech Scribe

    MacSpeech Scribe

    MacSpeech Scribe is speech recognition software for Mac OS X designed specifically for transcription of recorded voice dictation. It runs on Mac OS X 10.6 Snow Leopard. The software transcribes dictation recorded by an individual speaker. Typically, the speaker will record their dictation using a digital recording device such as a handheld digital recorder, mobile smartphone (e.g. iPhone), or desktop or laptop computer with a suitable microphone. MacSpeech Scribe supports specific audio file formats for recorded dictation: .aif, .aiff, .wav, .mp4, .m4a, and .m4v. MacSpeech Scribe was originally developed by MacSpeech, Inc. and released February 11, 2010, at Macworld Expo in San Francisco. The product is now owned by Nuance Communications which acquired MacSpeech on February 16, 2010. Nuance is the developer of other speech recognition products including Dragon NaturallySpeaking for Windows, Dragon Dictate for Mac (formerly "MacSpeech Dictate"), and Dragon Dictation apps for iOS. Jeffery Battersby of Macworld noted in his September 2010 review of MacSpeech Scribe, v1.1: Small foibles aside, MacSpeech Scribe is a powerful and intelligent tool for transcribing your recorded speech. A simple training process and access to a wide variety of standard audio formats mean that you’ll be moving your spoken text to the printed page in a matter of minutes and with a minimum of hassle. Scribe is the best, simplest way for you to get your spoken word to the printed page. == Release history ==

    Read more →
  • Computer Science Ontology

    Computer Science Ontology

    The Computer Science Ontology (CSO) is an automatically generated taxonomy of research topics in the field of Computer Science. It was produced by the Open University in collaboration with Springer Nature by running an information extraction system over a large corpus of scientific articles. Several branches were manually improved by domain experts. The current version (CSO 3.2) includes about 14K research topics and 160K semantic relationships. CSO is available in OWL, Turtle, and N-Triples. It is aligned with several other knowledge graphs, including DBpedia, Wikidata, YAGO, Freebase, and Cyc. New versions of CSO are regularly released on the CSO Portal. CSO is mostly used to characterise scientific papers and other documents according to their research areas, in order to enable different kinds of analytics. The CSO Classifier is an open-source python tool for automatically annotating documents with CSO. == Applications == Recommender Systems. Computing the semantic similarity of documents. Extracting metadata from video lecture subtitles. Performing bibliometrics analysis.

    Read more →
  • Thomas Bolander

    Thomas Bolander

    Thomas Bolander is a Danish professor at DTU Compute, Technical University of Denmark, where he studies logic and artificial intelligence. Most of his studies focus on the social aspect of artificial intelligence, and how we can make future AI able to navigate in social interactions. Thomas Bolander also sits in different commissions, expert panels and boards, among these he is a member of the Siri Commission, the TeckDK Commission, a member of the editorial board of the journal Studia Logica and co-organizer of Science and Cocktails. Bolander is known for his dissemination of science. In 2019 he was awarded the H. C. Ørsted Medal. Which he was the first to achieve after a break of three years.

    Read more →
  • Description logic

    Description logic

    Description logics (DL) are a family of formal knowledge representation languages. Many DLs are more expressive than propositional logic but less expressive than first-order logic. In contrast to the latter, the core reasoning problems for DLs are (usually) decidable, and efficient decision procedures have been designed and implemented for these problems. There are general, spatial, temporal, spatiotemporal, and fuzzy description logics, and each description logic features a different balance between expressive power and reasoning complexity by supporting different sets of mathematical constructors. DLs are used in artificial intelligence to describe and reason about the relevant concepts of an application domain (known as terminological knowledge). It is of particular importance in providing a logical formalism for ontologies and the Semantic Web: the Web Ontology Language (OWL) and its profiles are based on DLs. A major area of application of DLs and OWL is in biomedical informatics, where they assist in the codification of biomedical knowledge. DLs and OWL are also applied in other domains, including defense, climate modeling, and large-scale industrial knowledge graphs. == Introduction == A DL models concepts, roles and individuals, and their relationships. The fundamental modeling concept of a DL is the axiom—a logical statement relating roles and/or concepts. This is a key difference from the frames paradigm where a frame specification declares and completely defines a class. == Nomenclature == === Terminology compared to FOL and OWL === The description logic community uses different terminology than the first-order logic (FOL) community for operationally equivalent notions; some examples are given below. The Web Ontology Language (OWL) uses again a different terminology, also given in the table below. === Naming convention === There are many varieties of description logics and there is an informal naming convention, roughly describing the operators allowed. The expressivity is encoded in the label for a logic starting with one of the following basic logics: Followed by any of the following extensions: ==== Exceptions ==== Some canonical DLs that do not exactly fit this convention are: ==== Examples ==== As an example, A L C {\displaystyle {\mathcal {ALC}}} is a centrally important description logic from which comparisons with other varieties can be made. A L C {\displaystyle {\mathcal {ALC}}} is simply A L {\displaystyle {\mathcal {AL}}} with complement of any concept allowed, not just atomic concepts. A L C {\displaystyle {\mathcal {ALC}}} is used instead of the equivalent A L U E {\displaystyle {\mathcal {ALUE}}} . A further example, the description logic S H I Q {\displaystyle {\mathcal {SHIQ}}} is the logic A L C {\displaystyle {\mathcal {ALC}}} plus extended cardinality restrictions, and transitive and inverse roles. The naming conventions aren't purely systematic so that the logic A L C O I N {\displaystyle {\mathcal {ALCOIN}}} might be referred to as A L C N I O {\displaystyle {\mathcal {ALCNIO}}} and other abbreviations are also made where possible. The Protégé ontology editor supports S H O I N ( D ) {\displaystyle {\mathcal {SHOIN}}^{\mathcal {(D)}}} . Three major biomedical informatics terminology bases, SNOMED CT, GALEN, and GO, are expressible in E L {\displaystyle {\mathcal {EL}}} (with additional role properties). OWL 2 provides the expressiveness of S R O I Q ( D ) {\displaystyle {\mathcal {SROIQ}}^{\mathcal {(D)}}} , OWL-DL is based on S H O I N ( D ) {\displaystyle {\mathcal {SHOIN}}^{\mathcal {(D)}}} , and for OWL-Lite it is S H I F ( D ) {\displaystyle {\mathcal {SHIF}}^{\mathcal {(D)}}} . == History == Description logic was given its current name in the 1980s. Previous to this it was called (chronologically): terminological systems, and concept languages. === Knowledge representation === Frames and semantic networks lack formal (logic-based) semantics. DL was first introduced into knowledge representation (KR) systems to overcome this deficiency. The first DL-based KR system was KL-ONE (by Ronald J. Brachman and Schmolze, 1985). During the '80s other DL-based systems using structural subsumption algorithms were developed including KRYPTON (1983), LOOM (1987), BACK (1988), K-REP (1991) and CLASSIC (1991). This approach featured DL with limited expressiveness but relatively efficient (polynomial time) reasoning. In the early '90s, the introduction of a new tableau based algorithm paradigm allowed efficient reasoning on more expressive DL. DL-based systems using these algorithms — such as KRIS (1991) — show acceptable reasoning performance on typical inference problems even though the worst case complexity is no longer polynomial. From the mid '90s, reasoners were created with good practical performance on very expressive DL with high worst case complexity. Examples from this period include FaCT, RACER (2001), CEL (2005), and KAON 2 (2005). DL reasoners, such as FaCT, FaCT++, RACER, DLP and Pellet, implement the method of analytic tableaux. KAON2 is implemented by algorithms which reduce a SHIQ(D) knowledge base to a disjunctive datalog program. === Semantic web === The DARPA Agent Markup Language (DAML) and Ontology Inference Layer (OIL) ontology languages for the Semantic Web can be viewed as syntactic variants of DL. In particular, the formal semantics and reasoning in OIL use the S H I Q {\displaystyle {\mathcal {SHIQ}}} DL. The DAML+OIL DL was developed as a submission to—and formed the starting point of—the World Wide Web Consortium (W3C) Web Ontology Working Group. In 2004, the Web Ontology Working Group completed its work by issuing the OWL recommendation. The design of OWL is based on the S H {\displaystyle {\mathcal {SH}}} family of DL with OWL DL and OWL Lite based on S H O I N ( D ) {\displaystyle {\mathcal {SHOIN}}^{\mathcal {(D)}}} and S H I F ( D ) {\displaystyle {\mathcal {SHIF}}^{\mathcal {(D)}}} respectively. The W3C OWL Working Group began work in 2007 on a refinement of - and extension to - OWL. In 2009, this was completed by the issuance of the OWL2 recommendation. OWL2 is based on the description logic S R O I Q ( D ) {\displaystyle {\mathcal {SROIQ}}^{\mathcal {(D)}}} . Practical experience demonstrated that OWL DL lacked several key features necessary to model complex domains. == Modeling == === TBox vs Abox === In DL, a distinction is drawn between the so-called TBox (terminological box) and the ABox (assertional box). In general, the TBox contains sentences describing concept hierarchies (i.e., relations between concepts) while the ABox contains ground sentences stating where in the hierarchy, individuals belong (i.e., relations between individuals and concepts). For example, the statement: belongs in the TBox, while the statement: belongs in the ABox. Note that the TBox/ABox distinction is not significant, in the same sense that the two "kinds" of sentences are not treated differently in first-order logic (which subsumes most DL). When translated into first-order logic, a subsumption axiom like (1) is simply a conditional restriction to unary predicates (concepts) with only variables appearing in it. Clearly, a sentence of this form is not privileged or special over sentences in which only constants ("grounded" values) appear like (2). === Motivation for having Tbox and Abox === So why was the distinction introduced? The primary reason is that the separation can be useful when describing and formulating decision-procedures for various DL. For example, a reasoner might process the TBox and ABox separately, in part because certain key inference problems are tied to one but not the other one ('classification' is related to the TBox, 'instance checking' to the ABox). Another example is that the complexity of the TBox can greatly affect the performance of a given decision-procedure for a certain DL, independently of the ABox. Thus, it is useful to have a way to talk about that specific part of the knowledge base. The secondary reason is that the distinction can make sense from the knowledge base modeler's perspective. It is plausible to distinguish between our conception of terms/concepts in the world (class axioms in the TBox) and particular manifestations of those terms/concepts (instance assertions in the ABox). In the above example: when the hierarchy within a company is the same in every branch but the assignment to employees is different in every department (because there are other people working there), it makes sense to reuse the TBox for different branches that do not use the same ABox. There are two features of description logic that are not shared by most other data description formalisms: DL does not make the unique name assumption (UNA) or the closed-world assumption (CWA). Not having UNA means that two concepts with different names may be allowed by some inference to be shown to be equivalent. Not having CWA, or rather having the open world assumption (OWA) means that

    Read more →
  • TensorFlow

    TensorFlow

    TensorFlow is a software library for machine learning and artificial intelligence. It can be used across a range of tasks, but is used mainly for training and inference of neural networks. It is one of the most popular deep learning frameworks, alongside others such as PyTorch. It is free and open-source software released under the Apache License 2.0. It was developed by the Google Brain team for Google's internal use in research and production. The initial version was released under the Apache License 2.0 in 2015. Google released an updated version, TensorFlow 2.0, in September 2019. TensorFlow can be used in a wide variety of programming languages, including Python, JavaScript, C++, and Java, facilitating its use in a range of applications in many sectors. == History == === DistBelief === Starting in 2011, Google Brain built DistBelief as a proprietary machine learning system based on deep learning neural networks. Its use grew rapidly across diverse Alphabet companies in both research and commercial applications. Google assigned multiple computer scientists, including Jeff Dean, to simplify and refactor the codebase of DistBelief into a faster, more robust application-grade library, which became TensorFlow. In 2009, the team, led by Geoffrey Hinton, had implemented generalized backpropagation and other improvements, which allowed generation of neural networks with substantially higher accuracy, for instance a 25% reduction in errors in speech recognition. === TensorFlow === TensorFlow is Google Brain's second-generation system. Version 1.0.0 was released on February 11, 2017. While the reference implementation runs on single devices, TensorFlow can run on multiple CPUs and GPUs (with optional CUDA and SYCL extensions for general-purpose computing on graphics processing units). TensorFlow is available on 64-bit Linux, macOS, Windows, and mobile computing platforms including Android and iOS. Its flexible architecture allows for easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices. TensorFlow computations are expressed as stateful dataflow graphs. The name TensorFlow derives from the operations that such neural networks perform on multidimensional data arrays, which are referred to as tensors. During the Google I/O Conference in June 2016, Jeff Dean stated that 1,500 repositories on GitHub mentioned TensorFlow, of which only 5 were from Google. In March 2018, Google announced TensorFlow.js version 1.0 for machine learning in JavaScript. In Jan 2019, Google announced TensorFlow 2.0. It became officially available in September 2019. In May 2019, Google announced TensorFlow Graphics for deep learning in computer graphics. === Tensor processing unit (TPU) === In May 2016, Google announced its Tensor processing unit (TPU), an application-specific integrated circuit (ASIC, a hardware chip) built specifically for machine learning and tailored for TensorFlow. A TPU is a programmable AI accelerator designed to provide high throughput of low-precision arithmetic (e.g., 8-bit), and oriented toward using or running models rather than training them. Google announced they had been running TPUs inside their data centers for more than a year, and had found them to deliver an order of magnitude better-optimized performance per watt for machine learning. In May 2017, Google announced the second-generation, as well as the availability of the TPUs in Google Compute Engine. The second-generation TPUs deliver up to 180 teraflops of performance, and when organized into clusters of 64 TPUs, provide up to 11.5 petaflops. In May 2018, Google announced the third-generation TPUs delivering up to 420 teraflops of performance and 128 GB high bandwidth memory (HBM). Cloud TPU v3 Pods offer 100+ petaflops of performance and 32 TB HBM. In February 2018, Google announced that they were making TPUs available in beta on the Google Cloud Platform. === Edge TPU === In July 2018, the Edge TPU was announced. Edge TPU is Google's purpose-built ASIC chip designed to run TensorFlow Lite machine learning (ML) models on small client computing devices such as smartphones known as edge computing. === TensorFlow Lite === In May 2017, Google announced TensorFlow Lite as a software stack to support machine learning models for mobile and embedded devices, and in November 2017, provided the developer preview. In January 2019, the TensorFlow team released a developer preview of the mobile GPU inference engine with OpenGL ES 3.1 Compute Shaders on Android devices and Metal Compute Shaders on iOS devices. In May 2019, Google announced that their TensorFlow Lite Micro (also known as TensorFlow Lite for Microcontrollers) and ARM's uTensor would be merging. It was renamed as LiteRT in 2024. === TensorFlow 2.0 === As TensorFlow's market share among research papers was declining to the advantage of PyTorch, the TensorFlow Team announced a release of a new major version of the library in September 2019. TensorFlow 2.0 introduced many changes, the most significant being TensorFlow eager, which changed the automatic differentiation scheme from the static computational graph to the "Define-by-Run" scheme originally made popular by Chainer and later PyTorch. Other major changes included removal of old libraries, cross-compatibility between trained models on different versions of TensorFlow, and significant improvements to the performance on GPU. == Features == === AutoDifferentiation === AutoDifferentiation is the process of automatically calculating the gradient vector of a model with respect to each of its parameters. With this feature, TensorFlow can automatically compute the gradients for the parameters in a model, which is useful to algorithms such as backpropagation which require gradients to optimize performance. To do so, the framework must keep track of the order of operations done to the input Tensors in a model, and then compute the gradients with respect to the appropriate parameters. === Eager execution === TensorFlow includes an "eager execution" mode, which means that operations are evaluated immediately as opposed to being added to a computational graph which is executed later. Code executed eagerly can be examined step-by step-through a debugger, since data is augmented at each line of code rather than later in a computational graph. This execution paradigm is considered to be easier to debug because of its step by step transparency. === Distribute === In both eager and graph executions, TensorFlow provides an API for distributing computation across multiple devices with various distribution strategies. This distributed computing can often speed up the execution of training and evaluating of TensorFlow models and is a common practice in the field of AI. === Losses === To train and assess models, TensorFlow provides a set of loss functions (also known as cost functions). Some popular examples include mean squared error (MSE) and binary cross entropy (BCE). === Metrics === In order to assess the performance of machine learning models, TensorFlow gives API access to commonly used metrics. Examples include various accuracy metrics (binary, categorical, sparse categorical) along with other metrics such as Precision, Recall, and Intersection-over-Union (IoU). === TF.nn === TensorFlow.nn is a module for executing primitive neural network operations on models. Some of these operations include variations of convolutions (1/2/3D, Atrous, depthwise), activation functions (Softmax, RELU, GELU, Sigmoid, etc.) and their variations, and other operations (max-pooling, bias-add, etc.). === Optimizers === TensorFlow offers a set of optimizers for training neural networks, including ADAM, ADAGRAD, and Stochastic Gradient Descent (SGD). When training a model, different optimizers offer different modes of parameter tuning, often affecting a model's convergence and performance. == Usage and extensions == === TensorFlow === TensorFlow serves as a core platform and library for machine learning. TensorFlow's APIs use Keras to allow users to make their own machine-learning models. In addition to building and training their model, TensorFlow can also help load the data to train the model, and deploy it using TensorFlow Serving. TensorFlow provides a stable Python Application Program Interface (API), as well as APIs without backwards compatibility guarantee for JavaScript, C++, and Java. Third-party language binding packages are also available for C#, Haskell, Julia, MATLAB, Object Pascal, R, Scala, Rust, OCaml, and Crystal. Bindings that are now archived and unsupported include Go and Swift. === TensorFlow.js === TensorFlow also has a library for machine learning in JavaScript. Using the provided JavaScript APIs, TensorFlow.js allows users to use either Tensorflow.js models or converted models from TensorFlow or TFLite, retrain the given models, and run on the web. === LiteRT === LiteRT, formerly known as Te

    Read more →
  • Feigenbaum test

    Feigenbaum test

    A Feigenbaum test is a variation of the Turing test where a computer system attempts to replicate an expert in a given field such as chemistry or marketing. It is also known, as a subject matter expert Turing test and was proposed by Edward Feigenbaum in a 2003 paper. The concept is also described by Ray Kurzweil in his 2005 book The Singularity is Near. Kurzweil argues that machines who pass this test are an inevitable consequence of Moore's Law.

    Read more →
  • Cerebellar model articulation controller

    Cerebellar model articulation controller

    The cerebellar model arithmetic computer (CMAC) is a type of neural network based on a model of the mammalian cerebellum. It is also known as the cerebellar model articulation controller. It is a type of associative memory. The CMAC was first proposed as a function modeler for robotic controllers by James Albus in 1975 (hence the name), but has been extensively used in reinforcement learning and also as for automated classification in the machine learning community. The CMAC is an extension of the perceptron model. It computes a function for n {\displaystyle n} input dimensions. The input space is divided up into hyper-rectangles, each of which is associated with a memory cell. The contents of the memory cells are the weights, which are adjusted during training. Usually, more than one quantisation of input space is used, so that any point in input space is associated with a number of hyper-rectangles, and therefore with a number of memory cells. The output of a CMAC is the algebraic sum of the weights in all the memory cells activated by the input point. A change of value of the input point results in a change in the set of activated hyper-rectangles, and therefore a change in the set of memory cells participating in the CMAC output. The CMAC output is therefore stored in a distributed fashion, such that the output corresponding to any point in input space is derived from the value stored in a number of memory cells (hence the name associative memory). This provides generalisation. == Building blocks == In the adjacent image, there are two inputs to the CMAC, represented as a 2D space. Two quantising functions have been used to divide this space with two overlapping grids (one shown in heavier lines). A single input is shown near the middle, and this has activated two memory cells, corresponding to the shaded area. If another point occurs close to the one shown, it will share some of the same memory cells, providing generalisation. The CMAC is trained by presenting pairs of input points and output values, and adjusting the weights in the activated cells by a proportion of the error observed at the output. This simple training algorithm has a proof of convergence. It is normal to add a kernel function to the hyper-rectangle, so that points falling towards the edge of a hyper-rectangle have a smaller activation than those falling near the centre. One of the major problems cited in practical use of CMAC is the memory size required, which is directly related to the number of cells used. This is usually ameliorated by using a hash function, and only providing memory storage for the actual cells that are activated by inputs. == One-step convergent algorithm == Initially least mean square (LMS) method is employed to update the weights of CMAC. The convergence of using LMS for training CMAC is sensitive to the learning rate and could lead to divergence. In 2004, a recursive least squares (RLS) algorithm was introduced to train CMAC online. It does not need to tune a learning rate. Its convergence has been proved theoretically and can be guaranteed to converge in one step. The computational complexity of this RLS algorithm is O(N3). == Hardware implementation infrastructure == Based on QR decomposition, an algorithm (QRLS) has been further simplified to have an O(N) complexity. Consequently, this reduces memory usage and time cost significantly. A parallel pipeline array structure on implementing this algorithm has been introduced. Overall by utilizing QRLS algorithm, the CMAC neural network convergence can be guaranteed, and the weights of the nodes can be updated using one step of training. Its parallel pipeline array structure offers its great potential to be implemented in hardware for large-scale industry usage. == Continuous CMAC == Since the rectangular shape of CMAC receptive field functions produce discontinuous staircase function approximation, by integrating CMAC with B-splines functions, continuous CMAC offers the capability of obtaining any order of derivatives of the approximate functions. == Deep CMAC == In recent years, numerous studies have confirmed that by stacking several shallow structures into a single deep structure, the overall system could achieve better data representation, and, thus, more effectively deal with nonlinear and high complexity tasks. In 2018, a deep CMAC (DCMAC) framework was proposed and a backpropagation algorithm was derived to estimate the DCMAC parameters. Experimental results of an adaptive noise cancellation task showed that the proposed DCMAC can achieve better noise cancellation performance when compared with that from the conventional single-layer CMAC. == Summary ==

    Read more →
  • Explanation-based learning

    Explanation-based learning

    Explanation-based learning (EBL) is a form of machine learning that exploits a very strong, or even perfect, domain theory (i.e. a formal theory of an application domain akin to a domain model in ontology engineering, not to be confused with Scott's domain theory) in order to make generalizations or form concepts from training examples. It is also linked with Encoding (memory) to help with Learning. == Details == An example of EBL using a perfect domain theory is a program that learns to play chess through example. A specific chess position that contains an important feature such as "Forced loss of black queen in two moves" includes many irrelevant features, such as the specific scattering of pawns on the board. EBL can take a single training example and determine what are the relevant features in order to form a generalization. A domain theory is perfect or complete if it contains, in principle, all information needed to decide any question about the domain. For example, the domain theory for chess is simply the rules of chess. Knowing the rules, in principle, it is possible to deduce the best move in any situation. However, actually making such a deduction is impossible in practice due to combinatoric explosion. EBL uses training examples to make searching for deductive consequences of a domain theory efficient in practice. In essence, an EBL system works by finding a way to deduce each training example from the system's existing database of domain theory. Having a short proof of the training example extends the domain-theory database, enabling the EBL system to find and classify future examples that are similar to the training example very quickly. The main drawback of the method—the cost of applying the learned proof macros, as these become numerous—was analyzed by Minton. === Basic formulation === EBL software takes four inputs: a hypothesis space (the set of all possible conclusions) a domain theory (axioms about a domain of interest) training examples (specific facts that rule out some possible hypothesis) operationality criteria (criteria for determining which features in the domain are efficiently recognizable, e.g. which features are directly detectable using sensors) == Application == An especially good application domain for an EBL is natural language processing (NLP). Here a rich domain theory, i.e., a natural language grammar—although neither perfect nor complete, is tuned to a particular application or particular language usage, using a treebank (training examples). Rayner pioneered this work. The first successful industrial application was to a commercial NL interface to relational databases. The method has been successfully applied to several large-scale natural language parsing systems, where the utility problem was solved by omitting the original grammar (domain theory) and using specialized LR-parsing techniques, resulting in huge speed-ups, at a cost in coverage, but with a gain in disambiguation. EBL-like techniques have also been applied to surface generation, the converse of parsing. When applying EBL to NLP, the operationality criteria can be hand-crafted, or can be inferred from the treebank using either the entropy of its or-nodes or a target coverage/disambiguation trade-off (= recall/precision trade-off = f-score). EBL can also be used to compile grammar-based language models for speech recognition, from general unification grammars. Note how the utility problem, first exposed by Minton, was solved by discarding the original grammar/domain theory, and that the quoted articles tend to contain the phrase grammar specialization—quite the opposite of the original term explanation-based generalization. Perhaps the best name for this technique would be data-driven search space reduction. Other people who worked on EBL for NLP include Guenther Neumann, Aravind Joshi, Srinivas Bangalore, and Khalil Sima'an.

    Read more →
  • Portable Format for Analytics

    Portable Format for Analytics

    The Portable Format for Analytics (PFA) is a JSON-based predictive model interchange format conceived and developed by Jim Pivarski. PFA provides a way for analytic applications to describe and exchange predictive models produced by analytics and machine learning algorithms. It supports common models such as logistic regression and decision trees. Version 0.8 was published in 2015. Subsequent versions have been developed by the Data Mining Group. As a predictive model interchange format developed by the Data Mining Group, PFA is complementary to the DMG's XML-based standard called the Predictive Model Markup Language or PMML. == Release history == == Data Mining Group == The Data Mining Group is a consortium managed by the Center for Computational Science Research, Inc., a nonprofit founded in 2008. == Examples == reverse array: # reverse input array of doubles input: {"type": "array", "items": "double"} output: {"type": "array", "items": "double"} action: - let: { x : input} - let: { z : input} - let: { l : {a.len: [x]}} - let: { i : l} - while : { ">=" : [i,0]} do: - set : {z : {attr: z, path : [i] , to: {attr : x ,path : [ {"-":[{"-" : [l ,i]},1]}] } } } - set : {i : {-:[i,1]}} - z Bubblesort input: {"type": "array", "items": "double"} output: {"type": "array", "items": "double"} action: - let: { A : input} - let: { N : {a.len: [A]}} - let: { n : {-:[N,1]}} - let: { i : 0} - let: { s : 0.0} - while : { ">=" : [n,0]} do : - set : { i : 0 } - while : { "<=" : [i,{-:[n,1]}]} do : - if: {">": [ {attr: A, path : [i]} , {attr: A, path:[{+:[i,1]}]} ]} then : - set : {s : {attr: A, path: [i]}} - set : {A : {attr: A, path: [i], to: {attr: A, path:[{+:[i,1]}]} } } - set : {A : {attr: A, path: [{+:[i,1]}], to: s }} - set : {i : {+:[i,1]}} - set : {n : {-:[n,1]}} - A == Implementations == Hadrian (Java/Scala/JVM) - Hadrian is a complete implementation of PFA in Scala, which can be accessed through any JVM language, principally Java. It focuses on model deployment, so it is flexible (can run in restricted environments) and fast. Titus (Python 2.x) - Titus is a complete, independent implementation of PFA in pure Python. It focuses on model development, so it includes model producers and PFA manipulation tools in addition to runtime execution. Currently, it works for Python 2. Titus 2 (Python 3.x) - Titus 2 is a fork of Titus which supports PFA implementation for Python 3. Aurelius (R) - Aurelius is a toolkit for generating PFA in the R programming language. It focuses on porting models to PFA from their R equivalents. To validate or execute scoring engines, Aurelius sends them to Titus through rPython (so both must be installed). Antinous (Model development in Jython) - Antinous is a model-producer plugin for Hadrian that allows Jython code to be executed anywhere a PFA scoring engine would go. It also has a library of model producing algorithms.

    Read more →
  • Greg Brockman

    Greg Brockman

    Gregory Brockman (born November 29, 1987) is an American entrepreneur and software engineer. He is co-founder and president of OpenAI. He began his career at Stripe in 2010, upon leaving MIT, and became CTO in 2013. He left Stripe in 2015 to co-found OpenAI, where he also served as CTO. == Early life == Brockman was born in Thompson, North Dakota, and attended Red River High School, where he excelled in mathematics, chemistry, and computer science. He won a silver medal in the 2006 International Chemistry Olympiad and became the first finalist from North Dakota to participate in the Intel science talent search since 1973. In 2003, 2005, and 2007, he attended Canada/USA Mathcamp, a summer program for mathematically talented high-school students. In 2008, Brockman enrolled at Harvard University but left a year later, briefly enrolling at the Massachusetts Institute of Technology. == Career == In 2010, he dropped out of MIT to join Stripe, a company founded by Patrick Collison, his MIT classmate, and John Collison. In 2013, he became Stripe's first CTO, while the company grew from 5 to 205 employees. Brockman left Stripe in May 2015. === OpenAI === Brockman met with Sam Altman and Elon Musk, and led the recruiting of the OpenAI founding team. Many of its members, including Ilya Sutskever, were top AI research talent that left high paying jobs for the opportunity at OpenAI. He co-founded OpenAI in December 2015 alongside Altman, Sutskever and others. The company initially operated from Brockman's living room. He led various projects at OpenAI, including OpenAI Gym and OpenAI Five, a Dota 2 bot. On February 14, 2019, OpenAI announced that they had developed a new large language model called GPT-2, but kept it private due to their concern for its potential misuse. They released the model to a limited group of beta testers in May 2019. On March 14, 2023, in a live video demo, Brockman unveiled GPT-4, the fourth iteration in the GPT series. On November 17, 2023, alongside the firing of Sam Altman, Brockman was told he had been removed from the board. Sutskever supplied the board with a document of alleged bullying by Brockman. Mira Murati said Brockman's relationship with Altman made it impossible for her to do her job, and Altman had already "fielded many requests from OpenAI employees to rein in Brockman". Brockman was to report to Murati, but on November 17, he announced that he had quit the company. On November 20, 2023, Microsoft CEO Satya Nadella announced that Brockman and Altman would join Microsoft to lead a new advanced AI research team. The following day, after a deal was reached to reinstate Altman as CEO, Brockman returned to OpenAI. Brockman took a sabbatical from August to November 2024. === Elon Musk lawsuit === Jury selection for OpenAI cofounder Elon Musk's lawsuit against OpenAI and its current executives, including Brockman, began on April 27, 2026. On April 28, 2026, trial testimony was by now underway, with Elon Musk beginning his testimony against Altman and OpenAI. On April 30, 2026 Musk would enter his third day of testimony. == Personal life == In November 2019 after a year of dating, Brockman married Anna at OpenAI's offices on a workday. Ilya Sutskever officiated. == Political activities == Brockman and his wife were the biggest donors to Donald Trump's Super PAC, MAGA Inc., in 2025 with each of them donating US$12.5 million. Brockman and his wife also donated $50 million to Leading the Future, a super PAC dedicated to AI deregulation that he helped found with Andreessen Horowitz co-founders Marc Andreessen and Ben Horowitz. OpenAI publicly expressed openness to increased regulatory oversight and has a policy against donating to such Super PACs.

    Read more →
  • Embodied cognition

    Embodied cognition

    Embodied cognition represents a diverse group of theories which investigate how cognition is shaped by the bodily state and capacities of the organism. These embodied factors include the motor system, the perceptual system, bodily interactions with the environment (situatedness), and the assumptions about the world that shape the functional structure of the brain and body of the organism. Embodied cognition suggests that these elements are essential to a wide spectrum of cognitive functions, such as perception biases, memory recall, comprehension and high-level mental constructs (such as meaning attribution and categories) and performance on various cognitive tasks (reasoning or judgment). The embodied mind thesis challenges other theories, such as cognitivism, computationalism, and Cartesian dualism. It is closely related to the extended mind thesis, situated cognition, and enactivism. The modern version depends on understandings drawn from up-to-date research in psychology, linguistics, cognitive science, dynamical systems, artificial intelligence, robotics, animal cognition, plant cognition, and neurobiology. == Theory == Proponents of the embodied cognition thesis emphasize the active and significant role the body plays in the shaping of cognition and in the understanding of an agent's mind and cognitive capacities. In philosophy, embodied cognition holds that an agent's cognition, rather than being the product of mere (innate) abstract representations of the world, is strongly influenced by aspects of an agent's body beyond the brain itself. An embodied model of cognition opposes the disembodied Cartesian model, according to which all mental phenomena are non-physical and, therefore, not influenced by the body. With this opposition the embodiment thesis intends to reintroduce an agent's bodily experiences into any account of cognition. It is a rather broad thesis and encompasses both weak and strong variants of embodiment. In an attempt to reconcile cognitive science with human experience, the enactive approach to cognition defines "embodiment" as follows: By using the term embodied we mean to highlight two points: first that cognition depends upon the kinds of experience that come from having a body with various sensorimotor capacities, and second, that these individual sensorimotor capacities are themselves embedded in a more encompassing biological, psychological and cultural context. This double sense attributed to the embodiment thesis emphasizes the many aspects of cognition that researchers in different fields—such as philosophy, cognitive science, artificial intelligence, psychology, and neuroscience—are involved with. This general characterization of embodiment faces some difficulties: a consequence of this emphasis on the body, experience, culture, context, and the cognitive mechanisms of an agent in the world is that often distinct views and approaches to embodied cognition overlap. The theses of extended cognition and situated cognition, for example, are usually intertwined and not always carefully separated. And since each of the aspects of the embodiment thesis is endorsed to different degrees, embodied cognition should be better seen "as a research program rather than a well-defined unified theory". Some authors explain the embodiment thesis by arguing that cognition depends on an agent's body and its interactions with a determined environment. From this perspective, cognition in real biological systems is not an end in itself; it is constrained by the system's goals and capacities. Such constraints do not mean cognition is set by adaptive behavior (or autopoiesis) alone, but instead that cognition requires "some kind of information processing... the transformation or communication of incoming information". The acquiring of such information involves the agent's "exploration and modification of the environment". It would be a mistake, however, to suppose that cognition consists simply of building maximally accurate representations of input information...the gaining of knowledge is a stepping stone to achieving the more immediate goal of guiding behavior in response to the system's changing surroundings. Another approach to understanding embodied cognition comes from a narrower characterization of the embodiment thesis. The following narrower view of embodiment avoids any compromises to external sources other than the body and allows differentiating between embodied cognition, extended cognition, and situated cognition. Thus, the embodiment thesis can be specified as follows: Many features of cognition are embodied in that they are deeply dependent upon characteristics of the physical body of an agent, such that the agent's beyond-the-brain body plays a significant causal role, or a physically constitutive role, in that agent's cognitive processing. This thesis points out the core idea that an agent's body plays a significant role in shaping different features of cognition, such as perception, attention, memory, reasoning—among others. Likewise, these features of cognition depend on the kind of body an agent has. The thesis omits direct mention of some aspects of the "more encompassing biological, psychological and cultural context" included in the enactive definition, making it possible to separate embodied cognition, extended cognition, and situated cognition. In contrast to the embodiment thesis, the extended mind thesis limits cognitive processing neither to the brain nor even to the body, it extends it outward into the agent's world. Situated cognition emphasizes that this extension is not just a matter of including resources outside the head but stressing the role of probing and changing interactions with the agent's world. Cognition is situated in that it is inherently dependent upon the cultural and social contexts within which it takes place. This conceptual reframing of cognition as an activity influenced by the body has had significant implications. For instance, the view of cognition inherited by most contemporary cognitive neuroscience is internalist in nature. An agent's behavior along with its capacity to maintain (accurate) representations of the surrounding environment were considered as the product of "powerful brains that can maintain the world models and devise plans". From this perspective, cognizing was conceived as something that an isolated brain did. In contrast, accepting the role the body plays during cognitive processes allows us to account for a more encompassing view of cognition. This shift in perspective within neuroscience suggests that successful behavior in real-world scenarios demands the integration of several sensorimotor and cognitive (as well as affective) capacities of an agent. Thus, cognition emerges in the relationship between an agent and the affordances provided by the environment rather than in the brain alone. In 2002, a collection of positive characterizations summarizing what the embodiment thesis entails for cognition were offered. Professor of Cognitive Psychology Margaret Wilson argues that the general outlook of embodied cognition "displays an interesting co-variation of multiple observations and houses a number of different claims: (1) cognition is situated; (2) cognition is time-pressured; (3) we off-load cognitive work onto the environment; (4) the environment is part of the cognitive system; (5) cognition is for action; (6) offline cognition is bodily-based". According to Wilson, the first three and the fifth claim appear to be at least partially true, while the fourth claim is deeply problematic in that all things that have an impact on the elements of a system are not necessarily considered part of the system. The sixth claim has received the least attention in the literature on embodied cognition, yet it might be the most significant of the six claims as it shows how certain human cognitive capabilities, that previously were thought to be highly abstract, now appear to be leaning towards an embodied approach for their explanation. Wilson also describes at least five main (abstract) categories that combine both sensory and motor skills (or sensorimotor functions). The first three are working memory, episodic memory, and implicit memory; the fourth is mental imagery, and finally, the fifth concerns reasoning and problem solving. == History == The theory of embodied cognition, along with the multiple aspects it comprises, can be regarded as the imminent result of an intellectual skepticism towards the flourishment of the disembodied theory of mind put forth by René Descartes in the 17th century. According to Cartesian dualism, the mind is entirely distinct from the body and can be successfully explained and understood without reference to the body or to its processes. Research has been done to identify the set of ideas that would establish what could be considered as the early stages of embodied cognition around inquiries regarding the mind-body-soul rel

    Read more →
  • Toolchain

    Toolchain

    A toolchain is a set of software development tools used to build and otherwise develop software. Often, the tools are executed sequentially and form a pipeline such that the output of one tool is the input for the next. Sometimes the term is used for a set of related tools that are not necessarily executed sequentially. A relatively common and simple toolchain consists of the tools to build for a particular operating system (OS) and CPU architecture: a compiler, a linker, and a debugger. With a cross-compiler, a toolchain can support cross-platform development. For building more complex software systems, many other tools may be in the toolchain. For example, for a video game, the toolchain may include tools for preparing sound effects, music, textures, 3-dimensional models and animations, and for combining these resources into the finished product.

    Read more →
  • Learning vector quantization

    Learning vector quantization

    In computer science, learning vector quantization (LVQ) is a prototype-based supervised classification algorithm. LVQ is the supervised counterpart of vector quantization systems. LVQ can be understood as a special case of an artificial neural network, more precisely, it applies a winner-take-all Hebbian learning-based approach. It is a precursor to self-organizing maps (SOM) and related to neural gas and the k-nearest neighbor algorithm (k-NN). LVQ was invented by Teuvo Kohonen. == Definition == An LVQ system is represented by prototypes W = ( w ( i ) , . . . , w ( n ) ) {\displaystyle W=(w(i),...,w(n))} which are defined in the feature space of observed data. In winner-take-all training algorithms one determines, for each data point, the prototype which is closest to the input according to a given distance measure. The position of this so-called winner prototype is then adapted, i.e. the winner is moved closer if it correctly classifies the data point or moved away if it classifies the data point incorrectly. An advantage of LVQ is that it creates prototypes that are easy to interpret for experts in the respective application domain. LVQ systems can be applied to multi-class classification problems in a natural way. A key issue in LVQ is the choice of an appropriate measure of distance or similarity for training and classification. Recently, techniques have been developed which adapt a parameterized distance measure in the course of training the system, see e.g. (Schneider, Biehl, and Hammer, 2009) and references therein. LVQ can be a valuable aid in classifying text documents. == Algorithm == The algorithms are presented as in. Set up: Let the data be denoted by x i ∈ R D {\displaystyle x_{i}\in \mathbb {R} ^{D}} , and their corresponding labels by y i ∈ { 1 , 2 , … , C } {\displaystyle y_{i}\in \{1,2,\dots ,C\}} . The complete dataset is { ( x i , y i ) } i = 1 N {\displaystyle \{(x_{i},y_{i})\}_{i=1}^{N}} . The set of code vectors is w j ∈ R D {\displaystyle w_{j}\in \mathbb {R} ^{D}} . The learning rate at iteration step t {\displaystyle t} is denoted by α t {\displaystyle \alpha _{t}} . The hyperparameters w {\displaystyle w} and ϵ {\displaystyle \epsilon } are used by LVQ2 and LVQ3. The original paper suggests ϵ ∈ [ 0.1 , 0.5 ] {\displaystyle \epsilon \in [0.1,0.5]} and w ∈ [ 0.2 , 0.3 ] {\displaystyle w\in [0.2,0.3]} . === LVQ1 === Initialize several code vectors per label. Iterate until convergence criteria is reached. Sample a datum x i {\displaystyle x_{i}} , and find out the code vector w j {\displaystyle w_{j}} , such that x i {\displaystyle x_{i}} falls within the Voronoi cell of w j {\displaystyle w_{j}} . If its label y i {\displaystyle y_{i}} is the same as that of w j {\displaystyle w_{j}} , then w j ← w j + α t ( x i − w j ) {\displaystyle w_{j}\leftarrow w_{j}+\alpha _{t}(x_{i}-w_{j})} , otherwise, w j ← w j − α t ( x i − w j ) {\displaystyle w_{j}\leftarrow w_{j}-\alpha _{t}(x_{i}-w_{j})} . === LVQ2 === LVQ2 is the same as LVQ3, but with this sentence removed: "If w j {\displaystyle w_{j}} and w k {\displaystyle w_{k}} and x i {\displaystyle x_{i}} have the same class, then w j ← w j − α t ( x i − w j ) {\displaystyle w_{j}\leftarrow w_{j}-\alpha _{t}(x_{i}-w_{j})} and w k ← w k + α t ( x i − w k ) {\displaystyle w_{k}\leftarrow w_{k}+\alpha _{t}(x_{i}-w_{k})} .". If w j {\displaystyle w_{j}} and w k {\displaystyle w_{k}} and x i {\displaystyle x_{i}} have the same class, then nothing happens. === LVQ3 === Initialize several code vectors per label. Iterate until convergence criteria is reached. Sample a datum x i {\displaystyle x_{i}} , and find out two code vectors w j , w k {\displaystyle w_{j},w_{k}} closest to it. Let d j := ‖ x i − w j ‖ , d k := ‖ x i − w k ‖ {\displaystyle d_{j}:=\|x_{i}-w_{j}\|,d_{k}:=\|x_{i}-w_{k}\|} . If min ( d j d k , d k d j ) > s {\displaystyle \min \left({\frac {d_{j}}{d_{k}}},{\frac {d_{k}}{d_{j}}}\right)>s} , where s = 1 − w 1 + w {\displaystyle s={\frac {1-w}{1+w}}} , then If w j {\displaystyle w_{j}} and x i {\displaystyle x_{i}} have the same class, and w k {\displaystyle w_{k}} and x i {\displaystyle x_{i}} have different classes, then w j ← w j + α t ( x i − w j ) {\displaystyle w_{j}\leftarrow w_{j}+\alpha _{t}(x_{i}-w_{j})} and w k ← w k − α t ( x i − w k ) {\displaystyle w_{k}\leftarrow w_{k}-\alpha _{t}(x_{i}-w_{k})} . If w k {\displaystyle w_{k}} and x i {\displaystyle x_{i}} have the same class, and w j {\displaystyle w_{j}} and x i {\displaystyle x_{i}} have different classes, then w j ← w j − α t ( x i − w j ) {\displaystyle w_{j}\leftarrow w_{j}-\alpha _{t}(x_{i}-w_{j})} and w k ← w k + α t ( x i − w k ) {\displaystyle w_{k}\leftarrow w_{k}+\alpha _{t}(x_{i}-w_{k})} . If w j {\displaystyle w_{j}} and w k {\displaystyle w_{k}} and x i {\displaystyle x_{i}} have the same class, then w j ← w j − ϵ α t ( x i − w j ) {\displaystyle w_{j}\leftarrow w_{j}-\epsilon \alpha _{t}(x_{i}-w_{j})} and w k ← w k + ϵ α t ( x i − w k ) {\displaystyle w_{k}\leftarrow w_{k}+\epsilon \alpha _{t}(x_{i}-w_{k})} . If w k {\displaystyle w_{k}} and x i {\displaystyle x_{i}} have different classes, and w j {\displaystyle w_{j}} and x i {\displaystyle x_{i}} have different classes, then the original paper simply does not explain what happens in this case, but presumably nothing happens in this case. Otherwise, skip. Note that condition min ( d j d k , d k d j ) > s {\displaystyle \min \left({\frac {d_{j}}{d_{k}}},{\frac {d_{k}}{d_{j}}}\right)>s} , where s = 1 − w 1 + w {\displaystyle s={\frac {1-w}{1+w}}} , precisely means that the point x i {\displaystyle x_{i}} falls between two Apollonian spheres.

    Read more →
  • Stochastic Neural Analog Reinforcement Calculator

    Stochastic Neural Analog Reinforcement Calculator

    The Stochastic Neural Analog Reinforcement Calculator (SNARC) is a neural network machine designed by Marvin Minsky. Prompted by a letter from Minsky, George Armitage Miller gathered the funding (a few thousand dollars) for the project from the Office of Naval Research of the U.S. Department of Defense in the summer of 1951 with the work to be carried out by Minsky, who was then a graduate student in mathematics at Princeton University. At the time, a physics graduate student at Princeton, Dean S. Edmonds, volunteered that he was good with electronics and therefore Minsky brought him onto the project. During undergraduate years, Minsky was inspired by the 1943 Warren McCulloch and Walter Pitts paper on artificial neurons, and decided to build such a machine. The learning was Skinnerian reinforcement learning, and Minsky talked with Skinner extensively during the development of the machine. They tested the machine on a copy of Shannon's maze, and found that it could learn to solve the maze. Unlike Shannon's maze, this machine did not control a physical robot, but simulated rats running in a maze. The simulation is displayed as an "arrangement of lights", and the circuit was reinforced each time the simulated rat reached the goal. The machine surprised its creators. "The rats actually interacted with one another. If one of them found a good path, the others would tend to follow it." The machine itself is a randomly connected network of approximately 40 Hebb synapses. These synapses each have a memory that holds the probability that signal comes in one input and another signal will come out of the output. There is a probability knob that goes from 0 to 1 that shows this probability of the signals propagating. If the probability signal gets through, a capacitor remembers this function and engages an electromagnetic clutch. At this point, the operator will press a button to give a reward to the machine. This activates a motor on a surplus Minneapolis-Honeywell C-1 gyroscopic autopilot from a B-24 bomber. The motor turns a chain that goes to all 40 synapse machines, checking if the clutch is engaged or not. As the capacitor can only "remember" for a certain amount of time, the chain only catches the most recent updates of the probabilities. Each neuron contained 6 vacuum tubes and a motor. The entire machine is "the size of a grand piano" and contained 300 vacuum tubes. The tubes failed regularly, but the machine would still work despite failures. This machine is considered one of the first pioneering attempts at the field of artificial intelligence. Minsky went on to be a founding member of MIT's Project MAC, which split to become the MIT Laboratory for Computer Science and the MIT Artificial Intelligence Lab, and is now the MIT Computer Science and Artificial Intelligence Laboratory. In 1985 Minsky became a founding member of the MIT Media Laboratory. According to Minsky, he loaned the machine to students in Dartmouth, and subsequently lost, except for a single neuron. A photo of Minsky's last neuron can be seen here. The photo shows 6 vacuum tubes, one of which is a Sylvania JAN-CHS-6H6GT/G/VT-90A.

    Read more →