AI Content Strategist Jobs

AI Content Strategist Jobs — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Tiki Wiki CMS Groupware

    Tiki Wiki CMS Groupware

    Tiki Wiki CMS Groupware or simply Tiki, originally known as TikiWiki, is a free and open source Wiki-based content management system and online office suite written primarily in PHP and distributed under the GNU Lesser General Public License (LGPL-2.1-only) license. In addition to enabling websites and portals on the internet and on intranets and extranets, Tiki contains a number of collaboration features allowing it to operate as a Geospatial Content Management System (GeoCMS) and Groupware web application. Tiki includes all the basic features common to most CMSs such as the ability to register and maintain individual user accounts within a flexible and rich permission / privilege system, create and manage menus, RSS-feeds, customize page layout, perform logging, and administer the system. All administration tasks are accomplished through a browser-based user interface. Tiki features an all-in-one design, as opposed to a core+extensions model followed by other CMSs. This allows for future-proof upgrades (since all features are released together), but has the drawback of an extremely large codebase (more than 1,000,000 lines). Tiki can run on any computing platform that supports both a web server capable of running PHP 5 (including Apache HTTP Server, IIS, Lighttpd, Hiawatha, Cherokee, and nginx) and a MySQL/MariaDB database to store content and settings. == Major components == Tiki has four major categories of components: content creation and management tools, content organization tools and navigation aids, communication tools, and configuration and administration tools. These components enable administrators and users to create and manage content, as well as letting them communicate to others and configure sites. In addition, Tiki allows each user to choose from various visual themes. These themes are implemented using CSS and the open source Smarty template engine. Additional themes can be created by a Tiki administrator for branding or customization as well. == Internationalization == Tiki is an international project, supporting many languages. The default interface language in Tiki is English, but any language that can be encoded and displayed using the UTF-8 encoding can be supported. Translated strings can be included via an external language file, or by translating interface strings directly, through the database. As of 29 September 2005, Tiki had been fully translated into eight languages and reportedly 90% or more translated into another five languages, as well as partial translations for nine additional languages. Tiki also supports interactive translation of actual wiki pages and was the initial wiki engine used in the Cross Lingual Wiki Engine Project. This allows Tiki-based web sites to have translated content — not just the user interface. == Implementation == Tiki is developed primarily in PHP with some JavaScript code. It uses MySQL/MariaDB as a database. It will run on any server that provides PHP 5, including Apache and Microsoft's IIS. Tiki components make extensive use of other open source projects, including Zend Framework, Smarty, jQuery, HTML Purifier, FCKeditor, Raphaël, phpCAS, and Morcego. When used with Mapserver Tiki can become a Geospatial Content Management System. == Project team == Tiki is under active development by a large international community of over 300 developers and translators, and is one of the largest open-source teams in the world. Project members have donated the resources and bandwidth required to host the tiki.org website and various subdomains. The project members refer to this dependence on their own product as "eating their own dogfood", which they have been doing since the early days of the project. Tiki community members also participate in various related events such as WikiSym and the Libre Software Meeting. == History == Tiki has been hosted on SourceForge.net since its initial release (Release 0.9, named Spica) in October 2002. It was primarily the development of Luis Argerich (Buenos Aires, Argentina), Eduardo Polidor (São Paulo, Brazil), and Garland Foster (Green Bay, WI, United States). In July 2003, Tiki was named the SourceForge.net July 2003 Project of the Month. In late 2003, a fork of Tiki was used to create Bitweaver. In 2006, Tiki was named to CMS Report's Top 30 Web Applications. In 2008, Tiki was named to EContent magazine's Top 100 In 2009, Tiki adopted a six-month release cycle and announced the selection of a Long Term Support (LTS) version and the Tiki Software Community Association was formed as the legal steward for Tiki. The Tiki Software Association is a not-for-profit entity established in Canada. Previously, the entire project was run entirely by volunteers. In 2010, Tiki received Best of Open Source Software Applications Award (BOSSIE) from InfoWorld, in the Applications category. In 2011, Tiki was named to CMS Report's Top 30 Web Applications. In 2012, Tiki was named "Best Web Tool" by WebHostingSearch.com, and "People's Choice: Best Free CMS" by CMS Critic. In 2016, Tiki was named as one of the "10 Best Open Source Collaboration Software Tools" by Small Business Computing. == Name == The name TikiWiki is written in CamelCase, a common Wiki syntax indicating a hyperlink within the Wiki. It is most likely a compound word combining two Polynesian terms, Tiki and Wiki, to create a self-rhyming name similar to wikiwiki, a common variant of wiki. A backronym has also been formed for Tiki: Tightly Integrated Knowledge Infrastructure. == Release Information and History == In general, the Tiki Software Community Association releases a new major version of Tiki Wiki every 8 months where prior, non-LTS, major versions are supported until the first minor version release of the next major version (i.e., 16.0 ⇒ 17.1). Starting with version 12.x, Tiki Wiki LTS is supported for 5 years where it enters a security/maintenance release cycle upon the release of the next LTS version. Tiki Wiki's release history is outlined below.

    Read more →
  • Shape context

    Shape context

    Shape context is a feature descriptor used in object recognition. Serge Belongie and Jitendra Malik proposed the term in their paper "Matching with Shape Contexts" in 2000. == Theory == The shape context is intended to be a way of describing shapes that allows for measuring shape similarity and the recovering of point correspondences. The basic idea is to pick n points on the contours of a shape. For each point pi on the shape, consider the n − 1 vectors obtained by connecting pi to all other points. The set of all these vectors is a rich description of the shape localized at that point but is far too detailed. The key idea is that the distribution over relative positions is a robust, compact, and highly discriminative descriptor. So, for the point pi, the coarse histogram of the relative coordinates of the remaining n − 1 points, h i ( k ) = # { q ≠ p i : ( q − p i ) ∈ bin ( k ) } {\displaystyle h_{i}(k)=\#\{q\neq p_{i}:(q-p_{i})\in {\mbox{bin}}(k)\}} is defined to be the shape context of p i {\displaystyle p_{i}} . The bins are normally taken to be uniform in log-polar space. The fact that the shape context is a rich and discriminative descriptor can be seen in the figure below, in which the shape contexts of two different versions of the letter "A" are shown. (a) and (b) are the sampled edge points of the two shapes. (c) is the diagram of the log-polar bins used to compute the shape context. (d) is the shape context for the point marked with a circle in (a), (e) is that for the point marked as a diamond in (b), and (f) is that for the triangle. As can be seen, since (d) and (e) are the shape contexts for two closely related points, they are quite similar, while the shape context in (f) is very different. For a feature descriptor to be useful, it needs to have certain invariances. In particular it needs to be invariant to translation, scaling, small perturbations, and, depending on the application, rotation. Translational invariance comes naturally to shape context. Scale invariance is obtained by normalizing all radial distances by the mean distance α {\displaystyle \alpha } between all the point pairs in the shape although the median distance can also be used. Shape contexts are empirically demonstrated to be robust to deformations, noise, and outliers using synthetic point set matching experiments. One can provide complete rotational invariance in shape contexts. One way is to measure angles at each point relative to the direction of the tangent at that point (since the points are chosen on edges). This results in a completely rotationally invariant descriptor. But of course this is not always desired since some local features lose their discriminative power if not measured relative to the same frame. Many applications in fact forbid rotational invariance e.g. distinguishing a "6" from a "9". == Use in shape matching == A complete system that uses shape contexts for shape matching consists of the following steps (which will be covered in more detail in the Details of Implementation section): Randomly select a set of points that lie on the edges of a known shape and another set of points on an unknown shape. Compute the shape context of each point found in step 1. Match each point from the known shape to a point on an unknown shape. To minimize the cost of matching, first choose a transformation (e.g. affine, thin plate spline, etc.) that warps the edges of the known shape to the unknown (essentially aligning the two shapes). Then select the point on the unknown shape that most closely corresponds to each warped point on the known shape. Calculate the "shape distance" between each pair of points on the two shapes. Use a weighted sum of the shape context distance, the image appearance distance, and the bending energy (a measure of how much transformation is required to bring the two shapes into alignment). To identify the unknown shape, use a nearest-neighbor classifier to compare its shape distance to shape distances of known objects. == Details of implementation == === Step 1: Finding a list of points on shape edges === The approach assumes that the shape of an object is essentially captured by a finite subset of the points on the internal or external contours on the object. These can be simply obtained using the Canny edge detector and picking a random set of points from the edges. Note that these points need not and in general do not correspond to key-points such as maxima of curvature or inflection points. It is preferable to sample the shape with roughly uniform spacing, though it is not critical. === Step 2: Computing the shape context === This step is described in detail in the Theory section. === Step 3: Computing the cost matrix === Consider two points p and q that have normalized K-bin histograms (i.e. shape contexts) g(k) and h(k). As shape contexts are distributions represented as histograms, it is natural to use the χ2 test statistic as the "shape context cost" of matching the two points: C S = 1 2 ∑ k = 1 K [ g ( k ) − h ( k ) ] 2 g ( k ) + h ( k ) {\displaystyle C_{S}={\frac {1}{2}}\sum _{k=1}^{K}{\frac {[g(k)-h(k)]^{2}}{g(k)+h(k)}}} The values of this range from 0 to 1. In addition to the shape context cost, an extra cost based on the appearance can be added. For instance, it could be a measure of tangent angle dissimilarity (particularly useful in digit recognition): C A = 1 2 ‖ ( cos ⁡ ( θ 1 ) sin ⁡ ( θ 1 ) ) − ( cos ⁡ ( θ 2 ) sin ⁡ ( θ 2 ) ) ‖ {\displaystyle C_{A}={\frac {1}{2}}{\begin{Vmatrix}{\dbinom {\cos(\theta _{1})}{\sin(\theta _{1})}}-{\dbinom {\cos(\theta _{2})}{\sin(\theta _{2})}}\end{Vmatrix}}} This is half the length of the chord in unit circle between the unit vectors with angles θ 1 {\displaystyle \theta _{1}} and θ 2 {\displaystyle \theta _{2}} . Its values also range from 0 to 1. Now the total cost of matching the two points could be a weighted-sum of the two costs: C = ( 1 − β ) C S + β C A {\displaystyle C=(1-\beta )C_{S}+\beta C_{A}\!\,} Now for each point pi on the first shape and a point qj on the second shape, calculate the cost as described and call it Ci,j. This is the cost matrix. === Step 4: Finding the matching that minimizes total cost === Now, a one-to-one matching π ( i ) {\displaystyle \pi (i)} that matches each point pi on shape 1 and qj on shape 2 that minimizes the total cost of matching, H ( π ) = ∑ i C ( p i , q π ( i ) ) {\displaystyle H(\pi )=\sum _{i}C\left(p_{i},q_{\pi (i)}\right)} is needed. This can be done in O ( N 3 ) {\displaystyle O(N^{3})} time using the Hungarian method, although there are more efficient algorithms. To have robust handling of outliers, one can add "dummy" nodes that have a constant but reasonably large cost of matching to the cost matrix. This would cause the matching algorithm to match outliers to a "dummy" if there is no real match. === Step 5: Modeling transformation === Given the set of correspondences between a finite set of points on the two shapes, a transformation T : R 2 → R 2 {\displaystyle T:\mathbb {R} ^{2}\to \mathbb {R} ^{2}} can be estimated to map any point from one shape to the other. There are several choices for this transformation, described below. ==== Affine ==== The affine model is a standard choice: T ( p ) = A p + o {\displaystyle T(p)=Ap+o\!} . The least squares solution for the matrix A {\displaystyle A} and the translational offset vector o is obtained by: o = 1 n ∑ i = 1 n ( p i − q π ( i ) ) , A = ( Q + P ) t {\displaystyle o={\frac {1}{n}}\sum _{i=1}^{n}\left(p_{i}-q_{\pi (i)}\right),A=(Q^{+}P)^{t}} Where P = ( 1 p 11 p 12 ⋮ ⋮ ⋮ 1 p n 1 p n 2 ) {\displaystyle P={\begin{pmatrix}1&p_{11}&p_{12}\\\vdots &\vdots &\vdots \\1&p_{n1}&p_{n2}\end{pmatrix}}} with a similar expression for Q {\displaystyle Q\!} . Q + {\displaystyle Q^{+}\!} is the pseudoinverse of Q {\displaystyle Q\!} . ==== Thin plate spline ==== The thin plate spline (TPS) model is the most widely used model for transformations when working with shape contexts. A 2D transformation can be separated into two TPS function to model a coordinate transform: T ( x , y ) = ( f x ( x , y ) , f y ( x , y ) ) {\displaystyle T(x,y)=\left(f_{x}(x,y),f_{y}(x,y)\right)} where each of the ƒx and ƒy have the form: f ( x , y ) = a 1 + a x x + a y y + ∑ i = 1 n ω i U ( ‖ ( x i , y i ) − ( x , y ) ‖ ) , {\displaystyle f(x,y)=a_{1}+a_{x}x+a_{y}y+\sum _{i=1}^{n}\omega _{i}U\left({\begin{Vmatrix}(x_{i},y_{i})-(x,y)\end{Vmatrix}}\right),} and the kernel function U ( r ) {\displaystyle U(r)\!} is defined by U ( r ) = r 2 log ⁡ r 2 {\displaystyle U(r)=r^{2}\log r^{2}\!} . The exact details of how to solve for the parameters can be found elsewhere but it essentially involves solving a linear system of equations. The bending energy (a measure of how much transformation is needed to align the points) will also be easily obtained. ==== Regularized TPS ==== The TPS formulation above has exact matching requirement for the pairs of points on the two shapes. For noisy data, it is best to

    Read more →
  • Attensity

    Attensity

    Attensity was an American company that provided social analytics and engagement applications for social customer relationship management (social CRM). Attensity's text analytics software applications extracted facts, relationships and sentiment from unstructured data. == History == Attensity was founded in 2000. An early investor in Attensity was In-Q-Tel, which funds technology to support the missions of the US Government and the broader DOD. InTTENSITY, an independent company that has combined Inxight with Attensity Software (the only joint development project that combines two InQTel funded software packages), was the exclusive distributor and outlet for Attensity in the Federal Market. In 2009, Attensity Corp., then based in Palo Alto, merged with Germany's Empolis and Living-e AG to form Attensity Group. In 2010, Attensity Group acquired Biz360, a provider of social media monitoring and market intelligence solutions. In early 2012, Attensity Group divested itself of the Empolis business unit via a management buyout; that unit currently conducts business under its pre-merger name. Attensity Group was a closely held private company. Its majority shareholder was Aeris Capital, a private Swiss investment office advising a high-net-worth individual and his charitable foundation. Foundation Capital, Granite Ventures, and Scale Venture Partners were among Biz360's investors and thus became shareholders in Attensity Group. In February 2016, Attensity's IP assets were acquired by InContact, and Attensity closed.

    Read more →
  • ACL Data Collection Initiative

    ACL Data Collection Initiative

    The ACL Data Collection Initiative (ACL/DCI) was a project established in 1989 by the Association for Computational Linguistics (ACL) to create and distribute large text and speech corpora for computational linguistics research. The initiative aimed to address the growing need for substantial text databases that could support research in areas such as natural language processing, speech recognition, and computational linguistics. By 1993, the initiative’s activities had effectively ceased, with its functions and datasets absorbed by the Linguistic Data Consortium (LDC), which was founded in 1992. == Objectives == The ACL/DCI had several key objectives: To acquire a large and diverse text corpus from various sources To transform the collected texts into a common format based on the Standard Generalized Markup Language (SGML) To make the corpus available for scientific research at low cost with minimal restrictions To provide a common database that would allow researchers to replicate or extend published results To reduce duplication of effort among researchers in obtaining and preparing text data These objectives were designed to address the growing demand for very large amounts of text arising from applications in recognition and analysis of text and speech. Its core objective was to "oversee the acquisition and preparation of a large text corpus to be made available for scientific research at cost and without royalties". == History == By the late 1980s, researchers in computational linguistics and speech recognition faced a significant problem: the lack of large-scale, accessible text corpora for developing statistical models and testing algorithms. Existing generally available text databases were too small to meet the needs of developing applications in text and speech recognition. The initiative was formed to meet this need by collecting, standardizing, and distributing large quantities of text data with minimal restrictions for scientific research. As stated by Liberman (1990), "research workers have been severely hampered by the lack of appropriate materials, and specially by the lack of a large enough body of text on which published results can be replicated or extended by others." The ACL/DCI committee was established in February 1989. The committee included members from academic and industrial research laboratories in the United States and Europe. The initiative was chaired by Mark Liberman from the University of Pennsylvania (formerly of AT&T Bell Laboratories). Other committee members included representatives from organizations such as Bellcore, IBM T.J. Watson Research Center, Cambridge University, Virginia Polytechnic Institute & State University, Northeastern University, University of Pennsylvania, SRI International, MCC, Xerox PARC, ISSCO, and University of Pisa. The project operated initially without dedicated funding, relying on volunteer efforts from committee members and their affiliated institutions. Key supporters included AT&T Bell Labs, Bellcore, IBM, Xerox, and the University of Pennsylvania, which allowed the use of their computing facilities for ACL/DCI-related work. Previously running on volunteer effort pro bono, in 1991, it obtained funding from General Electric and the National Science Foundation (IRI-9113530). == Data == As of 1990, the ACL/DCI had collected hundreds of millions of words of diverse text. The collection included: Wall Street Journal articles (25 to 50 million words); Canadian Hansard (parliamentary records) in parallel English and French versions: cleaned-up English Hansard donated by the IBM alignment models group (100 million words), and original Bilingual Hansard (from a different time period) obtained directly (200 million words). Collins English Dictionary (1979 edition), both as fulltext (3 million words) and as various "database" versions, constructed using "typographers' tape" donated by Collins, which were computer tapes containing the structured digital data used to typeset and print the 1979 edition of the dictionary; Emails from ARPANET newsletters for the ACM Special Interest Group on Information Retrieval Forum (IRLIST) and AIList Digest issues distributed over the ARPANET (AILIST) (5 million words), both collected by Edward A. Fox at VIPSU; Articles on networking (2 million words); U.S. Department of Agriculture Extension Service Fact Sheets (>1 million words); 200,000 scientific abstracts of about 1,500 words each from the Department of Energy (25 million words); Archives of the Challenger Investigation Commission, including transcripts of depositions and hearings (2.5 million words); Books from the Library of America, including works by Mark Twain, Eugene O'Neill, Ralph Waldo Emerson, Herman Melville, W.E.B. DuBois, Willa Cather, and Benjamin Franklin (130 books, 20 million words); Public domain books like the King James Bible, Tristram Shandy, The Federalist Papers; Several million words of transcribed radiologists' reports, donated by Francis Ganong at Kurzweil Applied Intelligence Inc (about 5 million words); The Child Language Data Exchange corpus of child language acquisition transcripts; U.S. Department of Justice Justice Retrieval and Inquiry System (JURIS) materials; The Swiss Civil Code in parallel German, French and Italian; Economic reports from the Union Bank of Switzerland, in parallel English, German, French and Italian; About 12K words of administrative policy manuals and 14K words of administrative memos, contributed by Geoff Pullum of U.C.S.C.; Material from various ACM journals and the ACL journal Computational Linguistics; The CSLI publications series: 50-100 reports (8K words each) and 5-10 books (80K words each). The initiative started with North American English text but expanded to include Canadian French and planned to include Japanese, Chinese, and other Asian languages. At least 5 million words from the collection were tagged under the Penn Treebank project, and those tags were distributed by DCI as well. After DCI was absorbed by the LDC, the datasets were curated under LDC. == Format == The ACL/DCI corpus was coded in a standard form based on SGML (Standard Generalized Markup Language, ISO 8879), consistent with the recommendations of the Text Encoding Initiative (TEI), of which the DCI was an affiliated project. The TEI was a joint project of the ACL, the Association for Computers and the Humanities, and the Association for Literary and Linguistic Computing, aiming to provide a common interchange format for literary and linguistic data. The initiative planned to add annotations reflecting consensually approved linguistic features like part of speech and various aspects of syntactic and semantic structure over time. == Examples == As an example of the use of ACL/DCI, consider the Wall Street Journal (WSJ) corpus for speech recognition research. The WSJ corpus was used as the basis for the DARPA Spoken Language System (SLS) community's Continuous Speech Recognition (CSR) Corpus. The WSJ corpus became a standard benchmark for evaluating speech recognition systems and has been used in numerous research papers. The WSJ CSR Corpus provided DARPA with its first general-purpose English, large vocabulary, natural language, high perplexity corpus containing speech (400 hours) and text (47 million words) during 1987–89. The text corpus was 313 MB in size. The text was preprocessed to remove ambiguity in the word sequence that a reader might choose, ensuring that the unread text used to train language models was representative of the spoken test material. The preprocessing included converting numbers into orthographics, expanding abbreviations, resolving apostrophes and quotation marks, and marking punctuation. As another example, the Yarowsky algorithm used bitext data from DCI to train a simple word-sense disambiguation model that was competitive with advanced models trained on smaller datasets. == Distribution == Materials from the ACL/DCI collection were distributed to research groups on a non-commercial basis. By 1990, about 25 research groups and individual researchers had received tapes containing various portions of the collected material. To obtain the data, researchers had to sign an agreement not to redistribute the data or make direct commercial use of it. However, commercial application of "analytical materials" derived from the text, such as statistical tables or grammar rules, was explicitly permitted. The initiative first distributed data via 12-inch reels of 9-track tape, then via CD-ROMs. Each such tape could contain 30 million words compressed via the Lempel-Ziv algorithms. The first CD-ROM distribution was in 1991, funded by Dragon Systems Inc. It contained Collins English Dictionary, WSJ, scientific abstracts provided by the U.S. Department of Energy, and the Penn Treebank.

    Read more →
  • Integrated writing environment

    Integrated writing environment

    An integrated writing environment (IWE) is software that provides comprehensive writing and knowledge management functionality for writers and information workers. IWEs enable writers and information workers to perform a variety of tasks related to the document in the IWE in a single environment. This provides a distraction-free workspace and streamlined writing experience. IWEs provide similar efficiency and functionality benefits to writers and information professionals that integrated development environments (IDEs) provide to software developers. == Overview == IWEs are designed to maximize productivity and help improve the quality of written work by integrating together tools that allow users to work effectively in a single application. The IWE features may include integrated content search, reversion management, outlining, note management, and reference management, as may be suitable for the target field of use. == List of IWEs == Celtx This IWE is intended for screenplay writers and has screenplay writing and management tools. Celtex provides tools for the pre-production work phase, story development, storyboarding, script breakdowns, production scheduling, and reports. Scrivener This IWE targets novel, research paper, and script writing. Scrivener provides tools to organize notes and research documents for easy access and referencing. After completing the writing, Scrivener allows the user to export the document to formats supported by common word processors, such as Microsoft Word. TeXstudio This IWE targets LaTeX documents and provides interactive spelling checker, code folding, and syntax highlighting.

    Read more →
  • Contract management software

    Contract management software

    Contract management software constitutes software and associated data management used to support contract management, contract lifecycle management, and contractor management on projects in the procurement of goods and services. It may be used together with project management software. == History == Historically, contract management was seen as a "paper-intensive" process. Early steps from the early 2000's reported by the Aberdeen Group required extensive data conversion work to enable documents to be handled electronically. With the adoption of the European Union's General Data Protection Regulation (GDPR) in 2016, companies needed to take additional steps in regards to contract management. Each data responsible entity was obliged to sign data processing agreements (DPAs) with the various vendors, who treat personal data on behalf of the data responsible. DPAs need to be regularly controlled, adjusted and renewed, which adds an extra agreement to such vendors or at least an extra DPA addendum to each agreement. By 2018, Ardent Partner's research had found that software used for automating contract management activities was being more extensively used among major companies or businesses with "Best-in-Class" procurement teams. Contract management process automation was found to be closely linked with more effective internal business collaboration, standardization and risk management. == Advantages and key functions == Using contract management software can have multiple benefits compared to manually managing paper contracts. This software can help keep track of multiple activities and can have features for automating administration, ensuring compliance, monitoring risk, running reports and triggering alerts. In addition to these types of features, contract management software systems provide a centralized repository for employees to quickly access all contracts worldwide in one place. Contract management software is produced by many companies, working on a range of scales and offering varying degrees of customizability. Basic functions should include the ability to store contract documents, track changes to contract documents, search documents for a particular criterion, send key date alerts and to report required aspects of the contract. Other functions include managing a new contract request, capturing related data, following a document through a review and approval process, and collecting digital signatures. Contract management software may also be an aid to project portfolio management and spend analysis, and may also monitor KPIs. Leading contract management software provides contract visibility, monitoring, and compliance to automate and streamline the contract lifecycle process. Contract management software which uses artificial intelligence (AI) can identify contract types based on pattern recognition. AI contracting software trains its algorithms on a set of contract data to recognize patterns and extract variables such as clauses, dates, and parties. It also offers simple prediction capabilities, by sorting through a large volume of contracts and flagging individual contracts based on specified criteria. AI software can also read contracts in multiple formats and languages, extract contract data, and provide analytics. It can reduce the risk of human error in contract drafting and review. A centralized repository provides a critical advantage allowing for all contract documents to be stored within one location. Having contracts stored in multiple locations can delay and interrupt the contracting process. == Contract risk management software (CRMS) for capital projects == Very large enterprises, such as capital expenditure (capex) projects, involve multiple parties and high risk and uncertainty. They are unlike traditional operating contracts in that they are subject to shared deadlines in unique situations. As the complexity of these unique projects increases, the relationships between parties become more important. This requires contract management software, or contract risk management software (CRMS), to become more dynamic and responsive. The terms of these capex contracts necessarily involve assumptions at the start of the process and are likely to change over the lifetime of the project lifecycle. For this reason, CRMS must be capable of recording one single instance of agreed changes to contract terms and incorporating these changes in an auditable and legally robust way. With multiple decision makers involved, CRMS should also make accountability more transparent and enable faster decisions about variation proposals.

    Read more →
  • Ed (chatbot)

    Ed (chatbot)

    Ed was a chatbot co-developed by the Los Angeles Unified School District and AllHere Education. Described as a learning acceleration platform, it was the first personal assistant for students in the United States. Part of the district's Individual Acceleration Plan, it was able to interact with students both verbally and visually, offering support in 100 languages. The chatbot was launched on March 20, 2024, as part of the district's plan for academic recovery from the COVID-19 pandemic and to improve overall academic performance. Utilizing artificial intelligence, Ed organizes data and reports on grades, test scores, and attendance, creating individualized plans for each student. After the company behind it, AllHere, collapsed, the district shuttered operations of the chatbot on June 14, 2024. The firm is under investigation by the US Federal Bureau of Investigation. == History == On February 14, 2022, Alberto M. Carvalho became the Superintendent of the Los Angeles Unified School District, pledging to give the district a full academic recovery from the COVID-19 pandemic. In December 2022, he announced the Individual Acceleration Plan for the district, which aimed to provide each student with a unique progress report and help them determine if they were on track to graduate. The district faced criticism from disability advocates for its management of Individualized Education Programs, and in April 2022, the United States Department of Education announced that the district had failed to provide appropriate educational services to students with disabilities during the pandemic. The district had been grappling with significant absenteeism issues since the pandemic, which led to declining academic performance and disengagement among students. On February 17, 2023, the district issued a request for proposals to develop a fully integrated portal system. Later that year, they signed a $6 million, five-year contract with AllHere Education, a Boston-based company founded in 2016. The introduction of Ed follows the public launch of ChatGPT, which has been utilized by both teachers and students in educational settings. On August 4, 2023, during an annual address at the Walt Disney Concert Hall, Carvalho and the Los Angeles Unified School District announced the launch of Ed. The district invested $4 million into the chatbot, with Carvalho noting that this cost would be halved thanks to donor and grant funding. The chatbot was launched on March 20, 2024. Following its launch, a press conference was held to address security and technology concerns. Carvalho stated that the district had collaborated with security companies and incorporated filters to screen for threatening language. Months after its launch, AllHere Education furloughed most of its staff on June 14, citing their “current financial position” on its website as the reason. After learning about the furlough, the district terminated its dealings with AllHere Education. However, it stated its intention to bring the chatbot back in the future once officials determine the best course of action. Carvalho announced that he would appoint an independent task force to review what went wrong with AllHere Education and the chatbot. On February 25, 2026, the FBI served a search warrant on Carvalho’s home and office in connection with AllHere. The FBI also raided the LAUSD's headquarters. == Service == The chatbot was described as a personal assistant and a "one-stop shop for parents and students" who want to see information about a student's attendance and grades, as well as other resources from the district. Additionally, the application can function as an alarm clock, provide daily lunch menus from the school cafeteria, and offer updates on the location of school buses. The chatbot also helps students and parents who do not speak English as their first language by translating displayed information into approximately 100 different languages. The application can also help with submitting applications and give updates on progress and upcoming assignments. The district stated that the primary goal of Ed was to actively motivate students to complete homework and other tasks. == Reception == The chatbot received a mostly positive reception among parents and observers upon its launch. Some parents and teachers expressed caution about the technology, voicing concerns that the district's push for its implementation lacked public accountability. Rob Nelson from the University of Pennsylvania described the district's strategy as risky, saying that the release felt "like the beginning of a Clippy-level disaster". After the chatbot's shutdown, The 74 criticized it for misusing student data. Chris Whiteley, a former software engineer at AllHere Education, alleged that the data collected by the chatbot likely violated the district's data privacy rules.

    Read more →
  • Mean shift

    Mean shift

    Mean shift is a non-parametric feature-space mathematical analysis technique for locating the maxima of a density function, a so-called mode-seeking algorithm. Application domains include cluster analysis in computer vision and image processing. == History == The mean shift procedure is usually credited to work by Fukunaga and Hostetler in 1975. It is, however, reminiscent of earlier work by Schnell in 1964. == Overview == Mean shift is a procedure for locating the maxima—the modes—of a density function given discrete data sampled from that function. This is an iterative method, and we start with an initial estimate x {\displaystyle x} . Let a kernel function K ( x i − x ) {\displaystyle K(x_{i}-x)} be given. This function determines the weight of nearby points for re-estimation of the mean. Typically a Gaussian kernel on the distance to the current estimate is used, K ( x i − x ) = e − c | | x i − x | | 2 {\displaystyle K(x_{i}-x)=e^{-c||x_{i}-x||^{2}}} . The weighted mean of the density in the window determined by K {\displaystyle K} is m ( x ) = ∑ x i ∈ N ( x ) K ( x i − x ) x i ∑ x i ∈ N ( x ) K ( x i − x ) {\displaystyle m(x)={\frac {\sum _{x_{i}\in N(x)}K(x_{i}-x)x_{i}}{\sum _{x_{i}\in N(x)}K(x_{i}-x)}}} where N ( x ) {\displaystyle N(x)} is the neighborhood of x {\displaystyle x} , a set of points for which K ( x i − x ) ≠ 0 {\displaystyle K(x_{i}-x)\neq 0} . The difference m ( x ) − x {\displaystyle m(x)-x} is called mean shift in Fukunaga and Hostetler. The mean-shift algorithm now sets x ← m ( x ) {\displaystyle x\leftarrow m(x)} , and repeats the estimation until m ( x ) {\displaystyle m(x)} converges. Although the mean shift algorithm has been widely used in many applications, a rigid proof for the convergence of the algorithm using a general kernel in a high dimensional space is still not known. Aliyari Ghassabeh showed the convergence of the mean shift algorithm in one dimension with a differentiable, convex, and strictly decreasing profile function. However, the one-dimensional case has limited real world applications. Also, the convergence of the algorithm in higher dimensions with a finite number of the stationary (or isolated) points has been proved. However, sufficient conditions for a general kernel function to have finite stationary (or isolated) points have not been provided. Gaussian Mean-Shift is an Expectation–maximization algorithm. == Details == Let data be a finite set S {\displaystyle S} embedded in the n {\displaystyle n} -dimensional Euclidean space, X {\displaystyle X} . Let K {\displaystyle K} be a flat kernel that is the characteristic function of the λ {\displaystyle \lambda } -ball in X {\displaystyle X} , In each iteration of the algorithm, s ← m ( s ) {\displaystyle s\leftarrow m(s)} is performed for all s ∈ S {\displaystyle s\in S} simultaneously. The first question, then, is how to estimate the density function given a sparse set of samples. One of the simplest approaches is to just smooth the data, e.g., by convolving it with a fixed kernel of width h {\displaystyle h} , where x i {\displaystyle x_{i}} are the input samples and k ( r ) {\displaystyle k(r)} is the kernel function (or Parzen window). h {\displaystyle h} is the only parameter in the algorithm and is called the bandwidth. This approach is known as kernel density estimation or the Parzen window technique. Once we have computed f ( x ) {\displaystyle f(x)} from the equation above, we can find its local maxima using gradient ascent or some other optimization technique. The problem with this "brute force" approach is that, for higher dimensions, it becomes computationally prohibitive to evaluate f ( x ) {\displaystyle f(x)} over the complete search space. Instead, mean shift uses a variant of what is known in the optimization literature as multiple restart gradient descent. Starting at some guess for a local maximum, y k {\displaystyle y_{k}} , which can be a random input data point x 1 {\displaystyle x_{1}} , mean shift computes the gradient of the density estimate f ( x ) {\displaystyle f(x)} at y k {\displaystyle y_{k}} and takes an uphill step in that direction. == Types of kernels == Kernel definition: Let X {\displaystyle X} be the n {\displaystyle n} -dimensional Euclidean space, R n {\displaystyle \mathbb {R} ^{n}} . The norm of x {\displaystyle x} is a non-negative number, ‖ x ‖ 2 = x ⊤ x ≥ 0 {\displaystyle \|x\|^{2}=x^{\top }x\geq 0} . A function K : X → R {\displaystyle K:X\rightarrow \mathbb {R} } is said to be a kernel if there exists a profile, k : [ 0 , ∞ ] → R {\displaystyle k:[0,\infty ]\rightarrow \mathbb {R} } , such that K ( x ) = k ( ‖ x ‖ 2 ) {\displaystyle K(x)=k(\|x\|^{2})} and k is non-negative. k is non-increasing: k ( a ) ≥ k ( b ) {\displaystyle k(a)\geq k(b)} if a < b {\displaystyle a Read more →

  • Spreading activation

    Spreading activation

    Spreading activation is a method for searching associative networks, biological and artificial neural networks, or semantic networks. The search process is initiated by labeling a set of source nodes (e.g. concepts in a semantic network) with weights or "activation" and then iteratively propagating or "spreading" that activation out to other nodes linked to the source nodes. Most often these "weights" are real values that decay as activation propagates through the network. When the weights are discrete this process is often referred to as marker passing. Activation may originate from alternate paths, identified by distinct markers, and terminate when two alternate paths reach the same node. However brain studies show that several different brain areas play an important role in semantic processing. Spreading activation in semantic networks as a model were invented in cognitive psychology to model the fan out effect. Spreading activation can also be applied in information retrieval, by means of a network of nodes representing documents and terms contained in those documents. == Cognitive psychology == As it relates to cognitive psychology, spreading activation is the theory of how the brain iterates through a network of associated ideas to retrieve specific information. The spreading activation theory presents the array of concepts within our memory as cognitive units, each consisting of a node and its associated elements or characteristics, all connected together by edges. A spreading activation network can be represented schematically, in a sort of web diagram with shorter lines between two nodes meaning the ideas are more closely related and will typically be associated more quickly to the original concept. In memory psychology, the spreading activation model holds that people organize their knowledge of the world based on their personal experiences, which in turn form the network of ideas that is the person's knowledge of the world. When a word (the target) is preceded by an associated word (the prime) in word recognition tasks, participants seem to perform better in the amount of time that it takes them to respond. For instance, subjects respond faster to the word "doctor" when it is preceded by "nurse" than when it is preceded by an unrelated word like "carrot". This semantic priming effect with words that are close in meaning within the cognitive network has been seen in a wide range of tasks given by experimenters, ranging from sentence verification to lexical decision and naming. As another example, if the original concept is "red" and the concept "vehicles" is primed, they are much more likely to say "fire engine" instead of something unrelated to vehicles, such as "cherries". If instead "fruits" was primed, they would likely name "cherries" and continue on from there. The activation of pathways in the network has everything to do with how closely linked two concepts are by meaning, as well as how a subject is primed. == Algorithm == A directed graph is populated by Nodes[ 1...N ] each having an associated activation value A [ i ] which is a real number in the range [0.0 ... 1.0]. A Link[ i, j ] connects source node[ i ] with target node[ j ]. Each edge has an associated weight W [ i, j ] usually a real number in the range [0.0 ... 1.0]. Parameters: Firing threshold F, a real number in the range [0.0 ... 1.0] Decay factor D, a real number in the range [0.0 ... 1.0] Steps: Initialize the graph setting all activation values A [ i ] to zero. Set one or more origin nodes to an initial activation value greater than the firing threshold F. A typical initial value is 1.0. For each unfired node [ i ] in the graph having an activation value A [ i ] greater than the node firing threshold F: For each Link [ i, j ] connecting the source node [ i ] with target node [ j ], adjust A [ j ] = A [ j ] + (A [ i ] W [ i, j ] D) where D is the decay factor. If a target node receives an adjustment to its activation value so that it would exceed 1.0, then set its new activation value to 1.0. Likewise maintain 0.0 as a lower bound on the target node's activation value should it receive an adjustment to below 0.0. Once a node has fired it may not fire again, although variations of the basic algorithm permit repeated firings and loops through the graph. Nodes receiving a new activation value that exceeds the firing threshold F are marked for firing on the next spreading activation cycle. If activation originates from more than one node, a variation of the algorithm permits marker passing to distinguish the paths by which activation is spread over the graph The procedure terminates when either there are no more nodes to fire or in the case of marker passing from multiple origins, when a node is reached from more than one path. Variations of the algorithm that permit repeated node firings and activation loops in the graph, terminate after a steady activation state, with respect to some delta, is reached, or when a maximum number of iterations is exceeded. == Examples ==

    Read more →
  • Latent semantic analysis

    Latent semantic analysis

    Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA assumes that words that are close in meaning will occur in similar pieces of text (the distributional hypothesis). A matrix containing word counts per document (rows represent unique words and columns represent each document) is constructed from a large piece of text and a mathematical technique called singular value decomposition (SVD) is used to reduce the number of rows while preserving the similarity structure among columns. Documents are then compared by cosine similarity between any two columns. Values close to 1 represent very similar documents while values close to 0 represent very dissimilar documents. An information retrieval technique using latent semantic structure was patented in 1988 by Scott Deerwester, Susan Dumais, George Furnas, Richard Harshman, Thomas Landauer, Karen Lochbaum and Lynn Streeter. In the context of its application to information retrieval, it is sometimes called latent semantic indexing (LSI). == Overview == === Occurrence matrix === LSA can use a document-term matrix which describes the occurrences of terms in documents; it is a sparse matrix whose rows correspond to terms and whose columns correspond to documents. A typical example of the weighting of the elements of the matrix is tf-idf (term frequency–inverse document frequency): the weight of an element of the matrix is proportional to the number of times the terms appear in each document, where rare terms are upweighted to reflect their relative importance. This matrix is also common to standard semantic models, though it is not necessarily explicitly expressed as a matrix, since the mathematical properties of matrices are not always used. === Rank lowering === After the construction of the occurrence matrix, LSA finds a low-rank approximation to the term-document matrix. There could be various reasons for these approximations: The original term-document matrix is presumed too large for the computing resources; in this case, the approximated low rank matrix is interpreted as an approximation (a "least and necessary evil"). The original term-document matrix is presumed noisy: for example, anecdotal instances of terms are to be eliminated. From this point of view, the approximated matrix is interpreted as a de-noisified matrix (a better matrix than the original). The original term-document matrix is presumed overly sparse relative to the "true" term-document matrix. That is, the original matrix lists only the words actually in each document, whereas we might be interested in all words related to each document—generally a much larger set due to synonymy. The consequence of the rank lowering is that some dimensions are combined and depend on more than one term: {(car), (truck), (flower)} → {(1.3452 car + 0.2828 truck), (flower)} This mitigates the problem of identifying synonymy, as the rank lowering is expected to merge the dimensions associated with terms that have similar meanings. It also partially mitigates the problem with polysemy, since components of polysemous words that point in the "right" direction are added to the components of words that share a similar meaning. Conversely, components that point in other directions tend to either simply cancel out, or, at worst, to be smaller than components in the directions corresponding to the intended sense. === Derivation === Let X {\displaystyle X} be a matrix where element ( i , j ) {\displaystyle (i,j)} describes the occurrence of term i {\displaystyle i} in document j {\displaystyle j} (this can be, for example, the frequency). X {\displaystyle X} will look like this: d j ↓ t i T → [ x 1 , 1 … x 1 , j … x 1 , n ⋮ ⋱ ⋮ ⋱ ⋮ x i , 1 … x i , j … x i , n ⋮ ⋱ ⋮ ⋱ ⋮ x m , 1 … x m , j … x m , n ] {\displaystyle {\begin{matrix}&{\textbf {d}}_{j}\\&\downarrow \\{\textbf {t}}_{i}^{T}\rightarrow &{\begin{bmatrix}x_{1,1}&\dots &x_{1,j}&\dots &x_{1,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{i,1}&\dots &x_{i,j}&\dots &x_{i,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{m,1}&\dots &x_{m,j}&\dots &x_{m,n}\\\end{bmatrix}}\end{matrix}}} Now a row in this matrix will be a vector corresponding to a term, giving its relation to each document: t i T = [ x i , 1 … x i , j … x i , n ] {\displaystyle {\textbf {t}}_{i}^{T}={\begin{bmatrix}x_{i,1}&\dots &x_{i,j}&\dots &x_{i,n}\end{bmatrix}}} Likewise, a column in this matrix will be a vector corresponding to a document, giving its relation to each term: d j = [ x 1 , j ⋮ x i , j ⋮ x m , j ] {\displaystyle {\textbf {d}}_{j}={\begin{bmatrix}x_{1,j}\\\vdots \\x_{i,j}\\\vdots \\x_{m,j}\\\end{bmatrix}}} Now the dot product t i T t p {\displaystyle {\textbf {t}}_{i}^{T}{\textbf {t}}_{p}} between two term vectors gives the correlation between the terms over the set of documents. The matrix product X X T {\displaystyle XX^{T}} contains all these dot products. Element ( i , p ) {\displaystyle (i,p)} (which is equal to element ( p , i ) {\displaystyle (p,i)} ) contains the dot product t i T t p {\displaystyle {\textbf {t}}_{i}^{T}{\textbf {t}}_{p}} ( = t p T t i {\displaystyle ={\textbf {t}}_{p}^{T}{\textbf {t}}_{i}} ). Likewise, the matrix X T X {\displaystyle X^{T}X} contains the dot products between all the document vectors, giving their correlation over the terms: d j T d q = d q T d j {\displaystyle {\textbf {d}}_{j}^{T}{\textbf {d}}_{q}={\textbf {d}}_{q}^{T}{\textbf {d}}_{j}} . Now, from the theory of linear algebra, there exists a decomposition of X {\displaystyle X} such that U {\displaystyle U} and V {\displaystyle V} are orthogonal matrices and Σ {\displaystyle \Sigma } is a diagonal matrix. This is called a singular value decomposition (SVD): X = U Σ V T {\displaystyle {\begin{matrix}X=U\Sigma V^{T}\end{matrix}}} The matrix products giving us the term and document correlations then become X X T = ( U Σ V T ) ( U Σ V T ) T = ( U Σ V T ) ( V T T Σ T U T ) = U Σ V T V Σ T U T = U Σ Σ T U T X T X = ( U Σ V T ) T ( U Σ V T ) = ( V T T Σ T U T ) ( U Σ V T ) = V Σ T U T U Σ V T = V Σ T Σ V T {\displaystyle {\begin{matrix}XX^{T}&=&(U\Sigma V^{T})(U\Sigma V^{T})^{T}=(U\Sigma V^{T})(V^{T^{T}}\Sigma ^{T}U^{T})=U\Sigma V^{T}V\Sigma ^{T}U^{T}=U\Sigma \Sigma ^{T}U^{T}\\X^{T}X&=&(U\Sigma V^{T})^{T}(U\Sigma V^{T})=(V^{T^{T}}\Sigma ^{T}U^{T})(U\Sigma V^{T})=V\Sigma ^{T}U^{T}U\Sigma V^{T}=V\Sigma ^{T}\Sigma V^{T}\end{matrix}}} Since Σ Σ T {\displaystyle \Sigma \Sigma ^{T}} and Σ T Σ {\displaystyle \Sigma ^{T}\Sigma } are diagonal we see that U {\displaystyle U} must contain the eigenvectors of X X T {\displaystyle XX^{T}} , while V {\displaystyle V} must be the eigenvectors of X T X {\displaystyle X^{T}X} . Both products have the same non-zero eigenvalues, given by the non-zero entries of Σ Σ T {\displaystyle \Sigma \Sigma ^{T}} , or equally, by the non-zero entries of Σ T Σ {\displaystyle \Sigma ^{T}\Sigma } . Now the decomposition looks like this: X U Σ V T ( d j ) ( d ^ j ) ↓ ↓ ( t i T ) → [ x 1 , 1 … x 1 , j … x 1 , n ⋮ ⋱ ⋮ ⋱ ⋮ x i , 1 … x i , j … x i , n ⋮ ⋱ ⋮ ⋱ ⋮ x m , 1 … x m , j … x m , n ] = ( t ^ i T ) → [ [ u 1 ] … [ u l ] ] ⋅ [ σ 1 … 0 ⋮ ⋱ ⋮ 0 … σ l ] ⋅ [ [ v 1 ] ⋮ [ v l ] ] {\displaystyle {\begin{matrix}&X&&&U&&\Sigma &&V^{T}\\&({\textbf {d}}_{j})&&&&&&&({\hat {\textbf {d}}}_{j})\\&\downarrow &&&&&&&\downarrow \\({\textbf {t}}_{i}^{T})\rightarrow &{\begin{bmatrix}x_{1,1}&\dots &x_{1,j}&\dots &x_{1,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{i,1}&\dots &x_{i,j}&\dots &x_{i,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{m,1}&\dots &x_{m,j}&\dots &x_{m,n}\\\end{bmatrix}}&=&({\hat {\textbf {t}}}_{i}^{T})\rightarrow &{\begin{bmatrix}{\begin{bmatrix}\,\\\,\\{\textbf {u}}_{1}\\\,\\\,\end{bmatrix}}\dots {\begin{bmatrix}\,\\\,\\{\textbf {u}}_{l}\\\,\\\,\end{bmatrix}}\end{bmatrix}}&\cdot &{\begin{bmatrix}\sigma _{1}&\dots &0\\\vdots &\ddots &\vdots \\0&\dots &\sigma _{l}\\\end{bmatrix}}&\cdot &{\begin{bmatrix}{\begin{bmatrix}&&{\textbf {v}}_{1}&&\end{bmatrix}}\\\vdots \\{\begin{bmatrix}&&{\textbf {v}}_{l}&&\end{bmatrix}}\end{bmatrix}}\end{matrix}}} The values σ 1 , … , σ l {\displaystyle \sigma _{1},\dots ,\sigma _{l}} are called the singular values, and u 1 , … , u l {\displaystyle u_{1},\dots ,u_{l}} and v 1 , … , v l {\displaystyle v_{1},\dots ,v_{l}} the left and right singular vectors. Notice the only part of U {\displaystyle U} that contributes to t i {\displaystyle {\textbf {t}}_{i}} is the i 'th {\displaystyle i{\textrm {'th}}} row. Let this row vector be called t ^ i T {\displaystyle {\hat {\textrm {t}}}_{i}^{T}} . Likewise, the only part of V T {\displaystyle V^{T}} that contributes to d j {\displaystyle {\textbf {d}}_{j}} is the j 'th {\displaystyle j{\textrm {'th}}} column, d ^ j {\displaystyle {\hat {\textrm {d}}}_{j}} . These are not the eigenvectors, but depend on all the eigenvectors. I

    Read more →
  • Logical Machine Corporation

    Logical Machine Corporation

    Logical Machine Corporation (LOMAC) was an American computer company active from the mid-1970s to the 1980s and based in the San Francisco Bay Area. It was founded as John Peers and Company by the British entrepreneur John Peers in 1974. LOMAC developed the ADAM, a minicomputer which ran a specialized compiler for the company's natural English programming language. Throughout the late 1970s, the company acquired several technology firms, including Byte, Inc., the owner of the Byte Shop retail chain. Despite its unique approach to computing and earning $5 million in revenue in 1977, LOMAC struggled as the industry began to standardize around the IBM Personal Computer (IBM PC). Following Peers's departure in 1980, the company rebranded as Logical Business Machines, Inc. (LBM, or simply Logical), and attempted to pivot toward IBM PC–compatible hardware. However, financial difficulties led to the company filing for Chapter 11 bankruptcy in 1984. After emerging from bankruptcy in 1985 with new investment, Logical ceased hardware manufacturing to focus exclusively on software development and value-added reselling. == History == John Peers (born 1942) founded Logical Machine Corporation as John Peers and Company in September 1974. The company originally occupied a 4,500-square-foot office in Burlingame, California. The company was Peers' fourth; he had recently sold off Allied Business Systems of London to Trafalgar House in 1974. Peers sought to set up manufacturing in an agricultural zone in Ukiah, California. Following a delay, caused in part by concerned residents, a 30,000-square-foot plant was raised in Burke Hill, three miles south of Ukiah. The Ukiah plant was built to mass manufacture the company's ADAM minicomputer. The ADAM computer ran a specialized compiler for the company's natural English programming language; that is to say, the programming language attempted to closely emulate English syntax. Prototypes of the ADAM were built in May 1974, based on specifications devised in October 1973. Peers had yet to patent the technology as of June 1975. The ADAM's central processing unit was bolted onto an 7-by-6-foot L-shaped desk, on which rested its terminal. Twenty units of the ADAM were installed between April 1975 and February 1976, out of a backlog of orders for 3,500 from 500 clients, manufactured out of the company's Burlingame headquarters. It cost US$40,000. A controversial print advertisement featuring a naked woman seated at an ADAM terminal—as a pastiche of Adam and Eve—was recalled in early 1976 as a result of outcry from the National Organization for Women. The company changed its name to Logical Machine Corporation (LOMAC) in October 1976 and moved its headquarters to a 26,000-square-foot building in Sunnyvale, California, in anticipation of a ramping up of orders for the ADAM. The company originally occupied half of the building; they later purchased the other half from the tenant in July 1977 to double its manufacturing output. For fiscal year 1977, the company earned $5 million in revenue. In December 1977, LOMAC acquired Byte, Inc.—the proprietor of The Byte Shop, the first computer retail chain—from Paul Terrell and Boyd Wilson for an unspecified amount. The Byte Shop had 65 locations in the San Francisco Bay Area in 1978; it catered mainly to hobbyists with low cost microcomputer kits, in contrast to the high cost of LOMAC's ADAM. By July 1978, however, LOMAC were able to reduce the price of the ADAM down to $15,000. The company by that point had shipped their 50th ADAM and expanded to 14 countries. Also in 1978, LOMAC acquired Mass Memory—a high-tech optical storage company based in Phoenix, Arizona, whose products had storage capacities on the order gigabytes and terabytes—and Centigram, makers of the Mike—a computer with speech recognition. Later that year, the company introduced Tina, a low-cost version of the ADAM. LOMAC suffered losses that year and appointed Jerry Brandt to the board of directions, naming him chief operating officer, in August 1978. Brandt had Logical absorb Mass Memory and Centigram into the parent operations, shutting down their respective plants in the process, converted 10 Byte Shops to franchises and opened 25 more franchised Byte locations, and stopped direct sales of LOMAC's business computer products. By the beginning of 1979, LOMAC was profitable once more, and Brandt was let go from LOMAC. Peers left LOMAC in 1980, following a slump in the company's sales. He became an executive director of the United States Robotics Society, a consortium for industrial automation companies, that year. Following Peers' departure, LOMAC changed its name to Logical Business Machines, adopting the name of its European subsidiary. In 1983, the company announced a 16-bit clone of the IBM PC, called the Logical L-XT, which featured a 10-MB hard drive, 320-KB floppy drive and 192 KB of RAM, and a real-time clock, and came shipped with various software (including MS-DOS, a word processor, and a spreadsheet application) and an amber CRT monitor. The following year, the company introduced L-NET, a local area network system based on the L-XT that could link up to 64 computers. L-NET came shipped with a natural programming language, Diplomat—a descendant of the programming language used on the ADAM. In June 1983, Logical sued Coleco Industries over trademark infringement with the latter's to-be-released Adam microcomputer. Logical cited confusion from their existing ADAM customer base caused by the announcement of the Coleco Adam as the basis for the suit. Coleco challenged Logical in the press, writing that Logical's rights to the Adam trademark for use in computers had lapsed earlier in the year. The two settled out of court, with Coleco agreeing to license the Adam name from Logical in exchange for unlimited rights to the Adam trademark. Logical halted development of the L-XT when they filed for Chapter 11 bankruptcy in July 1984. The company had been $4 million in debt. They emerged from bankruptcy in September 1985, after being infused with $2 million from Carat Ltd. The latter immediately received a little less than 50 percent ownership in Logical—this stake set to grow to over 50 percent over the next six months. As part of the terms of exiting bankruptcy, Logical stopped manufacturing hardware and strictly became a software development company and value-added reseller of computer systems.

    Read more →
  • Controlled natural language

    Controlled natural language

    Controlled natural languages (CNLs) are subsets of natural languages that are obtained by restricting the grammar and vocabulary in order to reduce or eliminate ambiguity and complexity. Traditionally, controlled languages fall into two major types: those that improve readability for human readers (e.g. non-native speakers), and those that enable reliable automatic semantic analysis of the language. The first type of languages (often called "simplified" or "technical" languages), for example ASD Simplified Technical English, Caterpillar Technical English, IBM's Easy English, are used in the industry to increase the quality of technical documentation, and possibly simplify the semi-automatic translation of the documentation. These languages restrict the writer by general rules such as "Keep sentences short", "Avoid the use of pronouns", "Only use dictionary-approved words", and "Use only the active voice". The second type of languages have a formal syntax and formal semantics, and can be mapped to an existing formal language, such as first-order logic. Thus, those languages can be used as knowledge representation languages, and writing of those languages is supported by fully automatic consistency and redundancy checks, query answering, etc. == Languages == Existing controlled natural languages include: == Encoding == IETF has reserved simple as a BCP 47 variant subtag for simplified versions of languages.

    Read more →
  • Prosthesis

    Prosthesis

    In medicine, a prosthesis (pl.: prostheses; from Ancient Greek: πρόσθεσις, romanized: prósthesis, lit. 'addition, application, attachment'), or a prosthetic implant, is an artificial device that replaces a missing body part, which may be lost through physical trauma, disease, or a condition present at birth (congenital disorder). Prostheses may restore the normal functions of the missing body part, or may perform a cosmetic function. A person who has undergone an amputation is sometimes referred to as an amputee, Rehabilitation for someone with an amputation is primarily coordinated by a physiatrist as part of an inter-disciplinary team consisting of physiatrists, prosthetists, nurses, physical therapists, and occupational therapists. Prostheses can be created by hand or with computer-aided design (CAD), a software interface that helps creators design and analyze the creation with computer-generated 2-D and 3-D graphics as well as analysis and optimization tools. == Types == A person's prosthetic device should be designed and assembled to meet their individual appearance and functional needs. Depending on personal circumstances, co-morbidities, budget or health insurance coverage, and access to medical care, decisions may need to balance aesthetics and function. In addition, for some individuals, a myoelectric device, a body-powered device, or an activity-specific device may be appropriate options. The person's future goals and vocational aspirations and potential capabilities may help them choose between one or more devices. Craniofacial prostheses include intra-oral and extra-oral prostheses. Extra-oral prostheses are further divided into hemifacial, auricular (ear), nasal, orbital and ocular. Intra-oral prostheses include dental prostheses, such as dentures, obturators, and dental implants. Prostheses of the neck include larynx substitutes, trachea and upper esophageal replacements, Some prostheses of the torso include breast prostheses which may be either single or bilateral, full breast devices or nipple prostheses. Penile prostheses are used to treat erectile dysfunction, perform phalloplasty procedures in men, and to build a new penis in female-to-male gender reassignment surgeries. === Limb prostheses === Limb prostheses include both upper- and lower-extremity prostheses. Upper-extremity prostheses are used at varying levels of amputation: forequarter, shoulder disarticulation, transhumeral prosthesis, elbow disarticulation, transradial prosthesis, wrist disarticulation, full hand, partial hand, finger, partial finger. A transradial prosthesis is an artificial limb that replaces an arm missing below the elbow. Upper limb prostheses can be categorized in three main categories: Passive devices, Body Powered devices, and Externally Powered (myoelectric) devices. Passive devices can either be passive hands, mainly used for cosmetic purposes, or passive tools, mainly used for specific activities (e.g. leisure or vocational). An extensive overview and classification of passive devices can be found in a literature review by Maat et.al. A passive device can be static, meaning the device has no movable parts, or it can be adjustable, meaning its configuration can be adjusted (e.g. adjustable hand opening). Despite the absence of active grasping, passive devices are very useful in bimanual tasks that require fixation or support of an object, or for gesticulation in social interaction. According to scientific data a third of the upper limb amputees worldwide use a passive prosthetic hand. Body Powered or cable-operated limbs work by attaching a harness and cable around the opposite shoulder of the damaged arm. A recent body-powered approach has explored the utilization of the user's breathing to power and control the prosthetic hand to help eliminate actuation cable and harness. The third category of available prosthetic devices comprises myoelectric arms. This particular class of devices distinguishes itself from the previous ones due to the inclusion of a battery system. This battery serves the dual purpose of providing energy for both actuation and sensing components. While actuation predominantly relies on motor or pneumatic systems, a variety of solutions have been explored for capturing muscle activity, including techniques such as Electromyography, Sonomyography, Myokinetic, and others. These methods function by detecting the minute electrical currents generated by contracted muscles during upper arm movement, typically employing electrodes or other suitable tools. Subsequently, these acquired signals are converted into gripping patterns or postures that the artificial hand will then execute. In the prosthetics industry, a trans-radial prosthetic arm is often referred to as a "BE" or below elbow prosthesis. Lower-extremity prostheses provide replacements at varying levels of amputation. These include hip disarticulation, transfemoral prosthesis, knee disarticulation, transtibial prosthesis, Syme's amputation, foot, partial foot, and toe. The two main subcategories of lower extremity prosthetic devices are trans-tibial (any amputation transecting the tibia bone or a congenital anomaly resulting in a tibial deficiency) and trans-femoral (any amputation transecting the femur bone or a congenital anomaly resulting in a femoral deficiency). A transfemoral prosthesis is an artificial limb that replaces a leg missing above the knee. Transfemoral amputees can have a very difficult time regaining normal movement. In general, a transfemoral amputee must use approximately 80% more energy to walk than a person with two whole legs. This is due to the complexities in movement associated with the knee. In newer and more improved designs, hydraulics, carbon fiber, mechanical linkages, motors, computer microprocessors, and innovative combinations of these technologies are employed to give more control to the user. In the prosthetics industry, a trans-femoral prosthetic leg is often referred to as an "AK" or above the knee prosthesis. A transtibial prosthesis is an artificial limb that replaces a leg missing below the knee. A transtibial amputee is usually able to regain normal movement more readily than someone with a transfemoral amputation, due in large part to retaining the knee, which allows for easier movement. Lower extremity prosthetics describe artificially replaced limbs located at the hip level or lower. In the prosthetics industry, a transtibial prosthetic leg is often referred to as a "BK" or below the knee prosthesis. Prostheses are manufactured and fit by clinical prosthetists. Prosthetists are healthcare professionals responsible for making, fitting, and adjusting prostheses and for lower limb prostheses will assess both gait and prosthetic alignment. Once a prosthesis has been fit and adjusted by a prosthetist, a rehabilitation physiotherapist (called physical therapist in America) will help teach a new prosthetic user to walk with a leg prosthesis. To do so, the physical therapist may provide verbal instructions and may also help guide the person using touch or tactile cues. This may be done in a clinic or home. There is some research suggesting that such training in the home may be more successful if the treatment includes the use of a treadmill. Using a treadmill, along with the physical therapy treatment, helps the person to experience many of the challenges of walking with a prosthesis. In the United Kingdom, 75% of lower limb amputations are performed due to inadequate circulation (dysvascularity). This condition is often associated with many other medical conditions (co-morbidities) including diabetes and heart disease that may make it a challenge to recover and use a prosthetic limb to regain mobility and independence. For people who have inadequate circulation and have lost a lower limb, there is insufficient evidence due to a lack of research, to inform them regarding their choice of prosthetic rehabilitation approaches. Lower extremity prostheses are often categorized by the level of amputation or after the name of a surgeon: Transfemoral (Above-knee) Transtibial (Below-knee) Ankle disarticulation (more commonly known as Syme's amputation) Knee disarticulation (also see knee replacement) Hip disarticulation, (also see hip replacement) Hemi-pelvictomy Partial foot amputations (Pirogoff, Talo-Navicular and Calcaneo-cuboid (Chopart), Tarso-metatarsal (Lisfranc), Trans-metatarsal, Metatarsal-phalangeal, Ray amputations, toe amputations). Van Nes rotationplasty ==== Prosthetic raw materials ==== Prosthetic are made lightweight for better convenience for the amputee. Some of these materials include: Plastics: Polyethylene Polypropylene Acrylics Polyurethane Wood (early prosthetics) Rubber (early prosthetics) Lightweight metals: Aluminum Composites: Carbon fiber reinforced polymers Wheeled prostheses have also been used extensively in the rehabilitation of injured domestic animals, including dogs, cats, pigs, rabbits, and

    Read more →
  • Video imprint (computer vision)

    Video imprint (computer vision)

    Proposed as an extension of image epitomes in the field of video content analysis, video imprint is obtained by recasting video contents into a fixed-sized tensor representation regardless of video resolution or duration. Specifically, statistical characteristics are retained to some degrees so that common video recognition tasks can be carried out directly on such imprints, e.g., event retrieval, temporal action localization. It is claimed that both spatio-temporal interdependences are accounted for and redundancies are mitigated during the computation of video imprints. The option of computing video imprints exploiting the epitome model has the advantage of more flexible input feature formats and more efficient training stage for video content analysis.

    Read more →
  • BERT (language model)

    BERT (language model)

    Bidirectional encoder representations from transformers (BERT) is a language model introduced in October 2018 by researchers at Google. It learns to represent text as a sequence of vectors using self-supervised learning. It uses the encoder-only transformer architecture. BERT dramatically improved the state of the art for large language models. As of 2020, BERT is a ubiquitous baseline in natural language processing (NLP) experiments. BERT is trained by masked token prediction and next sentence prediction. With this training, BERT learns contextual, latent representations of tokens in their context, similar to ELMo and GPT-2. It found applications for many natural language processing tasks, such as coreference resolution and polysemy resolution. It improved on ELMo and spawned the study of "BERTology", which attempts to interpret what is learned by BERT. BERT was originally implemented in the English language at two model sizes, BERTBASE (110 million parameters) and BERTLARGE (340 million parameters). Both were trained on the Toronto BookCorpus (800M words) and English Wikipedia (2,500M words). The weights were released on GitHub. On March 11, 2020, 24 smaller models were released, the smallest being BERTTINY with just 4 million parameters. == Architecture == BERT is an "encoder-only" transformer architecture. At a high level, BERT consists of 4 modules: Tokenizer: This module converts a piece of English text into a sequence of integers ("tokens"). Embedding: This module converts the sequence of tokens into an array of real-valued vectors representing the tokens. It represents the conversion of discrete token types into a lower-dimensional Euclidean space. Encoder: a stack of Transformer blocks with self-attention, but without causal masking. Task head: This module converts the final representation vectors into one-shot encoded tokens again by producing a predicted probability distribution over the token types. It can be viewed as a simple decoder, decoding the latent representation into token types, or as an "un-embedding layer". The task head is necessary for pre-training, but it is often unnecessary for so-called "downstream tasks," such as question answering or sentiment classification. Instead, one removes the task head and replaces it with a newly initialized module suited for the task, and finetune the new module. The latent vector representation of the model is directly fed into this new module, allowing for sample-efficient transfer learning. === Embedding === This section describes the embedding used by BERTBASE. The other one, BERTLARGE, is similar, just larger. The tokenizer of BERT is WordPiece, which is a sub-word strategy like byte-pair encoding. Its vocabulary size is 30,000, and any token not appearing in its vocabulary is replaced by [UNK] ("unknown"). The first layer is the embedding layer, which contains three components: token type embeddings, position embeddings, and segment type embeddings. Token type: The token type is a standard embedding layer, translating a one-hot vector into a dense vector based on its token type. Position: The position embeddings are based on a token's position in the sequence. BERT uses absolute position embeddings, where each position in a sequence is mapped to a real-valued vector. Each dimension of the vector consists of a sinusoidal function that takes the position in the sequence as input. Segment type: Using a vocabulary of just 0 or 1, this embedding layer produces a dense vector based on whether the token belongs to the first or second text segment in that input. In other words, type-1 tokens are all tokens that appear after the [SEP] special token. All prior tokens are type-0. The three embedding vectors are added together representing the initial token representation as a function of these three pieces of information. After embedding, the vector representation is normalized using a LayerNorm operation, outputting a 768-dimensional vector for each input token. After this, the representation vectors are passed forward through 12 Transformer encoder blocks, and are decoded back to 30,000-dimensional vocabulary space using a basic affine transformation layer. === Architectural family === The encoder stack of BERT has 2 free parameters: L {\displaystyle L} , the number of layers, and H {\displaystyle H} , the hidden size. There are always H / 64 {\displaystyle H/64} self-attention heads, and the feed-forward/filter size is always 4 H {\displaystyle 4H} . By varying these two numbers, one obtains an entire family of BERT models. For BERT: the feed-forward size and filter size are synonymous. Both of them denote the number of dimensions in the middle layer of the feed-forward network. the hidden size and embedding size are synonymous. Both of them denote the number of real numbers used to represent a token. The notation for encoder stack is written as L/H. For example, BERTBASE is written as 12L/768H, BERTLARGE as 24L/1024H, and BERTTINY as 2L/128H. == Training == === Pre-training === BERT was pre-trained simultaneously on two tasks: Masked language modeling (MLM): In this task, BERT ingests a sequence of words, where one word may be randomly changed ("masked"), and BERT tries to predict the original words that had been changed. For example, in the sentence "The cat sat on the [MASK]," BERT would need to predict "mat." This helps BERT learn bidirectional context, meaning it understands the relationships between words not just from left to right or right to left but from both directions at the same time. Next sentence prediction (NSP): In this task, BERT is trained to predict whether one sentence logically follows another. For example, given two sentences, "The cat sat on the mat" and "It was a sunny day", BERT has to decide if the second sentence is a valid continuation of the first one. This helps BERT understand relationships between sentences, which is important for tasks like question answering or document classification. ==== Masked language modeling ==== In masked language modeling, 15% of tokens would be randomly selected for masked-prediction task, and the training objective was to predict the masked token given its context. In more detail, the selected token is: replaced with a [MASK] token with probability 80%, replaced with a random word token with probability 10%, not replaced with probability 10%. The reason not all selected tokens are masked is to avoid the dataset shift problem. The dataset shift problem arises when the distribution of inputs seen during training differs significantly from the distribution encountered during inference. A trained BERT model might be applied to word representation (like Word2Vec), where it would be run over sentences not containing any [MASK] tokens. It is later found that more diverse training objectives are generally better. As an illustrative example, consider the sentence "my dog is cute". It would first be divided into tokens like "my1 dog2 is3 cute4". Then a random token in the sentence would be picked. Let it be the 4th one "cute4". Next, there would be three possibilities: with probability 80%, the chosen token is masked, resulting in "my1 dog2 is3 [MASK]4"; with probability 10%, the chosen token is replaced by a uniformly sampled random token, such as "happy", resulting in "my1 dog2 is3 happy4"; with probability 10%, nothing is done, resulting in "my1 dog2 is3 cute4". After processing the input text, the model's 4th output vector is passed to its decoder layer, which outputs a probability distribution over its 30,000-dimensional vocabulary space. ==== Next sentence prediction ==== Given two sentences, the model predicts if they appear sequentially in the training corpus, outputting either [IsNext] or [NotNext]. During training, the algorithm sometimes samples two sentences from a single continuous span in the training corpus, while at other times, it samples two sentences from two discontinuous spans. The first sentence starts with a special token, [CLS] (for "classify"). The two sentences are separated by another special token, [SEP] (for "separate"). After processing the two sentences, the final vector for the [CLS] token is passed to a linear layer for binary classification into [IsNext] and [NotNext]. For example: Given "[CLS] my dog is cute [SEP] he likes playing [SEP]", the model should predict [IsNext]. Given "[CLS] my dog is cute [SEP] how do magnets work [SEP]", the model should predict [NotNext]. === Fine-tuning === BERT is meant as a general pretrained model for various applications in natural language processing. That is, after pre-training, BERT can be fine-tuned with fewer resources on smaller datasets to optimize its performance on specific tasks such as natural language inference and text classification, and sequence-to-sequence-based language generation tasks such as question answering and conversational response generation. The original BERT paper published results demonstrating that a small amount of fine

    Read more →