BRS/Search is a full-text database and information retrieval system. BRS/Search uses a fully inverted indexing system to store, locate, and retrieve unstructured data. It was the search engine that in 1977 powered Bibliographic Retrieval Services (BRS) commercial operations with 20 databases (including the first national commercial availability of MEDLINE); it has changed ownership several times during its development and is currently sold as Livelink ECM Discovery Server by Open Text Corporation. == Early development == Development on what was to become BRS began as Biomedical Communications Network (BCN) at the State University of New York at Albany (SUNY). BCN, which went online in 1968, provided on-line access to nine databases, including MEDLINE and BIOSIS Previews, to large universities and medical schools primarily in the Northeast of the USA. State funding for the project was withdrawn in 1975, and Bibliographic Retrieval Services (BRS) was formed as a non-profit concern the following year. It was incorporated in May 1976 as a for-profit corporation with Ron Quake as president, Jan Egeland as vice president in charge of marketing and training, and Lloyd Palmer as vice president of systems. == BRS commercial operations == In December 1976, the First BRS User Meeting was held in Syracuse, New York, and by January 1977 BRS started commercial operations with 20 databases (including the first national commercial availability of MEDLINE) and 9 million records, using modified IBM STAIRS (STorage And Information Retrieval System) software, Telenet for telecommunications, and timesharing mainframe computers of Carrier Corporation. In October 1980 BRS was sold by Egeland and Quake to Indian Head, Inc., a subsidiary of the Dutch company Thyssen-Bornemisza Group. == 1989–1993 == In 1989 Robert Maxwell acquired BRS and the BRS/Search software; he announced the planned incorporation of the ORBIT Search Service and BRS Information Technologies and renamed the whole group Maxwell Online, Inc. At that time BRS Information Technologies was serving the medical and academic library marketplace with over 150 databases. Maxwell later bought the publishing company Macmillan and put Maxwell Online under Macmillan. In the same year BRS/LINK (hypertext connection of databases; first application delivering full text) was announced. The initial BRS/LINK application "relates the citation in a bibliographic database to its full-text article in a second database," and "eliminates the need to re-execute a search strategy in the second database in order to find the corresponding full-text article." Initially BRS/LINK supported linking only selected bibliographic databases: MEDLINE, Health Planning and Administration, and MEDLINE References on AIDS to the full-text Comprehensive Core Medical Library. At the time of Robert Maxwell’s death in 1991, Macmillan brought in Andrew Gregory to represent the company during the 2 years that Maxwell’s affairs were being settled and to prepare Maxwell Online to be able to sell the components. Maxwell Online shortly thereafter underwent yet another name change, this time to InfoPro Technologies. == Dataware Technologies ownership of BRS/SEARCH == Early in 1994, InfoPro Technologies, a subsidiary of MHC Inc. (holding company for Macmillan Inc.), the former Maxwell Online service, sold off all its subsidiaries. ORBIT Search Services went to the French-owned Questel, the dial-up BRS Search Services to CD Plus Technologies (later to become OVID), and BRS Software Products (including BRS/SEARCH) to Dataware Technologies. Almost up to the end of InfoPro Technologies, BRS Software had been the fastest growing segment of the company. At the 14th BRS North American Users Group Conference in 1999, Dave Schubmehl of Dataware Technologies presented a paper in which he stated "The purpose of this presentation is to update BRS users on upcoming releases of BRS/Search, NetAnswer, and other Dataware products. BRS/Search 7.0 will include features specifically requested by customers, as well as other enhancements. Earlier this year, Dataware acquired Sovereign Hill Software, makers of InQuery. In light of that acquisition, and Dataware's other development projects, we'll look at Dataware's plans for all products, including BRS/Search and NetAnswer." == Open Text acquisition of BRS/Search == In 2001 BRS/Search was acquired by Open Text and became LiveLink ECM Discovery Server. It is now referred to as Open Text Discovery Server. Open Text still supports both BRS/Search and NetAnswer. The core BRS/Search technology in the Open Text portfolio was augmented with other capabilities through various acquisitions. For example, Dataware's acquisition of Sovereign-Hill brought InQuery, “a probabilistic information retrieval system using an inference network”, which was developed by the University of Massachusetts Amherst Center for Intelligent Information Retrieval] out of the UMass CIIR and into the marketplace. A product re-branding table shows the range of products, their old names and their new names. InQuery is a concept search engine that uses noun phrases, parts of speech and other co-occurrence relationships in overlapping passages of text rather than single term inverted indexes of single words in documents. Open Text's portfolio has grown to include Hummingbird Content Management, and has always included BASIS. == 2003 == BRS/Search North America User's Group (BRSNAUG) website with a June 8, 2003 date listed the following features for BRS/Search. The BRSNAUG also disincorporated in 2003. Cross-references to BRS/Search on the World Wide Web point to Open Text Livelink. Engine features include: Rapid query response time. Numerical data handling and elementary statistical processing (sum, avg, min, max) Search results weighting and relevancy ranking Left- and right-truncation and expansion of search terms Superior data compression – loaded databases typically use only about 1.5 times the input stream size in disk space Large capacity databases – up to 100 million documents, each with up to 65,000 paragraphs Fine control of indexing and searching – right down to the word, sentence, and paragraph level Fine control over data security. Document access can be controlled at the database, document, and paragraph level International language support for all 7/8 bit characters sets and customizable language tables Flexible and customizable stop word lists ANSI-compatible thesauri Hypertext links within and between documents and databases (R6.x) Support for natural language parsing of queries Automatic document summarization tools Client/Server development Programming interfaces for World-Wide Web (HTTP, HTML) access to databases
Glossary of robotics
Robotics is the branch of technology that deals with the design, construction, operation, structural disposition, manufacture and application of robots. Robotics is related to the sciences of electronics, engineering, mechanics, and software. The following is a list of common definitions related to the Robotics field. == A == Actuator: a motor that translates control signals into mechanical movement. The control signals are usually electrical but may, more rarely, be pneumatic or hydraulic. The power supply may likewise be any of these. It is common for electrical control to be used to modulate a high-power pneumatic or hydraulic motor. Aerobot: a robot capable of independent flight on other planets. A type of aerial robot. Arduino: The current platform of choice for small-scale robotic experimentation and physical computing. Artificial intelligence: is the intelligence of machines and the branch of computer science that aims to create it. Aura (satellite): a robotic spacecraft launched by NASA in 2004 which collects atmospheric data from Earth. Automaton: an early self-operating robot, performing exactly the same actions, over and over. Autonomous vehicle: a vehicle equipped with an autopilot system, which is capable of driving from one point to another without input from a human operator. == B == Biomimetic: See Bionics. Bionics: also known as biomimetics, biognosis, biomimicry, or bionical creativity engineering is the application of biological methods and systems found in nature to the study and design of engineering systems and modern technology. == C == CAD/CAM (computer-aided design and computer-aided manufacturing): These systems and their data may be integrated into robotic operations. Čapek, Karel: Czech author who coined the term 'robot' in his 1921 play, Rossum's Universal Robots. Chandra X-ray Observatory: a robotic spacecraft launched by NASA in 1999 to collect astronomical data. Cloud robotics: robots empowered with more capacity and intelligence from cloud. Combat, robot: a hobby or sport event where two or more robots fight in an arena to disable each other. This has developed from a hobby in the 1990s to several TV series worldwide. Cruise missile: a robot-controlled guided missile that carries an explosive payload. Cyborg: also known as a cybernetic organism, a being with both biological and artificial (e.g. electronic, mechanical or robotic) parts. == D == Degrees of freedom: the extent to which a robot can move itself; expressed in terms of Cartesian coordinates (x, y, and z) and angular movements (yaw, pitch, and roll). Delta robot: a tripod linkage, used to construct fast-acting manipulators with a wide range of movement. Drive Power: The energy source or sources for the robot actuators. == E == Emergent behaviour, a complicated resultant behaviour that emerges from the repeated operation of simple underlying behaviours. Envelope (Space), Maximum The volume of space encompassing the maximum designed movements of all robot parts including the end-effector, workpiece, and attachments. Explosive ordnance disposal robot A mobile robot designed to assess whether an object contains explosives; some carry detonators that can be deposited at the object and activated after the robot withdraws. == F == FIRST(For Inspiration and Recognition of Science and Technology): an organization founded by inventor Dean Kamen in 1989 in order to develop ways to inspire students in engineering and technology fields. Forward chaining: a process in which events or received data are considered by an entity to intelligently adapt its behavior. == G == Gynoid: A humanoid robot designed to look like a human female. == H == Haptic: tactile feedback technology using the operator's sense of touch. Also sometimes applied to robot manipulators with their own touch sensitivity. Hexapod (platform): A movable platform using six linear actuators. Often used in flight simulators and fairground rides, they also have applications as a robotic manipulator. Hexapod (walker): A six-legged walking robot, using a simple insect-like locomotion. Human–computer interaction. Humanoid: A robotic entity designed to resemble a human being in form, function, or both. Hydraulics: the control of mechanical force and movement, generated by the application of liquid under pressure. cf. pneumatics. == I == Industrial robot: A reprogrammable, multifunctional manipulator designed to move material, parts, tools, or specialized devices through variable programmed motions for the performance of a variety of tasks. Insect robot: A small robot designed to imitate insect behaviors rather than complex human behaviors. == K == Kalman filter: a mathematical technique to estimate the value of a sensor measurement, from a series of intermittent and noisy values. Kinematics: the study of motion, as applied to robots. This includes both the design of linkages to perform motion, their power, control and stability; also their planning, such as choosing a sequence of movements to achieve a broader task. Inverse Kinematics: the process of determining joint angles required for a robot's end-effector to reach a desired position and orientation in space. Used in motion planning to calculate motor commands from target positions. == L == Linear actuator A form of motor that generates a linear movement directly. == M == Manipulator or gripper: A robotic 'hand'. Mobile robot: A self-propelled and self-contained robot that is capable of moving over a mechanically unconstrained course. Muting: The deactivation of a presence-sensing safeguarding device during a portion of the robot cycle. Mecanum wheel: A wheel fitted with angled rollers that enables a robot vehicle to move in multiple directions, including sideways. == O == Ornithopter – An aerial robot or drone that achieves flight through a flapping-wing mechanism rather than rotating blades or fixed wings, often utilized for highly maneuverable flight. == P == Parallel manipulator: an articulated robot or manipulator based on a number of kinematic chains, actuators and joints, in parallel. cf. serial manipulator. Pendant: Any portable control device that permits an operator to control the robot from within the restricted envelope (space) of the robot. Pneumatics: the control of mechanical force and movement, generated by the application of compressed gas. cf. hydraulics. Powered exoskeleton: is a wearable mobile machine that allow for limb movement with increased strength and endurance. Prosthetic robots: programmable manipulators or devices for missing human limbs. == R == Remote manipulator: A manipulator under direct human control, often used for work with hazardous materials. Robonaut: a development project conducted by NASA to create humanoid robots capable of using space tools and working in similar environments to suited astronauts. == S == Sensor fusion:The process of combining data from multiple sensors, such as LiDAR, cameras, global positioning systems (GPS), and inertial measurement units (IMUs), to produce a more accurate and reliable understanding of an environment than using a single sensor alone. It is widely used in robotics and autonomous systems to improve perception, localization, and decision-making. Serial manipulator: an articulated robot or manipulator with a single series kinematic chain of actuators. cf. parallel manipulator. Service robots are machines that extend human capabilities. Servo, a motor that moves to and maintains a set position under command, rather than continuously moving. Servomechanism An automatic device that uses error-sensing negative feedback to correct the performance of a mechanism. Single Point of Control The ability to operate the robot such that initiation or robot motion from one source of control is possible only from that source and cannot be overridden from another source. Slow Speed Control A mode of robot motion control where the velocity of the robot is limited to allow persons sufficient time either to withdraw the hazardous motion or stop the robot. Snake robot A robot component resembling a tentacle or elephant's trunk, where many small actuators are used to allow continuous curved motion of a robot component, with many degrees of freedom. This is usually applied to snake-arm robots, which use this as a flexible manipulator. A rarer application is the snakebot, where the entire robot is mobile and snake-like, so as to gain access through narrow spaces. Stepper motor Stewart platform A movable platform using six linear actuators, hence also known as a Hexapod. Subsumption architecture A robot architecture that uses a modular, bottom-up design beginning with the least complex behavioral tasks. Surgical robot, a remote manipulator used for keyhole surgery Swarm robotics involve large numbers of mostly simple physical robots. Their actions may seek to incorporate emergent behavior observed in social insects (swarm intelligence). Synchro == T == Teach Mode: The control state that al
Mobile Fortify
Mobile Fortify is a mobile app used by United States Immigration and Customs Enforcement (ICE) on their government-issued phones. The app allows agents to take a photo in order to gather biometrics, including contactless fingerprints and faceprints, for the purpose of identifying an individual and their potential immigration status. The app was created by NEC. == History == In June 2025, use of Mobile Fortify by ICE was uncovered through leaked emails and the user manual, reported by 404 Media. The app is internally developed, and details of the parent company and developer were initially unknown. In January 2026, the DHS's 2025 AI Use Case Inventory revealed the vendor as NEC Corporation, an international conglomerate with subsidiaries in Argentina, Australia, China, India and Malaysia. Later that month, several senators demanded transparency around the app and its origins, and that ICE stop using it. A second letter was sent again in November, after hearing no response to the previous letter from ICE. == Technology == Unlike other facial recognition software, Fortify uses federally linked databases. By contrast, Clearview AI uses public social media databases for biometric scanning. Federal databases include DHS's automated biometric identification system (IDENT), containing more than 270 million biometric records, and Customs and Border Protection's Traveler Verification Service. The State Department's visa and passport photo database, the FBI's National Crime Information Center, National Law Enforcement Telecommunications Systems, and CBP's TECS and Seized Assets and Case Tracing System (SEACATS). == Oversight == Several senators urged ICE to stop using the app for fear of infringing on fourth amendment and first amendment rights, and requested details on who developed the app, when it was deployed, whether the app was tested for accuracy, and policies and practices governing its use. In June 2025, they sent an open letter to Todd Lyons, ICE acting director, signed by senators Cory Booker, Chris Van Hollen, Ed Markey, Bernie Sanders, Adam Schiff, Tina Smith, Elizabeth Warren, and Ron Wyden. On November 3, a second letter was sent to the ICE by senators, after not receiving answers to questions from the previous letter deadlined for October 2. == Criticism == Mobile Fortify, and ICE's use of similar biometric identification technologies (such as Mobile Identify, an app similar to Mobile Fortify to be used by local or regional law enforcement to assist in immigration enforcement ) has faced scrutiny from a variety of digital rights organizations, politicians, and news outlets. The criticism is already considered to potentially be a reason why the similar Mobile Identify app was pulled from the Google Play Store. Facial recognition technologies are known to produce false-positives and generally unreliable results, especially on those with darker skin tones. ICE has already previously mistakenly arrested a U.S. citizen under the belief he was illegally in the country, and later stated that he "could be deported based on biometric confirmation of his identity" prior to his release. U.S. representative Bennie Thompson, ranking member of the House Homeland Security Committee has previously commented that "ICE officials have told us that an apparent biometric match by Mobile Fortify is a ‘definitive’ determination of a person's status and that an ICE officer may ignore evidence of American citizenship—including a birth certificate—if the app says the person is an alien," and that "Mobile Fortify is a dangerous tool in the hands of ICE, and it puts American citizens at risk of detention and even deportation," On January 19, 2026, 404 Media reported on a case where a woman, identified in court documents as "MJMA", was scanned by Mobile Fortify twice in the same interaction, and two entirely different names were provided by the app. According to the Innovation Law Lab, whose attorneys are representing MJMA, both of the names were incorrect. ICE has stated that they will not allow people to decline to be scanned by Mobile Fortify, and that photos taken, even those of U.S. citizens, will be stored for 15 years, something that has been criticized primarily because ICE has not performed a Privacy Impact Assessment (PIA) for Mobile Fortify, the right to decline other forms of biometric verification to the U.S. government is often available under other circumstances, and the 15 year window is viewed as unnecessarily large.
Artificial Intelligence Cold War
The Artificial Intelligence Cold War (AI Cold War) is a narrative in which geopolitical tensions between the United States of America (USA) and the People's Republic of China (PRC) could lead to a Second Cold War waged in the area of artificial intelligence technology rather than in the areas of nuclear capabilities or ideology. The context of the AI Cold War narrative is the AI arms race, which involves a build-up of military capabilities using AI technology by the US and China and the usage of increasingly advanced semiconductors which power those capabilities. According to a February 2019 publication by the Center for a New American Security, General Secretary of the Chinese Communist Party Xi Jinping – believes that being at the forefront of AI technology will be critical to the future of China's global military and economic power competition. == Origins of the term == The term AI Cold War first appeared in 2018 in an article in Wired magazine by Nicholas Thompson and Ian Bremmer. The two authors trace the emergence of the AI Cold War narrative to 2017, when China published its AI Development Plan, which included a strategy aimed at becoming the global leader in AI by 2030. While the authors acknowledge the use of AI by China to strengthen its authoritarian (totalitarian) rule, they warn against the perils for the US of engaging in an AI Cold War strategy. Thompson and Bremmer rather advocate for a technological cooperation between the US and China to encourage global standards in privacy and ethical use of AI. Shortly after the publication of the article in Wired magazine, the former U.S. Treasury Secretary Hank Paulson referred to the emergence of an ‘Economic Iron Curtain’ between the US and China, reinforcing the new AI Cold War narrative. == Proponents of the AI Cold War narrative == Politico contributed to reinforcing the AI Cold War narrative. In 2020, the paper argued that because of the increasing AI capabilities of China, the US and other democratic countries have to create an alliance to stay ahead of China. Former Google chief executive Eric Schmidt, together with Graham T. Allison alleged in an article in Project Syndicate that, in the context of the COVID-19 pandemic, the AI capabilities of China are ahead of the US in most critical areas. Scientists who have immigrated to the U.S. play an outsize role in the country's development of AI technology. Many of them were educated in China, prompting debates about national security concerns amid worsening relations between the two countries. Policy and technology experts have pointed to concerns about unethical use of AI which would be primarily associated with China. Ethics would therefore constitute a major ideological divide in the upcoming AI Cold War. Fears around disrupting supply chains and a global semiconductor shortage are linked to Taiwan's critical role in the production of semiconductors. 70% of semiconductors are either produced in Taiwan or transfer through Taiwan, where TSMC, world's largest chipmaker is headquartered. The PRC does not recognize the sovereignty of Taiwan and trade restrictions by the US on companies selling semiconductors to the PRC have disrupted in the past the commercial relationships between TSMC and Huawei. == Reactions to the AI Cold War == === Review of the validity of the AI Cold War narrative === Academics and observers expressed concerns about the validity and soundness of the AI Cold War narrative. Denise Garzia expressed concern in Nature that the AI Cold War narrative will undermine the efforts by the US to establish global rules for AI ethics. Researchers have warned in MIT Technology Review that the breakdown in international collaboration in the area of science because of the threat of the alleged AI Cold War would be detrimental to progress. Additionally, the AI Cold War narrative impacts on many more areas including the planning of supply chains and the proliferation of AI. The dissemination of the AI Cold War narrative could therefore be costly and destructive and exacerbate existing tensions. Joanna Bryson and Helena Malikova have pointed to Big Tech's potential interest in promoting the AI Cold War narrative, as technology companies lobby for less onerous regulation of AI in the US and the EU. A factual assessment of the existing AI capabilities of different countries shows a less binary reality than portrayed by the AI Cold War narrative. The AI Cold War started as a narrative but it could turn into a self-fulfilling prophecy and fuel an arms race, not only because of corporate interests but also because of the existing interests at different national security departments. Regarding cyber power, the International Institute for Strategic Studies published a study in June 2021, which argued that the online capabilities of China have been exaggerated and that Chinese cyber power is at least a decade behind the US, largely due to lingering security issues. === Restrictions to trading with China === US politicians and European industry players have invoked the looming AI Cold War as a reason to ban procurement by public authorities in Europe of Huawei 5G technology due to concerns over the Chinese state-sponsored surveillance industry. In 2019, the Trump administration successfully lobbied the Dutch government into stopping the Netherlands-based company ASML from exporting equipment to China. ASML manufactures a machine called an extreme ultraviolet lithography system used by semiconductor producers, including TSMC and Intel to produce state-of the-art microchips. The Biden administration adopted the same course of action as the Trump administration and requested the Netherlands to restrict sales by ASML to China, invoking national-security concerns. The trade restrictions imposed by the Trump administration affected semiconductors imports from China to the US and raised concerns by the US industry that supply chains will be disrupted in case of an AI Cold War. This prompted US technology companies to develop mitigation strategies including hoarding semiconductors and trying to set up local semiconductor production facilities, with the support of government subsidies. === Industrial policy initiatives === ==== United States ==== In June 2021, the US Senate approved the U.S. Innovation and Competition Act providing around 250 billion US dollars public money support to the US technological and manufacturing industry. The alleged Chinese threat in the area of technology helped secure a strong bipartisan support for the new legislation, amounting to the largest industrial policy move by the US in decades. Chinese authorities reproached to the US that the bill was “full of cold war zero-sum thinking”. The legislative bill is aimed at strengthening capabilities in the area of technology, such as quantum computing and AI specifically to face the competitive threat from China perceived as urgent. Senator Chuck Schumer, the leader of the Senate majority and one of the sponsors of the industrial policy bill invoked the threat of authoritarian regimes that want “grab the mantle of global economic leadership and own the innovations”. In 2022, U.S. Innovation and Competition Act was amended and turned into the Chips and Science Act with planned spending of 280 billion US dollars, 53 billion thereof are allocated directly to subsidies for semiconductors manufacturing. Commentators identified possible positive effects on innovation from the US attempts to compete with China in a perceived rivalry. Among the main beneficiaries of the US CHIPS Act are the semiconductor producers Intel, TSMC and Micron Technology. ==== European Chips Act ==== In February 2022, the European Union introduced its own European Chips Act initiative. The background of the initiative would be the objective of European strategic autonomy. The EU's initiative puts forward subsidies of 30 billion euros to encourage manufacturing of semiconductors in the EU. The US company Intel is one beneficiary of the initiative. The US and European chips acts raise concerns of protectionism and a risk of a subsidies "race to the bottom." === New world order === The AI Cold War heralds a new world order in geopolitics, according to Hemant Taneja and Fareed Zakaria. This new world order is a departure from the unipolar system dominated by the US. It is characterized by existence of two parallel digital ecosystems, ran by China and the US. In order to succeed countries that consider themselves as democracies are to align their technological ecosystems to that of the US, in a process labelled re-globalization.
T-norm fuzzy logics
T-norm fuzzy logics are a family of non-classical logics, informally delimited by having a semantics that takes the real unit interval [0, 1] for the system of truth values and functions called t-norms for permissible interpretations of conjunction. They are mainly used in applied fuzzy logic and fuzzy set theory as a theoretical basis for approximate reasoning. T-norm fuzzy logics belong in broader classes of fuzzy logics and many-valued logics. In order to generate a well-behaved implication, the t-norms are usually required to be left-continuous; logics of left-continuous t-norms further belong in the class of substructural logics, among which they are marked with the validity of the law of prelinearity, (A → B) ∨ (B → A). Both propositional and first-order (or higher-order) t-norm fuzzy logics, as well as their expansions by modal and other operators, are studied. Logics that restrict the t-norm semantics to a subset of the real unit interval (for example, finitely valued Łukasiewicz logics) are usually included in the class as well. Important examples of t-norm fuzzy logics are monoidal t-norm logic (MTL) of all left-continuous t-norms, basic logic (BL) of all continuous t-norms, product fuzzy logic of the product t-norm, or the nilpotent minimum logic of the nilpotent minimum t-norm. Some independently motivated logics belong among t-norm fuzzy logics, too, for example Łukasiewicz logic (which is the logic of the Łukasiewicz t-norm) or Gödel–Dummett logic (which is the logic of the minimum t-norm). == Motivation == As members of the family of fuzzy logics, t-norm fuzzy logics primarily aim at generalizing classical two-valued logic by admitting intermediary truth values between 1 (truth) and 0 (falsity) representing degrees of truth of propositions. The degrees are assumed to be real numbers from the unit interval [0, 1]. In propositional t-norm fuzzy logics, propositional connectives are stipulated to be truth-functional, that is, the truth value of a complex proposition formed by a propositional connective from some constituent propositions is a function (called the truth function of the connective) of the truth values of the constituent propositions. The truth functions operate on the set of truth degrees (in the standard semantics, on the [0, 1] interval); thus the truth function of an n-ary propositional connective c is a function Fc: [0, 1]n → [0, 1]. Truth functions generalize truth tables of propositional connectives known from classical logic to operate on the larger system of truth values. T-norm fuzzy logics impose certain natural constraints on the truth function of conjunction. The truth function ∗ : [ 0 , 1 ] 2 → [ 0 , 1 ] {\displaystyle \colon [0,1]^{2}\to [0,1]} of conjunction is assumed to satisfy the following conditions: Commutativity, that is, x ∗ y = y ∗ x {\displaystyle xy=yx} for all x and y in [0, 1]. This expresses the assumption that the order of fuzzy propositions is immaterial in conjunction, even if intermediary truth degrees are admitted. Associativity, that is, ( x ∗ y ) ∗ z = x ∗ ( y ∗ z ) {\displaystyle (xy)z=x(yz)} for all x, y, and z in [0, 1]. This expresses the assumption that the order of performing conjunction is immaterial, even if intermediary truth degrees are admitted. Monotony, that is, if x ≤ y {\displaystyle x\leq y} then x ∗ z ≤ y ∗ z {\displaystyle xz\leq yz} for all x, y, and z in [0, 1]. This expresses the assumption that increasing the truth degree of a conjunct should not decrease the truth degree of the conjunction. Neutrality of 1, that is, 1 ∗ x = x {\displaystyle 1x=x} for all x in [0, 1]. This assumption corresponds to regarding the truth degree 1 as full truth, conjunction with which does not decrease the truth value of the other conjunct. Together with the previous conditions this condition ensures that also 0 ∗ x = 0 {\displaystyle 0x=0} for all x in [0, 1], which corresponds to regarding the truth degree 0 as full falsity, conjunction with which is always fully false. Continuity of the function ∗ {\displaystyle } (the previous conditions reduce this requirement to the continuity in either argument). Informally this expresses the assumption that microscopic changes of the truth degrees of conjuncts should not result in a macroscopic change of the truth degree of their conjunction. This condition, among other things, ensures a good behavior of (residual) implication derived from conjunction; to ensure the good behavior, however, left-continuity (in either argument) of the function ∗ {\displaystyle } is sufficient. In general t-norm fuzzy logics, therefore, only left-continuity of ∗ {\displaystyle } is required, which expresses the assumption that a microscopic decrease of the truth degree of a conjunct should not macroscopically decrease the truth degree of conjunction. These assumptions make the truth function of conjunction a left-continuous t-norm, which explains the name of the family of fuzzy logics (t-norm based). Particular logics of the family can make further assumptions about the behavior of conjunction (for example, Gödel–Dummett logic requires its idempotence) or other connectives (for example, the logic IMTL (involutive monoidal t-norm logic) requires the involutiveness of negation). All left-continuous t-norms ∗ {\displaystyle } have a unique residuum, that is, a binary function ⇒ {\displaystyle \Rightarrow } such that for all x, y, and z in [0, 1], x ∗ y ≤ z {\displaystyle xy\leq z} if and only if x ≤ y ⇒ z . {\displaystyle x\leq y\Rightarrow z.} The residuum of a left-continuous t-norm can explicitly be defined as ( x ⇒ y ) = sup { z ∣ z ∗ x ≤ y } . {\displaystyle (x\Rightarrow y)=\sup\{z\mid zx\leq y\}.} This ensures that the residuum is the pointwise largest function such that for all x and y, x ∗ ( x ⇒ y ) ≤ y . {\displaystyle x(x\Rightarrow y)\leq y.} The latter can be interpreted as a fuzzy version of the modus ponens rule of inference. The residuum of a left-continuous t-norm thus can be characterized as the weakest function that makes the fuzzy modus ponens valid, which makes it a suitable truth function for implication in fuzzy logic. Left-continuity of the t-norm is the necessary and sufficient condition for this relationship between a t-norm conjunction and its residual implication to hold. Truth functions of further propositional connectives can be defined by means of the t-norm and its residuum, for instance the residual negation ¬ x = ( x ⇒ 0 ) {\displaystyle \neg x=(x\Rightarrow 0)} or bi-residual equivalence x ⇔ y = ( x ⇒ y ) ∗ ( y ⇒ x ) . {\displaystyle x\Leftrightarrow y=(x\Rightarrow y)(y\Rightarrow x).} Truth functions of propositional connectives may also be introduced by additional definitions: the most usual ones are the minimum (which plays a role of another conjunctive connective), the maximum (which plays a role of a disjunctive connective), or the Baaz Delta operator, defined in [0, 1] as Δ x = 1 {\displaystyle \Delta x=1} if x = 1 {\displaystyle x=1} and Δ x = 0 {\displaystyle \Delta x=0} otherwise. In this way, a left-continuous t-norm, its residuum, and the truth functions of additional propositional connectives determine the truth values of complex propositional formulae in [0, 1]. Formulae that always evaluate to 1 are called tautologies with respect to the given left-continuous t-norm ∗ , {\displaystyle ,} or ∗ - {\displaystyle {\mbox{-}}} tautologies. The set of all ∗ - {\displaystyle {\mbox{-}}} tautologies is called the logic of the t-norm ∗ , {\displaystyle ,} as these formulae represent the laws of fuzzy logic (determined by the t-norm) that hold (to degree 1) regardless of the truth degrees of atomic formulae. Some formulae are tautologies with respect to a larger class of left-continuous t-norms; the set of such formulae is called the logic of the class. Important t-norm logics are the logics of particular t-norms or classes of t-norms, for example: Łukasiewicz logic is the logic of the Łukasiewicz t-norm x ∗ y = max ( x + y − 1 , 0 ) {\displaystyle xy=\max(x+y-1,0)} Gödel–Dummett logic is the logic of the minimum t-norm x ∗ y = min ( x , y ) {\displaystyle xy=\min(x,y)} Product fuzzy logic is the logic of the product t-norm x ∗ y = x ⋅ y {\displaystyle xy=x\cdot y} Monoidal t-norm logic MTL is the logic of (the class of) all left-continuous t-norms Basic fuzzy logic BL is the logic of (the class of) all continuous t-norms It turns out that many logics of particular t-norms and classes of t-norms are axiomatizable. The completeness theorem of the axiomatic system with respect to the corresponding t-norm semantics on [0, 1] is then called the standard completeness of the logic. Besides the standard real-valued semantics on [0, 1], the logics are sound and complete with respect to general algebraic semantics, formed by suitable classes of prelinear commutative bounded integral residuated lattices. == History == Some particular t-norm fuzzy logics have been introduced and investigated long before the family was re
Latent semantic analysis
Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA assumes that words that are close in meaning will occur in similar pieces of text (the distributional hypothesis). A matrix containing word counts per document (rows represent unique words and columns represent each document) is constructed from a large piece of text and a mathematical technique called singular value decomposition (SVD) is used to reduce the number of rows while preserving the similarity structure among columns. Documents are then compared by cosine similarity between any two columns. Values close to 1 represent very similar documents while values close to 0 represent very dissimilar documents. An information retrieval technique using latent semantic structure was patented in 1988 by Scott Deerwester, Susan Dumais, George Furnas, Richard Harshman, Thomas Landauer, Karen Lochbaum and Lynn Streeter. In the context of its application to information retrieval, it is sometimes called latent semantic indexing (LSI). == Overview == === Occurrence matrix === LSA can use a document-term matrix which describes the occurrences of terms in documents; it is a sparse matrix whose rows correspond to terms and whose columns correspond to documents. A typical example of the weighting of the elements of the matrix is tf-idf (term frequency–inverse document frequency): the weight of an element of the matrix is proportional to the number of times the terms appear in each document, where rare terms are upweighted to reflect their relative importance. This matrix is also common to standard semantic models, though it is not necessarily explicitly expressed as a matrix, since the mathematical properties of matrices are not always used. === Rank lowering === After the construction of the occurrence matrix, LSA finds a low-rank approximation to the term-document matrix. There could be various reasons for these approximations: The original term-document matrix is presumed too large for the computing resources; in this case, the approximated low rank matrix is interpreted as an approximation (a "least and necessary evil"). The original term-document matrix is presumed noisy: for example, anecdotal instances of terms are to be eliminated. From this point of view, the approximated matrix is interpreted as a de-noisified matrix (a better matrix than the original). The original term-document matrix is presumed overly sparse relative to the "true" term-document matrix. That is, the original matrix lists only the words actually in each document, whereas we might be interested in all words related to each document—generally a much larger set due to synonymy. The consequence of the rank lowering is that some dimensions are combined and depend on more than one term: {(car), (truck), (flower)} → {(1.3452 car + 0.2828 truck), (flower)} This mitigates the problem of identifying synonymy, as the rank lowering is expected to merge the dimensions associated with terms that have similar meanings. It also partially mitigates the problem with polysemy, since components of polysemous words that point in the "right" direction are added to the components of words that share a similar meaning. Conversely, components that point in other directions tend to either simply cancel out, or, at worst, to be smaller than components in the directions corresponding to the intended sense. === Derivation === Let X {\displaystyle X} be a matrix where element ( i , j ) {\displaystyle (i,j)} describes the occurrence of term i {\displaystyle i} in document j {\displaystyle j} (this can be, for example, the frequency). X {\displaystyle X} will look like this: d j ↓ t i T → [ x 1 , 1 … x 1 , j … x 1 , n ⋮ ⋱ ⋮ ⋱ ⋮ x i , 1 … x i , j … x i , n ⋮ ⋱ ⋮ ⋱ ⋮ x m , 1 … x m , j … x m , n ] {\displaystyle {\begin{matrix}&{\textbf {d}}_{j}\\&\downarrow \\{\textbf {t}}_{i}^{T}\rightarrow &{\begin{bmatrix}x_{1,1}&\dots &x_{1,j}&\dots &x_{1,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{i,1}&\dots &x_{i,j}&\dots &x_{i,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{m,1}&\dots &x_{m,j}&\dots &x_{m,n}\\\end{bmatrix}}\end{matrix}}} Now a row in this matrix will be a vector corresponding to a term, giving its relation to each document: t i T = [ x i , 1 … x i , j … x i , n ] {\displaystyle {\textbf {t}}_{i}^{T}={\begin{bmatrix}x_{i,1}&\dots &x_{i,j}&\dots &x_{i,n}\end{bmatrix}}} Likewise, a column in this matrix will be a vector corresponding to a document, giving its relation to each term: d j = [ x 1 , j ⋮ x i , j ⋮ x m , j ] {\displaystyle {\textbf {d}}_{j}={\begin{bmatrix}x_{1,j}\\\vdots \\x_{i,j}\\\vdots \\x_{m,j}\\\end{bmatrix}}} Now the dot product t i T t p {\displaystyle {\textbf {t}}_{i}^{T}{\textbf {t}}_{p}} between two term vectors gives the correlation between the terms over the set of documents. The matrix product X X T {\displaystyle XX^{T}} contains all these dot products. Element ( i , p ) {\displaystyle (i,p)} (which is equal to element ( p , i ) {\displaystyle (p,i)} ) contains the dot product t i T t p {\displaystyle {\textbf {t}}_{i}^{T}{\textbf {t}}_{p}} ( = t p T t i {\displaystyle ={\textbf {t}}_{p}^{T}{\textbf {t}}_{i}} ). Likewise, the matrix X T X {\displaystyle X^{T}X} contains the dot products between all the document vectors, giving their correlation over the terms: d j T d q = d q T d j {\displaystyle {\textbf {d}}_{j}^{T}{\textbf {d}}_{q}={\textbf {d}}_{q}^{T}{\textbf {d}}_{j}} . Now, from the theory of linear algebra, there exists a decomposition of X {\displaystyle X} such that U {\displaystyle U} and V {\displaystyle V} are orthogonal matrices and Σ {\displaystyle \Sigma } is a diagonal matrix. This is called a singular value decomposition (SVD): X = U Σ V T {\displaystyle {\begin{matrix}X=U\Sigma V^{T}\end{matrix}}} The matrix products giving us the term and document correlations then become X X T = ( U Σ V T ) ( U Σ V T ) T = ( U Σ V T ) ( V T T Σ T U T ) = U Σ V T V Σ T U T = U Σ Σ T U T X T X = ( U Σ V T ) T ( U Σ V T ) = ( V T T Σ T U T ) ( U Σ V T ) = V Σ T U T U Σ V T = V Σ T Σ V T {\displaystyle {\begin{matrix}XX^{T}&=&(U\Sigma V^{T})(U\Sigma V^{T})^{T}=(U\Sigma V^{T})(V^{T^{T}}\Sigma ^{T}U^{T})=U\Sigma V^{T}V\Sigma ^{T}U^{T}=U\Sigma \Sigma ^{T}U^{T}\\X^{T}X&=&(U\Sigma V^{T})^{T}(U\Sigma V^{T})=(V^{T^{T}}\Sigma ^{T}U^{T})(U\Sigma V^{T})=V\Sigma ^{T}U^{T}U\Sigma V^{T}=V\Sigma ^{T}\Sigma V^{T}\end{matrix}}} Since Σ Σ T {\displaystyle \Sigma \Sigma ^{T}} and Σ T Σ {\displaystyle \Sigma ^{T}\Sigma } are diagonal we see that U {\displaystyle U} must contain the eigenvectors of X X T {\displaystyle XX^{T}} , while V {\displaystyle V} must be the eigenvectors of X T X {\displaystyle X^{T}X} . Both products have the same non-zero eigenvalues, given by the non-zero entries of Σ Σ T {\displaystyle \Sigma \Sigma ^{T}} , or equally, by the non-zero entries of Σ T Σ {\displaystyle \Sigma ^{T}\Sigma } . Now the decomposition looks like this: X U Σ V T ( d j ) ( d ^ j ) ↓ ↓ ( t i T ) → [ x 1 , 1 … x 1 , j … x 1 , n ⋮ ⋱ ⋮ ⋱ ⋮ x i , 1 … x i , j … x i , n ⋮ ⋱ ⋮ ⋱ ⋮ x m , 1 … x m , j … x m , n ] = ( t ^ i T ) → [ [ u 1 ] … [ u l ] ] ⋅ [ σ 1 … 0 ⋮ ⋱ ⋮ 0 … σ l ] ⋅ [ [ v 1 ] ⋮ [ v l ] ] {\displaystyle {\begin{matrix}&X&&&U&&\Sigma &&V^{T}\\&({\textbf {d}}_{j})&&&&&&&({\hat {\textbf {d}}}_{j})\\&\downarrow &&&&&&&\downarrow \\({\textbf {t}}_{i}^{T})\rightarrow &{\begin{bmatrix}x_{1,1}&\dots &x_{1,j}&\dots &x_{1,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{i,1}&\dots &x_{i,j}&\dots &x_{i,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{m,1}&\dots &x_{m,j}&\dots &x_{m,n}\\\end{bmatrix}}&=&({\hat {\textbf {t}}}_{i}^{T})\rightarrow &{\begin{bmatrix}{\begin{bmatrix}\,\\\,\\{\textbf {u}}_{1}\\\,\\\,\end{bmatrix}}\dots {\begin{bmatrix}\,\\\,\\{\textbf {u}}_{l}\\\,\\\,\end{bmatrix}}\end{bmatrix}}&\cdot &{\begin{bmatrix}\sigma _{1}&\dots &0\\\vdots &\ddots &\vdots \\0&\dots &\sigma _{l}\\\end{bmatrix}}&\cdot &{\begin{bmatrix}{\begin{bmatrix}&&{\textbf {v}}_{1}&&\end{bmatrix}}\\\vdots \\{\begin{bmatrix}&&{\textbf {v}}_{l}&&\end{bmatrix}}\end{bmatrix}}\end{matrix}}} The values σ 1 , … , σ l {\displaystyle \sigma _{1},\dots ,\sigma _{l}} are called the singular values, and u 1 , … , u l {\displaystyle u_{1},\dots ,u_{l}} and v 1 , … , v l {\displaystyle v_{1},\dots ,v_{l}} the left and right singular vectors. Notice the only part of U {\displaystyle U} that contributes to t i {\displaystyle {\textbf {t}}_{i}} is the i 'th {\displaystyle i{\textrm {'th}}} row. Let this row vector be called t ^ i T {\displaystyle {\hat {\textrm {t}}}_{i}^{T}} . Likewise, the only part of V T {\displaystyle V^{T}} that contributes to d j {\displaystyle {\textbf {d}}_{j}} is the j 'th {\displaystyle j{\textrm {'th}}} column, d ^ j {\displaystyle {\hat {\textrm {d}}}_{j}} . These are not the eigenvectors, but depend on all the eigenvectors. I
Data analysis for fraud detection
Fraud represents a significant problem for governments and businesses and specialized analysis techniques for discovering fraud using them are required. Some of these methods include knowledge discovery in databases (KDD), data mining, machine learning and statistics. They offer applicable and successful solutions in different areas of electronic fraud crimes. In general, the primary reason to use data analytics techniques is to tackle fraud since many internal control systems have serious weaknesses. For example, the currently prevailing approach employed by many law enforcement agencies to detect companies involved in potential cases of fraud consists in receiving circumstantial evidence or complaints from whistleblowers. As a result, a large number of fraud cases remain undetected and unprosecuted. In order to effectively test, detect, validate, correct error and monitor control systems against fraudulent activities, businesses entities and organizations rely on specialized data analytics techniques such as data mining, data matching, the sounds like function, regression analysis, clustering analysis, and gap analysis. Techniques used for fraud detection fall into two primary classes: statistical techniques and artificial intelligence. == Statistical techniques == Examples of statistical data analysis techniques are: Data preprocessing techniques for detection, validation, error correction, and filling up of missing or incorrect data. Calculation of various statistical parameters such as averages, quantiles, performance metrics, probability distributions, and so on. For example, the averages may include average length of call, average number of calls per month and average delays in bill payment. Models and probability distributions of various business activities either in terms of various parameters or probability distributions. Computing user profiles. Time-series analysis of time-dependent data. Clustering and classification to find patterns and associations among groups of data. Data matching Data matching is used to compare two sets of collected data. The process can be performed based on algorithms or programmed loops. Trying to match sets of data against each other or comparing complex data types. Data matching is used to remove duplicate records and identify links between two data sets for marketing, security or other uses. Sounds like Function is used to find values that sound similar. The Phonetic similarity is one way to locate possible duplicate values, or inconsistent spelling in manually entered data. The ‘sounds like’ function converts the comparison strings to four-character American Soundex codes, which are based on the first letter, and the first three consonants after the first letter, in each string. Regression analysis allows you to examine the relationship between two or more variables of interest. Regression analysis estimates relationships between independent variables and a dependent variable. This method can be used to help understand and identify relationships among variables and predict actual results. Gap analysis is used to determine whether business requirements are being met, if not, what are the steps that should be taken to meet successfully. Matching algorithms to detect anomalies in the behavior of transactions or users as compared to previously known models and profiles. Techniques are also needed to eliminate false alarms, estimate risks, and predict future of current transactions or users. Some forensic accountants specialize in forensic analytics which is the procurement and analysis of electronic data to reconstruct, detect, or otherwise support a claim of financial fraud. The main steps in forensic analytics are data collection, data preparation, data analysis, and reporting. For example, forensic analytics may be used to review an employee's purchasing card activity to assess whether any of the purchases were diverted or divertible for personal use. == Artificial intelligence == Fraud detection is a knowledge-intensive activity. The main AI techniques used for fraud detection include: Data mining to classify, cluster, and segment the data and automatically find associations and rules in the data that may signify interesting patterns, including those related to fraud. Expert systems to encode expertise for detecting fraud in the form of rules. Pattern recognition to detect approximate classes, clusters, or patterns of suspicious behavior either automatically (unsupervised) or to match given inputs. Machine learning techniques to automatically identify characteristics of fraud. Neural nets to independently generate classification, clustering, generalization, and forecasting that can then be compared against conclusions raised in internal audits or formal financial documents such as 10-Q. Other techniques such as link analysis, Bayesian networks, decision theory, and sequence matching are also used for fraud detection. A new and novel technique called System properties approach has also been employed where ever rank data is available. Statistical analysis of research data is the most comprehensive method for determining if data fraud exists. Data fraud as defined by the Office of Research Integrity (ORI) includes fabrication, falsification and plagiarism. == Machine learning and data mining == Early data analysis techniques were oriented toward extracting quantitative and statistical data characteristics. These techniques facilitate useful data interpretations and can help to get better insights into the processes behind the data. Although the traditional data analysis techniques can indirectly lead us to knowledge, it is still created by human analysts. To go beyond, a data analysis system has to be equipped with a substantial amount of background knowledge, and be able to perform reasoning tasks involving that knowledge and the data provided. In effort to meet this goal, researchers have turned to ideas from the machine learning field. This is a natural source of ideas, since the machine learning task can be described as turning background knowledge and examples (input) into knowledge (output). If data mining results in discovering meaningful patterns, data turns into information. Information or patterns that are novel, valid and potentially useful are not merely information, but knowledge. One speaks of discovering knowledge, before hidden in the huge amount of data, but now revealed. The machine learning and artificial intelligence solutions may be classified into two categories: 'supervised' and 'unsupervised' learning. These methods seek for accounts, customers, suppliers, etc. that behave 'unusually' in order to output suspicion scores, rules or visual anomalies, depending on the method. Whether supervised or unsupervised methods are used, note that the output gives us only an indication of fraud likelihood. No stand alone statistical analysis can assure that a particular object is a fraudulent one, but they can identify them with very high degrees of accuracy. As a result, effective collaboration between machine learning model and human analysts is vital to the success of fraud detection applications. === Supervised learning === In supervised learning, a random sub-sample of all records is taken and manually classified as either 'fraudulent' or 'non-fraudulent' (task can be decomposed on more classes to meet algorithm requirements). Relatively rare events such as fraud may need to be over sampled to get a big enough sample size. These manually classified records are then used to train a supervised machine learning algorithm. After building a model using this training data, the algorithm should be able to classify new records as either fraudulent or non-fraudulent. Supervised neural networks, fuzzy neural nets, and combinations of neural nets and rules, have been extensively explored and used for detecting fraud in mobile phone networks and financial statement fraud. Bayesian learning neural network is implemented for credit card fraud detection, telecommunications fraud, auto claim fraud detection, and medical insurance fraud. Hybrid knowledge/statistical-based systems, where expert knowledge is integrated with statistical power, use a series of data mining techniques for the purpose of detecting cellular clone fraud. Specifically, a rule-learning program to uncover indicators of fraudulent behaviour from a large database of customer transactions is implemented. Cahill et al. (2000) design a fraud signature, based on data of fraudulent calls, to detect telecommunications fraud. For scoring a call for fraud its probability under the account signature is compared to its probability under a fraud signature. The fraud signature is updated sequentially, enabling event-driven fraud detection. Link analysis comprehends a different approach. It relates known fraudsters to other individuals, using record linkage and social network methods. This type of detection is only able to detect fra