Cortica

Cortica

Headquartered in Tel Aviv Cortica utilizes unsupervised learning methods to recognize and analyze digital images and video. The technology developed by the Cortica team is based on research of the function of the human brain. == Company Founding == Cortica was founded in 2007 by Igal Raichelgauz, Karina Odinaev and Yehoshua Zeevi. Together, the founders developed the company’s core technology while at Technion – Israel Institute of Technology. By combining discoveries in neuroscience with developments in computer programming, the team created technology that possesses the ability to interpret large amounts of visual data with increased accuracy. This technology, called Image2Text, is based on the founders’ work in digitally replicating cortical neural networks’ ability to identify complex patterns within massive quantities of ambiguous and noisy data. Cortica’s offerings have application in the automotive industry, media industries, as well as the smart city and medical industries. Industry experts suggest that the self-driving automotive industry alone will be worth upwards of $7 trillion while each connected car is expected to generate 4,000 GB of data per day. Beyond that, industry analysts expect the proliferation of surveillance cameras to continue leading to an expected 2,500 Petabytes of data being generated daily by new surveillance cameras. Cortica operates in these high scale industries. The company currently employs professionals from many domains including AI researchers as well as veterans of intelligence units within the Israeli Defense Forces. == Research and Technology == In 2006, Founders Raichelgauz, Odinaev, and Zeevi shared their findings with the 28th IEEE EMBS Annual International Conference in New York in a paper titled, “Natural Signal Classification by Neural Cliques and Phase-Locked Attractors”. That same year, the team also published “Cliques in Neural Ensembles as Perception Carriers" CB Insights recently identified Cortica as the number one patent holder among AI companies. Cortica is researching to develop a machine-learning driving system which can identify objects and pedestrians. Connecting to it, Elon Musk has been rumored to partner with Cortica for his electric car company, Tesla. However, Tesla denies it stating that Musk did not discuss a collaboration with artificial intelligence firm Cortica. == Funding == Cortica raised $7 million in its Series A funding round, announced in August 2012. Investors included Horizons Ventures (the investment firm of Hong Kong billionaire Li Ka-Shing), and Ynon Kreiz, the former chairman and CEO of the Endemol Group. In May 2013, it was announced that Cortica had raised $1.5 million from Russian firm Mail.ru Group. It later transpired that this was a part of Cortica's Series B funding round for $6.4 million, announced in June 2013. The round was led by Horizons Ventures, with participation from the Russian firm Mail.ru Group and other angel investors. In its fourth funding round, Cortica has raised $20 million, bringing the total investments to $38 million. According to a report from The Israeli lead Daily economic newspaper, TheMarker, the fourth round was led by a strategic Chinese investor who will probably help the company expand into the Asian market. == Media coverage == GigaOm listed Cortica as one of the top deep learning startups in a November 2013 article surveying the field, along with AlchemyAPI, Ersatz, and Semantria. Business Insider ranked Cortica as one of the coolest tech companies in Israel. CB Insights has identified Cortica as the top patent holding AI company. In 2017 several leading automotive media outlets covered the launch of Cortica's automotive business unit

Structured sparsity regularization

Structured sparsity regularization is a class of methods, and an area of research in statistical learning theory, that extend and generalize sparsity regularization learning methods. Both sparsity and structured sparsity regularization methods seek to exploit the assumption that the output variable Y {\displaystyle Y} (i.e., response, or dependent variable) to be learned can be described by a reduced number of variables in the input space X {\displaystyle X} (i.e., the domain, space of features or explanatory variables). Sparsity regularization methods focus on selecting the input variables that best describe the output. Structured sparsity regularization methods generalize and extend sparsity regularization methods, by allowing for optimal selection over structures like groups or networks of input variables in X {\displaystyle X} . Common motivation for the use of structured sparsity methods are model interpretability, high-dimensional learning (where dimensionality of X {\displaystyle X} may be higher than the number of observations n {\displaystyle n} ), and reduction of computational complexity. Moreover, structured sparsity methods allow to incorporate prior assumptions on the structure of the input variables, such as overlapping groups, non-overlapping groups, and acyclic graphs. Examples of uses of structured sparsity methods include face recognition, magnetic resonance image (MRI) processing, socio-linguistic analysis in natural language processing, and analysis of genetic expression in breast cancer. == Definition and related concepts == === Sparsity regularization === Consider the linear kernel regularized empirical risk minimization problem with a loss function V ( y i , f ( x ) ) {\displaystyle V(y_{i},f(x))} and the ℓ 0 {\displaystyle \ell _{0}} "norm" as the regularization penalty: min w ∈ R d 1 n ∑ i = 1 n V ( y i , ⟨ w , x i ⟩ ) + λ ‖ w ‖ 0 , {\displaystyle \min _{w\in \mathbb {R} ^{d}}{\frac {1}{n}}\sum _{i=1}^{n}V(y_{i},\langle w,x_{i}\rangle )+\lambda \|w\|_{0},} where x , w ∈ R d {\displaystyle x,w\in \mathbb {R^{d}} } , and ‖ w ‖ 0 {\displaystyle \|w\|_{0}} denotes the ℓ 0 {\displaystyle \ell _{0}} "norm", defined as the number of nonzero entries of the vector w {\displaystyle w} . f ( x ) = ⟨ w , x i ⟩ {\displaystyle f(x)=\langle w,x_{i}\rangle } is said to be sparse if ‖ w ‖ 0 = s < d {\displaystyle \|w\|_{0}=s 0 {\displaystyle w_{j}>0} . However, as in this case groups may overlap, we take the intersection of the complements of those groups that are not set to zero. This intersection of complements selection criteria implies the modeling choice that we allow some coefficients within a particular group g {\displaystyle g} to be set to zero, while others within the same group g {\displaystyle g} may remain positive. In other words, coefficients within a group may differ depending on the several group memberships that each variable within the group may have. ==== Union of groups: latent group Lasso ==== A different approach is to consider union of groups for variable selection. This approach captures the modeling situation where variables can be selected as long as they belong at least to one group with positive coefficients. This modeling perspective implies that we want to preserve group structure. The formulation of the union of groups approach is also referred to as latent group Lasso, and requires to modify the group ℓ 2 {\displaystyle \ell _{2}} norm considered above and introduce the following regularizer R ( w ) = i n f { ∑ g ‖ w g ‖ g : w = ∑ g = 1 G w ¯ g } {\displaystyle R(w)=inf\left\{\sum _{g}\|w_{g}\|_{g}:w=\sum _{g=1}^{G}{\bar {w}}_{g}\right\}} where w ∈ R d {\displaystyle w\in {\mathbb {R^{d}} }} , w g ∈ G g {\displaystyle w_{g}\in G_{g}} is the vector of coefficients of group g, and w ¯ g ∈ R d {\displaystyle {\bar {w}}_{g}\in {\mathbb {R^{d}} }} is a vector with coefficients w g j {\displaystyle w_{g}^{j}} for all variables j {

BRS/Search

BRS/Search is a full-text database and information retrieval system. BRS/Search uses a fully inverted indexing system to store, locate, and retrieve unstructured data. It was the search engine that in 1977 powered Bibliographic Retrieval Services (BRS) commercial operations with 20 databases (including the first national commercial availability of MEDLINE); it has changed ownership several times during its development and is currently sold as Livelink ECM Discovery Server by Open Text Corporation. == Early development == Development on what was to become BRS began as Biomedical Communications Network (BCN) at the State University of New York at Albany (SUNY). BCN, which went online in 1968, provided on-line access to nine databases, including MEDLINE and BIOSIS Previews, to large universities and medical schools primarily in the Northeast of the USA. State funding for the project was withdrawn in 1975, and Bibliographic Retrieval Services (BRS) was formed as a non-profit concern the following year. It was incorporated in May 1976 as a for-profit corporation with Ron Quake as president, Jan Egeland as vice president in charge of marketing and training, and Lloyd Palmer as vice president of systems. == BRS commercial operations == In December 1976, the First BRS User Meeting was held in Syracuse, New York, and by January 1977 BRS started commercial operations with 20 databases (including the first national commercial availability of MEDLINE) and 9 million records, using modified IBM STAIRS (STorage And Information Retrieval System) software, Telenet for telecommunications, and timesharing mainframe computers of Carrier Corporation. In October 1980 BRS was sold by Egeland and Quake to Indian Head, Inc., a subsidiary of the Dutch company Thyssen-Bornemisza Group. == 1989–1993 == In 1989 Robert Maxwell acquired BRS and the BRS/Search software; he announced the planned incorporation of the ORBIT Search Service and BRS Information Technologies and renamed the whole group Maxwell Online, Inc. At that time BRS Information Technologies was serving the medical and academic library marketplace with over 150 databases. Maxwell later bought the publishing company Macmillan and put Maxwell Online under Macmillan. In the same year BRS/LINK (hypertext connection of databases; first application delivering full text) was announced. The initial BRS/LINK application "relates the citation in a bibliographic database to its full-text article in a second database," and "eliminates the need to re-execute a search strategy in the second database in order to find the corresponding full-text article." Initially BRS/LINK supported linking only selected bibliographic databases: MEDLINE, Health Planning and Administration, and MEDLINE References on AIDS to the full-text Comprehensive Core Medical Library. At the time of Robert Maxwell’s death in 1991, Macmillan brought in Andrew Gregory to represent the company during the 2 years that Maxwell’s affairs were being settled and to prepare Maxwell Online to be able to sell the components. Maxwell Online shortly thereafter underwent yet another name change, this time to InfoPro Technologies. == Dataware Technologies ownership of BRS/SEARCH == Early in 1994, InfoPro Technologies, a subsidiary of MHC Inc. (holding company for Macmillan Inc.), the former Maxwell Online service, sold off all its subsidiaries. ORBIT Search Services went to the French-owned Questel, the dial-up BRS Search Services to CD Plus Technologies (later to become OVID), and BRS Software Products (including BRS/SEARCH) to Dataware Technologies. Almost up to the end of InfoPro Technologies, BRS Software had been the fastest growing segment of the company. At the 14th BRS North American Users Group Conference in 1999, Dave Schubmehl of Dataware Technologies presented a paper in which he stated "The purpose of this presentation is to update BRS users on upcoming releases of BRS/Search, NetAnswer, and other Dataware products. BRS/Search 7.0 will include features specifically requested by customers, as well as other enhancements. Earlier this year, Dataware acquired Sovereign Hill Software, makers of InQuery. In light of that acquisition, and Dataware's other development projects, we'll look at Dataware's plans for all products, including BRS/Search and NetAnswer." == Open Text acquisition of BRS/Search == In 2001 BRS/Search was acquired by Open Text and became LiveLink ECM Discovery Server. It is now referred to as Open Text Discovery Server. Open Text still supports both BRS/Search and NetAnswer. The core BRS/Search technology in the Open Text portfolio was augmented with other capabilities through various acquisitions. For example, Dataware's acquisition of Sovereign-Hill brought InQuery, “a probabilistic information retrieval system using an inference network”, which was developed by the University of Massachusetts Amherst Center for Intelligent Information Retrieval] out of the UMass CIIR and into the marketplace. A product re-branding table shows the range of products, their old names and their new names. InQuery is a concept search engine that uses noun phrases, parts of speech and other co-occurrence relationships in overlapping passages of text rather than single term inverted indexes of single words in documents. Open Text's portfolio has grown to include Hummingbird Content Management, and has always included BASIS. == 2003 == BRS/Search North America User's Group (BRSNAUG) website with a June 8, 2003 date listed the following features for BRS/Search. The BRSNAUG also disincorporated in 2003. Cross-references to BRS/Search on the World Wide Web point to Open Text Livelink. Engine features include: Rapid query response time. Numerical data handling and elementary statistical processing (sum, avg, min, max) Search results weighting and relevancy ranking Left- and right-truncation and expansion of search terms Superior data compression – loaded databases typically use only about 1.5 times the input stream size in disk space Large capacity databases – up to 100 million documents, each with up to 65,000 paragraphs Fine control of indexing and searching – right down to the word, sentence, and paragraph level Fine control over data security. Document access can be controlled at the database, document, and paragraph level International language support for all 7/8 bit characters sets and customizable language tables Flexible and customizable stop word lists ANSI-compatible thesauri Hypertext links within and between documents and databases (R6.x) Support for natural language parsing of queries Automatic document summarization tools Client/Server development Programming interfaces for World-Wide Web (HTTP, HTML) access to databases

Two-phase commit protocol

In transaction processing, databases, and computer networking, the two-phase commit protocol (2PC, tupac) is a type of atomic commitment protocol (ACP). It is a distributed algorithm that coordinates all the processes that participate in a distributed atomic transaction on whether to commit or abort (roll back) the transaction. This protocol (a specialised type of consensus protocol) achieves its goal even in many cases of temporary system failure (involving either process, network node, communication, etc. failures), and is thus widely used. However, it is not resilient to all possible failure configurations, and in rare cases, manual intervention is needed to remedy an outcome. To accommodate recovery from failure (automatic in most cases) the protocol's participants use logging of the protocol's states. Log records, which are typically slow to generate but survive failures, are used by the protocol's recovery procedures. Many protocol variants exist that primarily differ in logging strategies and recovery mechanisms. Though usually intended to be used infrequently, recovery procedures compose a substantial portion of the protocol, due to many possible failure scenarios to be considered and supported by the protocol. In a "normal execution" of any single distributed transaction (i.e., when no failure occurs, which is typically the most frequent situation), the protocol consists of two phases: The commit-request phase (or voting phase), in which a coordinator process attempts to prepare all the transaction's participating processes (named participants, cohorts, or workers) to take the necessary steps for either committing or aborting the transaction and to vote, either "Yes": commit (if the transaction participant's local portion execution has ended properly), or "No": abort (if a problem has been detected with the local portion), and The commit phase, in which, based on voting of the participants, the coordinator decides whether to commit (only if all have voted "Yes") or abort the transaction (otherwise), and notifies the result to all the participants. The participants then follow with the needed actions (commit or abort) with their local transactional resources (also called recoverable resources; e.g., database data) and their respective portions in the transaction's other output (if applicable). The two-phase commit (2PC) protocol should not be confused with the two-phase locking (2PL) protocol, a concurrency control protocol. == Assumptions == The protocol works in the following manner: one node is a designated coordinator, which is the master site, and the rest of the nodes in the network are designated the participants. The protocol assumes that: there is stable storage at each node with a write-ahead log, no node crashes forever, the data in the write-ahead log is never lost or corrupted in a crash, and any two nodes can communicate with each other. The last assumption is not too restrictive, as network communication can typically be rerouted. The first two assumptions are much stronger; if a node is totally destroyed then data can be lost. The protocol is initiated by the coordinator after the last step of the transaction has been reached. The participants then respond with an agreement message or an abort message depending on whether the transaction has been processed successfully at the participant. == Basic algorithm == === Commit request (or voting) phase === The coordinator sends a query to commit message to all participants and waits until it has received a reply from all participants. The participants execute the transaction up to the point where they will be asked to commit. They each write an entry to their undo log and an entry to their redo log. Each participant replies with: either an agreement message (participant votes Yes to commit), if the participant's actions succeeded; or an abort message (participant votes No to commit), if the participant experiences a failure that will make it impossible to commit. === Commit (or completion) phase === ==== Success ==== If the coordinator received an agreement message from all participants during the commit-request phase: The coordinator sends a commit message to all the participants. Each participant completes the operation, and releases all the locks and resources held during the transaction. Each participant sends an acknowledgement to the coordinator. The coordinator completes the transaction when all acknowledgements have been received. ==== Failure ==== If any participant votes No during the commit-request phase (or the coordinator's timeout expires): The coordinator sends a rollback message to all the participants. Each participant undoes the transaction using the undo log, and releases the resources and locks held during the transaction. Each participant sends an acknowledgement to the coordinator. The coordinator undoes the transaction when all acknowledgements have been received. ==== Message flow ==== Coordinator Participant QUERY TO COMMIT --------------------------------> VOTE YES/NO prepare/abort <------------------------------- commit/abort COMMIT/ROLLBACK --------------------------------> ACKNOWLEDGEMENT commit/abort <-------------------------------- end An next to the record type means that the record is forced to stable storage. == Disadvantages == The greatest disadvantage of the two-phase commit protocol is that it is a blocking protocol. If the coordinator fails permanently, some participants will never resolve their transactions: After a participant has sent an agreement message as a response to the commit-request message from the coordinator, it will block until a commit or rollback is received. A two-phase commit protocol cannot dependably recover from a failure of both the coordinator and a cohort member during the commit phase. If only the coordinator had failed, and no cohort members had received a commit message, it could safely be inferred that no commit had happened. If, however, both the coordinator and a cohort member failed, it is possible that the failed cohort member was the first to be notified, and had actually done the commit. Even if a new coordinator is selected, it cannot confidently proceed with the operation until it has received an agreement from all cohort members, and hence must block until all cohort members respond. == Implementing the two-phase commit protocol == === Common architecture === In many cases the 2PC protocol is distributed in a computer network. It is easily distributed by implementing multiple dedicated 2PC components similar to each other, typically named transaction managers (TMs; also referred to as 2PC agents or Transaction Processing Monitors), that carry out the protocol's execution for each transaction (e.g., The Open Group's X/Open XA). The databases involved with a distributed transaction, the participants, both the coordinator and participants, register to close TMs (typically residing on respective same network nodes as the participants) for terminating that transaction using 2PC. Each distributed transaction has an ad hoc set of TMs, the TMs to which the transaction participants register. A leader, the coordinator TM, exists for each transaction to coordinate 2PC for it, typically the TM of the coordinator database. However, the coordinator role can be transferred to another TM for performance or reliability reasons. Rather than exchanging 2PC messages among themselves, the participants exchange the messages with their respective TMs. The relevant TMs communicate among themselves to execute the 2PC protocol schema above, "representing" the respective participants, for terminating that transaction. With this architecture the protocol is fully distributed (does not need any central processing component or data structure), and scales up with number of network nodes (network size) effectively. This common architecture is also effective for the distribution of other atomic commitment protocols besides 2PC, since all such protocols use the same voting mechanism and outcome propagation to protocol participants. === Protocol optimizations === Database research has been done on ways to get most of the benefits of the two-phase commit protocol while reducing costs by protocol optimizations and protocol operations saving under certain system's behavior assumptions. ==== Presumed abort and presumed commit ==== Presumed abort or Presumed commit are common such optimizations. An assumption about the outcome of transactions, either commit, or abort, can save both messages and logging operations by the participants during the 2PC protocol's execution. For example, when presumed abort, if during system recovery from failure no logged evidence for commit of some transaction is found by the recovery procedure, then it assumes that the transaction has been aborted, and acts accordingly. This means that it does not matter if aborts are logged at all, and such logging can be saved under this assumption. Typical

Literature review

A literature review is an overview of previously published works on a particular topic. The term can refer to a full scholarly paper or a section of a scholarly work such as books or articles. Either way, a literature review provides the researcher/author and the audiences with general information of an existing knowledge of a particular topic. A good literature review has a proper research question, a proper theoretical framework, and/or a chosen research method. It serves to situate the current study within the body of the relevant literature and provides context for the reader. In such cases, the review usually precedes the methodology and results sections of the work. Producing a literature review is often part of a graduate and post-graduate requirement, included in the preparation of a thesis, dissertation, or a journal article. Literature reviews are also common in a research proposal or prospectus (the document approved before a student formally begins a dissertation or thesis). A literature review can be a type of a review article. In this sense, it is a scholarly paper that presents the current knowledge including substantive findings as well as theoretical and methodological contributions to a particular topic. Literature reviews are secondary sources and do not report new or original experimental work. Most often associated with academic-oriented literature, such reviews are found in academic journals and are not to be confused with book reviews, which may also appear in the same publication. Literature reviews are a basis for research in nearly every academic field. == Types == Since the concept of a systematic review was formalized in the 1970s, a basic division among types of reviews is the dichotomy of narrative reviews versus systematic reviews. The main types of narrative reviews are evaluative, exploratory, and instrumental. A fourth type of review of literature (the scientific literature) is the systematic review but it is not called a literature review, which absent further specification, conventionally refers to narrative reviews. A systematic review focuses on a specific research question to identify, appraise, select, and synthesize all high-quality research evidence and arguments relevant to that question. A meta-analysis is typically a systematic review using statistical methods to effectively combine the data used on all selected studies to produce a more reliable result. Torraco (2016) describes an integrative literature review. The purpose of an integrative literature review is to generate new knowledge on a topic through the process of review, critique, and synthesis of the literature under investigation. George et al (2023) offer an extensive overview of review approaches. They also propose a model for selecting an approach by looking at the purpose, object, subject, community, and practices of the review. They describe six different types of review, each with their own unique purposes: Exploratory or scoping reviews focus on breadth as opposed to depth Systematic or integrative reviews integrate empirical studies on a topic Meta-narrative reviews are qualitative and use literature to compare research or practice communities Problematizing or critical reviews propose new perspectives on a concept by association with other literature Meta-analyses and meta-regressions integrate quantitative studies and identify moderators Mixed research syntheses combine other review approaches in the same paper == Process and product == Shields and Rangarajan (2013) distinguish between the process of reviewing the literature and a finished work or product known as a literature review. The process of reviewing the literature is often ongoing and informs many aspects of the empirical research project. The process of reviewing the literature requires different kinds of activities and ways of thinking. Shields and Rangarajan (2013) and Granello (2001) link the activities of doing a literature review with Benjamin Bloom's revised taxonomy of the cognitive domain (ways of thinking: remembering, understanding, applying, analyzing, evaluating, and creating). === Use of artificial intelligence in a literature review === Artificial intelligence (AI) is reshaping traditional literature reviews across various disciplines. Generative pre-trained transformers, such as ChatGPT, are often used by students and academics for review purposes. Since 2023, an increasing number of tools powered by large language models and other artificial intelligence technologies have been developed to assist, automate, or generate literature reviews. Nevertheless, the employment of ChatGPT in academic reviews is problematic due to ChatGPT's propensity to "hallucinate". In response, efforts are being made to mitigate these hallucinations through the integration of plugins. For instance, Rad et al. (2023) used ScholarAI for review in cardiothoracic surgery.

Threat actor

In cybersecurity and risk assessment, a threat actor (or threat agents, attackers, or adversaries) is a person, group, organisation, state, or other entity with the ability to cause, carry, transmit, support, or exploit a threat. Threat actors are commonly analysed according to their motivations, resources, technical capability, access to systems, relationship to a target, and degree of connection to state authority. They may exploit vulnerabilities, conduct social engineering, steal or monetise data, disrupt operations, or support other actors who carry out such activity. Because the term covers a wide range of actors, researchers and security organisations use taxonomies that distinguish between groups such as cybercriminals, state-linked actors, ideologically motivated actors, thrill seekers or trolls, insiders, and competitors. Threat actor classifications are used in risk management, cyber threat intelligence, and incident response to connect observed behaviour with possible objectives and likely future activity. The categories are not always mutually exclusive: the same actor may combine criminal, ideological, commercial, or state-linked motivations, and different organisations may use different names for similar actors. == Risk assessment and security management == In risk assessment, threat actor analysis is used to identify who or what may create, carry, transmit, support, or exploit a threat, and how that actor relates to the system being assessed. Rausand and Haugen classify threat actors by their relationship to the system, distinguishing between internal and external actors, and by intent, distinguishing between intentional and unintentional actors. Threat actor classification may also support incident investigation. Rogers argued that actor categories could be inferred from observable case points, such as tools used, messages left, data targeted, forensic knowledge, and the degree of damage, allowing investigators to assess likely motivation and skill level. Later work similarly linked actor classification to operational analysis. Chng, Lu, Kumar and Yau proposed a framework connecting hacker types, motivations and typical strategies, arguing that observed behaviour before or during an attack can help analysts infer the likely type of actor involved. At the strategic level, actor analysis may consider an actor's resources, capabilities, degree of state involvement, motivations and objectives. == Landscape == The United Nations Institute for Disarmament Research has described the contemporary cyberthreat landscape as involving an increasingly diverse and interconnected set of actors, including state-led operations, cybercriminal syndicates, ideological hacktivists, commercial cyber mercenaries, private companies and civilian volunteers. Its 2026 report argued that these actors vary in resources, technical sophistication and relationships with states, making it traditional distinctions between state, civilian combatant roles, and legitimate and illegitimate conduct harder to apply. == Academic taxonomies == Early taxonomies classified hackers by activity, skill, motivation, or criminal profile. Landreth proposed six categories based on activity: novice, student, tourist, crasher, and thief. Hollinger classified computer misuse into pirates, browsers, and crackers, describing a progression from less-skilled activity to more technically serious offences. Chantler used attributes including activity, skill, knowledge, motivation, and duration of involvement to distinguish between an elite group, neophytes, and "losers and lamers". Parker proposed seven profiles of cybercriminals: pranksters, hacksters, malicious hackers, personal problem solvers, career criminals, extreme advocates, and malcontents, addicts, and irrational or incompetent people. In 2000, Marc Rogers proposed a taxonomy of hackers with seven, non-mutually-exclusive categories: newbie/tool kit users, cyber-punks, internals, coders, old guard hackers, professional criminals, and cyber-terrorists. Rausand and Haugen distinguish between internal and external threat actors, and between intentional and unintentional threat actors. Internal actors have some relationship with, access to, or position inside the system or organisation, while external actors operate from outside it. Intentional actors seek to create, exploit, or support a threat event, whereas unintentional actors may cause or enable a threat event through error, negligence, accident, or lack of awareness. Rogers later revised his hacker taxonomy into Novices, Cyber-punks, Internals, Petty Thieves, Virus Writers, Old Guard hackers, Professional Criminals, Information Warriors, and, more tentatively, Political Activists. In the model, motivation is grouped into four broad domains: curiosity, notoriety, revenge, and financial gain. A 2022 review by Chng, Lu, Kumar and Yau examined 11 hacker typologies published over three decades and proposed a unified framework linking hacker types, motivations, and strategies. The framework identified 13 hacker types and seven motivations, and argued that observed strategies during an attack can help analysts infer the likely type of actor involved. == Government taxonomies == Taxonomies of threat actors by governments are much more likely to include state-level threat actors. In the United States the National Institute of Standards and Technology (NIST) uses the term threat source in its risk-assessment guidance: organisations are directed to identify and characterise threat sources of concern, including capability, intent and targeting for adversarial threat sources, and the range of effects for non-adversarial threat sources. NIST treats threat-source identification as part of the risk-assessment process, alongside identifying threat events, vulnerabilities, likelihood and impact. In the EU, European Union Agency for Cybersecurity publishes the annual ENISA Threat Landscape, which analyses cyber incidents and adversary behaviour affecting the European Union. The 2025 report analysed selected incidents from the previous year and grouped activity around cybercrime, state-aligned activity, foreign information manipulation and interference, and hacktivism. In ENISA's 2025 analysis, hacktivist activity dominated reporting, representing almost 80% of recorded incidents and consisting mainly of low-level distributed denial-of-service operations. ENISA also reported increasing convergence between hacktivism, cybercrime and state-nexus activity, including state-aligned use of hacktivist personas, hacktivist adoption of ransomware, and false-flag or impersonation activity. At the UN level, A 2026 report by the United Nations Institute for Disarmament Research described the cyberthreat landscape as involving state-led operations, cybercriminal syndicates, ideological hacktivists, commercial cyber mercenaries, and civilian volunteers, with actors varying in resources, technical sophistication, and links to states. Canada defines threat actors as states, groups, or individuals who aim to cause harm by exploiting a vulnerability with malicious intent. A threat actor must be trying to gain access to information systems to access or alter data, devices, systems, or networks. The Japanese government's National Centre of Incident Readiness and Strategy (NISC) was established in 2015 to create a "free, fair and secure cyberspace" in Japan. The NICS created a cybersecurity strategy in 2018 that outlines nation-states and cybercrime to be some of the most key threats. It also indicates that terrorist usage of the cyberspace needs to be monitored and understood. The Security Council of the Russian Federation published the cyber security strategy doctrine in 2016. This strategy highlights the following threat actors as a risk to cyber security measures: nation-state actors, cyber criminals, and terrorists. == Techniques == Threat actors use techniques like Social engineering (security), and Phishing, alongside technical exploits like Cross-site scripting, SQL injection, and denial-of-service attacks. == Limitations == In practice, actor categories may overlap (Edward Snowden for example), and the same activity may combine features associated with hacktivism, cybercrime and state-linked operations. The lines between hacktivism, cybercrime and state-nexus activity had continued to blur, with shared toolsets, overlapping methods, fake personas, hacktivist adoption of ransomware, and cybercriminal or state-linked actors masquerading as other groups. Threat actor analysis also has limits as a risk-management method. NIST notes that risk assessments depend on their purpose, scope, assumptions, constraints, information sources, risk model and analytic approach, and that assessments are tied to particular time frames and organisational contexts. NIST also warns that simple threat-vulnerability pairing may be undesirable or problematic where there are many threats and vulnerabilities, and recom

Software intelligence

Software intelligence is insight into the inner workings and structural condition of software assets produced by software designed to analyze database structure, software framework and source code to better understand and control complex software systems in information technology environments. Similarly to business intelligence (BI), software intelligence is produced by a set of software tools and techniques for the mining of data and the software's inner-structure. Results are automatically produced and feed a knowledge base containing technical documentation and blueprints of the innerworking of applications, and make it available to all to be used by business and software stakeholders to make informed decisions, measure the efficiency of software development organizations, communicate about the software health, prevent software catastrophes. == History == Software intelligence has been used by Kirk Paul Lafler, an American engineer, entrepreneur, and consultant, and founder of Software Intelligence Corporation in 1979. At that time, it was mainly related to SAS activities, in which he has been an expert since 1979. In the early 1980s, Victor R. Basili participated in different papers detailing a methodology for collecting valid software engineering data relating to software engineering, evaluation of software development, and variations. In 2004, different software vendors in software analysis started using the terms as part of their product naming and marketing strategy. Then in 2010, Ahmed E. Hassan and Tao Xie defined software intelligence as a "practice offering software practitioners up-to-date and pertinent information to support their daily decision-making processes and Software Intelligence should support decision-making processes throughout the lifetime of a software system". They go on by defining software intelligence as a "strong impact on modern software practice" for the upcoming decades. == Capabilities == Because of the complexity and wide range of components and subjects implied in software, software intelligence is derived from different aspects of software: Software composition is the construction of software application components. Components result from software coding, as well as the integration of the source code from external components: Open source, 3rd party components, or frameworks. Other components can be integrated using application programming interface call to libraries or services. Software architecture refers to the structure and organization of elements of a system, relations, and properties among them. Software flaws designate problems that can cause security, stability, resiliency, and unexpected results. There is no standard definition of software flaws but the most accepted is from The MITRE Corporation where common flaws are cataloged as Common Weakness Enumeration. Software grades assess attributes of the software. Historically, the classification and terminology of attributes have been derived from the ISO 9126-3 and the subsequent ISO 25000:2005 quality model. Software economics refers to the resource evaluation of software in the past, present, or future to make decisions and to govern. == Components == The capabilities of software intelligence platforms include an increasing number of components: Code analyzer to serve as an information basis for other software intelligence components identifying objects created by the programming language, external objects from Open source, third parties objects, frameworks, API, or services Graphical visualization and blueprinting of the inner structure of the software product or application considered including dependencies, from data acquisition (automated and real-time data capture, end-user entries) up to data storage, the different layers within the software, and the coupling between all elements. Navigation capabilities within components and impact analysis features List of flaws, architectural and coding violations, against standardized best practices, cloud blocker preventing migration to a Cloud environment, and rogue data-call entailing the security and integrity of software Grades or scores of the structural and software quality aligned with industry-standard like OMG, CISQ or SEI assessing the reliability, security, efficiency, maintainability, and scalability to cloud or other systems. Metrics quantifying and estimating software economics including work effort, sizing, and technical debt Industry references and benchmarking allowing comparisons between outputs of analysis and industry standards == User aspect == Some considerations must be made in order to successfully integrate the usage of software Intelligence systems in a company. Ultimately the software intelligence system must be accepted and utilized by the users in order for it to add value to the organization. If the system does not add value to the users' mission, they simply don't use it as stated by M. Storey in 2003. At the code level and system representation, software intelligence systems must provide a different level of abstractions: an abstract view for designing, explaining and documenting and a detailed view for understanding and analyzing the software system. At the governance level, the user acceptance for software intelligence covers different areas related to the inner functioning of the system as well as the output of the system. It encompasses these requirements: Comprehensive: missing information may lead to a wrong or inappropriate decision, as well as it is a factor influencing the user acceptance of a system. Accurate: accuracy depends on how the data is collected to ensure fair and indisputable opinion and judgment. Precise: precision is usually judged by comparing several measurements from the same or different sources. Scalable: lack of scalability in the software industry is a critical factor leading to failure. Credible: outputs must be trusted and believed. Deploy-able and usable. == Applications == Software intelligence has many applications in all businesses relating to the software environment, whether it is software for professionals, individuals, or embedded software. Depending on the association and the usage of the components, applications will relate to: Change and modernization: uniform documentation and blueprinting on all inner components, external code integrated, or call to internal or external components of the software Resiliency and security: measuring against industry standards to diagnose structural flaws in an IT environment. Compliance validation regarding security, specific regulations or technical matters. Decisions making and governance: Providing analytics about the software itself or stakeholders involved in the development of the software, e.g. productivity measurement to inform business and IT leaders about progress towards business goals. Assessment and Benchmarking to help business and IT leaders to make informed, fact-based decision about software. == Marketplace == Software intelligence is a high-level discipline and has been gradually growing covering the applications listed above. There are several markets driving the need for it: Application Portfolio Analysis (APA) aiming at improving the enterprise performance. Software Assessment for producing the software KPI and improving quality and productivity. Software security and resiliency measures and validation. Software evolution or legacy modernization, for which blueprinting the software systems are needed nor tools improving and facilitating modifications.