AI Art Enhancer

AI Art Enhancer — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Sycophancy (artificial intelligence)

    Sycophancy (artificial intelligence)

    In the field of artificial intelligence, sycophancy is a tendency of large language models (LLMs) and other AI assistants to tailor their responses to what they predict the user wants to hear rather than to what is accurate or warranted. The behavior takes several forms: an assistant may agree with a user's stated opinion even when the user is mistaken; it may abandon a correct answer after a challenge such as "are you sure?"; it may validate beliefs, decisions or self-presentation regardless of merit; or it may praise the user, their work or their ideas in unwarranted terms. The word is borrowed from the ordinary English term for fawning flattery, and is used in AI alignment and AI safety research to describe a class of misalignment failures associated with training on human feedback. Researchers at Anthropic first documented the behavior systematically in 2022. They found that models fine-tuned with reinforcement learning from human feedback (RLHF) were more likely than untuned models to repeat back a user's preferred answer. A 2023 follow-up paper, "Towards Understanding Sycophancy in Language Models", showed that five frontier assistants from OpenAI, Anthropic and Meta all exhibited the behavior, and traced its origin to biases in the human preference data used during training. Later work documented sycophancy in mathematics, medicine, academic peer review and other domains, and identified a broader category called "social sycophancy" affecting an assistant's emotional and interpersonal responses. The issue drew widespread public attention in April 2025 after OpenAI rolled back an update to its GPT-4o model. Users had reported that the assistant praised dangerous decisions, endorsed delusional thinking and offered exaggerated compliments for trivial prompts. OpenAI's post-mortem attributed the change in behavior to an additional training signal based on user thumbs-up and thumbs-down feedback. That episode, together with reporting in The New York Times, Rolling Stone and elsewhere on users drawn into delusional thinking through prolonged chatbot interaction, has been cited in litigation and in academic studies as evidence that sycophancy poses risks to user well-being. Proposed mitigations include fine-tuning on synthetic data that rewards disagreement with incorrect user statements, editing the small subset of model parameters causally responsible for the behavior, changes to the dialogue or system prompt, and benchmarks designed to surface sycophantic behavior before models are released. == Causes == The dominant explanation points to RLHF, the standard technique for aligning chat assistants with user expectations. Human annotators rank candidate model responses; a reward model is trained to predict those rankings; and the language model is then optimized against the reward model. Because human raters tend to prefer outputs that confirm their existing beliefs or flatter their work, the pipeline systematically rewards responses that agree with the annotator. Perez and colleagues at Anthropic published the first large-scale empirical evidence of the effect in 2022. They reported that RLHF training increased the probability that a model would repeat back a dialog user's preferred answer, and that larger models exhibited the behavior more strongly. Sharma and colleagues, the following year, went further and examined Anthropic's own preference data directly. Both the human raters and the reward models trained on their judgments preferred convincingly written sycophantic responses to truthful ones at a non-negligible rate. Wei and co-authors at Google DeepMind found similar results in the PaLM family, observing that both model scale and instruction tuning increased sycophancy on opinion questions. The behavior is often classified as a form of reward hacking, in which an optimization process exploits a flaw in its reward signal rather than achieving the intended objective. OpenAI's post-mortem of the April 2025 GPT-4o incident identified a more specific mechanism. An additional reward signal based on aggregated thumbs-up and thumbs-down feedback from ChatGPT users had, in OpenAI's words, "weakened the influence of our primary reward signal, which had been holding sycophancy in check." Separately, an Anthropic interpretability paper from 2025 located a linear direction in a model's internal activations corresponding to sycophantic behavior, and showed that such "persona vectors" could be used to flag sycophancy-inducing training data and to steer models away from the trait at inference time. == Measurement == The Anthropic team released SycophancyEval with its 2023 paper, supplying test sets for each of the four canonical behaviors. Two further benchmarks from Stanford followed in 2025. SycEval, applied to mathematical and medical reasoning tasks, reported an overall sycophancy rate of 58 per cent across the GPT-4o, Claude and Gemini models tested. ELEPHANT, aimed at social sycophancy, found that the eleven LLMs evaluated affirmed posts that the Reddit community r/AmITheAsshole had judged inappropriate in 42 per cent of cases, and preserved a user's face 45 percentage points more often than human respondents did. Domain-specific benchmarks have followed. BrokenMath tests robustness to plausible-looking but false mathematical claims drawn from competition problems, and reports that the best evaluated model was sycophantic in 29 per cent of cases. SYCON-Bench measures how many dialogue turns are required before a model abandons a correct position. Visual sycophancy in multimodal models has been examined with MM-SY and PENDULUM. A 2026 study by researchers at the Massachusetts Institute of Technology reported that personalization features, which adapt assistants to individual users over repeated sessions, can intensify social sycophancy. == Notable incidents == === GPT-4o rollback (April 2025) === On 25 April 2025, OpenAI completed the rollout of an update to GPT-4o, the default model used in ChatGPT at the time. Within days, users reported that the assistant had begun praising trivial messages in extravagant terms, endorsing impulsive or dangerous decisions, and reinforcing strong emotional statements without pushback. Widely shared examples included the model congratulating a user who reported stopping prescribed psychiatric medication, and praising a business plan to sell "shit on a stick" as venture-capital ready. OpenAI's chief executive, Sam Altman, wrote on 27 April that recent updates had made the model "too sycophant-y and annoying" and said fixes were in progress. The company began reverting the update on 28 April and completed the rollback for free users by 30 April. Two post-mortems followed: a short note on 29 April and a longer technical follow-up, "Expanding on what we missed with sycophancy", on 2 May. Both attributed the regression to a new training signal based on user thumbs-up and thumbs-down feedback, to inadequate pre-launch evaluation for sycophantic drift, and to the dismissal of qualitative concerns raised by internal testers before release. Reporting in CNN, Fortune and Bloomberg News treated the incident as a turning point in public awareness of the problem. === Chatbot-related psychological harm === From mid-2025 onward, news reports began to link sycophantic chatbot behavior to acute psychological harm. In June 2025, The New York Times technology reporter Kashmir Hill published an investigation centered on Eugene Torres, a Manhattan accountant with no history of mental illness, who developed a sustained delusional episode after a series of conversations with ChatGPT about simulation theory. According to the article, the assistant encouraged Torres to stop taking prescribed medication, to cut off friends and family, and at one point told him that he could fly from a nineteen-story building if he "truly believed". Futurism and Rolling Stone ran parallel investigations documenting other cases in which heavy use of ChatGPT had been associated with delusional thinking, involuntary commitment or, in at least one case, the death of a user with a pre-existing psychiatric diagnosis. A 2026 paper by researchers at the Massachusetts Institute of Technology and the University of Washington put forward a formal Bayesian model. It showed that even an ideally rational user could be drawn into what the authors call "delusional spiraling" when interacting with a sufficiently sycophantic assistant, and that the effect was not eliminated by suppressing hallucinations or by warning users in advance. The lawsuit Raine v. OpenAI, filed in San Francisco Superior Court in August 2025 by the parents of a sixteen-year-old who had died by suicide, alleges that "heightened sycophancy" was a design feature of ChatGPT that contributed to their son's death; it is the first wrongful-death suit against a large language-model provider. === Wider commentary === Mainstream coverage in outlets including The New York Times, The Washington Pos

    Read more →
  • Media intelligence

    Media intelligence

    Media intelligence uses data mining and data science to analyze public, social and editorial media content. It refers to marketing systems that synthesize billions of online conversations into relevant information. This allow organizations to measure and manage content performance, understand trends, and drive communications and business strategy. Media intelligence can include software as a service using big data terminology. This includes questions about messaging efficiency, share of voice, audience geographical distribution, message amplification, influencer strategy, journalist outreach, creative resonance, and competitor performance in all these areas. Media intelligence differs from business intelligence in that it uses and analyzes data outside company firewalls. Examples of that data are user-generated content on social media sites, blogs, comment fields, and wikis etc. It may also include other public data sources like press releases, news, blogs, legal filings, reviews and job postings. Media intelligence may also include competitive intelligence, wherein information that is gathered from publicly available sources such as social media, press releases, and news announcements are used to better understand the strategies and tactics being deployed by competing businesses. Media intelligence is enhanced by means of emerging technologies like ambient intelligence, machine learning, semantic tagging, natural language processing, sentiment analysis and machine translation. == Technologies used == Different media intelligence platforms use different technologies for monitoring, curating content, engaging with content, data analysis and measurement of communications and marketing campaign success. These technology providers may obtain content by scraping content directly from websites or by connecting to the API provided by social media, or other content platforms that are created for 3rd party developers to develop their own applications and services that access data. Technology companies may also get data from a data reseller. Some social media monitoring and analytics companies use calls to data providers each time an end-user develops a query. Others archive and index social media posts to provide end users with on-demand access to historical data and enable methodologies and technologies leveraging network and relational data. Additional monitoring companies use crawlers and spidering technology to find keyword references, known as semantic analysis or natural language processing. Basic implementation involves curating data from social media on a large scale and analyzing the results to make sense out of it.

    Read more →
  • Dashboard (computing)

    Dashboard (computing)

    In computer information systems, a dashboard is a type of graphical user interface which often provides at-a-glance views of data relevant to a particular objective or process through a combination of visualizations and summary information. In other usage, "dashboard" is another name for "progress report" or "report" and is considered a form of data visualization. The dashboard is often accessible by a web browser and is typically linked to regularly updating data sources. Dashboards are often interactive and facilitate users to explore the data themselves, usually by clicking into elements to view more detailed information. The term dashboard originates from the automobile dashboard where drivers monitor the major functions at a glance via the instrument panel. == History == The idea of digital dashboards followed the study of decision support systems in the 1970s. Early predecessors of the modern business dashboard were first developed in the 1980s in the form of Executive Information Systems (EISs). Due to problems primarily with data refreshing and handling, it was soon realized that the approach wasn't practical as information was often incomplete, unreliable, and spread across too many disparate sources. Thus, EISs hibernated until the 1990s when the information age quickened pace and data warehousing, and online analytical processing (OLAP) allowed dashboards to function adequately. Despite the availability of enabling technologies, the dashboard use didn't become popular until later in that decade, with the rise of key performance indicators (KPIs), and the introduction of Robert S. Kaplan and David P. Norton's balanced scorecard. In the late 1990s, Microsoft promoted a concept known as the Digital Nervous System and "digital dashboards" were described as being one leg of that concept. Today, the use of dashboards forms an important part of Business Performance Management (BPM). Initially dashboards were used for monitoring purposes, now with the advancement of technology, dashboards are being used for more analytical purposes. The use of dashboards has now been incorporating; scenario analysis, drill down capabilities, and presentation format flexibility. == Benefits == Digital dashboards allow managers to monitor the contribution of the various departments in their organization. In addition, they enable “rolling up” of information to present a consolidated view across an organization. To gauge exactly how well an organization is performing overall, digital dashboards allow you to capture and report specific data points from each department within the organization, thus providing a "snapshot" of performance. Benefits of using digital dashboards include: Visual presentation of performance measures Ability to identify and correct negative trends Measure efficiencies/inefficiencies Ability to generate detailed reports showing new trends Ability to make more informed decisions based on collected business intelligence Dashboards offers a holistic view of the entire business as it gives the manager a bird's eye view into the performance of sales, data inventory, web traffic, social media analytics and other associated data that is visually presented on a single dashboard. Dashboards lead to better management of marketing/financial strategies as a dashboard for the display of marketing data makes the process of marketing easier and more reliable as compared to doing it manually. Web analytics play a crucial role in shaping the marketing strategy of many businesses. Dashboards also facilitate for better tracking of sales and financial reporting as the data is more precise and in one area. Lastly, dashboards offer for better customer service through monitoring because they keep both the managers and the clients updated on the project progress through automated emails and notifications. == Align strategies and organizational goals == Gain total visibility of all systems instantly Quick identification of data outliers and correlations Consolidated reporting into one location Available on mobile devices to quickly access metrics == Classification == Dashboards can be broken down according to role and are either strategic, analytical, operational, or informational. Dashboards are the 3rd step on the information ladder, demonstrating the conversion of data to increasingly valuable insights. Strategic dashboards support managers at any level in an organization and provide the quick overview that decision-makers need to monitor the health and opportunities of the business. Dashboards of this type focus on high-level measures of performance and forecasts. Strategic dashboards benefit from static snapshots of data (daily, weekly, monthly, and quarterly) that are not constantly changing from one moment to the next. Dashboards for analytical purposes often include more context, comparisons, and history, along with subtler performance evaluators. In addition, analytical dashboards typically support interactions with the data, such as drilling down into the underlying details. Dashboards for monitoring operations are often designed differently from those that support strategic decision making or data analysis and often require monitoring of activities and events that are constantly changing and might require attention and response at a moment's notice. == Types of dashboards == Digital dashboards may be laid out to track the flows inherent in the business processes that they monitor. Graphically, users may see the high-level processes and then drill down into low-level data. This level of detail is often buried deep within the corporate enterprise and otherwise unavailable to the senior executives. Three main types of digital dashboards dominate the market today: desktop software applications, web-browser-based applications, and desktop applications are also known as desktop widgets. The last are driven by a widget engine. Both Desktop and Browser-based providers enable the distribution of dashboards via a web browser. An example of the latter is web-based-browser Asana, which helps teams orchestrate their work, from daily tasks to strategic cross-functional initiatives. With it, teams can manage everything from company objectives to digital transformation to product launches and marketing campaigns. Specialized dashboards may track all corporate functions. Examples include human resources, recruiting, sales, operations, security, information technology, project management, customer relationship management, digital marketing and many more departmental dashboards. For a smaller organization like a startup a compact startup scorecard dashboard tracks important activities across lot of domains ranging from social media to sales. Digital dashboard projects involve business units as the driver and the information technology department as the enabler. Therefore, the success of dashboard projects depends on the relevancy/importance of information provided within the dashboard. This includes the metrics chosen to monitor and the timeliness of the data forming those metrics; data must be up to date and accurate. Key performance indicators, balanced scorecards, and sales performance figures are some of the content appropriate on business dashboards. === Performance Dashboards === Dashboards involve the combination of visual and functional features. This combination of features helps improve cognition and interpretation. A performance dashboard sits at the intersection of two powerful disciplines: business intelligence and performance management. Therefore, there are different users who could use these dashboards for different reasons. For example, a level of workers could look at monitoring inventory while those in more managerial roles can look at lagging measure. Then executives could utilize the dashboard to evaluate strategic performance against objectives. == Dashboards and scorecards == Balanced scorecards and dashboards have been linked together as if they were interchangeable. However, although both visually display critical information, the difference is in the format: Scorecards can open the quality of an operation while dashboards provide calculated direction. A balanced scorecard has what they called a "prescriptive" format. It should always contain these components: Perspectives – group Objectives – verb-noun phrases pulled from a strategy plan Measures – also called metric or key performance indicators (KPIs) Spotlight indicators – red, yellow, or green symbols that provide an at-a-glance view of a measure's performance. Each of these sections ensures that a Balanced Scorecard is essentially connected to the businesses critical strategic needs. The design of a dashboard is more loosely defined. Dashboards are usually a series of graphics, charts, gauges and other visual indicators that can be monitored and interpreted. Even when there is a strategic link, on a dashboard, it may not be noticed as such since objectives are not normally pre

    Read more →
  • Upworthy

    Upworthy

    Upworthy is a media brand that focuses on positive storytelling. It was started in March 2012 by Eli Pariser, the former executive director of MoveOn, and Peter Koechley, the former managing editor of The Onion. One of Facebook's co-founders, Chris Hughes, was an early investor. At its peak between 2012 and 2014, it reached up to 100 million people a month. In 2017, the company was acquired by Good Worldwide. == History == Upworthy was launched in 2012 with a focus on aggregating positive content, which aligned with Facebook's algorithm. Originally, Upworthy curators searched the internet for existing content to feature on the site. Once selected as an option, curators brainstormed different headlines and shareable images for the content, and tested it with a small sample of Upworthy's visitors before sharing it on the site. The site popularized a clickbait style of two-phrase headlines. The company simplifies issues that are controversial by nature, which are presented from a politically liberal point of view and are heavily fact-checked for accuracy. In June 2013, an article in Fast Company called Upworthy "the fastest growing media site of all time". It had 8.7 million unique monthly visitors in the first six months, and in November 2013, had a high of 87 million unique visitors in a single month. In 2013, Facebook changed its algorithm, leading to a significant decline in readers from that platform. Upworthy fired one round of writers in 2015, and another in 2016, after an unionization effort by some of the staff. The union involved, the Writers Guild of America, East, has organized several online "viral" news publishers. In January 2017, Upworthy was acquired by media company GOOD Worldwide. The newsrooms of the two organizations would merge as part of the acquisition. About 20 staffers were laid off as part of the merger. In March 2020, Upworthy saw a 65% increase in Instagram followers and a 47% increased interest in positive content on-site page views as a result of increased interest in positive content during the COVID-19 pandemic. In January 2023, National Geographic Books bought Good People: Stories From the Best of Humanity from Upworthy, with a publication date of September 3, 2024. The book is described as "a heartwarming collection of first-person tales that will provide comfort and inspiration to anyone who could use a little dose of joy right now". It was created by two senior Upworthy team members, Gabriel Reilich and Lucia Knell, and features 101 stories from Upworthy's audience. The co-creators encouraged Upworthy followers to connect with the brand through questions on their posts, opening the door for organic and personal stories to be shared in the comment sections. The book debuted on The New York Times nonfiction bestseller list on September 22, 2024, and remained on the list for two weeks. The book is seen in the top 10 on Publishers Weekly Fall 2024 Adult Preview: Lifestyle and on The Washington Post's "5 feel-good books".

    Read more →
  • Thai QR Payment

    Thai QR Payment

    Thai QR Payment or PromptPay (พร้อมเพย์) is a real-time payment system in Thailand that allows money transfers through digital channels using identifiers linked to a bank account, including a mobile phone number, citizen identification number, tax identification number or bank account number. The system was introduced in 2016 as part of Thailand's national e-payment infrastructure and was developed under the National e-Payment Master Plan, a government programme intended to expand digital payment infrastructure and reduce the use of cash in everyday transactions. It is owned by National ITMX ltd and Bank of Thailand and developed by Vocalink, a group by Mastercard == History == PromptPay (originally AnyID) is one of the National e-Payment projects and policies by Thailand, to regulate and standardize electronic payments to follow the technologies with internet and smartphones that is expanding and bringing technology into Finance and Commerce. By 22 December 2015, The First Prayut cabinet have approved the project as a national infastructure PromptPay has also been used in cross-border payment linkages with other real-time payment systems in Southeast Asia. In April 2021, the Monetary Authority of Singapore and the Bank of Thailand launched a linkage between Singapore's PayNow and Thailand's PromptPay, allowing customers of participating banks to send money between the two countries using a mobile phone number. In June 2021, the central banks of Thailand and Malaysia launched a cross-border QR payment linkage between PromptPay and Malaysia's DuitNow system. == Services == PromptPay's Services have included Encrypted Transactions and Payment between Two Individuals (C2C) Government Infrastructure Payment Tax Returns Individual PromptPay e-Wallet Thai QR Payment Pay Alert e-Donation Cross Border QR Payment

    Read more →
  • Contrast set learning

    Contrast set learning

    Contrast set learning is a form of association rule learning that seeks to identify meaningful differences between separate groups by reverse-engineering the key predictors that identify for each particular group. For example, given a set of attributes for a pool of students (labeled by degree type), a contrast set learner would identify the contrasting features between students seeking bachelor's degrees and those working toward PhD degrees. == Overview == A common practice in data mining is to classify, to look at the attributes of an object or situation and make a guess at what category the observed item belongs to. As new evidence is examined (typically by feeding a training set to a learning algorithm), these guesses are refined and improved. Contrast set learning works in the opposite direction. While classifiers read a collection of data and collect information that is used to place new data into a series of discrete categories, contrast set learning takes the category that an item belongs to and attempts to reverse engineer the statistical evidence that identifies an item as a member of a class. That is, contrast set learners seek rules associating attribute values with changes to the class distribution. They seek to identify the key predictors that contrast one classification from another. For example, an aerospace engineer might record data on test launches of a new rocket. Measurements would be taken at regular intervals throughout the launch, noting factors such as the trajectory of the rocket, operating temperatures, external pressures, and so on. If the rocket launch fails after a number of successful tests, the engineer could use contrast set learning to distinguish between the successful and failed tests. A contrast set learner will produce a set of association rules that, when applied, will indicate the key predictors of each failed tests versus the successful ones (the temperature was too high, the wind pressure was too high, etc.). Contrast set learning is a form of association rule learning. Association rule learners typically offer rules linking attributes commonly occurring together in a training set (for instance, people who are enrolled in four-year programs and take a full course load tend to also live near campus). Instead of finding rules that describe the current situation, contrast set learners seek rules that differ meaningfully in their distribution across groups (and thus, can be used as predictors for those groups). For example, a contrast set learner could ask, “What are the key identifiers of a person with a bachelor's degree or a person with a PhD, and how do people with PhD's and bachelor’s degrees differ?” Standard classifier algorithms, such as C4.5, have no concept of class importance (that is, they do not know if a class is "good" or "bad"). Such learners cannot bias or filter their predictions towards certain desired classes. As the goal of contrast set learning is to discover meaningful differences between groups, it is useful to be able to target the learned rules towards certain classifications. Several contrast set learners, such as MINWAL or the family of TAR algorithms, assign weights to each class in order to focus the learned theories toward outcomes that are of interest to a particular audience. Thus, contrast set learning can be thought of as a form of weighted class learning. === Example: Supermarket Purchases === The differences between standard classification, association rule learning, and contrast set learning can be illustrated with a simple supermarket metaphor. In the following small dataset, each row is a supermarket transaction and each "1" indicates that the item was purchased (a "0" indicates that the item was not purchased): Given this data, Association rule learning may discover that customers that buy onions and potatoes together are likely to also purchase hamburger meat. Classification may discover that customers that bought onions, potatoes, and hamburger meats were purchasing items for a cookout. Contrast set learning may discover that the major difference between customers shopping for a cookout and those shopping for an anniversary dinner are that customers acquiring items for a cookout purchase onions, potatoes, and hamburger meat (and do not purchase foie gras or champagne). == Treatment learning == Treatment learning is a form of weighted contrast-set learning that takes a single desirable group and contrasts it against the remaining undesirable groups (the level of desirability is represented by weighted classes). The resulting "treatment" suggests a set of rules that, when applied, will lead to the desired outcome. Treatment learning differs from standard contrast set learning through the following constraints: Rather than seeking the differences between all groups, treatment learning specifies a particular group to focus on, applies a weight to this desired grouping, and lumps the remaining groups into one "undesired" category. Treatment learning has a stated focus on minimal theories. In practice, treatment are limited to a maximum of four constraints (i.e., rather than stating all of the reasons that a rocket differs from a skateboard, a treatment learner will state one to four major differences that predict for rockets at a high level of statistical significance). This focus on simplicity is an important goal for treatment learners. Treatment learning seeks the smallest change that has the greatest impact on the class distribution. Conceptually, treatment learners explore all possible subsets of the range of values for all attributes. Such a search is often infeasible in practice, so treatment learning often focuses instead on quickly pruning and ignoring attribute ranges that, when applied, lead to a class distribution where the desired class is in the minority. === Example: Boston housing data === The following example demonstrates the output of the treatment learner TAR3 on a dataset of housing data from the city of Boston (a nontrivial public dataset with over 500 examples). In this dataset, a number of factors are collected for each house, and each house is classified according to its quality (low, medium-low, medium-high, and high). The desired class is set to "high", and all other classes are lumped together as undesirable. The output of the treatment learner is as follows: Baseline class distribution: low: 29% medlow: 29% medhigh: 21% high: 21% Suggested Treatment: [PTRATIO=[12.6..16), RM=[6.7..9.78)] New class distribution: low: 0% medlow: 0% medhigh: 3% high: 97% With no applied treatments (rules), the desired class represents only 21% of the class distribution. However, if one filters the data set for houses with 6.7 to 9.78 rooms and a neighborhood parent-teacher ratio of 12.6 to 16, then 97% of the remaining examples fall into the desired class (high-quality houses). == Algorithms == There are a number of algorithms that perform contrast set learning. The following subsections describe two examples. === STUCCO === The STUCCO contrast set learner treats the task of learning from contrast sets as a tree search problem where the root node of the tree is an empty contrast set. Children are added by specializing the set with additional items picked through a canonical ordering of attributes (to avoid visiting the same nodes twice). Children are formed by appending terms that follow all existing terms in a given ordering. The formed tree is searched in a breadth-first manner. Given the nodes at each level, the dataset is scanned and the support is counted for each group. Each node is then examined to determine if it is significant and large, if it should be pruned, and if new children should be generated. After all significant contrast sets are located, a post-processor selects a subset to show to the user - the low order, simpler results are shown first, followed by the higher order results which are "surprising and significantly different." The support calculation comes from testing a null hypothesis that the contrast set support is equal across all groups (i.e., that contrast set support is independent of group membership). The support count for each group is a frequency value that can be analyzed in a contingency table where each row represents the truth value of the contrast set and each column variable indicates the group membership frequency. If there is a difference in proportions between the contrast set frequencies and those of the null hypothesis, the algorithm must then determine if the differences in proportions represent a relation between variables or if it can be attributed to random causes. This can be determined through a chi-square test comparing the observed frequency count to the expected count. Nodes are pruned from the tree when all specializations of the node can never lead to a significant and large contrast set. The decision to prune is based on: The minimum deviation size: The maximum difference between the support

    Read more →
  • Scalable Coherent Interface

    Scalable Coherent Interface

    The Scalable Coherent Interface or Scalable Coherent Interconnect (SCI), is a high-speed interconnect standard for shared memory multiprocessing and message passing. The goal was to scale well, provide system-wide memory coherence and a simple interface; i.e. a standard to replace existing buses in multiprocessor systems with one with no inherent scalability and performance limitations. The IEEE Std 1596-1992, IEEE Standard for Scalable Coherent Interface (SCI) was approved by the IEEE standards board on March 19, 1992. It saw some use during the 1990s, but never became widely used and has been replaced by other systems from the early 2000s. == History == Soon after the Fastbus (IEEE 960) follow-on Futurebus (IEEE 896) project in 1987, some engineers predicted it would already be too slow for the high performance computing marketplace by the time it would be released in the early 1990s. In response, a "Superbus" study group was formed in November 1987. Another working group of the standards association of the Institute of Electrical and Electronics Engineers (IEEE) spun off to form a standard targeted at this market in July 1988. It was essentially a subset of Futurebus features that could be easily implemented at high speed, along with minor additions to make it easier to connect to other systems, such as VMEbus. Most of the developers had their background from high-speed computer buses. Representatives from companies in the computer industry and research community included Amdahl, Apple Computer, BB&N, Hewlett-Packard, CERN, Dolphin Server Technology, Cray Research, Sequent, AT&T, Digital Equipment Corporation, McDonnell Douglas, National Semiconductor, Stanford Linear Accelerator Center, Tektronix, Texas Instruments, Unisys, University of Oslo, University of Wisconsin. The original intent was a single standard for all buses in the computer. The working group soon came up with the idea of using point-to-point communication in the form of insertion rings. This avoided the lumped capacitance, limited physical length/speed of light problems and stub reflections in addition to allowing parallel transactions. The use of insertion rings is credited to Manolis Katevenis who suggested it at one of the early meetings of the working group. The working group for developing the standard was led by David B. Gustavson (chair) and David V. James (Vice Chair). David V. James was a major contributor for writing the specifications including the executable C-code. Stein Gjessing’s group at the University of Oslo used formal methods to verify the coherence protocol and Dolphin Server Technology implemented a node controller chip including the cache coherence logic. Different versions and derivatives of SCI were implemented by companies like Dolphin Interconnect Solutions, Convex, Data General AViiON (using cache controller and link controller chips from Dolphin), Sequent and Cray Research. Dolphin Interconnect Solutions implemented a PCI and PCI-Express connected derivative of SCI that provides non-coherent shared memory access. This implementation was used by Sun Microsystems for its high-end clusters, Thales Group and several others including volume applications for message passing within HPC clustering and medical imaging. SCI was often used to implement non-uniform memory access architectures. It was also used by Sequent Computer Systems as the processor memory bus in their NUMA-Q systems. Numascale developed a derivative to connect with coherent HyperTransport. == The standard == The standard defined two interface levels: The physical level that deals with electrical signals, connectors, mechanical and thermal conditions The logical level that describes the address space, data transfer protocols, cache coherence mechanisms, synchronization primitives, control and status registers, and initialization and error recovery facilities. This structure allowed new developments in physical interface technology to be easily adapted without any redesign on the logical level. Scalability for large systems is achieved through a distributed directory-based cache coherence model. (The other popular models for cache coherency are based on system-wide eavesdropping (snooping) of memory transactions – a scheme which is not very scalable.) In SCI each node contains a directory with a pointer to the next node in a linked list that shares a particular cache line. SCI defines a 64-bit flat address space (16 exabytes) where 16 bits are used for identifying a node (65,536 nodes) and 48 bits for address within the node (256 terabytes). A node can contain many processors and/or memory. The SCI standard defines a packet switched network. === Topologies === SCI can be used to build systems with different types of switching topologies from centralized to fully distributed switching: With a central switch, each node is connected to the switch with a ringlet (in this case a two-node ring). In distributed switching systems, each node can be connected to a ring of arbitrary length and either all or some of the nodes can be connected to two or more rings. The most common way to describe these multi-dimensional topologies is k-ary n-cubes (or tori). The SCI standard specification mentions several such topologies as examples. The 2-D torus is a combination of rings in two dimensions. Switching between the two dimensions requires a small switching capability in the node. This can be expanded to three or more dimensions. The concept of folding rings can also be applied to the Torus topologies to avoid any long connection segments. === Transactions === SCI sends information in packets. Each packet consists of an unbroken sequence of 16-bit symbols. The symbol is accompanied by a flag bit. A transition of the flag bit from 0 to 1 indicates the start of a packet. A transition from 1 to 0 occurs 1 (for echoes) or 4 symbols before the packet end. A packet contains a header with address command and status information, payload (from 0 through optional lengths of data) and a CRC check symbol. The first symbol in the packet header contains the destination node address. If the address is not within the domain handled by the receiving node, the packet is passed to the output through the bypass FIFO. In the other case, the packet is fed to a receive queue and may be transferred to a ring in another dimension. All packets are marked when they pass the scrubber (a node is established as scrubber when the ring is initialized). Packets without a valid destination address will be removed when passing the scrubber for the second time to avoid filling the ring with packets that would otherwise circulate indefinitely. === Cache coherence === Cache coherence ensures data consistency in multiprocessor systems. The simplest form applied in earlier systems was based on clearing the cache contents between context switches and disabling the cache for data that were shared between two or more processors. These methods were feasible when the performance difference between the cache and memory were less than one order of magnitude. Modern processors with caches that are more than two orders of magnitude faster than main memory would not perform anywhere near optimal without more sophisticated methods for data consistency. Bus based systems use eavesdropping (snooping) methods since buses are inherently broadcast. Modern systems with point-to point links use broadcast methods with snoop filter options to improve performance. Since broadcast and eavesdropping are inherently non-scalable, these are not used in SCI. Instead, SCI uses a distributed directory-based cache coherence protocol with a linked list of nodes containing processors that share a particular cache line. Each node holds a directory for the main memory of the node with a tag for each line of memory (same line length as the cache line). The memory tag holds a pointer to the head of the linked list and a state code for the line (three states – home, fresh, gone). Associated with each node is also a cache for holding remote data with a directory containing forward and backward pointers to nodes in the linked list sharing the cache line. The tag for the cache has seven states (invalid, only fresh, head fresh, only dirty, head dirty, mid valid, tail valid). The distributed directory is scalable. The overhead for the directory based cache coherence is a constant percentage of the node’s memory and cache. This percentage is in the order of 4% for the memory and 7% for the cache. == Legacy == SCI is a standard for connecting the different resources within a multiprocessor computer system, and it is not as widely known to the public as for example the Ethernet family for connecting different systems. Different system vendors implemented different variants of SCI for their internal system infrastructure. These different implementations interface to very intricate mechanisms in processors and memory systems and each vendor has to preserve some degrees of

    Read more →
  • Kerckhoffs's principle

    Kerckhoffs's principle

    Kerckhoffs's principle (also called Kerckhoffs's desideratum, assumption, axiom, doctrine or law) of cryptography was stated by the Dutch cryptographer Auguste Kerckhoffs in the 19th century. The principle holds that a cryptosystem should be secure, even if everything about the system, except the key, is public knowledge. This concept is widely embraced by cryptographers, in contrast to security through obscurity, which is not. Kerckhoffs's principle was phrased by the American mathematician Claude Shannon as "the enemy knows the system", i.e., "one ought to design systems under the assumption that the enemy will immediately gain full familiarity with them". In that form, it is called Shannon's maxim. Another formulation by American researcher and professor Steven M. Bellovin is: In other words—design your system assuming that your opponents know it in detail. (A former official at NSA's National Computer Security Center told me that the standard assumption there was that serial number 1 of any new device was delivered to the Kremlin.) == Origins == The invention of telegraphy radically changed military communications and increased the number of messages that needed to be protected from the enemy dramatically, leading to the development of field ciphers which had to be easy to use without large confidential codebooks prone to capture on the battlefield. It was this environment which led to the development of Kerckhoffs's requirements. Auguste Kerckhoffs was a professor of German language at Ecole des Hautes Etudes Commerciales (HEC) in Paris. In early 1883, Kerckhoffs's article, La Cryptographie Militaire, was published in two parts in the Journal of Military Science, in which he stated six design rules for military ciphers. Translated from French, they are: The system must be practically, if not mathematically, indecipherable; It should not require secrecy, and it should not be a problem if it falls into enemy hands; It must be possible to communicate and remember the key without using written notes, and correspondents must be able to change or modify it at will; It must be applicable to telegraph communications; It must be portable, and should not require several persons to handle or operate; Lastly, given the circumstances in which it is to be used, the system must be easy to use and should not be stressful to use or require its users to know and comply with a long list of rules. Some are no longer relevant given the ability of computers to perform complex encryption. The second rule, now known as Kerckhoffs's principle, is still critically important. == Explanation of the principle == Kerckhoffs viewed cryptography as a rival to, and a better alternative than, steganographic encoding, which was common in the nineteenth century for hiding the meaning of military messages. One problem with encoding schemes is that they rely on humanly-held secrets such as "dictionaries" which disclose for example, the secret meaning of words. Steganographic-like dictionaries, once revealed, permanently compromise a corresponding encoding system. Another problem is that the risk of exposure increases as the number of users holding the secrets increases. Nineteenth century cryptography, in contrast, used simple tables which provided for the transposition of alphanumeric characters, generally given row-column intersections which could be modified by keys which were generally short, numeric, and could be committed to human memory. The system was considered "indecipherable" because tables and keys do not convey meaning by themselves. Secret messages can be compromised only if a matching set of table, key, and message falls into enemy hands in a relevant time frame. Kerckhoffs viewed tactical messages as only having a few hours of relevance. Systems are not necessarily compromised, because their components (i.e. alphanumeric character tables and keys) can be easily changed. === Advantage of secret keys === Using secure cryptography is supposed to replace the difficult problem of keeping messages secure with a much more manageable one, keeping relatively small keys secure. A system that requires long-term secrecy for something as large and complex as the whole design of a cryptographic system obviously cannot achieve that goal. It only replaces one hard problem with another. However, if a system is secure even when the enemy knows everything except the key, then all that is needed is to manage keeping the keys secret. There are a large number of ways the internal details of a widely used system could be discovered. The most obvious is that someone could bribe, blackmail, or otherwise threaten staff or customers into explaining the system. In war, for example, one side will probably capture some equipment and people from the other side. Each side will also use spies to gather information. If a method involves software, someone could do memory dumps or run the software under the control of a debugger in order to understand the method. If hardware is being used, someone could buy or steal some of the hardware and build whatever programs or gadgets needed to test it. Hardware can also be dismantled so that the chip details can be examined under the microscope. === Maintaining security === A generalization some make from Kerckhoffs's principle is: "The fewer and simpler the secrets that one must keep to ensure system security, the easier it is to maintain system security." Bruce Schneier ties it in with a belief that all security systems must be designed to fail as gracefully as possible: Kerckhoffs's principle applies beyond codes and ciphers to security systems in general: every secret creates a potential failure point. Secrecy, in other words, is a prime cause of brittleness—and therefore something likely to make a system prone to catastrophic collapse. Conversely, openness provides ductility. Any security system depends crucially on keeping some things secret. However, Kerckhoffs's principle points out that the things kept secret ought to be those least costly to change if inadvertently disclosed. For example, a cryptographic algorithm may be implemented by hardware and software that is widely distributed among users. If security depends on keeping that secret, then disclosure leads to major logistic difficulties in developing, testing, and distributing implementations of a new algorithm – it is "brittle". On the other hand, if keeping the algorithm secret is not important, but only the keys used with the algorithm must be secret, then disclosure of the keys simply requires the simpler, less costly process of generating and distributing new keys. == Applications == In accordance with Kerckhoffs's principle, the majority of civilian cryptography makes use of publicly known algorithms. By contrast, ciphers used to protect classified government or military information are often kept secret (see Type 1 encryption). However, it should not be assumed that government/military ciphers must be kept secret to maintain security. It is possible that they are intended to be as cryptographically sound as public algorithms, and the decision to keep them secret is in keeping with a layered security posture. == Security through obscurity == It is moderately common for companies to keep the inner workings of a system secret. Some argue this "security by obscurity" makes the product safer and less vulnerable to attack. A counter-argument is that keeping the innards secret may improve security in the short term, but in the long run, only systems that have been published and analyzed should be trusted. Steven Bellovin and Randy Bush commented: Security Through Obscurity Considered Dangerous Hiding security vulnerabilities in algorithms, software, and/or hardware decreases the likelihood they will be repaired and increases the likelihood that they can and will be exploited. Discouraging or outlawing discussion of weaknesses and vulnerabilities is extremely dangerous and deleterious to the security of computer systems, the network, and its citizens. Open Discussion Encourages Better Security The long history of cryptography and cryptoanalysis has shown time and time again that open discussion and analysis of algorithms exposes weaknesses not thought of by the original authors, and thereby leads to better and more secure algorithms. As Kerckhoffs noted about cipher systems in 1883 [Kerc83], "Il faut qu'il n'exige pas le secret, et qu'il puisse sans inconvénient tomber entre les mains de l'ennemi." (Roughly, "the system must not require secrecy and must be able to be stolen by the enemy without causing trouble.")

    Read more →
  • NIS2 Directive

    NIS2 Directive

    The Directive (EU) 2022/2555, commonly known as NIS2 is a directive of the European Union aimed at protecting digital infrastructure, in particular critical infrastructure. It broadened the sectors covered by EU network and information security rules and updated incident reporting and oversight compared to the NIS1. Member States were required to transpose NIS2 by 17 October 2024, and the earlier NIS Directive was repealed on 18 October 2024. Only 23 Member States have fully implemented the measures contained with the NIS Directive. Infringement proceedings against them to enforce the Directive have not taken place, and they are not expected to take place in the near future. This failed implementation has led to the fragmentation of cybersecurity capabilities across the EU, with differing standards, incident reporting requirements and enforcement requirements being implemented in different Member States. From the EFTA countries (to April 2026) only Liechtenstein has fully transposed the NIS2 Directive. While the EFTA commission is conducting preparations to transpose the directive into its legislation. == National implementations == === Czech Republic === It is implemented through the Act No. 264/2025 Coll. also called Zákon o kybernetické bezpečnosti (Cybersecurity law) and through another five implementing regulations. The transposing legislation came into force on November 1st, 2025. === Germany === It is implemented through the Gesetz zur Umsetzung der NIS-2-Richtlinie und zur Regelung wesentlicher Grundzüge des Informationssicherheitsmanagements in der Bundesverwaltung. === Ireland === It is implemented through the National Cyber Security Bill. === The Netherlands === It is implemented through the Cyberbeveiligingswet (Cbw). === Slovakia === It is implemented through via an amendment of the Act No. 69/2018 Coll. also called Zákon o kybernetickej bezpečnosti a o zmene a doplnení niektorých zákonov (Law on Cybersecurity and change and amendment of certain laws). It came into force on November 1st, 2025. === Spain === It is implemented through the Esquema Nacional de Seguridad (ENS).

    Read more →
  • Letter frequency

    Letter frequency

    Letter frequency is the number of times letters of the alphabet appear on average in written language. Letter frequency analysis dates back to the Arab mathematician Al-Kindi (c. AD 801–873), who formally developed the method to break ciphers. Letter frequency analysis gained importance in Europe with the development of movable type in AD 1450, wherein one must estimate the amount of type required for each letterform. Linguists use letter frequency analysis as a rudimentary technique for language identification, where it is particularly effective as an indication of whether an unknown writing system is alphabetic, syllabic, or logographic. The use of letter frequencies and frequency analysis plays a fundamental role in cryptograms and several word puzzle games, including hangman, Scrabble, Wordle and the television game show Wheel of Fortune. One of the earliest descriptions in classical literature of applying the knowledge of English letter frequency to solving a cryptogram is found in Edgar Allan Poe's famous story "The Gold-Bug", where the method is successfully applied to decipher a message giving the location of a treasure hidden by Captain Kidd. Herbert S. Zim, in his classic introductory cryptography text Codes and Secret Writing, gives the English letter frequency sequence as "ETAON RISHD LFCMU GYPWB VKJXZQ", the most common letter pairs as "TH HE AN RE ER IN ON AT ND ST ES EN OF TE ED OR TI HI AS TO", and the most common doubled letters as "LL EE SS OO TT FF RR NN PP CC". Different ways of counting can produce somewhat different orders. Letter frequencies also have a strong effect on the design of some keyboard layouts. The most frequent letters are placed on the home row of the Blickensderfer typewriter, the Dvorak keyboard layout, Colemak and other optimized layouts, while the commonly used QWERTY layout places common letters apart from each other to prevent typewriter jamming. == Background == The frequency of letters in text has been studied for use in cryptanalysis, and frequency analysis in particular, dating back to the Arab mathematician al-Kindi (c. AD 801–873 ), who formally developed the method (the ciphers breakable by this technique go back at least to the Caesar cipher used by Julius Caesar, so this method could have been explored in classical times). Letter frequency analysis gained additional importance in Europe with the development of movable type in AD 1450, wherein one must estimate the amount of type required for each letterform, as evidenced by the variations in letter compartment size in typographer's type cases. No exact letter frequency distribution underlies a given language, since all writers write slightly differently. However, most languages have a characteristic distribution which is strongly apparent in longer texts. Even language changes as extreme as from Old English to modern English (regarded as mutually unintelligible) show strong trends in related letter frequencies: over a small sample of Biblical passages, from most frequent to least frequent, enaid sorhm tgþlwu æcfy ðbpxz of Old English compares to eotha sinrd luymw fgcbp kvjqxz of modern English, with the most extreme differences concerning letterforms not shared. Linotype machines for the English language assumed the letter order, from most to least common, to be etaoin shrdlu cmfwyp vbgkqj xz based on the experience and custom of manual compositors. The equivalent for the French language was elaoin sdrétu cmfhyp vbgwqj xz. Arranging the alphabet in Morse into groups of letters that require equal amounts of time to transmit, and then sorting these groups in increasing order, yields e it san hurdm wgvlfbk opxcz jyq. Letter frequency was used by other telegraph systems, such as the Murray Code. Similar ideas are used in modern data-compression techniques such as Huffman coding. Letter frequencies, like word frequencies, tend to vary, both by writer and by subject. For instance, ⟨d⟩ occurs with greater frequency in fiction, as most fiction is written in past tense and thus most verbs will end in the inflectional suffix -ed / -d. One cannot write an essay about x-rays without using ⟨x⟩ frequently, and the essay will have an idiosyncratic letter frequency if the essay is about, say, Queen Zelda of Zanzibar requesting X-rays from Qatar to examine hypoxia in zebras. Different authors have habits which can be reflected in their use of letters. Hemingway's writing style, for example, is visibly different from Faulkner's. Letter, bigram, trigram, word frequencies, word length, and sentence length can be calculated for specific authors and used to prove or disprove authorship of texts, even for authors whose styles are not so divergent. Accurate average letter frequencies can only be gleaned by analyzing a large amount of representative text. With the availability of modern computing and collections of large text corpora, such calculations are easily made. Examples can be drawn from a variety of sources (press reporting, religious texts, scientific texts and general fiction) and there are differences especially for general fiction with the position of ⟨h⟩ and ⟨i⟩, with ⟨h⟩ becoming more common. Different dialects of a language will also affect a letter's frequency. For example, an author in the United States would produce something in which ⟨z⟩ is more common than an author in the United Kingdom writing on the same topic: words like "analyze", "apologize", and "recognize" contain the letter in American English, whereas the same words are spelled "analyse", "apologise", and "recognise" in British English. This would highly affect the frequency of the letter ⟨z⟩, as it is rarely used by British writers in the English language. The "top twelve" letters constitute about 80% of the total usage. The "top eight" letters constitute about 65% of the total usage. Letter frequency as a function of rank can be fitted well by several rank functions, with the two-parameter Cocho/Beta rank function being the best. Another rank function with no adjustable free parameter also fits the letter frequency distribution reasonably well (the same function has been used to fit the amino acid frequency in protein sequences.) A spy using the VIC cipher or some other cipher based on a straddling checkerboard typically uses a mnemonic such as "a sin to err" (dropping the second "r") or "at one sir" to remember the top eight characters. == Relative frequencies of letters in the English language == There are three ways to count letter frequency that result in very different charts for common letters. The first method, used in the chart below, is to count letter frequency in lemmas of a dictionary. The lemma is the word in its canonical form. The second method is to include all word variants when counting, such as "abstracts", "abstracted" and "abstracting" and not just the lemma of "abstract". This second method results in letters like ⟨s⟩ appearing much more frequently, such as when counting letters from lists of the most used English words on the Internet. ⟨s⟩ is especially common in inflected words (non-lemma forms) because it is added to form plurals and third person singular present tense verbs. A final method is to count letters based on their frequency of use in actual texts, resulting in certain letter combinations like ⟨th⟩ becoming more common due to the frequent use of common words like "the", "then", "both", "this", etc. Absolute usage frequency measures like this are used when creating keyboard layouts or letter frequencies in old fashioned printing presses. An analysis of entries in the Concise Oxford dictionary, ignoring frequency of word use, gives an order of "EARIOTNSLCUDPMHGBFYWKVXZJQ". The letter-frequency table above is taken from Pavel Mička's website, which cites Robert Lewand's Cryptological Mathematics. According to Lewand, arranged from most to least common in appearance, the letters are: etaoinshrdlcumwfgypbvkjxqz. Lewand's ordering differs slightly from others, such as Cornell University Math Explorer's Project, which produced a table after measuring 40,000 words. In English, the space character occurs almost twice as frequently as the top letter (⟨e⟩) and the non-alphabetic characters (digits, punctuation, etc.) collectively occupy the fourth position (having already included the space) between ⟨t⟩ and ⟨a⟩. == Relative frequencies of the first letters of a word in the English language == The frequency of the first letters of words or names is helpful in pre-assigning space in physical files and indexes. Given 26 filing cabinet drawers, rather than a 1:1 assignment of one drawer to one letter of the alphabet, it is often useful to use a more equal-frequency-letter code by assigning several low-frequency letters to the same drawer (often one drawer is labeled VWXYZ), and to split up the most-frequent initial letters (⟨s, a, c⟩) into several drawers (often 6 drawers Aa-An, Ao-Az, Ca-Cj, Ck-Cz, Sa-Si, Sj-Sz). The same system is used in some mult

    Read more →
  • Point-to-point encryption

    Point-to-point encryption

    Point-to-point encryption (P2PE) is a standard established by the PCI Security Standards Council. Payment solutions that offer similar encryption but do not meet the P2PE standard are referred to as end-to-end encryption (E2EE) solutions. The objective of P2PE and E2EE is to provide a payment security solution that instantaneously converts confidential payment card (credit and debit card) data and information into indecipherable code at the time the card is swiped, in order to prevent hacking and fraud. It is designed to maximize the security of payment card transactions in an increasingly complex regulatory environment. == The standard == The P2PE Standard defines the requirements that a "solution" must meet in order to be accepted as a PCI-validated P2PE solution. A "solution" is a complete set of hardware, software, gateway, decryption, device handling, etc. Only "solutions" can be validated; individual pieces of hardware such as card readers cannot be validated. It is also a common mistake to refer to P2PE validated solutions as "certified"; there is no such certification. The determination of whether or not a solution meets the P2PE standard is the responsibility of a P2PE Qualified Security Assessor (P2PE-QSA). P2PE-QSA companies are independent third-party companies who employ assessors that have met the PCI Security Standards Council's requirements for education and experience, and have passed the requisite exam. The PCI Security Standards Council does not validate solutions. == How it works == As a payment card is swiped through a card reading device, referred to as a point of interaction (POI) device, at the merchant location or point of sale, the device immediately encrypts the card information. A device that is part of a PCI-validated P2PE solution uses an algorithmic calculation to encrypt the confidential payment card data. From the POI, the encrypted, indecipherable codes are sent to the payment gateway or processor for decryption. The keys for encryption and decryption are never available to the merchant, making card data entirely invisible to the retailer. Once the encrypted codes are within the secure data zone of the payment processor, the codes are decrypted to the original card numbers and then passed to the issuing bank for authorization. The bank either approves or rejects the transaction, depending upon the card holder's payment account status. The merchant is then notified if the payment is accepted or rejected to complete the process along with a token that the merchant can store. This token is a unique number reference to the original transaction that the merchant can use should they ever be needed to perform research or refund the customer without ever knowing the customer's card information (tokenization). There are also Qualified Integrator and Reseller (QIR) Companies, which are businesses authorized to "implement, configure, and/or support validated" PA-DSS Payment Applications, and perform qualified installations. == Solution providers == According to the PCI Security Standards Council:The P2PE solution provider is a third-party entity (for example, a processor, acquirer, or payment gateway) that has overall responsibility for the design and implementation of a specific P2PE solution, and manages P2PE solutions for its merchant customers. The solution provider has overall responsibility for ensuring that all P2PE requirements are met, including any P2PE requirements performed by third-party organizations on behalf of the solution provider (for example, certification authorities and key-injection facilities). == Benefits == === Customer benefits === P2PE significantly reduces the risk of payment card fraud by instantaneously encrypting confidential cardholder data at the moment a payment card is swiped or "dipped" if it is a chip card at the card reading device (payment terminal) or POI. === Merchant benefits === P2PE significantly facilitates merchant responsibilities: With a P2PE validated solution, merchants save significant time and money as PCI requirements may be greatly reduced. Payment Card Industry Data Security Standard (PCI DSS). For organizations who use a P2PE validated solution provider, the PCI Self Assessment Questionnaire is reduced from 12 sections to 4 sections and the controls are reduced from 329 questions to just 35. In the event of fraud, the P2PE Solution Provider, not the merchant, is held accountable for data loss and resulting fines that may be assessed by the card brands (American Express, Visa, MasterCard, Discover, and JCB). The PCI Security Standards Council does not assess penalties on Solution Providers or Merchants. The payment process with P2PE is quicker than other transaction processes, thus creating simpler and faster customer–merchant transactions. == Point-to-point encryption versus end-to-end encryption == === Point-to-point === A point-to-point connection directly links system 1 (the point of payment card acceptance) to system 2 (the point of payment processing). A true P2PE solution is determined with three main factors: The solution uses a hardware-to-hardware encryption and decryption process along with a POI device that has SRED (Secure Reading and Exchange of Data) listed as a function. The solution has been validated to the PCI P2PE Standard which includes specific POI device requirements such as strict controls regarding shipping, receiving, tamper-evident packaging, and installation. A solution includes merchant education in the form of a P2PE Instruction Manual, which guides the merchant on POI device use, storage, return for repairs, and regular PCI reporting. === End-to-end === End-to-end encryption as the name suggests has the advantage over P2PE that card details are not unencrypted between the two endpoints. If the endpoints are a PCI PED validated PIN pad and a POS acquirer, there is no opportunity for the card details to be intercepted. It is obviously important that the endpoints (the PED and gateway) are provided by PCI accredited organisations. == PCI point-to-point encryption requirements == The requirements include: Secure encryption of payment card data at the point of interaction (POI), P2PE validated application(s) at the point of interaction, Secure management of encryption and decryption devices, Management of the decryption environment and all decrypted account data, Use of secure encryption methodologies and cryptographic key operations, including key generation, distribution, loading/injection, administration, and usage.

    Read more →
  • Information security

    Information security

    Information security is the practice of protecting information by mitigating information risks. It is part of information risk management. It typically involves preventing or reducing the probability of unauthorized or inappropriate access to data or the unlawful use, disclosure, disruption, deletion, corruption, modification, inspection, recording, or devaluation of information. It also involves actions intended to reduce the adverse impacts of such incidents. Protected information may take any form, e.g., electronic or physical, tangible (e.g., paperwork), or intangible (e.g., knowledge). Information security's primary focus is the balanced protection of data confidentiality, integrity, and availability (known as the CIA triad, unrelated to the US government organization) while maintaining a focus on efficient policy implementation, all without hampering organization productivity. This is largely achieved through a structured risk management process. To standardize this discipline, academics and professionals collaborate to offer guidance, policies, and industry standards on passwords, antivirus software, firewalls, encryption software, legal liability, security awareness and training, and so forth. This standardization may be further driven by a wide variety of laws and regulations that affect how data is accessed, processed, stored, transferred, and destroyed. While paper-based business operations are still prevalent, requiring their own set of information security practices, enterprise digital initiatives are increasingly being emphasized, with information assurance now typically being dealt with by information technology (IT) security specialists. These specialists apply information security to technology (most often some form of computer system). IT security specialists are almost always found in any major enterprise/establishment due to the nature and value of the data within larger businesses. They are responsible for keeping all of the technology within the company secure from malicious attacks that often attempt to acquire critical private information or gain control of the internal systems. There are many specialist roles in Information Security including securing networks and allied infrastructure, securing applications and databases, security testing, information systems auditing, business continuity planning, electronic record discovery, and digital forensics. == Standards == Information security standards are guidelines generally outlined in published materials that aim to protect a user's or an organization's cyber environment from threats. This environment includes the users themselves, hardware such as devices and networks, software such as applications or services, and any information in storage or transit. These standards comprise security concepts, technologies, and guidelines to deal with an adverse event. They may also include assessment criteria and certification for organizations implementing a minimum level of security. These standards are developed by various international and national bodies to prevent or mitigate cyber-attacks, ensure consistency among developers, and establish a minimum standard in industries susceptible to an attack. The ISO/IEC 27000 family, published by the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), provides information about the guidelines and requirements for an Information Security Management System (ISMS). The Common Criteria (ISO/IEC 15408) provides guidelines on evaluating and certifying the security of a system. The IEC 62443 establishes security standards for automation and control systems. Similarly, the ISO/SAE 21434, ETSI EN 303 645, and EN 18031 provide standards for road vehicles, the Internet of Things, and radio-based systems respectively. The NIST Cybersecurity Framework (NIST CSF) is a set of guidelines developed by the U.S. National Institute of Standards and Technology to help organizations with risk management. NIST also publishes various Federal Information Processing Standards (FIPS) and Special Publications. The United Kingdom has introduced Cyber Essentials, which is a certification scheme to protect organizations against common security threats. The Australian Cyber Security Centre publishes the Essential Eight mitigation strategies. The Payment Card Industry Data Security Standard (PCI DSS) regulates handling of cardholder data in order to reduce credit card fraud. UL has published standards related to specific industries such as UL 2900-2-3 for security and life safety signaling systems and UL-2900-2-1 for healthcare and wellness systems. == Threats == Information security threats come in many different forms. Some of the most common threats today are software attacks, theft of intellectual property, theft of identity, theft of equipment or information, sabotage, and information extortion. Viruses, worms, phishing attacks, and Trojan horses are a few common examples of software attacks. The theft of intellectual property has also been an extensive issue for many businesses. Identity theft is the attempt to act as someone else usually to obtain that person's personal information or to take advantage of their access to vital information through social engineering. Sabotage usually consists of the destruction of an organization's website in an attempt to cause loss of confidence on the part of its customers. Information extortion consists of theft of a company's property or information as an attempt to receive a payment in exchange for returning the information or property back to its owner, as with ransomware. One of the most functional precautions against these attacks is to conduct periodical user awareness. Governments, military, corporations, financial institutions, hospitals, non-profit organizations, and private businesses amass a great deal of confidential information about their employees, customers, products, research, and financial status. Should confidential information about a business's customers or finances or new product line fall into the hands of a competitor or hacker, a business and its customers could suffer widespread, irreparable financial loss, as well as damage to the company's reputation. From a business perspective, information security must be balanced against cost; the Gordon-Loeb Model provides a mathematical economic approach for addressing this concern. For the individual, information security has a significant effect on privacy, which is viewed very differently in various cultures. == History == Since the early days of communication, diplomats and military commanders understood that it was necessary to provide some mechanism to protect the confidentiality of correspondence and to have some means of detecting tampering. Julius Caesar is credited with the invention of the Caesar cipher c. 50 B.C., which was created in order to prevent his secret messages from being read should a message fall into the wrong hands. However, for the most part protection was achieved through the application of procedural handling controls. Sensitive information was marked up to indicate that it should be protected and transported by trusted persons, guarded and stored in a secure environment or strong box. As postal services expanded, governments created official organizations to intercept, decipher, read, and reseal letters (e.g., the U.K.'s Secret Office, founded in 1653). In the mid-nineteenth century more complex classification systems were developed to allow governments to manage their information according to the degree of sensitivity. For example, the British Government codified this, to some extent, with the publication of the Official Secrets Act in 1889. Section 1 of the law concerned espionage and unlawful disclosures of information, while Section 2 dealt with breaches of official trust. A public interest defense was soon added to defend disclosures in the interest of the state. A similar law was passed in India in 1889, The Indian Official Secrets Act, which was associated with the British colonial era and used to crack down on newspapers that opposed the Raj's policies. A newer version was passed in 1923 that extended to all matters of confidential or secret information for governance. By the time of the First World War, multi-tier classification systems were used to communicate information to and from various fronts, which encouraged greater use of code making and breaking sections in diplomatic and military headquarters. Encoding became more sophisticated between the wars as machines were employed to scramble and unscramble information. The establishment of computer security inaugurated the history of information security. The need for such appeared during World War II. The volume of information shared by the Allied countries during the Second World War necessitated formal alignment of classification systems and procedural controls. An arcane range of markings evol

    Read more →
  • GOLOG

    GOLOG

    GOLOG is a high-level logic programming language for the specification and execution of complex actions in dynamical domains. It is based on the situation calculus. It is a first-order logical language for reasoning about action and change. GOLOG was developed at the University of Toronto. == History == The concept of situation calculus on which the GOLOG programming language is based was first proposed by John McCarthy in 1963. == Description == A GOLOG interpreter automatically maintains a direct characterization of the dynamic world being modeled, on the basis of user supplied axioms about preconditions, effects of actions and the initial state of the world. This allows the application to reason about the condition of the world and consider the impacts of different potential actions before focusing on a specific action. Golog is a logic programming language and is very different from conventional programming languages. A procedural programming language like C defines the execution of statements in advance. The programmer creates a subroutine which consists of statements, and the computer executes each statement in a linear order. In contrast, fifth-generation programming languages like Golog work with an abstract model with which the interpreter can generate the sequence of actions. The source code defines the problem and it is up to the solver to find the next action. This approach can facilitate the management of complex problems from the domain of robotics. A Golog program defines the state space in which the agent is allowed to operate. A path in the symbolic domain is found with state space search. To speed up the process, Golog programs are realized as hierarchical task networks. Apart from the original Golog language, there are some extensions available. The ConGolog language provides concurrency and interrupts. Other dialects like IndiGolog and Readylog were created for real time applications in which sensor readings are updated on the fly. == Uses == Golog has been used to model the behavior of autonomous agents. In addition to a logic-based action formalism for describing the environment and the effects of basic actions, they enable the construction of complex actions using typical programming language constructs. It is also used for applications in high level control of robots and industrial processes, virtual agents, discrete event simulation etc. It can be also used to develop Belief Desire Intention-style agent systems. == Planning and scripting == In contrast to the Planning Domain Definition Language, Golog supports planning and scripting as well. Planning means that a goal state in the world model is defined, and the solver brings a logical system into this state. Behavior scripting implements reactive procedures, which are running as a computer program. For example, suppose the idea is to authoring a story. The user defines what should be true at the end of the plot. A solver gets started and applies possible actions to the current situation until the goal state is reached. The specification of a goal state and the possible actions are realized in the logical world model. In contrast, a hardwired reactive behavior doesn't need a solver but the action sequence is provided in a scripting language. The Golog interpreter, which is written in Prolog, executes the script and this will bring the story into the goal state.

    Read more →
  • Classora

    Classora

    Classora is a knowledge base for the Internet oriented to data analysis. From a practical point of view, Classora is a digital repository that stores structured information and allows it to be displayed in multiple formats: analytically, graphically, geographically (through maps); as well as carry out OLAP analysis. The information contained in Classora comes from public sources and is uploaded into the system through bots and ETL processes. The Knowledge Base has a commercial API for semantic enhancement, and an open web through which any user can access to part of the information collected (it also allows users to complete data and share opinions). Internally, Classora is organized into Knowledge Units and Reports. A «Knowledge Unit» is any element of the World about which information may be stored and presented in the form of a data sheet (a person, a company, a country, etc.) A «Report» is a group of Knowledge Units: a ranking of companies, a sport classification table, a survey about people, etc. In fact, one of the technical capabilities of Classora is that it allows the comparison of reports and knowledge units gathered from different sources, thereby generating an added value for the media in which this information is published: digital media, interactive TV, etc. == Key definitions == === Knowledge unit === The units of knowledge (also known as entries) in Classora are data sheets that have a certain semantic equivalence with the articles on the Wikipedia: they store information about any element of the world, be it a film, a country, a company or an animal. However, they differ from Wikipedia in that Classora stores structured information, enriched with a metadata layer; and therefore it is able to automatically interpret the meaning of each unit of knowledge. === Data report === A report is a group of units of knowledge in which the repetition of elements is not allowed. This definition includes any list, poll, ranking, etc.; and, in general, any consultation that involves more than one unit of knowledge. Classora excels at the reports management due to its visualization capabilities, being able to display data in the form of tables, graphs and maps. Types of reports: Sports scores: Sports competitions results sanctioned by the competent institution. Rankings and lists: All types of interesting and curious lists, whether they have an implicit order or not. Polls: Units of knowledge that are ranked according to users’ votes. Queries to the Knowledge Base: Questions from users using CQL. Networks of connections: automatically calculated from the reports and the taxonomy of each Knowledge Unit. === Organizational taxonomy === An organizational taxonomy (also referred to as entry type) is a data sheet that brings together the common attributes of a set of units of knowledge. For instance, the organizational taxonomy F1 Driver displays attributes such as date of debut, team, etc.; and the organizational taxonomy Football Club presents attributes such as city, stadium, etc. In Classora, taxonomies are hierarchically organized, so that they inherit attributes from their parent taxonomies. For instance, F1 Driver is a subsidiary taxonomy of Sportsperson, which is a subsidiary taxonomy of Person, which in turn is a subsidiary taxonomy of Organism. The simplest type of entry in Classora is Classora Object. All the other taxonomies are its subsidiaries and inherit its attributes. In fact, the only attribute Classora Object possesses is name (all units of knowledge are required to have one name at least). == Architecture of Classora == === Data Extraction Module === The Data Extraction Module consists of a set of robots coordinated by software that also manages the potential incidents. Most of the information available in Classora is automatically uploaded through those robots, which connect to the main online public sources to gather all types of data. There are three categories of robots: Extraction robots: responsible for the massive uploading of reports from official public sources (FIFA, CIA, IMF, Eurostat...). They are used for either absolute or incremental data uploading. Data scanner robots: responsible for looking for and updating the data of a unit of knowledge. They use specific sources to perform this task: Wikipedia, IMDB, World Bank, etc. Content aggregators: they don’t connect to external sources. Instead, they generate new information using Classora’s internal database. === Participatory Module === In Classora’s Open Website, Internet users may participate providing their knowledge as they would on the Wikipedia. There are different ways to participate: adding or correcting data in the Knowledge Base, voting in surveys (participatory rankings) and creating new Knowledge Units and Data Reports. === Connectivity Module === The Knowledge Base is designed to be embedded in multi-platform, multi-channel systems, thus enabling its integration into mobile devices, tablets, interactive TV, etc. This integration may be carried out through specific plugins (for navigators or other devices) or an API REST that provides content in XML or JSON formats. The API is divided into three blocks of operations. The first one is the block of general utility tools (ranging from autosuggest components about geographical hierarchies to operations to obtain the list of today’s celebrity birthdays, using CQL). The second one is the block of operations for widget generation (graphs, maps, rankings) using information from the knowledge base. Finally, there is a block of operations designed for the publication of free-source content. == Project statistics == As of April 2012, 2,000,000 Knowledge Units, 15,000 Reports, around 10,000 Maps and several million potential Comparative Analyses had been added to Classora. According to the site of web metrics Alexa, Classora Open Website is ranked at 100,557 globally and at 2,880 in the Spanish traffic ranking. Users spend an average of 9 ½ minutes in Classora.

    Read more →
  • Death of Molly Russell

    Death of Molly Russell

    In November 2017, Molly Russell, a fourteen-year-old British schoolgirl from Harrow, London, was found dead in her bedroom by her parents. In an inquest, the coroner stated that she had died from an act of self-harm following depression and the results of social media consumption, including material on Instagram and Pinterest. She also had a Twitter account in which she documented her growing depression. == Life == Russell had been a pupil at Hatch End High School. At the inquest, the school's head teacher expressed shock that she was able to access distressing online content. Her parents stated that she had never shown any previous signs of struggle and was doing very well in school. It was revealed at the inquest that in the six months prior to her death, 2,100 of 16,300 pieces of content she had interacted with on Instagram were on topics such as self-harm, depression, and suicide. It was also noted that throughout her experience on social media, there were never any warning signs about the information she viewed on these platforms. == Subsequent events == Dr. Navin Venugopal, the child psychiatrist assigned to the case investigating her death, called the material she viewed "disturbing and distressing" and said he was unable to sleep well for weeks after viewing it. The coroner Andrew Walker concluded that Molly's death was "an act of self harm suffering from depression and the negative effects of online content". He issued a prevention of future deaths report regarding her death, in which he made a number of recommendations for operators of online platforms, including: separating platforms for adults and children age verification changes in policy on filtering of age-specific content adding features for parental supervision and control data retention of material viewed by children He suggested that this could be accomplished by either legislation or self-regulation. The lawyer representing her family at the inquest stated that the findings "captured all of the elements of why this material is so harmful." The case has been cited as a motivator for the passage of the Online Safety Act. A charity, the Molly Rose Foundation, was set up in her memory, with the goal of suicide prevention for young people. Meta and Pinterest are believed to have made substantial donations to the charity.

    Read more →