Tensor network

Tensor networks or tensor network states are a class of variational wave functions used in the study of many-body quantum systems and fluids. Tensor networks extend one-dimensional matrix product states to higher dimensions while preserving some of their useful mathematical properties. The wave function is encoded as a tensor contraction of a network of individual tensors. The structure of the individual tensors can impose global symmetries on the wave function (such as antisymmetry under exchange of fermions) or restrict the wave function to specific quantum numbers, like total charge, angular momentum, or spin. It is also possible to derive strict bounds on quantities like entanglement and correlation length using the mathematical structure of the tensor network. This has made tensor networks useful in theoretical studies of quantum information in many-body systems. They have also proved useful in variational studies of ground states, excited states, and dynamics of strongly correlated many-body systems. == Diagrammatic notation == In general, a tensor network diagram (Penrose diagram) can be viewed as a graph where nodes (or vertices) represent individual tensors, while edges represent summation over an index. Free indices are depicted as edges (or legs) attached to a single vertex only. Sometimes, there is also additional meaning to a node's shape. For instance, one can use trapezoids for unitary matrices or tensors with similar behaviour. This way, flipped trapezoids would be interpreted as complex conjugates to them. == History == Foundational research on tensor networks began in 1971 with a paper by Roger Penrose. In "Applications of negative dimensional tensors" Penrose developed tensor diagram notation, describing how the diagrammatic language of tensor networks could be used in applications in physics. In 1992, Steven R. White developed the density matrix renormalization group (DMRG) for quantum lattice systems. The DMRG was the first successful tensor network and associated algorithm. In 2002, Guifré Vidal and Reinhard Werner attempted to quantify entanglement, laying the groundwork for quantum resource theories. This was also the first description of the use of tensor networks as mathematical tools for describing quantum systems. In 2004, Frank Verstraete and Ignacio Cirac developed the theory of matrix product states, projected entangled pair states, and variational renormalization group methods for quantum spin systems. In 2006, Vidal developed the multi-scale entanglement renormalization ansatz (MERA). In 2007 he developed entanglement renormalization for quantum lattice systems. In 2010, Ulrich Schollwock developed the density-matrix renormalization group for the simulation of one-dimensional strongly correlated quantum lattice systems. In 2014, Román Orús introduced tensor networks for complex quantum systems and machine learning, as well as tensor network theories of symmetries, fermions, entanglement and holography. == Connection to machine learning == Tensor networks have been adapted for supervised learning, taking advantage of similar mathematical structure in variational studies in quantum mechanics and large-scale machine learning. This crossover has spurred collaboration between researchers in artificial intelligence and quantum information science. In June 2019, Google, the Perimeter Institute for Theoretical Physics, and X (company), released TensorNetwork, an open-source library for efficient tensor calculations. The main interest in tensor networks and their study from the perspective of machine learning is to reduce the number of trainable parameters (in a layer) by approximating a high-order tensor with a network of lower-order ones. Using the so-called tensor train technique (TT), one can reduce an N-order tensor (containing exponentially many trainable parameters) to a chain of N tensors of order 2 or 3, which gives us a polynomial number of parameters.

Quantum machine learning

Quantum machine learning (QML) is the study of quantum algorithms for machine learning. It often refers to quantum algorithms for machine learning tasks which analyze classical data, sometimes called quantum-enhanced machine learning. QML algorithms use qubits and quantum operations to try to improve the space and time complexity of classical machine learning algorithms. Hybrid QML methods involve both classical and quantum processing, where computationally difficult subroutines are outsourced to a quantum device. These routines can be more complex in nature and executed faster on a quantum computer. Furthermore, quantum algorithms can be used to analyze quantum states instead of classical data. The term "quantum machine learning" is sometimes used to refer classical machine learning methods applied to data generated from quantum experiments (i.e. machine learning of quantum systems), such as learning the phase transitions of a quantum system or creating new quantum experiments. QML also extends to a branch of research that explores methodological and structural similarities between certain physical systems and learning systems, in particular neural networks. For example, some mathematical and numerical techniques from quantum physics are applicable to classical deep learning and vice versa. Furthermore, researchers investigate more abstract notions of learning theory with respect to quantum information, sometimes referred to as "quantum learning theory". == Machine learning with quantum computers == Quantum-enhanced machine learning refers to quantum algorithms that solve tasks in machine learning, thereby improving and often expediting classical machine learning techniques. Such algorithms typically require one to encode the given classical data set into a quantum computer to make it accessible for quantum information processing. Subsequently, quantum information processing routines are applied and the result of the quantum computation is read out by measuring the quantum system. For example, the outcome of the measurement of a qubit reveals the result of a binary classification task. While many proposals of QML algorithms are still purely theoretical and require a full-scale universal quantum computer to be tested, others have been implemented on small-scale or special purpose quantum devices. === Quantum associative memories and quantum pattern recognition === Early work on quantum associative memories has been done by Dan Ventura and Tony Martinez and by Carlo A. Trugenberger in the late 1990s and early 2000s. Associative (or content-addressable) memories are able to recognize stored content on the basis of a similarity measure, while random access memories are accessed by the address of stored information and not its content. As such they must be able to retrieve both incomplete and corrupted patterns, the essential machine learning task of pattern recognition. Typical classical associative memories store p patterns in the O ( n 2 ) {\displaystyle O(n^{2})} interactions (synapses) of a real, symmetric energy matrix over a network of n artificial neurons. The encoding is such that the desired patterns are local minima of the energy functional and retrieval is done by minimizing the total energy, starting from an initial configuration. Unfortunately, classical associative memories are severely limited by the phenomenon of cross-talk. When too many patterns are stored, spurious memories appear which quickly proliferate, so that the energy landscape becomes disordered and no retrieval is anymore possible. The number of storable patterns is typically limited by a linear function of the number of neurons, p ≤ O ( n ) {\displaystyle p\leq O(n)} . Quantum associative memories (in their simplest realization) store patterns in a unitary matrix U acting on the Hilbert space of n qubits. Retrieval is realized by the unitary evolution of a fixed initial state to a quantum superposition of the desired patterns with probability distribution peaked on the most similar pattern to an input. By its very quantum nature, the retrieval process is thus probabilistic. Because quantum associative memories are free from cross-talk, however, spurious memories are never generated. Correspondingly, they have a superior capacity than classical ones. The number of parameters in the unitary matrix U is O ( p n ) {\displaystyle O(pn)} . One can thus have efficient, spurious-memory-free quantum associative memories for any polynomial number of patterns. If the matrix U is encoded as a unique operator (as opposed as to a sequence of gates as in the circuit model), e.g. by an optical interferometer, the retrieval becomes efficient even for an exponential number of patterns. === Linear algebra simulation with quantum amplitudes === A number of quantum algorithms for machine learning are based on the idea of amplitude encoding, that is, to associate the amplitudes of a quantum state with the inputs and outputs of computations. Since a state of n {\displaystyle n} qubits is described by 2 n {\displaystyle 2^{n}} complex amplitudes, this information encoding can allow for an exponentially compact representation. Intuitively, this corresponds to associating a discrete probability distribution over binary random variables with a classical vector. The goal of algorithms based on amplitude encoding is to formulate quantum algorithms whose resources grow polynomially in the number of qubits n {\displaystyle n} , which amounts to a logarithmic time complexity in the number of amplitudes and thereby the dimension of the input. Many QML algorithms in this category are based on variations of the quantum algorithm for linear systems of equations (colloquially called HHL, after the paper's authors) which, under specific conditions, performs a matrix inversion using an amount of physical resources growing only logarithmically in the dimensions of the matrix. One of these conditions is that a Hamiltonian which entry-wise corresponds to the matrix can be simulated efficiently, which is known to be possible if the matrix is sparse or low rank. For reference, any known classical algorithm for matrix inversion requires a number of operations that grows more than quadratically in the dimension of the matrix (e.g. O ( n 2.373 ) {\displaystyle O{\mathord {\left(n^{2.373}\right)}}} ), but they are not restricted to sparse matrices. Quantum matrix inversion can be applied to machine learning methods in which the training reduces to solving a linear system of equations, for example in least-squares linear regression, the least-squares version of support vector machines, and Gaussian processes. A crucial bottleneck of methods that simulate linear algebra computations with the amplitudes of quantum states is state preparation, which often requires one to initialise a quantum system in a state whose amplitudes reflect the features of the entire dataset. Although efficient methods for state preparation are known for specific cases, this step easily hides the complexity of the task. === Variational quantum algorithms (VQAs) === In a variational quantum algorithm, a classical computer optimizes the parameters used to prepare a quantum state, while a quantum computer is used to do the actual state preparation and measurement. VQAs are considered promising candidates for noisy intermediate-scale quantum computers. Variational quantum circuits (or parameterized quantum circuits) are a popular class of VQAs where the parameters are those used in a fixed quantum circuit. Researchers have studied VQCs to solve optimization problems and find the ground state energy of complex quantum systems, which were difficult to solve using a classical computer. === Quantum binary classifier === Pattern reorganization is one of the important tasks of machine learning, binary classification is one of the tools or algorithms to find patterns. Binary classification is used in supervised learning and in unsupervised learning. In QML, classical bits are converted to qubits and they are mapped to Hilbert space; complex value data are used in a quantum binary classifier to use the advantage of Hilbert space. By exploiting the quantum mechanic properties such as superposition, entanglement, interference the quantum binary classifier produces the accurate result in short period of time. === Quantum machine learning algorithms based on Grover search === Another approach to improving classical machine learning with quantum information processing uses amplitude amplification methods based on Grover's search algorithm, which has been shown to solve unstructured search problems with a quadratic speedup compared to classical algorithms. These quantum routines can be employed for learning algorithms that translate into an unstructured search task, as can be done, for instance, in the case of the k-medians and the k-nearest neighbors algorithms. Other applications include quadratic speedups in the training of perceptrons. An e

Time-lock puzzle

A time-lock puzzle, or time-released cryptography, encrypts a message that cannot be decrypted until a specified amount of time has passed. The concept was first described by Timothy C. May, and a solution first introduced by Ron Rivest, Adi Shamir, and David A. Wagner in 1996. Time-lock puzzle are useful in cases where confidentiality of information is determined by time, such as a diarist who does not want their views released until 50 years after their death, an auction where bids are sealed until the bidding period is closed, electronic voting, and contract signing. They can additionally be used in creating further cryptographic primitives, such as verifiable delay functions and zero knowledge proofs. Time-released cryptography can be achieved through several different mechanisms. Use mathematical problems requiring sequential calculations to solve, and cannot be solved with parallelization. Thus, adding more computers to a problem will not help solve the problem faster. Use of a trusted agent, or multiple agents who each hold a part of the message and cryptographic keys, who release the message after a specified time period has passed. Distribute public encryption keys to users, and place private cryptographic keys with a trusted agent in an offline location, to be released at a later date.

Control-flow diagram

A control-flow diagram (CFD) is a diagram to describe the control flow of a business process, process or review. Control-flow diagrams were developed in the 1950s, and are widely used in multiple engineering disciplines. They are one of the classic business process modeling methodologies, along with flow charts, drakon-charts, data flow diagrams, functional flow block diagram, Gantt charts, PERT diagrams, and IDEF. == Overview == A control-flow diagram can consist of a subdivision to show sequential steps, with if-then-else conditions, repetition, and/or case conditions. Suitably annotated geometrical figures are used to represent operations, data, or equipment, and arrows are used to indicate the sequential flow from one to another. There are several types of control-flow diagrams, for example: Change-control-flow diagram, used in project management Configuration-decision control-flow diagram, used in configuration management Process-control-flow diagram, used in process management Quality-control-flow diagram, used in quality control. In software and systems development, control-flow diagrams can be used in control-flow analysis, data-flow analysis, algorithm analysis, and simulation. Control and data are most applicable for real time and data-driven systems. These flow analyses transform logic and data requirements text into graphic flows which are easier to analyze than the text. PERT, state transition, and transaction diagrams are examples of control-flow diagrams. == Types of control-flow diagrams == === Process-control-flow diagram === A flow diagram can be developed for the process [control system] for each critical activity. Process control is normally a closed cycle in which a sensor. The application determines if the sensor information is within the predetermined (or calculated) data parameters and constraints. The results of this comparison, which controls the critical component. This [feedback] may control the component electronically or may indicate the need for a manual action. This closed-cycle process has many checks and balances to ensure that it stays safe. It may be fully computer controlled and automated, or it may be a hybrid in which only the sensor is automated and the action requires manual intervention. Further, some process control systems may use prior generations of hardware and software, while others are state of the art. === Performance-seeking control-flow diagram === The figure presents an example of a performance-seeking control-flow diagram of the algorithm. The control law consists of estimation, modeling, and optimization processes. In the Kalman filter estimator, the inputs, outputs, and residuals were recorded. At the compact propulsion-system-modeling stage, all the estimated inlet and engine parameters were recorded. In addition to temperatures, pressures, and control positions, such estimated parameters as stall margins, thrust, and drag components were recorded. In the optimization phase, the operating-condition constraints, optimal solution, and linear-programming health-status condition codes were recorded. Finally, the actual commands that were sent to the engine through the DEEC were recorded.

Visual cryptography

Visual cryptography is a cryptographic technique which allows visual information (pictures, text, etc.) to be encrypted in such a way that the decrypted information appears as a visual image. One of the best-known techniques has been credited to Moni Naor and Adi Shamir, who developed it in 1994. They demonstrated a visual secret sharing scheme, where a binary image was broken up into n shares so that only someone with all n shares could decrypt the image, while any n − 1 shares revealed no information about the original image. Each share was printed on a separate transparency, and decryption was performed by overlaying the shares. When all n shares were overlaid, the original image would appear. There are several generalizations of the basic scheme including k-out-of-n visual cryptography, and using opaque sheets but illuminating them by multiple sets of identical illumination patterns under the recording of only one single-pixel detector, which exposed the image. Using a similar idea, transparencies can be used to implement a one-time pad encryption, where one transparency is a shared random pad, and another transparency acts as the ciphertext. Normally, there is an expansion of space requirement in visual cryptography. But if one of the two shares is structured recursively, the efficiency of visual cryptography can be increased to 100%. Some antecedents of visual cryptography are in patents from the 1960s. Other antecedents are in the work on perception and secure communication. Visual cryptography can be used to protect biometric templates in which decryption does not require any complex computations. == Example == In this example, the binary image has been split into two component images. Each component image has a pair of pixels for every pixel in the original image. These pixel pairs are shaded black or white according to the following rule: if the original image pixel was black, the pixel pairs in the component images must be complementary; randomly shade one ■□, and the other □■. When these complementary pairs are overlapped, they will appear dark gray. On the other hand, if the original image pixel was white, the pixel pairs in the component images must match: both ■□ or both □■. When these matching pairs are overlapped, they will appear light gray. So, when the two component images are superimposed, the original image appears. However, without the other component, a component image reveals no information about the original image; it is indistinguishable from a random pattern of ■□ / □■ pairs. Moreover, if you have one component image, you can use the shading rules above to produce a counterfeit component image that combines with it to produce any image at all. == (2, n) visual cryptography sharing case == Sharing a secret with an arbitrary number of people, n, such that at least 2 of them are required to decode the secret is one form of the visual secret sharing scheme presented by Moni Naor and Adi Shamir in 1994. In this scheme we have a secret image which is encoded into n shares printed on transparencies. The shares appear random and contain no decipherable information about the underlying secret image, however if any 2 of the shares are stacked on top of one another the secret image becomes decipherable by the human eye. Every pixel from the secret image is encoded into multiple subpixels in each share image using a matrix to determine the color of the pixels. In the (2, n) case, a white pixel in the secret image is encoded using a matrix from the following set, where each row gives the subpixel pattern for one of the components: {all permutations of the columns of} : C 0 = [ 1 0 . . . 0 1 0 . . . 0 . . . 1 0 . . . 0 ] . {\displaystyle \mathbf {C_{0}=} {\begin{bmatrix}1&0&...&0\\1&0&...&0\\...\\1&0&...&0\end{bmatrix}}.} While a black pixel in the secret image is encoded using a matrix from the following set: {all permutations of the columns of} : C 1 = [ 1 0 . . . 0 0 1 . . . 0 . . . 0 0 . . . 1 ] . {\displaystyle \mathbf {C_{1}=} {\begin{bmatrix}1&0&...&0\\0&1&...&0\\...\\0&0&...&1\end{bmatrix}}.} For instance in the (2,2) sharing case (the secret is split into 2 shares and both shares are required to decode the secret) we use complementary matrices to share a black pixel and identical matrices to share a white pixel. Stacking the shares we have all the subpixels associated with the black pixel now black while 50% of the subpixels associated with the white pixel remain white. == Cheating the (2, n) visual secret sharing scheme == Horng et al. proposed a method that allows n − 1 colluding parties to cheat an honest party in visual cryptography. They take advantage of knowing the underlying distribution of the pixels in the shares to create new shares that combine with existing shares to form a new secret message of the cheaters choosing. We know that 2 shares are enough to decode the secret image using the human visual system. But examining two shares also gives some information about the 3rd share. For instance, colluding participants may examine their shares to determine when they both have black pixels and use that information to determine that another participant will also have a black pixel in that location. Knowing where black pixels exist in another party's share allows them to create a new share that will combine with the predicted share to form a new secret message. In this way a set of colluding parties that have enough shares to access the secret code can cheat other honest parties. == Visual steganography == 2×2 subpixels can also encode a binary image in each component image. For example, each white pixel of each component image could be represented by two black subpixels, while each black pixel represented by three black subpixels. When overlaid, each white pixel of the secret image is represented by three black subpixels, while each black pixel is represented by all four subpixels black. Each corresponding pixel in the component images is randomly rotated to avoid orientation leaking information about the secret image. == In popular culture == In "Do Not Forsake Me Oh My Darling", a 1967 episode of TV series The Prisoner, the protagonist uses a visual cryptography overlay of multiple transparencies to reveal a secret message – the location of a scientist friend who had gone into hiding.

Prompt engineering

Prompt engineering is the process of structuring natural language inputs (known as prompts) to produce specified outputs from a generative artificial intelligence (GenAI) model. Context engineering is the related area of software engineering that focuses on the management of non-prompt contexts supplied to the GenAI model, such as metadata, API tools, and tokens. It can also be defined as the practice of designing and refining input instructions given to a generative AI model to produce more accurate, relevant, or useful outputs. Effective prompt engineering involves understanding how a model interprets language, and may include techniques such as few-shot prompting, chain-of-thought prompting, and role assignment. It is increasingly considered a skill for working with large language models (LLMs) in both research and professional contexts. During the 2020s AI boom, prompt engineering became regarded as a business capability across corporations and industries. Employees with the title prompt engineer were hired to create prompts that would increase productivity and efficacy, although the individual title has since lost traction amid AI models that produce better prompts than humans and corporate training in prompting for general employees. Common prompting techniques include multi-shot, chain-of-thought, and tree-of-thought prompting, as well as the use of assigning roles to the model. Automated prompt generation methods, such as retrieval-augmented generation (RAG), provide for greater accuracy and a wider scope of functions for prompt engineers. Prompt injection is a type of cybersecurity attack that targets machine learning models through malicious prompts. == Terminology == The Oxford English Dictionary defines prompt engineering as "The action or process of formulating and refining prompts for an artificial intelligence program, algorithm, etc., in order to optimize its output or to achieve a desired outcome; the discipline or profession concerned with this." In 2023, prompt ("an instruction given to an artificial intelligence program, algorithm, etc., which determines or influences the content it generates") was the runner-up to Oxford's word of the year. === Prompt === A prompt is some natural language text that describes and prescribes the task that an artificial intelligence (AI) should perform. A prompt for a text-to-text language model can be a query, a command, or a longer statement referencing context, instructions, and conversation history. The process of prompt engineering may involve designing clear queries, refining wording, providing relevant context, specifying the style of output, and assigning a character for the AI to mimic in order to guide the model toward more accurate, useful, and consistent responses. When communicating with a text-to-image or a text-to-audio model, a typical prompt contains a description of a desired output such as "a high-quality photo of an astronaut riding a horse" or "Lo-fi slow BPM electro chill with organic samples". Prompt engineering may be applied to text-to-image models to achieve a desired subject, style, layout, lighting, and aesthetic. === Techniques === Common terms used to describe various specific prompt engineering techniques include chain-of-thought, tree-of-thought, and retrieval-augmented generation (RAG). A 2024 survey of the field identified over 50 distinct text-based prompting techniques, 40 multimodal variants, and a vocabulary of 33 terms used across prompting research, highlighting a present lack of standardised terminology for prompt engineering. Vibe coding is an AI-assisted software development method where a user prompts an LLM with a description of what they want and lets it generate or edit the code. In 2025, "vibe coding" was the Collins Dictionary word of the year. === Context engineering === Context engineering is a related process that focuses on the context elements that accompany user prompts, which include system instructions, retrieved knowledge, tool definitions, conversation summaries, and task metadata. Context engineering is performed to improve reliability, provenance and token efficiency in production LLM systems. The concept emphasises operational practices such as token budgeting, provenance tags, versioning of context artifacts, observability (logging which context was supplied), and context regression tests to ensure that changes to supplied context do not silently alter system behaviour. == Rationale == Research has found that the performance of large language models (LLMs) is highly sensitive to choices such as the ordering of examples, the quality of demonstration labels, and even small variations in phrasing. In some cases, reordering examples in a prompt produced accuracy shifts of more than 40 percent. === In-context learning === A model's ability to temporarily learn from prompts is known as in-context learning. In-context learning is an emergent ability of large language models. It is an emergent property of model scale, meaning that breaks in scaling laws occur, leading to its efficacy increasing at a different rate in larger models than in smaller models. Unlike training and fine-tuning, which produce lasting changes, in-context learning is temporary. Training models to perform in-context learning can be viewed as a form of meta-learning, or "learning to learn". === Prompting to estimate model sensitivity === Research consistently demonstrates that LLMs are highly sensitive to subtle variations in prompt formatting, structure, and linguistic properties. Some studies have shown up to 76 accuracy points across formatting changes in few-shot settings. Linguistic features significantly influence prompt effectiveness—such as morphology, syntax, and lexico-semantic changes—which meaningfully enhance task performance across a variety of tasks. Clausal syntax, for example, improves consistency and reduces uncertainty in knowledge retrieval. This sensitivity persists even with larger model sizes, additional few-shot examples, or instruction tuning. To address sensitivity of models and make them more robust, several evaluative methods have been proposed. FormatSpread facilitates systematic analysis by evaluating a range of plausible prompt formats, offering a more comprehensive performance interval. Similarly, PromptEval estimates performance distributions across diverse prompts, enabling robust metrics such as performance quantiles and accurate evaluations under constrained budgets. == Prompting techniques == === Multi-shot === A prompt may include a few examples for a model to learn from in context, an approach called few-shot learning. For example, the prompt may ask the model to complete "maison → house, chat → cat, chien →", with the expected response being dog. === Chain-of-thought === Chain-of-thought (CoT) prompting is a technique that allows large language models (LLMs) to solve a problem as a series of intermediate steps before giving a final answer. In 2022, Google Brain reported that chain-of-thought prompting improves reasoning ability by inducing the model to answer a multi-step problem with steps of reasoning that mimic a train of thought. Chain-of-thought techniques were developed to help LLMs handle multi-step reasoning tasks, such as arithmetic or commonsense reasoning questions. When applied to PaLM, a 540 billion parameter language model, according to Google, CoT prompting significantly aided the model, allowing it to perform comparably with task-specific fine-tuned models on several tasks, achieving state-of-the-art results at the time on the GSM8K mathematical reasoning benchmark. It is possible to fine-tune models on CoT reasoning datasets to enhance this capability further and stimulate better interpretability. As originally proposed by Google, each CoT prompt is accompanied by a set of input/output examples—called exemplars—to demonstrate the desired model output, making it a few-shot prompting technique. However, according to a later paper from researchers at Google and the University of Tokyo, simply appending the words "Let's think step-by-step" was also effective, which allowed for CoT to be employed as a zero-shot technique. ==== Self-consistency ==== Self-consistency performs several chain-of-thought rollouts, then selects the most commonly reached conclusion out of all the rollouts. === Tree-of-thought === Tree-of-thought prompting generalizes chain-of-thought by generating multiple lines of reasoning in parallel, with the ability to backtrack or explore other paths. It can use tree search algorithms like breadth-first, depth-first, or beam. === Text-to-image prompting === In 2022, text-to-image models like DALL-E 2, Stable Diffusion, and Midjourney were released to the public. These models take text prompts as input and use them to generate images. Early text-to-image models typically do not understand negation, grammar and sentence structure in the same way as large language models, and may thus requi

Honey encryption

Honey encryption is a type of data encryption that "produces a ciphertext, which, when decrypted with an incorrect key as guessed by the attacker, presents a plausible-looking yet incorrect plaintext." == Creators == Ari Juels and Thomas Ristenpart of the University of Wisconsin, the developers of the encryption system, presented a paper on honey encryption at the 2014 Eurocrypt cryptography conference. == Method of protection == A brute-force attack involves repeated decryption with random keys; this is equivalent to picking random plaintexts from the space of all possible plaintexts with a uniform distribution. This is effective because even though the attacker is equally likely to see any given plaintext, most plaintexts are extremely unlikely to be legitimate i.e. the distribution of legitimate plaintexts is non-uniform. Honey encryption defeats such attacks by first transforming the plaintext into a space such that the distribution of legitimate plaintexts is uniform. Thus an attacker guessing keys will see legitimate-looking plaintexts frequently and random-looking plaintexts infrequently. This makes it difficult to determine when the correct key has been guessed. In effect, honey encryption "[serves] up fake data in response to every incorrect guess of the password or encryption key." The security of honey encryption relies on the fact that the probability of an attacker judging a plaintext to be legitimate can be calculated (by the encrypting party) at the time of encryption. This makes honey encryption difficult to apply in certain applications e.g. where the space of plaintexts is very large or the distribution of plaintexts is unknown. It also means that honey encryption can be vulnerable to brute-force attacks if this probability is miscalculated. For example, it is vulnerable to known-plaintext attacks: if the attacker has a crib that a plaintext must match to be legitimate, they will be able to brute-force even Honey Encrypted data if the encryption did not take the crib into account. == Example == An encrypted credit card number is susceptible to brute-force attacks because not every string of digits is equally likely. The number of digits can range from 13 to 19, though 16 is the most common. Additionally, it must have a valid IIN and the last digit must match the checksum. An attacker can also take into account the popularity of various services: an IIN from MasterCard is probably more likely than an IIN from Diners Club Carte Blanche. Honey encryption can protect against these attacks by first mapping credit card numbers to a larger space where they match their likelihood of legitimacy. Numbers with invalid IINs and checksums are not mapped at all (i.e. have probability 0 of legitimacy). Numbers from large brands like MasterCard and Visa map to large regions of this space, while less popular brands map to smaller regions, etc. An attacker brute-forcing such an encryption scheme would only see legitimate-looking credit card numbers when they brute-force, and the numbers would appear with the frequency the attacker would expect from the real world. == Application == Juels and Ristenpart aim to use honey encryption to protect data stored on password manager services. Juels stated that "password managers are a tasty target for criminals," and worries that "if criminals get a hold of a large collection of encrypted password vaults they could probably unlock many of them without too much trouble." Hristo Bojinov, CEO and founder of Anfacto, noted that "Honey Encryption could help reduce their vulnerability. But he notes that not every type of data will be easy to protect this way. … Not all authentication or encryption system yield themselves to being honeyed."