AI Avatar Kiosk

AI Avatar Kiosk — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Weak supervision

    Weak supervision

    Weak supervision (also known as semi-supervised learning) is a paradigm in machine learning, the relevance and notability of which increased with the advent of large language models due to the large amount of data required to train them. It is characterized by using a combination of a small amount of human-labeled data (exclusively used in more expensive and time-consuming supervised learning paradigm), followed by a large amount of unlabeled data (used exclusively in unsupervised learning paradigm). In other words, the desired output values are provided only for a subset of the training data. The remaining data is unlabeled or imprecisely labeled. Intuitively, it can be seen as an exam and labeled data as sample problems that the teacher solves for the class as an aid in solving another set of problems. In the transductive setting, these unsolved problems act as exam questions. In the inductive setting, they become practice problems of the sort that will make up the exam. == Problem == The acquisition of labeled data for a learning problem often requires a skilled human agent (e.g. to transcribe an audio segment) or a physical experiment (e.g. determining the 3D structure of a protein or determining whether there is oil at a particular location). The cost associated with the labeling process thus may render large, fully labeled training sets infeasible, whereas acquisition of unlabeled data is relatively inexpensive. In such situations, semi-supervised learning can be of great practical value. Semi-supervised learning is also of theoretical interest in machine learning and as a model for human learning. == Technique == More formally, semi-supervised learning assumes a set of l {\displaystyle l} independently identically distributed examples x 1 , … , x l ∈ X {\displaystyle x_{1},\dots ,x_{l}\in X} with corresponding labels y 1 , … , y l ∈ Y {\displaystyle y_{1},\dots ,y_{l}\in Y} and u {\displaystyle u} unlabeled examples x l + 1 , … , x l + u ∈ X {\displaystyle x_{l+1},\dots ,x_{l+u}\in X} are processed. Semi-supervised learning combines this information to surpass the classification performance that can be obtained either by discarding the unlabeled data and doing supervised learning or by discarding the labels and doing unsupervised learning. Semi-supervised learning may refer to either transductive learning or inductive learning. The goal of transductive learning is to infer the correct labels for the given unlabeled data x l + 1 , … , x l + u {\displaystyle x_{l+1},\dots ,x_{l+u}} only. The goal of inductive learning is to infer the correct mapping from X {\displaystyle X} to Y {\displaystyle Y} . It is unnecessary (and, according to Vapnik's principle, imprudent) to perform transductive learning by way of inferring a classification rule over the entire input space; however, in practice, algorithms formally designed for transduction or induction are often used interchangeably. == Assumptions == In order to make any use of unlabeled data, some relationship to the underlying distribution of data must exist. Semi-supervised learning algorithms make use of at least one of the following assumptions: === Continuity / smoothness assumption === Points that are close to each other are more likely to share a label. This is also generally assumed in supervised learning and yields a preference for geometrically simple decision boundaries. In the case of semi-supervised learning, the smoothness assumption additionally yields a preference for decision boundaries in low-density regions, so few points are close to each other but in different classes. === Cluster assumption === The data tend to form discrete clusters, and points in the same cluster are more likely to share a label (although data that shares a label may spread across multiple clusters). This is a special case of the smoothness assumption and gives rise to feature learning with clustering algorithms. === Manifold assumption === The data lie approximately on a manifold of much lower dimension than the input space. In this case learning the manifold using both the labeled and unlabeled data can avoid the curse of dimensionality. Then learning can proceed using distances and densities defined on the manifold. The manifold assumption is practical when high-dimensional data are generated by some process that may be hard to model directly, but which has only a few degrees of freedom. For instance, human voice is controlled by a few vocal folds, and images of various facial expressions are controlled by a few muscles. In these cases, it is better to consider distances and smoothness in the natural space of the generating problem, rather than in the space of all possible acoustic waves or images, respectively. == History == The heuristic approach of self-training (also known as self-learning or self-labeling) is historically the oldest approach to semi-supervised learning, with examples of applications starting in the 1960s. The transductive learning framework was formally introduced by Vladimir Vapnik in the 1970s. Interest in inductive learning using generative models also began in the 1970s. A probably approximately correct learning bound for semi-supervised learning of a Gaussian mixture was demonstrated by Ratsaby and Venkatesh in 1995. == Methods == === Generative models === Generative approaches to statistical learning first seek to estimate p ( x | y ) {\displaystyle p(x|y)} , the distribution of data points belonging to each class. The probability p ( y | x ) {\displaystyle p(y|x)} that a given point x {\displaystyle x} has label y {\displaystyle y} is then proportional to p ( x | y ) p ( y ) {\displaystyle p(x|y)p(y)} by Bayes' rule. Semi-supervised learning with generative models can be viewed either as an extension of supervised learning (classification plus information about p ( x ) {\displaystyle p(x)} ) or as an extension of unsupervised learning (clustering plus some labels). Generative models assume that the distributions take some particular form p ( x | y , θ ) {\displaystyle p(x|y,\theta )} parameterized by the vector θ {\displaystyle \theta } . If these assumptions are incorrect, the unlabeled data may actually decrease the accuracy of the solution relative to what would have been obtained from labeled data alone. However, if the assumptions are correct, then the unlabeled data necessarily improves performance. The unlabeled data are distributed according to a mixture of individual-class distributions. In order to learn the mixture distribution from the unlabeled data, it must be identifiable, that is, different parameters must yield different summed distributions. Gaussian mixture distributions are identifiable and commonly used for generative models. The parameterized joint distribution can be written as p ( x , y | θ ) = p ( y | θ ) p ( x | y , θ ) {\displaystyle p(x,y|\theta )=p(y|\theta )p(x|y,\theta )} by using the chain rule. Each parameter vector θ {\displaystyle \theta } is associated with a decision function f θ ( x ) = argmax y p ( y | x , θ ) {\displaystyle f_{\theta }(x)={\underset {y}{\operatorname {argmax} }}\ p(y|x,\theta )} . The parameter is then chosen based on fit to both the labeled and unlabeled data, weighted by λ {\displaystyle \lambda } : argmax Θ ( log ⁡ p ( { x i , y i } i = 1 l | θ ) + λ log ⁡ p ( { x i } i = l + 1 l + u | θ ) ) {\displaystyle {\underset {\Theta }{\operatorname {argmax} }}\left(\log p(\{x_{i},y_{i}\}_{i=1}^{l}|\theta )+\lambda \log p(\{x_{i}\}_{i=l+1}^{l+u}|\theta )\right)} === Low-density separation === Another major class of methods attempts to place boundaries in regions with few data points (labeled or unlabeled). One of the most commonly used algorithms is the transductive support vector machine, or TSVM (which, despite its name, may be used for inductive learning as well). Whereas support vector machines for supervised learning seek a decision boundary with maximal margin over the labeled data, the goal of TSVM is a labeling of the unlabeled data such that the decision boundary has maximal margin over all of the data. In addition to the standard hinge loss ( 1 − y f ( x ) ) + {\displaystyle (1-yf(x))_{+}} for labeled data, a loss function ( 1 − | f ( x ) | ) + {\displaystyle (1-|f(x)|)_{+}} is introduced over the unlabeled data by letting y = sign ⁡ f ( x ) {\displaystyle y=\operatorname {sign} {f(x)}} . TSVM then selects f ∗ ( x ) = h ∗ ( x ) + b {\displaystyle f^{}(x)=h^{}(x)+b} from a reproducing kernel Hilbert space H {\displaystyle {\mathcal {H}}} by minimizing the regularized empirical risk: f ∗ = argmin f ( ∑ i = 1 l ( 1 − y i f ( x i ) ) + + λ 1 ‖ h ‖ H 2 + λ 2 ∑ i = l + 1 l + u ( 1 − | f ( x i ) | ) + ) {\displaystyle f^{}={\underset {f}{\operatorname {argmin} }}\left(\displaystyle \sum _{i=1}^{l}(1-y_{i}f(x_{i}))_{+}+\lambda _{1}\|h\|_{\mathcal {H}}^{2}+\lambda _{2}\sum _{i=l+1}^{l+u}(1-|f(x_{i})|)_{+}\right)} An exact solution is intractable due to the non-convex term ( 1 − | f ( x ) | ) + {\displayst

    Read more →
  • Forking lemma

    Forking lemma

    The forking lemma is any of a number of related lemmas in cryptography research. The lemma states that if an adversary (typically a probabilistic Turing machine), on inputs drawn from some distribution, produces an output that has some property with non-negligible probability, then with non-negligible probability, if the adversary is re-run on new inputs but with the same random tape, its second output will also have the property. This concept was first used by David Pointcheval and Jacques Stern in "Security proofs for signature schemes," published in the proceedings of Eurocrypt 1996. In their paper, the forking lemma is specified in terms of an adversary that attacks a digital signature scheme instantiated in the random oracle model. They show that if an adversary can forge a signature with non-negligible probability, then there is a non-negligible probability that the same adversary with the same random tape can create a second forgery in an attack with a different random oracle. The forking lemma was later generalized by Mihir Bellare and Gregory Neven. The forking lemma has been used and further generalized to prove the security of a variety of digital signature schemes and other random-oracle based cryptographic constructions. == Statement of the lemma == The generalized version of the lemma is stated as follows. Let A be a probabilistic algorithm, with inputs (x, h1, ..., hq; r) that outputs a pair (J, y), where r refers to the random tape of A (that is, the random choices A will make). Suppose further that IG is a probability distribution from which x is drawn, and that H is a set of size h from which each of the hi values are drawn according to the uniform distribution. Let acc be the probability that on inputs distributed as described, the J output by A is greater than or equal to 1. We can then define a "forking algorithm" FA that proceeds as follows, on input x: Pick a random tape r for A. Pick h1, ..., hq uniformly from H. Run A on input (x, h1, ..., hq; r) to produce (J, y). If J = 0, then return (0, 0, 0). Pick h'J, ..., h'q uniformly from H. Run A on input (x, h1, ..., hJ−1, h'J, ..., h'q; r) to produce (J', y'). If J' = J and hJ ≠ h'J then return (1, y, y'), otherwise, return (0, 0, 0). Let frk be the probability that FA outputs a triple starting with 1, given an input x chosen randomly from IG. Then frk ≥ acc ⋅ ( acc q − 1 h ) . {\displaystyle {\text{frk}}\geq {\text{acc}}\cdot \left({\frac {\text{acc}}{q}}-{\frac {1}{h}}\right).} === Intuition === The idea here is to think of A as running two times in related executions, where the process "forks" at a certain point, when some but not all of the input has been examined. In the alternate version, the remaining inputs are re-generated but are generated in the normal way. The point at which the process forks may be something we only want to decide later, possibly based on the behavior of A the first time around: this is why the lemma statement chooses the branching point (J) based on the output of A. The requirement that hJ ≠ h'J is a technical one required by many uses of the lemma. (Note that since both hJ and h'J are chosen randomly from H, then if h is large, as is usually the case, the probability of the two values not being distinct is extremely small.) === Example === For example, let A be an algorithm for breaking a digital signature scheme in the random oracle model. Then x would be the public parameters (including the public key) A is attacking, and hi would be the output of the random oracle on its ith distinct input. The forking lemma is of use when it would be possible, given two different random signatures of the same message, to solve some underlying hard problem. An adversary that forges once, however, gives rise to one that forges twice on the same message with non-negligible probability through the forking lemma. When A attempts to forge on a message m, we consider the output of A to be (J, y) where y is the forgery, and J is such that m was the Jth unique query to the random oracle (it may be assumed that A will query m at some point, if A is to be successful with non-negligible probability). (If A outputs an incorrect forgery, we consider the output to be (0, y).) By the forking lemma, the probability (frk) of obtaining two good forgeries y and y' on the same message but with different random oracle outputs (that is, with hJ ≠ h'J) is non-negligible when acc is also non-negligible. This allows us to prove that if the underlying hard problem is indeed hard, then no adversary can forge signatures. This is the essence of the proof given by Pointcheval and Stern for a modified ElGamal signature scheme against an adaptive adversary. == Known issues with application of forking lemma == The reduction provided by the forking lemma is not tight. Pointcheval and Stern proposed security arguments for Digital Signatures and Blind Signature using Forking Lemma. Claus P. Schnorr provided an attack on blind Schnorr signatures schemes, with more than p o l y l o g ( n ) {\displaystyle polylog(n)} concurrent executions (the case studied and proven secure by Pointcheval and Stern). A polynomial-time attack, for Ω ( n ) {\displaystyle \Omega (n)} concurrent executions, was shown in 2020 by Benhamouda, Lepoint, Raykova, and Orrù. Schnorr also suggested enhancements for securing blind signatures schemes based on discrete logarithm problem.

    Read more →
  • Trusted Computing

    Trusted Computing

    Trusted Computing (TC) is a technology developed and promoted by the Trusted Computing Group. The term is taken from the field of trusted systems and has a specialized meaning that is distinct from the field of confidential computing. With Trusted Computing, the computer will consistently behave in expected ways, and those behaviors will be enforced by computer hardware and software. Enforcing this behavior is achieved by loading the hardware with a unique encryption key that is inaccessible to the rest of the system and the owner. TC is controversial as the hardware is not only secured for its owner, but also against its owner, leading opponents of the technology like free software activist Richard Stallman to deride it as "treacherous computing", and certain scholarly articles to use scare quotes when referring to the technology. Trusted Computing proponents such as International Data Corporation, the Enterprise Strategy Group and Endpoint Technologies Associates state that the technology will make computers safer, less prone to viruses and malware, and thus more reliable from an end-user perspective. They also state that Trusted Computing will allow computers and servers to offer improved computer security over that which is currently available. Opponents often state that this technology will be used primarily to enforce digital rights management policies (imposed restrictions to the owner) and not to increase computer security. Chip manufacturers Intel and AMD, hardware manufacturers such as HP and Dell, and operating system providers such as Microsoft include Trusted Computing in their products if enabled. The U.S. Army requires that every new PC it purchases comes with a Trusted Platform Module (TPM). As of July 3, 2007, so does virtually the entire United States Department of Defense. == Key concepts == Trusted Computing encompasses six key technology concepts, of which all are required for a fully Trusted system, that is, a system compliant to the TCG specifications: Endorsement key Secure input and output Memory curtaining / protected execution Sealed storage Remote attestation Trusted Third Party (TTP) === Endorsement key === The endorsement key is a 2048-bit RSA public and private key pair that is created randomly on the chip at manufacture time and cannot be changed. The private key never leaves the chip, while the public key is used for attestation and for encryption of sensitive data sent to the chip, as occurs during the TPM_TakeOwnership command. This key is used to allow the execution of secure transactions: every Trusted Platform Module (TPM) is required to be able to sign a random number (in order to allow the owner to show that he has a genuine trusted computer), using a particular protocol created by the Trusted Computing Group (the direct anonymous attestation protocol) in order to ensure its compliance of the TCG standard and to prove its identity; this makes it impossible for a software TPM emulator with an untrusted endorsement key (for example, a self-generated one) to start a secure transaction with a trusted entity. The TPM should be designed to make the extraction of this key by hardware analysis hard, but tamper resistance is not a strong requirement. === Memory curtaining === Memory curtaining extends common memory protection techniques to provide full isolation of sensitive areas of memory—for example, locations containing cryptographic keys. Even the operating system does not have full access to curtained memory. The exact implementation details are vendor specific. === Sealed storage === Sealed storage protects private information by binding it to platform configuration information including the software and hardware being used. This means the data can be released only to a particular combination of software and hardware. Sealed storage can be used for DRM enforcing. For example, users who keep a song on their computer that has not been licensed to be listened will not be able to play it. Currently, a user can locate the song, listen to it, and send it to someone else, play it in the software of their choice, or back it up (and in some cases, use circumvention software to decrypt it). Alternatively, the user may use software to modify the operating system's DRM routines to have it leak the song data once, say, a temporary license was acquired. Using sealed storage, the song is securely encrypted using a key bound to the trusted platform module so that only the unmodified and untampered music player on his or her computer can play it. In this DRM architecture, this might also prevent people from listening to the song after buying a new computer, or upgrading parts of their current one, except after explicit permission of the vendor of the song. === Remote attestation === Remote attestation allows changes to the user's computer to be detected by authorized parties. For example, software companies can identify unauthorized changes to software, including users modifying their software to circumvent commercial digital rights restrictions. It works by having the hardware generate a certificate stating what software is currently running. The computer can then present this certificate to a remote party to show that unaltered software is currently executing. Numerous remote attestation schemes have been proposed for various computer architectures, including Intel, RISC-V, and ARM. Remote attestation is usually combined with public-key encryption so that the information sent can only be read by the programs that requested the attestation, and not by an eavesdropper. To take the song example again, the user's music player software could send the song to other machines, but only if they could attest that they were running an authorized copy of the music player software. Combined with the other technologies, this provides a more restricted path for the music: encrypted I/O prevents the user from recording it as it is transmitted to the audio subsystem, memory locking prevents it from being dumped to regular disk files as it is being worked on, sealed storage curtails unauthorized access to it when saved to the hard drive, and remote attestation prevents unauthorized software from accessing the song even when it is used on other computers. To preserve the privacy of attestation responders, Direct Anonymous Attestation has been proposed as a solution, which uses a group signature scheme to prevent revealing the identity of individual signers. Proof of space (PoS) have been proposed to be used for malware detection, by determining whether the L1 cache of a processor is empty (e.g., has enough space to evaluate the PoSpace routine without cache misses) or contains a routine that resisted being evicted. === Trusted third party === == Known applications == The Microsoft products Windows Vista, Windows 7, Windows 8 and Windows RT make use of a Trusted Platform Module to facilitate BitLocker Drive Encryption. Other known applications with runtime encryption and the use of secure enclaves include the Signal messenger and the e-prescription service ("E-Rezept") by the German government. == Possible applications == === Digital rights management === Trusted Computing would allow companies to create a digital rights management (DRM) system which would be very hard to circumvent, though not impossible. An example is downloading a music file. Sealed storage could be used to prevent the user from opening the file with an unauthorized player or computer. Remote attestation could be used to authorize play only by music players that enforce the record company's rules. The music would be played from curtained memory, which would prevent the user from making an unrestricted copy of the file while it is playing, and secure I/O would prevent capturing what is being sent to the sound system. Circumventing such a system would require either manipulation of the computer's hardware, capturing the analogue (and thus degraded) signal using a recording device or a microphone, or breaking the security of the system. New business models for use of software (services) over Internet may be boosted by the technology. By strengthening the DRM system, one could base a business model on renting programs for a specific time periods or "pay as you go" models. For instance, one could download a music file which could only be played a certain number of times before it becomes unusable, or the music file could be used only within a certain time period. === Preventing cheating in online games === Trusted Computing could be used to combat cheating in online games. Some players modify their game copy in order to gain unfair advantages in the game; remote attestation, secure I/O and memory curtaining could be used to determine that all players connected to a server were running an unmodified copy of the software. === Verification of remote computation for grid computing === Trusted Computing could be used to guarantee participants in a grid computing sys

    Read more →
  • Brooklyn Bridge (software)

    Brooklyn Bridge (software)

    The Brooklyn Bridge from White Crane Systems was a data transfer enabler. Although it came with some hardware, it was the software which was the basis of the product. It also could transform the data's format. == Overview == The New York Times described its category as being among "communications packages used to transfer files." In an era of 300 baud, Brooklyn Bridge operated at "115,200 baud" so that a transfer which "at 300 baud took 4 minutes and 36 seconds" only needed 5 seconds. Unlike some communications packages, this one retains the original version-date, so as not to alarm people when they seem to have what looks like an update, when it's not. == Description == Once the software is installed, users comfortable with typing the word "COPY" can do so as readily as they sneakernet. An earlier review described it as "less cumbersome than conventional communications software" The use of neither specialized hardware nor specialized software is ideal in an era when this can be done using online or other "outside" services.

    Read more →
  • Perceptual robotics

    Perceptual robotics

    Perceptual robotics is an interdisciplinary science linking Robotics and Neuroscience. It investigates biologically motivated robot control strategies, concentrating on perceptual rather than cognitive processes and thereby sides with J. J. Gibson's view against the Poverty of the stimulus theory. As a working definition, the following quote from Chapter 64 by H. Bülthoff, C. Wallraven and M. Giese from The Springer Handbook of Robotics, edited by Bruno Siciliano and Oussama Khatib, published by Springer in 2007, could be used: In the following we will apply the term Perceptual Robotics to signify the design of robots based on principles that are derived from human perception on all three levels in the sense of Marr. This includes a realization in terms of specific neural circuits as well as the transfer of more abstract biologically-inspired strategies for the solution of relevant computational problems.

    Read more →
  • Social network hosting service

    Social network hosting service

    A social network hosting service is a web hosting service that specifically hosts the user creation of web-based social networking services, alongside related applications. Such services are also known as vertical social networks due to the creation of SNSes which cater to specific user interests and niches; like larger, interest-agnostic SNSes, such niche networking services may also possess the ability to create increasingly niche groups of users. == List of social network hosting services == Federated Media Publishing's BigTent BroadVision Clearvale Ning Wall.fm

    Read more →
  • Chaffing and winnowing

    Chaffing and winnowing

    Chaffing and winnowing is a cryptographic technique to achieve confidentiality without using encryption when sending data over an insecure channel. The name is derived from agriculture: after grain has been harvested and threshed, it remains mixed together with inedible fibrous chaff. The chaff and grain are then separated by winnowing, and the chaff is discarded. The cryptographic technique was conceived by Ron Rivest and published in an on-line article on 18 March 1998. Although it bears similarities to both traditional encryption and steganography, it cannot be classified under either category. This technique allows the sender to deny responsibility for encrypting their message. When using chaffing and winnowing, the sender transmits the message unencrypted, in clear text. Although the sender and the receiver share a secret key, they use it only for authentication. However, a third party can make their communication confidential by simultaneously sending specially crafted messages through the same channel. == How it works == The sender (Alice) wants to send a message to the receiver (Bob). In the simplest setup, Alice enumerates the symbols in her message and sends out each in a separate packet. If the symbols are complex enough, such as natural-language text, an attacker may be able to distinguish the real symbols from poorly faked chaff symbols, posing a similar problem as steganography in needing to generate highly realistic fakes; to avoid this, the symbols can be reduced to just single 0/1 bits, and realistic fakes can then be simply randomly generated 50:50 and are indistinguishable from real symbols. In general, the method requires each symbol to arrive in-order and to be authenticated by the receiver. When implemented over networks that may change the order of packets, the sender places the symbol's serial number in the packet, the symbol itself (both unencrypted), and a message authentication code (MAC). Many MACs use a secret key Alice shares with Bob, but it is sufficient that the receiver has a method to authenticate the packets. Rivest notes an interesting property of chaffing-and-winnowing is that third parties (such as an ISP) can opportunistically add it to communications without needing permission or coordination with the sender/recipient. A third-party (Charles) who transmits Alice's packets to Bob, interleaves the packets with corresponding bogus packets (called "chaff") with corresponding serial numbers, arbitrary symbols, and a random number in place of the MAC. Charles does not need to know the key to do that (real MACs are large enough that it is extremely unlikely to generate a valid one by chance, unlike in the example). Bob uses the MAC to find the authentic messages and drops the "chaff" messages. This process is called "winnowing". An eavesdropper located between Alice and Charles can easily read Alice's message. But an eavesdropper between Charles and Bob would have to tell which packets are bogus and which are real (i.e. to winnow, or "separate the wheat from the chaff"). That is infeasible if the MAC used is secure and Charles does not leak any information on packet authenticity (e.g. via timing). If a fourth party joins the example (named Darth) who wants to send counterfeit messages to impersonate Alice, it would require Alice to disclose her secret key. If Darth cannot force Alice to disclose an authentication key (the knowledge of which would enable him to forge messages from Alice), then her messages will remain confidential. Charles, on the other hand, is no target of Darth's at all, since Charles does not even possess any secret keys that could be disclosed. == Variations == The simple variant of the chaffing and winnowing technique described above adds many bits of overhead per bit of original message. To make the transmission more efficient, Alice can process her message with an all-or-nothing transform and then send it out in much larger chunks. The chaff packets will have to be modified accordingly. Because the original message can be reconstructed only by knowing all of its chunks, Charles needs to send only enough chaff packets to make finding the correct combination of packets computationally infeasible. Chaffing and winnowing lends itself especially well to use in packet-switched network environments such as the Internet, where each message (whose payload is typically small) is sent in a separate network packet. In another variant of the technique, Charles carefully interleaves packets coming from multiple senders. That eliminates the need for Charles to generate and inject bogus packets in the communication. However, the text of Alice's message cannot be well protected from other parties who are communicating via Charles at the same time. This variant also helps protect against information leakage and traffic analysis. == Implications for law enforcement == Ron Rivest suggests that laws related to cryptography, including export controls, would not apply to chaffing and winnowing because it does not employ any encryption at all. The power to authenticate is in many cases the power to control, and handing all authentication power to the government is beyond all reason The author of the paper proposes that the security implications of handing everyone's authentication keys to the government for law-enforcement purposes would be far too risky, since possession of the key would enable someone to masquerade and communicate as another entity, such as an airline controller. Furthermore, Ron Rivest contemplates the possibility of rogue law enforcement officials framing up innocent parties by introducing the chaff into their communications, concluding that drafting a law restricting chaffing and winnowing would be far too difficult. == Trivia == The term winnowing was suggested by Ronald Rivest's father. Before the publication of Rivest's paper in 1998 other people brought to his attention a 1965 novel, Rex Stout's The Doorbell Rang, which describes the same concept and was thus included in the paper's references.

    Read more →
  • What I eat in a day video

    What I eat in a day video

    "What I eat in a day" videos are a trend on several social media platforms where a person describes all the meals and snacks that they eat during a given day, often as part of a given diet. The videos, shared on platforms including Twitter, TikTok and YouTube, become increasingly popular in 2020, with some of them accumulating millions of views, and they are considered a profitable industry for the people making them. Some have raised concerns that the videos may promote an unrealistic standard for healthy eating and contribute to the development of eating disorders. == Format == These videos often feature a montage of the food that the creator eats over the course of the day, sometimes with the associated calorie count of the foods that they describe. Unlike related mukbang videos, however, in which participants eat large amounts of food, the diets described are often restrictive. However, other videos are labeled as "unhealthy" and depict large portion sizes and higher amounts of processed food. == Popularity == "What I eat in a day" videos have existed for a long time, especially on YouTube, but they have become much more widespread in recent years. This phenomenon is self-reinforcing because when social media users watch or like these videos they are likely to see more of them in the future. Indeed, some of the most successful videos have tens of millions of view each. == Criticism and controversy == Several dieticians and mental health professionals over the impacts that these videos can have, as they can advocate a restrictive style of eating and not "promote body diversity." They have also raised concerns that this trend could contribute to a rise in disordered eating, especially since use of social media is known to increase feelings of negative body image. This trend is particularly prevalent among young adults, which are also the group with the highest vulnerability to eating disorders. More recently, a portion of these videos have begun to challenge diets and depict more realistic ways of eating in order to reduce the potential consequences of the trend.

    Read more →
  • Social software engineering

    Social software engineering

    Social software engineering (SSE) is a branch of software engineering that is concerned with the social aspects of software development and the developed software. SSE focuses on the socialness of both software engineering and developed software. On the one hand, the consideration of social factors in software engineering activities, processes and CASE tools is deemed to be useful to improve the quality of both development process and produced software. Examples include the role of situational awareness and multi-cultural factors in collaborative software development. On the other hand, the dynamicity of the social contexts in which software could operate (e.g., in a cloud environment) calls for engineering social adaptability as a runtime iterative activity. Examples include approaches which enable software to gather users' quality feedback and use it to adapt autonomously or semi-autonomously. SSE studies and builds socially-oriented tools to support collaboration and knowledge sharing in software engineering. SSE also investigates the adaptability of software to the dynamic social contexts in which it could operate and the involvement of clients and end-users in shaping software adaptation decisions at runtime. Social context includes norms, culture, roles and responsibilities, stakeholder's goals and interdependencies, end-users perception of the quality and appropriateness of each software behaviour, etc. The participants of the 1st International Workshop on Social Software Engineering and Applications (SoSEA 2008) proposed the following characterization: Community-centered: Software is produced and consumed by and/or for a community rather than focusing on individuals Collaboration/collectiveness: Exploiting the collaborative and collective capacity of human beings Companionship/relationship: Making explicit the various associations among people Human/social activities: Software is designed consciously to support human activities and to address social problems Social inclusion: Software should enable social inclusion enforcing links and trust in communities Thus, SSE can be defined as "the application of processes, methods, and tools to enable community-driven creation, management, deployment, and use of software in online environments". One of the main observations in the field of SSE is that the concepts, principles, and technologies made for social software applications are applicable to software development itself as software engineering is inherently a social activity. SSE is not limited to specific activities of software development. Accordingly, tools have been proposed supporting different parts of SSE, for instance, social system design or social requirements engineering. Consequently vertical market software, such as software development tools, engineering tools, marketing tools or software that helps users in a decision-making process can profit from social components. Such vertical social software differentiates strongly in its user-base from traditional social software such as Yammer.

    Read more →
  • AS1 (networking)

    AS1 (networking)

    AS1 (Applicability Statement 1) is a specification about how to transport structured business-to-business data securely and reliably over the Internet. Security is achieved by using digital certificates and encryption. == AS1 technical overview == The AS1 protocol is based on SMTP and S/MIME. It was the first AS protocol developed and uses signing, encryption and MDN conventions. In other words: Files are sent as "attachments" in a specially coded SMIME email message Messages can be signed, but do not have to be Messages can be encrypted, but do not have to be Messages may request an MDN back if all went well, but do not have to request such a message If the original AS1 message requested an MDN... Upon the receipt of the message and its successful decryption or signature validation (as necessary) a "success" MDN will be sent back to the original sender. This MDN is typically signed but not encrypted. Upon the receipt and successful verification of the signature on the MDN, the original sender will "know" that the recipient got their message (this provides the "Non-repudiation" element of AS1) If there are any problems receiving or interpreting the original AS1 message, a "failed" MDN may be sent back. Like any other AS file transfer, AS1 file transfers typically require both sides of the exchange to trade X.509 certificates and specific "trading partner" names before any transfers can take place.

    Read more →
  • Sex differences in social media use

    Sex differences in social media use

    Men and women use social media in different ways and with different frequencies. In general, several researchers have found that women tend to use social network services (SNSs) more than men and primiarly to socialize. == Differences == === Predilection for usage === Many studies have found that women are more likely to use either specific SNSs such as Facebook or MySpace or SNSs in general. In 2015, 73% of online men and 80% of online women used social networking sites. The gap in gender differences has become less apparent in LinkedIn. In 2015 about 26 percent of online men and 25% of online women used the business-and employee-oriented networking site. Researchers who have examined the gender of users of multiple SNSs have found contradictory results. Hargittai's groundbreaking 2007 study examining race, gender, and other differences between undergraduate college student users of SNSs found that women were not only more likely to have used SNSes than men but that they were also more likely to have used many different services, including Facebook, MySpace, and Friendster; these differences persisted in several models and analyses. Although she only surveyed students at one institution – the University of Illinois at Chicago – Hargittai selected that institution intentionally as "an ideal location for studies of how different kinds of people use online sites and services." In contrast, data collected by the Pew Internet & American Life Project found that men were more likely to have multiple SNS profiles. Although the sample sizes of the two surveys are comparable – 1,650 Internet users in the Pew survey compared with 1,060 in Hargittai's survey – the data from the Pew survey are newer and arguably more representative of the entire adult United States population. Pinterest, Facebook, and Instagram attract more females. Picture sharing sites overall are very popular among women. Pinterest alone attracts three times as many female users than male. However, use of Pinterest by men has increased from 5% in 2012. Facebook attracts about 77% of women online. Instagram is also more likely to attract women. Men are more likely to participate in online forums like Reddit, Digg or Slashdot. One in five men claim to be a part of an online forum. === Uses === In general, women seem to use SNSs more to explicitly foster social connections. A study conducted by Pew research centers found that women were more avid users of social media. In November 2010, the gap between men and women was as high as 15%. Female participants in a multi-stage study conducted in 2007 to discover the motivations of Facebook users scored higher on scales for social connection and posting of photographs. Studies have also been conducted on the differences between females and males with regards to blogging. The Pew Research Center found that younger females are more likely to blog than males their own age, even males that are older than them. Similarly, in a study of blogs maintained in MySpace, women were found to be more likely to not only write blogs but also write about family, romantic relationships, friendships, and health in those blogs. A study of Swedish SNS users found that women were more likely to have expressions of friendship, specifically in the areas of (a) publishing photos of their friends, (b) specifically naming their best friends, and (c) writing poems to and about their friends. Women were also more likely to have expressions related to family relationships and romantic relationships. One of the key findings of this research is that those men who do have expressions of romantic relationships in their profile had expressions just as strong as the women. However, the researcher speculated that this may be in part due to a desire to publicly express heterosexual behaviors and mannerisms instead of merely expressing romantic feelings. A large-scale study of gender differences in MySpace found that both men and women tended to have a majority of female Friends, and both men and women tended to have a majority of female "Top" Friends in the site. A later study found women to author disproportionately many (public) comments in MySpace, but an investigation into the role of emotion in public MySpace comments found that women both give and receive stronger positive emotion. It was hypothesised that women are simply more effective at using social networking sites because they are better able to harness positive emotion. A study focused on the influence of gender and personality on individuals' use of online social networking websites such as Facebook, reported that men use social networking sites with the intention of forming new relationships, whereas, women use them more for relationship maintenance. In addition to this, women are more likely to use Facebook or MySpace to compare themselves to others and also to search for information. Men, however, are more likely to look at other people's profiles with in the intention to find friends. Women were less successful at actually finding new friends, but more successful at "maintaining existing relationships, making new relationships, using for academic purposes and following specific agenda". Similarly, men also self-reported this motivation "while women reported using them more for relationship maintenance". === Personality === OCEAN personality traits are known to systematically vary between human males and females. In one study, the same women were more extraverted and agreeable, such as less neurotic while on social media than offline. Other studies associated neuroticism with female use of social media. === Privacy === Privacy has been the primary topic of many studies of SNS users, and many of these studies have found differences between male and female SNS users, although some studies have found results contradictory to those found in other studies. Some researchers have found that women are more protective of their personal information and more likely to have private profiles. Other researchers have found that women are less likely to post some types of information. Acquisti and Gross found that women in their sample were less likely to reveal their sexual orientation, personal address, or cell phone number. This is similar to Pew Internet & American Life research of children users of SNSs that found that boys and girls presented different views of privacy and behaviors, with girls being more concerned about and restrictive of information such as city, town, last name, and cell phone number that could be used to locate them. At least one group of researchers has found that women are less likely to share information that "identifies them directly – last name, cell phone number, and address or home phone number," linking that resistance to women's greater concerns about "cyberstalking", "cyberbullying", and security problems. Despite these concerns about privacy, researchers have found that women are more likely to maintain up-to-date photos of themselves. Further, Kolek and Saunders found in their sample of college student Facebook users that women were more likely to not only post a photograph of themselves in their profile but that they were more likely to have a publicly viewable Facebook account (a contradictory finding compared to many other studies), post photos, and post photo albums. Women were more likely to have: (a) a publicly viewable Facebook account, (b) more photo albums, (c) more photos, (d) a photo of themselves as their profile picture, (e) positive references to alcohol, partying, or drugs, and (f) more positive references to or about the institution or institution-related activities. In general, women were more likely to disclose information about themselves in their Facebook profile, with the primary exception of sharing their telephone number. Similarly, female respondents to Strano's study were more likely to keep their profile photo recent and choose a photo that made them appear attractive, happy, and fun-loving. Citing several examples, Strano opined that there may also be a difference in how men and women Facebook users display and interpret profile photos depicting relationships. Privacy has also been a concern for the SnapChat app, which allows you to send messages either text or photo or video which then disappear. One study has shown that security is not a major concern for the majority of users and that most do not use Snapchat to send sensitive content (although up to 25% may do so experimentally). As part of their research almost no statistically significant gender differences were found. === Cyberbullying === Past research carried out to investigate if there are any gender differences in cyber-bullying has found that boys commit more cyber verbal bullying, cyber forgery and more violence based on hidden identity or presenting themselves as other person. === Mansplaining === A 2021 article found that mansplaining could be seen more prominent online rather than offl

    Read more →
  • NRENum.net

    NRENum.net

    The NRENum.net service is an end-user ENUM service run by TERENA and the participating national research and education networking organisations (NRENs), primarily for academia. NRENum.net is considered as a complementary service and a valid alternative to the Golden ENUM tree. The domain nrenum.net is being populated in order to provide the infrastructure in DNS for storage of E.164 numbers. The NRENum.net service includes the operation of the Tier-0 root Domain Name Server(s) and the delegation of county codes to NRENum.net Registries. NRENum.net is a registered community trademark of TERENA. == Service description == E.164 Telephone Number Mapping (ENUM) is a standard protocol that is the result of work of the Internet Engineering Task Force's Telephone Number Mapping working group. ENUM translates a telephone number into a domain name. This allows users to continue to use the existing phone number formats they are familiar with, while allowing the call to be routed using DNS. This makes ENUM a quick, stable and cheap link between telecommunications systems and the Internet. RFC 3761 discusses the use of the Domain Name System for storage of E.164 numbers. More specifically, how DNS can be used for identifying available services connected to one E.164 number. The RIPE NCC provides DNS operations for e164.arpa (known as Golden ENUM tree) in accordance with the instructions from the Internet Architecture Board. The NRENum.net service is an end-user ENUM service run by TERENA and the participating NRENs primarily for academia. NRENum.net is considered as a complementary service and a valid alternative to the Golden ENUM tree. The domain nrenum.net is being populated in order to provide the infrastructure in DNS for storage of E.164 numbers. The NRENum.net service includes the operation of the Tier-0 root Domain Name Servers and the delegation of county codes to NRENum.net Registries. NRENum.net is a registered community trademark of TERENA. NRENum.net facilitates services such as Voice over IP and videoconferencing. NRENum.net tree refers to the tree structure where: Tier-0 root Domain Name Servers (technically one master and several secondary servers ensuring resilience) are run by the hosting organisations and coordinated by the NRENum.net Operations Team. Tier-1 Domain Name Servers are run by the NRENum.net (national or regional) Registries responsible for the country code(s) delegated. Tier-2 and lower DNS sub-delegations may be implemented, regulated by the national service policies. An NRENum.net Registry is an entity that is authorised by the NRENum.net Operations Team to operate the national or regional Tier-1 Domain Name Server and be responsible for the county code(s) delegated. In many countries there is a National Research and Education Networking organisation (NREN) that acts as the Registry of the country. An NRENum.net Registrar is responsible for the number/block registration in the Tier-1 DNS and a Number Validation Entity is responsible for the validation of the E.164 telephone numbers to be registered. The NREN may at the same time have the role of the NRENum.net Registry, Registrar and Validation Entity for the country code(s) delegated. A Registrant (end user) is an E.164 telephone number holder. Holders of E.164 numbers who want to be listed in the service must contact the appropriate NRENum.net Registrar. Number (block) delegation is the technical process of assigning country codes to national registries, or number blocks under country codes to end users. Number (block) registration is the technical process of configuring DNS and populating it with the appropriate ENUM records (i.e., adding NAPTR records to DNS) via registrars. The ITU-T strictly regulates the number structure of valid E.164 telephone numbers and assigns number blocks to national authorities (telecom regulators) or recently to global entities directly. The national authorities can further delegate the number ranges to local operators within the country or region. A virtual number has either a non-valid E.164 number structure (e.g., longer than 15 digits) or has a valid structure but is not assigned to any national authorities or operators. The number Validation Entity is responsible for checking the numbers to be registered to NRENum.net. == History == The idea for the NRENum.net service was conceived in 2006. NRENum.net became operational in August 2006, and was run by Bernie Höneisen, a staff member of SWITCH, and Kewin Stöckigt, a staff member of AARNet, as a private service, with technical support from SWITCH and the participants in the TERENA Task Force on Enhanced Communication Services (TF-ECS). When that task force completed its activities in 2008, TERENA agreed to take over the coordination of the NRENum.net service. By that time, nine NRENs had joined NRENum.net. The service continued to grow during the next years, and in March 2012 NRENum.net went global when RNP from Brazil joined the service as its 14th partificpant and the first outside Europe. In 2011, the participants decided to migrate the operation of the service's master Domain Name Server to NIIF and the operation of the two secondary DNSs to CARNET and SWITCH. In 2013, Internet2, AARNet and NORDUnet set up additional secondary Domain Name Servers for their regions, thereby completing the global distribution of DNS slaves and bringing the resilience of the NRENum.net infrastructure to a high level. == Governance == TERENA has established a lightweight global governance structure. The Global NRENum.net Governance Committee (GNGC) is the highest-level strategic body responsible for overall NRENum.net service definition, sustainability and long-term strategy. This includes formulating and recommending service governance principles and policies. Its members are nominated by the NRENum.net Registries in the various world regions, and are appointed by TERENA. The GNGC is composed of two members representing Europe, two representing the Asia-Pacific region, and two representing the Americas. The NRENum.net Operations Team is responsible for the day-to-day operations of the Tier-0 root DNSs and the handling of country code delegation requests. It may escalate technical or policy issues to the GNGC for discussion. TERENA is responsible for ensuring the correct and secure operations of the NRENum.net service performed by the NRENum.net Operations Team and governance by the GNGC. TERENA also supports the development of technical improvements to the NRENum.net service and promotes the deployment of NRENum.net worldwide. == Geographical deployment == Thirty-two county codes are delegated in the NRENum.net service. Below these are listed per world region. === Europe === === Asia-Pacific === === North America === +1 United States (Internet2) === Latin America === === Caribbean === === Africa === +262 Réunion, Mayotte (RENATER)

    Read more →
  • Conditional random field

    Conditional random field

    Conditional random fields (CRFs) are a class of statistical modeling methods often applied in pattern recognition and machine learning and used for structured prediction. Whereas a classifier predicts a label for a single sample without considering "neighbouring" samples, a CRF can take context into account. To do so, the predictions are modelled as a graphical model, which represents the presence of dependencies between the predictions. The kind of graph used depends on the application. For example, in natural language processing, "linear chain" CRFs are popular, for which each prediction is dependent only on its immediate neighbours. In image processing, the graph typically connects locations to nearby and/or similar locations to enforce that they receive similar predictions. Other examples where CRFs are used are: labeling or parsing of sequential data for natural language processing or biological sequences, part-of-speech tagging, shallow parsing, named entity recognition, gene finding, peptide critical functional region finding, and object recognition and image segmentation in computer vision. == Description == CRFs are a type of discriminative undirected probabilistic graphical model. Lafferty, McCallum and Pereira define a CRF on observations X {\displaystyle {\boldsymbol {X}}} and random variables Y {\displaystyle {\boldsymbol {Y}}} as follows: Let G = ( V , E ) {\displaystyle G=(V,E)} be a graph such that Y = ( Y v ) v ∈ V {\displaystyle {\boldsymbol {Y}}=({\boldsymbol {Y}}_{v})_{v\in V}} , so that Y {\displaystyle {\boldsymbol {Y}}} is indexed by the vertices of G {\displaystyle G} . Then ( X , Y ) {\displaystyle ({\boldsymbol {X}},{\boldsymbol {Y}})} is a conditional random field when each random variable Y v {\displaystyle {\boldsymbol {Y}}_{v}} , conditioned on X {\displaystyle {\boldsymbol {X}}} , obeys the Markov property with respect to the graph; that is, its probability is dependent only on its neighbours in G and not its past states: P ( Y v | X , { Y w : w ≠ v } ) = P ( Y v | X , { Y w : w ∼ v } ) {\displaystyle P({\boldsymbol {Y}}_{v}|{\boldsymbol {X}},\{{\boldsymbol {Y}}_{w}:w\neq v\})=P({\boldsymbol {Y}}_{v}|{\boldsymbol {X}},\{{\boldsymbol {Y}}_{w}:w\sim v\})} , where w ∼ v {\displaystyle {\mathit {w}}\sim v} means that w {\displaystyle w} and v {\displaystyle v} are neighbors in G {\displaystyle G} . What this means is that a CRF is an undirected graphical model whose nodes can be divided into exactly two disjoint sets X {\displaystyle {\boldsymbol {X}}} and Y {\displaystyle {\boldsymbol {Y}}} , the observed and output variables, respectively; the conditional distribution p ( Y | X ) {\displaystyle p({\boldsymbol {Y}}|{\boldsymbol {X}})} is then modeled. === Inference === For general graphs, the problem of exact inference in CRFs is intractable. The inference problem for a CRF is basically the same as for an MRF and the same arguments hold. However, there exist special cases for which exact inference is feasible: If the graph is a chain or a tree, message passing algorithms yield exact solutions. The algorithms used in these cases are analogous to the forward-backward and Viterbi algorithm for the case of HMMs. If the CRF only contains pair-wise potentials and the energy is submodular, combinatorial min cut/max flow algorithms yield exact solutions. If exact inference is impossible, several algorithms can be used to obtain approximate solutions. These include: Loopy belief propagation Alpha expansion Mean field inference Linear programming relaxations === Parameter learning === Learning the parameters θ {\displaystyle \theta } is usually done by maximum likelihood learning for p ( Y i | X i ; θ ) {\displaystyle p(Y_{i}|X_{i};\theta )} . If all nodes have exponential family distributions and all nodes are observed during training, this optimization is convex. It can be solved for example using gradient descent algorithms, or Quasi-Newton methods such as the L-BFGS algorithm. On the other hand, if some variables are unobserved, the inference problem has to be solved for these variables. Exact inference is intractable in general graphs, so approximations have to be used. === Examples === In sequence modeling, the graph of interest is usually a chain graph. An input sequence of observed variables X {\displaystyle X} represents a sequence of observations and Y {\displaystyle Y} represents a hidden (or unknown) state variable that needs to be inferred given the observations. The Y i {\displaystyle Y_{i}} are structured to form a chain, with an edge between each Y i − 1 {\displaystyle Y_{i-1}} and Y i {\displaystyle Y_{i}} . As well as having a simple interpretation of the Y i {\displaystyle Y_{i}} as "labels" for each element in the input sequence, this layout admits efficient algorithms for: model training, learning the conditional distributions between the Y i {\displaystyle Y_{i}} and feature functions from some corpus of training data. decoding, determining the probability of a given label sequence Y {\displaystyle Y} given X {\displaystyle X} . inference, determining the most likely label sequence Y {\displaystyle Y} given X {\displaystyle X} . The conditional dependency of each Y i {\displaystyle Y_{i}} on X {\displaystyle X} is defined through a fixed set of feature functions of the form f ( i , Y i − 1 , Y i , X ) {\displaystyle f(i,Y_{i-1},Y_{i},X)} , which can be thought of as measurements on the input sequence that partially determine the likelihood of each possible value for Y i {\displaystyle Y_{i}} . The model assigns each feature a numerical weight and combines them to determine the probability of a certain value for Y i {\displaystyle Y_{i}} . Linear-chain CRFs have many of the same applications as conceptually simpler hidden Markov models (HMMs), but relax certain assumptions about the input and output sequence distributions. An HMM can loosely be understood as a CRF with very specific feature functions that use constant probabilities to model state transitions and emissions. Conversely, a CRF can loosely be understood as a generalization of an HMM that makes the constant transition probabilities into arbitrary functions that vary across the positions in the sequence of hidden states, depending on the input sequence. Notably, in contrast to HMMs, CRFs can contain any number of feature functions, the feature functions can inspect the entire input sequence X {\displaystyle X} at any point during inference, and the range of the feature functions need not have a probabilistic interpretation. == Variants == === Higher-order CRFs and semi-Markov CRFs === CRFs can be extended into higher order models by making each Y i {\displaystyle Y_{i}} dependent on a fixed number k {\displaystyle k} of previous variables Y i − k , . . . , Y i − 1 {\displaystyle Y_{i-k},...,Y_{i-1}} . In conventional formulations of higher order CRFs, training and inference are only practical for small values of k {\displaystyle k} (such as k ≤ 5), since their computational cost increases exponentially with k {\displaystyle k} . However, another recent advance has managed to ameliorate these issues by leveraging concepts and tools from the field of Bayesian nonparametrics. Specifically, the CRF-infinity approach constitutes a CRF-type model that is capable of learning infinitely-long temporal dynamics in a scalable fashion. This is effected by introducing a novel potential function for CRFs that is based on the Sequence Memoizer (SM), a nonparametric Bayesian model for learning infinitely-long dynamics in sequential observations. To render such a model computationally tractable, CRF-infinity employs a mean-field approximation of the postulated novel potential functions (which are driven by an SM). This allows for devising efficient approximate training and inference algorithms for the model, without undermining its capability to capture and model temporal dependencies of arbitrary length. There exists another generalization of CRFs, the semi-Markov conditional random field (semi-CRF), which models variable-length segmentations of the label sequence Y {\displaystyle Y} . This provides much of the power of higher-order CRFs to model long-range dependencies of the Y i {\displaystyle Y_{i}} , at a reasonable computational cost. Finally, large-margin models for structured prediction, such as the structured Support Vector Machine can be seen as an alternative training procedure to CRFs. === Latent-dynamic conditional random field === Latent-dynamic conditional random fields (LDCRF) or discriminative probabilistic latent variable models (DPLVM) are a type of CRFs for sequence tagging tasks. They are latent variable models that are trained discriminatively. In an LDCRF, like in any sequence tagging task, given a sequence of observations x = x 1 , … , x n {\displaystyle x_{1},\dots ,x_{n}} , the main problem the model must solve is how to assign a sequence of labels y = y 1 , … , y n {\displaystyle y_{1},\dots ,y_{n}} from one finite set

    Read more →
  • Feistel cipher

    Feistel cipher

    In cryptography, a Feistel cipher (also known as Luby–Rackoff block cipher) is a symmetric structure used in the construction of block ciphers, named after the German-born physicist and cryptographer Horst Feistel, who did pioneering research while working for IBM; it is also commonly known as a Feistel network. A large number of block ciphers use the scheme, including the US Data Encryption Standard, the Soviet/Russian GOST (aka Magma) and the more recent Blowfish and Twofish ciphers. In a Feistel cipher, encryption and decryption are very similar operations, and both consist of iteratively running a function called a "round function" a fixed number of times. == History == Many modern symmetric block ciphers are based on Feistel networks. Feistel networks were first seen commercially in IBM's Lucifer cipher, designed by Horst Feistel and Don Coppersmith in 1973. Feistel networks gained respectability when the U.S. Federal Government adopted the DES (a cipher based on Lucifer, with changes made by the NSA) in 1976. Like other components of the DES, the iterative nature of the Feistel construction makes implementing the cryptosystem in hardware easier (particularly on the hardware available at the time of DES's design). == Design == A Feistel network uses a round function, a function which takes two inputs – a data block and a subkey – and returns one output of the same size as the data block. In each round, the round function is run on half of the data to be encrypted, and its output is XORed with the other half of the data. This is repeated a fixed number of times, and the final output is the encrypted data. An important advantage of Feistel networks compared to other cipher designs such as substitution–permutation networks (SP-networks) is that the entire operation is guaranteed to be invertible (that is, encrypted data can be decrypted), even if the round function is not itself invertible. The round function can be made arbitrarily complicated, since it does not need to be designed to be invertible. Furthermore, the encryption and decryption operations are very similar, even identical in some cases, requiring only a reversal of the key schedule. Therefore, the size of the code or circuitry required to implement such a cipher is nearly halved. Unlike SP-networks, Feistel networks also do not depend on a substitution box that could cause timing side-channels in software implementations. == Theoretical work == The structure and properties of Feistel ciphers have been extensively analyzed by cryptographers. Michael Luby and Charles Rackoff analyzed the Feistel cipher construction and proved that if the round function is a cryptographically secure pseudorandom function, with Ki used as the seed, then 3 rounds are sufficient to make the block cipher a pseudorandom permutation, while 4 rounds are sufficient to make it a "strong" pseudorandom permutation (which means that it remains pseudorandom even to an adversary who gets oracle access to its inverse permutation). Because of this very important result of Luby and Rackoff, Feistel ciphers are sometimes called Luby–Rackoff block ciphers. Further theoretical work has generalized the construction somewhat and given more precise bounds for security. == Construction details == Let F {\displaystyle \mathrm {F} } be the round function and let K 0 , K 1 , … , K n {\displaystyle K_{0},K_{1},\ldots ,K_{n}} be the sub-keys for the rounds 0 , 1 , … , n {\displaystyle 0,1,\ldots ,n} respectively. Then the basic operation is as follows: Split the plaintext block into two equal pieces: ( L 0 {\displaystyle L_{0}} , R 0 {\displaystyle R_{0}} ). For each round i = 0 , 1 , … , n {\displaystyle i=0,1,\dots ,n} , compute L i + 1 = R i , {\displaystyle L_{i+1}=R_{i},} R i + 1 = L i ⊕ F ( R i , K i ) , {\displaystyle R_{i+1}=L_{i}\oplus \mathrm {F} (R_{i},K_{i}),} where ⊕ {\displaystyle \oplus } means XOR. Then the ciphertext is ( R n + 1 , L n + 1 ) {\displaystyle (R_{n+1},L_{n+1})} . Decryption of a ciphertext ( R n + 1 , L n + 1 ) {\displaystyle (R_{n+1},L_{n+1})} is accomplished by computing for i = n , n − 1 , … , 0 {\displaystyle i=n,n-1,\ldots ,0} R i = L i + 1 , {\displaystyle R_{i}=L_{i+1},} L i = R i + 1 ⊕ F ⁡ ( L i + 1 , K i ) . {\displaystyle L_{i}=R_{i+1}\oplus \operatorname {F} (L_{i+1},K_{i}).} Then ( L 0 , R 0 ) {\displaystyle (L_{0},R_{0})} is the plaintext again. The diagram illustrates both encryption and decryption. Note the reversal of the subkey order for decryption; this is the only difference between encryption and decryption. === Unbalanced Feistel cipher === Unbalanced Feistel ciphers use a modified structure where L 0 {\displaystyle L_{0}} and R 0 {\displaystyle R_{0}} are not of equal lengths. The Skipjack cipher is an example of such a cipher. The Texas Instruments digital signature transponder uses a proprietary unbalanced Feistel cipher to perform challenge–response authentication. The Thorp shuffle is an extreme case of an unbalanced Feistel cipher in which one side is a single bit. This has better provable security than a balanced Feistel cipher but requires more rounds. There exists Type-1, Type-2, and Type-3 Feistel networks, where the Feistel function is one fourth the size of the block but operates a varying number of times within one round. === Other uses === The Feistel construction is also used in cryptographic algorithms other than block ciphers. For example, the optimal asymmetric encryption padding (OAEP) scheme uses a simple Feistel network to randomize ciphertexts in certain asymmetric-key encryption schemes. A generalized Feistel algorithm can be used to create strong permutations on small domains of size not a power of two (see format-preserving encryption). === Feistel networks as a design component === Whether the entire cipher is a Feistel cipher or not, Feistel-like networks can be used as a component of a cipher's design. For example, MISTY1 is a Feistel cipher using a three-round Feistel network in its round function, Skipjack is a modified Feistel cipher using a Feistel network in its G permutation, and Threefish (part of Skein) is a non-Feistel block cipher that uses a Feistel-like MIX function. == List of Feistel ciphers == Feistel or modified Feistel: Generalised Feistel: CAST-256 CLEFIA MacGuffin RC2 RC6 Skipjack SMS4

    Read more →
  • Hike Messenger

    Hike Messenger

    Hike Messenger, aka Hike Sticker Chat, is a multifunctional Indian social media and social networking service offering instant messaging (IM) and Voice over IP (VoIP) services that was launched on December 11, 2012, by Kavin Bharti Mittal. Hike functioned through SMS. The app registration used a s‍tandard, one-time password (OTP) based authentication process. It was estimated to be worth $1.4 billion and had more than 100 million registered users. It went defunct on January 6, 2021, as they were unable to compete with global messaging platforms. The app re-appeared on google play store and apple app store on 19 September 2025. == History == Hike Messenger was launched on December 12, 2012, by its founder, Kavin Bharti Mittal. The majority of users were from India, with 80% under the age of 25. The company purchased startups like TinyMogul and Hoppr in 2015. After buying US-based free voice calling company Zip Phones, Hike provided VoIP calling services. On March 5, 2015, Hike launched the 'Great Indian Sticker Challenge' to create more stickers. In February 2017, Hike acquired the social networking app Pulse. From version 5.0, it became the first social messaging app to start a mobile payment service in India. The timeline feature came back after multiple user requests and the introduction of a personalized digital envelope called Blue Packets for sending monetary gifts through a built-in wallet. In 2017, the acquisition of Bengaluru-based startup Creo was announced to enable third-party developers to build services on top of the Hike platform. In 2018, Hike provided 1 billion users with internet access by targeting smaller cities. In January 2019, the company discarded the previous super-app approach, and began launching specialized apps for specific use-cases. In May 2019, Hike announced a collaboration with Indraprastha Institute of Information Technology, Delhi (IIIT-D) to develop a variety of machine learning models. In April 2019, the company launched its first standalone app, Hike Sticker Chat. A separate content app Hike News & Content was also launched. In 2021, Hike shut down its messaging service and shifted focus to gaming and community platforms. It launched Rush, a real-money gaming app featuring casual titles like ludo and carrom, which scaled to over 10 million users and generated more than US$500 million in gross revenue over four years. The company also introduced Vibe, an approval-only community app, as part of its pivot away from the super-app and messaging model. In September 2025, following the passage of the Promotion and Regulation of Online Gaming Act, which banned real-money gaming in India, Hike announced its complete closure. Founder Kavin Bharti Mittal stated that while the company had begun international expansion, scaling globally under the new regulatory regime would require a full reset that was not a viable use of capital or resources. On 19 September 2025, hike was relaunched on play store and app store by the name hike messenger. == Application == === Timeline of Features === On 15 April 2014, Hike introduced unlimited free SMS via a service called Hike Offline, through credits earned by users from regular chatting, as connectivity is still a major issue in many parts of India. In an attempt to appeal to its younger users, Hike introduced features that find resonance with the local market, such as Last Seen Privacy and localized sticker packs. It also introduced a two-way chat theme, allowing users to change the chat background for themselves and for their friends simultaneously. The app also started showing live Cricket scores in collaboration with Cricbuzz, as well as news, casual games, and social media feeds. Hike also added a file transfer service, allowing files less than 100MB of all formats, with a view on further increasing the size limit to 1 GB. With the launch of version 2.9.2.0 in January 2015, Hike implemented support for sending uncompressed images and a "quick upload" feature optimized for 2G speed. Later that month, Hike introduced a voice calling feature for its users. In September 2015, Hike launched free group call support with up to 100 people in a simultaneous conference call environment. In November 2016, Hike announced the launch of a feature called Stories that allows people to share real-life moments using fun live filters which automatically get deleted after 48 hours, and a new camera design with localized filters. Hike 4.0 launched on 26 August 2015 with the tagline 'Got a Gang? Get on Hike'. Hike 4.0 was an optimization-focused update, increasing the performance of the app on poor networks. It supported photo filters, doodles, and bite-sized news updates in under 100 characters. Hike launched News Feed with Hindi language support on 29 September 2015 to cater for the needs of the non-English population. Hike launched version 3.5 as the biggest update for Windows Phone 8.1 during December 2015 which changed the user interface for more simpler navigation, supported sending unlimited non-media files and documents of any format and better group admin settings. It also included ten brand new chat themes. Hike launched a microapp feature which was live for two days on 8 May 2016, as a Mother's Day special in which users could add images, quotes or messages as a token of love with customized e-cards and stickers on their timeline not only on Hike, but also on other platforms. On 26 October 2016, Hike Messenger rolled out the beta version of a video calling feature ahead of WhatsApp starting with the Android users which also lets recipients preview a video call before deciding to take it and is optimized to even work under 2G conditions. On 24 December 2016, Hike rolled out a short 20-second Video Stories feature that can be directly shared with friends or posted on a public timeline with different filters in collaboration with content creators with the same 48-hour time limit before being automatically deleted. The Stories feature continues to receive constant future updates to include and enable content, public story option, private user messaging and geo-tagging. In September 2017, Hike launched personalized sticker packs with 20,000+ graphical stickers for over 500 colleges that covered around 1,000 colleges by December 2018 across India which can be used across different geographies, and are highly customized for users with availability in 40+ local languages that support automatic sticker suggestions where the application suggests the best reply for any sticker message and also allows users to "nudge", a feature used to ping the receiver. Hike started supporting user comments on friend's posts, added a specific message reply function, a redesigned camera interface to support front flash and user mentions with the help of the @ symbol. In December, 2017, Hike launched group voting, bill splitting, checklists and event reminders for group chat that supports up to 1,000 users both on iOS and Android platform. Hike launched another feature called Hike Land, which is a virtual world with beta trial to start from March 2020, that will use Hike Moji where online users with their digital avatar can hang out with other users and will be built inside the Hike Sticker Chat application. It is mainly targeted but not restricted towards 16 to 21 years age group of people. Without unveiling much about Hike Land, a separate website has been created with option to reserve spots by giving details like name, gender and phone number that will link the user profile from the Hike Sticker Chat account though it is not a necessity. ==== Hike Direct ==== The Hike Direct feature is based on the technology known as WiFi Direct, which initially was also called WiFi P2P and got introduced to users by October 2015, which enables sharing of files such as music, apps, videos without a live internet connection within a 100-meter radius by creating a wireless network between two or more devices with a transfer speed of 100MB per minute. For privacy and security reasons, Hike didn't show the recipient's location or proximity and works only when two users are connected in the same room by adding one another into the contact list. ==== Hike Wallet ==== In June 2017, Hike announced the launch of version 5.0 with multiple new features like User Chat Themes, Night Mode and Magic Selfie. along with a built-in Wallet partnered with Yes Bank. This feature was first rolled out to Android users followed by iOS users at a later stage. Hike collaborated with Airtel Payment Bank to power its digital payment wallet by November 2017 where Hike users have access to Airtel Payments Bank's merchant & utility payment services and know your customer (KYC) infrastructure with 5 million transactions happening from services like recharge and P2P. Hike formed a partnership with Ola Cabs to bring a taxi and auto-rickshaw booking facility from 14 February 2018. With Hike Wallet facility users could now book bus tickets with 3

    Read more →