AI Analysis Youtube Video

AI Analysis Youtube Video — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Natural Language Toolkit

    Natural Language Toolkit

    The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language. It supports classification, tokenization, stemming, tagging, parsing, and semantic reasoning functionalities. It was developed by Steven Bird and Edward Loper in the Department of Computer and Information Science at the University of Pennsylvania. NLTK includes graphical demonstrations and sample data. It is accompanied by a book that explains the underlying concepts behind the language processing tasks supported by the toolkit, plus a cookbook. NLTK is intended to support research and teaching in NLP or closely related areas, including empirical linguistics, cognitive science, artificial intelligence, information retrieval, and machine learning. NLTK has been used successfully as a teaching tool, as an individual study tool, and as a platform for prototyping and building research systems. == Library highlights == Discourse representation Lexical analysis: Word and text tokenizer n-gram and collocations Part-of-speech tagger Tree model and Text chunker for capturing Named-entity recognition

    Read more →
  • Microsoft Security Development Lifecycle

    Microsoft Security Development Lifecycle

    The Microsoft Security Development Lifecycle (SDL) is the approach Microsoft uses to integrate security into DevOps processes (sometimes called a DevSecOps approach). You can use this SDL guidance and documentation to adapt this approach and practices to your organization. == Overview == The practices outlined in the SDL approach are applicable to all types of software development and across all platforms, ranging from traditional waterfall methodologies to modern DevOps approaches. They can generally be applied to the following: Software – whether you are developing software code for firmware, AI applications, operating systems, drivers, IoT Devices, mobile device apps, web services, plug-ins or applets, hardware microcode, low-code/no-code apps, or other software formats. Note that most practices in the SDL are applicable to secure computer hardware development as well. Platforms – whether the software is running on a ‘serverless’ platform approach, on an on-premises server, a mobile device, a cloud hosted VM, a user endpoint, as part of a Software as a Service (SaaS) application, a cloud edge device, an IoT device, or anywhere else. == Practices == The SDL recommends 10 security practices to incorporate into your development workflows. Applying the 10 security practices of SDL is an ongoing process of improvement so a key recommendation is to begin from some point and keep enhancing as you proceed. This continuous process involves changes to culture, strategy, processes, and technical controls as you embed security skills and practices into DevOps workflows. The 10 SDL practices are: Establish security standards, metrics, and governance Require use of proven security features, languages, and frameworks Perform security design review and threat modeling Define and use cryptography standards Secure the software supply chain Secure the engineering environment Perform security testing Ensure operational platform security Implement security monitoring and response Provide security training == Versions ==

    Read more →
  • Cambridge Analytica

    Cambridge Analytica

    Cambridge Analytica Ltd. (CA), previously known as SCL USA, was a British political consulting firm that came to prominence through the Facebook–Cambridge Analytica data scandal. It was founded in 2013, as a subsidiary of the private intelligence company and self-described "global election management agency" SCL Group by long-time SCL executives Nigel Oakes, Alexander Nix and Alexander Oakes, with Nix as CEO. Cambridge Analytica was hired by a variety of political actors, including the Trinidadian government in 2010 and the 2016 presidential campaigns of Ted Cruz and Donald Trump. The firm maintained offices in London, New York City, and Washington, D.C. The company closed operations in 2018 due to backlash from the scandal, although firms related to both Cambridge Analytica and its parent firm SCL still exist. == History == Cambridge Analytica was founded in 2013 as a subsidiary of the private intelligence company SCL Group, which describes itself as providing "data, analytics and strategy to governments and military organisations worldwide". The company was part of "an international web of companies" headed by the London-based SCL Group. Cambridge Analytica (SCL USA) was incorporated in January 2013 with its registered office being in Westferry Circus, London and consisting of just one staff member, director and CEO Alexander Nix (also appointed in January 2015). Nix was also the director of nine similar companies sharing the same registered offices in London, including Firecrest technologies, Emerdata and six SCL Group companies including "SCL elections limited". Nigel Oakes, known as the former boyfriend of Lady Helen Windsor, had founded the predecessor SCL Group in the 1990s, and in 2005 Oakes established SCL Group together with his brother Alexander Oakes and Alexander Nix; SCL Group was the parent company of Cambridge Analytica. Former Conservative minister and MP Sir Geoffrey Pattie was the founding chairman of SCL; Lord Ivar Mountbatten also joined Oakes as a director of the company. As a result of the Facebook–Cambridge Analytica data scandal, Nix was removed as CEO and replaced by Julian Wheatland before the company closed. Several of the company's executives were Old Etonians. The company's owners included several of the Conservative Party's largest donors such as billionaire Vincent Tchenguiz, former British Conservative minister Jonathan Marland, Baron Marland and the family of American hedge fund manager Robert Mercer. The company combined misappropriation of digital assets, data mining, data brokerage, and data analysis with strategic communication during electoral processes. While its parent SCL had focused on influencing elections in developing countries since the 1990s, Cambridge Analytica focused more on the western world, including the United Kingdom and the United States; CEO Alexander Nix has said CA was involved in 44 U.S. political races in 2014. In 2015, CA performed data analysis services for Ted Cruz's presidential campaign. In 2016, CA worked for Donald Trump's presidential campaign as well as for Leave.EU (one of the organisations campaigning in the United Kingdom's referendum on European Union membership). CA's role in those campaigns has been controversial and is the subject of ongoing inquiries in both countries. Political scientists question CA's claims about the effectiveness of its methods of targeting voters. == Data scandal == In March 2018, media outlets broke news of Cambridge Analytica's business practices. The New York Times and The Observer reported that the company had acquired and used personal data about Facebook users from an external researcher who had told Facebook he was collecting it for academic purposes. Shortly afterwards, Channel 4 News aired undercover investigative videos showing Nix boasting about using prostitutes, bribery sting operations, and honey traps to discredit politicians on whom it had conducted opposition research, and saying that the company "ran all of (Donald Trump's) digital campaign". In response to the media reports, the Information Commissioner's Office (ICO) of the UK pursued a warrant to search the company's servers. Facebook banned Cambridge Analytica from advertising on its platform, saying that it had been deceived. On 23 March 2018, the British High Court granted the ICO a warrant to search Cambridge Analytica's London offices. As a result, Nix was suspended as CEO, and replaced by Julian Wheatland. The personal data of up to 87 million Facebook users were acquired via the 270,000 Facebook users who used a Facebook app created by Aleksandr Kogan called "This Is Your Digital Life". This was a personality profiling app and asked simple personality questions similar to other Facebook quizzes. Kogan was a scientist and psychologist, also being an employed lecturer for the University of Cambridge from 2012 to 2018. Alexander Nix claimed they had close to five thousand data points on each person who participated. They also gathered information through other data brokers ending with them acquiring millions of data points from American citizens. Kogan's app exploited a feature of Facebook's Graph API (version 1.0), which permitted any third-party app to access not only the app user's data, but also the full profile data of all of that user's Facebook friends, without those friends' knowledge or consent. This platform-wide design was available to all developers and was used by tens of thousands of apps; Facebook CEO Mark Zuckerberg later told the House Energy and Commerce Committee that the company was auditing "tens of thousands" of apps that had had access to large amounts of user data. Because the average Facebook user at the time had approximately 300 friends, the 270,000 users who installed Kogan's app yielded data on up to 87 million people. Facebook deprecated the friends-data API in April 2014 and shut it down entirely in April 2015, but data already collected by apps remained in developers' possession. Kogan passed this data to Cambridge Analytica, breaching Facebook's terms of service. On 1 May 2018, Cambridge Analytica and its parent company SCL filed for insolvency proceedings and closed operations. Alexander Tayler, a former director for Cambridge Analytica, was appointed director of Emerdata on 28 March 2018. Rebekah Mercer, Jennifer Mercer, Alexander Nix and Johnson Chun Shun Ko, who has links to American businessman Erik Prince, are in leadership positions at Emerdata. The Russo brothers are producing an upcoming film on Cambridge Analytica. In 2019 the Federal Trade Commission filed an administrative complaint against Cambridge Analytica for misuse of data. In 2020, the British Information Commissioner's Office closed a three-year inquiry into the company, concluded that Cambridge Analytica was "not involved" in the 2016 Brexit referendum and found no additional evidence for Russia's alleged interference during the campaign. US sensitive polling and election data, however, were passed to Russian Intelligence via a Cambridge Analytica contractor Sam Patten, Trump campaign manager Paul Manafort, and Russian agent Konstantin Kilimnik, who was indicted during the affair. Publicly, parent company SCL Group called itself a "global election management agency", Politico reported it was known for involvement "in military disinformation campaigns to social media branding and voter targeting". SCL gained work on a large number of campaigns for the US and UK governments' war on terror advancing their model of behavioral conflict during the 2000s. SCL's involvement in the political world has been primarily in the developing world where it has been used by the military and politicians to study and manipulate public opinion and political will. Slate writer Sharon Weinberger compared one of SCL's hypothetical test scenarios to fomenting a coup. Among the investors in Cambridge Analytica were some of the Conservative Party's largest donors such as billionaire Vincent Tchenguiz, former Conservative minister Jonathan Marland, Baron Marland, Roger Gabb, the family of American hedge fund manager Robert Mercer, and Steve Bannon. A minimum of 15 million dollars has been invested into the company by Mercer, according to The New York Times. Bannon's stake in the company was estimated at 1 to 5 million dollars, but he divested his holdings in April 2017 as required by his role as White House Chief Strategist. In March 2018, Jennifer Mercer and Rebekah Mercer became directors of Emerdata limited. In March 2018 it became public by Christopher Wylie, that Cambridge Analytica's first activities were founded on a data set, which its parent company SCL bought 2014 from a company named Global Science Research founded by Aleksandr Kogan and his team present across the world who worked as a psychologist at Cambridge. During Boris Johnson's tenure as foreign secretary, the Foreign Office sought advice from Cambridge Analytica and Boris Johnson had a meeting with Alexander N

    Read more →
  • G.9970

    G.9970

    G.9970 (also known as G.hnta) is a Recommendation developed by ITU-T that describes the generic transport architecture for home networks and their interfaces to a provider's access network. G.9970 was developed by Study Group 15, Question 1. G.9970 received Consent on December 12, 2008 and was Approved on January 13, 2009. == Relationship with G.hn == G.9970 (G.hnta) and G.9960 (G.hn) are two ITU-T Recommendations that address home networking in a complementary manner. While G.9970 addresses layer 3 (network layer) of the home network architecture, G.9960 addresses layers 1 (physical layer) and 2 (data link layer).

    Read more →
  • Evaluation of binary classifiers

    Evaluation of binary classifiers

    Evaluation of a binary classifier typically assigns a numerical value, or values, to a classifier that represent its accuracy. An example is error rate, which measures how frequently the classifier makes a mistake. There are many metrics that can be used; different fields have different preferences. For example, in medicine sensitivity and specificity are often used, while in computer science precision and recall are preferred. An important distinction is between metrics that are independent of the prevalence or skew (how often each class occurs in the population), and metrics that depend on the prevalence – both types are useful, but they have very different properties. Often, evaluation is used to compare two methods of classification, so that one can be adopted and the other discarded. Such comparisons are more directly achieved by a form of evaluation that results in a single unitary metric rather than a pair of metrics. == Contingency table == Given a data set, a classification (the output of a classifier on that set) gives two numbers: the number of positives and the number of negatives, which add up to the total size of the set. To evaluate a classifier, one compares its output to another reference classification – ideally a perfect classification, but in practice the output of another gold standard test – and cross tabulates the data into a 2×2 contingency table, comparing the two classifications. One then evaluates the classifier relative to the gold standard by computing summary statistics of these 4 numbers. Generally these statistics will be scale invariant (scaling all the numbers by the same factor does not change the output), to make them independent of population size, which is achieved by using ratios of homogeneous functions, most simply homogeneous linear or homogeneous quadratic functions. Say we test some people for the presence of a disease. Some of these people have the disease, and our test correctly says they are positive. They are called true positives (TP). Some have the disease, but the test incorrectly claims they don't. They are called false negatives (FN). Some don't have the disease, and the test says they don't – true negatives (TN). Finally, there might be healthy people who have a positive test result – false positives (FP). These can be arranged into a 2×2 contingency table (confusion matrix), conventionally with the test result on the vertical axis and the actual condition on the horizontal axis. These numbers can then be totaled, yielding both a grand total and marginal totals. Totaling the entire table, the number of true positives, false negatives, true negatives, and false positives add up to 100% of the set. Totaling the columns (adding vertically) the number of true positives and false positives add up to 100% of the test positives, and likewise for negatives. Totaling the rows (adding horizontally), the number of true positives and false negatives add up to 100% of the condition positives (conversely for negatives). The basic marginal ratio statistics are obtained by dividing the 2×2=4 values in the table by the marginal totals (either rows or columns), yielding 2 auxiliary 2×2 tables, for a total of 8 ratios. These ratios come in 4 complementary pairs, each pair summing to 1, and so each of these derived 2×2 tables can be summarized as a pair of 2 numbers, together with their complements. Further statistics can be obtained by taking ratios of these ratios, ratios of ratios, or more complicated functions. The contingency table and the most common derived ratios are summarized below; see sequel for details. Note that the rows correspond to the condition actually being positive or negative (or classified as such by the gold standard), as indicated by the color-coding, and the associated statistics are prevalence-independent, while the columns correspond to the test being positive or negative, and the associated statistics are prevalence-dependent. There are analogous likelihood ratios for prediction values, but these are less commonly used, and not depicted above. == Pairs of metrics == Often accuracy is evaluated with a pair of metrics composed in a standard pattern. === Sensitivity and specificity === The fundamental prevalence-independent statistics are sensitivity and specificity. Sensitivity or True Positive Rate (TPR), also known as recall, is the proportion of people that tested positive and are positive (True Positive, TP) of all the people that actually are positive (Condition Positive, CP = TP + FN). It can be seen as the probability that the test is positive given that the patient is sick. With higher sensitivity, fewer actual cases of disease go undetected (or, in the case of the factory quality control, fewer faulty products go to the market). Specificity (SPC) or True Negative Rate (TNR) is the proportion of people that tested negative and are negative (True Negative, TN) of all the people that actually are negative (Condition Negative, CN = TN + FP). As with sensitivity, it can be looked at as the probability that the test result is negative given that the patient is not sick. With higher specificity, fewer healthy people are labeled as sick (or, in the factory case, fewer good products are discarded). The relationship between sensitivity and specificity, as well as the performance of the classifier, can be visualized and studied using the Receiver Operating Characteristic (ROC) curve. In theory, sensitivity and specificity are independent in the sense that it is possible to achieve 100% in both (such as in the red/blue ball example given above). In more practical, less contrived instances, however, there is usually a trade-off, such that they are inversely proportional to one another to some extent. This is because we rarely measure the actual thing we would like to classify; rather, we generally measure an indicator of the thing we would like to classify, referred to as a surrogate marker. The reason why 100% is achievable in the ball example is because redness and blueness is determined by directly detecting redness and blueness. However, indicators are sometimes compromised, such as when non-indicators mimic indicators or when indicators are time-dependent, only becoming evident after a certain lag time. The following example of a pregnancy test will make use of such an indicator. Modern pregnancy tests do not use the pregnancy itself to determine pregnancy status; rather, human chorionic gonadotropin is used, or hCG, present in the urine of gravid females, as a surrogate marker to indicate that a woman is pregnant. Because hCG can also be produced by a tumor, the specificity of modern pregnancy tests cannot be 100% (because false positives are possible). Also, because hCG is present in the urine in such small concentrations after fertilization and early embryogenesis, the sensitivity of modern pregnancy tests cannot be 100% (because false negatives are possible). === Positive and negative predictive values === In addition to sensitivity and specificity, the performance of a binary classification test can be measured with positive predictive value (PPV), also known as precision, and negative predictive value (NPV). The positive prediction value answers the question "If the test result is positive, how well does that predict an actual presence of disease?". It is calculated as TP/(TP + FP); that is, it is the proportion of true positives out of all positive results. The negative prediction value is the same, but for negatives, naturally. ==== Impact of prevalence on predictive values ==== Prevalence has a significant impact on prediction values. As an example, suppose there is a test for a disease with 99% sensitivity and 99% specificity. If 2000 people are tested and the prevalence (in the sample) is 50%, 1000 of them are sick and 1000 of them are healthy. Thus about 990 true positives and 990 true negatives are likely, with 10 false positives and 10 false negatives. The positive and negative prediction values would be 99%, so there can be high confidence in the result. However, if the prevalence is only 5%, so of the 2000 people only 100 are really sick, then the prediction values change significantly. The likely result is 99 true positives, 1 false negative, 1881 true negatives and 19 false positives. Of the 19+99 people tested positive, only 99 really have the disease – that means, intuitively, that given that a patient's test result is positive, there is only 84% chance that they really have the disease. On the other hand, given that the patient's test result is negative, there is only 1 chance in 1882, or 0.05% probability, that the patient has the disease despite the test result. === Precision and recall === Precision and recall can be interpreted as (estimated) conditional probabilities: Precision is given by P ( C = P | C ^ = P ) {\displaystyle P(C=P|{\hat {C}}=P)} while recall is given by P ( C ^ = P | C = P ) {\displaystyle P({\hat {C}}=P|C=P)} , where C ^ {\

    Read more →
  • Hardware random number generator

    Hardware random number generator

    In computing, a hardware random number generator (HRNG), true random number generator (TRNG), non-deterministic random bit generator (NRBG), or physical random number generator is a device that generates random numbers from a physical process capable of producing entropy, unlike a pseudorandom number generator (PRNG) that utilizes a deterministic algorithm and non-physical nondeterministic random bit generators that do not include hardware dedicated to generation of entropy. Many natural phenomena generate low-level, statistically random "noise" signals, including thermal and shot noise, jitter and metastability of electronic circuits, Brownian motion, and atmospheric noise. Researchers also used the photoelectric effect, involving a beam splitter, other quantum phenomena, and even nuclear decay (due to practical considerations the latter, as well as the atmospheric noise, is not viable except for fairly restricted applications or online distribution services). While "classical" (non-quantum) phenomena are not truly random, an unpredictable physical system is usually acceptable as a source of randomness, so the qualifiers "true" and "physical" are used interchangeably. A hardware random number generator is expected to output near-perfect random numbers ("full entropy"). A physical process usually does not have this property, and a practical TRNG typically includes a few blocks: a noise source that implements the physical process producing the entropy. Usually this process is analog, so a digitizer is used to convert the output of the analog source into a binary representation; a conditioner (randomness extractor) that improves the quality of the random bits; health tests. TRNGs are mostly used in cryptographical algorithms that get completely broken if the random numbers have low entropy, so the testing functionality is usually included. Hardware random number generators generally produce only a limited number of random bits per second. In order to increase the available output data rate, they are often used to generate the "seed" for a faster PRNG. PRNG also helps with the noise source "anonymization" (whitening out the noise source identifying characteristics) and entropy extraction. With a proper PRNG algorithm selected (cryptographically secure pseudorandom number generator, CSPRNG), the combination can satisfy the requirements of Federal Information Processing Standards and Common Criteria standards. == Uses == Hardware random number generators can be used in any application that needs randomness. However, in many scientific applications additional cost and complexity of a TRNG (when compared with pseudo random number generators) provide no meaningful benefits. TRNGs have additional drawbacks for data science and statistical applications: impossibility to re-run a series of numbers unless they are stored, reliance on an analog physical entity can obscure the failure of the source. The TRNGs therefore are primarily used in the applications where their unpredictability and the impossibility to re-run the sequence of numbers are crucial to the success of the implementation: in cryptography and gambling machines. === Cryptography === The major use for hardware random number generators is in the field of data encryption, for example to create random cryptographic keys and nonces needed to encrypt and sign data. In addition to randomness, there are at least two additional requirements imposed by the cryptographic applications: forward secrecy guarantees that the knowledge of the past output and internal state of the device should not enable the attacker to predict future data; backward secrecy protects the "opposite direction": knowledge of the output and internal state in the future should not divulge the preceding data. A typical way to fulfill these requirements is to use a TRNG to seed a cryptographically secure pseudorandom number generator. == History == Physical devices were used to generate random numbers for thousands of years, primarily for gambling. Dice in particular have been known for more than 5000 years (found on locations in modern Iraq and Iran), and flipping a coin (thus producing a random bit) dates at least to the times of ancient Rome. The first documented use of a physical random number generator for scientific purposes was by Francis Galton (1890). He devised a way to sample a probability distribution using a common gambling die. In addition to the top digit, Galton also looked at the face of a die closest to him, thus creating 64 = 24 outcomes (about 4.6 bits of randomness). Kendall and Babington-Smith (1938) used a fast-rotating 10-sector disk that was illuminated by periodic bursts of light. The sampling was done by a human who wrote the number under the light beam onto a pad. The device was utilized to produce a 100,000-digit random number table (at the time such tables were used for statistical experiments, like PRNG nowadays). On 29 April 1947, the RAND Corporation began generating random digits with an "electronic roulette wheel", consisting of a random frequency pulse source of about 100,000 pulses per second gated once per second with a constant frequency pulse and fed into a five-bit binary counter. Douglas Aircraft built the equipment, implementing Cecil Hasting's suggestion (RAND P-113) for a noise source (most likely the well known behavior of the 6D4 miniature gas thyratron tube, when placed in a magnetic field). Twenty of the 32 possible counter values were mapped onto the 10 decimal digits and the other 12 counter values were discarded. The results of a long run from the RAND machine, filtered and tested, were converted into a table, which originally existed only as a deck of punched cards, but was later published in 1955 as a book, 50 rows of 50 digits on each page (A Million Random Digits with 100,000 Normal Deviates). The RAND table was a significant breakthrough in delivering random numbers because such a large and carefully prepared table had never before been available. It has been a useful source for simulations, modeling, and for deriving the arbitrary constants in cryptographic algorithms to demonstrate that the constants had not been selected maliciously ("nothing up my sleeve numbers"). Since the early 1950s, research into TRNGs has been highly active, with thousands of research works published and about 2000 patents granted by 2017. == Physical phenomena with random properties == Multiple different TRNG designs were proposed over time with a large variety of noise sources and digitization techniques ("harvesting"). However, practical considerations (size, power, cost, performance, robustness) dictate the following desirable traits: use of a commonly available inexpensive silicon process; exclusive use of digital design techniques. This allows an easier system-on-chip integration and enables the use of FPGAs; compact and low-power design. This discourages use of analog components (e.g., amplifiers); mathematical justification of the entropy collection mechanisms. Stipčević & Koç in 2014 classified the physical phenomena used to implement TRNG into four groups: electrical noise; free-running oscillators; chaos; quantum effects. === Electrical noise-based RNG === Noise-based RNGs generally follow the same outline: the source of a noise generator is fed into a comparator. If the voltage is above threshold, the comparator output is 1, otherwise 0. The random bit value is latched using a flip-flop. Sources of noise vary and include: Johnson–Nyquist noise ("thermal noise"); Zener noise; avalanche breakdown. The drawbacks of using noise sources for an RNG design are: noise levels are hard to control, they vary with environmental changes and device-to-device; calibration processes needed to ensure a guaranteed amount of entropy are time-consuming; noise levels are typically low, thus the design requires power-hungry amplifiers. The sensitivity of amplifier inputs enables manipulation by an attacker; circuitry located nearby generates a lot of non-random noise thus lowering the entropy; a proof of randomness is near-impossible as multiple interacting physical processes are involved. === Chaos-based RNG === The idea of chaos-based noise stems from the use of a complex system that is hard to characterize by observing its behavior over time. For example, lasers can be put into (undesirable in other applications) chaos mode with chaotically fluctuating power, with power detected using a photodiode and sampled by a comparator. The design can be quite small, as all photonics elements can be integrated on-chip. Stipčević & Koç characterize this technique as "most objectionable", mostly due to the fact that chaotic behavior is usually controlled by a differential equation and no new randomness is introduced, thus there is a possibility of the chaos-based TRNG producing a limited subset of possible output strings. === Free-running oscillators-based RNG === The TRNGs based on a free-running oscilla

    Read more →
  • G.9963

    G.9963

    Recommendation G.9963 is a home networking standard under development at the International Telecommunication Union standards sector, the ITU-T. It was begun in 2010 by ITU-T to add multiple-input and multiple-output (known as MIMO) capabilities to the G.hn standard originally defined in Recommendation G.9960. The standard is also known as "G.hn-mimo". As part of the family of G.hn standards, G.9963 was endorsed by the HomeGrid Forum.

    Read more →
  • Control break

    Control break

    In computer programming, a control break is a change in the value of one of the keys on which a file is sorted, which requires some extra processing. For example, with an input file sorted by post code, the number of items found in each postal district might need to be printed on a report, and a heading shown for the next district. Quite often there is a hierarchy of nested control breaks in a program, such as streets within districts within areas, with the need for a grand total at the end. Structured programming techniques have been developed to ensure correct processing of control breaks in languages such as COBOL and to ensure that conditions such as empty input files and sequence errors are handled properly. With fourth-generation languages such as SQL, the programming language should handle most of the details of control breaks automatically.

    Read more →
  • Capture the flag (cybersecurity)

    Capture the flag (cybersecurity)

    In computer security, Capture the Flag (CTF) is an exercise in which participants attempt to find text strings, called "flags", which are secretly hidden in purposefully vulnerable programs or websites. They can be used for both competitive or educational purposes. In two main variations of CTFs, participants either steal flags from other participants (attack/defense-style CTFs) or from organizers (jeopardy-style challenges). A mixed competition combines these two styles. Competitions can include hiding flags in hardware devices, they can be both online or in-person, and can be advanced or entry-level. The game is inspired by the traditional outdoor sport with the same name. CTFs are used as a tool for developing and refining cybersecurity skills, making them popular in both professional and academic settings. == Overview == Capture the Flag (CTF) is a cybersecurity competition that is used to test and develop computer security skills. It was first developed in 1996 at DEF CON, the largest cybersecurity conference in the United States which is hosted annually in Las Vegas, Nevada. The conference hosts a weekend of cybersecurity competitions, including their flagship CTF. Two popular CTF formats are jeopardy and attack-defense. Both formats test participant’s knowledge in cybersecurity, but differ in objective. In the Jeopardy format, participating teams must complete as many challenges of varying point values from a various categories such as cryptography, web exploitation, and reverse engineering. In the attack-defense format, competing teams must defend their vulnerable computer systems while attacking their opponent's systems. The exercise involves a diverse array of tasks, including exploitation and cracking passwords, but there is little evidence showing how these tasks translate into cybersecurity knowledge held by security experts. Recent research has shown that the Capture the Flag tasks mainly covered technical knowledge but lacked social topics like social engineering and awareness on cybersecurity. == Educational applications == CTFs have been shown to be an effective way to improve cybersecurity education through gamification. There are many examples of CTFs designed to teach cybersecurity skills to a wide variety of audiences, including PicoCTF, organized by the Carnegie Mellon CyLab, which is oriented towards high school students, and Arizona State University supported pwn.college. Beyond educational CTF events and resources, CTFs has been shown to be a highly effective way to instill cybersecurity concepts in the classroom. CTFs have been included in undergraduate computer science classes such as Introduction to Information Security at the National University of Singapore. CTFs are also popular in military academies. They are often included as part of the curriculum for cybersecurity courses, with the NSA organized Cyber Exercise culminating in a CTF competition between the US service academies and military colleges. == Competitions == Many CTF organizers register their competition with the CTFtime platform. This allows the tracking of the position of teams over time and across competitions. These include "Plaid Parliament of Pwning", "More Smoked Leet Chicken", "Dragon Sector", "dcua", "Eat, Sleep, Pwn, Repeat", "perfect blue", "organizers" and "Blue Water". Overall the "Plaid Parliament of Pwning" and "Dragon Sector" have both placed first worldwide the most with three times each. === Community competitions === Every year there are dozens of CTFs organized in a variety of formats. Many CTFs are associated with cybersecurity conferences such as DEF CON, various editions of SANS Institute's NetWars, HITCON, and BSides. The DEF CON CTF, an attack-defence CTF, is notable for being one of the oldest CTF competitions to exist, and has been variously referred to as the "World Series", "Superbowl", and "Olympics", of hacking by media outlets. The NYU Tandon hosted Cybersecurity Awareness Worldwide (CSAW) CTF is one of the largest open-entry competitions for students learning cybersecurity from around the world. In 2021, it hosted over 1200 teams during the qualification round. In addition to conference organized CTFs, many CTF clubs and teams organize CTF competitions. Many CTF clubs and teams are associated with universities, such as the CMU associated Plaid Parliament of Pwning, which hosts PlaidCTF, and the ASU associated Shellphish. Some community CTFs are online and open to all participants. The SANS Institute Holiday Hack Challenge and TryHackMe Advent of Cyber. === Government-supported competitions === Governmentally supported CTF competitions include the DARPA Cyber Grand Challenge and ENISA European Cybersecurity Challenge. In 2023, the US Space Force-sponsored Hack-a-Sat CTF competition included, for the first time, a live orbital satellite for participants to exploit. === Corporate-supported competitions === Corporations and other organizations sometimes use CTFs as a training or evaluation exercise, with benefits similar to those in educational settings. In addition to internal CTF exercises, some corporations such as Google and Tencent host publicly accessible CTF competitions. == In popular culture == In Mr. Robot, a qualification round for the DEF CON CTF competition is depicted in the season 3 opener "eps3.0_power-saver-mode.h". The logo for DEF CON can be seen in the background. In The Undeclared War, a CTF is depicted in the opening scene of the series as a recruitment exercise used by GCHQ. Go Go Squid!, a Chinese television series, is based around training for and competing in highly stylized CTF competitions .

    Read more →
  • Customer data management

    Customer data management

    Customer data management (CDM) is the ways in which businesses keep track of their customer information and survey their customer base in order to obtain feedback. CDM includes a range of software or cloud computing applications designed to give large organizations rapid and efficient access to customer data. Surveys and data can be centrally located and widely accessible within a company, as opposed to being warehoused in separate departments. CDM encompasses the collection, analysis, organizing, reporting and sharing of customer information throughout an organization. Businesses need a thorough understanding of their customers’ needs if they are to retain and increase their customer base. Efficient CDM solutions provide companies with the ability to deal instantly with customer issues and obtain immediate feedback. As a result, customer retention and customer satisfaction can show marked improvement. According to a study by Aberdeen Group, "above-average and best-in-class companies... attain greater than 20% annual improvement in retention rates, revenues, data accuracy and partner/customer satisfaction rates." == Customer data management and cloud computing == Cloud computing offers an attractive choice for CDM in many companies due to its accessibility and cost-effectiveness. Businesses can decide who, within their company, should have the ability to create, adjust, analyze or share customer information. In December 2010, 52% of Information Technology (IT) professionals worldwide were deploying, or planning to deploy, cloud computing; this percentage is far higher in many countries. == Background == Customer data management, as a term, was coined in the 1990s, pre-dating the alternative term enterprise feedback management (EFM). CDM was introduced as a software solution that would replace earlier disc-based or paper-based surveys and spreadsheet data. Initially, CDM solutions were marketed to businesses as software, which were specific to one company, and often to one department within that company. This was superseded by application service providers (ASPs) where software was hosted for end user organizations, thus avoiding the necessity for IT professionals to deploy and support software. However, ASPs with their single-tenancy architecture were, in turn, superseded by software as a service (SaaS), engineered for multi-tenancy. By 2007 SaaS applications, giving businesses on-demand access to their customer information, were rapidly gaining popularity compared with ASPs. Cloud computing now includes SaaS and many prominent CDM providers offer cloud-based applications to their clients. In recent years, there has been a push away from the term EFM, with many of those working in this area advocating the slightly updated use of CDM. The return to the term CDM is largely based on the greater need for clarity around the solutions offered by companies, and on the desire to retire terminology veering on techno-jargon that customers may have a hard time understanding.

    Read more →
  • Transmission security

    Transmission security

    Transmission security (TRANSEC) is the component of communications security (COMSEC) that results from the application of measures designed to protect transmissions from interception and exploitation by means other than cryptanalysis. Goals of transmission security include: Low probability of interception (LPI) Low probability of detection (LPD) Antijam — resistance to jamming (EPM or ECCM) This involves securing communication links from being compromised by techniques like jamming, eavesdropping, and signal interception. TRANSEC includes the use of frequency hopping, spread spectrum and the physical protection of communication links to obscure the patterns of transmission. It is particularly vital in military and government communication systems, where the security of transmitted data is critical to prevent adversaries from gathering intelligence or disrupting operations. TRANSEC is often implemented alongside COMSEC (Communications Security) to form a comprehensive approach to communication security. Methods used to achieve transmission security include frequency hopping and spread spectrum where the required pseudorandom sequence generation is controlled by a cryptographic algorithm and key. Such keys are known as transmission security keys (TSK). Modern U.S. and NATO TRANSEC-equipped radios include SINCGARS and HAVE QUICK.

    Read more →
  • Hike Messenger

    Hike Messenger

    Hike Messenger, aka Hike Sticker Chat, is a multifunctional Indian social media and social networking service offering instant messaging (IM) and Voice over IP (VoIP) services that was launched on December 11, 2012, by Kavin Bharti Mittal. Hike functioned through SMS. The app registration used a s‍tandard, one-time password (OTP) based authentication process. It was estimated to be worth $1.4 billion and had more than 100 million registered users. It went defunct on January 6, 2021, as they were unable to compete with global messaging platforms. The app re-appeared on google play store and apple app store on 19 September 2025. == History == Hike Messenger was launched on December 12, 2012, by its founder, Kavin Bharti Mittal. The majority of users were from India, with 80% under the age of 25. The company purchased startups like TinyMogul and Hoppr in 2015. After buying US-based free voice calling company Zip Phones, Hike provided VoIP calling services. On March 5, 2015, Hike launched the 'Great Indian Sticker Challenge' to create more stickers. In February 2017, Hike acquired the social networking app Pulse. From version 5.0, it became the first social messaging app to start a mobile payment service in India. The timeline feature came back after multiple user requests and the introduction of a personalized digital envelope called Blue Packets for sending monetary gifts through a built-in wallet. In 2017, the acquisition of Bengaluru-based startup Creo was announced to enable third-party developers to build services on top of the Hike platform. In 2018, Hike provided 1 billion users with internet access by targeting smaller cities. In January 2019, the company discarded the previous super-app approach, and began launching specialized apps for specific use-cases. In May 2019, Hike announced a collaboration with Indraprastha Institute of Information Technology, Delhi (IIIT-D) to develop a variety of machine learning models. In April 2019, the company launched its first standalone app, Hike Sticker Chat. A separate content app Hike News & Content was also launched. In 2021, Hike shut down its messaging service and shifted focus to gaming and community platforms. It launched Rush, a real-money gaming app featuring casual titles like ludo and carrom, which scaled to over 10 million users and generated more than US$500 million in gross revenue over four years. The company also introduced Vibe, an approval-only community app, as part of its pivot away from the super-app and messaging model. In September 2025, following the passage of the Promotion and Regulation of Online Gaming Act, which banned real-money gaming in India, Hike announced its complete closure. Founder Kavin Bharti Mittal stated that while the company had begun international expansion, scaling globally under the new regulatory regime would require a full reset that was not a viable use of capital or resources. On 19 September 2025, hike was relaunched on play store and app store by the name hike messenger. == Application == === Timeline of Features === On 15 April 2014, Hike introduced unlimited free SMS via a service called Hike Offline, through credits earned by users from regular chatting, as connectivity is still a major issue in many parts of India. In an attempt to appeal to its younger users, Hike introduced features that find resonance with the local market, such as Last Seen Privacy and localized sticker packs. It also introduced a two-way chat theme, allowing users to change the chat background for themselves and for their friends simultaneously. The app also started showing live Cricket scores in collaboration with Cricbuzz, as well as news, casual games, and social media feeds. Hike also added a file transfer service, allowing files less than 100MB of all formats, with a view on further increasing the size limit to 1 GB. With the launch of version 2.9.2.0 in January 2015, Hike implemented support for sending uncompressed images and a "quick upload" feature optimized for 2G speed. Later that month, Hike introduced a voice calling feature for its users. In September 2015, Hike launched free group call support with up to 100 people in a simultaneous conference call environment. In November 2016, Hike announced the launch of a feature called Stories that allows people to share real-life moments using fun live filters which automatically get deleted after 48 hours, and a new camera design with localized filters. Hike 4.0 launched on 26 August 2015 with the tagline 'Got a Gang? Get on Hike'. Hike 4.0 was an optimization-focused update, increasing the performance of the app on poor networks. It supported photo filters, doodles, and bite-sized news updates in under 100 characters. Hike launched News Feed with Hindi language support on 29 September 2015 to cater for the needs of the non-English population. Hike launched version 3.5 as the biggest update for Windows Phone 8.1 during December 2015 which changed the user interface for more simpler navigation, supported sending unlimited non-media files and documents of any format and better group admin settings. It also included ten brand new chat themes. Hike launched a microapp feature which was live for two days on 8 May 2016, as a Mother's Day special in which users could add images, quotes or messages as a token of love with customized e-cards and stickers on their timeline not only on Hike, but also on other platforms. On 26 October 2016, Hike Messenger rolled out the beta version of a video calling feature ahead of WhatsApp starting with the Android users which also lets recipients preview a video call before deciding to take it and is optimized to even work under 2G conditions. On 24 December 2016, Hike rolled out a short 20-second Video Stories feature that can be directly shared with friends or posted on a public timeline with different filters in collaboration with content creators with the same 48-hour time limit before being automatically deleted. The Stories feature continues to receive constant future updates to include and enable content, public story option, private user messaging and geo-tagging. In September 2017, Hike launched personalized sticker packs with 20,000+ graphical stickers for over 500 colleges that covered around 1,000 colleges by December 2018 across India which can be used across different geographies, and are highly customized for users with availability in 40+ local languages that support automatic sticker suggestions where the application suggests the best reply for any sticker message and also allows users to "nudge", a feature used to ping the receiver. Hike started supporting user comments on friend's posts, added a specific message reply function, a redesigned camera interface to support front flash and user mentions with the help of the @ symbol. In December, 2017, Hike launched group voting, bill splitting, checklists and event reminders for group chat that supports up to 1,000 users both on iOS and Android platform. Hike launched another feature called Hike Land, which is a virtual world with beta trial to start from March 2020, that will use Hike Moji where online users with their digital avatar can hang out with other users and will be built inside the Hike Sticker Chat application. It is mainly targeted but not restricted towards 16 to 21 years age group of people. Without unveiling much about Hike Land, a separate website has been created with option to reserve spots by giving details like name, gender and phone number that will link the user profile from the Hike Sticker Chat account though it is not a necessity. ==== Hike Direct ==== The Hike Direct feature is based on the technology known as WiFi Direct, which initially was also called WiFi P2P and got introduced to users by October 2015, which enables sharing of files such as music, apps, videos without a live internet connection within a 100-meter radius by creating a wireless network between two or more devices with a transfer speed of 100MB per minute. For privacy and security reasons, Hike didn't show the recipient's location or proximity and works only when two users are connected in the same room by adding one another into the contact list. ==== Hike Wallet ==== In June 2017, Hike announced the launch of version 5.0 with multiple new features like User Chat Themes, Night Mode and Magic Selfie. along with a built-in Wallet partnered with Yes Bank. This feature was first rolled out to Android users followed by iOS users at a later stage. Hike collaborated with Airtel Payment Bank to power its digital payment wallet by November 2017 where Hike users have access to Airtel Payments Bank's merchant & utility payment services and know your customer (KYC) infrastructure with 5 million transactions happening from services like recharge and P2P. Hike formed a partnership with Ola Cabs to bring a taxi and auto-rickshaw booking facility from 14 February 2018. With Hike Wallet facility users could now book bus tickets with 3

    Read more →
  • Public computer

    Public computer

    A public computer (or public access computer) is any of various computers available in public areas. Some places where public computers may be available are libraries, schools, or dedicated facilities run by government. Public computers share similar hardware and software components to personal computers, however, the role and function of a public access computer is entirely different. A public access computer is used by many different untrusted individuals throughout the course of the day. The computer must be locked down and secure against both intentional and unintentional abuse. Users typically do not have authority to install software or change settings. A personal computer, in contrast, is typically used by a single responsible user, who can customize the machine's behavior to their preferences. Public access computers are often provided with tools such as a PC reservation system to regulate access. The world's first public access computer center was the Marin Computer Center in California, co-founded by David and Annie Fox in 1977. == Kiosks == A kiosk is a special type of public computer using software and hardware modifications to provide services only about the place the kiosk is in. For example, a movie ticket kiosk can be found at a movie theater. These kiosks are usually in a secure browser with zero access to the desktop. Many of these kiosks may run Linux, however, ATMs, a kiosk designed for depositing money, often run Windows XP. == Public computers in the United States == === Library computers === In the United States and Canada, almost all public libraries have computers available for the use of patrons, though some libraries will impose a time limit on users to ensure others will get a turn and keep the library less busy. Users are often allowed to print documents that they have created using these computers, though sometimes for a small fee. ==== Privacy ==== Privacy is an important part of the public library institution, since the libraries entitle the public to intellectual freedom. Use of any computer or network may create records of users' activities that can jeopardize their privacy. It is possible for a patron to jeopardize their privacy if they do not delete cache, clear cookies, or documents from the public computer. In order for a member of the public to remain private on a computer, the American Library Association (ALA) has guidelines. These give patrons an idea of the right way to keep using public library computers. In their provision of services to library users, librarians have an ethical responsibility, expressed in the ALA Code of Ethics, to preserve users' right to privacy. A librarian is also responsible for giving users an understanding of private patron use and access. Libraries must ensure that users have the following rights when browsing on public computers: the computer automatically will clear a users history; libraries should display privacy screens so users do not see another patron's screen; updating software for effective safety measures; restoration data software to clear documents that users may have left on their computers and to combat possible malware; security practices; and making users aware of any possible monitoring of their browsing activities. Users can also view the Library Privacy Checklist for Public Access Computers and Networks to better understand what libraries strive for when protecting privacy. === School computers === The U.S. government has given money to many school boards to purchase computers for educational applications. Schools may have multiple computer labs, which contain these computers for students to use. There is usually Internet access on these machines, but some schools will put up a blocking service to limit the websites that students are able to access to only include educational resources, such as Google. In addition to controlling the content students are viewing, putting up these blocks can also help to keep the computers safe by preventing students from downloading malware and other threats. However, the effectiveness of such content filtering systems is questionable since it can easily be circumvented by using proxy websites, Virtual Private Networks, and for some weak security systems, merely knowing the IP address of the intended website is enough to bypass the filter. School computers often have advanced operating system security to prevent tech-savvy students from inflicting damage (i.e. the Windows Registry Editor and Task Manager, etc.) are disabled on Microsoft Windows machines. Schools with very advanced tech services may also install a locked down BIOS/firmware or make kernel-level changes to the operating system, precluding the possibility of unauthorized activity.

    Read more →
  • Data marketplace

    Data marketplace

    Data marketplace is an online platform for sharing and consuming data in the form of data assets or data products. Part of the data management stack, it aims to bring together data producers and data consumers (including business users and AI) in a single space, with the objective of increasing access to understandable, high-quality data. Included within its Data Marketplaces and Exchange (DME) category by Gartner, data marketplaces can provide data internally within an organization, externally with partners, or as open data. == Concept == Digitization has dramatically increased data volumes within organizations, with IDC predicting that by 2025 the world will contain 175 zettabytes of data. This has created a need to both manage this data and provide access to it to enable business intelligence and data analysis. However, data is often scattered within multiple systems (such as data warehouses and data lakes), and is in formats that are only understandable by technical experts, such as data scientists. According to IDC, 81% of IT leaders cite data silos as a major barrier to digital transformation. This means that data is not freely available to business users or external audiences such as partners or citizens, limiting its value, and holding back AI deployments. Data marketplaces solve this issue, providing seamless, self-service access to high-quality data in an understandable, secure and auditable manner. They break down data silos, reduce friction in data access, and enable a broader range of users, including non-technical profiles, to find, understand, and consume data autonomously. Data assets on the marketplace can be raw data, data visualizations or data products. Data marketplaces combine data management functions such as data governance with the user-friendly experience offered by e-commerce marketplaces in order to increase the usage of data. These include features such as powerful search engines, feedback, ratings, subscriptions and product description sheets. According to Gartner, data marketplaces provide infrastructure, transactional capabilities, and services for both consumers and providers of data assets. == History and timeline == Data marketplaces have evolved since they first emerged in terms of both their scope and usage. === 2000s === With the rise of the internet, data brokers began collecting, aggregating, distributing and selling personal, financial and marketing data to third parties online. Data marketplaces were deployed to monetize this data, making it discoverable and accessible to users, either through subscriptions or one-off purchases. At the same time, regulations, such as the US Open Government Initiative of 2009 and others around the world mandated greater transparency and data sharing with the public. Data sharing portals were created by public and government bodies to make this information available through self-service to all users. === 2010s === Due to the growth of big data and cloud platforms, cloud-based data exchange platforms emerged. These were offered by major infrastructure providers, and included Amazon Web Services (AWS) Data Exchange, Snowflake Data Marketplace, and the Google Cloud Platform. These platforms moved beyond simple data brokerage or open data by providing structured, catalogued data sharing between organizations. === 2020s === Driven by a need to increase internal data sharing with both business users and AI, organizations are now looking to adopt internal data marketplaces. These aim to democratize data consumption by providing seamless access for all employees and AI to trusted data, including data products, through an intuitive, e-commerce style experience. According to Gartner analyst Richa Jha, "by providing a single, governed platform for discovering, sharing, and scaling data products, data marketplaces drive productivity, collaboration, and ROI across the enterprise." == Data marketplaces within the overall data architecture == Data marketplaces provide a consumption and collaboration layer for data. That means they complement and integrate with other parts of the overall data architecture, including: === Data warehouses and data lakes === Data marketplaces connect to data sources, such as data warehouses or data lakes, to provide intuitive access to the data stored within them, enabling data to be shared and distributed to non-technical audiences. Access can be direct, with data and data products stored within the data marketplace or virtualized. === Data catalog === A data catalog provides a technical inventory of an organization's data estate. It collects technical information on all available data assets within an organization, based on metadata descriptions. This ensures traceability, and supports compliance and governance requirements. Unlike a data marketplace, a data catalog does not provide access to data, and is designed to be used by data professionals, rather than the business. This means it lacks an intuitive, understandable interface and is consequently not easily accessible by business users. === Data mesh === Data mesh is an architecture and framework for data management, first defined by Zhamak Dehghani in 2019. It aims to decentralize data ownership to delegate responsibility, empowering teams and focusing on delivering data to users in the form of self-service data products. The data marketplace is a central pillar of data mesh, providing intuitive access to these data products, and creating a collaboration space for data owners and data consumers. === Data product === Data products are high-value, consumable data assets that package high-quality data and associated tools to enable seamless usage by business users at scale. First defined by McKinsey in 2022, they have an identified owner, a service level agreement (SLA), and a reusability logic. == Core components of a data marketplace == A data marketplace typically includes specific core components: === E-commerce style interface === An e-commerce style experience that engages non-technical users, minimizes the need for training and builds confidence and trust in data. Look and feel should be customizable to incorporate corporate design guidelines to ensure consistency with other organizational applications. === Built-in data catalog === As in a standalone data catalog, this indexes all available data, based on metadata that includes type, source, owner, freshness, and quality level. === Discovery and search engine === This enables users to search, filter, explore and discover available data intuitively. As in an e-commerce marketplace, it should be intelligent, and provide relevant results based on natural language queries. === Access control and security management === Data marketplaces will contain data that needs to be protected under regulations such as the General Data Protection Regulation (GDPR) in Europe, the California Consumer Privacy Act (CCPA) in the United States, and sector-specific frameworks in industries such as finance and healthcare. To ensure both security and compliance while maximizing data consumption, the data marketplace should include granular access management and a full audit trail. === Semantic layer and business glossary === Different parts of the business are likely to use different terms to describe data. This leads to inconsistencies and an inability to share data across systems and teams. The semantic layer and business glossary standardize a shared vocabulary and common definitions of business indicators and concepts, providing a single language for data across the business and for AI agents. === Data governance mechanisms === These enforce corporate data governance policies, ensuring data traceability through data lineage, quality certification, usage monitoring, and continuous improvement through user feedback loops. === Collaboration features === As on an e-commerce website, a data marketplace should provide collaboration features that bring together data users and data owners. This includes the ability to rate data products, share use cases, and provide feedback to data owners, creating a community around data and supporting a data-driven culture. == Types of data marketplace == While they share the same underlying technology, data marketplaces can be deployed in three broad ways: === Internal data marketplaces === These bring together data from across an organization and make it available via self-service to employees from across the business. They aim to widen access to data and consequently to improve decision-making and reporting, increase performance and maximize efficiency. === Ecosystem data marketplaces === These extend sharing beyond a single organization, enabling multiple partners (public institutions, industry players, research bodies) to share and consume data within a governed framework. Data can be provided by all parties or simply by one organization and consumed by others. Ecosystem data marketplaces are particularly relevant in

    Read more →
  • Hybrid argument (cryptography)

    Hybrid argument (cryptography)

    In cryptography, the hybrid argument is a proof technique used to show that two distributions are computationally indistinguishable. == History == Hybrid arguments had their origin in a papers by Andrew Yao in 1982 and Shafi Goldwasser and Silvio Micali in 1983. == Formal description == Formally, to show two distributions D1 and D2 are computationally indistinguishable, we can define a sequence of hybrid distributions D1 := H0, H1, ..., Ht =: D2 where t is polynomial in the security parameter n. Define the advantage of any probabilistic efficient (polynomial-bounded time) algorithm A as A d v H i , H i + 1 d i s t ( A ) := | Pr [ x ← $ H i : A ( x ) = 1 ] − Pr [ x ← $ H i + 1 : A ( x ) = 1 ] | , {\displaystyle {\mathsf {Adv}}_{H_{i},H_{i+1}}^{\mathsf {dist}}(\mathbf {A} ):=\left|\Pr[x{\stackrel {\$}{\gets }}H_{i}:\mathbf {A} (x)=1]-\Pr[x{\stackrel {\$}{\gets }}H_{i+1}:\mathbf {A} (x)=1]\right|,} where the dollar symbol ($) denotes that we sample an element from the distribution at random. By triangle inequality, it is clear that for any probabilistic polynomial time algorithm A, A d v D 1 , D 2 d i s t ( A ) ≤ ∑ i = 0 t − 1 A d v H i , H i + 1 d i s t ( A ) . {\displaystyle {\mathsf {Adv}}_{D_{1},D_{2}}^{\mathsf {dist}}(\mathbf {A} )\leq \sum _{i=0}^{t-1}{\mathsf {Adv}}_{H_{i},H_{i+1}}^{\mathsf {dist}}(\mathbf {A} ).} Thus there must exist some k s.t. 0 ≤ k < t(n) and A d v H k , H k + 1 d i s t ( A ) ≥ A d v D 1 , D 2 d i s t ( A ) / t ( n ) . {\displaystyle {\mathsf {Adv}}_{H_{k},H_{k+1}}^{\mathsf {dist}}(\mathbf {A} )\geq {\mathsf {Adv}}_{D_{1},D_{2}}^{\mathsf {dist}}(\mathbf {A} )/t(n).} Since t is polynomial-bounded, for any such algorithm A, if we can show that it has a fixed negligible advantage function ε(n) between distributions Hi and Hi+1 for every i, so in particular, ϵ ( n ) ≥ A d v H k , H k + 1 d i s t ( A ) ≥ A d v D 1 , D 2 d i s t ( A ) / t ( n ) , {\displaystyle \epsilon (n)\geq {\mathsf {Adv}}_{H_{k},H_{k+1}}^{\mathsf {dist}}(\mathbf {A} )\geq {\mathsf {Adv}}_{D_{1},D_{2}}^{\mathsf {dist}}(\mathbf {A} )/t(n),} then it immediately follows that its advantage to distinguish the distributions D1 = H0 and D2 = Ht must also be negligible. == Applications == The hybrid argument is extensively used in cryptography. Some simple proofs using hybrid arguments are: If one cannot efficiently predict the next bit of the output of some number generator, then this generator is a pseudorandom number generator (PRG). We can securely expand a PRG with 1-bit output into a PRG with n-bit output.

    Read more →