AI Generator Book Cover

AI Generator Book Cover — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Hugging Face

    Hugging Face

    Hugging Face, Inc., is an American company based in New York City that develops computation tools for building applications using machine learning. Its transformers library built for natural language processing applications and its platform allow users to share machine learning models and datasets and showcase their work. == History == === Founding === The company was founded in 2016 by French entrepreneurs Clément Delangue, Julien Chaumond, and Thomas Wolf in New York City, originally as a company that developed a chatbot app targeted at teenagers. The company was named after the U+1F917 🤗 HUGGING FACE emoji. After open sourcing the model behind the chatbot, the company pivoted to focus on being a platform for machine learning. === AI boom === On April 28, 2021, the company launched the BigScience Research Workshop in collaboration with several other research groups to release an open large language model. In 2022, the workshop concluded with the announcement of BLOOM, a multilingual large language model with 176 billion parameters. In February 2023, the company announced partnership with Amazon Web Services (AWS) which would allow Hugging Face's products to be available to AWS customers to use them as the building blocks for their custom applications. The company also said the next generation of BLOOM will be run on Trainium, a proprietary machine learning chip created by AWS. In June 2024, the company announced, along with Meta and Scaleway, their launch of a new AI accelerator program for European startups. The initiative aimed to help startups integrate open foundation models into their products, accelerating the EU AI ecosystem. The program, based at STATION F in Paris, ran from September 2024 to February 2025. Selected startups received mentoring, and access to AI models and tools and Scaleway's computing power. On September 23, 2024, to further the International Decade of Indigenous Languages, Hugging Face teamed up with Meta and UNESCO to launch a new online language translator. It was built on Meta's No Language Left Behind open-source AI model, enabling free text translation across 200 languages, including many low-resource languages. In April 2025, Hugging Face announced that they acquired a humanoid robotics startup, Pollen Robotics, based in France and founded by Matthieu Lapeyre and Pierre Rouanet in 2016. In an X tweet, Delangue shared his vision to "make Artificial Intelligence robotics Open Source". === Cyberattacks === In early 2026, hackers hijacked the Hugging Face platform to launch Android-targeted attacks involving "powerful malware" which could completely take over a compromised target.

    Read more →
  • Menu hack

    Menu hack

    A menu hack is a non-standard method of ordering food, usually at fast-food or fast casual restaurants, that offers a different result than what is explicitly stated on a menu. Menu hacks may range from a simple alternate flavor to "gaming the system" in order to obtain more food than normal. They are often spread on social media platforms such as TikTok, and are more popular with Generation Z, which has been known to customize their orders more than previous generations. Hacks are sometimes officially added to the menu after their popularity grows. However, in some cases, they have been criticized for overburdening fast food employees with outlandish requests, sparking debate as to whether certain menu hacks are unethical. The list of all possible menu hacks is called a secret menu. == History == The term "menu hack" stems from hacker culture and its tradition of overcoming previously imposed limitations. However, the tradition of ordering from a secret menu dates back to the early days of fast food. "Animal style" fries, a word of mouth menu item ordered from In-N-Out since the 1960s, was rumored to have been created by local surfers. In the Information Age, the rise of social media gave influencers the ability to communicate unique food combinations to their followers, which proved to go viral easily. Design mistakes in food ordering apps also proved to be easily exploitable. In some cases, these hacks boosted the profile of brands on social media, while in others, they caused financial harm when the company was unprepared to handle the sudden influx of unusual orders. One restaurant chain notable for the phenomenon is Chipotle Mexican Grill. A viral hack from Alexis Frost, suggesting a quesadilla with fajita vegetables inside, dipped in Chipotle vinaigrette mixed with sour cream, obtained 1.9 million views on TikTok, overloading the chain's workers, who had to work harder to prepare more vegetables and vinaigrette. Some restaurants began to deny the dish to customers, forcing them to only order meat and cheese on quesadillas. The company ultimately left the dish on the menu, but urged customers to stop ordering it via social media. When it later officially added the Fajita Quesadilla to the menu, digital sales nearly doubled. A method to order nachos, which are not officially on the menu, was also noted by customers. Starbucks is also famous for menu hacks, including the Pink Drink, a "Barbiecore" beverage in which coconut milk replaced the water in the strawberry açaí refresher. After it went viral, the company made it a permanent menu item and distributed it bottled in grocery stores. == Controversy == Menu hacks have been subject to a growing backlash, with employees stating that they "dread" younger customers due to the proliferation of unusual orders. Service industry workers, already overworked and underpaid, have called the rise of menu hacks and their difficulty to make an additional reason to unionize and demand higher wages.

    Read more →
  • AS1 (networking)

    AS1 (networking)

    AS1 (Applicability Statement 1) is a specification about how to transport structured business-to-business data securely and reliably over the Internet. Security is achieved by using digital certificates and encryption. == AS1 technical overview == The AS1 protocol is based on SMTP and S/MIME. It was the first AS protocol developed and uses signing, encryption and MDN conventions. In other words: Files are sent as "attachments" in a specially coded SMIME email message Messages can be signed, but do not have to be Messages can be encrypted, but do not have to be Messages may request an MDN back if all went well, but do not have to request such a message If the original AS1 message requested an MDN... Upon the receipt of the message and its successful decryption or signature validation (as necessary) a "success" MDN will be sent back to the original sender. This MDN is typically signed but not encrypted. Upon the receipt and successful verification of the signature on the MDN, the original sender will "know" that the recipient got their message (this provides the "Non-repudiation" element of AS1) If there are any problems receiving or interpreting the original AS1 message, a "failed" MDN may be sent back. Like any other AS file transfer, AS1 file transfers typically require both sides of the exchange to trade X.509 certificates and specific "trading partner" names before any transfers can take place.

    Read more →
  • Social media use in hiring

    Social media use in hiring

    Social media use in hiring refers to the examination by employers of job applicants' (public) social media profiles as part of the hiring assessment. For example, the vast majority of Fortune 500 companies use social media as a tool to screen prospective employees and as a tool for talent acquisition. This practice raises ethical questions. Employers and recruiters note that they have access only to information that applicants choose to make public. Many Western-European countries restrict employer's use of social media in the workplace. States including Arkansas, California, Colorado, Illinois, Maryland, Michigan, Nevada, New Jersey, New Mexico, Utah, Washington, and Wisconsin protect applicants and employees from surrendering usernames and passwords for social media accounts. Use of social media has caused significant problems for some applicants who are active on social media. A 2013 survey of 17,000 young people in six countries found that one in ten people aged 16 to 34 claimed to have been rejected for a job because of social media activity. Social media services have been reported to affect deception in resumes. While these services do not affect deception frequency, it does increase deception about interests and hobbies. == Ethical implications == This issue raises many ethical questions that some consider an employer's right and others consider discrimination. As of 2016, except in the states of California, Maryland, and Illinois, there are no laws that prohibit employers from using social media profiles as a basis of whether or not someone should be hired. Title VII also prohibits discrimination during any aspect of employment including hiring or firing, recruitment, or testing. Social media has been integrating into the workplace, and this has led to conflicts within employees and employers.[107] Particularly, Facebook has been seen as a popular platform for employers to investigate in order to learn more about potential employees. This conflict first started in Maryland when an employer requested and received an employee's Facebook username and password. State lawmakers first introduced legislation in 2012 to prohibit employers from requesting passwords to personal social accounts in order to get a job or to keep a job. This led to Canada, Germany, the U.S. Congress and 11 U.S. states to pass or propose legislation that prevents employers' access to private social accounts of employees.[108] Many Western European countries have already implemented laws that restrict the regulation of social media in the workplace. States including Arkansas, California, Colorado, Illinois, Maryland, Michigan, Nevada, New Jersey, New Mexico, Utah, Washington, and Wisconsin have passed legislation that protects potential employees and current employees from employers that demand them to give forth their username or password for a social media account. Laws that forbid employers from disciplining an employee based on activity off the job on social media sites have also been put into act in states including California, Colorado, Connecticut, North Dakota, and New York. Several states have similar laws that protect students in colleges and universities from having to grant access to their social media accounts. Eight states have passed the law that prohibits post secondary institutions from demanding social media login information from any prospective or current students and privacy legislation has been introduced or is pending in at least 36 states as of July 2013. As of May 2014, legislation has been introduced and is in the process of pending in at least 28 states and has been enacted in Maine and Wisconsin. In addition, the National Labor Relations Board has been devoting a lot of their attention to attacking employer policies regarding social media that can discipline employees who seek to speak and post freely on social media sites. Use of social media by young people has caused significant problems for some applicants who are active on social media when they try to enter the job market. A survey of 17,000 young people in six countries in 2013 found that 1 in 10 people aged 16 to 34 have been rejected for a job because of online comments they made on social media websites. A 2014 survey of recruiters found that 93% of them check candidates' social media postings. Moreover, professor Stijn Baert of Ghent University conducted a field experiment in which fictitious job candidates applied for real job vacancies in Belgium. They were identical except in one respect: their Facebook profile photos. It was found that candidates with the most wholesome photos were a lot more likely to receive invitations for job interviews than those with the more controversial photos. In addition, Facebook profile photos had a greater impact on hiring decisions when candidates were highly educated. These cases have created some privacy implications as to whether or not companies should have the right to look at employee's Facebook profiles. In March 2012, Facebook decided they might take legal action against employers for gaining access to employee's profiles through their passwords. According to Facebook Chief Privacy Officer for policy, Erin Egan, the company has worked hard to give its users the tools to control who sees their information. He also said users shouldn't be forced to share private information and communications just to get a job. According to the network's Statement of Rights and Responsibilities, sharing or soliciting a password is a violation of Facebook policy. Employees may still give their password information out to get a job, but according to Erin Egan, Facebook will continue to do their part to protect the privacy and security of their users. == Impacts == Use of social media by young people has caused significant problems for some applicants who are active on social media when they try to enter the job market. A survey of 17,000 young people in six countries in 2013 found that 1 in 10 people aged 16 to 34 have been rejected for a job because of online comments they made on social media websites. A 2014 survey of recruiters found that 93% of them check candidates' social media postings. Moreover, in 2015 professor Stijn Baert of Ghent University conducted a field experiment in which fictitious job candidates applied for real job vacancies in Belgium. They were identical except in one respect: their Facebook profile photos. It was found that candidates with the most wholesome photos were a lot more likely to receive invitations for job interviews than those with the more controversial photos. In addition, Facebook profile photos had a greater impact on hiring decisions when candidates were highly educated. These cases have created some privacy implications as to whether or not companies should have the right to look at employee's Facebook profiles. In March 2012, Facebook decided they might take legal action against employers for gaining access to employee's profiles through their passwords. According to Facebook Chief Privacy Officer for policy, Erin Egan, the company has worked hard to give its users the tools to control who sees their information. He also said users shouldn't be forced to share private information and communications just to get a job. According to the network's Statement of Rights and Responsibilities, sharing or soliciting a password is a violation of Facebook policy. Employees may still give their password information out to get a job, but according to Erin Egan, Facebook will continue to do their part to protect the privacy and security of their users. == Policy Responses == 26 US states now have laws against an employer requiring a current or potential employee to give the employer their username and password.

    Read more →
  • Security awareness

    Security awareness

    Security awareness is the knowledge and attitude members of an organization possess regarding the protection of the physical, and especially informational, assets of that organization. However, it is very tricky to implement because organizations are not able to impose such awareness directly on employees as there are no ways to explicitly monitor people's behavior. That being said, the literature does suggest several ways that such security awareness could be improved. Many organizations require formal security awareness training for all workers when they join the organization and periodically thereafter, usually annually. Another main force that is found to have a strong correlation with employees' security awareness is managerial security participation. It also bridges security awareness with other organizational aspects. == Relationship between Security Awareness and Human Factors == Employees' behavior, cognitive biases, and decision-making processes influence the effectiveness of security measures. Research indicates that psychological factors, such as optimism bias, overconfidence, and habitual behaviors, can undermine security awareness initiatives. To address these challenges, organizations are increasingly using behavioral analytics and security nudges—subtle prompts like password reminders and phishing warnings—to encourage secure behavior. Human error remains the leading cause of cybersecurity incidents. A 2023 IBM Security report found that 95% of breaches are due to human mistakes, including falling for phishing emails, using weak passwords, and mishandling sensitive data. Organizations emphasize security awareness training as a key strategy to mitigate this risk. It is particularly important for leadership to foster a culture of cybersecurity and to provide targeted training to increase security awareness among all employees across the organization. == Coverage == Topics covered in security awareness training include: The nature of sensitive material and physical assets they may come in contact with, such as trade secrets, privacy concerns and government classified information Employee and contractor responsibilities in handling sensitive information, including review of employee nondisclosure agreements Requirements for proper handling of sensitive material in physical form, including marking, transmission, storage and destruction Proper methods for protecting sensitive information on computer systems, including password policy and use of two-factor authentication Other computer security concerns, including malware, phishing, social engineering, etc. Workplace security, including building access, wearing of security badges, reporting of Incidents, forbidden articles, etc. Consequences of failure to properly protect information, including potential loss of employment, economic consequences to the firm, damage to individuals whose private records are divulged, and possible civil and criminal penalties Security awareness means understanding that there is the potential for some people to deliberately or accidentally steal, damage, or misuse the data that is stored within a company's computer systems and throughout its organization. Therefore, it would be prudent to support the assets of the institution (information, physical, and personal) by trying to stop that from happening. According to the European Network and Information Security Agency, "Awareness of the risks and available safeguards is the first line of defence for the security of information systems and networks." "The focus of Security Awareness consultancy should be to achieve a long term shift in the attitude of employees towards security, whilst promoting a cultural and behavioural change within an organisation. Security policies should be viewed as key enablers for the organisation, not as a series of rules restricting the efficient working of your business." == Role of Gamification and Interactive Training == Modern security awareness programs increasingly utilize gamification, phishing simulations, and interactive learning modules. Studies have shown that engaging employees through serious games, reward systems, and real-world attack simulations improves retention and application of security practices. One example is phishing simulation training, where employees receive simulated phishing emails to test their ability to recognize threats. Research indicates that repeated exposure to such exercises leads to long-term improvements in security awareness. == Legislation and Compliance Requirements == Many industries mandate security awareness training to comply with regulations such as: General Data Protection Regulation (GDPR) – requires organizations to ensure data protection awareness among employees. Health Insurance Portability and Accountability Act (HIPAA) – mandates security awareness programs for healthcare providers. Payment Card Industry Data Security Standard (PCI-DSS) – enforces security training for businesses handling payment card information. == Measuring security awareness == In a 2016 study, researchers developed a method of measuring security awareness. Specifically they measured "understanding about circumventing security protocols, disrupting the intended functions of systems or collecting valuable information, and not getting caught" (p. 38). The researchers created a method that could distinguish between experts and novices by having people organize different security scenarios into groups. Experts will organize these scenarios based on centralized security themes where novices will organize the scenarios based on superficial themes. Security awareness is also assessed through real-time security metrics, such as tracking phishing click rates, password reuse tendencies, and policy adherence rates. Organizations are adopting continuous monitoring strategies to provide immediate feedback to employees about risky behavior and suggest corrective actions. == Evolving cyber threats and security awareness strategies == As cyber threats continue to evolve, security awareness programs must adapt to new attack vectors, such as AI-driven cyberattacks, deepfakes, and insider threats. ENISA's Threat Landscape report highlights the increasing prominence of these emerging threats, stressing the need for security measures that address both traditional attacks like ransomware and malware, as well as more sophisticated techniques such as Living Off Trusted Sites (LOTS) and advanced evasion methods used by cybercriminals.

    Read more →
  • Air Force Network

    Air Force Network

    Air Force Network (AFNet) is an Indian Air Force (IAF) owned, operated and managed digital information grid. The AFNet replaces the Indian Air Force's (IAF) old communication network set-up using the tropo-scatter technology of the 1950s making it a true net-centric combat force. The IAF project is part of the overall mission to network all three services; The Indian Army, The Indian Navy and The Indian Air Force. The former Defence Minister AK Antony inaugurated the IAF's the AFNET on 14 September 2010 dedicating it to the people of India, for their direct or indirect participation in the communication revolution. == Background == Armed Forces in India has been using troposcatters as primary means of military communications since the 1950s, thereby occupying huge and expensive 2G and 3G spectrums which otherwise could have been used for expanding and de-clogging the civilian wireless communication network. The rapid expansion of civilian mobile telephony leading to need for larger bandwidth for wireless communication and commercial need to operate the 3G network necessitated the Government of India to have the Indian Armed Forces vacate the spectrum occupied by them. Thus the government of India through Department of Telecommunication (DoT) started a project called "Network for Spectrum" to set up a fiber optics network for the exclusive use of Indian Armed Forces in exchange for spectrum being released by the Defence Forces. The aim of 'Network for Spectrum' being twofold - to facilitate the growth of national tele-density on the one hand, and ensuring modernization of defence communications with the state-of-the-art communication infrastructure, and to support net-centric military operations. The Department of Telecom and the Ministry of Defence signed the memorandum of understanding for vacating the spectrum and setting up dedicated network for the use of defence forces. In this MoU, DoT agreed to laying of 40,000 route kilometres of optical fibre cable connecting 219 Army stations, 33 Navy stations and 162 points for the Air Force. It further agreed to setting up an exclusive defence band and Defence Interest Zone along 100 km of the international border, where spectrum will be reserved only for use by the Armed Forces. The total cost of implementing "Network for Spectrum" project is estimated to be ₹ 10,000 crores. AFNet is Indian Air Force component of Digital Information Grid under "Network for Spectrum" project and the AFNet has been extended and connected to the Digital Information Grid Project under implementation for the Indian Navy and the Indian Army on 2015. == Project Origin == The Air Force Network (AFNet) had been developed by the Indian Air Force at a cost of ₹1,077 crore (US$235.53 million) in collaboration with HCL Technologies and Bharat Sanchar Nigam Limited. It will replace the Air Force's more than half-a-century-old telecom network. This project is part of the defence ministry's initiative to digitize the communication systems of the three armed forces under "Network for Spectrum" initiative to improve coordination among themselves and other Military and Strategic Institution. IAF was the first to complete this gigabyte digital information grid implemented under the AFNet project. AFNet will be connected and extended to a Unified Digital Grid encompassing all the legs of Indian Armed Forces. The then defence minister, A. K. Antony, inaugurated the AFNet, IAF's gigabyte digital information grid. The grid is aimed at improving the network-centric warfare capability of the Air Force. The event also saw the presence of other personalities including the then Minister of Communication & IT, A. Raja; the Marshal of the Air Force, Arjan Singh; the Chief of the Air Staff, the Chief of the Army Staff and other officials from the three services and members of the Industry. The event also featured a practice interception of a simulated aerial target by a MiG-29 which took off from an airbase in the Punjab sector using the AFNet capabilities. Further capabilities in line with network centric warfare were also demonstrated. This included sharing information, videos and pictures by operational assets and platforms like UAVs and AWACS to decision-makers who are several hundred kilometres apart. == Technology, Design & Structure == AFNet incorporates the latest traffic transportation technology in form of Internet Protocol (IP) packets over the network using Multiprotocol Label Switching (MPLS). A large Voice over Internet Protocol (VoIP) layer with stringent quality of service enforcement will facilitate robust, high quality voice, video and conferencing solutions. AFNet will prove to be an effective force multiplier for intelligence analysis, mission planning and control, post-mission feedback and related activities like maintenance, logistics and administration. A comprehensive design with multi-layer security precautions for “Defence in Depth” have been planned by incorporating encryption technologies, Intrusion Prevention Systems to ensure the resistance of the IT system against information manipulation and eavesdropping. The network is secured with a host of advanced state-of-the-art encryption technologies. It is designed for high reliability with redundancy built into the network design itself. The AFNet is also capable of transmitting video from unmanned surveillance aircraft (UAV), pictures from airborne warning and control systems (AWACS) to decision makers on the ground and providing intelligence inputs from remote areas. The AFNet is also expected to facilitate accelerated economic growth by providing radio frequency spectrum for telecommunication purposes. AFNET will be the largest Multi-protocol Label Switching (MPLS) network in the defence segment. == Demonstration == At the AFNet launch, the IAF showcased a practice interception of simulated enemy targets by a pair of Mig-29 fighter aircraft airborne from an advanced airbase in the Punjab sector using the gigabyte digital information grid. During the AFNet-assisted operations, the Indian fighter jets neutralised intruding targets in the western sector, which was played out live on the giant screens at the Air Force auditorium offering a glimpse of the harnessed potential of the system. The final orders for engaging the enemy targets were issued live by Antony, whose queries about how the operation went were responded to by the pilot as "excellent". Various other functionalities contributing towards Network Centric Warfare were also showcased. These consisted of facilitating video from Unmanned Aerial Vehicle (UAV), pictures from an AWACS aircraft to the decision-makers on ground sitting hundreds of kilometres away, providing intelligence inputs from far-flung areas at central locations seamlessly. This was possible mainly because of the robust networking platform provided by AFNet. == Integrated Air Command and Control System == Integrated Air Command and Control System (IACCS) is an automated command and control system for air defence operated by the Indian Air Force. IACCS operations rides the AFNET backbone integrating all ground-based and airborne sensors, air defense weapon systems and command and control (C2) nodes. Subsequent integration with other services networks and civil radars will provide an integrated Air Situation Picture to operators to carry out AD role. The project was envisaged in 1995 following the Purulia arms drop case and was a part of IAF’s first Air Power Doctrinal manual issued in the 2000s, later revised in 2022. The first node in the western sectors had been operationalised by September 2010. The first five nodes located in the western and south western sectors were commissioned in 2011. The Air Force was preparing to seek clearance for five further nodes which would cover the rest of the nation including the island territories. Through the IACCS, IAF will connect all of its space, air and ground assets quickly, for total awareness of a region. This will offer connectivity for all the ground platforms and airborne platforms (including AEW&C), as a part of the network centricity of IAF. The IACCS also facilitates real-time transport of images, data and voice, amongst satellites, aircraft and ground stations. By 2018, five IACCS nodes had been established including Barnala (Punjab), Wadsar (Gujarat), Aya Nagar (Delhi), Jodhpur (Rajasthan) and Ambala (Haryana). Following this, under Phase-II, 4 additional nodes and 10 sub-nodes are to be set up. The major nodes will be established in the Eastern, Central, Southern and Andaman and Nicobar sectors. The second phase will cost ₹8,000 crore (equivalent to ₹110 billion or US$1.1 billion in 2023). IACCS successfully integrated all operating radars, including its own, the Army's, and civilian ones, in 2023. This enabled the autonomous firing response capability to take down incoming missiles, aircraft, and UAVs. The Akashteer system of the Indian Army is being integrated with the IACCS

    Read more →
  • G.9970

    G.9970

    G.9970 (also known as G.hnta) is a Recommendation developed by ITU-T that describes the generic transport architecture for home networks and their interfaces to a provider's access network. G.9970 was developed by Study Group 15, Question 1. G.9970 received Consent on December 12, 2008 and was Approved on January 13, 2009. == Relationship with G.hn == G.9970 (G.hnta) and G.9960 (G.hn) are two ITU-T Recommendations that address home networking in a complementary manner. While G.9970 addresses layer 3 (network layer) of the home network architecture, G.9960 addresses layers 1 (physical layer) and 2 (data link layer).

    Read more →
  • ESign (India)

    ESign (India)

    Aadhaar eSign is an online electronic signature service in India to facilitate an Aadhaar holder to digitally sign a document. The signature service is facilitated by authenticating the Aadhaar holder via the Aadhaar-based e-KYC (electronic Know Your Customer) service. To eSign a document, one has to have an Aadhaar card and a mobile number registered with Aadhaar. With these two things, an Indian citizen can sign a document remotely without being physically present. == Procedure == The notification issued by Government of India in this regard stipulates the following procedure for the e-authentication using Aadhaar e-KYC services. Authentication of an electronic record by e-authentication technique, which shall be done by the applicable use of e-authentication, hash function, and asymmetric cryptosystem techniques, leading to issuance of digital signature certificate by Certifying Authority, a trusted third party service by subscriber's key pair generation, storing of the key pairs on hardware security module and creation of digital signature provided that the trusted third party shall be offered by the certifying authority (the trusted third party shall send application form and certificate signing request to the Certifying Authority for issuing a digital signature certificate to the subscriber), issuance of digital signature certificate by Certifying Authority shall be based on e-authentication, particulars given in the prescribed format, digitally signed verified information from Aadhaar e-KYC services and electronic consent of digital signature certificate applicant, the manner and requirements for e-authentication shall be as issued by the Controller from time to time, the security procedure for creating the subscriber's key pair shall be in accordance with the e-authentication guidelines issued by the Controller, the standards referred to in rule 6 of the Information Technology (Certifying Authorities) Rules, 2000 shall be complied with, in so far as they relate to the certification function of public key of Digital Signature Certificate applicant, and the manner in which information is authenticated by means of digital signature shall comply with the standards specified in rule 6 of the Information Technology (Certifying Authorities) Rules, 2000 in so far as they relate to the creation, storage and transmission of Digital Signature. == eSign Service Providers == Organisations and individuals seeking to obtain the eSigning Service can utilize the services of various service providers. There are empanelled service providers with whom organisations can register as an Application Service Prover after submitting the requisite documents, getting UAT access, building the application around the service and going through an IT Audit by an CERT-IN empanelled auditor. However, the process of registering as an Application Service Provider is cumbersome, and requires huge investments of time, money and resources in complying with the regulations and building a suitable application. Most organisations prefer using services of plug-n-play gateway providers who take the responsibility of complying with the regulations, hence simplifying the process for the market.

    Read more →
  • Evolvability (computer science)

    Evolvability (computer science)

    The term evolvability is a framework of computational learning introduced by Leslie Valiant in his paper of the same name. The aim of this theory is to model biological evolution and categorize which types of mechanisms are evolvable. Evolution is an extension of PAC learning and learning from statistical queries. == General framework == Let F n {\displaystyle F_{n}\,} and R n {\displaystyle R_{n}\,} be collections of functions on n {\displaystyle n\,} variables. Given an ideal function f ∈ F n {\displaystyle f\in F_{n}} , the goal is to find by local search a representation r ∈ R n {\displaystyle r\in R_{n}} that closely approximates f {\displaystyle f\,} . This closeness is measured by the performance Perf ⁡ ( f , r ) {\displaystyle \operatorname {Perf} (f,r)} of r {\displaystyle r\,} with respect to f {\displaystyle f\,} . As is the case in the biological world, there is a difference between genotype and phenotype. In general, there can be multiple representations (genotypes) that correspond to the same function (phenotype). That is, for some r , r ′ ∈ R n {\displaystyle r,r'\in R_{n}} , with r ≠ r ′ {\displaystyle r\neq r'\,} , still r ( x ) = r ′ ( x ) {\displaystyle r(x)=r'(x)\,} for all x ∈ X n {\displaystyle x\in X_{n}} . However, this need not be the case. The goal then, is to find a representation that closely matches the phenotype of the ideal function, and the spirit of the local search is to allow only small changes in the genotype. Let the neighborhood N ( r ) {\displaystyle N(r)\,} of a representation r {\displaystyle r\,} be the set of possible mutations of r {\displaystyle r\,} . For simplicity, consider Boolean functions on X n = { − 1 , 1 } n {\displaystyle X_{n}=\{-1,1\}^{n}\,} , and let D n {\displaystyle D_{n}\,} be a probability distribution on X n {\displaystyle X_{n}\,} . Define the performance in terms of this. Specifically, Perf ⁡ ( f , r ) = ∑ x ∈ X n f ( x ) r ( x ) D n ( x ) . {\displaystyle \operatorname {Perf} (f,r)=\sum _{x\in X_{n}}f(x)r(x)D_{n}(x).} Note that Perf ⁡ ( f , r ) = Prob ⁡ ( f ( x ) = r ( x ) ) − Prob ⁡ ( f ( x ) ≠ r ( x ) ) . {\displaystyle \operatorname {Perf} (f,r)=\operatorname {Prob} (f(x)=r(x))-\operatorname {Prob} (f(x)\neq r(x)).} In general, for non-Boolean functions, the performance will not correspond directly to the probability that the functions agree, although it will have some relationship. Throughout an organism's life, it will only experience a limited number of environments, so its performance cannot be determined exactly. The empirical performance is defined by Perf s ⁡ ( f , r ) = 1 s ∑ x ∈ S f ( x ) r ( x ) , {\displaystyle \operatorname {Perf} _{s}(f,r)={\frac {1}{s}}\sum _{x\in S}f(x)r(x),} where S {\displaystyle S\,} is a multiset of s {\displaystyle s\,} independent selections from X n {\displaystyle X_{n}\,} according to D n {\displaystyle D_{n}\,} . If s {\displaystyle s\,} is large enough, evidently Perf s ⁡ ( f , r ) {\displaystyle \operatorname {Perf} _{s}(f,r)} will be close to the actual performance Perf ⁡ ( f , r ) {\displaystyle \operatorname {Perf} (f,r)} . Given an ideal function f ∈ F n {\displaystyle f\in F_{n}} , initial representation r ∈ R n {\displaystyle r\in R_{n}} , sample size s {\displaystyle s\,} , and tolerance t {\displaystyle t\,} , the mutator Mut ⁡ ( f , r , s , t ) {\displaystyle \operatorname {Mut} (f,r,s,t)} is a random variable defined as follows. Each r ′ ∈ N ( r ) {\displaystyle r'\in N(r)} is classified as beneficial, neutral, or deleterious, depending on its empirical performance. Specifically, r ′ {\displaystyle r'\,} is a beneficial mutation if Perf s ⁡ ( f , r ′ ) − Perf s ⁡ ( f , r ) ≥ t {\displaystyle \operatorname {Perf} _{s}(f,r')-\operatorname {Perf} _{s}(f,r)\geq t} ; r ′ {\displaystyle r'\,} is a neutral mutation if − t < Perf s ⁡ ( f , r ′ ) − Perf s ⁡ ( f , r ) < t {\displaystyle -t<\operatorname {Perf} _{s}(f,r')-\operatorname {Perf} _{s}(f,r) 0 {\displaystyle \epsilon >0\,} , for all ideal functions f ∈ F n {\displaystyle f\in F_{n}} and representations r 0 ∈ R n {\displaystyle r_{0}\in R_{n}} , with probability at least 1 − ϵ {\displaystyle 1-\epsilon \,} , Perf ⁡ ( f , r g ( n , 1 / ϵ ) ) ≥ 1 − ϵ , {\displaystyle \operatorname {Perf} (f,r_{g(n,1/\epsilon )})\geq 1-\epsilon ,} where the sizes of neighborhoods N ( r ) {\displaystyle N(r)\,} for r ∈ R n {\displaystyle r\in R_{n}\,} are at most p ( n , 1 / ϵ ) {\displaystyle p(n,1/\epsilon )\,} , the sample size is s ( n , 1 / ϵ ) {\displaystyle s(n,1/\epsilon )\,} , the tolerance is t ( 1 / n , ϵ ) {\displaystyle t(1/n,\epsilon )\,} , and the generation size is g ( n , 1 / ϵ ) {\displaystyle g(n,1/\epsilon )\,} . F {\displaystyle F\,} is evolvable over D {\displaystyle D\,} if it is evolvable by some R {\displaystyle R\,} over D {\displaystyle D\,} . F {\displaystyle F\,} is evolvable if it is evolvable over all distributions D {\displaystyle D\,} . == Results == The class of conjunctions and the class of disjunctions are evolvable over the uniform distribution for short conjunctions and disjunctions, respectively. The class of parity functions (which evaluate to the parity of the number of true literals in a given subset of literals) are not evolvable, even for the uniform distribution. Evolvability implies PAC learnability.

    Read more →
  • Branch number

    Branch number

    In cryptography, the branch number is a numerical value that characterizes the amount of diffusion introduced by a vectorial Boolean function F that maps an input vector a to output vector F ( a ) {\displaystyle F(a)} . For the (usual) case of a linear F the value of the differential branch number is produced by: applying nonzero values of a (i.e., values that have at least one non-zero component of the vector) to the input of F; calculating for each input value a the Hamming weight W {\displaystyle W} (number of nonzero components), and adding weights W ( a ) {\displaystyle W(a)} and W ( F ( a ) ) {\displaystyle W(F(a))} together; selecting the smallest combined weight across for all nonzero input values: B d ( F ) = min a ≠ 0 ( W ( a ) + W ( F ( a ) ) ) {\displaystyle B_{d}(F)={\underset {a\neq 0}{\min }}(W(a)+W(F(a)))} . If both a and F ( a ) {\displaystyle F(a)} have s components, the result is obviously limited on the high side by the value s + 1 {\displaystyle s+1} (this "perfect" result is achieved when any single nonzero component in a makes all components of F ( a ) {\displaystyle F(a)} to be non-zero). A high branch number suggests higher resistance to the differential cryptanalysis: the small variations of input will produce large changes on the output and in order to obtain small variations of the output, large changes of the input value will be required. The term was introduced by Daemen and Rijmen in early 2000s and quickly became a typical tool to assess the diffusion properties of the transformations. == Mathematics == The branch number concept is not limited to the linear transformations, Daemen and Rijmen provided two general metrics: differential branch number, where the minimum is obtained over inputs of F that are constructed by independently sweeping all the values of two nonzero and unequal vectors a, b ( ⊕ {\displaystyle \oplus } is a component-by-component exclusive-or): B d ( F ) = min a ≠ b ( W ( a ⊕ b ) + W ( F ( a ) ⊕ F ( b ) ) {\displaystyle B_{d}(F)={\underset {a\neq b}{\min }}(W(a\oplus b)+W(F(a)\oplus F(b))} ; for linear branch number, the independent candidates α {\displaystyle \alpha } and β {\displaystyle \beta } are independently swept; they should be nonzero and correlated with respect to F (the L A T ( α , β ) {\displaystyle LAT(\alpha ,\beta )} coefficient of the linear approximation table of F should be nonzero): B l ( F ) = min α ≠ 0 , β , L A T ( α , β ) ≠ 0 ( W ( α ) + W ( β ) ) {\displaystyle B_{l}(F)={\underset {\alpha \neq 0,\beta ,LAT(\alpha ,\beta )\neq 0}{\min }}(W(\alpha )+W(\beta ))} .

    Read more →
  • Data definition specification

    Data definition specification

    In computing, a data definition specification (DDS) is a guideline to ensure comprehensive and consistent data definition. It represents the attributes required to quantify data definition. A comprehensive data definition specification encompasses enterprise data, the hierarchy of data management, prescribed guidance enforcement and criteria to determine compliance. == Overview == A data definition specification may be developed for any organization or specialized field, improving the quality of its products through consistency and transparency. It eliminates redundancy (since all contributing areas are referencing the same specification) and provides standardization and degrees of compliance, making it easier and more efficient to create, modify, verify, analyze and share information across the enterprise. To understand how a data definition specification works in an enterprise, we must look at the elements of a DDS. Writing data definitions, defining business terms (or rules) in the context of a particular environment, provides structure for an organization's data architecture. In developing these definitions, the words used must be traceable to clearly defined data. A data definition specification may be used in the following activities: Business intelligence Business process modeling Business rules management Data analysis and modeling Information architecture Metadata modeling Data mastering Report generation == Criteria == A data definition specification requires data definitions to be: Atomic – singular, describing only one concept. Commonly used and ambiguous terms should be defined. While a term refers to one concept, several words may be used in a term: File – A concept identifiable with one word File extension – A concept identifiable with more than one word Traceable – Mapped to a specific data element. In business, a term may be traced to an entity (for example, a customer) or an attribute (such as a customer's name). A term may be a value in a data set (such as gender), or designate the data set itself. Traceability indicates relationships in the data hierarchy. Consistent - Used in a standard syntax; if used in a specific context, the context is noted Accurate - Precise, correct and unambiguous, stating what the term is and is not Clear - Readily understood by the reader Complete - With the term, its description and contextual references Concise - To avoid circular references == Applications == === Enterprise data === A data definition specification was produced by the Open Mobile Alliance to document charging data. The document, the centralized catalog of data elements defined for interfaces, specifies the mapping of these data elements to protocol fields in the interfaces. Created for the exchange of financial data, Market Data Definition Language (MDDL) is an XML specification designed to enable the interchange of information necessary to account, to analyze, and to trade financial instruments of the world's markets. It defines an XML-based interchange format and common data dictionary on the fields needed to describe: (1) financial instruments, (2) corporate events affecting value and tradability, and (3) market-related, economic and industrial indicators. The principal function of MDDL is to allow entities to exchange market data by standardizing formats and definitions. MDDL provides a common format for market data so that it can be efficiently passed from one processing system to another and provides a common understanding of market data content by standardizing terminology and by normalizing the relationships of various data elements to one another ... From the user perspective, the goal of MDDL is to enable users to integrate data from multiple sources by standardizing both the input feeds used for data warehousing (i.e., define what's being provided by vendors) and the output methods by which client applications request the data (i.e., ensure compatibility on how to get data in and out of applications)." === Clinical submissions === The Clinical Data Interchange Standards Consortium, a global, multidisciplinary, non-profit organization, has established standards to support the acquisition, exchange, submission and archiving of clinical research data and metadata. CDISC standards are vendor-neutral, platform-independent and freely available from the CDISC website. The Case Report Tabulation Data Definition Specification (define.xml) draft version 2.0, the oldest data definition specification, is part of the evolution from the 1999 FDA electronic submission (eSub) guidance and electronic Common Technical Document (eCTD) documents specifying that a document describing the content and structure of included data be included in a submission. Define.xml was developed to automate the review process by generating a machine-readable data-definition document. Define.xml has standardized submissions to the Food and Drug Administration, reducing review times from over two years to several months. === Archival data === A data definition specification is the foundation of metadata for scientific data archiving. The Metadata Encoding and Transmission Standard (METS) uses one principle of a DDS: consistent use of key terms to catalog digital objects for global use. The METS schema is a flexible mechanism for encoding descriptive, administrative and structural metadata for a digital library object and expressing complex links between metadata, and can provide a useful standard for the exchange of digital-library objects between repositories. A similar effort is underway to preserve complex data associated with video-game archiving. Preserving Virtual Worlds attempted to address archival-format deficiencies, citing the lack of suitable documentation for interactive fiction and games at the bit level: specifically, the absence of "representation information" needed to map raw bits into higher-level data constructs. Preserving Virtual Worlds 2 is a research project expanding on initial efforts in this field.

    Read more →
  • SocialIQ

    SocialIQ

    Social IQ (formerly Soovox Inc.) was a San Diego-based influencer marketing platform that measured users' online social influence and connected them with brands for word-of-mouth marketing campaigns. The company was founded in 2009 by Akram Benmbarek and was headquartered in San Diego, California. == History == Akram Benmbarek, who had previously worked in technology finance at Advanced Equities Financial Corp and in wealth management at Morgan Stanley, Merrill Lynch, and UBS, founded the company in mid-2009 under the name Soovox. In October 2011, Benmbarek rebranded the company as SocialIQ. At that time, the company was seeking a Series A round of venture capital, having raised under $1 million in angel seed funding. == Similar metrics == Klout PeerIndex

    Read more →
  • Stochastic parrot

    Stochastic parrot

    In machine learning, the term stochastic parrot is a metaphor that frames large language models as systems that statistically mimic text without real understanding. The word "stochastic" – from the ancient Greek "στοχαστικός" (stokhastikos, 'based on guesswork') – is a term from probability theory meaning "randomly determined". The word "parrot" refers to parrots' ability to mimic human speech. The term was introduced in a 2021 paper on AI ethics titled "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜" and authored by Timnit Gebru, Emily M. Bender, Angelina McMillan-Major, and Margaret Mitchell. The paper outlined possible risks associated with large language models (LLMs). In December 2020, it was the subject of a workplace dispute between Gebru (then co-leader of Google's Ethical Artificial Intelligence Team) and Google, which had requested the retraction of the paper. The incident culminated in Gebru's controversial departure from the company. The paper was later presented at the 2021 ACM Conference, and the term "stochastic parrot" has seen widespread use in academic research concerning generative AI and LLMs. The term has been interpreted negatively as an insult towards AI. == Background == Timnit Gebru is an AI ethics researcher, Emily M. Bender is a linguist specializing in computational linguistics, and Margaret Mitchell is a computer scientist specializing in algorithmic bias. Gebru had joined Google in 2018, where she co-led a team on the ethics of artificial intelligence with Mitchell. In late 2020, the paper "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜" was co-written by Gebru and five other researchers, four of whom were Google employees. The paper argues that large language models (LLMs) present significant risks such as environmental and financial costs, inscrutability leading to unknown dangerous biases, and potential for deception as LLMs do not understand the concepts underlying what they learn. The paper states that LLMs are "stitching together sequences of linguistic forms ... observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning." Therefore, they are labeled "stochastic parrots". === Dismissal of Gebru by Google === After the paper was submitted for consideration to the 2021 ACM Conference, Google requested that Gebru either retract the paper from the conference or remove the names of Google employees from it. Gebru refused to do so without further discussion, and emailed Google Research vice president Megan Kacholia that if the company could not explain the request for retraction and address other concerns regarding similar projects, she would plan to resign after a transition period, stating that they could "work on a last date". The following day, on December 2, 2020, Gebru received an email saying that Google was "accepting her resignation". Her abrupt firing sparked protests by Google employees and negative publicity for the company. == Usage == The phrase has been used by AI skeptics to signify that LLMs lack understanding of the meaning of their outputs. Sam Altman, CEO of OpenAI, used the term shortly after the release of ChatGPT in December 2022, tweeting "i am a stochastic parrot, and so r u". The term was nominated as the 2023 AI-related Word of the Year by the American Dialect Society. == Debate == Some LLMs, such as ChatGPT, have become capable of interacting with users in convincingly human-like conversations. The development of these new systems has deepened the discussion of the extent to which LLMs understand or are simply "parroting". According to machine learning researchers Lindholm, Wahlström, Lindsten, and Schön, the term "stochastic parrot" highlights two vital limitations of LLMs: LLMs are limited by the data they are trained on and are simply stochastically repeating contents of datasets. Because they are just making up outputs based on training data, LLMs do not understand if they are saying something incorrect or inappropriate. Lindholm et al. noted that, with poor quality datasets and other limitations, a learning machine might produce results that are "dangerously wrong". === Subjective experience === In the mind of a human being, words and language correspond to things one has experienced. For LLMs, according to proponents of the theory, words correspond only to other words and patterns of usage fed into their training data. Proponents of the idea of stochastic parrots thus conclude that statements about LLMs are due to "the human tendency to attribute meaning to text", and claim this occurs despite the LLMs not actually understanding language. === Fine-tuning === Kelsey Piper argued that the claim that LLMs are stochastic parrots or mere "next-token predictors" focuses on pre-training, ignoring that modern LLMs are also fine-tuned to follow instructions and to prefer accurate answers. === Hallucinations and mistakes === The tendency of LLMs to pass off false information as fact is held as support. Called hallucinations or confabulations, LLMs will occasionally synthesize information that matches some pattern. LLMs may fail to distinguish fact and fiction, which leads to the claim that they can't connect words to a comprehension of the world, as humans do. Furthermore, LLMs may fail to decipher complex or ambiguous grammar cases that rely on understanding the meaning of language. For example: The wet newspaper that fell down off the table is my favorite newspaper. But now that my favorite newspaper fired the editor I might not like reading it anymore. Can I replace 'my favorite newspaper' by 'the wet newspaper that fell down off the table' in the second sentence? GPT-4, an LLM released in March 2023, responded yes, not understanding that the meaning of "newspaper" is different in these two contexts; it is first an object and second an institution. === Benchmarks and experiments === One argument against the hypothesis that LLMs are stochastic parrot is their results on benchmarks for reasoning, common sense and language understanding. In 2023, some LLMs have shown good results on many language understanding tests, such as the Super General Language Understanding Evaluation (SuperGLUE). GPT-4 scored in the >90th-percentile on the Uniform Bar Examination and achieved 93% accuracy on the MATH benchmark of high-school Olympiad problems, results that exceed rote pattern-matching expectations. Such tests, and the smoothness of many LLM responses, help as many as 51% of AI professionals believe they can truly understand language with enough data, according to a 2022 survey. === Expert rebuttals === Some AI researchers dispute the notion that LLMs merely "parrot" their training data. Geoffrey Hinton, a pioneering figure in neural networks, counters that the metaphor misunderstands the prerequisite for accurate language prediction. He argues that "to predict the next word accurately, you have to understand the sentence", a view he presented on 60 Minutes in 2023. From this perspective, understanding is not an alternative to statistical prediction, but rather an emergent property required to perform it effectively at scale. Hinton also uses logical puzzles to demonstrate that LLMs actually understand language. A 2024 Scientific American investigation described a closed Berkeley workshop where state-of-the-art models solved novel tier-4 mathematics problems and produced coherent proofs, indicating reasoning abilities beyond memorization. The GPT-4 Technical Report showed human-level results on professional and academic exams (e.g., the Uniform Bar Exam and USMLE), challenging the "parrot" characterization. Anthropic conducted mechanistic interpretability research on Claude, using attribution graphs to identify circuits. The research showed how the LLM processes information via chains of fuzzy logical inference, and indicated an ability to plan ahead. They found that Claude 3.5 Haiku "employs remarkably general abstractions", forms "internally generated plans for its future outputs" and "works backwards from its longer-term goals". They noted that "The mechanisms of the model can apparently only be faithfully described using an overwhelmingly large causal graph." They also found that the model includes "mechanisms that could underlie a simple form of metacognition", in that it "thinks about" the level of its own knowledge before reaching its answer. === Interpretability === Another line of evidence against the 'stochastic parrot' claim comes from mechanistic interpretability, a research field dedicated to reverse-engineering LLMs to understand their internal workings. Rather than only observing the model's input-output behavior, these techniques probe the model's internal activations, which can be used to determine if they contain structured representations of the world. The goal is to investigate whether LLMs are merely manipulating surface statistics or if t

    Read more →
  • Strong cryptography

    Strong cryptography

    Strong cryptography or cryptographically strong are general terms used to designate the cryptographic algorithms that, when used correctly, provide a very high (usually insurmountable) level of protection against any eavesdropper, including the government agencies. There is no precise definition of the boundary line between the strong cryptography and (breakable) weak cryptography, as this border constantly shifts due to improvements in hardware and cryptanalysis techniques. These improvements eventually place the capabilities once available only to the NSA within the reach of a skilled individual, so in practice there are only two levels of cryptographic security, "cryptography that will stop your kid sister from reading your files, and cryptography that will stop major governments from reading your files" (Bruce Schneier). The strong cryptography algorithms have high security strength, for practical purposes usually defined as a number of bits in the key. For example, the United States government, when dealing with export control of encryption, considered as of 1999 any implementation of the symmetric encryption algorithm with the key length above 56 bits or its public key equivalent to be strong and thus potentially a subject to the export licensing. To be strong, an algorithm needs to have a sufficiently long key and be free of known mathematical weaknesses, as exploitation of these effectively reduces the key size. At the beginning of the 21st century, the typical security strength of the strong symmetrical encryption algorithms is 128 bits (slightly lower values still can be strong, but usually there is little technical gain in using smaller key sizes). Demonstrating the resistance of any cryptographic scheme to attack is a complex matter, requiring extensive testing and reviews, preferably in a public forum. Good algorithms and protocols are required (similarly, good materials are required to construct a strong building), but good system design and implementation is needed as well: "it is possible to build a cryptographically weak system using strong algorithms and protocols" (just like the use of good materials in construction does not guarantee a solid structure). Many real-life systems turn out to be weak when the strong cryptography is not used properly, for example, random nonces are reused A successful attack might not even involve algorithm at all, for example, if the key is generated from a password, guessing a weak password is easy and does not depend on the strength of the cryptographic primitives. A user can become the weakest link in the overall picture, for example, by sharing passwords and hardware tokens with the colleagues. == Background == The level of expense required for strong cryptography originally restricted its use to the government and military agencies, until the middle of the 20th century the process of encryption required a lot of human labor and errors (preventing the decryption) were very common, so only a small share of written information could have been encrypted. US government, in particular, was able to keep a monopoly on the development and use of cryptography in the US into the 1960s. In the 1970, the increased availability of powerful computers and unclassified research breakthroughs (Data Encryption Standard, the Diffie-Hellman and RSA algorithms) made strong cryptography available for civilian use. Mid-1990s saw the worldwide proliferation of knowledge and tools for strong cryptography. By the 21st century the technical limitations were gone, although the majority of the communication were still unencrypted. At the same the cost of building and running systems with strong cryptography became roughly the same as the one for the weak cryptography. The use of computers changed the process of cryptanalysis, famously with Bletchley Park's Colossus. But just as the development of digital computers and electronics helped in cryptanalysis, it also made possible much more complex ciphers. It is typically the case that use of a quality cipher is very efficient, while breaking it requires an effort many orders of magnitude larger - making cryptanalysis so inefficient and impractical as to be effectively impossible. == Cryptographically strong algorithms == This term "cryptographically strong" is often used to describe an encryption algorithm, and implies, in comparison to some other algorithm (which is thus cryptographically weak), greater resistance to attack. But it can also be used to describe hashing and unique identifier and filename creation algorithms. See for example the description of the Microsoft .NET runtime library function Path.GetRandomFileName. In this usage, the term means "difficult to guess". An encryption algorithm is intended to be unbreakable (in which case it is as strong as it can ever be), but might be breakable (in which case it is as weak as it can ever be) so there is not, in principle, a continuum of strength as the idiom would seem to imply: Algorithm A is stronger than Algorithm B which is stronger than Algorithm C, and so on. The situation is made more complex, and less subsumable into a single strength metric, by the fact that there are many types of cryptanalytic attack and that any given algorithm is likely to force the attacker to do more work to break it when using one attack than another. There is only one known unbreakable cryptographic system, the one-time pad, which is not generally possible to use because of the difficulties involved in exchanging one-time pads without them being compromised. So any encryption algorithm can be compared to the perfect algorithm, the one-time pad. The usual sense in which this term is (loosely) used, is in reference to a particular attack, brute force key search — especially in explanations for newcomers to the field. Indeed, with this attack (always assuming keys to have been randomly chosen), there is a continuum of resistance depending on the length of the key used. But even so there are two major problems: many algorithms allow use of different length keys at different times, and any algorithm can forgo use of the full key length possible. Thus, Blowfish and RC5 are block cipher algorithms whose design specifically allowed for several key lengths, and who cannot therefore be said to have any particular strength with respect to brute force key search. Furthermore, US export regulations restrict key length for exportable cryptographic products and in several cases in the 1980s and 1990s (e.g., famously in the case of Lotus Notes' export approval) only partial keys were used, decreasing 'strength' against brute force attack for those (export) versions. More or less the same thing happened outside the US as well, as for example in the case of more than one of the cryptographic algorithms in the GSM cellular telephone standard. The term is commonly used to convey that some algorithm is suitable for some task in cryptography or information security, but also resists cryptanalysis and has no, or fewer, security weaknesses. Tasks are varied, and might include: generating randomness encrypting data providing a method to ensure data integrity Cryptographically strong would seem to mean that the described method has some kind of maturity, perhaps even approved for use against different kinds of systematic attacks in theory and/or practice. Indeed, that the method may resist those attacks long enough to protect the information carried (and what stands behind the information) for a useful length of time. But due to the complexity and subtlety of the field, neither is almost ever the case. Since such assurances are not actually available in real practice, sleight of hand in language which implies that they are will generally be misleading. There will always be uncertainty as advances (e.g., in cryptanalytic theory or merely affordable computer capacity) may reduce the effort needed to successfully use some attack method against an algorithm. In addition, actual use of cryptographic algorithms requires their encapsulation in a cryptosystem, and doing so often introduces vulnerabilities which are not due to faults in an algorithm. For example, essentially all algorithms require random choice of keys, and any cryptosystem which does not provide such keys will be subject to attack regardless of any attack resistant qualities of the encryption algorithm(s) used. == Legal issues == Widespread use of encryption increases the costs of surveillance, so the government policies aim to regulate the use of the strong cryptography. In the 2000s, the effect of encryption on the surveillance capabilities was limited by the ever-increasing share of communications going through the global social media platforms, that did not use the strong encryption and provided governments with the requested data. Murphy talks about a legislative balance that needs to be struck between the power of the government that are broad enough to be able to follow the qui

    Read more →
  • Locally recoverable code

    Locally recoverable code

    Locally recoverable codes are a family of error correction codes that were introduced first by D. S. Papailiopoulos and A. G. Dimakis and have been widely studied in information theory due to their applications related to distributive and cloud storage systems. An [ n , k , d , r ] q {\displaystyle [n,k,d,r]_{q}} LRC is an [ n , k , d ] q {\displaystyle [n,k,d]_{q}} linear code such that there is a function f i {\displaystyle f_{i}} that takes as input i {\displaystyle i} and a set of r {\displaystyle r} other coordinates of a codeword c = ( c 1 , … , c n ) ∈ C {\displaystyle c=(c_{1},\ldots ,c_{n})\in C} different from c i {\displaystyle c_{i}} , and outputs c i {\displaystyle c_{i}} . == Overview == Erasure-correcting codes, or simply erasure codes, for distributed and cloud storage systems, are becoming more and more popular as a result of the present spike in demand for cloud computing and storage services. This has inspired researchers in the fields of information and coding theory to investigate new facets of codes that are specifically suited for use with storage systems. It is well-known that LRC is a code that needs only a limited set of other symbols to be accessed in order to restore every symbol in a codeword. This idea is very important for distributed and cloud storage systems since the most common error case is when one storage node fails (erasure). The main objective is to recover as much data as possible from the fewest additional storage nodes in order to restore the node. Hence, Locally Recoverable Codes are crucial for such systems. The following definition of the LRC follows from the description above: an [ n , k , r ] {\displaystyle [n,k,r]} -Locally Recoverable Code (LRC) of length n {\displaystyle n} is a code that produces an n {\displaystyle n} -symbol codeword from k {\displaystyle k} information symbols, and for any symbol of the codeword, there exist at most r {\displaystyle r} other symbols such that the value of the symbol can be recovered from them. The locality parameter satisfies 1 ≤ r ≤ k {\displaystyle 1\leq r\leq k} because the entire codeword can be found by accessing k {\displaystyle k} symbols other than the erased symbol. Furthermore, Locally Recoverable Codes, having the minimum distance d {\displaystyle d} , can recover d − 1 {\displaystyle d-1} erasures. == Definition == Let C {\displaystyle C} be a [ n , k , d ] q {\displaystyle [n,k,d]_{q}} linear code. For i ∈ { 1 , … , n } {\displaystyle i\in \{1,\ldots ,n\}} , let us denote by r i {\displaystyle r_{i}} the minimum number of other coordinates we have to look at to recover an erasure in coordinate i {\displaystyle i} . The number r i {\displaystyle r_{i}} is said to be the locality of the i {\displaystyle i} -th coordinate of the code. The locality of the code is defined as An [ n , k , d , r ] q {\displaystyle [n,k,d,r]_{q}} locally recoverable code (LRC) is an [ n , k , d ] q {\displaystyle [n,k,d]_{q}} linear code C ∈ F q n {\displaystyle C\in \mathbb {F} _{q}^{n}} with locality r {\displaystyle r} . Let C {\displaystyle C} be an [ n , k , d ] q {\displaystyle [n,k,d]_{q}} -locally recoverable code. Then an erased component can be recovered linearly, i.e. for every i ∈ { 1 , … , n } {\displaystyle i\in \{1,\ldots ,n\}} , the space of linear equations of the code contains elements of the form x i = f ( x i 1 , … , x i r ) {\displaystyle x_{i}=f(x_{i_{1}},\ldots ,x_{i_{r}})} , where i j ≠ i {\displaystyle i_{j}\neq i} . == Optimal locally recoverable codes == Theorem Let n = ( r + 1 ) s {\displaystyle n=(r+1)s} and let C {\displaystyle C} be an [ n , k , d ] q {\displaystyle [n,k,d]_{q}} -locally recoverable code having s {\displaystyle s} disjoint locality sets of size r + 1 {\displaystyle r+1} . Then An [ n , k , d , r ] q {\displaystyle [n,k,d,r]_{q}} -LRC C {\displaystyle C} is said to be optimal if the minimum distance of C {\displaystyle C} satisfies == Tamo–Barg codes == Let f ∈ F q [ x ] {\displaystyle f\in \mathbb {F} _{q}[x]} be a polynomial and let ℓ {\displaystyle \ell } be a positive integer. Then f {\displaystyle f} is said to be ( r {\displaystyle r} , ℓ {\displaystyle \ell } )-good if • f {\displaystyle f} has degree r + 1 {\displaystyle r+1} , • there exist distinct subsets A 1 , … , A ℓ {\displaystyle A_{1},\ldots ,A_{\ell }} of F q {\displaystyle \mathbb {F} _{q}} such that – for any i ∈ { 1 , … , ℓ } {\displaystyle i\in \{1,\ldots ,\ell \}} , f ( A i ) = { t i } {\displaystyle f(A_{i})=\{t_{i}\}} for some t i ∈ F q {\displaystyle t_{i}\in \mathbb {F} _{q}} , i.e., f {\displaystyle f} is constant on A i {\displaystyle A_{i}} , – # A i = r + 1 {\displaystyle \#A_{i}=r+1} , – A i ∩ A j = ∅ {\displaystyle A_{i}\cap A_{j}=\varnothing } for any i ≠ j {\displaystyle i\neq j} . We say that { A 1 , … , A ℓ {\displaystyle A_{1},\ldots ,A_{\ell }} } is a splitting covering for f {\displaystyle f} . === Tamo–Barg construction === The Tamo–Barg construction utilizes good polynomials. • Suppose that a ( r , ℓ ) {\displaystyle (r,\ell )} -good polynomial f ( x ) {\displaystyle f(x)} over F q {\displaystyle \mathbb {F} _{q}} is given with splitting covering i ∈ { 1 , … , ℓ } {\displaystyle i\in \{1,\ldots ,\ell \}} . • Let s ≤ ℓ − 1 {\displaystyle s\leq \ell -1} be a positive integer. • Consider the following F q {\displaystyle \mathbb {F} _{q}} -vector space of polynomials V = { ∑ i = 0 s g i ( x ) f ( x ) i : deg ⁡ ( g i ( x ) ) ≤ deg ⁡ ( f ( x ) ) − 2 } . {\displaystyle V=\left\{\sum _{i=0}^{s}g_{i}(x)f(x)^{i}:\deg(g_{i}(x))\leq \deg(f(x))-2\right\}.} • Let T = ⋃ i = 1 ℓ A i {\textstyle T=\bigcup _{i=1}^{\ell }A_{i}} . • The code { ev T ⁡ ( g ) : g ∈ V } {\displaystyle \{\operatorname {ev} _{T}(g):g\in V\}} is an ( ( r + 1 ) ℓ , ( s + 1 ) r , d , r ) {\displaystyle ((r+1)\ell ,(s+1)r,d,r)} -optimal locally coverable code, where ev T {\displaystyle \operatorname {ev} _{T}} denotes evaluation of g {\displaystyle g} at all points in the set T {\displaystyle T} . === Parameters of Tamo–Barg codes === • Length. The length is the number of evaluation points. Because the sets A i {\displaystyle A_{i}} are disjoint for i ∈ { 1 , … , ℓ } {\displaystyle i\in \{1,\ldots ,\ell \}} , the length of the code is | T | = ( r + 1 ) ℓ {\displaystyle |T|=(r+1)\ell } . • Dimension. The dimension of the code is ( s + 1 ) r {\displaystyle (s+1)r} , for s {\displaystyle s} ≤ ℓ − 1 {\displaystyle \ell -1} , as each g i {\displaystyle g_{i}} has degree at most deg ⁡ ( f ( x ) ) − 2 {\displaystyle \deg(f(x))-2} , covering a vector space of dimension deg ⁡ ( f ( x ) ) − 1 = r {\displaystyle \deg(f(x))-1=r} , and by the construction of V {\displaystyle V} , there are s + 1 {\displaystyle s+1} distinct g i {\displaystyle g_{i}} . • Distance. The distance is given by the fact that V ⊆ F q [ x ] ≤ k {\displaystyle V\subseteq \mathbb {F} _{q}[x]_{\leq k}} , where k = r + 1 − 2 + s ( r + 1 ) {\displaystyle k=r+1-2+s(r+1)} , and the obtained code is the Reed-Solomon code of degree at most k {\displaystyle k} , so the minimum distance equals ( r + 1 ) ℓ − ( ( r + 1 ) − 2 + s ( r + 1 ) ) {\displaystyle (r+1)\ell -((r+1)-2+s(r+1))} . • Locality. After the erasure of the single component, the evaluation at a i ∈ A i {\displaystyle a_{i}\in A_{i}} , where | A i | = r + 1 {\displaystyle |A_{i}|=r+1} , is unknown, but the evaluations for all other a ∈ A i {\displaystyle a\in A_{i}} are known, so at most r {\displaystyle r} evaluations are needed to uniquely determine the erased component, which gives us the locality of r {\displaystyle r} . To see this, g {\displaystyle g} restricted to A j {\displaystyle A_{j}} can be described by a polynomial h {\displaystyle h} of degree at most deg ⁡ ( f ( x ) ) − 2 = r + 1 − 2 = r − 1 {\displaystyle \deg(f(x))-2=r+1-2=r-1} thanks to the form of the elements in V {\displaystyle V} (i.e., thanks to the fact that f {\displaystyle f} is constant on A j {\displaystyle A_{j}} , and the g i {\displaystyle g_{i}} 's have degree at most deg ⁡ ( f ( x ) ) − 2 {\displaystyle \deg(f(x))-2} ). On the other hand | A j ∖ { a j } | = r {\displaystyle |A_{j}\backslash \{a_{j}\}|=r} , and r {\displaystyle r} evaluations uniquely determine a polynomial of degree r − 1 {\displaystyle r-1} . Therefore h {\displaystyle h} can be constructed and evaluated at a j {\displaystyle a_{j}} to recover g ( a j ) {\displaystyle g(a_{j})} . === Example of Tamo–Barg construction === We will use x 5 ∈ F 41 [ x ] {\displaystyle x^{5}\in \mathbb {F} _{41}[x]} to construct [ 15 , 8 , 6 , 4 ] {\displaystyle [15,8,6,4]} -LRC. Notice that the degree of this polynomial is 5, and it is constant on A i {\displaystyle A_{i}} for i ∈ { 1 , … , 8 } {\displaystyle i\in \{1,\ldots ,8\}} , where A 1 = { 1 , 10 , 16 , 18 , 37 } {\displaystyle A_{1}=\{1,10,16,18,37\}} , A 2 = 2 A 1 {\displaystyle A_{2}=2A_{1}} , A 3 = 3 A 1 {\displaystyle A_{3}=3A_{1}} , A 4 = 4 A 1 {\displaystyle A_{4}=4A_{1}} , A 5 = 5 A 1 {\displaystyle A_{5}=5A_{1}} , A 6 = 6 A 1 {\displaystyle A_{6}=6A_{1}}

    Read more →