AI Code Checker Python

AI Code Checker Python — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • SPL notation

    SPL notation

    SPL (Sentence Plan Language) is an abstract notation representing the semantics of a sentence in natural language. In a classical Natural Language Generation (NLG) workflow, an initial text plan (hierarchically or sequentially organized factoids, often modelled in accordance with Rhetorical Structure Theory) is transformed by a sentence planner (generator) component to a sequence of sentence plans modelled in a Sentence Plan Language. A surface generator can be used to transform the SPL notation into natural language sentences. Probably the most widely used SPL language used today (2022) is AMR (Abstract Meaning Representation, see there for further references), but is owes parts of its popularity to its application to NLP problems other than NLG, e.g., machine translation and semantic parsing.

    Read more →
  • Chaotic cryptology

    Chaotic cryptology

    Chaotic cryptology is the application of mathematical chaos theory to the practice of cryptography, the study or techniques used to privately and securely transmit information with the presence of a third-party or adversary. Since first being investigated by Robert Matthews in 1989, the use of chaos in cryptography has attracted much interest. However, long-standing concerns about its security and implementation speed continue to limit its implementation. Chaotic cryptology consists of two opposite processes: Chaotic cryptography and Chaotic cryptanalysis. Cryptography refers to encrypting information for secure transmission, whereas cryptanalysis refers to decrypting and deciphering encoded encrypted messages. In order to use chaos theory efficiently in cryptography, the chaotic maps are implemented such that the entropy generated by the map can produce required Confusion and diffusion. Properties in chaotic systems and cryptographic primitives share unique characteristics that allow for the chaotic systems to be applied to cryptography. If chaotic parameters, as well as cryptographic keys, can be mapped symmetrically or mapped to produce acceptable and functional outputs, it will make it next to impossible for an adversary to find the outputs without any knowledge of the initial values. Since chaotic maps in a real life scenario require a set of numbers that are limited, they may, in fact, have no real purpose in a cryptosystem if the chaotic behavior can be predicted. One of the most important issues for any cryptographic primitive is the security of the system. However, in numerous cases, chaos-based cryptography algorithms are proved insecure. The main issue in many of the cryptanalyzed algorithms is the inadequacy of the chaotic maps implemented in the system. == Types == Chaos-based cryptography has been divided into two major groups: Symmetric chaos cryptography, where the same secret key is used by sender and receiver. Asymmetric chaos cryptography, where one key of the cryptosystem is public. Some of the few proposed systems have been broken. The majority of chaos-based cryptographic algorithms are symmetric. Many use discrete chaotic maps in their process. == Applications == === Image encryption === Bourbakis and Alexopoulos in 1991 proposed supposedly the earliest fully intended digital image encryption scheme which was based on SCAN language. Later on, with the emergence of chaos-based cryptography hundreds of new image encryption algorithms, all with the aim of improving the security of digital images were proposed. However, there were three main aspects of the design of an image encryption that was usually modified in different algorithms (chaotic map, application of the map and structure of algorithm). The initial and perhaps most crucial point was the chaotic map applied in the design of the algorithms. The speed of the cryptosystem is always an important parameter in the evaluation of the efficiency of a cryptography algorithm, therefore, the designers were initially interested in using simple chaotic maps such as tent map, and the logistic map. However, in 2006 and 2007, the new image encryption algorithms based on more sophisticated chaotic maps proved that application of chaotic map with higher dimension could improve the quality and security of the cryptosystems. === Hash function === Chaotic behavior can generate hash functions, such as applying the Chirikov/Julia 3D trajectory translation into a SHA-512 hash. === Random number generation === The unpredictable behavior of the chaotic maps can be used in the generation of random numbers. Some of the earliest chaos-based random number generators tried to directly generate random numbers from the logistic map. Many more recent works did so using the numerical solutions of hyperchaotic systems of differential equations, either at the integer-order, or the fractional-order.

    Read more →
  • Convergent encryption

    Convergent encryption

    Convergent encryption, also known as content hash keying, is a cryptosystem that produces identical ciphertext from identical plaintext files. This has applications in cloud computing to remove duplicate files from storage without the provider having access to the encryption keys. The combination of deduplication and convergent encryption was described in a backup system patent filed by Stac Electronics in 1995. This combination has been used by Farsite, Permabit, Freenet, MojoNation, GNUnet, flud, and the Tahoe Least-Authority File Store. The system gained additional visibility in 2011 when cloud storage provider Bitcasa announced they were using convergent encryption to enable de-duplication of data in their cloud storage service. == Overview == The system computes a cryptographic hash of the plaintext in question. The system then encrypts the plaintext by using the hash as a key. Finally, the hash itself is stored, encrypted with a key chosen by the user. == Known Attacks == Convergent encryption is open to a "confirmation of a file attack" in which an attacker can effectively confirm whether a target possesses a certain file by encrypting an unencrypted, or plain-text, version and then simply comparing the output with files possessed by the target. This attack poses a problem for a user storing information that is non-unique, i.e. also either publicly available or already held by the adversary - for example: banned books or files that cause copyright infringement. An argument could be made that a confirmation of a file attack is rendered less effective by adding a unique piece of data such as a few random characters to the plain text before encryption; this causes the uploaded file to be unique and therefore results in a unique encrypted file. However, some implementations of convergent encryption where the plain-text is broken down into blocks based on file content, and each block then independently convergently encrypted may inadvertently defeat attempts at making the file unique by adding bytes at the beginning or end. Even more alarming than the confirmation attack is the "learn the remaining information attack" described by Drew Perttula in 2008. This type of attack applies to the encryption of files that are only slight variations of a public document. For example, if the defender encrypts a bank form including a ten digit bank account number, an attacker that is aware of generic bank form format may extract defender's bank account number by producing bank forms for all possible bank account numbers, encrypt them and then by comparing those encryptions with defender's encrypted file deduce the bank account number. Note that this attack can be extended to attack a large number of targets at once (all spelling variations of a target bank customer in the example above, or even all potential bank customers), and the presence of this problem extends to any type of form document: tax returns, financial documents, healthcare forms, employment forms, etc. Also note that there is no known method for decreasing the severity of this attack -- adding a few random bytes to files as they are stored does not help, since those bytes can likewise be attacked with the "learn the remaining information" approach. The only effective approach to mitigating this attack is to encrypt the contents of files with a non-convergent secret before storing (negating any benefit from convergent encryption), or to simply not use convergent encryption in the first place.

    Read more →
  • Short Weather Cipher

    Short Weather Cipher

    The Short Weather Cipher (German: Wetterkurzschlüssel, abbreviated WKS), also known as the weather short signal book, was a cipher, presented as a codebook, that was used by the radio telegraphists aboard U-boats of the German Navy (Kriegsmarine) during World War II. It was used to condense weather reports into a short 7-letter message, which was enciphered by using the naval Enigma and transmitted by radiomen to intercept stations on shore, where it was deciphered by Enigma and the 7-letter weather report was reconstructed. == History == During World War II, during various times, different versions of the cipher were in operation. The first issue carried the codename Weimar. It was replaced by the edition Eisenach on 20 January 1942. On 10 March 1943, the third edition of the weather key, bearing the codename Naumburg, entered into force. On May 9, 1941, during Operation Primrose, the operation to occupy Åndalsnes and create a diversion south of Trondheim in Norway as part of the Norwegian Campaign, an intact Naval Enigma (M3) cipher machine, a copy of the "Weimar" version of the short weather cipher and a copy of the short signal book (German: Kurzsignalbuch or Kurzsignale for short) was recovered from the submarine U-110, that was captured in the North Atlantic east of Cape Farewell, Greenland. This enabled the cryptanalysts in Bletchley Park to break the encryption of the M3 and to decipher the German submarine radio messages. The Short Weather Cipher was critical in the cryptanalysis of the Naval Enigma M4 and yielded excellent cribs. On 30 October 1942, a copy of the Wetterkurzschlüssel, the short weather cipher, and of the short signal book, the Kurzsignale, were recovered as part of a daring raid on the U-boat U-559, when three Royal Navy sailors, Lieutenant Anthony Fasson, Able Seaman Colin Grazier and NAAFI canteen assistant Tommy Brown, then boarded the abandoned submarine, and recovered the documents after a 90-minute search. They reached the Government Code and Cypher at Bletchley Park after a three-week delay, on 24 November 1942. The documents which cost the lives of Fasson and Grazier proved to be particularly important in breaking the Naval Enigma M4. The version of the short weather cipher recovered was the Eisenach version. Unlike the first version Weimar, the Eisenach did not list the 26 rotor positions that were indicated by a letter, to be used in enciphering weather reports. Thus, Hut 8 cryptanalysts thought that all four rotors were used to encipher weather reports. Testing on the Bombes began to surface weather kisses (identical messages in two cryptosystems). On 13 December 1942, a crib obtained using the Short Weather Cipher gave a key with the Naval Enigma M4 rotatable Umkehrwalze (reversing roller or reflector) in the neutral position, making it equivalent to a standard Enigma and thus making B-Dienst messages potentially breakable on existing bombes. Hut 8 learned that the 4-letter indicators for regular U-boat messages were the same as 3-letter indicators for weather messages the same day, except for one extra letter. This meant that once the key was found for a weather message on any day, the fourth rotor had to be only tested in 26 positions to find the full 4-letter key. By the end of the day on Sunday 13 December, Rodger Winn of the Submarine Tracking Room at Bletchley Park knew that Shark Enigma Cipher was broken. When the third edition of the short signal book was introduced on 10 March 1943, Hut 8 was immediately deprived of cribs. However, by the 19 March, cribs were again being used by Hut 8 personnel, using the method of employing short signal sighting reports. These were reports made by U-boats when contact was made with Kurzsignalheft code book. Hut 8 managed to solve Shark for 90 out of 112 days before the end of June. Kurzsignalheft short sighting reports also used M4 in M3 mode. By the end of June, four-rotor bombes had entered service at Bletchley Park, and by August had been introduced by the US Navy. From September onwards, Shark was generally solved within 24 hours. == Operation == The U-boat encoded weather reports using the Short Weather Cipher, before being enciphered on the Naval Enigma. The shore patrol of the Kriegsmarine, deciphered the message and decoded it, then forwarding it to a central meteorological station, which rebroadcast the data as ship synoptics, after enciphering it with additive tables using a cipher, which was called Germet 3 by Hut 8 personnel. The short weather cipher coded weather reports using a polyphonic single-letter code with X missing. A = +28° ◦ B = +27° ◦ C = +26° ◦ D = +25° ◦ . . . ◦ W = +6° ◦ Y= +5° ◦ Z = +4° ◦ A = +3° ◦ B = +2° ◦ C = +1° ◦ D = 0° ◦ E =−1° ◦ F =−2° ◦ . . . ◦ Z = −21° ◦ In a similar way, water temperature, atmospheric pressure, humidity, wind direction, wind velocity, visibility, degree of cloudiness, geographic latitude, and geographic longitude had to be coded in a prescribed order with the weather report consisted of a single short word. Based on the approximate knowledge of the position of the submarine, the Kriegsmarine telegraphist who received the message could translate the letter "S", according to the above table, which could mean 10 °C or −15 °C, back to the correct temperature. Similarly, the direction and the type of swell was also coded with only a single letter: ----------------------------------------------------- Direction from which | Type of swell the swell comes | low | middle high | high | ----------------------------------------------------- N | a | i | q | NE | b | j | r | E | c | k | s | SE | d | l | t | S | e | m | u | SW | f | n | v | W | g | o | w | NW | h | p | x | No swelling | | | | y Intermittent | | | | z As an example of the cipher, a weather report for 68° North latitude, 20° West longitude (north of Iceland) with atmospheric pressure 972 millibars, temperature minus 5 °C, wind northwest Force 6 (on the Beaufort scale), 3/10 cirrus cloud cover, visibility 5 nautical miles, would be coded as MZNFPED. == Publications == Bauer, Arthur O. (1997), Funkpeilung als alliierte Waffe gegen deutsche U-Boote 1939–1945 [Direction finding as Allied weapon against German submarines from 1939 to 1945] (in German), Diemen, NL: Selbstverlag, ISBN 978-3-00-002142-8 Bauer, Friedrich L. (2007), Decrypted Secrets. Methods and Maxims of Cryptology (4., rev. and extended ed.), Berlin Heidelberg New York: Springer, ISBN 978-3-540-24502-5 Pfeiffer, Paul N. (October 1998), "Breaking the German Weather Ciphers in the Mediterranean Detachment, 849th Signal Intelligence Service", Cryptologia, 22 (4): 354–369, doi:10.1080/0161-119891886975, ISSN 0161-1194 Ulbricht, Heinz (2005), Die Chiffriermaschine Enigma – Trügerische Sicherheit. Ein Beitrag zur Geschichte der Nachrichtendienste [The Enigma cipher machine – Deceptive security. A contribution to the history of the intelligence services], Dissertation, Fachbereich Mathematik und Informatik, Technische Universität Braunschweig (in German)

    Read more →
  • Canva

    Canva

    Canva Pty Ltd. is an Australian multinational proprietary software company launched in 2013 based in Sydney, Australia. The platform provides a graphic design platform to create visual content for presentations, websites, and other digital products. Its uses include templates for presentations, posters, and social media content, as well as photo and video editing functionality. The platform uses a drag-and-drop interface designed for users without professional design training or experience. Canva operates on a freemium model and has added features such as print services and video editing tools over time. == History == === 2013–2020 === Canva was founded in Perth, Australia, by Melanie Perkins, Cliff Obrecht and Cameron Adams on 1 January 2013. One of the company's early investors was Susan Wu, an American entrepreneur. In its first year, Canva had more than 750,000 users. In 2017, the company reached profitability and had 294,000 paying customers. In January 2018, Perkins announced that the company had raised A$40 million from Sequoia Capital, Blackbird Ventures, and Felicis Ventures, and the company was valued at A$1 billion. It raised A$70 million in May 2019, followed by A$85 million in October 2019 and the launch of Canva for Enterprise. In December 2019, Canva announced Canva for Education, a free product for schools and other educational institutions intended to facilitate collaboration between students and teachers. === 2021–2025 === In June 2020, Canva announced a partnership with FedEx Office and with Office Depot the following month. As of June 2020, Canva's valuation had risen to A$6 billion, rising to A$40 billion by September 2021. In September 2021, Canva raised US$200 million, with its value peaking that year at US$40 billion. By September 2022, the valuation of the company had leveled at US$26 billion. While Canva's value declined from its 2021 peak by mid-2022, it remained one of Australia's most prominent technology companies, alongside Atlassian. In March 2022, Canva had over 75 million monthly active users. In 2023, the pair were named in the Australian Financial Review's AFR Rich List as among the 10 most wealthy people in Australia. On 7 December 2022, Canva launched Magic Write, which is the platform's AI-powered copywriting assistant. On 22 March 2023, Canva announced its new Assistant tool, which makes recommendations on graphics and styles that match the user's existing design. On 11 January 2024, Canva launched its own GPT in OpenAI's GPT Store. The company has announced it intends to compete with Google and Microsoft in the office software category with website and whiteboard products. In May 2024, the company announced the launch of Canva Enterprise, a plan designed for large organisations, alongside new tools including Work Kits, Courses and AI capabilities. In 2024, it announced a co-funded solar energy project to enhance its sustainability efforts. On 10 April 2025, Canva released Visual Suite 2. The new interface combines Canva's design and productivity tools. New features include a spreadsheets application (Canva Sheets), a generative AI coding assistant (Canva Code), a chatbot, and an updated photo editor that can modify or remove background objects. In August 2025, Canva launched a stock sale to employees, valuing the company at US$42 billion. == Acquisitions == In 2018, the company acquired presentations startup Zeetings for an undisclosed amount, as part of its expansion into the presentations space. In May 2019, the company announced the acquisitions of Pixabay and Pexels, two free stock photography sites based in Germany, which enabled Canva users to access their photos for designs. In February 2021, Canva acquired Austrian startup Kaleido.ai and the Czech-based Smartmockups. In 2022, Canva acquired Flourish, a London-based data visualization startup. In March 2024, Canva acquired UK-based Serif, the developers of the Affinity suite of graphic design software, for approximately $380 million. In August 2024, Canva acquired the AI image generation platform and startup, Leonardo AI, for an undisclosed amount. In June 2025, it was announced that Canva had acquired Australian AI marketing startup MagicBrief for an undisclosed amount. In February 2026, Canva acquired two startups: Cavalry, which specializes in animation software, and MangoAI, which focuses on improving advertising performance. In April 2026, Canva acquired Simtheory, an AI Workflow Tool, and Ortto, a marketing automation tool. == Philanthropy == Canva's co-founders, Melanie Perkins and Cliff Obrecht, have publicly stated their intention to donate a significant portion of their personal wealth to charity. In 2021, Canva started a partnership with GiveDirectly, a nonprofit organization operating in low income areas that makes unconditional cash transfers to families living in extreme poverty. Since then, the company has donated $50 million to support GiveDirectly's work across Malawi. In 2025, Canva announced an additional $100 million commitment to expand its GiveDirectly partnership. == Controversies == === Data breach === In May 2019, Canva experienced a data breach in which the data of roughly 139 million users was exposed. The exposed data included real names of users, usernames, email addresses, geographical information, and password hashes for some users. In January 2020, approximately 4 million user passwords were decrypted and shared online. Canva responded by resetting the passwords of every user who had not changed their password since the initial breach. === Russian operations === In May 2022 Canva was criticized for continuing to provide free access to its services in Russia, even after suspending payment processing in the country. Activists from the Ukrainian diaspora in Australia and others said this could be viewed as indirectly supporting Russia’s war effort. They noted the company was the only one of several major Australian firms to receive the lowest “digging in” rating on a tracker run by the Yale School of Management for failing to pull out of Russia. Canva responded that it had suspended financial transactions in Russia from March 2022 and maintained the free version to allow the continued creation and sharing of “pro-peace and anti-war” content for its 1.4 million Russian users.

    Read more →
  • Protecting Kids From Social Media Act

    Protecting Kids From Social Media Act

    Protecting Kids on Social Media Act or HB 1891 is an American law that was introduced by William Lamberth of Sumner County, Tennessee and was signed into law by Tennessee's governor on May 2, 2024. The bill requires social media websites such as X, YouTube, TikTok, Facebook and others to verify the age of users and if those users are under 18, they must have parental consent. == Progress == The law passed the Tennessee State Legislature with little opposition: the bill had only two no votes in the House from Aftyn Behn and Vincent B. Dixie, and it had zero no votes in the Senate. == Bill summary == Every social media company must verify the age of new users after the law takes effect, and if the user had created an account before the law took effect, they must verify the age of the person attempting to access the account within 14 days. If the new user or the user who originally owned an account is under 18 years of age, they must get parental consent and the third party or social media company must not retain the data from the age verification process or obtaining parental consent. Parents who are account holders of those under 18 can view the privacy settings, set daily time restrictions, and implement breaks during which the minor cannot access the account. The law is enforced by the Attorney General of Tennessee and went into effect on January 1, 2025. == Lawsuit == On October 3, 2024, the trade association NetChoice filed a lawsuit against Tennessee Attorney General Jonathan Skrmetti in the Middle District Court of Tennessee, claiming that the law violates the First Amendment. The Judge for the case is William L. Campbell Jr. An initial case management conference was originally scheduled for December 4, 2024, however it was delayed because of the Supreme Court case United States v. Skrmetti, recommending that the conference be delayed after January 20, 2025. On February 14, 2025, Judge Eli Richardson denied NetChoice's motion for a temporary restraining order because it would disrupt the status quo of the case.

    Read more →
  • Cryptographic Module Testing Laboratory

    Cryptographic Module Testing Laboratory

    Cryptographic Module Testing Laboratory (CMTL) is an information technology (IT) computer security testing laboratory that is accredited to conduct cryptographic module evaluations for conformance to the FIPS 140-2 U.S. Government standard. The National Institute of Standards and Technology (NIST) National Voluntary Laboratory Accreditation Program (NVLAP) accredits CMTLs to meet Cryptographic Module Validation Program (CMVP) standards and procedures. This has been replaced by FIPS 140-2 and the Cryptographic Module Validation Program (CMVP). == CMTL requirements == These laboratories must meet the following requirements: NIST Handbook 150, NVLAP Procedures and General Requirements NIST Handbook 150-17 Information Technology Security Testing - Cryptographic Module Testing NVLAP Specific Operations Checklist for Cryptographic Module Testing == FIPS 140-2 in relation to the Common Criteria == A CMTL can also be a Common Criteria (CC) Testing Laboratory (CCTL). The CC and FIPS 140-2 are different in the abstractness and focus of evaluation. FIPS 140-2 testing is against a defined cryptographic module and provides a suite of conformance tests to four FIPS 140 security levels. FIPS 140-2 describes the requirements for cryptographic modules and includes such areas as physical security, key management, self tests, roles and services, etc. The standard was initially developed in 1994 - prior to the development of the CC. The CC is an evaluation against a Protection Profile (PP), or security target (ST). Typically, a PP covers a broad range of products. A CC evaluation does not supersede or replace a validation to either FIPS 140-1, FIPS140-2 or FIPS 140-3. The four security levels in FIPS 140-1 and FIPS 140-2 do not map directly to specific CC EALs or to CC functional requirements. A CC certificate cannot be a substitute for a FIPS 140-1 or FIPS 140-2 certificate. If the operational environment is a modifiable operational environment, the operating system requirements of the Common Criteria are applicable at FIPS Security Levels 2 and above. FIPS 140-1 required evaluated operating systems that referenced the Trusted Computer System Evaluation Criteria (TCSEC) classes C2, B1 and B2. However, TCSEC is no longer in use and has been replaced by the Common Criteria. Consequently, FIPS 140-2 now references the Common Criteria. FIPS 140-2 or FIPS 140-3 validation efforts can be in some parts reused in Common Criteria evaluations, specifically in areas related to entropy source and cryptographic algorithms.

    Read more →
  • Public Services Network

    Public Services Network

    The Public Services Network (PSN) is a UK government's high-performance network, which helps public sector organisations work together, reduce duplication and share resources. It unified the provision of network infrastructure across the United Kingdom public sector into an interconnected "network of networks" to increase efficiency and reduce overall public expenditure. It is now a legacy network and public sector organisations are being migrated to using services on the public internet. == Origins == The Public Services Network (PSN) was launched officially as part of the Transformational Government Strategy commencing in 2005, under the original name of the Public Sector Network. Prior to this, some parts of local government had already successfully implemented the concept. The Hampshire Public Services Network (HPSN) was the first PSN, launched in 1999, followed closely by Kent County Councils partnerships with the KPSN. The HPSN, encompassing all of the borough, district and unitary councils, with the County Council, as well as the Fire Services, the Isle of Wight Council and 540 schools. National PSN technical and architecture compliance criteria were established from 2007, by GDS working with local government leaders from Socitm (the Society of Information Technology Management) on the National CIO Council and the Local CIO Council. The PSN's aim was to bring public services organisations with a common interest onto a single, coherent and standards-based ‘network of networks’. This would create influence, economies of scale and a commonality of standards for secure and easy inter-connection between public service organisations. The original concept of a network of networks strategy was based upon the work already undertaken in local government and recognition of Communities of Interest (COI) within the Criminal Justice Sector during work by the Office for Criminal Justice Reform (OCJR) between 2005 and 2007 to enable data sharing across business units. In this context a COI was defined as groups of Government departments and external partners who in combination provided services within a specific area of operation and used the same data, with a similar risk profile, shared risk appetite and common governance framework. Historically each group member had implemented their own networks and standards of operation in isolation with little or no consideration as to how services and data may be shared and resulting in increased costs of operation. The Network of Networks strategy proposed within OCJR recommended the creation of specific networks based upon these Communities of Interest which were joined together through data interchange gateways supporting common standards. Under this approach networks would be arranged by data type and business functions such as Criminal Justice, Health and Social Care, Defence and Intelligence or Public Finance rather than solely on established departmental boundaries. Within a COI, trust relationships and data interchange are readily supported, enabling data sharing without a need to cross network boundaries and providing benefits of scale without the challenges and compromises intrinsic to homogeneous cross sector networks. Data is made available without a need to transport it between organisations and control is retained by the data originator. In early 2007 a group of UK Government department CTOs in conjunction with the Office for Government Commerce Buying Solutions (OGC BS) established the vision for a single commonly provided, procured and managed public sector voice and data network infrastructure to replace the multitude of separately procured and managed networks serving various segments of the UK public sector; Education, Health, Central Government, Local Government etc. In 2008 an Industry Working Group was established to document the objectives and requirements more clearly. Their report set out the architectural and commercial principles as well as anticipated security, service management, governance and transition arrangements. == Architecture == The PSN comprises a core network, the Government Conveyancing Network or GCN provided by GCN Service Providers or GCNSPs. The GCN interconnects multiple operator networks, termed Direct Network Service Providers or DNSPs. Subscriber organisations contract to a connection from a local participating DNSP, connect via that to GCN and hence onwards to other interconnected networks and services. The GCN network is entirely based on IPv4 and MPLS and the GCNSPs are not currently mandated to provide IPv6, though they should have a roadmap to implementing it if and when required. == Commercial framework == In 2010 Virgin Media Business, BT, Cable & Wireless and Global Crossing signed Deeds of Undertaking (DoU) and subsequently achieved accreditation for providing GCN and IP VPN services. In March 2012, BT, Cable & Wireless, Capita Business Services, Eircom, Fujitsu, Kcom, Level 3, Logicalis, MDNX, Thales, Updata and Virgin Media Business were successful bidders for the initial two-year PSN Connectivity framework. In June 2012, 29 companies were confirmed as suppliers of ICT services to the UK public sector under the Government's PSN Services framework contract. Apart from most of the previous suppliers, additional companies also included 2e2, Airwave Solutions, Azzurri Communications, Cassidian, CSC Computer Sciences, Computacenter, Daisy Communications, Easynet Global Services, EE, Freedom Communications, Icom Holdings, NextiraOne, PageOne Communications, Phoenix IT Group, Siemens Communications, Specialist Computer Centres, Telefónica, telent Technology Services, Uniworld Communications and Vodafone. == Governance == The PSN is managed within the Cabinet Office where it is part of the Government Digital Service. == Early implementations == There were already notable initiatives in progress in county council areas, demonstrating public sector network integration in both the Hampshire HPSN2 network and in Kent's community network. Project Pathway was established as a pilot linking these two county-wide networks, with Virgin Media Business and Global Crossing the subscriber and GCN network elements. Staffordshire County Council was the first council in England to establish a PSN that included the county's NHS Health partners. Other county councils have since followed the leads of these councils. == Transition == Centrally procured public sector networks are expected to migrate across to the PSN framework as they reach the end of their contract terms, either through an interim framework or directly. The Government Secure Intranet (GSi) contracts expired in September 2011, running on to 12 February 2012 and were replaced by the transitional Government Secure Intranet Convergence Framework (GCF). The Managed Telephony Service (MTS) contract expired on 31 December 2011 and was replaced by the Managed Telephony Convergence Framework (MTCF). == Future plan == In a blog post published on 20 January 2017, Government Digital Service announced that the Technology Leaders Network (TLN) had agreed that government was starting a journey away from the PSN. This was because using the Internet was considered suitable for the vast majority of the work that the public sector does. The blog post confirmed that the 'move was not going to happen immediately' and stated that 'there's quite a bit of work to do across the public sector to prepare for the changes'. It also stated that it was too early for a full timeline to be provided, although all PSN-connected organisations would be updated as the process evolved. The blog post confirmed that organisations that need to access services that are only available on the PSN would still need to connect to it for the time being and continue to meet its assurance requirements. In a blog post published on 16 March 2017, Government Digital Service (GDS) set out its plans for PSN assurance. The blog post confirmed that the PSN compliance process wasn't 'going anywhere, certainly for a while yet'. It explained that the TLN agreed that – as one of the only recognised, externally accredited, cross-government common assurance standards – it 'needs to live on far beyond the end of the physical PSN network'. Government Digital Service, along with the National Cyber Security Centre (NCSC) and the Cyber and Government Security Directorate, are now looking at ways to expand and reframe PSN compliance in a new context that, while retaining the assurance principles that are the basis of the existing process, will aim to improve the process. A GDS blog post titled 'The road to closing down the PSN' published on 8 September 2020 describes how the public sector will migrate away from the PSN. The Cabinet Office has set up a programme called Future Networks for Government (FN4G) to help organisations move away from the PSN.

    Read more →
  • Surrogate model

    Surrogate model

    A surrogate model is an engineering method used when an outcome of interest cannot be easily measured or computed, so an approximate mathematical model of the outcome is used instead. Most engineering design problems require experiments and/or simulations to evaluate design objective and constraint functions as a function of design variables. For example, in order to find the optimal airfoil shape for an aircraft wing, an engineer simulates the airflow around the wing for different shape variables (e.g., length, curvature, material, etc.). For many real-world problems, however, a single simulation can take many minutes, hours, or even days to complete. As a result, routine tasks such as design optimization, design space exploration, sensitivity analysis and "what-if" analysis become impossible since they require thousands or even millions of simulation evaluations. One way of alleviating this burden is by constructing approximation models, known as surrogate models, metamodels or emulators, that mimic the behavior of the simulation model as closely as possible while being computationally cheaper to evaluate. Surrogate models are constructed using a data-driven, bottom-up approach. The exact, inner working of the simulation code is not assumed to be known (or even understood), relying solely on the input-output behavior. A model is constructed based on modeling the response of the simulator to a limited number of intelligently chosen data points. This approach is also known as behavioral modeling or black-box modeling, though the terminology is not always consistent. When only a single design variable is involved, the process is known as curve fitting. Though using surrogate models in lieu of experiments and simulations in engineering design is more common, surrogate modeling may be used in many other areas of science where there are expensive experiments and/or function evaluations. == Goals == The scientific challenge of surrogate modeling is the generation of a surrogate that is as accurate as possible, using as few simulation evaluations as possible. The process comprises three major steps which may be interleaved iteratively: Sample selection (also known as sequential design, optimal experimental design (OED) or active learning) Construction of the surrogate model and optimizing the model parameters (i.e., bias-variance tradeoff) Appraisal of the accuracy of the surrogate. The accuracy of the surrogate depends on the number and location of samples (expensive experiments or simulations) in the design space. A systematic data representation during training can improve model scalability, thereby reducing the need for expensive simulations. Various design of experiments (DOE) techniques cater to different sources of errors, in particular, errors due to noise in the data or errors due to an improper surrogate model. == Types of surrogate models == Popular surrogate modeling approaches are: polynomial response surfaces; kriging; more generalized Bayesian approaches; gradient-enhanced kriging (GEK); radial basis function; support vector machines; space mapping; artificial neural networks and Bayesian networks. Other methods recently explored include Fourier surrogate modeling , random forests, convolutional neural networks, and generative adversarial networks. For some problems, the nature of the true function is not known a priori, and therefore it is not clear which surrogate model will be the most accurate one. In addition, there is no consensus on how to obtain the most reliable estimates of the accuracy of a given surrogate. Many other problems have known physics properties. In these cases, physics-based surrogates such as space-mapping based models are commonly used. == Invariance properties == Recently proposed comparison-based surrogate models (e.g., ranking support vector machines) for evolutionary algorithms, such as CMA-ES, allow preservation of some invariance properties of surrogate-assisted optimizers: Invariance with respect to monotonic transformations of the function (scaling) Invariance with respect to orthogonal transformations of the search space (rotation) == Applications == An important distinction can be made between two different applications of surrogate models: design optimization and design space approximation (also known as emulation). In surrogate model-based optimization, an initial surrogate is constructed using some of the available budgets of expensive experiments and/or simulations. The remaining experiments/simulations are run for designs which the surrogate model predicts may have promising performance. The process usually takes the form of the following search/update procedure. Initial sample selection (the experiments and/or simulations to be run) Construct surrogate model Search surrogate model (the model can be searched extensively, e.g., using a genetic algorithm, as it is cheap to evaluate) Run and update experiment/simulation at new location(s) found by search and add to sample Iterate steps 2 to 4 until out of time or design is "good enough" Depending on the type of surrogate used and the complexity of the problem, the process may converge on a local or global optimum, or perhaps none at all. In design space approximation, one is not interested in finding the optimal parameter vector, but rather in the global behavior of the system. Here the surrogate is tuned to mimic the underlying model as closely as needed over the complete design space. Such surrogates are a useful, cheap way to gain insight into the global behavior of the system. Optimization can still occur as a post-processing step, although with no update procedure (see above), the optimum found cannot be validated. == Surrogate modeling software == Surrogate Modeling Toolbox (SMT: https://github.com/SMTorg/smt) is a Python package that contains a collection of surrogate modeling methods, sampling techniques, and benchmarking functions. This package provides a library of surrogate models that is simple to use and facilitates the implementation of additional methods. SMT is different from existing surrogate modeling libraries because of its emphasis on derivatives, including training derivatives used for gradient-enhanced modeling, prediction derivatives, and derivatives with respect to the training data. It also includes new surrogate models that are not available elsewhere: kriging by partial-least squares reduction and energy-minimizing spline interpolation. Python library SAMBO Optimization supports sequential optimization with arbitrary models, with tree-based models and Gaussian process models built in. Surrogates.jl is a Julia packages which offers tools like random forests, radial basis methods and kriging. == Surrogate-Assisted Evolutionary Algorithms (SAEAs) == SAEAs are an advanced class of optimization techniques that integrate evolutionary algorithms (EAs) with surrogate models. In traditional EAs, evaluating the fitness of candidate solutions often requires computationally expensive simulations or experiments. SAEAs address this challenge by building a surrogate model, which is a computationally inexpensive approximation of the objective function or constraint functions. The surrogate model serves as a substitute for the actual evaluation process during the evolutionary search. It allows the algorithm to quickly estimate the fitness of new candidate solutions, thereby reducing the number of expensive evaluations needed. This significantly speeds up the optimization process, especially in cases where the objective function evaluations are time-consuming or resource-intensive. SAEAs typically involve three main steps: (1) building the surrogate model using a set of initial sampled data points, (2) performing the evolutionary search using the surrogate model to guide the selection, crossover, and mutation operations, and (3) periodically updating the surrogate model with new data points generated during the evolutionary process to improve its accuracy. By balancing exploration (searching new areas in the solution space) and exploitation (refining known promising areas), SAEAs can efficiently find high-quality solutions to complex optimization problems. They have been successfully applied in various fields, including engineering design, machine learning, and computational finance, where traditional optimization methods may struggle due to the high computational cost of fitness evaluations.

    Read more →
  • Intranet

    Intranet

    An intranet is a computer network for sharing information, easier communication, collaboration tools, operational systems, and other computing services within an organization, usually to the exclusion of access by outsiders. The term is used in contrast to public networks, such as the Internet, but uses the same technology based on the Internet protocol suite. An organization-wide intranet can constitute a focal point of internal communication and collaboration, and provide a single starting point to access internal and external resources. In its simplest form, an intranet is established with the technologies for local area networks (LANs) and wide area networks (WANs). Many modern intranets have search engines, user profiles, blogs, mobile apps with notifications, and events planning within their infrastructure. An intranet is sometimes contrasted to an extranet. While an intranet is generally restricted to employees of the organization, extranets may also be accessed by customers, suppliers, or other approved parties. Extranets extend a private network onto the Internet with special provisions for authentication, authorization and accounting (AAA protocol). == Uses == Intranets are increasingly being used to deliver tools, such as for collaboration (to facilitate working in groups and teleconferencing) or corporate directories, sales and customer relationship management, or project management. Intranets are also used as corporate culture-change platforms. For example, a large number of employees using an intranet forum application to host a discussion about key issues could come up with new ideas related to management, productivity, quality, and other corporate issues. In large intranets, website traffic is often similar to public website traffic and can be better understood by using web metrics software to track overall activity. User surveys also improve intranet website effectiveness. Larger businesses allow users within their intranet to access public internet through firewall servers. They have the ability to screen incoming and outgoing messages, keeping security intact. When part of an intranet is made accessible to customers and others outside the business, it becomes part of an extranet. Businesses can send private messages through the public network using special encryption/decryption and other security safeguards to connect one part of their intranet to another. Intranet user-experience, editorial, and technology teams work together to produce in-house sites. Most commonly, intranets are managed by the communications, HR or CIO departments of large organizations, or some combination of these. Because of the scope and variety of content and the number of system interfaces, the intranets of many organizations are much more complex than their respective public websites. Intranets and the use of intranets are growing rapidly. According to the Intranet Design Annual 2007 from Nielsen Norman Group, the number of pages on participants' intranets averaged 200,000 over the years 2001 to 2003 and has grown to an average of 6 million pages over 2005–2007. == Benefits == Intranets can help users locate and view information faster and use applications relevant to their roles and responsibilities. With a web browser interface, users can access data held in any database the organization wants to make available at any time and — subject to security provisions — from anywhere within company workstations, increasing employees' ability to perform their jobs faster, more accurately, and with confidence that they have the right information. It also helps improve services provided to users. Using hypermedia and Web technology, Web publishing allows for the maintenance of and easy access to cumbersome corporate knowledge, such as employee manuals, benefits documents, company policies, business standards, news feeds, and even training, all of which can be accessed throughout a company using common Internet standards (Acrobat files, Flash files, CGI applications). Because each business unit can update the online copy of a document, the most recent version is usually available to employees using the intranet. Intranets are also used as a platform for developing and deploying applications to support business operations and decisions across the internetworked enterprise. Information is easily accessible to all authorised users, enabling collaboration. Being able to communicate in real-time through integrated third-party tools, such as an instant messenger, promotes the sharing of ideas and removes blockages to communication to help boost a business's productivity. Intranets can serve as powerful tools for communicating (such as through chat, email and/or blogs) within a given organization about vertically strategic initiatives that have a global reach throughout said organization. The type of information that can easily be conveyed is the purpose of the initiative and what it is aiming to achieve, who is driving it, results achieved to date, and whom to speak to for more information. By providing this information on the intranet, staff can keep up-to-date with the strategic focus of their organization. For example, when Nestlé had a number of food processing plants in Scandinavia, their central support system had to deal with a number of queries every day. When Nestlé decided to invest in an intranet, they quickly realized the savings. Gerry McGovern says that the savings from the reduction in query calls was substantially greater than the investment in the intranet. Users can view information and data via a web browser rather than maintaining physical documents such as procedure manuals, internal phone list and requisition forms. This can potentially save the business money on printing, duplicating documents, and the environment, as well as document maintenance overhead. For example, the HRM company PeopleSoft "derived significant cost savings by shifting HR processes to the intranet". McGovern goes on to say the manual cost of enrolling in benefits was found to be US$109.48 per enrollment. "Shifting this process to the intranet reduced the cost per enrollment to $21.79; a saving of 80 percent". Another company that saved money on expense reports was Cisco. "In 1996, Cisco processed 54,000 reports and the amount of dollars processed was USD19 million". Many companies dictate computer specifications which, in turn, may allow Intranet developers to write applications that only have to work on one browser such that there are no cross-browser compatibility issues. Being able to specifically address one's "viewer" is a great advantage. Since intranets are user-specific (requiring database/network authentication prior to access), users know exactly who they are interfacing with and can personalize their intranet based on role (job title, department) or individual ("Congratulations Jane, on your 3rd year with our company!"). Since "involvement in decision making" is one of the main drivers of employee engagement, offering tools (like forums or surveys) that foster peer-to-peer collaboration and employee participation can make employees feel more valued and involved. == Planning and creation == Most organizations devote considerable resources into the planning and implementation of their intranet as it is of strategic importance to the organization's success. Some of the planning would include topics such as determining the purpose and goals of the intranet, identifying persons or departments responsible for implementation and management and devising functional plans, page layouts and designs. The appropriate staff would also ensure that implementation schedules and phase-out of existing systems were organized, while defining and implementing security of the intranet and ensuring it lies within legal boundaries and other constraints. In order to produce a high-value end product, systems planners should determine the level of interactivity (e.g. wikis, on-line forms) desired. Planners may also consider whether the input of new data and updating of existing data is to be centrally controlled or devolve. These decisions sit alongside to the hardware and software considerations (like content management systems), participation issues (like good taste, harassment, confidentiality), and features to be supported. Intranets are often static sites; they are a shared drive, serving up centrally stored documents alongside internal articles or communications (often one-way communication). By leveraging firms which specialise in 'social' intranets, organisations are beginning to think of how their intranets can become a 'communication hub' for their entire team. The actual implementation would include steps such as securing senior management support and funding, conducting a business requirement analysis and identifying users' information needs. From the technical perspective, there would need to be a coordinated installation of the web server and user access netw

    Read more →
  • Data security

    Data security

    Data security or data protection is the process of securing digital information to protect it from online threats. Data security or protection means protecting digital data, such as those in a database, from destructive forces and from the unwanted actions of unauthorized users, such as a cyberattack or a data breach. Data security protects computer hardware, software, storage devices, and the data of user devices. Data security also protects the data of organizations, companies and administrative controls. Data security guarantees the protection of individual data, such as identity documents and bank data, and protects against unauthorized access, theft and loss of individual data. Data security also protects data breaches that occurs in companies and industries. Good security measures in industries reduce the probability of data breaches, and employees can rely on the company with their data and private information to be kept secured while companies can continue to maintain a stable reputation. The CIA Triad (Confidentiality, Integrity, and Availability) is what is used to practice what an information security is required to follow. Confidentiality, protects information from being accessed by unauthorized persons. Integrity, makes sure data is trustworthy; and Availability, meaning that data can be accessed by approved users when it is needed; are three goals for data security. Non-repudiation in data security definition, is a device/service that shows where the data originated from and the proof of integrity. == Technologies == === Disk encryption === Disk encryption refers to encryption technology that encrypts data on a hard disk drive. It takes data from a storage device and coverts it into an unreadable format. Disk encryption typically takes form in either software (see disk encryption software) or hardware (see disk encryption hardware) which can be used together. Disk encryption is often referred to as on-the-fly encryption (OTFE) or transparent encryption. Full disk encryption encrypts each individual sector of a disk volume. Files and user data are encrypted to hinder unauthorized users from accessing without a decryption key. A diversifier permits a plaintext of a specific disk sector to be encrypted into different ciphertexts, which does not require additional storage, such as an initialization vector (IV) or message authentication code (MAC). === Software versus hardware-based mechanisms for protecting data === Software-based security solutions encrypt the data to protect it from theft. However, a malicious program or a hacker could corrupt the data to make it unrecoverable, making the system unusable. Hardware-based security solutions prevent read and write access to data, which provides very strong protection against tampering and unauthorized access. Hardware-based security or assisted computer security offers an alternative to software-only computer security. Security tokens such as those using PKCS#11 or a mobile phone may be more secure due to the physical access required in order to be compromised. Access is enabled only when the token is connected and the correct PIN is entered (see two-factor authentication). However, dongles can be used by anyone who can gain physical access to it. Newer technologies in hardware-based security solve this problem by offering full proof of security for data. Working off hardware-based security: A hardware device allows a user to log in, log out and set different levels through manual actions. Many devices use biometric technology to prevent malicious users from logging in, logging out, and changing privilege levels. The current state of a user of the device is read by controllers in peripheral devices such as hard disks. Illegal access by a malicious user or a malicious program is interrupted based on the current state of a user by hard disk and DVD controllers making illegal access to data impossible. Hardware-based access control is more secure than the protection provided by the operating systems as operating systems are vulnerable to malicious attacks by viruses and hackers. The data on hard disks can be corrupted after malicious access is obtained. With hardware-based protection, the software cannot manipulate the user privilege levels. A hacker or a malicious program cannot gain access to secure data protected by hardware or perform unauthorized privileged operations. This assumption is broken only if the hardware itself is malicious or contains a backdoor. The hardware protects the operating system image and file system privileges from being tampered with. Therefore, a completely secure system can be created using a combination of hardware-based security and secure system administration policies. === Backups === Backup is the process of reproducing copies of essential data and storing in a separate, secured place. It is used to ensure data that is lost can be recovered from another source. Backups contains a minimum of one copy of the data that requires preservation. It is considered essential to keep a backup of any data in most industries and the process is recommended for any files of importance to a user. There are 3 types of backups; full backups, incremental backups, and differential backups. Full backups secure all data from a production system, such as a server, database, or other connected data source. It is impossible to lose all data in a full backup if a breach or corruption were to occur. Full backups require a significantly large amount of time to back up and may be time-consuming taking hours to days to complete. Incremental backups only secures changed data since last backup. While all backups are done in full backups, incremental backups only save data that is recently or frequently changed. Incremental backups require lower storage costs making it a prominent solution for growing datasets. === Data Privacy === Data privacy (or information privacy) is the right for individual's data to be secured to obstruct the use of unauthorized access. It gives individuals control over their data and how it can be shared to third parties. The U.S Privacy Protection Law (see Privacy laws of the United States) requires organizations to inform individuals of how their data is collected and when a data breach occurs. By implementing an encryption, it ensures that private data is unreadable to cybercriminals. === Data masking === Data masking of structured data is the process of obscuring (masking) specific data within a database table or cell to ensure that data security is maintained and sensitive information is not exposed to unauthorized personnel. This may include masking the data from users (for example so banking customer representatives can only see the last four digits of a customer's national identity number), developers (who need real production data to test new software releases but should not be able to see sensitive financial data), outsourcing vendors, etc. Data masking is a form of encryption, as it obscures data by modifying particular letters and numbers to keep data concealed and protected from potential hackers. The individual that has access to the code that decrypts the replaced characters are the only ones that can uncover the data. === Data erasure === Data erasure (or data deletion, data destruction) is a method of software-based overwriting that permanently clears all electronic data residing on a hard drive or other digital media to ensure that no sensitive data is lost when an asset is retired or reused. Article 17: Right to be Forgotten states that users have the right to permanently remove all of their private information from their old devices/services to give people more control over their data. Users are able to switch between devices efficiently. == Threats == === Malware === Malware (or malicious software) is designed to destroy, corrupt or gain unauthorized access to a computer for the purpose of stealing, or destroying data. Hackers who use malware typically utilize many types of malware, which includes computer virus, computer worms, ransomware, spyware and Trojan horse to create a vast system of disruption and cause easy data theft. One of the victims of the vast system of disruption includes healthcare workers, who are targeted by compromised systems by infections and then having their data attacked. === Phishing === Phishing is a type of scam that allows hackers to hoax people using psychological and social engineering (using human emotions such as their trust and fear) tactics into giving personal data through emails and messages, and install computer viruses if the individual were to click on a malicious link unknowingly. Attackers are able to create websites that are very similar to original websites, which makes it difficult to detect a fake website, causing individuals to fall for giving in information. Phishing attackers use human emotion to exploit them, such as making them feel fear, urgency, sympathy with the message

    Read more →
  • Caste census

    Caste census

    Caste census is a proposed census to be conducted in India by the Central Government of India. The proposed census was decided under the leadership of Prime Minister Narendra Modi by the cabinet committee of political affairs (CCPA) on 30 April 2025. It has been decided that a caste enumeration should be included with the forthcoming census. The exact time has not been declared yet. It is unclear that when the next census will be held. The decision of the cabinet was announced by the Central Railway Minister Ashwini Vaishnaw. It has been seen as a step that would help in drafting "equitable and targeted" policies by the present Central Government of India led by the Bhartiya Janta Party in India. The Central Home Minister Amit Shah has described the decision as a "historic decision". He has also described that the historic decision as “committed to social justice”. The leader of opposition Rahul Gandhi has welcomed the decision. He said "We have shown we can pressure govt" He has demanded a clear timeline for its completion. He has called it "The first step towards deep social reform". == Description == The caste census is a systematic recording of individuals’ caste identities during the nationwide census in the country. The Central minister Ashwini Vaishnaw expressed his view on the proposed census and said that it would "strengthen the social and economic structure of our society while the nation continues to progress”. The Caste census will happen for the first time in 100 years by the Central Government of India. It will be the part of the upcoming census in India. == History == According to Peabody, the first systematic caste-wise enumeration of households in the Indian subcontinent was conducted between 1658 and 1664 across seven districts of the then Marwar Kingdom, including Jodhpur city which was its capital. It was conducted by the then home minister Munhata Nainsi of the kingdom for the purpose of tax documentation. It was not to for classification of society or creation of social hierarchies but solving a tax related problem. During the period of the British rule in India, caste census was included in the decadal censuses to categorise the population by caste, religion and occupation. In 1871–72, the first detailed caste census was conducted by the government of British Raj in India. It was practiced between the period 1881 to 1931. The last caste census was conducted in the year 1931 in which 4,147 castes were recorded. The largest population in the whole of British India (including Pakistan and Bangladesh) was of Brahmins. The population of Brahmins was recorded more than 1.5 crores. After Brahmin community, the second place was of Jatav (Chamar)community. The population of Jatav was a little more than 1.23 crores. On the third place were Rajputs. The population of Rajputs was 81 lakhs. The Rajput caste was followed by the Kunbi caste of Maharashtra. The population of Kunbi caste was 64 lakhs and 34 thousands. The Kunbi caste was followed by Yadav (Ahir) caste. The population of Yadav (Ahir) community was 56 lakhs and 82 thousands. The Yadav (Ahir) caste was followed by Teli community. The population of Teli community was 42 lakhs and 58 thousands. The Teli community was followed by Gwala community. The population of the Gwala community was 40 lakhs. After the independence of India, the caste enumeration was stopped by the newly independent Government of India led by the prime minister Pandit Jawahar Lal Nehru in 1951. The caste enumeration was stopped to avoid reinforcing social divisions in the Indian society. But, there was an exception made for the enumeration of the Scheduled Castes (SCs) and Scheduled Tribes (STs) in the decadal censuses. Therefore, the enumeration of the Scheduled Castes and the Scheduled Tribes is being conducted in every census since 1951. In 1961, the Government of India permitted states for conducting their own surveys to compile OBC lists, but national caste census was not conducted.

    Read more →
  • SAP StreamWork

    SAP StreamWork

    SAP StreamWork is an enterprise collaboration tool from SAP SE released in March 2010, and discontinued in December 2015. StreamWork allowed real-time collaboration like Google Wave, but focused on business activities such as analyzing data, planning meetings, and making decisions. It incorporated technology from Box.net and Evernote to allow users to connect to online files and documents, and document-reader technology from Scribd allowed users to view documents directly within its environment. StreamWork supported the OpenSocial set of application programming interfaces (APIs), allowing it to connect to tools built by third-party developers, such as Google Docs. A version of StreamWork intended for large enterprises used a virtual appliance based on Novell's SUSE Linux Enterprise to connect it to business systems, including those from SAP.

    Read more →
  • Sentiment analysis

    Sentiment analysis

    Sentiment analysis (also known as opinion mining) is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. With the rise of deep language models, such as RoBERTa, more difficult data domains can be analyzed, e.g., news texts where authors typically express their opinion/sentiment less explicitly. == Types == A basic task in sentiment analysis is classifying the polarity of a given text at the document, sentence, or feature/aspect level—whether the expressed opinion in a document, a sentence or an entity feature/aspect is positive, negative, or neutral. Advanced, "beyond polarity" sentiment classification looks, for instance, at emotional states such as enjoyment, anger, disgust, sadness, fear, and surprise. Precursors to sentimental analysis include the General Inquirer, which provided hints toward quantifying patterns in text and, separately, psychological research that examined a person's psychological state based on analysis of their verbal behavior. Subsequently, the method described in a patent by Volcani and Fogel, looked specifically at sentiment and identified individual words and phrases in text with respect to different emotional scales. A current system based on their work, called EffectCheck, presents synonyms that can be used to increase or decrease the level of evoked emotion in each scale. Many other subsequent efforts were less sophisticated, using a mere polar view of sentiment, from positive to negative, such as work by Turney, and Pang who applied different methods for detecting the polarity of product reviews and movie reviews respectively. This work is at the document level. One can also classify a document's polarity on a multi-way scale, which was attempted by Pang and Snyder among others: Pang and Lee expanded the basic task of classifying a movie review as either positive or negative to predict star ratings on either a 3- or a 4-star scale, while Snyder performed an in-depth analysis of restaurant reviews, predicting ratings for various aspects of the given restaurant, such as the food and atmosphere (on a five-star scale). First steps to bringing together various approaches—learning, lexical, knowledge-based, etc.—were taken in the 2004 AAAI Spring Symposium where linguists, computer scientists, and other interested researchers first aligned interests and proposed shared tasks and benchmark data sets for the systematic computational research on affect, appeal, subjectivity, and sentiment in text. Even though in most statistical classification methods, the neutral class is ignored under the assumption that neutral texts lie near the boundary of the binary classifier, several researchers suggest that, as in every polarity problem, three categories must be identified. Moreover, it can be proven that specific classifiers such as the Max Entropy and SVMs can benefit from the introduction of a neutral class and improve the overall accuracy of the classification. There are in principle two ways for operating with a neutral class. Either, the algorithm proceeds by first identifying the neutral language, filtering it out and then assessing the rest in terms of positive and negative sentiments, or it builds a three-way classification in one step. This second approach often involves estimating a probability distribution over all categories (e.g. naive Bayes classifiers as implemented by the NLTK). Whether and how to use a neutral class depends on the nature of the data: if the data is clearly clustered into neutral, negative and positive language, it makes sense to filter the neutral language out and focus on the polarity between positive and negative sentiments. If, in contrast, the data are mostly neutral with small deviations towards positive and negative affect, this strategy would make it harder to clearly distinguish between the two poles. A different method for determining sentiment is the use of a scaling system whereby words commonly associated with having a negative, neutral, or positive sentiment are given an associated number on a −10 to +10 scale (most negative up to most positive) or simply from 0 to a positive upper limit such as +4. This makes it possible to adjust the sentiment of a given term relative to its environment (usually on the level of the sentence). When a piece of unstructured text is analyzed using natural language processing, each concept in the specified environment is given a score based on the way sentiment words relate to the concept and its associated score. This allows movement to a more sophisticated understanding of sentiment, because it is now possible to adjust the sentiment value of a concept relative to modifications that may surround it. Words, for example, that intensify, relax or negate the sentiment expressed by the concept can affect its score. Alternatively, texts can be given a positive and negative sentiment strength score if the goal is to determine the sentiment in a text rather than the overall polarity and strength of the text. There are various other types of sentiment analysis, such as aspect-based sentiment analysis, grading sentiment analysis (positive, negative, neutral), multilingual sentiment analysis and detection of emotions. === Subjectivity/objectivity identification === This task is commonly defined as classifying a given text (usually a sentence) into one of two classes: objective or subjective. This problem can sometimes be more difficult than polarity classification. The subjectivity of words and phrases may depend on their context and an objective document may contain subjective sentences (e.g., a news article quoting people's opinions). Moreover, as mentioned by Su, results are largely dependent on the definition of subjectivity used when annotating texts. However, Pang showed that removing objective sentences from a document before classifying its polarity helped improve performance. Subjective and objective identification, emerging subtasks of sentiment analysis to use syntactic, semantic features, and machine learning knowledge to identify if a sentence or document contains facts or opinions. Awareness of recognizing factual and opinions is not recent, having possibly first presented by Carbonell at Yale University in 1979. The term objective refers to the incident carrying factual information. Example of an objective sentence: 'To be elected president of the United States, a candidate must be at least thirty-five years of age.' The term subjective describes the incident contains non-factual information in various forms, such as personal opinions, judgment, and predictions, also known as 'private states'. In the example down below, it reflects a private states 'We Americans'. Moreover, the target entity commented by the opinions can take several forms from tangible product to intangible topic matters stated in Liu (2010). Furthermore, three types of attitudes were observed by Liu (2010), 1) positive opinions, 2) neutral opinions, and 3) negative opinions. Example of a subjective sentence: 'We Americans need to elect a president who is mature and who is able to make wise decisions.' This analysis is a classification problem. Each class's collections of words or phrase indicators are defined for to locate desirable patterns on unannotated text. For subjective expression, a different word list has been created. Lists of subjective indicators in words or phrases have been developed by multiple researchers in the linguist and natural language processing field states in Riloff et al. (2003). A dictionary of extraction rules has to be created for measuring given expressions. Over the years, in subjective detection, the features extraction progression from curating features by hand to automated features learning. At the moment, automated learning methods can further separate into supervised and unsupervised machine learning. Patterns extraction with machine learning process annotated and unannotated text have been explored extensively by academic researchers. However, researchers recognized several challenges in developing fixed sets of rules for expressions respectably. Much of the challenges in rule development stems from the nature of textual information. Six challenges have been recognized by several researchers: 1) metaphorical expressions, 2) discrepancies in writings, 3) context-sensitive, 4) represented words with fewer usages, 5) time-sensitive, and 6) ever-growing volume. Metaphorical expressions. The text contains metaphoric expression may impact on the performance on the extraction. Besides, metaphors take in different forms, which may have been contribu

    Read more →
  • Knapsack cryptosystems

    Knapsack cryptosystems

    Knapsack cryptosystems are cryptosystems whose security is based on the hardness of solving the knapsack problem. They remain quite unpopular because simple versions of these algorithms have been broken for several decades. However, that type of cryptosystem is a good candidate for post-quantum cryptography. The most famous knapsack cryptosystem is the Merkle-Hellman Public Key Cryptosystem, one of the first public key cryptosystems, published the same year as the RSA cryptosystem. However, this system has been broken by several attacks: one from Shamir, one by Adleman, and the low density attack. However, there exist modern knapsack cryptosystems that are considered secure so far: among them is Nasako-Murakami 2006. Knapsack cryptosystems, when not subject to classical cryptoanalysis, are believed to be difficult even for quantum computers. That is not the case for systems that rely on factoring large integers, like RSA, or computing discrete logarithms, like ECDSA, problems solved in polynomial time with Shor's algorithm.

    Read more →