AI Chatbot Companion

AI Chatbot Companion — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Automatic scorer

    Automatic scorer

    An automatic scorer is the computerized scoring system to keep track of scoring in ten-pin bowling. It was introduced en masse in bowling alleys in the 1970s and combined with mechanical pinsetters to detect overturned pins. By eliminating the need for manual score-keeping, these systems have introduced new bowlers into the game who otherwise would not participate because they had to count the score themselves, as many do not understand the mathematical formula involved in bowler scoring. At first, people were skeptical about whether a computer could keep an accurate score. In the twenty-first century, automatic scorers are used in most bowling centers around the world. The three manufacturers of these specialty computers have been Brunswick Bowling, AMF Bowling (later QubicaAMF), and RCA. == History == Automatic equipment is considered a cornerstone of the modern bowling center. The traditional bowling center of the early 20th century was advanced in automation when the pinsetter person ("pin boy"), who set back up by hand the bowled down pins, was replaced by a machine that automatically replaced the pins in their proper play positions. This machine came out in the 1950s. A detection system was developed from the pinsetter mechanism in the 1960s that could tell which pins had been knocked down, and that information could be transferred to a digital computer. Automatic electronic scoring was first conceived by Robert Reynolds, who was described by a newspaper story at the time as "a West Coast electronics calculator expert." He worked with the technical staff of Brunswick Bowling to develop it. The goal was realized in the late 1960s when a specialized computer was designed for the purpose of automatic scorekeeping for bowling. The field test for the automatic scorer took place at Village Lanes bowling center, Chicago in 1967. The scoring machine received approval for official use by the American Bowling Congress in August of that year. They were first used in national official league gaming on October 10, 1967. In November, Brunswick announced that they were accepting orders for the new digital computer, which cost around $3,000 per bowling lane. Bowling centers that installed these new automatic scoring devices in the 1970s charged a ten cents extra per line of scoring for the convenience. == Description == Each Automatic Scorer computer unit kept score for four lanes. It had two bowler identification panels serving two lanes each. The bowler pushed it into his named position when his turn came up so the computer knew who was bowling and score accordingly. After the bowler rolled the bowling ball down the lane and knocked down pins, the pinsetter detected which pins were down and relayed this information back to the computer for scoring. The result was then printed on a scoresheet and projected overhead onto a large screen for all to see. The Automatic Scorer digital computer was mathematically accurate, however the detection system at the pinsetter mechanism sometimes reported the wrong number of pins knocked down. The computer could be corrected manually for any errors in the system; similarly, human errors, such as neglecting to move the bowler identification mechanism, could be corrected for by manual action. The scorer could take into account bowlers' handicaps and could adjust for late-arriving bowlers. The automatic scorer is directly connected to the foul detection unit. As a result, foul line violations are automatically scored. Brunswick had put ten years of research and development into the Automatic Scorer, and by 1972 there were over 500 of these computers installed in bowling centers around the world. AMF Bowling, competitor to Brunswick, entered into the automatic scorer computer field during the 1970s and their systems were installed into their brand of bowling centers. By 1974, RCA was also making these computers for automatic scoring. == Reception and further developments == The purposes of the computerized scoring were to avoid errors by human scorers and to prevent cheating. It had the side benefit of speeding up the progress of the game and introducing new bowlers to the game. Score-keeping for bowling is based on a formula that many new to bowling were not familiar with and thought difficult to learn. These casual bowlers unfamiliar with the formula thought the scores given by the computers were confusing. Some bowlers were not comfortable with automatic scorers when they were introduced in the 1970s, so kept score using the traditional method on paper score sheets. The introduction of this device increased the popularity of the sport. Automatic scorers came to be considered a normal part of modern bowling installations worldwide, with owners and managers saying that bowlers expect such equipment to be present in bowling establishments and that business increased following their introduction. Brunswick introduced a color television style automatic scorer in 1983. Bowling center owners could use these style automatic scorers for advertising, management, videos, and live television. By the 2010s, these types of electronic visual displays could show bowler avatars and social media connections to publicize the bowlers' scores. Some are capable of being extended entertainment systems of games for children and adults. Some scoring systems support variations on traditional bowling, such as different kinds of bingo games where certain pins have to be knocked down at certain times or practice regimes where certain spares have to be accomplished. By this point, QubicaAMF Worldwide, an outgrowth of AMF, was one of the leading providers of bowling scoring equipment.

    Read more →
  • Pivot to video

    Pivot to video

    "Pivot to video" is a phrase referring to the trend, starting in 2015, of media publishing companies cutting staff resources for written content (generally published on their own web sites) in favor of short-form video content (often published on third-party platforms such as Facebook, Instagram, Twitter, YouTube, Snapchat, and TikTok). These moves were generally presented by publishers as a response to changes in social media traffic or to changes in the media consumption habits of younger audiences. However, many media commentators have argued that this shift was primarily motivated by advertising revenue, and that only advertisers, not consumers, prefer video over text. The pivot's contribution to job loss in the media industry has given the phrase "pivot to video" an association with decline, especially in a business context. Commentators have also noted a lack of transparency and accuracy in the viewership metrics reported by platforms such as Facebook, pointing out that abrupt shifts in platforms' proprietary algorithms can have devastating effects on publishers' viewership, traffic, and revenue. Following a scandal in which Facebook revealed it had artificially inflated numbers to its advertisers about how long viewers watched ads, many journalists and industry analysts concluded that the shift to video was based on such misleading or inaccurate metrics, which created a false impression that there was customer demand for additional video content. == History == Streaming media technology has been available since the early 1990s, though it was relatively low-fidelity and not widely available until the mid-2000s. In 2007, traditional media publishers including the New York Times, Washington Post and Time Inc. created new divisions to develop web videos, and Facebook launched its video platform. Twitter purchased micro-video service Vine in October 2012, began adding native video streaming in late 2014, and acquired video-streaming service Periscope in January 2015. An August 2014 profile on BuzzFeed noted the publisher's large investment into video production, and observed that "the future of BuzzFeed may not even be on BuzzFeed.com. One of the company’s nascent ideas, BuzzFeed Distributed, will be a team of 20 people producing content that lives entirely on other popular platforms, like Tumblr, Instagram or Snapchat." On 7 January 2015, Facebook issued a statement about "the shift to video," reporting that "since June 2014, Facebook has averaged more than 1 billion video views every day." Media critic John Herrman argued that "What the shift to Facebook video means is that Facebook is more interested in hosting the things media companies make than just spreading them, that it views links to outside pages as a problem to be solved, and that it sees Facebook-hosted video as an example of the solution." In February 2015, the digital video-journalism publisher NowThis announced that it would operate without a home page, producing content to be published directly on social media platforms. In April 2016, Mashable fired much of its editorial staff, attempting to pivot away from hard news coverage while "growing Mashable across every platform" and doubling down on branded content and video. By December 2017, following a sale to Ziff Davis, Mashable retreated from this focus on video; Bernard Gershon, president of GershonMedia, said that the announcement of many such "pivots" were actually aimed primarily at investors. By 2017, "advertiser interest in video [was] insatiable... Any CFO is going to say 'How can we get more video?'" according to an executive of the publishers' trade association Digital Content Next. Publishers such as Vanity Fair, the Washington Post, and Sports Illustrated began adapting their own articles into cheap video content, either dictated by a newsreader or animated as a slideshow with captions, which could be shared on social platforms or even played alongside the articles themselves. June 2017 saw numerous high-profile pivots to video. Vocativ laid off at least 20 staff, including its entire newsroom, explaining that "as the industry evolves, we are undertaking a strategic shift to focus exclusively on video content that will be distributed via social media and other platforms." Fox Sports eliminated its entire writing staff to focus on creating "premium video across all platforms." And MTV News announced a restructuring that would cut its writing team. Less than two years earlier, MTV News had hired Grantland co-founder Dan Fierman to lead a significant investment in "longform" political and cultural reporting, but Fierman left in April 2017, and in June MTV announced it was "shifting resources into short-form video content more in line with young people's media consumption habits." In July, Vice Media laid off at least 60 employees, including the editor-in-chief of Vice Sports, while expanding video production. August 2017 saw Mic cut ten writers and directed the remainder of the newsroom to generate videos for social platforms. CEO Chris Altchek said "When you think about how many hours people spend watching video versus reading, the audience has already spoken." The move was ultimately unsuccessful, and Mic laid off the majority of its staff a year later before being sold to Bustle Media Group for a fraction of its former value. In September 2017, the for-profit wiki-hosting company Fandom began adding commercially produced videos to its otherwise user-generated wiki subdomains, explicitly citing the need to "keep up with user and advertiser expectations" by "diversifying our content," claiming without substantiation that "consumer patterns are changing," necessitating the addition of "complementary video" to accommodate that supposed need. Objection to the content in these videos and its sharp contrast against the content of the wiki sites to which they were applied led to vocal user backlash, leading Fandom CCO Dorth Raphaely to offer the following non-committal response: "I agree that with these videos in particular we did not deliver the right type of content experience." Movie Pilot CEO Tobi Bauckhage explained his company's fall 2017 layoffs as part of moving "from a text-based publishing model to video... a reaction to the fact that Facebook has changed their algorithms in favor of video instead of referral traffic over the last 12 months and we were losing money in the publishing bit of our business." As part of the company's change in direction, the majority of its staff was laid off and its parent company was sold to Webedia. In November 2017, magazine publisher Condé Nast cut jobs, reduced the frequency of several magazines, and shut down the print edition of Teen Vogue, then invested significant new resources in video production, with a senior executive saying "In the next 24 months, I hope that video is half our business... It’s critical. It’s the macro trend of content consumption." In February 2018, Vox Media cut approximately 50 employees, primarily those assigned to "social video," as Vox CEO Jim Bankoff admitted that those efforts were not "viable audience or revenue growth drivers." In August 2020, Facebook Inc. (now Meta Platforms) pivoted Instagram to video in an effort to replicate the success of TikTok and appeal to a younger audience, introducing "reels" as a form of video and promoting them aggressively. Reels accounted more than half the 20 most-viewed posts on Facebook; however, most of these reels were anonymous aggregations of content from TikTok. Elon Musk declared in early 2024 that X (formerly Twitter) was now a "video-first platform", which has been described by critics as a "pivot to video". == As euphemism == In 2017, Journalist Brian Feldman said that "'Pivoting to video' has become a business strategy for digital publishers common enough in recent months to be a kind of cliché — a slick way to describe something else: layoffs." In response, writers use the phrase as gallows humor shorthand for death or cancellation, as in "how do i tell my bf i want our relationship to pivot to video" (SkyNews' Mollie Goodfellow) or "Horse broke its leg, so we had to take it out back and help it 'pivot to video'" (blogger Anil Dash). == Facebook metrics controversy == In September 2016, Facebook admitted that it had reported artificially inflated numbers to its advertisers about how long viewers watched ads leading to an overestimation of 60-80%. Plaintiffs in a later court case allege the discrepancy was as high as 150-900%. Facebook apologized in an official statement and in multiple staff appearances at New York Advertising Week. Two months later, Facebook disclosed additional discrepancies in audience metrics. In October 2018, a California federal court unsealed the text of a class action lawsuit filed by advertisers against Facebook, alleging that Facebook had known since 2015 that its viewership numbers were highly inflated, that internal records showed it "was far from an hon

    Read more →
  • Data grid

    Data grid

    A data grid is an architecture or set of services that allows users to access, modify and transfer extremely large amounts of geographically distributed data for research purposes. Data grids make this possible through a host of middleware applications and services that pull together data and resources from multiple administrative domains and then present it to users upon request. The data in a data grid can be located at a single site or multiple sites where each site can be its own administrative domain governed by a set of security restrictions as to who may access the data. Likewise, multiple replicas of the data may be distributed throughout the grid outside their original administrative domain and the security restrictions placed on the original data for who may access it must be equally applied to the replicas. Specifically developed data grid middleware is what handles the integration between users and the data they request by controlling access while making it available as efficiently as possible. == Middleware == Middleware provides all the services and applications necessary for efficient management of datasets and files within the data grid while providing users quick access to the datasets and files. There is a number of concepts and tools that must be available to make a data grid operationally viable. However, at the same time not all data grids require the same capabilities and services because of differences in access requirements, security and location of resources in comparison to users. In any case, most data grids will have similar middleware services that provide for a universal name space, data transport service, data access service, data replication and resource management service. When taken together, they are key to the data grids functional capabilities. === Universal namespace === Since sources of data within the data grid will consist of data from multiple separate systems and networks using different file naming conventions, it would be difficult for a user to locate data within the data grid and know they retrieved what they needed based solely on existing physical file names (PFNs). A universal or unified name space makes it possible to create logical file names (LFNs) that can be referenced within the data grid that map to PFNs. When an LFN is requested or queried, all matching PFNs are returned to include possible replicas of the requested data. The end user can then choose from the returned results the most appropriate replica to use. This service is usually provided as part of a management system known as a Storage Resource Broker (SRB). Information about the locations of files and mappings between the LFNs and PFNs may be stored in a metadata or replica catalogue. The replica catalogue would contain information about LFNs that map to multiple replica PFNs. === Data transport service === Another middleware service is that of providing for data transport or data transfer. Data transport will encompass multiple functions that are not just limited to the transfer of bits, to include such items as fault tolerance and data access. Fault tolerance can be achieved in a data grid by providing mechanisms that ensures data transfer will resume after each interruption until all requested data is received. There are multiple possible methods that might be used to include starting the entire transmission over from the beginning of the data to resuming from where the transfer was interrupted. As an example, GridFTP provides for fault tolerance by sending data from the last acknowledged byte without starting the entire transfer from the beginning. The data transport service also provides for the low-level access and connections between hosts for file transfer. The data transport service may use any number of modes to implement the transfer to include parallel data transfer where two or more data streams are used over the same channel or striped data transfer where two or more steams access different blocks of the file for simultaneous transfer to also using the underlying built-in capabilities of the network hardware or specifically developed protocols to support faster transfer speeds. The data transport service might optionally include a network overlay function to facilitate the routing and transfer of data as well as file I/O functions that allow users to see remote files as if they were local to their system. The data transport service hides the complexity of access and transfer between the different systems to the user so it appears as one unified data source. === Data access service === Data access services work hand in hand with the data transfer service to provide security, access controls and management of any data transfers within the data grid. Security services provide mechanisms for authentication of users to ensure they are properly identified. Common forms of security for authentication can include the use of passwords or Kerberos (protocol). Authorization services are the mechanisms that control what the user is able to access after being identified through authentication. Common forms of authorization mechanisms can be as simple as file permissions. However, need for more stringent controlled access to data is done using Access Control Lists (ACLs), Role-Based Access Control (RBAC) and Tasked-Based Authorization Controls (TBAC). These types of controls can be used to provide granular access to files to include limits on access times, duration of access to granular controls that determine which files can be read or written to. The final data access service that might be present to protect the confidentiality of the data transport is encryption. The most common form of encryption for this task has been the use of SSL while in transport. While all of these access services operate within the data grid, access services within the various administrative domains that host the datasets will still stay in place to enforce access rules. The data grid access services must be in step with the administrative domains access services for this to work. === Data replication service === To meet the needs for scalability, fast access and user collaboration, most data grids support replication of datasets to points within the distributed storage architecture. The use of replicas allows multiple users faster access to datasets and the preservation of bandwidth since replicas can often be placed strategically close to or within sites where users need them. However, replication of datasets and creation of replicas is bound by the availability of storage within sites and bandwidth between sites. The replication and creation of replica datasets is controlled by a replica management system. The replica management system determines user needs for replicas based on input requests and creates them based on availability of storage and bandwidth. All replicas are then cataloged or added to a directory based on the data grid as to their location for query by users. In order to perform the tasks undertaken by the replica management system, it needs to be able to manage the underlying storage infrastructure. The data management system will also ensure the timely updates of changes to replicas are propagated to all nodes. ==== Replication update strategy ==== There are a number of ways the replication management system can handle the updates of replicas. The updates may be designed around a centralized model where a single master replica updates all others, or a decentralized model, where all peers update each other. The topology of node placement may also influence the updates of replicas. If a hierarchy topology is used then updates would flow in a tree like structure through specific paths. In a flat topology it is entirely a matter of the peer relationships between nodes as to how updates take place. In a hybrid topology consisting of both flat and hierarchy topologies updates may take place through specific paths and between peers. ==== Replication placement strategy ==== There are a number of ways the replication management system can handle the creation and placement of replicas to best serve the user community. If the storage architecture supports replica placement with sufficient site storage, then it becomes a matter of the needs of the users who access the datasets and a strategy for placement of replicas. There have been numerous strategies proposed and tested on how to best manage replica placement of datasets within the data grid to meet user requirements. There is not one universal strategy that fits every requirement the best. It is a matter of the type of data grid and user community requirements for access that will determine the best strategy to use. Replicas can even be created where the files are encrypted for confidentiality that would be useful in a research project dealing with medical files. The following section contains several strategies for replica placement. ===== Dynamic replication ===== Dynam

    Read more →
  • Internet

    Internet

    The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a network of networks that comprises private, public, academic, business, and government networks of local to global scope, linked by electronic, wireless, and optical networking technologies. The Internet carries a vast range of information services and resources, such as the interlinked hypertext documents and applications of the World Wide Web (WWW), electronic mail, discussion groups, internet telephony, streaming media and file sharing. Most traditional communication media, including telephone, radio, television, paper mail, newspapers, and print publishing, have been transformed by the Internet, giving rise to new media such as email, online music, digital newspapers, news aggregators, and audio and video streaming websites. The Internet has enabled and accelerated new forms of personal interaction through instant messaging, Internet forums, and social networking services. Online shopping has also grown to occupy a significant market across industries, enabling firms to extend brick and mortar presences to serve larger markets. Business-to-business and financial services on the Internet affect supply chains across entire industries. The origins of the Internet date back to research that enabled the time-sharing of computer resources, the development of packet switching, and the design of computer networks for data communication. The set of communication protocols to enable internetworking on the Internet arose from research and development commissioned in the 1970s by the Defense Advanced Research Projects Agency (DARPA) of the United States Department of Defense in collaboration with universities and researchers across the United States, United Kingdom and France. The Internet has no single centralized governance in either technological implementation or policies for access and usage. Each constituent network sets its own policies. The overarching definitions of the two principal name spaces on the Internet, the Internet Protocol address (IP address) space and the Domain Name System (DNS), are directed by a maintainer organization, the Internet Corporation for Assigned Names and Numbers (ICANN). The technical underpinning and standardization of the core protocols is an activity of the non-profit Internet Engineering Task Force (IETF). == Terminology == The word internetted was used as early as 1849, meaning interconnected or interwoven. The word Internet was used in 1945 by the United States War Department in a radio operator's manual, and in 1974 as the shorthand form of Internetwork. Today, the term Internet most commonly refers to the global system of interconnected computer networks, though it may also refer to any group of smaller networks. The word Internet may be capitalized as a proper noun, although this is becoming less common. This reflects the tendency in English to capitalize new terms and move them to lowercase as they become familiar. The word is sometimes still capitalized to distinguish the global internet from smaller networks, though many publications, including the AP Stylebook since 2016, recommend the lowercase form in every case. In 2016, the Oxford English Dictionary found that, based on a study of around 2.5 billion printed and online sources, "Internet" was capitalized in 54% of cases. The terms Internet and World Wide Web are often used interchangeably; it is common to speak of "going on the Internet" when using a web browser to view web pages. However, the World Wide Web, or the Web, is only one of a large number of Internet services. It is the global collection of web pages, documents and other web resources linked by hyperlinks and URLs. == History == === 1960s === In the 1960s, computer scientists began developing systems for time-sharing of computer resources. J. C. R. Licklider proposed the idea of a universal network while working at Bolt Beranek & Newman and, later, leading the Information Processing Techniques Office at the Advanced Research Projects Agency (ARPA) of the United States Department of Defense. Research into packet switching, one of the fundamental Internet technologies, started in the work of Paul Baran at RAND in the early 1960s and, independently, Donald Davies at the United Kingdom's National Physical Laboratory in 1965. After the Symposium on Operating Systems Principles in 1967, packet switching from the proposed NPL network was incorporated into the design of the ARPANET, an experimental resource sharing network proposed by ARPA. ARPANET development began with two network nodes which were interconnected between the University of California, Los Angeles and the Stanford Research Institute on 29 October 1969. The third site was at the University of California, Santa Barbara, followed by the University of Utah. === 1970s === By the end of 1971, 15 sites were connected to the young ARPANET. Thereafter, the ARPANET gradually developed into a decentralized communications network, connecting remote centers and military bases in the United States. Other user networks and research networks, such as the Merit Network and CYCLADES, were developed in the late 1960s and early 1970s. Early international collaborations for the ARPANET were rare. Connections were made in 1973 to Norway (NORSAR and, later, NDRE) and to Peter Kirstein's research group at University College London, which provided a gateway to British academic networks, the first internetwork for resource sharing. ARPA projects, the International Network Working Group and commercial initiatives led to the development of various protocols and standards by which multiple separate networks could become a single network, or a network of networks. In 1974, Vint Cerf at Stanford University and Bob Kahn at DARPA published a proposal for "A Protocol for Packet Network Intercommunication". Cerf and his graduate students used the term internet as a shorthand for internetwork in RFC 675. The Internet Experiment Notes and later RFCs repeated this use. The work of Louis Pouzin and Robert Metcalfe had important influences on the resulting TCP/IP design. National PTTs and commercial providers developed the X.25 standard and deployed it on public data networks. === 1980s === The ARPANET initially served as a backbone for the interconnection of regional academic and military networks in the United States to enable resource sharing. Access to the ARPANET was expanded in 1981 when the National Science Foundation (NSF) funded the Computer Science Network (CSNET). In 1982, the Internet Protocol Suite (TCP/IP) was standardized, which facilitated worldwide proliferation of interconnected networks. TCP/IP network access expanded again in 1986 when the National Science Foundation Network (NSFNet) provided access to supercomputer sites in the United States for researchers, first at speeds of 56 kbit/s and later at 1.5 Mbit/s and 45 Mbit/s. The NSFNet expanded into academic and research organizations in Europe, Australia, New Zealand and Japan in 1988–89. Although other network protocols such as UUCP and PTT public data networks had global reach well before this time, this marked the beginning of the Internet as an intercontinental network. Commercial Internet service providers emerged in 1989 in the United States and Australia. The ARPANET was decommissioned in 1990. === 1990s === The linking of commercial networks and enterprises by the early 1990s, as well as the advent of the World Wide Web, marked the beginning of the transition to the modern Internet. Steady advances in semiconductor technology and optical networking created new economic opportunities for commercial involvement in the expansion of the network in its core and for delivering services to the public. In mid-1989, MCI Mail and Compuserve established connections to the Internet, delivering email and public access products to the half million users of the Internet. Just months later, on 1 January 1990, PSInet launched an alternate Internet backbone for commercial use; one of the networks that added to the core of the commercial Internet of later years. In March 1990, the first high-speed T1 (1.5 Mbit/s) link between the NSFNET and Europe was installed between Cornell University and CERN, allowing much more robust communications than were capable with satellites. Later in 1990, Tim Berners-Lee began writing WorldWideWeb, the first web browser, after two years of lobbying CERN management. By Christmas 1990, Berners-Lee had built all the tools necessary for a working Web: the HyperText Transfer Protocol (HTTP) 0.9, the HyperText Markup Language (HTML), the first Web browser (which was also an HTML editor and could access Usenet newsgroups and FTP files), the first HTTP server software (later known as CERN httpd), the first web server, and the first Web pages that described the project itself. In 1991 the

    Read more →
  • SimSimi

    SimSimi

    SimSimi is an artificial intelligence conversation program created in 2002 by ISMaker. It grows its artificial intelligence day by day assisted by a feature that allows users to teach it to respond correctly. SimSimi, pronounced as "shim-shimi", is from a Korean word simsim (심심) which means "bored". It has an application designed for Android, Windows Phone and iOS. The application was banned in Thailand in 2012 after users taught it to make responses containing profanity, and to criticise leading politicians. In April 2018, SimSimi was suspended in Brazil due to accusations of sending inappropriate messages, such as sexual language, bullying and even death threats, being labeled as "dangerous" mainly due to its popularity among children, and according to its developer, the suspension of the app in the country "was inevitable because the SimSimi app, at least in the last few days, had a significant negative social impact in Brazil.”

    Read more →
  • Forking lemma

    Forking lemma

    The forking lemma is any of a number of related lemmas in cryptography research. The lemma states that if an adversary (typically a probabilistic Turing machine), on inputs drawn from some distribution, produces an output that has some property with non-negligible probability, then with non-negligible probability, if the adversary is re-run on new inputs but with the same random tape, its second output will also have the property. This concept was first used by David Pointcheval and Jacques Stern in "Security proofs for signature schemes," published in the proceedings of Eurocrypt 1996. In their paper, the forking lemma is specified in terms of an adversary that attacks a digital signature scheme instantiated in the random oracle model. They show that if an adversary can forge a signature with non-negligible probability, then there is a non-negligible probability that the same adversary with the same random tape can create a second forgery in an attack with a different random oracle. The forking lemma was later generalized by Mihir Bellare and Gregory Neven. The forking lemma has been used and further generalized to prove the security of a variety of digital signature schemes and other random-oracle based cryptographic constructions. == Statement of the lemma == The generalized version of the lemma is stated as follows. Let A be a probabilistic algorithm, with inputs (x, h1, ..., hq; r) that outputs a pair (J, y), where r refers to the random tape of A (that is, the random choices A will make). Suppose further that IG is a probability distribution from which x is drawn, and that H is a set of size h from which each of the hi values are drawn according to the uniform distribution. Let acc be the probability that on inputs distributed as described, the J output by A is greater than or equal to 1. We can then define a "forking algorithm" FA that proceeds as follows, on input x: Pick a random tape r for A. Pick h1, ..., hq uniformly from H. Run A on input (x, h1, ..., hq; r) to produce (J, y). If J = 0, then return (0, 0, 0). Pick h'J, ..., h'q uniformly from H. Run A on input (x, h1, ..., hJ−1, h'J, ..., h'q; r) to produce (J', y'). If J' = J and hJ ≠ h'J then return (1, y, y'), otherwise, return (0, 0, 0). Let frk be the probability that FA outputs a triple starting with 1, given an input x chosen randomly from IG. Then frk ≥ acc ⋅ ( acc q − 1 h ) . {\displaystyle {\text{frk}}\geq {\text{acc}}\cdot \left({\frac {\text{acc}}{q}}-{\frac {1}{h}}\right).} === Intuition === The idea here is to think of A as running two times in related executions, where the process "forks" at a certain point, when some but not all of the input has been examined. In the alternate version, the remaining inputs are re-generated but are generated in the normal way. The point at which the process forks may be something we only want to decide later, possibly based on the behavior of A the first time around: this is why the lemma statement chooses the branching point (J) based on the output of A. The requirement that hJ ≠ h'J is a technical one required by many uses of the lemma. (Note that since both hJ and h'J are chosen randomly from H, then if h is large, as is usually the case, the probability of the two values not being distinct is extremely small.) === Example === For example, let A be an algorithm for breaking a digital signature scheme in the random oracle model. Then x would be the public parameters (including the public key) A is attacking, and hi would be the output of the random oracle on its ith distinct input. The forking lemma is of use when it would be possible, given two different random signatures of the same message, to solve some underlying hard problem. An adversary that forges once, however, gives rise to one that forges twice on the same message with non-negligible probability through the forking lemma. When A attempts to forge on a message m, we consider the output of A to be (J, y) where y is the forgery, and J is such that m was the Jth unique query to the random oracle (it may be assumed that A will query m at some point, if A is to be successful with non-negligible probability). (If A outputs an incorrect forgery, we consider the output to be (0, y).) By the forking lemma, the probability (frk) of obtaining two good forgeries y and y' on the same message but with different random oracle outputs (that is, with hJ ≠ h'J) is non-negligible when acc is also non-negligible. This allows us to prove that if the underlying hard problem is indeed hard, then no adversary can forge signatures. This is the essence of the proof given by Pointcheval and Stern for a modified ElGamal signature scheme against an adaptive adversary. == Known issues with application of forking lemma == The reduction provided by the forking lemma is not tight. Pointcheval and Stern proposed security arguments for Digital Signatures and Blind Signature using Forking Lemma. Claus P. Schnorr provided an attack on blind Schnorr signatures schemes, with more than p o l y l o g ( n ) {\displaystyle polylog(n)} concurrent executions (the case studied and proven secure by Pointcheval and Stern). A polynomial-time attack, for Ω ( n ) {\displaystyle \Omega (n)} concurrent executions, was shown in 2020 by Benhamouda, Lepoint, Raykova, and Orrù. Schnorr also suggested enhancements for securing blind signatures schemes based on discrete logarithm problem.

    Read more →
  • Bitmap index

    Bitmap index

    A bitmap index is a special kind of database index that uses bitmaps. Bitmap indexes have traditionally been considered to work well for low-cardinality columns, which have a modest number of distinct values, either absolutely, or relative to the number of records that contain the data. The extreme case of low cardinality is Boolean data (e.g., does a resident in a city have internet access?), which has two values, True and False. Bitmap indexes use bit arrays (commonly called bitmaps) and answer queries by performing bitwise logical operations on these bitmaps. Bitmap indexes have a significant space and performance advantage over other structures for query of such data. Their drawback is they are less efficient than the traditional B-tree indexes for columns whose data is frequently updated: consequently, they are more often employed in read-only systems that are specialized for fast query - e.g., data warehouses, and generally unsuitable for online transaction processing applications. Some researchers argue that bitmap indexes are also useful for moderate or even high-cardinality data (e.g., unique-valued data) which is accessed in a read-only manner, and queries access multiple bitmap-indexed columns using the AND, OR or XOR operators extensively. Bitmap indexes are also useful in data warehousing applications for joining a large fact table to smaller dimension tables such as those arranged in a star schema. == Example == Continuing the internet access example, a bitmap index may be logically viewed as follows: On the left, Identifier refers to the unique number assigned to each resident, HasInternet is the data to be indexed, the content of the bitmap index is shown as two columns under the heading bitmaps. Each column in the left illustration under the Bitmaps header is a bitmap in the bitmap index. In this case, there are two such bitmaps, one for "has internet" Yes and one for "has internet" No. It is easy to see that each bit in bitmap Y shows whether a particular row refers to a person who has internet access. This is the simplest form of bitmap index. Most columns will have more distinct values. For example, the sales amount is likely to have a much larger number of distinct values. Variations on the bitmap index can effectively index this data as well. We briefly review three such variations. Note: Many of the references cited here are reviewed at (John Wu (2007)). For those who might be interested in experimenting with some of the ideas mentioned here, many of them are implemented in open source software such as FastBit, the Lemur Bitmap Index C++ Library, the Roaring Bitmap Java library and the Apache Hive Data Warehouse system. == Compression == For historical reasons, bitmap compression and inverted list compression were developed as separate lines of research, and only later were recognized as solving essentially the same problem. Software can compress each bitmap in a bitmap index to save space. There has been considerable amount of work on this subject. Though there are exceptions such as Roaring bitmaps, Bitmap compression algorithms typically employ run-length encoding, such as the Byte-aligned Bitmap Code, the Word-Aligned Hybrid code, the Partitioned Word-Aligned Hybrid (PWAH) compression, the Position List Word Aligned Hybrid, the Compressed Adaptive Index (COMPAX), Enhanced Word-Aligned Hybrid (EWAH) and the COmpressed 'N' Composable Integer SEt (CONCISE). These compression methods require very little effort to compress and decompress. More importantly, bitmaps compressed with BBC, WAH, COMPAX, PLWAH, EWAH and CONCISE can directly participate in bitwise operations without decompression. This gives them considerable advantages over generic compression techniques such as LZ77. BBC compression and its derivatives are used in a commercial database management system. BBC is effective in both reducing index sizes and maintaining query performance. BBC encodes the bitmaps in bytes, while WAH encodes in words, better matching current CPUs. "On both synthetic data and real application data, the new word aligned schemes use only 50% more space, but perform logical operations on compressed data 12 times faster than BBC." PLWAH bitmaps were reported to take 50% of the storage space consumed by WAH bitmaps and offer up to 20% faster performance on logical operations. Similar considerations can be done for CONCISE and Enhanced Word-Aligned Hybrid. The performance of schemes such as BBC, WAH, PLWAH, EWAH, COMPAX and CONCISE is dependent on the order of the rows. A simple lexicographical sort can divide the index size by 9 and make indexes several times faster. The larger the table, the more important it is to sort the rows. Reshuffling techniques have also been proposed to achieve the same results of sorting when indexing streaming data. == Encoding == Basic bitmap indexes use one bitmap for each distinct value. It is possible to reduce the number of bitmaps used by using a different encoding method. For example, it is possible to encode C distinct values using log(C) bitmaps with binary encoding. This reduces the number of bitmaps, further saving space, but to answer any query, most of the bitmaps have to be accessed. This makes it potentially not as effective as scanning a vertical projection of the base data, also known as a materialized view or projection index. Finding the optimal encoding method that balances (arbitrary) query performance, index size and index maintenance remains a challenge. Without considering compression, Chan and Ioannidis analyzed a class of multi-component encoding methods and came to the conclusion that two-component encoding sits at the kink of the performance vs. index size curve and therefore represents the best trade-off between index size and query performance. == Binning == For high-cardinality columns, it is useful to bin the values, where each bin covers multiple values and build the bitmaps to represent the values in each bin. This approach reduces the number of bitmaps used regardless of encoding method. However, binned indexes can only answer some queries without examining the base data. For example, if a bin covers the range from 0.1 to 0.2, then when the user asks for all values less than 0.15, all rows that fall in the bin are possible hits and have to be checked to verify whether they are actually less than 0.15. The process of checking the base data is known as the candidate check. In most cases, the time used by the candidate check is significantly longer than the time needed to work with the bitmap index. Therefore, binned indexes exhibit irregular performance. They can be very fast for some queries, but much slower if the query does not exactly match a bin. == History == The concept of bitmap index was first introduced by Professor Israel Spiegler and Rafi Maayan in their research "Storage and Retrieval Considerations of Binary Data Bases", published in 1985. The first commercial database product to implement a bitmap index was Computer Corporation of America's Model 204. Patrick O'Neil published a paper about this implementation in 1987. This implementation is a hybrid between the basic bitmap index (without compression) and the list of Row Identifiers (RID-list). Overall, the index is organized as a B+tree. When the column cardinality is low, each leaf node of the B-tree would contain long list of RIDs. In this case, it requires less space to represent the RID-lists as bitmaps. Since each bitmap represents one distinct value, this is the basic bitmap index. As the column cardinality increases, each bitmap becomes sparse and it may take more disk space to store the bitmaps than to store the same content as RID-lists. In this case, it switches to use the RID-lists, which makes it a B+tree index. == In-memory bitmaps == One of the strongest reasons for using bitmap indexes is that the intermediate results produced from them are also bitmaps and can be efficiently reused in further operations to answer more complex queries. Many programming languages support this as a bit array data structure. For example, Java has the BitSet class and .NET have the BitArray class. Some database systems that do not offer persistent bitmap indexes use bitmaps internally to speed up query processing. For example, PostgreSQL versions 8.1 and later implement a "bitmap index scan" optimization to speed up arbitrarily complex logical operations between available indexes on a single table. For tables with many columns, the total number of distinct indexes to satisfy all possible queries (with equality filtering conditions on either of the fields) grows very fast, being defined by this formula: C n [ n 2 ] ≡ n ! ( n − [ n 2 ] ) ! [ n 2 ] ! {\displaystyle \mathbf {C} _{n}^{\left[{\frac {n}{2}}\right]}\equiv {\frac {n!}{\left(n-\left[{\frac {n}{2}}\right]\right)!\left[{\frac {n}{2}}\right]!}}} . A bitmap index scan combines expressions on different indexes, thus requiring only one index per column t

    Read more →
  • Application delivery network

    Application delivery network

    An application delivery network (ADN) is a suite of technologies that, when deployed together, provide availability, security, visibility, and acceleration for Internet applications such as websites. ADN components provide supporting functionality that enables website content to be delivered to visitors and other users of that website, in a fast, secure, and reliable way. Gartner defines application delivery networking as the combination of WAN optimization controllers (WOCs) and application delivery controllers (ADCs). At the data center end of an ADN is the ADC, an advanced traffic management device that is often also referred to as a web switch, content switch, or multilayer switch, the purpose of which is to distribute traffic among a number of servers or geographically dislocated sites based on application specific criteria. In the branch office portion of an ADN is the WAN optimization controller, which works to reduce the number of bits that flow over the network using caching and compression, and shapes TCP traffic using prioritization and other optimization techniques. Some WOC components are installed on PCs or mobile clients, and there is typically a portion of the WOC installed in the data center. Application delivery networks are also offered by some CDN vendors. The ADC, one component of an ADN, evolved from layer 4-7 switches in the late 1990s when it became apparent that traditional load balancing techniques were not robust enough to handle the increasingly complex mix of application traffic being delivered over a wider variety of network connectivity options. == Application delivery techniques == The Internet was designed according to the end-to-end principle. This principle keeps the core network relatively simple and moves the intelligence as much as possible to the network end-points: the hosts and clients. An Application Delivery Network (ADN) enhances the delivery of applications across the Internet by employing a number of optimization techniques. Many of these techniques are based on established best-practices employed to efficiently route traffic at the network layer including redundancy and load balancing In theory, an Application Delivery Network (ADN) is closely related to a content delivery network. The difference between the two delivery networks lies in the intelligence of the ADN to understand and optimize applications, usually referred to as application fluency. Application Fluent Network (AFN) is based on the concept of Application Fluency to refer to WAN optimization techniques applied at Layer Four to Layer Seven of the OSI model for networks. Application Fluency implies that the network is fluent or intelligent in understanding and being able to optimize delivery of each application. Application Fluent Network is an addition of SDN capabilities. The acronym 'AFN' is used by Alcatel-Lucent Enterprise to refer to an Application Fluent Network. Application delivery uses one or more layer 4–7 switches, also known as a web switch, content switch, or multilayer switch to intelligently distribute traffic to a pool, also known as a cluster or farm, of servers. The application delivery controller (ADC) is assigned a single virtual IP address (VIP) that represents the pool of servers. Traffic arriving at the ADC is then directed to one of the servers in the pool (cluster, farm) based on a number of factors including application specific data values, application transport protocol, availability of servers, current performance metrics, and client-specific parameters. An ADN provides the advantages of load distribution, increase in capacity of servers, improved scalability, security, and increased reliability through application specific health checks. Increasingly the ADN comprises a redundant pair of ADC on which is integrated a number of different feature sets designed to provide security, availability, reliability, and acceleration functions. In some cases these devices are still separate entities, deployed together as a network of devices through which application traffic is delivered, each providing specific functionality that enhances the delivery of the application. == ADN optimization techniques == === TCP multiplexing === TCP Multiplexing is loosely based on established connection pooling techniques utilized by application server platforms to optimize the execution of database queries from within applications. An ADC establishes a number of connections to the servers in its pool and keeps the connections open. When a request is received by the ADC from the client, the request is evaluated and then directed to a server over an existing connection. This has the effect of reducing the overhead imposed by establishing and tearing down the TCP connection with the server, improving the responsiveness of the application. Some ADN implementations take this technique one step further and also multiplex HTTP and application requests. This has the benefit of executing requests in parallel, which enhances the performance of the application. === TCP optimization === There are a number of Request for Comments (RFCs) which describe mechanisms for improving the performance of TCP. Many ADN implement these RFCs in order to provide enhanced delivery of applications through more efficient use of TCP. The RFCs most commonly implemented are: Delayed Acknowledgements Nagle Algorithm Selective Acknowledgements Explicit Congestion Notification ECN Limited and Fast Retransmits Adaptive Initial Congestion Windows === Data compression and caching === ADNs also provide optimization of application data through caching and compression techniques. There are two types of compression used by ADNs today: industry standard HTTP compression and proprietary data reduction algorithms. It is important to note that the cost in CPU cycles to compress data when traversing a LAN can result in a negative performance impact and therefore best practices are to only utilize compression when delivering applications via a WAN or particularly congested high-speed data link. HTTP compression is asymmetric and transparent to the client. Support for HTTP compression is built into web servers and web browsers. All commercial ADN products currently support HTTP compression. A second compression technique is achieved through data reduction algorithms. Because these algorithms are proprietary and modify the application traffic, they are symmetric and require a device to reassemble the application traffic before the client can receive it. A separate class of devices known as WAN Optimization Controllers (WOC) provide this functionality, but the technology has been slowly added to the ADN portfolio over the past few years as this class of device continues to become more application aware, providing additional features for specific applications such as CIFS and SMB. == ADN reliability and availability techniques == === Advanced health checking === Advanced health checking is the ability of an ADN to determine not only the state of the server on which an application is hosted, but the status of the application it is delivering. Advanced health checking techniques allow the ADC to intelligently determine whether or not the content being returned by the server is correct and should be delivered to the client. This feature enables other reliability features in the ADN, such as resending a request to a different server if the content returned by the original server is found to be erroneous. === Load balancing algorithms === The load balancing algorithms found in today's ADN are far more advanced than the simplistic round-robin and least connections algorithms used in the early 1990s. These algorithms were originally loosely based on operating systems' scheduling algorithms, but have since evolved to factor in conditions peculiar to networking and application environments. It is more accurate to describe today's "load balancing" algorithms as application routing algorithms, as most ADN employ application awareness to determine whether an application is available to respond to a request. This includes the ability of the ADN to determine not only whether the application is available, but whether or not the application can respond to the request within specified parameters, often referred to as a service level agreement. Typical industry standard load balancing algorithms available today include: Round Robin Least Connections Fastest Response Time Weighted Round Robin Weighted Least Connections Custom values assigned to individual servers in a pool based on SNMP or other communication mechanism === Fault tolerance === The ADN provides fault tolerance at the server level, within pools or farms. This is accomplished by designating specific servers as a 'backup' that is activated automatically by the ADN in the event that the primary server(s) in the pool fail. The ADN also ensures application availability and reliability through its ability to seamlessly "failover"

    Read more →
  • Toad Data Modeler

    Toad Data Modeler

    Toad Data Modeler is a database design tool allowing users to visually create, maintain, and document new or existing database systems, and to deploy changes to data structures across different platforms. It is used to construct logical and physical data models, compare and synchronize models, generate complex SQL/DDL, create and modify scripts, and reverse and forward engineer databases and data warehouse systems. Toad's data modelling software is used for database design, maintenance and documentation. == Product History == Toad Data Modeler was previously called "CASE Studio 2" before it was acquired from Charonware by Quest Software in 2006. Quest Software was acquired by Dell on September 28, 2012. On October 31, 2016, Dell finalized the sale of Dell Software to Francisco Partners and Elliott Management, which relaunched on November 1, 2016 as Quest Software. == Features/Usages == Multiple database support - Connect multiple databases natively and simultaneously, including Oracle, SAP, MySQL, SQL Server, PostgreSQL, Db2, Ingres, and Microsoft Access. Data modelling tool - Create database structures or make changes to existing models automatically and provide documentation on multiple platforms. Logical and physical modelling - Build complex logical and physical entity relationship models and reverse, forward, and engineer databases. Reporting - Generate detailed reports on existing database structures. Model customization - Add logical data to user diagrams to customize user models. All Toad products typically have 2 releases per year. == Other features == Model Actions (Compare Models, Convert Model, Merge Models, Generate Change Script) Version Control System (Apache Subversion) Naming Conventions Auto Layout Multiple Workspaces Scripting and Customization Automation Object Gallery Full Unicode Support Integration with Toad for Oracle == Related Software == Erwin Data Modeler Oracle SAP MySQL SQL Server PostgreSQL IBM Db2 Ingres Microsoft Access

    Read more →
  • Instagram egg

    Instagram egg

    The Instagram egg is a photo of an egg posted by the account @world_record_egg on the social media platform Instagram. It became a global phenomenon and an internet meme within days of its publication on 4 January 2019. It is the second most-liked Instagram post and was the most-liked Instagram post from 14 January 2019 until 20 December 2022, when it was overtaken by Lionel Messi's post showing him and his teammates celebrating after Argentina won the 2022 FIFA World Cup. The owner of the account was revealed to be Chris Godfrey, a British advertising creative, who later worked with his two friends Alissa Khan-Whelan and CJ Brown on a Hulu commercial featuring the egg, intended to raise mental health awareness. == Background == The photo was originally taken by Serghei Platanov, who then posted it to Shutterstock on 23 June 2015 with the title "eggs isolated on white background". == History == On 4 January 2019, the @world_record_egg account was created, and posted an image of a bird egg with the caption, "Let's set a world record together and get the most liked post on Instagram. Beating the current world record held by Kylie Jenner (18 million)! We got this." Jenner's previous record, the first photo of her daughter Stormi, had garnered a total of 18.4 million likes. The post quickly reached 18.4 million likes in just under 10 days, becoming the most-liked Instagram post at the time. It then continued to rise over 45 million likes in the next 48 hours, surpassing the "Despacito" music video and taking the world record for the most-liked online post (on any media platform) in history. After the account became verified on 14 January 2019, the post rose in popularity and likes, which snowballed into coverage in various media outlets. By 18 March 2019, the post had accumulated over 53.3 million likes, nearly three times the previous record of 18.4 million. It posted frequent updates for a few days in the form of Instagram Stories. Alongside the like tally, as of January 2023 the post has 3.8 million comments. Several individuals tried to claim that they were the account's creator, the claims being dismissed by "the egg" on Instagram direct messages. On 3 February 2019, the creator of the Instagram egg was revealed by Hulu and The New York Times to be Chris Godfrey, a British advertising creative. Alissa Khan-Whelan, his colleague, was also outed. On 18 January 2019, the account posted a second picture of an egg, almost identical to the first one apart from a small crack at the top left. As of 25 February 2019, the post accumulated 11.8 million likes. On 22 January 2019, the account posted a third picture of an egg, this time having two larger cracks. In less than 25 minutes, the post accumulated 1 million likes, and by 25 February 2019, it had accumulated 9.5 million likes. On 29 January 2019, a fourth picture of an egg was posted to the account which has another large crack on the right hand side, attracting 7.6 million likes by 25 February 2019. On 1 February 2019, a fifth picture of an egg was posted with stitching like that of a football, referencing the upcoming Super Bowl. That post had accumulated 6.5 million likes by 25 February 2019. The account promised that it would reveal what was inside the egg on 3 February, on the subscription video on demand service Hulu. The Hulu Instagram egg reveal was used to promote an animation about a mental health campaign. A caption from the clip read, "Recently I've started to crack, the pressure of social media is getting to me. If you're struggling too, talk to someone." The video was later posted on the @world_record_egg Instagram account, and this post received over 33 million views by May 2019. As of May 2020, it had received over 41 million views. On 16 July 2019, Chris Godfrey (the creator of the account) was listed as one of the top 25 most influential people on the internet. On 20 December 2022, the record for the most-liked Instagram post was surpassed by a post from Argentine footballer Lionel Messi, showing him and his teammates celebrating after winning the 2022 FIFA World Cup with their national team. The world record egg responded to being overtaken in likes by Messi with "Today [Lionel Messi] has taken the crown, for now. But I'm still left with one question… Who is the greatest of all time – Cristiano Ronaldo or Leo Messi?" The account sold to Dubai-based investor Mustafa El Fishawy in April 2024 for an undisclosed seven-figure sum. Reed Smith, who advised Godfrey, Brown, and Khan-Whelan in the transaction, stated they opted to sell it to "focus on new ventures." On 3 June, @world_record_egg posted an egg with the flag of Palestine in support of the country during the Gaza war; the post's caption described it as an "Egg for Peace" and hoped to "set a new world record together and get the most liked post on Instagram for a good cause." == Reception == In response to breaking the world record for the most-liked Instagram post, the account's owner wrote "This is madness. What a time to be alive." Hours later, Jenner posted a video on Instagram of her cracking open an egg and pouring its yolk onto the ground, with the caption: "Take that little egg." Pundits pontificated on the meaning of the egg picture's dominance over social media's "first family". As Vogue observed, tapping a heart pictogram is easy, and eggs are "lovable". More pointedly: [T]he attention economy is a scam based on requiring little to no labor from both producer and consumer despite commanding the most space, and therefore value, in our digital lives... but it very well could be: As a metaphor for the fragility of the influencer ecosystem, the egg has broken the Internet. The significance of the event and its massive republishing are a topic of discussion. A University of Westminster researcher of internet memes compared it to the movement to name a scientific research vessel in the United Kingdom as Boaty McBoatface. The Instagrammer's success is a rare victory for the unpaid viral campaign on social media. "There is a bit of an anti-celebrity revolt here – 'look what we can do with a simple egg'" The researcher suggests that the accomplishment of becoming such a widely heralded unpaid viral post may become increasingly rare, as social networks rely more on paid and business promotion. The post's spread has been characterized as a populist backlash against "consumerism" and is seen by some as a triumph of community over celebrity. However, propelled by their popular success, the creators promised to release 'egg-centric' memorabilia. Hundreds of games based on the Instagram egg have appeared on Apple's App Store. The creators of the Instagram egg also reached a deal to promote Hulu.

    Read more →
  • Data recovery

    Data recovery

    In computing, data recovery is a process of retrieving deleted, inaccessible, lost, corrupted, damaged, or overwritten data from secondary storage, removable media or files, when the data stored in them cannot be accessed in a usual way. The data is most often salvaged from storage media such as internal or external hard disk drives (HDDs), solid-state drives (SSDs), USB flash drives, magnetic tapes, CDs, DVDs, RAID subsystems, and other electronic devices. Recovery may be required due to physical damage to the storage devices or logical damage to the file system that prevents it from being mounted by the host operating system (OS). Logical failures occur when the hard drive devices are functional but the user or automated-OS cannot retrieve or access data stored on them. Logical failures can occur due to corruption of the engineering chip, lost partitions, firmware failure, or failures during formatting/re-installation. Data recovery can be a very simple or technical challenge. This is why there are specific software companies specialized in this field that help to get back data on your system. == About == The most common data recovery scenarios involve an operating system failure, malfunction of a storage device, logical failure of storage devices, accidental damage or deletion, etc. (typically, on a single-drive, single-partition, single-OS system), in which case the ultimate goal is simply to copy all important files from the damaged media to another new drive. This can be accomplished using a Live CD, or DVD by booting directly from a ROM or a USB drive instead of the corrupted drive in question. Many Live CDs or DVDs provide a means to mount the system drive and backup drives or removable media, and to move the files from the system drive to the backup media with a file manager or optical disc authoring software. Such cases can often be mitigated by disk partitioning and consistently storing valuable data files (or copies of them) on a different partition from the replaceable OS system files. Another scenario involves a drive-level failure, such as a compromised file system or drive partition, or a hard disk drive failure. In any of these cases, the data is not easily read from the media devices. Depending on the situation, solutions involve repairing the logical file system, partition table, or master boot record, or updating the firmware or drive recovery techniques ranging from software-based recovery of corrupted data, to hardware- and software-based recovery of damaged service areas (also known as the hard disk drive's "firmware"), to hardware replacement on a physically damaged drive which allows for the extraction of data to a new drive. If a drive recovery is necessary, the drive itself has typically failed permanently, and the focus is rather on a one-time recovery, salvaging whatever data can be read. In a third scenario, files have been accidentally "deleted" from a storage medium by the users. Typically, the contents of deleted files are not removed immediately from the physical drive; instead, references to them in the directory structure are removed, and thereafter space the deleted data occupy is made available for later data overwriting. In the mind of end users, deleted files cannot be discoverable through a standard file manager, but the deleted data still technically exists on the physical drive. In the meantime, the original file contents remain, often several disconnected fragments, and may be recoverable if not overwritten by other data files. The term "data recovery" is also used in the context of forensic applications or espionage, where data which have been encrypted, hidden, or deleted, rather than damaged, are recovered. Sometimes data present in the computer gets encrypted or hidden due to reasons like virus attacks which can only be recovered by some computer forensic experts. == Physical damage == A wide variety of failures can cause physical damage to storage media, which may result from human errors and natural disasters. CD-ROMs can have their metallic substrate or dye layer scratched off; hard disks can suffer from a multitude of mechanical failures, such as head crashes, PCB failure, and failed motors; tapes can simply break. Physical damage to a hard drive, even in cases where a head crash has occurred, does not necessarily mean permanent data loss. However, in extreme cases, such as prolonged exposure to moisture and corrosion —like the lost Bitcoin hard drive of James Howells, buried in the Newport landfill for over a decade — recovery is usually impossible. In rare cases, forensic techniques such as magnetic force microscopy (MFM) have been explored to detect residual magnetic traces when data holds exceptional value. Other techniques employed by many professional data recovery companies can typically salvage most, if not all, of the data that had been lost when the failure occurred. Of course, there are exceptions to this, such as cases where severe damage to the hard drive platters may have occurred. However, if the hard drive can be repaired and a full image or clone created, then the logical file structure can be rebuilt in most instances. Most physical damage cannot be repaired by end users. For example, opening a hard disk drive in a normal environment can allow airborne dust to settle on the platter and become caught between the platter and the read/write head. During normal operation, read/write heads float 3 to 6 nanometers above the platter surface, and the average dust particles found in a normal environment are typically around 30,000 nanometers in diameter. When these dust particles get caught between the read/write heads and the platter, they can cause new head crashes that further damage the platter and thus compromise the recovery process. Furthermore, end users generally do not have the hardware or technical expertise required to make these repairs. Consequently, data recovery companies are often employed to salvage important data with the more reputable ones using class 100 dust- and static-free cleanrooms. === Recovery techniques === Recovering data from physically damaged hardware can involve multiple techniques. Some damage can be repaired by replacing parts in the hard disk. This alone may make the disk usable, but there may still be logical damage. A specialized disk-imaging procedure is used to recover every readable bit from the surface. Once this image is acquired and saved on a reliable medium, the image can be safely analyzed for logical damage and will possibly allow much of the original file system to be reconstructed. ==== Hardware repair ==== A common misconception is that a damaged printed circuit board (PCB) may be simply replaced during recovery procedures by an identical PCB from a healthy drive. While this may work in rare circumstances on hard disk drives manufactured before 2003, it will not work on newer drives. Electronics boards of modern drives usually contain drive-specific adaptation data (generally a map of bad sectors and tuning parameters) and other information required to properly access data on the drive. Replacement boards often need this information to effectively recover all of the data. The replacement board may need to be reprogrammed. Some manufacturers (Seagate, for example) store this information on a serial EEPROM chip, which can be removed and transferred to the replacement board. Each hard disk drive has what is called a system area or service area; this portion of the drive, which is not directly accessible to the end user, usually contains drive's firmware and adaptive data that helps the drive operate within normal parameters. One function of the system area is to log defective sectors within the drive; essentially telling the drive where it can and cannot write data. The sector lists are also stored on various chips attached to the PCB, and they are unique to each hard disk drive. If the data on the PCB do not match what is stored on the platter, then the drive will not calibrate properly. In most cases the drive heads will click because they are unable to find the data matching what is stored on the PCB. == Logical damage == The term "logical damage" refers to situations in which the error is not a problem in the hardware and requires software-level solutions. === Corrupt partitions and file systems, media errors === In some cases, data on a hard disk drive can be unreadable due to damage to the partition table or file system, or to (intermittent) media errors. In the majority of these cases, at least a portion of the original data can be recovered by repairing the damaged partition table or file system using specialized data recovery software such as TestDisk; software like ddrescue can image media despite intermittent errors, and image raw data when there is partition table or file system damage. This type of data recovery can be performed by people without expertise in drive hardware as it requires no special physica

    Read more →
  • Social business model

    Social business model

    The social business model is use of social media tools and social networking behavioral standards by businesses for communication with customers, suppliers, and others. Combining social networking etiquette (being helpful, transparent and authentic) with business engagement on LinkedIn (for one-to-one interaction), Twitter (for immediacy) and Facebook (for content sharing) more fully involves employees in the organization and increases customer intimacy and trust. == Overview == Traditional business models, particularly in large organizations, have had as one common characteristic careful limitation of direct contact between those within the organization and those outside of it. Only certain specific individuals (most frequently in roles such as sales, customer service and field consulting) were designated as "customer-facing" personnel. Organizations further limited outside access to internal employees through filtering mechanisms such as publishing only a main switchboard number (whether routed through a live receptionist or an interactive voice response system) and generic "sales@" or "info@" email addresses. The Cluetrain Manifesto (written by Rick Levine, Christopher Locke, Doc Searls, and David Weinberger and published in 1999) was among the first books to predict the demise of this old order and the emergence of more open business models, though most of the business world was slow to adopt the book's recommended cultural changes. Thirteen years later, authors Dion Hinchcliffe and Peter Kim added structural underpinnings to the cultural shifts outlined in The Cluetrain Manifesto in their book, Social Business by Design. The book details many of the ways social media tools and practices are being adopted within organizations, to support both internal employee collaboration and external customer engagement (which the authors describe as the "bigger problem"). == Elements == In implementing the social business model, organizations apply social networking protocols and tools in a range of areas, potentially including: Marketing Customer Support Recruiting Crowdsourcing Internal employee collaboration Sales Product Development Supply Chain Operations Investor Relations == Characteristics of organizations adopting the social business model == Organizations that fully adopt the social business model will exhibit four key characteristics: Connected – employees will be able to seamlessly engage one-on-one in real-time with other employees and individuals outside the organization (customers, prospects, partners, media, etc.) using a variety of communications methods including text chat, voice, file sharing, email, and video chat. Social – employees will follow social networking etiquette (being authentic, helpful and transparent) in external interactions. The focus will be on answering questions and providing information rather than overt sales or promotion. Presence – these conversations may originate on the company's website or elsewhere online (e.g., publication websites, industry portals, or social networking sites such as LinkedIn or Facebook). Intelligent – organizations will use in-depth analytics to monitor connections, social interactions and presence; measure corresponding business results; and continually adjust and improve practices for increased effectiveness. == Technical and functional requirements == While much of the change inherent in adopting the social business model is cultural, it also requires process changes enabled by social business technology. Functional requirements for a social business technology platform include: Analytics (including the cost of engagement as well as various measures of return on investment such as leads, sales, referrals, recommendations, and retained customers). Integration with other social media and business tools such as CRM systems, partner relationship management (PRM) software, product development, website analytics, and employee-recruiting applications. Rules-based workflow (e.g. routing a comment to the appropriate individual for a response, based on content). Geolocation (so customers or prospects can be automatically routed to local sales or customer service representatives). Content sharing. Collaboration tools. Transparency (i.e., people should know who they are engaging with) Unified communications (the ability to engage via voice, text, video, email, and share a wide variety of file types) Storage (the ability to store interactions for legal, training, compliance or compensation purposes, and purge stored data when no longer needed based on company policy or regulatory requirements). Immediacy (real-time monitoring and response).

    Read more →
  • Elastic cloud storage

    Elastic cloud storage

    An elastic cloud is a cloud computing offering that provides variable service levels based on changing needs. Elasticity is an attribute that can be applied to most cloud services. It states that the capacity and performance of any given cloud service can expand or contract according to a customer's requirements and that this can potentially be changed automatically as a consequence of some software-driven event or, at worst, can be reconfigured quickly by the customer's infrastructure management team. Elasticity has been described as one of the five main principles of cloud computing by Rosenburg and Mateos in The Cloud at Your Service - Manning 2011. == History == Cloud computing was first described by Gillet and Kapor in 1996; however, the first practical implementation was a consequence of a strategy to leverage Amazon's excess data center capacity. Amazon and other pioneers of the commercial use of this technology were primarily interested in providing a “public” cloud service, whereby they could offer customers the benefits of using the cloud, particularly the utility-based pricing model benefit. Other suppliers followed suit with a range of cloud-based models all offering elasticity as a core component, but these suppliers were only offering this service as an element of their public cloud service. Due to perceived weaknesses in security, or at least a lack of proven compliance, many organizations, particularly in the financial and public sectors, have been slow adopters of cloud technologies. These wary organizations can achieve some of the benefits of cloud computing by adopting private cloud technologies. An alternative form of the elastic cloud has been offered by vendors such as EMC and IBM, whereby the service is based around an enterprise's own infrastructure but still retains elements of elasticity and the potential to bill by consumption. == Description == Elasticity in cloud computing is the ability for the organization to adjust its storage requirements in terms of capacity and processing with respect to operational requirements. This has the following benefits: Operational Benefits - Services can be acquired quickly, meaning that the evolving requirements of the business can be addressed almost immediately, giving an organization a potential agility advantage. A properly implemented elastic system will provision/de-provision according to application demands, so if a particular business has activity spikes then the provision can be enabled to match the demand and the capacity can be re-allocated. Research and Development (R&D) Projects - R&D activities are no longer hindered by a requirement to secure a capex budget prior to a project starting. Capability can simply be provisioned from the cloud and released at the end of the exercise. Testing and Deployment - With most large-scale projects a size test needs to be performed prior to final rollout. By taking advantage of the elasticity of the cloud and creating a full-scale avatar of the proposed production system, realistic data and traffic volumes can be provisioned and released as needed. Expensive Resources Allocated - This will normally apply only in the context where a customer is applying at least some of their own servers as part of a cloud infrastructure, specifically where a business (for performance reasons) has decided to invest in solid-state storage as opposed to spinning platters. There are instances when, due to activity spikes, a less critical process may need to be moved from the high-performance resources to more traditional storage. Server Specification - When a customer has elected to own/lease hardware, they can select and specify servers that are specifically tuned to meet the likely needs of their operation (i.e., directly controlling the cost/benefit equation). Utility Based Payments - There is, of course, a key cost driver in this process, and the notion that you should pay for what you consume is acceptable for many organizations. When hardware capacity is sourced internally, organizations need to over-provision. This applies just as much to traditional outsourcing as it does to capex-related expenditure on in-house servers. Cloud Platform – At the heart of any cloud storage system is the ability to manage hyperscale object storage and a Hadoop Distributed Files System (HDFS). Elastic storage capability is particularly well suited to hyperscale and Hadoop environments, where its capability to rapidly respond to changing circumstances and priorities is essential

    Read more →
  • Secret London

    Secret London

    Secret London is a Facebook group started by 21-year-old Bristol University graduate, Tiffany Philippou, on 19 January 2010 in response to a Saatchi & Saatchi competition. The group grew rapidly (180,000 members as of 8 February 2010) and is composed mostly of Londoners who use the site to share suggestions and photos of London. After the group's early success, the founder announced her intention to launch a website of the same name by crowdsourcing the design and development. The website was launched on 16 February 2010. == Other secret cities == Following the initial success of Secret London, a number of other secret groups were independently started around the world, some of which already have over 100,000 users. As of 19 February 2010, the list of other groups includes: Secret Frankfurt, Secret Tel Aviv, Secret Paris, Secret New York, Secret Tokyo, Secret Toronto, Secret Los Angeles, Secret Exeter, Secret Boston, Secret Norwich, Secret Singapore, Secret Brighton, Secret Minneapolis, Secret Sydney, Secret Canberra, Secret Brisbane, Secret Wellington, Secret Christchurch, Secret Madeira, Secret Funchal, Secret Bristol and Secret Cardiff. == Controversy == Some commentators have questioned whether it possible to share secrets without compromising them, and whether sharing tips publicly will lead to over-exposure of the businesses who are recommended.

    Read more →
  • Data monetization

    Data monetization

    Data monetization, a form of monetization, may refer to the act of generating measurable economic benefits from available data sources (analytics). Less commonly, it may also refer to the act of monetizing data services. In the case of analytics, typically, these benefits accrue as revenue or expense savings, but may also include market share or corporate market value gains. Data monetization leverages data generated through business operations, available exogenous data or content, as well as data associated with individual actors such as that collected via electronic devices and sensors participating in the internet of things. For example, the ubiquity of the internet of things is generating location data and other data from sensors and mobile devices at an ever-increasing rate. When this data is collated against traditional databases, the value and utility of both sources of data increases, leading to tremendous potential to mine data for social good, research and discovery, and achievement of business objectives. Closely associated with data monetization are the emerging data as a service models for transactions involving data by the data item. There are three ethical and regulatory vectors involved in data monetization due to the sometimes conflicting interests of actors involved in the digital supply chain. The individual data creator who generates files and records through his own efforts or owns a device such as a sensor or a mobile phone that generates data has a claim to ownership of data. The business entity that generates data in the course of its operations, such as its transactions with financial institutions or risk factors discovered through feedback from customers also has a claim on data captured through their systems and platforms. However, the person that contributed the data may also have a legitimate claim on the data. Internet platforms and service providers, such as Google or Facebook that require a user to forgo some ownership interest in their data in exchange for use of the platform also have a legitimate claim on the data. Thus the practice of data monetization, although common since 2000, is now getting increasing attention from regulators. The European Union and the United States Congress have begun to address these issues. For instance, in the financial services industry, regulations involving data are included in the Gramm–Leach–Bliley Act and Dodd-Frank. Some individual creators of data are shifting to using personal data vaults and implementing vendor relationship management concepts as a reflection of an increasing resistance to their data being federated or aggregated and resold without compensation. Groups such as the Personal Data Ecosystem Consortium, Patient privacy rights, and others are also challenging corporate cooptation of data without compensation. Financial services companies are a relatively good example of an industry focused on generating revenue by leveraging data. Credit card issuers and retail banks use customer transaction data to improve targeting of cross-sell offers. Partners are increasingly promoting merchant based reward programs which leverage a bank’s data and provide discounts to customers at the same time. == Types of data monetization == Internal data monetization - An organization's data is used internally, resulting in economic benefit. This is commonly the case in organizations using analytics to uncover insights, resulting in improved profit, cost savings or the avoidance of risk. Internal data monetization is currently the most common form of monetization, requiring far fewer security, intellectual property, and legal precautions when compared to other types. The potential economic gains from this type of data monetization are limited by the organization's internal structure and situation. External data monetization - A person or organization makes data they possess available on a for-fee basis to external parties, or as a broker for same. This type of monetization is less common and requires various methods to distribute the data to potential buyers and consumers. However, the economic gain that results from collecting data, packaging and distributing it, can be quite large. == Steps == Identification of available data sources – this includes data currently available for monetization as well as other external data sources that may enhance the value of what’s currently available. Connect, aggregate, attribute, validate, authenticate, and exchange data - this allows data to be converted directly into actionable or revenue generating insight or services. Set terms and prices and facilitate data trading - methods for data vetting, storage, and access. For example, many global corporations have locked and siloed data storage infrastructures, which hinders efficient access to data and cooperative and real-time exchange. Perform Research and analytics – draw predictive insights from existing data as a basis for using data for to reduce risk, enhance product development or performance, or improve customer experience or business outcomes. Action and leveraging – the last phase of monetizing data includes determining alternative or improved data centric products, ideas, or services. Examples may include real-time actionable triggered notifications or enhanced channels such as web or mobile response mechanisms. == Pricing variables and factors == A fee for use of a platform to connect buyers and sellers use of a platform to configure, organize, and otherwise process data included in a data trade connecting or including a device or sensor into a data supply chain connecting and credentialing a creator of a data source and a data buyer – often through a federated identity connecting a data source to other data sources to be included in a data supply chain use of an internet service or other transmission services for uploading and downloading data – sometimes, for an individual, through a personal cloud use of encrypted keys to achieve secure data transfer use of a search algorithm specifically designed to tag data sources that contain data points of value to the data buyer linking a data creator or generator to a data collection protocol or form server actions – such as a notification – triggered by an update to a data item or data source included in a data supply chain A price or exchange or other trade value assigned by a data creator or generator to a data item or a data source offered by a data buyer to a data creator assigned by a data buyer for a data item or a data source formatted according to criteria set by a data buyer An incremental fee assigned by a data buyer for a data item or a data set scaled to the reputation of the data creator == Benefits == Improved decision-making that leads to real time crowd sourced research, improved profits, decreased costs, reduced risk and improved compliance More impactful decisions (e.g., make real-time decisions) More timely (lower latency) decisions (e.g., a vendor making purchase recommendations while the customer is still on the phone or in the store, a customer connecting with multiple vendors to discover the best price, triggered notifications when thresholds are reached for data values) More granular decisions (e.g., localized pricing decisions at an individual or device or sensor level versus larger aggregates). Targeted Marketing (e.g., Vendors with access to big data can make targeted advertisements to specific customers within a set data pool decreasing costs for the advertiser and reaching most interested customers) == Frameworks == There are a wide variety of industries, firms and business models related to data monetization. The following frameworks have been offered to help understand the types of business models that are used: Roger Ehrenberg of IA Ventures, a venture capital firm that invests in this sector, has defined three basic types of data product firms: Contributory databases. The magic of these businesses is that a customer provides their own data in exchange for receiving a more robust set of aggregated data back that provides insight into the broader marketplace, or provides a vehicle for expressing a view. Give a little, get a lot back in return – a pretty compelling value proposition, and one that frequently results in a payment from the data contributor in exchange for receiving enriched, aggregated data. Once these contributory databases are developed and customers become reliant on their insights, they become extremely valuable and persistent data assets. Data processing platforms. These businesses create barriers through a combination of complex data architectures, proprietary algorithms, and rich analytics to help customers consume data in whatever form they please. Often these businesses have special relationships with key data providers, that when combined with other data and processed as a whole create valuable differentiation and competitive barriers. Bloomberg is an example of a powerful

    Read more →