AI Chatbot Interface

AI Chatbot Interface — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Latent semantic analysis

    Latent semantic analysis

    Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA assumes that words that are close in meaning will occur in similar pieces of text (the distributional hypothesis). A matrix containing word counts per document (rows represent unique words and columns represent each document) is constructed from a large piece of text and a mathematical technique called singular value decomposition (SVD) is used to reduce the number of rows while preserving the similarity structure among columns. Documents are then compared by cosine similarity between any two columns. Values close to 1 represent very similar documents while values close to 0 represent very dissimilar documents. An information retrieval technique using latent semantic structure was patented in 1988 by Scott Deerwester, Susan Dumais, George Furnas, Richard Harshman, Thomas Landauer, Karen Lochbaum and Lynn Streeter. In the context of its application to information retrieval, it is sometimes called latent semantic indexing (LSI). == Overview == === Occurrence matrix === LSA can use a document-term matrix which describes the occurrences of terms in documents; it is a sparse matrix whose rows correspond to terms and whose columns correspond to documents. A typical example of the weighting of the elements of the matrix is tf-idf (term frequency–inverse document frequency): the weight of an element of the matrix is proportional to the number of times the terms appear in each document, where rare terms are upweighted to reflect their relative importance. This matrix is also common to standard semantic models, though it is not necessarily explicitly expressed as a matrix, since the mathematical properties of matrices are not always used. === Rank lowering === After the construction of the occurrence matrix, LSA finds a low-rank approximation to the term-document matrix. There could be various reasons for these approximations: The original term-document matrix is presumed too large for the computing resources; in this case, the approximated low rank matrix is interpreted as an approximation (a "least and necessary evil"). The original term-document matrix is presumed noisy: for example, anecdotal instances of terms are to be eliminated. From this point of view, the approximated matrix is interpreted as a de-noisified matrix (a better matrix than the original). The original term-document matrix is presumed overly sparse relative to the "true" term-document matrix. That is, the original matrix lists only the words actually in each document, whereas we might be interested in all words related to each document—generally a much larger set due to synonymy. The consequence of the rank lowering is that some dimensions are combined and depend on more than one term: {(car), (truck), (flower)} → {(1.3452 car + 0.2828 truck), (flower)} This mitigates the problem of identifying synonymy, as the rank lowering is expected to merge the dimensions associated with terms that have similar meanings. It also partially mitigates the problem with polysemy, since components of polysemous words that point in the "right" direction are added to the components of words that share a similar meaning. Conversely, components that point in other directions tend to either simply cancel out, or, at worst, to be smaller than components in the directions corresponding to the intended sense. === Derivation === Let X {\displaystyle X} be a matrix where element ( i , j ) {\displaystyle (i,j)} describes the occurrence of term i {\displaystyle i} in document j {\displaystyle j} (this can be, for example, the frequency). X {\displaystyle X} will look like this: d j ↓ t i T → [ x 1 , 1 … x 1 , j … x 1 , n ⋮ ⋱ ⋮ ⋱ ⋮ x i , 1 … x i , j … x i , n ⋮ ⋱ ⋮ ⋱ ⋮ x m , 1 … x m , j … x m , n ] {\displaystyle {\begin{matrix}&{\textbf {d}}_{j}\\&\downarrow \\{\textbf {t}}_{i}^{T}\rightarrow &{\begin{bmatrix}x_{1,1}&\dots &x_{1,j}&\dots &x_{1,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{i,1}&\dots &x_{i,j}&\dots &x_{i,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{m,1}&\dots &x_{m,j}&\dots &x_{m,n}\\\end{bmatrix}}\end{matrix}}} Now a row in this matrix will be a vector corresponding to a term, giving its relation to each document: t i T = [ x i , 1 … x i , j … x i , n ] {\displaystyle {\textbf {t}}_{i}^{T}={\begin{bmatrix}x_{i,1}&\dots &x_{i,j}&\dots &x_{i,n}\end{bmatrix}}} Likewise, a column in this matrix will be a vector corresponding to a document, giving its relation to each term: d j = [ x 1 , j ⋮ x i , j ⋮ x m , j ] {\displaystyle {\textbf {d}}_{j}={\begin{bmatrix}x_{1,j}\\\vdots \\x_{i,j}\\\vdots \\x_{m,j}\\\end{bmatrix}}} Now the dot product t i T t p {\displaystyle {\textbf {t}}_{i}^{T}{\textbf {t}}_{p}} between two term vectors gives the correlation between the terms over the set of documents. The matrix product X X T {\displaystyle XX^{T}} contains all these dot products. Element ( i , p ) {\displaystyle (i,p)} (which is equal to element ( p , i ) {\displaystyle (p,i)} ) contains the dot product t i T t p {\displaystyle {\textbf {t}}_{i}^{T}{\textbf {t}}_{p}} ( = t p T t i {\displaystyle ={\textbf {t}}_{p}^{T}{\textbf {t}}_{i}} ). Likewise, the matrix X T X {\displaystyle X^{T}X} contains the dot products between all the document vectors, giving their correlation over the terms: d j T d q = d q T d j {\displaystyle {\textbf {d}}_{j}^{T}{\textbf {d}}_{q}={\textbf {d}}_{q}^{T}{\textbf {d}}_{j}} . Now, from the theory of linear algebra, there exists a decomposition of X {\displaystyle X} such that U {\displaystyle U} and V {\displaystyle V} are orthogonal matrices and Σ {\displaystyle \Sigma } is a diagonal matrix. This is called a singular value decomposition (SVD): X = U Σ V T {\displaystyle {\begin{matrix}X=U\Sigma V^{T}\end{matrix}}} The matrix products giving us the term and document correlations then become X X T = ( U Σ V T ) ( U Σ V T ) T = ( U Σ V T ) ( V T T Σ T U T ) = U Σ V T V Σ T U T = U Σ Σ T U T X T X = ( U Σ V T ) T ( U Σ V T ) = ( V T T Σ T U T ) ( U Σ V T ) = V Σ T U T U Σ V T = V Σ T Σ V T {\displaystyle {\begin{matrix}XX^{T}&=&(U\Sigma V^{T})(U\Sigma V^{T})^{T}=(U\Sigma V^{T})(V^{T^{T}}\Sigma ^{T}U^{T})=U\Sigma V^{T}V\Sigma ^{T}U^{T}=U\Sigma \Sigma ^{T}U^{T}\\X^{T}X&=&(U\Sigma V^{T})^{T}(U\Sigma V^{T})=(V^{T^{T}}\Sigma ^{T}U^{T})(U\Sigma V^{T})=V\Sigma ^{T}U^{T}U\Sigma V^{T}=V\Sigma ^{T}\Sigma V^{T}\end{matrix}}} Since Σ Σ T {\displaystyle \Sigma \Sigma ^{T}} and Σ T Σ {\displaystyle \Sigma ^{T}\Sigma } are diagonal we see that U {\displaystyle U} must contain the eigenvectors of X X T {\displaystyle XX^{T}} , while V {\displaystyle V} must be the eigenvectors of X T X {\displaystyle X^{T}X} . Both products have the same non-zero eigenvalues, given by the non-zero entries of Σ Σ T {\displaystyle \Sigma \Sigma ^{T}} , or equally, by the non-zero entries of Σ T Σ {\displaystyle \Sigma ^{T}\Sigma } . Now the decomposition looks like this: X U Σ V T ( d j ) ( d ^ j ) ↓ ↓ ( t i T ) → [ x 1 , 1 … x 1 , j … x 1 , n ⋮ ⋱ ⋮ ⋱ ⋮ x i , 1 … x i , j … x i , n ⋮ ⋱ ⋮ ⋱ ⋮ x m , 1 … x m , j … x m , n ] = ( t ^ i T ) → [ [ u 1 ] … [ u l ] ] ⋅ [ σ 1 … 0 ⋮ ⋱ ⋮ 0 … σ l ] ⋅ [ [ v 1 ] ⋮ [ v l ] ] {\displaystyle {\begin{matrix}&X&&&U&&\Sigma &&V^{T}\\&({\textbf {d}}_{j})&&&&&&&({\hat {\textbf {d}}}_{j})\\&\downarrow &&&&&&&\downarrow \\({\textbf {t}}_{i}^{T})\rightarrow &{\begin{bmatrix}x_{1,1}&\dots &x_{1,j}&\dots &x_{1,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{i,1}&\dots &x_{i,j}&\dots &x_{i,n}\\\vdots &\ddots &\vdots &\ddots &\vdots \\x_{m,1}&\dots &x_{m,j}&\dots &x_{m,n}\\\end{bmatrix}}&=&({\hat {\textbf {t}}}_{i}^{T})\rightarrow &{\begin{bmatrix}{\begin{bmatrix}\,\\\,\\{\textbf {u}}_{1}\\\,\\\,\end{bmatrix}}\dots {\begin{bmatrix}\,\\\,\\{\textbf {u}}_{l}\\\,\\\,\end{bmatrix}}\end{bmatrix}}&\cdot &{\begin{bmatrix}\sigma _{1}&\dots &0\\\vdots &\ddots &\vdots \\0&\dots &\sigma _{l}\\\end{bmatrix}}&\cdot &{\begin{bmatrix}{\begin{bmatrix}&&{\textbf {v}}_{1}&&\end{bmatrix}}\\\vdots \\{\begin{bmatrix}&&{\textbf {v}}_{l}&&\end{bmatrix}}\end{bmatrix}}\end{matrix}}} The values σ 1 , … , σ l {\displaystyle \sigma _{1},\dots ,\sigma _{l}} are called the singular values, and u 1 , … , u l {\displaystyle u_{1},\dots ,u_{l}} and v 1 , … , v l {\displaystyle v_{1},\dots ,v_{l}} the left and right singular vectors. Notice the only part of U {\displaystyle U} that contributes to t i {\displaystyle {\textbf {t}}_{i}} is the i 'th {\displaystyle i{\textrm {'th}}} row. Let this row vector be called t ^ i T {\displaystyle {\hat {\textrm {t}}}_{i}^{T}} . Likewise, the only part of V T {\displaystyle V^{T}} that contributes to d j {\displaystyle {\textbf {d}}_{j}} is the j 'th {\displaystyle j{\textrm {'th}}} column, d ^ j {\displaystyle {\hat {\textrm {d}}}_{j}} . These are not the eigenvectors, but depend on all the eigenvectors. I

    Read more →
  • Social media newsroom

    Social media newsroom

    A social media newsroom is a company resource, set up to increase the functionality and usability of the traditional online newsroom. Social media newsrooms (SMNs) are intended to encourage dialogue and information sharing. Unlike online newsrooms, content is accessible to more than just journalists, but to all those with whom the company engages such as bloggers, their prospects, customers, business partners and investors. It gives these stakeholders access to news, public relations announcements, images, audio, video and other multimedia files. In addition to posting press releases and corporate news, companies can integrate other social content from sites such as YouTube, Flickr and Slideshow as well as streams from corporate Twitter accounts. Traditional tools for journalists such as corporate fast facts, leadership information, a multimedia library, financial information, awards and other recent media coverage are also included in an SMN. Examples of companies effectively using social media newsrooms include Opel Group, Pressat, First Direct, MyNewsdesk, Scania and Newport Beach.

    Read more →
  • WhoSay

    WhoSay

    WhoSay was an American social media service and branding platform for celebrities and their fans. Founded in Los Angeles in 2010, with financing by Creative Artists Agency (CAA), Amazon.com and other investors, it is notable for allowing its users to retain ownership rights over the content that they post to their accounts, through copyright branding, and for enabling users to post content to other social media sites like Twitter, Facebook, Instagram and Tumblr simultaneously. WhoSay describes itself as a "social celebrity magazine" whose editorial team keeps its users informed about the latest celebrity and entertainment news. Clients such as Dylan McDermott and Chris Rock lauded the service for its ability to add content to multiple social network sites easily. Rock in particular has commented on its ease of use for those who are not part of a tech-savvy demographic, commenting, "It's perfect for someone that's not 25." WhoSay's competitors included theAudience, which is operated by the William Morris Endeavor. == History == WhoSay was founded in March 2010, by Steve Ellis and the Los Angeles-based talent agency Creative Artists Agency (CAA). It was financed through investments Amazon.com (who along with CAA, holds a minority stake in the company), Comcast, Greylock Partners, and High Peak Ventures. The company's main headquarters are in The New York Times Building in Manhattan, with additional headquarters in CAA's office building in the Silicon Beach area of Los Angeles, and in London. The company was founded to protect celebrities' intellectual property and enable the celebrities themselves to profit themselves from their own content through copyright branding. Its chief executive is co-founder Steve Ellis, who, after leaving Getty Images, was contacted by CAA, who were looking to resolve the issue of celebrities losing the rights to their own photos and videos when uploading them to social network sites. Ellis explained WhoSay's mission thus: "We work with people who are constantly being utilized by third parties for the wrong reasons. [The company was formed] to give celebrities and other influential people a set of tools to allow them to manage and control their presence in the digital world." In this way, WhoSay is likened by Ellis to "a People magazine by the people themselves who are in it." The company started slowly, until CAA client Tom Hanks signed onto WhoSay three months after the service's launch. The company continued to maintain a low profile for the first three years of operation, during which it accumulated a client list of 1,500 actors, musicians and artists. Clients are accepted by the service on an invitation-only basis, although they are not restricted to Creative Artists clients. Among them are Kelly Clarkson, Julia Louis-Dreyfus, Paula Patton, Kevin Spacey, Jim Carrey, John Cusack, Bill Maher, Johnny Knoxville, Chelsea Handler, Eva Longoria, Spike Lee, Enrique Iglesias and Katie Couric. Clients are not charged for the service, and are given a share of any revenue that is generated by advertisements. They are also given the ability share in the database of e-mail addresses that come with registration, in order to communicate directly with fans. Actor Dylan McDermott was introduced to WhoSay by his agent, as a way of easily posting content to Facebook, Twitter, Tumblr and even China's Tencent social network with relative ease. McDermott comments, "When you put something out there, you can hit everything at one time. It makes it easy for me." Comedian Chris Rock has commented that WhoSay is ideal for people like him have developed difficulty in keeping track of different websites as they get older, saying, "It's perfect for someone that's not 25." In September 2013 WhoSay introduced a mobile application for consumers. By October 2013, the company's website attracted 12 million monthly visitors. In July 2014 Rob Gregory left his role as president of Newsweek's The Daily Beast to become WhoSay's chief revenue officer. Among his responsibilities are developing ways to monetize WhoSay's web and mobile products, such as premium advertising strategies and brand partnerships. WhoSay does not allow consumers to create accounts, nor does it include search features, making it difficult to access a celebrity's account unless a user is directed there from one of their other social pages. According to Ellis, consumers have enough social media choices, saying, "Frankly they don't really need the services that we provide, and there are a lot of very specific features built into our service that really only benefit someone who is of a high profile." By February 2015, WhoSay had amassed 4.8 million unique users, and expanded its accounts to companies that employ celebrities for branded content. Such companies include Lexus, which partnered with the company to promote a campaign in which actress Rosario Dawson, during the lead up to the 87th Academy Awards, released five short videos on her social media accounts. The videos feature her driving through Los Angeles in preparation for the grand opening of her pop-up store, which sells Studio One Eighty Nine, a clothing line tied to her foundation promoting African culture and content. That April, WhoSay partnered with Chevrolet's #BestDayEver social media campaign for April Fool's Day, enlisting Olivia Wilde, Norman Reedus, Alec Baldwin, Ian Somerhalder, and Nikki Reed to surprise students in four U.S. classrooms as their substitute teachers. For example, Baldwin, dressed as Abraham Lincoln, surprised students in an Occidental College class on U.S. Culture and Society. Other companies that WhoSay has partnered with include KFC, JCPenney, Dunkin' Donuts and Crest. In January 2018, the website was acquired by Viacom (now Paramount Global).

    Read more →
  • Consistency (database systems)

    Consistency (database systems)

    In database systems, consistency (or correctness) refers to the requirement that any given database transaction must change affected data only in allowed ways. Any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof. This does not guarantee correctness of the transaction in all ways the application programmer might have wanted (that is the responsibility of application-level code) but merely that any programming errors cannot result in the violation of any defined database constraints. In a distributed system, referencing CAP theorem, consistency can also be understood as after a successful write, update or delete of a Record, any read request immediately receives the latest value of the Record. == As an ACID guarantee == Consistency is one of the four guarantees that define ACID transactions; however, significant ambiguity exists about the nature of this guarantee. It is defined variously as: The guarantee that database constraints are not violated, particularly once a transaction commits. The guarantee that any transactions started in the future necessarily see the effects of other transactions committed in the past. As these various definitions are not mutually exclusive, it is possible to design a system that guarantees "consistency" in every sense of the word, as most relational database management systems in common use today arguably do. == As a CAP trade-off == The CAP theorem is based on three trade-offs, one of which is "atomic consistency" (shortened to "consistency" for the acronym), about which the authors note, "Discussing atomic consistency is somewhat different than talking about an ACID database, as database consistency refers to transactions, while atomic consistency refers only to a property of a single request/response operation sequence. And it has a different meaning than the Atomic in ACID, as it subsumes the database notions of both Atomic and Consistent." In the CAP theorem, you can only have two of the following three properties: consistency, availability, or partition tolerance. Therefore, consistency may have to be traded off in some database systems.

    Read more →
  • TIMIT

    TIMIT

    TIMIT is a corpus of phonemically and lexically transcribed speech of American English speakers of different sexes and dialects. Each transcribed element has been delineated in time. TIMIT was designed to further acoustic-phonetic knowledge and automatic speech recognition systems. It was commissioned by DARPA and corpus design was a joint effort between the Massachusetts Institute of Technology, SRI International, and Texas Instruments (TI). The speech was recorded at TI, transcribed at MIT, and verified and prepared for publishing by the National Institute of Standards and Technology (NIST). There is also a telephone bandwidth version called NTIMIT (Network TIMIT). TIMIT and NTIMIT are not freely available — either membership of the Linguistic Data Consortium, or a monetary payment, is required for access to the dataset. == Data == TIMIT contains ~5 hours of speech, of 10 sentences spoken by each of 630 speakers. The sentences were randomly sampled from a corpus of 2342 sentences. The speakers were native speakers of American English, classified under 8 major dialect regions: New England, Northern, North Midland, South Midland, Southern, New York City, Western, Army Brat (moved around). The speakers were 70% male and 30% female. Recordings were made in a noise-isolated recording booth at Texas Instrument, using a semi-automatic computer system (STEROIDS) to control the presentation of prompts to the speaker and the recording. Two-channel recordings were made using a Sennheiser HMD 414 headset-mounted microphone and a Brüel & Kjær 1/2" far-field pressure microphone (#4165). The speech was digitized at a sample rate of 20 kHz then and downsampled to 16 kHz. == History == The TIMIT telephone corpus was an early attempt to create a database with speech samples. It was published in the year 1988 on CD-ROM and consists of only 10 sentences per speaker. Two 'dialect' sentences were read by each speaker, as well as another 8 sentences selected from a larger set Each sentence averages 3 seconds long and is spoken by 630 different speakers. It was the first notable attempt in creating and distributing a speech corpus and the overall project has produced costs of 1.5 million US$. An update was released in October 1990. It included full 630-speaker corpus; checked and corrected transcriptions; word-alignment transcriptions; NIST SPHERE-headered waveform files and header manipulation software; phonemic dictionary; new test and training subsets balanced for dialectal and phonetic coverage; more extensive documentation. The full name of the project is DARPA-TIMIT Acoustic-Phonetic Continuous Speech Corpus and the acronym TIMIT stands for Texas Instruments/Massachusetts Institute of Technology. The main reason why a corpus of telephone speech was created was to train speech recognition software. In the Blizzard challenge, different software has the obligation to convert audio recordings into textual data and the TIMIT corpus was used as a standardized baseline.

    Read more →
  • Weird SoundCloud

    Weird SoundCloud

    Weird SoundCloud, or SoundClown, is a mashup parody music scene taking place on the online distribution platform SoundCloud. The scene has been described by its producers and music journalists to be a satirical take on electronic dance music, and useless, throwaway internet content. One critic, Audra Schroeder, categorized it as an in-joke that is "deconstructing and reshaping memes and popular music, recontextualizing the sacred texts of millennial chat rooms." == Origins == In a January 2014 interview, DJ Kevin Wang suggested that the Weird SoundCloud has "been around in the last one to two years", but started to gain much more popularity the previous year through electronic dance music internet blogs. Weird SoundCloud producer Ideaot suggested that some in the phenomenon came from the YouTube poop scene. Another producer in the community, DJ @@ (AT-AT), reasoned that producers joining the scene "want to express their musicality, see it as a more mature form of YouTube Poop," or are "just looking for recognition on social media sites." AT-AT said that it was "a fun thing to do, and after I stopped making proper music I felt I needed a bit of an outlet for my creativity. The fact that people enjoyed it and/or treated it as a travesty (Direct quote from one of my tracks) spurs me on." == Characteristics == Weird SoundCloud is a mash-up and parody music genre labeled by journalist Audra Schroeder as an in-joke that is "deconstructing and reshaping memes and popular music, recontextualizing the sacred texts of millennial chat rooms." Most tracks range from around 30 seconds to one minute in length. The people who make weird SoundCloud are known as SoundClowns, a term coined by producer Dicksoak. Ideaot described the weird SoundCloud community as "largely just people who are friends with each other." Noisey critic Ryan Bassil spotlight the variety of music coming out of the weird SoundCloud landscape: "One minute you could be listening to the Seinfeld theme reimagined as an aneurysm inducing dubstep corker, the next, you're recovering from hearing a version of Tenacious D's "Tribute" that's akin to having a stroke." Bassil analyzes that the tracks "often take the past and repurpose it into something that, although not altogether useful, sounds fresh and reflective of the abstract, confusing panoramic that encapsulates the modern internet." Bassil compared the lexicon of SoundClown's track titles to that of Reddit and Twitter users. According to Dicksoak, most works of the style are critiques of EDM or "are just uploaded because they sound funny." However, Bassil disagreed, writing that there are also many tracks that keep repurposing a certain meme, such as "mom's spaghetti" or the re-use of vocals from recordings by hip hop group Death Grips. He describe the scene's re-use of memes as a satirical take on pointless online content that is only on the internet to "do nothing other than fill the void": They're changing the format of the original work's intended message or audience - a technique often employed by top-tier digital media companies - and in doing so they're sarcastically, ironically, taking the piss out of what Web 2.0's turned into - an open arena where the most ridiculous, unashamed, often pointless piggy-back content can rack up thousands and thousands of clicks. == Notable examples == There are mash-ups that "disrupt the flow of popular music", in the words of writer Schroeder, such as a "flutedrop" remix of the Miley Cyrus song "Wrecking Ball" and Shaliek's mashup of music by Bruno Mars and Korn. In November 2013, Wang released a set of mp3 files on SoundCloud named Best Drops Ever, which included tracks like "A Drop So Epic a Bunch of NYU Bros Already Bought a 3-Day Weekend Pass for It" and "A Drop So Crazy You'll Kill Your Family". All of the tracks start as normal electronic dance music build-ups, before they drop into a "bait and switch" audio or film clip such as Filet-O-Fish commercials, the Whitney Houston song "I Will Always Love You" and the film Bambi (1942) that ruins the anticipation. The collection is a parody of the over-importance and over-focus of the drop and lack of care of the overall quality of a song common in the modern electronic dance music scene. Wang has released more than 45 tracks in the weird SoundCloud, some of them receiving around a million plays. Subgenres of Weird SoundCloud include Macklecore, mash-ups and remixes that include the works of American hip-hop recording artist Macklemore, and Biggiewave, which include samples of songs from the album Ready to Die (1994) by The Notorious B.I.G. Common audio and meme sources used include Skrillex, the Martin Garrix track "Animals", Thomas the Tank Engine, Shrek, Macklemore, "Gangnam Style", the Bruno Mars track "Uptown Funk", the Disturbed track "Down with the Sickness", Space Jam, the Childish Gambino track "Bonfire", the Death Grips track "Takyon" and air horn sound effects. == Reception == Bassil praised the SoundClown scene as "loveable and strangely honest", reasoning that it "just reminds me that we're all humans on the internet, all searching for #content that means something, something to connect with, but usually only dredging up bastardised versions of things we've already read, seen, or watched before." Bassil also described the weird SoundCloud as a more successful version of a similar scene known as weird YouTube; the reason for the success of SoundClowns is due to SoundCloud's discovery algorithm: "Small collectives and trends are able to form, and there's an abundance of tracks from artists who are almost forging careers out of it, as opposed to uploading one viral hit." Publications have made lists of weird SoundCloud works, such as BuzzFeed's "23 Of The Weirdest Songs On Soundcloud", Obsev's "Weird SoundCloud Mashups That Must've Been Made While Drunk", and Thump's "9 of the Best and Most Upsetting Soundclowns we Could Find", where writer Isabelle Hellyer called it the "most influential genre of music in human history." A Your EDM writer called it "oddly addicting."

    Read more →
  • Social news website

    Social news website

    A social news website is a website that features user-posted stories. Such stories are ranked based on popularity, as voted on by other users of the site or by website administrators. Users typically comment online on the news posts and these comments may also be ranked in popularity. Since their emergence with the birth of Web 2.0, social news sites have been used to link many types of information, including news, humor, support, and discussion. All such websites allow the users to submit content and each site differs in how the content is moderated. On the Slashdot and Fark websites, administrators decide which articles are selected for the front page. On Reddit and Digg, the articles that get the most votes from the community of users will make it to the front page. Many social news websites also feature an online comment system, where users discuss the issues raised in an article. Some of these sites have also applied their voting system to the comments, so that the most popular comments are displayed first. Some social news websites also have a social networking service, in that users can set up a user profile and follow other users' online activity on the website. Like many other Web 2.0 tools, social news websites use the collective intelligence of all of the users to operate. Social news websites also "impl[y] the technical, economic, legal, and human enhancement of a universally distributed intelligence that will unleash a positive dynamic of recognition and skills mobilization". Social news websites help participants to share a collective vision and awareness of how their actions are integrated with those of other individuals. Social news websites provide a new and innovative way to participate in a community that is constantly being flooded with new information. These social news websites "include opportunities for peer-to-peer learning, a changed attitude toward intellectual property, the diversification of cultural expression, the development of skills valued in the modern workplace, and a more empowered conception of citizenship". These websites can help to shape and reshape democratic opinions and perspectives. Social news sites may mitigate the gatekeeping of mainstream news sources and allow the public to decide what counts as "news", which may facilitate a more participatory culture. Social news sites may also support democratic participation by allowing users from across geographic and national boundaries to access the same information, respond to fellow users' views and beliefs, and create a virtual sphere for users to contribute within. == Websites == === Active === ==== Fark ==== Fark, which started in 1997, features news on any topic. On Fark, users can submit articles to the administrators of the site. Each day, these administrators pick out 50 articles to display on the front page. ==== Slashdot ==== Slashdot, started in 1997, was one of the first social news websites. It focuses mainly on science and technology-related news. Users can submit stories and the editors pick out the best stories each day for the front page. Users can then post comments on the stories. The influx of web traffic that resulted from Slashdot linking to external websites led to the effect being called the Slashdot effect ==== Digg ==== Digg, started in December 2004, introduced the voting system. This system allows users to "digg" or "bury" articles. "Digging" is the equivalent of voting positively, so that popular articles are displayed first. "Burying" does not lower an article's score. However, if an article is buried enough times, it will be automatically deleted from the site. Digg offers a social networking service, as members can follow other members and build personal profiles with information about their interests. ==== Reddit ==== Reddit, started in June 2005, is a social news website where users can submit articles and comments and vote on these submissions. The submissions are organized into categories called "subreddits". Unlike Digg, with Reddit, users can directly affect an article's score. An "upvote" will increase the score and a "downvote" will decrease it. Articles with the highest scores are displayed on the front page. There is also a page for "controversial" articles, that have an almost equal number of upvotes and downvotes. Free speech debates have arisen due to the shutting down of obscene or potentially illegal "subreddits" (including /r/jailbait, a collection of sexually suggestive underage pictures.) Reddit introduced a system of user-created communities called "subreddits", which are essentially categories for a specific type of news. Comments on the featured posts are shown in a hierarchical fashion also based on votes. Users have the ability to earn "karma" for their participation and time on the website. ==== Hacker News ==== Hacker News, started in February 2007, is a social news site focusing on computer science and entrepreneurship, created by Paul Graham and run by his startup incubator, Y Combinator. === Defunct === ==== Newsvine ==== Newsvine, started in March 2006, was a social news website mostly focused on politics, both international and domestic. The Newsvine home page allowed users to customize "seeds" and story feeds. Users received articles via "The Wire" from sources including The Associated Press or The Huffington Post, and from "The Vine" a stream of content from other Newsvine users. The "Top of the Vine" displayed the most voted and commented on articles of the day, week, month, or year. Additionally, Newsvine allowed members to create their own "Customizable Column", which could highlight a user's content posted, recent comments, and information about the specific Newsvine member. ==== feedalizr ==== feedalizr was a cross-platform, desktop social media aggregator built using Adobe Integrated Runtime that consolidates the updates from social media and social networking websites. Users can then use this application to update those sites from their desktop and view a consolidated stream of information. ==== Voat ==== Voat, launched in April 2014 and discontinued in December of 2020, was also a social news website and is very similar to Reddit visually and functionally. The site's userbase included a large number of alt right users, many of whom migrated to Voat after being banned on Reddit. ==== Prismatic ==== Prismatic combined machine learning, user experience design, and interaction design to create a new way to discover, consume, and share media. Prismatic software used social network aggregation and machine learning algorithms to filter the content that aligns with the interests of a specific user. Prismatic integrated with Facebook, Twitter, and Pocket to gather information about user's interests and suggest the most relevant stories to read. ==== Artifact ==== Artifact was an iOS and Android app that used machine learning to personalize news recommendations to readers, and also had social features such as liking articles, commenting, and reputation scores for users.

    Read more →
  • Storyful

    Storyful

    Storyful (stylized as storyful.) is a social media intelligence company headquartered in Dublin, Ireland that is a subsidiary of News Corp, offering services such as social news monitoring, video licensing, and reputation risk management tools for corporate clients. The startup was launched as the first social media newswire, a content aggregator, verifying news sources and online content in Dublin in 2010 by Mark Little, a former journalist with RTÉ News. Storyful was acquired by News Corp in 2013 for USD$25 million. == Background == Mark Little, who had worked as a television journalist for RTÉ One, founded startup Storyful in Dublin, Ireland, in 2010, as a service that "verified news sources and online content". According to Nieman Lab, Storyful had a reputation for content aggregation as a social news agency—finding, verifying, distributing, licensing, and commercializing user-generated content, social media and online content from social networking services, including videos about stories in the news, such as the Syrian Civil War, Arab Spring protests, as well as "smaller viral moments". Storyful aimed to provide authority through its verification and monitoring tools while providing authenticity through user-generated content. On 20 December 2013 News Corp purchased Storyful for US$25 million and opened a New York office in the same building as Fox News' main studios. Little left Storyful in 2015 and Gavin Sheridan, Storyful's director of innovation left in 2014. News Corp CEO Robert Thomson said that through Storyful, News Corp would "define the opportunities that the digital landscape presents, rather than simply adapt to them." After the acquisition, the company expanded its service to include "commercial and creative work". After Murdoch acquired the company, from 2014 through to February 2018, losses "swelled", requiring a series of cash injections from News Corp. During that time the company expanded aggressively globally with a staff of about 200 worldwide up from about 30 in 2014. According to The Guardian, in 2016, journalists were encouraged by Storyful to use the social media monitoring software called Verify developed by Storyful. By installing Verify's web browser extension on their computers, Verify would inform the journalists when social media content had been "verified and cleared". The Guardian revealed that through the Verify plugin, dozens of staff in four offices had access to the journalists browsing activity without them knowing. This data allowed Storyful to actively monitor its own clients' activities on social media and to "turn it into an internal feed" at Storyful that "updates in real time". In November 2018, when a video circulated by Infowars' Paul Joseph Watson appeared to prove that CNN's Jim Acosta's contact with a White House intern was a physical blow, Storyful was able to prove that the 15-second-long clip had been doctored. According to a 21 January 2019 article in CNN Business, Rob McDonagh, the editor of Storyful's U.S. news team, had proven that one of the viral videos that served as catalysts in the January 2019 Lincoln Memorial confrontation at 18 January 2019 Indigenous Peoples March, was posted by a suspicious account, under the handle @2020fight. McDonagh's team validates videos and posts before adding them to their "digest", distinguishing true stories from those that are not. Storyful attempts to validate each post or video before including it in its digest. McDonagh reviewed previous content from @2020fight's account, and found it suspicious because it had a high follower count, a "highly polarized and yet inconsistent political messaging", an "unusually high rate of tweets", and "the use of someone else's image in the profile photo." reporter Donie O'Sullivan said that the @2020fight video that had been posted on 18 January, which had 2.5 million views by 22 January, was the one that "helped frame the news cycle". Currently the website offers a service by which video can be commercially brokered. == Services == Services include a newswire service—one of their "core pillars"—and social news monitoring. By February 2018, Storyful was developing "risk and reputation monitoring" services through which they would source and verify social news, fact-checking it and contextualising it for corporate clients. They were "developing tech tools" to "explore obscure or closed networks" for their intelligence team. can use to explore obscure or closed networks. They "track deviations in social conversations around brands and organisations and catch potential risks before they blow up. Like an alerts system." The company "released a re-booted version of its Newswire platform in 2018. According to FORA, Storyful was developing new tools to combat fake news online. == Clients == When Storyful was acquired by News Corp in 2013, the company already had the Wall Street Journal, the BBC, New York Times, YouTube, ITN and Channel 4 News as clients. By 2018 their clients included CNN, ABC News and Fox News, The New York Times, the Washington Post, in the United States, the Australian Broadcasting Corporation and all of News Corp’s own publications. Most of their "reputation-conscious corporate customers" clients prefer to not be named.

    Read more →
  • Multi-focus image fusion

    Multi-focus image fusion

    Multi-focus image fusion is a multiple image compression technique using input images with different focus depths to make one output image that preserves all information. == Overview == The main idea of image fusion is gathering important and the essential information from the input images into one single image which ideally has all of the information of the input images. The research history of image fusion spans over 30 years and many scientific papers. Image fusion generally has two aspects: image fusion methods and objective evaluation metrics. In visual sensor networks (VSN), sensors are cameras which record images and video sequences. In many applications of VSN, a camera can't give a perfect illustration including all details of the scene. This is because of the limited depth of focus of the optical lens of cameras. Therefore, just the object located in the focal length of camera is focused and clear, and other parts of the image are blurred. VSN captures images with different depths of focus using several cameras. Due to the large amount of data generated by cameras compared to other sensors such as pressure and temperature sensors and some limitations of bandwidth, energy consumption and processing time, it is essential to process the local input images to decrease the amount of transmitted data. == Multi-Focus image fusion in the spatial domain == Huang and Jing have reviewed and applied several focus measurements in the spatial domain for the multi-focus image fusion process, suitable for real-time applications. They mentioned some focus measurements including variance, energy of image gradient (EOG), Tenenbaum's algorithm (Tenengrad), energy of Laplacian (EOL), sum-modified-Laplacian (SML), and spatial frequency (SF). Their experiments showed that EOL gave better results than other methods like variance and spatial frequency. == Multi-Focus image fusion in multi-scale transform and DCT domain == Image fusion based on the multi-scale transform is the most commonly used and promising technique. Laplacian pyramid transform, gradient pyramid-based transform, morphological pyramid transform and the premier ones, discrete wavelet transform, shift-invariant wavelet transform (SIDWT), and discrete cosine harmonic wavelet transform (DCHWT) are some examples of image fusion methods based on multi-scale transform. These methods are complex and have some limitations e.g. processing time and energy consumption. For example, multi-focus image fusion methods based on DWT require a lot of convolution operations, so they take more time and energy to process. Therefore, most methods in multi-scale transform are not suitable for real-time applications. Moreover, these methods are not very successful along edges, due to the wavelet transform process missing the edges of the image. They create ringing artefacts in the output image and reduce its quality. Due to the aforementioned problems in the multi-scale transform methods, researchers are interested in multi-focus image fusion in the DCT domain. DCT-based methods are more efficient in terms of transmission and archiving images coded in Joint Photographic Experts Group (JPEG) standard to the upper node in the VSN agent. A JPEG system consists of a pair of an encoder and a decoder. In the encoder, images are divided into non-overlapping 8×8 blocks, and the DCT coefficients are calculated for each. Since the quantization of DCT coefficients is a lossy process, many of the small-valued DCT coefficients are quantized to zero, which corresponds to high frequencies. DCT-based image fusion algorithms work better when the multi-focus image fusion methods are applied in the compressed domain. In addition, in the spatial-based methods, the input images must be decoded and then transferred to the spatial domain. After implementation of the image fusion operations, the output fused images must again be encoded. DCT domain-based methods do not require complex and time-consuming consecutive decoding and encoding operations. Therefore, the image fusion methods based on DCT domain operate with much less energy and processing time. Recently, a lot of research has been carried out in the DCT domain. DCT+Variance, DCT+Corr_Eng, DCT+EOL, and DCT+VOL are some prominent examples of DCT based methods.

    Read more →
  • IBM remote batch terminals

    IBM remote batch terminals

    The IBM 2780 and the IBM 3780 are devices developed by IBM for performing remote job entry (RJE) and other batch functions over telephone lines; they communicate with the mainframe via Binary Synchronous Communications (BSC or Bisync) and replaced older terminals using synchronous transmit-receive (STR). In addition, IBM has developed workstation programs for the 1130, 360/20, 2922, System/360 other than 360/20, System/370 and System/3. == 2780 Data Transmission Terminal == The 2780 Data Transmission Terminal first shipped in 1967. It consists of: A line printer similar to the IBM 1443 that can print up to 240 lines per minute (lpm), or 300 lpm using an extremely restricted character set. A card reader/punch unit, similar to an IBM 1442, that can read up to 400 cards per minute (cpm) and can punch up to 355 cpm. A line buffer that stores data received or to be transmitted over the communications line. A binary synchronous adapter which controls the flow of data over the communications line. The 2780 is capable of local (offline) card to print operation. It comes in four models: Model 1: Can read punched cards and transmit the data to a remote host computer, and can receive and print data sent by the host. Model 2: Same as Model 1 but adds the ability to punch card data received from the host. Model 3: Can only print data received from the host, but not send data to it. Model 4: Can read and punch card data, but has no printing capabilities. The 2780 uses a dedicated communication line at speeds of 1200, 2000, 2400 or 4800 bits per second. It is a half duplex device, although full duplex lines can be used with some increase in throughput. It can communicate in Transcode (a 6-bit code), 8-bit EBCDIC, or 7-bit ASCII. == 2770 Data Communication System == The 2770, announced in 1969, "was said to surpass all other IBM terminals in the variety of available input-output devices." The 2770 was developed by the IBM General Products Division (GPD) in Rochester, MN. It comes standard with a desktop terminal with keyboard. The printer and other devices (any two in any combination) can be attached to the 2772 Multi-Purpose Control unit. Possible devices include: 50 Magnetic Data Inscriber 545 Card Punch Model 3 (non-printing) or Model 4 (printing) 1017 Paper Tape Reader 1018 Paper Tape Punch 1053 Printer Model 1 1255 Magnetic Character Reader Models 1, 2 or 3 2203 Printer Model A1 or A2 2213 Printer Model 1 or 2 2265 Display Station Model 2 2502 Card Reader Model A1 or A2 5496 Data Recorder == 3780 Data Communications Terminal == In May 1972, IBM announced the IBM 3780, an enhanced version of the 2780. The 3780 was developed by IBM's Data Processing Division (DPD). There is one model, with an optional card punch. The 3780 drops Transcode support and incorporates several performance enhancements. It supports compression of blank fields in data using run-length encoding. It provides the ability to interleave data between devices, introduces double buffering, and adds support for the Wait-before-transmit ACKnowledgement (WACK) and Temporary Text Delay (TTD) Binary Synchronous control characters. The integrated punched card unit can read cards at 600 cards per minute. The integrated printer is rated at 300, 350 or 425 lines per minute based on characters set (63, 52 or 39 characters). The 3781 Card Punch is an optional feature. It punches 160 columns per second, or 91 cards per minute if all 80 columns are punched. The IBM 2780 and 3780 were later emulated on various types of equipment, including eventually the personal computer. A notable early emulation was the DN60, by Digital Equipment Corporation in the late 1970s. == 3770 Data Communications System == In 1974 IBM Data Processing Division (DPD) offered a successor to the 3780, called the 3770 Data Communications System, supporting SDLC, BSC, BSC Multi-leaving and SNA, depending on the configuration. The 3770 is a family of desk console style terminals that offers a variety of keyboard and printer combinations as well as I/O equipment attachment and communications features. The terminals come built into a desk and include the following models: 3771 Communication Terminal (optional card reader, optional card punch, wire matrix printer) Models 1 (40 cps printer), 2 (80 cps printer), and 3 (120 cps printer). 3773 Communication Terminal (diskette, wire matrix printer) Models 1 (40 cps printer), 2 (80 cps printer), and 3 (120 cps printer). Each model has a P version which adds some programming features. 3774 Communication Terminal (optional card reader, optional card punch, optional belt printer, wire matrix printer) Models 1 (80 cps printer), and 2 (120 cps printer). Each model has a P version which adds some programming features, a 480-character display and a non-removable diskette. 3775 Communication Terminal (optional card reader, optional card punch, optional diskette, belt printer) Model 1 (120 lpm printer). The model P1 adds some programming features, a 480-character display and a non-removable diskette. 3776 Communication Terminal (optional card reader, optional card punch, optional diskette, belt printer) Models 1 (300 lpm printer) and 2 (400 lpm printer). Models 3 and 4 are similar to models 1 and 2. 3777 Communication Terminal (optional card reader, optional diskette, train printer) Model 1 (up to 1000 lpm printer depending on character set). Model 2 adds an optional card punch, model 3 adds an optional magnetic tape drive and model 4 replaces the train printer with a slower model called the IBM 3262. The model 4 also allows a second, optional, 3262. The following I/O devices can be attached to a 3770 terminal: IBM 2502 Card Reader: Models A1 (up to 150 card per minute), A2 (up to 300 cards per minute) or A3 (up to 400 cards per minute) IBM 3203 Printer Model 3: 1000 LPM using 48 character set IBM 3501 Card Reader: Up to 50 cards per minute desktop unit IBM 3521 Card Punch: Up to 50 cards per minute IBM 3782 Card Attachment unit, which allows the 2502 or 3521 to be attached to any terminal except the 3777 IBM 3784 Line Printer, can be attached to a 3774 as a second printer. Up to 155 LPM with 48 characters set print belt. == Workstation programs == IBM distributes workstation programs with systems software including OS/360 Attached Support Processor (ASP) Houston Automatic Spooling Priority (HASP and HASP II) Operating System/Virtual Storage 1 (OS/VS1) Operating System/Virtual Storage 2 (OS/VS2 MVS) Release 2 through 3.8 MVS versions from MVS/SP Version 1 through z/OS Priority Output Writers, Execution processors and input Readers (POWER) Remote Spooling Communications Subsystem (RSCS) Except for the RJE workstation programs in OS/360, these programs use a variation of BSC known as Multi-leaving. In addition, IBM provides separately ordered workstation programs using BSC. Systems Network Architecture (SNA) and TCP/IP. Workstation programs are available from IBM and third-party vendors to support all of these protocols: 2770/3770 2780/3780 Multileaving Network Job Entry (NJE) OS/360 RJE SNA TCP/IP

    Read more →
  • Data preservation

    Data preservation

    Data preservation is the act of conserving and maintaining both the safety and integrity of data. Preservation is done through formal activities that are governed by policies, regulations and strategies directed towards protecting and prolonging the existence and authenticity of data and its metadata. Data can be described as the elements or units in which knowledge and information is created, and metadata are the summarizing subsets of the elements of data; or the data about the data. The main goal of data preservation is to protect data from being lost or destroyed and to contribute to the reuse and progression of the data. == History == Most historical data collected over time has been lost or destroyed. War and natural disasters combined with the lack of materials and necessary practices to preserve and protect data has caused this. Usually, only the most important data sets were saved, such as government records and statistics, legal contracts and economic transactions. Scientific research and doctoral theses data have mostly been destroyed from improper storage and lack of data preservation awareness and execution. Over time, data preservation has evolved and has generated importance and awareness. We now have many different ways to preserve data and many different important organizations involved in doing so. The first digital data preservation storage solutions appeared in the 1950s, which were usually flat or hierarchically structured. While there were still issues with these solutions, it made storing data much cheaper, and more easily accessible. In the 1970s relational databases as well as spreadsheets appeared. Relational data bases structure data into tables using structured query languages which made them more efficient than the preceding storage solutions, and spreadsheets hold high volumes of numeric data which can be applied to these relational databases to produce derivative data. More recently, non-relational (non-structured query language) databases have appeared as complements to relational databases which hold high volumes of unstructured or semi-structured data. == Importance == The scope of data preservation is vast. Everything from governmental to business records to art essentially can be represented as data, and is amenable to be lost. This then leads to loss of human history, for perpetuity. Data can be lost on a small or independent scale whether it's personal data loss, or data loss within businesses and organizations, as well as on a larger or national or global scale which can negatively and potentially permanently affect things such as environmental protection, medical research, homeland security, public health and safety, economic development and culture. The mechanisms of data loss are also as many as they are varied, spanning from disaster, wars, data breaches, negligence, all the way through simple forgetting to natural decay. Ways in which data collections can be used when preserved and stored properly can be seen through the U.S. Geological Survey, which stores data collections on natural hazards, natural resources, and landscapes. The data collected by the Survey is used by federal and state land management agencies towards land use planning and management, and continually needs access to historical reference data. == Related Concepts == In contrast, data holdings are collections of gathered data that are informally kept, and not necessarily prepared for long-term preservation. For example, a collection or back-up of personal files. Data holdings are generally the storage methods used in the past when data has been lost due to environmental and other historical disasters. Furthermore, data retention differs from data preservation in the sense that by definition, to retain an object (data) is to hold or keep possession or use of the object. To preserve an object is to protect, maintain and keep up for future use. Retention policies often circle around when data should be deleted on purpose as well, and held from public access, while preservation prioritizes permanence and more widely shared access. Thus, data preservation exceeds the concept of having or possessing data or back up copies of data. Data preservation ensures reliable access to data by including back-up and recovery mechanisms that precede the event of a disaster or technological change. == Methods == === Digital === Digital preservation, is similar to data preservation, but is mainly concerned with technological threats, and solely digital data. Essentially digital data is a set of formal activities to enable ongoing or persistent use and access of digital data exceeding the occurrence of technological malfunction or change. Digital preservation is aware of the inevitable change in technology and protocols, and prepares for data that will need to be accessible across new types of technologies and platforms while the integrity of the data and metadata are being conserved. Technology, while providing great process in conserving data that may not have been possible in the past, is also changing at such a quick rate that digital data may not be accessible anymore due to the format being incompatible with new software. Without the use of data preservation much of our existing digital data is at risk. The majority of methods used towards data preservation today are digital methods, which are so far the most effective methods that exist. === Archives === Archives are a collection of historical documents and records. Archives contribute and work towards the preservation of data by collecting data that is well organized, while providing the appropriate metadata to confirm it. An example of an important data archive is The LONI Image Data Archive, which is an archive that collects data regarding clinical trials and clinical research studies. === Catalogues, directories and portals === Catalogues, directories and portals are consolidated resources which are kept by individual institutions, and are associated with data archives and holdings. In other words, the data is not presented on the site, but instead might act as metadata and aggregators, and may administer thorough inventories. === Repositories === Repositories are places where data archives and holdings can be accessed and stored. The goal of repositories is to make sure that all requirements and protocols of archives and holdings are being met, and data is being certified to ensure data integrity and user trust. Single-site Repositories A repository that holds all data sets on a single site. An example of a major single-site repository the Data Archiving and Networking Services which is a repository which provides ongoing access to digital research resources for the Netherlands. Multi-Site Repositories A repository that hosts data set on multiple institutional sites. An example of a well known multi-site repository is OpenAIRE which is a repository that hosts research data and publications collaborating all of the EU countries and more. OpenAIRE promotes open scholarship and seeks to improves discover-ability and re-usability of data. Trusted Digital Repository A repository that seeks to provide reliable, trusted access over a long period of time. The repository can be single or multi-sited but must cooperate with the Reference Model for an Open Archival Information System, as well as adhere to a set of rules or attributes that contribute to its trust such as having persistent financial responsibility, organizational buoyancy, administrative responsibility security and safety. An example of a trusted digital repository is The Digital Repository of Ireland (DRI) which is a multi-site repository that hosts Ireland's humanity and social science data sets. === Cyber Infrastructures === Cyber infrastructures which consists of archive collections which are made available through the system of hardware, technologies, software, policies, services and tools. Cyber infrastructures are geared towards the sharing of data supporting peer-to-peer collaborations and a cultural community. An example of a major cyber-infrastructure is The Canadian Geo-spatial Data Infrastructure which provides access to spatial data in Canada.

    Read more →
  • Symmetric Boolean function

    Symmetric Boolean function

    In mathematics, a symmetric Boolean function is a Boolean function whose value does not depend on the order of its input bits, i.e., it depends only on the number of ones (or zeros) in the input. For this reason they are also known as Boolean counting functions. There are 2n+1 symmetric n-ary Boolean functions. Instead of the truth table, traditionally used to represent Boolean functions, one may use a more compact representation for an n-variable symmetric Boolean function: the (n + 1)-vector, whose i-th entry (i = 0, ..., n) is the value of the function on an input vector with i ones. Mathematically, the symmetric Boolean functions correspond one-to-one with the functions that map n+1 elements to two elements, f : { 0 , 1 , . . . , n } → { 0 , 1 } {\displaystyle f:\{0,1,...,n\}\rightarrow \{0,1\}} . Symmetric Boolean functions are used to classify Boolean satisfiability problems. == Special cases == A number of special cases are recognized: Majority function: their value is 1 on input vectors with more than n/2 ones Threshold functions: their value is 1 on input vectors with k or more ones for a fixed k All-equal and not-all-equal function: their values is 1 when the inputs do (not) all have the same value Exact-count functions: their value is 1 on input vectors with k ones for a fixed k One-hot or 1-in-n function: their value is 1 on input vectors with exactly one one One-cold function: their value is 1 on input vectors with exactly one zero Congruence functions: their value is 1 on input vectors with the number of ones congruent to k mod m for fixed k, m Parity function: their value is 1 if the input vector has odd number of ones The n-ary versions of AND, OR, XOR, NAND, NOR and XNOR are also symmetric Boolean functions. == Properties == In the following, f k {\displaystyle f_{k}} denotes the value of the function f : { 0 , 1 } n → { 0 , 1 } {\displaystyle f:\{0,1\}^{n}\rightarrow \{0,1\}} when applied to an input vector of weight k {\displaystyle k} . === Weight === The weight of the function can be calculated from its value vector: | f | = ∑ k = 0 n ( n k ) f k {\displaystyle |f|=\sum _{k=0}^{n}{\binom {n}{k}}f_{k}} === Algebraic normal form === The algebraic normal form either contains all monomials of certain order m {\displaystyle m} , or none of them; i.e. the Möbius transform f ^ {\displaystyle {\hat {f}}} of the function is also a symmetric function. It can thus also be described by a simple (n+1) bit vector, the ANF vector f ^ m {\displaystyle {\hat {f}}_{m}} . The ANF and value vectors are related by a Möbius relation: f ^ m = ⨁ k 2 ⊆ m 2 f k {\displaystyle {\hat {f}}_{m}=\bigoplus _{k_{2}\subseteq m_{2}}f_{k}} where k 2 ⊆ m 2 {\displaystyle k_{2}\subseteq m_{2}} denotes all the weights k whose base-2 representation is covered by the base-2 representation of m (a consequence of Lucas’ theorem). Effectively, an n-variable symmetric Boolean function corresponds to a log(n)-variable ordinary Boolean function acting on the base-2 representation of the input weight. For example, for three-variable functions: f ^ 0 = f 0 f ^ 1 = f 0 ⊕ f 1 f ^ 2 = f 0 ⊕ f 2 f ^ 3 = f 0 ⊕ f 1 ⊕ f 2 ⊕ f 3 {\displaystyle {\begin{array}{lcl}{\hat {f}}_{0}&=&f_{0}\\{\hat {f}}_{1}&=&f_{0}\oplus f_{1}\\{\hat {f}}_{2}&=&f_{0}\oplus f_{2}\\{\hat {f}}_{3}&=&f_{0}\oplus f_{1}\oplus f_{2}\oplus f_{3}\end{array}}} So the three variable majority function with value vector (0, 0, 1, 1) has ANF vector (0, 0, 1, 0), i.e.: Maj ( x , y , z ) = x y ⊕ x z ⊕ y z {\displaystyle {\text{Maj}}(x,y,z)=xy\oplus xz\oplus yz} === Unit hypercube polynomial === The coefficients of the real polynomial agreeing with the function on { 0 , 1 } n {\displaystyle \{0,1\}^{n}} are given by: f m ∗ = ∑ k = 0 m ( − 1 ) | k | + | m | ( m k ) f k {\displaystyle f_{m}^{}=\sum _{k=0}^{m}(-1)^{|k|+|m|}{\binom {m}{k}}f_{k}} For example, the three variable majority function polynomial has coefficients (0, 0, 1, -2): Maj ( x , y , z ) = ( x y + x z + y z ) − 2 ( x y z ) {\displaystyle {\text{Maj}}(x,y,z)=(xy+xz+yz)-2(xyz)} == Examples ==

    Read more →
  • CHAOS (chess)

    CHAOS (chess)

    CHAOS (Chess Heuristics and Other Stuff) is a chess playing program that was developed by programmers working at the RCA Systems Programming division in the late 1960s. It played competitively in computer chess competitions in the 1970s and 1980s. It differed from other programs of that era in its look-ahead philosophy, choosing to use chess knowledge to evaluate fewer positions and continuations as opposed to simple evaluations that relied on deep look-ahead to avoid bad moves. == Introduction == CHAOS was originally developed by Ira Ruben, Fred Swartz, Victor Berman, Joe Winograd and William Toikka while working at RCA in Cinnaminson, NJ. Its name is an acronym for 'Chess Heuristics and Other Stuff.' Program development moved to the Computing Center of the University of Michigan when Swartz changed jobs, and Mike Alexander joined the development group. Swartz, Alexander and Berman were continuously group members from that point onward in CHAOS' evolution, as others of the original authors left and new members contributed episodically. Chess Senior Master Jack O'Keefe contributed to CHAOS' development from about 1980 onwards. CHAOS was written in Fortran, except for low-level board representation manipulations written in assembly language or C. Due to this portability, it ran on RCA, Univac and IBM-compatible mainframes in its lifetime. CHAOS heralds from the mainframe computing era when only machines of that capacity were able to play at a high level. Consequently, development and testing could only take place at off-peak times for production use of the machine. In a competition, CHAOS had to run on a dedicated mainframe with a telephone link to the match venue. In its later years, CHAOS ran on computers on the machine assembly floor of Amdahl Corporation on MTS. == Background == === Chess and artificial intelligence === Mathematicians Claude Shannon and Alan Turing, working separately, were the first to view playing chess as a challenge to machines. Working for AT&T / Bell Labs with its access to telephone switching equipment, Shannon built a relay-based machine that learned how to work its way through a two-dimensional, 5x5 cell maze in 1949. Shannon viewed this as an analogue of the way that organisms learn things about their natural environment. There is a random element to searching it, a memory element to benefit from the search outcome, and a reward element that reinforces learning when the global outcome is favorable to the organism. Soon afterward, Shannon wrote a mathematical analysis of the game of chess, published in 1950. Like with the maze, he broke down game play into the necessary elements for reinforcement learning. Associated with each board configuration a move will be made from, there is a numerical score. To decide what move to make, a player wants to maximize their own position's score after the move and to minimize their opponent's score (a minimax view). Since there are about 32 possible moves at each of the early stages of the game, and about 40 moves and responses in each game, then there are about 32 80 {\displaystyle 32^{80}} or about 10 120 {\displaystyle 10^{120}} possible games - an impossibly large set to evaluate completely. Therefore, there must be a way to limit the number of moves to look ahead for to find the best one. Reducing the game to these few key elements provided a way to think about human intelligence in general. Shannon became part of a wider group using computing machines to mimic aspects of human intelligence that grew into the general idea of artificial intelligence. (Other members of this group were John McCarthy, Herbert Simon, Allen Newell, Alan Kotok, Alex Bernstein and Richard Greenblatt.) The paradigm that evolved was that there was a quantification of the position on the board into a score, an evaluation method to find favorable outcomes (minimax, later alpha-beta pruning), and a strategy to manage the combinatorial explosion of the look-ahead possibilities. By the early 1960s, there were computer programs that played chess at a rudimentary level. They used very simple evaluation functions for each position and tried to search as far forward as was practical given the time constraints and available compute power. Naturally, programmers optimized their code to use the available computing resources. This led to a major philosophical divide among chess programs: those that tried to evaluate as many positions as possible, and those that tried to evaluate the most promising move sequences as deeply as possible. CHAOS was firmly in the camp believing only the most promising moves should be evaluated in depth. Said Swartz, "The 'brute force people' ... look at every (possible move) no matter what garbage it is. Most moves are just terrible, terrible moves, and most computing time is being spent on pure garbage." The program spent more time evaluating each board position in the expectation that it would find the most promising lines of play to explore in depth. In 1983, the then-fastest chess program (Belle) evaluated 110,000 positions per second, and typical programs 1000–50,000 per second, whereas CHAOS evaluated about 50-100 per second. === Machine learning and strategies to manage search === From about 1949 onward, Arthur Samuel began work for IBM on machine learning, culminating in a checkers-playing program in 1952 and publications on the topic. Concurrently, Christopher Strachey created Checkers, a program to play the board game of checkers in 1951, but it had no capacity to learn from its play. Checkers was chosen by both authors because it was simpler than chess yet contained the basic characteristics of an intellectual activity, and, in Samuel's view, was a test-bed in which heuristic procedures and learning processes could be evaluated quickly. Checker playing programs introduced the notion of the game tree and evaluating play to various depths to choose the best move. The complexity of chess, however, promoted it to the status of an analogue for human intelligence, and it attracted computer scientists' attention, who referred to it as research into artificial intelligence (AI). Like checkers, it required a numerical assessment of each arrangement of chess pieces on a board. It also required looking ahead to future moves to decide how to play the present position. Due to the enormous number of possible moves, there had to be a way to confine the look-ahead search to the most promising lines of play. From these factors, the notion of minimax score evaluation developed and, later, alpha-beta tree pruning to abandon looking at positions worse than any that have already been examined. === Chess search strategies === The AI community viewed artificial intelligence as comprising two parts: a way to symbolically quantify the knowledge in hand (a chess board position), and a set of heuristics to limit look-ahead to the consequences of a move. The early chess playing programs attempted to look forward as far as possible, perhaps to 3 moves ahead by each player, and to choose the best outcome. This led to the horizon effect, whereby a key move 4 or more moves ahead would be unexamined and therefore missed. Consequently, the programs were quite weak and heuristics to manage the search became important in their development. CHAOS used a selective search strategy with iterative widening. As chess programs evolved, they incorporated books of opening lines of play from historic sources. Nowadays, book moves are catalogued in machine-readable form, but originally programmers had to type them in. CHAOS had an extensive book for its time of around 10,000 moves that O'Keefe helped to develop. A problem with play from an opening book is the behavior of the program when the play leaves the book: the positional advantage may be so subtle that the evaluation scheme may be unable to understand it, leading to very wide and shallow searches to establish a line of play. The horizon effect again plagues move selection after leaving the book. CHAOS mitigated these problems by only using book lines that it could understand, and by relying on cached analyses of continuations out of the book made while the opponent's clock was running. == Game Play History == CHAOS played in twelve ACM computer chess tournaments and four World Computer Chess Championships (WCCC). Its debut was the ACM computer chess tournament in 1973, taking 2nd place. In 1974, it again won 2nd place in the WCCC, defeating the tournament favorite Chess 4.0 but losing to Kaissa. CHAOS was close to winning the 1980 WCCC, but lost to Belle in a playoff. The 1985 ACM computer chess tournament was CHAOS' last competition. One of CHAOS' notable victories was over Chess 4.0 at the 1974 WCCC tournament. Chess 4.0 was unbeaten by any other program up until then. Playing as white, CHAOS made a knight sacrifice (16 Nd4-e6!!) that traded material for open lines of attack and eventually won the game. CHAOS’ authors thought the move was due to a

    Read more →
  • Social television

    Social television

    Social television is the union of television and social media. Millions of people now share their TV experience with other viewers on social media such as Twitter and Facebook using smartphones and tablets. TV networks and rights holders are increasingly sharing video clips on social platforms to monetise engagement and drive tune-in. The social TV market covers the technologies that support communication and social interaction around TV as well as companies that study television-related social behavior and measure social media activities tied to specific TV broadcasts – many of which have attracted significant investment from established media and technology companies. The market is also seeing numerous tie-ups between broadcasters and social networking players such as Twitter and Facebook. The market is expected to be worth $256bn by 2017. Social TV was named one of the 10 most important emerging technologies by the MIT Technology Review on Social TV in 2010. And in 2011, David Rowan, the editor of Wired magazine, named Social TV at number three of six in his peek into 2011 and what tech trends to expect to get traction. Ynon Kreiz, CEO of the Endemol Group told the audience at the Digital Life Design (DLD) conference in January 2011: "Everyone says that social television will be big. I think it's not going to be big—it's going to be huge". Much of the investment in the earlier years of social TV went into standalone social TV apps. The industry believed these apps would provide an appealing and complimentary consumer experience which could then be monetized with ads. These apps featured TV listings, check-ins, stickers and synchronised second-screen content but struggled to attract users away from Twitter and Facebook. Most of these companies have since gone out of business or been acquired amid a wave of consolidation and the market has instead focused on the activities of the social media channels themselves – such as Twitter Amplify, Facebook Suggested Videos and Snapchat Discover – and the technologies that support them. == Twitter == Twitter and Facebook are both helping users connect around media, which can provoke strong debate and engagement. Both social platforms want to be the 'digital watercooler' and host conversation around TV because the engagement and data about what media people consume can then be used to generate advertising revenue. As an open platform, conversation on Twitter is closely aligned with real-time events. In May 2013, it launched Twitter Amplify – an advertising product for media and consumer brands. With Amplify, Twitter runs video highlights from major live broadcasts, with advertisers' names and messages playing before the clip. By February 2014, all four major U.S. TV networks had signed up to the Amplify program, bringing a variety of premium TV content onto the social platform in the form of in-tweet real-time video clips. In June 2014, Twitter acquired its Twitter Amplify partner in the U.S. SnappyTV, a company that was helping broadcasters and rights holders to share video content both organically across social and via Twitter's Amplify program. Twitter continues to rely on Grabyo, which has also struck numerous deals with some of the largest broadcasters and rights holders in Europe and North America to share video content across Facebook and Twitter. == Facebook == Facebook made significant changes to its platform in 2014 including updates to its algorithm to enhance how it serves video in users' feeds. It also launched video autoplay to get users to watch the videos in their feeds. It rapidly surpassed Twitter and by the end of 2014 it was enjoying three billion video views a day on its platform and had announced a partnership with the NFL, one of Twitter's most active Twitter Amplify partners. In April 2015, at its F8 Developer Conference, it revealed it was working with Grabyo among other technology partners to bring video onto its platform. Then in July it announced it would be launching Facebook Suggested Videos, bringing related videos and ads to anyone that clicks on a video – a move that not only competed with Twitter's commercial video offering but also put it in direct competition with YouTube. == TV Time == TV Time is a television dedicated social network that allows users to keep track of the television series they watch, as well as films. It also allows them to express their reaction to the media they have seen with episode specific voting for favorite characters and emotional reaction to episodes, as well as commenting in episode restrictive pages. This way users are able to avoid spoilers while also finding a precise audience and community for each of their interactions, as opposed to bigger, non-television dedicated social medias such as Facebook and Twitter where the likelihood of unintentionally reading spoilers is much higher. TV Time offers an analytics service called "TVLytics" where the votes and reactions collected from users can be studied for research and television production purposes. == Advertising == According to Businessinsider.com, there are variety of applications for social TV, including support for TV ad sales, optimizing TV ad buys, making ad buys more efficient, as a complement to audience measurement, and eventually, audience forecasting and real-time optimization. Social TV data can ease access to focus groups and may create a positive feedback loop for generating ultra-sticky TV programming and multi-screen ad campaigns. == In numbers == Viewers share their TV experience on social media in real-time as events unfold: between 88-100m Facebook users login to the platform during the primetime hours of 8pm – 11pm in the US. The volume of social media engagement in TV is also rising – according to Nielsen SocialGuide, there was a 38% increase in tweets about TV in 2013 to 263m. For the 2014 Super Bowl, Twitter reported that a record 24.9 million tweets about the game were sent during the telecast, peaking at 381,605 tweets per minute. Facebook reported that 50 million people discussed the Super Bowl, generating 185 million interactions. The 2014 Oscars generated 5m tweets, viewed by an audience of 37m unique Twitter users and delivering 3.3bn impressions globally as conversation and key moments were shared virally across the platform. In 2014 the All England Lawn Tennis Club (AELTC), hosts of Wimbledon, used Grabyo to share video content across social. The videos were viewed 3.5 million times across Facebook and Twitter. In partnered with Grabyo again in 2015 and the videos generated over 48 million views across Facebook and Twitter. == Television shows with social integration == Here are some examples of how TV executives are integrating social elements with TV shows: C-SPAN streamed tweets from US Senators and Representatives during the quorum call The Voice had the judges of the program tweet during the show and the posts scrolls on the bottom of the screen. The use of Twitter also led to an increase in viewers. "Glee" Entertainment Weekly created a second screen viewing platform for the Glee season 3 premiere. == Related publications == Erika Jonietz. "Making TV Social, Virtually" MIT Technology Review. (January 11, 2010) AmigoTV (Alcatel-Lucent; Coppens et al.) – 2004 www.ist-ipmedianet.org/Alcatel_EuroiTV2004_AmigoTV_short_paper_S4-2.pdf Nextream (MIT Media Lab, Martin et al.) – 2010 Social Interactive Television: Immersive Shared Experiences and Perspectives (P. Cesar, D. Geerts, and K. Chorianopoulos (eds.)) – 2009 Social TV and the Emergence of Interactive TV – Multimedia Research Group – November 2010 Interactive Social TV on Service Oriented Environments: Challenges and Enablers (May 2011) == Systems == Boxee – acquired by Samsung GetGlue – acquired by i.TV Grabyo KIT digital Miso TV Tank Top TV WiO Xbox Live

    Read more →
  • Consumer relationship system

    Consumer relationship system

    Consumer relationship systems (CRS) are specialized customer relationship management (CRM) software applications that are used to handle a company's dealings with its customers. Current consumer relationship systems integrate the software with telephone and call recording systems as well as with corporate systems for input and reporting. Customers can provide input from the company's website directly into the CRS. These systems are popular because they can deliver the 'voice of the consumer' that contributes to product quality improvement and that ultimately increases corporate profits. Consumer relationship systems that provide automated support as well as advanced systems may have artificial intelligence (AI) interfaces that can extract and analyse data collected, or handle basic questions and complaints. == History == The first CRS was developed in the 1980s. In 1981 Michael Wilke and Robert Thornton founded Wilke/Thornton, Inc in Columbus, Ohio, to develop new CRS software.

    Read more →