AI Art For Sale

AI Art For Sale — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Lexxe

    Lexxe

    Lexxe is an internet search engine that applies Natural Language Processing in its semantic search technology. Founded in 2005 by Dr. Hong Liang Qiao, Lexxe is based in Sydney, Australia. Today, Lexxe's key focus is on sentiment search with the launch of a news sentiment search site at News & Moods (www.newsandmoods.com). Lexxe has experienced several stages of change of focus in search technology: Lexxe launched its Alpha version in 2005, featuring Natural Language question answering (i.e. users could ask questions in English to the search engine apart from keyword searches — this feature has been suspended for redevelopment since 2010). It used only algorithms to extract answers from web pages, with no question-answer pair databases prepared in advance. In 2011, Lexxe launched a beta version with a new search technology called Semantic Key. Semantic Keys enable users to query with a conceptual keyword (or a keyword with a special meaning, hence the term Semantic Key) in order to find instances under the concept, e.g. price → $5.95 or €200, color → red, yellow, white. For example, “price: a pound of apples”, “color: ferrari”. With initial 500 Semantic Keys at the Beta launch, Lexxe became the first search engine in the world to offer this unique and useful search technology to the users. The cost of building Semantic Keys was too heavy though. In 2017, Lexxe launched News & Moods (www.newsandmoods.com), an open platform for news sentiment search, a first step towards sentiment search feature for the entire Internet search in Lexxe search engine. News & Moods also comes with smartphone apps in Android and iOS.

    Read more →
  • Strategic Air Command Digital Information Network

    Strategic Air Command Digital Information Network

    The Strategic Air Command DIgital Network (SACDIN) was a United States military computer network that provided computerized record communications, replacing the Data Transmission Subsystem and part of the Data Display Subsystem of the SAC Automated Command and Control System. SACDIN enabled a rapid flow of communications from headquarters SAC to its fielded forces, such as B-52 bases and ICBM Launch Control Centers. == Logistics == Major portions of SACDIN were developed, engineered and installed by the International Telephone and Telegraph (ITT) company, under contract to the Electronic Systems Center. == Chronology == 1969 - Headquarters SAC submits a request to the Joint Chiefs of Staff to study an expanded communications system, known as the SAC Total Information Network (SATIN). It would interconnect Air Force Satellite Communications (AFSATCOM), Advanced Airborne Command Post (AABNCP), Airborne Command Post (ABNCP), high frequency/single sideband radio HF/SSB radio, SAC Automated Command and Control System (SACCS), Automatic Digital Information Network (AUTODIN), Survivable Low Frequency Communications System (SLFCS) and Command Data Buffer (CDB) 1977 1 November - SATIN IV was effectively terminated by Congress. The restructured program was renamed SAC Digital Network (SACDIN), and was formulated to meet SAC's minimum essential data communications requirements, but also had the capability to grow in a modular fashion. 1986 ?? ??? - SACDIN replaces much of the SAC Automated Command and Control System (SACCS) and the SAC Automated Total Information Network (SATIN)

    Read more →
  • Social media use by businesses

    Social media use by businesses

    Social media use by businesses includes a range of applications. Although social media accessed via desktop computers offer an online shopping variety of opportunities for companies in a wide range of business sectors, mobile social media, which users can access when they are "on the go" via tablet computers or smartphones, benefit companies because of the location- and time-sensitive awareness of their users. Mobile social media tools can be used for marketing research, communication, sales promotions/discounts, informal employee learning/organizational development, relationship development/loyalty programs, and e-commerce. Marketing research: Mobile social media applications provide companies data about offline consumer movements at a level of detail that was previously accessible to online companies only. These applications allow any business to know the exact time a customer who uses social media entered one of its locations, as well as know the social media comments made during the visit. Communication: Mobile social media communication takes two forms: company-to-consumer (in which a company may establish a connection to a consumer based on its location and provide reviews about locations nearby) and user-generated content. For example, McDonald's offered $5 and $10 gift-cards to 100 users randomly selected among those checking in at one of its restaurants. This promotion increased check-ins by 33% (from 2,146 to 2,865), resulted in over 50 articles and blog posts, and prompted several hundred thousand news feeds and Twitter messages. Sales promotions and discounts: Although customers have had to use printed coupons in the past, mobile social media allows companies to tailor promotions to specific users at specific times. For example, when launching its California-Cancun service, Virgin America offered users who checked in through Loopt at one of three designated taco trucks in San Francisco or Los Angeles between 11 a.m. and 3 p.m. on 31 August 2010, two tacos for $1 and two flights to Cancun or Cabo for the price of one. This special promotion was only available to people who were at a certain location at a certain time. Relationship development and loyalty programs: In order to increase long-term relationships with customers, companies can develop loyalty programs that allow customers who check-in via social media regularly at a location to earn discounts or perks. For example, American Eagle Outfitters remunerates such customers with a tiered 10%, 15%, or 20% discount on their total purchase. Informal employee learning/organizational development is facilitated by social media. Technologies such as blogs, wiki pages, web forums, social networks and other social media act as technology enhanced learning (TEL) tools, and their users perceive change in organizational structure, culture and knowledge management. The prerequisite for the successful use of social media are motivated employees who want to use the new technologies. It is central for companies to understand the factors that determine the willingness to use social media. Customer service and support: A company can gain cost savings and increase revenue and customer satisfaction by using social media platforms in customer service and support. By using social media tools, company's have easy and widescale contact to its customers and simultaneously increase their brand knowledge. E-commerce: Social media sites are increasingly implementing marketing-friendly strategies, creating platforms that are mutually beneficial for users, businesses, and the networks themselves in the popularity and accessibility of e-commerce, or online purchases. The user who posts their comments about a company's product or service benefits because they are able to share their views with their online friends and acquaintances. The company benefits because it obtains insight (positive or negative) about how their product or service is viewed by consumers. Mobile social media applications such as Amazon.com and Pinterest have started to influence an upward trend in the popularity and accessibility of e-commerce. E-commerce businesses may refer to social media as consumer-generated media (CGM). A common thread running through all definitions of social media is a blending of technology and social interaction for the co-creation of value for the business or organization that is using it. People obtain valuable information, education, news, and other data from electronic and print media. Social media are distinct from industrial and traditional media such as newspapers, magazines, television, and film as they are comparatively inexpensive marketing tools and are highly accessible. They enable anyone, including private individuals, to publish or access information easily. Industrial media generally require significant resources to publish information, and in most cases the articles go through many revisions before being published. This process adds to the cost and the resulting market price. Originally social media was only used by individuals, but now it is used by both businesses and nonprofit organizations and also in government and politics. One characteristic shared by both social and industrial media is the capability to reach small or large audiences; for example, either a blog post or a television show may reach no people or millions of people. Some of the properties that help describe the differences between social and industrial media are: Quality: In industrial (traditional) publishing—mediated by a publisher—the typical range of quality is substantially narrower (skewing to the high quality side) than in niche, unmediated markets like user-generated social media posts. The main challenge posed by the content in social media sites is the fact that the distribution of quality has high variance: from very high-quality items to low-quality, sometimes even abusive or inappropriate content. Reach: Both industrial and social media technologies provide scale and are capable of reaching a global audience. Industrial media, however, typically use a centralized framework for organization, production, and dissemination, whereas social media are by their very nature more decentralized, less hierarchical, and distinguished by multiple points of production and utility. Frequency: The number of times users access a type of media per day. Heavy social media users, such as young people, check their social media account numerous times throughout the day. Accessibility: The means of production for industrial media are typically government or corporate (privately owned); social media tools are generally available to the public at little or no cost, or they are supported by advertising revenue. While social media tools are available to anyone with access to Internet and a computer or mobile device, due to the digital divide, the poorest segment of the population lacks access to the Internet and computer. Low-income people may have more access to traditional media (TV, radio, etc.), as an inexpensive TV and aerial or radio costs much less than an inexpensive computer or mobile device. Moreover, in many regions, TV or radio owners can tune into free over the air programming; computer or mobile device owners need Internet access to go to social media sites. Usability: Industrial media production typically requires specialized skills and training. For example, in the 1970s, to record a pop song, an aspiring singer would have to rent time in an expensive professional recording studio and hire an audio engineer. Conversely, most social media activities, such as posting a video of oneself singing a song require only modest reinterpretation of existing skills (assuming a person understands Web 2.0 technologies); in theory, anyone with access to the Internet can operate the means of social media production, and post digital pictures, videos or text online. Immediacy: The time lag between communications produced by industrial media can be long (days, weeks, or even months, by the time the content has been reviewed by various editors and fact checkers) compared to social media (which can be capable of virtually instantaneous responses). The immediacy of social media can be seen as a strength, in that it enables regular people to instantly communicate their opinions and information. At the same time, the immediacy of social media can also be seen as a weakness, as the lack of fact checking and editorial "gatekeepers" facilitates the circulation of hoaxes and fake news. Permanence: Industrial media, once created, cannot be altered (e.g., once a magazine article or paper book is printed and distributed, changes cannot be made to that same article in that print run) whereas social media posts can be altered almost instantaneously, when the user decides to edit their post or due to comments from other readers. Community media constitute a hybrid of industrial and social media. Though community-owned, some community radio,

    Read more →
  • Social media and suicide

    Social media and suicide

    Since the rise of social media, there have been numerous cases of individuals being influenced towards committing suicide or self-harm through their use of social media, and even of individuals arranging to broadcast suicide attempts, some successful, on social media. Researchers have studied social media and suicide to determine what, if any, risks social media poses in terms of suicide, and to identify methods of mitigating such risks, if they exist. The search for a correlation has not yet uncovered a clear answer. == Background == Suicide is one of the leading causes of death worldwide, and as of 2020, the second leading cause of death in the United States for those aged 15–34. According to the Center for Disease Control and Prevention, suicide was the third leading cause of death among adolescents in the US, from 1999 to 2006. In 2020, people in the US had a suicide rate of 13.5 per 100,000. Suicide was a leading cause of death in the United States accounting for 48,183 deaths in 2021. Suicide rates increased by 30 per cent from 2000 to 2018 and declined in 2019 and 2020. Suicide remains a significant public health issue worldwide, despite prevention efforts and treatments. Suicide has been identified not only as an individual phenomenon but also as being influenced by social and environmental factors. There is growing evidence that online activity has influenced suicide-related behavior. The use of social media throughout the 21st century has grown exponentially. For this reason, there are a variety of sources that are accessible to the public in various forms, especially social media sites such as Facebook, Instagram, Twitter, YouTube, Snapchat, TikTok and many more. Although these platforms were intended to allow people to connect virtually, these platforms can lead to cyber-bullying, insecurity, and emotional distress, and sometimes may influence a person to attempt suicide. Bullying, whether on social media or elsewhere, physical or not, significantly increases victims' risk of suicidal behavior. Since social media was introduced some people have taken their lives as a result of cyberbullying. Furthermore, suicide rates among teenagers have increased from 2010 to 2022 as social media has become something that people interact with more throughout their day-to-day lives. Media algorithms tend to popularize videos and posts to inform the country of the rising trouble, which may create a popular appeal to the young and immature minds of teenagers. This is why, social media could provide higher risks with the promotion of different kinds of pro-suicidal sites, message boards, chat rooms, and forums. Moreover, the Internet not only reports suicide incidents but documents suicide methods (for example, suicide pacts, an agreement between two or more people to kill themselves at a particular time and often by the same lethal means). Therefore, the role the Internet plays, particularly social media, in suicide-related behavior is a topic of growing interest. == Cyberbullying == There is substantial evidence that the Internet and social media can influence suicide-related behavior. Such evidence includes an increase in exposure to graphic content. A research study conducted by Sameer Hinduja and Justin Patchin found a correlation between cyberbullying and suicide. According to their findings, cyber-bullying increases suicidal thoughts by 14.5 percent and suicide attempts by 8.7 percent. Particularly alarming is the fact that children and young people under 25 who are victims of cyberbullying are more than twice as likely to self-harm and engage in suicidal behavior. Overall, teen suicide rates have increased within the past decade.This presents a significant public health concern, with over 40,000 suicides in the United States and nearly one million worldwide annually. Adolescents involved in cyberbullying often downplay its seriousness by calling it a joke or blaming the victim. These moral disengagement strategies can normalize harmful behavior and reduce feelings of guilt. This normalization may increase emotional distress and contribute to risks like depression and suicidal thoughts. Recent data from the Centers for Disease Control and Prevention reveals that 14.9 per cent of teenagers have experienced online bullying, while 13.6 per cent of teenagers have seriously attempted suicide. Both of these incidents are in increasing numbers in the United States. Furthermore, in numerous recent incidents, cyber-bullying led the victim to commit suicide; this phenomenon is now known as cyberbullicide. Many parents and children are unaware of the dangers and potential legal consequences of cyberbullying. As a response, anti-bullying regulations implemented by schools aim to prevent any form of bullying, including through technology, and protect students from online harassment. While some states have enacted laws against cyberbullying, there are currently no federal regulations addressing this issue. == Social media's influence on suicide == The media may portray suicidal behavior or language which can potentially influence people to act on these suicidal ideation. This may include news reports of actual suicides that have occurred or television shows and films that reenact suicides. Some organizations have proposed guidelines about how the media should report suicide. There is evidence that compliance with the guidelines varies. Some research showed that it is unclear whether the guidelines have successfully reduced the number of suicides. On the contrary, other research studies stated that the guidelines have worked in some cases. == Impact of pro-suicidal sites, message boards, chat rooms and forums == Social media platforms have transformed traditional methods of communication by allowing instantaneous and interactive sharing of information created and controlled by individuals, groups, organizations, and governments. As of the third quarter of 2022, Facebook had 266 million monthly active users, between Canada and the US. An immense quantity of information on the topic of suicide is available on the Internet and via social media. The information available on social media on the topic of suicide can influence suicidal behavior, both negatively and positively. The social cognitive theory plays a vital role in suicide attempts influenced through social media. This theory is demonstrated when one is influenced by what they see through various processes that form into modeled behaviors. This can be shown when people post their suicide attempts online or promote suicidal behavior in general. Contributors to these social media platforms may also exert peer pressure and encourage others to take their own lives, idolize those who have killed themselves, and facilitate suicide pacts. These pro-suicidal sites reported the following. For example, on a Japanese message board in 2008, it was shared that people can kill themselves using hydrogen sulfide gas. Shortly afterwards, 220 people attempted suicide in this way, and 208 were successful. Biddle et al. conducted a systematic Web search of 12 suicide-associated terms (e.g., suicide, suicide methods, how to kill yourself, and best suicide methods) to analyze the search results, and found that pro-suicide sites and chat rooms that discussed general issues associated with suicide most often occurred within the first few hits of a search. In another study, 373 suicide-related websites were found using Internet search engines and examined. Among them, 31% were suicide-neutral, 29% were anti-suicide, and 11% were pro-suicide. Together, these studies have shown that obtaining pro-suicide information on the Internet, including detailed information on suicide methods, is very easy. While social media has been prevalent in young adult suicide, some young adults find comfort and solace through these platforms. Young adults are making connections with people in like situations that are helping them feel less lonely. Although the public opinion is that message boards are harmful, the following studies show how they point to suicide prevention and have positive influences. A study using content analysis analyzed all of the postings on the AOL Suicide Bulletin Board over 11 months and concluded that most contributions contained positive, empathetic, and supportive postings. Then, a multi-method study was able to demonstrate that the users of such forums experience a great deal of social support and only a small amount of social strain. Lastly, in the survey participants were asked to assess the extent of their suicidal thoughts on a 7-level scale (0, absolutely no suicidal thoughts, to 7, very strong suicidal thoughts) for the time directly before their first forum visit and at the time of the survey. The study found a significant reduction after using the forum. The study however cannot conclude the forum is the only reason for the decrease. Together, these studies show how forums can reduce the number of

    Read more →
  • Pooling layer

    Pooling layer

    In neural networks, a pooling layer is a kind of network layer that downsamples and aggregates information that is dispersed among many vectors into fewer vectors. It has several uses. It removes redundant information, thus reducing the amount of computation and memory required, which makes the model more robust to small variations in the input; and it increases the receptive field of neurons in later layers in the network. == Convolutional neural network pooling == Pooling is most commonly used in convolutional neural networks (CNN). Below is a description of pooling in 2-dimensional CNNs. The generalization to n-dimensions is immediate. As notation, we consider a tensor x ∈ R H × W × C {\displaystyle x\in \mathbb {R} ^{H\times W\times C}} , where H {\displaystyle H} is height, W {\displaystyle W} is width, and C {\displaystyle C} is the number of channels. A pooling layer outputs a tensor y ∈ R H ′ × W ′ × C ′ {\displaystyle y\in \mathbb {R} ^{H'\times W'\times C'}} . We define two variables f , s {\displaystyle f,s} called "filter size" (aka "kernel size") and "stride". Sometimes, it is necessary to use a different filter size and stride for horizontal and vertical directions. In such cases, we define 4 variables: f H , f W , s H , s W {\displaystyle f_{H},f_{W},s_{H},s_{W}} . The receptive field of an entry in the output tensor, y {\displaystyle y} , are all the entries in x {\displaystyle x} that can affect that entry. === Max pooling === Max Pooling (MaxPool) is commonly used in CNNs to reduce the spatial dimensions of feature maps. Define M a x P o o l ( x | f , s ) 0 , 0 , 0 = max ( x 0 : f − 1 , 0 : f − 1 , 0 ) {\displaystyle \mathrm {MaxPool} (x|f,s)_{0,0,0}=\max(x_{0:f-1,0:f-1,0})} where 0 : f − 1 {\displaystyle 0:f-1} means the range 0 , 1 , … , f − 1 {\displaystyle 0,1,\dots ,f-1} . Note that we need to avoid the off-by-one error. The next input is M a x P o o l ( x | f , s ) 1 , 0 , 0 = max ( x s : s + f − 1 , 0 : f − 1 , 0 ) {\displaystyle \mathrm {MaxPool} (x|f,s)_{1,0,0}=\max(x_{s:s+f-1,0:f-1,0})} and so on. The receptive field of y i , j , c {\displaystyle y_{i,j,c}} is x i s + f − 1 , j s + f − 1 , c {\displaystyle x_{is+f-1,js+f-1,c}} , so in general, M a x P o o l ( x | f , s ) i , j , c = m a x ( x i s : i s + f − 1 , j s : j s + f − 1 , c ) {\displaystyle \mathrm {MaxPool} (x|f,s)_{i,j,c}=\mathrm {max} (x_{is:is+f-1,js:js+f-1,c})} If the horizontal and vertical filter size and strides differ, then in general, M a x P o o l ( x | f , s ) i , j , c = m a x ( x i s H : i s H + f H − 1 , j s W : j s W + f W − 1 , c ) {\displaystyle \mathrm {MaxPool} (x|f,s)_{i,j,c}=\mathrm {max} (x_{is_{H}:is_{H}+f_{H}-1,js_{W}:js_{W}+f_{W}-1,c})} More succinctly, we can write y k = max ( { x k ′ | k ′ in the receptive field of k } ) {\displaystyle y_{k}=\max(\{x_{k'}|k'{\text{ in the receptive field of }}k\})} . If H {\displaystyle H} is not expressible as k s + f {\displaystyle ks+f} where k {\displaystyle k} is an integer, then for computing the entries of the output tensor on the boundaries, max pooling would attempt to take as inputs variables off the tensor. In this case, how those non-existent variables are handled depends on the padding conditions, illustrated on the right. Global Max Pooling (GMP) is a specific kind of max pooling where the output tensor has shape R C {\displaystyle \mathbb {R} ^{C}} and the receptive field of y c {\displaystyle y_{c}} is all of x 0 : H , 0 : W , c {\displaystyle x_{0:H,0:W,c}} . That is, it takes the maximum over each entire channel. It is often used just before the final fully connected layers in a CNN classification head. === Average pooling === Average pooling (AvgPool) is similarly defined A v g P o o l ( x | f , s ) i , j , c = a v e r a g e ( x i s : i s + f − 1 , j s : j s + f − 1 , c ) = 1 f 2 ∑ k ∈ i s : i s + f − 1 ∑ l ∈ j s : j s + f − 1 x k , l , c {\displaystyle \mathrm {AvgPool} (x|f,s)_{i,j,c}=\mathrm {average} (x_{is:is+f-1,js:js+f-1,c})={\frac {1}{f^{2}}}\sum _{k\in is:is+f-1}\sum _{l\in js:js+f-1}x_{k,l,c}} Global Average Pooling (GAP) is defined similarly to GMP. It was first proposed in Network-in-Network. Similarly to GMP, it is often used just before the final fully connected layers in a CNN classification head. === Interpolations === There are some interpolations of max pooling and average pooling. Mixed Pooling is a linear sum of max pooling and average pooling. That is, M i x e d P o o l ( x | f , s , w ) = w M a x P o o l ( x | f , s ) + ( 1 − w ) A v g P o o l ( x | f , s ) {\displaystyle \mathrm {MixedPool} (x|f,s,w)=w\mathrm {MaxPool} (x|f,s)+(1-w)\mathrm {AvgPool} (x|f,s)} where w ∈ [ 0 , 1 ] {\displaystyle w\in [0,1]} is either a hyperparameter, a learnable parameter, or randomly sampled anew every time. Lp Pooling is similar to average pooling, but uses Lp norm average instead of average: y k = ( 1 N ∑ k ′ in the receptive field of k | x k ′ | p ) 1 / p {\displaystyle y_{k}=\left({\frac {1}{N}}\sum _{k'{\text{ in the receptive field of }}k}|x_{k'}|^{p}\right)^{1/p}} where N {\displaystyle N} is the size of receptive field, and p ≥ 1 {\displaystyle p\geq 1} is a hyperparameter. If all activations are non-negative, then average pooling is the case of p = 1 {\displaystyle p=1} , and max pooling is the case of p → ∞ {\displaystyle p\to \infty } . Square-root pooling is the case of p = 2 {\displaystyle p=2} . Stochastic pooling samples a random activation x k ′ {\displaystyle x_{k'}} from the receptive field with probability x k ′ ∑ k ″ x k ″ {\displaystyle {\frac {x_{k'}}{\sum _{k''}x_{k''}}}} . It is the same as average pooling in expectation. Softmax pooling is like max pooling, but uses softmax, i.e. ∑ k ′ e β x k ′ x k ′ ∑ k ″ e β x k ″ {\displaystyle {\frac {\sum _{k'}e^{\beta x_{k'}}x_{k'}}{\sum _{k''}e^{\beta x_{k''}}}}} where β > 0 {\displaystyle \beta >0} . Average pooling is the case of β ↓ 0 {\displaystyle \beta \downarrow 0} , and max pooling is the case of β ↑ ∞ {\displaystyle \beta \uparrow \infty } Local Importance-based Pooling generalizes softmax pooling by ∑ k ′ e g ( x k ′ ) x k ′ ∑ k ″ e g ( x k ″ ) {\displaystyle {\frac {\sum _{k'}e^{g(x_{k'})}x_{k'}}{\sum _{k''}e^{g(x_{k''})}}}} where g {\displaystyle g} is a learnable function. === Other poolings === Spatial pyramidal pooling applies max pooling (or any other form of pooling) in a pyramid structure. That is, it applies global max pooling, then applies max pooling to the image divided into 4 equal parts, then 16, etc. The results are then concatenated. It is a hierarchical form of global pooling, and similar to global pooling, it is often used just before a classification head. Region of Interest Pooling (also known as RoI pooling) is a variant of max pooling used in R-CNNs for object detection. It is designed to take an arbitrarily-sized input matrix, and output a fixed-sized output matrix. Covariance pooling computes the covariance matrix of the vectors { x k , l , 0 : C − 1 } k ∈ i s : i s + f − 1 , l ∈ j s : j s + f − 1 {\displaystyle \{x_{k,l,0:C-1}\}_{k\in is:is+f-1,l\in js:js+f-1}} which is then flattened to a C 2 {\displaystyle C^{2}} -dimensional vector y i , j , 0 : C 2 − 1 {\displaystyle y_{i,j,0:C^{2}-1}} . Global covariance pooling is used similarly to global max pooling. As average pooling computes the average, which is a first-degree statistic, and covariance is a second-degree statistic, covariance pooling is also called "second-order pooling". It can be generalized to higher-order poolings. Blur Pooling means applying a blurring method before downsampling. For example, the Rect-2 blur pooling means taking an average pooling at f = 2 , s = 1 {\displaystyle f=2,s=1} , then taking every second pixel (identity with s = 2 {\displaystyle s=2} ). == Vision Transformer pooling == In Vision Transformers (ViT), there are the following common kinds of poolings. BERT-like pooling uses a dummy [CLS] token, "classification". For classification, the output at [CLS] is the classification token, which is then processed by a LayerNorm-feedforward-softmax module into a probability distribution, which is the network's prediction of class probability distribution. This is the one used by the original ViT and Masked Autoencoder. Global average pooling (GAP) does not use the dummy token, but simply takes the average of all output tokens as the classification token. It was mentioned in the original ViT as being equally good. Multihead attention pooling (MAP) applies a multi headed attention block to pooling. Specifically, it takes as input a list of vectors x 1 , x 2 , … , x n {\displaystyle x_{1},x_{2},\dots ,x_{n}} , which might be thought of as the output vectors of a layer of a ViT. It then applies a feedforward layer F F N {\displaystyle \mathrm {FFN} } on each vector, resulting in a matrix V = [ F F N ( v 1 ) , … , F F N ( v n ) ] {\displaystyle V=[\mathrm {FFN} (v_{1}),\dots ,\mathrm {FFN} (v_{n})]} . This is then sent to a multi-headed attention, resulting in M u l t i h e a d e d A

    Read more →
  • Social media marketing

    Social media marketing

    Social media marketing is the use of social media platforms and websites to promote a product or service. Although the terms e-marketing and digital marketing are still dominant in academia, social media marketing is becoming more popular for practitioners and researchers. Social media platforms such as Facebook, LinkedIn, Instagram, and Twitter, among others, have built-in data analytics tools that companies can use to track the progress, success, and engagement of social media marketing campaigns. Companies address a range of stakeholders through social media marketing, including current and potential customers, current and potential employees, journalists, bloggers, and the general public. On a strategic level, social media marketing includes the management of a marketing campaign, governance, setting the scope (e.g. more active or passive use) and the establishment of a firm's desired social media "culture" and "tone". Firms that use social media marketing can allow customers and Internet users to post user-generated content (e.g., online comments, product reviews, etc.), also known as "earned media", rather than use marketer-prepared advertising copy. == Purposes and tactics == Social media may be employed in marketing as a communications tool that makes companies accessible to those who are interested in their product and visible to those who are not familiar with their products. It is used by companies to create buzz, learn from customers, and target them. Of the top 10 factors that correlate with a strong Google organic search, seven are social media-dependent. This means that if brands with little to no social media presence tend to show up less on Google searches. While platforms such as Twitter, Facebook and—in the past—Google+ have a larger number of monthly users, the visual media-sharing-based mobile platforms garner a higher interaction rate in comparison, and have registered the fastest growth, and have changed the ways in which consumers engage with brand content. Instagram has an interaction rate of 1.46% with an average of 130 million users monthly as opposed to Twitter, which has a .03% interaction rate with an average of 210 million monthly users. Unlike traditional media that are often cost-prohibitive to many companies, a social media strategy does not require significant financial investment. To this end, companies make use of platforms such as Facebook, Twitter, YouTube, TikTok and Instagram to reach audiences much wider than through traditional print, television, or radio advertisements alone at a fraction of the cost, as most social networking sites can be used at little or no cost (however, some websites charge companies for premium services). This has changed the ways that companies approach and interact with customers, as a substantial percentage of consumer interactions are now being carried out over online platforms with much higher visibility. Customers can post reviews of products and services, rate customer service, and ask questions or voice concerns directly to companies through social media platforms. According to Measuring Success, over 80% of consumers use the web to research products and services. Thus social media marketing is also used by businesses in order to build relationships of trust with consumers. To this aim, companies may hire personnel to specifically handle these social media interactions, who usually report under the title of online community managers. Handling these interactions in a satisfactory manner can result in an increase of consumer trust. To both this aim and to fix the public's perception of a company, three steps are taken in order to address consumer concerns: Identifying the extent of the social chatter Engaging the influencers to help Developing a proportional response == Strategies == === Passive approach === Social media can be a useful source of market information and a way to hear customers' perspectives. Blogs, content communities, and forums are platforms where individuals share their reviews and recommendations of brands, products, and services. Businesses are able to tap into and analyze customer voices and feedback generated in social media for marketing purposes. In this sense, social media is a relatively inexpensive source of market intelligence which can be used by marketers and managers to track and respond to consumer-identified problems and detect market opportunities. === Active approach === Social media can be used as a public relations tool, a direct marketing tool, and a communication channel to target very specific audiences, with social media influencers and social media personalities as effective customer engagement tools. This tactic is widely known as influencer marketing, which gives brands the opportunity to reach their target audience via a group of selected influencers advertising their product or service. Brands were projected to spend up to $15 billion on influencer marketing by 2022, per Business Insider Intelligence estimates, based on Mediakix data. The use of customer influencers, such as popular bloggers, can be an efficient and cost-effective method to launch new products or services. == Engagement == Engagement with the social web means that customers and stakeholders are active participants rather than passive spectators. An example of these are consumer advocacy groups and groups that criticize companies (e.g., lobby groups or advocacy organizations). The use of Social media in a business or political context allows people to express and share opinions about a company's products, services or business practices, or a government's actions. On social media, each participant becomes part of the marketing department (or a challenge to the marketing effort) as other customers read their comments or reviews. The effectiveness of social media marketing campaigns is dependent on the promotion of online engagement. With the advent of social media marketing, it has become increasingly important to gain customer interest in products and services, which can eventually be translated into buying behavior, or voting and donating behavior in a political context. New online marketing concepts of engagement and loyalty have emerged which aim to build customer participation and brand reputation. Engagement in social media for the purpose of a social media strategy is divided into two parts. The first is proactive, regular posting of new online content, which can be seen through digital photos, digital videos, text, and conversations. It is also represented through sharing of content and information from others via weblinks. The second part is reactive conversations, with social media users responding to those who reach out to others' social media profiles through comments or messages. == Campaigns == === Local businesses === Small businesses use social networking sites as a promotional technique. Businesses can follow individuals' social media usage in their local area and advertise specials and deals, which can be exclusive and in the form of "get a free drink with a copy of this tweet". This type of message encourages other locals to follow the business on their official websites in order to obtain the promotional deal. The business's brand visibility is enhanced in the process. Social networking sites are also used by small businesses to develop their own market research on new products and services. By encouraging their customers to give feedback on new product ideas, businesses can gain insights on whether or not a product may be accepted by their target market enough to merit full production. In addition, customers will feel the company has engaged them in the process of co-creation—the process in which the business uses customer feedback to create or modify a product or service to fill a need of the target market. Such feedback can be presented in various forms, such as surveys, contests, and polls. Social networking sites such as LinkedIn, also provide opportunities for small businesses to find candidates to fill staff positions. Review sites such as Yelp help small businesses build their reputation beyond brand visibility. Positive customer peer reviews help influence new prospects to purchase goods and services more than company advertising. == Benefits == Social Media Marketing allows companies to promote themselves to large, diverse audiences that could not be reached through traditional marketing such as phone and email-based advertising. Marketing on most social media platforms also comes at little to no cost, making it accessible to virtually any size business. Social Media Marketing accommodates personalized and direct marketing that targets specific demographics and markets. Companies can engage with customers directly, allowing them to obtain feedback and resolve issues almost immediately. Another advantage of social media marketing is that it's an ideal environment for a company to conduct market research. It can be used

    Read more →
  • Data transformation (computing)

    Data transformation (computing)

    In computing, data transformation is the process of converting data from one format or structure into another format or structure. It is a fundamental aspect of most data integration and data management tasks such as data wrangling, data warehousing, data integration and application integration. Data transformation can be simple or complex based on the required changes to the data between the source (initial) data and the target (final) data. Data transformation is typically performed via a mixture of manual and automated steps. Tools and technologies used for data transformation can vary widely based on the format, structure, complexity, and volume of the data being transformed. A master data recast is another form of data transformation where the entire database of data values is transformed or recast without extracting the data from the database. All data in a well-designed database is directly or indirectly related to a limited set of master database tables by a network of foreign key constraints. Each foreign key constraint is dependent upon a unique database index from the parent database table. Therefore, when the proper master database table is recast with a different unique index, the directly and indirectly related data are also recast or restated. The directly and indirectly related data may also still be viewed in the original form since the original unique index still exists with the master data. Also, the database recast must be done in such a way as to not impact the applications architecture software. When the data mapping is indirect via a mediating data model, the process is also called data mediation. == Data transformation process == Data transformation can be divided into the following steps, each applicable as needed based on the complexity of the transformation required. Data discovery Data mapping Code generation Code execution Data review These steps are often the focus of developers or technical data analysts who may use multiple specialized tools to perform their tasks. The steps can be described as follows: Data discovery is the first step in the data transformation process. Typically the data is profiled using profiling tools or sometimes using manually written profiling scripts to better understand the structure and characteristics of the data and decide how it needs to be transformed. Data mapping is the process of defining how individual fields are mapped, modified, joined, filtered, aggregated etc. to produce the final desired output. Developers or technical data analysts traditionally perform data mapping since they work in the specific technologies to define the transformation rules (e.g. visual ETL tools, transformation languages). Code generation is the process of generating executable code (e.g. SQL, Python, R, or other executable instructions) that will transform the data based on the desired and defined data mapping rules. Typically, the data transformation technologies generate this code based on the definitions or metadata defined by the developers. Code execution is the step whereby the generated code is executed against the data to create the desired output. The executed code may be tightly integrated into the transformation tool, or it may require separate steps by the developer to manually execute the generated code. Data review is the final step in the process, which focuses on ensuring the output data meets the transformation requirements. It is typically the business user or final end-user of the data that performs this step. Any anomalies or errors in the data that are found and communicated back to the developer or data analyst as new requirements to be implemented in the transformation process. == Types of data transformation == === Batch data transformation === Traditionally, data transformation has been a bulk or batch process, whereby developers write code or implement transformation rules in a data integration tool, and then execute that code or those rules on large volumes of data. This process can follow the linear set of steps as described in the data transformation process above. Batch data transformation is the cornerstone of virtually all data integration technologies such as data warehousing, data migration and application integration. When data must be transformed and delivered with low latency, the term "microbatch" is often used. This refers to small batches of data (e.g. a small number of rows or a small set of data objects) that can be processed very quickly and delivered to the target system when needed. === Benefits of batch data transformation === Traditional data transformation processes have served companies well for decades. The various tools and technologies (data profiling, data visualization, data cleansing, data integration etc.) have matured and most (if not all) enterprises transform enormous volumes of data that feed internal and external applications, data warehouses and other data stores. === Limitations of traditional data transformation === This traditional process also has limitations that hamper its overall efficiency and effectiveness. The people who need to use the data (e.g. business users) do not play a direct role in the data transformation process. Typically, users hand over the data transformation task to developers who have the necessary coding or technical skills to define the transformations and execute them on the data. This process leaves the bulk of the work of defining the required transformations to the developer, which often in turn do not have the same domain knowledge as the business user. The developer interprets the business user requirements and implements the related code/logic. This has the potential of introducing errors into the process (through misinterpreted requirements), and also increases the time to arrive at a solution. This problem has given rise to the need for agility and self-service in data integration (i.e. empowering the user of the data and enabling them to transform the data themselves interactively). There are companies that provide self-service data transformation tools. They are aiming to efficiently analyze, map and transform large volumes of data without the technical knowledge and process complexity that currently exists. While these companies use traditional batch transformation, their tools enable more interactivity for users through visual platforms and easily repeated scripts. Still, there might be some compatibility issues (e.g. new data sources like IoT may not work correctly with older tools) and compliance limitations due to the difference in data governance, preparation and audit practices. === Interactive data transformation === Interactive data transformation (IDT) is an emerging capability that allows business analysts and business users the ability to directly interact with large datasets through a visual interface, understand the characteristics of the data (via automated data profiling or visualization), and change or correct the data through simple interactions such as clicking or selecting certain elements of the data. Although interactive data transformation follows the same data integration process steps as batch data integration, the key difference is that the steps are not necessarily followed in a linear fashion and typically don't require significant technical skills for completion. There are a number of companies that provide interactive data transformation tools, including Trifacta, Alteryx and Paxata. They are aiming to efficiently analyze, map and transform large volumes of data while at the same time abstracting away some of the technical complexity and processes which take place under the hood. Interactive data transformation solutions provide an integrated visual interface that combines the previously disparate steps of data analysis, data mapping and code generation/execution and data inspection. That is, if changes are made at one step (like for example renaming), the software automatically updates the preceding or following steps accordingly. Interfaces for interactive data transformation incorporate visualizations to show the user patterns and anomalies in the data so they can identify erroneous or outlying values. Once they've finished transforming the data, the system can generate executable code/logic, which can be executed or applied to subsequent similar data sets. By removing the developer from the process, interactive data transformation systems shorten the time needed to prepare and transform the data, eliminate costly errors in the interpretation of user requirements and empower business users and analysts to control their data and interact with it as needed. == Transformational languages == There are numerous languages available for performing data transformation. Many transformation languages require a grammar to be provided. In many cases, the grammar is structured using something closely resembling Backus–Naur form (BNF). There are numerous languages

    Read more →
  • Initialization vector

    Initialization vector

    In cryptography, an initialization vector (IV) or starting variable is an input to a cryptographic primitive being used to provide the initial state. The IV is typically required to be random or pseudorandom, but sometimes an IV only needs to be unpredictable or unique. Randomization is crucial for some encryption schemes to achieve semantic security, a property whereby repeated usage of the scheme under the same key does not allow an attacker to infer relationships between (potentially similar) segments of the encrypted message. For block ciphers, the use of an IV is described by the modes of operation. Some cryptographic primitives require the IV only to be non-repeating, and the required randomness is derived internally. In this case, the IV is commonly called a nonce (a number used only once), and the primitives (e.g. CBC) are considered stateful rather than randomized. This is because an IV need not be explicitly forwarded to a recipient but may be derived from a common state updated at both sender and receiver side. (In practice, a short nonce is still transmitted along with the message to consider message loss.) An example of stateful encryption schemes is the counter mode of operation, which has a sequence number for a nonce. The IV size depends on the cryptographic primitive used; for block ciphers it is generally the cipher's block-size. In encryption schemes, the unpredictable part of the IV has at best the same size as the key to compensate for time/memory/data tradeoff attacks. When the IV is chosen at random, the probability of collisions due to the birthday problem must be taken into account. Traditional stream ciphers such as RC4 do not support an explicit IV as input, and a custom solution for incorporating an IV into the cipher's key or internal state is needed. Some designs realized in practice are known to be insecure; the WEP protocol is a notable example, and is prone to related-IV attacks. == Motivation == A block cipher is one of the most basic primitives in cryptography, and frequently used for data encryption. However, by itself, it can only be used to encode a data block of a predefined size, called the block size. For example, a single invocation of the AES algorithm transforms a 128-bit plaintext block into a ciphertext block of 128 bits in size. The key, which is given as one input to the cipher, defines the mapping between plaintext and ciphertext. If data of arbitrary length is to be encrypted, a simple strategy is to split the data into blocks each matching the cipher's block size, and encrypt each block separately using the same key. This method is not secure as equal plaintext blocks get transformed into equal ciphertexts, and a third party observing the encrypted data may easily determine its content even when not knowing the encryption key. To hide patterns in encrypted data while avoiding the re-issuing of a new key after each block cipher invocation, a method is needed to randomize the input data. In 1980, the NIST published a national standard document designated Federal Information Processing Standard (FIPS) PUB 81, which specified four so-called block cipher modes of operation, each describing a different solution for encrypting a set of input blocks. The first mode implements the simple strategy described above, and was specified as the electronic codebook (ECB) mode. In contrast, each of the other modes describe a process where ciphertext from one block encryption step gets intermixed with the data from the next encryption step. To initiate this process, an additional input value is required to be mixed with the first block, and which is referred to as an initialization vector. For example, the cipher-block chaining (CBC) mode requires an unpredictable value, of size equal to the cipher's block size, as additional input. This unpredictable value is added to the first plaintext block before subsequent encryption. In turn, the ciphertext produced in the first encryption step is added to the second plaintext block, and so on. The ultimate goal for encryption schemes is to provide semantic security: by this property, it is practically impossible for an attacker to draw any knowledge from observed ciphertext. It can be shown that each of the three additional modes specified by the NIST are semantically secure under so-called chosen-plaintext attacks. == Properties == Properties of an IV depend on the cryptographic scheme used. A basic requirement is uniqueness, which means that no IV may be reused under the same key. For block ciphers, repeated IV values devolve the encryption scheme into electronic codebook mode: equal IV and equal plaintext result in equal ciphertext. In stream cipher encryption uniqueness is crucially important as plaintext may be trivially recovered otherwise. Example: Stream ciphers encrypt plaintext P to ciphertext C by deriving a key stream K from a given key and IV and computing C as C = P xor K. Assume that an attacker has observed two messages C1 and C2 both encrypted with the same key and IV. Then knowledge of either P1 or P2 reveals the other plaintext since C1 xor C2 = (P1 xor K) xor (P2 xor K) = P1 xor P2. Many schemes require the IV to be unpredictable by an adversary. This is effected by selecting the IV at random or pseudo-randomly. In such schemes, the chance of a duplicate IV is negligible, but the effect of the birthday problem must be considered. As for the uniqueness requirement, a predictable IV may allow recovery of (partial) plaintext. Example: Consider a scenario where a legitimate party called Alice encrypts messages using the cipher-block chaining mode. Consider further that there is an adversary called Eve that can observe these encryptions and is able to forward plaintext messages to Alice for encryption (in other words, Eve is capable of a chosen-plaintext attack). Now assume that Alice has sent a message consisting of an initialization vector IV1 and starting with a ciphertext block CAlice. Let further PAlice denote the first plaintext block of Alice's message, let E denote encryption, and let PEve be Eve's guess for the first plaintext block. Now, if Eve can determine the initialization vector IV2 of the next message she will be able to test her guess by forwarding a plaintext message to Alice starting with (IV2 xor IV1 xor PEve); if her guess was correct this plaintext block will get encrypted to CAlice by Alice. This is because of the following simple observation: CAlice = E(IV1 xor PAlice) = E(IV2 xor (IV2 xor IV1 xor PAlice)). Depending on whether the IV for a cryptographic scheme must be random or only unique the scheme is either called randomized or stateful. While randomized schemes always require the IV chosen by a sender to be forwarded to receivers, stateful schemes allow sender and receiver to share a common IV state, which is updated in a predefined way at both sides. == Block ciphers == Block cipher processing of data is usually described as a mode of operation. Modes are primarily defined for encryption as well as authentication, though newer designs exist that combine both security solutions in so-called authenticated encryption modes. While encryption and authenticated encryption modes usually take an IV matching the cipher's block size, authentication modes are commonly realized as deterministic algorithms, and the IV is set to zero or some other fixed value. == Stream ciphers == In stream ciphers, IVs are loaded into the keyed internal secret state of the cipher, after which a number of cipher rounds are executed prior to releasing the first bit of output. For performance reasons, designers of stream ciphers try to keep that number of rounds as small as possible, but because determining the minimal secure number of rounds for stream ciphers is not a trivial task, and considering other issues such as entropy loss, unique to each cipher construction, related-IVs and other IV-related attacks are a known security issue for stream ciphers, which makes IV loading in stream ciphers a serious concern and a subject of ongoing research. == WEP IV == The 802.11 encryption algorithm called WEP (short for Wired Equivalent Privacy) used a short, 24-bit IV, leading to reused IVs with the same key, which led to it being easily cracked. Packet injection allowed for WEP to be cracked in times as short as several seconds. This ultimately led to the deprecation of WEP. == SSL 2.0 IV == In cipher-block chaining mode (CBC mode), the IV need not be secret, but must be unpredictable (In particular, for any given plaintext, it must not be possible to predict the IV that will be associated to the plaintext in advance of the generation of the IV.) at encryption time. Additionally for the output feedback mode (OFB mode), the IV must be unique. In particular, the (previously) common practice of re-using the last ciphertext block of a message as the IV for the next message is insecure (for example, this method was used by SSL 2.0). If an attacker knows

    Read more →
  • Enterprise mobile application

    Enterprise mobile application

    The term enterprise mobile application is used in the context of mobile apps created/brought by individual organizations for their workers to carry out the functions required to run the organization. It is the process of building a mobile application for the requirements of an enterprise. An enterprise mobile application belonging to an organization is expected to be used by only the workers of that organization. The definition of enterprise mobile application does not include the mobile apps that an organization create for its customers or consumers of the products or services generated by the organization. == Example == An organization, whether for-profit or non-profit, may create a mobile app for its members to track inventory levels of supplies they distribute to their target communities or materials used in product manufacturing. Such a mobile app comes under the definition of enterprise mobile application. However, the same organization may also create another mobile app to sell their products to end users or spread awareness of their services to various communities, and that mobile app would not come under definition of enterprise mobile application. == Enterprise mobile solution providers == Enterprise Mobile solution providers create and develop apps for individual organizations that can buy instead of creating the apps themselves. Reasons for Organizations buying the apps include time and cost savings, technical expertise. Today Enterprise Mobility is playing track role for enterprise transformation. Today, enterprises needs productivity is a fast way. Enterprise mobility helps business owners to build their work in a progressive way by assisting enterprise mobility solutions.

    Read more →
  • Semi-Automatic Ground Environment

    Semi-Automatic Ground Environment

    The Semi-Automated Ground Environment (SAGE) was a system of large computers and associated networking equipment that coordinated data from many radar sites and processed it to produce a single unified image of the airspace over a wide area. SAGE directed and controlled the NORAD response to a possible Soviet air attack, operating in this role from the late 1950s into the 1980s. The processing power behind SAGE was supplied by the largest discrete component-based computer ever built, the AN/FSQ-7, manufactured by IBM. Each SAGE Direction Center (DC) housed an FSQ-7 which occupied an entire floor, approximately 22,000 square feet (2,000 m2) not including supporting equipment. The FSQ-7 was actually two computers, "A" side and "B" side. Computer processing was switched from "A" side to "B" side on a regular basis, allowing maintenance on the unused side. Information was fed to the DCs from a network of radar stations as well as readiness information from various defense sites. The computers, based on the raw radar data, developed "tracks" for the reported targets, and automatically calculated which defenses were within range. Operators used light guns to select targets on-screen for further information, select one of the available defenses, and issue commands to attack. These commands would then be automatically sent to the defense site via teleprinter. Connecting the various sites was an enormous network of telephones, modems and teleprinters. Later additions to the system allowed SAGE's tracking data to be sent directly to CIM-10 Bomarc missiles and some of the US Air Force's interceptor aircraft in-flight, directly updating their autopilots to maintain an intercept course without operator intervention. Each DC also forwarded data to a Combat Center (CC) for "supervision of the several sectors within the division" ("each combat center [had] the capability to coordinate defense for the whole nation"). SAGE became operational in the late 1950s and early 1960s at an estimated total cost between 8 and 12 billion dollars, four times the cost of the Manhattan Project. Throughout its development, there were continual concerns about its real ability to deal with large attacks, and the Operation Sky Shield tests showed that only about one-fourth of enemy bombers would have been intercepted. Nevertheless, SAGE was the backbone of NORAD's air defense system into the 1980s, by which time the tube-based FSQ-7s were increasingly costly to maintain and completely outdated. Today the same command and control task is carried out by microcomputers, based on the same basic underlying data. == Background == === Earlier systems === Just prior to World War II, Royal Air Force (RAF) tests with the new Chain Home (CH) radars had demonstrated that relaying information to the fighter aircraft directly from the radar sites was not feasible. The radars determined the map coordinates of the enemy, but could generally not see the fighters at the same time. This meant the fighters had to be able to determine where to fly to perform an interception but were often unaware of their own exact location and unable to calculate an interception while also flying their aircraft. The solution was to send all of the radar information to a central control station where operators collated the reports into single tracks, and then reported these tracks to the airbases, or sectors. The sectors used additional systems to track their own aircraft, plotting both on a single large map. Operators viewing the map could then see what direction their fighters would have to fly to approach their targets and relay that simply by telling them to fly along a certain heading or vector. This Dowding system was the first ground-controlled interception (GCI) system of large scale, covering the entirety of the UK. It proved enormously successful during the Battle of Britain, and is credited as being a key part of the RAF's success. The system was slow, often providing information that was up to five minutes out of date. Against propeller driven bombers flying at perhaps 225 miles per hour (362 km/h) this was not a serious concern, but it was clear the system would be of little use against jet-powered bombers flying at perhaps 600 miles per hour (970 km/h). The system was extremely expensive in manpower terms, requiring hundreds of telephone operators, plotters and trackers in addition to the radar operators. This was a serious drain on manpower, making it difficult to expand the network. The idea of using a computer to handle the task of taking reports and developing tracks had been explored beginning late in the war. By 1944, analog computers had been installed at the CH stations to automatically convert radar readings into map locations, eliminating two people. Meanwhile, the Royal Navy began experimenting with the Comprehensive Display System (CDS), another analog computer that took X and Y locations from a map and automatically generated tracks from repeated inputs. Similar systems began development with the Royal Canadian Navy, DATAR, and the US Navy, the Naval Tactical Data System (NTDS). A similar system was also specified for the Nike SAM project, specifically referring to a US version of CDS, coordinating the defense over a battle area so that multiple batteries did not fire on a single target. All of these systems were relatively small in geographic scale, generally tracking within a city-sized area. === Valley Committee === When the Soviet Union tested its first atomic bomb in August 1949, the topic of air defense of the US became important for the first time. A study group, the "Air Defense Systems Engineering Committee", was set up under the direction of Dr. George Valley to consider the problem and is known to history as the "Valley Committee". Their December report noted a key problem in air defense using ground-based radars. A bomber approaching a radar station would detect the signals from the radar long before the reflection off the bomber was strong enough to be detected by the station. The committee suggested that when this occurred, the bomber would descend to low altitude, thereby greatly limiting the radar horizon, allowing the bomber to fly past the station undetected. Although flying at low altitude greatly increased fuel consumption, the team calculated that the bomber would only need to do this for about 10% of its flight, making the fuel penalty acceptable. The only solution to this problem was to build a huge number of stations with overlapping coverage. At that point the problem became one of managing the information. Manual plotting was ruled out as too slow, and a computerized solution was the only possibility. To handle this task, the computer would need to be fed information directly, eliminating any manual translation by phone operators, and it would have to be able to analyze that information and automatically develop tracks. A system tasked with defending cities against the predicted future Soviet bomber fleet would have to be dramatically more powerful than the models used in the NTDS or DATAR. The Committee then had to consider whether or not such a computer was possible. The Valley Committee was introduced to Jerome Wiesner, associate director of the Research Laboratory of Electronics at MIT. Wiesner noted that the Servomechanisms Laboratory had already begun development of a machine that might be fast enough. This was the Whirlwind I, originally developed for the Office of Naval Research as a general purpose flight simulator that could simulate any current or future aircraft by changing its software. Wiesner introduced the Valley Committee to Whirlwind's project lead, Jay Forrester, who convinced him that Whirlwind was sufficiently capable. In September 1950, an early microwave early-warning radar system at Hanscom Field was connected to Whirlwind using a custom interface developed by Forrester's team. An aircraft was flown past the site, and the system digitized the radar information and successfully sent it to Whirlwind. With this demonstration, the technical concept was proven. Forrester was invited to join the committee. === Project Charles === With this successful demonstration, Louis Ridenour, chief scientist of the Air Force, wrote a memo stating "It is now apparent that the experimental work necessary to develop, test, and evaluate the systems proposals made by ADSEC will require a substantial amount of laboratory and field effort." Ridenour approached MIT President James Killian with the aim of beginning a development lab similar to the war-era Radiation Laboratory that made enormous progress in radar technology. Killian was initially uninterested, desiring to return the school to its peacetime civilian charter. Ridenour eventually convinced Killian the idea was sound by describing the way the lab would lead to the development of a local electronics industry based on the needs of the lab and the students who would leave the lab to start their

    Read more →
  • Personal web page

    Personal web page

    Personal web pages are World Wide Web pages created by an individual to contain content of a personal nature rather than content pertaining to a company, organization or institution. Personal web pages are primarily used for informative or entertainment purposes but can also be used for personal career marketing (by containing a list of the individual's skills, experience and a CV), social networking with other people with shared interests, or as a space for personal expression. These terms do not usually refer to just a single "page" or HTML file, but to a website—a collection of webpages and related files under a common URL or Web address. In strictly technical terms, a site's actual home page (index page) often only contains sparse content with some catchy introductory material and serves mostly as a pointer or table of contents to the more content-rich pages inside, such as résumés, family, hobbies, family genealogy, a web log/diary ("blog"), opinions, online journals and diaries or other writing, examples of written work, digital audio sound clips, digital video clips, digital photos, or information about a user's other interests. Many personal pages only include information of interest to friends and family of the author. However, some webpages set up by hobbyists or enthusiasts of certain subject areas can be valuable topical web directories. == History == In the 1990s, most Internet service providers (ISPs) provided a free small personal, user-created webpage along with free Usenet News service. These were all considered part of full Internet service. Also several free web hosting services such as GeoCities provided free web space for personal web pages. These free web hosting services would typically include web-based site management and a few pre-configured scripts to easily integrate an input form or guestbook script into the user's site. Early personal web pages were often called "home pages" and were intended to be set as a default page in a web browser's preferences, usually by their owner. These pages would often contain links, to-do lists, and other information their author found useful. In the days when search engines were in their infancy, these pages (and the links they contained) could be an important resource in navigating the web. Since the early 2000s, the rise of blogging and the development of user friendly web page designing software made it easier for amateur users who did not have computer programming or website designer training to create personal web pages. Some website design websites provided free ready-made blogging scripts, where all the user had to do was input their content into a template. At the same time, a personal web presence became easier with the increased popularity of social networking services, some with blogging platforms such as LiveJournal and Blogger. These websites provided an attractive and easy-to-use content management system for regular users. Most of the early personal websites were Web 1.0 style, in which a static display of text and images or photos was displayed to individuals who came to the page. About the only interaction that was possible on these early websites was signing the virtual "guestbook". With the collapse of the dot-com bubble in the late 1990s, the ISP industry consolidated, and the focus of web hosting services shifted away from the surviving ISP companies to independent Internet hosting services and to ones with other affiliations. For example, many university departments provided personal pages for professors and television broadcasters provided them for their on-air personalities. These free webpages served as a perquisite ("perk") for staff, while at the same time boosting the Web visibility of the parent organization. Web hosting companies either charge a monthly fee, or provide service that is "free" (advertising based) for personal web pages. These are priced or limited according to the total size of all files in bytes on the host's hard drive, or by bandwidth, (traffic), or by some combination of both. For those customers who continue to use their ISP for these services, national ISPs commonly continue to provide both disk space and help including ready-made drop-in scripts. With the rise of Web 2.0-style websites, both professional websites and user-created, amateur websites tended to contain interactive features, such as "clickable" links to online newspaper articles or favourite websites, the option to comment on content displayed on the website, the option to "tag" images, videos or links on the site, the option of "clicking" on an image to enlarge it or find out more information, the option of user participation for website guests to evaluate or review the pages, or even the option to create new user-generated content for others to see. A key difference between Web 1.0 personal webpages and Web 2.0 personal pages was while the former tended to be created by hackers, computer programmers and computer hobbyists, the latter were created by a much wider variety of users, including individuals whose main interests lay in hobbies or topics outside of computers (e.g., indie music fans, political activists, and social entrepreneurs). == Motivations == In a study done by Zinkhan, participants had four main reasons to create personal web pages. First, people use personal web pages as a portrayal of self, in a sense marketing themselves, since creators have the freedom to portray their own identities. Second, personal web pages are a way to interact with people who have similar interests as the creator, possible employers, or colleagues. Third, personal web pages can gain social acceptance with groups that the creator is interested in depending on the information that the creator reveals about themselves. Fourth, personal web pages can give creators a sense of connection to the world since these web pages are public and a way to introduce oneself to other people around the globe. People may maintain personal web pages to serve as a showcase for their skills in professional life, creative skills or self promotion of their business, charity or band. The use of personal web pages to display an individual's professional life has become more common in the 21st century. Mary Madden, an expert researcher on privacy and technology, did a study that found a tenth of American jobs require Personal web pages that advertise an individual online. Personal web pages have become a source of initial impression of possible employees used by employers. It can also be used to express opinions on issues ranging from news and politics to movies. Others may use their personal web page as a communication method. For example, an aspiring artist might give out business cards with their personal web page, and invite people to visit their page and see their artwork, "like" their page or sign their guestbook. A personal web page gives the owner generally more control on presence in search results and how they wish to be viewed online. It also allows more freedom in types and quantity of content than a social network profile offers, and can link various social media profiles with each other. It can be used to correct the record on something, or clear up potential confusion between you and someone with the same name. In the 2010s, some amateur writers, bands and filmmakers release digital versions of their stories, songs and short films online, with the aim of gaining an audience and becoming more well-known. While the huge number of aspiring artists posting their work online makes it unlikely for individuals and groups to become popular via the Internet, there are a small number of YouTube stars who were unknown until their online performances garnered them a huge audience. == Sites of academics == Academic professionals (especially at the college and university level), including professors and researchers, are often given online space for creating and storing personal web documents, including personal web pages, CVs and a list of their books, academic papers and conference presentations, on the websites of their employers. This goes back to the early decade of the World Wide Web and its original purpose of providing a quick and easy way for academics to share research papers and data. Researchers may have a personal website to share more information about themselves, about their academic activities and for sharing (unpublished) results of their research. This has been noted as part of the success of open-access repositories such as arXiv.

    Read more →
  • Short Weather Cipher

    Short Weather Cipher

    The Short Weather Cipher (German: Wetterkurzschlüssel, abbreviated WKS), also known as the weather short signal book, was a cipher, presented as a codebook, that was used by the radio telegraphists aboard U-boats of the German Navy (Kriegsmarine) during World War II. It was used to condense weather reports into a short 7-letter message, which was enciphered by using the naval Enigma and transmitted by radiomen to intercept stations on shore, where it was deciphered by Enigma and the 7-letter weather report was reconstructed. == History == During World War II, during various times, different versions of the cipher were in operation. The first issue carried the codename Weimar. It was replaced by the edition Eisenach on 20 January 1942. On 10 March 1943, the third edition of the weather key, bearing the codename Naumburg, entered into force. On May 9, 1941, during Operation Primrose, the operation to occupy Åndalsnes and create a diversion south of Trondheim in Norway as part of the Norwegian Campaign, an intact Naval Enigma (M3) cipher machine, a copy of the "Weimar" version of the short weather cipher and a copy of the short signal book (German: Kurzsignalbuch or Kurzsignale for short) was recovered from the submarine U-110, that was captured in the North Atlantic east of Cape Farewell, Greenland. This enabled the cryptanalysts in Bletchley Park to break the encryption of the M3 and to decipher the German submarine radio messages. The Short Weather Cipher was critical in the cryptanalysis of the Naval Enigma M4 and yielded excellent cribs. On 30 October 1942, a copy of the Wetterkurzschlüssel, the short weather cipher, and of the short signal book, the Kurzsignale, were recovered as part of a daring raid on the U-boat U-559, when three Royal Navy sailors, Lieutenant Anthony Fasson, Able Seaman Colin Grazier and NAAFI canteen assistant Tommy Brown, then boarded the abandoned submarine, and recovered the documents after a 90-minute search. They reached the Government Code and Cypher at Bletchley Park after a three-week delay, on 24 November 1942. The documents which cost the lives of Fasson and Grazier proved to be particularly important in breaking the Naval Enigma M4. The version of the short weather cipher recovered was the Eisenach version. Unlike the first version Weimar, the Eisenach did not list the 26 rotor positions that were indicated by a letter, to be used in enciphering weather reports. Thus, Hut 8 cryptanalysts thought that all four rotors were used to encipher weather reports. Testing on the Bombes began to surface weather kisses (identical messages in two cryptosystems). On 13 December 1942, a crib obtained using the Short Weather Cipher gave a key with the Naval Enigma M4 rotatable Umkehrwalze (reversing roller or reflector) in the neutral position, making it equivalent to a standard Enigma and thus making B-Dienst messages potentially breakable on existing bombes. Hut 8 learned that the 4-letter indicators for regular U-boat messages were the same as 3-letter indicators for weather messages the same day, except for one extra letter. This meant that once the key was found for a weather message on any day, the fourth rotor had to be only tested in 26 positions to find the full 4-letter key. By the end of the day on Sunday 13 December, Rodger Winn of the Submarine Tracking Room at Bletchley Park knew that Shark Enigma Cipher was broken. When the third edition of the short signal book was introduced on 10 March 1943, Hut 8 was immediately deprived of cribs. However, by the 19 March, cribs were again being used by Hut 8 personnel, using the method of employing short signal sighting reports. These were reports made by U-boats when contact was made with Kurzsignalheft code book. Hut 8 managed to solve Shark for 90 out of 112 days before the end of June. Kurzsignalheft short sighting reports also used M4 in M3 mode. By the end of June, four-rotor bombes had entered service at Bletchley Park, and by August had been introduced by the US Navy. From September onwards, Shark was generally solved within 24 hours. == Operation == The U-boat encoded weather reports using the Short Weather Cipher, before being enciphered on the Naval Enigma. The shore patrol of the Kriegsmarine, deciphered the message and decoded it, then forwarding it to a central meteorological station, which rebroadcast the data as ship synoptics, after enciphering it with additive tables using a cipher, which was called Germet 3 by Hut 8 personnel. The short weather cipher coded weather reports using a polyphonic single-letter code with X missing. A = +28° ◦ B = +27° ◦ C = +26° ◦ D = +25° ◦ . . . ◦ W = +6° ◦ Y= +5° ◦ Z = +4° ◦ A = +3° ◦ B = +2° ◦ C = +1° ◦ D = 0° ◦ E =−1° ◦ F =−2° ◦ . . . ◦ Z = −21° ◦ In a similar way, water temperature, atmospheric pressure, humidity, wind direction, wind velocity, visibility, degree of cloudiness, geographic latitude, and geographic longitude had to be coded in a prescribed order with the weather report consisted of a single short word. Based on the approximate knowledge of the position of the submarine, the Kriegsmarine telegraphist who received the message could translate the letter "S", according to the above table, which could mean 10 °C or −15 °C, back to the correct temperature. Similarly, the direction and the type of swell was also coded with only a single letter: ----------------------------------------------------- Direction from which | Type of swell the swell comes | low | middle high | high | ----------------------------------------------------- N | a | i | q | NE | b | j | r | E | c | k | s | SE | d | l | t | S | e | m | u | SW | f | n | v | W | g | o | w | NW | h | p | x | No swelling | | | | y Intermittent | | | | z As an example of the cipher, a weather report for 68° North latitude, 20° West longitude (north of Iceland) with atmospheric pressure 972 millibars, temperature minus 5 °C, wind northwest Force 6 (on the Beaufort scale), 3/10 cirrus cloud cover, visibility 5 nautical miles, would be coded as MZNFPED. == Publications == Bauer, Arthur O. (1997), Funkpeilung als alliierte Waffe gegen deutsche U-Boote 1939–1945 [Direction finding as Allied weapon against German submarines from 1939 to 1945] (in German), Diemen, NL: Selbstverlag, ISBN 978-3-00-002142-8 Bauer, Friedrich L. (2007), Decrypted Secrets. Methods and Maxims of Cryptology (4., rev. and extended ed.), Berlin Heidelberg New York: Springer, ISBN 978-3-540-24502-5 Pfeiffer, Paul N. (October 1998), "Breaking the German Weather Ciphers in the Mediterranean Detachment, 849th Signal Intelligence Service", Cryptologia, 22 (4): 354–369, doi:10.1080/0161-119891886975, ISSN 0161-1194 Ulbricht, Heinz (2005), Die Chiffriermaschine Enigma – Trügerische Sicherheit. Ein Beitrag zur Geschichte der Nachrichtendienste [The Enigma cipher machine – Deceptive security. A contribution to the history of the intelligence services], Dissertation, Fachbereich Mathematik und Informatik, Technische Universität Braunschweig (in German)

    Read more →
  • Feature hashing

    Feature hashing

    In machine learning, feature hashing, also known as the hashing trick (by analogy to the kernel trick), is a fast and space-efficient way of vectorizing features, i.e. turning arbitrary features into indices in a vector or matrix. It works by applying a hash function to the features and using their hash values as indices directly (after a modulo operation), rather than looking the indices up in an associative array. In addition to its use for encoding non-numeric values, feature hashing can also be used for dimensionality reduction. This trick is often attributed to Weinberger et al. (2009), but there exists a much earlier description of this method published by John Moody in 1989. == Motivation == === Motivating example === In a typical document classification task, the input to the machine learning algorithm (both during learning and classification) is free text. From this, a bag of words (BOW) representation is constructed: the individual tokens are extracted and counted, and each distinct token in the training set defines a feature (independent variable) of each of the documents in both the training and test sets. Machine learning algorithms, however, are typically defined in terms of numerical vectors. Therefore, the bags of words for a set of documents is regarded as a term-document matrix where each row is a single document, and each column is a single feature/word; the entry i, j in such a matrix captures the frequency (or weight) of the j'th term of the vocabulary in document i. (An alternative convention swaps the rows and columns of the matrix, but this difference is immaterial.) Typically, these vectors are extremely sparse—according to Zipf's law. The common approach is to construct, at learning time or prior to that, a dictionary representation of the vocabulary of the training set, and use that to map words to indices. Hash tables and tries are common candidates for dictionary implementation. E.g., the three documents John likes to watch movies. Mary likes movies too. John also likes football. can be converted, using the dictionary to the term-document matrix ( John likes to watch movies Mary too also football 1 1 1 1 1 0 0 0 0 0 1 0 0 1 1 1 0 0 1 1 0 0 0 0 0 1 1 ) {\displaystyle {\begin{pmatrix}{\textrm {John}}&{\textrm {likes}}&{\textrm {to}}&{\textrm {watch}}&{\textrm {movies}}&{\textrm {Mary}}&{\textrm {too}}&{\textrm {also}}&{\textrm {football}}\\1&1&1&1&1&0&0&0&0\\0&1&0&0&1&1&1&0&0\\1&1&0&0&0&0&0&1&1\end{pmatrix}}} (Punctuation was removed, as is usual in document classification and clustering.) The problem with this process is that such dictionaries take up a large amount of storage space and grow in size as the training set grows. On the contrary, if the vocabulary is kept fixed and not increased with a growing training set, an adversary may try to invent new words or misspellings that are not in the stored vocabulary so as to circumvent a machine learned filter. To address this challenge, Yahoo! Research attempted to use feature hashing for their spam filters. Note that the hashing trick isn't limited to text classification and similar tasks at the document level, but can be applied to any problem that involves large (perhaps unbounded) numbers of features. === Mathematical motivation === Mathematically, a token is an element t {\displaystyle t} in a finite (or countably infinite) set T {\displaystyle T} . Suppose we only need to process a finite corpus, then we can put all tokens appearing in the corpus into T {\displaystyle T} , meaning that T {\displaystyle T} is finite. However, suppose we want to process all possible words made of the English letters, then T {\displaystyle T} is countably infinite. Most neural networks can only operate on real vector inputs, so we must construct a "dictionary" function ϕ : T → R n {\displaystyle \phi :T\to \mathbb {R} ^{n}} . When T {\displaystyle T} is finite, of size | T | = m ≤ n {\displaystyle |T|=m\leq n} , then we can use one-hot encoding to map it into R n {\displaystyle \mathbb {R} ^{n}} . First, arbitrarily enumerate T = { t 1 , t 2 , . . , t m } {\displaystyle T=\{t_{1},t_{2},..,t_{m}\}} , then define ϕ ( t i ) = e i {\displaystyle \phi (t_{i})=e_{i}} . In other words, we assign a unique index i {\displaystyle i} to each token, then map the token with index i {\displaystyle i} to the unit basis vector e i {\displaystyle e_{i}} . One-hot encoding is easy to interpret, but it requires one to maintain the arbitrary enumeration of T {\displaystyle T} . Given a token t ∈ T {\displaystyle t\in T} , to compute ϕ ( t ) {\displaystyle \phi (t)} , we must find out the index i {\displaystyle i} of the token t {\displaystyle t} . Thus, to implement ϕ {\displaystyle \phi } efficiently, we need a fast-to-compute bijection h : T → { 1 , . . . , m } {\displaystyle h:T\to \{1,...,m\}} , then we have ϕ ( t ) = e h ( t ) {\displaystyle \phi (t)=e_{h(t)}} . In fact, we can relax the requirement slightly: It suffices to have a fast-to-compute injection h : T → { 1 , . . . , n } {\displaystyle h:T\to \{1,...,n\}} , then use ϕ ( t ) = e h ( t ) {\displaystyle \phi (t)=e_{h(t)}} . In practice, there is no simple way to construct an efficient injection h : T → { 1 , . . . , n } {\displaystyle h:T\to \{1,...,n\}} . However, we do not need a strict injection, but only an approximate injection. That is, when t ≠ t ′ {\displaystyle t\neq t'} , we should probably have h ( t ) ≠ h ( t ′ ) {\displaystyle h(t)\neq h(t')} , so that probably ϕ ( t ) ≠ ϕ ( t ′ ) {\displaystyle \phi (t)\neq \phi (t')} . At this point, we have just specified that h {\displaystyle h} should be a hashing function. Thus we reach the idea of feature hashing. == Algorithms == === Feature hashing (Weinberger et al. 2009) === The basic feature hashing algorithm presented in (Weinberger et al. 2009) is defined as follows. First, one specifies two hash functions: the kernel hash h : T → { 1 , 2 , . . . , n } {\displaystyle h:T\to \{1,2,...,n\}} , and the sign hash ζ : T → { − 1 , + 1 } {\displaystyle \zeta :T\to \{-1,+1\}} . Next, one defines the feature hashing function: ϕ : T → R n , ϕ ( t ) = ζ ( t ) e h ( t ) {\displaystyle \phi :T\to \mathbb {R} ^{n},\quad \phi (t)=\zeta (t)e_{h(t)}} Finally, extend this feature hashing function to strings of tokens by ϕ : T ∗ → R n , ϕ ( t 1 , . . . , t k ) = ∑ j = 1 k ϕ ( t j ) {\displaystyle \phi :T^{}\to \mathbb {R} ^{n},\quad \phi (t_{1},...,t_{k})=\sum _{j=1}^{k}\phi (t_{j})} where T ∗ {\displaystyle T^{}} is the set of all finite strings consisting of tokens in T {\displaystyle T} . Equivalently, ϕ ( t 1 , . . . , t k ) = ∑ j = 1 k ζ ( t j ) e h ( t j ) = ∑ i = 1 n ( ∑ j : h ( t j ) = i ζ ( t j ) ) e i {\displaystyle \phi (t_{1},...,t_{k})=\sum _{j=1}^{k}\zeta (t_{j})e_{h(t_{j})}=\sum _{i=1}^{n}\left(\sum _{j:h(t_{j})=i}\zeta (t_{j})\right)e_{i}} ==== Geometric properties ==== We want to say something about the geometric property of ϕ {\displaystyle \phi } , but T {\displaystyle T} , by itself, is just a set of tokens, we cannot impose a geometric structure on it except the discrete topology, which is generated by the discrete metric. To make it nicer, we lift it to T → R T {\displaystyle T\to \mathbb {R} ^{T}} , and lift ϕ {\displaystyle \phi } from ϕ : T → R n {\displaystyle \phi :T\to \mathbb {R} ^{n}} to ϕ : R T → R n {\displaystyle \phi :\mathbb {R} ^{T}\to \mathbb {R} ^{n}} by linear extension: ϕ ( ( x t ) t ∈ T ) = ∑ t ∈ T x t ζ ( t ) e h ( t ) = ∑ i = 1 n ( ∑ t : h ( t ) = i x t ζ ( t ) ) e i {\displaystyle \phi ((x_{t})_{t\in T})=\sum _{t\in T}x_{t}\zeta (t)e_{h(t)}=\sum _{i=1}^{n}\left(\sum _{t:h(t)=i}x_{t}\zeta (t)\right)e_{i}} There is an infinite sum there, which must be handled at once. There are essentially only two ways to handle infinities. One may impose a metric, then take its completion, to allow well-behaved infinite sums, or one may demand that nothing is actually infinite, only potentially so. Here, we go for the potential-infinity way, by restricting R T {\displaystyle \mathbb {R} ^{T}} to contain only vectors with finite support: ∀ ( x t ) t ∈ T ∈ R T {\displaystyle \forall (x_{t})_{t\in T}\in \mathbb {R} ^{T}} , only finitely many entries of ( x t ) t ∈ T {\displaystyle (x_{t})_{t\in T}} are nonzero. Define an inner product on R T {\displaystyle \mathbb {R} ^{T}} in the obvious way: ⟨ e t , e t ′ ⟩ = { 1 , if t = t ′ , 0 , else. ⟨ x , x ′ ⟩ = ∑ t , t ′ ∈ T x t x t ′ ⟨ e t , e t ′ ⟩ {\displaystyle \langle e_{t},e_{t'}\rangle ={\begin{cases}1,{\text{ if }}t=t',\\0,{\text{ else.}}\end{cases}}\quad \langle x,x'\rangle =\sum _{t,t'\in T}x_{t}x_{t'}\langle e_{t},e_{t'}\rangle } As a side note, if T {\displaystyle T} is infinite, then the inner product space R T {\displaystyle \mathbb {R} ^{T}} is not complete. Taking its completion would get us to a Hilbert space, which allows well-behaved infinite sums. Now we have an inner product space, with enough structure to describe the geometry of the feature hashing function ϕ : R T → R n {\displaystyle \phi :\ma

    Read more →
  • Storyful

    Storyful

    Storyful (stylized as storyful.) is a social media intelligence company headquartered in Dublin, Ireland that is a subsidiary of News Corp, offering services such as social news monitoring, video licensing, and reputation risk management tools for corporate clients. The startup was launched as the first social media newswire, a content aggregator, verifying news sources and online content in Dublin in 2010 by Mark Little, a former journalist with RTÉ News. Storyful was acquired by News Corp in 2013 for USD$25 million. == Background == Mark Little, who had worked as a television journalist for RTÉ One, founded startup Storyful in Dublin, Ireland, in 2010, as a service that "verified news sources and online content". According to Nieman Lab, Storyful had a reputation for content aggregation as a social news agency—finding, verifying, distributing, licensing, and commercializing user-generated content, social media and online content from social networking services, including videos about stories in the news, such as the Syrian Civil War, Arab Spring protests, as well as "smaller viral moments". Storyful aimed to provide authority through its verification and monitoring tools while providing authenticity through user-generated content. On 20 December 2013 News Corp purchased Storyful for US$25 million and opened a New York office in the same building as Fox News' main studios. Little left Storyful in 2015 and Gavin Sheridan, Storyful's director of innovation left in 2014. News Corp CEO Robert Thomson said that through Storyful, News Corp would "define the opportunities that the digital landscape presents, rather than simply adapt to them." After the acquisition, the company expanded its service to include "commercial and creative work". After Murdoch acquired the company, from 2014 through to February 2018, losses "swelled", requiring a series of cash injections from News Corp. During that time the company expanded aggressively globally with a staff of about 200 worldwide up from about 30 in 2014. According to The Guardian, in 2016, journalists were encouraged by Storyful to use the social media monitoring software called Verify developed by Storyful. By installing Verify's web browser extension on their computers, Verify would inform the journalists when social media content had been "verified and cleared". The Guardian revealed that through the Verify plugin, dozens of staff in four offices had access to the journalists browsing activity without them knowing. This data allowed Storyful to actively monitor its own clients' activities on social media and to "turn it into an internal feed" at Storyful that "updates in real time". In November 2018, when a video circulated by Infowars' Paul Joseph Watson appeared to prove that CNN's Jim Acosta's contact with a White House intern was a physical blow, Storyful was able to prove that the 15-second-long clip had been doctored. According to a 21 January 2019 article in CNN Business, Rob McDonagh, the editor of Storyful's U.S. news team, had proven that one of the viral videos that served as catalysts in the January 2019 Lincoln Memorial confrontation at 18 January 2019 Indigenous Peoples March, was posted by a suspicious account, under the handle @2020fight. McDonagh's team validates videos and posts before adding them to their "digest", distinguishing true stories from those that are not. Storyful attempts to validate each post or video before including it in its digest. McDonagh reviewed previous content from @2020fight's account, and found it suspicious because it had a high follower count, a "highly polarized and yet inconsistent political messaging", an "unusually high rate of tweets", and "the use of someone else's image in the profile photo." reporter Donie O'Sullivan said that the @2020fight video that had been posted on 18 January, which had 2.5 million views by 22 January, was the one that "helped frame the news cycle". Currently the website offers a service by which video can be commercially brokered. == Services == Services include a newswire service—one of their "core pillars"—and social news monitoring. By February 2018, Storyful was developing "risk and reputation monitoring" services through which they would source and verify social news, fact-checking it and contextualising it for corporate clients. They were "developing tech tools" to "explore obscure or closed networks" for their intelligence team. can use to explore obscure or closed networks. They "track deviations in social conversations around brands and organisations and catch potential risks before they blow up. Like an alerts system." The company "released a re-booted version of its Newswire platform in 2018. According to FORA, Storyful was developing new tools to combat fake news online. == Clients == When Storyful was acquired by News Corp in 2013, the company already had the Wall Street Journal, the BBC, New York Times, YouTube, ITN and Channel 4 News as clients. By 2018 their clients included CNN, ABC News and Fox News, The New York Times, the Washington Post, in the United States, the Australian Broadcasting Corporation and all of News Corp’s own publications. Most of their "reputation-conscious corporate customers" clients prefer to not be named.

    Read more →
  • Branch number

    Branch number

    In cryptography, the branch number is a numerical value that characterizes the amount of diffusion introduced by a vectorial Boolean function F that maps an input vector a to output vector F ( a ) {\displaystyle F(a)} . For the (usual) case of a linear F the value of the differential branch number is produced by: applying nonzero values of a (i.e., values that have at least one non-zero component of the vector) to the input of F; calculating for each input value a the Hamming weight W {\displaystyle W} (number of nonzero components), and adding weights W ( a ) {\displaystyle W(a)} and W ( F ( a ) ) {\displaystyle W(F(a))} together; selecting the smallest combined weight across for all nonzero input values: B d ( F ) = min a ≠ 0 ( W ( a ) + W ( F ( a ) ) ) {\displaystyle B_{d}(F)={\underset {a\neq 0}{\min }}(W(a)+W(F(a)))} . If both a and F ( a ) {\displaystyle F(a)} have s components, the result is obviously limited on the high side by the value s + 1 {\displaystyle s+1} (this "perfect" result is achieved when any single nonzero component in a makes all components of F ( a ) {\displaystyle F(a)} to be non-zero). A high branch number suggests higher resistance to the differential cryptanalysis: the small variations of input will produce large changes on the output and in order to obtain small variations of the output, large changes of the input value will be required. The term was introduced by Daemen and Rijmen in early 2000s and quickly became a typical tool to assess the diffusion properties of the transformations. == Mathematics == The branch number concept is not limited to the linear transformations, Daemen and Rijmen provided two general metrics: differential branch number, where the minimum is obtained over inputs of F that are constructed by independently sweeping all the values of two nonzero and unequal vectors a, b ( ⊕ {\displaystyle \oplus } is a component-by-component exclusive-or): B d ( F ) = min a ≠ b ( W ( a ⊕ b ) + W ( F ( a ) ⊕ F ( b ) ) {\displaystyle B_{d}(F)={\underset {a\neq b}{\min }}(W(a\oplus b)+W(F(a)\oplus F(b))} ; for linear branch number, the independent candidates α {\displaystyle \alpha } and β {\displaystyle \beta } are independently swept; they should be nonzero and correlated with respect to F (the L A T ( α , β ) {\displaystyle LAT(\alpha ,\beta )} coefficient of the linear approximation table of F should be nonzero): B l ( F ) = min α ≠ 0 , β , L A T ( α , β ) ≠ 0 ( W ( α ) + W ( β ) ) {\displaystyle B_{l}(F)={\underset {\alpha \neq 0,\beta ,LAT(\alpha ,\beta )\neq 0}{\min }}(W(\alpha )+W(\beta ))} .

    Read more →