AI Art For Sale

AI Art For Sale — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Human-in-the-loop

    Human-in-the-loop

    Human-in-the-loop (HITL) is used in multiple contexts. It can be defined as a model requiring human interaction. HITL is associated with modeling and simulation (M&S) in the live, virtual, and constructive taxonomy. HITL, along with the related human-on-the-loop, are also used in relation to lethal autonomous weapons. Further, HITL is used in the context of machine learning.It is also used in conversational AI to manage complex interactions that require human empathy. == Machine learning == In machine learning, HITL is used in the sense of humans aiding the computer in making the correct decisions in building a model. HITL improves machine learning over random sampling by selecting the most critical data needed to refine the model. == Simulation == In simulation, HITL models may conform to human factors requirements as in the case of a mockup. In this type of simulation, a human is always part of the simulation and consequently influences the outcome in such a way that is difficult if not impossible to reproduce exactly. HITL also readily allows for the identification of problems and requirements that may not be easily identified by other means of simulation. HITL is often referred to as an interactive simulation, which is a special kind of physical simulation in which physical simulations include human operators, such as in a flight or a driving simulator. === Benefits === Human-in-the-loop allows the user to change the outcome of an event or process. The immersion effectively contributes to a positive transfer of acquired skills into the real world. This can be demonstrated by trainees utilizing flight simulators in preparation to become pilots. HITL also allows for the acquisition of knowledge regarding how a new process may affect a particular event. Utilizing HITL allows participants to interact with realistic models and attempt to perform as they would in an actual scenario. HITL simulations bring to the surface issues that would not otherwise be apparent until after a new process has been deployed. A real-world example of HITL simulation as an evaluation tool is its usage by the Federal Aviation Administration (FAA) to allow air traffic controllers to test new automation procedures by directing the activities of simulated air traffic while monitoring the effect of the newly implemented procedures. As with most processes, there is always the possibility of human error, which can only be reproduced using HITL simulation. Although much can be done to automate systems, humans typically still need to take the information provided by a system to determine the next course of action based on their judgment and experience. Intelligent systems can only go so far in certain circumstances to automate a process; only humans in the simulation can accurately judge the final design. Tabletop simulation may be useful in the very early stages of project development for the purpose of collecting data to set broad parameters, but the important decisions require human-in-the-loop simulation. HITL reflects scenarios where human input remains essential despite advances in automation. === Within the virtual simulation taxonomy === Virtual simulations inject HITL in a central role by exercising motor control skills (e.g. flying an airplane), decision making skills (e.g. committing fire control resources to action), or communication skills (e.g. as members of a C4I team). === Examples === Flight simulators Driving simulators Marine simulators Video games Supply chain management simulators Digital puppetry === Misconceptions === Although human-in-the-loop simulation can include a computer simulation in the form of a synthetic environment, computer simulation is not necessarily a form of human-in-the-loop simulation, and is often considered as human-out-of-the loop simulation. In this particular case, a computer model’s behavior is modified according to a set of initial parameters. The results of the model differ from the results stemming from a true human-in-the-loop simulation because the results can easily be replicated time and time again, by simply providing identical parameters. == Weapons == === Taxonomy === Three classifications of the degree of human control of autonomous weapon systems were laid out by Bonnie Docherty in a 2012 Human Rights Watch report. human-in-the-loop: a human must instigate the action of the weapon (in other words not fully autonomous) human-on-the-loop: a human may abort an action human-out-of-the-loop: no human action is involved === Positive human action === In discussions of autonomous weapons and nuclear command and control, the phrase positive human action has been used alongside "human-in-the-loop" to emphasize that a human operator must affirmatively authorize the use of force. Descriptions of the United States Navy's Aegis Combat System have used the phrase in characterizing a requirement for affirmative human action to initiate live firing. A survey of autonomous weapons systems described the Aegis "Auto SM" mode as one in which "the system fully develops the engagement process however engagement requires positive human action". The phrase entered United States federal law in the National Defense Authorization Act for Fiscal Year 2025, which stipulates that artificial intelligence systems not compromise "the principle of requiring positive human actions in execution of decisions by the President with respect to the employment of nuclear weapons".

    Read more →
  • Social film

    Social film

    A social film is a type of interactive film that is presented through the lens of social media. A social film is distributed digitally and integrates with a social networking service, such as Facebook or YouTube. It combines features of web video, social network games and social media. == Key elements == Social films are a more recent phenomenon, and, in turn, there are few precedents for their format. Although there are not many examples of this genre of film, the medium has certain identifiable elements: Casual entertainment Social media User-generated content Game mechanics Using just one of these factors or a combination of them, a social film engages viewers to interact directly with the work. This can be done through usual social media functionality like comments and ranking or adding directly to the narrative itself. Just as with memes, social film distribution relies on the viral spread enabled by social media. This is based on the viral expansion loops model, in which a viewer benefits from sharing the application with friends, exponentially creating new viewers compelled to share the application. == History == One of the first social films to be created was from the YouTube channel lonelygirl15. This social film started in 2006 and was created by Miles Beckett , Mesh Flinders, and Greg Goodfried. They used YouTube posts to create an interactive video series about a fictional character who showcased her life in a vlog format. As the videos went on, more bizarre things would keep happening to the main character, Bree, before she just stopped uploading. This channel was not only the first viral social film, but went on to be one of the first viral YouTube channels to be created. It did take a few years to see any more films in this genre, but 2011 saw many people start to try their hand at making these films. The first social film in this year was a film called Him, Her and Them which was produced and released by Murmur in April 2011. It was distributed exclusively through Facebook and promoted as the first “Facebook film.” The film is composed of short video clips and interactive slideshows, integrating Facebook's Social Graph API. Users participate via text-based additions to the story, which are viewable only by friends within their social network. In May 2011, Canon and Ron Howard teamed up to create Project Imagin8ion, which was a photo contest where photographers submitted photos and the top 8 photos would be the inspiration for a short film. This short film was called "When You Find Me" and could be found exclusively on YouTube. In July 2011, Intel and Toshiba partnered together to create Hollywood's first Social Film experience, a thriller called Inside, directed by D.J. Caruso and starring Emmy Rossum. The project is broken up into several segments across multiple social media platforms including Facebook, YouTube, and Twitter. In this instance, the audience is challenged to help Emmy Rossum's character, Christina, safely make it out of the room she's been trapped in. This particular form of social film is a major undertaking in that it combines social media activity with A-list acting talent to create a user experience that all happens in real time. Although not quite the same idea, Hollywood also started experimenting with the idea of interactive and crowd-sourced films. One of the first examples of this was a short film called "Life In A Day" directed by Kevin Macdonald and produced by Ridley Scott. Kevin asked people from all over the world to submit videos onto YouTube of what they were doing on July 24th, 2010. They combined all of the best videos that were submitted together to create one film of people doing different things all around the world, no matter how boring or simple those things seemed. They took this short to film festivals before releasing it to the public on YouTube in 2011. In August 2012, Intel and Toshiba partnered again to create The Beauty Inside, directed by Drake Doremus, starring Mary Elizabeth Winstead and Topher Grace. It's Hollywood's first social film that gives everyone in the audience a chance to play Alex, the lead role. The experience will be broken up into six filmed episodes interspersed with real-time interactive storytelling that all takes place on Alex's Facebook timeline. In August 2013, Intel and Toshiba released their third entry into the category, The Power Inside, directed by Will Speck and Josh Gordon and starring Harvey Keitel, Analeigh Tipton, and Craig Roberts. It's Hollywood's first social film that asks the audience to audition to help save or destroy the world. The experience is broken up into six filmed episodes interspersed with user-generated content and interactive storytelling on the main character's Facebook timeline. In 2015, Intel partnered with Dell for their fourth entry, What Lives Inside directed by Robert Stromberg and starring Colin Hanks, Catherine O'Hara, and J. K. Simmons. The first of four episodes was released on Hulu on March 25, 2015.

    Read more →
  • Interplanetary Internet

    Interplanetary Internet

    The interplanetary Internet is a conceived computer network in space, consisting of a set of network nodes that can communicate with each other. These nodes are the planet's orbiters and landers, and the Earth ground stations. For example, the orbiters collect the scientific data from the Curiosity rover on Mars through near-Mars communication links, transmit the data to Earth through direct links from the Mars orbiters to the Earth ground stations via the NASA Deep Space Network, and finally the data routed through Earth's internal internet. Interplanetary communication is greatly delayed by interplanetary distances, as data transmission can only go as fast as the speed of light, so a new set of protocols and technologies that are tolerant to large delays and errors are required. The interplanetary Internet has been envisioned as a store and forward network of internets that is often disconnected, has a wireless backbone fraught with error-prone links and delays ranging from tens of minutes to even hours, even when there is a connection. As of 2024 agencies and companies working towards bringing the network to fruition include NASA, ESA, SpaceX and Blue Origin. == Challenges and reasons == In the core implementation of Interplanetary Internet, satellites orbiting a planet communicate to other planet's satellites. Simultaneously, these planets revolve around the Sun with long distances, and thus many challenges face the communications. The reasons and the resultant challenges are: The motion and long distances between planets: The interplanetary communication is greatly delayed due to the interplanetary distances and the motion of the planets. The delay is variable and long, ranging from a couple of minutes (Earth-to-Mars), to a couple of hours (Pluto-to-Earth), depending on their relative positions. The interplanetary communication also suspends due to the solar conjunction, when the sun's radiation hinders the direct communication between the planets. As such, the communication characterizes lossy links and intermittent link connectivity. Low embeddable payload: Satellites can only carry a small payload, which poses challenges to the power, mass, size, and cost for communication hardware design. An asymmetric bandwidth would be the result of this limitation. This asymmetry reaches ratios up to 1000:1 as downlink:uplink bandwidth portion. Absence of fixed infrastructure: The graph of participating nodes in a specific planet-to-planet communication keeps changing over time, due to the constant motion. The routes of the planet-to-planet communication are planned and scheduled rather than being opportunistic. The Interplanetary Internet design must address these challenges to operate successfully and achieve good communication with other planets. It also must use the few available resources efficiently in the system. == Development == Space communication technology has steadily evolved from expensive, one-of-a-kind point-to-point architectures, to the re-use of technology on successive missions, to the development of standard protocols agreed upon by space agencies of many countries. This last phase has gone on since 1982 through the efforts of the Consultative Committee for Space Data Systems (CCSDS), a body composed of the major space agencies of the world. It has 11 member agencies, 32 observer agencies, and over 119 industrial associates. The evolution of space data system standards has gone on in parallel with the evolution of the Internet, with conceptual cross-pollination where fruitful, but largely as a separate evolution. Since the late 1990s, familiar Internet protocols and CCSDS space link protocols have integrated and converged in several ways; for example, the successful FTP file transfer to Earth-orbiting STRV 1B on January 2, 1996, which ran FTP over the CCSDS IPv4-like Space Communications Protocol Specifications (SCPS) protocols. Internet Protocol use without CCSDS has taken place on spacecraft, e.g., demonstrations on the UoSAT-12 satellite, and operationally on the Disaster Monitoring Constellation. Having reached the era where networking and IP on board spacecraft have been shown to be feasible and reliable, a forward-looking study of the bigger picture was the next phase. The Interplanetary Internet study at NASA's Jet Propulsion Laboratory (JPL) was started by a team of scientists at JPL led by internet pioneer Vinton Cerf and the late Adrian Hooke. Cerf was appointed as a distinguished visiting scientist at JPL in 1998, while Hooke was one of the founders and directors of CCSDS. While IP-like SCPS protocols are feasible for short hops, such as ground station to orbiter, rover to lander, lander to orbiter, probe to flyby, and so on, delay-tolerant networking is needed to get information from one region of the Solar System to another. It becomes apparent that the concept of a region is a natural architectural factoring of the Interplanetary Internet. A region is an area where the characteristics of communication are the same. Region characteristics include communications, security, the maintenance of resources, perhaps ownership, and other factors. The Interplanetary Internet is a "network of regional internets". What is needed then, is a standard way to achieve end-to-end communication through multiple regions in a disconnected, variable-delay environment using a generalized suite of protocols. Examples of regions might include the terrestrial Internet as a region, a region on the surface of the Moon or Mars, or a ground-to-orbit region. The recognition of this requirement led to the concept of a "bundle" as a high-level way to address the generalized Store-and-Forward problem. Bundles are an area of new protocol development in the upper layers of the OSI model, above the Transport Layer with the goal of addressing the issue of bundling store-and-forward information so that it can reliably traverse radically dissimilar environments constituting a "network of regional internets". Delay-tolerant networking (DTN) was designed to enable standardized communications over long distances and through time delays. At its core is the Bundle Protocol (BP), which is similar to the Internet Protocol, or IP, that serves as the heart of the Internet here on Earth. The big difference between the regular Internet Protocol (IP) and the Bundle Protocol is that IP assumes a seamless end-to-end data path, while BP is built to account for errors and disconnections — glitches that commonly plague deep-space communications. Bundle Service Layering, implemented as the Bundling protocol suite for delay-tolerant networking, will provide general-purpose delay-tolerant protocol services in support of a range of applications: custody transfer, segmentation and reassembly, end-to-end reliability, end-to-end security, and end-to-end routing among them. The Bundle Protocol was first tested in space on the UK-DMC satellite in 2008. An example of one of these end-to-end applications flown on a space mission is the CCSDS File Delivery Protocol (CFDP), used on the Deep Impact comet mission. CFDP is an international standard for automatic, reliable file transfer in both directions. CFDP should not be confused with Coherent File Distribution Protocol, which has the same acronym and is an IETF-documented experimental protocol for rapidly deploying files to multiple targets in a highly networked environment. In addition to reliably copying a file from one entity (such as a spacecraft or ground station) to another entity, CFDP has the capability to reliably transmit arbitrarily small messages defined by the user, in the metadata accompanying the file, and to reliably transmit commands relating to file system management that are to be executed automatically on the remote end-point entity (such as a spacecraft) upon successful reception of a file. == Protocol == The Consultative Committee for Space Data Systems (CCSDS) packet telemetry standard defines the protocol used for the transmission of spacecraft instrument data over the deep-space channel. Under this standard, an image or other data sent from a spacecraft instrument is transmitted using one or more packets. === CCSDS packet definition === A packet is a block of data with length that can vary between successive packets, ranging from 7 to 65,542 bytes, including the packet header. Packetized data is transmitted via frames, which are fixed-length data blocks. The size of a frame, including frame header and control information, can range up to 2048 bytes. Packet sizes are fixed during the development phase. Because packet lengths are variable but frame lengths are fixed, packet boundaries usually do not coincide with frame boundaries. === Telecom processing notes === Data in a frame is typically protected from channel errors by error-correcting codes. Even when the channel errors exceed the correction capability of the error-correcting code, the presence of errors is nearly always detected by the e

    Read more →
  • Superincreasing sequence

    Superincreasing sequence

    In mathematics, a sequence of positive real numbers ( s 1 , s 2 , . . . ) {\displaystyle (s_{1},s_{2},...)} is called superincreasing if every element of the sequence is greater than the sum of all previous elements in the sequence. Formally, this condition can be written as s n + 1 > ∑ j = 1 n s j {\displaystyle s_{n+1}>\sum _{j=1}^{n}s_{j}} for all n ≥ 1. == Program == The following Python source code tests a sequence of numbers to determine if it is superincreasing: This produces the following output: Sum: 0 Element: 1 Sum: 1 Element: 3 Sum: 4 Element: 6 Sum: 10 Element: 13 Sum: 23 Element: 27 Sum: 50 Element: 52 Is it a superincreasing sequence? True == Examples == (1, 3, 6, 13, 27, 52) is a superincreasing sequence, but (1, 3, 4, 9, 15, 25) is not. The series a^x for a>=2 == Properties == Multiplying a superincreasing sequence by a positive real constant keeps it superincreasing.

    Read more →
  • Kernel density estimation

    Kernel density estimation

    In statistics, kernel density estimation (KDE) is the application of kernel smoothing for probability density estimation, i.e., a non-parametric method to estimate the probability density function of a random variable based on kernels as weights. KDE answers a fundamental data smoothing problem where inferences about the population are made based on a finite data sample. In some fields such as signal processing and econometrics it is also termed the Parzen–Rosenblatt window method, after Emanuel Parzen and Murray Rosenblatt, who are usually credited with independently creating it in its current form. One of the famous applications of kernel density estimation is in estimating the class-conditional marginal densities of data when using a naive Bayes classifier, which can improve its prediction accuracy. == Definition == Let x = ( x 1 , x 2 , x 3 , . . . ) {\displaystyle \mathbf {x} =\left(x_{1},x_{2},x_{3},...\right)} be independent and identically distributed samples drawn from some univariate distribution with an unknown density f at any given point x. We are interested in estimating the shape of this function f. Its kernel density estimator is f ^ h ( x ) = 1 n ∑ i = 1 n K h ( x − x i ) = 1 n h ∑ i = 1 n K ( x − x i h ) , {\displaystyle {\hat {f}}_{h}(x)={\frac {1}{n}}\sum _{i=1}^{n}K_{h}(x-x_{i})={\frac {1}{nh}}\sum _{i=1}^{n}K{\left({\frac {x-x_{i}}{h}}\right)},} where K is the kernel — a non-negative function — and h > 0 is a smoothing parameter called the bandwidth or simply width. A kernel with subscript h is called the scaled kernel and defined as Kh(x) = ⁠1/h⁠ K(⁠x/h⁠). Intuitively one wants to choose h as small as the data will allow; however, there is always a trade-off between the bias of the estimator and its variance. The choice of bandwidth is discussed in more detail below. A range of kernel functions are commonly used: uniform, triangular, biweight, triweight, Epanechnikov (parabolic), normal, and others. The Epanechnikov kernel is optimal in a mean square error sense, though the loss of efficiency is small for the kernels listed previously. Due to its convenient mathematical properties, the normal kernel is often used, which means K(x) = ϕ(x), where ϕ is the standard normal density function. The kernel density estimator then becomes f ^ h ( x ) = 1 n ∑ i = 1 n 1 h 2 π exp ⁡ ( − ( x − x i ) 2 2 h 2 ) , {\displaystyle {\hat {f}}_{h}(x)={\frac {1}{n}}\sum _{i=1}^{n}{\frac {1}{h{\sqrt {2\pi }}}}\exp \left({\frac {-(x-x_{i})^{2}}{2h^{2}}}\right),} where h {\displaystyle h} is the standard deviation of the sample x {\displaystyle \mathbf {x} } . The construction of a kernel density estimate finds interpretations in fields outside of density estimation. For example, in thermodynamics, this is equivalent to the amount of heat generated when heat kernels (the fundamental solution to the heat equation) are placed at each data point locations xi. Similar methods are used to construct discrete Laplace operators on point clouds for manifold learning (e.g. diffusion map). == Example == Kernel density estimates are closely related to histograms, but can be endowed with properties such as smoothness or continuity by using a suitable kernel. The diagram below based on these 6 data points illustrates this relationship: For the histogram, first, the horizontal axis is divided into sub-intervals or bins which cover the range of the data: In this case, six bins each of width 2. Whenever a data point falls inside this interval, a box of height 1/12 is placed there. If more than one data point falls inside the same bin, the boxes are stacked on top of each other. For the kernel density estimate, normal kernels with a standard deviation of 1.5 (indicated by the red dashed lines) are placed on each of the data points xi. The kernels are summed to make the kernel density estimate (solid blue curve). The smoothness of the kernel density estimate (compared to the discreteness of the histogram) illustrates how kernel density estimates converge faster to the true underlying density for continuous random variables. == Bandwidth selection == The bandwidth of the kernel is a free parameter which exhibits a strong influence on the resulting estimate. To illustrate its effect, we take a simulated random sample from the standard normal distribution (plotted at the blue spikes in the rug plot on the horizontal axis). The grey curve is the true density (a normal density with mean 0 and variance 1). In comparison, the red curve is undersmoothed since it contains too many spurious data artifacts arising from using a bandwidth h = 0.05, which is too small. The green curve is oversmoothed since using the bandwidth h = 2 obscures much of the underlying structure. The black curve with a bandwidth of h = 0.337 is considered to be optimally smoothed since its density estimate is close to the true density. An extreme situation is encountered in the limit h → 0 {\displaystyle h\to 0} (no smoothing), where the estimate is a sum of n delta functions centered at the coordinates of analyzed samples. In the other extreme limit h → ∞ {\displaystyle h\to \infty } the estimate retains the shape of the used kernel, centered on the mean of the samples (completely smooth). The most common optimality criterion used to select this parameter is the expected L2 risk function, also termed the mean integrated squared error: MISE ⁡ ( h ) = E [ ∫ ( f ^ h ( x ) − f ( x ) ) 2 d x ] {\displaystyle \operatorname {MISE} (h)=\operatorname {E} \!\left[\int \!{\left({\hat {f}}\!_{h}(x)-f(x)\right)}^{2}dx\right]} Under weak assumptions on f and K, (f is the, generally unknown, real density function), MISE ⁡ ( h ) = AMISE ⁡ ( h ) + o ( ( n h ) − 1 + h 4 ) {\displaystyle \operatorname {MISE} (h)=\operatorname {AMISE} (h)+{\mathcal {o}}{\left((nh)^{-1}+h^{4}\right)}} where o is the little o notation, and n the sample size (as above). The AMISE is the asymptotic MISE, i. e. the two leading terms, AMISE ⁡ ( h ) = R ( K ) n h + 1 4 m 2 ( K ) 2 h 4 R ( f ″ ) {\displaystyle \operatorname {AMISE} (h)={\frac {R(K)}{nh}}+{\frac {1}{4}}m_{2}(K)^{2}h^{4}R(f'')} where R ( g ) = ∫ g ( x ) 2 d x {\textstyle R(g)=\int g(x)^{2}\,dx} for a function g, m 2 ( K ) = ∫ x 2 K ( x ) d x {\textstyle m_{2}(K)=\int x^{2}K(x)\,dx} and f ″ {\displaystyle f''} is the second derivative of f {\displaystyle f} and K {\displaystyle K} is the kernel. The minimum of this AMISE is the solution to this differential equation ∂ ∂ h AMISE ⁡ ( h ) = − R ( K ) n h 2 + m 2 ( K ) 2 h 3 R ( f ″ ) = 0 {\displaystyle {\frac {\partial }{\partial h}}\operatorname {AMISE} (h)=-{\frac {R(K)}{nh^{2}}}+m_{2}(K)^{2}h^{3}R(f'')=0} or h AMISE = R ( K ) 1 / 5 m 2 ( K ) 2 / 5 R ( f ″ ) 1 / 5 n − 1 / 5 = C n − 1 / 5 {\displaystyle h_{\operatorname {AMISE} }={\frac {R(K)^{1/5}}{m_{2}(K)^{2/5}R(f'')^{1/5}}}n^{-1/5}=Cn^{-1/5}} Neither the AMISE nor the hAMISE formulas can be used directly since they involve the unknown density function f {\displaystyle f} or its second derivative f ″ {\displaystyle f''} . To overcome that difficulty, a variety of automatic, data-based methods have been developed to select the bandwidth. Several review studies have been undertaken to compare their efficacies, with the general consensus that the plug-in selectors and cross validation selectors are the most useful over a wide range of data sets. Substituting any bandwidth h which has the same asymptotic order n−1/5 as hAMISE into the AMISE gives that AMISE(h) = O(n−4/5), where O is the big O notation. It can be shown that, under weak assumptions, there cannot exist a non-parametric estimator that converges at a faster rate than the kernel estimator. Note that the n−4/5 rate is slower than the typical n−1 convergence rate of parametric methods. If the bandwidth is not held fixed, but is varied depending upon the location of either the estimate (balloon estimator) or the samples (pointwise estimator), this produces a particularly powerful method termed adaptive or variable bandwidth kernel density estimation. Bandwidth selection for kernel density estimation of heavy-tailed distributions is relatively difficult. === A rule-of-thumb bandwidth estimator === If Gaussian basis functions are used to approximate univariate data, and the underlying density being estimated is Gaussian, the optimal choice for h (that is, the bandwidth that minimises the mean integrated squared error) is: h = ( 4 σ ^ 5 3 n ) 1 / 5 ≈ 1.06 σ ^ n − 1 / 5 , {\displaystyle h={\left({\frac {4{\hat {\sigma }}^{5}}{3n}}\right)}^{1/5}\approx 1.06\,{\hat {\sigma }}\,n^{-1/5},} An h {\displaystyle h} value is considered more robust when it improves the fit for long-tailed and skewed distributions or for bimodal mixture distributions. This is often done empirically by replacing the standard deviation σ ^ {\displaystyle {\hat {\sigma }}} by the parameter A {\displaystyle A} below: A = min ( σ ^ , I Q R 1.34 ) {\displaystyle A=\min \left({\hat {\sigma }},{\frac {\mathrm {IQR} }{1.34}}\right)} where IQR is the

    Read more →
  • Sysomos

    Sysomos

    Sysomos Inc. is a Toronto-based social media analytics company owned by Outside Insight market leaders Meltwater. The company developed text analytics and machine learning technologies for user generated content, and served 80% of the top agencies and Fortune 500. == History == Sysomos was founded by Nilesh Bansal and Nick Koudas. The company is a spinoff of the University of Toronto research project BlogScope. The BlogScope project, which started in 2005, resulted in creation of the underlying content aggregation and analysis engine commercialized by Sysomos. The company raised venture capital in 2008 and was acquired by Marketwire in 2010. The company's original flagship product, Media Analysis Platform (MAP), mines and analyzes content from social media or user-generated content to create a picture of media coverage. Sysomos launched its flagship offering MAP in Sept 2007, followed by addition of Heartbeat to its product suite in 2009. In addition to the two main products, the company released FourWhere, a free location-based social search service that mashes up Foursquare in March 2010. The company also offers Sysomos Heartbeat which provides social media monitoring and engagement capabilities to communication professionals, brand managers and customer support groups. In 2013, Heartbeat was extended to add publishing components to deliver a complete end-to-end social media marketing platform. On July 6, 2010, it was announced that Marketwire, a press release distribution company, had acquired Sysomos. After the acquisition, Sysomos founders Nick Koudas and Nilesh Bansal, left Sysomos to start Aislelabs. In February 2015, Sysomos split from Marketwired, as an independent company, and appointed Adnan Ahmed as the new CEO. In March 2015, newly independent Sysomos launched a redesign for its Heartbeat product and a new API for its MAP product. In the same year, the company acquired Expion. In September 2016, Peter Heffring was announced as the new CEO. In April 2017, Sysomos showcased a new unified platform offering new insights. In April 2018, media monitoring firm Meltwater announced it had acquired Sysomos. The CEO of Sysomos, Peter Heffring, said the company will continue to operate as an independent unit of Meltwater. Heffring will run the social analytics division of Meltwater. == Reports == Inside Twitter series of reports is the most extensive third-party survey on Twitter's growth and demographics. Another extensive survey regarding the top 5% of most active Twitter users found that over 25% of all tweets are machine created. The report also confirms Twitter's international growth. Inside Facebook Pages report found that only four percent of pages have more than 10,000 fans, 0.76% of pages have more than 100,000 fans, and 0.05% of pages (or 297 in total) have more than a million fans. Inside YouTube reports focus more on video hosting services and YouTube.

    Read more →
  • Social bot

    Social bot

    A social bot, refers to fully or partially automated social media accounts designed to perform most regular users’ actions, such as liking, posting content, and chatting with other users. Although their levels of autonomy vary, and often include a human-in-the-loop, social bots can use artificial intelligence to perform social media actions and can use large language models to mimic human dialogue. Social bots can operate alone or in groups that coordinate messaging as part of a network of coordinated inauthentic behavior. Social bots are often used to perform ad fraud by artificially boosting viewership and engagement metrics and to spread disinformation on social media. == Uses == Social bots are used for a large number of purposes on a variety of social media platforms, including Twitter, Instagram, Facebook, and YouTube. One common use of social bots is to inflate a social media user's apparent popularity, usually by artificially manipulating their engagement metrics with large volumes of fake likes, reposts, or replies. Social bots can similarly be used to artificially inflate a user's follower count with fake followers, creating a false perception of a larger and more influential online following than is the case. The use of social bots to create the impression of a large social media influence allows individuals, brands, and organizations to attract a higher number of human followers and boost their online presence. Fake engagement can be bought and sold in the black market of social media engagement. Corporations typically use automated customer service agents on social media to affordably manage high levels of support requests. Social bots are used to send automated responses to users’ questions, sometimes prompting the user to private message the support account with additional information. The increased use of automated support bots and virtual assistants has led to some companies laying off customer-service staff. Social bots are also often used to influence public opinion. Autonomous bot accounts can flood social media with large numbers of posts expressing support for certain products, companies, or political campaigns, creating the impression of organic grassroots support. This can create a false perception of the number of people who support a certain position, which may also have effects on the direction of stock prices or on elections. Messages with similar content can also influence fads or trends. Many social bots are also used to amplify phishing attacks. These malicious bots are used to trick a social media user into giving up their passwords or other personal data. This is usually accomplished by posting links claiming to direct users to news articles that would in actuality direct to malicious websites containing malware. Scammers often use URL shortening services such as TinyURL and bit.ly to disguise a link's domain address, increasing the likelihood of a user clicking the malicious link. The presence of fake social media followers and high levels of engagement help convince the victim that the scammer is in fact a trusted user. Social bots can be a tool for computational propaganda. Bots can also be used for algorithmic curation, algorithmic radicalization, and/or influence-for-hire, a term that refers to the selling of an account on social media platforms. == History == Bots have coexisted with computer technology since the earliest days of computing. Social bots have their roots in the 1950s with Alan Turing, whose work focused on machine intelligence with the development of the Turing Test. The following decades saw further progress made towards the goal of creating programs capable of mimicking human behavior, notably with Joseph Weizenbaum’s creation of ELIZA. Considered to be one of the first Chatbots, ELIZA could simulate natural conversations with human users through pattern matching. Its most famous script was DOCTOR, a simulation of a Rogerian psychotherapist that was programmed to chat with patients and respond to questions. With the growth of social media platforms in the early 2000s, these bots could be used to interact with much larger user groups in an inconspicuous manner. Early instances of autonomous agents on social media could be found on sites like MySpace, with social bots being used by marketing firms to inflate activity on a user’s page in an effort to make them appear more popular. Social bots have been observed on a large variety of social media websites, with Twitter being one of the most widely observed examples. The creation of Twitter bots is generally against the site’s terms of service when used to post spam or to automatically like and follow other users, but some degree of automation using Twitter’s API may be permitted if used for “entertainment, informational, or novelty purposes.” Other platforms such as Reddit and Discord also allow for the use of social bots as long as they are not used to violate policies regarding harmful content and abusive behavior. Social media platforms have developed their own automated tools to filter out messages that come from bots, although they cannot detect all bot messages. == Legal regulation == Due to the difficulty of recognizing social bots and separating them from "eligible" automation via social media APIs, it is unclear how legal regulation can be enforced. Social bots are expected to play a role in shaping public opinion by autonomously acting as influencers. Some social bots have been used to rapidly spread misinformation, manipulate stock markets, influence opinion on companies and brands, promote political campaigns, and engage in malicious phishing campaigns. In the United States, some states have started to implement legislation in an attempt to regulate the use of social bots. In 2019, California passed the Bolstering Online Transparency Act (the B.O.T. Act) to make it unlawful to use automated software to appear indistinguishable from humans for the purpose of influencing a social media user's purchasing and voting decisions. Other states such as Utah and Colorado have passed similar bills to restrict the use of social bots. The Artificial Intelligence Act (AI Act) in the European Union is the first comprehensive law governing the use of Artificial Intelligence. The law requires transparency in AI to prevent users from being tricked into believing they are communicating with another human. AI-generated content on social media must be clearly marked as such, preventing social bots from using AI in a manner that mimics human behavior. == Detection == The first generation of bots could sometimes be distinguished from real users by their often superhuman capacities to post messages. Later developments have succeeded in imprinting more "human" activity and behavioral patterns in the agent. With enough bots, it might be even possible to achieve artificial social proof. To unambiguously detect social bots as what they are, a variety of criteria must be applied together using pattern detection techniques, some of which are: cartoon figures as user pictures sometimes also random real user pictures are captured (identity fraud) reposting rate temporal patterns sentiment expression followers-to-friends ratio length of user names variability in (re)posted messages engagement rate (like/followers rate) analysis of the time series of social media posts Social bots are always becoming increasingly difficult to detect and understand. The bots' human-like behavior, ever-changing behavior of the bots, and the sheer volume of bots covering every platform may have been a factor in the challenges of removing them. Social media sites, like Twitter, are among the most affected, with CNBC reporting up to 48 million of the 319 million users (roughly 15%) were bots in 2017. Botometer (formerly BotOrNot) is a public Web service that checks the activity of a Twitter account and gives it a score based on how likely the account is to be a bot. The system leverages over a thousand features. An active method for detecting early spam bots was to set up honeypot accounts that post nonsensical content, which may get reposted (retweeted) by the bots. However, bots evolve quickly, and detection methods have to be updated constantly, because otherwise they may get useless after a few years. One method is the use of Benford's Law for predicting the frequency distribution of significant leading digits to detect malicious bots online. This study was first introduced at the University of Pretoria in 2020. Another method is artificial-intelligence-driven detection. Some of the sub-categories of this type of detection would be active learning loop flow, feature engineering, unsupervised learning, supervised learning, and correlation discovery. Some operations of bots work together in a synchronized way. For example, ISIS used Twitter to amplify its Islamic content by numerous orchestrated accounts which further pushed an item to the Hot List news, thus further a

    Read more →
  • Virtual collective consciousness

    Virtual collective consciousness

    Virtual collective consciousness (VCC) is a term rebooted and promoted by two behavioral scientists, Yousri Marzouki and Olivier Oullier in their 2012 Huffington Post article titled: "Revolutionizing Revolutions: Virtual Collective Consciousness and the Arab Spring", after its first appearance in 1999-2000. VCC is now defined as an internal knowledge catalyzed by social media platforms and shared by a plurality of individuals driven by the spontaneity, the homogeneity, and the synchronicity of their online actions. VCC occurs when a large group of persons, brought together by a social media platform think and act with one mind and share collective emotions. Thus, they are able to coordinate their efforts efficiently, and could rapidly spread their word to a worldwide audience. When interviewed about the concept of VCC that appeared in the book - Hyperconnectivity and the Future of Internet Communication - he edited, Professor of Pervasive Computing, Adrian David Cheok mentioned the following: "The idea of a global (collective) virtual consciousness is a bottom-up process and a rather emergent property resulting from a momentum of complex interactions taking place in social networks. This kind of collective behaviour (or intelligence) results from a collision between a physical world and a virtual world and can have a real impact in our life by driving collective action." == Etymology == In 1999-2000, Richard Glen Boire provided a cursory mention and the only occurrence of the term "Virtual collective consciousness" in his text as follows: The trend of technology is to overcome the limitations of the human body. And, the Web has been characterized as a virtual collective consciousness and unconsciousness The recent definition of VCC evolved from the first empirical study that provided a cyberpsychological insight into the contribution of Facebook to the 2011 Tunisian revolution. In this study, the concept was originally called "collective cyberconsciousness". The latter is an extension of the idea of "collective consciousness" coupled with "citizen media" usage. The authors of this study also made a parallel between this original definition of VCC and other comparable concepts such as Durkheim's collective representation, Žižek's "collective mind" or Boguta's "new collective consciousness" that he used to describe the computational history of the Internet shutdown during the Egyptian revolution. Since VCC is the byproduct of the network's successful actions, then these actions must be timely, acute, rapid, domain-specific, and purpose-oriented to successfully achieve their goal. Before reaching a momentum of complexity, each collective behavior starts by a spark that triggers a chain of events leading to a crystallized stance of a tremendous amount of interactions. Thus, VCC is an emergent global pattern from these individual actions. In 2012, the term virtual collective consciousness resurfaced and was brought to light after extending its applications to the Egyptian case and the whole social networking major impact on the success of the so-called Arab Spring. Moreover, the acronym VCC was suggested to identify the theoretical framework covering on-line behaviors leading to a virtual collective consciousness. Hence, online social networks have provided a new and faster way of establishing or modifying "collective consciousness" that was paramount to the 2011 uprisings in the Arab world. == Theoretical underpinnings of VCC == Various theoretical references in fields ranging from sociology to computer science were mentioned in order to account for the key features that render the framework for a virtual collective consciousness. The following list is not exhaustive, but the references it contains are often highlighted: Émile Durkheim's collective representations are at the heart of VCC since collectivity taken decisions according to Durkheim's assumptions will approve or disapprove individuals' actions and help them eventually reach their final goal. Marshall McLuhan's global village: The shrinking of our big world to a small place called cyberspace is made possible by technological extensions of human consciousness. Carl Jung's collective unconscious: When a society witnesses significant changes, the anchoring of archetypal images (e.g., political leaders) seems to be deeply rooted in individuals' collective unconscious that is likely to bias their political choices. Individual memories of public events were also supposed to convey a "collective awareness" that can be subconsciously altered by the instantaneous spread of information through social networking around the world. Daniel Wegner's transactive memory (TM): social-networking platforms such as Facebook during the Tunisian revolution or Twitter during the Egyptian revolution served as placeholders of a VCC where information can be harnessed and steered to the highly specific revolutionary purpose. Although research on TM was originally limited to couples, small groups, and organizations, recent studies strongly suggest that an effective TM can operate on a very large scale too. James Surowiecki's wisdom of crowds Collective influence algorithm: The CI (Collective influence) algorithm is effective in finding influential nodes in a variety of networks, including social networks, communication networks, and biological networks. It has been used to identify influencers on social-media platforms, to identify key nodes in transportation networks, and to identify potential drug-targets in biological networks. == Some illustrations of VCC == Besides the studied effect of social networking on the Tunisian and Egyptian revolutions, the former via Facebook and the latter via Twitter other applications were studied under the prism of VCC framework: The Whitacre's virtual choir: A compelling example of the degree of autonomy and self-identity members of a spontaneously created network through a VCC is Eric Whitacre's unique musical project that involved a collection of singers performing remotely to create a virtual Choir. The effect of all the voices illustrated a genuine virtual collective empathy merging the artist's mind with all the singers through his silent conducting gestures. The Harlem Shake dance: The Bitcoin protocol: It was questioned whether or not the Bitcoin protocol can morph into virtual collective consciousness. The Byzantine generals problem was used as an analogy to understand the behavioral complexity of the community of Bitcoin's users. Artificial Social Networking Intelligence (ASNI): refers to the application of artificial intelligence within social networking services and social media platforms. It encompasses various technologies and techniques used to automate, personalize, enhance, improve, and synchronize users' interactions and experiences within social networks. ASNI is expected to evolve rapidly, influencing how we interact online and shaping our digital experiences. Transparency, ethical considerations, media influence bias, and user control over data will be crucial to ensure responsible development and positive impact.

    Read more →
  • Labeled data

    Labeled data

    Labeled data is a group of samples that have been tagged with one or more labels. Labeling typically takes a set of unlabeled data and augments each piece of it with informative tags called judgments. For example, a data label might indicate whether a photo contains a horse or a cow, which words were uttered in an audio recording, what type of action is being performed in a video, what the topic of a news article is, what the overall sentiment of a tweet is, or whether a dot in an X-ray is a tumor. Labels can be obtained by having humans make judgments about a given piece of unlabeled data. Labeled data is significantly more expensive to obtain than the raw unlabeled data. The quality of labeled data directly influences the performance of supervised machine learning models in operation, as these models learn from the provided labels. == Crowdsourced labeled data == In 2006, Fei-Fei Li, the co-director of the Stanford Human-Centered AI Institute, initiated research to improve the artificial intelligence models and algorithms for image recognition by significantly enlarging the training data. The researchers downloaded millions of images from the World Wide Web and a team of undergraduates started to apply labels for objects to each image. In 2007, Li outsourced the data labeling work on Amazon Mechanical Turk, an online marketplace for digital piece work. The 3.2 million images that were labeled by more than 49,000 workers formed the basis for ImageNet, one of the largest hand-labeled database for outline of object recognition. == Automated data labelling == After obtaining a labeled dataset, machine learning models can be applied to the data so that new unlabeled data can be presented to the model and a likely label can be guessed or predicted for that piece of unlabeled data. == Challenges == === Data-driven bias === Algorithmic decision-making is subject to programmer-driven bias as well as data-driven bias. Training data that relies on bias labeled data will result in prejudices and omissions in a predictive model, despite the machine learning algorithm being legitimate. The labeled data used to train a specific machine learning algorithm needs to be a statistically representative sample to not bias the results. For example, in facial recognition systems underrepresented groups are subsequently often misclassified if the labeled data available to train has not been representative of the population,. In 2018, a study by Joy Buolamwini and Timnit Gebru demonstrated that two facial analysis datasets that have been used to train facial recognition algorithms, IJB-A and Adience, are composed of 79.6% and 86.2% lighter skinned humans respectively. === Human error and inconsistency === Human annotators are prone to errors and biases when labeling data. This can lead to inconsistent labels and affect the quality of the data set. The inconsistency can affect the machine learning model's ability to generalize well. === Domain expertise === Certain fields, such as legal document analysis or medical imaging, require annotators with specialized domain knowledge. Without the expertise, the annotations or labeled data may be inaccurate, negatively impacting the machine learning model's performance in a real-world scenario.

    Read more →
  • Weird SoundCloud

    Weird SoundCloud

    Weird SoundCloud, or SoundClown, is a mashup parody music scene taking place on the online distribution platform SoundCloud. The scene has been described by its producers and music journalists to be a satirical take on electronic dance music, and useless, throwaway internet content. One critic, Audra Schroeder, categorized it as an in-joke that is "deconstructing and reshaping memes and popular music, recontextualizing the sacred texts of millennial chat rooms." == Origins == In a January 2014 interview, DJ Kevin Wang suggested that the Weird SoundCloud has "been around in the last one to two years", but started to gain much more popularity the previous year through electronic dance music internet blogs. Weird SoundCloud producer Ideaot suggested that some in the phenomenon came from the YouTube poop scene. Another producer in the community, DJ @@ (AT-AT), reasoned that producers joining the scene "want to express their musicality, see it as a more mature form of YouTube Poop," or are "just looking for recognition on social media sites." AT-AT said that it was "a fun thing to do, and after I stopped making proper music I felt I needed a bit of an outlet for my creativity. The fact that people enjoyed it and/or treated it as a travesty (Direct quote from one of my tracks) spurs me on." == Characteristics == Weird SoundCloud is a mash-up and parody music genre labeled by journalist Audra Schroeder as an in-joke that is "deconstructing and reshaping memes and popular music, recontextualizing the sacred texts of millennial chat rooms." Most tracks range from around 30 seconds to one minute in length. The people who make weird SoundCloud are known as SoundClowns, a term coined by producer Dicksoak. Ideaot described the weird SoundCloud community as "largely just people who are friends with each other." Noisey critic Ryan Bassil spotlight the variety of music coming out of the weird SoundCloud landscape: "One minute you could be listening to the Seinfeld theme reimagined as an aneurysm inducing dubstep corker, the next, you're recovering from hearing a version of Tenacious D's "Tribute" that's akin to having a stroke." Bassil analyzes that the tracks "often take the past and repurpose it into something that, although not altogether useful, sounds fresh and reflective of the abstract, confusing panoramic that encapsulates the modern internet." Bassil compared the lexicon of SoundClown's track titles to that of Reddit and Twitter users. According to Dicksoak, most works of the style are critiques of EDM or "are just uploaded because they sound funny." However, Bassil disagreed, writing that there are also many tracks that keep repurposing a certain meme, such as "mom's spaghetti" or the re-use of vocals from recordings by hip hop group Death Grips. He describe the scene's re-use of memes as a satirical take on pointless online content that is only on the internet to "do nothing other than fill the void": They're changing the format of the original work's intended message or audience - a technique often employed by top-tier digital media companies - and in doing so they're sarcastically, ironically, taking the piss out of what Web 2.0's turned into - an open arena where the most ridiculous, unashamed, often pointless piggy-back content can rack up thousands and thousands of clicks. == Notable examples == There are mash-ups that "disrupt the flow of popular music", in the words of writer Schroeder, such as a "flutedrop" remix of the Miley Cyrus song "Wrecking Ball" and Shaliek's mashup of music by Bruno Mars and Korn. In November 2013, Wang released a set of mp3 files on SoundCloud named Best Drops Ever, which included tracks like "A Drop So Epic a Bunch of NYU Bros Already Bought a 3-Day Weekend Pass for It" and "A Drop So Crazy You'll Kill Your Family". All of the tracks start as normal electronic dance music build-ups, before they drop into a "bait and switch" audio or film clip such as Filet-O-Fish commercials, the Whitney Houston song "I Will Always Love You" and the film Bambi (1942) that ruins the anticipation. The collection is a parody of the over-importance and over-focus of the drop and lack of care of the overall quality of a song common in the modern electronic dance music scene. Wang has released more than 45 tracks in the weird SoundCloud, some of them receiving around a million plays. Subgenres of Weird SoundCloud include Macklecore, mash-ups and remixes that include the works of American hip-hop recording artist Macklemore, and Biggiewave, which include samples of songs from the album Ready to Die (1994) by The Notorious B.I.G. Common audio and meme sources used include Skrillex, the Martin Garrix track "Animals", Thomas the Tank Engine, Shrek, Macklemore, "Gangnam Style", the Bruno Mars track "Uptown Funk", the Disturbed track "Down with the Sickness", Space Jam, the Childish Gambino track "Bonfire", the Death Grips track "Takyon" and air horn sound effects. == Reception == Bassil praised the SoundClown scene as "loveable and strangely honest", reasoning that it "just reminds me that we're all humans on the internet, all searching for #content that means something, something to connect with, but usually only dredging up bastardised versions of things we've already read, seen, or watched before." Bassil also described the weird SoundCloud as a more successful version of a similar scene known as weird YouTube; the reason for the success of SoundClowns is due to SoundCloud's discovery algorithm: "Small collectives and trends are able to form, and there's an abundance of tracks from artists who are almost forging careers out of it, as opposed to uploading one viral hit." Publications have made lists of weird SoundCloud works, such as BuzzFeed's "23 Of The Weirdest Songs On Soundcloud", Obsev's "Weird SoundCloud Mashups That Must've Been Made While Drunk", and Thump's "9 of the Best and Most Upsetting Soundclowns we Could Find", where writer Isabelle Hellyer called it the "most influential genre of music in human history." A Your EDM writer called it "oddly addicting."

    Read more →
  • Conjugate coding

    Conjugate coding

    Conjugate coding is a cryptographic tool, introduced by Stephen Wiesner in the late 1960s. It is part of the two applications Wiesner described for quantum coding, along with a method for creating fraud-proof banking notes. The application that the concept was based on was a method of transmitting multiple messages in such a way that reading one destroys the others. This is called quantum multiplexing and it uses photons polarized in conjugate bases as "qubits" to pass information. Conjugate coding also is a simple extension of a random number generator. At the behest of Charles Bennett, Wiesner published the manuscript explaining the basic idea of conjugate coding with a number of examples but it was not embraced because it was significantly ahead of its time. Because its publication has been rejected, it was developed to the world of public-key cryptography in the 1980s as oblivious transfer, first by Michael Rabin and then by Shimon Even. It is used in the field of quantum computing. The initial concept of quantum cryptography developed by Bennett and Gilles Brassard was also based on this concept.

    Read more →
  • Data steward

    Data steward

    A data steward is an oversight or data governance role within an organization, and is responsible for ensuring the quality and fitness for purpose of the organization's data assets, including the metadata for those data assets. A data steward may share some responsibilities with a data custodian, such as the awareness, accessibility, release, appropriate use, security and management of data. A data steward would also participate in the development and implementation of data assets. A data steward may seek to improve the quality and fitness for purpose of other data assets their organization depends upon but is not responsible for. Data stewards have a specialist role that utilizes an organization's data governance processes, policies, guidelines and responsibilities for administering an organizations' entire data in compliance with policy and/or regulatory obligations (e.g., GDPR, HIPAA). The overall objective of a data steward is the data quality of the data assets, datasets, data records and data elements. This includes documenting metainformation for the data, such as definitions, related rules/governance, physical manifestation, and related data models (most of these properties being specific to an attribute/concept relationship), identifying owners/custodian's various responsibilities, relations insight pertaining to attribute quality, aiding with project requirement data facilitation and documentation of capture rules. Data stewards begin the stewarding process with the identification of the data assets and elements which they will steward, with the ultimate result being standards, controls and data entry. The steward works closely with business glossary standards analysts (for standards), with data architect/modelers (for standards), with DQ analysts (for controls) and with operations team members (good-quality data going in per business rules) while entering data. Data stewardship roles are common when organizations attempt to exchange data precisely and consistently between computer systems and to reuse data-related resources. Master data management often makes references to the need for data stewardship for its implementation to succeed. Data stewardship must have precise purpose, fit for purpose or fitness. == Data steward responsibilities == A data steward ensures that each assigned data element: Has clear and unambiguous data element definition Does not conflict with other data elements in the metadata registry (removes duplicates, overlap etc.) Has clear enumerated value definitions if it is of type Code Is still being used (remove unused data elements) Is being used consistently in various computer systems Is being used, fit for purpose = Data Fitness Has adequate documentation on appropriate usage and notes Documents the origin and sources of authority on each metadata element Is protected against unauthorised access or change Responsibilities of data stewards vary between different organisations and institutions. For example, at Delft University of Technology, data stewards are perceived as the first contact point for any questions related to research data. They also have subject-specific background allowing them to easily connect with researchers and to contextualise data management problems to take into account disciplinary practices. == Types of data stewards == Depending on the set of data stewardship responsibilities assigned to an individual, there are 4 types (or dimensions of responsibility) of data stewards typically found within an organization: Data object data steward - responsible for managing reference data and attributes of one business data entity Business data steward - responsible for managing critical data, both reference and transactional, created or used by one business function. The data steward may also serve as a liaison between the organization's data users and technical teams, helping to bridge the gap between business needs and technical requirements. They may also play a role in educating others within the organization about best practices for data management, and advocating for data-driven decision-making. Process data steward - responsible for managing data across one business process System data steward - responsible for managing data for at least one IT system == Benefits of data stewardship == Systematic data stewardship can foster: Faster analysis Consistent use of data management resources Easy mapping of data between computer systems and exchange documents Lower costs associated with migration to (for example) service-oriented architecture (SOA) Mitigation of data risk Better control of dangers associated with privacy, legal, errors, etc. Assignment of each data element to a person sometimes seems like an unimportant process. But multiple groups have found that users have greater trust and usage rates in systems where they can contact a person with questions on each data element. == Examples == Delft University of Technology (TU Delft) offers an example of data stewardship implementation at a research institution. In 2017 the Data Stewardship Project was initiated at TU Delft to address research data management needs in a disciplinary manner across the whole campus. Dedicated data stewards with subject-specific background were appointed at every TU Delft faculty to support researchers with data management questions and to act as a linking point with the other institutional support services. The project is coordinated centrally by TU Delft Library, and it has its own website, blog and a YouTube channel. The [1]EPA metadata registry furnishes an example of data stewardship. Note that each data element therein has a "POC" (point of contact). In 2023, ETH Zurich launched the Data Stewardship Network (DSN) to facilitate collaboration among employees engaged in data management, analysis, and code development across research groups. The DSN serves as a platform for networking and knowledge exchange, aiming to professionalize the role of data stewards who support research data management and reproducible workflows. Established by the team for Research Data Management and Digital Curation at the ETH Library, the DSN collaborates with Scientific IT Services to provide expertise in areas such as storage infrastructure and reproducible workflows. == Data stewardship applications == Information stewardship applications are business solutions used by business users acting in the role of information steward (interpreting and enforcing information governance policy, for example). These developing solutions represent, for the most part, an amalgam of a number of disparate, previously IT-centric tools already on the market, but are organized and presented in such a way that information stewards (a business role) can support the work of information policy enforcement as part of their normal, business-centric, day-to-day work in a range of use cases. The initial push for the formation of this new category of packaged software came from operational use cases — that is, use of business data in and between transactional and operational business applications. This is where most of the master data management efforts are undertaken in organizations. However, there is also now a faster-growing interest in the new data lake arena for more analytical use cases.

    Read more →
  • Too Good To Go

    Too Good To Go

    Too Good To Go is a service with a mobile application that connects customers to restaurants and stores that have surplus unsold food. The service covers major European cities, and in October 2020 started operations in North America. As part of the initiatives taken on the International Day of Awareness of Food Loss and Waste to reduce food loss and waste, the app is suggested alongside OLIO among many others. In 2023 Too Good To Go was the fastest-growing sustainable food app startup by number of downloads. As of August 2023, it claimed 164,000 businesses, serving 62 million users, have saved 155 million bags of food. As of March 2023, it claimed to have saved over 200 million meals. == History == The company was created in 2015 in Denmark by Thomas Bjørn Momsen, Klaus Bagge Pedersen, Adam Sigbrand and Brian Christensen. In 2017, Mette Lykke (co-founder of Endomondo) joined as CEO. In February 2019, the company raised an additional 6 million euros in a new round of investment. In August 2019, Too Good To Go was re-launched in Austria. In September 2019, Too Good To Go acquired the Spanish startup weSAVEeat and merged it into its own brand. In November 2019, the offer of Too Good To Go extended to plants through a partnership with the French retail plants company Jardiland. In December 2019, Too Good To Go partnered with the French grocery retail stores Intermarché, and donated 60K euros to the French charity Restaurants du Cœur. In October 2021, Bonnie Wright teamed up with Too Good To Go to drive the initiative to reduce food waste. == Corporate affairs == The key trends for the Danish entity Too Good To Go ApS are (as of the financial year ending December 31): == International expansion == As of March 2026 the company serves the European countries Austria, Belgium, Czechia, Denmark, the Faroe Islands, France, Germany, Ireland, Italy, the Netherlands, Norway, Poland, Portugal, Spain, Sweden, Switzerland, the United Kingdom. Outside of Europe the service is available in Australia, Canada, Japan, New Zealand and the United States. == Purpose == The purpose of Too Good To Go is to reduce food waste worldwide. It developed a mobile application that connects restaurants and stores that have unsold, surplus food, with customers who can then buy whatever food the outlet considers surplus to requirements—without being able to choose—at a much lower price than normal. The food on the app is priced at one-third its original price. The company claims this reduces the waste of food that would otherwise be discarded; food waste is a global problem that affects the environment. In three years active, the app reached more than 9.5 million users. As of 2022, more than 57.7 million users and 154,000 establishments have signed up, and 139 million meals have been collected. In 2019, the company had 350 employees in Europe. As of June 2023 the company was estimated to have 1,289 employees. == Use == Food outlets must notify the TGTG company about what they have available on each day, stating what sort of food they have (baked foods, meals, produce, vegan food), and the price for a 'surprise bag', whose contents they determine; the user cannot choose, but the original prices will be three or more times the TGTG price. Notification is made early based upon the quantity predicted to be left over, not at the end of a selling period. Users must register to use the service. A mobile phone with an Internet connection running Android or iOS is needed. The user runs the TGTG app, which lists outlets available within a chosen distance and time range. The customer can then order and pay for a 'surprise bag'. The supplier can cancel an order at any time if the expected surplus is not available—the purchaser is notified by text message—and the purchaser can cancel with two hours' notice. The phone must be taken to the food supplier in a specified pickup time window, often 30 or 60 minutes long, and the transaction is finalised by swiping the app—connected to the Internet—to confirm collection.

    Read more →
  • Social media use by businesses

    Social media use by businesses

    Social media use by businesses includes a range of applications. Although social media accessed via desktop computers offer an online shopping variety of opportunities for companies in a wide range of business sectors, mobile social media, which users can access when they are "on the go" via tablet computers or smartphones, benefit companies because of the location- and time-sensitive awareness of their users. Mobile social media tools can be used for marketing research, communication, sales promotions/discounts, informal employee learning/organizational development, relationship development/loyalty programs, and e-commerce. Marketing research: Mobile social media applications provide companies data about offline consumer movements at a level of detail that was previously accessible to online companies only. These applications allow any business to know the exact time a customer who uses social media entered one of its locations, as well as know the social media comments made during the visit. Communication: Mobile social media communication takes two forms: company-to-consumer (in which a company may establish a connection to a consumer based on its location and provide reviews about locations nearby) and user-generated content. For example, McDonald's offered $5 and $10 gift-cards to 100 users randomly selected among those checking in at one of its restaurants. This promotion increased check-ins by 33% (from 2,146 to 2,865), resulted in over 50 articles and blog posts, and prompted several hundred thousand news feeds and Twitter messages. Sales promotions and discounts: Although customers have had to use printed coupons in the past, mobile social media allows companies to tailor promotions to specific users at specific times. For example, when launching its California-Cancun service, Virgin America offered users who checked in through Loopt at one of three designated taco trucks in San Francisco or Los Angeles between 11 a.m. and 3 p.m. on 31 August 2010, two tacos for $1 and two flights to Cancun or Cabo for the price of one. This special promotion was only available to people who were at a certain location at a certain time. Relationship development and loyalty programs: In order to increase long-term relationships with customers, companies can develop loyalty programs that allow customers who check-in via social media regularly at a location to earn discounts or perks. For example, American Eagle Outfitters remunerates such customers with a tiered 10%, 15%, or 20% discount on their total purchase. Informal employee learning/organizational development is facilitated by social media. Technologies such as blogs, wiki pages, web forums, social networks and other social media act as technology enhanced learning (TEL) tools, and their users perceive change in organizational structure, culture and knowledge management. The prerequisite for the successful use of social media are motivated employees who want to use the new technologies. It is central for companies to understand the factors that determine the willingness to use social media. Customer service and support: A company can gain cost savings and increase revenue and customer satisfaction by using social media platforms in customer service and support. By using social media tools, company's have easy and widescale contact to its customers and simultaneously increase their brand knowledge. E-commerce: Social media sites are increasingly implementing marketing-friendly strategies, creating platforms that are mutually beneficial for users, businesses, and the networks themselves in the popularity and accessibility of e-commerce, or online purchases. The user who posts their comments about a company's product or service benefits because they are able to share their views with their online friends and acquaintances. The company benefits because it obtains insight (positive or negative) about how their product or service is viewed by consumers. Mobile social media applications such as Amazon.com and Pinterest have started to influence an upward trend in the popularity and accessibility of e-commerce. E-commerce businesses may refer to social media as consumer-generated media (CGM). A common thread running through all definitions of social media is a blending of technology and social interaction for the co-creation of value for the business or organization that is using it. People obtain valuable information, education, news, and other data from electronic and print media. Social media are distinct from industrial and traditional media such as newspapers, magazines, television, and film as they are comparatively inexpensive marketing tools and are highly accessible. They enable anyone, including private individuals, to publish or access information easily. Industrial media generally require significant resources to publish information, and in most cases the articles go through many revisions before being published. This process adds to the cost and the resulting market price. Originally social media was only used by individuals, but now it is used by both businesses and nonprofit organizations and also in government and politics. One characteristic shared by both social and industrial media is the capability to reach small or large audiences; for example, either a blog post or a television show may reach no people or millions of people. Some of the properties that help describe the differences between social and industrial media are: Quality: In industrial (traditional) publishing—mediated by a publisher—the typical range of quality is substantially narrower (skewing to the high quality side) than in niche, unmediated markets like user-generated social media posts. The main challenge posed by the content in social media sites is the fact that the distribution of quality has high variance: from very high-quality items to low-quality, sometimes even abusive or inappropriate content. Reach: Both industrial and social media technologies provide scale and are capable of reaching a global audience. Industrial media, however, typically use a centralized framework for organization, production, and dissemination, whereas social media are by their very nature more decentralized, less hierarchical, and distinguished by multiple points of production and utility. Frequency: The number of times users access a type of media per day. Heavy social media users, such as young people, check their social media account numerous times throughout the day. Accessibility: The means of production for industrial media are typically government or corporate (privately owned); social media tools are generally available to the public at little or no cost, or they are supported by advertising revenue. While social media tools are available to anyone with access to Internet and a computer or mobile device, due to the digital divide, the poorest segment of the population lacks access to the Internet and computer. Low-income people may have more access to traditional media (TV, radio, etc.), as an inexpensive TV and aerial or radio costs much less than an inexpensive computer or mobile device. Moreover, in many regions, TV or radio owners can tune into free over the air programming; computer or mobile device owners need Internet access to go to social media sites. Usability: Industrial media production typically requires specialized skills and training. For example, in the 1970s, to record a pop song, an aspiring singer would have to rent time in an expensive professional recording studio and hire an audio engineer. Conversely, most social media activities, such as posting a video of oneself singing a song require only modest reinterpretation of existing skills (assuming a person understands Web 2.0 technologies); in theory, anyone with access to the Internet can operate the means of social media production, and post digital pictures, videos or text online. Immediacy: The time lag between communications produced by industrial media can be long (days, weeks, or even months, by the time the content has been reviewed by various editors and fact checkers) compared to social media (which can be capable of virtually instantaneous responses). The immediacy of social media can be seen as a strength, in that it enables regular people to instantly communicate their opinions and information. At the same time, the immediacy of social media can also be seen as a weakness, as the lack of fact checking and editorial "gatekeepers" facilitates the circulation of hoaxes and fake news. Permanence: Industrial media, once created, cannot be altered (e.g., once a magazine article or paper book is printed and distributed, changes cannot be made to that same article in that print run) whereas social media posts can be altered almost instantaneously, when the user decides to edit their post or due to comments from other readers. Community media constitute a hybrid of industrial and social media. Though community-owned, some community radio,

    Read more →
  • InfiniBand

    InfiniBand

    InfiniBand (IB) is a computer networking standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also used as either a direct or switched interconnect between servers and storage systems, as well as an interconnect between storage systems. It is designed to be scalable and uses a switched fabric network topology. Between 2014 and June 2016, it was the most commonly used interconnect in the TOP500 list of supercomputers. Mellanox (acquired by Nvidia) manufactures InfiniBand host bus adapters and network switches, which are used by large computer system and database vendors in their product lines. As a computer cluster interconnect, IB competes with Ethernet, Fibre Channel, and Intel Omni-Path. The technology is promoted by the InfiniBand Trade Association. == History == InfiniBand originated in 1999 from the merger of two competing designs: Future I/O and Next Generation I/O (NGIO). NGIO was led by Intel, with a specification released in 1998, and joined by Sun Microsystems and Dell. Future I/O was backed by Compaq, IBM, and Hewlett-Packard. This led to the formation of the InfiniBand Trade Association (IBTA), which included both sets of hardware vendors as well as software vendors such as Microsoft. At the time it was thought some of the more powerful computers were approaching the interconnect bottleneck of the PCI bus, in spite of upgrades like PCI-X. Version 1.0 of the InfiniBand Architecture Specification was released in 2000. Initially the IBTA vision for IB was simultaneously a replacement for PCI in I/O, Ethernet in the machine room, cluster interconnect and Fibre Channel. IBTA also envisaged decomposing server hardware on an IB fabric. Mellanox had been founded in 1999 to develop NGIO technology, but by 2001 shipped an InfiniBand product line called InfiniBridge at 10 Gbit/second speeds. Following the burst of the dot-com bubble there was hesitation in the industry to invest in such a far-reaching technology jump. By 2002, Intel announced that instead of shipping IB integrated circuits ("chips"), it would focus on developing PCI Express, and Microsoft discontinued IB development in favor of extending Ethernet. Sun Microsystems and Hitachi continued to support IB. In 2003, the System X supercomputer built at Virginia Tech used InfiniBand in what was estimated to be the third largest computer in the world at the time. The OpenIB Alliance (later renamed OpenFabrics Alliance) was founded in 2004 to develop an open set of software for the Linux kernel. By February, 2005, the support was accepted into the 2.6.11 Linux kernel. In November 2005 storage devices finally were released using InfiniBand from vendors such as Engenio. Cisco, desiring to keep technology superior to Ethernet off the market, adopted a "buy to kill" strategy. Cisco successfully killed InfiniBand switching companies such as Topspin via acquisition. Of the top 500 supercomputers in 2009, Gigabit Ethernet was the internal interconnect technology in 259 installations, compared with 181 using InfiniBand. In 2010, market leaders Mellanox and Voltaire merged, leaving just one other IB vendor, QLogic, primarily a Fibre Channel vendor. At the 2011 International Supercomputing Conference, links running at about 56 gigabits per second (known as FDR, see below), were announced and demonstrated by connecting booths in the trade show. In 2012, Intel acquired QLogic's InfiniBand technology, leaving only one independent supplier. By 2014, InfiniBand was the most popular internal connection technology for supercomputers, although within two years, 10 Gigabit Ethernet started displacing it. In 2016, it was reported that Oracle Corporation (an investor in Mellanox) might engineer its own InfiniBand hardware. In 2019 Nvidia acquired Mellanox, the last independent supplier of InfiniBand products. == Specification == Specifications are published by the InfiniBand trade association. === Performance === Original names for speeds were single-data rate (SDR), double-data rate (DDR) and quad-data rate (QDR) as given below. Subsequently, other three-letter initialisms were added for even higher data rates. Notes Each link is duplex. Links can be aggregated: most systems use a 4 link/lane connector (QSFP). HDR often makes use of 2x links (aka HDR100, 100 Gb link using 2 lanes of HDR, while still using a QSFP connector). NDR introduced OSFP connectors which host one or two links at 2x (NDR200) or 4x (NDR400). They are not logically configured as a single 8x link, even when connecting switches together with an OSFP cable. InfiniBand provides remote direct memory access (RDMA) capabilities for low CPU overhead. === Topology === InfiniBand uses a switched fabric topology, as opposed to early shared medium Ethernet. All transmissions begin or end at a channel adapter. Each processor contains a host channel adapter (HCA) and each peripheral has a target channel adapter (TCA). These adapters can also exchange information for security or quality of service (QoS). === Messages === InfiniBand transmits data in packets of up to 4 KB that are taken together to form a message. A message can be: a remote direct memory access read or write a channel send or receive a transaction-based operation (that can be reversed) a multicast transmission an atomic operation === Physical interconnection === In addition to a board form factor connection, it can use both active and passive copper (up to 10 meters) and optical fiber cable (up to 10 km). QSFP connectors are used. The InfiniBand Association also specified the CXP connector system for speeds up to 120 Gbit/s over copper, active optical cables, and optical transceivers using parallel multi-mode fiber cables with 24-fiber MPO connectors. === Software interfaces === Mellanox operating system support is available for Solaris, FreeBSD, Red Hat Enterprise Linux, SUSE Linux Enterprise Server (SLES), Windows, HP-UX, VMware ESX, and AIX. InfiniBand has no specific standard application programming interface (API). The standard only lists a set of verbs such as ibv_open_device or ibv_post_send, which are abstract representations of functions or methods that must exist. The syntax of these functions is left to the vendors. Sometimes for reference this is called the verbs API. The de facto standard software is developed by OpenFabrics Alliance and called the Open Fabrics Enterprise Distribution (OFED). It is released under two licenses GPL2 or BSD license for Linux and FreeBSD, and as Mellanox OFED for Windows (product names: WinOF / WinOF-2; attributed as host controller driver for matching specific ConnectX 3 to 5 devices) under a choice of BSD license for Windows. It has been adopted by most of the InfiniBand vendors, for Linux, FreeBSD, and Microsoft Windows. IBM refers to a software library called libibverbs, for its AIX operating system, as well as "AIX InfiniBand verbs". The Linux kernel support was integrated in 2005 into the kernel version 2.6.11. === Ethernet over InfiniBand === Ethernet over InfiniBand, abbreviated to EoIB, is an Ethernet implementation over the InfiniBand protocol and connector technology. EoIB enables multiple Ethernet bandwidths varying on the InfiniBand (IB) version. Ethernet's implementation of the Internet Protocol Suite, usually referred to as TCP/IP, is different in some details compared to the direct InfiniBand protocol in IP over IB (IPoIB).

    Read more →