Key frame

Key frame

In animation and filmmaking, a key frame (or keyframe) is a drawing or shot that defines the starting and ending points of a smooth transition. These are called frames because their position in time is measured in frames on a strip of film or on a digital video editing timeline. A sequence of key frames defines which movement the viewer will see, whereas the position of the key frames on the film, video, or animation defines the timing of the movement. Because only two or three key frames over the span of a second do not create the illusion of movement, the remaining frames are filled with "inbetweens". == Use of key frames as a means to change parameters == In software packages that support animation, especially 3D graphics, there are many parameters that can be changed for any one object. One example of such an object is a light. In 3D graphics, lights function similarly to real-world lights. They cause illumination, cast shadows, and create specular highlights. Lights have many parameters, including light intensity, beam size, light color, and the texture cast by the light. Supposing that an animator wants the beam size to change smoothly from one value to another within a predefined period of time, that could be achieved by using key frames. At the start of the animation, a beam size value is set. Another value is set for the end of the animation. Thus, the software program automatically interpolates the two values, creating a smooth transition. == Video editing == In non-linear digital video editing, as well as in video compositing software, a key frame is a frame used to indicate the beginning or end of a change made to a parameter. For example, a key frame could be set to indicate the point at which audio will have faded up or down to a certain level. == Video compression == In video compression, a key frame, also known as an intra-frame, is a frame in which a complete image is stored in the data stream. In video compression, only changes that occur from one frame to the next are stored in the data stream, in order to greatly reduce the amount of information that must be stored. This technique capitalizes on the fact that most video sources (such as a typical movie) have only small changes in the image from one frame to the next. Whenever a drastic change to the image occurs, such as when switching from one camera shot to another or at a scene change, a key frame must be created. The entire image for the frame must be output when the visual difference between the two frames is so great that representing the new image incrementally from the previous frame would require more data than recreating the whole image. Because video compression only stores incremental changes between frames (except for key frames), it is not possible to fast-forward or rewind to any arbitrary spot in the video stream. That is because the data for a given frame only represents how that frame was different from the preceding one. For that reason, it is beneficial to include key frames at arbitrary intervals while encoding video. For example, a key frame may be output once for each 10 seconds of video, even though the video image does not change enough visually to warrant the automatic creation of the key frame. That would allow seeking within the video stream at a minimum of 10-second intervals. The downside is that the resulting video stream will be larger in disk size because many key frames are added when they are not necessary for the frame's visual representation. This drawback, however, does not produce significant compression loss when the bitrate is already set at a high value for better quality (as in the DVD MPEG-2 format).

Hard sigmoid

In artificial intelligence, especially computer vision and artificial neural networks, a hard sigmoid is non-smooth function used in place of a sigmoid function. These retain the basic shape of a sigmoid, rising from 0 to 1, but using simpler functions, especially piecewise linear functions or piecewise constant functions. These are preferred where speed of computation is more important than precision. == Examples == The most extreme examples are the sign function or Heaviside step function, which go from −1 to 1 or 0 to 1 (which to use depends on normalization) at 0. Other examples include the Theano library, which provides two approximations: ultra_fast_sigmoid, which is a multi-part piecewise approximation and hard_sigmoid, which is a 3-part piecewise linear approximation (output 0, line with slope 0.2, output 1).

Neural architecture search

Neural architecture search (NAS) is a technique for automating the design of artificial neural networks (ANN), a widely used model in the field of machine learning. NAS has been used to design networks that are on par with or outperform hand-designed architectures. Methods for NAS can be categorized according to the search space, search strategy and performance estimation strategy used: The search space defines the type(s) of ANN that can be designed and optimized. The search strategy defines the approach used to explore the search space. The performance estimation strategy evaluates the performance of a possible ANN from its design (without constructing and training it). NAS is closely related to hyperparameter optimization and meta-learning and is a subfield of automated machine learning (AutoML). == Reinforcement learning == Reinforcement learning (RL) can underpin a NAS search strategy. Barret Zoph and Quoc Viet Le applied NAS with RL targeting the CIFAR-10 dataset and achieved a network architecture that rivals the best manually-designed architecture for accuracy, with an error rate of 3.65, 0.09 percent better and 1.05x faster than a related hand-designed model. On the Penn Treebank dataset, that model composed a recurrent cell that outperforms LSTM, reaching a test set perplexity of 62.4, or 3.6 perplexity better than the prior leading system. On the PTB character language modeling task it achieved bits per character of 1.214. Learning a model architecture directly on a large dataset can be a lengthy process. NASNet addressed this issue by transferring a building block designed for a small dataset to a larger dataset. The design was constrained to use two types of convolutional cells to return feature maps that serve two main functions when convoluting an input feature map: normal cells that return maps of the same extent (height and width) and reduction cells in which the returned feature map height and width is reduced by a factor of two. For the reduction cell, the initial operation applied to the cell's inputs uses a stride of two (to reduce the height and width). The learned aspect of the design included elements such as which lower layer(s) each higher layer took as input, the transformations applied at that layer and to merge multiple outputs at each layer. In the studied example, the best convolutional layer (or "cell") was designed for the CIFAR-10 dataset and then applied to the ImageNet dataset by stacking copies of this cell, each with its own parameters. The approach yielded accuracy of 82.7% top-1 and 96.2% top-5. This exceeded the best human-invented architectures at a cost of 9 billion fewer FLOPS—a reduction of 28%. The system continued to exceed the manually-designed alternative at varying computation levels. The image features learned from image classification can be transferred to other computer vision problems. E.g., for object detection, the learned cells integrated with the Faster-RCNN framework improved performance by 4.0% on the COCO dataset. In the so-called Efficient Neural Architecture Search (ENAS), a controller discovers architectures by learning to search for an optimal subgraph within a large graph. The controller is trained with policy gradient to select a subgraph that maximizes the validation set's expected reward. The model corresponding to the subgraph is trained to minimize a canonical cross entropy loss. Multiple child models share parameters, ENAS requires fewer GPU-hours than other approaches and 1000-fold less than "standard" NAS. On CIFAR-10, the ENAS design achieved a test error of 2.89%, comparable to NASNet. On Penn Treebank, the ENAS design reached test perplexity of 55.8. == Evolution == An alternative approach to NAS is based on evolutionary algorithms, which has been employed by several groups. An Evolutionary Algorithm for Neural Architecture Search generally performs the following procedure. First a pool consisting of different candidate architectures along with their validation scores (fitness) is initialised. At each step the architectures in the candidate pool are mutated (e.g.: 3x3 convolution instead of a 5x5 convolution). Next the new architectures are trained from scratch for a few epochs and their validation scores are obtained. This is followed by replacing the lowest scoring architectures in the candidate pool with the better, newer architectures. This procedure is repeated multiple times and thus the candidate pool is refined over time. Mutations in the context of evolving ANNs are operations such as adding or removing a layer, which include changing the type of a layer (e.g., from convolution to pooling), changing the hyperparameters of a layer, or changing the training hyperparameters. On CIFAR-10 and ImageNet, evolution and RL performed comparably, while both slightly outperformed random search. == Bayesian optimization == Bayesian Optimization (BO), which has proven to be an efficient method for hyperparameter optimization, can also be applied to NAS. In this context, the objective function maps an architecture to its validation error after being trained for a number of epochs. At each iteration, BO uses a surrogate to model this objective function based on previously obtained architectures and their validation errors. One then chooses the next architecture to evaluate by maximizing an acquisition function, such as expected improvement, which provides a balance between exploration and exploitation. Acquisition function maximization and objective function evaluation are often computationally expensive for NAS, and make the application of BO challenging in this context. Recently, BANANAS has achieved promising results in this direction by introducing a high-performing instantiation of BO coupled to a neural predictor. == Hill-climbing == Another group used a hill climbing procedure that applies network morphisms, followed by short cosine-annealing optimization runs. The approach yielded competitive results, requiring resources on the same order of magnitude as training a single network. E.g., on CIFAR-10, the method designed and trained a network with an error rate below 5% in 12 hours on a single GPU. == Multi-objective search == While most approaches solely focus on finding architecture with maximal predictive performance, for most practical applications other objectives are relevant, such as memory consumption, model size or inference time (i.e., the time required to obtain a prediction). Because of that, researchers created a multi-objective search. LEMONADE is an evolutionary algorithm that adopted Lamarckism to efficiently optimize multiple objectives. In every generation, child networks are generated to improve the Pareto frontier with respect to the current population of ANNs. Neural Architect is claimed to be a resource-aware multi-objective RL-based NAS with network embedding and performance prediction. Network embedding encodes an existing network to a trainable embedding vector. Based on the embedding, a controller network generates transformations of the target network. A multi-objective reward function considers network accuracy, computational resource and training time. The reward is predicted by multiple performance simulation networks that are pre-trained or co-trained with the controller network. The controller network is trained via policy gradient. Following a modification, the resulting candidate network is evaluated by both an accuracy network and a training time network. The results are combined by a reward engine that passes its output back to the controller network. == One-shot models == RL or evolution-based NAS require thousands of GPU-days of searching/training to achieve state-of-the-art computer vision results as described in the NASNet, mNASNet and MobileNetV3 papers. To reduce computational cost, many recent NAS methods rely on the weight-sharing idea. In this approach, a single overparameterized supernetwork (also known as the one-shot model) is defined. A supernetwork is a very large Directed Acyclic Graph (DAG) whose subgraphs are different candidate neural networks. Thus, in a supernetwork, the weights are shared among a large number of different sub-architectures that have edges in common, each of which is considered as a path within the supernet. The essential idea is to train one supernetwork that spans many options for the final design rather than generating and training thousands of networks independently. In addition to the learned parameters, a set of architecture parameters are learnt to depict preference for one module over another. Such methods reduce the required computational resources to only a few GPU days. More recent works further combine this weight-sharing paradigm, with a continuous relaxation of the search space, which enables the use of gradient-based optimization methods. These approaches are generally referred to as differentiable NAS and have proven very efficient in exploring the search space of ne

With Folded Hands ...

"With Folded Hands ..." is a 1947 science fiction novelette by American writer Jack Williamson (1908–2006). In writing it, Williamson was influenced by the aftermath of World War II, the atomic bombings of Hiroshima and Nagasaki, and his concern that "some of the technological creations we had developed with the best intentions might have disastrous consequences in the long run." The novelette first appeared in the July 1947 issue of Astounding Science Fiction and was later included in The Science Fiction Hall of Fame, Volume Two (1973) after being voted one of the best novellas up to 1965. In 1950, it was the first of several Astounding stories adapted for NBC's radio series Dimension X. == Rewrite and sequel == The 1947 publication was followed by a novel-length rewrite, with a different setting and inventor. At the behest of Astounding editor-in-chief John W. Campbell, a new ending had the robots defeated by means of what Williamson and Campbell would later christen "psionics". This novel was serialized, also in Astounding (March, April, May 1948), as ... And Searching Mind, and finally published in hardback book form as The Humanoids (1949). Much later, in 1980, Williamson followed with another sequel, The Humanoid Touch. == Plot summary == Underhill, a seller of "Mechanicals" (unthinking robots that perform menial tasks) in the small town of Two Rivers, is startled to find a competitor's store on his way home. The competitors are not humans but are small black robots who appear more advanced than anything Underhill has encountered before. They describe themselves as "humanoids". Disturbed at his encounter, Underhill rushes home to discover that his wife has taken in a new lodger, a mysterious old man named Sledge. In the course of the next day, the new Mechanicals have appeared everywhere in town. They state that they only follow the Prime Directive: "to serve and obey and guard men from harm". Offering their services free of charge, they replace humans as police officers, bank tellers, and more, and eventually drive Underhill out of business. Despite the humanoids' benign appearance and mission, Underhill soon realizes that, in the name of their Prime Directive, the mechanicals have essentially taken over every aspect of human life. No humans may engage in any behavior that might endanger them, and every human action is carefully scrutinized. Suicide is prohibited. Humans who resist the Prime Directive are taken away and lobotomized, so that they may live happily under the direction of the humanoids. Underhill learns that his lodger Sledge is the creator of the humanoids and is on the run from them. Sledge explains that 60 years earlier he had discovered the force of "rhodomagnetics" on the planet Wing IV and that his discovery resulted in a war that destroyed his planet. In his grief, Sledge designed the humanoids to help humanity and be invulnerable to human exploitation. However, he eventually realized that they had instead taken control of humanity, in the name of their Prime Directive, to make humans happy. The humanoids are spreading out from Wing IV to every human-occupied planet to implement their Prime Directive. Sledge and Underhill attempt to stop the humanoids by aiming a rhodomagnetic beam at Wing IV, but fail. The humanoids take Sledge away for surgery. He returns with no memory of his prior life, stating that he is now happy under the humanoids' care. Underhill is driven home by the humanoids, sitting "with folded hands," as there is nothing left to do. == Origins == In a 1991 interview, Williamson revealed how the story construction reflected events of his childhood in addition to technological extrapolations: I wrote "With Folded Hands" immediately after World War II, when the shadow of the atomic bomb had just fallen over SF and was just beginning to haunt the imaginations of people in the US. The story grows out of that general feeling that some of the technological creations we had developed with the best intentions might have disastrous consequences in the long run (that idea, of course, still seems relevant today). The notion I was consciously working on specifically came out of a fragment of a story I had worked on for a while about an astronaut in space who is accompanied by a robot obviously superior to him physically—i.e., the robot wasn't hurt by gravity, extremes of temperature, radiation, or whatever. Just looking at the fragment gave me the sense of how inferior humanity is in many ways to mechanical creations. That basic recognition was the essence of the story, and as I wrote it up in my notes the theme was that the perfect machine would prove to be perfectly destructive... It was only when I looked back at the story much later on that I was able to realize that the emotional reach of the story undoubtedly derived from my own early childhood, when people were attempting to protect me from all those hazardous things a kid is going to encounter in the isolated frontier setting I grew up in. As a result, I felt frustrated and over protected by people whom I couldn't hate because I loved them. A sort of psychological trap. Specifically, the first three years of my life were spent on a ranch at the top of the Sierra Madre Mountains on the headwaters of the Yaqui River in Sonora, Mexico. ... [My mother] was terrified by this environment. My father built a crib that became a psychological prison for me, particularly because my mother apparently kept me in it too long, when I needed to get out and crawl on the floor. ... In retrospect, I'm certain I projected my fears and suspicions of this kind of conditioning, and these projections became the governing emotional principle of "With Folded Hands" and The Humanoids. == Reception == In 2024, Robert Silverberg wrote an essay in which he asserted that "With Folded Hands..." is "probably the best story ever written about robots" and suggested that Elon Musk's Optimus Generation 2 is the realization of the "humanoids" along with their worst drawbacks.

Algorithmic accountability

Algorithmic accountability refers to the allocation of responsibility for the consequences of real-world actions influenced by algorithms used in decision-making processes. Ideally, algorithms should be designed to eliminate bias from their decision-making outcomes. This means they ought to evaluate only relevant characteristics of the input data, avoiding distinctions based on attributes that are generally inappropriate in social contexts, such as an individual's ethnicity in legal judgments. However, adherence to this principle is not always guaranteed, and there are instances where individuals may be adversely affected by algorithmic decisions. Responsibility for any harm resulting from a machine's decision may lie with the algorithm itself or with the individuals who designed it, particularly if the decision resulted from bias or flawed data analysis inherent in the algorithm's design. == Algorithm usage == Algorithms are widely utilized across various sectors of society that incorporate computational techniques in their control systems. These applications span numerous industries, including but not limited to medical, transportation, and payment services. In these contexts, algorithms perform functions such as: Approving or denying credit card applications; Approving or denying immigrant visas; Determining which taxpayers will be audited on their income taxes; Managing systems that control self-driving cars on a highway; Scoring individuals as potential criminals for use in legal proceedings; Search engines that match and rank database and internet search results; Recommendation systems that filter which news, entertainment, or purchase items are featured in a feed; Market-making algorithms that match sellers and buyers, such as in transportation (ride-hailing) or financial platforms. However, the implementation of these algorithms can be complex and opaque. Generally, algorithms function as "black boxes," meaning that the specific processes an input undergoes during execution are often not transparent, with users typically only seeing the resulting output. This lack of transparency raises concerns about potential biases within the algorithms, as the parameters influencing decision-making may not be well understood. The outputs generated can lead to perceptions of bias, especially if individuals in similar circumstances receive different results. According to Nicholas Diakopoulos: But these algorithms can make mistakes. They have biases. Yet they sit in opaque black boxes, their inner workings, their inner “thoughts” hidden behind layers of complexity. We need to get inside that black box, to understand how they may be exerting power on us, and to understand where they might be making unjust mistakes == Wisconsin Supreme Court case == Algorithms are prevalent across various fields and significantly influence decisions that affect the population at large. Their underlying structures and parameters often remain unknown to those impacted by their outcomes. A notable case illustrating this issue is a recent ruling by the Wisconsin Supreme Court concerning "risk assessment" algorithms used in criminal justice. The court determined that scores generated by such algorithms, which analyze multiple parameters from individuals, should not be used as a determining factor for arresting an accused individual. Furthermore, the court mandated that all reports submitted to judges must include information regarding the accuracy of the algorithm used to compute these scores. This ruling is regarded as a noteworthy development in how society should manage software that makes consequential decisions, highlighting the importance of reliability, particularly in complex settings like the legal system. The use of algorithms in these contexts necessitates a high degree of impartiality in processing input data. However, experts note that there is still considerable work to be done to ensure the accuracy of algorithmic results. Questions about the transparency of data processing continue to arise, which raises issues regarding the appropriateness of the algorithms and the intentions of their designers. == Controversies == A notable instance of potential algorithmic bias is highlighted in an article by The Washington Post regarding the ride-hailing service Uber. An analysis of collected data revealed that estimated waiting times for users varied based on the neighborhoods in which they resided. Key factors influencing these discrepancies included the predominant ethnicity and average income of the area. Specifically, neighborhoods with a majority white population and higher economic status tended to have shorter waiting times, while those with more diverse ethnic compositions and lower average incomes experienced longer waits. It’s important to clarify that this observation reflects a correlation identified in the data, rather than a definitive cause-and-effect relationship. No value judgments are made regarding the behavior of the Uber app in these cases. In TechCrunch website, Hemant Taneja wrote: Concern about “black box” algorithms that govern our lives has been spreading. New York University’s Information Law Institute hosted a conference on algorithmic accountability, noting: “Scholars, stakeholders, and policymakers question the adequacy of existing mechanisms governing algorithmic decision-making and grapple with new challenges presented by the rise of algorithmic power in terms of transparency, fairness, and equal treatment.” Yale Law School’s Information Society Project is studying this, too. “Algorithmic modeling may be biased or limited, and the uses of algorithms are still opaque in many critical sectors,” the group concluded. == Possible solutions == Discussions among experts have sought viable solutions to understand the operations of algorithms, often referred to as "black boxes." It is generally proposed that companies responsible for developing and implementing these algorithms should ensure their reliability by disclosing the internal processes of their systems. Hemant Taneja, writing for TechCrunch, emphasizes that major technology companies, such as Google, Amazon, and Uber, must actively incorporate algorithmic accountability into their operations. He suggests that these companies should transparently monitor their own systems to avoid stringent regulatory measures. One potential approach is the introduction of regulations in the tech sector to enforce oversight of algorithmic processes. However, such regulations could significantly impact software developers and the industry as a whole. It may be more beneficial for companies to voluntarily disclose the details of their algorithms and decision-making parameters, which could enhance the trustworthiness of their solutions. Another avenue discussed is the possibility of self-regulation by the companies that create these algorithms, allowing them to take proactive steps in ensuring accountability and transparency in their operations. In TechCrunch website, Hemant Taneja wrote: There’s another benefit — perhaps a huge one — to software-defined regulation. It will also show us a path to a more efficient government. The world’s legal logic and regulations can be coded into software and smart sensors can offer real-time monitoring of everything from air and water quality, traffic flows and queues at the DMV. Regulators define the rules, technologist create the software to implement them and then AI and ML help refine iterations of policies going forward. This should lead to much more efficient, effective governments at the local, national and global levels.

Corpus of Linguistic Acceptability

Corpus of Linguistic Acceptability (CoLA) is a dataset the primary purpose of which is to serve as a benchmark for evaluating the ability of artificial neural networks, including large language models, to judge the grammatical correctness of sentences. It consists of 10,657 English sentences from published linguistics literature that were manually labeled either as grammatical or ungrammatical. == Public version == The publicly available version of CoLA contains 9,594 sentences that belong to training and development sets. It excludes 1,063 sentences reserved for a held-out test set.

A.I.s

A.I.s is a themed anthology of science fiction short works edited by American writers Jack Dann and Gardner Dozois. It was first published in paperback by Ace Books in December 2004. It was reissued as an ebook by Baen Books in June 2013. The book collects ten novelettes and short stories by various science fiction authors, together with a preface by the editors. == Contents == "Preface" (Jack Dann and Gardner Dozois) "Antibodies" (Charles Stross) "Trojan Horse" (Michael Swanwick) "Birth Day" (Robert Reed) "The Hydrogen Wall" (Gregory Benford) "The Turing Test" (Chris Beckett) "Dante Dreams" (Stephen Baxter) "The Names of All the Spirits" (J. R. Dunn) "From the Corner of My Eye" (Alexander Glass) "Halfjack" (Roger Zelazny) "Computer Virus" (Nancy Kress)