Representational harm

Representational harm

Systems cause representational harm when they misrepresent a group of people in a negative manner. Representational harms include perpetuating harmful stereotypes about or minimizing the existence of a social group, such as a racial, ethnic, gender, or religious group. Machine learning algorithms often commit representational harm when they learn patterns from data that have algorithmic bias, and this has been shown to be the case with large language models. While preventing representational harm in models is essential to prevent harmful biases, researchers often lack precise definitions of representational harm and conflate it with allocative harm, an unequal distribution of resources among social groups, which is more widely studied and easier to measure. However, recognition of representational harms is growing and preventing them has become an active research area. Researchers have recently developed methods to effectively quantify representational harm in algorithms, making progress on preventing this harm in the future. == Types == Three prominent types of representational harm include stereotyping, denigration, and misrecognition. These subcategories present many dangers to individuals and groups. Stereotypes are oversimplified and usually undesirable representations of a specific group of people, usually by race and gender. This often leads to the denial of educational, employment, housing, and other opportunities. For example, the model minority stereotype of Asian Americans as highly intelligent and good at mathematics can be damaging professionally and academically. Representational harm happens when the representation of details teams improves damaging stereotypes, developing social exclusion and prejudice. This experience is particularly noticeable in the depiction of marginalised groups, containing people of color, women, LGBTQ+ people, and people with handicaps. Media depictions of these groups generally stop working to catch their array and intricacy. Instead, they are typically reduced to one-dimensional caricatures, which ultimately continue social prejudices. These organised depictions contribute to the help of hazardous stereotypes and the marginalisation of these locations. Denigration is the action of unfairly criticizing individuals. This frequently happens when the demeaning of social groups occurs. For example, when searching for "Black-sounding" names versus "white-sounding" ones, some retrieval systems bolster the false perception of criminality by displaying ads for bail-bonding businesses. A system may shift the representation of a group to be of lower social status, often resulting in a disregard from society. Research shows that hazardous depictions in the media can have substantial emotional and social impacts on both individuals and areas. Lawrence Bobo examined the issue of Ethnic stereotype in film, tv, and marketing. African Americans are commonly received duties specified by features such as "violent tendencies," "laziness," or being "merely for contentment features." While these representations might appear varied externally, they stay to boost underlying frameworks of white prominence and racial inequality. As a circumstances, Black individuals are frequently represented as law offenders or in secondary roles, which adds to the support of Ethnic stereotype and Institutional racism. Misrecognition, or incorrect recognition, can display in many forms, including, but not limited to, erasing and alienating social groups, and denying people the right to self-identify. Erasing and alienating social groups involves the unequal visibility of certain social groups; specifically, systematic ineligibility in algorithmic systems perpetuates inequality by contributing to the underrepresentation of social groups. Not allowing people to self-identify is closely related as people's identities can be 'erased' or 'alienated' in these algorithms. Misrecognition causes more than surface-level harm to individuals: psychological harm, social isolation, and emotional insecurity can emerge from this subcategory of representational harm. == Quantification == As the dangers of representational harm have become better understood, some researchers have developed methods to measure representational harm in algorithms. Modeling stereotyping is one way to identify representational harm. Representational stereotyping can be quantified by comparing the predicted outcomes for one social group with the ground-truth outcomes for that group observed in real data. For example, if individuals from group A achieve an outcome with a probability of 60%, stereotyping would be observed if it predicted individuals to achieve that outcome with a probability greater than 60%. The group modeled stereotyping in the context of classification, regression, and clustering problems, and developed a set of rules to quantitatively determine if the model predictions exhibit stereotyping in each of these cases. Other attempts to measure representational harms have focused on applications of algorithms in specific domains such as image captioning, the act of an algorithm generating a short description of an image. In a study on image captioning, researchers measured five types of representational harm. To quantify stereotyping, they measured the number of incorrect words included in the model-generated image caption when compared to a gold-standard caption. They manually reviewed each of the incorrectly included words, determining whether the incorrect word reflected a stereotype associated with the image or whether it was an unrelated error, which allowed them to have a proxy measure of the amount of stereotyping occurring in this caption generation. These researchers also attempted to measure demeaning representational harm. To measure this, they analyzed the frequency with which humans in the image were mentioned in the generated caption. It was hypothesized that if the individuals were not mentioned in the caption, then this was a form of dehumanization. == Examples == One of the most notorious examples of representational harm was committed by Google in 2015 when an algorithm in Google Photos classified Black people as gorillas. Developers at Google said that the problem was caused because there were not enough faces of Black people in the training dataset for the algorithm to learn the difference between Black people and gorillas. Google issued an apology and fixed the issue by blocking its algorithms from classifying anything as a primate. In 2023, Google's photos algorithm was still blocked from identifying gorillas in photos. Another prevalent example of representational harm is the possibility of stereotypes being encoded in word embeddings, which are trained using a wide range of text. These word embeddings are the representation of a word as an array of numbers in vector space, which allows an individual to calculate the relationships and similarities between words. However, recent studies have shown that these word embeddings may commonly encode harmful stereotypes, such as the common example that the phrase "computer programmer" is oftentimes more closely related to "man" than it is to "women" in vector space. This could be interpreted as a misrepresentation of computer programming as a profession that is better performed by men, which would be an example of representational harm. == Addressing representational harm == Initiatives to minimise representational harm include advertising for even more inclusive and accurate portrayals of marginalised teams in the media. Scholars and protestors recommend that the method to reducing representational injury depends on raising the selection of voices both behind and before the digital video camera. When marginalized groups are provided the chance to represent themselves, they can check traditional stereotypes and present their experiences additional authentically. Over the last few years, efforts to increase representation of people of color, women, and LGBTQ+ people in conventional media have made some progression. Films such as Selma, routed by Ava DuVernay, and tv series like Pose, developed by Ryan Murphy, have actually been extensively applauded for their nuanced and respectful representations of marginalised communities. These tasks existing complex individualities and stories that move past streamlined stereotypes. Self-representation is one more crucial method to addressing representational harm. By equipping marginalised locations to create their really own tales, media designers can effectively reduce the perpetuation of hazardous stereotypes. This procedure consists of both the manufacturing of media product by participants of these communities and proactively difficult typical media structures that have actually historically omitted them.

StyleGAN

The Style Generative Adversarial Network, or StyleGAN for short, is an extension to the GAN architecture introduced by Nvidia researchers in December 2018, and made source available in February 2019. StyleGAN depends on Nvidia's CUDA software, GPUs, and Google's TensorFlow, or Meta AI's PyTorch, which supersedes TensorFlow as the official implementation library in later StyleGAN versions. The second version of StyleGAN, called StyleGAN2, was published on February 5, 2020. It removes some of the characteristic artifacts and improves the image quality. Nvidia introduced StyleGAN3, described as an "alias-free" version, on June 23, 2021, and made source available on October 12, 2021. == History == A direct predecessor of the StyleGAN series is the Progressive GAN, published in 2017. In December 2018, Nvidia researchers distributed a preprint with accompanying software introducing StyleGAN, a GAN for producing an unlimited number of (often convincing) portraits of fake human faces. StyleGAN was able to run on Nvidia's commodity GPU processors. In February 2019, Uber engineer Phillip Wang used the software to create the website This Person Does Not Exist, which displayed a new face on each web page reload. Wang himself has expressed amazement, given that humans are evolved to specifically understand human faces, that nevertheless StyleGAN can competitively "pick apart all the relevant features (of human faces) and recompose them in a way that's coherent." In September 2019, a website called Generated Photos published 100,000 images as a collection of stock photos. The collection was made using a private dataset shot in a controlled environment with similar light and angles. Similarly, two faculty at the University of Washington's Information School used StyleGAN to create Which Face is Real?, which challenged visitors to differentiate between a fake and a real face side by side. The faculty stated the intention was to "educate the public" about the existence of this technology so they could be wary of it, "just like eventually most people were made aware that you can Photoshop an image". The second version of StyleGAN, called StyleGAN2, was published on February 5, 2020. It removes some of the characteristic artifacts and improves the image quality. In 2021, a third version was released, improving consistency between fine and coarse details in the generator. Dubbed "alias-free", this version was implemented with PyTorch. === Illicit use === In December 2019, Facebook took down a network of accounts with false identities, and mentioned that some of them had used profile pictures created with machine learning techniques. == Architecture == === Progressive GAN === Progressive GAN is a method for training GAN for large-scale image generation stably, by growing a GAN generator from small to large scale in a pyramidal fashion. Like SinGAN, it decomposes the generator as G = G 1 ∘ G 2 ∘ ⋯ ∘ G N {\displaystyle G=G_{1}\circ G_{2}\circ \cdots \circ G_{N}} , and the discriminator as D = D N ∘ D N − 1 ∘ ⋯ ∘ D 1 {\displaystyle D=D_{N}\circ D_{N-1}\circ \cdots \circ D_{1}} . During training, at first only G N , D N {\displaystyle G_{N},D_{N}} are used in a GAN game to generate 4x4 images. Then G N − 1 , D N − 1 {\displaystyle G_{N-1},D_{N-1}} are added to reach the second stage of GAN game, to generate 8x8 images, and so on, until we reach a GAN game to generate 1024x1024 images. To avoid discontinuity between stages of the GAN game, each new layer is "blended in" (Figure 2 of the paper). For example, this is how the second stage GAN game starts: Just before, the GAN game consists of the pair G N , D N {\displaystyle G_{N},D_{N}} generating and discriminating 4x4 images. Just after, the GAN game consists of the pair ( ( 1 − α ) + α ⋅ G N − 1 ) ∘ u ∘ G N , D N ∘ d ∘ ( ( 1 − α ) + α ⋅ D N − 1 ) {\displaystyle ((1-\alpha )+\alpha \cdot G_{N-1})\circ u\circ G_{N},D_{N}\circ d\circ ((1-\alpha )+\alpha \cdot D_{N-1})} generating and discriminating 8x8 images. Here, the functions u , d {\displaystyle u,d} are image up- and down-sampling functions, and α {\displaystyle \alpha } is a blend-in factor (much like an alpha in image composing) that smoothly glides from 0 to 1. === StyleGAN === StyleGAN is designed as a combination of Progressive GAN with neural style transfer. The key architectural choice of StyleGAN-1 is a progressive growth mechanism, similar to Progressive GAN. Each generated image starts as a constant 4 × 4 × 512 {\displaystyle 4\times 4\times 512} array, and repeatedly passed through style blocks. Each style block applies a "style latent vector" via affine transform ("adaptive instance normalization"), similar to how neural style transfer uses Gramian matrix. It then adds noise, and normalize (subtract the mean, then divide by the variance). At training time, usually only one style latent vector is used per image generated, but sometimes two ("mixing regularization") in order to encourage each style block to independently perform its stylization without expecting help from other style blocks (since they might receive an entirely different style latent vector). After training, multiple style latent vectors can be fed into each style block. Those fed to the lower layers control the large-scale styles, and those fed to the higher layers control the fine-detail styles. Style-mixing between two images x , x ′ {\displaystyle x,x'} can be performed as well. First, run a gradient descent to find z , z ′ {\displaystyle z,z'} such that G ( z ) ≈ x , G ( z ′ ) ≈ x ′ {\displaystyle G(z)\approx x,G(z')\approx x'} . This is called "projecting an image back to style latent space". Then, z {\displaystyle z} can be fed to the lower style blocks, and z ′ {\displaystyle z'} to the higher style blocks, to generate a composite image that has the large-scale style of x {\displaystyle x} , and the fine-detail style of x ′ {\displaystyle x'} . Multiple images can also be composed this way. === StyleGAN2 === StyleGAN2 improves upon StyleGAN in two ways. One, it applies the style latent vector to transform the convolution layer's weights instead, thus solving the "blob" problem. The "blob" problem roughly speaking is because using the style latent vector to normalize the generated image destroys useful information. Consequently, the generator learned to create a "distraction" by a large blob, which absorbs most of the effect of normalization (somewhat similar to using flares to distract a heat-seeking missile). Two, it uses residual connections, which helps it avoid the phenomenon where certain features are stuck at intervals of pixels. For example, the seam between two teeth may be stuck at pixels divisible by 32, because the generator learned to generate teeth during stage N-5, and consequently could only generate primitive teeth at that stage, before scaling up 5 times (thus intervals of 32). This was updated by the StyleGAN2-ADA ("ADA" stands for "adaptive"), which uses invertible data augmentation. It also tunes the amount of data augmentation applied by starting at zero, and gradually increasing it until an "overfitting heuristic" reaches a target level, thus the name "adaptive". === StyleGAN3 === StyleGAN3 improves upon StyleGAN2 by solving the "texture sticking" problem, which can be seen in the official videos. They analyzed the problem by the Nyquist–Shannon sampling theorem, and argued that the layers in the generator learned to exploit the high-frequency signal in the pixels they operate upon. To solve this, they proposed imposing strict lowpass filters between each generator's layers, so that the generator is forced to operate on the pixels in a way faithful to the continuous signals they represent, rather than operate on them as merely discrete signals. They further imposed rotational and translational invariance by using more signal filters. The resulting StyleGAN-3 is able to generate images that rotate and translate smoothly, and without texture sticking.

Kindwise

FlowerChecker, also known as Kindwise, is a company that uses machine learning to identify natural objects from images. This includes plants and their diseases, but also insects and mushrooms. It is based in Brno, Czech Republic. It was founded in 2014 by Ondřej Veselý, Jiří Řihák, and Ondřej Vild, at the time Ph.D. students. == Features & Tools == FlowerChecker offers multiple products. Plant.id is a machine learning-based plant identification API launched in 2018, with the plant disease identification API, plant.health, released in April 2022. The plant.id API is suitable for integration into other software, such as mobile apps or urban trees from remote-sensing imagery. Other products include insect.id, mushroom.id and crop.health are machine learning-based identification APIs for the identification of insects, fungi and economically important plants, respectively, and include also online public demos. The FlowerChecker app was discontinued in October 2024 after 10 years of successful operation. == Recognition == In 2019, FlowerChecker won the Idea of the Year award in the AI Awards organized by the Confederation of Industry of the Czech Republic. In 2020, an academic study comparing ten free automated image recognition apps showed that plant.id's performance excelled in most of the parameters studied. In an independent study comparing different image-based species recognition models and their suitability for recognizing invasive alien species, the plant.id achieved the highest accuracy compared to other tools. In a subsequent study, plant.id was utilized to evaluate urban forest biodiversity using remote-sensing imagery, achieving the highest accuracy in tree species identification among compared methods. The technology has also been referenced as an example of practical integration of AI-based plant identification into cross-platform precision agriculture systems. == Research activities == Flowerchecker cooperates with the Nature Conservation Agency of the Czech Republic on a biodiversity mapping project. FlowerChecker plans to adapt its services to participate in the control of invasive species. In 2022, the company entered a consortium to develop a weeder capable of in-row weed detection and removal. In 2025, it received funding for the development of a technology for the removal of invasive species.

Knowledge processing for robots

KnowRob (Knowledge processing for robots) is a system which combines knowledge representation and reasoning methods to acquire and ground knowledge. This system is the backbone of openEASE. both under developing at the Institute for Artificial Intelligence at the University of Bremen, Germany. == The framework == KnowRob can serve as a common sense framework for the integration of knowledge. This knowledge can be static encyclopedic knowledge, common sense knowledge, task descriptions, environment models, object information, observed actions, etc., which can come from different sources, like manually axiomatized, derived from observations, or imported from the web. KnowRob has been used by different research groups, as the Rice University using the ontological knowledge base in a robotic platform. As well by the Eindhoven University of Technology research group competing in the RoboCup league, in the "at Home" category, with the RoboEarth project. As well, KnowRob is mentioned in the work of some research groups from the Lucian Blaga University of Sibiu, Middle East Technical University in their combination of different knowledge bases, Keio University as related work because of the ontology service, University of Texas at Austin as related work as well because of the relation with the work presented, Hanyang University as related work as an OWL based knowledge processing framework. == Representations == To represent the knowledge, KnowRob uses the OWL ontology language and an extended first-order logic knowledge representation with computable predicates. To give the order of subactions, KnowRob includes a pair-wise ordering constrain, which gives a partial ordering. KnowRob adopts the closed-world assumption Prolog, and an open-world assumption by the use of computables. To include reasoning rules into Prolog, KnowRob uses an inference procedure beyond the capabilities of OWL to extract information about tasks executions. In its second version, KnowRob provides a logic interface to the hybrid reasoning kernel as a logic based language. This language presents the hybrid reasoning kernel as if everything were entities retrievable by providing partial descriptions for them. This entities descriptions include objects, their parts, and articulation models, environments composed of objects, software components, actions, and events. === Episodic memories === Episodic memory is related to the experience information, which is organized temporally and spatially, alongside combined with context information. In KnowRob, an episodic memory is understood as a recording that the agent makes of the ongoing activity, which includes very detailed information about the actions, motions, their purposes, effects and the behavior they generate, it also includes the images captured during execution, etc. == Usage == The knowledge is computed by external methods using Prolog queries. In the second version of the KnowRob system, is included a better structure of the packages and documentations. Which includes some extensions from the previous version, as well as a logic based language. For example, a cup description from perception can be represented in this language as: entity(Cup,[an, object, [type, cup], [shape, cylinder], [color, orange]]) As well, a controller could represent the same object as: entity(Cup, [an, object, [type, cup], [proper_physical_parts, [an, object, [type, handle], [grasp−pose, G−pose]]]]) The interface language is comparable to other query languages for symbolic knowledge bases. KnowRob's query language integrates reasoning methods, such as the simulation-based reasoning. == Goals == The goal of the KnowRob framework is to make semantic knowledge available for service robots. It is able to answer queries about missing information in vague instructions for tasks. This is possible with the actions hierarchical representation and information about objects which can be included in certain action.

Alec Radford

Alec Radford is an American artificial intelligence researcher. == Biography == Radford grew up in Texas. He graduated from Cistercian Preparatory School in 2011, where he became an Eagle Scout, and dropped out of Olin College in August 2014, where he and fellow students Slater Victoroff, Diana Yuan, and Madison May had formed the startup Indico in their dorm room. In 2015, the quartet were joined by Luke Metz and the firm and the Facebook AI research lab in New York used generative adversarial networks to create realistic low pixel images. A demonstration of Indico's technology was used without proper attribution in an April 2016 demonstration by Nvidia chief executive Jensen Huang. Radford joined OpenAI around 2016, where he worked on natural-language processing. The following year, Radford trained a neural network on Amazon reviews. The model was fairly basic, with layers which allowed for human understanding. Upon exploring it, he saw that it had a special neuron linked to the sentiment of the reviews, which it had created on its own. This was a drastic improvement from previous neural networks that had analysed sentiment, because they had to be told to do so and specially trained on data that was explicitly labeled according to sentiment. This development made OpenAI chief scientist Ilya Sutskever consider that a future model, using more diverse language data, could map far more structures of meaning, eventually becoming a "learned core module" for superintelligence. In 2018, Radford was the lead author on OpenAI's seminal research paper on generative pre-trained transformers, which form the foundation of ChatGPT. At OpenAI, he worked on early GPT models, Whisper, a speech recognition model, and the image generator DALL-E. He left OpenAI in December 2024 to pursue independent research. Around March 2025, Radford joined Thinking Machines Lab as an advisor. He joined along with Bob McGrew who was previously the chief research officer of OpenAI. In April 2026, Radford, Nick Levine, and David Duvenaud released Talkie, an AI model trained on books, newspapers, scientific journals, patents, and case law published before December 31, 1930. When asked about the state of the world in 2026, it stated that one billion people would live in Europe, that London and New York would be connected by steamships that transit between the two in ten days, and "winter will be passed in Paris, and the summer in London."

Signal transfer function

The signal transfer function (SiTF) is a measure of the signal output versus the signal input of a system such as an infrared system or sensor. There are many general applications of the SiTF. Specifically, in the field of image analysis, it gives a measure of the noise of an imaging system, and thus yields one assessment of its performance. == SiTF evaluation == In evaluating the SiTF curve, the signal input and signal output are measured differentially; meaning, the differential of the input signal and differential of the output signal are calculated and plotted against each other. An operator, using computer software, defines an arbitrary area, with a given set of data points, within the signal and background regions of the output image of the infrared sensor, i.e. of the unit under test (UUT), (see "Half Moon" image below). The average signal and background are calculated by averaging the data of each arbitrarily defined region. A second order polynomial curve is fitted to the data of each line. Then, the polynomial is subtracted from the average signal and background data to yield the new signal and background. The difference of the new signal and background data is taken to yield the net signal. Finally, the net signal is plotted versus the signal input. The signal input of the UUT is within its own spectral response. (e.g. color-correlated temperature, pixel intensity, etc.). The slope of the linear portion of this curve is then found using the method of least squares. == SiTF curve == The net signal is calculated from the average signal and background, as in signal to noise ratio (imaging)#Calculations. The SiTF curve is then given by the signal output data, (net signal data), plotted against the signal input data (see graph of SiTF to the right). All the data points in the linear region of the SiTF curve can be used in the method of least squares to find a linear approximation. Given n {\displaystyle n\,} data points ( x i , y i ) {\displaystyle (x_{i}\,,y_{i}\,)} a best fit line parameterized as y = m x + b {\displaystyle y=mx+b\,} is given by: m = ∑ x i y i n − ∑ x i n ∑ y i n ∑ x i 2 n − ( ∑ x i n ) 2 b = ∑ y i n − m ∑ x i n {\displaystyle m={\frac {{\frac {\sum x_{i}y_{i}}{n}}-{\frac {\sum x_{i}}{n}}{\frac {\sum y_{i}}{n}}}{{\frac {\sum x_{i}^{2}}{n}}-({\frac {\sum x_{i}}{n}})^{2}}}\qquad \qquad b={\frac {\sum y_{i}}{n}}-m{\frac {\sum x_{i}}{n}}}

Google Brain

Google Brain was a deep learning artificial intelligence research team that served as the sole AI branch of Google before being incorporated under the newer umbrella of Google AI, a research division at Google dedicated to artificial intelligence. Formed in 2011, it combined open-ended machine learning research with information systems and large-scale computing resources. It created tools such as TensorFlow, which allow neural networks to be used by the public, and multiple internal AI research projects, and aimed to create research opportunities in machine learning and natural language processing. It was merged into former Google sister company DeepMind to form Google DeepMind in April 2023. == History == The Google Brain project began in 2011 as a part-time research collaboration between Google fellow Jeff Dean and Google Researcher Greg Corrado. Google Brain started as a Google X project and became so successful that it was graduated back to Google: Astro Teller has said that Google Brain paid for the entire cost of Google X. In June 2012, The New York Times reported that a cluster of 16,000 processors in 1,000 computers dedicated to mimicking some aspects of human brain activity had successfully trained itself to recognize a cat based on 10 million digital images taken from YouTube videos. The story was also covered by National Public Radio (NPR). In March 2013, Google hired Geoffrey Hinton, a leading researcher in the deep learning field, and acquired the company DNNResearch Inc. headed by Hinton. Hinton said that he would be dividing his future time between his university research and his work at Google. In April 2023, Google Brain merged with Google sister company DeepMind to form Google DeepMind, as part of the company's continued efforts to accelerate work on AI. == Team and location == Google Brain was initially established by Google Fellow Jeff Dean and visiting Stanford professor Andrew Ng. In 2014, the team included Jeff Dean, Quoc V. Le, Ilya Sutskever, Alex Krizhevsky, Samy Bengio, and Vincent Vanhoucke. In 2017, team members included Anelia Angelova, Samy Bengio, Greg Corrado, George Dahl, Michael Isard, Anjuli Kannan, Hugo Larochelle, Chris Olah, Benoit Steiner, Vincent Vanhoucke, Vijay Vasudevan, and Fernanda Viegas. Chris Lattner, who created Apple's programming language Swift and then ran Tesla's autonomy team for six months, joined Google Brain's team in August 2017. Lattner left the team in January 2020 and joined SiFive. As of 2021, Google Brain was led by Jeff Dean, Geoffrey Hinton, and Zoubin Ghahramani. Other members include Katherine Heller, Pi-Chuan Chang, Ian Simon, Jean-Philippe Vert, Nevena Lazic, Anelia Angelova, Lukasz Kaiser, Carrie Jun Cai, Eric Breck, Ruoming Pang, Carlos Riquelme, Hugo Larochelle, and David Ha. Samy Bengio left the team in April 2021, and Zoubin Ghahramani took on his responsibilities. Google Research includes Google Brain and is based in Mountain View. It also has satellite groups in Accra, Amsterdam, Atlanta, Beijing, Berlin, Cambridge, Israel, Los Angeles, London, Montreal, Munich, New York City, Paris, Pittsburgh, Princeton, San Francisco, Seattle, Tokyo, Toronto, and Zurich. == Projects == === Artificial-intelligence-devised encryption system === In October 2016, Google Brain designed an experiment to determine that neural networks are capable of learning secure symmetric encryption. In this experiment, three neural networks were created: Alice, Bob and Eve. Adhering to the idea of a generative adversarial network (GAN), the goal of the experiment was for Alice to send an encrypted message to Bob that Bob could decrypt, but the adversary, Eve, could not. Alice and Bob maintained an advantage over Eve, in that they shared a key used for encryption and decryption. In doing so, Google Brain demonstrated the capability of neural networks to learn secure encryption. === Image enhancement === In February 2017, Google Brain determined a probabilistic method for converting pictures with 8x8 resolution to a resolution of 32x32. The method built upon an already existing probabilistic model called pixelCNN to generate pixel translations. The proposed software utilizes two neural networks to make approximations for the pixel makeup of translated images. The first network, known as the "conditioning network," downsizes high-resolution images to 8x8 and attempts to create mappings from the original 8x8 image to these higher-resolution ones. The other network, known as the "prior network," uses the mappings from the previous network to add more detail to the original image. The resulting translated image is not the same image in higher resolution, but rather a 32x32 resolution estimation based on other existing high-resolution images. Google Brain's results indicate the possibility for neural networks to enhance images. === Google Translate === The Google Brain contributed to the Google Translate project by employing a new deep learning system that combines artificial neural networks with vast databases of multilingual texts. In September 2016, Google Neural Machine Translation (GNMT) was launched, an end-to-end learning framework, able to learn from a large number of examples. Previously, Google Translate's Phrase-Based Machine Translation (PBMT) approach would statistically analyze word by word and try to match corresponding words in other languages without considering the surrounding phrases in the sentence. But rather than choosing a replacement for each individual word in the desired language, GNMT evaluates word segments in the context of the rest of the sentence to choose more accurate replacements. Compared to older PBMT models, the GNMT model scored a 24% improvement in similarity to human translation, with a 60% reduction in errors. The GNMT has also shown significant improvement for notoriously difficult translations, like Chinese to English. While the introduction of the GNMT has increased the quality of Google Translate's translations for the pilot languages, it was very difficult to create such improvements for all of its 103 languages. Addressing this problem, the Google Brain Team was able to develop a Multilingual GNMT system, which extended the previous one by enabling translations between multiple languages. Furthermore, it allows for Zero-Shot Translations, which are translations between two languages that the system has never explicitly seen before. Google announced that Google Translate can now also translate without transcribing, using neural networks. This means that it is possible to translate speech in one language directly into text in another language, without first transcribing it to text. According to the Researchers at Google Brain, this intermediate step can be avoided using neural networks. In order for the system to learn this, they exposed it to many hours of Spanish audio together with the corresponding English text. The different layers of neural networks, replicating the human brain, were able to link the corresponding parts and subsequently manipulate the audio waveform until it was transformed to English text. Another drawback of the GNMT model is that it causes the time of translation to increase exponentially with the number of words in the sentence. This caused the Google Brain Team to add 2000 more processors to ensure the new translation process would still be fast and reliable. === Robotics === Aiming to improve traditional robotics control algorithms where new skills of a robot need to be hand-programmed, robotics researchers at Google Brain are developing machine learning techniques to allow robots to learn new skills on their own. They also attempt to develop ways for information sharing between robots so that robots can learn from each other during their learning process, also known as cloud robotics. As a result, Google has launched the Google Cloud Robotics Platform for developers in 2019, an effort to combine robotics, AI, and the cloud to enable efficient robotic automation through cloud-connected collaborative robots. Robotics research at Google Brain has focused mostly on improving and applying deep learning algorithms to enable robots to complete tasks by learning from experience, simulation, human demonstrations, and/or visual representations. For example, Google Brain researchers showed that robots can learn to pick and throw rigid objects into selected boxes by experimenting in an environment without being pre-programmed to do so. In another research, researchers trained robots to learn behaviors such as pouring liquid from a cup; robots learned from videos of human demonstrations recorded from multiple viewpoints. Google Brain researchers have collaborated with other companies and academic institutions on robotics research. In 2016, the Google Brain Team collaborated with researchers at X in a research on learning hand-eye coordination for robotic grasping. Their method allowed real-time robot control for grasping novel objec