Leslie P. Kaelbling

Leslie P. Kaelbling

Leslie Pack Kaelbling is an American roboticist and the Panasonic Professor of Computer Science and Engineering at the Massachusetts Institute of Technology. She is widely recognized for adapting partially observable Markov decision processes from operations research for application in artificial intelligence and robotics. Kaelbling received the IJCAI Computers and Thought Award in 1997 for applying reinforcement learning to embedded control systems and developing programming tools for robot navigation. In 2000, she was elected as a Fellow of the Association for the Advancement of Artificial Intelligence. == Career == Kaelbling received an A. B. in Philosophy in 1983 and a Ph.D. in Computer Science in 1990, both from Stanford University. During this time she was also affiliated with the Center for the Study of Language and Information. She then worked at SRI International and the affiliated robotics spin-off Teleos Research before joining the faculty at Brown University. She left Brown in 1999 to join the faculty at MIT. Her research focuses on decision-making under uncertainty, machine learning, and sensing with applications to robotics. == Journal of Machine Learning Research == In the spring of 2000, she and two-thirds of the editorial board of the Kluwer-owned journal Machine Learning resigned in protest to its pay-to-access archives with simultaneously limited financial compensation for authors. Kaelbling co-founded and served as the first editor-in-chief of the Journal of Machine Learning Research, a peer-reviewed open access journal on the same topics which allows researchers to publish articles for free and retain copyright with its archives freely available online. In response to the mass resignation, Kluwer changed their publishing policy to allow authors to self-archive their papers online after peer-review. Kaelbling responded that this policy was reasonable and would have made the creation of an alternative journal unnecessary, but the editorial board members had made it clear they wanted such a policy and it was only after the threat of resignations and the actual founding of JMLR that the publishing policy finally changed. == Selected works == Reinforcement Learning: A Survey (LP Kaelbling, ML Littman, AW Moore). Journal of Artificial Intelligence Research (JAIR) 4 (1996) 237-285. A highly cited survey on the field of reinforcement learning. Planning and acting in partially observable stochastic domains (LP Kaelbling, ML Littman, AR Cassandra). Artificial Intelligence 101 (1), 99-134. Acting under uncertainty: Discrete Bayesian models for mobile-robot navigation (AR Cassandra, LP Kaelbling, JA Kurien). Intelligent Robots and Systems (2) 963-972. The synthesis of digital machines with provable epistemic properties (SJ Rosenschein, LP Kaelbling). Proceedings of the 1986 Conference on Theoretical Aspects of Reasoning about Knowledge, 83-98. Practical reinforcement learning in continuous spaces (WD Smart, LP Kaelbling). 2000 International Conference on Machine Learning (ICML), 903-910. Hierarchical task and motion planning in the now (LP Kaelbling, T Lozano-Pérez). 2011 IEEE International Conference on Robotics and Automation (ICRA), 1470-1477.

Inductive bias

The inductive bias (also known as learning bias) of a learning algorithm is the set of assumptions that the learner uses to predict outputs of given inputs that it has not encountered. Inductive bias is anything which makes the algorithm learn one pattern instead of another pattern (e.g., step-functions in decision trees instead of continuous functions in linear regression models). Learning involves searching a space of solutions for a solution that provides a good explanation of the data. However, in many cases, there may be multiple equally appropriate solutions. An inductive bias allows a learning algorithm to prioritize one solution (or interpretation) over another, independently of the observed data. In machine learning, the aim is to construct algorithms that are able to learn to predict a certain target output. To achieve this, the learning algorithm is presented some training examples that demonstrate the intended relation of input and output values. Then the learner is supposed to approximate the correct output, even for examples that have not been shown during training. Without any additional assumptions, this problem cannot be solved since unseen situations might have an arbitrary output value. The kind of necessary assumptions about the nature of the target function are subsumed in the phrase inductive bias. A classical example of an inductive bias is Occam's razor, assuming that the simplest consistent hypothesis about the target function is actually the best. Here, consistent means that the hypothesis of the learner yields correct outputs for all of the examples that have been given to the algorithm. Approaches to a more formal definition of inductive bias are based on mathematical logic. Here, the inductive bias is a logical formula that, together with the training data, logically entails the hypothesis generated by the learner. However, this strict formalism fails in many practical cases in which the inductive bias can only be given as a rough description (e.g., in the case of artificial neural networks), or not at all. == Types == The following is a list of common inductive biases in machine learning algorithms. Maximum conditional independence: if the hypothesis can be cast in a Bayesian framework, try to maximize conditional independence. This is the bias used in the Naive Bayes classifier. Minimum cross-validation error: when trying to choose among hypotheses, select the hypothesis with the lowest cross-validation error. Although cross-validation may seem to be free of bias, the "no free lunch" theorems show that cross-validation must be biased, for example assuming that there is no information encoded in the ordering of the data. Maximum margin: when drawing a boundary between two classes, attempt to maximize the width of the boundary. This is the bias used in support vector machines. The assumption is that distinct classes tend to be separated by wide boundaries. Minimum description length: when forming a hypothesis, attempt to minimize the length of the description of the hypothesis. Minimum features: unless there is good evidence that a feature is useful, it should be deleted. This is the assumption behind feature selection algorithms. Nearest neighbors: assume that most of the cases in a small neighborhood in feature space belong to the same class. Given a case for which the class is unknown, guess that it belongs to the same class as the majority in its immediate neighborhood. This is the bias used in the k-nearest neighbors algorithm. The assumption is that cases that are near each other tend to belong to the same class. == Shift of bias == Although most learning algorithms have a static bias, some algorithms are designed to shift their bias as they acquire more data. This does not avoid bias, since the bias shifting process itself must have a bias.

Human image synthesis

Human image synthesis is technology that can be applied to make believable and even photorealistic renditions of human-likenesses, moving or still. It has effectively existed since the early 2000s. Many films using computer generated imagery have featured synthetic images of human-like characters digitally composited onto the real or other simulated film material. Towards the end of the 2010s deep learning artificial intelligence has been applied to synthesize images and video that look like humans, without need for human assistance, once the training phase has been completed, whereas the old school 7D-route required massive amounts of human work. == Timeline of human image synthesis == In 1971 Henri Gouraud made the first CG geometry capture and representation of a human face. Modeling was his wife Sylvie Gouraud. The 3D model was a simple wire-frame model and he applied the Gouraud shader he is most known for to produce the first known representation of human-likeness on computer. The 1972 short film A Computer Animated Hand by Edwin Catmull and Fred Parke was the first time that computer-generated imagery was used in film to simulate moving human appearance. The film featured a computer simulated hand and face (watch film here). The 1976 film Futureworld reused parts of A Computer Animated Hand on the big screen. The 1983 music video for song Musique Non-Stop by German band Kraftwerk aired in 1986. Created by the artist Rebecca Allen, it features non-realistic looking, but clearly recognizable computer simulations of the band members. The 1994 film The Crow was the first film production to make use of digital compositing of a computer simulated representation of a face onto scenes filmed using a body double. Necessity was the muse as the actor Brandon Lee portraying the protagonist was tragically killed accidentally on-stage. In 1999 Paul Debevec et al. of USC captured the reflectance field of a human face with their first version of a light stage. They presented their method at the SIGGRAPH 2000 In 2003 audience debut of photo realistic human-likenesses in the 2003 films The Matrix Reloaded in the burly brawl sequence where up-to-100 Agent Smiths fight Neo and in The Matrix Revolutions where at the start of the end showdown Agent Smith's cheekbone gets punched in by Neo leaving the digital look-alike unnaturally unhurt. The Matrix Revolutions bonus DVD documents and depicts the process in some detail and the techniques used, including facial motion capture and limbal motion capture, and projection onto models. In 2003 The Animatrix: Final Flight of the Osiris a state-of-the-art want-to-be human likenesses not quite fooling the watcher made by Square Pictures. In 2003 digital likeness of Tobey Maguire was made for movies Spider-man 2 and Spider-man 3 by Sony Pictures Imageworks. In 2005 the Face of the Future project was an established. by the University of St Andrews and Perception Lab, funded by the EPSRC. The website contains a "Face Transformer", which enables users to transform their face into any ethnicity and age as well as the ability to transform their face into a painting (in the style of either Sandro Botticelli or Amedeo Modigliani). This process is achieved by combining the user's photograph with an average face. In 2009 Debevec et al. presented new digital likenesses, made by Image Metrics, this time of actress Emily O'Brien whose reflectance was captured with the USC light stage 5 Motion looks fairly convincing contrasted to the clunky run in the Animatrix: Final Flight of the Osiris which was state-of-the-art in 2003 if photorealism was the intention of the animators. In 2009 a digital look-alike of a younger Arnold Schwarzenegger was made for the movie Terminator Salvation though the end result was critiqued as unconvincing. Facial geometry was acquired from a 1984 mold of Schwarzenegger. In 2010 Walt Disney Pictures released a sci-fi sequel entitled Tron: Legacy with a digitally rejuvenated digital look-alike of actor Jeff Bridges playing the antagonist CLU. In SIGGGRAPH 2013 Activision and USC presented a real-time "Digital Ira" a digital face look-alike of Ari Shapiro, an ICT USC research scientist, utilizing the USC light stage X by Ghosh et al. for both reflectance field and motion capture. The end result both precomputed and real-time rendering with the modernest game GPU shown here and looks fairly realistic. In 2014 The Presidential Portrait by USC Institute for Creative Technologies in conjunction with the Smithsonian Institution was made using the latest USC mobile light stage wherein President Barack Obama had his geometry, textures and reflectance captured. In 2014 Ian Goodfellow et al. presented the principles of a generative adversarial network. GANs made the headlines in early 2018 with the deepfakes controversies. For the 2015 film Furious 7 a digital look-alike of actor Paul Walker who died in an accident during the filming was done by Weta Digital to enable the completion of the film. In 2016 techniques which allow near real-time counterfeiting of facial expressions in existing 2D video have been believably demonstrated. In 2016 a digital look-alike of Peter Cushing was made for the Rogue One film where its appearance would appear to be of same age as the actor was during the filming of the original 1977 Star Wars film. In SIGGRAPH 2017 an audio driven digital look-alike of upper torso of Barack Obama was presented by researchers from University of Washington. It was driven only by a voice track as source data for the animation after the training phase to acquire lip sync and wider facial information from training material consisting 2D videos with audio had been completed. Late 2017 and early 2018 saw the surfacing of the deepfakes controversy where porn videos were doctored using deep machine learning so that the face of the actress was replaced by the software's opinion of what another persons face would look like in the same pose and lighting. In 2018 Game Developers Conference Epic Games and Tencent Games demonstrated "Siren", a digital look-alike of the actress Bingjie Jiang. It was made possible with the following technologies: CubicMotion's computer vision system, 3Lateral's facial rigging system and Vicon's motion capture system. The demonstration ran in near real time at 60 frames per second in the Unreal Engine 4. In 2018 at the World Internet Conference in Wuzhen the Xinhua News Agency presented two digital look-alikes made to the resemblance of its real news anchors Qiu Hao (Chinese language) and Zhang Zhao (English language). The digital look-alikes were made in conjunction with Sogou. Neither the speech synthesis used nor the gesturing of the digital look-alike anchors were good enough to deceive the watcher to mistake them for real humans imaged with a TV camera. In September 2018 Google added "involuntary synthetic pornographic imagery" to its ban list, allowing anyone to request the search engine block results that falsely depict them as "nude or in a sexually explicit situation." In February 2019 Nvidia open sources StyleGAN, a novel generative adversarial network. Right after this Phillip Wang made the website ThisPersonDoesNotExist.com with StyleGAN to demonstrate that unlimited amounts of often photo-realistic looking facial portraits of no-one can be made automatically using a GAN. Nvidia's StyleGAN was presented in a not yet peer reviewed paper in late 2018. At the June 2019 CVPR the MIT CSAIL presented a system titled "Speech2Face: Learning the Face Behind a Voice" that synthesizes likely faces based on just a recording of a voice. It was trained with massive amounts of video of people speaking. Since 1 July 2019 Virginia has criminalized the sale and dissemination of unauthorized synthetic pornography, but not the manufacture., as § 18.2–386.2 titled 'Unlawful dissemination or sale of images of another; penalty.' became part of the Code of Virginia. The law text states: "Any person who, with the intent to coerce, harass, or intimidate, maliciously disseminates or sells any videographic or still image created by any means whatsoever that depicts another person who is totally nude, or in a state of undress so as to expose the genitals, pubic area, buttocks, or female breast, where such person knows or has reason to know that he is not licensed or authorized to disseminate or sell such videographic or still image is guilty of a Class 1 misdemeanor.". The identical bills were House Bill 2678 presented by Delegate Marcus Simon to the Virginia House of Delegates on 14 January 2019 and three-day later an identical Senate bill 1736 was introduced to the Senate of Virginia by Senator Adam Ebbin. Since 1 September 2019 Texas senate bill SB 751 amendments to the election code came into effect, giving candidates in elections a 30-day protection period to the elections during which making and distributing digital look-alikes or synthetic fakes of the candidates is an offense. Th

T-vertices

T-vertices is a term used in computer graphics to describe a problem that can occur during mesh refinement or mesh simplification. The most common case occurs in naive implementations of continuous level of detail, where a finer-level mesh is "sewn" together with a coarser-level mesh by simply aligning the finer vertices on the edges of the coarse polygons. The result is a continuous mesh, however due to the nature of the z-buffer and certain lighting algorithms such as Gouraud shading, visual artifacts can often be detected. Some modeling algorithms such as subdivision surfaces will fail when a model contains T-vertices.

Data administration

Data administration or data resource management is an organizational function working in the areas of information systems and computer science that plans, organizes, describes and controls data resources. Data resources are usually stored in databases under a database management system or other software such as electronic spreadsheets. In many smaller organizations, data administration is performed occasionally, or is a small component of the database administrator’s work. In the context of information systems development, data administration ideally begins at system conception, ensuring there is a data dictionary to help maintain consistency, avoid redundancy, and model the database so as to make it logical and usable, by means of data modeling, including database normalization techniques. == Data resource management == According to the Data Management Association (DAMA), data resource management is "the development and execution of architectures, policies, practices and procedures that properly manage the full data lifecycle needs of an enterprise". Data Resource management may be thought of as a managerial activity that applies information system and other data management tools to the task of managing an organization’s data resource to meet a company’s business needs, and the information they provide to their shareholders. From the perspective of database design, it refers to the development and maintenance of data models to facilitate data sharing between different systems, particularly in a corporate context. Data Resource Management is also concerned with both data quality and compatibility between data models. Since the beginning of the information age, businesses need all types of data on their business activity. With each data created, when a business transaction is made, need data is created. With these data, new direction is needed that focuses on managing data as a critical resource of the organization to directly support its business activities. The data resource must be managed with the same intensity and formality that other critical resources are managed. Organizations must emphasize the information aspect of information technology, determine the data needed to support the business, and then use appropriate technology to build and maintain a high-quality data resource that provides that support. Data resource quality is a measure of how well the organization's data resource supports the current and the future business information demand of the organization. The data resource cannot support just the current business information demand while sacrificing the future business information demand. It must support both the current and the future business information demand. The ultimate data resource quality is stability across changing business needs and changing technology. A corporate data resource must be developed within single, organization-wide common data architecture. A data architecture is the science and method of designing and constructing a data resource that is business driven, based on real-world objects and events as perceived by the organization, and implemented into appropriate operating environments. It is the overall structure of a data resource that provides a consistent foundation across organizational boundaries to provide easily identifiable, readily available, high-quality data to support the business information demand. The common data architecture is a formal, comprehensive data architecture that provides a common context within which all data at an organization's disposal are understood and integrated. It is subject oriented, meaning that it is built from data subjects that represent business objects and business events in the real world that are of interest to the organization and about which data are captured and maintained.

Data-centric AI

Data-centric AI is an approach within artificial intelligence that emphasizes on improving the quality, consistency and representativeness of the data used to train machine learning models, rather than focusing primarily on optimizing model architectures or algorithms. This idea has gained traction as researchers and practitioners have come to believe that many performance limitations of machine learning systems stem from issues such as noisy labels, biased datasets, and lack of coverage in the data. Data-centric AI involves disciplined approach to data cleaning, augmentation, labeling, and governance that improves model performance and reliability in applications such as computer vision, natural language processing, and further.

Starlight Information Visualization System

Starlight is a software product originally developed at Pacific Northwest National Laboratory and now by Future Point Systems. It is an advanced visual analysis environment. In addition to using information visualization to show the importance of individual pieces of data by showing how they relate to one another, it also contains a small suite of tools useful for collaboration and data sharing, as well as data conversion, processing, augmentation and loading. The software, originally developed for the intelligence community, allows users to load data from XML files, databases, RSS feeds, web services, HTML files, Microsoft Word, PowerPoint, Excel, CSV, Adobe PDF, TXT files, etc. and analyze it with a variety of visualizations and tools. The system integrates structured, unstructured, geospatial, and multimedia data, offering comparisons of information at multiple levels of abstraction, simultaneously and in near real-time. In addition Starlight allows users to build their own named entity-extractors using a combination of algorithms, targeted normalization lists and regular expressions in the Starlight Data Engineer (SDE). As an example, Starlight might be used to look for correlations in a database containing records about chemical spills. An analyst could begin by grouping records according to the cause of the spill to reveal general trends. Sorting the data a second time, they could apply different colors based on related details such as the company responsible, age of equipment or geographic location. Maps and photographs could be integrated into the display, making it even easier to recognize connections among multiple variables. Starlight has been deployed to both the Iraq and Afghanistan wars and used on a number of large-scale projects. PNNL began developing Starlight in the mid-1990s, with funding from the Land Information Warfare Agency, a part of the Army Intelligence and Security Command and continued developed at the laboratory with funding from the NSA and the CIA. Starlight integrates visual representations of reports, radio transcripts, radar signals, maps and other information. The software system was recently honored with an R&D 100 Award for technical innovation. In 2006 Future Point Systems, a Silicon Valley startup, acquired rights to jointly develop and distribute the Starlight product in cooperation with the Pacific Northwest National Laboratory. The software is now also used outside of the military/intelligence communities in a number of commercial environments.