AI Coding Tools

Explore the best AI Coding Tools — independent reviews, comparisons, pricing and step-by-step how-to guides, curated by Aizhi.

  • LanguageWare

    LanguageWare

    LanguageWare is a natural language processing (NLP) technology developed by IBM, which allows applications to process natural language text. It comprises a set of Java libraries that provide a range of NLP functions: language identification, text segmentation/tokenization, normalization, entity and relationship extraction, and semantic analysis and disambiguation. The analysis engine uses a finite-state machine approach at multiple levels, which aids its performance characteristics while maintaining a reasonably small footprint. The behaviour of the system is driven by a set of configurable lexico-semantic resources which describe the characteristics and domain of the processed language. A default set of resources comes as part of LanguageWare and these describe the native language characteristics, such as morphology, and the basic vocabulary for the language. Supplemental resources have been created that capture additional vocabularies, terminologies, rules and grammars, which may be generic to the language or specific to one or more domains. A set of Eclipse-based customization tooling, LanguageWare Resource Workbench, is available on IBM's alphaWorks site, and allows domain knowledge to be compiled into these resources and thereby incorporated into the analysis process. LanguageWare can be deployed as a set of UIMA-compliant annotators, Eclipse plug-ins or Web Services.

    Read more →
  • Data annotation

    Data annotation

    Data annotation is the process of labeling or tagging relevant metadata within a dataset to enable machines to interpret the data accurately. The dataset can take various forms, including images, audio files, video footage, or text. == Applications == Data is a fundamental component in the development of artificial intelligence (AI). Training AI models, particularly in computer vision and natural language processing, requires large volumes of annotated data. Proper annotation ensures that machine learning algorithms can recognize patterns and make accurate predictions. Common types of data annotation include classification, bounding boxes, semantic segmentation, and keypoint annotation. Data annotation is used in AI-driven fields, including healthcare, autonomous vehicles, retail, security, and entertainment. By accurately labeling data, machine learning models can perform complex tasks such as object detection, sentiment analysis, and speech recognition with greater precision. This growing demand has led to the emergence of specialized sectors and platforms dedicated to AI training and human-in-the-loop workflows, which often utilize Reinforcement Learning from Human Feedback (RLHF) to refine model behavior. == In computer vision == === Image classification === Image classification, also known as image categorization, involves assigning predefined labels to images. Machine learning algorithms trained on classified images can later recognize objects and differentiate between categories. For instance, an AI model trained to recognize furniture styles can distinguish between Georgian and Rococo armchairs. === Semantic segmentation === Semantic segmentation assigns each pixel in an image to a specific class, such as trees, vehicles, humans, or buildings. This type of annotation enables machine learning models to differentiate objects by grouping similar pixels, allowing for a detailed understanding of an image. === Bounding boxes === Bounding box annotation involves drawing rectangular boxes around objects in an image. This technique is commonly used in autonomous driving, security surveillance, and retail analytics to detect and classify objects such as pedestrians, vehicles, and products on store shelves. === 3D cuboids === 3D cuboid annotation enhances traditional bounding boxes by adding depth, enabling models to predict an object's spatial orientation, movement, and size. This method is particularly useful for autonomous vehicles and robotics, where understanding object dimensions and depth is critical. === Polygonal annotation === For objects with irregular shapes, such as curved or multi-sided items, polygonal annotation provides more precise labeling than bounding boxes. This technique is often used in applications that require detailed object recognition, such as medical imaging or aerial mapping. === Keypoint annotation === Keypoint annotation marks specific points on an object, such as facial landmarks or body joints, to enable tracking and motion analysis. This method is widely used in facial recognition, emotion detection, sports analytics, and augmented reality applications.

    Read more →
  • Perusall

    Perusall

    Perusall is a social web annotation tool intended for use by students at schools and universities. It allows users to annotate the margins of a text in a virtual group setting that is similar to social media—with upvoting, emojis, chat functionality, and notification. It also includes automatic AI grading. == History == Perusall began as a research project at Harvard University. It later became an educational product for students and teachers. As of 2024, Perusall states more than 5 million students have used the tool at over 5,000 educational institutions in 112 countries." == Functionality == Perusall integrates with learning management systems such as Moodle, Canvas and Blackboard to aid with collaborative annotation. The tool supports annotation of a range of media including text, images, equations, videos, PDFs and snapshots of webpages.

    Read more →
  • Description logic

    Description logic

    Description logics (DL) are a family of formal knowledge representation languages. Many DLs are more expressive than propositional logic but less expressive than first-order logic. In contrast to the latter, the core reasoning problems for DLs are (usually) decidable, and efficient decision procedures have been designed and implemented for these problems. There are general, spatial, temporal, spatiotemporal, and fuzzy description logics, and each description logic features a different balance between expressive power and reasoning complexity by supporting different sets of mathematical constructors. DLs are used in artificial intelligence to describe and reason about the relevant concepts of an application domain (known as terminological knowledge). It is of particular importance in providing a logical formalism for ontologies and the Semantic Web: the Web Ontology Language (OWL) and its profiles are based on DLs. A major area of application of DLs and OWL is in biomedical informatics, where they assist in the codification of biomedical knowledge. DLs and OWL are also applied in other domains, including defense, climate modeling, and large-scale industrial knowledge graphs. == Introduction == A DL models concepts, roles and individuals, and their relationships. The fundamental modeling concept of a DL is the axiom—a logical statement relating roles and/or concepts. This is a key difference from the frames paradigm where a frame specification declares and completely defines a class. == Nomenclature == === Terminology compared to FOL and OWL === The description logic community uses different terminology than the first-order logic (FOL) community for operationally equivalent notions; some examples are given below. The Web Ontology Language (OWL) uses again a different terminology, also given in the table below. === Naming convention === There are many varieties of description logics and there is an informal naming convention, roughly describing the operators allowed. The expressivity is encoded in the label for a logic starting with one of the following basic logics: Followed by any of the following extensions: ==== Exceptions ==== Some canonical DLs that do not exactly fit this convention are: ==== Examples ==== As an example, A L C {\displaystyle {\mathcal {ALC}}} is a centrally important description logic from which comparisons with other varieties can be made. A L C {\displaystyle {\mathcal {ALC}}} is simply A L {\displaystyle {\mathcal {AL}}} with complement of any concept allowed, not just atomic concepts. A L C {\displaystyle {\mathcal {ALC}}} is used instead of the equivalent A L U E {\displaystyle {\mathcal {ALUE}}} . A further example, the description logic S H I Q {\displaystyle {\mathcal {SHIQ}}} is the logic A L C {\displaystyle {\mathcal {ALC}}} plus extended cardinality restrictions, and transitive and inverse roles. The naming conventions aren't purely systematic so that the logic A L C O I N {\displaystyle {\mathcal {ALCOIN}}} might be referred to as A L C N I O {\displaystyle {\mathcal {ALCNIO}}} and other abbreviations are also made where possible. The Protégé ontology editor supports S H O I N ( D ) {\displaystyle {\mathcal {SHOIN}}^{\mathcal {(D)}}} . Three major biomedical informatics terminology bases, SNOMED CT, GALEN, and GO, are expressible in E L {\displaystyle {\mathcal {EL}}} (with additional role properties). OWL 2 provides the expressiveness of S R O I Q ( D ) {\displaystyle {\mathcal {SROIQ}}^{\mathcal {(D)}}} , OWL-DL is based on S H O I N ( D ) {\displaystyle {\mathcal {SHOIN}}^{\mathcal {(D)}}} , and for OWL-Lite it is S H I F ( D ) {\displaystyle {\mathcal {SHIF}}^{\mathcal {(D)}}} . == History == Description logic was given its current name in the 1980s. Previous to this it was called (chronologically): terminological systems, and concept languages. === Knowledge representation === Frames and semantic networks lack formal (logic-based) semantics. DL was first introduced into knowledge representation (KR) systems to overcome this deficiency. The first DL-based KR system was KL-ONE (by Ronald J. Brachman and Schmolze, 1985). During the '80s other DL-based systems using structural subsumption algorithms were developed including KRYPTON (1983), LOOM (1987), BACK (1988), K-REP (1991) and CLASSIC (1991). This approach featured DL with limited expressiveness but relatively efficient (polynomial time) reasoning. In the early '90s, the introduction of a new tableau based algorithm paradigm allowed efficient reasoning on more expressive DL. DL-based systems using these algorithms — such as KRIS (1991) — show acceptable reasoning performance on typical inference problems even though the worst case complexity is no longer polynomial. From the mid '90s, reasoners were created with good practical performance on very expressive DL with high worst case complexity. Examples from this period include FaCT, RACER (2001), CEL (2005), and KAON 2 (2005). DL reasoners, such as FaCT, FaCT++, RACER, DLP and Pellet, implement the method of analytic tableaux. KAON2 is implemented by algorithms which reduce a SHIQ(D) knowledge base to a disjunctive datalog program. === Semantic web === The DARPA Agent Markup Language (DAML) and Ontology Inference Layer (OIL) ontology languages for the Semantic Web can be viewed as syntactic variants of DL. In particular, the formal semantics and reasoning in OIL use the S H I Q {\displaystyle {\mathcal {SHIQ}}} DL. The DAML+OIL DL was developed as a submission to—and formed the starting point of—the World Wide Web Consortium (W3C) Web Ontology Working Group. In 2004, the Web Ontology Working Group completed its work by issuing the OWL recommendation. The design of OWL is based on the S H {\displaystyle {\mathcal {SH}}} family of DL with OWL DL and OWL Lite based on S H O I N ( D ) {\displaystyle {\mathcal {SHOIN}}^{\mathcal {(D)}}} and S H I F ( D ) {\displaystyle {\mathcal {SHIF}}^{\mathcal {(D)}}} respectively. The W3C OWL Working Group began work in 2007 on a refinement of - and extension to - OWL. In 2009, this was completed by the issuance of the OWL2 recommendation. OWL2 is based on the description logic S R O I Q ( D ) {\displaystyle {\mathcal {SROIQ}}^{\mathcal {(D)}}} . Practical experience demonstrated that OWL DL lacked several key features necessary to model complex domains. == Modeling == === TBox vs Abox === In DL, a distinction is drawn between the so-called TBox (terminological box) and the ABox (assertional box). In general, the TBox contains sentences describing concept hierarchies (i.e., relations between concepts) while the ABox contains ground sentences stating where in the hierarchy, individuals belong (i.e., relations between individuals and concepts). For example, the statement: belongs in the TBox, while the statement: belongs in the ABox. Note that the TBox/ABox distinction is not significant, in the same sense that the two "kinds" of sentences are not treated differently in first-order logic (which subsumes most DL). When translated into first-order logic, a subsumption axiom like (1) is simply a conditional restriction to unary predicates (concepts) with only variables appearing in it. Clearly, a sentence of this form is not privileged or special over sentences in which only constants ("grounded" values) appear like (2). === Motivation for having Tbox and Abox === So why was the distinction introduced? The primary reason is that the separation can be useful when describing and formulating decision-procedures for various DL. For example, a reasoner might process the TBox and ABox separately, in part because certain key inference problems are tied to one but not the other one ('classification' is related to the TBox, 'instance checking' to the ABox). Another example is that the complexity of the TBox can greatly affect the performance of a given decision-procedure for a certain DL, independently of the ABox. Thus, it is useful to have a way to talk about that specific part of the knowledge base. The secondary reason is that the distinction can make sense from the knowledge base modeler's perspective. It is plausible to distinguish between our conception of terms/concepts in the world (class axioms in the TBox) and particular manifestations of those terms/concepts (instance assertions in the ABox). In the above example: when the hierarchy within a company is the same in every branch but the assignment to employees is different in every department (because there are other people working there), it makes sense to reuse the TBox for different branches that do not use the same ABox. There are two features of description logic that are not shared by most other data description formalisms: DL does not make the unique name assumption (UNA) or the closed-world assumption (CWA). Not having UNA means that two concepts with different names may be allowed by some inference to be shown to be equivalent. Not having CWA, or rather having the open world assumption (OWA) means that

    Read more →
  • Automated Mathematician

    Automated Mathematician

    The Automated Mathematician (AM) is one of the earliest successful discovery systems. It was created by Douglas Lenat in Lisp, and in 1977 led to Lenat being awarded the IJCAI Computers and Thought Award. AM worked by generating and modifying short Lisp programs which were then interpreted as defining various mathematical concepts; for example, a program that tested equality between the length of two lists was considered to represent the concept of numerical equality, while a program that produced a list whose length was the product of the lengths of two other lists was interpreted as representing the concept of multiplication. The system had elaborate heuristics for choosing which programs to extend and modify, based on the experiences of working mathematicians in solving mathematical problems. == Controversy == Lenat claimed that the system was composed of hundreds of data structures called "concepts", together with hundreds of "heuristic rules" and a simple flow of control: "AM repeatedly selects the top task from the agenda and tries to carry it out. This is the whole control structure!" Yet the heuristic rules were not always represented as separate data structures; some had to be intertwined with the control flow logic. Some rules had preconditions that depended on the history, or otherwise could not be represented in the framework of the explicit rules. What's more, the published versions of the rules often involve vague terms that are not defined further, such as "If two expressions are structurally similar, ..." (Rule 218) or "... replace the value obtained by some other (very similar) value..." (Rule 129). Another source of information is the user, via Rule 2: "If the user has recently referred to X, then boost the priority of any tasks involving X." Thus, it appears quite possible that much of the real discovery work is buried in unexplained procedures. Lenat claimed that the system had rediscovered both Goldbach's conjecture and the fundamental theorem of arithmetic. Later critics accused Lenat of over-interpreting the output of AM. In his paper Why AM and Eurisko appear to work, Lenat conceded that any system that generated enough short Lisp programs would generate ones that could be interpreted by an external observer as representing equally sophisticated mathematical concepts. However, he argued that this property was in itself interesting—and that a promising direction for further research would be to look for other languages in which short random strings were likely to be useful. == Successor == This intuition was the basis of AM's successor Eurisko, which attempted to generalize the search for mathematical concepts to the search for useful heuristics.

    Read more →
  • Discovery system (artificial intelligence)

    Discovery system (artificial intelligence)

    A discovery system is an artificial intelligence system that attempts to discover new scientific concepts or laws. The aim of discovery systems is to automate scientific data analysis and the scientific discovery process. Ideally, an artificial intelligence system should be able to search systematically through the space of all possible hypotheses and yield the hypothesis - or set of equally likely hypotheses - that best describes the complex patterns in data. During the era known as the second AI summer (approximately 1978–1987), various systems akin to the era's dominant expert systems were developed to tackle the problem of extracting scientific hypotheses from data, with or without interacting with a human scientist. These systems included Autoclass, Automated Mathematician, Eurisko, which aimed at general-purpose hypothesis discovery, and more specific systems such as Dalton, which uncovers molecular properties from data. The dream of building systems that discover scientific hypotheses was pushed to the background with the second AI winter and the subsequent resurgence of subsymbolic methods such as neural networks. Subsymbolic methods emphasize prediction over explanation, and yield models which works well but are difficult or impossible to explain which has earned them the name black box AI. A black-box model cannot be considered a scientific hypothesis, and this development has even led some researchers to suggest that the traditional aim of science - to uncover hypotheses and theories about the structure of reality - is obsolete. Other researchers disagree and argue that subsymbolic methods are useful in many cases, just not for generating scientific theories. == Discovery systems from the 1970s and 1980s == Autoclass was a Bayesian Classification System written in 1986 Automated Mathematician was one of the earliest successful discovery systems. It was written in 1977 and worked by generating a modifying small Lisp programs Eurisko was a Sequel to Automated Mathematician written in 1984 Dalton is a still maintained program capable of calculating various molecular properties initially launched in 1983 and available in open source since 2017 Glauber is a scientific discovery method written in the context of computational philosophy of science launched in 1983 == Modern discovery systems (2009–present) == After a couple of decades with little interest in discovery systems, the interest in using AI to uncover natural laws and scientific explanations was renewed by the work of Michael Schmidt, then a PhD student in Computational Biology at Cornell University. Schmidt and his advisor, Hod Lipson, invented Eureqa, which they described as a symbolic regression approach to "distilling free-form natural laws from experimental data". This work effectively demonstrated that symbolic regression was a promising way forward for AI-driven scientific discovery. Since 2009, symbolic regression has matured further, and today, various commercial and open source systems are actively used in scientific research. Notable examples include Eureqa, now a part of DataRobot AI Cloud Platform, AI Feynman, and QLattice.

    Read more →
  • Graphics processing unit

    Graphics processing unit

    A graphics processing unit (GPU) is a specialized electronic circuit designed for digital image processing and to accelerate computer graphics, being present either as a component on a discrete graphics card or embedded on motherboards, mobile phones, personal computers, workstations, and game consoles. GPUs are increasingly being used for artificial intelligence (AI) processing due to linear algebra acceleration, which is also used extensively in graphics processing. Although there is no single definition of the term, and it may be used to describe any video display system, in modern use a GPU includes the ability to internally perform the calculations needed for various graphics tasks, like rotating and scaling 3D images, and often the additional ability to run custom programs known as shaders. This contrasts with earlier graphics controllers known as video display controllers which had no internal calculation capabilities, or blitters, which performed only basic memory movement operations. The modern GPU emerged during the 1990s, adding the ability to perform operations like drawing lines and text without CPU help, and later adding 3D functionality. Graphics functions are generally independent and this lends these tasks to being implemented on separate calculation engines. Modern GPUs include hundreds, or thousands, of calculation units. This made them useful for non-graphic calculations involving embarrassingly parallel problems due to their parallel structure. The ability of GPUs to rapidly perform vast numbers of calculations has led to their adoption in diverse fields including artificial intelligence (AI) where they excel at handling data-intensive and computationally demanding tasks. Other non-graphical uses include the training of neural networks and cryptocurrency mining. == History == === 1960s === Dedicated 3D graphics hardware dates back to graphic terminals such as the Adage AGT-30 from 1967 with analog matrix processors. In 1969 Evans & Sutherland (E&S) introduced the Line Drawing System-1 (LDS-1), which was the first all-digital system to provide matrix multiplication. Also in 1969, the low-cost graphics terminal IMLAC PDS-1 was introduced. It later saw use as an early 3D gaming machine with the likes of Maze War. === 1970s === In professional hardware, in 1972 PLATO IV system becomes operational at the University of Illinois Urbana-Champaign. Between around 1973 and 1978, several networked multiplayer wireframe 3D games are implemented and popularized by users of the system. Also in 1972, the E&S Continuous Tone 1 (CT1) "Watkins box" system (consisting of an E&S LDS-2 and Shaded Picture System) is delivered to Case Western Reserve University. It offered the first real-time Gouraud shading. In 1975, a joint effort between Evans & Sutherland Computer Corporation and the University of Utah's computer graphics department results in the first ever MOSFET video framebuffer, capable of color and smooth shading. E&S Continuous Tone 3 (CT3) system was delivered in 1977 to Lufthansa for pilot training using computer simulation. It was the first graphics system capable of real-time texture mapping. Ikonas made graphics systems with 8- and 24-bit graphics and 3D acceleration in the late 70s. Arcade system boards have used specialized 2D graphics circuits since the 1970s. In early video game hardware, RAM for frame buffers was expensive, so video chips composited data together as the display was being scanned out on the monitor. A specialized barrel shifter circuit helped the CPU animate the framebuffer graphics for various 1970s arcade video games from Midway and Taito, such as Gun Fight (1975), Sea Wolf (1976), and Space Invaders (1978). The Namco Galaxian arcade system in 1979 used specialized graphics hardware that supported RGB color, multi-colored sprites, and tilemap backgrounds. The Galaxian hardware was widely used during the golden age of arcade video games, by game companies such as Namco, Centuri, Gremlin, Irem, Konami, Midway, Nichibutsu, Sega, and Taito. The Atari 2600 in 1977 used a video shifter called the Television Interface Adaptor. Atari 8-bit computers (1979) had ANTIC, a video processor which interpreted instructions describing a "display list"—the way the scan lines map to specific bitmapped or character modes and where the memory is stored (so there did not need to be a contiguous frame buffer). 6502 machine code subroutines could be triggered on scan lines by setting a bit on a display list instruction. ANTIC also supported smooth vertical and horizontal scrolling independent of the CPU. === 1980s === In the 1980s significant advancements were made in professional 3D graphics hardware. Perhaps most impactful was the 1981 development of the Geometry Engine, a VLSI vector processor ASIC designed by Jim Clark and Marc Hannah at Stanford University. This processor is the forerunner of modern tensor cores and other similar processors marketed for graphics and AI. The Geometry Engine went on to be used in Silicon Graphics workstations for many years. Silicon Graphics's first product, shipped in November 1983, was the IRIS 1000, a terminal with hardware-accelerated 3D graphics based on the Geometry Engine. The Geometry Engine was capable of approximately 6 million operations per second. The 1981 NEC μPD7220 was the first implementation of a personal computer graphics display processor as a single large-scale integration (LSI) integrated circuit chip. This enabled the design of low-cost, high-performance video graphics cards such as those from Number Nine Visual Technology. It became the best-known GPU until the mid-1980s. It was the first fully integrated VLSI (very large-scale integration) metal–oxide–semiconductor (NMOS) graphics display processor for PCs, supported up to 1024×1024 resolution, and laid the foundations for the PC graphics market. It was used in a number of graphics cards and was licensed for clones such as the Intel 82720, the first of Intel's graphics processing units. The Williams Electronics arcade games Robotron: 2084, Joust, Sinistar, and Bubbles, all released in 1982, contain custom blitter chips for operating on 16-color bitmaps. In 1984, Hitachi released the ARTC HD63484, the first major CMOS graphics processor for personal computers. The ARTC could display up to 4K resolution when in monochrome mode. It was used in a number of graphics cards and terminals during the late 1980s. In 1985, the Amiga was released with a custom graphics chip called Agnus including a blitter for bitmap manipulation, line drawing, and area fill. It also included a coprocessor with its own simple instruction set, that was capable of manipulating graphics hardware registers in sync with the video beam (e.g. for per-scanline palette switches, sprite multiplexing, and hardware windowing), or driving the blitter. Also in 1985, IBM released the Professional Graphics Controller, designed by later to be Nvidia co-founder Curtis Priem, which was a rudimentary 3D card with 640 × 480 256-color graphics which used a dedicated CPU to draw graphics independently of the main system. It was used as the basis of cards by a number of makers (including Matrox) and its analog RGB signaling led directly to the VGA video standard. Priem later in the 80s worked on the influential Sun Microsystems GX (also known as cgsix) accelerated 2D graphics card. In 1986, Texas Instruments released the TMS34010, the first fully programmable graphics processor. It could run general-purpose code but also had a graphics-oriented instruction set. During 1990–1992, this chip became the basis of the Texas Instruments Graphics Architecture ("TIGA") Windows accelerator cards. Following in 1987, the IBM 8514 graphics system was released. It was one of the first video cards for IBM PC compatibles that implemented fixed-function 2D primitives in electronic hardware. Sharp's X68000, released in 1987, used a custom graphics chipset with a 65,536 color palette and hardware support for sprites, scrolling, and multiple playfields. It served as a development machine for Capcom's CP System arcade board. Fujitsu's FM Towns computer, released in 1989, had support for a 16,777,216 color palette. For context, IBM also introduced its Video Graphics Array (VGA) display system in 1987, with a maximum resolution of 640 × 480 pixels. Unlike 8514/A, VGA had no hardware acceleration features. In November 1988, NEC Home Electronics announced its creation of the Video Electronics Standards Association (VESA) to develop and promote a Super VGA (SVGA) computer display standard as a successor to VGA. Super VGA enabled graphics display resolutions up to 800 × 600 pixels, a 56% increase. In 1988 SGI sold IRIS workstation graphics with 10-12 Geometry Engines and introduced the IrisVision add-in board for IBM MicroChannel bus (RS/6000) based on the Geometry Engine as well. In 1988 as well, the first dedicated polygonal 3D graphics boards in arcade machines were introduced wit

    Read more →
  • Neurocomputing (journal)

    Neurocomputing (journal)

    Neurocomputing is a peer-reviewed scientific journal covering research on artificial intelligence, machine learning, and neural computation. It was established in 1989 and is published by Elsevier. The editor-in-chief is Zidong Wang (Brunel University London). Independent scientometric studies noted that despite being one of the most productive journals in the field, it has kept its reputation across the years intact and plays an important role in leading the research in the area. The journal is abstracted and indexed in Scopus and Science Citation Index Expanded. According to the Journal Citation Reports, its 2023 impact factor is 5.5.

    Read more →
  • Drush

    Drush

    Drush (DRUpal SHell) is a computer software shell-based application used to control, manipulate, and administer Drupal websites. == Details == Drush was originally developed by Arto Bendiken for Drupal 4.7. In May 2007, it was partly rewritten and redesigned for Drupal 5 by Franz Heinzmann. Drush is maintained by Moshe Weitzman with the support of Owen Barton, greg.1.anderson, jonhattan, Mark Sonnabaum, Jonathan Hedstrom and Christopher Gervais.

    Read more →
  • Outline of deep learning

    Outline of deep learning

    The following outline is provided as an overview of, and topical guide to, deep learning: Deep learning is a subfield of machine learning and artificial intelligence based on artificial neural networks with multiple processing layers. It emphasizes representation learning and is widely used in areas such as computer vision, natural language processing, speech recognition, recommender systems, robotics, and generative artificial intelligence. == Ways to categorize deep learning == A field of study A branch of artificial intelligence A subfield of machine learning A subfield of computer science A form of representation learning A class of methods based on artificial neural networks An approach used in computational statistics == History == === Precursors === Cybernetics Perceptron Connectionism Neocognitron Backpropagation === Milestones === LeNet Long short-term memory Deep belief network AlexNet Sequence to sequence learning Generative adversarial network Residual neural network Transformer BERT Generative pre-trained transformer Diffusion model === Related histories === History of artificial intelligence History of machine learning Timeline of machine learning == Core concepts == == Learning settings == Supervised learning Unsupervised learning Self-supervised learning Semi-supervised learning Reinforcement learning Transfer learning Multitask learning Multimodal learning Online machine learning Continual learning == Common tasks == Image classification Object detection Image segmentation Automatic speech recognition Neural machine translation Question answering Automatic summarization Text-to-image model Protein structure prediction == Architectures == === Feedforward and convolutional architectures === Feedforward neural network Multilayer perceptron Convolutional neural network Radial basis function network Residual neural network U-Net === Recurrent and sequence architectures === Recurrent neural network Long short-term memory Gated recurrent unit Sequence to sequence learning Recursive neural network === Representation-learning architectures === Autoencoder Denoising autoencoder Sparse autoencoder Variational autoencoder Restricted Boltzmann machine Deep belief network === Attention and transformer architectures === Attention (machine learning) Transformer BERT Generative pre-trained transformer Vision transformer === Generative and probabilistic architectures === Autoregressive model Diffusion model Energy-based model Generative adversarial network Mixture of experts === Graph and memory architectures === Graph neural network Graph convolutional network Siamese network Neural Turing machine Memory network Echo state network Capsule neural network == Neural network components and techniques == Artificial neuron Activation function Rectified linear unit Sigmoid function Softmax function Embedding Convolution Pooling layer Attention Batch normalization Layer normalization Residual connections == Training and optimization == Backpropagation Gradient descent Stochastic gradient descent Adam optimization Learning rate Loss function Cross-entropy Mean squared error Regularization Dropout Early stopping Batch normalization Data augmentation Transfer learning Knowledge distillation Ensemble learning Curriculum learning == Datasets and benchmarks == CIFAR-10 ImageNet MNIST database Common Objects in Context (COCO) General Language Understanding Evaluation (GLUE) benchmark LibriSpeech SQuAD == Applications == === Computer vision === Computer vision Facial recognition system Image classification Image segmentation Medical imaging Object detection Optical character recognition === Natural language processing === Automatic summarization Chatbot Information retrieval Large language model Natural language processing Neural machine translation Question answering Sentiment analysis === Speech and audio === Automatic speech recognition Music information retrieval Speaker recognition Speech synthesis === Science and medicine === Bioinformatics Computational biology Drug discovery Medical diagnosis Protein structure prediction === Robotics and control === Autonomous car Computer game bot Control theory Robotics === Recommendation, search, and forecasting === Anomaly detection Forecasting Fraud detection Recommender system Search engine === Generative artificial intelligence === Deepfake Generative artificial intelligence Large language model Speech synthesis Text-to-image model === Computer graphics and video games === Deep Learning Anti-Aliasing (DLAA) Deep Learning Super Sampling (DLSS) == Hardware == AMD Instinct AMD XDNA Application-specific integrated circuit Deep learning processor, Neural processing unit (NPU), or Neural Engine Field-programmable gate array General-purpose computing on graphics processing units (GPGPU) Graphics processing unit NVIDIA Deep Learning Accelerator (NVDLA) Tensor processing unit Vision processing unit Wafer-scale integration === Supporting software platforms === CUDA Metal ROCm == Software == === Open-source frameworks and libraries === === Neural network software === EDLUT Emergent Encog JOONE Neuroph NeuroSolutions OpenNN Peltarion Synapse SNNS === Platforms, tools, and deployment === Amazon SageMaker Google Colab Hugging Face Kaggle Kubeflow MLflow ONNX OpenVINO TensorFlow Hub == Algorithms for deep learning and neural networks == Backpropagation Conjugate gradient method Generalized Hebbian algorithm Gradient descent Levenberg–Marquardt algorithm Perceptron Quasi-Newton method Wake-sleep algorithm == Methods and related topics == === Representation and metric learning === Contrastive learning Embedding Feature learning Manifold learning Metric learning === Generative modeling === Autoregressive model Diffusion model Generative adversarial network Generative model Variational inference === Efficient and scalable deep learning === Knowledge distillation Low-rank approximation Mixture of experts Quantization Sparsity === Reliability, safety, and interpretability === Adversarial machine learning AI alignment Algorithmic bias Catastrophic forgetting Differential privacy Explainable artificial intelligence Federated learning Hallucination (artificial intelligence) == Conferences and workshops == Annual Meeting of the Association for Computational Linguistics Conference on Computer Vision and Pattern Recognition Conference on Neural Information Processing Systems International Conference on Computer Vision International Conference on Learning Representations International Conference on Machine Learning == Organizations == === Research laboratories and institutions === Allen Institute for AI Alberta Machine Intelligence Institute European Laboratory for Learning and Intelligent Systems Google DeepMind Meta AI Mila Microsoft Research Vector Institute === Companies === Anthropic Cerebras Cohere DeepSeek Mistral AI OpenAI Stability AI xAI == Publications == === Books === Deep Learning – Ian Goodfellow and Yoshua Bengio Neural Networks and Deep Learning – Michael Nielsen Perceptrons – Marvin Minsky and Seymour Papert === Journals === IEEE Transactions on Neural Networks and Learning Systems Neural Networks Neural Computation == Influential persons ==

    Read more →
  • Outline of deep learning

    Outline of deep learning

    The following outline is provided as an overview of, and topical guide to, deep learning: Deep learning is a subfield of machine learning and artificial intelligence based on artificial neural networks with multiple processing layers. It emphasizes representation learning and is widely used in areas such as computer vision, natural language processing, speech recognition, recommender systems, robotics, and generative artificial intelligence. == Ways to categorize deep learning == A field of study A branch of artificial intelligence A subfield of machine learning A subfield of computer science A form of representation learning A class of methods based on artificial neural networks An approach used in computational statistics == History == === Precursors === Cybernetics Perceptron Connectionism Neocognitron Backpropagation === Milestones === LeNet Long short-term memory Deep belief network AlexNet Sequence to sequence learning Generative adversarial network Residual neural network Transformer BERT Generative pre-trained transformer Diffusion model === Related histories === History of artificial intelligence History of machine learning Timeline of machine learning == Core concepts == == Learning settings == Supervised learning Unsupervised learning Self-supervised learning Semi-supervised learning Reinforcement learning Transfer learning Multitask learning Multimodal learning Online machine learning Continual learning == Common tasks == Image classification Object detection Image segmentation Automatic speech recognition Neural machine translation Question answering Automatic summarization Text-to-image model Protein structure prediction == Architectures == === Feedforward and convolutional architectures === Feedforward neural network Multilayer perceptron Convolutional neural network Radial basis function network Residual neural network U-Net === Recurrent and sequence architectures === Recurrent neural network Long short-term memory Gated recurrent unit Sequence to sequence learning Recursive neural network === Representation-learning architectures === Autoencoder Denoising autoencoder Sparse autoencoder Variational autoencoder Restricted Boltzmann machine Deep belief network === Attention and transformer architectures === Attention (machine learning) Transformer BERT Generative pre-trained transformer Vision transformer === Generative and probabilistic architectures === Autoregressive model Diffusion model Energy-based model Generative adversarial network Mixture of experts === Graph and memory architectures === Graph neural network Graph convolutional network Siamese network Neural Turing machine Memory network Echo state network Capsule neural network == Neural network components and techniques == Artificial neuron Activation function Rectified linear unit Sigmoid function Softmax function Embedding Convolution Pooling layer Attention Batch normalization Layer normalization Residual connections == Training and optimization == Backpropagation Gradient descent Stochastic gradient descent Adam optimization Learning rate Loss function Cross-entropy Mean squared error Regularization Dropout Early stopping Batch normalization Data augmentation Transfer learning Knowledge distillation Ensemble learning Curriculum learning == Datasets and benchmarks == CIFAR-10 ImageNet MNIST database Common Objects in Context (COCO) General Language Understanding Evaluation (GLUE) benchmark LibriSpeech SQuAD == Applications == === Computer vision === Computer vision Facial recognition system Image classification Image segmentation Medical imaging Object detection Optical character recognition === Natural language processing === Automatic summarization Chatbot Information retrieval Large language model Natural language processing Neural machine translation Question answering Sentiment analysis === Speech and audio === Automatic speech recognition Music information retrieval Speaker recognition Speech synthesis === Science and medicine === Bioinformatics Computational biology Drug discovery Medical diagnosis Protein structure prediction === Robotics and control === Autonomous car Computer game bot Control theory Robotics === Recommendation, search, and forecasting === Anomaly detection Forecasting Fraud detection Recommender system Search engine === Generative artificial intelligence === Deepfake Generative artificial intelligence Large language model Speech synthesis Text-to-image model === Computer graphics and video games === Deep Learning Anti-Aliasing (DLAA) Deep Learning Super Sampling (DLSS) == Hardware == AMD Instinct AMD XDNA Application-specific integrated circuit Deep learning processor, Neural processing unit (NPU), or Neural Engine Field-programmable gate array General-purpose computing on graphics processing units (GPGPU) Graphics processing unit NVIDIA Deep Learning Accelerator (NVDLA) Tensor processing unit Vision processing unit Wafer-scale integration === Supporting software platforms === CUDA Metal ROCm == Software == === Open-source frameworks and libraries === === Neural network software === EDLUT Emergent Encog JOONE Neuroph NeuroSolutions OpenNN Peltarion Synapse SNNS === Platforms, tools, and deployment === Amazon SageMaker Google Colab Hugging Face Kaggle Kubeflow MLflow ONNX OpenVINO TensorFlow Hub == Algorithms for deep learning and neural networks == Backpropagation Conjugate gradient method Generalized Hebbian algorithm Gradient descent Levenberg–Marquardt algorithm Perceptron Quasi-Newton method Wake-sleep algorithm == Methods and related topics == === Representation and metric learning === Contrastive learning Embedding Feature learning Manifold learning Metric learning === Generative modeling === Autoregressive model Diffusion model Generative adversarial network Generative model Variational inference === Efficient and scalable deep learning === Knowledge distillation Low-rank approximation Mixture of experts Quantization Sparsity === Reliability, safety, and interpretability === Adversarial machine learning AI alignment Algorithmic bias Catastrophic forgetting Differential privacy Explainable artificial intelligence Federated learning Hallucination (artificial intelligence) == Conferences and workshops == Annual Meeting of the Association for Computational Linguistics Conference on Computer Vision and Pattern Recognition Conference on Neural Information Processing Systems International Conference on Computer Vision International Conference on Learning Representations International Conference on Machine Learning == Organizations == === Research laboratories and institutions === Allen Institute for AI Alberta Machine Intelligence Institute European Laboratory for Learning and Intelligent Systems Google DeepMind Meta AI Mila Microsoft Research Vector Institute === Companies === Anthropic Cerebras Cohere DeepSeek Mistral AI OpenAI Stability AI xAI == Publications == === Books === Deep Learning – Ian Goodfellow and Yoshua Bengio Neural Networks and Deep Learning – Michael Nielsen Perceptrons – Marvin Minsky and Seymour Papert === Journals === IEEE Transactions on Neural Networks and Learning Systems Neural Networks Neural Computation == Influential persons ==

    Read more →
  • Granular computing

    Granular computing

    Granular computing is an emerging computing paradigm of information processing that concerns the processing of complex information entities called "information granules", which arise in the process of data abstraction and derivation of knowledge from information or data. Generally speaking, information granules are collections of entities that usually originate at the numeric level and are arranged together due to their similarity, functional or physical adjacency, indistinguishability, coherency, or the like. At present, granular computing is more a theoretical perspective than a coherent set of methods or principles. As a theoretical perspective, it encourages an approach to data that recognizes and exploits the knowledge present in data at various levels of resolution or scales. In this sense, it encompasses all methods which provide flexibility and adaptability in the resolution at which knowledge or information is extracted and represented. == Types of granulation == As mentioned above, granular computing is not an algorithm or process; there is no particular method that is called "granular computing". It is rather an approach to looking at data that recognizes how different and interesting regularities in the data can appear at different levels of granularity, much as different features become salient in satellite images of greater or lesser resolution. On a low-resolution satellite image, for example, one might notice interesting cloud patterns representing cyclones or other large-scale weather phenomena, while in a higher-resolution image, one misses these large-scale atmospheric phenomena but instead notices smaller-scale phenomena, such as the interesting pattern that is the streets of Manhattan. The same is generally true of all data: At different resolutions or granularities, different features and relationships emerge. The aim of granular computing is to try to take advantage of this fact in designing more effective machine-learning and reasoning systems. There are several types of granularity that are often encountered in data mining and machine learning, and we review them below: === Value granulation (discretization/quantization) === One type of granulation is the quantization of variables. It is very common that in data mining or machine-learning applications the resolution of variables needs to be decreased in order to extract meaningful regularities. An example of this would be a variable such as "outside temperature" (temp), which in a given application might be recorded to several decimal places of precision (depending on the sensing apparatus). However, for purposes of extracting relationships between "outside temperature" and, say, "number of health-club applications" (club), it will generally be advantageous to quantize "outside temperature" into a smaller number of intervals. ==== Motivations ==== There are several interrelated reasons for granulating variables in this fashion: Based on prior domain knowledge, there is no expectation that minute variations in temperature (e.g., the difference between 80–80.7 °F (26.7–27.1 °C)) could have an influence on behaviors driving the number of health-club applications. For this reason, any "regularity" which our learning algorithms might detect at this level of resolution would have to be spurious, as an artifact of overfitting. By coarsening the temperature variable into intervals the difference between which we do anticipate (based on prior domain knowledge) might influence number of health-club applications, we eliminate the possibility of detecting these spurious patterns. Thus, in this case, reducing resolution is a method of controlling overfitting. By reducing the number of intervals in the temperature variable (i.e., increasing its grain size), we increase the amount of sample data indexed by each interval designation. Thus, by coarsening the variable, we increase sample sizes and achieve better statistical estimation. In this sense, increasing granularity provides an antidote to the so-called curse of dimensionality, which relates to the exponential decrease in statistical power with increase in number of dimensions or variable cardinality. Independent of prior domain knowledge, it is often the case that meaningful regularities (i.e., which can be detected by a given learning methodology, representational language, etc.) may exist at one level of resolution and not at another. For example, a simple learner or pattern recognition system may seek to extract regularities satisfying a conditional probability threshold such as p ( Y = y j | X = x i ) ≥ α . {\displaystyle p(Y=y_{j}|X=x_{i})\geq \alpha .} In the special case where α = 1 , {\displaystyle \alpha =1,} this recognition system is essentially detecting logical implication of the form X = x i → Y = y j {\displaystyle X=x_{i}\rightarrow Y=y_{j}} or, in words, "if X = x i , {\displaystyle X=x_{i},} then Y = y j {\displaystyle Y=y_{j}} ". The system's ability to recognize such implications (or, in general, conditional probabilities exceeding threshold) is partially contingent on the resolution with which the system analyzes the variables. As an example of this last point, consider the feature space shown to the right. The variables may each be regarded at two different resolutions. Variable X {\displaystyle X} may be regarded at a high (quaternary) resolution wherein it takes on the four values { x 1 , x 2 , x 3 , x 4 } {\displaystyle \{x_{1},x_{2},x_{3},x_{4}\}} or at a lower (binary) resolution wherein it takes on the two values { X 1 , X 2 } . {\displaystyle \{X_{1},X_{2}\}.} Similarly, variable Y {\displaystyle Y} may be regarded at a high (quaternary) resolution or at a lower (binary) resolution, where it takes on the values { y 1 , y 2 , y 3 , y 4 } {\displaystyle \{y_{1},y_{2},y_{3},y_{4}\}} or { Y 1 , Y 2 } , {\displaystyle \{Y_{1},Y_{2}\},} respectively. At the high resolution, there are no detectable implications of the form X = x i → Y = y j , {\displaystyle X=x_{i}\rightarrow Y=y_{j},} since every x i {\displaystyle x_{i}} is associated with more than one y j , {\displaystyle y_{j},} and thus, for all x i , {\displaystyle x_{i},} p ( Y = y j | X = x i ) < 1. {\displaystyle p(Y=y_{j}|X=x_{i})<1.} However, at the low (binary) variable resolution, two bilateral implications become detectable: X = X 1 ↔ Y = Y 1 {\displaystyle X=X_{1}\leftrightarrow Y=Y_{1}} and X = X 2 ↔ Y = Y 2 {\displaystyle X=X_{2}\leftrightarrow Y=Y_{2}} , since every X 1 {\displaystyle X_{1}} occurs iff Y 1 {\displaystyle Y_{1}} and X 2 {\displaystyle X_{2}} occurs iff Y 2 . {\displaystyle Y_{2}.} Thus, a pattern recognition system scanning for implications of this kind would find them at the binary variable resolution, but would fail to find them at the higher quaternary variable resolution. ==== Issues and methods ==== It is not feasible to exhaustively test all possible discretization resolutions on all variables in order to see which combination of resolutions yields interesting or significant results. Instead, the feature space must be preprocessed (often by an entropy analysis of some kind) so that some guidance can be given as to how the discretization process should proceed. Moreover, one cannot generally achieve good results by naively analyzing and discretizing each variable independently, since this may obliterate the very interactions that we had hoped to discover. A sample of papers that address the problem of variable discretization in general, and multiple-variable discretization in particular, is as follows: Chiu, Wong & Cheung (1991), Bay (2001), Liu et al. (2002), Wang & Liu (1998), Zighed, Rabaséda & Rakotomalala (1998), Catlett (1991), Dougherty, Kohavi & Sahami (1995), Monti & Cooper (1999), Fayyad & Irani (1993), Chiu, Cheung & Wong (1990), Nguyen & Nguyen (1998), Grzymala-Busse & Stefanowski (2001), Ting (1994), Ludl & Widmer (2000), Pfahringer (1995), An & Cercone (1999), Chiu & Cheung (1989), Chmielewski & Grzymala-Busse (1996), Lee & Shin (1994), Liu & Wellman (2002), Liu & Wellman (2004). === Variable granulation (clustering/aggregation/transformation) === Variable granulation is a term that could describe a variety of techniques, most of which are aimed at reducing dimensionality, redundancy, and storage requirements. We briefly describe some of the ideas here, and present pointers to the literature. ==== Variable transformation ==== A number of classical methods, such as principal component analysis, multidimensional scaling, factor analysis, and structural equation modeling, and their relatives, fall under the genus of "variable transformation." Also in this category are more modern areas of study such as dimensionality reduction, projection pursuit, and independent component analysis. The common goal of these methods in general is to find a representation of the data in terms of new variables, which are a linear or nonlinear transformation of the original variables, and in which important stati

    Read more →
  • Cryptographic module

    Cryptographic module

    A cryptographic module is a component of a computer system that securely implements cryptographic algorithms, typically with some element of tamper resistance. NIST defines a cryptographic module as "The set of hardware, software, and/or firmware that implements security functions (including cryptographic algorithms), holds plaintext keys and uses them for performing cryptographic operations, and is contained within a cryptographic module boundary." Hardware security modules, including secure cryptoprocessors, are one way of implementing cryptographic modules. Standards for cryptographic modules include FIPS 140-3 and ISO/IEC 19790.

    Read more →
  • EfficientNet

    EfficientNet

    EfficientNet is a family of convolutional neural networks (CNNs) for computer vision published by researchers at Google AI in 2019. Its key innovation is compound scaling, which uniformly scales all dimensions of depth, width, and resolution using a single parameter. EfficientNet models have been adopted in various computer vision tasks, including image classification, object detection, and segmentation. == Compound scaling == EfficientNet introduces compound scaling, which, instead of scaling one dimension of the network at a time, such as depth (number of layers), width (number of channels), or resolution (input image size), uses a compound coefficient ϕ {\displaystyle \phi } to scale all three dimensions simultaneously. Specifically, given a baseline network, the depth, width, and resolution are scaled according to the following equations: depth multiplier: d = α ϕ width multiplier: w = β ϕ resolution multiplier: r = γ ϕ {\displaystyle {\begin{aligned}{\text{depth multiplier: }}d&=\alpha ^{\phi }\\{\text{width multiplier: }}w&=\beta ^{\phi }\\{\text{resolution multiplier: }}r&=\gamma ^{\phi }\end{aligned}}} subject to α ⋅ β 2 ⋅ γ 2 ≈ 2 {\displaystyle \alpha \cdot \beta ^{2}\cdot \gamma ^{2}\approx 2} and α ≥ 1 , β ≥ 1 , γ ≥ 1 {\displaystyle \alpha \geq 1,\beta \geq 1,\gamma \geq 1} . The α ⋅ β 2 ⋅ γ 2 ≈ 2 {\displaystyle \alpha \cdot \beta ^{2}\cdot \gamma ^{2}\approx 2} condition is such that increasing ϕ {\displaystyle \phi } by a factor of ϕ 0 {\displaystyle \phi _{0}} would increase the total FLOPs of running the network on an image approximately 2 ϕ 0 {\displaystyle 2^{\phi _{0}}} times. The hyperparameters α {\displaystyle \alpha } , β {\displaystyle \beta } , and γ {\displaystyle \gamma } are determined by a small grid search. The original paper suggested 1.2, 1.1, and 1.15, respectively. Architecturally, they optimized the choice of modules by neural architecture search (NAS), and found that the inverted bottleneck convolution (which they called MBConv) used in MobileNet worked well. The EfficientNet family is a stack of MBConv layers, with shapes determined by the compound scaling. The original publication consisted of 8 models, from EfficientNet-B0 to EfficientNet-B7, with increasing model size and accuracy. EfficientNet-B0 is the baseline network, and subsequent models are obtained by scaling the baseline network by increasing ϕ {\displaystyle \phi } . == Variants == EfficientNet has been adapted for fast inference on edge TPUs and centralized TPU or GPU clusters by NAS. EfficientNet V2 was published in June 2021. The architecture was improved by further NAS search with more types of convolutional layers. It also introduced a training method, which progressively increases image size during training, and uses regularization techniques like dropout, RandAugment, and Mixup. The authors claim this approach mitigates accuracy drops often associated with progressive resizing.

    Read more →
  • Lynda Soderholm

    Lynda Soderholm

    Lynda Soderholm is a physical chemist at the U.S. Department of Energy's (DOE) Argonne National Laboratory with a specialty in f-block elements. She is a senior scientist and the lead of the Actinide, Geochemistry & Separation Sciences Theme within Argonne's Chemical Sciences and Engineering Division. Her specific role is the Separation Science group leader within Heavy Element Chemistry and Separation Science (HESS), directing basic research focused on low-energy methods for isolating lanthanide and actinide elements from complex mixtures. She has made fundamental contributions to understanding f-block chemistry and characterizing f-block elements. Soderholm became a Fellow of the American Association for the Advancement of Science (AAAS) in 2013, and is also an Argonne Distinguished Fellow. == Early life and education == Soderholm was awarded her PhD in 1982 by McMaster University under the direction of Prof John Greedan. Her dissertation focused on characterizing the structural and magnetic properties of a series of ternary f-ion oxides. After graduating, she was awarded a NATO postdoctoral fellow at the Centre national de la recherche scientifique in France from 1982 until 1985. After a short postdoctoral appointment as an Argonne postdoctoral fellow she was promoted to staff scientist the same year. Over several years, she moved up the ranks, becoming a senior chemist in 2001. She was also an adjunct professor at the University of Notre Dame from 2003 until 2007. In 2021, Soderholm was appointed interim Division Director for the Chemical Sciences and Engineering Division. == Career and research == === Uncovering structure of Yttrium-123 Superconductor === Early in her career, Soderholm focused on the characterizing the magnetic and electronic behavior of compounds containing f-ions (lanthanides and actinides) with a focus on high-Tc materials, compounds that are superconducting under usually high temperatures. She was part of the research group that first determined the structure of YBa2Cu3O7. Their discovery formed the foundation for the further developments in the broad field of superconductivity. === Understanding f-ion speciation in solution === Continuing her interest in the f-elements, Soderholm shifted her focus from solid-state materials to nanoparticles and solutions, taking advantage of advances in X-ray structural probes made available by synchrotron facilities. Building on her earlier work using neutron scattering, her team became the first to discover that plutonium exists in solution as tiny, well-defined nanoparticles. This work solved a longstanding problem in understanding transport of plutonium in the environment and resulted in the development of a new, patented approach to separating plutonium during nuclear reprocessing. === Using machine learning to evaluate molecular structures === Soderholm's more recent projects use machine learning to understand the influence of complex molecular structuring in solutions, in connection with low-energy processes for separation of f-block elements from complex mixtures. == Awards and honors == University of Chicago Board of Governors' Distinguished Performance Award, 2009. Fellow of the American Association for the Advancement of Science, 2013. Argonne Distinguished Fellow, 2016 DOE materials sciences research competition for Outstanding Scientific Accomplishments in Solid State Physics, 1987. == Select publications == Beno, M. A.; Soderholm, L.; Capone, D. W., II; Hinks, D. G.; Jorgensen, J. D.; Grace, J. D.; Schuller, I. K.; Segre, C. U.; Zhang, K., Structure of the single-phase high-temperature superconductor yttrium barium copper oxide (YBa2Cu3O7−δ). Appl. Phys. Lett. 1987, 51 (1), 57–9. Soderholm, L.; Zhang, K.; Hinks, D. G.; Beno, M. A.; Jorgensen, J. D.; Segre, C. U.; Schuller, I. K., Incorporation of praseodymium in YBa2Cu3O7−δ: electronic effects on superconductivity. Nature (London) 1987, 328 (6131), 604–5. Antonio, M. R.; Williams, C. W.; Soderholm, L., Berkelium redox speciation. Radiochim. Acta 2002, 90 (12), 851–856. Soderholm, L.; Skanthakumar, S.; Neuefeind, J., Determination of actinide speciation in solution using high-energy X-ray scattering. Anal. Bioanal. Chem. 2005, 383 (1), 48–55. Forbes, T. Z.; Burns, P. C.; Skanthakumar, S.; Soderholm, L., Synthesis, structure, and magnetism of Np2O5. J. Am. Chem. Soc. 2007, 129 (10), 2760–2761. Soderholm, L.; Almond, P. M.; Skanthakumar, S.; Wilson, R. E.; Burns, P. C., The structure of the plutonium oxide nanocluster [Pu38O56Cl54(H2O)8]14-. Angew. Chem., Int. Ed. 2008, 47 (2), 298–302. Jensen, M. P.; Gorman-Lewis, D.; Aryal, B.; Paunesku, T.; Vogt, S.; Rickert, P. G.; Seifert, S.; Lai, B.; Woloschak, G. E.; Soderholm, L., An iron-dependent and transferrin-mediated cellular uptake pathway for plutonium. Nat. Chem. Biol. 2011, 7 (8), 560–565. Wilson, R. E.; Skanthakumar, S.; Soderholm, L., Separation of Plutonium Oxide Nanoparticles and Colloids. Angew. Chem., Int. Ed. 2011, 50 (47), 11234–11237. Knope, K. E.; Soderholm, L., Solution and solid-state structural chemistry of actinide hydrates and their hydrolysis and condensation products. Chem. Rev. 2013, 113 (2), 944–994. Luo, G.; Bu, W.; Mihaylov, M.; Kuzmenko, I.; Schlossman, M. L.; Soderholm, L., X-ray reflectivity reveals a nonmonotonic ion-density profile perpendicular to the surface of ErCl3 aqueous solutions. J. Phys. Chem. C 2013, 117 (37), 19082–19090. Jin, G. B.; Lin, J.; Estes, S. L.; Skanthakumar, S.; Soderholm, L., Influence of countercation hydration enthalpies on the formation of molecular complexes: A thorium-nitrate example. J. Am. Chem. Soc. 2017, 139 (49), 18003–18008. == Patents == Solvent extraction system for plutonium colloids and other oxide nano-particles, (2016).

    Read more →