AI Assistant Examples

AI Assistant Examples — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Weak supervision

    Weak supervision

    Weak supervision (also known as semi-supervised learning) is a paradigm in machine learning, the relevance and notability of which increased with the advent of large language models due to the large amount of data required to train them. It is characterized by using a combination of a small amount of human-labeled data (exclusively used in more expensive and time-consuming supervised learning paradigm), followed by a large amount of unlabeled data (used exclusively in unsupervised learning paradigm). In other words, the desired output values are provided only for a subset of the training data. The remaining data is unlabeled or imprecisely labeled. Intuitively, it can be seen as an exam and labeled data as sample problems that the teacher solves for the class as an aid in solving another set of problems. In the transductive setting, these unsolved problems act as exam questions. In the inductive setting, they become practice problems of the sort that will make up the exam. == Problem == The acquisition of labeled data for a learning problem often requires a skilled human agent (e.g. to transcribe an audio segment) or a physical experiment (e.g. determining the 3D structure of a protein or determining whether there is oil at a particular location). The cost associated with the labeling process thus may render large, fully labeled training sets infeasible, whereas acquisition of unlabeled data is relatively inexpensive. In such situations, semi-supervised learning can be of great practical value. Semi-supervised learning is also of theoretical interest in machine learning and as a model for human learning. == Technique == More formally, semi-supervised learning assumes a set of l {\displaystyle l} independently identically distributed examples x 1 , … , x l ∈ X {\displaystyle x_{1},\dots ,x_{l}\in X} with corresponding labels y 1 , … , y l ∈ Y {\displaystyle y_{1},\dots ,y_{l}\in Y} and u {\displaystyle u} unlabeled examples x l + 1 , … , x l + u ∈ X {\displaystyle x_{l+1},\dots ,x_{l+u}\in X} are processed. Semi-supervised learning combines this information to surpass the classification performance that can be obtained either by discarding the unlabeled data and doing supervised learning or by discarding the labels and doing unsupervised learning. Semi-supervised learning may refer to either transductive learning or inductive learning. The goal of transductive learning is to infer the correct labels for the given unlabeled data x l + 1 , … , x l + u {\displaystyle x_{l+1},\dots ,x_{l+u}} only. The goal of inductive learning is to infer the correct mapping from X {\displaystyle X} to Y {\displaystyle Y} . It is unnecessary (and, according to Vapnik's principle, imprudent) to perform transductive learning by way of inferring a classification rule over the entire input space; however, in practice, algorithms formally designed for transduction or induction are often used interchangeably. == Assumptions == In order to make any use of unlabeled data, some relationship to the underlying distribution of data must exist. Semi-supervised learning algorithms make use of at least one of the following assumptions: === Continuity / smoothness assumption === Points that are close to each other are more likely to share a label. This is also generally assumed in supervised learning and yields a preference for geometrically simple decision boundaries. In the case of semi-supervised learning, the smoothness assumption additionally yields a preference for decision boundaries in low-density regions, so few points are close to each other but in different classes. === Cluster assumption === The data tend to form discrete clusters, and points in the same cluster are more likely to share a label (although data that shares a label may spread across multiple clusters). This is a special case of the smoothness assumption and gives rise to feature learning with clustering algorithms. === Manifold assumption === The data lie approximately on a manifold of much lower dimension than the input space. In this case learning the manifold using both the labeled and unlabeled data can avoid the curse of dimensionality. Then learning can proceed using distances and densities defined on the manifold. The manifold assumption is practical when high-dimensional data are generated by some process that may be hard to model directly, but which has only a few degrees of freedom. For instance, human voice is controlled by a few vocal folds, and images of various facial expressions are controlled by a few muscles. In these cases, it is better to consider distances and smoothness in the natural space of the generating problem, rather than in the space of all possible acoustic waves or images, respectively. == History == The heuristic approach of self-training (also known as self-learning or self-labeling) is historically the oldest approach to semi-supervised learning, with examples of applications starting in the 1960s. The transductive learning framework was formally introduced by Vladimir Vapnik in the 1970s. Interest in inductive learning using generative models also began in the 1970s. A probably approximately correct learning bound for semi-supervised learning of a Gaussian mixture was demonstrated by Ratsaby and Venkatesh in 1995. == Methods == === Generative models === Generative approaches to statistical learning first seek to estimate p ( x | y ) {\displaystyle p(x|y)} , the distribution of data points belonging to each class. The probability p ( y | x ) {\displaystyle p(y|x)} that a given point x {\displaystyle x} has label y {\displaystyle y} is then proportional to p ( x | y ) p ( y ) {\displaystyle p(x|y)p(y)} by Bayes' rule. Semi-supervised learning with generative models can be viewed either as an extension of supervised learning (classification plus information about p ( x ) {\displaystyle p(x)} ) or as an extension of unsupervised learning (clustering plus some labels). Generative models assume that the distributions take some particular form p ( x | y , θ ) {\displaystyle p(x|y,\theta )} parameterized by the vector θ {\displaystyle \theta } . If these assumptions are incorrect, the unlabeled data may actually decrease the accuracy of the solution relative to what would have been obtained from labeled data alone. However, if the assumptions are correct, then the unlabeled data necessarily improves performance. The unlabeled data are distributed according to a mixture of individual-class distributions. In order to learn the mixture distribution from the unlabeled data, it must be identifiable, that is, different parameters must yield different summed distributions. Gaussian mixture distributions are identifiable and commonly used for generative models. The parameterized joint distribution can be written as p ( x , y | θ ) = p ( y | θ ) p ( x | y , θ ) {\displaystyle p(x,y|\theta )=p(y|\theta )p(x|y,\theta )} by using the chain rule. Each parameter vector θ {\displaystyle \theta } is associated with a decision function f θ ( x ) = argmax y p ( y | x , θ ) {\displaystyle f_{\theta }(x)={\underset {y}{\operatorname {argmax} }}\ p(y|x,\theta )} . The parameter is then chosen based on fit to both the labeled and unlabeled data, weighted by λ {\displaystyle \lambda } : argmax Θ ( log ⁡ p ( { x i , y i } i = 1 l | θ ) + λ log ⁡ p ( { x i } i = l + 1 l + u | θ ) ) {\displaystyle {\underset {\Theta }{\operatorname {argmax} }}\left(\log p(\{x_{i},y_{i}\}_{i=1}^{l}|\theta )+\lambda \log p(\{x_{i}\}_{i=l+1}^{l+u}|\theta )\right)} === Low-density separation === Another major class of methods attempts to place boundaries in regions with few data points (labeled or unlabeled). One of the most commonly used algorithms is the transductive support vector machine, or TSVM (which, despite its name, may be used for inductive learning as well). Whereas support vector machines for supervised learning seek a decision boundary with maximal margin over the labeled data, the goal of TSVM is a labeling of the unlabeled data such that the decision boundary has maximal margin over all of the data. In addition to the standard hinge loss ( 1 − y f ( x ) ) + {\displaystyle (1-yf(x))_{+}} for labeled data, a loss function ( 1 − | f ( x ) | ) + {\displaystyle (1-|f(x)|)_{+}} is introduced over the unlabeled data by letting y = sign ⁡ f ( x ) {\displaystyle y=\operatorname {sign} {f(x)}} . TSVM then selects f ∗ ( x ) = h ∗ ( x ) + b {\displaystyle f^{}(x)=h^{}(x)+b} from a reproducing kernel Hilbert space H {\displaystyle {\mathcal {H}}} by minimizing the regularized empirical risk: f ∗ = argmin f ( ∑ i = 1 l ( 1 − y i f ( x i ) ) + + λ 1 ‖ h ‖ H 2 + λ 2 ∑ i = l + 1 l + u ( 1 − | f ( x i ) | ) + ) {\displaystyle f^{}={\underset {f}{\operatorname {argmin} }}\left(\displaystyle \sum _{i=1}^{l}(1-y_{i}f(x_{i}))_{+}+\lambda _{1}\|h\|_{\mathcal {H}}^{2}+\lambda _{2}\sum _{i=l+1}^{l+u}(1-|f(x_{i})|)_{+}\right)} An exact solution is intractable due to the non-convex term ( 1 − | f ( x ) | ) + {\displayst

    Read more →
  • Type–token distinction

    Type–token distinction

    The type–token distinction is the difference between a type of objects (analogous to a class) and the individual tokens of that type (analogous to instances). Since each type may be instantiated by multiple tokens, there are generally more tokens than types of an object. For example, the sentence "A rose is a rose is a rose" contains three word types: three word tokens of the type a, two word tokens of the type is, and three word tokens of the type rose. The distinction is important in disciplines such as logic, linguistics, metalogic, typography, and computer programming. == Overview == The type–token distinction separates types (abstract descriptive concepts) from tokens (objects that instantiate concepts). For example, in the sentence "the bicycle is becoming more popular" the word bicycle represents the abstract concept of bicycles and this abstract concept is a type, whereas in the sentence "the bicycle is in the garage", it represents a particular object and this particular object is a token. Similarly, the word type 'letter' uses only four letter types: L, E, T and R. Nevertheless, it uses both E and T twice. One can say that the word type 'letter' has six letter tokens, with two tokens each of the letter types E and T. Whenever a word type is inscribed, the number of letter tokens created equals the number of letter occurrences in the word type. Some logicians consider a word type to be the class of its tokens. Other logicians counter that the word type has a permanence and constancy not found in the class of its tokens. The type remains the same while the class of its tokens is continually gaining new members and losing old members. == Typography == In typography, the type–token distinction is used to determine the presence of a text printed by movable type: The defining criteria which a typographic print has to fulfill is that of the type identity of the various letter forms which make up the printed text. In other words: each letter form which appears in the text has to be shown as a particular instance ("token") of one and the same type which contains a reverse image of the printed letter. == Charles Sanders Peirce == The distinctions between using words as types or tokens were first made by American logician and philosopher Charles Sanders Peirce in 1906 using terminology that he established. Peirce's type–token distinction applies to words, sentences, paragraphs and so on: to anything in a universe of discourse of character-string theory, or concatenation theory. Peirce's original words are the following: A common mode of estimating the amount of matter in a ... printed book is to count the number of words. There will ordinarily be about twenty 'thes' on a page, and, of course, they count as twenty words. In another sense of the word 'word,' however, there is but one word 'the' in the English language; and it is impossible that this word should lie visibly on a page, or be heard in any voice .... Such a ... Form, I propose to term a Type. A Single ... Object ... such as this or that word on a single line of a single page of a single copy of a book, I will venture to call a Token. .... In order that a Type may be used, it has to be embodied in a Token which shall be a sign of the Type, and thereby of the object the Type signifies.

    Read more →
  • Eimear Kenny

    Eimear Kenny

    Eimear E. Kenny is a researcher in population genetics and translation genomics, and is the Founding Director of the Institute for Genomic Health, and Endowed Chair and Professor of Genomic Health at the Icahn School of Medicine at Mount Sinai. She is known for novel approaches in computational genomics, advancing the study of human genetic variation and its connection to disease risk and diagnosis. Her research has laid the foundation for integrating artificial intelligence (AI) and genomics into precision medicine and routine clinical care. By combining genomics, computer science, and medicine, her work leverages genomic sequencing technologies and machine learning algorithms to uncover insights that improve patient care, accelerate genomic data analysis, and enable the future of AI-driven healthcare. She has led multiple genomics-based clinical trials, applying computational biology and AI in clinical settings to advance genomic medicine and precision healthcare. == Research == A recipient of the Early-Career Award from the American Society of Human Genetics (USA), Kenny, as of 2024, leads a team in genetics, computer science, and medicine, focusing on genetic ancestry, large-scale genomics, clinical trials, and genomic medicine at the Institute for Genomic Health. The lab works to advance understanding of genetic ancestry and its impact on health in order to inform better clinical medicine models. She is recognized for her work to leverage biobanks for translational genomics and her development of new genetic tests an strategies for health care management. In one study, she and her colleagues investigated genetic disorders that might be under-diagnosed due to insufficient data, and found a variant in a collagen gene associated with Steel syndrome. This syndrome caused short stature and bone and joint issues and was thought to be rare. However, the study revealed it is common in individuals with Puerto Rican ancestry. Three of Kenny's genomic medicine clinical trials assessed how to bring new technology, such as digital apps, or information, such as polygenic risk scores, into routine clinical care. In the 2010s, Kenny was instrumental in several large-scale sequencing studies, including the 1000 Genomes Project, the Exome Sequencing Project, the Genome Sequencing Project, and the Trans-Omics for Precision Medicine. In 2012, she led work that discovered the variant responsible for blond hair in Melanesia, work that was featured in the Smithsonian NHGRI Human Genome Exhibit in Washington, D.C. In 2017, her group was one of the first to demonstrate that polygenic risk scores derived in predominantly European populations have reduced accuracy when applied in populations now widely acknowledged as a major challenge in the field of genomic risk prediction. As of 2024, she is Principal Investigator in many NIH-funded international consortium focused on computational genomics and genomic medicine, including Electronic Medical Records and Genomics, Polygenic Risk Methods in Diverse Populations, and the Human Pangenome Reference Consortium. In 2023, Kenny played a key role in a groundbreaking advancement in genomics research by helping to map a diverse human pangenome—a major shift from reliance on a single reference genome. Unlike the earlier genetic map, based on one man of mixed European and African ancestry in Buffalo, this new pangenome project captures far greater human genetic diversity. As reported by The Washington Post, Kenny's work demonstrates how a more inclusive human genome can drive discoveries in rare genetic diseases, improve genomic medicine, and accelerate the future of precision healthcare. Kenny was co-developer and current license holder for Random Forest adMIXture (RFMix), a patented software for inferring continental and sub-continental ancestry at genomic loci. == Education and career == Kenny graduated from Trinity College Dublin with a BA in Biochemistry in 1999 and did a masters in Bioinformatics at Leeds University. She received her PhD in Computational Genomics at Rockefeller University, and did her post-doctoral work in the lab of Dr. Carlos D. Bustamante at Stanford University. === Academic appointments === As of 2024, at Mount Sinai, she serves as the Endowed Chair and Professor of Genomic Health, Professor at the Department of Medicine and Professor at the Department of Genetics and Genomic Sciences. Since 2018 she has served as the Founding Director of the Institute for Genomic Health, and since 2022, she also serves as the Founding Director of the Center for Translational Genomics. She is also the Director of Translational Research, Division for Genomic Medicine. Former appointments include Assistant Professor at the Department of Genetics and Genomic Sciences and Member at The Charles Bronfman Institute of Personalized Medicine, both at Mount Sinai. She was also Bioinformatics Programmer at the California Institute of Technology, and research assistant at the Massachusetts Institute of Technology. == Publications == As of 2024, Kenny is an advisor to Cell Genomics. Google Scholar reports 50,623 citations, an h-index of 66 and an i10-index of 130. The five most-cited articles she contributed to are: Auton, A; Brooks, LD; Durbin, RM; Garrison, EP; Kang, HM; Korbel, JO; Marchini, JL; McCarthy, S; McVean, GA; Abecasis, GR (2015). "A global reference for human genetic variation". Nature. 526 (7571): 68–74. Bibcode:2015Natur.526...68T. doi:10.1038/nature15393. PMC 4750478. PMID 26432245.. Cited by 14847 Abecasis, GR; Auton, A; Brooks, LD; DePristo, MA; Durbin, RM; Handsaker, RE; Kang, HM; Marth, GT; McVean, GA (2012). "An integrated map of genetic variation from 1,092 human genomes". Nature. 491 (7422): 56–65. Bibcode:2012Natur.491...56T. doi:10.1038/nature11632. PMC 3498066. PMID 23128226.. Cited by 8287 Jacob A. Tennessen et al. Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes.Science337,64–69(2012).DOI:10.1126/science.1219240 Cited by 1886 Taliun, D.; Harris, D.N.; Kessler, M.D.; et al. (2021). "Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program". Nature. 590 (7845): 290–299. Bibcode:2021Natur.590..290T. doi:10.1038/s41586-021-03205-y. PMC 7875770. PMID 33568819.. Cited by 1369 Vilhjálmsson, BJ; et al. (2015). "Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores". Am J Hum Genet. 97 (4): 576–92. doi:10.1016/j.ajhg.2015.09.001. PMC 4596916. PMID 26430803.. Cited by 1327

    Read more →
  • John M. Jumper

    John M. Jumper

    John Michael Jumper (born 1 January 1985) is an American chemist and computer scientist. Jumper and Demis Hassabis were awarded the 2024 Nobel Prize in Chemistry for protein structure prediction. As of 2025 Jumper serves as director at Google DeepMind. Jumper and his colleagues created AlphaFold, an artificial intelligence (AI) model to predict protein structures from their amino acid sequence with high accuracy. The AlphaFold team had released 214 million protein structures as of January 2024. The scientific journal Nature included Jumper as one of the ten "people who mattered" in science in their annual listing of Nature's 10 in 2021. == Education == Jumper graduated from Pulaski Academy in 2003. He received a Bachelor of Science with majors in physics and mathematics from Vanderbilt University in 2007, a Master of Philosophy in theoretical condensed matter physics from the University of Cambridge where he was a student of St Edmund's College, Cambridge in 2010 on a Marshall Scholarship, a Master of Science in theoretical chemistry from the University of Chicago in 2012, and a Doctor of Philosophy in theoretical chemistry from the University of Chicago in 2017. His doctoral advisors at the University of Chicago were Tobin R. Sosnick and Karl Freed. == Career and research == Jumper's research investigates algorithms for protein structure prediction. === AlphaFold === AlphaFold is a deep learning algorithm developed by Jumper and his team at DeepMind, a research lab acquired by Google's parent company Alphabet Inc. It is an artificial intelligence program which performs predictions of protein structure. === Awards and honors === In November 2020, AlphaFold was named the winner of the 14th Critical Assessment of Structure Prediction (CASP) competition. This international competition benchmarks algorithms to determine which one can best predict the 3D structure of proteins. AlphaFold won the competition, outperforming other algorithms scoring above 90 for around two-thirds of the proteins in CASP's global distance test (GDT), a test that measures the degree to which a computational program predicted structure is similar to the lab experiment determined structure, with 100 being a complete match, within the distance cutoff used for calculating GDT. In 2021, Jumper was awarded the BBVA Foundation Frontiers of Knowledge Award in the category "Biology and Biomedicine". In 2022 Jumper received the Wiley Prize in Biomedical Sciences and for 2023 the Breakthrough Prize in Life Sciences for developing AlphaFold, which accurately predicts the structure of a protein. In 2023 he was awarded the Canada Gairdner International Award and the Albert Lasker Award for Basic Medical Research. In 2024, Jumper and Demis Hassabis shared half of the Nobel Prize in Chemistry for their protein folding predictions, the other half went to David Baker for computational protein design. In 2025, Jumper received the Golden Plate Award of the American Academy of Achievement and the Marshall Medal of the Marshall Aid Commemoration Commission. He was elected a Fellow of the Royal Society (FRS) that same year. In 2026, he was elected a member of the National Academy of Engineering.

    Read more →
  • Spatial embedding

    Spatial embedding

    Spatial embedding is one of feature learning techniques used in spatial analysis where points, lines, polygons or other spatial data types. representing geographic locations are mapped to vectors of real numbers. Conceptually it involves a mathematical embedding from a space with many dimensions per geographic object to a continuous vector space with a much lower dimension. Such embedding methods allow complex spatial data to be used in neural networks and have been shown to improve performance in spatial analysis tasks == Embedded data types == Geographic data can take many forms: text, images, graphs, trajectories, polygons. Depending on the task, there may be a need to combine multimodal data from different sources. The next section describes examples of different types of data and their uses. === Text === Geolocated posts on social media can be used to acquire a library of documents bound to a given place that can be later transformed to embedded vectors using word embedding techniques. === Image === Satellites and aircraft collect digital spatial data acquired from remotely sensed images which can be used in machine learning. They are sometimes hard to analyse using basic image analysis methods and convolutional neural networks can be used to acquire an embedding of images bound to a given geographical object or a region. === Point === A single point of interest (POI) can be assigned multiple features that can be used in machine learning. These could be demographic, transportation, meteorological, or economic data, for example. When embedding single points, it is common to consider the entire set of available points as nodes in a graph. === Line / multiline === Among other things, motion trajectories are represented as lines (multilines). Individual trajectories are embedded taking into account travel time, distances and also features of points visited along the way. Embedding of trajectories allows to improve performance of such tasks as clustering and also categorization. === Polygon === The geographic areas analyzed in machine learning are defined by both administrative boundaries and top-down division into grids of regular shapes such as rectangles, for example. Both types are represented as polygons and, like points, can be assigned different demographic, transportation, or economic features. A polygon can also have features related to the size of the area or shape it represents. === Graph === An example domain where graph representation is used is the street layout in a city, where vertices can be intersections and edges can be roads. The vertices can also be destination points like public transport stops or important points in the city, and the edges represent the flow between them. Embedding graphs or single vertices allows to improve accuracy of analysis methods in which the treated geographical domain can be represented as a network. == Usage == POI recommendation - generating personalized point of interest recommendations based on user preferences. Next/future location prediction - prediction of the next location a person will go to based on their historical trajectory. Zone functions classification - based on different mobility of people or POI distribution a function of a given area in a city can be predicted. Crime prediction - estimation of crime rate in different regions of a city. Local event detection - studying spatio-temporal changes in embeddings can provide valuable information in detection of local event occurring in specific location. Regional mobility popularity prediction - analysis of mobility can show patterns in popularity of different regions in a city. Shape matching - finding a similar shape of given polygon, for example finding building with the same shape as input building. Travel time estimation - predicting estimated travel time given current traffic conditions and special occurring events. Time estimation for on-demand food delivery - estimation of delivery time when placing an order through the website. == Temporal aspect == Some of the data analyzed has a timestamp associated with it. In some cases of data analysis this information is omitted and in others it is used to divide the set into groups. The most common division is the separation of weekdays from weekends or division into hours of the day. This is particularly important in the analysis of mobility data, because the characteristics of mobility during the week and at different times of the day are very different from each other. Another area in which time division into, for example, individual months can be used is in the analysis of tourism of a given region. In order to take such a split into account, embedding methods treat the time stamp specifically or separate versions of the model are developed for different subgroups of the analyzed set.

    Read more →
  • GENESIS (software)

    GENESIS (software)

    GENESIS (The General Neural Simulation System) is a simulation environment for constructing realistic models of neurobiological systems at many levels of scale including: sub-cellular processes, individual neurons, networks of neurons, and neuronal systems. These simulations are “computer-based implementations of models whose primary objective is to capture what is known of the anatomical structure and physiological characteristics of the neural system of interest”. GENESIS is intended to quantify the physical framework of the nervous system in a way that allows for easy understanding of the physical structure of the nerves in question. “At present only GENESIS allows parallelized modeling of single neurons and networks on multiple-instruction-multiple-data parallel computers.” Development of GENESIS software spread from its home at Caltech to labs at the University of Texas at San Antonio, the University of Antwerp, the National Centre for Biological Sciences in Bangalore, the University of Colorado, the Pittsburgh Supercomputing Center, the San Diego Supercomputer Center, and Emory University. == Neurons and Neural Systems == GENESIS works by creating simulation environments for constructing models of neurons or neural systems. "Nerve cells are capable of communicating with each other in such a highly structured manner as to form neuronal networks. To understand neural networks, it is necessary to understand the ways in which one neuron communicates with another through synaptic connections and the process called synaptic transmission". Neurons have a specialized structure for their function, they "are different from most other cells in the body in that they are polarized and have distinct morphological regions, each with specific functions". The two important regions of a neuron are the dendrite and the axon. "Dendrites are the region where one neuron receives connections from other neurons. The cell body or soma contains the nucleus and the other organelles necessary for cellular function. The axon is a key component of nerve cells over which information is transmitted from one part of the neuron (e.g., the cell body) to the terminal regions of the neuron". The third important piece of a neuron is the synapse. "The synapse is the terminal region of the axon this is where one neuron forms a connection with another and conveys information through the process of synaptic transmission". Neural networks like the ones simulated with GENESIS software can quickly become highly complex and difficult to understand. "Just a few interconnected neurons (a microcircuit) can perform sophisticated tasks such as mediate reflexes, process sensory information, generate locomotion and mediate learning and memory. Even more complex networks, macrocircuits, consist of multiple embedded microcircuits. Macrocircuits mediate higher brain functions such as object recognition and cognition". GENESIS endeavors to simulate neural systems as they are found in nature. Often, "a neuron can receive contacts from up to 10,000 presynaptic neurons, and, in turn, any one neuron can contact up to 10,000 postsynaptic neurons. The combinatorial possibility could give rise to enormously complex neuronal circuits or network topologies, which might be very difficult to understand". == History == GENESIS was developed by Dr. James M. Bower, in the Caltech laboratory, and first released to the public in 1988 in association with the first Methods in Computational Neuroscience Course at the Marine Biological Laboratory in Woods Hole, MA. Full source code for the software was released in the same year under an open software model for development. It's now supported by the Computational Biology Initiative at the University of Texas at San Antonio and is available free along with tutorial guides on its use. P-GENESIS, a parallel version of GENESIS, was first run in 1990 on the Intel Delta, which was the prototype for the Intel Paragon family of massively parallel supercomputers. == How GENESIS Works == GENESIS is useful in creating a simulation environment for constructing models of neurobiological systems, such as: sub-cellular processes individual neurons networks of neurons neuronal systems The GENESIS system is complicated, but relatively easy to use. An individual can input commands through one of three ways: script files, graphical user interface, or the GENESIS command shell. These commands are then processed by the script language interpreter. "The Script Language Interpreter processes commands entered through the keyboard, script files, or the graphical user interface, and passes them to the GENESIS simulation engine. The simulation engine also loads compiled object libraries, reads and writes data files, and interacts with the graphical user interface". Below is a graphical representation of the user input process and a sample GENESIS output. == Applications == Most current applications for GENESIS involve realistic simulations of biological systems. It is usually used to simulate the behavior of larger brain structures, for example the cerebral cortex. These studies most often occur in lab courses in neural simulation at Caltech and the Marine Biological Laboratory at Woods Hole, Massachusetts. GENESIS can be used in combination with Yale University’s software called NEURON as a means for scientists to collaborate to construct a physical description of the nervous system. The GENESIS software can also be used with Kinetikit in the modeling of signal transduction pathways. GENESIS has been used in many studies. Some of these studies involve research that focuses on the development of software that would be useful across many disciplines. Others are studies of neurons, such as Purkinje cells. These studies used GENESIS to simulate Purkinje cells and could be useful for the planning and development of later experiments using the GENESIS software. There may also be biomedical applications of the software. For example, St. Jude Medical in Europe has developed an implanted GENESIS device.

    Read more →
  • Dataism

    Dataism

    Dataism is a term that has been used to describe the mindset or philosophy created by the emerging significance of big data. It was first used by David Brooks in The New York Times in 2013. The term has been expanded to describe what historian Yuval Noah Harari, in his book Homo Deus: A Brief History of Tomorrow from 2015, calls an emerging ideology or even a new form of religion, in which "information flow" is the "supreme value". In art, the term was used by Albert-Laszlo Barabasi to refer to an artist movement that uses data as its primary source of inspiration. == History == "If you asked me to describe the rising philosophy of the day, I'd say it is Data-ism", wrote David Brooks in The New York Times in February 2013. Brooks argued that in a world of increasing complexity, relying on data could reduce cognitive biases and "illuminate patterns of behavior we haven't yet noticed". In 2015, Steve Lohr's book Data-ism looked at how Big Data is transforming society, using the term to describe the Big Data revolution. In his 2016 book Homo Deus: A Brief History of Tomorrow, Yuval Noah Harari argues that all competing political or social structures can be seen as data processing systems: "Dataism declares that the universe consists of data flows, and the value of any phenomenon or entity is determined by its contribution to data processing" and "we may interpret the entire human species as a single data processing system, with individual humans serving as its chips." According to Harari, a Dataist should want to "maximise dataflow by connecting to more and more media". Harari predicts that the logical conclusion of this process is that, eventually, humans will give algorithms the authority to make the most important decisions in their lives, such as whom to marry and which career to pursue. Harari argues that Aaron Swartz could be called the "first martyr" of Dataism. In 2022, Albert-László Barabási coined the term "Dataism" to define an artistic movement that positions data as the central means of understanding nature, society, technology, and human essence. This movement underscores the necessity for art to integrate with data to stay relevant in contemporary society. Dataism responds to the intricacy and interconnectedness of modern social, economic, and technological realms, which exceed individual understanding. Advocating for the use of methodologies from various fields like science, business, and politics in art, Dataism sees this fusion as essential for art to retain its significance and influence. == Criticism == Commenting on Harari's characterisation of Dataism, security analyst Daniel Miessler believes that Dataism does not present the challenge to the ideology of liberal humanism that Harari claims, because humans will simultaneously be able to believe in their own importance and that of data. Harari himself raises some criticisms, such as the problem of consciousness, which Dataism is unlikely to illuminate. Humans may also find out that organisms are not algorithms, he suggests. Dataism implies that all data is public, even personal data, to make the system work as a whole, which is a factor that's already showing resistance today. Other analysts, such as Terry Ortleib, have looked at the extent to which Dataism poses a dystopian threat to humanity. The Facebook–Cambridge Analytica data scandal showed how political leaders manipulated Facebook's users' data to build specific psychological profiles that went on to manipulate the network. A team of data analysts reproduced the AI technology developed by Cambridge Analytica around Facebook's data and was able to define the following rules: 10 likes enables a machine to know a person like a coworker, 70 likes like a friend would, 150 likes like a parent would, 300 likes like a lover would, and beyond it may be possible to know a people better than they know themselves.

    Read more →
  • John Schulman

    John Schulman

    John Schulman (born 1987 or 1988) is an American artificial intelligence researcher and co-founder of OpenAI. In August 2024, he announced he would be joining Anthropic. In February 2025, he announced he was leaving to join Thinking Machines Lab, where he is chief scientist. == Early life and education == Schulman had an interest in science and math from a young age. He enjoyed science fiction, especially the work of Isaac Asimov. When he was in seventh grade, he became deeply interested in the television program BattleBots, which featured combat between remote-controlled robots. In what he said was his first self-directed study, he read extensively in subject areas that would help him design a superior robot, but the robot he and his friends worked on was never built. He attended Great Neck South High School. He was a member of the US Physics olympiad Team in 2005. In 2010, he graduated from Caltech with a degree in physics. He has a PhD in electrical engineering and computer sciences from the University of California, Berkeley, where he was advised by Pieter Abbeel. == Career == In December 2015, shortly before finishing his PhD, Schulman co-founded OpenAI with Sam Altman, Elon Musk, Ilya Sutskever, Greg Brockman, Trevor Blackwell, Vicki Cheung, Andrej Karpathy, Durk Kingma, Pamela Vagata, and Wojciech Zaremba, with Sam Altman and Elon Musk as the co-chairs. There, he led the reinforcement learning team that created ChatGPT. He has been referred to as the "architect" of ChatGPT. In August 2024, Schulman announced he would be joining Anthropic. He stated his move was to allow him to deepen his focus on AI alignment and return to more hands-on technical work. In February 2025, he announced he was leaving to join Thinking Machines Lab, where he is chief scientist. == Awards and honors == In 2025, Schulman received the Mark Bingham Award for Excellence in Achievement by Young Alumni from his alma mater, UC Berkeley.

    Read more →
  • The Visualization Handbook

    The Visualization Handbook

    The Visualization Handbook is a textbook by Charles D. Hansen and Christopher R. Johnson that serves as a survey of the field of scientific visualization by presenting the basic concepts and algorithms in addition to a current review of visualization research topics and tools. It is commonly used as a textbook for scientific visualization graduate courses. It is also commonly cited as a reference for scientific visualization and computer graphics in published papers, with almost 500 citations documented on Google Scholar. == Table of Contents == PART I - Introduction Overview of Visualization - William J. Schroeder and Kenneth M. Martin PART II - Scalar Field Visualization: Isosurfaces Accelerated Isosurface Extraction Approaches -Yarden Livnat Time-Dependent Isosurface Extraction - Han-Wei Shen Optimal Isosurface Extraction - Paolo Cignoni, Claudio Montani, Robert Scopigno, and Enrico Puppo Isosurface Extraction Using Extrema Graphs - Takayuki Itoh and Koji Koyamada Isosurfaces and Level-Sets - Ross Whitaker PART III - Scalar Field Visualization: Volume Rendering Overview of Volume Rendering - Arie E. Kaufman and Klaus Mueller Volume Rendering Using Splatting - Roger Crawfis, Daqing Xue, and Caixia Zhang Multidimensional Transfer Functions for Volume Rendering - Joe Kniss, Gordon Kindlmann, and Charles D. Hansen Pre-Integrated Volume Rendering - Martin Kraus and Thomas Ertl Hardware-Accelerated Volume Rendering - Hanspeter Pfister PART IV - Vector Field Visualization Overview of Flow Visualization - Daniel Weiskopf and Gordon Erlebacher Flow Textures: High-Resolution Flow Visualization - Gordon Erlebacher, Bruno Jobard, and Daniel Weiskopf Detection and Visualization of Vortices - Ming Jiang, Raghu Machiraju, and David Thompson PART V - Tensor Field Visualization Oriented Tensor Reconstruction - Leonid Zhukov and Alan H. Barr Diffusion Tensor MRI Visualization - Song Zhang, David Laidlaw, and Gordon Kindlmann Topological Methods for Flow Visualization - Gerik Scheuermann and Xavier Tricoche PART VI - Geometric Modeling for Visualization 3D Mesh Compression - Jarek Rossignac Variational Modeling Methods for Visualization - Hans Hagen and Ingrid Hotz Model Simplification - Jonathan D. Cohen and Dinesh Manocha PART VII - Virtual Environments for Visualization Direct Manipulation in Virtual Reality - Steve Bryson The Visual Haptic Workbench - Milan Ikits and J. Dean Brederson Virtual Geographic Information Systems - William Ribarsky Visualization Using Virtual Reality - R. Bowen Loftin, Jim X. Chen, and Larry Rosenblum PART VIII - Large-Scale Data Visualization Desktop Delivery: Access to Large Datasets - Philip D. Heermann and Constantine Pavlakos Techniques for Visualizing Time-Varying Volume Data - Kwan-Liu Ma and Eric B. Lum Large-Scale Data Visualization and Rendering: A Problem-Driven Approach - Patrick McCormick and James Ahrens Issues and Architectures in Large-Scale Data Visualization - Constantine Pavlakos and Philip D. Heermann Consuming Network Bandwidth with Visapult - Wes Bethel and John Shalf PART IX - Visualization Software and Frameworks The Visualization Toolkit - William J. Schroeder and Kenneth M. Martin Visualization in the SCIRun Problem-Solving Environment - David M. Weinstein, Steven Parker, Jenny Simpson, Kurt Zimmerman, and Greg M. Jones Numerical Algorithms Group IRIS Explorer - Jeremy Walton AVS and AVS/Express - Jean M. Favre and Mario Valle Vis5D, Cave5D, and VisAD - Bill Hibbard Visualization with AVS - W. T. Hewitt, Nigel W. John, Matthew D. Cooper, K. Yien Kwok, George W. Leaver, Joanna M. Leng, Paul G. Lever, Mary J. McDerby, James S. Perrin, Mark Riding, I. Ari Sadarjoen, Tobias M. Schiebeck, and Colin C. Venters ParaView: An End-User Tool for Large-Data Visualization - James Ahrens, Berk Geveci, and Charles Law The Insight Toolkit: An Open-Source Initiative in Data Segmentation and Registration - Terry S. Yoo amira: A Highly Interactive System for Visual Data Analysis - Detlev Stalling, Malte Westerhoff, and Hans-Christian Hege PART X - Perceptual Issues in Visualization Extending Visualization to Perceptualization: The Importance of Perception in Effective Communication of Information - David S. Ebert Art and Science in Visualization - Victoria Interrante Exploiting Human Visual Perception in Visualization - Alan Chalmers and Kirsten Cater PART XI - Selected Topics and Applications Scalable Network Visualization - Stephen G. Eick Visual Data-Mining Techniques - Daniel A. Keim, Mike Sips, and Mihael Ankerst Visualization in Weather and Climate Research - Don Middleton, Tim Scheitlin, and Bob Wilhelmson Painting and Visualization - Robert M. Kirby, Daniel F. Keefe, and David Laidlaw Visualization and Natural Control Systems for Microscopy - Russell M. Taylor II, David Borland, Frederick P. Brooks, Jr., Mike Falvo, Kevin Jeffay, Gail Jones, David Marshburn, Stergios J. Papadakis, Lu-Chang Qin, Adam Seeger, F. Donelson Smith, Dianne Sonnenwald, Richard Superfine, Sean Washburn, Chris Weigle, Mary Whitton, Leandra Vicci, Martin Guthold, Tom Hudson, Philip Williams, and Warren Robinett Visualization for Computational Accelerator Physics - Kwan-Liu Ma, Greg Schussman, and Brett Wilson

    Read more →
  • Issue tree

    Issue tree

    An issue tree, also called logic tree, is a graphical breakdown of a question that dissects it into its different components vertically and that progresses into details as it reads to the right. Issue trees are useful in problem solving to identify the root causes of a problem as well as to identify its potential solutions. They also provide a reference point to see how each piece fits into the whole picture of a problem. == Types == According to professor of strategy Arnaud Chevallier, elaborating an approach used at McKinsey & Company, there are two types of issue trees: diagnostic ones and solution ones. Diagnostic trees break down a "why" key question, identifying all the possible root causes for the problem. Solution trees break down a "how" key question, identifying all the possible alternatives to fix the problem. == Rules == Four basic rules can help ensure that issue trees are optimal, according to Chevallier: Consistently answer a "why" or a "how" question Progress from the key question to the analysis as it moves to the right Have branches that are mutually exclusive and collectively exhaustive (MECE) Use an insightful breakdown The requirement for issue trees to be collectively exhaustive implies that divergent thinking is a critical skill. == Applications == === In management interviews === Issue trees are used to answer questions in case interviews for management consulting positions. A quantitative type of question, the market sizing question, requires the interviewee to estimate the size of a data group such as a specific segment of a population, an amount of objects, a company's revenues, or similar. The candidates are expected to use a structured and logical method of arriving at their answer, and using an issue tree provides a diagram to aid the candidate's logical reasoning. Issue trees are used for other types of case interview questions as well.

    Read more →
  • Mathematical model

    Mathematical model

    A mathematical model is an abstract description of a concrete system using mathematical concepts and language. The process of developing a mathematical model is termed mathematical modeling. Mathematical models are used in many fields, including applied mathematics, natural sciences, social sciences and engineering. In particular, the field of operations research studies the use of mathematical modelling and related tools to solve problems in business or military operations. A model may help to characterize a system by studying the effects of different components, which may be used to make predictions about behavior or solve specific problems. == Elements of a mathematical model == Mathematical models can take many forms, including dynamical systems, statistical models, differential equations, or game theoretic models. These and other types of models can overlap, with a given model involving a variety of abstract structures. In many cases, the quality of a scientific field depends on how well the mathematical models developed on the theoretical side agree with results of repeatable experiments. Lack of agreement between theoretical mathematical models and experimental measurements often leads to important advances as better theories are developed. In the physical sciences, a traditional mathematical model contains most of the following elements: Governing equations Supplementary sub-models Defining equations Constitutive equations Assumptions and constraints Initial and boundary conditions Classical constraints and kinematic equations == Classifications == Mathematical models are of different types: === Linear vs. nonlinear === If all the operators in a mathematical model exhibit linearity, the resulting mathematical model is defined as linear. All other models are considered nonlinear. The definition of linearity and nonlinearity is dependent on context, and linear models may have nonlinear expressions in them. For example, in a statistical linear model, it is assumed that a relationship is linear in the parameters, but it may be nonlinear in the predictor variables. Similarly, a differential equation is said to be linear if it can be written with linear differential operators, but it can still have nonlinear expressions in it. In a mathematical programming model, if the objective functions and constraints are represented entirely by linear equations, then the model is regarded as a linear model. If one or more of the objective functions or constraints are represented with a nonlinear equation, then the model is known as a nonlinear model. Linear structure implies that a problem can be decomposed into simpler parts that can be treated independently or analyzed at a different scale, and therefore that the results will remain valid if the initial is recomposed or rescaled. Nonlinearity, even in fairly simple systems, is often associated with phenomena such as chaos and irreversibility. Although there are exceptions, nonlinear systems and models tend to be more difficult to study than linear ones. A common approach to nonlinear problems is linearization, but this can be problematic if one is trying to study aspects such as irreversibility, which are strongly tied to nonlinearity. === Static vs. dynamic === A dynamic model accounts for time-dependent changes in the state of the system, while a static (or steady-state) model calculates the system in equilibrium, and thus is time-invariant. Dynamic models are typically represented by differential equations or difference equations. === Explicit vs. implicit === If all of the input parameters of the overall model are known, and the output parameters can be calculated by a finite series of computations, the model is said to be explicit. But sometimes it is the output parameters which are known, and the corresponding inputs must be solved for by an iterative procedure, such as Newton's method or Broyden's method. In such a case the model is said to be implicit. For example, a jet engine's physical properties such as turbine and nozzle throat areas can be explicitly calculated given a design thermodynamic cycle (air and fuel flow rates, pressures, and temperatures) at a specific flight condition and power setting, but the engine's operating cycles at other flight conditions and power settings cannot be explicitly calculated from the constant physical properties. === Discrete vs. continuous === A discrete model treats objects as discrete, such as the particles in a molecular model or the states in a statistical model; while a continuous model represents the objects in a continuous manner, such as the velocity field of fluid in pipe flows, temperatures and stresses in a solid, and electric field that applies continuously over the entire model due to a point charge. === Deterministic vs. probabilistic (stochastic) === A deterministic model is one in which every set of variable states is uniquely determined by parameters in the model and by sets of previous states of these variables; therefore, a deterministic model always performs the same way for a given set of initial conditions. Conversely, in a stochastic model—usually called a "statistical model"—randomness is present, and variable states are not described by unique values, but rather by probability distributions. === Deductive, inductive, or floating === A deductive model is a logical structure based on a theory. An inductive model arises from empirical findings and generalization from them. If a model rests on neither theory nor observation, it may be described as a 'floating' model. Application of mathematics in social sciences outside of economics has been criticized for unfounded models. Application of catastrophe theory in science has been characterized as a floating model. === Strategic vs. non-strategic === Models used in game theory are distinct in the sense that they model agents with incompatible incentives, such as competing species or bidders in an auction. Strategic models assume that players are autonomous decision makers who rationally choose actions that maximize their objective function. A key challenge of using strategic models is defining and computing solution concepts such as the Nash equilibrium. An interesting property of strategic models is that they separate reasoning about rules of the game from reasoning about behavior of the players. == Construction == In business and engineering, mathematical models may be used to maximize a certain output. The system under consideration will require certain inputs. The system relating inputs to outputs depends on other variables too: decision variables, state variables, exogenous variables, and random variables. Decision variables are sometimes known as independent variables. Exogenous variables are sometimes known as parameters or constants. The variables are not independent of each other as the state variables are dependent on the decision, input, random, and exogenous variables. Furthermore, the output variables are dependent on the state of the system (represented by the state variables). Objectives and constraints of the system and its users can be represented as functions of the output variables or state variables. The objective functions will depend on the perspective of the model's user. Depending on the context, an objective function is also known as an index of performance, as it is some measure of interest to the user. Although there is no limit to the number of objective functions and constraints a model can have, using or optimizing the model becomes more involved (computationally) as the number increases. For example, economists often apply linear algebra when using input–output models. Complicated mathematical models that have many variables may be consolidated by use of vectors where one symbol represents several variables. === A priori information === Mathematical modeling problems are often classified into black box or white box models, according to how much a priori information on the system is available. A black-box model is a system of which there is no a priori information available. A white-box model (also called glass box or clear box) is a system where all necessary information is available. Practically all systems are somewhere between the black-box and white-box models, so this concept is useful only as an intuitive guide for deciding which approach to take. Usually, it is preferable to use as much a priori information as possible to make the model more accurate. Therefore, the white-box models are usually considered easier, because if you have used the information correctly, then the model will behave correctly. Often the a priori information comes in forms of knowing the type of functions relating different variables. For example, if we make a model of how a medicine works in a human system, we know that usually the amount of medicine in the blood is an exponentially decaying function, but we are still left with several unknown parameters; how

    Read more →
  • LIFER/LADDER

    LIFER/LADDER

    LIFER/LADDER was one of the first database natural language processing systems. It was designed as a natural language interface to a database of information about US Navy ships. This system, as described in a paper by Hendrix (1978), used a semantic grammar to parse questions and query a distributed database. It was implemented in Interlisp. The LIFER/LADDER system could only support simple one-table queries or multiple table queries with easy join conditions. Some examples of queries it could accept: What are the length, width, and draft of the Kitty Hawk? When will Reeves achieve readiness rating C2? What is the nearest ship to Naples with a doctor on board? What ships are carrying cargo for the United States? Where are they going? Print the American cruisers’ current positions and states of readiness?

    Read more →
  • Shopify

    Shopify

    Shopify Inc., stylized as shopify, is a Canadian multinational e-commerce company headquartered in Ottawa, Ontario that operates a platform for retail point-of-sale systems. The company has over 5 million customers and processed US$292.3 billion in transactions in 2024, of which 57% was in the United States. Major customers include Tesla, LVMH, Nestlé, PepsiCo, AB InBev, Kraft Heinz, Lindt, Whole Foods Market, Red Bull, and Hyatt. The company's software has been praised for its ease of use and reasonable fee structure. It has been described as the "go-to e-commerce platform for startups". However, the company has faced criticism for allegedly inflating their sales data and for associating with controversial sellers. == History == === 2006: Founding === Shopify was founded in 2006 by friends Tobias Lütke, Daniel Weinand and Scott Lake after launching Snowdevil, an online store for snowboarding equipment, in 2004. Dissatisfied with the existing e-commerce products on the market, Lütke, a computer programmer by trade, instead built his own. Lütke used the open source web application framework Ruby on Rails to build Snowdevil's online store and launched it after two months of development. The Snowdevil founders launched the platform as Shopify in June 2006. Shopify created an open-source template language called Liquid, which is written in Ruby and has been used since 2006. In June 2009, Shopify launched an application programming interface (API) platform and App Store. The API allows developers to create applications for Shopify online stores and then sell them on the Shopify App Store. === 2010s === In January 2010, Shopify started its Build-A-Business competition, in which participants create a business using its commerce platform. The winners of the competition received cash prizes and mentorship from entrepreneurs, such as Richard Branson, Eric Ries and others. In April of that year, Shopify launched a free mobile app on the Apple App Store. The app allows Shopify store owners to view and manage their stores from iOS mobile devices. In December 2010, Shopify raised $7 million from a series A round from Bessemer Venture Partners, FirstMark Capital, and Felicis Ventures at a $20 million pre-money valuation. At that time, the company had annualized transaction value of $132 million. In October 2011, it raised $15 million in a Series B round. In August 2013, Shopify launched Shopify Payments in partnership with Stripe. Shopify Payments allows merchants to accept payments without requiring a third-party payment gateway. The company also announced the launch of a point of sale system to enable in-person sales in addition to online. The company received $100 million in Series C funding in December 2013. Shopify earned $105 million in revenue in 2014, twice as much as it raised the previous year. In February 2014, Shopify released "Shopify Plus" for large e-commerce businesses seeking access to additional features and support. Shopify went public via an initial public offering on May 21, 2015 raising more than $131 million. In September 2015, Amazon.com closed its Amazon Webstore service for merchants and selected Shopify as the preferred migration provider; In April 2016, Shopify announced Shopify Capital, a cash advance product. Shopify Capital was initially piloted to merchants within the US and allowed merchants to receive an advance on future earnings processed through its payment gateway. Since its launch in 2016, Shopify Capital has provided more than $5.1 billion in funding to Shopify merchants, with a maximum advance of $2 million. On June 7, 2016, Shopify launched its Shopify Plus Partners Program, to help agencies connect with evolving businesses in ecommerce space. On October 3, 2016, Shopify acquired Boltmade. In November 2016, Shopify partnered with Paystack which allowed Nigerian online retailers to accept payments from customers around the world. On November 22, 2016, Shopify launched Frenzy, a mobile app that improves flash sales. In January 2017, Shopify announced integration with Amazon that would allow merchants to sell on Amazon from their Shopify stores. In April 2017, Shopify introduced its Chip & Swipe Reader, a Bluetooth enabled debit and credit card reader for brick and mortar retail purchases. The company has since released additional technology for brick and mortar retailers, including a point-of-sale system with a Dock and Retail Stand similar to that offered by Square, and a tappable chip card reader. Shopify announced a one-click accelerated checkout feature called Shopify Pay in April 2017 as an exclusive feature for merchants using Shopify Payments as their payment processor. Customers can save their shipping and payment information for future purchases from all participating Shopify stores. In November 2017 Shopify announced Arrive, a mobile application to help customers track packages from both Shopify merchants and other e-commerce websites. In September 2018, Shopify announced plans to expand its office space in Toronto's King West neighborhood in 2022 as part of "The Well" complex, jointly owned by Allied Properties REIT and RioCan REIT. In October 2018, Shopify opened its first flagship, a physical space for business owners in Los Angeles. The space offered educational classes, coworking space, a "genius bar" for companies that use Shopify software, and workshops. Online cannabis sales in Ontario, Canada, used Shopify's software when the drug was legalized in October 2018. Shopify's software is also used for in-person cannabis sales in Ontario since becoming legal in 2019. In January 2019, Shopify announced the launch of Shopify Studios, a full-service television and film content and production house. On March 22, 2019, Shopify and email marketing platform Mailchimp ended an integration agreement over disputes involving customer privacy and data collection. In April 2019, Shopify announced an integration with Snapchat to allow Shopify merchants to buy and manage Snapchat Story ads directly on the Shopify platform. The company had previously secured similar integration partnerships with Facebook and Google. On August 14, 2019, Shopify launched Shopify Chat, a new native chat function that allows merchants to have real-time conversations with customers visiting Shopify stores online. === 2020s === In January 2020, the company announced plans to hire in Vancouver, Canada. Additionally, the effects of the COVID-19 pandemic contributed to lifting stock prices. On February 21, 2020, Shopify announced plans to join the Diem Association, known as Libra Association at the time. Also that month, Shopify Pay was rebranded as Shop Pay. In April, Arrive was rebranded as Shop, combining both customer-facing features under a single brand. In May, during the COVID-19 pandemic, Shopify announced it would shift most of its global workforce to permanent remote work. It was reported that Shopify's valuation would likely rise on the back of options it had in the company Affirm that was expecting to go public shortly. In November 2020, Shopify announced a partnership with Alipay to support merchants with cross-border payments. Shopify also provided the opportunity for users to connect Alibaba and AliExpress to Shopify through a Alibaba Dropshipping app that could be purchased through the Shopify App Store. Multiple applications launched between 2021 and 2024 allowed customers to connect their Shopify store to their Alibaba account and then import and publish your products. The integration automatically syncs inventory and orders between both platforms so that Alibaba vendors can ship directly to dropshipping customers.As a result of Affirm's January 13, 2021 IPO, Shopify's 8% stake in Affirm was worth $2 billion. About half of Shopify's C-level executives left the company in early 2021. On June 29, 2021, Shopify removed the 20% revenue share for app developers that make less than US$1 million per year. On January 18, 2022, Shopify announced a partnership with JD.com to let U.S. merchants expand their operations in China, listing their products on JD's cross-border e-commerce platform JD Worldwide. On March 22, 2022, Shopify introduced Linkpop, a product to create a branded, social marketplace through which merchants can advertise and market their products via links to be added on social media channels. The following month, Shopify, Alphabet Inc., Meta Platforms, McKinsey & Company, and Stripe, Inc. announced a $925 million advance market commitment of carbon dioxide removal (CDR) from companies that are developing CDR technology over the next 9 years. In June 2022, Shopify partnered with Twitter. As a part of the deal, Twitter announced that it would launch a sales channel app for all of Shopify's U.S. merchants through its app store. Shopify also partnered with PayPal to offer Shopify Payments to merchants in France. On July 26, 2022, Lütke announced immediate layoffs totalling roughly 10 percent of its workforce. In

    Read more →
  • Social History and Industrial Classification

    Social History and Industrial Classification

    Social History and Industrial Classification (SHIC) is a classification system used by many British museums for social history and industrial collections. It was first published in 1983. == Purpose == SHIC classifies materials (books, objects, recordings etc.) by their interaction with the people who used them. For example, a carpenter's hammer is classified with other tools of the carpenter, and not with a blacksmith's hammer. In contrast other classification systems, for example the Dewey Decimal Classification, might class all hammers together and close to the classification for other percussive tools. The specialist subject network, Social History Curator's Group (SHCG), obtained funding in 2012 to develop an on-line version, now on their website http://www.shcg.org.uk/ == Scheme == Materials are classified under four major category numbers: Community life Domestic and family life Personal life Working life Further classification within a category is by the use of further numbers after the decimal point. It is permissible to assign more than one classification in cases where the object had more than one use.

    Read more →
  • Semantic data model

    Semantic data model

    A semantic data model (SDM) is a high-level semantics-based database description and structuring formalism (database model) for databases. This database model is designed to capture more of the meaning of an application environment than is possible with contemporary database models. An SDM specification describes a database in terms of the kinds of entities that exist in the application environment, the classifications and groupings of those entities, and the structural interconnections among them. SDM provides a collection of high-level modeling primitives to capture the semantics of an application environment. By accommodating derived information in a database structural specification, SDM allows the same information to be viewed in several ways; this makes it possible to directly accommodate the variety of needs and processing requirements typically present in database applications. The design of the present SDM is based on our experience in using a preliminary version of it. SDM is designed to enhance the effectiveness and usability of database systems. An SDM database description can serve as a formal specification and documentation tool for a database; it can provide a basis for supporting a variety of powerful user interface facilities, it can serve as a conceptual database model in the database design process; and, it can be used as the database model for a new kind of database management system. == In software engineering == A semantic data model in software engineering has various meanings: It is a conceptual data model in which semantic information is included. This means that the model describes the meaning of its instances. Such a semantic data model is an abstraction that defines how the stored symbols (the instance data) relate to the real world. It is a conceptual data model that includes the capability to express and exchange information which enables parties to interpret meaning (semantics) from the instances, without the need to know the meta-model. Such semantic models are fact-oriented (as opposed to object-oriented). Facts are typically expressed by binary relations between data elements, whereas higher order relations are expressed as collections of binary relations. Typically binary relations have the form of triples: Object-RelationType-Object. For example: the Eiffel Tower Paris. Typically the instance data of semantic data models explicitly include the kinds of relationships between the various data elements, such as . To interpret the meaning of the facts from the instances, it is required that the meaning of the kinds of relations (relation types) be known. Therefore, semantic data models typically standardize such relation types. This means that the second kind of semantic data models enables that the instances express facts that include their own meanings. The second kind of semantic data models are usually meant to create semantic databases. The ability to include meaning in semantic databases facilitates building distributed databases that enable applications to interpret the meaning from the content. This implies that semantic databases can be integrated when they use the same (standard) relation types. This also implies that in general they have a wider applicability than relational or object-oriented databases. == Overview == The logical data structure of a database management system (DBMS), whether hierarchical, network, or relational, cannot totally satisfy the requirements for a conceptual definition of data, because it is limited in scope and biased toward the implementation strategy employed by the DBMS. Therefore, the need to define data from a conceptual view has led to the development of semantic data modeling techniques. That is, techniques to define the meaning of data within the context of its interrelationships with other data, as illustrated in the figure. The real world, in terms of resources, ideas, events, etc., are symbolically defined within physical data stores. A semantic data model is an abstraction which defines how the stored symbols relate to the real world. Thus, the model must be a true representation of the real world. According to Klas and Schrefl (1995), the "overall goal of semantic data models is to capture more meaning of data by integrating relational concepts with more powerful abstraction concepts known from the Artificial Intelligence field. The idea is to provide high level modeling primitives as an integral part of a data model in order to facilitate the representation of real world situations". == History == The need for semantic data models was first recognized by the U.S. Air Force in the mid-1970s as a result of the Integrated Computer-Aided Manufacturing (ICAM) Program. The objective of this program was to increase manufacturing productivity through the systematic application of computer technology. The ICAM Program identified a need for better analysis and communication techniques for people involved in improving manufacturing productivity. As a result, the ICAM Program developed a series of techniques known as the IDEF (ICAM Definition) Methods which included the following: IDEF0 used to produce a “function model” which is a structured representation of the activities or processes within the environment or system. IDEF1 used to produce an “information model” which represents the structure and semantics of information within the environment or system. IDEF1X a semantic data modeling technique used to produce a graphical information model which represents the structure and semantics of information within an environment or system. Use of this standard permits the construction of semantic data models which may serve to support the management of data as a resource, the integration of information systems, and the building of computer databases. IDEF2 used to produce a “dynamics model” which represents the time varying behavioral characteristics of the environment or system. During the 1990s, the application of semantic modelling techniques resulted in the semantic data models of the second kind. An example of such is the semantic data model that is standardised as ISO 15926-2 (2002), which is further developed into the semantic modelling language Gellish (2005). The definition of the Gellish language is documented in the form of a semantic data model. Gellish itself is a semantic modelling language, that can be used to create other semantic models. Those semantic models can be stored in Gellish Databases, being semantic databases. == Applications == A semantic data model can be used to serve many purposes. Some key objectives include: Planning of data resources: A preliminary data model can be used to provide an overall view of the data required to run an enterprise. The model can then be analyzed to identify and scope projects to build shared data resources. Building of shareable databases: A fully developed model can be used to define an application independent view of data which can be validated by users and then transformed into a physical database design for any of the various DBMS technologies. In addition to generating databases which are consistent and shareable, development costs can be drastically reduced through data modeling. Evaluation of vendor software: Since a data model actually represents the infrastructure of an organization, vendor software can be evaluated against a company’s data model in order to identify possible inconsistencies between the infrastructure implied by the software and the way the company actually does business. Integration of existing databases: By defining the contents of existing databases with semantic data models, an integrated data definition can be derived. With the proper technology, the resulting conceptual schema can be used to control transaction processing in a distributed database environment. The U.S. Air Force Integrated Information Support System (I2S2) is an experimental development and demonstration of this kind of technology, applied to a heterogeneous type of DBMS environments.

    Read more →