Ashutosh Saxena is an Indian-American computer scientist, researcher, and entrepreneur known for his contributions to the field of artificial intelligence and large-scale robot learning. His interests include building enterprise AI agents and embodied AI. Saxena is the co-founder and CEO of Caspar.AI, where generative AI parses data from ambient 3D radar sensors to predict 20+ health & wellness markers for pro-active patient care. Prior to Caspar.AI, Ashutosh co-founded Cognical Katapult (NSDQ: KPLT), which provides a no credit required alternative to traditional financing for online and omni-channel retail. Before Katapult, Saxena was an assistant professor in the Computer Science Department and faculty director of the RoboBrain Project (a large-scale AI model for robotics) at Cornell University. == Education == In 2009, with artificial intelligence pioneer Andrew Ng as his advisor, Saxena received both his M.S. and Ph.D. in computer science with an emphasis on artificial intelligence from Stanford University. Saxena received his bachelor's degree in electrical engineering from the Indian Institute of Technology, Kanpur in 2004. == Career == Saxena was the chief scientist of New York-based Holopad, where he worked with Steven Spielberg's team to create walkthroughs and 3D experiences for his movie TinTin. His past experiences include building acoustic AI models at Bose Corporation. Once Ashutosh completed his undergraduate degree, he became a researcher at the Commonwealth Scientific and Industrial Research Organization, where he developed AI models for medical devices. Before Caspar, Saxena pursued other entrepreneurial ventures, such as ZunaVision, an artificial intelligence startup he co-founded with Andrew Ng that uses AI to embed advertising space within videos. Ashutosh served as the CTO of ZunaVision from 2008 to 2010. After ZunaVision, Saxena co-founded Cognical Katapult, which provided financing solutions to nonprime and underbanked consumers powered by artificial intelligence. From 2014 to 2016, Saxena served as the faculty director of the RoboBrain project, which was a joint venture that he started between Stanford University, Cornell University, Brown University, and the University of California, Berkeley that made a knowledge engine for robots. Saxena co-founded Brain of Things in 2015 with David Cheriton, who serves as chief scientist, and was listed as the fastest growing private company reaching an annual recurring revenue of $8 million in three years. It has been widely covered in several outlets including Forbes Japan, and MIT Technology Review. Saxena's work on deep learning won test of time award in 2023 by Robotics Science and Systems. Ashutosh has been recognized for his work by receiving the Alfred P. Sloan Fellow in 2011, Google Faculty Research Award in 2012, Microsoft Faculty Fellowship in 2012, NSF Career award in 2013, One of the Eight Innovators to Watch by the Smithsonian Institution in 2015, and received TR35 Innovator Award by MIT Technology Review in 2018. He was named by San Francisco Business Times as a 40 under 40 young business leader. == Research == Saxena has authored over 100 published papers in the areas of large-scale robot learning and artificial intelligence, with 20,000+ citations. His work in the fields of computer vision and deep learning have been featured in press releases and academic journal reviews. Ashutosh's early work includes the Stanford Artificial Intelligence Robot (STAIR), an AI models that enables to perform tasks such as unload items from a dishwasher, which was covered on the front-page of New York Times. His work on Make3D, was the first work that estimated 3D depth from a single still image. At Cornell University, Ashutosh led the Robot Learning Lab, which used a machine learning approach to train robots to perform tasks in human environments such as generalizing manipulation in 3D point-clouds where robots learn to transfer manipulation trajectories to novel objects utilizing a large sample of demonstrations from crowdsourcing.
Object co-segmentation
In computer vision, object co-segmentation is a special case of image segmentation, which is defined as jointly segmenting semantically similar objects in multiple images or video frames. == Challenges == It is often challenging to extract segmentation masks of a target/object from a noisy collection of images or video frames, which involves object discovery coupled with segmentation. A noisy collection implies that the object/target is present sporadically in a set of images or the object/target disappears intermittently throughout the video of interest. Early methods typically involve mid-level representations such as object proposals. == Dynamic Markov networks-based methods == A joint object discover and co-segmentation method based on coupled dynamic Markov networks has been proposed recently, which claims significant improvements in robustness against irrelevant/noisy video frames. Unlike previous efforts which conveniently assumes the consistent presence of the target objects throughout the input video, this coupled dual dynamic Markov network based algorithm simultaneously carries out both the detection and segmentation tasks with two respective Markov networks jointly updated via belief propagation. Specifically, the Markov network responsible for segmentation is initialized with superpixels and provides information for its Markov counterpart responsible for the object detection task. Conversely, the Markov network responsible for detection builds the object proposal graph with inputs including the spatio-temporal segmentation tubes. == Graph cut-based methods == Graph cut optimization is a popular tool in computer vision, especially in earlier image segmentation applications. As an extension of regular graph cuts, multi-level hypergraph cut is proposed to account for more complex high order correspondences among video groups beyond typical pairwise correlations. With such hypergraph extension, multiple modalities of correspondences, including low-level appearance, saliency, coherent motion and high level features such as object regions, could be seamlessly incorporated in the hyperedge computation. In addition, as a core advantage over co-occurrence based approach, hypergraph implicitly retains more complex correspondences among its vertices, with the hyperedge weights conveniently computed by eigenvalue decomposition of Laplacian matrices. == CNN/LSTM-based methods == In action localization applications, object co-segmentation is also implemented as the segment-tube spatio-temporal detector. Inspired by the recent spatio-temporal action localization efforts with tubelets (sequences of bounding boxes), Le et al. present a new spatio-temporal action localization detector Segment-tube, which consists of sequences of per-frame segmentation masks. This Segment-tube detector can temporally pinpoint the starting/ending frame of each action category in the presence of preceding/subsequent interference actions in untrimmed videos. Simultaneously, the Segment-tube detector produces per-frame segmentation masks instead of bounding boxes, offering superior spatial accuracy to tubelets. This is achieved by alternating iterative optimization between temporal action localization and spatial action segmentation. The proposed segment-tube detector is illustrated in the flowchart on the right. The sample input is an untrimmed video containing all frames in a pair figure skating video, with only a portion of these frames belonging to a relevant category (e.g., the DeathSpirals). Initialized with saliency based image segmentation on individual frames, this method first performs temporal action localization step with a cascaded 3D CNN and LSTM, and pinpoints the starting frame and the ending frame of a target action with a coarse-to-fine strategy. Subsequently, the segment-tube detector refines per-frame spatial segmentation with graph cut by focusing on relevant frames identified by the temporal action localization step. The optimization alternates between the temporal action localization and spatial action segmentation in an iterative manner. Upon practical convergence, the final spatio-temporal action localization results are obtained in the format of a sequence of per-frame segmentation masks (bottom row in the flowchart) with precise starting/ending frames.
GPU switching
GPU switching is a mechanism used on computers with multiple graphic controllers. This mechanism allows the user to either maximize the graphic performance or prolong battery life by switching between the graphic cards. It is mostly used on gaming laptops which usually have an integrated graphic device and a discrete video card. == Basic components == Most computers using this feature contain integrated graphics processors and dedicated graphics cards that applies to the following categories. === Integrated graphics === Also known as: Integrated graphics, shared graphics solutions, integrated graphics processors (IGP) or unified memory architecture (UMA). This kind of graphics processors usually have much fewer processing units and share the same memory with the CPU. Sometimes the graphics processors are integrated onto a motherboard. It is commonly known as: on-board graphics. A motherboard with on-board graphics processors doesn't require a discrete graphics card or a CPU with graphics processors to operate. === Dedicated graphics cards === Also known as: discrete graphics cards. Unlike integrated graphics, dedicated graphics cards have much more processing units and have its own RAM with much higher memory bandwidth. In some cases, a dedicated graphics chip can be integrated onto the motherboards, B150-GP104 for example. Regardless of the fact that the graphics chip is integrated, it is still counted as a dedicated graphics cards system because the graphics chip is integrated with its own memory. == Theory == Most Personal Computers have a motherboard that uses a Southbridge and Northbridge structure. === Northbridge control === The Northbridge is one of the core logic chipset that handles communications between the CPU, GPU, RAM and the Southbridge. The discrete graphics card is usually installed onto the graphics card slot such as PCI-Express and the integrated graphics is integrated onto the CPU itself or occasionally onto the Northbridge. The Northbridge is the most responsible for switching between GPUs. The way how it works usually has the following process (refer to the Figure 1. on the right): The Northbridge receives input from Southbridge through the internal bus. The Northbridge signals to CPU through the Front-side bus. The CPU runs the task assignment application (usually the graphics card driver) to determine which GPU core to use. The CPU passes down the command to the Northbridge. The Northbridge passes down the command to the according GPU core. The GPU core processes the command and returns the rendered data back to the Northbridge. The Northbridge sends the rendered data back to Southbridge. === Southbridge control === The Southbridge is a set of integrated circuits such Intel's I/O Controller Hub (ICH). It handles all of a computer's I/O functions, such as receiving the keyboard input and outputting the data onto the screen. The way how it usually works usually has two steps: Take in the user input and pass it down to the Northbridge. (Optional) Receive the rendered data from the Northbridge and output it. The reason why the second step can be optional is that sometimes the rendered the data is outputted directly from the discrete graphics card which is located on the graphics card slot so there is no need to output the data through the Southbridge. == Main purpose == GPU switching is mostly used for saving energy by switching between graphic cards. The dedicated graphics cards consume much more power than integrated graphics but also provides higher 3D performances, which is needed for a better gaming and CAD experience. Following is a list of the TDPs of the most popular CPU with integrated graphics and dedicated graphics cards. The dedicated graphics cards exhibit much higher power consumption than the integrated graphics on both platforms. Disabling them when no heavy graphics processing is needed can significantly lower the power consumption. == Technologies == === Nvidia Optimus === Nvidia Optimus™ is a computer GPU switching technology created by Nvidia that can dynamically and seamlessly switch between two graphic cards based on running programs. === AMD Enduro === AMD Enduro™ is a collective brand developed by AMD that features many new technologies that can significantly save power. It was previously named as: PowerXpress and Dynamic Switchable Graphics (DSG). This technology implements a sophisticated system to predict the potential usage need for graphics cards and switch between graphics cards based on predicted need. This technology also introduces a new power control plan that allows the discrete graphics cards consume no energy when idling. == Manufacturers == === Integrated graphics === In personal computers, the IGP (integrated graphics processors) are mostly manufactured by Intel and AMD and are integrated onto their CPUs. They are commonly known as: Intel HD and Iris Graphics - also called HD series and Iris series AMD Accelerated Processing Unit (APU) - also formerly known as: fusion === Dedicated graphics cards === The most popular dedicated graphics cards are manufactured by AMD and Nvidia. They are commonly known as: AMD Radeon Nvidia GeForce == Drivers and OS support == Most common operating systems have built-in support for this feature. However, the users may download the updated drivers from Nvidia or AMD for better experience. === Windows support === Windows 7 has built-in support for this feature. The system automatically switches between GPUs depending on the program that's running. However, the user may switch the GPUs manually through device manager or power manager. === Linux === Modern Linux systems handle hybrid graphics in two parts: power/control for the inactive GPU, and optional render offloading for individual applications. vga_switcheroo (in the kernel since 2.6.34) coordinates power and mux control on systems with multiple GPUs. It was designed primarily for muxed designs (hardware display switch), and on muxless laptops it is typically used only for power control. A display server restart is no longer required for offloading on muxless systems. DRI PRIME (Mesa) enables per-process render offload on muxless systems: an app renders on the discrete GPU and the integrated GPU presents the result. Users can opt in via the DRI_PRIME environment variable (e.g., DRI_PRIME=1) or desktop integration. On GNOME, the switcheroo-control service exposes the discrete GPU to the shell, adding a “Launch using Discrete Graphics Card” entry to app menus on supported systems (Wayland or Xorg), which invokes render offload under the hood. With the proprietary Nvidia driver, render offload is provided as PRIME Render Offload (supported since driver 435.xx). Distributions commonly ship a helper like prime-run or desktop menu entries that set the required environment for offloading. ==== Notes and limitations (Linux) ==== On muxless systems the internal display is hard-wired to the integrated GPU; the discrete GPU cannot directly drive that panel and instead renders offscreen for composition by the iGPU. External displays connected to the dGPU may allow direct output depending on the laptop’s wiring. Power-saving behavior varies by driver and distro defaults. Some setups need explicit configuration to power down the inactive GPU when idle. Desktop integrations (e.g., GNOME's menu item) simply opt an app into offload; they do not "auto-switch" the whole session. Users can still launch apps on either GPU as needed.
Polyfill (programming)
In software development, a polyfill is code that implements a new standard feature of a deployment environment within an old version of that environment that does not natively support the feature. Most often, it refers to JavaScript code that implements an HTML5 or CSS web standard, either an established standard (supported by some browsers) on older browsers, or a proposed standard (not supported by any browsers) on existing browsers. Polyfills are also used in PHP and Python. Polyfills allow web developers to use an API regardless of whether or not it is supported by a browser, and usually with minimal overhead. Typically they first check if a browser supports an API, and use it if available, otherwise using their own implementation. Polyfills themselves use other, more supported features, and thus different polyfills may be needed for different browsers. The term is also used as a verb: polyfilling is providing a polyfill for a feature. == Definition == The term is a neologism, coined by Remy Sharp, who required a word that meant "replicate an API using JavaScript (or Flash or whatever) if the browser doesn’t have it natively" while co-writing the book Introducing HTML5 in 2009. Formally, "a shim is a library that brings a new API to an older environment, using only the means of that environment." Polyfills exactly fit this definition; the term shim was also used for early polyfills. However, to Sharp shim connoted non-transparent APIs and workarounds, such as spacer GIFs for layout, sometimes known as shim.gif, and similar terms such as progressive enhancement and graceful degradation were not appropriate, so he invented a new term. The term is based on the multipurpose filling paste brand Polyfilla, a paste used to cover up cracks and holes in walls, and the meaning "fill in holes (in functionality) in many (poly-) ways." The word has since gained popularity, particularly due to its use by Paul Irish and in Modernizr documentation. The distinction that Sharp makes is: What makes a polyfill different from the techniques we have already, like a shim, is this: if you removed the polyfill script, your code would continue to work, without any changes required in spite of the polyfill being removed. This distinction is not drawn by other authors. At times various other distinctions are drawn between shims, polyfills, and fallbacks, but there are no generally accepted distinctions: most consider polyfills a form of shim. The term polyfiller is also occasionally found. == Examples == === core-js === core-js is one of the most popular JavaScript standard library polyfills. Includes polyfills for ECMAScript up to the latest version of the standard: promises, symbols, collections, iterators, typed arrays, many other features, ECMAScript proposals, some cross-platform WHATWG / W3C features and proposals like URL. You can load only required features or use it without global namespace pollution. It can be integrated with Babel, which allows it to automatically inject required core-js modules into your code. === html5shiv === In IE versions prior to 9, unknown HTML elements like
Digital exhibition
Digital Exhibition includes both the projection technologies, such as High Definition, and delivery technologies of a film to a movie theater. Delivery technologies include disk drives, satellite relay, and fiber optics. This can save money in distribution but is usually more expensive overall due to maintenance and standardization of technology. However, there are benefits to digital exhibition, for example it requires less assembly by the exhibitor and can contain the trailers that the distributor wishes.
Multimodal representation learning
Multimodal representation learning is a subfield of representation learning focused on integrating and interpreting information from different modalities, such as text, images, audio, or video, by projecting them into a shared latent space. This allows for semantically similar content across modalities to be mapped to nearby points within that space, facilitating a unified understanding of diverse data types. By automatically learning meaningful features from each modality and capturing their inter-modal relationships, multimodal representation learning enables a unified representation that enhances performance in cross-media analysis tasks such as video classification, event detection, and sentiment analysis. It also supports cross-modal retrieval and translation, including image captioning, video description, and text-to-image synthesis. == Motivation == The primary motivations for multimodal representation learning arise from the inherent nature of real-world data and the limitations of unimodal approaches. Since multimodal data offers complementary and supplementary information about an object or event from different perspectives, it is more informative than relying on a single modality. A key motivation is to narrow the heterogeneity gap that exists between different modalities by projecting their features into a shared semantic subspace. This allows semantically similar content across modalities to be represented by similar vectors, facilitating the understanding of relationships and correlations between them. Multimodal representation learning aims to leverage the unique information provided by each modality to achieve a more comprehensive and accurate understanding of concepts. These unified representations are crucial for improving performance in various cross-media analysis tasks such as video classification, event detection, and sentiment analysis. They also enable cross-modal retrieval, allowing users to search and retrieve content across different modalities. Additionally, it facilitates cross-modal translation, where information can be converted from one modality to another, as seen in applications like image captioning and text-to-image synthesis. The abundance of ubiquitous multimodal data in real-world applications, including understudied areas like healthcare, finance, and human-computer interaction (HCI), further motivates the development of effective multimodal representation learning techniques. == Approaches and methods == === Canonical-correlation analysis based methods === Canonical-correlation analysis (CCA) was first introduced in 1936 by Harold Hotelling and is a fundamental approach for multimodal learning. CCA aims to find linear relationships between two sets of variables. Given two data matrices X ∈ R n × p {\displaystyle X\in \mathbb {R} ^{n\times p}} and Y ∈ R n × q {\displaystyle Y\in \mathbb {R} ^{n\times q}} representing different modalities, CCA finds projection vectors w x ∈ R p {\displaystyle w_{x}\in \mathbb {R} ^{p}} and w y ∈ R q {\displaystyle w_{y}\in \mathbb {R} ^{q}} that maximizes the correlation between the projected variables: ρ = max w x , w y w x ⊤ Σ x y w y w x ⊤ Σ x x w x w y ⊤ Σ y y w y {\displaystyle \rho =\max _{w_{x},w_{y}}{\frac {w_{x}^{\top }\Sigma _{xy}w_{y}}{{\sqrt {w_{x}^{\top }\Sigma _{xx}w_{x}}}{\sqrt {w_{y}^{\top }\Sigma _{yy}w_{y}}}}}} such that Σ x x {\displaystyle \Sigma _{xx}} and Σ y y {\displaystyle \Sigma _{yy}} are the within-modality covariance matrices, and Σ x y {\displaystyle \Sigma _{xy}} is the between-modality covariance matrix. However, standard CCA is limited by its linearity, which led to the development of nonlinear extensions, such as kernel CCA and deep CCA. ==== Kernel CCA ==== Kernel canonical correlation analysis (KCCA) extends traditional CCA to capture nonlinear relationships between modalities by implicitly mapping the data into high dimensional feature spaces using kernel functions. Given kernel functions K x {\displaystyle K_{x}} and K y {\displaystyle K_{y}} with corresponding Gram matrices K x ∈ R n × n {\displaystyle K_{x}\in \mathbb {R} ^{n\times n}} and K y ∈ R n × n {\displaystyle K_{y}\in \mathbb {R} ^{n\times n}} , KCCA seeks coefficients α {\displaystyle \alpha } and β {\displaystyle \beta } that maximize: ρ = max α , β α ⊤ K x K y β α ⊤ K x 2 α β ⊤ K y 2 β {\displaystyle \rho =\max _{\alpha ,\beta }{\frac {\alpha ^{\top }K_{x}Ky\beta }{{\sqrt {\alpha ^{\top }K_{x}^{2}\alpha }}{\sqrt {\beta ^{\top }K_{y}^{2}\beta }}}}} To prevent overfitting, regularization terms are typically added, resulting in: ρ = max α , β α T K x K y β α T ( K x 2 + λ x K x ) α β T ( K y 2 + λ y K y ) β {\displaystyle \rho =\max _{\alpha ,\beta }{\frac {\alpha ^{T}K_{x}K_{y}\beta }{{\sqrt {\alpha ^{T}\left(K_{x}^{2}+\lambda _{x}K_{x}\right)\alpha }}{\sqrt {\;\beta ^{T}\left(K_{y}^{2}+\lambda _{y}K_{y}\right)\beta }}}}} where λ x {\displaystyle \lambda _{x}} and λ y {\displaystyle \lambda _{y}} are regularization parameters. KCCA has proven effective for tasks such as cross-modal retrieval and semantic analysis, though it faces computational challenges with large datasets due to its O ( n 2 ) {\displaystyle O(n^{2})} memory requirement for sorting kernel matrices. KCCA was proposed independently by several researchers. ==== Deep CCA ==== Deep canonical correlation analysis (DCCA), introduced in 2013, employs neural networks to learn nonlinear transformations for maximizing the correlation between modalities. DCCA uses separate neural networks f x {\displaystyle f_{x}} and f y {\displaystyle f_{y}} for each modality to transform the original data before applying CCA: max W x , W y , θ x , θ y corr ( f x ( X ; θ x ) , f y ( Y ; θ y ) ) {\displaystyle \max _{W_{x},W_{y},\theta _{x},\theta _{y}}\operatorname {corr} \left(f_{x}(X;\theta _{x}),f_{y}(Y;\theta _{y})\right)} where θ x {\displaystyle \theta _{x}} and θ y {\displaystyle \theta _{y}} represent the parameters of the neural networks, and W x {\displaystyle W_{x}} and W y {\displaystyle W_{y}} are the CCA projection matrices. The correlation objective is computed as: corr ( H x , H y ) = tr ( T − 1 / 2 H x T H y S − 1 / 2 ) {\displaystyle \operatorname {corr} (H_{x},H_{y})=\operatorname {tr} \left(T^{-1/2}H_{x}^{T}H_{y}S^{-1/2}\right)} where H x = f x ( X ) {\displaystyle H_{x}=f_{x}(X)} and H y = f y ( Y ) {\displaystyle H_{y}=f_{y}(Y)} are the network outputs, T = H x T H x + r x I {\displaystyle T=H_{x}^{T}H_{x}+r_{x}I} , S = H y T H y + r y I {\displaystyle S=H_{y}^{T}H_{y}+r_{y}I} and r x , r y {\displaystyle r_{x},r_{y}} are the regularization parameters. DCCA overcomes the limitations of linear CCA and kernel CCA by learning complex nonlinear relationships while maintaining computational efficiency for large datasets through mini-batch optimization. === Graph-based methods === Graph-based approaches for multimodal representation learning leverage graph structure to model relationships between entities across different modalities. These methods typically represent each modality as a graph and then learn embedding that preserve cross-modal similarities, enabling more effective joint representation of heterogeneous data. One such method is cross-modal graph neural networks (CMGNNs) that extend traditional graph neural networks (GNNs) to handle data from multiple modalities by constructing graphs that capture both intra-modal and inter-modal relationships. These networks model interactions across modalities by representing them as nodes and their relationships as edges. Other graph-based methods include Probabilistic Graphical Models (PGMs) such as deep belief networks (DBN) and deep Boltzmann machines (DBM). These models can learn a joint representation across modalities, for instance, a multimodal DBN achieves this by adding a shared restricted Boltzmann Machine (RBM) hidden layer on top of modality-specific DBNs. Additionally, the structure of data in some domains like Human-Computer Interaction (HCI), such as the view hierarchy of app screens, can potentially be modeled using graph-like structures. The field of graph representation learning is also relevant, with ongoing progress in developing evaluation benchmarks. === Diffusion maps === Another set of methods relevant to multimodal representation learning are based on diffusion maps and their extensions to handle multiple modalities. ==== Multi-view diffusion maps ==== Multi-view diffusion maps address the challenge of achieving multi-view dimensionality reduction by effectively utilizing the availability of multiple views to extract a coherent low-dimensional representation of the data. The core idea is to exploit both the intrinsic relations within each view and the mutual relations between the different views, defining a cross-view model where a random walk process implicitly hops between objects in different views. A multi-view kernel matrix is constructed by combining these relations, defining a cross-view diffusion process and associ
Hydration (web development)
In web development, hydration or rehydration is a technique in which client-side JavaScript converts a web page that is static from the perspective of the web browser, delivered either through static rendering or server-side rendering, into a dynamic web page by attaching event handlers to the HTML elements in the DOM. Because the HTML is pre-rendered on a server, this allows for a fast "first contentful paint" (when useful data is first displayed to the user), but there is a period of time afterward where the page appears to be fully loaded and interactive, but is not until the client-side JavaScript is executed and event handlers have been attached. Frameworks that use hydration include Next.js and Nuxt. React v16.0 introduced a "hydrate" function, which hydrates an element, in its API. == Variations == === Streaming server-side rendering === Streaming server-side rendering allows one to send HTML in chunks that the browser can progressively render as it is received. This can provide a fast first paint and first contentful paint as HTML markup arrives to users faster. === Progressive rehydration === In progressive rehydration, individual pieces of a server-rendered application are “booted up” over time, rather than the current common approach of initializing the entire application at once. This can help reduce the amount of JavaScript required to make pages interactive, since client-side upgrading of low priority parts of the page can be deferred to prevent blocking the main thread. It can also help avoid one of the most common server-side rendering rehydration pitfalls, where a server-rendered DOM tree gets destroyed and then immediately rebuilt – most often because the initial synchronous client-side render required data that wasn't quite ready, perhaps awaiting Promise resolution. === Partial rehydration === Partial rehydration has proven difficult to implement. This approach is an extension of the idea of progressive rehydration, where the individual pieces (components/views/trees) to be progressively rehydrated are analyzed and those with little interactivity or no reactivity are identified. For each of these mostly-static parts, the corresponding JavaScript code is then transformed into inert references and decorative functionality, reducing their client-side footprint to near-zero. The partial hydration approach comes with its own issues and compromises. It poses some interesting challenges for caching, and client-side navigation means it cannot be assumed that server-rendered HTML for inert parts of the application will be available without a full page load. One framework that supports partial rehydration is Elder.js, which is based on Svelte. === Trisomorphic rendering === Trisomorphic rendering is a technique which uses streaming server-side rendering for initial/non-JavaScript navigations, and then uses service workers to take on rendering of HTML for navigations after it has been installed. This can keep cached components and templates up to date and enables SPA-style navigations for rendering new views in the same session. This approach works best when one can share the same templating and routing code between the server, client page, and service worker.