Higuchi dimension

Higuchi dimension

In fractal geometry, the Higuchi dimension (or Higuchi fractal dimension (HFD)) is an approximate value for the box-counting dimension of the graph of a real-valued function or time series. This value is obtained via an algorithmic approximation so one also talks about the Higuchi method. It has many applications in science and engineering and has been applied to subjects like characterizing primary waves in seismograms, clinical neurophysiology and analyzing changes in the electroencephalogram in Alzheimer's disease. == Formulation of the method == The original formulation of the method is due to T. Higuchi. Given a time series X : { 1 , … , N } → R {\displaystyle X:\{1,\dots ,N\}\to \mathbb {R} } consisting of N {\displaystyle N} data points and a parameter k m a x ≥ 2 {\displaystyle k_{\mathrm {max} }\geq 2} the Higuchi Fractal dimension (HFD) of X {\displaystyle X} is calculated in the following way: For each k ∈ { 1 , … , k m a x } {\displaystyle k\in \{1,\dots ,k_{\mathrm {max} }}\} and m ∈ { 1 , … , k } {\displaystyle m\in \{1,\dots ,k}\} define the length L m ( k ) {\displaystyle L_{m}(k)} by L m ( k ) = N − 1 ⌊ N − m k ⌋ k 2 ∑ i = 1 ⌊ N − m k ⌋ | X N ( m + i k ) − X N ( m + ( i − 1 ) k ) | . {\displaystyle L_{m}(k)={\frac {N-1}{\lfloor {\frac {N-m}{k}}\rfloor k^{2}}}\sum _{i=1}^{\lfloor {\frac {N-m}{k}}\rfloor }|X_{N}(m+ik)-X_{N}(m+(i-1)k)|.} The length L ( k ) {\displaystyle L(k)} is defined by the average value of the k {\displaystyle k} lengths L 1 ( k ) , … , L k ( k ) {\displaystyle L_{1}(k),\dots ,L_{k}(k)} , L ( k ) = 1 k ∑ m = 1 k L m ( k ) . {\displaystyle L(k)={\frac {1}{k}}\sum _{m=1}^{k}L_{m}(k).} The slope of the best-fitting linear function through the data points { ( log ⁡ 1 k , log ⁡ L ( k ) ) } {\displaystyle \left\{\left(\log {\frac {1}{k}},\log L(k)\right)\right\}} is defined to be the Higuchi fractal dimension of the time-series X {\displaystyle X} . == Application to functions == For a real-valued function f : [ 0 , 1 ] → R {\displaystyle f:[0,1]\to \mathbb {R} } one can partition the unit interval [ 0 , 1 ] {\displaystyle [0,1]} into N {\displaystyle N} equidistantly intervals [ t j , t j + 1 ) {\displaystyle [t_{j},t_{j+1})} and apply the Higuchi algorithm to the times series X ( j ) = f ( t j ) {\displaystyle X(j)=f(t_{j})} . This results into the Higuchi fractal dimension of the function f {\displaystyle f} . It was shown that in this case the Higuchi method yields an approximation for the box-counting dimension of the graph of f {\displaystyle f} as it follows a geometrical approach (see Liehr & Massopust 2020). == Robustness and stability == Applications to fractional Brownian functions and the Weierstrass function reveal that the Higuchi fractal dimension can be close to the box-dimension. On the other hand, the method can be unstable in the case where the data X ( 1 ) , … , X ( N ) {\displaystyle X(1),\dots ,X(N)} are periodic or if subsets of it lie on a horizontal line (see Liehr & Massopust 2020).

2024 National Public Data breach

In August 2024, three class-action lawsuits were filed against National Public Data along with over 14 complaints filed in federal court, claiming that the company permitted hackers to steal sensitive private information covering millions of individuals. The theft was alleged to have occurred in April 2024. One of the lawsuits specifically claims that in April, a hacker going by the moniker "USDoD" posted a notice on the dark web, offering the data for sale at the price of US$3.5 million. The information stolen is alleged to include 2.9 billion records containing full names, current and past addresses, Social Security numbers, dates of birth, and telephone numbers. The stolen data contains records for people in the US, UK, and Canada. National Public Data confirmed on August 16, 2024, there was a breach originating from someone trying to breach their systems since December 2023, with the breach occurring from April 2024 and over the next few months. The company also confirmed that 2.9 billion records were obtained, though they were still working to determine how many people were affected by the breach, and were working with law enforcement to identify the hacker. == Jerico Pictures == Jerico Pictures, Inc., doing business as National Public Data, was a data broker company that performed employee background checks. Their primary service was collecting information from public data sources, including criminal records, addresses, and employment history, and offering that information for sale. On October 2, 2024, Jerico Pictures filed for Chapter 11 bankruptcy as it currently faces over a dozen lawsuits over the breach, and is potentially liable "for credit monitoring for hundreds of millions of potentially impacted individuals." In December 2024, National Public Data shut down, showing a closure notice on its website.

PropBank

PropBank is a corpus that is annotated with verbal propositions and their arguments—a "proposition bank". Although "PropBank" refers to a specific corpus produced by Martha Palmer et al., the term propbank is also coming to be used as a common noun referring to any corpus that has been annotated with propositions and their arguments. The PropBank project has played a role in research in natural language processing, and has been used in semantic role labelling. == Comparison == PropBank differs from FrameNet, the resource to which it is most frequently compared, in several ways. PropBank is a verb-oriented resource, while FrameNet is centered on the more abstract notion of frames, which generalizes descriptions across similar verbs (e.g. "describe" and "characterize") as well as nouns and other words (e.g. "description"). PropBank does not annotate events or states of affairs described using nouns. PropBank commits to annotating all verbs in a corpus, whereas the FrameNet project chooses sets of example sentences from a large corpus and only in a few cases has annotated longer continuous stretches of text. PropBank-style annotations often remain close to the syntactic level, while FrameNet-style annotations are sometimes more semantically motivated. From the start, PropBank was developed with the idea of serving as training data for machine learning-based semantic role labeling systems in mind. It requires that all arguments to a verb be syntactic constituents and different senses of a word are only distinguished if the differences bear on the arguments. Due to such differences, semantic role labeling with respect to PropBank is often a somewhat easier task than producing FrameNet-style annotations.

Tesla Dojo

Tesla Dojo is a series of supercomputers designed and built by Tesla for computer vision video processing and recognition. It was used for training Tesla's machine learning models to improve its Full Self-Driving (FSD) advanced driver-assistance system. It went into production in July 2023. Dojo's goal was to efficiently process millions of terabytes of video data captured from real-life driving situations from Tesla's 4+ million cars. This goal led to a considerably different architecture than conventional supercomputer designs. In August 2025, Bloomberg News reported that the Dojo project had been disbanded, though it was restarted in January 2026. == History == Tesla operates several massively parallel computing clusters for developing its Autopilot advanced driver assistance system. Its primary unnamed cluster using 5,760 Nvidia A100 graphics processing units (GPUs) was touted by Andrej Karpathy in 2021 at the fourth International Joint Conference on Computer Vision and Pattern Recognition (CCVPR 2021) to be "roughly the number five supercomputer in the world" at approximately 81.6 petaflops, based on scaling the performance of the Nvidia Selene supercomputer, which uses similar components. However, the performance of the primary Tesla GPU cluster has been disputed, as it was not clear if this was measured using single-precision or double-precision floating point numbers (FP32 or FP64). Tesla also operates a second 4,032 GPU cluster for training and a third 1,752 GPU cluster for automatic labeling of objects. The primary unnamed Tesla GPU cluster has been used for processing one million video clips, each ten seconds long, taken from Tesla Autopilot cameras operating in Tesla cars in the real world, running at 36 frames per second. Collectively, these video clips contained six billion object labels, with depth and velocity data; the total size of the data set was 1.5 petabytes. This data set was used for training a neural network intended to help Autopilot computers in Tesla cars understand roads. By August 2022, Tesla had upgraded the primary GPU cluster to 7,360 GPUs. Dojo was first mentioned by Elon Musk in April 2019 during Tesla's "Autonomy Investor Day". In August 2020, Musk stated it was "about a year away" due to power and thermal issues. Dojo was officially announced at Tesla's Artificial Intelligence (AI) Day on August 19, 2021. Tesla revealed details of the D1 chip and its plans for "Project Dojo", a datacenter that would house 3,000 D1 chips; the first "Training Tile" had been completed and delivered the week before. In October 2021, Tesla released a "Dojo Technology" whitepaper describing the Configurable Float8 (CFloat8) and Configurable Float16 (CFloat16) floating point formats and arithmetic operations as an extension of Institute of Electrical and Electronics Engineers (IEEE) standard 754. At the follow-up AI Day in September 2022, Tesla announced it had built several System Trays and one Cabinet. During a test, the company stated that Project Dojo drew 2.3 megawatts (MW) of power before tripping a local San Jose, California power substation. At the time, Tesla was assembling one Training Tile per day. In August 2023, Tesla powered on Dojo for production use as well as a new training cluster configured with 10,000 Nvidia H100 GPUs. In January 2024, Musk described Dojo as "a long shot worth taking because the payoff is potentially very high. But it's not something that is a high probability." In June 2024, Musk explained that ongoing construction work at Gigafactory Texas is for a computing cluster claiming that it is planned to comprise an even mix of "Tesla AI" and Nvidia/other hardware with a total thermal design power of at first 130 MW and eventually exceeding 500 MW. In August 2025, Bloomberg News reported that the Dojo project was disbanded, though Musk announced it would be restarted in January 2026 with a new chip iteration. == Technical architecture == The fundamental unit of the Dojo supercomputer is the D1 chip, designed by a team at Tesla led by ex-AMD CPU designer Ganesh Venkataramanan, including Emil Talpes, Debjit Das Sarma, Douglas Williams, Bill Chang, and Rajiv Kurian. The D1 chip is manufactured by the Taiwan Semiconductor Manufacturing Company (TSMC) using 7 nanometer (nm) semiconductor nodes, has 50 billion transistors and a large die size of 645 mm2 (1.0 square inch). Updating at Artificial Intelligence (AI) Day in 2022, Tesla announced that Dojo would scale by deploying multiple ExaPODs, in which there would be: 10 Cabinets per ExaPOD (1,062,000 cores, 3,000 D1 chips) 2 System Trays per Cabinet (106,200 cores, 300 D1 chips) 6 Training Tiles per System Tray (53,100 cores, along with host interface hardware) 25 D1 chips per Training Tile (8,850 cores) 354 computing cores per D1 chip According to Venkataramanan, Tesla's senior director of Autopilot hardware, Dojo will have more than an exaflop (a million teraflops) of computing power. For comparison, according to Nvidia, in August 2021, the (pre-Dojo) Tesla AI-training center used 720 nodes, each with eight Nvidia A100 Tensor Core GPUs for 5,760 GPUs in total, providing up to 1.8 exaflops of performance. === D1 chip === Each node (computing core) of the D1 processing chip is a general purpose 64-bit CPU with a superscalar core. It supports internal instruction-level parallelism, and includes simultaneous multithreading (SMT). It doesn't support virtual memory and uses limited memory protection mechanisms. Dojo software/applications manage chip resources. The D1 instruction set supports both 64-bit scalar and 64-byte single instruction, multiple data (SIMD) vector instructions. The integer unit mixes reduced instruction set computer (RISC-V) and custom instructions, supporting 8, 16, 32, or 64 bit integers. The custom vector math unit is optimized for machine learning kernels and supports multiple data formats, with a mix of precisions and numerical ranges, many of which are compiler composable. Up to 16 vector formats can be used simultaneously. ==== Node ==== Each D1 node uses a 32-byte fetch window holding up to eight instructions. These instructions are fed to an eight-wide decoder which supports two threads per cycle, followed by a four-wide, four-way SMT scalar scheduler that has two integer units, two address units, and one register file per thread. Vector instructions are passed further down the pipeline to a dedicated vector scheduler with two-way SMT, which feeds either a 64-byte SIMD unit or four 8×8×4 matrix multiplication units. The network on-chip (NOC) router links cores into a two-dimensional mesh network. It can send one packet in and one packet out in all four directions to/from each neighbor node, along with one 64-byte read and one 64-byte write to local SRAM per clock cycle. Hardware native operations transfer data, semaphores and barrier constraints across memories and CPUs. System-wide double data rate 4 (DDR4) synchronous dynamic random-access memory (SDRAM) memory works like bulk storage. ==== Memory ==== Each core has a 1.25 megabytes (MB) of SRAM main memory. Load and store speeds reach 400 gigabytes (GB) per second and 270 GB/sec, respectively. The chip has explicit core-to-core data transfer instructions. Each SRAM has a unique list parser that feeds a pair of decoders and a gather engine that feeds the vector register file, which together can directly transfer information across nodes. ==== Die ==== Twelve nodes (cores) are grouped into a local block. Nodes are arranged in an 18×20 array on a single die, of which 354 cores are available for applications. The die runs at 2 gigahertz (GHz) and totals 440 MB of SRAM (360 cores × 1.25 MB/core). It reaches 376 teraflops using 16-bit brain floating point (BF16) numbers or using configurable 8-bit floating point (CFloat8) numbers, which is a Tesla proposal, and 22 teraflops at FP32. Each die comprises 576 bi-directional serializer/deserializer (SerDes) channels along the perimeter to link to other dies, and moves 8 TB/sec across all four die edges. Each D1 chip has a thermal design power of approximately 400 watts. === Training Tile === The water-cooled Training Tile packages 25 D1 chips into a 5×5 array. Each tile supports 36 TB/sec of aggregate bandwidth via 40 input/output (I/O) chips - half the bandwidth of the chip mesh network. Each tile supports 10 TB/sec of on-tile bandwidth. Each tile has 11 GB of SRAM memory (25 D1 chips × 360 cores/D1 × 1.25 MB/core). Each tile achieves 9 petaflops at BF16/CFloat8 precision (25 D1 chips × 376 TFLOP/D1). Each tile consumes 15 kilowatts; 288 amperes at 52 volts. === System Tray === Six tiles are aggregated into a System Tray, which is integrated with a host interface. Each host interface includes 512 x86 cores, providing a Linux-based user environment. Previously, the Dojo System Tray was known as the Training Matrix, which includes six Training Tiles, 20 Dojo Interface Processor cards across four host servers, and Ethernet-l

Semantic interpretation

Semantic interpretation is an important component in dialog systems. It is related to natural language understanding, but mostly it refers to the last stage of understanding. The goal of interpretation is binding the user utterance to concept, or something the system can understand. Typically it is creating a database query based on user utterance.

Face Swap Live

Face Swap Live is a mobile app created by Laan Labs that enables users to swap faces with another person in real-time using the device's camera. It was released on December 14, 2015. In addition to swapping faces with another person, the app enables users to create videos using a set of bundled live filters. The app is available on iOS and Android devices. Face Swap Live was named Apple's #2 best-selling paid app in 2016.

Ericom Connect

Ericom Connect is a remote access/application publishing solution produced by Ericom Software that provides secure, centrally managed access to physical or hosted desktops and applications running on Microsoft Windows and Linux systems. == Product overview == Ericom Connect is desktop virtualization and application virtualization software that allows users to run applications remotely, without installing them on the local computer or device. The software is noted for its scalability, ease of deployment, and compatibility with any type of infrastructure, cloud or physical. Ericom Connect uses AccessPad (native client for desktops), AccessToGo (native client for mobile), or AccessNow, one of the first HTML5 RDP solutions to support clientless access to Windows desktops and applications from any device with an HTML5-compatible browser, including Macintosh computers, mobile devices, and Google Chromebooks. Other notable features include performance monitoring, built-in real-time analytics & BI, support for two-factor authentication (using RSA SecurID), multi-tenancy and multi-datacenter support via a single unified web interface, and a “Launch Simulation” feature that allows users to visualize and simulate actual step-by-step user processes directly from within the administration console. In addition to scalability, by distributing configurations, logs, etc., across multiple servers there is no single point of failure, as can be the case if all configuration information is stored on one server. == History == Ericom Connect was introduced in 2015. Ericom Connect is a successor to Ericom PowerTerm Web Connect. PowerTerm Web Connect used an architecture similar to what was then current with Citrix and VMWare, relying on a centralized SQL server, a connection broker, image management for different hypervisors, and a variety of clients. Ericom Connect uses a new grid architecture that provides more scalability, reliability, and flexibility than before.