AI Chatbot Free Unlimited

AI Chatbot Free Unlimited — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Hierarchical Risk Parity

    Hierarchical Risk Parity

    Hierarchical Risk Parity (HRP) is an advanced investment portfolio optimization framework developed in 2016 by Marcos López de Prado at Guggenheim Partners and Cornell University. HRP is a probabilistic graph-based alternative to the prevailing mean-variance optimization (MVO) framework developed by Harry Markowitz in 1952, and for which he received the Nobel Prize in economic sciences. HRP algorithms apply discrete mathematics and machine learning techniques to create diversified and robust investment portfolios that outperform MVO methods out-of-sample. HRP aims to address the limitations of traditional portfolio construction methods, particularly when dealing with highly correlated assets. Following its publication, HRP has been implemented in numerous open-source libraries, and received multiple extensions. == Key features == HRP portfolios have been proposed as a robust alternative to traditional quadratic optimization methods, including the Critical Line Algorithm (CLA) of Markowitz. HRP addresses three central issues commonly associated with quadratic optimizers: numerical instability, excessive concentration in a small number of assets, and poor out-of-sample performance. HRP leverages techniques from graph theory and machine learning to construct diversified portfolios using only the information embedded in the covariance matrix. Unlike quadratic programming methods, HRP does not require the covariance matrix to be invertible. Consequently, HRP remains applicable even in cases where the covariance matrix is ill-conditioned or singular—conditions under which standard optimizers fail. Monte Carlo simulations indicate that HRP achieves lower out-of-sample variance than CLA, despite the fact that minimizing variance is the explicit optimization objective of CLA. Furthermore, HRP portfolios exhibit lower realized risk compared to those generated by traditional risk parity methodologies. Empirical backtests have demonstrated that HRP would have historically outperformed conventional portfolio construction techniques. Algorithms within the HRP framework are characterized by the following features: Machine Learning Approach: HRP employs hierarchical clustering, a machine learning technique, to group similar assets based on their correlations. This allows the algorithm to identify the underlying hierarchical structure of the portfolio, and avoid that errors spread through the entire network. Risk-Based Allocation: The algorithm allocates capital based on risk, ensuring that assets only compete with similar assets for representation in the portfolio. This approach leads to better diversification across different risk sources, while avoiding the instability associated with noisy returns estimates. Covariance Matrix Handling: Unlike traditional methods like Mean-Variance Optimization, HRP does not require inverting the covariance matrix. This makes it more stable and applicable to portfolios with a large number of assets, particularly when the covariance matrix's condition number is high. == The problem: Markowitz's Curse == Portfolio construction is perhaps the most recurrent financial problem. On a daily basis, investment managers must build portfolios that incorporate their views and forecasts on risks and returns. Despite the theoretical elegance of Markowitz's mean-variance framework, its practical implementation is hindered by several limitations that undermine the reliability of solutions derived from the Critical Line Algorithm (CLA). A principal concern is the high sensitivity of optimal portfolios to small perturbations in expected returns: even minor forecasting errors can result in significantly different allocations (Michaud, 1998). Given the inherent difficulty of producing accurate return forecasts, numerous researchers have advocated for approaches that forgo expected returns entirely and instead rely solely on the covariance structure of asset returns. This has given rise to risk-based allocation methods, among which risk parity is a widely cited example (Jurczenko, 2015). While eliminating return forecasts mitigates some instability, it does not eliminate it. Quadratic programming techniques employed in portfolio optimization require the inversion of a positive-definite covariance matrix, meaning all eigenvalues must be strictly positive. When the matrix is numerically ill-conditioned—that is, when the ratio of its largest to smallest eigenvalue (its condition number) is large—matrix inversion becomes unreliable and prone to significant numerical errors (Bailey and López de Prado, 2012). The condition number of a covariance, correlation, or any symmetric (and thus diagonalizable) matrix is defined as the absolute value of the ratio between its largest and smallest eigenvalues in modulus. The figure on the right presents the sorted eigenvalues of several correlation matrices; the condition number is represented by the ratio of the first to last eigenvalues in each sequence. A diagonal correlation matrix, which is equal to its own inverse, exhibits the minimum possible condition number. As the number of correlated (or multicollinear) assets in a portfolio increases, the condition number rises. At high levels, this leads to severe numerical instability, whereby slight modifications in any matrix entry may result in drastically different inverses. This phenomenon, often referred to as Markowitz’s curse, encapsulates the paradox wherein increased correlation among assets heightens the theoretical need for diversification, yet simultaneously increases the likelihood of unstable optimization outcomes. Consequently, the potential benefits of diversification are frequently overshadowed by estimation errors. These problems are exacerbated as the dimensionality of the covariance matrix increases. The estimation of each covariance term consumes degrees of freedom, and in general, a minimum of 1 2 N ( N + 1 ) {\displaystyle {\frac {1}{2}}N(N+1)} independent and identically distributed (IID) observations is required to estimate a non-singular covariance matrix of dimension N {\displaystyle N} . For example, constructing an invertible covariance matrix of dimension 50 necessitates at least five years of daily IID observations. However, empirical evidence suggests that the correlation structure of financial assets is highly unstable over such extended periods. These difficulties are highlighted by the observation that even naïve allocation strategies—such as equally weighted portfolios—have frequently outperformed both mean-variance and risk-based optimizations in out-of-sample tests (De Miguel et al., 2009). == The solution: Hierarchical Risk Parity == The HRP algorithm addresses Markowitz's curse in three steps: Hierarchical Clustering: Assets are grouped into clusters based on their correlations, forming a hierarchical tree structure. Quasi-Diagonalization: The correlation matrix is reordered based on the clustering results, revealing a block diagonal structure. Recursive Bisection: Weights are assigned to assets through a top-down approach, splitting the portfolio into smaller sub-portfolios and allocating capital based on inverse variance. === Step 1: Hierarchical clustering === Given a T × N {\displaystyle T\times N} matrix of asset returns X {\displaystyle X} , where each column represents a time series of returns for one of N {\displaystyle N} assets over T {\displaystyle T} time periods, a hierarchical clustering process can be used to construct a tree-based representation of asset relationships. First, we compute the N × N {\displaystyle N\times N} correlation matrix ρ = ρ i , j i , j = 1 . . . N {\displaystyle \rho ={\rho _{i,j}}\;{i,j=1\;...\;N}} , where ρ i , j = c o r r ( X i , X j ) {\displaystyle \rho _{i,j}=\mathrm {corr} (X_{i},X_{j})} . From this, a pairwise distance matrix D = d i , j {\displaystyle D={d_{i,j}}} is defined using the transformation: d i , j = 1 2 ( 1 − ρ i , j ) {\displaystyle d_{i,j}={\sqrt {{\frac {1}{2}}(1-\rho _{i,j})}}} This distance function defines a proper metric space, satisfying non-negativity, identity of indiscernibles, symmetry, and the triangle inequality. Next, a secondary distance matrix D ~ = d ~ i , j {\displaystyle {\tilde {D}}={{\tilde {d}}_{i,j}}} is computed, where each entry measures the Euclidean distance between the distance profiles of two assets: d ~ i , j = ∑ n = 1 N ( d n , i − d n , j ) 2 {\displaystyle {\tilde {d}}_{i,j}={\sqrt {\sum _{n=1}^{N}(d_{n,i}-d_{n,j})^{2}}}} While d i , j {\displaystyle d_{i,j}} reflects correlation-based proximity between two assets, d ~ i , j {\displaystyle {\tilde {d}}_{i,j}} quantifies dissimilarity across the entire system, as it depends on all pairwise distances. Hierarchical clustering proceeds by identifying the pair ( i , j ) {\displaystyle (i,j)} with the smallest value of d ~ i , j {\displaystyle {\tilde {d}}_{i,j}} (for i ≠ j {\displaystyle i\neq j} ), and forming a new cluster u [ 1 ] = ( i , j ) {\displaystyle u[1]=(i,j)} .

    Read more →
  • MeituPic

    MeituPic

    Meitu Xiu Xiu ("Meitu") (Chinese: 美图秀秀) is an image editing software that is mostly used in Mainland China but is also popular in Hong Kong and Taiwan. It is only available on Google Play and App Store in certain countries. It provides tools for editing photos: filters, retouching, collage, scenes, frames, and photo decorations, as well as generative AI features such as text-to-images, AI removal and AI repainting etc. Meitu is one of the apps developed by Meitu, Inc.; it also produced BeautyCam, Wink and X-Design. == History == Meitu's PC version was created in 2008 by Wu Xinhong, the CEO of Meitu. In 2013, its mobile version became one of the first must-have mobile apps in China. Meitu, Inc. is a photo and video-centered app developer, which was founded in 2008 in Xiamen. Currently, the major revenue source of Meitu is premium subscription. Meitu, Inc. was initially funded by Cai Wensheng, a well-known angel investor. The company has an approximately 250 million monthly active users globally. == Function == === Edit === MeituPic provides a number of photo-editing tools. The major functions are auto enhance, edit, enhance, filters, frames, magic brush, mosaic, text, and blur. Auto enhance focuses on the nature of photos taken, while Edit includes functions of cropping, rotation, sharpening, and adjustment of ratio. For Enhance, users can apply slight adjustment on the photo by controlling the levels of brightness, contrast, colour temperature, saturation, highlight, shadow and smart light. Major types of filters are LOMO, beauty, style as well as art. Different frames can be chosen from poster, simple, and fantasy. Magic brush provides a great variety of brushes with different colours and patterns for users to decorate the photos. Mosaic brush enables users to cover certain parts of the photo. Texts can be added to the photo. Choices of different bubbles, font as well as style of words are available. Blurring effect is also available to make the photo less distinct and clear. === Beauty Retouch === There are seven major functions for retouching a photo: automatic retouch, smooth and whiten skin, remove blemish, make slimmer, remove dark circles and bags under the eyes, make taller, and enhance the eyes. Automatic retouch enhances portraits by lightening the skin tone, brightening the eyes, and simulating a face-lift by tapping on just one button. This helps to remove wrinkles and optimizes the skin tone. Acne, blemishes, and other skin imperfections can also be removed. The face-lift and weight-loss functions in the slimming option can be used to reshape the body. The option to make the subject taller can be used to change the perceived height of the subject and give the impression of slimmer, longer legs. The option to enhance the eyes can enlarge and brighten the eyes. === Collage === Collage has four types: template, freestyle, poster, PicStrip, which all maximize to insert nine photos. Template integrates photos in a vertical rectangle tightly. MeituPic has 15 frames or free download function for users. MeituPic also provides different templates according to number of photos inserted. Freestyle separates photos on a background freely. There are two parts of background: custom and more. For custom, users choose from album. For more, there are plain and picture with 18 choices. Poster makes a poster with photos. Users choose a poster among 8 choices or tap ‘more’ to download a new one. PicStrip combines photos vertically making an elongated file. Users choose a frame from 15 choices. Pinching thumb and forefinger together or apart zooms photos in/out. Putting two fingers and turning hand rotates photos. Pressing moves photos to ideal location. After designing, users tap ‘save/share’ on the upper right corner and the photo made is saved into album automatically. == Awards ==

    Read more →
  • Database virtualization

    Database virtualization

    Database virtualization is the decoupling of the database layer, which lies between the storage and application layers within the application stack. Virtualization of the database layer enables a shift away from the physical, toward the logical or virtual. Virtualization enables compute and storage resources to be pooled and allocated on demand. This enables both the sharing of single server resources for multi-tenancy, as well as the pooling of server resources into a single logical database or cluster. In both cases, database virtualization provides increased flexibility, more granular and efficient allocation of pooled resources, and more scalable computing. == Virtual data partitioning == The act of partitioning data stores as a database grows has been in use for several decades. There are two primary ways that data has been partitioned inside legacy data management systems: Shared-data databases: an architecture that assumes all database cluster nodes share a single partition. Inter-node communications are used to synchronize update activities performed by different nodes on the cluster. Shared-data data management systems are limited to single-digit node clusters. Shared-nothing databases: an architecture in which all data is segregated to internally managed partitions with clear, well-defined data location boundaries. Shared-nothing databases require manual partition management. In virtual partitioning, logical data is abstracted from physical data by autonomously creating and managing large numbers of data partitions (100s to 1000s). Because they are autonomously maintained, the resources required to manage the partitions are minimal. This kind of massive partitioning results in: Partitions that are small, efficiently managed, and load-balanced. Systems that do not require re-partitioning events to define additional partitions, even when the hardware is changed. “Shared-data” and “shared-nothing” architectures allow scalability through multiple data partitions and cross-partition querying and transaction processing without full partition scanning. == Horizontal data partitioning == Partitioning database sources from consumers is a fundamental concept. With greater numbers of database sources, inserting a horizontal data virtualization layer between the sources and consumers helps address this complexity. Rick van der Lans, the author of multiple books on SQL and relational databases, has defined data virtualization as "the process of offering data consumers a data access interface that hides the technical aspects of stored data, such as location, storage structure, API, access language, and storage technology." == Advantages == Added flexibility and agility for existing computing infrastructure. Enhanced database performance. Pooling and sharing computing resources, either splitting them (multi-tenancy) or combining them (clustering). Simplification of administration and management. Increased fault tolerance.

    Read more →
  • SMBGhost

    SMBGhost

    SMBGhost (or SMBleedingGhost or CoronaBlue) is a type of security vulnerability, with wormlike features, that affects Windows 10 computers and was first reported publicly on 10 March 2020. == Security vulnerability == A proof of concept (PoC) exploit code was published 1 June 2020 on GitHub by a security researcher. The code could possibly spread to millions of unpatched computers, resulting in as much as tens of billions of dollars in losses. Microsoft recommends all users of Windows 10 versions 1903 and 1909 and Windows Server versions 1903 and 1909 to install patches, and states, "We recommend customers install updates as soon as possible as publicly disclosed vulnerabilities have the potential to be leveraged by bad actors ... An update for this vulnerability was released in March [2020], and customers who have installed the updates, or have automatic updates enabled, are already protected." Workarounds, according to Microsoft, such as disabling SMB compression and blocking port 445, may help but may not be sufficient. According to the advisory division of Homeland Security, "Malicious cyber actors are targeting unpatched systems with the new [threat], ... [and] strongly recommends using a firewall to block server message block ports from the internet and to apply patches to critical- and high-severity vulnerabilities as soon as possible."

    Read more →
  • Artificial intelligence

    Artificial intelligence

    Artificial intelligence (AI) is the capability of computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of research in engineering, mathematics and computer science that develops and studies methods and software that enable machines to perceive their environment and use learning and intelligence to take actions that maximize their chances of achieving defined goals. High-profile applications of AI include advanced web search engines, chatbots, virtual assistants, autonomous vehicles, and play and analysis in strategy games (e.g., chess and Go). Since the 2020s, generative AI has become widely available to generate images, audio, and videos from text prompts. The traditional goals of AI research include learning, reasoning, knowledge representation, planning, natural language processing, and perception, as well as support for robotics. To reach these goals, AI researchers have used techniques including state space search and mathematical optimization, formal logic, artificial neural networks, and methods based on statistics, operations research, and economics. AI also draws upon psychology, linguistics, philosophy, neuroscience, and other fields. Some companies, such as OpenAI, Google DeepMind and Meta, aim to create artificial general intelligence (AGI) – AI that can complete virtually any cognitive task at least as well as a human. Artificial intelligence was founded as an academic discipline in 1956, and the field went through multiple cycles of optimism throughout its history, followed by periods of disappointment and loss of funding, known as AI winters. Funding and interest increased substantially after 2012, when graphics processing units began being used to accelerate neural networks, and deep learning outperformed previous AI techniques. This growth accelerated further after 2017 with the transformer architecture. In the 2020s, an AI boom has coincided with advances in generative AI, which allowed for the creation and modification of media. In addition to AI safety and unintended consequences and harms from the use of AI, ethical concerns, AI's long-term effects, and potential existential risks have prompted discussions of AI regulation. == Goals == The general problem of simulating (or creating) intelligence has been broken into subproblems. These consist of particular traits or capabilities that researchers expect an intelligent system to display. The traits described below have received the most attention and cover the scope of AI research. === Reasoning and problem-solving === Early researchers developed algorithms that imitated step-by-step reasoning that humans use when they solve puzzles or make logical deductions. By the late 1980s and 1990s, methods were developed for dealing with uncertain or incomplete information, employing concepts from probability and economics. Many of these algorithms are insufficient for solving large reasoning problems because they experience a "combinatorial explosion": They become exponentially slower as the problems grow. Even humans rarely use the step-by-step deduction that early AI research could model. They solve most of their problems using fast, intuitive judgments. Accurate and efficient reasoning is an unsolved problem. === Knowledge representation === Knowledge representation and knowledge engineering allow AI programs to answer questions intelligently and make deductions about real-world facts. Formal knowledge representations are used in content-based indexing and retrieval, scene interpretation, clinical decision support, knowledge discovery (mining "interesting" and actionable inferences from large databases), and other areas. A knowledge base is a body of knowledge represented in a form that can be used by a program. An ontology is the set of objects, relations, concepts, and properties used by a particular domain of knowledge. Knowledge bases need to represent things such as objects, properties, categories, and relations between objects; situations, events, states, and time; causes and effects; knowledge about knowledge (what we know about what other people know); default reasoning (things that humans assume are true until they are told differently and will remain true even when other facts are changing); and many other aspects and domains of knowledge. Among the most difficult problems in knowledge representation are the breadth of commonsense knowledge (the set of atomic facts that the average person knows is enormous); and the sub-symbolic form of most commonsense knowledge (much of what people know is not represented as "facts" or "statements" that they could express verbally). There is also the difficulty of knowledge acquisition, the problem of obtaining knowledge for AI applications. === Planning and decision-making === An "agent" is any entity (artificial or not) that perceives and takes actions in the world. A rational agent has goals or preferences and takes actions to make them happen. In automated planning, the agent has a specific goal. In automated decision-making, the agent has preferences—there are some situations it would prefer to be in, and some situations it is trying to avoid. The decision-making agent assigns a number to each situation (called the "utility") that measures how much the agent prefers it. For each possible action, it can calculate the "expected utility": the utility of all possible outcomes of the action, weighted by the probability that the outcome will occur. It can then choose the action with the maximum expected utility. In classical planning, the agent knows exactly what the effect of any action will be. In most real-world problems, however, the agent may not be certain about the situation they are in (it is "unknown" or "unobservable") and it may not know for certain what will happen after each possible action (it is not "deterministic"). It must choose an action by making a probabilistic guess and then reassess the situation to see if the action worked. Alongside thorough testing and improvement based on previous decisions, having an explanation for why the agent took certain decisions is a way to build trust, especially when the decisions have to be relied upon. In some problems, the agent's preferences may be uncertain, especially if there are other agents or humans involved. These can be learned (e.g., with inverse reinforcement learning), or the agent can seek information to improve its preferences. Information value theory can be used to weigh the value of exploratory or experimental actions. The space of possible future actions and situations is typically intractably large, so the agents must take actions and evaluate situations while being uncertain of what the outcome will be. A Markov decision process has a transition model that describes the probability that a particular action will change the state in a particular way and a reward function that supplies the utility of each state and the cost of each action. A policy associates a decision with each possible state. The policy could be calculated (e.g., by iteration), be heuristic, or it can be learned. Game theory describes the rational behavior of multiple interacting agents and is used in AI programs that make decisions that involve other agents. === Learning === Machine learning is the study of programs that can improve their performance on a given task automatically. It has been a part of AI from the beginning. There are several kinds of machine learning. Unsupervised learning analyzes a stream of data and finds patterns and makes predictions without any other guidance. Supervised learning requires labeling the training data with the expected answers, and comes in two main varieties: classification (where the program must learn to predict what category the input belongs in) and regression (where the program must deduce a numeric function based on numeric input). In reinforcement learning, the agent is rewarded for good responses and punished for bad ones. The agent learns to choose responses that are classified as "good". Transfer learning is when the knowledge gained from one problem is applied to a new problem. Deep learning is a type of machine learning that runs inputs through biologically inspired artificial neural networks for all of these types of learning. Computational learning theory can assess learners by computational complexity, by sample complexity (how much data is required), or by other notions of optimization. === Natural language processing === Natural language processing (NLP) allows programs to read, write and communicate in human languages. Specific problems include speech recognition, speech synthesis, machine translation, information extraction, information retrieval and question answering. Early work, based on Noam Chomsky's generative grammar and semantic networks, had difficulty with word-sense disambiguation unless

    Read more →
  • Image-based modeling and rendering

    Image-based modeling and rendering

    In computer graphics and computer vision, image-based modeling and rendering (IBMR) methods rely on a set of two-dimensional images of a scene to generate a three-dimensional model and then render some novel views of this scene. The traditional approach of computer graphics has been used to create a geometric model in 3D and try to reproject it onto a two-dimensional image. Computer vision, conversely, is mostly focused on detecting, grouping, and extracting features (edges, faces, etc.) present in a given picture and then trying to interpret them as three-dimensional clues. Image-based modeling and rendering allows the use of multiple two-dimensional images in order to generate directly novel two-dimensional images, skipping the manual modeling stage. == Light modeling == Instead of considering only the physical model of a solid, IBMR methods usually focus more on light modeling. The fundamental concept behind IBMR is the plenoptic illumination function which is a parametrisation of the light field. The plenoptic function describes the light rays contained in a given volume. It can be represented with seven dimensions: a ray is defined by its position ( x , y , z ) {\displaystyle (x,y,z)} , its orientation ( θ , ϕ ) {\displaystyle (\theta ,\phi )} , its wavelength ( λ ) {\displaystyle (\lambda )} and its time ( t ) {\displaystyle (t)} : P ( x , y , z , θ , ϕ , λ , t ) {\displaystyle P(x,y,z,\theta ,\phi ,\lambda ,t)} . IBMR methods try to approximate the plenoptic function to render a novel set of two-dimensional images from another. Given the high dimensionality of this function, practical methods place constraints on the parameters in order to reduce this number (typically to 2 to 4). == IBMR methods and algorithms == View morphing generates a transition between images Panoramic imaging renders panoramas using image mosaics of individual still images Lumigraph relies on a dense sampling of a scene Space carving generates a 3D model based on a photo-consistency check

    Read more →
  • Kdb+

    Kdb+

    kdb+ is a column-based relational time series database (TSDB) with in-memory (IMDB) abilities, developed and marketed by KX Systems. The database is commonly used in high-frequency trading (HFT) to store, analyze, process, and retrieve large data sets at high speed. kdb+ has the ability to handle billions of records and analyzes data within a database. The database is available in 32-bit and 64-bit versions for several operating systems. Financial institutions use kdb+ to analyze time series data such as stock or commodity exchange data. The database has also been used for other time-sensitive data applications including commodity markets such as energy trading, telecommunications, sensor data, log data, machine and computer network usage monitoring along with real time analytics in Formula One racing. == Overview == kdb+ is a high-performance column-store database that was designed to process and store large amounts of data. Commonly accessed data is pushed into random-access memory (RAM), which is faster to access than data in disk storage. Created with financial institutions in mind, the database was developed as a central repository to store time series data that supports real-time analysis of billions of records. kdb+ has the ability to analyze data over time and responds to queries similar to Structured Query Language (SQL). Columnar databases return answers to some queries in a more efficient way than row-based database management systems. kdb+ dictionaries, tables and nanosecond time stamps are native data types and are used to store time series data. At the core of kdb+ is the built-in programming language, q, a concise, expressive query array language, and dialect of the language APL. Q can manipulate streaming, real-time, and historical data. kdb+ uses q to aggregate and analyze data, perform statistical functions, and join data sets and supports SQL queries The vector language q was built for speed and expressiveness and eliminates most need for looping structures. kdb+ includes interfaces in C, C++, Java, C#, and Python. == History == In 1998, KX released kdb, a database built on the language K written by Arthur Whitney. In 2003, kdb+ was released as a 64-bit version of kdb. In 2004, the kdb+ tick market database framework was released along with kdb+ taq, a loader for the New York Stock Exchange (NYSE) taq data. kdb+ was created by Arthur Whitney, building on his prior work with array languages. In April 2007, KX announced that it was releasing a version of kdb+ for Mac OS X. Then, kdb+ was also available on the operating systems Linux, Windows, and Solaris. In September 2012, version 3.0 was released. It was optimized for Intel's upgraded processors with support for WebSockets, and universally unique identifiers (UUIDs, termed globally unique identifiers (GUID)s in Microsoft software). Intel's Advanced Vector Extensions (AVX) and Streaming SIMD Extensions 4 (SSE4) 4.2 on the Sandy Bridge processors of the time allowed for enhanced support of the kdb+ system. In June 2013, version 3.1 was released, with benchmarks up to 8 times faster than older versions. In March 2020, version 4.0 was released. New features included Multithreaded primitives, Intel Optane DC persistent memory support and Data at Rest Encryption.

    Read more →
  • Patent visualisation

    Patent visualisation

    Patent visualisation is an application of information visualisation. The number of patents has been increasing, encouraging companies to consider intellectual property as a part of their strategy. Patent visualisation, like patent mapping, is used to quickly view a patent portfolio. Software dedicated to patent visualisation began to appear in 2000, for example Aureka from Aurigin (now owned by Thomson Reuters). Many patent and portfolio analytics platforms, such as Questel, Patent Forecast, PatSnap, Patentcloud, Relecura, and Patent iNSIGHT Pro, offer options to visualise specific data within patent documents by creating topic maps, priority maps, IP Landscape reports, etc. Software converts patents into infographics or maps, to allow the analyst to "get insight into the data" and draw conclusions. Also called patinformatics, it is the "science of analysing patent information to discover relationships and trends that would be difficult to see when working with patent documents on a one-and-one basis". Patents contain structured data (like publication numbers) and unstructured text (like title, abstract, claims and visual info). Structured data are processed by data-mining and unstructured data are processed with text-mining. == Data mining == The main step in processing structured information is data-mining, which emerged in the late 1980s. Data mining involves statistics, artificial intelligence, and machine learning. Patent data mining extracts information from the structured data of the patent document. These structured data are bibliographic fields such as location, date or status. === Structured fields === === Advantages === Data mining allows study of filing patterns of competitors and locates main patent filers within a specific area of technology. This approach can be helpful to monitor competitors' environments, moves and innovation trends and gives a macro view of a technology status. == Text-mining == === Principle === Text mining is used to search through unstructured text documents. This technique is widely used on the Internet, it has had success in bioinformatics and now in the intellectual property environment. Text mining is based on a statistical analysis of word recurrence in a corpus. An algorithm extracts words and expressions from title, summary and claims and gathers them by declension. "And" and "if" are labeled as non-information bearing words and are stored in the stopword list. Stoplists can be specialised in order to create an accurate analysis. Next, the algorithm ranks the words by weight, according to their frequency in the patent's corpus and the document frequency containing this word. The score for each word is calculated using a formula such as: W e i g h t = T e r m F r e q u e n c y D o c u m e n t F r e q u e n c y = F r e q u e n c y o f t h e w o r d o r e x p r e s s i o n i n t h e T e x t S e a N u m b e r o f d o c u m e n t s c o n t a i n i n g t h e e x p r e s s i o n o r w o r d {\displaystyle Weight={\frac {Term\ Frequency}{Document\ Frequency}}={\frac {Frequency\ of\ the\ word\ or\ expression\ in\ the\ Text\ Sea}{Number\ of\ documents\ containing\ the\ expression\ or\ word}}} A frequently used word in several documents has less weight than a word used frequently in a few patents. Words under a minimum weight are eliminated, leaving a list of pertinent words or descriptors. Each patent is associated to the descriptors found in the selected document. Further, in the process of clusterisation, these descriptors are used as subsets, in which the patent are regrouped or as tags to place the patents in predetermined categories, for example keywords from International Patent Classifications. Four text parts can be processed with text-mining : Title Abstract Claim Patent Full-Text Software offer different combinations but title, abstract and claim are generally the most used, providing a good balance between interferences and relevancy. === Advantages === Text-mining can be used to narrow a search or quickly evaluate a patent corpus. For instance, if a query produces irrelevant documents, a multi-level clustering hierarchy identifies them in order to delete them and refine the search. Text-mining can also be used to create internal taxonomies specific to a corpus for possible mapping. == Visualisations == Allying patent analysis and informatic tools offers an overview of the environment through value-added visualisations. As patents contain structured and unstructured information, visualisations fall in two categories. Structured data can be rendered with data mining in macrothematic maps and statistical analysis. Unstructured information can be shown in like clouds, cluster maps and 2D keyword maps. === Data mining visualisation === === Text mining visualisation === === Visualisation for both data-mining and text-mining === Mapping visualisations can be used for both text-mining and data-mining results. == Uses == What patent visualisation can highlight: Competitors Partners New innovations Technologic environment description Networks Field application: R&D strategy management Competitive intelligence Licensing Strategy

    Read more →
  • List of software palettes

    List of software palettes

    This is a list of software palettes used by computers. Systems that use a 4-bit or 8-bit pixel depth can display up to 16 or 256 colors simultaneously. Many personal computers in the early 1990s displayed at most 256 different colors, freely selected by software (either by the user or by a program) from their wider hardware's RGB color palette. Usual selections of colors in limited subsets (generally 16 or 256) of the full palette includes some RGB level arrangements commonly used with the 8-bit palettes as master palettes or universal palettes (i.e., palettes for multipurpose uses). These are some representative software palettes, but any selection can be made in such of systems. For specific hardware color palettes, see the list of monochrome and RGB palettes, list of 8-bit computer hardware graphics, the list of 16-bit computer hardware graphics and the list of video game console palettes articles. Each palette is represented by an array of color patches. A one-pixel size version appears below each palette, to make it easy to compare palette sizes. For each unique palette, an image color test chart and sample image (truecolor original follows) rendered with that palette (without dithering) are given. The test chart shows the full 8-bit, 256 levels of the red, green, and blue (RGB) primary colors and cyan, magenta, and yellow complementary colors, along with a full 8-bit, 256 levels grayscale. Gradients of RGB intermediate colors (orange, lime green, sea green, sky blue, violet and fuchsia), and a full hue spectrum are also present. Color charts are not gamma corrected. These elements illustrate the color depth and distribution of the colors of any given palette, and the sample image indicates how the color selection of such palettes could represent real-life images. == System specifics == These are selections of colors officially employed as system palettes in some popular operating systems for personal computers that support 8-bit displays. === Microsoft Windows and IBM OS/2 default 16-color palette === Used by these platforms as a roughly backward compatible palette for the CGA, EGA and VGA text modes, but with colors arranged in a different order. Also, is the default palette for 16 color icons. The corresponding indices into this palette are: === Microsoft Windows default 20-color palette === In 256-color mode, there are four additional standard Windows colors, twenty system reserved colors in total; thus the system leaves 236 palette indexes free for applications to use. The system color entries inside a 256-color palette table are the first ten plus the last ten. In any case, the additional system colors do not seem to add a sharp color richness: they are only some intermediate shades of grayish colors. Since Windows 95, these additional colors can be changed by the system when a color scheme needs custom colors, reducing their utility as static, unchanging palette entries. The complete 20-color Windows system palette is: === Apple Macintosh default 16-color palette === When Apple Computer introduced the Macintosh II in 1987, this 16-color palette was included in System 4.1. === RISC OS default palette === Acorn RISC OS 2.x and 3.x provided this 16-color palette: === Solaris default 16-color palette === Solaris OS used this color palette: == RGB arrangements == These are selections of colors based in evenly ordered RGB levels which provide complete RGB combinations, mainly used as master palettes to display any kind of image within the limitations of the 8-bit pixel depth. === 6 level RGB === Having six levels for every primary, with 6³ = 216 combinations. The index can be addressed by (36×R)+(6×G)+B, with all R, G and B values in a range from 0 to 5. Intended as homogeneous RGB cube, it gives six true grays. Also, there is room for another sorts of 40 colors, so operating systems or programs can add extra colors. Systems that use this software palette are: Web-safe colors Apple Macintosh 256 color default palette. It also contains four gradients of ten shades each for gray, red, green and blue. === 6-7-6 levels RGB === This palette is constructed with six levels for red and blue primaries and seven levels for the green primary, giving 6×7×6 = 252 combinations. The index can be addressed by (42×R)+(6×G)+B, with R and B values in a range from 0 to 5 and G in a range from 0 to 6. The same case as the former, but with an added level of green due to the greater sensibility of the normal human eye to this frequency. It does not provide true grays, but remaining indexes can be filled with four intermediate grays. In any case, there is little room for any other color. === 6-8-5 levels RGB === This palette is constructed with six levels for red, eight levels for green and five levels for the blue primaries, giving 6×8×5 = 240 combinations. The index can be addressed by (40×R)+(5×G)+B, with R ranging from 0 to 5, G from 0 to 7 and B from 0 to 4. Levels are chosen in function of sensibility of the normal human eye to every primary color. Also, it does not provide true grays. Remaining indexes can be filled with sixteen intermediate grays or other fixed colors. In fact, this is the best balanced RGB master software palette, in a compromise between the RGB arrangement based in the human eye's sensibility and a sufficient remaining palette entries for another purposes. === 8-8-4 levels RGB === The 8-8-4 level RGB use eight levels for each of the red and green color components (3+3 high order bits), and four levels (2 low order bits) for the blue component, due to the lesser sensitivity of the normal human eye to this primary color. This results in an 8×8×4 = 256-color palette as follows: This RGB software palette occupies the full 8-bit range of possible palette entries, so there is no room for other fixed colors. Software using this palette must draw their user interface elements with the same colors used to show pictures. Also again, it does not provide true grays. == Other common uses of software palettes == === Grayscale palettes === Simple palette made doing every triplet RGB primaries having equal values as a continuous gradient from black to white through the full available palette entries. Here is the 8-bit, 256 levels palette: Used to display pure grayscale TIFF or JPEG images, for example. === Color gradient palettes === Palettes made of a continuous color gradient from darkest to lightest arbitrary hues. The pixel data is treated as if it were grayscale, but the color table plays with RGB color combinations, not only gray. The relationship between the original luminance and the mapped one can vary, but the lighting scale is preserved along all the palette entries. One very common case of such palettes is the sepia tone palette, which gives an image an old fashioned and aged look (left). Another gradient example, based on blue hues, is presented here (right), but any hue or mixing of hues can be used. Many cell phones with built-in cameras have options to take colorized photos using this technique. === Adaptive palettes === Those whose whole number of available indexes are filled with RGB combinations selected from the statistical order of appearance (usually balanced) of a concrete full true color original image. There exist many algorithms to pick the colors through color quantization; one well known is the Heckbert's median-cut algorithm. Here is the 8-bit, 256 color palette used with the color test chart and the image sample above: Adaptive palettes only work well with a unique image. Trying to display different images with adaptive palettes over an 8-bit display usually results in only one image with correct colors, because the images have different palettes and only one can be displayed at a time. Here is an example of what happens when an indexed color image is displayed with any color palette that is not its own adaptive palette: === False color palettes === Arbitrary gradient color scales, usually 256 shades, with no relationship with real colors of a given image. They are employed to artificially colorize a grayscale image to reveal details and/or to map the pixel level values to amounts of some physical magnitude (potential, temperature, altitude, etc.) Note, in the example above, that new details can be seen as blue over magenta in the background's dark areas of the original photograph. Here is the 8-bit, 256 color gradient palette used with the color test chart and the image sample above: There exist many false color palettes, some of them standardized, used mainly in scientific applications: astronomy and radioastronomy, satellite land imaging, thermography, study of materials, tomography and magnetic resonance imaging in medicine, etc.

    Read more →
  • Snapshot isolation

    Snapshot isolation

    In databases, and transaction processing (transaction management), snapshot isolation is a guarantee that all reads made in a transaction will see a consistent snapshot of the database (in practice it reads the last committed values that existed at the time it started), and the transaction itself will successfully commit only if no updates it has made conflict with any concurrent updates made since that snapshot. Snapshot isolation has been adopted by several major database management systems, such as InterBase, Firebird, Oracle, MySQL, PostgreSQL, SQL Anywhere, MongoDB and Microsoft SQL Server (2005 and later). The main reason for its adoption is that it allows better performance than serializability, yet still avoids most of the concurrency anomalies that serializability avoids (but not all). In practice snapshot isolation is implemented within multiversion concurrency control (MVCC), where generational values of each data item (versions) are maintained: MVCC is a common way to increase concurrency and performance by generating a new version of a database object each time the object is written, and allowing transactions' read operations of several last relevant versions (of each object). Snapshot isolation has been used to criticize the ANSI SQL-92 standard's definition of isolation levels, as it exhibits none of the "anomalies" that the SQL standard prohibited, yet is not serializable (the anomaly-free isolation level defined by ANSI). In spite of its distinction from serializability, snapshot isolation is sometimes referred to as serializable by Oracle. == Definition == A transaction executing under snapshot isolation appears to operate on a personal snapshot of the database, taken at the start of the transaction. When the transaction concludes, it will successfully commit only if the values updated by the transaction have not been changed externally since the snapshot was taken. Such a write–write conflict will cause the transaction to abort. In a write skew anomaly, two transactions (T1 and T2) concurrently read an overlapping data set (e.g. values V1 and V2), concurrently make disjoint updates (e.g. T1 updates V1, T2 updates V2), and finally concurrently commit, neither having seen the update performed by the other. Were the system serializable, such an anomaly would be impossible, as either T1 or T2 would have to occur "first", and be visible to the other. In contrast, snapshot isolation permits write skew anomalies. As a concrete example, imagine V1 and V2 are two balances held by a single person, Phil. The bank will allow either V1 or V2 to run a deficit, provided the total held in both is never negative (i.e. V1 + V2 ≥ 0). Both balances are currently $100. Phil initiates two transactions concurrently, T1 withdrawing $200 from V1, and T2 withdrawing $200 from V2. If the database guaranteed serializable transactions, the simplest way of coding T1 is to deduct $200 from V1, and then verify that V1 + V2 ≥ 0 still holds, aborting if not. T2 similarly deducts $200 from V2 and then verifies V1 + V2 ≥ 0. Since the transactions must serialize, either T1 happens first, leaving V1 = −$100, V2 = $100, and preventing T2 from succeeding (since V1 + (V2 − $200) is now −$200), or T2 happens first and similarly prevents T1 from committing. If the database is under snapshot isolation(MVCC), however, T1 and T2 operate on private snapshots of the database: each deducts $200 from an account, and then verifies that the new total is zero, using the other account value that held when the snapshot was taken. Since neither update conflicts, both commit successfully, leaving V1 = V2 = −$100, and V1 + V2 = −$200. Some systems built using multiversion concurrency control (MVCC) may support (only) snapshot isolation to allow transactions to proceed without worrying about concurrent operations, and more importantly without needing to re-verify all read operations when the transaction finally commits. This is convenient because MVCC maintains a series of recent history consistent states. The only information that must be stored during the transaction is a list of updates made, which can be scanned for conflicts fairly easily before being committed. However, MVCC systems (such as MarkLogic) will use locks to serialize writes together with MVCC to obtain some of the performance gains and still support the stronger "serializability" level of isolation. == Workarounds == Potential inconsistency problems arising from write skew anomalies can be fixed by adding (otherwise unnecessary) updates to the transactions in order to enforce the serializability property. Materialize the conflict Add a special conflict table, which both transactions update in order to create a direct write–write conflict. Promotion Have one transaction "update" a read-only location (replacing a value with the same value) in order to create a direct write–write conflict (or use an equivalent promotion, e.g. Oracle's SELECT FOR UPDATE). In the example above, we can materialize the conflict by adding a new table which makes the hidden constraint explicit, mapping each person to their total balance. Phil would start off with a total balance of $200, and each transaction would attempt to subtract $200 from this, creating a write–write conflict that would prevent the two from succeeding concurrently. However, this approach violates the normal form. Alternatively, we can promote one of the transaction's reads to a write. For instance, T2 could set V1 = V1, creating an artificial write–write conflict with T1 and, again, preventing the two from succeeding concurrently. This solution may not always be possible. In general, therefore, snapshot isolation puts some of the problem of maintaining non-trivial constraints onto the user, who may not appreciate either the potential pitfalls or the possible solutions. The upside to this transfer is better performance. == Terminology == Snapshot isolation is called "serializable" mode in Oracle and PostgreSQL versions prior to 9.1, which may cause confusion with the "real serializability" mode. There are arguments both for and against this decision; what is clear is that users must be aware of the distinction to avoid possible undesired anomalous behavior in their database system logic. == History == Snapshot isolation arose from work on multiversion concurrency control databases, where multiple versions of the database are maintained concurrently to allow readers to execute without colliding with writers. Such a system allows a natural definition and implementation of such an isolation level. InterBase, later owned by Borland, was acknowledged to provide SI rather than full serializability in version 4, and likely permitted write-skew anomalies since its first release in 1985. Unfortunately, the ANSI SQL-92 standard was written with a lock-based database in mind, and hence is rather vague when applied to MVCC systems. Berenson et al. wrote a paper in 1995 critiquing the SQL standard, and cited snapshot isolation as an example of an isolation level that did not exhibit the standard anomalies described in the ANSI SQL-92 standard, yet still had anomalous behaviour when compared with serializable transactions. In 2008, Cahill et al. showed that write-skew anomalies could be prevented by detecting and aborting "dangerous" triplets of concurrent transactions. This implementation of serializability is well-suited to multiversion concurrency control databases, and has been adopted in PostgreSQL 9.1, where it is known as Serializable Snapshot Isolation (SSI). When used consistently, this eliminates the need for the above workarounds. The downside over snapshot isolation is an increase in aborted transactions. This can perform better or worse than snapshot isolation with the above workarounds, depending on workload.

    Read more →
  • DataViva

    DataViva

    DataViva is an information visualization engine created by the Strategic Priorities Office of the government of Minas Gerais. DataViva makes official data about exports, industries, locations and occupations available for the entirety of Brazil through eight apps and more than 100 million possible visualizations. The first set of datum – also available at ALICEWEB – is provided by MDIC (Ministry of Development, Industry and Foreign Trade) / SECEX (Secretariat of Foreign Trade), an official institution of the Government of Brazil and shows foreign trade statistics for all exporting municipalities in the country. The other database, provided by Ministério do Trabalho e Emprego (MTE – Ministry of Labor and Employment), shows information about all the industries and occupations in Brazil (RAIS – Annual Social Information Report). The platform consists of eight core applications, each of which allows different ways of visualizing the data available. Some applications are descriptive, that is, showing data aggregated at various levels in a simple and comparative way, such as Treemapping. Others are prescriptive, using calculations that allow an analytic visualization of the data, based on theories such as the Product Space. All the applications are generated using D3plus, an open source JavaScript library built on top of D3.js by Alexander Simoes and Dave Landry. Inspired by The Observatory of Economic Complexity, DataViva is an open data, open-source, and free to use tool. It was developed in a partnership with Datawheel, co-founded by MIT Media Lab Professor César Hidalgo, and is maintained by the Government of Minas Gerais.

    Read more →
  • Global serializability

    Global serializability

    In concurrency control of databases, transaction processing (transaction management), and other transactional distributed applications, global serializability (or modular serializability) is a property of a global schedule of transactions. A global schedule is the unified schedule of all the individual database (and other transactional object) schedules in a multidatabase environment (e.g., federated database). Complying with global serializability means that the global schedule is serializable, has the serializability property, while each component database (module) has a serializable schedule as well. In other words, a collection of serializable components provides overall system serializability, which is usually incorrect. A need in correctness across databases in multidatabase systems makes global serializability a major goal for global concurrency control (or modular concurrency control). With the proliferation of the Internet, Cloud computing, Grid computing, and small, portable, powerful computing devices (e.g., smartphones), as well as increase in systems management sophistication, the need for atomic distributed transactions and thus effective global serializability techniques, to ensure correctness in and among distributed transactional applications, seems to increase. In a federated database system or any other more loosely defined multidatabase system, which are typically distributed in a communication network, transactions span multiple (and possibly distributed) databases. Enforcing global serializability in such system, where different databases may use different types of concurrency control, is problematic. Even if every local schedule of a single database is serializable, the global schedule of a whole system is not necessarily serializable. The massive communication exchanges of conflict information needed between databases to reach conflict serializability globally would lead to unacceptable performance, primarily due to computer and communication latency. Achieving global serializability effectively over different types of concurrency control has been open for several years. == The global serializability problem == === Problem statement === The difficulties described above translate into the following problem: Find an efficient (high-performance and fault tolerant) method to enforce Global serializability (global conflict serializability) in a heterogeneous distributed environment of multiple autonomous database systems. The database systems may employ different concurrency control methods. No limitation should be imposed on the operations of either local transactions (confined to a single database system) or global transactions (span two or more database systems). === Quotations === Lack of an appropriate solution for the global serializability problem has driven researchers to look for alternatives to serializability as a correctness criterion in a multidatabase environment (e.g., see Relaxing global serializability below), and the problem has been characterized as difficult and open. The following two quotations demonstrate the mindset about it by the end of the year 1991, with similar quotations in numerous other articles: "Without knowledge about local as well as global transactions, it is highly unlikely that efficient global concurrency control can be provided... Additional complications occur when different component DBMSs [Database Management Systems] and the FDBMSs [Federated Database Management Systems] support different concurrency mechanisms... It is unlikely that a theoretically elegant solution that provides conflict serializability without sacrificing performance (i.e., concurrency and/or response time) and availability exists." === Proposed solutions === Several solutions, some partial, have been proposed for the global serializability problem. Among them: Global conflict graph (serializability graph, precedence graph) checking Distributed Two-phase locking (Distributed 2PL) Distributed Timestamp ordering Tickets (local logical timestamps which define local total orders, and are propagated to determine global partial order of transactions) == Relaxing global serializability == Some techniques have been developed for relaxed global serializability (i.e., they do not guarantee global serializability; see also Relaxing serializability). Among them (with several publications each): Quasi serializability Two-level serializability Another common reason nowadays for Global serializability relaxation is the requirement of availability of internet products and services. This requirement is typically answered by large scale data replication. The straightforward solution for synchronizing replicas' updates of a same database object is including all these updates in a single atomic distributed transaction. However, with many replicas such a transaction is very large, and may span several computers and networks that some of them are likely to be unavailable. Thus such a transaction is likely to end with abort and miss its purpose. Consequently, Optimistic replication (Lazy replication) is often utilized (e.g., in many products and services by Google, Amazon, Yahoo, and alike), while global serializability is relaxed and compromised for eventual consistency. In this case relaxation is done only for applications that are not expected to be harmed by it. Classes of schedules defined by relaxed global serializability properties either contain the global serializability class, or are incomparable with it. What differentiates techniques for relaxed global conflict serializability (RGCSR) properties from those of relaxed conflict serializability (RCSR) properties that are not RGCSR is typically the different way global cycles (span two or more databases) in the global conflict graph are handled. No distinction between global and local cycles exists for RCSR properties that are not RGCSR. RCSR contains RGCSR. Typically RGCSR techniques eliminate local cycles, i.e., provide local serializability (which can be achieved effectively by regular, known concurrency control methods); however, obviously they do not eliminate all global cycles (which would achieve global serializability).

    Read more →
  • Computer Law & Security Review

    Computer Law & Security Review

    The Computer Law & Security Review is an international peer-reviewed journal published by Elsevier. It has been published six times a year since 1985 and is indexed in Scopus and SSCI. It is accessible to a wide range of professional legal and IT practitioners, businesses, academics, researchers, libraries and organisations in both the public and private sectors. The journal regularly covers: CLSR Briefing with special emphasis on UK/US developments European Union update National news from 10 European jurisdictions Pacific rim news column Refereed practitioner and academic papers on topics such as Web 2.0, IT security, Identity management, ID cards, RFID, interference with privacy, Internet law, telecoms regulation, online broadcasting, intellectual property, software law, e-commerce, outsourcing, data protection and freedom of information and many other topics. The Journal's Correspondent Panel includes more than 40 specialists in IT law and security. Each issue contains articles, case law analysis and current news on information and communications technology. Special Features High quality peer reviewed papers from internationally renowned practitioner and academic experts Latest developments reported in situ by more than 20 leading law firms from around the world Highly experienced and respected editor and correspondents panel Online access to all 23 volumes of CLSR with embedded web links to primary sources Contact details of all authors A pool of expertise that can collectively identify the key topics that need to be examined.

    Read more →
  • Glow (app)

    Glow (app)

    Glow is a fertility awareness and period-tracking app. It is part of a suite of mobile apps focused on women's reproductive health and childcare, which includes Eve by Glow (a dedicated period tracker), Glow Nurture (a pregnancy tracker), and Glow Baby (a baby development tracker). The Glow company also operates an online shop that sells several fertility-related products, including ovulation test strips, pregnancy tests, and wearable breast pumps. In 2024, Glow was reported to have approximately 25 million users across its various apps and community message boards. == History == Glow debuted in August 2013 as an iOS app. It was founded by Michael Huang and Max Levchin and launched with $6 million in Series A funding from venture capital firms Founders Fund and Andreesen Horowitz. In 2014, Glow raised an additional $17 million in Series B funding, with Formation 8 joining existing investors. In 2015, Glow launched Ruby, an app dedicated to sexual health. That year, Wired reported that the company had added features to their apps allowing men to monitor their fertility. Glow subsequently released an additional set of apps focused on pregnancy tracking and infant development. In 2016, Glow reported that it had a total of approximately 3 million users; by 2018, this had grown to 15 million. Vox described it as one of the “big two” period and fertility tracking apps and the one that had started the “boom” in the femtech space. == Application and features == Glow was initially described as a fertility application that applied data-driven methods to menstrual and ovulation tracking. Core features include cycle logging, ovulation prediction, and symptom tracking. The app also provides educational content related to reproductive health and childcare, as well as a set of online message boards that allow individuals to share experiences and seek peer support. == Privacy and legal issues == Glow has received significant media attention for its privacy and security practices. In 2016, Consumer Reports identified potential exploits in the Glow app that they claimed could have exposed private user data to hackers. Glow subsequently reported that it had fixed the vulnerabilities and told The Washington Post they had no evidence that user data had been compromised. In September 2020, the California Attorney General announced a settlement with Glow related to Consumer Reports’ findings, which included a $250,000 civil penalty. Following the US Supreme Court's 2022 Dobbs v. Jackson ruling, which legalized state-level bans on abortion, Glow (and other fertility trackers, such as Clue and Flo) came under additional scrutiny over concerns that user data on abortions could be reported to law enforcement. After this surge of media interest, a research team affiliated with the University of New South Wales conducted an investigation into the privacy practices of several popular fertility apps, including Glow. Their review of Glow was mixed, noting that they provided several privacy settings and de-identified sensitive data, but that user information could still be disclosed in the future if the app was sold. Glow rejected that claim, telling the Australian Associated Press that it "did not share" personal data. The company also cited several internal security measures it had implemented and its apps' offline data protection setting, which allows users to permanently delete their health-related data. == Reception == In 2014, Fast Company reported that 20,000 women had used Glow to conceive. Later that year, The Guardian included Glow Nurture on its list of the best iPhone apps of 2014. Media coverage often praised Glow's array of menstrual tracking options, although some reviews also noted that fertility apps are not birth control tools and cautioned against relying on them for that purpose. In 2019, Cosmopolitan singled Glow's community of users as one of its standout features.

    Read more →
  • Linked timestamping

    Linked timestamping

    Linked timestamping is a type of trusted timestamping where issued time-stamps are related to each other. Each time-stamp would contain data that authenticates the time-stamp before it, the authentication would be authenticating the entire message, including the previous time-stamps authentication, making a chain. This makes it impossible to add a time-stamp in to the middle of the chain, as any time-stamps afterwards would be different. == Description == Linked timestamping creates time-stamp tokens which are dependent on each other, entangled in some authenticated data structure. Later modification of the issued time-stamps would invalidate this structure. The temporal order of issued time-stamps is also protected by this data structure, making backdating of the issued time-stamps impossible, even by the issuing server itself. The top of the authenticated data structure is generally published in some hard-to-modify and widely witnessed media, like printed newspaper or public blockchain. There are no (long-term) private keys in use, avoiding PKI-related risks. Suitable candidates for the authenticated data structure include: Linear hash chain Merkle tree (binary hash tree) Skip list The simplest linear hash chain-based time-stamping scheme is illustrated in the following diagram: The linking-based time-stamping authority (TSA) usually performs the following distinct functions: Aggregation For increased scalability the TSA might group time-stamping requests together which arrive within a short time-frame. These requests are aggregated together without retaining their temporal order and then assigned the same time value. Aggregation creates a cryptographic connection between all involved requests; the authenticating aggregate value will be used as input for the linking operation. Linking Linking creates a verifiable and ordered cryptographic link between the current and already issued time-stamp tokens. Publishing The TSA periodically publishes some links, so that all previously issued time-stamp tokens depend on the published link and that it is practically impossible to forge the published values. By publishing widely witnessed links, the TSA creates unforgeable verification points for validating all previously issued time-stamps. == Security == Linked timestamping is inherently more secure than the usual, public-key signature based time-stamping. All consequential time-stamps "seal" previously issued ones - hash chain (or other authenticated dictionary in use) could be built only in one way; modifying issued time-stamps is nearly as hard as finding a preimage for the used cryptographic hash function. Continuity of operation is observable by users; periodic publications in widely witnessed media provide extra transparency. Tampering with absolute time values could be detected by users, whose time-stamps are relatively comparable by system design. Absence of secret keys increases system trustworthiness. There are no keys to leak and hash algorithms are considered more future-proof than modular arithmetic based algorithms, e.g. RSA. Linked timestamping scales well - hashing is much faster than public key cryptography. There is no need for specific cryptographic hardware with its limitations. The common technology for guaranteeing long-term attestation value of the issued time-stamps (and digitally signed data) is periodic over-time-stamping of the time-stamp token. Because of missing key-related risks and of the plausible safety margin of the reasonably chosen hash function this over-time-stamping period of hash-linked token could be an order of magnitude longer than of public-key signed token. == Research == === Foundations === Stuart Haber and W. Scott Stornetta proposed in 1990 to link issued time-stamps together into linear hash-chain, using a collision-resistant hash function. The main rationale was to diminish TSA trust requirements. Tree-like schemes and operating in rounds were proposed by Benaloh and de Mare in 1991 and by Bayer, Haber and Stornetta in 1992. Benaloh and de Mare constructed a one-way accumulator in 1994 and proposed its use in time-stamping. When used for aggregation, one-way accumulator requires only one constant-time computation for round membership verification. Surety started the first commercial linked timestamping service in January 1995. Linking scheme is described and its security is analyzed in the following article by Haber and Sornetta. Buldas et al. continued with further optimization and formal analysis of binary tree and threaded tree based schemes. Skip-list based time-stamping system was implemented in 2005; related algorithms are quite efficient. === Provable security === Security proof for hash-function based time-stamping schemes was presented by Buldas, Saarepera in 2004. There is an explicit upper bound N {\displaystyle N} for the number of time stamps issued during the aggregation period; it is suggested that it is probably impossible to prove the security without this explicit bound - the so-called black-box reductions will fail in this task. Considering that all known practically relevant and efficient security proofs are black-box, this negative result is quite strong. Next, in 2005 it was shown that bounded time-stamping schemes with a trusted audit party (who periodically reviews the list of all time-stamps issued during an aggregation period) can be made universally composable - they remain secure in arbitrary environments (compositions with other protocols and other instances of the time-stamping protocol itself). Buldas, Laur showed in 2007 that bounded time-stamping schemes are secure in a very strong sense - they satisfy the so-called "knowledge-binding" condition. The security guarantee offered by Buldas, Saarepera in 2004 is improved by diminishing the security loss coefficient from N {\displaystyle N} to N {\displaystyle {\sqrt {N}}} . The hash functions used in the secure time-stamping schemes do not necessarily have to be collision-resistant or even one-way; secure time-stamping schemes are probably possible even in the presence of a universal collision-finding algorithm (i.e. universal and attacking program that is able to find collisions for any hash function). This suggests that it is possible to find even stronger proofs based on some other properties of the hash functions. At the illustration above hash tree based time-stamping system works in rounds ( t {\displaystyle t} , t + 1 {\displaystyle t+1} , t + 2 {\displaystyle t+2} , ...), with one aggregation tree per round. Capacity of the system ( N {\displaystyle N} ) is determined by the tree size ( N = 2 l {\displaystyle N=2^{l}} , where l {\displaystyle l} denotes binary tree depth). Current security proofs work on the assumption that there is a hard limit of the aggregation tree size, possibly enforced by the subtree length restriction. == Standards == ISO 18014 part 3 covers 'Mechanisms producing linked tokens'. American National Standard for Financial Services, "Trusted Timestamp Management and Security" (ANSI ASC X9.95 Standard) from June 2005 covers linking-based and hybrid time-stamping schemes. There is no IETF RFC or standard draft about linking based time-stamping. RFC 4998 (Evidence Record Syntax) encompasses hash tree and time-stamp as an integrity guarantee for long-term archiving.

    Read more →