Trigrams are a special case of the n-gram, where n is 3. They are often used in natural language processing for performing statistical analysis of texts and in cryptography for control and use of ciphers and codes. See results of analysis of "Letter Frequencies in the English Language". == Frequency == Context is very important, varying analysis rankings and percentages are easily derived by drawing from different sample sizes, different authors; or different document types: poetry, science-fiction, technology documentation; and writing levels: stories for children versus adults, military orders, and recipes. Typical cryptanalytic frequency analysis finds that the 16 most common character-level trigrams in English are: Because encrypted messages sent by telegraph often omit punctuation and spaces, cryptographic frequency analysis of such messages includes trigrams that straddle word boundaries. This causes trigrams such as "edt" to occur frequently, even though it may never occur in any one word of those messages. == Examples == The sentence "the quick red fox jumps over the lazy brown dog" has the following word-level trigrams: the quick red quick red fox red fox jumps fox jumps over jumps over the over the lazy the lazy brown lazy brown dog And the word-level trigram "the quick red" has the following character-level trigrams (where an underscore "_" marks a space): the he_ e_q _qu qui uic ick ck_ k_r _re red
Kernel embedding of distributions
In machine learning, the kernel embedding of distributions (also called the kernel mean or mean map) comprises a class of nonparametric methods in which a probability distribution is represented as an element of a reproducing kernel Hilbert space (RKHS). A generalization of the individual data-point feature mapping done in classical kernel methods, the embedding of distributions into infinite-dimensional feature spaces can preserve all of the statistical features of arbitrary distributions, while allowing one to compare and manipulate distributions using Hilbert space operations such as inner products, distances, projections, linear transformations, and spectral analysis. This learning framework is very general and can be applied to distributions over any space Ω {\displaystyle \Omega } on which a sensible kernel function (measuring similarity between elements of Ω {\displaystyle \Omega } ) may be defined. For example, various kernels have been proposed for learning from data which are: vectors in R d {\displaystyle \mathbb {R} ^{d}} , discrete classes/categories, strings, graphs/networks, images, time series, manifolds, dynamical systems, and other structured objects. The theory behind kernel embeddings of distributions has been primarily developed by Alex Smola, Le Song, Arthur Gretton, and Bernhard Schölkopf. A review of recent works on kernel embedding of distributions can be found in. The analysis of distributions is fundamental in machine learning and statistics, and many algorithms in these fields rely on information theoretic approaches such as entropy, mutual information, or Kullback–Leibler divergence. However, to estimate these quantities, one must first either perform density estimation, or employ sophisticated space-partitioning/bias-correction strategies which are typically infeasible for high-dimensional data. Commonly, methods for modeling complex distributions rely on parametric assumptions that may be unfounded or computationally challenging (e.g. Gaussian mixture models), while nonparametric methods like kernel density estimation (Note: the smoothing kernels in this context have a different interpretation than the kernels discussed here) or characteristic function representation (via the Fourier transform of the distribution) break down in high-dimensional settings. Methods based on the kernel embedding of distributions sidestep these problems and also possess the following advantages: Data may be modeled without restrictive assumptions about the form of the distributions and relationships between variables Intermediate density estimation is not needed Practitioners may specify the properties of a distribution most relevant for their problem (incorporating prior knowledge via choice of the kernel) If a characteristic kernel is used, then the embedding can uniquely preserve all information about a distribution, while thanks to the kernel trick, computations on the potentially infinite-dimensional RKHS can be implemented in practice as simple Gram matrix operations Dimensionality-independent rates of convergence for the empirical kernel mean (estimated using samples from the distribution) to the kernel embedding of the true underlying distribution can be proven. Learning algorithms based on this framework exhibit good generalization ability and finite sample convergence, while often being simpler and more effective than information theoretic methods Thus, learning via the kernel embedding of distributions offers a principled drop-in replacement for information theoretic approaches and is a framework which not only subsumes many popular methods in machine learning and statistics as special cases, but also can lead to entirely new learning algorithms. == Definitions == Let X {\displaystyle X} denote a random variable with domain Ω {\displaystyle \Omega } and distribution P {\displaystyle P} . Given a symmetric, positive-definite kernel k : Ω × Ω → R {\displaystyle k:\Omega \times \Omega \rightarrow \mathbb {R} } the Moore–Aronszajn theorem asserts the existence of a unique RKHS H {\displaystyle {\mathcal {H}}} on Ω {\displaystyle \Omega } (a Hilbert space of functions f : Ω → R {\displaystyle f:\Omega \to \mathbb {R} } equipped with an inner product ⟨ ⋅ , ⋅ ⟩ H {\displaystyle \langle \cdot ,\cdot \rangle _{\mathcal {H}}} and a norm ‖ ⋅ ‖ H {\displaystyle \|\cdot \|_{\mathcal {H}}} ) for which k {\displaystyle k} is a reproducing kernel, i.e., in which the element k ( x , ⋅ ) {\displaystyle k(x,\cdot )} satisfies the reproducing property ⟨ f , k ( x , ⋅ ) ⟩ H = f ( x ) ∀ f ∈ H , ∀ x ∈ Ω . {\displaystyle \langle f,k(x,\cdot )\rangle _{\mathcal {H}}=f(x)\qquad \forall f\in {\mathcal {H}},\quad \forall x\in \Omega .} One may alternatively consider x ↦ k ( x , ⋅ ) {\displaystyle x\mapsto k(x,\cdot )} as an implicit feature mapping φ : Ω → H {\displaystyle \varphi :\Omega \rightarrow {\mathcal {H}}} (which is therefore also called the feature space), so that k ( x , x ′ ) = ⟨ φ ( x ) , φ ( x ′ ) ⟩ H {\displaystyle k(x,x')=\langle \varphi (x),\varphi (x')\rangle _{\mathcal {H}}} can be viewed as a measure of similarity between points x , x ′ ∈ Ω . {\displaystyle x,x'\in \Omega .} While the similarity measure is linear in the feature space, it may be highly nonlinear in the original space depending on the choice of kernel. === Kernel embedding === The kernel embedding of the distribution P {\displaystyle P} in H {\displaystyle {\mathcal {H}}} (also called the kernel mean or mean map) is given by: μ X := E [ k ( X , ⋅ ) ] = E [ φ ( X ) ] = ∫ Ω φ ( x ) d P ( x ) {\displaystyle \mu _{X}:=\mathbb {E} [k(X,\cdot )]=\mathbb {E} [\varphi (X)]=\int _{\Omega }\varphi (x)\ \mathrm {d} P(x)} If P {\displaystyle P} allows a square integrable density p {\displaystyle p} , then μ X = E k p {\displaystyle \mu _{X}={\mathcal {E}}_{k}p} , where E k {\displaystyle {\mathcal {E}}_{k}} is the Hilbert–Schmidt integral operator. A kernel is characteristic if the mean embedding μ : { family of distributions over Ω } → H {\displaystyle \mu :\{{\text{family of distributions over }}\Omega \}\to {\mathcal {H}}} is injective. Each distribution can thus be uniquely represented in the RKHS and all statistical features of distributions are preserved by the kernel embedding if a characteristic kernel is used. === Empirical kernel embedding === Given n {\displaystyle n} training examples { x 1 , … , x n } {\displaystyle \{x_{1},\ldots ,x_{n}\}} drawn independently and identically distributed (i.i.d.) from P , {\displaystyle P,} the kernel embedding of P {\displaystyle P} can be empirically estimated as μ ^ X = 1 n ∑ i = 1 n φ ( x i ) {\displaystyle {\widehat {\mu }}_{X}={\frac {1}{n}}\sum _{i=1}^{n}\varphi (x_{i})} === Joint distribution embedding === If Y {\displaystyle Y} denotes another random variable (for simplicity, assume the co-domain of Y {\displaystyle Y} is also Ω {\displaystyle \Omega } with the same kernel k {\displaystyle k} which satisfies ⟨ φ ( x ) ⊗ φ ( y ) , φ ( x ′ ) ⊗ φ ( y ′ ) ⟩ = k ( x , x ′ ) k ( y , y ′ ) {\displaystyle \langle \varphi (x)\otimes \varphi (y),\varphi (x')\otimes \varphi (y')\rangle =k(x,x')k(y,y')} ), then the joint distribution P ( x , y ) ) {\displaystyle P(x,y))} can be mapped into a tensor product feature space H ⊗ H {\displaystyle {\mathcal {H}}\otimes {\mathcal {H}}} via C X Y = E [ φ ( X ) ⊗ φ ( Y ) ] = ∫ Ω × Ω φ ( x ) ⊗ φ ( y ) d P ( x , y ) {\displaystyle {\mathcal {C}}_{XY}=\mathbb {E} [\varphi (X)\otimes \varphi (Y)]=\int _{\Omega \times \Omega }\varphi (x)\otimes \varphi (y)\ \mathrm {d} P(x,y)} By the equivalence between a tensor and a linear map, this joint embedding may be interpreted as an uncentered cross-covariance operator C X Y : H → H {\displaystyle {\mathcal {C}}_{XY}:{\mathcal {H}}\to {\mathcal {H}}} from which the cross-covariance of functions f , g ∈ H {\displaystyle f,g\in {\mathcal {H}}} can be computed as Cov ( f ( X ) , g ( Y ) ) := E [ f ( X ) g ( Y ) ] − E [ f ( X ) ] E [ g ( Y ) ] = ⟨ f , C X Y g ⟩ H = ⟨ f ⊗ g , C X Y ⟩ H ⊗ H {\displaystyle \operatorname {Cov} (f(X),g(Y)):=\mathbb {E} [f(X)g(Y)]-\mathbb {E} [f(X)]\mathbb {E} [g(Y)]=\langle f,{\mathcal {C}}_{XY}g\rangle _{\mathcal {H}}=\langle f\otimes g,{\mathcal {C}}_{XY}\rangle _{{\mathcal {H}}\otimes {\mathcal {H}}}} Given n {\displaystyle n} pairs of training examples { ( x 1 , y 1 ) , … , ( x n , y n ) } {\displaystyle \{(x_{1},y_{1}),\dots ,(x_{n},y_{n})\}} drawn i.i.d. from P {\displaystyle P} , we can also empirically estimate the joint distribution kernel embedding via C ^ X Y = 1 n ∑ i = 1 n φ ( x i ) ⊗ φ ( y i ) {\displaystyle {\widehat {\mathcal {C}}}_{XY}={\frac {1}{n}}\sum _{i=1}^{n}\varphi (x_{i})\otimes \varphi (y_{i})} === Conditional distribution embedding === Given a conditional distribution P ( y ∣ x ) , {\displaystyle P(y\mid x),} one can define the corresponding RKHS embedding as μ Y ∣ x = E [ φ ( Y ) ∣ X ] = ∫ Ω φ ( y ) d P ( y ∣ x ) {\displaystyle \mu _{Y\mid x}=\mathbb {E} [\varphi (Y)\mid X]=\int _{\Omega
Interim Measures for the Management of Generative AI Services
The Interim Measures for the Management of Generative AI Services (Chinese: 生成式人工智能服务管理暂行办法; pinyin: Shēngchéng shì réngōng zhìnéng fúwù guǎnlǐ zànxíng bànfǎ) are a set of regulations governing public-facing generative artificial intelligence services in China. Issued on 10 July 2023 and effective from 15 August 2023, they were China's first binding regulation specifically targeting generative AI. They have been described as among the earliest such regulations adopted by any country. The measures were jointly issued by the Cyberspace Administration of China (CAC) and six other national bodies: the National Development and Reform Commission, the Ministry of Education, the Ministry of Science and Technology, the Ministry of Industry and Information Technology, the Ministry of Public Security, and the National Radio and Television Administration. Among the measures' most prominent requirements is that generative AI services must uphold Core Socialist Values and must not generate content that could subvert state power, harm national security, or undermine social stability. The measures also require providers of public-facing generative AI services to undergo security assessments and register their algorithms with the CAC. As of December 2025, 748 generative AI services had completed the filing process at the national level. == Background == The Interim Measures build on two earlier sets of regulations targeting specific algorithm applications. The Administrative Provisions on Algorithm Recommendation for Internet Information Services, effective from March 2022, established China's algorithm registry and required providers of recommendation algorithms with "public opinion properties or social mobilization capabilities" to file with the CAC and undergo security assessments. The Administrative Provisions on Deep Synthesis of Internet Information Services, effective from January 2023, extended similar requirements to algorithms used for generating synthetic media such as deepfakes. In April 2023, the CAC released a draft of the generative AI regulation for public comment. The draft included several requirements that attracted attention, including that generated content should "embody Core Socialist Values" and that training data should be "true and accurate". The public consultation period ran until May 2023. The final version, published in July 2023, was substantially revised from the draft. According to an analysis by the Future of Privacy Forum, changes appeared to reflect feedback from industry stakeholders including Baidu, Xiaomi, SenseTime, and others, as well as input from government-affiliated research institutes. The final measures adopted a more permissive tone, with the CAC describing its approach as "inclusive and prudent" (包容审慎) and emphasising "classified and graded" (分类分级) supervision. == Scope == The measures apply to services that use generative AI technology to provide text, images, audio, video, or other content to the public within mainland China (Article 2). They do not apply to organisations that develop or use generative AI internally without offering services to the domestic public, such as industry associations, enterprises, and research institutions. Overseas providers whose services are accessible to users in China are also subject to the measures. == Key provisions == === Content requirements === Article 4 sets out the core content obligations. Providers and users of generative AI services must uphold the Core Socialist Values. The measures prohibit generating content that incites subversion of national sovereignty or the socialist system, endangers national security or the nation's image, incites separatism, promotes terrorism or extremism, promotes ethnic hatred or discrimination, or contains violence, obscenity, or false information prohibited by law. These content prohibitions largely mirror those in Article 12 of the Cybersecurity Law and in prior regulations governing online content. Article 4 also requires that models be designed and trained to avoid discrimination, that services respect intellectual property rights, and that providers take effective measures to improve the transparency and accuracy of generated content. === Training data and labelling === Article 7 requires providers to ensure that training data is of high quality and legitimately sourced, and that it does not infringe upon intellectual property rights. Where personal information is used, consent must be obtained. The final version of this provision removed language from the draft that would have held providers responsible for the "legitimacy" of all pretraining data, replacing it with a requirement to "employ effective measures to improve the quality of training data". Article 8 requires providers to establish labelling rules for training data and to conduct quality assessments of data annotations. Article 12 requires that generated images, videos, and other synthetic content be labelled as AI-generated. === User rights and privacy === Article 11 requires providers to protect user privacy, to minimise the collection and retention of personal data, and to refrain from unlawfully sharing user information. Users have the right to request review, correction, or deletion of their personal information. Article 10 requires providers to take measures to prevent excessive dependence on or addiction to generative AI services by minors. === Security assessment and algorithm filing === Article 17 requires that providers of generative AI services with "public opinion properties or the capacity for social mobilization" (具有舆论属性或者社会动员能力) carry out security assessments and complete algorithm filing procedures in accordance with the Administrative Provisions on Algorithm Recommendation for Internet Information Services. == Implementation == === Algorithm filing process === In practice, the filing requirements under the Interim Measures have developed into a two-tier process. The first tier is the standard algorithm filing (算法备案) under the pre-existing Algorithm Recommendation Provisions, which involves submitting information about an algorithm's design, purpose, and data sources to the CAC. This process is primarily a registration mechanism. For public-facing generative AI products, there is an additional, more rigorous process commonly referred to as the "large model filing" (大模型备案). This involves submitting a security self-assessment report, data annotation rules, a keyword blocking list, and evaluation test question sets. The process includes technical testing at the provincial level, followed by review at the national CAC level. The algorithm filing targets specific algorithms, while the large model filing evaluates the broader system architecture, training data, model parameters, and potential social impact. The CAC publishes lists of generative AI services that have successfully completed the filing process. The first such list was published on 2 April 2024. According to the CAC's year-end announcements, 302 generative AI services had completed national-level filing by the end of 2024 (of which 238 were new that year), alongside 105 applications that completed local-level registration. By the end of 2025, the cumulative total had risen to 748 national-level filings and 435 local-level registrations. === Content compliance and testing === According to the Carnegie Endowment, the CAC has conducted compliance audits of generative AI services with a particular focus on ensuring appropriate responses to queries about politically sensitive topics. The large model filing process requires providers to pass both provincial-level and national-level technical testing before their services can be made available to the public. On 1 March 2024, the National Technical Committee 260 on Cybersecurity (TC260) published TC260-003, the Basic Security Requirements for Generative AI Services (生成式人工智能服务安全基本要求), a technical standard that provides detailed guidance on the security assessments required under the Interim Measures. The standard covers requirements for training data safety, model security, and content safety evaluation, and is used as a reference for the filing process. == Analysis == === Relationship to broader Chinese internet regulation === The content requirements in the Interim Measures extend China's existing framework for online information control to generative AI. Legal scholars have noted that the "Core Socialist Values" provision and the specific content prohibitions are consistent with longstanding requirements imposed on internet platforms under the Cybersecurity Law and related regulations. The Asia Society Policy Institute has described the Chinese government's highest regulatory priority in this area as retaining control of information, noting that content-related obligations receive stricter enforcement than other provisions. === Nature of the filing system === The character of the filing system has been debated by scholars. Angela Huyue Zh
Someday (short story)
"Someday" is a science fiction short story by American writer Isaac Asimov. It was first published in the August 1956 issue of Infinity Science Fiction and reprinted in the collections Earth Is Room Enough (1957), The Complete Robot (1982), Robot Visions (1990), and The Complete Stories, Volume 1 (1990). == Plot summary == The story is set in a future where computers play a central role in organizing society. Humans are employed as computer operators, but they leave most of the thinking to machines. Indeed, whilst binary programming is taught at school, reading and writing have become obsolete. The story concerns a pair of boys who dismantle and upgrade an old Bard, a child's computer whose sole function is to generate random fairy tales. The boys download a book about computers into the Bard's memory in an attempt to expand its vocabulary, but the Bard simply incorporates computers into its standard fairy tale repertoire. The story ends with the boys excitedly leaving the room after deciding to go to the library to learn "squiggles" (writing) as a means of passing secret messages to one another. As they leave, one of the boys accidentally kicks the Bard's on switch. The Bard begins reciting a new story about a poor mistreated and often ignored robot called the Bard, whose sole purpose is to tell stories, which ends with the words: "the little computer knew then that computers would always grow wiser and more powerful until someday—someday—someday—…"
Digital organism
A digital organism is a self-replicating computer program that mutates and evolves. Digital organisms are used as a tool to study the dynamics of Darwinian evolution, and to test or verify specific hypotheses or mathematical models of evolution. The study of digital organisms is closely related to the area of artificial life. == History == Digital organisms can be traced back to the game Darwin, developed in 1961 at Bell Labs, in which computer programs had to compete with each other by trying to stop others from executing . A similar implementation that followed this was the game Core War. In Core War, it turned out that one of the winning strategies was to replicate as fast as possible, which deprived the opponent of all computational resources. Programs in the Core War game were also able to mutate themselves and each other by overwriting instructions in the simulated "memory" in which the game took place. This allowed competing programs to embed damaging instructions in each other that caused errors (terminating the process that read it), "enslaved processes" (making an enemy program work for you), or even change strategies mid-game and heal themselves. Steen Rasmussen at Los Alamos National Laboratory took the idea from Core War one step further in his core world system by introducing a genetic algorithm that automatically wrote programs. However, Rasmussen did not observe the evolution of complex and stable programs. It turned out that the programming language in which core world programs were written was very brittle, and more often than not mutations would completely destroy the functionality of a program. The first to solve the issue of program brittleness was Thomas S. Ray with his Tierra system, which was similar to core world. Ray made some key changes to the programming language such that mutations were much less likely to destroy a program. With these modifications, he observed for the first time computer programs that did indeed evolve in a meaningful and complex way. Later, Chris Adami, Titus Brown, and Charles Ofria started developing their Avida system, which was inspired by Tierra but again had some crucial differences. In Tierra, all programs lived in the same address space and could potentially execute or otherwise interfere with each other's code. In Avida, on the other hand, each program lives in its own address space. Because of this modification, experiments with Avida became much cleaner and easier to interpret than those with Tierra. With Avida, digital organism research has begun to be accepted as a valid contribution to evolutionary biology by a growing number of evolutionary biologists. Evolutionary biologist Richard Lenski of Michigan State University has used Avida extensively in his work. Lenski, Adami, and their colleagues have published in journals such as Nature and the Proceedings of the National Academy of Sciences (USA). In 1996, Andy Pargellis created a Tierra-like system called Amoeba that evolved self-replication from a randomly seeded initial condition. More recently REvoSim - a software package based around binary digital organisms - has allowed evolutionary simulations of large populations that can be run for geological timescales.
Common data model
A common data model (CDM) can refer to any standardised data model which allows for data and information exchange between different applications and data sources. Common data models aim to standardise logical infrastructure so that related applications can "operate on and share the same data", and can be seen as a way to "organize data from many sources that are in different formats into a standard structure". A common data model has been described as one of the components of a "strong information system". A standardised common data model has also been described as a typical component of a well designed agile application besides a common communication protocol. Providing a single common data model within an organisation is one of the typical tasks of a data warehouse. == Examples of common data models == === Border crossings === X-trans.eu was a cross-border pilot project between the Free State of Bavaria (Germany) and Upper Austria with the aim of developing a faster procedure for the application and approval of cross-border large-capacity transports. The portal was based on a common data model that contained all the information required for approval. === Climate data === The Climate Data Store Common Data Model is a common data model set up by the Copernicus Climate Change Service for harmonising essential climate variables from different sources and data providers. === General information technology === Within service-oriented architecture, S-RAMP is a specification released by HP, IBM, Software AG, TIBCO, and Red Hat which defines a common data model for SOA repositories as well as an interaction protocol to facilitate the use of common tooling and sharing of data. Content Management Interoperability Services (CMIS) is an open standard for inter-operation of different content management systems over the internet, and provides a common data model for typed files and folders used with version control. The NetCDF software libraries for array-oriented scientific data implements a common data model called the NetCDF Java common data model, which consists of three layers built on top of each other to add successively richer semantics. === Health === Within genomic and medical data, the Observational Medical Outcomes Partnership (OMOP) research program established under the U.S. National Institutes of Health has created a common data model for claims and electronic health records which can accommodate data from different sources around the world. PCORnet, which was developed by the Patient-Centered Outcomes Research Institute, is another common data model for health data including electronic health records and patient claims. The Sentinel Common Data Model was initially started as Mini-Sentinel in 2008. It is used by the Sentinel Initiative of the USA's Food and Drug Administration. The Generalized Data Model was first published in 2019. It was designed to be a stand-alone data model as well as to allow for further transformation into other data models (e.g., OMOP, PCORNet, Sentinel). It has a hierarchical structure to flexibly capture relationships among data elements. The JANUS clinical trial data repository also provides a common data model which is based on the SDTM standard to represent clinical data submitted to regulatory agencies, such as tabulation datasets, patient profiles, listings, etc. === Logistics === SX000i is a specification developed jointly by the Aerospace and Defence Industries Association of Europe (ASD) and the American Aerospace Industries Association (AIA) to provide information, guidance and instructions to ensure compatibility and the commonality. The associated SX002D specification contains a common data model. === Microsoft Common Data Model === The Microsoft Common Data Model is a collection of many standardised extensible data schemas with entities, attributes, semantic metadata, and relationships, which represent commonly used concepts and activities in various businesses areas. It is maintained by Microsoft and its partners, and is published on GitHub. Microsoft's Common Data Model is used amongst others in Microsoft Dataverse and with various Microsoft Power Platform and Microsoft Dynamics 365 services. === Rail transport === RailTopoModel is a common data model for the railway sector. === Other === There are many more examples of various common data models for different uses published by different sources.
Vehicle infrastructure integration
The Vehicle Infrastructure Integration (VII), also known as "Connected Roadways" or "vehicle-to-everything" (V2X) technology, is a United States Department of Transportation initiative that aims to improve road safety by developing technology that connects road vehicles with their environment. This development draws on several disciplines, including transport engineering, electrical engineering, automotive engineering, telematics, and computer science. Although VII specifically covers road transport, similar technologies are under development for other modes of transport. For example, airplanes may use ground-based beacons for automated guidance, allowing the autopilot to fly the plane without human intervention. == Goals == The goal of VII is to establish a communication link between vehicles (via On-Board Equipment, or OBE) and roadside infrastructure (via Roadside Equipment, or RSE) to enhance the safety, efficiency, and convenience of transportation systems. Two potential approaches are the widespread deployment of a dedicated short-range communications (DSRC) link on the 5.9GHz band, and cellular communication (C-V2X). Either of these methods would allow vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication. The initiative has three priorities: Stakeholder evaluation and acceptance of the business model and its deployment schedule, Validation of the technology, with a focus on communications systems, in relation to deployment costs, and Creation of legal structures and policies, especially concerning digital privacy, to improve the system's long-term potential for success. === Safety === Current automotive safety technology relies primarily on vehicle-based radar, lidar, and sonar systems. This technology allows, for instance, a potential reduction in rear-end collisions by monitoring obstacles in front of or behind the vehicle and automatically applying the brakes when necessary. This technology, however, is limited by the sensing range of vehicle-based radar, particularly in angled and left-turn collisions, such as a motorist losing control of the vehicle during an impending head-on collision. The rear-end collisions addressed by current technology are generally less severe than angled, left-turn, or head-on collisions. VII promotes the development of a direct communication link between road vehicles and all other vehicles nearby, allowing for the exchange of information on vehicle speed and orientation or driver awareness and intent. This real-time exchange of information may enable more effective automated emergency maneuvers, such as steering, decelerating, or braking. In addition to nearby vehicle awareness, VII promotes a communication link between vehicles and roadway infrastructure. Such a link may allow for improved real-time traffic information, better queue management, and feedback to vehicles. Existing implementations of VII use vehicle-based sensors that can recognize and respond to roadway markings or signs, automatically adjusting vehicle parameters to follow the recognized instructions. However, this information may also be acquired via roadside beacons or stored in a centralized database accessible to all vehicles. === Efficiency === With a VII system in place, vehicles will be linked together. The headway between vehicles may therefore be reduced so that there is less empty space on the road, increasing the available capacity per lane. More capacity per lane will in turn imply fewer lanes in general, possibly satisfying the community's concerns about the impact of roadway widening. VII will enable precise traffic-signal coordination by tracking vehicle platoons and will benefit from accurate timing by drawing on real-time traffic data covering volume, density, and turning movements. Real-time traffic data can also be used in the design of new roadways or modification of existing systems as the data could be used to provide accurate origin-destination studies and turning-movement counts for uses in transportation forecasting and traffic operations. Such technology would also lead to improvements for transport engineers to address problems whilst reducing the cost of obtaining and compiling data. Tolling is another prospect for VII technology as it could enable roadways to be automatically tolled. Data could be collectively transmitted to road users for in-vehicle display, outlining the lowest cost, shortest distance, and/or fastest route to a destination on the basis of real-time conditions. === Existing applications === To some extent, results along these lines have been achieved in trials performed around the globe, making use of GPS, mobile phone signals, and vehicle registration plates. GPS is becoming standard in many new high-end vehicles and is an option on most new low- and mid-range vehicles. In addition, many users also have mobile phones that transmit trackable signals (and may also be GPS-enabled). Mobile phones can already be traced for purposes of emergency response. GPS and mobile phone tracking, however, do not provide fully reliable data. Furthermore, integrating mobile phones in vehicles may be prohibitively difficult. Data from mobile phones, though useful, might even increase risks to motorists as they tend to look at their phones rather than concentrate on their driving. Automatic registration plate recognition can provide large quantities of data, but continuously tracking a vehicle through a corridor is a difficult task with existing technology. Today's equipment is designed for data acquisition and functions such as enforcement and tolling, not for returning data to vehicles or motorists for response. GPS will nevertheless be one of the key components in VII systems. == Limitations == === Privacy === VII architecture is designed to prevent identification of individual vehicles, with all data exchange between the vehicle and the system occurring anonymously. Exchanges between the vehicles and third parties such as OEMs and toll collectors will occur, but the network traffic will be sent via encrypted tunnels and will therefore not be decipherable by the VII system. Data sharing with law enforcement or Homeland Security was not included in system design as of 2006. === Technical issues === ==== Coordination ==== A major issue facing the deployment of VII is the problem of how to set up the system initially. The costs associated with installing the technology in vehicles and providing communications and power at every intersection are significant. ==== Maintenance ==== Another factor for consideration in regard to the technology's distribution is how to update and maintain the units. Traffic systems are highly dynamic, with new traffic controls implemented every day and roadways constructed or repaired every year. The vehicle-based option could be updated via the internet (preferably wireless) but may subsequently require all users to have access to internet technology. Alternatively, if receivers were placed in all vehicles and the VII system was primarily located along the roadside, information could be stored in a centralized database. This would allow the agency responsible to issue updates at any time. These would then be disseminated to the roadside units for passing motorists. Operationally, this method is currently considered to provide the greatest effectiveness but at a high cost to the authorities. ==== Security ==== Security of the units is another concern, especially in light of the public acceptance issue. Criminals could tamper, remove, or destroy VII units regardless of whether they are installed inside vehicles or along the roadside. Magnets, electric shocks, and malicious software (viruses, hacking, or jamming) could be used to damage VII systems – regardless of whether units are located inside vehicle or along the roadside. == Recent developments == Much of the current research and experimentation is conducted in the United States where coordination is ensured through the Vehicle Infrastructure Integration Consortium; consisting of automobile manufacturers (Ford, General Motors, Daimler Chrysler, Toyota, Nissan, Honda, Volkswagen, BMW), IT suppliers, U.S. Federal and state transportation departments, and professional associations. Trialing is taking place in Michigan and California. The specific applications now being developed under the U.S. initiative are: Warning drivers of unsafe conditions or imminent collisions. Warning drivers if they are about to run off the road or speed around a curve too fast. Informing system operators of real-time congestion, weather conditions and incidents. Providing operators with information on corridor capacity for real-time management, planning and provision of corridor-wide advisories to drivers. In mid-2007, a VII environment covering some 20 square miles (52 km2) near Detroit was used to test 20 prototype VII applications. Several automobile manufacturers are also conducting their own VII research and triali