Multilinear principal component analysis

Multilinear principal component analysis

Multilinear principal component analysis (MPCA) is a multilinear extension of principal component analysis (PCA) that is used to analyze M-way arrays, also informally referred to as "data tensors". M-way arrays may be modeled by linear tensor models, such as CANDECOMP/Parafac, or by multilinear tensor models, such as multilinear principal component analysis (MPCA) or multilinear (tensor) independent component analysis (MICA). In 2005, Vasilescu and Terzopoulos introduced the Multilinear PCA terminology as a way to better differentiate between multilinear data models that employed 2nd order statistics versus higher order statistics to compute a set of independent components for each mode, such as Multilinear ICA Multilinear PCA may be applied to compute the causal factors of data formation, or as signal processing tool on data tensors whose individual observation have either been vectorized, or whose observations are treated as a collection of column/row observations, an "observation as a matrix", and concatenated into a data tensor. The latter approach is suitable for compression and reducing redundancy in the rows, columns and fibers that are unrelated to the causal factors of data formation. Vasilescu and Terzopoulos in their paper "TensorFaces" introduced the M-mode SVD algorithm which are algorithms misidentified in the literature as the HOSVD or the Tucker which employ the power method or gradient descent, respectively. Vasilescu and Terzopoulos framed the data analysis, recognition and synthesis problems as multilinear tensor problems. Data is viewed as the compositional consequence of several causal factors, that are well suited for multi-modal tensor factor analysis. The power of the tensor framework was showcased by analyzing human motion joint angles, facial images or textures in the following papers: Human Motion Signatures (CVPR 2001, ICPR 2002), face recognition – TensorFaces, (ECCV 2002, CVPR 2003, etc.) and computer graphics – TensorTextures (Siggraph 2004). == The algorithm == The MPCA solution follows the alternating least square (ALS) approach. It is iterative in nature. As in PCA, MPCA works on centered data. Centering is a little more complicated for tensors, and it is problem dependent. == Feature selection == MPCA features: Supervised MPCA is employed in causal factor analysis that facilitates object recognition while a semi-supervised MPCA feature selection is employed in visualization tasks. == Extensions == Various extension of MPCA: Robust MPCA (RMPCA) Multi-Tensor Factorization, that also finds the number of components automatically (MTF)

2024 National Public Data breach

In August 2024, three class-action lawsuits were filed against National Public Data along with over 14 complaints filed in federal court, claiming that the company permitted hackers to steal sensitive private information covering millions of individuals. The theft was alleged to have occurred in April 2024. One of the lawsuits specifically claims that in April, a hacker going by the moniker "USDoD" posted a notice on the dark web, offering the data for sale at the price of US$3.5 million. The information stolen is alleged to include 2.9 billion records containing full names, current and past addresses, Social Security numbers, dates of birth, and telephone numbers. The stolen data contains records for people in the US, UK, and Canada. National Public Data confirmed on August 16, 2024, there was a breach originating from someone trying to breach their systems since December 2023, with the breach occurring from April 2024 and over the next few months. The company also confirmed that 2.9 billion records were obtained, though they were still working to determine how many people were affected by the breach, and were working with law enforcement to identify the hacker. == Jerico Pictures == Jerico Pictures, Inc., doing business as National Public Data, was a data broker company that performed employee background checks. Their primary service was collecting information from public data sources, including criminal records, addresses, and employment history, and offering that information for sale. On October 2, 2024, Jerico Pictures filed for Chapter 11 bankruptcy as it currently faces over a dozen lawsuits over the breach, and is potentially liable "for credit monitoring for hundreds of millions of potentially impacted individuals." In December 2024, National Public Data shut down, showing a closure notice on its website.

DVD

DVD (digital video disc or digital versatile disc) is a digital optical disc data storage format. It was invented and developed in 1995 and first released on November 1, 1996, in Japan. The medium can store any kind of digital data and has been widely used to store video programs (watched using DVD players), software and other computer files. DVDs offer significantly higher storage capacity than compact discs (CD) while having the same dimensions. A standard single-layer DVD can store up to 4.7 GB of data, a dual-layer DVD up to 8.5 GB. Dual-layer, double-sided DVDs can store up to a maximum of 17.08 GB. Prerecorded DVDs are mass-produced using molding machines that physically stamp data onto the DVD. Such discs are a form of DVD-ROM because data can only be read and not written or erased. Blank recordable DVD discs (DVD-R and DVD+R) can be recorded once using a DVD recorder and then function as a DVD-ROM. Rewritable DVDs (DVD-RW, DVD+RW, and DVD-RAM) can be recorded and erased many times. DVDs are used in DVD-Video consumer digital video format and less commonly in DVD-Audio consumer digital audio format, as well as for authoring DVD discs written in a special AVCHD format to hold high definition material (often in conjunction with AVCHD format camcorders). DVDs containing other types of information may be referred to as DVD data discs. == Etymology == The Oxford English Dictionary comments that, "In 1995, rival manufacturers of the product initially named digital video disc agreed that, in order to emphasize the flexibility of the format for multimedia applications, the preferred abbreviation DVD would be understood to denote digital versatile disc." The OED also states that in 1995, "The companies said the official name of the format will simply be DVD. Toshiba had been using the name 'digital video disc', but that was switched to 'digital versatile disc' after computer companies complained that it left out their applications." "Digital versatile disc" is the explanation provided in a DVD Forum Primer from 2000 and in the DVD Forum's mission statement, which the purpose is to promote broad acceptance of DVD products on technology, across entertainment, and other industries. Because DVDs became highly popular for the distribution of movies in the 2000s, the term DVD became popularly used in English as a noun to describe specifically a full-length movie released on the format; for example the phrase "to watch a DVD" describes watching a movie on DVD. == History == === Development and launch === Released in 1987, CD Video used analog video encoding on optical discs matching the established standard 120 mm (4.7 in) size of audio CDs. Video CD (VCD) became one of the first formats for distributing digitally encoded films in this format, in 1993. In the same year, two new optical disc storage formats were being developed. One was the Multimedia Compact Disc (MMCD), backed by Philips and Sony (developers of the CD and CD-i), and the other was the Super Density (SD) disc, supported by Toshiba, Time Warner, Matsushita Electric, Hitachi, Mitsubishi Electric, Pioneer, Thomson, and JVC. By the time of the press launches for both formats in January 1995, the MMCD nomenclature had been dropped, and Philips and Sony were referring to their format as Digital Video Disc (DVD). On May 3, 1995, an ad hoc industry technical group formed from five computer companies (IBM, Apple, Compaq, Hewlett-Packard, and Microsoft) issued a press release stating that they would only accept a single format. The group voted to boycott both formats unless the two camps agreed on a single, converged standard. They recruited Lou Gerstner, president of IBM, to pressure the executives of the warring factions. In one significant compromise, the MMCD and SD groups agreed to adopt proposal SD 9, which specified that both layers of the dual-layered disc be read from the same side—instead of proposal SD 10, which would have created a two-sided disc that users would have to turn over. Philips/Sony strongly insisted on the source code, EFMPlus, that Kees Schouhamer Immink had designed for the MMCD, because it makes it possible to apply the existing CD servo technology. Its drawback was a loss from 5 to 4.7 Gigabytes of capacity. As a result, the DVD specification provided a storage capacity of 4.7 GB (4.38 GiB) for a single-layered, single-sided disc and 8.5 GB (7.92 GiB) for a dual-layered, single-sided disc. The DVD specification ended up similar to Toshiba and Matsushita's Super Density Disc, except for the dual-layer option. MMCD was single-sided and optionally dual-layer, whereas SD was two half-thickness, single-layer discs which were pressed separately and then glued together to form a double-sided disc. Philips and Sony decided that it was in their best interests to end the format war, and on September 15, 1995 agreed to unify with companies backing the Super Density Disc to release a single format, with technologies from both. After other compromises between MMCD and SD, the group of computer companies won the day, and a single format was agreed upon. The computer companies also collaborated with the Optical Storage Technology Association (OSTA) on the use of their implementation of the ISO-13346 file system (known as Universal Disk Format) for use on the new DVDs. The format's details were finalized on December 8, 1995. In November 1995, Samsung announced it would start mass-producing DVDs by September 1996. The format launched on November 1, 1996, in Japan, mostly with music video releases. The first major releases from Warner Home Video arrived on December 20, 1996, with four titles being available. The format's release in the U.S. was delayed multiple times, from August 1996, to October 1996, November 1996, before finally settling on early 1997. Players began to be produced domestically that winter, with March 24, 1997, as the U.S. launch date of the format proper in seven test markets. Approximately 32 titles were available on launch day, mainly from the Warner Bros., MGM, and New Line libraries, with the notable inclusion of the 1996 film Twister. However, the launch was planned for the following day (March 25), leading to a distribution change with retailers and studios to prevent similar violations of breaking the street date. The nationwide rollout for the format happened on August 22, 1997. DTS announced in late 1997 that they would be coming onto the format. The sound system company revealed details in a November 1997 online interview, and clarified it would release discs in early 1998. However, this date would be pushed back several times before finally releasing their first titles at the 1999 Consumer Electronics Show. In 2001, blank DVD recordable discs cost the equivalent of $27.34 US dollars in 2022. === Adoption === Movie and home entertainment distributors adopted the DVD format to replace the ubiquitous VHS tape as the primary consumer video distribution format. Immediately following the formal adoption of a unified standard for DVD, two of the four leading video game console companies (Sega and The 3DO Company) said they already had plans to design a gaming console with DVDs as the source medium. Sony stated at the time that they had no plans to use DVD in their gaming systems, despite being one of the developers of the DVD format and eventually the first company to actually release a DVD-based console. Game consoles such as the PlayStation 2, Xbox, and Xbox 360 use DVDs as their source medium for games and other software. Contemporary games for Windows were also distributed on DVD. Early DVDs were mastered using DLT tape, but using DVD-R DL or +R DL eventually became common. TV DVD combos, combining a standard definition CRT TV or an HD flat panel TV with a DVD mechanism under the CRT or on the back of the flat panel, and VCR/DVD combos were also available for purchase. For consumers, DVD soon overtook VHS as the favored choice for home movie releases. In 2001, DVD players outsold VCRs for the first time in the United States. At that time, one in four American households owned a DVD player. By 2007, about 80% of Americans owned a DVD player, a figure that had surpassed VCRs; it was also higher than personal computers or cable television. == Specifications == The DVD specifications created and updated by the DVD Forum are published as so-called DVD Books (e.g. DVD-ROM Book, DVD-Audio Book, DVD-Video Book, DVD-R Book, DVD-RW Book, DVD-RAM Book, DVD-AR (Audio Recording) Book, DVD-VR (Video Recording) Book, etc.). DVD discs are made up of two discs; normally one is blank, and the other contains data. Each disc is 0.6 mm thick, and they are glued together to form a DVD disc. The gluing process must be done carefully to make the disc as flat as possible to avoid both birefringence and "disc tilt", which is when the disc is not perfectly flat, preventing it from being read. Some specifications for mechanical, physical and optical characteristics of DV

Enterprise social software

Enterprise social software (also known as or regarded as a major component of Enterprise 2.0), comprises social software as used in "enterprise" (business/commercial) contexts. It includes social and networked modifications to corporate intranets and other classic software platforms used by large companies to organize their communication. In contrast to traditional enterprise software, which imposes structure prior to use, enterprise social software tends to encourage use prior to providing structure. Carl Frappaolo and Dan Keldsen defined Enterprise 2.0 in a report written for Association for Information and Image Management (AIIM) as "a system of web-based technologies that provide rapid and agile collaboration, information sharing, emergence and integration capabilities in the extended enterprise". == Applications == === Functionality === Social software for an enterprise must (according to Andrew McAfee, Associate Professor, Harvard Business School) have the following functionality to work well: Search: allowing users to search for other users or content Links: grouping similar users or content together Authoring: including blogs and wikis Tags: allowing users to tag content Extensions: recommendations of users; or content based on profile Signals: allowing people to subscribe to users or content with RSS feeds McAfee recommends installing easy-to-use software which does not impose any rigid structure on users. He envisages an informal roll-out, but on a common platform to enable future collaboration between areas. He also recommends strong and visible managerial support to achieve this. In 2007 Dion Hinchcliffe expanded the list above by adding the following four functions: Freeform function: no barriers to authorship (meaning free from a learning curve or from restrictions) Network-oriented function, requiring web-addressable content in all cases Social function: stressing transparency (to access), diversity (in content and community members) and openness (to structure) Emergence function: requiring the provision of approaches that detect and leverage the collective wisdom of the community Enterprise search differs from a typical web search in its focus on "use within an organization by employees seeking information held internally, in a variety of formats and locations, including databases, document management systems, and other repositories". === Criticism === There has been recent criticism that the adaptation of the social paradigm (e.g. openness and altruistic behavior) does not always work well for the enterprise setting, which led some authors to question the proper functioning of enterprise social software. The findings from a novel study suggests that free and non-anonymous sharing of trusted information (beyond marketing or product information) is significantly influenced by concerns from business users.

Web syndication

Web syndication is making content available from one website to other sites. Most commonly, websites are made available to provide either summaries or full renditions of a website's recently added content. The term may also describe other kinds of content licensing for reuse. Contemporary web syndicates include: MSN, Excite, and Yahoo! News. == Motivation == For the subscribing sites, syndication is an effective way of adding greater depth and immediacy of information to their pages, making them more attractive to users. For the provider site, syndication increases exposure. This generates new traffic for the provider site—making syndication an easy and relatively cheap, or even free, form of advertisement. Content syndication has become an effective strategy for link building, as search engine optimization has become an increasingly important topic among website owners and online marketers. Links embedded within the syndicated content are typically optimized around anchor terms that will point an optimized link back to the website that the content author is trying to promote. These links tell the algorithms of the search engines that the website being linked to is an authority for the keyword that is being used as the anchor text. However the rollout of Google Panda's algorithm may not reflect this authority in its SERP rankings based on quality scores generated by the sites linking to the authority. The prevalence of web syndication is also of note to online marketers, since web surfers are becoming increasingly wary of providing personal information for marketing materials (such as signing up for a newsletter) and expect the ability to subscribe to a feed instead. Although the format could be anything transported over HTTP, such as HTML or JavaScript, it is more commonly XML. Web syndication formats include RSS, Atom, and JSON Feed. == History == Syndication first arose in earlier media such as print, radio, and television, allowing content creators to reach a wider audience. In the case of radio, the United States Federal government proposed a syndicate in 1924 so that the country's executives could quickly and efficiently reach the entire population. In the case of television, it is often said that "Syndication is where the real money is." Additionally, syndication accounts for the bulk of TV programming. One predecessor of web syndication is the Meta Content Framework (MCF), developed in 1996 by Ramanathan V. Guha and others in Apple Computer's Advanced Technology Group. Today, millions of online publishers, including newspapers, commercial websites, and blogs, distribute their news headlines, product offers, and blog postings in the news feed. == As a commercial model == Conventional syndication businesses such as Reuters and Associated Press thrive on the internet by offering their content to media partners on a subscription basis, using business models established in earlier media forms. Commercial web syndication can be categorized in three ways: by business models by types of content by methods for selecting distribution partners Commercial web syndication involves partnerships between content producers and distribution outlets. There are different structures of partnership agreements. One such structure is licensing content, in which distribution partners pay a fee to the content creators for the right to publish the content. Another structure is ad-supported content, in which publishers share revenues derived from advertising on syndicated content with that content's producer. A third structure is free, or barter syndication, in which no currency changes hands between publishers and content producers. This requires the content producers to generate revenue from another source, such as embedded advertising or subscriptions. Alternatively, they could distribute content without remuneration. Typically, those who create and distribute content free are promotional entities, vanity publishers, or government entities. Types of content syndicated include RSS or Atom Feeds and full content. With RSS feeds, headlines, summaries, and sometimes a modified version of the original full content is displayed on users' feed readers. With full content, the entire content—which might be text, audio, video, applications/widgets, or user-generated content—appears unaltered on the publisher's site. There are two methods for selecting distribution partners. The content creator can hand-pick syndication partners based on specific criteria, such as the size or quality of their audiences. Alternatively, the content creator can allow publisher sites or users to opt into carrying the content through an automated system. Some of these automated "content marketplace" systems involve careful screening of potential publishers by the content creator to ensure that the material does not end up in an inappropriate environment. Just as syndication is a source of profit for TV producers and radio producers, it also functions to maximize profit for Internet content producers. As the Internet has increased in size it has become increasingly difficult for content producers to aggregate a sufficiently large audience to support the creation of high-quality content. Syndication enables content creators to amortize the cost of producing content by licensing it across multiple publishers or by maximizing the distribution of advertising-supported content. A potential drawback for content creators, however, is that they can lose control over the presentation of their content when they syndicate it to other parties. Distribution partners benefit by receiving content either at a discounted price, or free. One potential drawback for publishers, however, is that because the content is duplicated at other publisher sites, they cannot have an "exclusive" on the content. For users, the fact that syndication enables the production and maintenance of content allows them to find and consume content on the Internet. One potential drawback for them is that they may run into duplicate content, which could be an annoyance. == E-commerce == Web syndication has been used to distribute product content such as feature descriptions, images, and specifications. As manufacturers are regarded as authorities and most sales are not achieved on manufacturer websites, manufacturers allow retailers or dealers to publish the information on their sites. Through syndication, manufacturers may pass relevant information to channel partners. Such web syndication has been shown to increase sales. Web syndication has also been found effective as a search engine optimization technique.

Common data model

A common data model (CDM) can refer to any standardised data model which allows for data and information exchange between different applications and data sources. Common data models aim to standardise logical infrastructure so that related applications can "operate on and share the same data", and can be seen as a way to "organize data from many sources that are in different formats into a standard structure". A common data model has been described as one of the components of a "strong information system". A standardised common data model has also been described as a typical component of a well designed agile application besides a common communication protocol. Providing a single common data model within an organisation is one of the typical tasks of a data warehouse. == Examples of common data models == === Border crossings === X-trans.eu was a cross-border pilot project between the Free State of Bavaria (Germany) and Upper Austria with the aim of developing a faster procedure for the application and approval of cross-border large-capacity transports. The portal was based on a common data model that contained all the information required for approval. === Climate data === The Climate Data Store Common Data Model is a common data model set up by the Copernicus Climate Change Service for harmonising essential climate variables from different sources and data providers. === General information technology === Within service-oriented architecture, S-RAMP is a specification released by HP, IBM, Software AG, TIBCO, and Red Hat which defines a common data model for SOA repositories as well as an interaction protocol to facilitate the use of common tooling and sharing of data. Content Management Interoperability Services (CMIS) is an open standard for inter-operation of different content management systems over the internet, and provides a common data model for typed files and folders used with version control. The NetCDF software libraries for array-oriented scientific data implements a common data model called the NetCDF Java common data model, which consists of three layers built on top of each other to add successively richer semantics. === Health === Within genomic and medical data, the Observational Medical Outcomes Partnership (OMOP) research program established under the U.S. National Institutes of Health has created a common data model for claims and electronic health records which can accommodate data from different sources around the world. PCORnet, which was developed by the Patient-Centered Outcomes Research Institute, is another common data model for health data including electronic health records and patient claims. The Sentinel Common Data Model was initially started as Mini-Sentinel in 2008. It is used by the Sentinel Initiative of the USA's Food and Drug Administration. The Generalized Data Model was first published in 2019. It was designed to be a stand-alone data model as well as to allow for further transformation into other data models (e.g., OMOP, PCORNet, Sentinel). It has a hierarchical structure to flexibly capture relationships among data elements. The JANUS clinical trial data repository also provides a common data model which is based on the SDTM standard to represent clinical data submitted to regulatory agencies, such as tabulation datasets, patient profiles, listings, etc. === Logistics === SX000i is a specification developed jointly by the Aerospace and Defence Industries Association of Europe (ASD) and the American Aerospace Industries Association (AIA) to provide information, guidance and instructions to ensure compatibility and the commonality. The associated SX002D specification contains a common data model. === Microsoft Common Data Model === The Microsoft Common Data Model is a collection of many standardised extensible data schemas with entities, attributes, semantic metadata, and relationships, which represent commonly used concepts and activities in various businesses areas. It is maintained by Microsoft and its partners, and is published on GitHub. Microsoft's Common Data Model is used amongst others in Microsoft Dataverse and with various Microsoft Power Platform and Microsoft Dynamics 365 services. === Rail transport === RailTopoModel is a common data model for the railway sector. === Other === There are many more examples of various common data models for different uses published by different sources.

Blocking of Twitter in Nigeria

Twitter was blocked in Nigeria from 5 June 2021 to 13 January 2022. The government imposed a ban on the social network after it deleted tweets made by, and temporarily suspended, the Nigerian president Muhammadu Buhari, warning the southeastern people of Nigeria, predominantly Igbo people, of a potential repeat of the 1967 Nigerian Civil War due to the ongoing insurgency in Southeastern Nigeria. The Nigerian government claimed that the deletion of the president's tweets factored into their decision, but it was ultimately based on "a litany of problems with the social media platform in Nigeria, where misinformation and fake news spread through it have had real world violent consequences", citing the persistent use of the platform for activities that are capable of undermining Nigeria's corporate existence. In January 2022, Nigeria lifted its blocking of Twitter after the platform agreed to establish a legal entity within the country sometime in the first quarter of 2022. == Background == On 1 June 2021, Nigerian President Muhammadu Buhari posted a tweet threatening a crackdown on regional separatists "in the language they understand". The next day, Twitter deleted the tweet, claiming it was in violation of Twitter rules, but gave no further details. Nigeria's Information Minister Lai Mohammed said that Twitter's actions were part of an unfair double standard, as Twitter had not banned incitement tweets from other groups. During the Nigerian Civil War a majority of deaths resulted from the blockade of Biafra which caused the deaths of millions of civilians from starvation, a fact that was not alluded to in the tweet. The Nigerian government has long held concerns over the use of Twitter in the country. The ongoing local End SARS protest began on Twitter and got amplified in 2020 when it had 48 million tweets in ten days. Buhari's government floated the idea of social media regulation on different occasions prior to banning Twitter. Attempts to pass an anti-social media bill in the past have failed majorly due to massive outcry on Twitter. Days before the ban, the country's minister of information called Twitter's activities in Nigeria suspicious, citing its influence on the End SARS protests. == Aftermath == Three days after Twitter was suspended, it was reported that the move had cost the country over 6 billion naira and would also contribute to the worsening unemployment in the country. ExpressVPN reported an over 200 percent increase in web traffic and searches for VPN spiked across the country. In response, Nigeria's Minister of Justice and Attorney General of the Federation Abubakar Malami at first openly threatened to prosecute citizens who bypass the ban using a VPN but then denied saying so after a screenshot of a Twitter deactivation notification he shared on Facebook showed a VPN logo. Nigeria's cultural minister Lai Mohammed stated the ban would be lifted once Twitter submitted to locally licensing, registration and conditions. "It will be licensed by the broadcasting commission, and must agree not to allow its platform to be used by those who are promoting activities that are inimical to the corporate existence of Nigeria." In late June 2021, Twitter announced it would enter talks with the Nigerian government over the platform's suspension. The talks began in July 2021. On 15 September 2021, Mohammed said the Nigerian government will lift the ban on Twitter in a "few days." The Minister said Twitter gave a progress report of their talks with them, adding that it has been productive and quite respectful. On 1 October 2021, President Muhammadu Buhari in his Independence Day broadcast said Twitter must meet the Nigerian government's five conditions before the suspension of the social media platform will be lifted. The conditions are: Respect for national security and cohesion; registration, physical presence and representation in Nigeria; fair taxation; dispute resolution; local content. == Reactions == The ban was condemned by Amnesty International, the British, Canadian and Swedish diplomatic missions to Nigeria, as well as the United States and the European Union in a joint statement. Two domestic organizations, the Socio-Economic Rights and Accountability Project (SERAP) and the Nigerian Bar Association, indicated intent to challenge the ban in court. Twitter itself called the ban "deeply concerning". Former U.S. President Donald Trump, who was permanently suspended from Twitter following the United States Capitol attack in January, praised the ban, stating "Congratulations to the country of Nigeria, who just banned Twitter because they banned their President", and also called on other countries to ban Twitter and Facebook due to "not allowing free and open speech." == Lifting of the ban == On 12 January 2022, the Nigerian Government lifted the ban after Twitter agreed to pay an "applicable tax" and establish "a legal entity in Nigeria during the first quarter of 2022".