AI Chat Vumc

AI Chat Vumc — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • MoltenVK

    MoltenVK

    MoltenVK is a software library which allows Vulkan applications to run on top of Metal on Apple's macOS, iOS, and tvOS operating systems. It is the first software component to be released for the Vulkan Portability Initiative, a project to have a subset of Vulkan run on platforms lacking native Vulkan drivers. There are some limitations compared with a native Vulkan implementation. == History == MoltenVK was first released as a proprietary and commercially licensed product by The Brenwill Workshop on July 27, 2016. On July 31, 2017, Khronos announced the formation of the Vulkan Portability Technical Subgroup. === Open source === On February 26, 2018, Khronos announced that Vulkan became available on macOS and iOS products through the MoltenVK library. Valve announced that Dota 2 will run on macOS using the Vulkan API with the aid of MoltenVK, and that they had made an arrangement with developer The Brenwill Workshop Ltd to release MoltenVK as open-source software under the Apache License version 2.0. On May 30, 2018, Qt was updated with Vulkan for Qt on macOS using MoltenVK. On May 31, 2018, optional Vulkan support for Dota 2 on macOS was released. Benchmarks for the game were available the following day, showing better performance using Vulkan and MoltenVK compared to OpenGL. On July 20, 2018, Wine was updated with Vulkan support on macOS using MoltenVK. On 29 July 2018, the first app using MoltenVK was accepted onto the App Store, after initially being rejected. On 6 August 2018, Google open-sourced Filament, a crossplatform real-time physically based rendering engine with MoltenVK for macOS/iOS. On November 28, 2018, Valve released Artifact, their first Vulkan-only game on macOS using MoltenVK. === Version 1.0 === On 29 January 2019, MoltenVK 1.0.32 was released with early prototype of Vulkan Portability Extensions. RPCS3 and Dolphin emulators were updated with Vulkan support on macOS using MoltenVK. On 13 April 2019, MoltenVK 1.0.34 was released with support for tessellation. On July 30, 2019, MoltenVK 1.0.36 was released targeting Metal 3.0. On July 31, 2020, MoltenVK 1.0.44 was released, adding support for the tvOS platform. On January 23, 2020, MoltenVK was updated to support for some of the new features of Vulkan 1.2, as of Vulkan SDK 1.2.121. === Version 1.1 === On October 1, 2020, MoltenVK 1.1.0 was released, adding full support for Vulkan 1.1, as of Vulkan SDK 1.2.154. On 9 December 2020, MoltenVK 1.1.1 was released, providing support for Vulkan on Apple silicon GPUs and support for the Mac Catalyst platform for porting iOS/iPadOS apps to macOS. === Version 1.2 === On October 18, 2022, MoltenVK 1.2.0 was released, adding full support for Vulkan 1.2 as of Vulkan SDK 1.3.231. In January 2023, MoltenVK 1.2.2 added support for Vulkan as of SDK 1.3.239, while this version of Vulkan SDK fixed some issues with the interconnectivity with Metal API, while version 1.2.3 supported some additional extensions. === Version 1.3 === On May 1, 2025, MoltenVK 1.3 was released with support for Vulkan 1.3. === Version 1.4 === On August 20, 2025, MoltenVK 1.4 was released with support for Vulkan 1.4.

    Read more →
  • SFINKS

    SFINKS

    Sfinks (Polish for "Sphynx") was also the initial name of the Janusz A. Zajdel Award In cryptography, SFINKS is a stream cypher algorithm developed by An Braeken, Joseph Lano, Nele Mentens, Bart Preneel, and Ingrid Verbauwhede. It includes a message authentication code. It has been submitted to the eSTREAM Project of the eCRYPT network. In 2005, Nicolas T. Courtois noted that, while the cipher is elegant and secure against some simple algebraic attacks, it is vulnerable to more elaborate known attacks.

    Read more →
  • Key (cryptography)

    Key (cryptography)

    A key in cryptography is a piece of information, usually a string of numbers or letters that are stored in a file, which, when processed through a cryptographic algorithm, can encode or decode cryptographic data. Based on the used method, the key can be different sizes and varieties, but in all cases, the strength of the encryption relies on the security of the key being maintained. A key's security strength is dependent on its algorithm, the size of the key, the generation of the key, and the process of key exchange. == Scope == The key is what is used to encrypt data from plaintext to ciphertext. There are different methods for utilizing keys and encryption. === Symmetric cryptography === Symmetric cryptography refers to the practice of the same key being used for both encryption and decryption. === Asymmetric cryptography === Asymmetric cryptography has separate keys for encrypting and decrypting. These keys are known as the public and private keys, respectively. == Purpose == Since the key protects the confidentiality and integrity of the system, it is important to be kept secret from unauthorized parties. With public key cryptography, only the private key must be kept secret, but with symmetric cryptography, it is important to maintain the confidentiality of the key. Kerckhoff's principle states that the entire security of the cryptographic system relies on the secrecy of the key. == Key sizes == Key size is the number of bits in the key defined by the algorithm. This size defines the upper bound of the cryptographic algorithm's security. The larger the key size, the longer it will take before the key is compromised by a brute force attack. Since perfect secrecy is not feasible for key algorithms, researches are now more focused on computational security. In the past, keys were required to be a minimum of 40 bits in length, however, as technology advanced, these keys were being broken quicker and quicker. As a response, restrictions on symmetric keys were enhanced to be greater in size. Currently, 2048 bit RSA is commonly used, which is sufficient for current systems. However, current RSA key sizes would all be cracked quickly with a powerful quantum computer. "The keys used in public key cryptography have some mathematical structure. For example, public keys used in the RSA system are the product of two prime numbers. Thus public key systems require longer key lengths than symmetric systems for an equivalent level of security. 3072 bits is the suggested key length for systems based on factoring and integer discrete logarithms which aim to have security equivalent to a 128 bit symmetric cipher." == Key generation == To prevent a key from being guessed, keys need to be generated randomly and contain sufficient entropy. The problem of how to safely generate random keys is difficult and has been addressed in many ways by various cryptographic systems. A key can directly be generated by using the output of a Random Bit Generator (RBG), a system that generates a sequence of unpredictable and unbiased bits. A RBG can be used to directly produce either a symmetric key or the random output for an asymmetric key pair generation. Alternatively, a key can also be indirectly created during a key-agreement transaction, from another key or from a password. Some operating systems include tools for "collecting" entropy from the timing of unpredictable operations such as disk drive head movements. For the production of small amounts of keying material, ordinary dice provide a good source of high-quality randomness. == Establishment scheme == The security of a key is dependent on how a key is exchanged between parties. Establishing a secured communication channel is necessary so that outsiders cannot obtain the key. A key establishment scheme (or key exchange) is used to transfer an encryption key among entities. Key agreement and key transport are the two types of a key exchange scheme that are used to be remotely exchanged between entities . In a key agreement scheme, a secret key, which is used between the sender and the receiver to encrypt and decrypt information, is set up to be sent indirectly. All parties exchange information (the shared secret) that permits each party to derive the secret key material. In a key transport scheme, encrypted keying material that is chosen by the sender is transported to the receiver. Either symmetric key or asymmetric key techniques can be used in both schemes. The Diffie–Hellman key exchange and Rivest-Shamir-Adleman (RSA) are the most two widely used key exchange algorithms. In 1976, Whitfield Diffie and Martin Hellman constructed the Diffie–Hellman algorithm, which was the first public key algorithm. The Diffie–Hellman key exchange protocol allows key exchange over an insecure channel by electronically generating a shared key between two parties. On the other hand, RSA is a form of the asymmetric key system which consists of three steps: key generation, encryption, and decryption. Key confirmation delivers an assurance between the key confirmation recipient and provider that the shared keying materials are correct and established. The National Institute of Standards and Technology recommends key confirmation to be integrated into a key establishment scheme to validate its implementations. == Management == Key management concerns the generation, establishment, storage, usage and replacement of cryptographic keys. A key management system (KMS) typically includes three steps of establishing, storing and using keys. The base of security for the generation, storage, distribution, use and destruction of keys depends on successful key management protocols. == Key vs password == A password is a memorized series of characters including letters, digits, and other special symbols that are used to verify identity. It is often produced by a human user or a password management software to protect personal and sensitive information or generate cryptographic keys. Passwords are often created to be memorized by users and may contain non-random information such as dictionary words. On the other hand, a key can help strengthen password protection by implementing a cryptographic algorithm which is difficult to guess or replace the password altogether. A key is generated based on random or pseudo-random data and can often be unreadable to humans. A password is less safe than a cryptographic key due to its low entropy, randomness, and human-readable properties. However, the password may be the only secret data that is accessible to the cryptographic algorithm for information security in some applications such as securing information in storage devices. Thus, a deterministic algorithm called a key derivation function (KDF) uses a password to generate the secure cryptographic keying material to compensate for the password's weakness. Various methods such as adding a salt or key stretching may be used in the generation.

    Read more →
  • Instagram egg

    Instagram egg

    The Instagram egg is a photo of an egg posted by the account @world_record_egg on the social media platform Instagram. It became a global phenomenon and an internet meme within days of its publication on 4 January 2019. It is the second most-liked Instagram post and was the most-liked Instagram post from 14 January 2019 until 20 December 2022, when it was overtaken by Lionel Messi's post showing him and his teammates celebrating after Argentina won the 2022 FIFA World Cup. The owner of the account was revealed to be Chris Godfrey, a British advertising creative, who later worked with his two friends Alissa Khan-Whelan and CJ Brown on a Hulu commercial featuring the egg, intended to raise mental health awareness. == Background == The photo was originally taken by Serghei Platanov, who then posted it to Shutterstock on 23 June 2015 with the title "eggs isolated on white background". == History == On 4 January 2019, the @world_record_egg account was created, and posted an image of a bird egg with the caption, "Let's set a world record together and get the most liked post on Instagram. Beating the current world record held by Kylie Jenner (18 million)! We got this." Jenner's previous record, the first photo of her daughter Stormi, had garnered a total of 18.4 million likes. The post quickly reached 18.4 million likes in just under 10 days, becoming the most-liked Instagram post at the time. It then continued to rise over 45 million likes in the next 48 hours, surpassing the "Despacito" music video and taking the world record for the most-liked online post (on any media platform) in history. After the account became verified on 14 January 2019, the post rose in popularity and likes, which snowballed into coverage in various media outlets. By 18 March 2019, the post had accumulated over 53.3 million likes, nearly three times the previous record of 18.4 million. It posted frequent updates for a few days in the form of Instagram Stories. Alongside the like tally, as of January 2023 the post has 3.8 million comments. Several individuals tried to claim that they were the account's creator, the claims being dismissed by "the egg" on Instagram direct messages. On 3 February 2019, the creator of the Instagram egg was revealed by Hulu and The New York Times to be Chris Godfrey, a British advertising creative. Alissa Khan-Whelan, his colleague, was also outed. On 18 January 2019, the account posted a second picture of an egg, almost identical to the first one apart from a small crack at the top left. As of 25 February 2019, the post accumulated 11.8 million likes. On 22 January 2019, the account posted a third picture of an egg, this time having two larger cracks. In less than 25 minutes, the post accumulated 1 million likes, and by 25 February 2019, it had accumulated 9.5 million likes. On 29 January 2019, a fourth picture of an egg was posted to the account which has another large crack on the right hand side, attracting 7.6 million likes by 25 February 2019. On 1 February 2019, a fifth picture of an egg was posted with stitching like that of a football, referencing the upcoming Super Bowl. That post had accumulated 6.5 million likes by 25 February 2019. The account promised that it would reveal what was inside the egg on 3 February, on the subscription video on demand service Hulu. The Hulu Instagram egg reveal was used to promote an animation about a mental health campaign. A caption from the clip read, "Recently I've started to crack, the pressure of social media is getting to me. If you're struggling too, talk to someone." The video was later posted on the @world_record_egg Instagram account, and this post received over 33 million views by May 2019. As of May 2020, it had received over 41 million views. On 16 July 2019, Chris Godfrey (the creator of the account) was listed as one of the top 25 most influential people on the internet. On 20 December 2022, the record for the most-liked Instagram post was surpassed by a post from Argentine footballer Lionel Messi, showing him and his teammates celebrating after winning the 2022 FIFA World Cup with their national team. The world record egg responded to being overtaken in likes by Messi with "Today [Lionel Messi] has taken the crown, for now. But I'm still left with one question… Who is the greatest of all time – Cristiano Ronaldo or Leo Messi?" The account sold to Dubai-based investor Mustafa El Fishawy in April 2024 for an undisclosed seven-figure sum. Reed Smith, who advised Godfrey, Brown, and Khan-Whelan in the transaction, stated they opted to sell it to "focus on new ventures." On 3 June, @world_record_egg posted an egg with the flag of Palestine in support of the country during the Gaza war; the post's caption described it as an "Egg for Peace" and hoped to "set a new world record together and get the most liked post on Instagram for a good cause." == Reception == In response to breaking the world record for the most-liked Instagram post, the account's owner wrote "This is madness. What a time to be alive." Hours later, Jenner posted a video on Instagram of her cracking open an egg and pouring its yolk onto the ground, with the caption: "Take that little egg." Pundits pontificated on the meaning of the egg picture's dominance over social media's "first family". As Vogue observed, tapping a heart pictogram is easy, and eggs are "lovable". More pointedly: [T]he attention economy is a scam based on requiring little to no labor from both producer and consumer despite commanding the most space, and therefore value, in our digital lives... but it very well could be: As a metaphor for the fragility of the influencer ecosystem, the egg has broken the Internet. The significance of the event and its massive republishing are a topic of discussion. A University of Westminster researcher of internet memes compared it to the movement to name a scientific research vessel in the United Kingdom as Boaty McBoatface. The Instagrammer's success is a rare victory for the unpaid viral campaign on social media. "There is a bit of an anti-celebrity revolt here – 'look what we can do with a simple egg'" The researcher suggests that the accomplishment of becoming such a widely heralded unpaid viral post may become increasingly rare, as social networks rely more on paid and business promotion. The post's spread has been characterized as a populist backlash against "consumerism" and is seen by some as a triumph of community over celebrity. However, propelled by their popular success, the creators promised to release 'egg-centric' memorabilia. Hundreds of games based on the Instagram egg have appeared on Apple's App Store. The creators of the Instagram egg also reached a deal to promote Hulu.

    Read more →
  • BabyCenter

    BabyCenter

    BabyCenter is an online media company based in San Francisco, New York City, Chicago, and Los Angeles that provides information on conception, pregnancy, birth, and early childhood development for parents and expecting parents. BabyCenter operates 8 country and region specific properties including websites, apps, emails, print publications, and an online community where parents can connect on a variety of topics. The visitors of website and the users of the app can sign up for free weekly email newsletters that guide them through pregnancy and their child's development. In addition to publishing detailed, medically reviewed information about pregnancy and parenting, BabyCenter, under its Mission Motherhood initiative, ran numerous social programs and has participated in public health initiatives in partnership with hospitals, healthcare agencies, nonprofits, NGOs, and government agencies to provide pregnancy and parenting advice. It also annually publishes the most popular baby names. BabyCenter LLC is part of the Everyday Health Group, a division of Ziff Davis. == History == BabyCenter was founded in October 1997 by Stanford University MBA graduates Matt Glickman and Mark Selcow, who recognized a need for information about pregnancy and parenting on the internet. BabyCenter was initially funded through $13.5 million in startup capital funding from venture capital firms, including Bessemer Venture Partners, Intel, and Trinity Ventures. The funds were used to open the BabyCenter Store in October 1998. In the early years of its operation, BabyCenter offered multiple resources and services for parents, including a website that provided medically reviewed information and guidance to new and expectant parents on such topics as fertility, labor, and childcare; a weekly email for pregnant women tailored to their week of pregnancy (based on their pregnancy due date); and community groups and chat rooms for pregnant couples and parents to discuss pregnancy and child-rearing strategies. The site grew quickly, and by early 1999 had 175 employees and an annual revenue of $35 million. In April of that year, the two founders sold BabyCenter to another website, eToys.com, for $190 million in stock. Twenty-three months later, in 2001, shortly before declaring bankruptcy, eToys sold the site to Johnson & Johnson for $10 million. During the eToys ownership, BabyCenter launched its first international E-commerce site in the UK during the spring of 2000. Starting in 2005, BabyCenter launched an expansion plan, extending its global network to Australia, Canada and other countries, staffing each outpost with local editors. In 2007, BabyCenter debuted a Mandarin-language site in China, initiated operations in India, launched a Spanish language website, and introduced its first mobile site. BabyCenter released My Pregnancy Today, its first mobile app, to Apple's App Store in August 2010 and to the Android market in April 2011. The app provided daily information, nutrition tips, advice relevant to the user's week of pregnancy, and 3-D animated videos showcasing a baby's development in utero. The My Pregnancy app was joined by a My Baby Today app in October 2011. In 2015, BabyCenter released Mom Feed, its first mobile app for parents of toddlers and older children (ages 1 to 8). Mom Feed offered personalized, stage-based information as well as content from the BabyCenter Community and Blog in a real-time stream. In 2016, BabyCenter launched its web-based Baby Names Finder. In 2018, Mom Feed was discontinued and BabyCenter replaced that experience with a separate Child Health content area on its website. Also in 2018, BabyCenter launched its mobile baby name generator, the Baby Names app, which, like the web-based Baby Names Finder, leverages data from hundreds of thousands of parents that culminates in its annual most popular Baby Names Report. In 2019, Johnson & Johnson sold Baby Center to Everyday Health Group, a division of New York-based parent company of Ziff Davis, Inc. Neither side disclosed terms of the deal. == Popular research == BabyCenter's most popular baby names is released annually and often cited by the media. In March 2024, BabyCenter did a review of the app Temu and said that the website has found products that have been recalled, could be counterfeit or circumvent U.S. safety standards and features that are important in preventing issues like choking. In 2025, BabyCenter released a report about the cost of raising a newborn baby in the first year. == Content and products == === Websites === BabyCenter has 8 country and region-specific websites around the world, including sites for the United States, Canada, Australia, Brazil, India, Germany, the United Kingdom, and Latin America. Users can find parenting and pregnancy advice in seven languages: English, Spanish, Portuguese, Arabic, French, German, and Hindi BabyCenter content for each country- or region-specific site is written by an editorial team based in that country or region. Medical and health content for each site is reviewed by a medical advisory board based there and adheres to that country or region's medical standards. For example, the U.S. site works with and follows the recommendations of such U.S. medical authorities as the American Academy of Pediatrics, the American Congress of Obstetrics & Gynecology and the Society for Maternal-Fetal Medicine. BabyCenter regularly conducts research and provides thought leadership on pregnancy and parenting topics, popularly cited by major media outlets including The Wall Street Journal, Forbes, The Washington Post, BuzzFeed, Insider, MarketWatch, Axios. === Community, blogs and social === From its earliest days, BabyCenter has had a community area that allows people to join a group of parents with children born in the same month, known as a Birth Club. BabyCenter launched a blog called Momformation in 2007. Eventually, the name was changed to BabyCenter Blog. In April 2021, the BabyCenter Community was identified in a research article within the journal PLOS Computational Biology as facilitating "unobstructed communication" between parents, which avoids the "strong echo chamber phenomena" that can foster and perpetuate vaccine misinformation. === My Pregnancy and Baby Today App === The app is available in six languages, although not all features are supported for every market. Initially the apps only featured pregnancy articles that could be found on the BabyCenter website, but over the years the feature set has expanded to include a growing list of app-specific tools such as weekly fetal development information, a kick tracker, a birth plan worksheet, a contraction timer, a baby growth tracker, a photo journal for pregnant women to record their pregnancy bellies, and a photo journal for documenting a baby's first year. === Mission Motherhood™ === BabyCenter was a cofounder of the Mobile Alliance for Maternal Action (MAMA), a public-private partnership between USAID, Johnson & Johnson, the UN Foundation, and BabyCenter from 2011 to-to 2015. The MAMA program sparked the creation of MomConnect, an initiative of the South African Department of Health for which BabyCenter developed SMS messages with health information about pregnancy and a child's first year of life. BabyCenter helped develop similar messages for mMitra, a voice messaging program in India. A research article in the Maternal and Child Health Journal stated the mMitra program offered strong evidence "that tailored mobile phone voice messages can improve key infant care knowledge and practices that lead to improved infant health outcomes in low-resource settings. BabyCenter's Mission Motherhood Messages were available to qualifying organizations on the BabyCenter website. BabyCenter contributed websites for Free Basics. These websites featured age and stage-based pregnancy and baby articles targeted to low-income, lower-education women who would not otherwise have access to health information. Content developed for this program was also used to support a UNICEF SMS program during the 2016 Zika outbreak. == Awards and recognition == In 1998, BabyCenter won a Webby Award for Best Home Site. Since then, it has been nominated for a Webby Award 19 times and won either a Webby or a People's Choice Webby Award 12 times – including a People's Voice win in 2021 for Lifestyle websites and mobile sites. In 2002, it won Service Journalism award from Online Journalism Awards (OJA). In 2015, BabyCenter won five Digital Health Awards for content about autism in children. In 2016, BabyCenter won seven Digital Health Awards: four for videos about the aches and pains of pregnancy, baby sleep, and the walking milestone in child development; two for articles about baby sleep training and sleep apnea in babies; and one for the BabyCenter mobile app My Pregnancy & Baby Today. In 2021, Forbes Health chose My Pregnancy & Baby Today as the best pregnancy app of 2021, and Women's Health identified it

    Read more →
  • Data philanthropy

    Data philanthropy

    Data philanthropy refers to the practice of private companies donating corporate data. This data is usually donated to nonprofits or donation-run organizations that have difficulty keeping up with expensive data collection technology. The concept was introduced through the United Nations Global Pulse initiative in 2011 to explore corporate data assets for humanitarian, academic, and societal causes. For example, anonymized mobile data could be used to track disease outbreaks, or data on consumer actions may be shared with researchers to study public health and economic trends. == Definition == A large portion of data collected from the internet consists of user-generated content, such as blogs, social media posts, and information submitted through lead generation and data forms. Additionally, corporations gather and analyze consumer data to gain insight into customer behavior, identify potential markets, and inform investment decisions. United Nations Global Pulse director Robert Kirkpatrick has referred to this type of data as "massive passive data" or "data exhaust." == Challenges == While data philanthropy can enhance development policies, making users' private data available to various organizations raises concerns regarding privacy, ownership, and the equitable use of data. Different techniques, such as differential privacy and alphanumeric strings of information, can allow access to personal data while ensuring user anonymity. However, even if these algorithms work, re-identification may still be possible. Another challenge is convincing corporations to share their data. The data collected by corporations provides them with market competitiveness and insight regarding consumer behavior. Corporations may fear losing their competitive edge if they share the information they have collected with the public. Numerous moral challenges are also encountered. In 2016, Mariarosaria Taddeo, a digital ethics professor at the University of Oxford, proposed an ethical framework to address them. == Sharing strategies == The goal of data philanthropy is to create a global data commons where companies, governments, and individuals can contribute anonymous, aggregated datasets. The United Nations Global Pulse offers four different tactics that companies can use to share their data that preserve consumer anonymity: Share aggregated and derived data sets for analysis under nondisclosure agreements (NDA) Allow researchers to analyze data within the private company's own network under NDAs Real-Time Data Commons: data pooled and aggregated among multiple companies of the same industry to protect competitiveness Public/Private Alerting Network: companies mine data behind their own firewalls and share indicators == Application in various fields == Many corporations take part in data philanthropy, including social networking platforms (e.g., Facebook, Twitter), telecommunications providers (e.g., Verizon, AT&T), and search engines (e.g., Google, Bing). Collecting and sharing anonymized, aggregated user-generated data is made available through data-sharing systems to support research, policy development, and social impact initiatives. By participating in such efforts, these organizations contribute to causes regarded as beneficial to society, allowing institutions to give back meaningfully. With the onset of technological advancements, the sharing of data on a global scale and an in-depth analysis of these data structures could mitigate the effects of global issues such as natural disasters and epidemics. Robert Kirkpatrick, the Director of the United Nations Global Pulse, has argued that this aggregated information is beneficial for the common good and can lead to developments in research and data production in a range of varied fields. === Digital disease detection === Health researchers use digital disease detection by collecting data from various sources—such as social media platforms (e.g., Twitter, Facebook), mobile devices (e.g., cell phones, smartphones), online search queries, mobile apps, and sensor data from wearables and environmental sensors—to monitor and predict the spread of infectious diseases. This approach allows them to track and anticipate outbreaks of epidemics (e.g., COVID-19, Ebola), pandemics, vector-borne diseases (e.g., malaria, dengue fever), and respiratory illnesses (e.g., influenza, SARS), improving response and intervention strategies for the spread of diseases. In 2008, Centers for Disease Control and Prevention collaborated with Google and launched Google Flu Trends, a website that tracked flu-related searches and user locations to track the spread of the flu. Users could visit Google Flu Trends to compare the amount of flu-related search activity versus the reported numbers of flu outbreaks on a graphical map. One drawback of this method of tracking was that Google searches are sometimes performed due to curiosity rather than when an individual is suffering from the flu. According to Ashley Fowlkes, an epidemiologist in the CDC Influenza division, "The Google Flu Trends system tries to account for that type of media bias by modeling search terms over time to see which ones remain stable." Google Flu Trends is no longer publishing current flu estimates on the public website; however, visitors to the site can still view and download previous estimates. Current data can be shared with verified researchers. A study from the Harvard School of Public Health (HSPH), published in the October 12, 2012 issue of Science, discussed how phone data helped curb the spread of malaria in Kenya. The researchers mapped phone calls and texts made by 14,816,521 Kenyan mobile phone subscribers. When individuals left their primary living location, the destination and length of journey were calculated. This data was then compared to a 2009 malaria prevalence map to estimate the disease's commonality in each location. Combining all this information, the researchers could estimate the probability of an individual carrying malaria and map the movement of the disease. This research can be used to track the spread of similar diseases. === Humanitarian aid === Calling patterns of mobile phone users can determine the socioeconomic standings of the populace, which can be used to deduce "its access to housing, education, healthcare, and basic services such as water and electricity." Researchers from Columbia University and Karolinska Institute used daily SIM card location data from both before and after the 2010 Haiti earthquake to estimate the movement of people both in response to the earthquake and during the related 2010 Haiti cholera outbreak. Their research suggests that mobile phone data can provide rapid and accurate estimates of population movements during disasters and outbreaks of infectious disease. Big data can also provide information on looming disasters and can assist relief organizations in rapid-response and locating displaced individuals. By analyzing specific patterns within this 'big data', governments and NGOs can enhance responses to disruptive events such as natural disasters, disease outbreaks, and global economic crises. Leveraging real-time information enables a deeper understanding of individual well-being, allowing for more effective interventions. Corporations utilize digital services, such as human sensor systems, to detect and solve impending problems within communities. This is a strategy used by the private sector to anonymously share customer information for public benefit, while preserving user privacy. === Impoverished areas === Poverty still remains a worldwide issue, with over 2.5 billion people currently impoverished. Statistics indicate the widespread use of mobile phones, even within impoverished communities. Additional data can be collected through Internet access, social media, utility payments and governmental statistics. Data-driven activities can lead to the accumulation of 'big data', which in turn can assist international non-governmental organizations in documenting and evaluating the needs of underprivileged populations. Through data philanthropy, NGOs can distribute information while cooperating with governments and private companies. === Corporate === Data philanthropy incorporates aspects of social philanthropy by allowing corporations to create profound impacts through the act of giving back by dispersing proprietary datasets. The public sector collects and preserves information, considered an essential asset. Companies track and analyze users' online activities to gain insight into their needs related to new products and services. These companies view the welfare of the population as key to business expansion and progression by using their data to highlight global citizens' issues. Experts in the private sector emphasize the importance of integrating diverse data sources—such as retail, mobile, and social media data—to develop essential solutions for global challenges. In Data Philanthropy:

    Read more →
  • Backdoor (computing)

    Backdoor (computing)

    A backdoor is a typically covert method of bypassing normal authentication or encryption in a computer, product, embedded device (e.g. a home router), or its embodiment (e.g. part of a cryptosystem, algorithm, chipset, or even a "homunculus computer"—a tiny computer-within-a-computer such as that found in Intel's AMT technology). Backdoors are most often used for securing remote access to a computer, or obtaining access to plaintext in cryptosystems. From there it may be used to gain access to privileged information like passwords, corrupt or delete data on hard drives, or transfer information within compromised networks. In the United States, the 1994 Communications Assistance for Law Enforcement Act forces internet providers to provide backdoors for government authorities. In 2024, the U.S. government realized that China had been tapping communications in the U.S. using that infrastructure for months, or perhaps longer; China recorded presidential candidate campaign office phone calls—including employees of the then-vice president of the nation, and of the candidates themselves. A backdoor may take the form of a hidden part of a program, a separate program (e.g. Back Orifice may subvert the system through a rootkit), code in the firmware of the hardware, or parts of an operating system such as Windows, for example, device drivers. Trojan horses can be used to create vulnerabilities in a device. A Trojan horse may appear to be an entirely legitimate program, but when executed, it triggers an activity that may install a backdoor. Although some are secretly installed, other backdoors are deliberate and widely known. These kinds of backdoors have "legitimate" uses such as providing the manufacturer with a way to restore user passwords. Many systems that store information within the cloud fail to create accurate security measures. If many systems are connected within the cloud, hackers can gain access to all other platforms through the most vulnerable system. Default passwords (or other default credentials) can function as backdoors if they are not changed by the user. Some debugging features can also act as backdoors if they are not removed in the release version. In 1993, the United States government attempted to deploy an encryption system, the Clipper chip, with an explicit backdoor for law enforcement and national security access. The chip was unsuccessful. Recent proposals to counter backdoors include creating a database of backdoors' triggers and then using neural networks to detect them. == Overview == The threat of backdoors surfaced when multiuser and networked operating systems became widely adopted. Petersen and Turn discussed computer subversion in a paper published in the proceedings of the 1967 AFIPS Conference. They noted a class of active infiltration attacks that use "trapdoor" entry points into the system to bypass security facilities and permit direct access to data. The use of the word trapdoor here clearly coincides with more recent definitions of a backdoor. However, since the advent of public key cryptography the term trapdoor has acquired a different meaning (see: Trapdoor function), and thus the term "backdoor" is now preferred, only after the term trapdoor went out of use. More generally, such security breaches were discussed at length in a RAND Corporation task force report published under DARPA sponsorship by J.P. Anderson and D.J. Edwards in 1970. While initially targeting the computer vision domain, backdoor attacks have expanded to encompass various other domains, including text, audio, ML-based computer-aided design, and ML-based wireless signal classification. Additionally, vulnerabilities in backdoors have been demonstrated in deep generative models, reinforcement learning (e.g., AI GO), and deep graph models. These broad-ranging potential risks have prompted concerns from national security agencies regarding their potentially disastrous consequences. A backdoor in a login system might take the form of a hard coded user and password combination which gives access to the system. An example of this sort of backdoor was used as a plot device in the 1983 film WarGames, in which the architect of the "WOPR" computer system had inserted a hardcoded password-less account which gave the user access to the system, and to undocumented parts of the system (in particular, a video game-like simulation mode and direct interaction with the artificial intelligence). Although the number of backdoors in systems using proprietary software (software whose source code is not publicly available) is not widely credited, they are nevertheless frequently exposed. Programmers have even succeeded in secretly installing large amounts of benign code as Easter eggs in programs, although such cases may involve official forbearance, if not actual permission. == Examples == === Worms === Many computer worms, such as Sobig and Mydoom, install a backdoor on the affected computer (generally a PC on broadband running Microsoft Windows and Microsoft Outlook). Such backdoors appear to be installed so that spammers can send junk e-mail from the infected machines. Others, such as the Sony/BMG rootkit, placed secretly on millions of music CDs through late 2005, are intended as DRM measures—and, in that case, as data-gathering agents, since both surreptitious programs they installed routinely contacted central servers. A sophisticated attempt to plant a backdoor in the Linux kernel, exposed in November 2003, added a small and subtle code change by subverting the revision control system. In this case, a two-line change appeared to check root access permissions of a caller to the sys_wait4 function, but because it used assignment = instead of equality checking ==, it actually granted permissions to the system. This difference is easily overlooked, and could even be interpreted as an accidental typographical error, rather than an intentional attack. In January 2014, a backdoor was discovered in certain Samsung Android products, like the Galaxy devices. The Samsung proprietary Android versions are fitted with a backdoor that provides remote access to the data stored on the device. In particular, the Samsung Android software that is in charge of handling the communications with the modem, using the Samsung IPC protocol, implements a class of requests known as remote file server (RFS) commands, that allows the backdoor operator to perform via modem remote I/O operations on the device hard disk or other storage. As the modem is running Samsung proprietary Android software, it is likely that it offers over-the-air remote control that could then be used to issue the RFS commands and thus to access the file system on the device. === Object code backdoors === Harder to detect backdoors involve modifying object code, rather than source code—object code is much harder to inspect, as it is designed to be machine-readable, not human-readable. These backdoors can be inserted either directly in the on-disk object code, or inserted at some point during compilation, assembly linking, or loading—in the latter case the backdoor never appears on disk, only in memory. Object code backdoors are difficult to detect by inspection of the object code, but are easily detected by simply checking for changes (differences), notably in length or in checksum, and in some cases can be detected or analyzed by disassembling the object code. Further, object code backdoors can be removed (assuming source code is available) by simply recompiling from source on a trusted system. Thus for such backdoors to avoid detection, all extant copies of a binary must be subverted, and any validation checksums must also be compromised, and source must be unavailable, to prevent recompilation. Alternatively, these other tools (length checks, diff, checksumming, disassemblers) can themselves be compromised to conceal the backdoor, for example detecting that the subverted binary is being checksummed and returning the expected value, not the actual value. To conceal these further subversions, the tools must also conceal the changes in themselves—for example, a subverted checksummer must also detect if it is checksumming itself (or other subverted tools) and return false values. This leads to extensive changes in the system and tools being needed to conceal a single change. As object code can be regenerated by recompiling (reassembling, relinking) the original source code, making a persistent object code backdoor (without modifying source code) requires subverting the compiler itself—so that when it detects that it is compiling the program under attack it inserts the backdoor—or alternatively the assembler, linker, or loader. As this requires subverting the compiler, this in turn can be fixed by recompiling the compiler, removing the backdoor insertion code. This defense can in turn be subverted by putting a source meta-backdoor in the compiler, so that when it detects that it is compiling itself

    Read more →
  • Cryptographic Module Testing Laboratory

    Cryptographic Module Testing Laboratory

    Cryptographic Module Testing Laboratory (CMTL) is an information technology (IT) computer security testing laboratory that is accredited to conduct cryptographic module evaluations for conformance to the FIPS 140-2 U.S. Government standard. The National Institute of Standards and Technology (NIST) National Voluntary Laboratory Accreditation Program (NVLAP) accredits CMTLs to meet Cryptographic Module Validation Program (CMVP) standards and procedures. This has been replaced by FIPS 140-2 and the Cryptographic Module Validation Program (CMVP). == CMTL requirements == These laboratories must meet the following requirements: NIST Handbook 150, NVLAP Procedures and General Requirements NIST Handbook 150-17 Information Technology Security Testing - Cryptographic Module Testing NVLAP Specific Operations Checklist for Cryptographic Module Testing == FIPS 140-2 in relation to the Common Criteria == A CMTL can also be a Common Criteria (CC) Testing Laboratory (CCTL). The CC and FIPS 140-2 are different in the abstractness and focus of evaluation. FIPS 140-2 testing is against a defined cryptographic module and provides a suite of conformance tests to four FIPS 140 security levels. FIPS 140-2 describes the requirements for cryptographic modules and includes such areas as physical security, key management, self tests, roles and services, etc. The standard was initially developed in 1994 - prior to the development of the CC. The CC is an evaluation against a Protection Profile (PP), or security target (ST). Typically, a PP covers a broad range of products. A CC evaluation does not supersede or replace a validation to either FIPS 140-1, FIPS140-2 or FIPS 140-3. The four security levels in FIPS 140-1 and FIPS 140-2 do not map directly to specific CC EALs or to CC functional requirements. A CC certificate cannot be a substitute for a FIPS 140-1 or FIPS 140-2 certificate. If the operational environment is a modifiable operational environment, the operating system requirements of the Common Criteria are applicable at FIPS Security Levels 2 and above. FIPS 140-1 required evaluated operating systems that referenced the Trusted Computer System Evaluation Criteria (TCSEC) classes C2, B1 and B2. However, TCSEC is no longer in use and has been replaced by the Common Criteria. Consequently, FIPS 140-2 now references the Common Criteria. FIPS 140-2 or FIPS 140-3 validation efforts can be in some parts reused in Common Criteria evaluations, specifically in areas related to entropy source and cryptographic algorithms.

    Read more →
  • Carrier cloud

    Carrier cloud

    In cloud computing, a carrier cloud is a class of cloud that integrates wide area networks (WAN) and other attributes of communications service providers’ carrier-grade networks to enable the deployment of highly-complex applications in the cloud. In contrast, classic cloud computing focuses on the data center and does not address the network connecting data centers and cloud users. This may result in unpredictable response times and security issues when business-critical data are transferred over the Internet. == History == The advent of virtualization technology, cost-effective computing hardware, and ubiquitous Internet connectivity have enabled the first wave of cloud services starting in the early years of the 21st century. But many businesses and other organizations hesitated to move to more demanding applications, from on-premises dedicated hardware to private or public clouds. As a response, communications service providers started in the 2010/2011 time frame to develop carrier clouds that address perceived weaknesses in existing cloud services. Cited weaknesses vary but often include possible downtime, security issues, high cost of custom software and data transfer, inflexibility of some cloud apps, poor customer and nonfulfillment of service level agreements (SLAs). == Characteristics == To enable the deployment of time-sensitive and business critical applications in the cloud, the carrier cloud is designed to match or even exceed the characteristics of on-premises deployments. Therefore, the carrier cloud is characterized by some or all of the following items: Configurable, elastic network performance: Typical cloud computing solutions use the best effort of the public Internet to connect cloud users and data centers. This approach provides instant connectivity but does not offer control over network capacities, latencies, and jitter. Carrier clouds address these gaps with content delivery networks and/or dedicated virtual private networks (VPN) at OSI layers 1 (optical wavelengths), 2 (data link layer), and 3 (network layer). These VPNs can be configured to offer the desired performance parameters and exhibit the same type of elasticity for the network that regular clouds provide for servers and storage. To achieve the requested performance parameters, such as low latency, cloud applications can be (automatically) allocated to distributed data centers that are close enough to the cloud users. Automatic resource placement: For a cloud with multiple data centers, information about both the data center and the connecting network is relevant for a decision of where to place cloud images and storage volumes. For this decision, carrier clouds can obtain relevant information about the network, e.g., using the Application-Layer Traffic Optimization (ALTO) protocol. High level of security and governance: Cloud application providers are subject to general and domain specific security, privacy, and governance requirements and regulations, such as the European Data Protection Directive and the U.S. Health Insurance Portability and Accountability Act. For added security, the wide area network of the carrier cloud can provide segregated encrypted or unencrypted network links that are not accessible from the general Internet. At the data center, the carrier cloud provides e.g. virtual private servers, management processes, logs, and documentation to fulfill security and governance rules. Location control: Fundamentally, cloud users should not be concerned with the geographic location of their cloud resources. However, privacy and other regulations may mandate that certain types of data must not be sent outside a national jurisdiction or other geographical region. Open APIs: Carrier clouds provide graphical user interfaces and Web application programming interfaces that allow cloud application providers to set up, manage, and monitor both, the data center and the WAN, of their cloud services. == Architecture == Carrier clouds encompass data centers at different network tiers and wide area networks that connect multiple data centers to each other as well as to the cloud users. Links between data centers are used for failover, overflow, backup, and geographic diversity. Carrier clouds can be set up as public, private, or hybrid clouds. The carrier cloud federates these cloud entities by using a single management system to orchestrate, manage, and monitor data center and network resources as a single system.

    Read more →
  • Kurzsignale

    Kurzsignale

    The Short Signal Code, also known as the Short Signal Book (German: Kurzsignalbuch), was a short code system used by the Kriegsmarine (German Navy) during World War II to minimize the transmission duration of messages. == Description == The transmission of radio messages had the potential risks of revealing the submarine's presence and direction; if decoded the content was also revealed. Submarines need to provide information, mostly in standard form (position of convoy to attack and of submarine, weather information), to their bases. Initially Morse code transmissions could be used. To inhibit detection, the duration of messages needed to be minimised; for this, Kurzsignale short-coding was used. To prevent interception, messages needed to be encrypted by the Enigma machine. To shorten transmission even further, the message could be sent by a fast machine instead of a human radio operator. For example, the Kurier system – not implemented in time – decreased the time to send a Morse dot from around 50 milliseconds for a human to 1 millisecond. == Short Signal book == The Kurzsignale code was intended to shorten transmission time to below the time required to get a directional fix. It was not primarily intended to hide signal contents; protection was intended to be achieved by encoding with the Enigma machine. A copy of the Kurzsignale code book was captured from German submarine U-110 on 9 May 1941. In August 1941, Dönitz began addressing U-boats by the names of their commanders, instead of boat numbers. The method of defining U-boat meeting points in the Short Signal Book was regarded as compromised, so a method was defined by B-Dienst cryptanalysts to disguise their positions on the Kriegsmarine German Naval Grid System (German:Gradnetzmeldeverfahren) was introduced and used until the end of the war == Radio direction finding == Aware of the danger presented by radio direction finding (RDF), the Kriegsmarine developed various systems to speed up broadcast. The Kurzsignale code system condensed messages into short codes consisting of short sequences for common terms such as "convoy location" so that additional descriptions would not be needed in the message. The resulting Kurzsignal was then encoded with the Enigma machine and subsequently transmitted as rapidly as possible, typically taking about 20 seconds. Typical length of an information or weather signal was about 25 characters. Conventional RDF needed about a minute to fix the bearing of a radio signal, and the Kurzsignale protected against this. However, the huff-duff system which was in use by the Allies could cope with these short transmissions. The fully automated burst transmission Kurier system, in testing from August 1944, could send a Kurzsignal in not more than 460 milliseconds; this was short enough to prevent location even by huff-duff and, if deployed, would have been a serious setback for Allied anti-submarine and code-breaking activities. By late 1944 the Kurier program was a top priority, but the war ended before the system was operational. == Short Weather cipher == A similar coding system was used for weather reports from U-boats, the Wetterkurzschlüssel (Short Weather Cipher). Code books were captured from U-559 on 30 October 1942.

    Read more →
  • Data analysis

    Data analysis

    Data analysis is the process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, and is used in different business, science, and social science domains. In today's business world, data analysis plays an important role in making decisions more scientific and helping businesses operate more effectively. It is widely used in fields such as business analytics, healthcare, and artificial intelligence to extract meaningful insights from data. Data mining is a particular data analysis technique that focuses on statistical modeling and knowledge discovery for predictive rather than purely descriptive purposes, while business intelligence covers data analysis that relies heavily on aggregation, focusing mainly on business information. In statistical applications, data analysis can be divided into descriptive statistics, exploratory data analysis (EDA), and confirmatory data analysis (CDA). EDA focuses on discovering new features in the data, while CDA focuses on confirming or falsifying existing hypotheses. Predictive analytics focuses on the application of statistical models for predictive forecasting or classification, while text analytics applies statistical, linguistic, and structural techniques to extract and classify information from textual sources, a variety of unstructured data. All of the above are varieties of data analysis. == Data analysis process == Data analysis is a process for obtaining raw data, and subsequently converting it into information useful for decision-making by users. Statistician John Tukey, defined data analysis in 1961, as:"Procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data." There are several phases, and they are iterative, in that feedback from later phases may result in additional work in earlier phases. === Data requirements === The data is necessary as inputs to the analysis, which is specified based upon the requirements of those directing the analytics (or customers, who will use the finished product of the analysis). The general type of entity upon which the data will be collected is referred to as an experimental unit (e.g., a person or population of people). Specific variables regarding a population (e.g., age and income) may be specified and obtained. Data may be numerical or categorical (i.e., a text label for numbers). === Data collection === Data may be collected from a variety of sources. A list of data sources are available for study & research. The requirements may be communicated by analysts to custodians of the data; such as, Information Technology personnel within an organization. Data collection or data gathering is the process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes. The data may also be collected from sensors in the environment, including traffic cameras, satellites, recording devices, etc. It may also be obtained through interviews, downloads from online sources, or reading documentation. === Data processing === Data integration is a precursor to data analysis: Data, when initially obtained, must be processed or organized for analysis. For instance, this may involve placing data into rows and columns in a table format (known as structured data) for further analysis, often through the use of spreadsheet (e.g. Excel) or statistical software. === Data cleaning === Once processed and organized, the data may be incomplete, contain duplicates, or contain errors. The need for data cleaning will arise from problems in the way that the data is entered and stored. Data cleaning is the process of preventing and correcting these errors. Common tasks include record matching, identifying inaccuracy of data, overall quality of existing data, deduplication, and column segmentation. Such data problems can also be identified through a variety of analytical techniques. For example; with financial information, the totals for particular variables may be compared against separately published numbers that are believed to be reliable. Unusual amounts, above or below predetermined thresholds, may also be reviewed. There are several types of data cleaning that are dependent upon the type of data in the set; this could be phone numbers, email addresses, employers, or other values. Quantitative data methods for outlier detection can be used to get rid of data that appears to have a higher likelihood of being input incorrectly. Text data spell checkers can be used to lessen the amount of mistyped words. However, it is harder to tell if the words are contextually (i.e., semantically and idiomatically) correct. === Exploratory data analysis === Once the datasets are cleaned, they can then begin to be analyzed using exploratory data analysis. The process of data exploration may result in additional data cleaning or additional requests for data; thus, the initialization of the iterative phases mentioned above. Descriptive statistics, such as the average, median, and standard deviation, are often used to broadly characterize the data. Data visualization is also used, in which the analyst is able to examine the data in a graphical format in order to obtain additional insights about messages within the data. === Modeling and algorithms === Mathematical formulas or mathematical models (supported by algorithms) may be applied to the data in order to identify relationships among the variables; for example, checking for correlation and by determining whether or not there is the presence of causality. In general terms, models may be developed to evaluate a specific variable based on other variable(s) contained within the dataset, with some residual error depending on the implemented model's accuracy (e.g., Data = Model + Error). Inferential statistics utilizes techniques that measure the relationships between particular variables. For example, regression analysis may be used to model whether a change in advertising (independent variable X), provides an explanation for the variation in sales (dependent variable Y), i.e. is Y a function of X? This can be described as (Y = aX + b + error), where the model is designed such that (a) and (b) minimize the error when the model predicts Y for a given range of values of X. === Data product === A data product is a computer application that takes data inputs and generates outputs, feeding them back into the environment. It may be based on a model or algorithm. For instance, an application that analyzes data about customer purchase history, and uses the results to recommend other purchases the customer might enjoy. === Communication === Once data is analyzed, it may be presented in many formats to the users of the analysis to support their requirements. The users may have feedback, which results in additional analysis. When determining how to communicate the results, the analyst may consider implementing a variety of data visualization techniques to help communicate the message more clearly and efficiently to the audience. Data visualization uses information displays (graphics such as, tables and charts) to help communicate key messages contained in the data. Tables are a valuable tool by enabling the ability of a user to query and focus on specific numbers; while charts (e.g., bar charts or line charts), may help explain the quantitative messages contained in the data. == Quantitative messages == Stephen Few described eight types of quantitative messages that users may attempt to communicate from a set of data, including the associated graphs. Time-series: A single variable is captured over a period of time, such as the unemployment rate over a 10-year period. A line chart may be used to demonstrate the trend. Ranking: Categorical subdivisions are ranked in ascending or descending order, such as a ranking of sales performance (the measure) by salespersons (the category, with each salesperson a categorical subdivision) during a single period. A bar chart may be used to show the comparison across the salespersons. Part-to-whole: Categorical subdivisions are measured as a ratio to the whole (i.e., a percentage out of 100%). A pie chart or bar chart can show the comparison of ratios, such as the market share represented by competitors in a market. Deviation: Categorical subdivisions are compared against a reference, such as a comparison of actual vs. budget expenses for several departments of a business for a given time period. A bar chart can show the comparison of the actual versus the reference amount. Frequency distribution:

    Read more →
  • Data monetization

    Data monetization

    Data monetization, a form of monetization, may refer to the act of generating measurable economic benefits from available data sources (analytics). Less commonly, it may also refer to the act of monetizing data services. In the case of analytics, typically, these benefits accrue as revenue or expense savings, but may also include market share or corporate market value gains. Data monetization leverages data generated through business operations, available exogenous data or content, as well as data associated with individual actors such as that collected via electronic devices and sensors participating in the internet of things. For example, the ubiquity of the internet of things is generating location data and other data from sensors and mobile devices at an ever-increasing rate. When this data is collated against traditional databases, the value and utility of both sources of data increases, leading to tremendous potential to mine data for social good, research and discovery, and achievement of business objectives. Closely associated with data monetization are the emerging data as a service models for transactions involving data by the data item. There are three ethical and regulatory vectors involved in data monetization due to the sometimes conflicting interests of actors involved in the digital supply chain. The individual data creator who generates files and records through his own efforts or owns a device such as a sensor or a mobile phone that generates data has a claim to ownership of data. The business entity that generates data in the course of its operations, such as its transactions with financial institutions or risk factors discovered through feedback from customers also has a claim on data captured through their systems and platforms. However, the person that contributed the data may also have a legitimate claim on the data. Internet platforms and service providers, such as Google or Facebook that require a user to forgo some ownership interest in their data in exchange for use of the platform also have a legitimate claim on the data. Thus the practice of data monetization, although common since 2000, is now getting increasing attention from regulators. The European Union and the United States Congress have begun to address these issues. For instance, in the financial services industry, regulations involving data are included in the Gramm–Leach–Bliley Act and Dodd-Frank. Some individual creators of data are shifting to using personal data vaults and implementing vendor relationship management concepts as a reflection of an increasing resistance to their data being federated or aggregated and resold without compensation. Groups such as the Personal Data Ecosystem Consortium, Patient privacy rights, and others are also challenging corporate cooptation of data without compensation. Financial services companies are a relatively good example of an industry focused on generating revenue by leveraging data. Credit card issuers and retail banks use customer transaction data to improve targeting of cross-sell offers. Partners are increasingly promoting merchant based reward programs which leverage a bank’s data and provide discounts to customers at the same time. == Types of data monetization == Internal data monetization - An organization's data is used internally, resulting in economic benefit. This is commonly the case in organizations using analytics to uncover insights, resulting in improved profit, cost savings or the avoidance of risk. Internal data monetization is currently the most common form of monetization, requiring far fewer security, intellectual property, and legal precautions when compared to other types. The potential economic gains from this type of data monetization are limited by the organization's internal structure and situation. External data monetization - A person or organization makes data they possess available on a for-fee basis to external parties, or as a broker for same. This type of monetization is less common and requires various methods to distribute the data to potential buyers and consumers. However, the economic gain that results from collecting data, packaging and distributing it, can be quite large. == Steps == Identification of available data sources – this includes data currently available for monetization as well as other external data sources that may enhance the value of what’s currently available. Connect, aggregate, attribute, validate, authenticate, and exchange data - this allows data to be converted directly into actionable or revenue generating insight or services. Set terms and prices and facilitate data trading - methods for data vetting, storage, and access. For example, many global corporations have locked and siloed data storage infrastructures, which hinders efficient access to data and cooperative and real-time exchange. Perform Research and analytics – draw predictive insights from existing data as a basis for using data for to reduce risk, enhance product development or performance, or improve customer experience or business outcomes. Action and leveraging – the last phase of monetizing data includes determining alternative or improved data centric products, ideas, or services. Examples may include real-time actionable triggered notifications or enhanced channels such as web or mobile response mechanisms. == Pricing variables and factors == A fee for use of a platform to connect buyers and sellers use of a platform to configure, organize, and otherwise process data included in a data trade connecting or including a device or sensor into a data supply chain connecting and credentialing a creator of a data source and a data buyer – often through a federated identity connecting a data source to other data sources to be included in a data supply chain use of an internet service or other transmission services for uploading and downloading data – sometimes, for an individual, through a personal cloud use of encrypted keys to achieve secure data transfer use of a search algorithm specifically designed to tag data sources that contain data points of value to the data buyer linking a data creator or generator to a data collection protocol or form server actions – such as a notification – triggered by an update to a data item or data source included in a data supply chain A price or exchange or other trade value assigned by a data creator or generator to a data item or a data source offered by a data buyer to a data creator assigned by a data buyer for a data item or a data source formatted according to criteria set by a data buyer An incremental fee assigned by a data buyer for a data item or a data set scaled to the reputation of the data creator == Benefits == Improved decision-making that leads to real time crowd sourced research, improved profits, decreased costs, reduced risk and improved compliance More impactful decisions (e.g., make real-time decisions) More timely (lower latency) decisions (e.g., a vendor making purchase recommendations while the customer is still on the phone or in the store, a customer connecting with multiple vendors to discover the best price, triggered notifications when thresholds are reached for data values) More granular decisions (e.g., localized pricing decisions at an individual or device or sensor level versus larger aggregates). Targeted Marketing (e.g., Vendors with access to big data can make targeted advertisements to specific customers within a set data pool decreasing costs for the advertiser and reaching most interested customers) == Frameworks == There are a wide variety of industries, firms and business models related to data monetization. The following frameworks have been offered to help understand the types of business models that are used: Roger Ehrenberg of IA Ventures, a venture capital firm that invests in this sector, has defined three basic types of data product firms: Contributory databases. The magic of these businesses is that a customer provides their own data in exchange for receiving a more robust set of aggregated data back that provides insight into the broader marketplace, or provides a vehicle for expressing a view. Give a little, get a lot back in return – a pretty compelling value proposition, and one that frequently results in a payment from the data contributor in exchange for receiving enriched, aggregated data. Once these contributory databases are developed and customers become reliant on their insights, they become extremely valuable and persistent data assets. Data processing platforms. These businesses create barriers through a combination of complex data architectures, proprietary algorithms, and rich analytics to help customers consume data in whatever form they please. Often these businesses have special relationships with key data providers, that when combined with other data and processed as a whole create valuable differentiation and competitive barriers. Bloomberg is an example of a powerful

    Read more →
  • Global serializability

    Global serializability

    In concurrency control of databases, transaction processing (transaction management), and other transactional distributed applications, global serializability (or modular serializability) is a property of a global schedule of transactions. A global schedule is the unified schedule of all the individual database (and other transactional object) schedules in a multidatabase environment (e.g., federated database). Complying with global serializability means that the global schedule is serializable, has the serializability property, while each component database (module) has a serializable schedule as well. In other words, a collection of serializable components provides overall system serializability, which is usually incorrect. A need in correctness across databases in multidatabase systems makes global serializability a major goal for global concurrency control (or modular concurrency control). With the proliferation of the Internet, Cloud computing, Grid computing, and small, portable, powerful computing devices (e.g., smartphones), as well as increase in systems management sophistication, the need for atomic distributed transactions and thus effective global serializability techniques, to ensure correctness in and among distributed transactional applications, seems to increase. In a federated database system or any other more loosely defined multidatabase system, which are typically distributed in a communication network, transactions span multiple (and possibly distributed) databases. Enforcing global serializability in such system, where different databases may use different types of concurrency control, is problematic. Even if every local schedule of a single database is serializable, the global schedule of a whole system is not necessarily serializable. The massive communication exchanges of conflict information needed between databases to reach conflict serializability globally would lead to unacceptable performance, primarily due to computer and communication latency. Achieving global serializability effectively over different types of concurrency control has been open for several years. == The global serializability problem == === Problem statement === The difficulties described above translate into the following problem: Find an efficient (high-performance and fault tolerant) method to enforce Global serializability (global conflict serializability) in a heterogeneous distributed environment of multiple autonomous database systems. The database systems may employ different concurrency control methods. No limitation should be imposed on the operations of either local transactions (confined to a single database system) or global transactions (span two or more database systems). === Quotations === Lack of an appropriate solution for the global serializability problem has driven researchers to look for alternatives to serializability as a correctness criterion in a multidatabase environment (e.g., see Relaxing global serializability below), and the problem has been characterized as difficult and open. The following two quotations demonstrate the mindset about it by the end of the year 1991, with similar quotations in numerous other articles: "Without knowledge about local as well as global transactions, it is highly unlikely that efficient global concurrency control can be provided... Additional complications occur when different component DBMSs [Database Management Systems] and the FDBMSs [Federated Database Management Systems] support different concurrency mechanisms... It is unlikely that a theoretically elegant solution that provides conflict serializability without sacrificing performance (i.e., concurrency and/or response time) and availability exists." === Proposed solutions === Several solutions, some partial, have been proposed for the global serializability problem. Among them: Global conflict graph (serializability graph, precedence graph) checking Distributed Two-phase locking (Distributed 2PL) Distributed Timestamp ordering Tickets (local logical timestamps which define local total orders, and are propagated to determine global partial order of transactions) == Relaxing global serializability == Some techniques have been developed for relaxed global serializability (i.e., they do not guarantee global serializability; see also Relaxing serializability). Among them (with several publications each): Quasi serializability Two-level serializability Another common reason nowadays for Global serializability relaxation is the requirement of availability of internet products and services. This requirement is typically answered by large scale data replication. The straightforward solution for synchronizing replicas' updates of a same database object is including all these updates in a single atomic distributed transaction. However, with many replicas such a transaction is very large, and may span several computers and networks that some of them are likely to be unavailable. Thus such a transaction is likely to end with abort and miss its purpose. Consequently, Optimistic replication (Lazy replication) is often utilized (e.g., in many products and services by Google, Amazon, Yahoo, and alike), while global serializability is relaxed and compromised for eventual consistency. In this case relaxation is done only for applications that are not expected to be harmed by it. Classes of schedules defined by relaxed global serializability properties either contain the global serializability class, or are incomparable with it. What differentiates techniques for relaxed global conflict serializability (RGCSR) properties from those of relaxed conflict serializability (RCSR) properties that are not RGCSR is typically the different way global cycles (span two or more databases) in the global conflict graph are handled. No distinction between global and local cycles exists for RCSR properties that are not RGCSR. RCSR contains RGCSR. Typically RGCSR techniques eliminate local cycles, i.e., provide local serializability (which can be achieved effectively by regular, known concurrency control methods); however, obviously they do not eliminate all global cycles (which would achieve global serializability).

    Read more →
  • Backdoor (computing)

    Backdoor (computing)

    A backdoor is a typically covert method of bypassing normal authentication or encryption in a computer, product, embedded device (e.g. a home router), or its embodiment (e.g. part of a cryptosystem, algorithm, chipset, or even a "homunculus computer"—a tiny computer-within-a-computer such as that found in Intel's AMT technology). Backdoors are most often used for securing remote access to a computer, or obtaining access to plaintext in cryptosystems. From there it may be used to gain access to privileged information like passwords, corrupt or delete data on hard drives, or transfer information within compromised networks. In the United States, the 1994 Communications Assistance for Law Enforcement Act forces internet providers to provide backdoors for government authorities. In 2024, the U.S. government realized that China had been tapping communications in the U.S. using that infrastructure for months, or perhaps longer; China recorded presidential candidate campaign office phone calls—including employees of the then-vice president of the nation, and of the candidates themselves. A backdoor may take the form of a hidden part of a program, a separate program (e.g. Back Orifice may subvert the system through a rootkit), code in the firmware of the hardware, or parts of an operating system such as Windows, for example, device drivers. Trojan horses can be used to create vulnerabilities in a device. A Trojan horse may appear to be an entirely legitimate program, but when executed, it triggers an activity that may install a backdoor. Although some are secretly installed, other backdoors are deliberate and widely known. These kinds of backdoors have "legitimate" uses such as providing the manufacturer with a way to restore user passwords. Many systems that store information within the cloud fail to create accurate security measures. If many systems are connected within the cloud, hackers can gain access to all other platforms through the most vulnerable system. Default passwords (or other default credentials) can function as backdoors if they are not changed by the user. Some debugging features can also act as backdoors if they are not removed in the release version. In 1993, the United States government attempted to deploy an encryption system, the Clipper chip, with an explicit backdoor for law enforcement and national security access. The chip was unsuccessful. Recent proposals to counter backdoors include creating a database of backdoors' triggers and then using neural networks to detect them. == Overview == The threat of backdoors surfaced when multiuser and networked operating systems became widely adopted. Petersen and Turn discussed computer subversion in a paper published in the proceedings of the 1967 AFIPS Conference. They noted a class of active infiltration attacks that use "trapdoor" entry points into the system to bypass security facilities and permit direct access to data. The use of the word trapdoor here clearly coincides with more recent definitions of a backdoor. However, since the advent of public key cryptography the term trapdoor has acquired a different meaning (see: Trapdoor function), and thus the term "backdoor" is now preferred, only after the term trapdoor went out of use. More generally, such security breaches were discussed at length in a RAND Corporation task force report published under DARPA sponsorship by J.P. Anderson and D.J. Edwards in 1970. While initially targeting the computer vision domain, backdoor attacks have expanded to encompass various other domains, including text, audio, ML-based computer-aided design, and ML-based wireless signal classification. Additionally, vulnerabilities in backdoors have been demonstrated in deep generative models, reinforcement learning (e.g., AI GO), and deep graph models. These broad-ranging potential risks have prompted concerns from national security agencies regarding their potentially disastrous consequences. A backdoor in a login system might take the form of a hard coded user and password combination which gives access to the system. An example of this sort of backdoor was used as a plot device in the 1983 film WarGames, in which the architect of the "WOPR" computer system had inserted a hardcoded password-less account which gave the user access to the system, and to undocumented parts of the system (in particular, a video game-like simulation mode and direct interaction with the artificial intelligence). Although the number of backdoors in systems using proprietary software (software whose source code is not publicly available) is not widely credited, they are nevertheless frequently exposed. Programmers have even succeeded in secretly installing large amounts of benign code as Easter eggs in programs, although such cases may involve official forbearance, if not actual permission. == Examples == === Worms === Many computer worms, such as Sobig and Mydoom, install a backdoor on the affected computer (generally a PC on broadband running Microsoft Windows and Microsoft Outlook). Such backdoors appear to be installed so that spammers can send junk e-mail from the infected machines. Others, such as the Sony/BMG rootkit, placed secretly on millions of music CDs through late 2005, are intended as DRM measures—and, in that case, as data-gathering agents, since both surreptitious programs they installed routinely contacted central servers. A sophisticated attempt to plant a backdoor in the Linux kernel, exposed in November 2003, added a small and subtle code change by subverting the revision control system. In this case, a two-line change appeared to check root access permissions of a caller to the sys_wait4 function, but because it used assignment = instead of equality checking ==, it actually granted permissions to the system. This difference is easily overlooked, and could even be interpreted as an accidental typographical error, rather than an intentional attack. In January 2014, a backdoor was discovered in certain Samsung Android products, like the Galaxy devices. The Samsung proprietary Android versions are fitted with a backdoor that provides remote access to the data stored on the device. In particular, the Samsung Android software that is in charge of handling the communications with the modem, using the Samsung IPC protocol, implements a class of requests known as remote file server (RFS) commands, that allows the backdoor operator to perform via modem remote I/O operations on the device hard disk or other storage. As the modem is running Samsung proprietary Android software, it is likely that it offers over-the-air remote control that could then be used to issue the RFS commands and thus to access the file system on the device. === Object code backdoors === Harder to detect backdoors involve modifying object code, rather than source code—object code is much harder to inspect, as it is designed to be machine-readable, not human-readable. These backdoors can be inserted either directly in the on-disk object code, or inserted at some point during compilation, assembly linking, or loading—in the latter case the backdoor never appears on disk, only in memory. Object code backdoors are difficult to detect by inspection of the object code, but are easily detected by simply checking for changes (differences), notably in length or in checksum, and in some cases can be detected or analyzed by disassembling the object code. Further, object code backdoors can be removed (assuming source code is available) by simply recompiling from source on a trusted system. Thus for such backdoors to avoid detection, all extant copies of a binary must be subverted, and any validation checksums must also be compromised, and source must be unavailable, to prevent recompilation. Alternatively, these other tools (length checks, diff, checksumming, disassemblers) can themselves be compromised to conceal the backdoor, for example detecting that the subverted binary is being checksummed and returning the expected value, not the actual value. To conceal these further subversions, the tools must also conceal the changes in themselves—for example, a subverted checksummer must also detect if it is checksumming itself (or other subverted tools) and return false values. This leads to extensive changes in the system and tools being needed to conceal a single change. As object code can be regenerated by recompiling (reassembling, relinking) the original source code, making a persistent object code backdoor (without modifying source code) requires subverting the compiler itself—so that when it detects that it is compiling the program under attack it inserts the backdoor—or alternatively the assembler, linker, or loader. As this requires subverting the compiler, this in turn can be fixed by recompiling the compiler, removing the backdoor insertion code. This defense can in turn be subverted by putting a source meta-backdoor in the compiler, so that when it detects that it is compiling itself

    Read more →
  • Social knowledge management

    Social knowledge management

    Social knowledge management is a business approach that aims to leverage the collective intelligence and social interactions of an organization’s members and stakeholders. It is a branch of knowledge management, which is a multidisciplinary field that deals with the creation, sharing, and use of knowledge in various domains, such as business, economics, psychology, and information management. Knowledge management seeks to enhance organizational performance, innovation, and competitiveness by managing the intangible assets of an organization, such as human capital, know-how, technology, customers, and networks. Social media plays a crucial role in social knowledge management by enhancing communication, collaboration, and learning among individuals and groups, both internally and externally. It offers valuable insights and feedback from customers, partners, and stakeholders, and aids in generating and disseminating new knowledge. In a business context, social media is utilized for various purposes, including sentiment analysis, social learning, social collaboration, and social knowledge management. Social knowledge management is one of the application areas of social media in a business context next to others like sentiment analysis, social learning or social collaboration. Social media use by businesses can strive to achieve the following things from social media strategy point of view: learn, listen, engage in conversation, measure and refine, develop capabilities, define activities, prioritize objectives etc. Social media are not only transforming private communication and interaction, they also will transform how people work. With social media knowledge work in organizations can be optimized extremely: like a better distribution sharing and access to knowledge. This will be more and more important, as in today's business world, speed and complexity increase dramatically, while work environments change constantly. == Examples of Social KM platforms == Elium, a European software application which combines social tagging, bookmarking and networking paradigms to address internal information management purposes. Sciomino was a startup enterprise social network for Social Knowledge Management.

    Read more →