AI Art Modifier

AI Art Modifier — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Apache Hama

    Apache Hama

    Apache Hama is a distributed computing framework based on bulk synchronous parallel computing techniques for massive scientific computations e.g., matrix, graph and network algorithms. Originally a sub-project of Hadoop, it became an Apache Software Foundation top level project in 2012. It was created by Edward J. Yoon, who named it (short for "Hadoop Matrix Algebra"), and Hama also means hippopotamus in Yoon's native Korean language (하마), following the trend of naming Apache projects after animals and zoology (such as Apache Pig). Hama was inspired by Google's Pregel large-scale graph computing framework described in 2010. When executing graph algorithms, Hama showed a fifty-fold performance increase relative to Hadoop. Retired in April 2020, project resources are made available as part of the Apache Attic. Yoon cited issues of installation, scalability, and a difficult programming model for its lack of adoption. == Architecture == Hama consists of three major components: BSPMaster, GroomServers and Zookeeper. === BSPMaster === BSPMaster is responsible for: Maintaining groom server status Controlling super steps in a cluster Maintaining job progress information Scheduling jobs and assigning tasks to groom servers Disseminating execution class across groom servers Controlling fault Providing users with the cluster control interface. A BSP Master and multiple grooms are started by the script. Then, the bsp master starts up with a RPC server for groom servers. Groom servers starts up with a BSPPeer instance and a RPC proxy to contact the bsp master. After started, each groom periodically sends a heartbeat message that encloses its groom server status, including maximum task capacity, unused memory, and so on. Each time the BSP master receives a heartbeat message, it brings the groom server status up-to-date. The bsp master makes use of groom servers' status in order to assign tasks to idle groom servers - and returns a heartbeat response containing assigned tasks and others actions for a groom server to do. Currently BSP master has a FIFO job scheduler and simple task assignment algorithms. === GroomServer === A groom server (shortly referred to as groom) is a process that performs BSP tasks assigned by BSPMaster. Each groom contacts the BSPMaster, and it takes assigned tasks and reports its status by means of periodical piggybacks with BSPMaster. Each groom is designed to run with HDFS or other distributed storages. Basically, a groom server and a data node should be run on one physical node. === Zookeeper === A Zookeeper is used to manage the efficient barrier synchronisation of the BSPPeers.

    Read more →
  • Protecting Kids From Social Media Act

    Protecting Kids From Social Media Act

    Protecting Kids on Social Media Act or HB 1891 is an American law that was introduced by William Lamberth of Sumner County, Tennessee and was signed into law by Tennessee's governor on May 2, 2024. The bill requires social media websites such as X, YouTube, TikTok, Facebook and others to verify the age of users and if those users are under 18, they must have parental consent. == Progress == The law passed the Tennessee State Legislature with little opposition: the bill had only two no votes in the House from Aftyn Behn and Vincent B. Dixie, and it had zero no votes in the Senate. == Bill summary == Every social media company must verify the age of new users after the law takes effect, and if the user had created an account before the law took effect, they must verify the age of the person attempting to access the account within 14 days. If the new user or the user who originally owned an account is under 18 years of age, they must get parental consent and the third party or social media company must not retain the data from the age verification process or obtaining parental consent. Parents who are account holders of those under 18 can view the privacy settings, set daily time restrictions, and implement breaks during which the minor cannot access the account. The law is enforced by the Attorney General of Tennessee and went into effect on January 1, 2025. == Lawsuit == On October 3, 2024, the trade association NetChoice filed a lawsuit against Tennessee Attorney General Jonathan Skrmetti in the Middle District Court of Tennessee, claiming that the law violates the First Amendment. The Judge for the case is William L. Campbell Jr. An initial case management conference was originally scheduled for December 4, 2024, however it was delayed because of the Supreme Court case United States v. Skrmetti, recommending that the conference be delayed after January 20, 2025. On February 14, 2025, Judge Eli Richardson denied NetChoice's motion for a temporary restraining order because it would disrupt the status quo of the case.

    Read more →
  • Pepper (cryptography)

    Pepper (cryptography)

    In cryptography, a pepper is a secret added to an input such as a password during hashing with a cryptographic hash function. This value differs from a salt in that it is not stored alongside a password hash, but rather the pepper is kept separate using another meachanism, such as a Hardware Security Module. Note that the National Institute of Standards and Technology refers to this value as a secret key rather than a pepper. A pepper is similar in concept to a salt or an encryption key. It is like a salt in that it is a randomized value that is added to a password hash, and it is similar to an encryption key in that it should be kept secret. A pepper performs a comparable role to a salt or an encryption key, but while a salt is not secret (merely unique) and can be stored alongside the hashed output, a pepper is secret and must not be stored with the output. The hash and salt are usually stored in a database, but, if stored, a pepper must be stored separately to prevent it from being obtained by the attacker in case of a database breach. == History == The idea of a site- or service-specific salt (in addition to a per-user salt) has a long history, with Steven M. Bellovin proposing a local parameter in a Bugtraq post in 1995. In 1996 Udi Manber also described the advantages of such a scheme, terming it a secret salt. However, he suggested not storing the value of the secret salt, but instead rediscovering it by trial and error at password verification time. The term pepper has been used, by analogy to salt, but with a variety of meanings. For example, when discussing a challenge-response scheme, pepper has been used for a salt-like quantity, though not used for password storage; it has been used for a data transmission technique where a pepper must be guessed; and even as a part of jokes. The term pepper was proposed for a secret or local parameter stored separately from the password in a discussion of protecting passwords from rainbow table attacks. This usage did not immediately catch on: for example, Fred Wenzel added support to Django password hashing for storage based on a combination of bcrypt and HMAC with separately stored nonces, without using the term. Usage has since become more common. == Types == There are multiple different types of pepper: A shared secret that is common to all users. A randomly-selected number that must be re-discovered on every password input. These mechanisms could be combined with password salting, iterated hashing or even one another. == Shared-secret pepper == Bellovin and Webster suggest prepend a shared secret to the password before hashing, which allows easy use of existing hash functions. For example, consider two users to be added to a database. This table contains two combinations of username and password. The password is not saved, and the 8-byte (64-bit) 44534C70C6883DE2 pepper is saved in a safe place separate from the output values of the hash, in this case SHA256. Unlike the salt, the pepper does not provide protection to users who use the same password, but protects against dictionary attacks, unless the attacker has the pepper value available. Since the same pepper is not shared between different applications, an attacker is unable to reuse the hashes of one compromised database to another. A complete scheme for saving passwords may include both salt and pepper use. For example, it has been suggested to combine the pepper by encrypting salted password hashes, which allows rotation of the pepper. In the case of a shared-secret pepper, a single compromised password (via password reuse or other attack) along with a user's salt can lead to an attack to discover the pepper, rendering it ineffective. If an attacker knows a plaintext password and a user's salt, as well as the algorithm used to hash the password, then discovering the pepper can be a matter of brute forcing the values of the pepper. This is why NIST recommends the secret value be at least 112 bits, so that discovering it by exhaustive search is prohibitively expensive. The pepper must be generated anew for every application it is deployed in, otherwise a breach of one application would result in lowered security of another application. Without knowledge of the pepper, other passwords in the database will be far more difficult to extract from their hashed values, as the attacker would need to guess the password as well as the pepper. A pepper adds security to a database of salts and hashes because unless the attacker is able to obtain the pepper, cracking even a single hash is intractable, no matter how weak the original password. Even with a list of (salt, hash) pairs, an attacker must also guess the secret pepper in order to find the password which produces the hash. The NIST specification for a secret salt suggests using a Password-Based Key Derivation Function (PBKDF) with an approved Pseudorandom Function such as HMAC with SHA-3 as the hash function of the HMAC. The NIST recommendation is also to perform at least 1000 iterations of the PBKDF, and a further minimum 1000 iterations using the secret salt in place of the non-secret salt. == Randomly-selected pepper that must be re-discovered == The aim of this mechanism is to slow down password the password verification step, thus slowing attacks. The aim is similar increasing the iteration count on bcrypt or Argon2, but the mechanism is different. The secret salt or pepper must be rediscovered by the verifier or attacker each time by guessing. In this situation, the password hashing function is calculated using both the password and the pepper. At password storage time, the pepper is chosen randomly from a range between 1 and R, the hash output is calculated using the password and the pepper. The hash output is stored with the username. The pepper is then discarded. At password verification time, the verifier is provided with a username and password to verify. The originally calculated hash is retrieved for the given username, and then the hash of the password and each value between 1 and R is calculated. If any of these hash values match the stored password hash, the password is considered valid. Note, the possible values of the pepper should not be tested in a fixed order known to an attacker, otherwise a timing attack may reveal the pepper. If the password is correct, the correct pepper will be found in R/2 hash evaluations on average. If the password is incorrect, all R values must be tested before the password can be rejected.

    Read more →
  • Social media and identity

    Social media and identity

    Social media can have both positive and negative impacts on a user's identity. Scholars within the fields of psychology and communication study the relationship between social media and identity in order to understand individual behavior, psychological impacts, and social patterns. Communication within political or social groups online can result in practice application, real-world implementation of a concept, of those found identities or the adoption of them as a whole. Young people, defined as emerging adults in or entering college, are especially found to have their identities shaped through social media. Sometimes it seems as though social media is taking over and changing us for the worse. Social media is always changing and can be hard to keep up with. Platforms come and go trends change everyday. What was cool yesterday is lame today. The biggest change from recent years that users are still adjusting to is the name change of Twitter now called X. Since Elon Musk purchased the platform he changed the name but nothing else about the app. Users now feel the need to explain when talking about X. Now it is often referred to as ‘X(Twitter)’ to clarify. == Social Media Usage and Demographics == We know what social media is and how it is used but who uses it? The Pew Research center conducted a 10 year study from 2005-2015 about the demographics of social media usage. While this article is 10 years old the statistics in it are from a very formative time in social media. This is when most people joined and were consistently using social media. Age: While it is no surprise that 90% of young adults use social media they are the main demographic of users. Older adults (65 and older) really hit a boom on social media. In 2005 only 2% of older adults used any form of social media. By 2015 35% of older adults used social media. We can infer that that percentage has grown even more since 2015. Gender: It is known that women tend to use social media more than men. In 2015 it was noted that 65% of women used social media. Men were not far behind, 62% of men were reported to use social media. There are no notable differences of users from various races and ethnicities. The research also shows that more suburban and urban residents use social media over those who live in rural areas. == Young adults == Young adults are especially influenced by social media, where they find social groups to belong to. Research shows that nearly half of teens believe social media platforms has a negative impact on people their age. Psychologists believe that at a time when young adults are coming into adolescence, they are more likely to be influenced by what they see on sites like Instagram or Twitter. Most young adults will widely share, with varying degrees of accuracy, honesty, and openness, information that in the past would have been private or reserved for select individuals. Key questions include whether they accurately portray their identities online and whether the use of social media might impact young adults' identity development. Media Imagery, in particular, is said to be a major influence on the minds of young men and women. Studies have shown that it is even more relevant when it comes to the issue of body image. Social media, in part, has been created to host a safe haven for those who do not claim a solid identity in the material world, but past identities are not easy to escape from since the Internet preserves much of the information that was shared. Social media is an essential part of the social lives of young adults. They rely on it to maintain relationships, create new relationships, and stay up to date with the world around them. Adolescents find social media to be extremely helpful when changing environments, like moving off to university for example. Social media provides students, especially first year students, the opportunity to create the identity they want the world to see. However, it has been seen that these students create online personas that may not reflect their true selves bringing up the issues of impression management. Social media provides young adults with the opportunity to present themselves as something other than their authentic self. Social media providers can help build relationships and community on their platforms. This is something that will create a more positive impact from social media. When young adults interact with each other using social media they are creating something called a social self-identity. Social self identity is what individuals create when they assimilate to being in a group. Social media has gained the reputation of being isolating. If these platforms encourage community then they can help grow users' social self-identity. == Media literacy == The definition of media literacy has evolved over time to encompass a range of experiences that can occur in social media or other digital spaces. The definition of media literacy is also broad and wide ranging in its context. Currently, media literacy is the idea that one is able to analyze, evaluate, and interact with media content in a meaningful way. Educators teach media literacy skills because of the vulnerable relationship that young adults can have with social media. Some examples of media literacy practices, particularly on Twitter, include using hashtags, live tweeting, and sharing information. One of the overall goals of media literacy within the context of social media is to keep young adults aware of potentially violent, graphic, or dangerous content that they may come across on the internet, and how to determine if the content is credible while engaging responsibly with it. In order to be considered media-literate, a person must be able to take in media from online and social platforms and have the correct competencies and context to be able to organize the information. In order to be considered media-literate, the digital information must be given to the user in a way that it can be put into the correct perspective and analyzed, deducted and synthesized.Teenagers and young adults can be vulnerable to specific content online outside of their age-range. Media literacy campaigns and education research shows that targeting those who fall into this age category would be the best way to understand and target their needs as young online users. There are multiple individual studies investigating social media identity relating to media literacy online, however there is a need for much more conclusive information that analyzes multiple studies at a time. Social media literacy is still considered an under-researched topic. Many scholars in media literacy research emphasize the impact of training young adults to consume media in a safe way is the major solution for furthering internet education in children and young adults. The more information the young adults are given on media literacy, the better prepared they are to enter the digital world confidently. One scientific model that has been proposed, known as The Social Media Literacy (SMILE) model is a framework that hypothesizes that at the core of this model it is helping young adults truly know the meaning and display the actions of media literacy online. SMILE is also meant to inspire more research on the subject of media literacy as it relates to social media effects and young adult learning abilities. The model was applied through the lens of a social media positivity bias among adolescents and puts forth five different assumptions about social media and media literacy; Social media literacy as a moderator (what is seen on social media) Social media literacy as a predictor (what is seen for specific individuals on social media) Media literacy within social media is a reciprocal process The development of social media literacy depends on a conditional process of variables affecting other variables Media literacy within social media is a differential learning process, and who teaches it is highly affective of the outcome This model also stresses that human beings learn media literacy (and social media literacy) naturally as they go through life. Research suggests that having young adults taught media literacy from an educator may make them less interested (and therefore less careful) of threats on social media. == Self Presentation == People create images of themselves to present to the public, a process called self presentation. Depending on the demographic, presenting oneself as authentic can result in identity clarity. Methods of self presentation can also be influenced by geography. The framework for this relationship between a user's location and their social media presentation is called the spatial self. Users depict their spatial self in order to include their physical space as a part of their self presentation to an audience. According to a 2018 research paper, patients of plastic surgeons have gone in and asked for specific snapchat "filter" features. This led to a theory of Snap

    Read more →
  • AI Overviews

    AI Overviews

    AI Overviews is an artificial intelligence (AI) feature integrated into Google Search that produces AI-generated summaries of search results. The feature has been criticized for its inaccuracy and for reducing website traffic. == History and development == AI Overviews were first introduced as part of Google's Search Generative Experience (SGE), which was unveiled at the Google I/O conference in May 2023. In May 2024 at Google I/O 2024, the feature was rebranded as AI Overviews and launched in the United States. The introduction of AI Overviews was seen as a strategic move to compete with other generative AI advancements, including OpenAI's ChatGPT. By August 2024, AI Overviews was rolled out to several other countries, including the United Kingdom, India, Japan, Brazil, Mexico, and Indonesia, with support for multiple languages. In October 2024, Google expanded the feature globally, making it available in over 100 countries. In December 2024, Botify x Demandsphere released findings stating that when AI Overviews and featured snippets appear together on the search engine results page, they take up approximately 67.1% of the screen on desktop and 75.7% on mobile. Even if content is ranking in the #1 position, it may not be visible to consumers if other visual elements on the results page are more prominent. In March 2025, Google started testing an "AI Mode", where the search results page is AI-generated. The company was also considering adding advertisements to the AI Mode, as they already exist in AI Overviews. As of May 2025, AI Overviews are available in over 200 countries and territories and in more than 40 languages. As of March 2026, Google AI Overviews appear on more than 48% of total Google Search queries, compared to just 6.49% in the previous year (58% year-over-year growth). == Functionality == The AI Overviews feature uses large language models to generate summaries from web content. The overviews are designed to be concise, providing a snapshot of relevant information about the queried topic. Google allows users to adjust the language complexity in summaries, offering both simplified and detailed options. The overviews also include links to sources. According to a June 2025 study by Semrush, the most cited source is Quora, followed by Reddit. == Reception == The feature has faced criticism for inaccuracies, including instances where erroneous or nonsensical content was generated. Depending on what is searched for, the overview may also consist of hallucinated content, such as when searching for idioms that do not exist. In May 2024, Google temporarily restricted the AI tool after it provided suggestions that were seen as nonsensical and harmful, such as telling users to eat rocks or apply glue on pizza. Concerns were also raised by content publishers, who feared a decline in web traffic as users relied on the summaries instead of visiting source websites. A Google patent from 2026 raised the concern of webmasters that Google could entirely replace the landing page of websites by an AI optimized copy of the website in its results. There is also apprehension about the ethical implications of AI-driven content aggregation, including its impact on intellectual property rights and the visibility of smaller content providers. The European Commission announced in December 2025 that they were investigating whether AI Overviews breached European competition law. In response, Google has stated its commitment to improve content validation and refine the algorithms used to filter unreliable information. Google implemented measures to prioritize link placement within AI Overviews, aiming to balance user convenience with the needs of content creators. In January 2026, Google restricted AI Overviews on certain health-related searches following an investigation by The Guardian. == Lawsuits == On February 24, 2025, Chegg sued Alphabet over the AI Overviews feature, claiming that it was leading to students preferring "low-quality, unverified AI summaries", thus violating antitrust law. Chegg also said it was considering either a sale or a take-private transaction. In September 2025, Penske Media Corporation, the publisher of Rolling Stone and The Hollywood Reporter, sued Google, claiming that AI Overviews illegally regurgitate content from their websites and drive off potential site visitors by always appearing on top of the search results while leaving little incentive to see the linked sources. The company stated that "the future of digital media and [...] its integrity [...] is threatened by Google's current actions", alleging that 20% of searches that link to Penske-owned websites show AI Overviews and that the figure is expected to rise. Google spokesperson José Castañeda called the claims "meritless" and stated that "AI Overviews send traffic to a greater diversity of sites." In 2026, Canadian musician Ashley MacIsaac filed a lawsuit against Google claiming that the AI Overview feature had wrongly stated that MacIsaac had been convicted of numerous criminal offences and was on the sex offender registry. He claims this incorrect information led to the cancellation of a December 2025 gig organized by the Sipekne'katik First Nation.

    Read more →
  • Clustered file system

    Clustered file system

    A clustered file system (CFS) is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for each node). Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance. == Shared-disk file system == A shared-disk file system uses a storage area network (SAN) to allow multiple computers to gain direct disk access at the block level. Access control and translation from file-level operations that applications use to block-level operations used by the SAN must take place on the client node. The most common type of clustered file system, the shared-disk file system – by adding mechanisms for concurrency control – provides a consistent and serializable view of the file system, avoiding corruption and unintended data loss even when multiple clients try to access the same files at the same time. Shared-disk file-systems commonly employ some sort of fencing mechanism to prevent data corruption in case of node failures, because an unfenced device can cause data corruption if it loses communication with its sister nodes and tries to access the same information other nodes are accessing. The underlying storage area network may use any of a number of block-level protocols, including SCSI, iSCSI, HyperSCSI, ATA over Ethernet (AoE), Fibre Channel, network block device, and InfiniBand. There are different architectural approaches to a shared-disk filesystem. Some distribute file information across all the servers in a cluster (fully distributed). === Examples === == Distributed file systems == Distributed file systems do not share block level access to the same storage but use a network protocol. These are commonly known as network file systems, even though they are not the only file systems that use the network to send data. Distributed file systems can restrict access to the file system depending on access lists or capabilities on both the servers and the clients, depending on how the protocol is designed. The difference between a distributed file system and a distributed data store is that a distributed file system allows files to be accessed using the same interfaces and semantics as local files – for example, mounting/unmounting, listing directories, read/write at byte boundaries, system's native permission model. Distributed data stores, by contrast, require using a different API or library and have different semantics (most often those of a database). === Design goals === Distributed file systems may aim for "transparency" in a number of aspects. That is, they aim to be "invisible" to client programs, which "see" a system which is similar to a local file system. Behind the scenes, the distributed file system handles locating files, transporting data, and potentially providing other features listed below. Access transparency: clients are unaware that files are distributed and can access them in the same way as local files are accessed. Location transparency: a consistent namespace exists encompassing local as well as remote files. The name of a file does not give its location. Concurrency transparency: all clients have the same view of the state of the file system. This means that if one process is modifying a file, any other processes on the same system or remote systems that are accessing the files will see the modifications in a coherent manner. Failure transparency: the client and client programs should operate correctly after a server failure. Heterogeneity: file service should be provided across different hardware and operating system platforms. Scalability: the file system should work well in small environments (1 machine, a dozen machines) and also scale gracefully to bigger ones (hundreds through tens of thousands of systems). Replication transparency: Clients should not have to be aware of the file replication performed across multiple servers to support scalability. Migration transparency: files should be able to move between different servers without the client's knowledge. === History === The Incompatible Timesharing System used virtual devices for transparent inter-machine file system access in the 1960s. More file servers were developed in the 1970s. In 1976, Digital Equipment Corporation created the File Access Listener (FAL), an implementation of the Data Access Protocol as part of DECnet Phase II which became the first widely used network file system. In 1984, Sun Microsystems created the file system called "Network File System" (NFS) which became the first widely used Internet Protocol based network file system. Other notable network file systems are Andrew File System (AFS), Apple Filing Protocol (AFP), NetWare Core Protocol (NCP), and Server Message Block (SMB) which is also known as Common Internet File System (CIFS). In 1986, IBM announced client and server support for Distributed Data Management Architecture (DDM) for the System/36, System/38, and IBM mainframe computers running CICS. This was followed by the support for IBM Personal Computer, AS/400, IBM mainframe computers under the MVS and VSE operating systems, and FlexOS. DDM also became the foundation for Distributed Relational Database Architecture, also known as DRDA. There are many peer-to-peer network protocols for open-source distributed file systems for cloud or closed-source clustered file systems, e. g.: 9P, AFS, Coda, CIFS/SMB, DCE/DFS, WekaFS, Lustre, PanFS, Google File System, Mnet, Chord Project. === Examples === == Network-attached storage == Network-attached storage (NAS) provides both storage and a file system, like a shared disk file system on top of a storage area network (SAN). NAS typically uses file-based protocols (as opposed to block-based protocols a SAN would use) such as NFS (popular on UNIX systems), SMB/CIFS (Server Message Block/Common Internet File System) (used with MS Windows systems), AFP (used with Apple Macintosh computers), or NCP (used with OES and Novell NetWare). == Design considerations == === Avoiding single point of failure === The failure of disk hardware or a given storage node in a cluster can create a single point of failure that can result in data loss or unavailability. Fault tolerance and high availability can be provided through data replication of one sort or another, so that data remains intact and available despite the failure of any single piece of equipment. For examples, see the lists of distributed fault-tolerant file systems and distributed parallel fault-tolerant file systems. === Performance === A common performance measurement of a clustered file system is the amount of time needed to satisfy service requests. In conventional systems, this time consists of a disk-access time and a small amount of CPU-processing time. But in a clustered file system, a remote access has additional overhead due to the distributed structure. This includes the time to deliver the request to a server, the time to deliver the response to the client, and for each direction, a CPU overhead of running the communication protocol software. === Concurrency === Concurrency control becomes an issue when more than one person or client is accessing the same file or block and want to update it. Hence updates to the file from one client should not interfere with access and updates from other clients. This problem is more complex with file systems due to concurrent overlapping writes, where different writers write to overlapping regions of the file concurrently. This problem is usually handled by concurrency control or locking which may either be built into the file system or provided by an add-on protocol. == History == IBM mainframes in the 1970s could share physical disks and file systems if each machine had its own channel connection to the drives' control units. In the 1980s, Digital Equipment Corporation's TOPS-20 and OpenVMS clusters (VAX/ALPHA/IA64) included shared disk file systems.

    Read more →
  • InfiniBand

    InfiniBand

    InfiniBand (IB) is a computer networking standard used in high-performance computing that features very high throughput and very low latency. It is used for data interconnect both among and within computers. InfiniBand is also used as either a direct or switched interconnect between servers and storage systems, as well as an interconnect between storage systems. It is designed to be scalable and uses a switched fabric network topology. Between 2014 and June 2016, it was the most commonly used interconnect in the TOP500 list of supercomputers. Mellanox (acquired by Nvidia) manufactures InfiniBand host bus adapters and network switches, which are used by large computer system and database vendors in their product lines. As a computer cluster interconnect, IB competes with Ethernet, Fibre Channel, and Intel Omni-Path. The technology is promoted by the InfiniBand Trade Association. == History == InfiniBand originated in 1999 from the merger of two competing designs: Future I/O and Next Generation I/O (NGIO). NGIO was led by Intel, with a specification released in 1998, and joined by Sun Microsystems and Dell. Future I/O was backed by Compaq, IBM, and Hewlett-Packard. This led to the formation of the InfiniBand Trade Association (IBTA), which included both sets of hardware vendors as well as software vendors such as Microsoft. At the time it was thought some of the more powerful computers were approaching the interconnect bottleneck of the PCI bus, in spite of upgrades like PCI-X. Version 1.0 of the InfiniBand Architecture Specification was released in 2000. Initially the IBTA vision for IB was simultaneously a replacement for PCI in I/O, Ethernet in the machine room, cluster interconnect and Fibre Channel. IBTA also envisaged decomposing server hardware on an IB fabric. Mellanox had been founded in 1999 to develop NGIO technology, but by 2001 shipped an InfiniBand product line called InfiniBridge at 10 Gbit/second speeds. Following the burst of the dot-com bubble there was hesitation in the industry to invest in such a far-reaching technology jump. By 2002, Intel announced that instead of shipping IB integrated circuits ("chips"), it would focus on developing PCI Express, and Microsoft discontinued IB development in favor of extending Ethernet. Sun Microsystems and Hitachi continued to support IB. In 2003, the System X supercomputer built at Virginia Tech used InfiniBand in what was estimated to be the third largest computer in the world at the time. The OpenIB Alliance (later renamed OpenFabrics Alliance) was founded in 2004 to develop an open set of software for the Linux kernel. By February, 2005, the support was accepted into the 2.6.11 Linux kernel. In November 2005 storage devices finally were released using InfiniBand from vendors such as Engenio. Cisco, desiring to keep technology superior to Ethernet off the market, adopted a "buy to kill" strategy. Cisco successfully killed InfiniBand switching companies such as Topspin via acquisition. Of the top 500 supercomputers in 2009, Gigabit Ethernet was the internal interconnect technology in 259 installations, compared with 181 using InfiniBand. In 2010, market leaders Mellanox and Voltaire merged, leaving just one other IB vendor, QLogic, primarily a Fibre Channel vendor. At the 2011 International Supercomputing Conference, links running at about 56 gigabits per second (known as FDR, see below), were announced and demonstrated by connecting booths in the trade show. In 2012, Intel acquired QLogic's InfiniBand technology, leaving only one independent supplier. By 2014, InfiniBand was the most popular internal connection technology for supercomputers, although within two years, 10 Gigabit Ethernet started displacing it. In 2016, it was reported that Oracle Corporation (an investor in Mellanox) might engineer its own InfiniBand hardware. In 2019 Nvidia acquired Mellanox, the last independent supplier of InfiniBand products. == Specification == Specifications are published by the InfiniBand trade association. === Performance === Original names for speeds were single-data rate (SDR), double-data rate (DDR) and quad-data rate (QDR) as given below. Subsequently, other three-letter initialisms were added for even higher data rates. Notes Each link is duplex. Links can be aggregated: most systems use a 4 link/lane connector (QSFP). HDR often makes use of 2x links (aka HDR100, 100 Gb link using 2 lanes of HDR, while still using a QSFP connector). NDR introduced OSFP connectors which host one or two links at 2x (NDR200) or 4x (NDR400). They are not logically configured as a single 8x link, even when connecting switches together with an OSFP cable. InfiniBand provides remote direct memory access (RDMA) capabilities for low CPU overhead. === Topology === InfiniBand uses a switched fabric topology, as opposed to early shared medium Ethernet. All transmissions begin or end at a channel adapter. Each processor contains a host channel adapter (HCA) and each peripheral has a target channel adapter (TCA). These adapters can also exchange information for security or quality of service (QoS). === Messages === InfiniBand transmits data in packets of up to 4 KB that are taken together to form a message. A message can be: a remote direct memory access read or write a channel send or receive a transaction-based operation (that can be reversed) a multicast transmission an atomic operation === Physical interconnection === In addition to a board form factor connection, it can use both active and passive copper (up to 10 meters) and optical fiber cable (up to 10 km). QSFP connectors are used. The InfiniBand Association also specified the CXP connector system for speeds up to 120 Gbit/s over copper, active optical cables, and optical transceivers using parallel multi-mode fiber cables with 24-fiber MPO connectors. === Software interfaces === Mellanox operating system support is available for Solaris, FreeBSD, Red Hat Enterprise Linux, SUSE Linux Enterprise Server (SLES), Windows, HP-UX, VMware ESX, and AIX. InfiniBand has no specific standard application programming interface (API). The standard only lists a set of verbs such as ibv_open_device or ibv_post_send, which are abstract representations of functions or methods that must exist. The syntax of these functions is left to the vendors. Sometimes for reference this is called the verbs API. The de facto standard software is developed by OpenFabrics Alliance and called the Open Fabrics Enterprise Distribution (OFED). It is released under two licenses GPL2 or BSD license for Linux and FreeBSD, and as Mellanox OFED for Windows (product names: WinOF / WinOF-2; attributed as host controller driver for matching specific ConnectX 3 to 5 devices) under a choice of BSD license for Windows. It has been adopted by most of the InfiniBand vendors, for Linux, FreeBSD, and Microsoft Windows. IBM refers to a software library called libibverbs, for its AIX operating system, as well as "AIX InfiniBand verbs". The Linux kernel support was integrated in 2005 into the kernel version 2.6.11. === Ethernet over InfiniBand === Ethernet over InfiniBand, abbreviated to EoIB, is an Ethernet implementation over the InfiniBand protocol and connector technology. EoIB enables multiple Ethernet bandwidths varying on the InfiniBand (IB) version. Ethernet's implementation of the Internet Protocol Suite, usually referred to as TCP/IP, is different in some details compared to the direct InfiniBand protocol in IP over IB (IPoIB).

    Read more →
  • Kurzsignale

    Kurzsignale

    The Short Signal Code, also known as the Short Signal Book (German: Kurzsignalbuch), was a short code system used by the Kriegsmarine (German Navy) during World War II to minimize the transmission duration of messages. == Description == The transmission of radio messages had the potential risks of revealing the submarine's presence and direction; if decoded the content was also revealed. Submarines need to provide information, mostly in standard form (position of convoy to attack and of submarine, weather information), to their bases. Initially Morse code transmissions could be used. To inhibit detection, the duration of messages needed to be minimised; for this, Kurzsignale short-coding was used. To prevent interception, messages needed to be encrypted by the Enigma machine. To shorten transmission even further, the message could be sent by a fast machine instead of a human radio operator. For example, the Kurier system – not implemented in time – decreased the time to send a Morse dot from around 50 milliseconds for a human to 1 millisecond. == Short Signal book == The Kurzsignale code was intended to shorten transmission time to below the time required to get a directional fix. It was not primarily intended to hide signal contents; protection was intended to be achieved by encoding with the Enigma machine. A copy of the Kurzsignale code book was captured from German submarine U-110 on 9 May 1941. In August 1941, Dönitz began addressing U-boats by the names of their commanders, instead of boat numbers. The method of defining U-boat meeting points in the Short Signal Book was regarded as compromised, so a method was defined by B-Dienst cryptanalysts to disguise their positions on the Kriegsmarine German Naval Grid System (German:Gradnetzmeldeverfahren) was introduced and used until the end of the war == Radio direction finding == Aware of the danger presented by radio direction finding (RDF), the Kriegsmarine developed various systems to speed up broadcast. The Kurzsignale code system condensed messages into short codes consisting of short sequences for common terms such as "convoy location" so that additional descriptions would not be needed in the message. The resulting Kurzsignal was then encoded with the Enigma machine and subsequently transmitted as rapidly as possible, typically taking about 20 seconds. Typical length of an information or weather signal was about 25 characters. Conventional RDF needed about a minute to fix the bearing of a radio signal, and the Kurzsignale protected against this. However, the huff-duff system which was in use by the Allies could cope with these short transmissions. The fully automated burst transmission Kurier system, in testing from August 1944, could send a Kurzsignal in not more than 460 milliseconds; this was short enough to prevent location even by huff-duff and, if deployed, would have been a serious setback for Allied anti-submarine and code-breaking activities. By late 1944 the Kurier program was a top priority, but the war ended before the system was operational. == Short Weather cipher == A similar coding system was used for weather reports from U-boats, the Wetterkurzschlüssel (Short Weather Cipher). Code books were captured from U-559 on 30 October 1942.

    Read more →
  • Spanish Network of Excellence on Cybersecurity Research

    Spanish Network of Excellence on Cybersecurity Research

    The Spanish Network of Excellence on Cybersecurity Research (RENIC), is a research initiative to promote cybersecurity interests in Spain. == Members == === Board of Directors (2018) === President: Universidad de Málaga Vice president: CSIC Treasurer: Universidad Politécnica de Madrid Secretary: Universidad de Granada Vocals: Tecnalia, Universidad de La Laguna and Universidad de Modragón === Board of Directors (2016) === President: Universidad Carlos III de Madrid Vice president: Universidad Politécnica de Madrid Treasurer: Universidad de Granada Secretary: Universidad de León Vocals: Gradiant, Tecnalia, Universidad de Málaga === Founding Members === Centro Andaluz de Innovación y Tecnologías de la Información y las Comunicaciones (CITIC). Consejo Superior de Investigaciones Científicas (CSIC). Centro Tecnolóxico de Telecomunicaciones de Galicia (Gradiant). Instituto Imdea Software. Instituto Nacional de Ciberseguridad (INCIBE). Mondragón Unibertsitatea. Tecnalia. Universidad Carlos III de Madrid. Universidad Castilla la Mancha. Universidad de Granada. Universidad de la Laguna. Universidad de León. Universidad de Málaga. Universidad de Murcia. Universidad de Vigo. Universidad Internacional de la Rioja. Universidad Politécnica de Madrid. Universidad Rey Juan Carlos. === Members === Consejo Superior de Investigaciones Científicas (CSIC). Centro Tecnolóxico de Telecomunicaciones de Galicia (Gradiant). Instituto Imdea Software. Instituto Nacional de Ciberseguridad (INCIBE). Mondragón Unibertsitatea. Tecnalia. Universidad Carlos III de Madrid. Universidad de Castilla-La Mancha. Universidad de Granada. Universidad de la Laguna. Universidad de León. Universidad de Málaga. Universidad de Murcia. Universidad de Vigo. Universidad Politécnica de Madrid. Universidad Rey Juan Carlos. Universitat Oberta de Catalunya. IKERLAN. === Honorary Members === Centre for the Development of Industrial Technology (CDTI). (2017) Instituto Nacional de Ciberseguridad (INCIBE). (2016) == Initiatives and Participations == RENIC is ECSO member, and is also a member of its board of directors. A collaboration agreement between RENIC and the Innovative Business Cluster on Cybersecurity (AEI Cybersecurity) has been signed. RENIC is pleased to sponsor the Cybersecurity Research National Conferences (JNIC) JNIC2017 edition, organized by Universidad Rey Juan Carlos. RENIC is pleased to announce the publication of the online version of the Catalog and knowledge map of cybersecurity research

    Read more →
  • Data monetization

    Data monetization

    Data monetization, a form of monetization, may refer to the act of generating measurable economic benefits from available data sources (analytics). Less commonly, it may also refer to the act of monetizing data services. In the case of analytics, typically, these benefits accrue as revenue or expense savings, but may also include market share or corporate market value gains. Data monetization leverages data generated through business operations, available exogenous data or content, as well as data associated with individual actors such as that collected via electronic devices and sensors participating in the internet of things. For example, the ubiquity of the internet of things is generating location data and other data from sensors and mobile devices at an ever-increasing rate. When this data is collated against traditional databases, the value and utility of both sources of data increases, leading to tremendous potential to mine data for social good, research and discovery, and achievement of business objectives. Closely associated with data monetization are the emerging data as a service models for transactions involving data by the data item. There are three ethical and regulatory vectors involved in data monetization due to the sometimes conflicting interests of actors involved in the digital supply chain. The individual data creator who generates files and records through his own efforts or owns a device such as a sensor or a mobile phone that generates data has a claim to ownership of data. The business entity that generates data in the course of its operations, such as its transactions with financial institutions or risk factors discovered through feedback from customers also has a claim on data captured through their systems and platforms. However, the person that contributed the data may also have a legitimate claim on the data. Internet platforms and service providers, such as Google or Facebook that require a user to forgo some ownership interest in their data in exchange for use of the platform also have a legitimate claim on the data. Thus the practice of data monetization, although common since 2000, is now getting increasing attention from regulators. The European Union and the United States Congress have begun to address these issues. For instance, in the financial services industry, regulations involving data are included in the Gramm–Leach–Bliley Act and Dodd-Frank. Some individual creators of data are shifting to using personal data vaults and implementing vendor relationship management concepts as a reflection of an increasing resistance to their data being federated or aggregated and resold without compensation. Groups such as the Personal Data Ecosystem Consortium, Patient privacy rights, and others are also challenging corporate cooptation of data without compensation. Financial services companies are a relatively good example of an industry focused on generating revenue by leveraging data. Credit card issuers and retail banks use customer transaction data to improve targeting of cross-sell offers. Partners are increasingly promoting merchant based reward programs which leverage a bank’s data and provide discounts to customers at the same time. == Types of data monetization == Internal data monetization - An organization's data is used internally, resulting in economic benefit. This is commonly the case in organizations using analytics to uncover insights, resulting in improved profit, cost savings or the avoidance of risk. Internal data monetization is currently the most common form of monetization, requiring far fewer security, intellectual property, and legal precautions when compared to other types. The potential economic gains from this type of data monetization are limited by the organization's internal structure and situation. External data monetization - A person or organization makes data they possess available on a for-fee basis to external parties, or as a broker for same. This type of monetization is less common and requires various methods to distribute the data to potential buyers and consumers. However, the economic gain that results from collecting data, packaging and distributing it, can be quite large. == Steps == Identification of available data sources – this includes data currently available for monetization as well as other external data sources that may enhance the value of what’s currently available. Connect, aggregate, attribute, validate, authenticate, and exchange data - this allows data to be converted directly into actionable or revenue generating insight or services. Set terms and prices and facilitate data trading - methods for data vetting, storage, and access. For example, many global corporations have locked and siloed data storage infrastructures, which hinders efficient access to data and cooperative and real-time exchange. Perform Research and analytics – draw predictive insights from existing data as a basis for using data for to reduce risk, enhance product development or performance, or improve customer experience or business outcomes. Action and leveraging – the last phase of monetizing data includes determining alternative or improved data centric products, ideas, or services. Examples may include real-time actionable triggered notifications or enhanced channels such as web or mobile response mechanisms. == Pricing variables and factors == A fee for use of a platform to connect buyers and sellers use of a platform to configure, organize, and otherwise process data included in a data trade connecting or including a device or sensor into a data supply chain connecting and credentialing a creator of a data source and a data buyer – often through a federated identity connecting a data source to other data sources to be included in a data supply chain use of an internet service or other transmission services for uploading and downloading data – sometimes, for an individual, through a personal cloud use of encrypted keys to achieve secure data transfer use of a search algorithm specifically designed to tag data sources that contain data points of value to the data buyer linking a data creator or generator to a data collection protocol or form server actions – such as a notification – triggered by an update to a data item or data source included in a data supply chain A price or exchange or other trade value assigned by a data creator or generator to a data item or a data source offered by a data buyer to a data creator assigned by a data buyer for a data item or a data source formatted according to criteria set by a data buyer An incremental fee assigned by a data buyer for a data item or a data set scaled to the reputation of the data creator == Benefits == Improved decision-making that leads to real time crowd sourced research, improved profits, decreased costs, reduced risk and improved compliance More impactful decisions (e.g., make real-time decisions) More timely (lower latency) decisions (e.g., a vendor making purchase recommendations while the customer is still on the phone or in the store, a customer connecting with multiple vendors to discover the best price, triggered notifications when thresholds are reached for data values) More granular decisions (e.g., localized pricing decisions at an individual or device or sensor level versus larger aggregates). Targeted Marketing (e.g., Vendors with access to big data can make targeted advertisements to specific customers within a set data pool decreasing costs for the advertiser and reaching most interested customers) == Frameworks == There are a wide variety of industries, firms and business models related to data monetization. The following frameworks have been offered to help understand the types of business models that are used: Roger Ehrenberg of IA Ventures, a venture capital firm that invests in this sector, has defined three basic types of data product firms: Contributory databases. The magic of these businesses is that a customer provides their own data in exchange for receiving a more robust set of aggregated data back that provides insight into the broader marketplace, or provides a vehicle for expressing a view. Give a little, get a lot back in return – a pretty compelling value proposition, and one that frequently results in a payment from the data contributor in exchange for receiving enriched, aggregated data. Once these contributory databases are developed and customers become reliant on their insights, they become extremely valuable and persistent data assets. Data processing platforms. These businesses create barriers through a combination of complex data architectures, proprietary algorithms, and rich analytics to help customers consume data in whatever form they please. Often these businesses have special relationships with key data providers, that when combined with other data and processed as a whole create valuable differentiation and competitive barriers. Bloomberg is an example of a powerful

    Read more →
  • Semi-Automatic Ground Environment

    Semi-Automatic Ground Environment

    The Semi-Automated Ground Environment (SAGE) was a system of large computers and associated networking equipment that coordinated data from many radar sites and processed it to produce a single unified image of the airspace over a wide area. SAGE directed and controlled the NORAD response to a possible Soviet air attack, operating in this role from the late 1950s into the 1980s. The processing power behind SAGE was supplied by the largest discrete component-based computer ever built, the AN/FSQ-7, manufactured by IBM. Each SAGE Direction Center (DC) housed an FSQ-7 which occupied an entire floor, approximately 22,000 square feet (2,000 m2) not including supporting equipment. The FSQ-7 was actually two computers, "A" side and "B" side. Computer processing was switched from "A" side to "B" side on a regular basis, allowing maintenance on the unused side. Information was fed to the DCs from a network of radar stations as well as readiness information from various defense sites. The computers, based on the raw radar data, developed "tracks" for the reported targets, and automatically calculated which defenses were within range. Operators used light guns to select targets on-screen for further information, select one of the available defenses, and issue commands to attack. These commands would then be automatically sent to the defense site via teleprinter. Connecting the various sites was an enormous network of telephones, modems and teleprinters. Later additions to the system allowed SAGE's tracking data to be sent directly to CIM-10 Bomarc missiles and some of the US Air Force's interceptor aircraft in-flight, directly updating their autopilots to maintain an intercept course without operator intervention. Each DC also forwarded data to a Combat Center (CC) for "supervision of the several sectors within the division" ("each combat center [had] the capability to coordinate defense for the whole nation"). SAGE became operational in the late 1950s and early 1960s at an estimated total cost between 8 and 12 billion dollars, four times the cost of the Manhattan Project. Throughout its development, there were continual concerns about its real ability to deal with large attacks, and the Operation Sky Shield tests showed that only about one-fourth of enemy bombers would have been intercepted. Nevertheless, SAGE was the backbone of NORAD's air defense system into the 1980s, by which time the tube-based FSQ-7s were increasingly costly to maintain and completely outdated. Today the same command and control task is carried out by microcomputers, based on the same basic underlying data. == Background == === Earlier systems === Just prior to World War II, Royal Air Force (RAF) tests with the new Chain Home (CH) radars had demonstrated that relaying information to the fighter aircraft directly from the radar sites was not feasible. The radars determined the map coordinates of the enemy, but could generally not see the fighters at the same time. This meant the fighters had to be able to determine where to fly to perform an interception but were often unaware of their own exact location and unable to calculate an interception while also flying their aircraft. The solution was to send all of the radar information to a central control station where operators collated the reports into single tracks, and then reported these tracks to the airbases, or sectors. The sectors used additional systems to track their own aircraft, plotting both on a single large map. Operators viewing the map could then see what direction their fighters would have to fly to approach their targets and relay that simply by telling them to fly along a certain heading or vector. This Dowding system was the first ground-controlled interception (GCI) system of large scale, covering the entirety of the UK. It proved enormously successful during the Battle of Britain, and is credited as being a key part of the RAF's success. The system was slow, often providing information that was up to five minutes out of date. Against propeller driven bombers flying at perhaps 225 miles per hour (362 km/h) this was not a serious concern, but it was clear the system would be of little use against jet-powered bombers flying at perhaps 600 miles per hour (970 km/h). The system was extremely expensive in manpower terms, requiring hundreds of telephone operators, plotters and trackers in addition to the radar operators. This was a serious drain on manpower, making it difficult to expand the network. The idea of using a computer to handle the task of taking reports and developing tracks had been explored beginning late in the war. By 1944, analog computers had been installed at the CH stations to automatically convert radar readings into map locations, eliminating two people. Meanwhile, the Royal Navy began experimenting with the Comprehensive Display System (CDS), another analog computer that took X and Y locations from a map and automatically generated tracks from repeated inputs. Similar systems began development with the Royal Canadian Navy, DATAR, and the US Navy, the Naval Tactical Data System (NTDS). A similar system was also specified for the Nike SAM project, specifically referring to a US version of CDS, coordinating the defense over a battle area so that multiple batteries did not fire on a single target. All of these systems were relatively small in geographic scale, generally tracking within a city-sized area. === Valley Committee === When the Soviet Union tested its first atomic bomb in August 1949, the topic of air defense of the US became important for the first time. A study group, the "Air Defense Systems Engineering Committee", was set up under the direction of Dr. George Valley to consider the problem and is known to history as the "Valley Committee". Their December report noted a key problem in air defense using ground-based radars. A bomber approaching a radar station would detect the signals from the radar long before the reflection off the bomber was strong enough to be detected by the station. The committee suggested that when this occurred, the bomber would descend to low altitude, thereby greatly limiting the radar horizon, allowing the bomber to fly past the station undetected. Although flying at low altitude greatly increased fuel consumption, the team calculated that the bomber would only need to do this for about 10% of its flight, making the fuel penalty acceptable. The only solution to this problem was to build a huge number of stations with overlapping coverage. At that point the problem became one of managing the information. Manual plotting was ruled out as too slow, and a computerized solution was the only possibility. To handle this task, the computer would need to be fed information directly, eliminating any manual translation by phone operators, and it would have to be able to analyze that information and automatically develop tracks. A system tasked with defending cities against the predicted future Soviet bomber fleet would have to be dramatically more powerful than the models used in the NTDS or DATAR. The Committee then had to consider whether or not such a computer was possible. The Valley Committee was introduced to Jerome Wiesner, associate director of the Research Laboratory of Electronics at MIT. Wiesner noted that the Servomechanisms Laboratory had already begun development of a machine that might be fast enough. This was the Whirlwind I, originally developed for the Office of Naval Research as a general purpose flight simulator that could simulate any current or future aircraft by changing its software. Wiesner introduced the Valley Committee to Whirlwind's project lead, Jay Forrester, who convinced him that Whirlwind was sufficiently capable. In September 1950, an early microwave early-warning radar system at Hanscom Field was connected to Whirlwind using a custom interface developed by Forrester's team. An aircraft was flown past the site, and the system digitized the radar information and successfully sent it to Whirlwind. With this demonstration, the technical concept was proven. Forrester was invited to join the committee. === Project Charles === With this successful demonstration, Louis Ridenour, chief scientist of the Air Force, wrote a memo stating "It is now apparent that the experimental work necessary to develop, test, and evaluate the systems proposals made by ADSEC will require a substantial amount of laboratory and field effort." Ridenour approached MIT President James Killian with the aim of beginning a development lab similar to the war-era Radiation Laboratory that made enormous progress in radar technology. Killian was initially uninterested, desiring to return the school to its peacetime civilian charter. Ridenour eventually convinced Killian the idea was sound by describing the way the lab would lead to the development of a local electronics industry based on the needs of the lab and the students who would leave the lab to start their

    Read more →
  • NYSERNet

    NYSERNet

    NYSERNet, Inc. (New York State Education and Research Network), is a non-profit Internet service provider in New York State. It mainly provides Internet access to universities, colleges, museums, health care facilities, primary and secondary schools, and research institutions. == History == NYSERNet was founded in 1986 in Troy, New York. Its founders compared NYSERNet's network with the Erie Canal and considered it the next step in two centuries to draw the country together. NYSERNet's network reaches from Buffalo to New York City. Completed in 1987, it was the first statewide regional IP network in the United States.[1] Initial speed of 56 kbps was upgraded to T1 in 1989 and T3 in 1994. It was the original assignee of AS174 according to RFC1117. This ASN is used today by Cogent Communications for their global network.

    Read more →
  • Open-source robotics

    Open-source robotics

    Open-source robotics is a branch of robotics where robots are developed with open-source hardware and free and open-source software, publicly sharing blueprints, schematics, and source code. It is thus closely related to the open design movement, the maker movement and open science. == Requirements == Open source robotics means that information about the hardware is easily discerned, so that others can easily rebuild it. In turn, this requires design to use only easily available standard subcomponents and tools, and for the build process to be documented in detail including a bill of materials and detailed ('Ikea style') step-by-step building and testing instructions. (A CAD file alone is not sufficient, as it does not show the steps for performing or testing the build). These requirements are standard to open source hardware in general, and are formalised by various licences, certifications, especially those defined by the peer-reviewed journals Journal of Open Hardware and HardwareX. Licensing requirements for software are the same as for any open source software. But in addition, for software components to be of practical use in real robot systems, they need to be compatible with other software, usually as defined by some robotics middleware community standard. == Hardware systems == Applications to date include: Robot arms, e.g. PARA or Thor Wheeled mobile robots. e.g. OpenScout Four-legged robots such as the Open Dynamic Robot Initiative UAV quadcopters (drones) such as Agilicious Humanoid robots, e.g. iCub, Berkeley Humanoid Lite Self-driving cars, e.g. OpenPodcar (→ Personal rapid transit) Submersible robots, eg. OpenFish Laboratory robotics such as chemical liquid handling Vertical farming Swarm robots, e.g. HeRoSwarm Domestic tasks: vacuum cleaning, floor washing and grass mowing Robot sports including robot combat and autonomous racing Education == Hardware subcomponents == Most open source hardware definitions allow non-open subcomponents to be used in modular design, as long as they are easily available. However many designs try to push openness down into as many subcomponents as possible, with the aim of ultimately reaching fully open designs. Open hardware manual-drive vehicles and their subcomponents, such as from Open Source Ecology, are often used as starting points and extended with automation systems. Open subcomponents can include open-source computing hardware as subcomponents, such as Arduino and RISC-V, as well as open source motors and drivers such as the Open Source Motor Controller and ODrive. Open hardware robotics interface boards can simplify interfacing between middleware software and physical hardware. == Software subcomponents == === Middleware === Robotics middleware is software which links multiple other software components together. In robotics, this specifically means real-time communication systems with standardized message passing protocols. The predominant open source middleware is ROS2, the robot operating system, now as version 2. Other alternatives include ROS1, YARP — used in the iCub, URBI, and Orca. Open source middleware is usually run on an open source operating system, especially the Ubuntu distribution of Linux. === Driver software === Most robot sensors and actuators require software drivers. There is little standardization of open source software at this level, because each hardware device is different. Creating open drivers for closed hardware is difficult as it requires both low level programming and reverse engineering. === Simulation software === Open source robotics simulators include Gazebo, MuJoCo and Webots. Open source 3D game engines such as Godot are also sometimes used as simulators, when equipped with suitable middleware interfaces. === Automation software === At the level of AI, many standard algorithms have open source software implementations, mostly in ROS2. Major components include: Machine vision systems such as the YOLO object detector. 3D photogrammetry Navigation including SLAM and planning such as nav2 Arm inverse kinematics such as moveIt2 == Community == The first signs of the increasing popularity of building and sharing robot designs were found with the maker culture community. What began with small competitions for remote operated vehicles (e.g. Robot combat), soon developed to the building of autonomous telepresence robots such as Sparky and then true robots (being able to take decisions themselves) as the Open Automaton Project. Several commercial companies now also produce kits for making simple robots. The community has adopted open source hardware licenses, certifications, and peer-reviewed publications, which check that source has been made correctly and permanently available under community definitions, and which validate that this has been done. These processes have become critically important due to many historical projects claiming to be open source but them reverting on the promise due to commercialisation or other pressures. As with other forms of open source hardware, the community continues to debate precise criteria for 'ease of build'. A common standard is that designs should be buildable by a technical university student, in a few days, using typical fablab tools, but definitions of all of these subterms can also be debated. Compared to other forms of open source hardware, open source robotics typically includes a large software element, so involves software as well as hardware engineers. Open source concepts are more established in open source software than hardware, so robotics is a field in which those concepts can be shared and transferred from software to hardware. While the community in open source robotics is multi-faceted with a wide range of backgrounds, a sizable sub-community uses the ROS middleware and meets at the ROSCon conferences to discuss development of ROS itself and automation components built on it.

    Read more →
  • Convergent encryption

    Convergent encryption

    Convergent encryption, also known as content hash keying, is a cryptosystem that produces identical ciphertext from identical plaintext files. This has applications in cloud computing to remove duplicate files from storage without the provider having access to the encryption keys. The combination of deduplication and convergent encryption was described in a backup system patent filed by Stac Electronics in 1995. This combination has been used by Farsite, Permabit, Freenet, MojoNation, GNUnet, flud, and the Tahoe Least-Authority File Store. The system gained additional visibility in 2011 when cloud storage provider Bitcasa announced they were using convergent encryption to enable de-duplication of data in their cloud storage service. == Overview == The system computes a cryptographic hash of the plaintext in question. The system then encrypts the plaintext by using the hash as a key. Finally, the hash itself is stored, encrypted with a key chosen by the user. == Known Attacks == Convergent encryption is open to a "confirmation of a file attack" in which an attacker can effectively confirm whether a target possesses a certain file by encrypting an unencrypted, or plain-text, version and then simply comparing the output with files possessed by the target. This attack poses a problem for a user storing information that is non-unique, i.e. also either publicly available or already held by the adversary - for example: banned books or files that cause copyright infringement. An argument could be made that a confirmation of a file attack is rendered less effective by adding a unique piece of data such as a few random characters to the plain text before encryption; this causes the uploaded file to be unique and therefore results in a unique encrypted file. However, some implementations of convergent encryption where the plain-text is broken down into blocks based on file content, and each block then independently convergently encrypted may inadvertently defeat attempts at making the file unique by adding bytes at the beginning or end. Even more alarming than the confirmation attack is the "learn the remaining information attack" described by Drew Perttula in 2008. This type of attack applies to the encryption of files that are only slight variations of a public document. For example, if the defender encrypts a bank form including a ten digit bank account number, an attacker that is aware of generic bank form format may extract defender's bank account number by producing bank forms for all possible bank account numbers, encrypt them and then by comparing those encryptions with defender's encrypted file deduce the bank account number. Note that this attack can be extended to attack a large number of targets at once (all spelling variations of a target bank customer in the example above, or even all potential bank customers), and the presence of this problem extends to any type of form document: tax returns, financial documents, healthcare forms, employment forms, etc. Also note that there is no known method for decreasing the severity of this attack -- adding a few random bytes to files as they are stored does not help, since those bytes can likewise be attacked with the "learn the remaining information" approach. The only effective approach to mitigating this attack is to encrypt the contents of files with a non-convergent secret before storing (negating any benefit from convergent encryption), or to simply not use convergent encryption in the first place.

    Read more →
  • Bus encryption

    Bus encryption

    Bus encryption is the use of encrypted program instructions on a data bus in a computer that includes a secure cryptoprocessor for executing the encrypted instructions. Bus encryption is used primarily in electronic systems that require high security, such as automated teller machines, TV set-top boxes, and secure data communication devices such as two-way digital radios. Bus encryption can also mean encrypted data transmission on a data bus from one processor to another processor. For example, from the CPU to a GPU which does not require input of encrypted instructions. Such bus encryption is used by Windows Vista and newer Microsoft operating systems to protect certificates, BIOS, passwords, and program authenticity. PVP-UAB (Protected Video Path) provides bus encryption of premium video content in PCs as it passes over the PCIe bus to graphics cards to enforce digital rights management. The need for bus encryption arises when multiple people have access to the internal circuitry of an electronic system, either because they service and repair such systems, stock spare components for the systems, own the system, steal the system, or find a lost or abandoned system. Bus encryption is necessary not only to prevent tampering of encrypted instructions that may be easily discovered on a data bus or during data transmission, but also to prevent discovery of decrypted instructions that may reveal security weaknesses that an intruder can exploit. In TV set-top boxes, it is necessary to download program instructions periodically to customer's units to provide new features and to fix bugs. These new instructions are encrypted before transmission, but must also remain secure on data buses and during execution to prevent the manufacture of unauthorized cable TV boxes. This can be accomplished by secure crypto-processors that read encrypted instructions on the data bus from external data memory, decrypt the instructions in the cryptoprocessor, and execute the instructions in the same cryptoprocessor.

    Read more →