AI App Developer Free

AI App Developer Free — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Superquadrics

    Superquadrics

    In mathematics, the superquadrics or super-quadrics (also superquadratics) are a family of geometric shapes defined by formulas that resemble those of ellipsoids and other quadrics, except that the squaring operations are replaced by arbitrary powers. They can be seen as the three-dimensional relatives of the superellipses. The term may refer to the solid object or to its surface, depending on the context. The equations below specify the surface; the solid is specified by replacing the equality signs by less-than-or-equal signs. The superquadrics include many shapes that resemble cubes, octahedra, cylinders, lozenges and spindles, with rounded or sharp corners. Because of their flexibility and relative simplicity, they are popular geometric modeling tools, especially in computer graphics. It becomes an important geometric primitive widely used in computer vision, robotics, and physical simulation. Some authors, such as Alan Barr, define "superquadrics" as including both the superellipsoids and the supertoroids. In modern computer vision literatures, superquadrics and superellipsoids are used interchangeably, since superellipsoids are the most representative and widely utilized shape among all the superquadrics. Comprehensive coverage of geometrical properties of superquadrics and methods of their recovery from range images and point clouds are covered in several computer vision literatures. == Formulas == === Implicit equation === The surface of the basic superquadric is given by | x | r + | y | s + | z | t = 1 {\displaystyle \left|x\right|^{r}+\left|y\right|^{s}+\left|z\right|^{t}=1} where r, s, and t are positive real numbers that determine the main features of the superquadric. Namely: less than 1: a pointy octahedron modified to have concave faces and sharp edges. exactly 1: a regular octahedron. between 1 and 2: an octahedron modified to have convex faces, blunt edges and blunt corners. exactly 2: a sphere greater than 2: a cube modified to have rounded edges and corners. infinite (in the limit): a cube Each exponent can be varied independently to obtain combined shapes. For example, if r=s=2, and t=4, one obtains a solid of revolution which resembles an ellipsoid with round cross-section but flattened ends. This formula is a special case of the superellipsoid's formula if (and only if) r = s. If any exponent is allowed to be negative, the shape extends to infinity. Such shapes are sometimes called super-hyperboloids. The basic shape above spans from -1 to +1 along each coordinate axis. The general superquadric is the result of scaling this basic shape by different amounts A, B, C along each axis. Its general equation is | x A | r + | y B | s + | z C | t = 1. {\displaystyle \left|{\frac {x}{A}}\right|^{r}+\left|{\frac {y}{B}}\right|^{s}+\left|{\frac {z}{C}}\right|^{t}=1.} === Parametric description === Parametric equations in terms of surface parameters u and v (equivalent to longitude and latitude if m equals 2) are x ( u , v ) = A g ( v , 2 r ) g ( u , 2 r ) y ( u , v ) = B g ( v , 2 s ) f ( u , 2 s ) z ( u , v ) = C f ( v , 2 t ) − π 2 ≤ v ≤ π 2 , − π ≤ u < π , {\displaystyle {\begin{aligned}x(u,v)&{}=Ag\left(v,{\frac {2}{r}}\right)g\left(u,{\frac {2}{r}}\right)\\y(u,v)&{}=Bg\left(v,{\frac {2}{s}}\right)f\left(u,{\frac {2}{s}}\right)\\z(u,v)&{}=Cf\left(v,{\frac {2}{t}}\right)\\&-{\frac {\pi }{2}}\leq v\leq {\frac {\pi }{2}},\quad -\pi \leq u<\pi ,\end{aligned}}} where the auxiliary functions are f ( ω , m ) = sgn ⁡ ( sin ⁡ ω ) | sin ⁡ ω | m g ( ω , m ) = sgn ⁡ ( cos ⁡ ω ) | cos ⁡ ω | m {\displaystyle {\begin{aligned}f(\omega ,m)&{}=\operatorname {sgn}(\sin \omega )\left|\sin \omega \right|^{m}\\g(\omega ,m)&{}=\operatorname {sgn}(\cos \omega )\left|\cos \omega \right|^{m}\end{aligned}}} and the sign function sgn(x) is sgn ⁡ ( x ) = { − 1 , x < 0 0 , x = 0 + 1 , x > 0. {\displaystyle \operatorname {sgn}(x)={\begin{cases}-1,&x<0\\0,&x=0\\+1,&x>0.\end{cases}}} === Spherical product === Barr introduces the spherical product which given two plane curves produces a 3D surface. If f ( μ ) = ( f 1 ( μ ) f 2 ( μ ) ) , g ( ν ) = ( g 1 ( ν ) g 2 ( ν ) ) {\displaystyle f(\mu )={\begin{pmatrix}f_{1}(\mu )\\f_{2}(\mu )\end{pmatrix}},\quad g(\nu )={\begin{pmatrix}g_{1}(\nu )\\g_{2}(\nu )\end{pmatrix}}} are two plane curves then the spherical product is h ( μ , ν ) = f ( μ ) ⊗ g ( ν ) = ( f 1 ( μ ) g 1 ( ν ) f 1 ( μ ) g 2 ( ν ) f 2 ( μ ) ) {\displaystyle h(\mu ,\nu )=f(\mu )\otimes g(\nu )={\begin{pmatrix}f_{1}(\mu )\ g_{1}(\nu )\\f_{1}(\mu )\ g_{2}(\nu )\\f_{2}(\mu )\end{pmatrix}}} This is similar to the typical parametric equation of a sphere: x = x 0 + r sin ⁡ θ cos ⁡ φ y = y 0 + r sin ⁡ θ sin ⁡ φ ( 0 ≤ θ ≤ π , 0 ≤ φ < 2 π ) z = z 0 + r cos ⁡ θ {\displaystyle {\begin{aligned}x&=x_{0}+r\sin \theta \;\cos \varphi \\y&=y_{0}+r\sin \theta \;\sin \varphi \qquad (0\leq \theta \leq \pi ,\;0\leq \varphi <2\pi )\\z&=z_{0}+r\cos \theta \end{aligned}}} which give rise to the name spherical product. Barr uses the spherical product to define quadric surfaces, like ellipsoids, and hyperboloids as well as the torus, superellipsoid, superquadric hyperboloids of one and two sheets, and supertoroids. == Plotting code == The following GNU Octave code generates a mesh approximation of a superquadric:

    Read more →
  • Airborne Networking

    Airborne Networking

    An Airborne Network (AN) is the infrastructure owned by the United States Air Force that provides communication transport services through at least one node that is on a platform capable of flight. == Background == === Definition === The intent of the US Air Force's Airborne Network is to expand the Global Information Grid (GIG) to connect the three major domains of warfare: Air, Space, and Terrestrial. The Transformational Satellite Communications System network currently provides connectivity for all communication through space assets. The Combat Information Transport System and Theater Deployable Communications provide terrestrial connectivity for theatre based operations. The Airborne Network is engineered to utilize all airborne assets to connect with space and surface networks building a seamless communications platform across all domains. === Capabilities === The capabilities identified by this type of system are vastly beyond that of our current military. This system will enable the Air Force to provide a transportable network, flexible enough to communicate with any air, space, or ground asset in the area. The network will provide a beyond line-of-sight (LoS) communications infrastructure that can be packed up and moved in and out of the designated battlespace, enabling the military to have a reliable and secure communications network that extends globally. The network is designed to be flexible enough to provide the right communication and network packages for a specific region, mission, or technology. Operationally, The AN is designed to be self-forming, self-organizing, and self-generating, with nodes joining and leaving the network as they enter and exit a specific region. The network consists of dedicated tactical links, wideband air-to-air links, and ad hoc networks constructed by the Joint Tactical Radio System (JTRS) networking services. JTRS is a software-defined radio that will work with many existing military and civilian radios. It includes integrated encryption and Wideband Networking Software to create mobile ad hoc networks. It also provides system performance analysis and fault diagnostics automatically, reducing the demand for human intervention and network maintenance. === Intended Use === The AN was designed as the cornerstone for the new military doctrine known as Network Centric Warfare. This doctrine was developed to use information superiority to equip warfighters with more precise information enabling commanders and shooters to make smarter decisions faster. The AN contributes to Network Centric Warfare by enabling commanders to provide real-time information to warfighters in the air and on the ground. Warfighters can then utilize more information and make more educated decisions about how to act in a particular situation. Once the act has been carried out commanders will have immediate information about the result and can make judgments on how to continue. All-in-all the AN was designed to reduce the time necessary to identify a target, make clear and educated decisions to pull or not to pull the trigger, and assess battle == Topologies == There are four main network topologies that will be deployed and vary based on the placement of backbone and subnet class networks. === Space, Air, Ground Tether === Establishing a direct connection to another aircraft or ground node, via a point-to-point link for nodes within LOS or via a Satellite Communications (SATCOM) link for nodes that are beyond line-of-sight is known as tethering. SATCOM links provide connectivity to a network ground entry point. Strike aircraft that accompany C2 aircraft such as an AWACS are tethered via point-to-point links. Finally, C2 or intelligence, surveillance, and reconnaissnce (ISR) aircraft may connect via a LOS link directly to a network ground entry point. Each of these tethered alternatives works exactly like a hub or switch that has an entry point to a larger network and allows their connected users access to that network. === Flat Ad Hoc === A flat ad hoc topology refers to establishing nonpersistent network connections as needed among AN nodes that are present at a given time. With this network the nodes dynamically “discover” other nodes to which they can interconnect and form the network. The specific interconnections between the nodes are not planned in advance, but are made as opportunities arise. The nodes join and leave the network at will, continually changing connections to neighbor nodes based upon their location and mobility characteristics. === Tiered Ad Hoc === Ad hoc networks can be flat in the sense that all nodes are peers of each other in a single network, as discussed above, or they can dynamically organize themselves into hierarchical tiers such that higher tiers are used to move data between more localized subnets. This network topology can be compared to any conventional deployed network that utilizes routers, switches, and hubs to temporarily connect users. === Persistent Backbone === A network topology characterized by a persistent backbone is established using relatively persistent wideband connections among high-value platforms flying relatively stable orbits. It provides the connectivity between the tactical subnets which are considered edge networks relative to the backbone. This provides concentration points for connectivity to the space backbone as well as to terrestrial networks. This type of network topology is comparable to a conventional permanent network with established data trunks, routers, switches, and hubs to connect users. == Architecture == === Network Management === The platform management system enables operators to manage all on-board network elements. It interfaces and interoperates with the Airborne Network management system to enable operators to manage remote network elements in the airborne network. The network management system monitors the health of the network by passively testing the network for faults and latency. The system will also actively troubleshoot faults with probes to identify and isolate faulty connections, and enables operators to apply network parameters and security changes to all systems based on the status of the network. === Routing/Switching === Routing and switching enables data to be dynamically transmitted over the network to other nodes. Routing protocols must be able to identify nodes transmitted within their own platform and data to be sent to other platforms regardless of the current topology. The routing protocol must also provide seamless roaming by ensuring that no routed packets are lost when a node changes its point of attachment to the network. Maintaining scalability is important in routing as the network is constantly changing. The network must be able to function with numerous levels of platforms, varying numbers of fast moving platforms, and varying amounts of traffic per platform. Routers and switches will use metrics to determine the best paths to take when routing data. The routing protocol utilized for the AN will be an Adaptive Quality of Service routing protocol. === Gateways/Proxies === Gateways and proxies enable the connection numerous technology types regardless of age to communicate across the IP-based network. Gateways and proxies are essential in the operation of this network because so many different technologies are used to communicate in each domain. These systems will facilitate the transition of the legacy on-board infrastructure, transmission systems, tactical data link systems, and user applications to the objective airborne network systems. Therefore, they are only temporary until all platforms use a standardized IP radio for transmission. === Performance Enhancing Proxies === Performance Enhancing Proxies improve the performance of user applications running across the Airborne Network by countering wireless network impairments, such as limited bandwidth, long delays, high loss rates, and disruptions in network connections. Proxy systems are implemented between the user application and the network and can be used to improve performance at the application and transport functional layers of the OSI model. Some techniques that can be employed include: Compression: Data compression or header compression can be used to minimize the number of bits sent over the network. Data bundling: Smaller data packets can be combined (bundled) into a single large packet for transmission over the network. Caching: A local cache can be used to save and provide data objects that are requested multiple times, reducing transmissions over the network (and improving response times). Store and forward: Message queuing can be used to ensure message delivery to users who become disconnected from the network or are unable to connect to the network for a period of time. Once the platform connects, the stored messages are sent. Pipelining: Rather than opening several separate network connections pipelining can be used to share a single networ

    Read more →
  • Shorty Awards

    Shorty Awards

    The Shorty Awards (also known as "The Shortys") are awards for outstanding and innovative work in digital and social media content by brands, advertising agencies, and creators. The awards, which generally focus on short-term content, honor achievements in content creation on Twitter, Facebook, YouTube, Instagram, TikTok, Twitch, and other social networking sites. The Shorty Awards began in 2008 and initially recognized achievements by independent creators on Twitter, with the first formal awards ceremony occurring in February 2009. Since then, the awards, which are now awarded each spring, have shifted their focus to recognize content across numerous platforms. Entrant work is judged on the merits of excellence in creativity, strategy, and engagement by the Real Time Academy, a group of industry professionals selected by the Shorty Awards on the basis of their professional reputations, industry knowledge, and personal achievements (which may include previous Shorty wins). An additional public voting component, known as Audience Honor Voting, is also used to select Shorty Awards contenders. Notable Shorty Award winners include Malala Yousafzai, Trevor Noah, Michelle Obama, Conan O’Brien, Lady Gaga, Bill Nye, Jacob Reed, and Lizzo. Brands and organizations such as Chipotle, Duolingo, Marvel Studios, HBO, Red Bull, Airbnb, Nestle, BMW, UNICEF and the Human Rights Campaign have also been awarded. The Shorty Awards also produces an annual award program called The Shorty Impact Awards, a competition dedicated to showcasing digital and social media-based projects by brands, agencies, and organizations that seek to make the world a better place. == List of ceremonies == == 1st Shorty Awards == The awards were created in 2008 by tech entrepreneurs Greg Galant, Adam Varga, and Lee Semel of Sawhorse Media. They invited Twitter account holders to nominate the best Twitter users in general categories such as humor, news, food, and design. Winners were chosen by more than 30,000 Twitter users during the voting period. The founders of Twitter first heard about the awards after the contest had gotten underway and expressed support for it. The first Shorty Awards ceremony was held on February 11, 2009, at the Galapagos Art Space in Brooklyn, New York. Approximately 300 people attended the event. The event was hosted by CNN anchor Rick Sanchez and featured appearances by prominent Twitter users MC Hammer and Gary Vaynerchuk and a video appearance by Shaquille O'Neal. The awards, in 26 categories, were voted on by Twitter users. == 2nd Shorty Awards == Voting for the second Shorty Awards opened in January 2010 in 26 official categories. A Real-Time Photo of the Year category was added to the list of official categories for the first time, recognizing the best photo posted to services such as Twitpic, Yfrog, or Facebook. The second Shorty Awards competition introduced a panel of judges called the Real-Time Academy of Short Form Arts & Sciences whose members were Craig Newmark, David Pogue, Kurt Andersen, Caterina Fake, Joi Ito, Frank Moss, Alberto Ibargüen, Sreenath Sreenivasan, MC Hammer, Alyssa Milano and Jimmy Wales. After public nominations determined the finalists, the academy decided on the winners. Winners were announced at a ceremony held in the Times Center in The New York Times building in Manhattan that was also streamed online. The ceremony was hosted by CNN anchor Rick Sanchez, who presented awards in the official categories as well as the newly added Real-Time Photo of the Year and a special humanitarian award. == 3rd Shorty Awards == The nomination period for the third annual Shorty Awards opened in January 2011 and ran through February 11, 2011, except for new categories that had extended nomination deadlines. There were 30 official categories and five special categories. In addition to Real-Time Photo of the Year, for the first time the awards accepted nominations for Foursquare Mayor of the Year, Foursquare Location of the Year, Microblog of the Year on Tumblr, and a Connecting People award. The awards also introduced new Shorty Industry Awards to recognize the best uses of social media by brands and agencies. Winners were announced at a ceremony on March 28, 2011, hosted by Aasif Mandvi in the Times Center. Other Shorty Awards presenters were scheduled to include Kiefer Sutherland, Jerry Stiller, Anne Meara, Stephen Wallem, Miss USA Rima Fakih, and Miss Teen USA Kamie Crawford. == 4th Shorty Awards == The 4th Annual Shorty Awards featured Ricky Gervais and Tiffani Thiessen. 1.6 million tweeted nominations were made across all the categories to honor the top users on Twitter, Facebook, Tumblr, Foursquare, YouTube and other internet platforms. == 5th Shorty Awards == The 5th Annual Shorty Awards ceremony featured Felicia Day, James Urbaniak, Kristian Nairn, Hannibal Buress, Carrie Keagan, Chris Hardwick, David Karp and Coco Rocha. 2.4 million tweeted nominations were made across all the categories to honor the top users on Twitter, Facebook, Tumblr, Foursquare, YouTube and other internet sites. == 6th Shorty Awards == The ceremony took place on April 7, 2014, at the New York TimesCenter and was hosted by Comedian Natasha Leggero. The show included appearances by Patton Oswalt, Jamie Oliver, Kristen Bell, Jerry Seinfeld, Moshe Kasher, Julie Klausner, Erin Brady, Guy Kawasaki, Matt Walsh, Retta, Us the Duo, Big Boi, Gilbert Gottfried, Thomas Middleditch, Billie Jean King and Leandra Medine. Winners included Jerry Seinfeld and Will Ferrell. == 7th Shorty Awards == The Seventh Annual Shorty Awards was hosted by comedian Rachel Dratch and took place on April 20, 2015, at The Times Center in NYC. The Real-Time Academy, the judging body of the Shortys, tripled in size for the 7th annual Awards and included Alton Brown, Mamrie Hart, Nikki Glaser, OK Go, The Fine Bros, Debbie Sterling, Dan Savage, Deena Varshavskaya and Palmer Luckey. Panic! at the Disco was the musical guest at the ceremony. On-stage presenters included Kevin Jonas, Bill Nye, Bella Thorne, Wyclef Jean, Emily Kinney and Tyler Oakley. == 8th Shorty Awards == The Eighth Annual Shorty Awards were held in NYC at the TimesCenter on April 11, 2016. They were hosted by YouTuber, Writer and Comedian Mamrie Hart with musical performances from Nico & Vinz. Winners of the night included Bill Wurtz, DJ Khaled, Misty Copeland, Casey Neistat, Dwayne Johnson, Hannah Hart, Troye Sivan, Baddie Winkle, Kevin Hart, Taraji P. Henson, King Bach, and Zach King. == 9th Shorty Awards == The Ninth Annual Shorty Awards were held in NYC at the PlayStation Theater on April 23, 2017. They were hosted by two-time Emmy Award winner Tony Hale with a musical performance by Lizzo. Winners of the night included Bill Nye, Shay Mitchell, Doug the Pug, Gigi Gorgeous, Simone Biles, Mara Wilson, Gaten Matarazzo and Chrissy Teigen. == 10th Shorty Awards == The 10th Annual Shorty Awards, took place on April 15, 2018, at the PlayStation Theater, New York City. The ceremony was hosted by actress, singer, and songwriter Keke Palmer with a musical performance by Betty Who. == 11th Shorty Awards == The 11th Annual Shorty Awards were held on May 5, 2019, in New York City at the PlayStation Theater. The ceremony was hosted by American actress and comedian Kathy Griffin, with a musical performance by Tank and the Bangas. == 12th Shorty Awards == The 12th Annual Shorty Awards were held on May 3, 2020. Due to the COVID-19 pandemic, the ceremony took place online for the first time, with presenters and award winners filming from their own homes. The ceremony was hosted by actor J.B. Smoove and featured a remixed performance of Trap Queen by Fetty Wap. Award winners included Jack Stauber, Supercar Blondie, Rose and Rosie, and Greta Thunberg. == 13th Shorty Awards == The 13th Annual Shorty Awards took place from April 26 to May 14, 2021. The ceremony was hosted on different social media platforms, such as Instagram and Clubhouse, to create a more tailored experience. Winners were announced from May 11 to May 14, with 10 winners being revealed each hour from 1 to 4 p.m. EST on the Shorty Awards Instagram account. == 14th Shorty Awards == The 14th Annual Shorty Awards were held virtually on May 15, 2022, honoring the best in social media and digital content. Hosted by Jay Shetty, the event recognized influencers, brands, and organizations across various categories, celebrating excellence in digital storytelling and innovative online campaigns. Notable winners included Tabitha Brown for her food content and the D'Amelio Family for their contributions to family and parenting content. The event highlighted the role of digital media in connecting and inspiring audiences during challenging times. == 15th Shorty Awards == The 15th Annual Shorty Awards celebrated the best in social media and digital content on May 24, 2023, at Tribeca 360° in New York City. Hosted by Jay Pharoah, the event honored creators, brands, and organizations ac

    Read more →
  • ISO 15765-2

    ISO 15765-2

    ISO 15765-2, or ISO-TP (Transport Layer), is an international standard for sending data packets over a CAN bus. The protocol allows for the transport of messages that exceed the eight byte maximum payload of CAN frames. ISO-TP segments longer messages into multiple frames, adding metadata (CAN-TP Header) that allows the interpretation of individual frames and reassembly into a complete message packet by the recipient. It can carry up to 232-1 (4294967295) bytes of payload per message packet starting from the 2016 version. Prior versions were limited to a maximum payload size of 4095 bytes. In the OSI model, ISO-TP covers the layer 3 (network layer) and 4 (transport layer). The most common application for ISO-TP is the transfer of diagnostic messages with OBD-II equipped vehicles using KWP2000 and UDS, but is used broadly in other application-specific CAN implementations where one might need to send messages longer than what the CAN protocol physical layer allows (eight bytes for CAN, 64 bytes for CAN FD, and 2048 bytes for CAN-XL). ISO-TP can be operated with its own addressing as so-called Extended Addressing or without address using only the CAN ID (so-called Normal Addressing). Extended addressing uses the first data byte of each frame as an additional element of the address, reducing the application payload by one byte. For clarity the protocol description below is based on Normal Addressing with eight byte CAN frames. In total, six types of addressing are allowed by the ISO 15765-2 Protocol. ISO-TP prepends one or more metadata bytes to the payload data in the eight byte CAN frame, reducing the payload to seven or fewer bytes per frame. The metadata is called the Protocol Control Information, or PCI. The PCI is one, two or three bytes. The initial field is four bits indicating the frame type, and implicitly describing the PCI length. ISO 15765-2 is a part of ISO 15765 (headlined Road vehicles — Diagnostic communication over Controller Area Network (DoCAN)), which has the following parts: ISO 15765-1 Part 1: General information and use case definition ISO 15765-2 Part 2: Transport protocol and network layer services ISO 15765-3 Part 3: Implementation of unified diagnostic services (UDS on CAN) – replaced by ISO 14229-3 Road vehicles — Unified diagnostic services ISO 15765-4 Part 4: Requirements for emissions-related systems == List of protocol control information (PCI) field types == The ISO-TP defines four frame types: A message of seven bytes or less is sent in a single frame, with the initial byte containing the type (0) and payload length (1-7 bytes). With the 0 in the type field, this can also pass as a simpler protocol with a length-data format and is often misinterpreted as such. A message longer than 7 bytes requires segmenting the message packet over multiple frames. A segmented transfer starts with a First Frame. The PCI is two bytes in this case, with the first 4 bit field the type (type 1) and the following 12 bits the message length (excluding the type and length bytes). The recipient confirms the transfer with a flow control frame. The flow control frame has three PCI bytes specifying the interval between subsequent frames and how many consecutive frames may be sent (Block Size). For CAN FD, the ISO 15765-2 protocol has been extended for Single and First frame, to allow larger size values, but still backwards compatible with traditional ISO 15765. See CAN FD. The initial byte contains the type (type = 3) in the first four bits, and a flag in the next four bits indicating if the transfer is allowed (0 = Continue To Send, 1 = Wait, 2 = Overflow/abort). The next byte is the block size, the count of frames that may be sent before waiting for the next flow control frame. A value of zero allows the remaining frames to be sent without flow control or delay. The third byte is the minimum Separation Time (STmin), the minimum delay time between frames. STmin values up to 127 (0x7F) specify the minimum number of milliseconds to delay between frames, while values in the range 241 (0xF1) to 249 (0xF9) specify delays increasing from 100 to 900 microseconds. Note that the Separation Time is defined as the minimum time between the end of one frame to the beginning of the next. Robust implementations should be prepared to accept frames from a sender that misinterprets this as the frame repetition rate i.e. from start-of-frame to start-of-frame. Even careful implementations may fail to account for the minor effect of bit-stuffing in the physical layer. The sender transmits the rest of the message using Consecutive Frames. Each Consecutive Frame has a one byte PCI, with a four bit type (type = 2) followed by a 4-bit sequence number. The sequence number starts at 1 and increments with each frame sent (1, 2,..., F, 0, 1,...), with which lost or discarded frames can be detected. Each consecutive frame starts at 0, initially for the first set of data in the first frame will be considered as 0th data. So the first set of CF(Consecutive frames) start from 0x1. There afterwards when it reaches 0x2F, will be started from 0x20 (e.g. 0x21, 0x22, 0x23...0x2F, 0x20, 0x21...). The 12-bit length field (as indicated in the First Frame) allows up to 4095 bytes of user data in a segmented message, but in practice the typical application-specific limit is considerably lower because of receive buffer or hardware limitations. == Timing parameters == Timing parameters, such as P1 and P2 timers, have to be mentioned. == Standards == ISO 15765-2:2016 Road vehicles -- Diagnostic communication over Controller Area Network (DoCAN) -- Part 2: Transport protocol and network layer services

    Read more →
  • Engineering Historical Memory

    Engineering Historical Memory

    Engineering Historical Memory (EHM) is an online database in the digital humanities, serving as an open-access research tool for primary historical materials focused on 11th to 15th century Afro-Eurasia. It adopts computational methods to make historical documents machine-understandable. EHM parses traditional artifacts such as historical maps, travel accounts, chronicles and codices into computer-readable formats, and links them to secondary multi-media references, a process referred to as the "automatic narrative generation". This approach generates cultural narratives and facilitates interaction with the historical artifacts, making them accessible to audiences from various backgrounds. == History == EHM was first theorised in 2007 by researcher Andrea Nanetti when he was a visiting scholar at Princeton University, and the preliminary test results were published between 2008 and 2011. In 2013, the EHM research team was set up in Singapore following Nanetti's professorship at Nanyang Technological University (NTU). Two years later, after receiving several Microsoft research grants, EHM went live on Microsoft Azure. In 2018, the College of Humanities, Arts and Social Sciences (CoHASS) at NTU Singapore formed the Digital Humanities Research Cluster, as part of which, EHM has been an ongoing interdisciplinary research project led by Nanetti. Partnering with international educational and cultural institutions such as Ca' Foscari University of Venice, University of Florence, Taylor & Francis Group, Delft University of Technology (TUDelft), and SenticNet, EHM has been supported by over 130 scholars and engineers. == Applications == Primary historical materials on EHM are curated into several categories, including maps, travel accounts, chronicles, codices, sites, archival documents, and paintings, such as the Morosini Codex (listed under Chronicles) and Pope Gregory X's Privilege for the Holy Monastery of St Catherine of Sinai (listed under Archival Documents). EHM has been adopted by cultural organisations as an exhibition and research tool in the digital humanities field. An example is the publication of a digital interactive edition of Fra Mauro's Map of the World on EHM, a collaboration project between NTU Singapore and the Biblioteca Nazionale Marciana of Venice. The digitisation process of the map on EHM involved transcribing and geo-referencing the textual content in the 15th-century map, followed by creating semantic annotations to connect the map's content with related secondary data sources. The e-map was subsequently adopted and launched online by Museo Galileo in March 2022 and incorporated into the virtual exhibition "Venezia and Suzhou: Water Cities along the Silk Roads" (online, September-December 2022). In 2024, the Fra Mauro's Map of the World application on EHM was awarded the Digital Humanities and Multimedia Studies Prize (DHMS) by the Medieval Academy of America. Image-Based Video Search Engine is another experimental project under the EHM scope led by the research teams at Delft University of Technology (TUDelft) and NTU Singapore. This ongoing project aims to improve the efficiency of retrieving targeted objects from audio-visuals. == Awards == In 2021, EHM won the GLAMi Awards (MuseWeb Conference - Galleries, Libraries, Archives, and Museums Innovation awards) in the "Resources for Scholars and Researchers" category. In the same year, EHM was a Falling Walls finalist for Science Breakthrough of the Year in the category Social Sciences and Humanities after nominated by the School of Advanced Study at the University of London. In April 2022, the Italian National Commission for UNESCO has selected and sent the EHM project to the organisers of the "Jikji Memory of the World" Award for final evaluation. In January 2024, the Medieval Academy of America announced its 2024 Digital Humanities and Multimedia Studies Prize (DHMS) goes to the Fra Mauro's Map of the World application on EHM.

    Read more →
  • Content management

    Content management

    Content management (CM) are a set of processes and technologies that support the collection, managing, and publishing of information in any form or medium. When stored and accessed via computers, this information may be more specifically referred to as digital content, or simply as content. Digital content may take the form of text (such as electronic documents), images, multimedia files (such as audio or video files), or any other file type that follows a content lifecycle requiring management. The process of content development and management is complex enough that various commercial software vendors (large and small), such as Interwoven and Microsoft, offer content management software to control and automate significant aspects of the content lifecycle. == Process == Content management practices and goals vary by mission and by organizational governance structure. News organizations, e-commerce websites, and educational institutions all use content management, but in different ways. This leads to differences in terminology and in the names and number of steps in the process. For example, some digital content is created by one or more authors. Over time that content may be edited. One or more individuals may provide some editorial oversight, approving the content for publication. Publishing may take many forms: it may be the act of "pushing" content out to others, or simply granting digital access rights to certain content to one or more individuals. Later that content may be superseded by another version of the content and thus retired or removed from use (as when this wiki page is modified). Content management is an inherently collaborative process. It often consists of the following basic roles and responsibilities: Creator – responsible for creating and editing content. Editor – responsible for tuning the content message and the style of delivery, including translation and localization. Publisher – responsible for releasing the content for use. Administrator – responsible for managing access permissions to folders, collections and files, usually accomplished by assigning access rights to user groups or roles. Admins may also assist and support users in various ways. Consumer, viewer or guest – the person who reads or otherwise consumes the content after it is published or shared. A critical aspect of content management is the ability to manage versions of content as it evolves (see also version control). Authors and editors often need to restore older versions of edited products due to a process failure or an undesirable series of edits. Time-sensitive content may also require updates as the subject matter evolves over time. Another equally important aspect of content management involves the creation, maintenance, and application of review standards. Each member of the content creation and review process has a unique role and set of responsibilities in the development or publication of the content. Each review team member requires clear and concise review standards. These must be maintained on an ongoing basis to ensure the long-term consistency and health of the knowledge base. A content management system is a set of automated processes that may support the following features: Import and creation of documents and multimedia material Identification of all key users and their roles The ability to assign roles and responsibilities to different instances of content categories or types Definition of workflow tasks often coupled with messaging so that content managers are alerted to changes in content The ability to track and manage multiple versions of a single instance of content The ability to publish the content to a repository to support access The ability to personalize content based on a set of rules Increasingly, the repository is an inherent part of the system, and incorporates enterprise search and retrieval. Content management systems take the following forms: Web content management system—software for web site management (often what content management implicitly means) Output of a newspaper editorial staff organization Workflow for article publication Document management systems Knowledge management software Single source content management system—content stored in chunks within a relational database Variant management system—where personnel tag source content (usually text and graphics) to represent variants stored as single source "master" content modules, resolved to the desired variant at publication (for example: automobile owners manual content for 12 model years stored as single master content files and "called" by model year as needed)—often used in concert with database chunk storage (see above) for large content objects == Governance structures == Content management expert Marc Feldman defines three primary content management governance structures: localized, centralized, and federated—each having its unique strengths and weaknesses. === Localized governance === By putting control in the hands of those closest to the content, the context experts, localized governance models empower and unleash creativity. These benefits come, however, at the cost of a partial-to-total loss of managerial control and oversight. === Centralized governance === When the levers of control are strongly centralized, content management systems are capable of delivering an exceptionally clear and unified brand message. Moreover, centralized content management governance structures allow for a large number of cost-savings opportunities in large enterprises, realized, for example, through (1) the avoidance of duplicated efforts in creating, editing, formatting, repurposing and archiving content; (2) process management and the streamlining of all content related labor; and/or (3) an orderly deployment or updating of the content management system. === Federated governance === Federated governance models potentially realize the benefits of both localized and centralized control while avoiding the weaknesses of both. While content management software systems are inherently structured to enable federated governance models, realizing these benefits can be difficult because it requires, for example, negotiating the boundaries of control with local managers and content creators. In the case of larger enterprises, in particular, the failure to fully implement or realize a federated governance structure equates to a failure to realize the full return on investment and cost savings that content management systems enable. == Implementation == Content management implementations must be able to manage content distributions and digital rights in content life cycle. Content management systems are usually involved with digital rights management in order to control user access and digital rights. In this step, the read-only structures of digital rights management systems force some limitations on content management, as they do not allow authors to change protected content in their life cycle. Creating new content using managed (protected) content is also an issue that gets protected contents out of management controlling systems. A few content management implementations cover all these issues.

    Read more →
  • Thirst trap

    Thirst trap

    A thirst trap is a type of social media post intended to entice viewers sexually. It refers to a viewer's "thirst", a colloquialism likening sexual frustration to dehydration, implying desperation, with the afflicted individual being described as "thirsty". The phrase entered into the lexicon in the late 1990s, but is most related to Internet slang that developed in the early 2010s. Its meaning has changed over time, previously referring to a graceless need for approval, affection or attention. == History == The term thirst trap originated within selfie culture, though its precise origins remain unclear. An early use of the phrase with reference to dehydration appears in the 1999 book Running for Dummies by Florence Griffith Joyner and John Hanc, where it referred to the deceptive sensation of thirst being quenched after initial fluid intake, advising continued hydration to avoid the so-called "thirst trap." The modern usage of thirst trap resurfaced around 2011 on platforms such as Twitter and Urban Dictionary, coinciding with the growing popularity of Snapchat, Instagram, and dating apps like Tinder and Grindr. In 2011, Urban Dictionary defined it as "any statement used to intentionally create attention or 'thirst'." By 2018, the term had entered mainstream discourse, appearing in outlets such as The New York Times and GQ without the need for explanation. == Usage of the term == Often, the term thirst trap describes an attractive picture of an individual that they post online. Thirst trap can also describe a digital heartthrob. For instance, former Canadian prime minister Justin Trudeau has been described as a political thirst trap. It has also been described as a modern form of "fishing for compliments". == Motivation == Thirst trapping may be driven by a variety of motives. Individuals often seek attention through "likes" and comments on social media, which can offer a temporary sense of validation and improved self-esteem. It can also serve as an outlet for expressing one's sexuality or enhancing a personal brand. In some cases, sharing such content may provide financial gain. Others might post thirst traps to cope with emotional distress, such as after breakup, or to spite a former lover. Sharing a thirst trap has also been used as a way to connect in times of social isolation (e.g. COVID-19 pandemic). From a physiological standpoint, endorphins and neurotransmitters like oxytocin and dopamine are released during sexual contact. It has been speculated outside of the academic setting that sharing and engaging with thirst traps may elicit similar pleasure responses. == Methodology == Methodologies have developed to take an optimal thirst trap photo. Reporting for Vice magazine, Graham Isador found several of his social network contacts spent a lot of time considering how to take the best photo and what text they should use. They considered angles and lighting. Sometimes they made use of the self-timer feature available on some cameras. Often, body parts are put on display without being too explicit (e.g. bulges of male genitalia, breast cleavage, abdominal muscles, pectoral muscles, backs, buttocks). Often, the thirst trap is accompanied by a caption. For instance, in October 2019, actress Tracee Ellis Ross posted bikini pictures on Instagram with a caption that included the message: "I've worked so hard to feel good in my skin and to build a life that truly matches me and I'm in it and it feels good. ... No filter, no retouch 47 year old thirst trap! Boom!" On Instagram, #ThirstTrapThursdays is a popular tag. Followers reply in turn after a posting. == Variations == "Gatsbying" is a variation of the thirst trap, where one puts posts on social media to attract the attention of a particular individual. The term alludes to the novel The Great Gatsby where the character Jay Gatsby would throw extravagant parties to attract the attention of his love interest, Daisy. "Instagrandstanding" is an alternative name for this. "Wholesome trapping" has developed, where one posts pictures of more meaningful aspects of life, such as spending time with friends or doing outdoor activities. == Criticism == Psychotherapist Lisa Brateman has criticized thirst traps as an unhealthy method of receiving external validation. This desire for external validation can be addictive. Thirst traps can cause pressure to maintain a good physical appearance, and therefore cause self-esteem issues. Additionally, thirst traps are often highly choreographed and thus present a distorted perception of reality. The manufacturing of thirst traps can be limited when one enters a relationship or with time as the body ages. In some cases, thirst traps can lead to harassment and online bullying. In April 2020, model Chrissy Teigen posted a video of herself wearing a black one-piece swimsuit, and she received a multitude of negative comments that constituted bullying and body shaming.

    Read more →
  • Key (cryptography)

    Key (cryptography)

    A key in cryptography is a piece of information, usually a string of numbers or letters that are stored in a file, which, when processed through a cryptographic algorithm, can encode or decode cryptographic data. Based on the used method, the key can be different sizes and varieties, but in all cases, the strength of the encryption relies on the security of the key being maintained. A key's security strength is dependent on its algorithm, the size of the key, the generation of the key, and the process of key exchange. == Scope == The key is what is used to encrypt data from plaintext to ciphertext. There are different methods for utilizing keys and encryption. === Symmetric cryptography === Symmetric cryptography refers to the practice of the same key being used for both encryption and decryption. === Asymmetric cryptography === Asymmetric cryptography has separate keys for encrypting and decrypting. These keys are known as the public and private keys, respectively. == Purpose == Since the key protects the confidentiality and integrity of the system, it is important to be kept secret from unauthorized parties. With public key cryptography, only the private key must be kept secret, but with symmetric cryptography, it is important to maintain the confidentiality of the key. Kerckhoff's principle states that the entire security of the cryptographic system relies on the secrecy of the key. == Key sizes == Key size is the number of bits in the key defined by the algorithm. This size defines the upper bound of the cryptographic algorithm's security. The larger the key size, the longer it will take before the key is compromised by a brute force attack. Since perfect secrecy is not feasible for key algorithms, researches are now more focused on computational security. In the past, keys were required to be a minimum of 40 bits in length, however, as technology advanced, these keys were being broken quicker and quicker. As a response, restrictions on symmetric keys were enhanced to be greater in size. Currently, 2048 bit RSA is commonly used, which is sufficient for current systems. However, current RSA key sizes would all be cracked quickly with a powerful quantum computer. "The keys used in public key cryptography have some mathematical structure. For example, public keys used in the RSA system are the product of two prime numbers. Thus public key systems require longer key lengths than symmetric systems for an equivalent level of security. 3072 bits is the suggested key length for systems based on factoring and integer discrete logarithms which aim to have security equivalent to a 128 bit symmetric cipher." == Key generation == To prevent a key from being guessed, keys need to be generated randomly and contain sufficient entropy. The problem of how to safely generate random keys is difficult and has been addressed in many ways by various cryptographic systems. A key can directly be generated by using the output of a Random Bit Generator (RBG), a system that generates a sequence of unpredictable and unbiased bits. A RBG can be used to directly produce either a symmetric key or the random output for an asymmetric key pair generation. Alternatively, a key can also be indirectly created during a key-agreement transaction, from another key or from a password. Some operating systems include tools for "collecting" entropy from the timing of unpredictable operations such as disk drive head movements. For the production of small amounts of keying material, ordinary dice provide a good source of high-quality randomness. == Establishment scheme == The security of a key is dependent on how a key is exchanged between parties. Establishing a secured communication channel is necessary so that outsiders cannot obtain the key. A key establishment scheme (or key exchange) is used to transfer an encryption key among entities. Key agreement and key transport are the two types of a key exchange scheme that are used to be remotely exchanged between entities . In a key agreement scheme, a secret key, which is used between the sender and the receiver to encrypt and decrypt information, is set up to be sent indirectly. All parties exchange information (the shared secret) that permits each party to derive the secret key material. In a key transport scheme, encrypted keying material that is chosen by the sender is transported to the receiver. Either symmetric key or asymmetric key techniques can be used in both schemes. The Diffie–Hellman key exchange and Rivest-Shamir-Adleman (RSA) are the most two widely used key exchange algorithms. In 1976, Whitfield Diffie and Martin Hellman constructed the Diffie–Hellman algorithm, which was the first public key algorithm. The Diffie–Hellman key exchange protocol allows key exchange over an insecure channel by electronically generating a shared key between two parties. On the other hand, RSA is a form of the asymmetric key system which consists of three steps: key generation, encryption, and decryption. Key confirmation delivers an assurance between the key confirmation recipient and provider that the shared keying materials are correct and established. The National Institute of Standards and Technology recommends key confirmation to be integrated into a key establishment scheme to validate its implementations. == Management == Key management concerns the generation, establishment, storage, usage and replacement of cryptographic keys. A key management system (KMS) typically includes three steps of establishing, storing and using keys. The base of security for the generation, storage, distribution, use and destruction of keys depends on successful key management protocols. == Key vs password == A password is a memorized series of characters including letters, digits, and other special symbols that are used to verify identity. It is often produced by a human user or a password management software to protect personal and sensitive information or generate cryptographic keys. Passwords are often created to be memorized by users and may contain non-random information such as dictionary words. On the other hand, a key can help strengthen password protection by implementing a cryptographic algorithm which is difficult to guess or replace the password altogether. A key is generated based on random or pseudo-random data and can often be unreadable to humans. A password is less safe than a cryptographic key due to its low entropy, randomness, and human-readable properties. However, the password may be the only secret data that is accessible to the cryptographic algorithm for information security in some applications such as securing information in storage devices. Thus, a deterministic algorithm called a key derivation function (KDF) uses a password to generate the secure cryptographic keying material to compensate for the password's weakness. Various methods such as adding a salt or key stretching may be used in the generation.

    Read more →
  • Airfair

    Airfair

    AirFair was a mobile travel application that checks flights, and shows whether a traveler is owed compensation. == History == AirFair was developed in 2016 by Allay Logic Ltd; a Newcastle-based tech-company. == Services == AirFair offered a free flight check to see if compensation is owed. The app could indicate how much the person is owed within minutes whether the flight was delayed, cancelled or the traveler is refused boarding.

    Read more →
  • POODLE

    POODLE

    POODLE (which stands for "Padding Oracle On Downgraded Legacy Encryption") is a security vulnerability which takes advantage of the fallback to SSL 3.0. If attackers successfully exploit this vulnerability, on average, they only need to make 256 SSL 3.0 requests to reveal one byte of encrypted messages. Bodo Möller, Thai Duong and Krzysztof Kotowicz from the Google Security Team discovered this vulnerability; they disclosed the vulnerability publicly on October 14, 2014 (despite the paper being dated "September 2014"). On December 8, 2014, a variation of the POODLE vulnerability that affected TLS was announced. The CVE-ID associated with the original POODLE attack is CVE-2014-3566. F5 Networks filed for CVE-2014-8730 as well, see POODLE attack against TLS section below. == Prevention == To mitigate the POODLE attack, one approach is to completely disable SSL 3.0 on the client side and the server side. However, some old clients and servers do not support TLS 1.0 and above. Thus, the authors of the paper on POODLE attacks also encourage browser and server implementation of TLS_FALLBACK_SCSV, which will make downgrade attacks impossible. Another mitigation is to implement "anti-POODLE record splitting". It splits the records into several parts and ensures none of them can be attacked. However the problem of the splitting is that, though valid according to the specification, it may also cause compatibility issues due to problems in server-side implementations. A full list of browser versions and levels of vulnerability to different attacks (including POODLE) can be found in the article Transport Layer Security. Opera 25 implemented this mitigation in addition to TLS_FALLBACK_SCSV. Google's Chrome browser and their servers had already supported TLS_FALLBACK_SCSV. Google stated in October 2014 it was planning to remove SSL 3.0 support from their products completely within a few months. Fallback to SSL 3.0 has been disabled in Chrome 39, released in November 2014. SSL 3.0 has been disabled by default in Chrome 40, released in January 2015. Mozilla disabled SSL 3.0 in Firefox 34 and ESR 31.3, which were released in December 2014, and added support of TLS_FALLBACK_SCSV in Firefox 35. Microsoft published a security advisory to explain how to disable SSL 3.0 in Internet Explorer and Windows OS, and on October 29, 2014, Microsoft released a fix which disables SSL 3.0 in Internet Explorer on Windows Vista / Server 2003 and above and announced a plan to disable SSL 3.0 by default in their products and services within a few months. Microsoft disabled fallback to SSL 3.0 in Internet Explorer 11 for Protect Mode sites on February 10, 2015, and for other sites on April 14, 2015. Apple's Safari (on OS X 10.8, iOS 8.1 and later) mitigated against POODLE by removing support for all CBC protocols in SSL 3.0, however, this left RC4 which is also completely broken by the RC4 attacks in SSL 3.0. POODLE was completely mitigated in OS X 10.11 (El Capitan 2015) and iOS 9 (2015). To prevent the POODLE attack, some web services dropped support of SSL 3.0. Examples include CloudFlare and Wikimedia. Network Security Services version 3.17.1 (released on October 3, 2014) and 3.16.2.3 (released on October 27, 2014) introduced support for TLS_FALLBACK_SCSV, and NSS will disable SSL 3.0 by default in April 2015. OpenSSL versions 1.0.1j, 1.0.0o and 0.9.8zc, released on October 15, 2014, introduced support for TLS_FALLBACK_SCSV. LibreSSL version 2.1.1, released on October 16, 2014, disabled SSL 3.0 by default. == POODLE attack against TLS == A new variant of the original POODLE attack was announced on December 8, 2014. This attack exploits implementation flaws of CBC encryption mode in the TLS 1.0 - 1.2 protocols. Even though TLS specifications require servers to check the padding, some implementations fail to validate it properly, which makes some servers vulnerable to POODLE even if they disable SSL 3.0. SSL Pulse showed "about 10% of the servers are vulnerable to the POODLE attack against TLS" before this vulnerability was announced. The CVE-ID for F5 Networks' implementation bug is CVE-2014-8730. The entry in NIST's NVD states that this CVE-ID is to be used only for F5 Networks' implementation of TLS, and that other vendors whose products have the same failure to validate the padding mistake in their implementations like A10 Networks and Cisco Systems need to issue their own CVE-IDs for their implementation errors because this is not a flaw in the protocol but in the implementation. The POODLE attack against TLS was found to be easier to initiate than the initial POODLE attack against SSL. There is no need to downgrade clients to SSL 3.0, meaning fewer steps are needed to execute a successful attack.

    Read more →
  • Radical trust

    Radical trust

    Radical trust is the confidence that any structured organization, such as a government, library, business, religion, or museum, has in collaboration and empowerment within online communities. Specifically, it pertains to the use of blogs, wiki and online social networking platforms by organizations to cultivate relationships with an online community that then can provide feedback and direction for the organization's interest. The organization 'trusts' and uses that input in its management. One of the first appearances of the notion of radical trust appears in an info graphic outlining the base principles of web 2.0 in Tim O'Reilly's weblog post "What is Web 2.0". Radical Trust is listed as the guiding example of trusting the validity of consumer generated media. This concept is considered to be an underlying assumption of Library 2.0. The adoption of radical trust by a library would require its management let go of some of its control over the library and building an organization without an end result in mind. The direction a library would take would be based on input provided by people through online communities. These changes in the organization may merely be anecdotal in nature, making this method of organization management dramatically distinct from data-based or evidence based management. In marketing, Collin Douma further describes the notion of radical trust as a key mindset required for marketers and advertisers to enter the social media marketing space. Conventional marketing dictates and maintains control of messages to cause the greatest persuasion in consumer decisions, but Douma argued that in the social media space, brands would need to cede that control in order to build brand loyalty.

    Read more →
  • Data profiling

    Data profiling

    Data profiling is the process of examining the data available from an existing information source (e.g. a database or a file) and collecting statistics or informative summaries about that data. The purpose of these statistics may be to: Find out whether existing data can be easily used for other purposes Improve the ability to search data by tagging it with keywords, descriptions, or assigning it to a category Assess data quality, including whether the data conforms to particular standards or patterns Assess the risk involved in integrating data in new applications, including the challenges of joins Discover metadata of the source database, including value patterns and distributions, key candidates, foreign-key candidates, and functional dependencies Assess whether known metadata accurately describes the actual values in the source database Understanding data challenges early in any data intensive project, so that late project surprises are avoided. Finding data problems late in the project can lead to delays and cost overruns. Have an enterprise view of all data, for uses such as master data management, where key data is needed, or data governance for improving data quality. == Introduction == Data profiling refers to the analysis of information for use in a data warehouse in order to clarify the structure, content, relationships, and derivation rules of the data. Profiling helps to not only understand anomalies and assess data quality, but also to discover, register, and assess enterprise metadata. The result of the analysis is used to determine the suitability of the candidate source systems, usually giving the basis for an early go/no-go decision, and also to identify problems for later solution design. == How data profiling is conducted == Data profiling utilizes methods of descriptive statistics such as minimum, maximum, mean, mode, percentile, standard deviation, frequency, variation, aggregates such as count and sum, and additional metadata information obtained during data profiling such as data type, length, discrete values, uniqueness, occurrence of null values, typical string patterns, and abstract type recognition. The metadata can then be used to discover problems such as illegal values, misspellings, missing values, varying value representation, and duplicates. Different analyses are performed for different structural levels. E.g. single columns could be profiled individually to get an understanding of frequency distribution of different values, type, and use of each column. Embedded value dependencies can be exposed in a cross-columns analysis. Finally, overlapping value sets possibly representing foreign key relationships between entities can be explored in an inter-table analysis. Normally, purpose-built tools are used for data profiling to ease the process. The computational complexity increases when going from single column, to single table, to cross-table structural profiling. Therefore, performance is an evaluation criterion for profiling tools. == When is data profiling conducted? == According to Kimball, data profiling is performed several times and with varying intensity throughout the data warehouse developing process. A light profiling assessment should be undertaken immediately after candidate source systems have been identified and DW/BI business requirements have been satisfied. The purpose of this initial analysis is to clarify at an early stage if the correct data is available at the appropriate detail level and that anomalies can be handled subsequently. If this is not the case the project may be terminated. Additionally, more in-depth profiling is done prior to the dimensional modeling process in order assess what is required to convert data into a dimensional model. Detailed profiling extends into the ETL system design process in order to determine the appropriate data to extract and which filters to apply to the data set. Additionally, data profiling may be conducted in the data warehouse development process after data has been loaded into staging, the data marts, etc. Conducting data at these stages helps ensure that data cleaning and transformations have been done correctly and in compliance of requirements. == Benefits and examples == Data profiling can improve data quality, shorten the implementation cycle of major projects, and improve users' understanding of data. Discovering business knowledge embedded in data itself is one of the significant benefits derived from data profiling. It can improve data accuracy in corporate databases.

    Read more →
  • 80 Million Tiny Images

    80 Million Tiny Images

    80 Million Tiny Images is a dataset intended for training machine-learning systems constructed by Antonio Torralba, Rob Fergus, and William T. Freeman in a collaboration between MIT and New York University. It was published in 2008. The dataset has size 760 GB. It contains 79,302,017 32×32-pixel color images, scaled down from images scraped from the World Wide Web over 8 months. The images are classified into 75,062 classes. Each class is a non-abstract noun in WordNet. Images may appear in more than one class. The dataset was motivated by non-parametric models of neural activations in the visual cortex upon seeing images. The CIFAR-10 dataset uses a subset of the images in this dataset, but with independently generated labels, as the original labels were not reliable. The CIFAR-10 set has 6000 examples of each of 10 classes, and the CIFAR-100 set has 600 examples of each of 100 non-overlapping classes. == Construction == It was first reported in a technical report in April 2007, during the middle of the construction process, when there were only 73 million images. The full dataset was published in 2008. They began with all 75,846 non-abstract nouns in WordNet, and then for each of these nouns, they scraped 7 image search engines: Altavista, Ask.com, Flickr, Cydral, Google, Picsearch, and Webshots. After 8 months of scraping, they obtained 97,245,098 images. Since they did not have enough storage, they downsized the images to 32×32 as they were scraped. After gathering, they removed images with zero variance and intra-word duplicate images, resulting in the final dataset. Out of the 75,846 nouns, only 75,062 classes had any results, so the other nouns did not appear in the final dataset. The number of images per noun follows a Zipf-like distribution, with 1056 images per noun on average. To prevent a few nouns taking up too many images, they put an upper bound of at most 3000 images per noun. == Retirement == The 80 Million Tiny Images dataset was retired from use by its creators in 2020, after a paper by researchers Abeba Birhane and Vinay Prabhu found that some of the labeling of several publicly available image datasets, including 80 Million Tiny Images, contained racist and misogynistic slurs which were causing models trained on them to exhibit racial and sexual bias. The dataset also contained offensive images. Following the release of the paper, the dataset's creators removed the dataset from distribution, and requested that other researchers not use it for further research and to delete their copies of the dataset.

    Read more →
  • Data profiling

    Data profiling

    Data profiling is the process of examining the data available from an existing information source (e.g. a database or a file) and collecting statistics or informative summaries about that data. The purpose of these statistics may be to: Find out whether existing data can be easily used for other purposes Improve the ability to search data by tagging it with keywords, descriptions, or assigning it to a category Assess data quality, including whether the data conforms to particular standards or patterns Assess the risk involved in integrating data in new applications, including the challenges of joins Discover metadata of the source database, including value patterns and distributions, key candidates, foreign-key candidates, and functional dependencies Assess whether known metadata accurately describes the actual values in the source database Understanding data challenges early in any data intensive project, so that late project surprises are avoided. Finding data problems late in the project can lead to delays and cost overruns. Have an enterprise view of all data, for uses such as master data management, where key data is needed, or data governance for improving data quality. == Introduction == Data profiling refers to the analysis of information for use in a data warehouse in order to clarify the structure, content, relationships, and derivation rules of the data. Profiling helps to not only understand anomalies and assess data quality, but also to discover, register, and assess enterprise metadata. The result of the analysis is used to determine the suitability of the candidate source systems, usually giving the basis for an early go/no-go decision, and also to identify problems for later solution design. == How data profiling is conducted == Data profiling utilizes methods of descriptive statistics such as minimum, maximum, mean, mode, percentile, standard deviation, frequency, variation, aggregates such as count and sum, and additional metadata information obtained during data profiling such as data type, length, discrete values, uniqueness, occurrence of null values, typical string patterns, and abstract type recognition. The metadata can then be used to discover problems such as illegal values, misspellings, missing values, varying value representation, and duplicates. Different analyses are performed for different structural levels. E.g. single columns could be profiled individually to get an understanding of frequency distribution of different values, type, and use of each column. Embedded value dependencies can be exposed in a cross-columns analysis. Finally, overlapping value sets possibly representing foreign key relationships between entities can be explored in an inter-table analysis. Normally, purpose-built tools are used for data profiling to ease the process. The computational complexity increases when going from single column, to single table, to cross-table structural profiling. Therefore, performance is an evaluation criterion for profiling tools. == When is data profiling conducted? == According to Kimball, data profiling is performed several times and with varying intensity throughout the data warehouse developing process. A light profiling assessment should be undertaken immediately after candidate source systems have been identified and DW/BI business requirements have been satisfied. The purpose of this initial analysis is to clarify at an early stage if the correct data is available at the appropriate detail level and that anomalies can be handled subsequently. If this is not the case the project may be terminated. Additionally, more in-depth profiling is done prior to the dimensional modeling process in order assess what is required to convert data into a dimensional model. Detailed profiling extends into the ETL system design process in order to determine the appropriate data to extract and which filters to apply to the data set. Additionally, data profiling may be conducted in the data warehouse development process after data has been loaded into staging, the data marts, etc. Conducting data at these stages helps ensure that data cleaning and transformations have been done correctly and in compliance of requirements. == Benefits and examples == Data profiling can improve data quality, shorten the implementation cycle of major projects, and improve users' understanding of data. Discovering business knowledge embedded in data itself is one of the significant benefits derived from data profiling. It can improve data accuracy in corporate databases.

    Read more →
  • IBM 37xx

    IBM 37xx

    IBM 37xx (or 37x5) is a family of IBM Systems Network Architecture (SNA) programmable front-end processors used mainly in mainframe environments. All members of the family ran one of three IBM-supplied programs. Emulation Program (EP) mimicked the operation of the older IBM 270x non-programmable controllers. Network Control Program (NCP) supported Systems Network Architecture devices. Partitioned Emulation Program (PEP) combined the functions of the two. == Models == === 370x series === 3705 — the oldest of the family, introduced in 1972 to replace the non-programmable IBM 270x family. The 3705 could control up to 352 communications lines. 3704 was a smaller version, introduced in 1973. It supported up to 32 lines. === 371x === The 3710 communications controller was introduced in 1984. === 372x series === The 3725 and the 3720 systems were announced in 1983. The 3725 replaced the hardware line scanners used on previous 370x machines with multiple microcoded processors. The 3725 was a large-scale node and front end processor. The 3720 was a smaller version of the 3725, which was sometimes used as a remote concentrator. The 3726 was an expansion unit for the 3725. With the expansion unit, the 3725 could support up to 256 lines at data rates up to 256 kbit/s, and connect to up to eight mainframe channels. Marketing of the 372x machines was discontinued in 1989. IBM discontinued support for the 3705, 3720, 3725 in 1999. === 374x series === The 3745, announced in 1988, provides up to eight T1 circuits. At the time of the announcement, IBM was estimated to have nearly 85% of the over US$825 million market for communications controllers over rivals such as NCR Comten and Amdahl Corporation. The 3745 is no longer marketed, but still supported and used. The 3746 "Nways Controller" model 900, unveiled in 1992, was an expansion unit for the 3745 supporting additional Token Ring and ESCON connections. A stand-alone model 950 appeared in 1995. == Successors == IBM no longer manufactures 37xx processors. The last models, the 3745/46, were withdrawn from marketing in 2002. Replacement software products are Communications Controller for Linux on System z and Enterprise Extender. == Clones == Several companies produced clones of 37xx controllers, including NCR COMTEN and Amdahl Corporation.

    Read more →