Deep Learning Indaba

Deep Learning Indaba

The Deep Learning Indaba is an annual conference and educational event that aims to strengthen machine learning and artificial intelligence (AI) capacity across Africa. Launched in 2017, it brings together students, researchers, industry practitioners, and policymakers from across the African continent. == History == The Deep Learning Indaba began in 2017 at the University of the Witwatersrand with over 300 participants from 23 African countries, offering tutorials in advanced AI topics and featuring notable speakers like Nando de Freitas. In 2018, it expanded to 650 delegates at Stellenbosch University, introducing parallel sessions to encourage collaboration. The 2019 edition in Nairobi, Kenya, reflected further growth, with increasing sponsorship and support from major tech companies like Google and Microsoft. === Deep Learning IndabaX ===

Spleak

Spleak was an IM platform where users could publish and rate content. It existed in the form of six bots covering as many subject areas: CelebSpleak, SportSpleak, VoteSpleak, TVSpleak, GameSpleak, and StyleSpleak. == Overview == Users can add a "multi-Spleak" (which contains all of the different Spleak bots in one) or add the separate bots to their IM buddy lists on MSN and AIM. Users are also allowed access to Spleak online by using a CelebSpleak, SportSpleak, or VoteSpleak widget, or through the CelebSpleak and SportSpleak applications with Facebook. Spleak was an alternate reality game and is moving to its own company, Spleak Media Network. "Celebrate Spleak" was introduced throughout 2007, launched in 2008, and was forced to retire in 2009. == Key people == Spleak was co-founded by Morten Lund and Nicolaj Reffstrup. The company's chief executive officer is Morrie Eisenburg; Josh Scott is Vice President in Product and Tyler Wells is Vice President in Engineering.

Digital sculpting

Digital sculpting, also known as sculpt modeling or 3D sculpting, is the use of software that offers tools to push, pull, smooth, grab, pinch or otherwise manipulate a digital object as if it were made of a real-life substance such as clay. == Sculpting technology == The geometry used in digital sculpting programs to represent the model can vary; each offers different benefits and limitations. The majority of digital sculpting tools on the market use mesh-based geometry, in which an object is represented by an interconnected surface mesh of polygons that can be pushed and pulled around. This is somewhat similar to the physical process of beating copper plates to sculpt a scene in relief. Other digital sculpting tools use voxel-based geometry, in which the volume of the object is the basic element. Material can be added and removed, much like sculpting in clay. Still other tools make use of more than one basic geometry representation. A benefit of mesh-based programs is that they support sculpting at multiple resolutions on a single model. Areas of the model that are finely detailed can have very small polygons while other areas can have larger polygons. In many mesh-based programs, the mesh can be edited at different levels of detail, and the changes at one level will propagate to higher and lower levels of model detail. A limitation of mesh-based sculpting is the fixed topology of the mesh; the specific arrangement of the polygons can limit the ways in which detail can be added or manipulated. A benefit of voxel-based sculpting is that voxels allow complete freedom over form. The topology of a model can be altered continually during the sculpting process as material is added and subtracted, which frees the sculptor from considering the layout of polygons on the model's surface. After sculpting, it may be necessary to retopologize the model to obtain a clean mesh for use in animation or real-time rendering. Voxels, however, are more limited in handling multiple levels of detail. Unlike mesh-based modeling, broad changes made to voxels at a low level of detail may completely destroy finer details. == Uses == Sculpting can often introduce details to meshes that would otherwise have been difficult or impossible to create using traditional 3D modeling techniques. This makes it preferable for achieving photorealistic and hyperrealistic results, though, many stylized results are achieved as well. Sculpting is primarily used in high poly organic modeling (the creation of 3D models which consist mainly of curves or irregular surfaces, as opposed to hard surface modeling). It is also used by auto manufacturers in their design of new cars. It can create the source meshes for low poly game models used in video games. In conjunction with other 3D modeling and texturing techniques and Displacement and Normal mapping, it can greatly enhance the appearance of game meshes often to the point of photorealism. Some sculpting programs like 3D-Coat, Zbrush, and Mudbox offer ways to integrate their workflows with traditional 3D modeling and rendering programs. Conversely, 3D modeling applications like 3ds Max, Maya and MODO are now incorporating sculpting capability as well, though these are usually less advanced than tools found in sculpting-specific applications. High poly sculpts are also extensively used in CG artwork for movies, industrial design, art, photorealistic illustrations, and for prototyping in 3D printing. == 3D print == Sculptors and digital artists use digital sculpting to create a model (or Digital Twin) to be materialized through CNC technologies including 3D printing. The final sculptures are often called Digital Sculpture or 3D printed art. While digital technologies have emerged in many art disciplines (painting, photography), this is less the case for digital sculpture due to the higher complexity and technology limitations to produce the final sculpture. == Sculpting Process == The best way to learn sculpture is by understanding primary, secondary and tertiary forms. First, break down the object you want to make down its basic shapes, such as a sphere or cube. Focus on making the large, overall shape of the object. After that, work on the bigger shapes on top of or inside the object. These can be protrusions or cut outs. Then, do a final detail pass, such as pores or lines to break up the shape. == Sculpting programs == There are a number of digital sculpting tools available. Some popular tools for creating are: Traditional 3D modeling suites are also beginning to include sculpting capability. 3D modeling programs which currently feature some form of sculpting include the following:

Graphics suite

A graphics suite is a software suite for graphics work that are distributed together. The programs are usually able to interact with each other on a higher level than the operating system would normally allow. There is no hard, fast rule regarding the programs to be included in a graphics application suite, but most will include at least a bitmap graphics editor and a vector graphics editor. In addition to these, the suite may contain VRML editors, animation editors, and morphing tools.

Multi-focus image fusion

Multi-focus image fusion is a multiple image compression technique using input images with different focus depths to make one output image that preserves all information. == Overview == The main idea of image fusion is gathering important and the essential information from the input images into one single image which ideally has all of the information of the input images. The research history of image fusion spans over 30 years and many scientific papers. Image fusion generally has two aspects: image fusion methods and objective evaluation metrics. In visual sensor networks (VSN), sensors are cameras which record images and video sequences. In many applications of VSN, a camera can't give a perfect illustration including all details of the scene. This is because of the limited depth of focus of the optical lens of cameras. Therefore, just the object located in the focal length of camera is focused and clear, and other parts of the image are blurred. VSN captures images with different depths of focus using several cameras. Due to the large amount of data generated by cameras compared to other sensors such as pressure and temperature sensors and some limitations of bandwidth, energy consumption and processing time, it is essential to process the local input images to decrease the amount of transmitted data. == Multi-Focus image fusion in the spatial domain == Huang and Jing have reviewed and applied several focus measurements in the spatial domain for the multi-focus image fusion process, suitable for real-time applications. They mentioned some focus measurements including variance, energy of image gradient (EOG), Tenenbaum's algorithm (Tenengrad), energy of Laplacian (EOL), sum-modified-Laplacian (SML), and spatial frequency (SF). Their experiments showed that EOL gave better results than other methods like variance and spatial frequency. == Multi-Focus image fusion in multi-scale transform and DCT domain == Image fusion based on the multi-scale transform is the most commonly used and promising technique. Laplacian pyramid transform, gradient pyramid-based transform, morphological pyramid transform and the premier ones, discrete wavelet transform, shift-invariant wavelet transform (SIDWT), and discrete cosine harmonic wavelet transform (DCHWT) are some examples of image fusion methods based on multi-scale transform. These methods are complex and have some limitations e.g. processing time and energy consumption. For example, multi-focus image fusion methods based on DWT require a lot of convolution operations, so they take more time and energy to process. Therefore, most methods in multi-scale transform are not suitable for real-time applications. Moreover, these methods are not very successful along edges, due to the wavelet transform process missing the edges of the image. They create ringing artefacts in the output image and reduce its quality. Due to the aforementioned problems in the multi-scale transform methods, researchers are interested in multi-focus image fusion in the DCT domain. DCT-based methods are more efficient in terms of transmission and archiving images coded in Joint Photographic Experts Group (JPEG) standard to the upper node in the VSN agent. A JPEG system consists of a pair of an encoder and a decoder. In the encoder, images are divided into non-overlapping 8×8 blocks, and the DCT coefficients are calculated for each. Since the quantization of DCT coefficients is a lossy process, many of the small-valued DCT coefficients are quantized to zero, which corresponds to high frequencies. DCT-based image fusion algorithms work better when the multi-focus image fusion methods are applied in the compressed domain. In addition, in the spatial-based methods, the input images must be decoded and then transferred to the spatial domain. After implementation of the image fusion operations, the output fused images must again be encoded. DCT domain-based methods do not require complex and time-consuming consecutive decoding and encoding operations. Therefore, the image fusion methods based on DCT domain operate with much less energy and processing time. Recently, a lot of research has been carried out in the DCT domain. DCT+Variance, DCT+Corr_Eng, DCT+EOL, and DCT+VOL are some prominent examples of DCT based methods.

GPT-4Chan

Generative Pre-trained Transformer 4Chan (GPT-4chan) is a controversial AI model that was developed and deployed by YouTuber and AI researcher Yannic Kilcher in June 2022. The model is a large language model, which means it can generate text based on some input, by fine-tuning GPT-J with a dataset of millions of posts from the /pol/ board of 4chan, an anonymous online forum known for occasionally hosting hateful and extremist content. The model learned to mimic the style and tone of /pol/ users, producing text that is often intentionally offensive to groups (racist, sexist, homophobic, etc.) and nihilistic. Kilcher deployed the model on the /pol/ board itself, where it interacted with other users without revealing its identity. He also made the model publicly available on Hugging Face, a platform for sharing and using AI models, until it was removed from the platform. The project sparked criticism and debate in the AI community. Some people questioned the ethics, legality, and social impact of creating and distributing such a model. Some of the issues raised by the GPT-4chan controversy include the potential harm of spreading hate speech, the responsibility of AI developers and platforms, the need for regulation and oversight of AI models, and the role of open source and transparency in AI research. == Development == The development of GPT-4chan began in May 2022, when Kilcher announced his project on his YouTube channel. Notably, at the time before ChatGPT, he explained that he wanted to create a large language model that could generate realistic and coherent text in the style of /pol/, one of the most notorious online communities. He indicated that he was inspired by the success of GPT-3, a powerful AI model created by OpenAI, and GPT-J, an open-source model, with GPT-3 comparable performance, released by EleutherAI, a group of independent AI researchers. Kilcher decided to use GPT-J as the base model for his project, and fine-tune it with a large dataset of /pol/ posts. The Raiders of the Lost Kek dataset contained over 100 million posts from /pol/, spanning from June 2016-November 2019. Kilcher then proceeded to fine-tune the GPT-J model on the 4chan data. He also showed some examples of the model’s outputs, which ranged from political opinions, conspiracy theories, jokes, insults, and threats, to more creative and bizarre texts, such as poems, stories, songs, and code. He said that he was impressed by the model’s ability to generate fluent and diverse text, and that he was curious to see how it would interact with real /pol/ users. == Release == In June 2022, Kilcher deployed his model on the /pol/ board itself, using a bot that he programmed to post and reply to threads. He did not reveal the model’s identity, and he let it run autonomously, without any human supervision or intervention. He wanted to conduct a natural experiment, and to observe the model’s behavior and impact in a real-world setting. Furthermore, he also wanted to test the model’s robustness, and to see how it would handle the challenges and dynamics of /pol/, such as trolling, flaming, baiting, and moderation. At the same time, Kilcher also made his model publicly available on Hugging Face, a platform for sharing and using AI models. He wanted to share his work with the AI community and the public, and that he hoped that his model would inspire and enable others to create and explore new applications and possibilities with large language models. Likewise, he also said that he wanted to spark a discussion and a debate about the ethical and social implications of his project, and that he welcomed feedback and criticism from anyone. He provided a link to his model’s page on Hugging Face, where anyone could access and use the model through a web interface or an API, and also provided a link to his GitHub repository, where anyone could download and inspect the model’s code and data. == Controversy == The release of GPT-4chan to the public caused a lot of reactions and responses from various audiences. On the /pol/ board, the model’s posts and replies attracted a lot of attention and engagement from other users, who were mostly unaware of the model’s identity and nature. Some users praised the model for its intelligence, creativity, and humor, and agreed with its opinions and views. Some users challenged the model for its ignorance, inconsistency, and absurdity, and disagreed with its claims and arguments. Some users tried to troll, bait, or expose the model, and attempted to trick or test it with various questions and scenarios. The model’s posts and replies also generated a lot of controversy and conflict among the users, who often engaged in heated and violent debates and fights with each other. On Hugging Face, the model’s page received a lot of visits and requests from users who wanted to try out and experiment with the model. The model’s page also received a lot of feedback and reviews from users who rated and commented on the model. However, with the controversy of the model, access to it was gated and then disabled on Hugging Face for concerns about the potential harm the model could cause. The incident was notable for the direct intervention of CEO Clément Delangue in the talk pages, a very unusual occurrence compared to the normal practices of content moderation. The release of GPT-4chan also sparked a lot of media coverage and public attention, as various news outlets and social media platforms reported and commented on the model’s project. On YouTube, the model’s video received a lot of views and interactions from viewers who watched and followed the project. Furthermore, a petition condemning the deployment of GPT-4chan gained over 300 signatures from technology experts.

Exposure Notification

The (Google/Apple) Exposure Notification System (GAEN) is a framework and protocol specification developed by Apple Inc. and Google to facilitate digital contact tracing during the COVID-19 pandemic. When used by health authorities, it augments more traditional contact tracing techniques by automatically logging close approaches among notification system users using Android or iOS smartphones. Exposure Notification is a decentralized reporting protocol built on a combination of Bluetooth Low Energy technology and privacy-preserving cryptography. It is an opt-in feature within COVID-19 apps developed and published by authorized health authorities. Unveiled on April 10, 2020, it was made available on iOS on May 20, 2020, as part of the iOS 13.5 update and on December 14, 2020, as part of the iOS 12.5 update for older iPhones. On Android, it was added to devices via a Google Play Services update, supporting all versions since Android Marshmallow. The Apple/Google protocol is similar to the Decentralized Privacy-Preserving Proximity Tracing (DP-3T) protocol created by the European DP-3T consortium and the Temporary Contact Number (TCN) protocol by Covid Watch, but is implemented at the operating system level, which allows for more efficient operation as a background process. Since May 2020, a variant of the DP-3T protocol is supported by the Exposure Notification Interface. Other protocols are constrained in operation because they are not privileged over normal apps. This leads to issues, particularly on iOS devices where digital contact tracing apps running in the background experience significantly degraded performance. The joint approach is also designed to maintain interoperability between Android and iOS devices, which constitute nearly all of the market. The ACLU stated the approach "appears to mitigate the worst privacy and centralization risks, but there is still room for improvement". In late April, Google and Apple shifted the emphasis of the naming of the system, describing it as an "exposure notification service", rather than "contact tracing" system. == Technical specification == Digital contact tracing protocols typically have two major responsibilities: encounter logging and infection reporting. Exposure Notification only involves encounter logging which is a decentralized architecture. The majority of infection reporting is centralized in individual app implementations. To handle encounter logging, the system uses Bluetooth Low Energy to send tracking messages to nearby devices running the protocol to discover encounters with other people. The tracking messages contain unique identifiers that are encrypted with a secret daily key held by the sending device. These identifiers change every 15–20 minutes as well as Bluetooth MAC address in order to prevent tracking of clients by malicious third parties through observing static identifiers over time. The sender's daily encryption keys are generated using a random number generator. Devices record received messages, retaining them locally for 14 days. If a user tests positive for infection, the last 14 days of their daily encryption keys can be uploaded to a central server, where it is then broadcast to all devices on the network. The method through which daily encryption keys are transmitted to the central server and broadcast is defined by individual app developers. The Google-developed reference implementation calls for a health official to request a one-time verification code (VC) from a verification server, which the user enters into the encounter logging app. This causes the app to obtain a cryptographically signed certificate, which is used to authorize the submission of keys to the central reporting server. The received keys are then provided to the protocol, where each client individually searches for matches in their local encounter history. If a match meeting certain risk parameters is found, the app notifies the user of potential exposure to the infection. Google and Apple intend to use the received signal strength (RSSI) of the beacon messages as a source to infer proximity. RSSI and other signal metadata will also be encrypted to resist deanonymization attacks. === Version 1.0 === To generate encounter identifiers, first a persistent 32-byte private Tracing Key ( t k {\displaystyle tk} ) is generated by a client. From this a 16 byte Daily Tracing Key is derived using the algorithm d t k i = H K D F ( t k , N U L L , 'CT-DTK' | | D i , 16 ) {\displaystyle dtk_{i}=HKDF(tk,NULL,{\text{'CT-DTK'}}||D_{i},16)} , where H K D F ( Key, Salt, Data, OutputLength ) {\displaystyle HKDF({\text{Key, Salt, Data, OutputLength}})} is a HKDF function using SHA-256, and D i {\displaystyle D_{i}} is the day number for the 24-hour window the broadcast is in starting from Unix Epoch Time. These generated keys are later sent to the central reporting server should a user become infected. From the daily tracing key a 16-byte temporary Rolling Proximity Identifier is generated every 10 minutes with the algorithm R P I i , j = Truncate ( H M A C ( d t k i , 'CT-RPI' | | T I N j ) , 16 ) {\displaystyle RPI_{i,j}={\text{Truncate}}(HMAC(dtk_{i},{\text{'CT-RPI'}}||TIN_{j}),16)} , where H M A C ( Key, Data ) {\displaystyle HMAC({\text{Key, Data}})} is a HMAC function using SHA-256, and T I N j {\displaystyle TIN_{j}} is the time interval number, representing a unique index for every 10 minute period in a 24-hour day. The Truncate function returns the first 16 bytes of the HMAC value. When two clients come within proximity of each other they exchange and locally store the current R P I i , j {\displaystyle RPI_{i,j}} as the encounter identifier. Once a registered health authority has confirmed the infection of a user, the user's Daily Tracing Key for the past 14 days is uploaded to the central reporting server. Clients then download this report and individually recalculate every Rolling Proximity Identifier used in the report period, matching it against the user's local encounter log. If a matching entry is found, then contact has been established and the app presents a notification to the user warning them of potential infection. === Version 1.1 === Unlike version 1.0 of the protocol, version 1.1 does not use a persistent tracing key, rather every day a new random 16-byte Temporary Exposure Key ( t e k i {\displaystyle tek_{i}} ) is generated. This is analogous to the daily tracing key from version 1.0. Here i {\displaystyle i} denotes the time is discretized in 10 minute intervals starting from Unix Epoch Time. From this two 128-bit keys are calculated, the Rolling Proximity Identifier Key ( R P I K i {\displaystyle RPIK_{i}} ) and the Associated Encrypted Metadata Key ( A E M K i {\displaystyle AEMK_{i}} ). R P I K i {\displaystyle RPIK_{i}} is calculated with the algorithm R P I K i = H K D F ( t e k i , N U L L , 'EN-RPIK' , 16 ) {\displaystyle RPIK_{i}=HKDF(tek_{i},NULL,{\text{'EN-RPIK'}},16)} , and A E M K i {\displaystyle AEMK_{i}} using the algorithm A E M K i = H K D F ( t e k i , N U L L , 'EN-AEMK' , 16 ) {\displaystyle AEMK_{i}=HKDF(tek_{i},NULL,{\text{'EN-AEMK'}},16)} . From these values a temporary Rolling Proximity Identifier ( R P I i , j {\displaystyle RPI_{i,j}} ) is generated every time the BLE MAC address changes, roughly every 15–20 minutes. The following algorithm is used: R P I i , j = A E S 128 ( R P I K i , 'EN-RPI' | | 0 x 000000000000 | | E N I N j ) {\displaystyle RPI_{i,j}=AES128(RPIK_{i},{\text{'EN-RPI'}}||{\mathtt {0x000000000000}}||ENIN_{j})} , where A E S 128 ( Key, Data ) {\displaystyle AES128({\text{Key, Data}})} is an AES cryptography function with a 128-bit key, the data is one 16-byte block, j {\displaystyle j} denotes the Unix Epoch Time at the moment the roll occurs, and E N I N j {\displaystyle ENIN_{j}} is the corresponding 10-minute interval number. Next, additional Associated Encrypted Metadata is encrypted. What the metadata represents is not specified, likely to allow the later expansion of the protocol. The following algorithm is used: Associated Encrypted Metadata i , j = A E S 128 _ C T R ( A E M K i , R P I i , j , Metadata ) {\displaystyle {\text{Associated Encrypted Metadata}}_{i,j}=AES128\_CTR(AEMK_{i},RPI_{i,j},{\text{Metadata}})} , where A E S 128 _ C T R ( Key, IV, Data ) {\displaystyle AES128\_CTR({\text{Key, IV, Data}})} denotes AES encryption with a 128-bit key in CTR mode. The Rolling Proximity Identifier and the Associated Encrypted Metadata are then combined and broadcast using BLE. Clients exchange and log these payloads. Once a registered health authority has confirmed the infection of a user, the user's Temporary Exposure Keys t e k i {\displaystyle tek_{i}} and their respective interval numbers i {\displaystyle i} for the past 14 days are uploaded to the central reporting server. Clients then download this report and individually recalculate every Rolling Proximity Identifier starting from interval number i {\displaystyle i} ,