NCSA Brown Dog

NCSA Brown Dog

NCSA Brown Dog is a research project to develop a method for easily accessing historic research data stored in order to maintain the long-term viability of large bodies of scientific research. It is supported by the National Center for Supercomputing Applications (NCSA) that is funded by the National Science Foundation (NSF). == History == Brown Dog is part of the DataNet partners program funded by NSF in 2008. DataNet was conceived to address the increasingly digital and data-intensive nature of science, engineering and education. Brown Dog is part of a follow-on effort called Data Infrastructure Building Blocks (DIBBs), focused on building software to support DataNet. The project was proposed by researchers at NCSA and the University of Illinois Urbana-Champaign as well as researchers from Boston University and the University of North Carolina at Chapel Hill. == Unstructured, uncurated, long tail data == Much scientific data is smaller, unstructured and uncurated and thus not easily shared. Such data is sometimes referred to as "long tail" data. This borrows a term from statistics and refers to the tail of the distribution of project sizes. The majority of smaller projects lack the resources to properly steward the data they produce. This so-called "long tail" data, both past and present, has the potential to inform future research in many study areas. Much of this data has become inaccessible due to obsolete software and file formats. The resulting impossibility of reviewing data from older research disrupts the overall scientific research project. == Approach == Brown Dog describes itself as the "super mutt" of software (thus the name "Brown Dog"), serving as a low-level data infrastructure to interface digital data content across the internet. Its approach is to use every possible source of automated help (i.e., software) in existence in a robust and provenance-preserving manner to create a service that can deal with as much of this data as possible. The project sees the broader impact of its work in its potential to serve the general public as a sort of "DNS for data", with the goal of making all data and all file formats as accessible as webpages are today. == Technology == Brown Dog seeks to address problems involving the use of uncurated and unstructured data collections through the development of two services: the Data Access Proxy (DAP) to aid in the conversion of file formats and the Data Tilling Services (DTS) for the automatic extraction of metadata from file contents. Once developed, researchers and general public users will be able to download browser plugins and other tools from the Brown Dog tool catalog. === Data Tilling Service === Data Tilling Service (DTS) will allow users to search data collections using an existing file to discover other similar files in a collection. A DTS search field will be appended to configured browsers where example files can be dropped. This tells DTS to search all the files under a given URL for files similar to the dropped file. For example, while browsing an online image collection, a user could drop an image of three people into the search field, and the DTS would return all images in the collection that also contain three people. If DTS encounters a foreign file format, it will utilize DAP to make the file accessible. DTS also indexes the data and extract and appends metadata to files and collections enabling users to gain some sense of the type of data they are encountering. This service runs on port 9443. === Data Access Proxy === Data Access Proxy (DAP) allows users to access data files that would otherwise be unreadable. Similar to an internet gateway or Domain Name Service, the DAP configuration would be entered into a user's machine and browser settings. Data requests over HTTP would first be examined by DAP to determine if the native file format is readable on the client device. If not, DAP converts the file into the best available format readable by the client machine. Alternatively, the user could specify the desired format themselves. This service runs on port 8184. == Use cases == Brown Dog targets three use cases proposed by groups within the EarthCube research communities. Developers and researchers from these communities will work together on use cases that span geoscience, engineering, biology and social science. === Long tail vegetation data in ecology and global change biology === This use case is led by Michael Dietze, Boston University Data on the abundance, species composition, and size structure of vegetation is critically important for a wide array of sub-disciplines in ecology, conservation, natural resource management, and global change biology. However, addressing many of the pressing questions in these disciplines will require that terrestrial biosphere and hydrologic models are able to assimilate the large amount of long-tail data that exists but is largely inaccessible. The Brown Dog team in cooperation with researches from Dietze's lab will facilitate the capture of a huge body of smaller research-oriented vegetation data sets collected over many decades and historical vegetation data embedded in Public Land Survey data dating back to 1785. This data will be used as initial conditions for models, to make sense of other large data sets and for model calibration and validation. === Designing green infrastructure considering storm water and human requirements === This use case is led by Barbara Minsker], University of Illinois at Urbana-Champaign]; William Sullivan, University of Illinois at Urbana-Champaign; Arthur Schmidt, University of Illinois at Urbana-Champaign. This case study involves developing novel green infrastructure design criteria and models that integrate requirements for storm water management and ecosystem and human health and well being. To address the scientific and social problems associated with the design of green spaces, data accessibility and availability is a major challenge. This study will focus on identified areas of the Green Healthy Neighborhood Planning region within the City of Chicago where existing local sewer performance is most deficient and where changes in impervious area through green infrastructure would be beneficial to under served neighborhoods. Brown Dog will be used to extract long-tail experimental data on human landscape preferences and health impacts. This data will be used to develop a human health impacts model that will then be linked together with a terrestrial biosphere model and a storm water model using Brown Dog technology. === Development and application for critical zone studies === This use case is led by Praveen Kumar, University of Illinois at Urbana-Champaign Critical Zone (CZ) is the "skin" of the earth that extends from the treetops to the bedrock that is created by life processes working at scales from microbes to biomes. The Critical Zone supports all terrestrial living systems. Its upper part is the bio-mantle. This is where terrestrial biota live, reproduce, use and expend energy, and where their wastes and remains accumulate and decompose. It encompasses the soil, which acts as a geomembrane through which water and solutes, energy, gases, solids, and organisms interact with the atmosphere, biosphere, hydrosphere, and lithosphere. A variety of drivers affect this bio-dynamic zone, ranging from climate and deforestation to agriculture, grazing and human development. Understanding and predicting these effects is central to managing and sustaining vital ecosystem services such as soil fertility, water purification, and production of food resources, and, at larger scales, global carbon cycling and carbon sequestration. The CZ provides a unifying framework for integrating terrestrial surface and near-surface environments, and reflects an intricate web of biological and chemical processes and human impacts occurring at vastly different temporal and spatial scales. The nature of these data create significant challenges for inter-disciplinary studies of the CZ because integration of the variety and number of data products and models has been a barrier. On the other hand, CZ data provides an excellent opportunity for defining, testing and implementing Brown Dog technologies. In this context "unstructured" data is viewed broadly as consisting of a collection of heterogeneous data with formats that reflect temporal and disciplinary legacies, data from emerging low cost open hardware based sensors and embedded sensor networks that lack well defined metadata and sensor characteristics, as well as data that are available as maps, images and text. == NSF Award == CIF21 DIBBs: Brown Dog was awarded in the winter of 2013 with a start date of October 1, 2013. Estimated expiration date is September 30, 2018. The award amount was $10,519,716.00, the largest DIBB award. The principal investigator is Kenton McHenry of NCSA at the University of Illinois at Urbana-Champaign. Coleaders are Jong Lee NCSA/UIU

Aarogya Setu

Aarogya Setu (lit. 'The bridge to health') is an Indian COVID-19 "contact tracing, syndromic mapping and self-assessment" digital service, primarily a mobile app, developed by the National Informatics Centre under the Ministry of Electronics and Information Technology (MeitY). The app reached more than 100 million installs in 40 days. On 26 May, amid growing privacy and security concerns, the source code of the app was made public. == Full view == The stated purpose of this app is to spread awareness of COVID-19 and to connect essential COVID-19-related health services to the people of India. This app augments the initiatives of the Department of Health to contain COVID-19 and shares best practices and advisories. It is a tracking app which uses the smartphone's GPS and Bluetooth features to track COVID-19 cases. The app is available for Android and iOS mobile operating systems. With Bluetooth, it tries to determine the risk if one has been near (within six feet of) a COVID-19-infected person, by scanning through a database of known cases across India. Using location information, it determines whether the location one is in belongs to one of the infected areas based on the data available. This app is an updated version of an earlier app called Corona Kavach (now discontinued) which was released earlier by the Government of India. == Features and tools == Aarogya Setu has four sections: User Status (tells the risk of getting COVID-19 for the user) Self Assess (helps the users identify COVID-19 symptoms and their risk profile) COVID-19 Updates (gives updates on local and national COVID-19 cases) E-pass integration (if applied for E-pass, it will be available) See Recent Contacts option (allows the users to assess the risk level of their Bluetooth contacts) It tells how many COVID-19 positive cases are likely in a radius of 500 m, 1 km, 2 km, 5 km and 10 km from the user. The app is built on a platform that can provide an application programming interface (API) so that other computer programs, mobile applications, and web services can make use of the features and data available in Aarogya Setu. == Response == Aarogya Setu crossed five million downloads within three days of its launch, making it one of the most popular government apps in India. It became the world's fastest-growing mobile app, beating Pokémon Go, with more than 50 million installs 13 days after launching in India on 2 April 2020. It reached 100 million installs by 13 May 2020, that is in 40 days since its launch. In an order on 29 April 2020 the central government made it mandatory for all employees to download the app and use it – "Before starting for office, they must review their status on Aarogya Setu and commute only when the app shows safe or low risk". The Union Home Ministry also said that the application is mandatory for all living in the COVID-19 containment zone. The government gave the announcement along with the nationwide lockdown extension by two weeks from the 4 May with certain relaxations. On 21 May 2020, the Airport Authority of India issued a Standard Operating Procedure (SOP) stating that all departing passengers must compulsorily be registered with the Aarogya Setu app. It added that the app would not be mandatory for children below 14 years. However, the next day, Civil Aviation Minister Hardeep Singh Puri clarified that the app would not be mandatory for any passengers. On 26 May 2020, the Aarogya Setu app code was made open to developers across the globe to help other countries manage contact tracing in their fight against COVID-19 pandemic. In March 2021, Co-WIN portal was integrated with the app. This allowed users to schedule an appointment through the app for COVID-19 vaccine by registering their phone number and providing relevant documents. == Effectiveness == NITI Aayog CEO revealed that "the app has been able to identify more than 3,000 hotspots in 3–17 days ahead of time." However, users and experts in India and around the world say the app raises huge data security concerns. The app collects name, number, gender, travel history, and uses a phone's Bluetooth and location data to let users know if they have been near a person with COVID-19 by scanning a database of known cases of infection, and also share it with the government simultaneously. This is the major area of concern as the app's constant access to a phone's Bluetooth imposes a form of security threat. But it stood to clarify itself that the informations received are not going to be made public. Amidst all these, the app hits a record of about one-hundred million downloads. == Reception == Rahul Gandhi, leader of the Congress party, termed the Aarogya Setu application a "sophisticated surveillance system" after the government announced that downloading the app would be mandatory for both government and private employees. Following this, others raised the same concerns about the Aarogya Setu app. The Ministry of Electronics and Information Technology (MeitY) responded to these concerns by asserting that Gandhi's claims were false, and that the app was being appreciated internationally. On 5 May, French ethical hacker Robert Baptiste, who goes by the name Elliot Alderson on Twitter, claimed that there were security issues with the app. The Indian government, as well as the app developers, responded to this claim by thanking the hacker for his attention, but dismissed his concerns. The developers of the app stated that the fetching of location data is a documented feature of the app, rather than a flaw, since the app is designed to track the distribution of the virus-infected population. They also asserted that no personal information of any user has been proven to be at risk. On 6 May, Robert Baptiste tweeted that security vulnerabilities in Aarogya Setu allowed hackers to "know who is infected, unwell, [or] made a self assessment in the area of his choice". He also gave details of how many people were unwell and infected at the Prime Minister's Office, the Indian Parliament and the Home Office. The Economic Times pointed out that a clause in the app's Terms and Conditions stated that the user "agrees and acknowledges that the Government of India will not be liable for ... any unauthorised access to your information or modification thereof". In response, several software developers called for the source code to be made public. On 12 May, former Supreme Court Judge Justice B.N. Srikrishna termed the government's push mandating the use of Aarogya Setu app "utterly illegal". He said so far it is not backed by any law and questioned "under what law, government is mandating it on anyone". MIT Technology Review gave 2 out of 5 stars to Aarogya Setu app after analyzing the COVID contact tracing apps launched in 25 countries. The app got stars only for the policy which suggests that data collected is deleted after a period of time and that the data collection, as far as user inputs go, is minimal. It also highlighted that India is the only democracy making its app mandatory for millions of people. The rating was further downgraded from 2 to 1 for collecting more information than the app needs to function. Following this, the MeitY made the source code of the Android app public on GitHub on 26 May, which will be followed by iOS and API documentation. Further, the Government has also launched a "bug bounty program". This was done to "promote transparency and ensure security and integrity of the app". However, experts stated that the server-side code had not yet been publicly released, which meant that public opinion on security and privacy was yet to be completely assuaged. Following this, ZDNet noted that the source code seemed to confirm the government's claim that user location data, if collected, would be anonymised and would be deleted after 45 days, or 60 days for high-risk individuals.

Mass media use by the Islamic State

The Islamic State (IS) is known for its extensive and effective use of propaganda. It uses a version of the Muslim Black Standard flag and developed an emblem which has clear symbolic meaning in the Muslim world. The Islamic State targets younger audiences, such as teenagers and young adults, since they are more vulnerable to propaganda. It is known to exploit the internet to spread its propaganda by establishing websites, such as the Al Fustat domain. Videos by the Islamic State are commonly accompanied by nasheeds (chants), notable examples being the chant Dawlat al-Islam Qamat, which came to be viewed as an unofficial anthem of the Islamic State, and Salil al-Sawarim. Academic research has emphasized the scale and volume of Islamic State media production beyond its flagship magazines. A quantitative study cited in R. Malash’s academic work documented 1,373 distinct Islamic State media products released over a six-month period between 1 August 2017 and 28 February 2018, including magazines, newsletters, reports, photographic releases, audio recordings, and other media formats. Scholars have used such datasets to illustrate the breadth and intensity of the group’s media output, particularly during periods of territorial decline, when propaganda activity remained high despite military pressure. == Traditional media == === Al-Furqan Foundation for Media Production === In January 2006, shortly after the group's rebranding as the "Islamic State of Iraq", it established the Al-Furqan Foundation for Media Production (Arabic: مؤسسة الفرقان للإنتاج الإعلامي, romanized: Muasasat al-Furqān lil'īntāj al'ilāmī), which produces CDs, DVDs, posters, pamphlets, and web-related propaganda products and official statements. It is the primary media production house of the Islamic State and responsible for production of major media releases, including the statements of the spokesmen and leaders of the group. On January 10, 2006, Al-Furqan released its very first video, titled (Arabic: زحف الأنوار, romanized: Zahf al-Anwār) It was founded by the Iraqi man Dr Wa'il al-Fayad, known as Abu Muhammad al-Furqan. He got his name "Al-Furqan" from his role in founding this media house, which was named after the 25th surah of the Quran Al-Furqan. It is the oldest media production house for the Islamic State, being founded in November 2006 to release media for the Islamic State of Iraq. The earliest release indexed by the SITE Intelligence Group is on 21 November 2006, documenting the storming of a police station in the Iraqi town of Miqdadiyah. Al-Furqan is considered to be a considerable innovation in jihadist media, with Kavkaz Center describing it as "a milestone on the path of jihad, a distinguished media that takes the great care in the management of the conflict with the crusaders and their tails and to expose the lies in the crusader's media." In October 2007, the Long War Journal reported on United States Army raids targeting Al-Furqan media cell members across Iraq, including in Mosul and Samarra. Between August 2013 and March 2014 they released the 22 part series Messages from the Land of Epic Battles. On 2 September 2014 SITE Intelligence Group discovered the beheading video called A Second Message to America, about the death of Steven Sotloff. Since then, Al-Furqan has released videos of their operations across Iraq and Syria, as well as execution videos directed to governments around the world. In April 2019, Al-Furqan released a video Interviewing Abu Bakr al-Baghdadi. Al-Furqan also produces media in the form of audio, which consists mostly of recordings of IS leaders and spokesmen giving speeches, as well as producing a single nasheed under their name called "Ya Allah Al-Jannah" (O Allah, (we ask you for) Paradise), sung by now-dead member of IS, Uqab Al-Marzuqi. === Al-I'tisam Foundation for Media Production === The Islamic State of Iraq founded a second media foundation - Al-I'tisam Media Foundation - around 2011, marked by their first video release, titled "The Conqueror of the Murtaddin: Abu Ahmad Al-Ansari". The foundation has since released a few series of videos, 50 parts of "Windows on the Land of Battles", 9 parts of "Pictures from the Land of Battles", a 9-part series quoting leaders about the establishment of the Islamic State, and other series before their last release, "Deterring the Safavids in Salah ad-Din" in 2015. Since then, there were no further releases from their behalf. === Al-Hayat Media Center === In mid-2014, IS established the Al-Hayat Media Center, which targets Western audiences and produces material in English, German, Russian, Urdu, Indonesian, Turkish, Bengali, Chinese, Bosnian, Kurdish, Uyghur, and French. When IS announced its expansion to other countries in November 2014 it established media departments for the new branches, and its media apparatus ensured that the new branches follow the same models it uses in Iraq and Syria. Then FBI Director James Comey said that IS's "propaganda is unusually slick," noting that, "They are broadcasting... in something like 23 languages". In July 2014, Al-Hayat began publishing a digital magazine called Dabiq, in a number of different languages including English. According to the magazine, its name is taken from the town of Dabiq in northern Syria, which is mentioned in a hadith about Armageddon. Al-Hayat also began publishing other digital magazines, including the Turkish language Konstantiniyye, the Ottoman word for Istanbul, the French language Dar al-Islam, and the Russian language Istok (Russian: Исток). By late 2016, these magazines had apparently all been discontinued, with Al-Hayat's material being consolidated into a new magazine called Rumiyah (Arabic for Rome). === Al-Naba === While the group's glossy, foreign-language magazines like Dabiq and Rumiyah ceased publication as the group lost territory, the weekly Arabic newsletter Al-Naba (The News) has continued to publish regularly, becoming the central pillar of the group's "media jihad" in the post-territorial phase. Recent scholarship, including studies published in 2025, suggests that Al-Naba serves a dual purpose: maintaining internal cohesion among dispersed fighters and projecting a narrative of endurance to enemies. Unlike the earlier magazines which were designed for recruitment, Al-Naba focuses on bureaucratic reporting, military statistics, and religious instruction. These are then translated and disseminated by decentralized supporter networks ("media mujahideen") to reach non-Arabic speakers. === Furat Media Center === The Al-Furat Media Center is another media center established in around 2015 to cater towards non-Arab speaking audiences. However, unlike the other organizations, the production wasn't as professional as ones made by the other media centers. Instead, they partially relied on local media departments and foreign communities of the Mujahideen to produce short-form videos. However, some professional long-form videos were also made under their behalf. As of now, the media center is the only known active branch of all the media centers of the Islamic State, after heavy losses from past campaigns against them. Their last release was "The Resolve of Muwahhidin in Russia", where videos from the Surovikino penal colony hostage crisis were edited and released. === Ajnad Foundation for Media Production === Ajnad Foundation is one of the official media wings of Islamic State which produces nasheeds and Quran recitations. It was established in January 2014 and has released more than 150 nasheeds. === Asdaa Foundation === Like the Ajnad Foundation, the Asdaa Foundation (Arabic: مؤسسة أصداء) or Asedaa Foundation produces Anasheed (Islamic chants). The foundation is the closest counterpart to Ajnad in producing Islamic State nasheeds, only difference being Ajnad is directly linked to the Islamic State while Asdaa is only classified as a "supporter organization" (munaser/munasera). The foundation had humble beginnings possibly in Yemen, where low-quality nasheeds were produced at first by 2 munshids, Abu Layth Al-Iraqi and Abu Ya'qub Al-Yamani. After that, the quality had improved a bit (possibly with new equipment and increased recognition) and eventually had its nasheeds included in the Islamic State's official media releases. One of its munshids, Abu Hafs is a renowned munshid who sings around 70 nasheeds, who as well works with Ajnad Foundation in some instances. He is currently alive, and working under Ansar Production Center (مركز إنتاج الأنصار), another Munasir foundation and Asedaa. Another Yemeni munshid, Abu Musab al-Adani, worked temporarily with Asdaa Foundation before defecting back to AQAP, from which he previously defected from. Some of their anasheed is used in IS's execution videos, a popular one is their human slaughterhouse execution video released during the time of Eid Al-Adha in 2016. The background nasheed they used was "We Came To Fill The Horizons With Terror", produced by the Asd

Verge3D

Verge3D is a real-time renderer and a toolkit used for creating interactive 3D experiences running on websites. == Overview == Verge3D enables users to convert content from 3D modelling tools (Blender, 3ds Max, and Maya are currently supported) to view in a web browser. Verge3D was created by the same core group of software engineers that previously created the Blend4Web framework. == Features == Verge3D uses WebGL for rendering. It incorporates components of the Three.js library and exposes its API to application developers. Puzzles Application functionality can be added via JavaScript, either by writing code directly or by using Puzzles, Verge3D’s visual programming environment based on Google Blockly. Puzzles is aimed primarily at non-programmers allowing quick creation of interactive scenarios in a drag-and-drop fashion. App Manager and web publishing App Manager is a lightweight web-based tool for creating, managing and publishing Verge3D projects, running on top of the local development server. Verge3D Network service integrated in the App Manager allows for publishing Verge3D applications via Amazon S3 and EC2 cloud services. PBR For purposes of authoring materials, a glTF 2.0-compliant physically based rendering pipeline is offered alongside the standard shader-based approach. PBR textures can be authored using external texturing software such as Substance Painter for which Verge3D offers the corresponding export preset. Besides the glTF 2.0 model, Verge3D supports physical materials of 3ds Max and Maya (with Autodesk Arnold as reference), and Blender's real-time Eevee materials. glTF and DCC software integration Verge3D integrates directly with Blender, 3ds Max, and Maya, enabling users to create 3D geometry, materials, and animations inside the software, then export them in the JSON-based glTF format. The Sneak Peek feature allows for exporting and viewing scenes from the DCC tool environment. Facebook 3D posts For Facebook publishing, Verge3D offers a specific GLB export option. The exported GLB files are displayed and can be opened in the App Manager. Asset compression Exported files can optionally use LZMA compression, resulting in a reduction in file size of up to 6x. UI and website layouts Interface layouts, created using external WYSIWYG editors, can be linked with Puzzles to trigger changes to a 3D scene being rendered in the browser and vice versa. Animation Verge3D supports skeletal animation, including animation of bipeds and character rigs, and allows for animation of material parameters. Model parts can also be set up to be dragged by the user. Physics The physics module can be linked separately to enable collision detection, dynamically moving objects, support for characters and vehicles, springs, ropes and cloth simulation. As of version 2.11, simple physics simulations can be created and controlled without coding via Puzzles, the visual programming system used by Verge3D. AR/VR The 2.10 update added support for WebXR, an in-development open technology designed to enable virtual reality and augmented reality experiences to be displayed in web browsers. It works with both headsets with controllers, like the HTC Vive and Oculus Rift, and those without, like Google Cardboard. AR/VR experiences can enabled via Puzzles or JavaScript. == Workflow == Verge3D's workflow differs substantially from other mainstream WebGL frameworks. Development of a new Verge3D application is usually started from modeling, texturing and animating 3D objects. The models are assembled in the 3D authoring tool. The scene file is then used as a basis for a Verge3D project initialized from the App Manager. An interactive scenario is optionally added using the Puzzles editor. A Verge3D application can be previewed in the web browser at any development stage using the App Manager. The finished web application can be deployed on the Verge3D Network, on Facebook or on the user's website. == Notable uses == NASA's Jet Propulsion Laboratory used Verge3D to create an interactive 3D visualization of the Mars InSight lander. The web application allows for exploring and interacting with the real-time model of the spacecraft, with the possibility to move different parts and unfurl the solar panels. NASA's older interactive web application Experience Curiosity was ported to Verge3D from Blend4Web. The application makes it possible to operate the rover, control its cameras and the robotic arm and reproduces some of the prominent events of the Mars Science Laboratory mission. Route 66 Digital's Escape Room used Verge3D and Blender. This interactive short explores how users can navigate 3D spaces and interact with objects without the need for instruction.

Digital citizen

The term digital citizen is used with different meanings. According to the definition provided by Karen Mossberger, one of the authors of Digital Citizenship: The Internet, Society, and Participation, digital citizens are "those who use the internet regularly and effectively". In this sense, a digital citizen is a person who uses information technology (IT) to engage in society, politics, and government. More recent elaborations of the concept define digital citizenship as the self-enactment of people’s role in society through the use of digital technologies, stressing the empowering and democratizing characteristics of the citizenship idea. These theories aim at taking into account the ever-increasing datafication of contemporary societies (symbolically linked to the Snowden leaks), which has called into question the meaning of “being (digital) citizens in a datafied society”. This condition is also referred to as the “algorithmic society”, characterised by the increasing datafication of social life and the pervasive presence of surveillance practices – see surveillance and surveillance capitalism, the use of artificial intelligence, and Big Data. Datafication presents crucial challenges for the very notion of citizenship, so that data collection can no longer be seen as an issue of privacy alone so that:We cannot simply assume that being a citizen online already means something (whether it is the ability to participate or the ability to stay safe) and then look for those whose conduct conforms to this meaning Instead, the idea of digital citizenship shall reflect the idea that we are no longer mere “users” of technologies since they shape our agency both as individuals and as citizens. Digital citizenship refers to the responsible and respectful use of technology to engage online, evaluate information, and protect human rights. It encompasses skills for communication, collaboration, empathy, privacy protection, and security to prevent data breaches and identity theft. == Digital citizenship in the "algorithmic society" == In the context of the algorithmic society, the question of digital citizenship "becomes one of the extents to which subjects are able to challenge, avoid or mediate their data double in this datafied society”. These reflections put the emphasis on the idea of the digital space (or cyberspace) as a political space where the respect of fundamental rights of the individual shall be granted (with reference both to the traditional ones as well as to new specific rights of the internet [see “digital constitutionalism”]) and where the agency and the identity of the individuals as citizens is at stake. This idea of digital citizenship is thought to be not only active but also performative, in the sense that “in societies that are increasingly mediated through digital technologies, digital acts become important means through which citizens create, enact and perform their role in society.” In particular, for Isin and Ruppert this points towards an active meaning of (digital) citizenship based on the idea that we constitute ourselves as digital citizen by claiming rights on the internet, either by saying or by doing something. == Types of digital participation == People who characterize themselves as digital citizens often use IT extensively—creating blogs, using social networks, and participating in online journalism. Although digital citizenship begins when any child, teen, or adult signs up for an email address, posts pictures online, uses e-commerce to buy merchandise online, and/or participates in any electronic function that is B2B or B2C, the process of becoming a digital citizen goes beyond simple internet activity. According to Thomas Humphrey Marshall, a British sociologist known for his work on social citizenship, a primary framework of citizenship comprises three different traditions: liberalism, republicanism, and ascriptive hierarchy. Within this framework, the digital citizen needs to exist in order to promote equal economic opportunities and increase political participation. In this way, digital technology helps to lower the barriers to entry for participation as a citizen within a society. They also have a comprehensive understanding of digital citizenship, which is the appropriate and responsible behavior when using technology. Since digital citizenship evaluates the quality of an individual's response to membership in a digital community, it often requires the participation of all community members, both visible and those who are less visible. A large part in being a responsible digital citizen encompasses digital literacy, etiquette, online safety, and an acknowledgement of private versus public information. The development of digital citizen participation can be divided into two main stages. The first stage is through information dissemination, which includes subcategories of its own: static information dissemination, characterized largely by citizens who use read-only websites where they take control of data from credible sources in order to formulate judgments or facts. Many of these websites where credible information may be found are provided by the government. dynamic information dissemination, which is more interactive and involves citizens as well as public servants. Both questions and answers can be communicated, and citizens have the opportunity to engage in question-and-answer dialogues through two-way communication platforms The second stage of digital citizen participation is citizen deliberation, which evaluates what type of participation and role that they play when attempting to ignite some sort of policy change. static citizen participants can play a role by engaging in online polls as well as through complaints and recommendations sent up, mainly toward the government who can create changes in policy decisions. dynamic citizen participants can deliberate amongst others on their thoughts and recommendations in town hall meetings or various media sites. One potential advantage of online participation through digital citizenship is increased social inclusion. In a report on civic engagement, citizen-powered democracy can be initiated either through information shared through the web, direct communication signals made by the state toward the public, and social media tactics from both private and public companies. In fact, it was found that the community-based nature of social media platforms allow individuals to feel more socially included and informed about political issues that peers have also been found to engage with, otherwise known as a "second-order effect." Understanding strategic marketing on social media would further explain social media customers’ participation. Two types of opportunities rise as a result, the first being the ability to lower barriers that can make exchanges much easier. In addition, they have the chance to participate in transformative disruption, giving people who have a historically lower political engagement to mobilize in a much easier and convenient fashion. Nonetheless, there are several challenges that face the presence of digital technologies in political participation. Both current as well as potential challenges can create significant risks for democratic processes. Not only is digital technology still seen as relatively ambiguous, it was also seen to have "less inclusivity in democratic life." Demographic groups differ considerably in the use of technology, and thus, one group could potentially be more represented than another as a result of digital participation. Another primary challenge consists in the ideology of a "filter bubble" effect. Alongside a tremendous spread of false information, internet users could reinforce existing prejudices and assist in polarizing disagreements in the public sphere. This can lead to misinformed voting and decisions based on exposure rather than on pure knowledge. A communication technology director, Van Dijk, stated, "Computerized information campaigns and mass public information systems have to be designed and supported in such a way that they help to narrow the gap between the 'information rich' and 'information poor' otherwise the spontaneous development of ICT will widen it." Access and equivalent amounts of knowledge behind digital technology must be equivalent in order for a fair system to put into place. Alongside a lack of evidenced support for technology that can be proven to be safe for citizens, the OECD has identified five struggles for the online engagement of citizens: Scale: To what extent can a society allow every individual's voice to be heard, but also not be lost in the mass debate? This can be extremely challenging for the government, which may not effectively know how to listen and respond to each individual contribution. Capacity: How can digital technology offer citizens more information on public policy-making? The opportunity for citizens to debate with one another is lacking for acti

Meta-Labeling

Meta-labeling, also known as corrective AI, is a machine learning (ML) technique utilized in quantitative finance to enhance the performance of investment and trading strategies, developed in 2017 by Marcos López de Prado at Guggenheim Partners and Cornell University. The core idea is to separate the decision of trade direction (side) from the decision of trade sizing, addressing the inefficiencies of simultaneously learning both side and size predictions. The side decision involves forecasting market movements (long, short, neutral), while the size decision focuses on risk management and profitability. It serves as a secondary decision-making layer that evaluates the signals generated by a primary predictive model. By assessing the confidence and likely profitability of those signals, meta-labeling allows investors and algorithms to dynamically size positions and suppress false positives. == Motivation == Meta-labeling is designed to improve precision without sacrificing recall. As noted by López de Prado, attempting to model both the direction and the magnitude of a trade using a single algorithm can result in poor generalization. By separating these tasks, meta-labeling enables greater flexibility and robustness: Enhances control over capital allocation. Reduces overfitting by limiting model complexity. Allows the use of interpretability tools and tailored thresholds to manage risk. Enables dynamic trade suppression in unfavorable regimes. == Applications == Meta-labeling has been applied in a variety of financial ML contexts, including: Algorithmic trading: Filtering and sizing trades to reduce false positives. Portfolio optimization: Scaling exposure across multiple signals with differing confidence levels. Risk management: Dynamically disabling strategies in adverse market conditions. Model validation: Interpreting when and why a model may be underperforming due to regime shifts. == General architecture == Meta-labeling decouples two core components of systematic trading strategies: directional prediction and position sizing. The process involves training a primary model to generate trade signals (e.g., buy, sell, or hold) and then training a secondary model to determine whether each signal is likely to lead to a profitable trade. The second model outputs a probability that is interpreted as the confidence in the forecast, which can be used to adjust the position size or to filter out unreliable trades. Meta-labeling is typically implemented as a three-stage process: Primary model (M1): Predicts the direction or label of a financial outcome using features such as market prices, returns, or volatility indicators. A typical output is directional, e.g., Y ∈ {−1,0,1}, representing short, neutral, or long positions. Secondary model (M2): A binary classifier trained to predict whether the primary model's prediction will be profitable. The target variable is a binary meta-label F ∈ { 0 , 1 } {\displaystyle F\in \{0,1\}} . Inputs can include features used in the primary model, performance diagnostics, or market regime data. Position sizing algorithm (M3): Translates the output probability of the secondary model into a position size. Higher confidence scores result in larger allocations, while lower confidence leads to reduced or zero exposure. === Stage 1: Forecasting side === Primary model architecture Figure 1 Figure 1 presents the architecture of a primary model. It focuses on forecasting the side of the trade. Following the example, this model (M1) takes in input data – such as open-high-low-close data and determines the side of the position to take: a negative number is a short position, and positive number is a long position, the range is set between −1 and 1 (the closer it is to −1 or 1, the stronger the models conviction is). When training the model, the labels are −1 and 1, based on the direction of forward returns for some predefined investment horizon. The researcher may decide to apply a recall check (τ: "Tau") by setting a minimum threshold that the initial output needs to be to qualify of a short or long position (if the threshold is not met, no side forecast is predicted, leading to closing of any open positions), this leads to the primary model output which is one of three possible side forecasts: −1, 0, or 1. The primary model also generates evaluation data which can be used by the secondary model, to improve performance of size forecasts. Some examples of evaluation data include rolling accuracy, F1, recall, precision, and AUC scores. === Stage 2: Filtering out false positives === General meta-labeling architecture Figure 2 Next comes the phase of filtering out false positives, by applying a secondary machine learning model (M2), which is a binary classifier trained to determine if the trade will be profitable or not. The model takes as input four general groupings of data: General input data which is predictive of a false positive. For example the last 30 days rolling volatility of the underlying asset. Evaluation data. Market state and regime data, one may find that macro economic data or clustering the market into regimes may help as specific trading strategies are known to perform better in particular regimes. Example: momentum based strategies perform best in periods with low volatility and strong directional moves. Primary models initial input which is a value between −1 and 1. This highlights the strength of the primary models conviction. The output of the model is a value between −1 and 1 (if using a Tanh function) which will indicate the strength of the conviction that a short or long position is profitable, or it could simply be between 0 and 1 (using a sigmoid function) if one only wanted to know if it made money or not. This output allows filtering out trades that are likely to lead to losses. One could stop at this point or use the outputs of the secondary model as inputs to a position sizing algorithm (M3) which could further enhance strategy performance metrics by translating the output probability of the secondary model into a position size. Higher confidence scores result in larger allocations, while lower confidence leads to reduced or zero exposure. === Stage 3: Optimizing position sizes === ==== Position sizing methods (M3) ==== Various algorithms have been proposed for transforming predicted probabilities into trade sizes: All-or-nothing: Allocate 100% of capital if the probability exceeds a predefined threshold (e.g., 0.5); otherwise, do not trade. Model confidence: Use the probability score directly as the fraction of capital allocated. Linear scaling: Rescale the model's probabilities using min-max normalization based on the training data. Normal CDF (NCDF): Use a normal cumulative distribution function applied to a z-statistic derived from the predicted probability. Empirical CDF (ECDF): Rank probabilities based on their percentile in the training data to ensure relative allocation. Sigmoid Optimal Position Sizing (SOPS): Applies a smooth non-linear sigmoid transformation optimized to maximize risk-adjusted returns (Sharpe ratio). ==== Model calibration ==== Each machine learning algorithm used in meta-labeling tends to produce outputs with different characteristic distributions; for example, some are approximately normally distributed, whereas others exhibit a pronounced U-shape, concentrating probabilities near the extremes. Due to these varying distributions, simply summing the outputs of different models can inadvertently lead to uneven weighting of signals, biasing trade decisions. To address this, model calibration techniques are essential to adjust the predicted probabilities towards frequentist probabilities, ensuring that model outputs reflect true likelihoods more accurately. Two common calibration techniques are: Platt scaling (Sigmoid scaling): Suitable for correcting S-shaped calibration plots typically produced by models such as support vector machines (SVMs). Isotonic regression: Fits a non-decreasing step function to probabilities and is effective particularly with larger datasets, though it can sometimes lead to overfitting. Transforming predictions to frequentist probabilities is crucial as it provides probabilistic outputs that are directly interpretable as the actual likelihood of an event occurring. Such calibration significantly enhances the effectiveness of fixed position sizing methods, reducing maximum drawdowns and increasing risk-adjusted returns. However, calibration has less impact on position sizing methods that directly estimate parameters from the training data, such as ECDF and SOPS, suggesting that calibration is a critical step mainly for fixed methods that rely heavily on raw model outputs. =

Web worker

A web worker, as defined by the World Wide Web Consortium (W3C) and the Web Hypertext Application Technology Working Group (WHATWG), is a JavaScript script executed from an HTML page that runs in the background, independently of scripts that may also have been executed from the same HTML page. Web workers are often able to utilize multi-core CPUs more effectively. The W3C and WHATWG envision web workers as long-running scripts that are not interrupted by scripts that respond to clicks or other user interactions. Keeping such workers from being interrupted by user activities should allow Web pages to remain responsive at the same time as they are running long tasks in the background. The web worker specification is part of the HTML Living Standard. == Overview == As envisioned by WHATWG, web workers are relatively heavy-weight and are not intended to be used in large numbers. They are expected to be long-lived, with a high start-up performance cost, and a high per-instance memory cost. Web workers run outside the context of an HTML document's scripts. Consequently, while they do not have access to the DOM, they can facilitate concurrent execution of JavaScript programs. == Features == Web workers interact with the main document via message passing. The following code creates a Worker that will execute the JavaScript in the given file. To send a message to the worker, the postMessage method of the worker object is used as shown below. The onmessage property uses an event handler to retrieve information from a worker. Once a worker is terminated, it goes out of scope and the variable referencing it becomes undefined; at this point a new worker has to be created if needed. == Example == The simplest use of web workers is for performing a computationally expensive task without interrupting the user interface. In this example, the main document spawns a web worker to compute prime numbers, and progressively displays the most recently found prime number. The main page is as follows: The Worker() constructor call creates a web worker and returns a worker object representing that web worker, which is used to communicate with the web worker. That object's onmessage event handler allows the code to receive messages from the web worker. The Web Worker itself is as follows: To send a message back to the page, the postMessage() method is used to post a message when a prime is found. == Support == If the browser supports web workers, a Worker property will be available on the global window object. The Worker property will be undefined if the browser does not support it. The following example code checks for web worker support on a browser Web workers are currently supported by Chrome, Opera, Edge, Internet Explorer (version 10), Mozilla Firefox, and Safari. Mobile Safari for iOS has supported web workers since iOS 5. The Android browser first supported web workers in Android 2.1, but support was removed in Android versions 2.2–4.3 before being restored in Android 4.4.