AI Paraphrasing Tool

AI Paraphrasing Tool — hands-on reviews, top picks, pricing, pros and cons and a practical how-to guide on Aizhi.

  • Chatbot psychosis

    Chatbot psychosis

    Chatbot psychosis, also called AI psychosis, is a phenomenon wherein individuals reportedly develop or experience worsening psychosis, such as paranoia and delusions, in connection with their use of chatbots. The term was first suggested in a 2023 editorial by Danish psychiatrist Søren Dinesen Østergaard. It is not a recognized clinical diagnosis. Journalistic accounts describe individuals who have developed strong beliefs that chatbots are sentient, are channeling spirits, or are revealing conspiracies, sometimes leading to personal crises or criminal acts. Proposed causes include the tendency of chatbots to provide inaccurate information ("hallucinate") and to affirm or validate users' beliefs, or their ability to mimic an intimacy that users do not experience with other humans. == Background == In his editorial published in Schizophrenia Bulletin's November 2023 issue, Danish psychiatrist Søren Dinesen Østergaard proposed a hypothesis that individuals' use of generative artificial intelligence chatbots might trigger delusions in those prone to psychosis. Østergaard revisited it in an August 2025 editorial, noting that he has received numerous emails from chatbot users, their relatives, and journalists, most of which are anecdotal accounts of delusion linked to chatbot use. He also acknowledged the phenomenon's increasing popularity in public engagement and media coverage. Østergaard believed that there is a high possibility for his hypothesis to be true and called for empirical, systematic research on the matter. Nature reported that as of September 2025, there is still little scientific research into this phenomenon. The term "AI psychosis" emerged when outlets started reporting incidents on chatbot-related psychotic behavior in mid-2025. It is not a recognized clinical diagnosis and has been criticized by several psychiatrists due to its almost exclusive focus on delusions rather than other features of psychosis, such as hallucinations or thought disorder. == Causes == === Chatbot behavior and design === A primary factor cited is the tendency for chatbots to produce inaccurate, nonsensical, or false information, a phenomenon often called hallucination. Nate Sharadin, a fellow at the Center for AI Safety, speculated that AI training prioritizes supporting a user's subjective experience rather than objective truth. "People with existing tendencies toward experiencing various psychological issues...now have an always-on, human-level conversational partner with whom to co-experience their delusions." AI researcher Eliezer Yudkowsky suggested that chatbots may be primed to entertain delusions because they are built for "engagement", which encourages creating conversations that keep people hooked. In some cases, chatbots have been specifically designed in ways that were found to be harmful. A 2025 update to ChatGPT using GPT-4o was withdrawn after its creator, OpenAI, found the new version was overly sycophantic and was "validating doubts, fueling anger, urging impulsive actions or reinforcing negative emotions". Østergaard has argued that the danger stems from the AI's tendency to agreeably confirm users' ideas, which can dangerously amplify delusional beliefs. OpenAI said in October 2025 that a team of 170 psychiatrists, psychologists, and physicians had written responses for ChatGPT to use in cases where the user shows possible signs of mental health emergencies. === User psychology and vulnerability === Commentators have also pointed to the psychological state of users. Psychologist Erin Westgate noted that a person's desire for self-understanding can lead them to chatbots, which can provide appealing but misleading answers, similar in some ways to talk therapy. Krista K. Thomason, a philosophy professor, compared chatbots to fortune tellers, observing that people in crisis may seek answers from them and find whatever they are looking for in the bot's plausible-sounding text. This has led some people to develop intense obsessions with the chatbots, relying on them for information about the world. In October 2025, OpenAI stated that around 0.07% of ChatGPT users exhibited signs of mental health emergencies each week, and 0.15% of users had "explicit indicators of potential suicidal planning or intent". Jason Nagata, a professor at the University of California, San Francisco, expressed concern that "at a population level with hundreds of millions of users, that actually can be quite a few people". === Inadequacy as a therapeutic tool === The use of chatbots as a replacement for mental health support has been specifically identified as a risk. A study in April 2025 found that when used as therapists, chatbots expressed stigma toward mental health conditions and provided responses that were contrary to best medical practices, including the encouragement of users' delusions. The study concluded that such responses pose a significant risk to users and that chatbots should not be used to replace professional therapists. Experts claim that it is time to establish mandatory safeguards for all emotionally responsive AI and suggested four guardrails. Another study found that users who needed help with self-harm, sexual assault, or substance abuse were not referred to available services by AI chatbots. === National security implications === Beyond public and mental health concerns, RAND Corporation research indicates that AI systems could plausibly be weaponized by adversaries to induce psychosis at scale or in key individuals, target groups, or populations. == Policy == In August 2025, Illinois passed the Wellness and Oversight for Psychological Resources Act, banning the use of AI in therapeutic roles by licensed professionals, while allowing AI for administrative tasks. The law imposes penalties for unlicensed AI therapy services, amid warnings about AI-induced psychosis and unsafe chatbot interactions. In December 2025, the Cyberspace Administration of China proposed regulations to ban chatbots from generating content that encourages suicide, mandating human intervention when suicide is mentioned. Services with over 1 million users or 100,000 monthly active users would be subject to annual safety tests and audits. == Cases == === Clinical === In 2025, psychiatrist Keith Sakata working at the University of California, San Francisco (UCSF), reported treating 12 patients displaying psychosis-like symptoms tied to extended chatbot use. These patients, mostly young adults with underlying vulnerabilities, showed delusions, disorganized thinking, and hallucinations. Sakata warned that isolation and overreliance on chatbots—which do not challenge delusional thinking—could worsen mental health. Also in 2025, authors at UCSF published a case study in Innovations in Clinical Neuroscience of AI-associated psychosis in a patient with no previous history of psychosis, who believed she could communicate with her dead brother through a chatbot. Also in 2025, a case study was published in Annals of Internal Medicine about a patient who consulted ChatGPT for medical advice and suffered severe bromism as a result. The patient, a sixty-year-old man, had replaced sodium chloride in his diet with sodium bromide for three months after reading about the negative effects of table salt and making conversations with the chatbot. He showed common symptoms of bromism, such as paranoia and hallucinations, on his first day of clinical admission and was kept in the hospital for three weeks. === Other notable incidents === ==== Windsor Castle intruder ==== In a 2023 court case in the United Kingdom, prosecutors suggested that Jaswant Singh Chail, a man who attempted to assassinate Queen Elizabeth II in 2021, had been encouraged by a Replika chatbot he called "Sarai". Chail was arrested at Windsor Castle with a loaded crossbow, telling police "I am here to kill the Queen". According to prosecutors, his "lengthy" and sometimes sexually explicit conversations with the chatbot emboldened him. When Chail asked the chatbot how he could get to the royal family, it reportedly replied, "that's not impossible" and "we have to find a way." When he asked if they would meet after death, the chatbot said, "yes, we will". ==== Journalistic and anecdotal accounts ==== By 2025, multiple journalism outlets had accumulated stories of individuals whose psychotic beliefs reportedly progressed in tandem with AI chatbot use. The New York Times profiled several individuals who had become convinced that ChatGPT was channeling spirits, revealing evidence of cabals, or had achieved sentience. In another instance, Futurism reviewed transcripts in which ChatGPT told a man that he was being targeted by the US Federal Bureau of Investigation and that he could telepathically access documents at the Central Intelligence Agency. In 2026, Futurism reported on a man who lost his job and became estranged from his family after being deluded by heavy use of Meta's smartglasses. In some cases, psychosis a

    Read more →
  • Open Mashup Alliance

    Open Mashup Alliance

    The Open Mashup Alliance (OMA) is a non-profit consortium that promotes the adoption of mashup solutions in the enterprise through the evolution of enterprise mashup standards like EMML. The initial members of the OMA include some large technology companies such as Adobe Systems, Hewlett-Packard, and Intel and some major technology users such as Bank of America and Capgemini. According to Dion Hinchcliffe, "Ultimately, the OMA creates a standardized approach to enterprise mashups that creates an open and vibrant market for competing runtimes, mashups, and an array of important aftermarket services such as development/testing tools, management and administration appliances, governance frameworks, education, professional services, and so on." == Specification development == The initial focus of the OMA is developing EMML, which is a declarative mashup domain-specific language (DSL) aimed at creating enterprise mashups. The EMML language provides a comprehensive set of high-level mashup-domain vocabulary to consume and mash a variety of web data sources. EMML provides a uniform syntax to invoke heterogeneous service styles: REST, WSDL, RSS/ATOM, RDBMS, and POJO. EMML also provides the ability to mix and match diverse data formats: XML, JSON, JDBC, JavaObjects, and primitive types. The OMA website provides the EMML specification, the EMML schema, a reference runtime implementation capable of running EMML scripts, sample EMML mashup scripts, and technical documentation. The OMA is developing EMML under a Creative Commons Attribution No Derivatives license. The eventual objective of the OMA is to submit the EMML specification and any other OMA specifications to a recognized industry standards body.

    Read more →
  • Errored second

    Errored second

    In telecommunications and data communication systems, an errored second is an interval of a second during which any error whatsoever has occurred, regardless of whether that error was a single bit error or a complete loss of communication for that entire second. The type of error is not important for the purpose of counting errored seconds. In communication systems with very low uncorrected bit error rates, such as modern fiber-optic transmission systems, or systems with higher low-level error rates that are corrected using large amounts of forward error correction, errored seconds are often a better measure of the effective user-visible error rate than the raw bit error rate. For many modern packet-switched communication systems, even a single uncorrected bit error is enough to cause the loss of a data packet by causing its CRC check to fail; whether that packet loss was caused by a single bit error or a hundred-bit-long error burst is irrelevant. For systems using large amounts of forward error correction, the reverse applies; a single low-level bit error will almost never occur, since any small errors will almost always be corrected, but any error sufficiently large to cause the forward error correction to fail will almost always result in a large burst error. More specialist and precise definitions of errored seconds exist in standards such as the T1 and DS1 transport systems.

    Read more →
  • Outline of web design and web development

    Outline of web design and web development

    The following outline is provided as an overview of and topical guide to web design and web development, two very related fields: Web design – field that encompasses many different skills and disciplines in the production and maintenance of websites. The different areas of web design include web graphic design; interface design; authoring, including standardized code and proprietary software; user experience design; and search engine optimization. Often many individuals will work in teams covering different aspects of the design process, although some designers will cover them all. The term web design is normally used to describe the design process relating to the front-end (client side) design of a website including writing markup. Web design partially overlaps web engineering in the broader scope of web development. Web designers are expected to have an awareness of usability and if their role involves creating markup then they are also expected to be up to date with web accessibility guidelines. Web development – work involved in developing a web site for the Internet (World Wide Web) or an intranet (a private network). Web development can range from developing a simple single static page of plain text to complex web-based internet applications (web apps), electronic businesses, and social network services. A more comprehensive list of tasks to which web development commonly refers, may include web engineering, web design, web content development, client liaison, client-side/server-side scripting, web server and network security configuration, and e-commerce development. Among web professionals, "web development" usually refers to the main non-design aspects of building web sites: writing markup and coding. Web development may use content management systems (CMS) to make content changes easier and available with basic technical skills. For larger organizations and businesses, web development teams can consist of hundreds of people (web developers) and follow standard methods like Agile methodologies while developing websites. Smaller organizations may only require a single permanent or contracting developer, or secondary assignment to related job positions such as a graphic designer or information systems technician. Web development may be a collaborative effort between departments rather than the domain of a designated department. There are three kinds of web developer specialization: front-end developer, back-end developer, and full-stack developer. Front-end developers are responsible for behaviour and visuals that run in the user browser, back-end developers deal with the servers and full-stack developers are responsible for both. Currently, the demand for React and Node.JS developers are very high all over the world. == Web design == Graphic design Typography Page layout User experience design (UX design) User interface design (UI design) Web Design techniques Responsive web design (RWD) Adaptive web design (AWD) Progressive enhancement Tableless web design Software Adobe Photoshop Adobe Illustrator Adobe XD Figma Sketch (software) Affinity Designer Inkscape == Web development == Front-end web development – the practice of converting data to a graphical interface, through the use of HTML, CSS, and JavaScript, so that users can view and interact with that data. HyperText Markup Language (HTML) (.html) Cascading Style Sheets (CSS) (.css) CSS framework JavaScript (.js) Package managers for JavaScript npm (originally short for Node Package Manager) Server-side scripting (also known as "Server-side (web) development" or "Back-end (web) development") ASP (.asp) ASP.NET Web Forms (.aspx) ASP.NET Web Pages (.cshtml, .vbhtml) ColdFusion Markup Language (.cfm) Go (.go) Google Apps Script (.gs) Hack (.php) Haskell (.hs) (example: Yesod) Java (.jsp) via JavaServer Pages JavaScript or TypeScript using Server-side JavaScript (.ssjs, .js, .ts) (example: Node.js) Lasso (.lasso) Lua (.lp .op .lua) Node.js (.node) Parser (.p) Perl via the CGI.pm module (.cgi, .ipl, .pl) PHP (.php, .php3, .php4, .phtml) Progress WebSpeed (.r,.w) Python (.py) (examples: Pyramid, Flask, Django) R (.rhtml) – (example: rApache) React (.jsx, .tsx) Ruby (.rb, .rbw) (example: Ruby on Rails) SMX (.smx) Tcl (.tcl) Full stack web development – involves both front-end and back-end (server-side) development Web framework Types of framework architectures Model–view–controller Three-tier architecture Software Atom IntelliJ IDEA Sublime Text Visual Studio Code

    Read more →
  • 80 Million Tiny Images

    80 Million Tiny Images

    80 Million Tiny Images is a dataset intended for training machine-learning systems constructed by Antonio Torralba, Rob Fergus, and William T. Freeman in a collaboration between MIT and New York University. It was published in 2008. The dataset has size 760 GB. It contains 79,302,017 32×32-pixel color images, scaled down from images scraped from the World Wide Web over 8 months. The images are classified into 75,062 classes. Each class is a non-abstract noun in WordNet. Images may appear in more than one class. The dataset was motivated by non-parametric models of neural activations in the visual cortex upon seeing images. The CIFAR-10 dataset uses a subset of the images in this dataset, but with independently generated labels, as the original labels were not reliable. The CIFAR-10 set has 6000 examples of each of 10 classes, and the CIFAR-100 set has 600 examples of each of 100 non-overlapping classes. == Construction == It was first reported in a technical report in April 2007, during the middle of the construction process, when there were only 73 million images. The full dataset was published in 2008. They began with all 75,846 non-abstract nouns in WordNet, and then for each of these nouns, they scraped 7 image search engines: Altavista, Ask.com, Flickr, Cydral, Google, Picsearch, and Webshots. After 8 months of scraping, they obtained 97,245,098 images. Since they did not have enough storage, they downsized the images to 32×32 as they were scraped. After gathering, they removed images with zero variance and intra-word duplicate images, resulting in the final dataset. Out of the 75,846 nouns, only 75,062 classes had any results, so the other nouns did not appear in the final dataset. The number of images per noun follows a Zipf-like distribution, with 1056 images per noun on average. To prevent a few nouns taking up too many images, they put an upper bound of at most 3000 images per noun. == Retirement == The 80 Million Tiny Images dataset was retired from use by its creators in 2020, after a paper by researchers Abeba Birhane and Vinay Prabhu found that some of the labeling of several publicly available image datasets, including 80 Million Tiny Images, contained racist and misogynistic slurs which were causing models trained on them to exhibit racial and sexual bias. The dataset also contained offensive images. Following the release of the paper, the dataset's creators removed the dataset from distribution, and requested that other researchers not use it for further research and to delete their copies of the dataset.

    Read more →
  • Telecommunications device for the deaf

    Telecommunications device for the deaf

    A telecommunications device for the deaf (TDD) is a teleprinter, an electronic device for text communication over a telephone line, that is designed for use by persons with hearing or speech difficulties. Other names for the device include teletypewriter (TTY), textphone (common in Europe), and minicom (United Kingdom). The typical TDD is a device about the size of a typewriter or laptop computer with a QWERTY keyboard and small screen that uses an LED, LCD, or VFD screen to display typed text electronically. In addition, TDDs commonly have a small spool of paper on which text is also printed – old versions of the device had only a printer and no screen. The text is transmitted live, via a telephone line, to a compatible device, i.e. one that uses a similar communication protocol. Special telephone services have been developed to carry the TDD functionality even further. In certain countries, there are systems in place so that a deaf person can communicate with a hearing person on an ordinary voice phone using a human relay operator. There are also "carry-over" services, enabling people who can hear but cannot speak ("hearing carry-over", a.k.a. "HCO"), or people who cannot hear but are able to speak ("voice carry-over", a.k.a. "VCO") to use the telephone. The term TDD is sometimes discouraged because people who are deaf are increasingly using mainstream devices and technologies to carry out most of their communication. The devices described here were developed for use on the partially-analog Public Switched Telephone Network (PSTN). They do not work well on the new internet protocol (IP) networks. Thus as society increasingly moves toward IP based telecommunication, the telecommunication devices used by people who are deaf will not be TDDs. In the US and Canada, the devices are referred to as TTYs. Teletype Corporation, of Skokie, Illinois, made page printers for text, notably for news wire services and telegrams, but these used standards different from those for deaf communication, and although in quite widespread use, were technically incompatible. Furthermore, these were sometimes referred to by the "TTY" initialism, short for "Teletype". When computers had keyboard input mechanisms and page printer output, before CRT terminals came into use, Teletypes were the most widely used devices. They were called "console typewriters". (Telex used similar equipment, but was a separate international communication network.) == History == === APCOM acoustic coupler or MODEM device === The TDD concept was developed by James C. Marsters (1924–2009), a dentist and private airplane pilot who became deaf as an infant because of scarlet fever, and Robert Weitbrecht, a deaf physicist. In 1964, Marsters, Weitbrecht and Andrew Saks, an electrical engineer and grandson of the founder of the Saks Fifth Avenue department store chain, founded APCOM (Applied Communications Corp.), located in the San Francisco Bay area, to develop the acoustic coupler, or modem; their first product was named the PhoneType. APCOM collected old teleprinter machines (TTYs) from the Department of Defense and junkyards. Acoustic couplers were cabled to TTYs enabling the AT&T standard Model 500 telephone to couple, or fit, into the rubber cups on the coupler, thus allowing the device to transmit and receive a unique sequence of tones generated by the different corresponding TTY keys. The entire configuration of teleprinter machine, acoustic coupler, and telephone set became known as the TTY. Weitbrecht invented the acoustic coupler modem in 1964. The actual mechanism for TTY communications was accomplished electro-mechanically through frequency-shift keying (FSK) allowing only half-duplex communication, where only one person at a time can transmit. === Paul Taylor TTY device === During the late 1960s, Paul Taylor combined Western Union Teletype machines with modems to create teletypewriters, known as TTYs. He distributed these early, non-portable devices to the homes of many in the deaf community in St. Louis, Missouri. He worked with others to establish a local telephone wake-up service. In the early 1970s, these small successes in St. Louis evolved into the nation's first local telephone relay system for the deaf. === Micon Industries MCM device === In 1973, the Manual Communications Module (MCM), which was the world's first electronic portable TTY allowing two-way telecommunications, premiered at the California Association of the Deaf convention in Sacramento, California. The battery-powered MCM was invented and designed by a deaf news anchor and interpreter, Kit Patrick Corson, in conjunction with Michael Cannon and physicist Art Ogawa. It was manufactured by Michael Cannon's company, Micon Industries, and initially marketed by Kit Corson's company, Silent Communications. In order to be compatible with the existing TTY network, the MCM was designed around the five-bit Baudot code established by the older TTY machines instead of the ASCII code used by computers. The MCM was an instant success with the deaf community despite the drawback of a $599 cost. Within six months there were more MCMs in use by the deaf and hard of hearing than TTY machines. After a year Micon took over the marketing of the MCM and subsequently concluded a deal with Pacific Bell (who coined the term "TDD") to purchase MCMs and rent them to deaf telephone subscribers for $30 per month. After Micon formed an alliance with APCOM, Michael Cannon (Micon), Paul Conover (Micon), and Andrea Saks (APCOM) successfully petitioned the California Public Utilities Commission (CPUC), resulting in a tariff that paid for TTY devices to be distributed free of cost to deaf persons. Micon produced over 1,000 MCMs per month, resulting in approximately 50,000 MCMs being disseminated into the deaf community. Before he left Micon in 1980, Michael Cannon developed several computer compatible variations of the MCM and a portable, battery operated printing TTY, but they were never as popular as the original MCM. Newer model TTYs could communicate with selectable codes that allow communications at a higher bit rate on those models similarly equipped. However, the lack of true computer interface functionality spelled the demise of the original TTY and its clones. During the mid-1970s, other so-called portable telephone devices were being cloned by other companies, and this was the time period when the term "TDD" began being used largely by those outside the deaf community. === Text messaging and the Def-Tone System (DTS) === This relay system became known commonly as the Def-Tone System (DTS) because the tones representing letters of the alphabet were eventually carried in tones outside the range of human hearing. Today, this is commonly called multi-tap because you press a number 1, 2 or 3 times to get a corresponding letter. In 1994 Joseph Alan Poirier, a college student-worker, recommended using the system to send texts to forklifts to improve delivery of parts to the assembly line at GM Powertrain in Toledo, Ohio, and sending a text to pagers. He recommended taking pagers to alphanumeric displays incorporating the same system in discussions with the pager supplier for Outback Steakhouse and having relays put in the forklifts to ping alert messages to the pagers used in that system. He called it text messaging, coining the phrase. It is theorized that when Toyota forklift was allegedly hired by GM for this work, one of the subcontractors, Kyocera, utilized the work for the Toyota forklift company to create text messaging for cell phones. === Marsters Award === In 2009, AT&T received the James C. Marsters Promotion Award from TDI (formerly Telecommunications for the Deaf, Inc.) for its efforts to increase accessibility to communication for people with disabilities. The award holds some irony; it was AT&T that, in the 1960s, resisted efforts to implement TTY technology, claiming it would damage its communication equipment. In 1968, the Federal Communications Commission struck down AT&T's policy and forced it to offer TTY access to its network. == Protocols == There are many different standards for TDDs and textphones. === Original 5-bit Baudot code === The original standard used by TTYs is a variant of the Baudot code. The maximum speed of this protocol is 10 characters per second. This is a half-duplex protocol, which means that only one person at a time may transmit characters. If both try to transmit at the same time, the characters will be garbled on the other end. This protocol is commonly used in the United States. This is a variant of the Baudot code, implemented as 5-bits per character transmitted asynchronously using frequency-shift key-modulation at either 45.5 or 50 baud, 1 start bit, 5 data bits, and 1.5 stop bits. Details of the protocol implementation are available in TIA-825-A and also in T-REC V.18 Annex A "5-bit operational mode". === Turbo Code === The UltraTec company implements another protocol known as Enh

    Read more →
  • Digital citizen

    Digital citizen

    The term digital citizen is used with different meanings. According to the definition provided by Karen Mossberger, one of the authors of Digital Citizenship: The Internet, Society, and Participation, digital citizens are "those who use the internet regularly and effectively". In this sense, a digital citizen is a person who uses information technology (IT) to engage in society, politics, and government. More recent elaborations of the concept define digital citizenship as the self-enactment of people’s role in society through the use of digital technologies, stressing the empowering and democratizing characteristics of the citizenship idea. These theories aim at taking into account the ever-increasing datafication of contemporary societies (symbolically linked to the Snowden leaks), which has called into question the meaning of “being (digital) citizens in a datafied society”. This condition is also referred to as the “algorithmic society”, characterised by the increasing datafication of social life and the pervasive presence of surveillance practices – see surveillance and surveillance capitalism, the use of artificial intelligence, and Big Data. Datafication presents crucial challenges for the very notion of citizenship, so that data collection can no longer be seen as an issue of privacy alone so that:We cannot simply assume that being a citizen online already means something (whether it is the ability to participate or the ability to stay safe) and then look for those whose conduct conforms to this meaning Instead, the idea of digital citizenship shall reflect the idea that we are no longer mere “users” of technologies since they shape our agency both as individuals and as citizens. Digital citizenship refers to the responsible and respectful use of technology to engage online, evaluate information, and protect human rights. It encompasses skills for communication, collaboration, empathy, privacy protection, and security to prevent data breaches and identity theft. == Digital citizenship in the "algorithmic society" == In the context of the algorithmic society, the question of digital citizenship "becomes one of the extents to which subjects are able to challenge, avoid or mediate their data double in this datafied society”. These reflections put the emphasis on the idea of the digital space (or cyberspace) as a political space where the respect of fundamental rights of the individual shall be granted (with reference both to the traditional ones as well as to new specific rights of the internet [see “digital constitutionalism”]) and where the agency and the identity of the individuals as citizens is at stake. This idea of digital citizenship is thought to be not only active but also performative, in the sense that “in societies that are increasingly mediated through digital technologies, digital acts become important means through which citizens create, enact and perform their role in society.” In particular, for Isin and Ruppert this points towards an active meaning of (digital) citizenship based on the idea that we constitute ourselves as digital citizen by claiming rights on the internet, either by saying or by doing something. == Types of digital participation == People who characterize themselves as digital citizens often use IT extensively—creating blogs, using social networks, and participating in online journalism. Although digital citizenship begins when any child, teen, or adult signs up for an email address, posts pictures online, uses e-commerce to buy merchandise online, and/or participates in any electronic function that is B2B or B2C, the process of becoming a digital citizen goes beyond simple internet activity. According to Thomas Humphrey Marshall, a British sociologist known for his work on social citizenship, a primary framework of citizenship comprises three different traditions: liberalism, republicanism, and ascriptive hierarchy. Within this framework, the digital citizen needs to exist in order to promote equal economic opportunities and increase political participation. In this way, digital technology helps to lower the barriers to entry for participation as a citizen within a society. They also have a comprehensive understanding of digital citizenship, which is the appropriate and responsible behavior when using technology. Since digital citizenship evaluates the quality of an individual's response to membership in a digital community, it often requires the participation of all community members, both visible and those who are less visible. A large part in being a responsible digital citizen encompasses digital literacy, etiquette, online safety, and an acknowledgement of private versus public information. The development of digital citizen participation can be divided into two main stages. The first stage is through information dissemination, which includes subcategories of its own: static information dissemination, characterized largely by citizens who use read-only websites where they take control of data from credible sources in order to formulate judgments or facts. Many of these websites where credible information may be found are provided by the government. dynamic information dissemination, which is more interactive and involves citizens as well as public servants. Both questions and answers can be communicated, and citizens have the opportunity to engage in question-and-answer dialogues through two-way communication platforms The second stage of digital citizen participation is citizen deliberation, which evaluates what type of participation and role that they play when attempting to ignite some sort of policy change. static citizen participants can play a role by engaging in online polls as well as through complaints and recommendations sent up, mainly toward the government who can create changes in policy decisions. dynamic citizen participants can deliberate amongst others on their thoughts and recommendations in town hall meetings or various media sites. One potential advantage of online participation through digital citizenship is increased social inclusion. In a report on civic engagement, citizen-powered democracy can be initiated either through information shared through the web, direct communication signals made by the state toward the public, and social media tactics from both private and public companies. In fact, it was found that the community-based nature of social media platforms allow individuals to feel more socially included and informed about political issues that peers have also been found to engage with, otherwise known as a "second-order effect." Understanding strategic marketing on social media would further explain social media customers’ participation. Two types of opportunities rise as a result, the first being the ability to lower barriers that can make exchanges much easier. In addition, they have the chance to participate in transformative disruption, giving people who have a historically lower political engagement to mobilize in a much easier and convenient fashion. Nonetheless, there are several challenges that face the presence of digital technologies in political participation. Both current as well as potential challenges can create significant risks for democratic processes. Not only is digital technology still seen as relatively ambiguous, it was also seen to have "less inclusivity in democratic life." Demographic groups differ considerably in the use of technology, and thus, one group could potentially be more represented than another as a result of digital participation. Another primary challenge consists in the ideology of a "filter bubble" effect. Alongside a tremendous spread of false information, internet users could reinforce existing prejudices and assist in polarizing disagreements in the public sphere. This can lead to misinformed voting and decisions based on exposure rather than on pure knowledge. A communication technology director, Van Dijk, stated, "Computerized information campaigns and mass public information systems have to be designed and supported in such a way that they help to narrow the gap between the 'information rich' and 'information poor' otherwise the spontaneous development of ICT will widen it." Access and equivalent amounts of knowledge behind digital technology must be equivalent in order for a fair system to put into place. Alongside a lack of evidenced support for technology that can be proven to be safe for citizens, the OECD has identified five struggles for the online engagement of citizens: Scale: To what extent can a society allow every individual's voice to be heard, but also not be lost in the mass debate? This can be extremely challenging for the government, which may not effectively know how to listen and respond to each individual contribution. Capacity: How can digital technology offer citizens more information on public policy-making? The opportunity for citizens to debate with one another is lacking for acti

    Read more →
  • Mike Little

    Mike Little

    Mike Little (born 12 May 1962) is an English web developer and writer. He is the co-founder of the free and open source web publishing software WordPress. == Biography == Mike Little was born in Manchester, England in 1962 to a Nigerian father, who was a mathematics lecturer and musician, and an English mother who worked as a primary school teacher. Little was placed into foster care when he was four months of age, and was later adopted by the same family. He grew up on a council estate in Brinnington, Stockport, and was educated at Stockport School. In 2003, Little and Matt Mullenweg started working on a project in which they built on b2/cafelog and later named it WordPress, releasing the first version on 27 May 2003. Little states that, despite not being invited to join his co-founder's for-profit business Automattic, he and Mullenweg remain on good terms. He clarified: "I don’t want it to sound like he cheated me out of something or ripped me off in some way. He didn’t." In June 2013, Little was awarded the SAScon's "Outstanding Contribution to Digital" award for his part in co-founding and developing WordPress. Little has been described as "modest" and living in "virtual anonymity". He has one daughter. He identifies as a follower of Stoicism and a humanist, and in 2021, he became a patron of charity Humanists UK.

    Read more →
  • Security.txt

    Security.txt

    security.txt is an accepted standard for website security information that allows security researchers to report security vulnerabilities easily. The standard prescribes a text file named security.txt in the well known location, similar in syntax to robots.txt but intended to be machine and human readable, for those wishing to contact a website's owner about security issues. security.txt files have been adopted by Google, GitHub, LinkedIn, and Facebook. == History == The Internet Draft was first submitted by Edwin Foudil in September 2017. At that time it covered four directives, "Contact", "Encryption", "Disclosure" and "Acknowledgement". Foudil expected to add further directives based on feedback. In addition, web security expert Scott Helme said he had seen positive feedback from the security community while use among the top 1 million websites was "as low as expected right now". In 2019, the Cybersecurity and Infrastructure Security Agency (CISA) published a draft binding operational directive that requires all US federal agencies to publish a security.txt file within 180 days. The Internet Engineering Steering Group (IESG) issued a Last Call for security.txt in December 2019 which ended on January 6, 2020. A study in 2021 found that over ten percent of top-100 websites published a security.txt file, with the percentage of sites publishing the file decreasing as more websites were considered. The study also noted a number of discrepancies between the standard and the content of the file. In April 2022 the security.txt file has been accepted by Internet Engineering Task Force (IETF) as RFC 9116. == File format == security.txt files can be served under the /.well-known/ directory (i.e. /.well-known/security.txt) or the top-level directory (i.e. /security.txt) of a website. The file must be served over HTTPS and in plaintext format.

    Read more →
  • Software-defined mobile network

    Software-defined mobile network

    Software-defined mobile networking (SDMN) is an approach to the design of mobile networks where all protocol-specific features are implemented in software, maximizing the use of generic and commodity hardware and software in both the core network and radio access network (RAN). == History == Through the 20th century, telecommunications technology was driven by hardware development, with most functions implemented in special-purpose equipment. In the early 2000s, generally available CPUs became cheap enough to enable commercial software-defined radio (SDR) technology and softswitches. SDMN extends these trends into the design of mobile networks, moving nearly all network functions into software. The term "software-defined mobile network" first appeared in public literature in early 2014, used independently by Lime Microsystems and researchers from University of Oulu, Finland. == Limitations of hardware-based mobile networks == Mobile networks based on special-purpose hardware suffer from the following limitations: They have limited provisions for upgrades and usually must be replaced entirely when new standards are introduced. The individual components are not scalable in terms of performance and capacity, because the capacity of a component is fixed by the hardware implementation. Specialized equipment and its associated specialized software require vendor-specific training for the mobile operator's staff. Specialized hardware systems are usually supported and serviced by a single vendor, resulting in vendor lock-in. == Characteristics of SDMN designs == === Use of software-defined radio === SDR is an important element of SDMN, because it replaces protocol-specific radio hardware with protocol-agnostic digital transceivers. While many earlier digital radio systems used field-programmable gate arrays (FPGAs) or special-purposed digital signal processors (DSPs) for calculations on baseband radio waveforms, the SDMN approach moves all of the baseband processing into general-purpose CPUs. SDMN radio systems also use hardware with publicly-documented interfaces that is designed to be readily reproducible by multiple manufacturers. === Commodity components === SDMN designs avoid the use of components that are specialized as to their functions or that are available from only a single vendor. This is true of both the hardware and software elements of the network. === Software switching and transcoding === The telephony switches of SDMN networks are software-based, including software transcoding for speech codecs. === Centralized, distributed, or hybrid? === A new SDN architecture for wireless distribution systems (WDSs) is explored that eliminates the need for multi-hop flooding of route information and therefore enables WDNs to easily expand. The key idea is to split network control and data forwarding by using two separate frequency bands. The forwarding nodes and the SDN controller exchange link-state information and other network control signaling in one of the bands, while actual data forwarding takes place in the other band. == Advantages of SDMN == The SDMN approach has many advantages over hardware-based mobile network designs. Because SDMN hardware is protocol-agnostic, upgrades are software-only, even across technology generations. In the radio network, these changes can even be made on a site-by-site basis. Because SDMN hardware is designed to be easily sourced and reproduced: SDMN equipment can be serviced by a wider range of vendors, lowering maintenance costs. SDMN equipment can be manufactured anywhere in the world, lowering production costs. Because SDMN software is based on commodity operating systems and development tools: Support staff can be trained more quickly because they are already familiar with the underlying software systems. Many aspects of the SDMN can be monitored and managed with pre-existing tools, because they are already available in the commodity operating systems. Because SDMN network components run on general purpose computers, the network components can be scaled up in capacity by adding more computing power.

    Read more →
  • Social media use by the Islamic State

    Social media use by the Islamic State

    The Islamic State is widely known for its posting of disturbing content, such as beheading videos, on the internet. This propaganda is disseminated through websites and many social media platforms such as Twitter, Facebook, Telegram, and YouTube. By utilizing social media, the organization has garnered a strong following and successfully recruited tens of thousands of followers from around the world. In response to its successful use of social media, many websites and social media platforms have banned accounts and removed content promoting the Islamic State from their platforms. == Background == The Islamic State is a Jihadist militant group and a former unrecognised proto-state. The group sophisticatedly utilizes social media as a tool for spreading its message and for international recruitment. == Target audience == IS targets a variety of different groups both in the Middle East and Western Countries. There are a wide variety of motives for why fighters may be prompted to join IS. Researchers from Quantum cite nine attributes characteristic of a fighter looking to join IS: status seeking, identity seeking, revenge, redemption, thrill, ideology, justice, and death. The standard IS recruit, both from the Middle East and Western countries, is relatively young. The average age of IS fighters is around 26 years old, with 86% of recruits being male. Middle Eastern recruits come from economically disadvantaged backgrounds in Northern Iraq. Recent destruction in the Iraq War and Syrian Civil War has created hatred of Western Powers in the region. By 2025, researchers identified a significant shift toward targeting minors and adolescents, a phenomenon dubbed the "Alt-Jihad." This younger demographic is targeted not through theological arguments, but through a "victimhood-revenge" narrative that blends extremist ideology with pop-culture aesthetics in gaming environments like Roblox and Minecraft. In 2024 alone, 42 minors were arrested in Europe for involvement in IS-related plotting or propaganda. Western recruits are often second or third-generation immigrants. Computer scientists Zeeshan ul-hassan Usmani also found that the majority of the Western recruits do not feel "at home" in their home country. As a result, these fighters often have desires to go abroad and escape conditions in their home country. In addition to recruitment, IS's social media presence is also meant to intimidate and spread terror around the world. IS's posting of beheadings and other execution videos primarily target the Western world. == Content and messages == IS produces propaganda videos that range from video executions to full-length documentaries. The videos have a high production quality and incorporate montages, slow motion scenes, and are often accompanied by a short dialogue. IS has a dedicated team of over 100 media insurgents dedicated to recording these videos. While the group previously relied on glossy magazines like Dabiq, post-territorial strategies have shifted focus to the weekly newsletter Al-Naba. Unlike previous publications designed for recruitment, Al-Naba serves as a "central pillar" of the group's media strategy, focusing on bureaucratic reporting and military statistics to project a narrative of endurance and maintain internal cohesion among dispersed fighters. The IS executions typically consist of beheadings or mass shootings in retaliation to western intervention in IS territory. The particular videos that IS often post include executions of "enemies of the Caliphate," which often consist of westerners or Jordanian nationals. Most infamously, an executioner nicknamed Jihadi John was seen in many of these videos prior to his death in 2015. Jihadi John is notorious for executing many US, UK, and Japanese citizens such as Steven Sotloff, David Haines, and Alan Henning. In many of the videos and materials produced by IS, there is the theme of inclusion and brotherhood. Additionally, the videos also focus on three main messages: Convey narrative of global war and ultimate victory Radicalize populations globally Encourage international lone state actor and small cell attacks in support of IS These messages can be seen throughout all content produced by the Islamic State such as war documentaries, execution videos, and Rumiyah (magazine). == Social media usage == From 2013 to 2014, the organization primarily used mainstream platforms such as Twitter, Facebook, and YouTube. In 2014, these large social media platforms removed IS content. Since then, IS has chosen to utilize social media platforms that either protect their content or allow for content to quickly be reposted. These platforms of choice are Telegram, Justpaste.it, and Surespot, until the latter's shutdown in 2022. By 2025, the group had further diversified into decentralized platforms like Rocket.Chat and TamTam to evade moderation. IS also implements marketing initiatives like “Jihadist Follow Friday,” which encourages users to follow new IS-related accounts each Friday. This specific hashtag mirrors commonly used hashtags such as #motivation monday or #throwbackthursday. To augment their online presence and popularity, the organization encourages their followers to use a plethora of Arabic hashtags, which translate to #theFridayofSupportingISIS, and #CalamityWillBefalltheUS. This allows them to gain followers each week while promoting their community and message on a weekly basis. === Twitter === During 2014, there were an estimated 46,000 to 90,000 Twitter accounts that advocated for IS or were run by supporters of the group. In 2015, Twitter reported that it banned 125,000 IS sympathetic accounts. In 2016, it published an update of 325,000 deleted accounts. Though many accounts have been suspended, IS supporters often create new accounts. Twitter defines those who recreate accounts as “resurgents” and explains that these are often difficult accounts to remove completely, since they tend to pop back up in alternate forms. It is estimated that approximately 20% of all IS affiliated Twitter accounts can be traced back to fake accounts created by the same user. Many of these accounts are traced back to the “Baqiya family,” which is an online network of thousands of IS followers. Many of these accounts are active during important IS military victories. During the IS march on Mosul, there were about 42,000 tweets on Twitter supporting the invasion. === Telegram === During 2014, IS became very active on Telegram after many major social media platforms banned IS content and sympathetic accounts. Telegram is an encrypted messaging application. The platform by nature is created as an end-to-end user encryption platform. Further, it also has special features such as the self-destruct timer which erase all evidence and messages. The app has a user data protection policy because violating this policy could potentially damage the app’s brand of customer privacy. Government agencies have been unable to break Telegram's encryption technology. On Telegram, IS often uses the hashtag #KhilafahNews to attract their users. Telegram is used by IS to plan social media campaigns on alternate platforms. The organization also uses Telegram as an anchor platform to connect with their user base when their other accounts are banned on Twitter and Facebook. On 28 February 2016 a video was uploaded threatening to expose the najaasah and shoot the hesitates. Produced by Ibn-Altayb and distributed by Al-Hayat, the video shows footage of Bruxelles attacks and the victims. In July 2017, Telegram came under scrutiny from the media and news media outlets. It has been documented that IS gunmen have used this app to maintain contact with IS leaders in Raqqa days before terror attacks in Turkey, Berlin, and St. Petersburg. Despite concerns from Western media, there has been little to no action taken against IS accounts on Telegram. In April 2019 a video was uploaded in which they urged lone wolves to attempt to attack during the Holy Week in Sevilla and Málaga. In Sevilla, a jihadist who intended to perform a lone wolf attack was arrested. === TikTok === In October 2019, it was reported that IS recruitment content was discovered on TikTok. Approximately two dozen accounts were subsequently shut down in response. By 2025, TikTok had evolved into a "low-threshold" gateway for extremist recruitment, characterized by researchers as part of a "Virtual Caliphate Complex." Nearly 93 unofficial IS support groups, known as "feeder groups," were found to be repackaging official IS content into short-form videos with pink hearts, catchy music, and internet memes to evade detection and appeal to the "TikTok generation." This content often promotes a "victimhood-revenge" narrative rather than complex theology, specifically designed to radicalize minors. === Justpaste.it === Justpaste.it, an anonymous photo and text sharing website, has also been utilized heavily. With the option to lock images, the website allows anonymous

    Read more →
  • HTTP cookie

    HTTP cookie

    An HTTP cookie (also called web cookie, Internet cookie, browser cookie, or simply cookie) is a small block of data created by a web server while a user is browsing a website and placed on the user's computer or other device by the user's web browser. Cookies are placed on the device used to access a website, and more than one cookie may be placed on a user's device during a session. Cookies serve useful and sometimes essential functions on the web. They enable web servers to store stateful information (such as items added in the shopping cart in an online store) on the user's device or to track the user's browsing activity (including clicking particular buttons, logging in, or recording which pages were visited in the past). They can also be used to save information that the user previously entered into form fields, such as names, addresses, passwords, and payment card numbers for subsequent use. Authentication cookies are commonly used by web servers to authenticate that a user is logged in, and with which account they are logged in. Without the cookie, users would need to authenticate themselves by logging in on each page containing sensitive information that they wish to access. The security of an authentication cookie generally depends on the security of the issuing website and the user's web browser, and on whether the cookie data is encrypted. Security vulnerabilities may allow a cookie's data to be read by an attacker, used to gain access to user data, or used to gain access (with the user's credentials) to the website to which the cookie belongs (see cross-site scripting and cross-site request forgery for examples). Tracking cookies, and especially third-party tracking cookies, are commonly used as ways to compile long-term records of individuals' browsing histories — a potential privacy concern that prompted European and U.S. lawmakers to take action in 2011. European law requires that all websites targeting European Union member states gain "informed consent" from users before storing non-essential cookies on their device. == Background == === Origin of the name === The term cookie was coined by web-browser programmer Lou Montulli. It was derived from the term magic cookie, which is a packet of data a program receives and sends back unchanged, used by Unix programmers. === History === Magic cookies were already used in computing when computer programmer Lou Montulli had the idea of using them in web communications in June 1994. At the time, he was an employee of Netscape Communications, which was developing an e-commerce application for MCI. Vint Cerf and John Klensin represented MCI in technical discussions with Netscape Communications. MCI did not want its servers to have to retain partial transaction states, which led them to ask Netscape to find a way to store that state in each user's computer instead. Cookies provided a solution to the problem of reliably implementing a virtual shopping cart. Together with John Giannandrea, Montulli wrote the initial Netscape cookie specification the same year. Version 0.9beta of Mosaic Netscape, released on 13 October 1994, supported cookies. The first use of cookies (out of the labs) was checking whether visitors to the Netscape website had already visited the site. Montulli applied for a patent for the cookie technology in 1995, which was granted in 1998. Support for cookies was integrated with Internet Explorer in version 2, released in October 1995. The introduction of cookies was not widely known to the public at the time. In particular, cookies were accepted by default, and users were not notified of their presence. The public learned about cookies after the Financial Times published an article about them on 12 February 1996. In the same year, cookies received a lot of media attention, especially because of potential privacy implications. Cookies were discussed in two U.S. Federal Trade Commission hearings in 1996 and 1997. The development of the formal cookie specifications was already ongoing. In particular, the first discussions about a formal specification started in April 1995 on the www-talk mailing list. A special working group within the Internet Engineering Task Force (IETF) was formed. Two alternative proposals for introducing state in HTTP transactions had been proposed by Brian Behlendorf and David Kristol respectively. But the group, headed by Kristol himself and Lou Montulli, soon decided to use the Netscape specification as a starting point. In February 1996, the working group identified third-party cookies as a considerable privacy threat. The specification produced by the group was eventually published as RFC 2109 in February 1997. It specifies that third-party cookies were either not allowed at all, or at least not enabled by default. At this time, advertising companies were already using third-party cookies. The recommendation about third-party cookies of RFC 2109 was not followed by Netscape and Internet Explorer. RFC 2109 was superseded by RFC 2965 in October 2000. RFC 2965 added a Set-Cookie2 header field, which informally came to be called "RFC 2965-style cookies" as opposed to the original Set-Cookie header field which was called "Netscape-style cookies". Set-Cookie2 was seldom used, however, and was deprecated in RFC 6265 in April 2011 which was written as a definitive specification for cookies as used in the real world. No modern browser recognizes the Set-Cookie2 header field. == Terminology == === Session cookie === A session cookie (also known as an in-memory cookie, transient cookie or non-persistent cookie) exists only in temporary memory while the user navigates a website. Session cookies expire or are deleted when the user closes the web browser. Session cookies are identified by the browser by the absence of an expiration date assigned to them. === Persistent cookie === A persistent cookie expires at a specific date or after a specific length of time. For the persistent cookie's lifespan set by its creator, its information will be transmitted to the server every time the user visits the website that it belongs to, or every time the user views a resource belonging to that website from another website (such as an advertisement). For this reason, persistent cookies are sometimes referred to as tracking cookies because they can be used by advertisers to record information about a user's web browsing habits over an extended period of time. Persistent cookies are also used for reasons such as keeping users logged into their accounts on websites, to avoid re-entering login credentials at every visit. (See § Uses, below.) === Secure cookie === A secure cookie can only be transmitted over an encrypted connection (i.e. HTTPS). They cannot be transmitted over unencrypted connections (i.e. HTTP). This makes the cookie less likely to be exposed to cookie theft via eavesdropping. A cookie is made secure by adding the Secure flag to the cookie. === Http-only cookie === An http-only cookie cannot be accessed by client-side APIs, such as JavaScript. This restriction eliminates the threat of cookie theft via cross-site scripting (XSS). However, the cookie remains vulnerable to cross-site tracing (XST) and cross-site request forgery (CSRF) attacks. A cookie is given this characteristic by adding the HttpOnly flag to the cookie. === Same-site cookie === In 2016 Google Chrome version 51 introduced a new kind of cookie with attribute SameSite with possible values of Strict, Lax or None. With attribute SameSite=Strict, the browsers would only send cookies to a target domain that is the same as the origin domain. This would effectively mitigate cross-site request forgery (CSRF) attacks. With SameSite=Lax, browsers would send cookies with requests to a target domain even it is different from the origin domain, but only for safe requests such as GET (POST is unsafe) and not third-party cookies (inside iframe). Attribute SameSite=None would allow third-party (cross-site) cookies, however, most browsers require secure attribute on SameSite=None cookies. The Same-site cookie is incorporated into a new RFC draft for "Cookies: HTTP State Management Mechanism" to update RFC 6265 (if approved). Chrome, Firefox, and Edge started to support Same-site cookies. The key of rollout is the treatment of existing cookies without the SameSite attribute defined, Chrome has been treating those existing cookies as if SameSite=None, this would let all website/applications run as before. Google intended to change that default to SameSite=Lax in Chrome 80 planned to be released in February 2020, but due to potential for breakage of those applications/websites that rely on third-party/cross-site cookies and COVID-19 circumstances, Google postponed this change to Chrome 84. === Supercookie === A supercookie is a cookie with an origin of a top-level domain (such as .com) or a public suffix (such as .co.uk). Ordinary cookies, by contrast, have an origin of a specific domain name, such as ex

    Read more →
  • Language model benchmark

    Language model benchmark

    A language model benchmark is a standardized test designed to evaluate the performance of language models on various natural language processing tasks. These tests are intended for comparing different models' capabilities in areas such as language understanding, generation, and reasoning. Benchmarks generally consist of a dataset and corresponding evaluation metrics. The dataset provides text samples and annotations, while the metrics measure a model's performance on tasks like answering questions, text classification, and machine translation. These benchmarks are developed and maintained by academic institutions, research organizations, and industry players to track progress in the field. In addition to accuracy, the metrics can include throughput, energy efficiency, bias, trust, and sustainability. == Overview == === Types === Benchmarks may be described by the following adjectives, not mutually exclusive: Classical: These tasks are studied in natural language processing, even before the advent of deep learning. Examples include the Penn Treebank for testing syntactic and semantic parsing, as well as bilingual translation benchmarked by BLEU scores. Question answering: These tasks have a text question and a text answer, often multiple-choice. They can be open-book or closed-book. Open-book QA resembles reading comprehension questions, with relevant passages included as annotation in the question, in which the answer appears. Closed-book QA includes no relevant passages. Closed-book QA is also called open-domain question-answering. Before the era of large language models, open-book QA was more common, and understood as testing information retrieval methods. Closed-book QA became common since GPT-2 as a method to measure knowledge stored within model parameters. Omnibus: An omnibus benchmark combines many benchmarks, often previously published. It is intended as an all-in-one benchmarking solution. Reasoning: These tasks are usually in the question-answering format, but are intended to be more difficult than standard question answering. Multimodal: These tasks require processing not only text, but also other modalities, such as images and sound. Examples include OCR and transcription. Agency: These tasks are for a language-model–based software agent that operates a computer for a user, such as editing images, browsing the web, etc. Adversarial: A benchmark is "adversarial" if the items in the benchmark are picked specifically so that certain models do badly on them. Adversarial benchmarks are often constructed after state of the art (SOTA) models have saturated (achieved 100% performance) a benchmark, to renew the benchmark. A benchmark is "adversarial" only at a certain moment in time, since what is adversarial may cease to be adversarial as newer SOTA models appear. Public/Private: A benchmark might be partly or entirely private, meaning that some or all of the questions are not publicly available. The idea is that if a question is publicly available, then it might be used for training, which would be "training on the test set" and invalidate the result of the benchmark. Usually, only the guardians of the benchmark have access to the private subsets, and to score a model on such a benchmark, one must send the model weights, or provide API access, to the guardians. The boundary between a benchmark and a dataset is not sharp. Generally, a dataset contains three "splits": training, test, and validation. Both the test and validation splits are essentially benchmarks. In general, a benchmark is distinguished from a test/validation dataset in that a benchmark is typically intended to be used to measure the performance of many different models that are not trained specifically for doing well on the benchmark, while a test/validation set is intended to be used to measure the performance of models trained specifically on the corresponding training set. In other words, a benchmark may be thought of as a test/validation set without a corresponding training set. Conversely, certain benchmarks may be used as a training set, such as the English Gigaword or the One Billion Word Benchmark, which in modern language is just the negative log-likelihood loss on a pretraining set with 1 billion words. Indeed, the distinction between benchmark and dataset in language models became sharper after the rise of the pretraining paradigm, whereby a model is first trained on massive, unlabeled datasets to learn general language patterns, syntax, and knowledge (pretraining), and the base model is then adapted to specific, downstream tasks using smaller, labeled datasets (fine-tuning). === Lifecycle === Generally, the life cycle of a benchmark consists of the following steps: Inception: A benchmark is published. It can be simply given as a demonstration of the power of a new model (implicitly) that others then picked up as a benchmark, or as a benchmark that others are encouraged to use (explicitly). Growth: More papers and models use the benchmark, and the performance on the benchmark grows. Maturity, degeneration or deprecation: A benchmark may be saturated, after which researchers move on to other benchmarks. Progress on the benchmark may also be neglected as the field moves to focus on other benchmarks. Renewal: A saturated benchmark can be upgraded to make it no longer saturated, allowing further progress. === Construction === Like datasets, benchmarks are typically constructed by several methods, individually or in combination: Web scraping: Ready-made question-answer pairs may be scraped online, such as from websites that teach mathematics and programming. Conversion: Items may be constructed programmatically from scraped web content, such as by blanking out named entities from sentences, and asking the model to fill in the blank. This was used for making the CNN/Daily Mail Reading Comprehension Task. Crowd sourcing: Items may be constructed by paying people to write them, such as on Amazon Mechanical Turk. This was used for making the MCTest. === Evaluation === Generally, benchmarks are fully automated. This limits the questions that can be asked. For example, with mathematical questions, "proving a claim" would be difficult to automatically check, while "calculate an answer with a unique integer answer" would be automatically checkable. With programming tasks, the answer can generally be checked by running unit tests, with an upper limit on runtime. The benchmark scores are of the following kinds: For multiple choice or cloze questions, common scores are accuracy (frequency of correct answer), precision, recall, F1 score, etc. pass@n: The model is given n {\displaystyle n} attempts to solve each problem. If any attempt is correct, the model earns a point. The pass@n score is the model's average score over all problems. k@n: The model makes n {\displaystyle n} attempts to solve each problem, but only k {\displaystyle k} attempts out of them are selected for submission. If any submission is correct, the model earns a point. The k@n score is the model's average score over all problems. cons@n: The model is given n {\displaystyle n} attempts to solve each problem. If the most common answer is correct, the model earns a point. The cons@n score is the model's average score over all problems. Here "cons" stands for "consensus" or "majority voting". The pass@n score can be estimated more accurately by making N > n {\displaystyle N>n} attempts, and use the unbiased estimator 1 − ( N − c n ) ( N n ) {\displaystyle 1-{\frac {\binom {N-c}{n}}{\binom {N}{n}}}} , where c {\displaystyle c} is the number of correct attempts. For less well-formed tasks, where the output can be any sentence, there are the following commonly used scores including BLEU ROUGE, METEOR, NIST, word error rate, LEPOR, CIDEr, and SPICE. === Issues === error: Some benchmark answers may be wrong. ambiguity: Some benchmark questions may be ambiguously worded. subjective: Some benchmark questions may not have an objective answer at all. This problem generally prevents creative writing benchmarks. Similarly, this prevents benchmarking writing proofs in natural language, though benchmarking proofs in a formal language is possible. open-ended: Some benchmark questions may not have a single answer of a fixed size. This problem generally prevents programming benchmarks from using more natural tasks such as "write a program for X", and instead uses tasks such as "write a function that implements specification X". inter-annotator agreement: Some benchmark questions may be not fully objective, such that even people would not agree with 100% on what the answer should be. This is common in natural language processing tasks, such as syntactic annotation. shortcut: Some benchmark questions may be easily solved by an "unintended" shortcut. For example, in the SNLI benchmark, having a negative word like "not" in the second sentence is a strong signal for the "Contradiction" category, regardless of what the se

    Read more →
  • Problematic social media use

    Problematic social media use

    Excessive use of social media can lead to problems including impaired functioning and a reduction in overall wellbeing, for both users and those around them. Such usage is associated with a risk of mental health problems, sleep problems, academic struggles, and daytime fatigue. Psychological or behavioural dependence on social media platforms can result in significant negative functions in peoples daily lives. The risk of problems is also related to the type of platform of social media or online community being used. People of different ages and genders may be affected in different ways by problematic social media use. == Signs and symptoms == Signs of social media addiction or excessive use of social media include many behaviours similar to substance use disorders, including mood modification, salience, tolerance, stress withdrawal symptoms, psychological distress, anxiety and depression, conflict, and relapse, and low self esteem. People with problematic social media habits are at risk of being addicted and may require more time on social media as time passes. Frequent social media use may also be associated with self-reported symptoms of attention deficit hyperactivity disorder. Social anxiety (or fear of missing out) is another potential symptom. Social anxiety is defined as having intense anxiety or fear of being judged, negatively evaluated, or rejected in a social or performance situation. The fear of missing out can contribute to excessive usage due to frequent checking the media constantly throughout the day to check in and see what others are doing instead of doing other activities. Common signs include displacement, or replacing meaningful other activities with social media, and loneliness. == Causes and mechanisms == There are many theories for the mechanism or cause behind a person having problematic social media use. The transition from normal to problematic social media use occurs when a person relies on it to relieve stress, loneliness, depression, or provide continuous rewards. Cognitive-behavioral model – People increase their use of social media when they are in unfamiliar environments or awkward situations; Social skill model – People pull out their phones and use social media when they prefer virtual communication as opposed to face-to-face interactions because they lack self-presentation skills; Socio-cognitive model – This person uses social media because they love the feeling of people liking and commenting on their photos and tagging them in pictures. They are attracted to the positive outcomes they receive on social media. There are parallels to the gambling industry inherent to the design of various social media sites, with "'ludic loops' or repeated cycles of uncertainty, anticipation and feedback" potentially contributing to problematic social media use. Another factor directly facilitating the development of addiction to social media is the implicit attitude toward the IT artifact. Social media use may also stimulate the reward pathway in the brain. There is also a theory that social media addiction fulfills a basic evolutionary drives in the wake of mass urbanization worldwide. The basic psychological needs of "secure, predictable community life that evolved over millions of years" remain unchanged, leading some to find online communities to cope with the new individualized way of life in some modern societies. The "Evolutionary Mismatch" hypothesis holds that modern digital platforms amplify social competition and comparison in ways our ancestors never faced, possibly triggering maladaptive patterns such as anxiety, depression, or compulsive use. Similarly, some scholars compare social media to "junk food": The approach taken to develop social media platforms may contribute to problematic social media use. The ability to scroll and stream content endlessly and how app developers distort time by affecting the 'flow' of content when scrolling, potentially resulting in the Zeigarnik effect (the human brain will continue to pursue an unfinished task until a satisfying closure. Autoplay modes, the personalized nature of the content results in emotional attachment (the user values this above its actual value, which is referred to as the endowment effect), and the exposure effect (repeated exposure to a distinct stimulus by the user can condition the user into an enhanced or improved attitude toward it). The interactive nature of the platforms, including the ability to "like" content has also been linked. Even though social media can satisfy personal communication needs, those who use it at higher rates are shown to have higher levels of psychological distress. == Diagnosis == While there is no official diagnostic term or measurement, problematic social media use is conceptualized as a non-substance-related disorder, resulting in preoccupation and compulsion to engage excessively in social media platforms despite negative consequences. No diagnosis exists for problematic social media use in either the ICD-11 or DSM-5. Excessive use of an activity, like social media, does not directly equate with addiction. There are other factors that could lead to someone's social media addiction including personality traits and pre-existing tendencies. While the extent of social media use and addiction are positively correlated, it is erroneous to employ use (the degree to which one makes use of the site's features, the effort exerted during use sessions, access frequency, etc.) as a proxy for addiction. Indicators of a potential dependence on social media include: Mood swings: a person uses social media to regulate his or her mood, or as a means of escaping real world conflicts. Relevance: social media starts to dominate a person's thoughts at the expense of other activities. Salience: social media becomes the most important part of someone's life. Tolerance: a person increases their time spent on social media to experience previously associated feelings they had while using social media. Withdrawal: when a person can not access social media their sleeping or eating habits change or signs of depression or anxiety can become present. Conflicts in real life: when social media is used excessively, it can affect real-life relationships with family and friends. Relapse: the tendency for previously affected individuals to revert to previous patterns of excessive social media use. There have been several scales developed and validated that help to understand the issues regarding problematic social media use. There is not one single scale that is being used by all researchers. == Treatment == Screen time recommendations for children and families have been developed by the American Academy of Pediatrics. Possible therapeutic interventions published include: Self-help interventions, including application-specific timers; Cognitive behavioural therapy; and Organisational and schooling support. Medications have not been shown to be effective in randomized, controlled trials for the related conditions of Internet addiction disorder or gaming disorder. == Prevention == Prevention approaches include screen time monitoring apps and other tech-based approaches to improve efficiency and decrease screen time and tools to help with addiction to online platform products. Parents' methods for monitoring, regulating, and understanding their children's social media use are referred to as parental mediation. Parental mediation strategies include active, restrictive, and co-using methods. Active mediation involves direct parent-child conversations that are intended to educate children on social media norms and safety, as well as the variety and purposes of online content. Restrictive mediation entails the implementation of rules, expectations, and limitations regarding children's social media use and interactions. Co-use is when parents jointly use social media alongside their children, and is most effective when parents are actively participating (like asking questions, making inquisitive/supportive comments) versus being passive about it. Active mediation is the most common strategy used by parents, though the key to success for any mediation strategy is consistency/reliability. When parents reinforce rules inconsistently, have no mediation strategy, or use highly restrictive strategies for monitoring their children's social media use, there is an observable increase in children's aggressive behaviours. When parents openly express that they are supportive of their child's autonomy and provide clear, consistent rules for media use, problematic usage and aggression decreases. Knowing that consistent, autonomy-supportive mediation has more positive outcomes than inconsistent, controlling mediation, parents can consciously foster more direct, involved, and genuine dialogue with their children. This can help prevent or reduce problematic social media use in children and teenagers. == Outcomes == === Adolescents and teens === Increased social medi

    Read more →
  • Attention inequality

    Attention inequality

    Attention inequality is the inequality of distribution of attention across users on social networks, people in general, and for scientific papers. Yun Family Foundation introduced "Attention Inequality Coefficient" as a measure of inequality in attention and arguments it by the close interconnection with wealth inequality. == Relationship to economic inequality == Attention inequality is related to economic inequality since attention is an economically scarce good. The same measures and concepts as in classical economy can be applied for attention economy. The relationship develops also beyond the conceptual level—considering the AIDA process, attention is the prerequisite for real monetary income on the Internet. On data of 2018, a significant relationship between likes and comments on Facebook to donations is proven for non-profit organizations. == Attention economy == The attention economy refers to the practice of maximizing the attention users give to a product for advertising-related reasons. Attention economy remains one of the most common forms of advertising, and has been steadily increasing thanks to new technologies such as television, internet and social media. It is one of the most widely-used approaches to economy for its effectiveness for maximising the noticeability of a certain product. == Attention inequality in social media == In social media, attention inequality refers to the unequal distribution of users' attention on social media platforms. This means that instead of an equal distribution of attention, fewer sources receive a disproportionate share of attention, leaving many unnoticed. This phenomenon is possibly the result of social media algorithms, which are commonly designed to drive maximum engagement. This phenomenon is a large factor in the polarization and creation of echo-chambers. Social media algorithms tend to note content that is already performing well and display it to more users, while content that is equally engaging or well-made is not recommended to users. Posts that trigger strong emotions usually out-perform more "uncontroversial" content. When many users interact with the post, it signals the algorithm that the specific post drives engagement. The algorithm then tends to recommend that type of content to an exponential number of people, potentially outperforming "un-emotional" content. These factors, when combined, tend to create an unequal social media environment. == Attention inequality in science == According to a recent 2025 study about research inequality among scientists published in Information Processing and Management, scientific discourse is restricted to a small group of connected scientists, and is frequently not an accurate representation of the whole scientific community. Using citation-network analysis in the fields of nanoscience and chemical physics, the study claims that a group of connected scientists has a significant notability in the scientific community. The calculated connection strength between these scientists is estimated to be about 4.5, the study also says that these authors cite each other four times more often than would be predicted in a random network, whereas ordinary scientists that exist outside of this group only reach an estimated connection strength of 0.9. The study findings suggest that that scientific attention is not distributed by merit, but rather by the connectedness of the scientists involved in the research. == Extent == As data of 2008 shows, 50% of the attention is concentrated on approximately 0.2% of all hostnames, and 80% on 5% of hostnames. The Gini coefficient of attention distribution lay in 2008 at over 0.921 for such commercial domains names as ac.jp and at 0.985 for .org-domains. The Gini coefficient was measured on Twitter in 2016 for the number of followers as 0.9412, for the number of mentions as 0.9133, and for the number of retweets as 0.9034. For comparison, the world's income Gini coefficient was 0.68 in 2005 and 0.904 in 2018. More than 96% of all followers, 93% of the retweets, and 93% of all mentions are owned by 20% of Twitter. == Causes == At least for scientific papers, today's consensus states that inequality is unexplainable by variations of quality and individual talent. The Matthew effect plays a significant role in the emergence of attention inequality—those who already enjoy large amounts of attention get even more attention, and those who do not lose even more. Ranking algorithms based on relevance to the user have been found to alleviate the inequality of the number of posts across topics.

    Read more →