AI Detector Text

AI Detector Text — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Automatic acquisition of sense-tagged corpora

    Automatic acquisition of sense-tagged corpora

    The knowledge acquisition bottleneck is perhaps the major impediment to solving the word-sense disambiguation (WSD) problem. Unsupervised learning methods rely on knowledge about word senses, which is barely formulated in dictionaries and lexical databases. Supervised learning methods depend heavily on the existence of manually annotated examples for every word sense, a requisite that can so far be met only for a handful of words for testing purposes, as it is done in the Senseval exercises. == Existing methods == Therefore, one of the most promising trends in WSD research is using the largest corpus ever accessible, the World Wide Web, to acquire lexical information automatically. WSD has been traditionally understood as an intermediate language engineering technology which could improve applications such as information retrieval (IR). In this case, however, the reverse is also true: Web search engines implement simple and robust IR techniques that can be successfully used when mining the Web for information to be employed in WSD. The most direct way of using the Web (and other corpora) to enhance WSD performance is the automatic acquisition of sense-tagged corpora, the fundamental resource to feed supervised WSD algorithms. Although this is far from being commonplace in the WSD literature, a number of different and effective strategies to achieve this goal have already been proposed. Some of these strategies are: acquisition by direct Web searching (searches for monosemous synonyms, hypernyms, hyponyms, parsed gloss' words, etc.), Yarowsky algorithm (bootstrapping), acquisition via Web directories, and acquisition via cross-language meaning evidences. == Summary == === Optimistic results === The automatic extraction of examples to train supervised learning algorithms reviewed has been, by far, the best explored approach to mine the web for word-sense disambiguation. Some results are certainly encouraging: In some experiments, the quality of the Web data for WSD equals that of human-tagged examples. This is the case of the monosemous relatives plus bootstrapping with Semcor seeds technique and the examples taken from the ODP Web directories. In the first case, however, Semcor-size example seeds are necessary (and only available for English), and it has only been tested with a very limited set of nouns; in the second case, the coverage is quite limited, and it is not yet clear whether it can be grown without compromising the quality of the examples retrieved. It has been shown that a mainstream supervised learning technique trained exclusively with web data can obtain better results than all unsupervised WSD systems which participated at Senseval-2. Web examples made a significant contribution to the best Senseval-2 English all-words system. === Difficulties === There are, however, several open research issues related to the use of Web examples in WSD: High precision in the retrieved examples (i.e., correct sense assignments for the examples) does not necessarily lead to good supervised WSD results (i.e., the examples are possibly not useful for training). The most complete evaluation of Web examples for supervised WSD indicates that learning with Web data improves over unsupervised techniques, but the results are nevertheless far from those obtained with hand-tagged data, and do not even beat the most-frequent-sense baseline. Results are not always reproducible; the same or similar techniques may lead to different results in different experiments. Compare, for instance, Mihalcea (2002) with Agirre and Martínez (2004), or Agirre and Martínez (2000) with Mihalcea and Moldovan (1999). Results with Web data seem to be very sensitive to small differences in the learning algorithm, to when the corpus was extracted (search engines change continuously), and on small heuristic issues (e.g., differences in filters to discard part of the retrieved examples). Results are strongly dependent on bias (i.e., on the relative frequencies of examples per word sense). It is unclear whether this is simply a problem of Web data, or an intrinsic problem of supervised learning techniques, or just a problem of how WSD systems are evaluated (indeed, testing with rather small Senseval data may overemphasize sense distributions compared to sense distributions obtained from the full Web as corpus). In any case, Web data has an intrinsic bias, because queries to search engines directly constrain the context of the examples retrieved. There are approaches that alleviate this problem, such as using several different seeds/queries per sense or assigning senses to Web directories and then scanning directories for examples; but this problem is nevertheless far from being solved. Once a Web corpus of examples is built, it is not entirely clear whether its distribution is safe from a legal perspective. === Future === Besides automatic acquisition of examples from the Web, there are some other WSD experiments that have profited from the Web: The Web as a social network has been successfully used for cooperative annotation of a corpus (OMWE, Open Mind Word Expert project), which has already been used in three Senseval-3 tasks (English, Romanian and Multilingual). The Web has been used to enrich WordNet senses with domain information: topic signatures and Web directories, which have in turn been successfully used for WSD. Also, some research benefited from the semantic information that the Wikipedia maintains on its disambiguation pages. It is clear, however, that most research opportunities remain largely unexplored. For instance, little is known about how to use lexical information extracted from the Web in knowledge-based WSD systems; and it is also hard to find systems that use Web-mined parallel corpora for WSD, even though there are already efficient algorithms that use parallel corpora in WSD.

    Read more →
  • Hike Messenger

    Hike Messenger

    Hike Messenger, aka Hike Sticker Chat, is a multifunctional Indian social media and social networking service offering instant messaging (IM) and Voice over IP (VoIP) services that was launched on December 11, 2012, by Kavin Bharti Mittal. Hike functioned through SMS. The app registration used a s‍tandard, one-time password (OTP) based authentication process. It was estimated to be worth $1.4 billion and had more than 100 million registered users. It went defunct on January 6, 2021, as they were unable to compete with global messaging platforms. The app re-appeared on google play store and apple app store on 19 September 2025. == History == Hike Messenger was launched on December 12, 2012, by its founder, Kavin Bharti Mittal. The majority of users were from India, with 80% under the age of 25. The company purchased startups like TinyMogul and Hoppr in 2015. After buying US-based free voice calling company Zip Phones, Hike provided VoIP calling services. On March 5, 2015, Hike launched the 'Great Indian Sticker Challenge' to create more stickers. In February 2017, Hike acquired the social networking app Pulse. From version 5.0, it became the first social messaging app to start a mobile payment service in India. The timeline feature came back after multiple user requests and the introduction of a personalized digital envelope called Blue Packets for sending monetary gifts through a built-in wallet. In 2017, the acquisition of Bengaluru-based startup Creo was announced to enable third-party developers to build services on top of the Hike platform. In 2018, Hike provided 1 billion users with internet access by targeting smaller cities. In January 2019, the company discarded the previous super-app approach, and began launching specialized apps for specific use-cases. In May 2019, Hike announced a collaboration with Indraprastha Institute of Information Technology, Delhi (IIIT-D) to develop a variety of machine learning models. In April 2019, the company launched its first standalone app, Hike Sticker Chat. A separate content app Hike News & Content was also launched. In 2021, Hike shut down its messaging service and shifted focus to gaming and community platforms. It launched Rush, a real-money gaming app featuring casual titles like ludo and carrom, which scaled to over 10 million users and generated more than US$500 million in gross revenue over four years. The company also introduced Vibe, an approval-only community app, as part of its pivot away from the super-app and messaging model. In September 2025, following the passage of the Promotion and Regulation of Online Gaming Act, which banned real-money gaming in India, Hike announced its complete closure. Founder Kavin Bharti Mittal stated that while the company had begun international expansion, scaling globally under the new regulatory regime would require a full reset that was not a viable use of capital or resources. On 19 September 2025, hike was relaunched on play store and app store by the name hike messenger. == Application == === Timeline of Features === On 15 April 2014, Hike introduced unlimited free SMS via a service called Hike Offline, through credits earned by users from regular chatting, as connectivity is still a major issue in many parts of India. In an attempt to appeal to its younger users, Hike introduced features that find resonance with the local market, such as Last Seen Privacy and localized sticker packs. It also introduced a two-way chat theme, allowing users to change the chat background for themselves and for their friends simultaneously. The app also started showing live Cricket scores in collaboration with Cricbuzz, as well as news, casual games, and social media feeds. Hike also added a file transfer service, allowing files less than 100MB of all formats, with a view on further increasing the size limit to 1 GB. With the launch of version 2.9.2.0 in January 2015, Hike implemented support for sending uncompressed images and a "quick upload" feature optimized for 2G speed. Later that month, Hike introduced a voice calling feature for its users. In September 2015, Hike launched free group call support with up to 100 people in a simultaneous conference call environment. In November 2016, Hike announced the launch of a feature called Stories that allows people to share real-life moments using fun live filters which automatically get deleted after 48 hours, and a new camera design with localized filters. Hike 4.0 launched on 26 August 2015 with the tagline 'Got a Gang? Get on Hike'. Hike 4.0 was an optimization-focused update, increasing the performance of the app on poor networks. It supported photo filters, doodles, and bite-sized news updates in under 100 characters. Hike launched News Feed with Hindi language support on 29 September 2015 to cater for the needs of the non-English population. Hike launched version 3.5 as the biggest update for Windows Phone 8.1 during December 2015 which changed the user interface for more simpler navigation, supported sending unlimited non-media files and documents of any format and better group admin settings. It also included ten brand new chat themes. Hike launched a microapp feature which was live for two days on 8 May 2016, as a Mother's Day special in which users could add images, quotes or messages as a token of love with customized e-cards and stickers on their timeline not only on Hike, but also on other platforms. On 26 October 2016, Hike Messenger rolled out the beta version of a video calling feature ahead of WhatsApp starting with the Android users which also lets recipients preview a video call before deciding to take it and is optimized to even work under 2G conditions. On 24 December 2016, Hike rolled out a short 20-second Video Stories feature that can be directly shared with friends or posted on a public timeline with different filters in collaboration with content creators with the same 48-hour time limit before being automatically deleted. The Stories feature continues to receive constant future updates to include and enable content, public story option, private user messaging and geo-tagging. In September 2017, Hike launched personalized sticker packs with 20,000+ graphical stickers for over 500 colleges that covered around 1,000 colleges by December 2018 across India which can be used across different geographies, and are highly customized for users with availability in 40+ local languages that support automatic sticker suggestions where the application suggests the best reply for any sticker message and also allows users to "nudge", a feature used to ping the receiver. Hike started supporting user comments on friend's posts, added a specific message reply function, a redesigned camera interface to support front flash and user mentions with the help of the @ symbol. In December, 2017, Hike launched group voting, bill splitting, checklists and event reminders for group chat that supports up to 1,000 users both on iOS and Android platform. Hike launched another feature called Hike Land, which is a virtual world with beta trial to start from March 2020, that will use Hike Moji where online users with their digital avatar can hang out with other users and will be built inside the Hike Sticker Chat application. It is mainly targeted but not restricted towards 16 to 21 years age group of people. Without unveiling much about Hike Land, a separate website has been created with option to reserve spots by giving details like name, gender and phone number that will link the user profile from the Hike Sticker Chat account though it is not a necessity. ==== Hike Direct ==== The Hike Direct feature is based on the technology known as WiFi Direct, which initially was also called WiFi P2P and got introduced to users by October 2015, which enables sharing of files such as music, apps, videos without a live internet connection within a 100-meter radius by creating a wireless network between two or more devices with a transfer speed of 100MB per minute. For privacy and security reasons, Hike didn't show the recipient's location or proximity and works only when two users are connected in the same room by adding one another into the contact list. ==== Hike Wallet ==== In June 2017, Hike announced the launch of version 5.0 with multiple new features like User Chat Themes, Night Mode and Magic Selfie. along with a built-in Wallet partnered with Yes Bank. This feature was first rolled out to Android users followed by iOS users at a later stage. Hike collaborated with Airtel Payment Bank to power its digital payment wallet by November 2017 where Hike users have access to Airtel Payments Bank's merchant & utility payment services and know your customer (KYC) infrastructure with 5 million transactions happening from services like recharge and P2P. Hike formed a partnership with Ola Cabs to bring a taxi and auto-rickshaw booking facility from 14 February 2018. With Hike Wallet facility users could now book bus tickets with 3

    Read more →
  • Social media use in education

    Social media use in education

    Social media in education is the use of social media to enhance education. Social media are "a group of Internet-based applications...that allow the creation and exchange of user-generated content". It is also known as the read/write web. As time went on and technology evolved, social media has been an integral part of people's lives, including students, scholars, and teachers. However, social media are controversial because, in addition to providing new means of connection, critics claim that they damage self-esteem, shorten attention spans, and increase mental health issues. A 2016 dissertation presented surveys that focused on the impact of social media. It reported that 54.6% of students believed that social media affected their studies positively (38% agree, 16.6% strongly agree). About 40% disagreed, and 4.7% of students strongly disagreed. 53% of female students reported that social media negatively impacted their studies. Among male students, 40% agreed that social media had a negative impact on studies, while 59% disagreed. A 2023 article dives deep into the rewards system of the brain in response to social media. This study compares the social rewards system in our brain to those from social media. From ages 10-12, most are receiving a cell phone, social rewards in the brain start to feel more satisfying. Leading to adulthood, the effects of social rewards are less likely to feel reliant on feedback from peers. Equivalent to a more mature prefrontal cortex, this enables a better management of their emotional reaction to these social rewards, meaning a more balanced and controlled reaction. == History == A survey from Cambridge International of nearly 20,000 teachers and students (ages 12–19) from 100 countries found that 48% of students use a desktop computer in class, 42% uses phones, 33% use interactive whiteboards and 20% use tablets. Desktop computers are more used than tablets. Teachers were abandoning the "no phones at school" rule. A 2024 research survey through Common Sense Education reported 54% of age 8-12 and 69% of ages 13-18 social media is an extensive distraction from homework. === United States === The long-running technology boom accelerated after the millennium. As of 2018, 95% of US teenage students had access to a smartphone and 45% said they were online almost constantly. In the early days of social media, access to technology was a significant issue as many students did not own not compatible devices and school budgets were often insufficient to purchase devices for student use. Despite backlash, Missouri passed a law that prohibited teachers from communicating privately with students over social media in 2011. Supporters were concerned that online communication between underage students and faculty could lead to inappropriate relationships. Some schools adopted a "Bring Your Own Device" (BYOD) policy, allowing students to bring Internet-accessing devices, such as phones or tablets to class. During the pandemic, the federal government offered funds that allowed more schools to purchase devices. Over time, more students acquired phones with social media access. Personal devices increased student satisfaction, but reduced teachers' ability to control device use in their classrooms. A 2018 Pew Research study reported that 95% of teenagers had a phone and used social media consistently. === Canada === The Peel District School Board (PDSB) in Ontario accepted the use of social media in the classroom. In 2013, the PDSB introduced BYOD and unblocked many social media sites. That was later replaced by a policy that dealt specifically with social media. == Uses == === Classroom === In the classroom, social media offers a way to systematically distribute and gather information from students. Teachers can supply documents, and audio/video media to students for immediate or later use. One study on higher education reported that devices and social media: created opportunities for interaction provided occasions for collaboration sped up information access offered more ways to learn situated learning. Frustrations included anti-technology instructors, device challenges, and devices as a distraction. Social media in classrooms can have a negative effect. A Yale University publication reported that students who used laptops in class for non-academic reasons had poorer performance. Students spent most of their time on social media, shopping, and other personal activities. Social media has helped many educators mentor their students more effectively. === Outside of class === Social media offer a venue for video calls, stories, feeds, and game playing that can enhance the learning process. Teachers can utilize social media to communicate with their students. Social media can provide students with resources that they can utilize in essays, projects, and presentations. Students can easily access comments made by teachers and peers and offer feedback to teachers. Social media can offer students the opportunity to collaborate by sharing information without requiring face to face meetings. Social media can allow students to more easily connect with experts, to go beyond course materials. Instructors in a 2010 study reported that online technologies (social media) can help students become comfortable having discussions outside the classroom better than traditional means. Teachers may face some risk when using social media outside the classroom, without appropriate work rules. Studies explores how college students' engagement with social media platforms influences their communication preferences and habits, particularly in relation to using school email for academic purposes. === Professional development === Social media can aid professional development, as teachers become students, enhancing knowledge transfer, skill master, and collaboration. === Non-academic uses === Schools can use social media to make public announcements. Teachers and administrators can communicate other important information to parents and students and to receive feedback from them. Families can keep up with school events and policies. === Ecology education === The potential of using social media in ecological, nature and forest education include: virtual nature groups can help promote good habits in forest tourism and recreation (nature ethics), by entering general rules in the regulations by administrators, e.g. "DO NOT PICK UP PLANTS UNKNOWN TO US", which is to protects rare species from pointless picking. social media activity motivates people to learn about nature in the field, allows them to gain knowledge, dispels popular myths, enables contact with scientists and practitioners, promotes valuable literature, websites, and at the same time reveals distortions and substantive errors in popular news services. contact is not only virtual. Despite financial barriers and distance, Internet users organize nature conventions. Such meetings are an opportunity not only to make friends, but also to learn about nature together and have fun. the possibility of contact between scientists and nature lovers via Facebook has become a source of cooperation in species inventory, e.g. the online campaign of the NATRIX Herpetological Society, which consists not only of collecting reports of observations of the smooth snake by Internet users, but also of drawing attention to the biology and threats to this species. Social media has become a place where ecology education quickly reaches people of different ages and social statuses. The nature groups that have been created, in which nature lovers, biologists, foresters and scientists participate, can have a real impact on the state of knowledge and data collection through citizen science. == Apps and services == Social media can allow students to participate in their field by working with organizations outside the classroom. By offering easier access to peers outside the classroom, students can broaden their perspectives and find support resources. Social media aided learning outside of the classroom through collaboration and innovation. One specific study, "Exploring education-related use of social media," called this "audience connectors". Audience connectors bring students together while studying with WhatsApp and Facebook. This study reported that "60 percent [of students in the study] agreed that technology changes education for the better." While social media can promote a beneficial education platform, downsides exist. Students may become skilled at "lifting material from the internet" rather than enhancing their personal understanding. Another downside is student attention spans decline. A concern raised by the students of this study showed how many use spell-check as a crutch and will see a trend of points taken off when spell-check is not an option. Apps like X allowed teachers to make classroom accounts where students can learn about social media in a controlled context. Teachers can post assignments on th

    Read more →
  • Cover (telecommunications)

    Cover (telecommunications)

    In telecommunications and tradecraft, cover is the technique of concealing or altering the characteristics of communications patterns for the purpose of denying an unauthorized receiver information that would be of value. The purpose of cover is not to make the communication secure, but to make it look like noise, rendering it uninteresting and not worth analysis. Even if an attacker recognizes the communication as interesting, cover makes traffic analysis more difficult since he must crack the cover before he can find out to whom it is addressed. Usually, the covered communication is also encrypted. In this way, enemies have no idea you sent a message; friends know you sent a message, but don't know what you said; the intended recipient knows what you said. Technically, cover sometimes refers to the specific process of modulo two additions of a pseudorandom bit stream generated by a cryptographic device with bits from the control message. Source: from Federal Standard 1037C and from MIL-STD-188

    Read more →
  • AI literacy

    AI literacy

    AI literacy or artificial intelligence literacy is "a set of competencies that enables individuals to critically evaluate AI technologies; communicate and collaborate effectively with AI; and use AI as a tool online, at home, and in the workplace." AI is employed in a variety of applications, including self-driving automobiles, virtual assistants and text generation by generative AI models. Users of these tools should be able to make informed decisions. AI literacy may have an impact on students' future employment prospects. With the rise of generative AI platforms, AI literacy has become a topic of conversation in the field of education. Some think AI literacy is essential for school and college students, while others restrict or prohibit the use of AI in assignments, viewing it as a form of academic dishonesty. However, many researchers and educational institutions promote a more nuanced approach, encouraging critical engagement with AI while developing policies that balance academic integrity with opportunities for learning. == Definitions == Other definitions of AI literacy include the ability to understand, use, monitor, and critically reflect on AI applications. That use of the term usually refers to teaching skills and knowledge to the general public, particularly those who are not adept in AI and the ability to understand, use, evaluate, and ethically navigate AI. As research into AI literacy is still emerging and focused on developing context-specific skills, there is not yet a single, broadly agreed-upon definition. AI literacy is linked to other forms of literacy. AI literacy requires digital literacy, whereas scientific and computational literacy may inform it. Data literacy also significantly overlaps with it. == Categories == AI literacy encompasses multiple categories, including a theoretical understanding of how artificial intelligence works, the usage of artificial intelligence technologies, and the critical appraisal of artificial intelligence, and its ethics. === Know and understand AI === Knowledge and understanding of AI refers to a basic understanding of what artificial intelligence is and how it works. This includes familiarity with machine learning algorithms and the limitations and biases present in AI systems. Users who know and understand AI should be familiar with various technologies that use artificial intelligence, including cognitive systems, robotics and machine learning. This includes recognizing that large language models (LLMs) are machine learning models trained on extensive datasets which generate new text rather than retrieving pre-written responses. === Use and apply AI === Using and applying AI refers to the ability to use AI tools to solve problems and perform tasks such as programming and analyzing big data. Some consider prompt engineering, the practice of designing effective prompts to guide generative AI platforms more effectively, as another competency within AI literacy. === Evaluate and create AI === Evaluation and creation refers to the ability to critically evaluate the quality and reliability of AI systems. It also refers to designing and building fair and ethical AI systems. To evaluate correctly, users should also learn in which areas AI is strong, and in which areas it is weak. === AI ethics === AI ethics refers to understanding the moral implications of AI, and the making informed decisions regarding the use of AI tools. This area includes considerations such as: Accountability: Hold AI actors accountable for the operation of AI systems and adherence to ethical ideals. Accuracy: Identify and report sources of error and uncertainty in algorithms and data. Auditability: Enable other parties to audit and assess algorithm behavior via transparent information sharing. Explainability: Make sure that algorithmic judgments and the underlying data can be presented in simple language. Fairness: Prevent biases and consider varied viewpoints. To do so, increase the diversity of researchers in the field. Human Centricity and Well-being: Prioritize human well-being in AI development and deployment. Human rights Alignment: Ensure that technology do not infringe internationally recognized human rights. Inclusivity: Make AI accessible to everyone. Progress: Choose high value initiatives. Responsibility, accountability, and transparency: Foster trust via responsibility, accountability, and fairness. Robustness and Security: Make AI systems safe, secure, and resistant to manipulation or data breach. Sustainability: Choose implementations that generate long-term, useful benefits. Environmental Implications: How this tool impacts the environment, any restrictions or laws, if this impact is worth the effects or not. === Enabling AI === Support AI by developing associated knowledge and skills such as programming and statistics. == Promoting AI literacy == Several governments have recognized the need to promote AI literacy, including among adults. Such programs have been published in the United States, China, Germany and Finland. Programs intended for the general public usually consist of short and easy to understand online study units. Programs intended for children are usually project-based. Programs for students at colleges and universities often address the specific professional needs of the student, depending on their field of study. Beyond the education system, AI literacy can also be developed in the community, for example in museums. === Schools === Schools use diverse pedagogies to promote AI literacy. These include: Performing a Turing test with an intelligent agent Creating chatbots Building apps using Blockly-based programming Project-based learning Building robots Data visualization Training AI models Artificial intelligence curricula can improve students' understanding of topics such as machine learning, neural networks, and deep learning. === Higher education === Before the second decade of the 21st century, artificial intelligence was studied mainly in STEM courses. Later, projects emerged to increase artificial intelligence education, specifically to promote AI literacy. Most courses start with one or more study units that deal with basic questions such as what artificial intelligence is, where it comes from, what it can do and what it can't do. Most courses also refer to machine learning and deep learning. Some of the courses deal with moral issues in artificial intelligence. In Ireland, the Higher Education Authority published Generative AI in Higher Education Teaching & Learning: Policy Framework in December 2025, which encouraged higher education institutions to embed AI literacy across programmes as a core graduate attribute. ==== Disciplinary policy ==== As a response to the increase of generative AI use in education, several disciplines formed committees or task forces to examine context-specific approaches toward AI literacy. In spring 2025, the Modern Language Association and Conference on College Composition and Communication Joint Task Force finished development of three working papers, a guide on AI literacy for students, and a collection of resources addressing AI use in writing. The task force emphasized the need for "a culture of critical AI literacy" and included guidelines not only for students but also educators and institutions, highlighting the need for modeling ethical AI use in planning processes. Similarly, a committee formed by the American Historical Association Council published "Guiding Principles for Artificial Intelligence in History Education" which encouraged "clear and transparent engagement with generative AI." The guidelines demonstrate the value of criticality when working with generative AI in thinking and research.

    Read more →
  • CryptoParty

    CryptoParty

    CryptoParty (Crypto-Party) is a grassroots global endeavour to introduce the basics of practical cryptography such as the Tor anonymity network, I2P, Freenet, key signing parties, disk encryption and virtual private networks to the general public. The project primarily consists of a series of free public workshops. == History == As a successor to the Cypherpunks of the 1990s, CryptoParty was conceived in late August 2012 by the Australian journalist Asher Wolf in a Twitter post following the passing of the Cybercrime Legislation Amendment Bill 2011 and the proposal of a two-year data retention law in that country, the Cybercrime Legislation Amendment Bill 2011. The DIY, self-organizing movement immediately went viral, with a dozen autonomous CryptoParties being organized within hours in cities throughout Australia, the US, the UK, and Germany. Many more parties were soon organized or held in Chile, The Netherlands, Hawaii, Asia, etc. Tor usage in Australia itself spiked, and CryptoParty London with 130 attendees—some of whom were veterans of the Occupy London movement—had to be moved from London Hackspace to the Google campus in east London's Tech City. As of mid-October 2012 some 30 CryptoParties have been held globally, some on a continuing basis, and CryptoParties were held on the same day in Reykjavik, Brussels, and Manila. The first draft of the 442-page CryptoParty Handbook (the hard copy of which is available at cost) was pulled together in three days using the book sprint approach, and was released 2012-10-04 under a CC BY-SA license. === Edward Snowden involvement === In May 2014, Wired reported that Edward Snowden, while employed by Dell as an NSA contractor, organized a local CryptoParty at a small hackerspace in Honolulu, Hawaii on December 11, six months before becoming well known for leaking tens of thousands of secret U.S. government documents. During the CryptoParty, Snowden taught 20 Hawaii residents how to encrypt their hard drives and use the Internet anonymously. The event was filmed by Snowden's then-girlfriend, but the video has never been released online. In a follow-up post to the CryptoParty wiki, Snowden pronounced the event a "huge success." == Media response == In 2013, CryptoParty received messages of support from the Electronic Frontier Foundation and (purportedly) AnonyOps, as well as the NSA whistleblower Thomas Drake, WikiLeaks central editor Heather Marsh, and Wired reporter Quinn Norton. Eric Hughes, the author of A Cypherpunk's Manifesto nearly two decades before, delivered the keynote address, Putting the Personal Back in Personal Computers, at the Amsterdam CryptoParty on 2012-09-27. Marcin de Kaminski, founding member of Piratbyrån which in turn founded The Pirate Bay, regarded CryptoParty as the most important civic project in cryptography in 2012, and Cory Doctorow has characterized a CryptoParty as being "like a Tupperware party for learning crypto." Der Spiegel in December 2014 mentioned "crypto parties" in the wake of the Edward Snowden leaks in an article about the NSA.

    Read more →
  • Social media intelligence

    Social media intelligence

    Social media intelligence (SMI or SOCMINT) comprises the collective tools and solutions that allow organizations to analyze conversations, respond to synchronize social signals, and synthesize social data points into meaningful trends and analysis, based on the user's needs. Social media intelligence allows one to utilize intelligence gathering from social media sites, using both intrusive or non-intrusive means, from open and closed social networks. This type of intelligence gathering is one element of OSINT (Open- Source Intelligence). To support both the sensing and seizing of social signals at scale, organisations increasingly rely on dedicated audience intelligence platforms which combine data aggregation, NLP-driven analysis, and cross-platform monitoring. The term 'Social Media Intelligence' was coined in a 2012 paper written by Sir David Omand, Jamie Bartlett and Carl Miller for the Centre for the Analysis of Social Media, at the London-based think tank, Demos. The authors argued that social media is now an important part of intelligence and security work, but that technological, analytical, and regulatory changes are needed before it can be considered a powerful new form of intelligence, including amendments to the United Kingdom Regulation of Investigatory Powers Act 2000. Given the dynamic evolution of social media and social media monitoring, our current understanding of how social media monitoring can help organizations create business value is inadequate. As a result, there is a need to study how organizations can (a) extract and analyze social media data related to their business (Sensing), and (b) utilize external intelligence gained from social media monitoring for specific business initiatives (Seizing). == Governmental use == In Thailand, the Technology Crime Suppression Division not only employs a 30-person team to scrutinize social media for content deemed disrespectful to the monarchy, known as lèse-majesté but also encourages citizens to report such content. Particularly targeting the youth, they run a "Cyber Scout" program where participants are rewarded for reporting individuals posting material perceived as detrimental to the monarchy. Instances in Israel involve the arrest of Palestinians by the police for their social media posts. An example includes a 15-year-old girl who posted a Facebook status with the words "forgive me," raising suspicions among Israeli authorities that she might be planning an attack. In Egypt, a leaked 2014 call for tender from the Ministry of Interior reveals efforts to procure a social media monitoring system to identify leading figures and prevent protests before they occur. In the United States, ZeroFOX faced criticism for sharing a report with Baltimore officials showcasing how their social media monitoring tool could track riots following Freddie Gray's funeral. The report labeled 19 individuals, including two prominent figures from the #BlackLivesMatter movement, as "threat actors." In the UK, the Association of Chief Police Officers of England, Wales, and Northern Ireland emphasized the significance of social media in intelligence gathering during anti-fracking protests in 2011. Social media analysis closely monitored protests against the badger cull in 2013, with a 2013 report revealing a team of 17 officers in the National Domestic Extremism Unit scanning public tweets, YouTube videos, Facebook profiles, and other online content from UK citizens. == Effects on political opinion == During the 2016 United States presidential election, the Senate Intelligence Committee released reports containing information about Russia’s use of troll farms to mislead black voters about voting. Also, German researchers in 2010 analyzed Twitter messages regarding the German federal election concluding that Twitter played a role in leading users to a specific political opinion. In a broad sense, social media refers to a conversational, distributed mode of content generation, dissemination, and communication among communities. Different from broadcast-based traditional and industrial media, social media has torn down the boundaries between authorship and readership, while the information consumption and dissemination process is becoming intrinsically intertwined with the process of generating and sharing information. An example of how SOCMINT is used to affect political opinions is the Cambridge Analytica Scandal. Cambridge Analytica was a company that purchased data from Facebook about its users without the consent or knowledge of Americans. They used this data to build a "psychological warfare tool" to persuade US voters to elect Donald Trump as president in the 2016 election. Christopher Wylie, the whistleblower, reported that personal information was taken in early 2014, and used to build a system that could target US voters with personalized pollical advertisements. More than 50 million individuals' data was exploited and manipulated. == Law enforcement == In September of 2023, the Philadelphia Police Department began using social media to track and stay one step ahead of criminal activity to stop meetups and potential robberies. This new approach has made officers utilize another tool in their field by being able to find new information as quickly as possible. Law enforcement agencies worldwide are increasingly employing social media intelligence to enhance their capabilities in both crime prevention and investigation. By analyzing publicly available data from social platforms such as Facebook, Twitter, and Instagram, police can track criminal activities, identify suspects, and even prevent potential crimes before they occur. For instance, the FBI utilizes SOCMINT to monitor threats and investigate criminal activities, including analyzing posts, images, and videos that might signal illegal activities or security concerns. == Marketing == SOCMINT collects data from both organizations and people on an individual level. It has a variety of different purposes, and though its main goal is to improve national security advancements, there are several other benefits as well. This intelligence can identify patterns, predict trends, gather information in current time, etc. In addition, these aspects have allowed for both improvement within businesses and help for law enforcement. Artificial Social Networking Intelligence (ASNI) refers to the application of artificial intelligence within social networking services and social media platforms. It encompasses various technologies and techniques used to automate, personalize, enhance, improve, and synchronize user's interactions and experiences within social networks. ASNI is expected to evolve rapidly, influencing how we interact online and shaping their digital experiences. Transparency, ethical considerations, media influence bias, and user control over data will be crucial to ensure responsible development and positive impact. Google provides many free services and has built an entire media brand with its vast variety of products. Along with data collection, Google also owns two advertising services, Google Ads, and Google AdSense. Surprisingly, most of its revenue comes from advertising, not direct sales of its services or products. Google makes money by selling advertising services to advertisers. They provide ad space to websites on Google, and target ads to consumers of Google services and products. Google can market ads using SOCMINT to collect data from its users and generate revenue. Research shows that various social media platforms on the Internet such as Twitter, Tumblr (micro-blogging websites), Facebook (a popular social networking website), YouTube (largest video sharing and hosting website), Blogs and discussion forums are being misused by extremist groups for spreading their beliefs and ideologies, promoting radicalization, recruiting members and creating online virtual communities sharing a common agenda. Popular microblogging websites such as Twitter are being used as a real-time platform for information sharing and communication during the planning and mobilization of civil unrest-related events.

    Read more →
  • Cryptovirology

    Cryptovirology

    Cryptovirology refers to the study of cryptography use in malware, such as ransomware and asymmetric backdoors. Traditionally, cryptography and its applications are defensive in nature, and provide privacy, authentication, and security to users. Cryptovirology employs a twist on cryptography, showing that it can also be used offensively. It can be used to mount extortion based attacks that cause loss of access to information, loss of confidentiality, and information leakage, tasks which cryptography typically prevents. The field was born with the observation that public-key cryptography can be used to break the symmetry between what an antivirus analyst sees regarding malware and what the attacker sees. The antivirus analyst sees a public key contained in the malware, whereas the attacker sees the public key contained in the malware as well as the corresponding private key (outside the malware) since the attacker created the key pair for the attack. The public key allows the malware to perform trapdoor one-way operations on the victim's computer that only the attacker can undo. == Overview == The field encompasses covert malware attacks in which the attacker securely steals private information such as symmetric keys, private keys, PRNG state, and the victim's data. Examples of such covert attacks are asymmetric backdoors. An asymmetric backdoor is a backdoor (e.g., in a cryptosystem) that can be used only by the attacker, even after it is found. This contrasts with the traditional backdoor that is symmetric, i.e., anyone that finds it can use it. Kleptography, a subfield of cryptovirology, is the study of asymmetric backdoors in key generation algorithms, digital signature algorithms, key exchanges, pseudorandom number generators, encryption algorithms, and other cryptographic algorithms. The NIST Dual EC DRBG random bit generator has an asymmetric backdoor in it. The EC-DRBG algorithm utilizes the discrete-log kleptogram from kleptography, which by definition makes the EC-DRBG a cryptotrojan. Like ransomware, the EC-DRBG cryptotrojan contains and uses the attacker's public key to attack the host system. The cryptographer Ari Juels indicated that NSA effectively orchestrated a kleptographic attack on users of the Dual EC DRBG pseudorandom number generation algorithm and that, although security professionals and developers have been testing and implementing kleptographic attacks since 1996, "you would be hard-pressed to find one in actual use until now." Due to public outcry about this cryptovirology attack, NIST rescinded the EC-DRBG algorithm from the NIST SP 800-90 standard. Covert information leakage attacks carried out by cryptoviruses, cryptotrojans, and cryptoworms that, by definition, contain and use the public key of the attacker is a major theme in cryptovirology. In "deniable password snatching," a cryptovirus installs a cryptotrojan that asymmetrically encrypts host data and covertly broadcasts it. This makes it available to everyone, noticeable by no one (except the attacker), and only decipherable by the attacker. An attacker caught installing the cryptotrojan claims to be a virus victim. An attacker observed receiving the covert asymmetric broadcast is one of the thousands, if not millions of receivers, and exhibits no identifying information whatsoever. The cryptovirology attack achieves "end-to-end deniability." It is a covert asymmetric broadcast of the victim's data. Cryptovirology also encompasses the use of private information retrieval (PIR) to allow cryptoviruses to search for and steal host data without revealing the data searched for even when the cryptotrojan is under constant surveillance. By definition, such a cryptovirus carries within its own coding sequence the query of the attacker and the necessary PIR logic to apply the query to host systems. == History == The first cryptovirology attack and discussion of the concept was by Adam L. Young and Moti Yung, at the time called "cryptoviral extortion" and it was presented at the 1996 IEEE Security & Privacy conference. In this attack, a cryptovirus, cryptoworm, or cryptotrojan contains the public key of the attacker and hybrid encrypts the victim's files. The malware prompts the user to send the asymmetric ciphertext to the attacker who will decipher it and return the symmetric decryption key it contains for a fee. The victim needs the symmetric key to decrypt the encrypted files if there is no way to recover the original files (e.g., from backups). The 1996 IEEE paper predicted that cryptoviral extortion attackers would one day demand e-money, long before Bitcoin even existed. Many years later, the media relabeled cryptoviral extortion as ransomware. In 2016, cryptovirology attacks on healthcare providers reached epidemic levels, prompting the U.S. Department of Health and Human Services to issue a Fact Sheet on Ransomware and HIPAA. The fact sheet states that when electronic protected health information is encrypted by ransomware, a breach has occurred, and the attack therefore constitutes a disclosure that is not permitted under HIPAA, the rationale being that an adversary has taken control of the information. Sensitive data might never leave the victim organization, but the break-in may have allowed data to be sent out undetected. California enacted a law that defines the introduction of ransomware into a computer system with the intent of extortion as being against the law. == Examples == === Tremor virus === While viruses in the wild have used cryptography in the past, the only purpose of such usage of cryptography was to avoid detection by antivirus software. For example, the tremor virus used polymorphism as a defensive technique in an attempt to avoid detection by anti-virus software. Though cryptography does assist in such cases to enhance the longevity of a virus, the capabilities of cryptography are not used in the payload. The One-half virus was amongst the first viruses known to have encrypted affected files. === Tro_Ransom.A virus === An example of a virus that informs the owner of the infected machine to pay a ransom is the virus nicknamed Tro_Ransom.A. This virus asks the owner of the infected machine to send $10.99 to a given account through Western Union. Virus.Win32.Gpcode.ag is a classic cryptovirus. This virus partially uses a version of 660-bit RSA and encrypts files with many different extensions. It instructs the owner of the machine to email a given mail ID if the owner desires the decryptor. If contacted by email, the user will be asked to pay a certain amount as ransom in return for the decryptor. === CAPI === It has been demonstrated that using just 8 different calls to Microsoft's Cryptographic API (CAPI), a cryptovirus can satisfy all its encryption needs. == Other uses of cryptography-enabled malware == Apart from cryptoviral extortion, there are other potential uses of cryptoviruses, such as deniable password snatching, cryptocounters, private information retrieval, and in secure communication between different instances of a distributed cryptovirus.

    Read more →
  • AI browser

    AI browser

    An AI browser is a web browser with integrated artificial intelligence capabilities, such as automatically summarizing web page content or answering questions about it. A more specialized type is an agentic browser, based on the concept of agentic AI, which can take actions – such as navigating webpages or filling out forms – on behalf of the user. Several agentic browsers emerged in 2025, including ChatGPT Atlas (macOS only), Comet, and Dia. As of 2025, this is a recent development in the browser market, including new entrants from OpenAI, Opera and Perplexity. The designation of 'AI browser' also includes established browsers that later added non-agentic AI features, such as Microsoft Edge with the Copilot chatbot, Google Chrome with the Gemini chatbot (for Windows desktop users in the US with their language set to English), and Firefox with multiple chatbot providers (such as ChatGPT, Claude, Copilot, Gemini, and Le Chat). AI browsers have been noted to be susceptible to prompt injection attacks. == Browser extensions and integrations == Rather than creating entirely new browsers, some AI browsing solutions integrate with existing browsers through extensions or companion applications. These tools add agentic capabilities to established browsers without requiring users to switch platforms. Examples include Composite, which functions as a cross-browser agent that works with Chrome, Edge, and other browsers to automate web-based tasks for workers. == Cloud-based implementations == Cloud-based implementations of AI browsers allow users to run automated browsing agents without local installation. These systems operate on remote servers using frameworks such as Puppeteer or Playwright. Examples include Browserbase, Browser-use and AI Browser. The AI typically parses the Document Object Model (DOM) to locate and interact with page elements, and may also analyze browser screenshots to interpret layout and structure. == Criticisms and dangers == AI browsers have been noted to be susceptible to being vulnerable to prompt injection attacks, in which the content of websites can be used to hijack the control of the browser. Multiple organisations have argued against using AI browsers due to this vulnerability. The United Kingdom national cyber security centre and Gartner consider them to be too risky for adoption by most organisations. A study by the CISPA Helmholtz Center and Saarland University concluded that this vulnerability makes them easy targets for malware, fraud, automated defamation, disinformation and biased outputs.

    Read more →
  • Visual networking

    Visual networking

    Visual networking refers to an emerging class of user applications that combine digital video and social networking capabilities. It is based upon the premise that visual literacy, "the ability to interpret, negotiate and make meaning from information presented in the form of a moving image", is a powerful force in how humans communicate, entertain and learn. The duality of visual networking—subsuming entertainment and communications, professional and personal content, video and other digital media, data networks and social networks to create immersive experiences, when, where and how the user wants it. These applications have changed video content from long-form movies and broadcast television programming to a database of segments or "clips", and social network annotations. And the generation and distribution of content takes on a new dimension with Web 2.0 applications—participatory social-networks or communities that facilitate interactive creativity, collaboration and sharing between users. == History == The rise of visual networking is relatively recent phenomenon driven by the emergence of social networking capabilities and the ability to deliver interactive video over a broadband network. It is a natural evolution of the current social networking phenomena whereby social networking annotations are layered over broadband video to create highly interactive and immersive experiences between individuals and their content. Until early 2005 this was not considered viable due to the lack of web and broadband infrastructure designed to support the transmission of web video and the still nascent stage of social networks like MySpace and Facebook. The introduction of YouTube in February 2005 marked the first significant combination of broadband video and social network systems designed to allow users to share, rate and tag user generated and premium content. From 2006 to 2008 this trend continued to gain steam as individuals and businesses pursued new combinations of video and social networking across a wide range of entertainment, communication and learning applications. == Broadband video takes off == Video has largely been defined by its use as an entertainment medium. Since the commercial availability of the television in the late '30s video has become the dominant entertainment medium far eclipsing audio and text based entertainment both in terms of time and dollars spent. Within the past decade, video use has rapidly evolved across a broader range of devices, multiple locations and user applications. The popularization of the long-tail and user-generated video has further challenged people's ideas of what's possible with video. A key advantage of video relative to other media is its superior ability to communicate ideas and emotions economically. If a picture is worth a thousand words, then a video may be worth a thousand pictures. Video by its very nature is highly experiential, making communications more compelling, informative and memorable. == Social networking meets video == At the core of visual networking is the concept that people can participate in communities of content and communities of interest. A community of interest is defined as a community of people who share a common interest or passion. These people exchange ideas and thoughts about the given passion, but may know (or care) little about each other outside of this area. Participation in a community of interest can be compelling, entertaining and create a ‘sticky’ community where people return frequently and remain for extended periods. The unparalleled potential of the Internet to promote such connections is only now being fully recognized and exploited, through Web-based groups established for that purpose. Based on the six degrees of separation concept (the idea that any two people on the planet could make contact through a chain of no more than five intermediaries), social networking establishes interconnected Internet communities (sometimes known as personal networks) that help people make contacts that would be good for them to know, but that they would be unlikely to have met otherwise. == Transition from search to discovery == The phrase The Long Tail was, according to Chris Anderson, first coined by himself in October 2004. Anderson argued that products that are in low demand or have low sales volume can collectively make up a market share that rivals or exceeds the relatively few current bestsellers and blockbusters, if the store or distribution channel is large enough. The Long Tail also has implications for the producers of content; especially those whose products could not—for economic reasons—find a place in pre-Internet information distribution channels controlled by book publishers, record companies, movie studios, and television networks. Looked at from the producers' side, the Long Tail has made possible a flowering of creativity across all fields of human endeavor. One example of this is YouTube, where thousands of diverse videos—whose content, production value or lack of popularity make them inappropriate for traditional television—are easily accessible to a wide range of viewers. The benefit to the consumer is that they know have an almost infinite choice of content to select from able to create their own specific channels based upon their unique needs. A potential negative side effect of the long tail is the rapidly growing inventory of text, audio and video content. The storage and distribution systems of the past restricted the number of songs, video, and books making it easier to search for what was relevant to the individual. As the long-tail has grown, more and more relevant and irrelevant content passes an individual by without their knowledge. This is especially true for video because unlike text-based files which can searched and indexed for easy finding, video typically has only its title as a clue to what's in it. This lack of comprehensive meta-data has limited the applicability of traditional search models. Augmenting traditional search has been the emergence of content based discovery tools that make people aware of relevant content based upon their participation in communities of interest and/or communities of content. The idea is that users may or may not start out searching for something, but they soon begin reacting to things they find, exploring links on pages they stumble upon and taking cues from fellow surfers about where to go. Instead of the old, passive, lean-back style of watching video, viewers are actively seeking content through discovery. People interact with each other, posting comments on what they just saw. Many sites now allow people to vote on videos, ranking and rating them. Ranking is the result of one of a number of algorithms that measure how many people have watched something or how many sites link to it. == Early examples == YouTube is the best early example of a visual networking experience. YouTube is a video sharing website where users can upload, view and share video clips. Unregistered users can watch most videos on the site, while registered users are permitted to upload an unlimited number of videos. Few statistics are publicly available regarding the number of videos on YouTube. However, in July 2006, the company revealed that more than 100 million videos were being watched every day, and 2.5 billion videos were watched in June 2006. 50,000 videos were being added per day in May 2006, and this increased to 65,000 by July. In January 2008 alone, nearly 79 million users watched over 3 billion videos on YouTube. Telepresence refers to a set of technologies which allow a person to feel as if they were present, to give the appearance that they were present, or to have an effect, at a location other than their true location. Telepresence requires that the senses of the user, or users, are provided with such stimuli as to give the feeling of being in that other location. Additionally, the user(s) may be given the ability to affect the remote location. In this case, the user's position, movements, actions, voice, etc. may be sensed, transmitted and duplicated in the remote location to bring about this effect. Therefore, information may be traveling in both directions between the user and the remote location. Critical the creating an in-person experience is the presence of high-definition video perfectly synchronized with stereophonic sound. A minimum system usually includes visual feedback. Ideally, the entire field of view of the user is filled with a view of the remote location, and the viewpoint corresponds to the movement and orientation of the user's head. In this way, it differs from television or cinema, where the viewpoint is out of the control of the viewer. == Other applications == While still in its infancy, visual networking applications are beginning to emerge that span both consumer and business markets. === Mobile video === Proliferation of multi-function mobile devices, particularl

    Read more →
  • Sentiment analysis

    Sentiment analysis

    Sentiment analysis (also known as opinion mining) is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. With the rise of deep language models, such as RoBERTa, more difficult data domains can be analyzed, e.g., news texts where authors typically express their opinion/sentiment less explicitly. == Types == A basic task in sentiment analysis is classifying the polarity of a given text at the document, sentence, or feature/aspect level—whether the expressed opinion in a document, a sentence or an entity feature/aspect is positive, negative, or neutral. Advanced, "beyond polarity" sentiment classification looks, for instance, at emotional states such as enjoyment, anger, disgust, sadness, fear, and surprise. Precursors to sentimental analysis include the General Inquirer, which provided hints toward quantifying patterns in text and, separately, psychological research that examined a person's psychological state based on analysis of their verbal behavior. Subsequently, the method described in a patent by Volcani and Fogel, looked specifically at sentiment and identified individual words and phrases in text with respect to different emotional scales. A current system based on their work, called EffectCheck, presents synonyms that can be used to increase or decrease the level of evoked emotion in each scale. Many other subsequent efforts were less sophisticated, using a mere polar view of sentiment, from positive to negative, such as work by Turney, and Pang who applied different methods for detecting the polarity of product reviews and movie reviews respectively. This work is at the document level. One can also classify a document's polarity on a multi-way scale, which was attempted by Pang and Snyder among others: Pang and Lee expanded the basic task of classifying a movie review as either positive or negative to predict star ratings on either a 3- or a 4-star scale, while Snyder performed an in-depth analysis of restaurant reviews, predicting ratings for various aspects of the given restaurant, such as the food and atmosphere (on a five-star scale). First steps to bringing together various approaches—learning, lexical, knowledge-based, etc.—were taken in the 2004 AAAI Spring Symposium where linguists, computer scientists, and other interested researchers first aligned interests and proposed shared tasks and benchmark data sets for the systematic computational research on affect, appeal, subjectivity, and sentiment in text. Even though in most statistical classification methods, the neutral class is ignored under the assumption that neutral texts lie near the boundary of the binary classifier, several researchers suggest that, as in every polarity problem, three categories must be identified. Moreover, it can be proven that specific classifiers such as the Max Entropy and SVMs can benefit from the introduction of a neutral class and improve the overall accuracy of the classification. There are in principle two ways for operating with a neutral class. Either, the algorithm proceeds by first identifying the neutral language, filtering it out and then assessing the rest in terms of positive and negative sentiments, or it builds a three-way classification in one step. This second approach often involves estimating a probability distribution over all categories (e.g. naive Bayes classifiers as implemented by the NLTK). Whether and how to use a neutral class depends on the nature of the data: if the data is clearly clustered into neutral, negative and positive language, it makes sense to filter the neutral language out and focus on the polarity between positive and negative sentiments. If, in contrast, the data are mostly neutral with small deviations towards positive and negative affect, this strategy would make it harder to clearly distinguish between the two poles. A different method for determining sentiment is the use of a scaling system whereby words commonly associated with having a negative, neutral, or positive sentiment are given an associated number on a −10 to +10 scale (most negative up to most positive) or simply from 0 to a positive upper limit such as +4. This makes it possible to adjust the sentiment of a given term relative to its environment (usually on the level of the sentence). When a piece of unstructured text is analyzed using natural language processing, each concept in the specified environment is given a score based on the way sentiment words relate to the concept and its associated score. This allows movement to a more sophisticated understanding of sentiment, because it is now possible to adjust the sentiment value of a concept relative to modifications that may surround it. Words, for example, that intensify, relax or negate the sentiment expressed by the concept can affect its score. Alternatively, texts can be given a positive and negative sentiment strength score if the goal is to determine the sentiment in a text rather than the overall polarity and strength of the text. There are various other types of sentiment analysis, such as aspect-based sentiment analysis, grading sentiment analysis (positive, negative, neutral), multilingual sentiment analysis and detection of emotions. === Subjectivity/objectivity identification === This task is commonly defined as classifying a given text (usually a sentence) into one of two classes: objective or subjective. This problem can sometimes be more difficult than polarity classification. The subjectivity of words and phrases may depend on their context and an objective document may contain subjective sentences (e.g., a news article quoting people's opinions). Moreover, as mentioned by Su, results are largely dependent on the definition of subjectivity used when annotating texts. However, Pang showed that removing objective sentences from a document before classifying its polarity helped improve performance. Subjective and objective identification, emerging subtasks of sentiment analysis to use syntactic, semantic features, and machine learning knowledge to identify if a sentence or document contains facts or opinions. Awareness of recognizing factual and opinions is not recent, having possibly first presented by Carbonell at Yale University in 1979. The term objective refers to the incident carrying factual information. Example of an objective sentence: 'To be elected president of the United States, a candidate must be at least thirty-five years of age.' The term subjective describes the incident contains non-factual information in various forms, such as personal opinions, judgment, and predictions, also known as 'private states'. In the example down below, it reflects a private states 'We Americans'. Moreover, the target entity commented by the opinions can take several forms from tangible product to intangible topic matters stated in Liu (2010). Furthermore, three types of attitudes were observed by Liu (2010), 1) positive opinions, 2) neutral opinions, and 3) negative opinions. Example of a subjective sentence: 'We Americans need to elect a president who is mature and who is able to make wise decisions.' This analysis is a classification problem. Each class's collections of words or phrase indicators are defined for to locate desirable patterns on unannotated text. For subjective expression, a different word list has been created. Lists of subjective indicators in words or phrases have been developed by multiple researchers in the linguist and natural language processing field states in Riloff et al. (2003). A dictionary of extraction rules has to be created for measuring given expressions. Over the years, in subjective detection, the features extraction progression from curating features by hand to automated features learning. At the moment, automated learning methods can further separate into supervised and unsupervised machine learning. Patterns extraction with machine learning process annotated and unannotated text have been explored extensively by academic researchers. However, researchers recognized several challenges in developing fixed sets of rules for expressions respectably. Much of the challenges in rule development stems from the nature of textual information. Six challenges have been recognized by several researchers: 1) metaphorical expressions, 2) discrepancies in writings, 3) context-sensitive, 4) represented words with fewer usages, 5) time-sensitive, and 6) ever-growing volume. Metaphorical expressions. The text contains metaphoric expression may impact on the performance on the extraction. Besides, metaphors take in different forms, which may have been contribu

    Read more →
  • Cloud Data Management Interface

    Cloud Data Management Interface

    ISO/IEC 17826 Information technology — Cloud Data Management Interface (CDMI) Version 2.0.0 is an international standard that specifies a protocol for self-provisioning, administering and managing access to data stored in cloud storage, object storage, storage area network and network attached storage systems. The CDMI standard is developed and maintained by the Storage Networking Industry Association, who makes a publicly accessible version of the specification available. CDMI defines new resource representations to enable standardized management of any URI-accessible data, and defines RESTful HTTP operations using these representations to discover the capabilities of the storage system, discover stored data, access and update management metadata, specify data storage protocols (such as iSCSI and NFS) through which the stored data is accessed, and provide cross-system and cross-cloud import and export in order to enable data portability. Management functions enabled by CDMI include managing data ownership, identity mapping, access controls, user-specified metadata, and to declaratively specify desired data protection, data retention, constraints on geographic placement, desired quality of service, data versioning and security requirements. CDMI also defines utility services to facilitate data management, such the ability to query data matching specific criteria, and includes extensions to perform bulk updates using CDMI Jobs. == Capabilities == Compliant implementations must provide access to a set of configuration parameters known as capabilities. These are either boolean values that represent whether or not a system supports things such as queues, export via other protocols, path-based storage and so on, or numeric values expressing system limits, such as how much metadata may be placed on an object. As a minimal compliant implementation can be quite small, with few features, clients need to check the cloud storage system for a capability before attempting to use the functionality it represents. Resource allocation assignments limited to the data management interface protocols must possess access bypass capabilities which extend beyond the layered framework. This integral function is vital to the prevention of transport layer session hijacking by unauthorized entities which may circumvent standard interfacing security parameters. == Containers == A CDMI client may access objects, including containers, by either name or object id (OID), assuming the CDMI server supports both methods. When storing objects by name, it is natural to use nested named containers; the resulting structure corresponds exactly to a traditional filesystem directory structure. == Objects == Objects are similar to files in a traditional file system, but are enhanced with an increased amount and capacity for metadata. As with containers, they may be accessed by either name or OID. When accessed by name, clients use URLs that contain the full pathname of objects to create, read, update and delete them. When accessed by OID, the URL specifies an OID string in the cdmi-objectid container; this container presents a flat name space conformant with standard object storage system semantics. Subject to system limits, objects may be of any size or type and have arbitrary user-supplied metadata attached to them. Systems that support query allow arbitrary queries to be run against the metadata. == Domains, Users and Groups == CDMI supports the concept of a domain, similar in concept to a domain in the Windows Active Directory model. Users and groups created in a domain share a common administrative database and are known to each other on a "first name" basis, i.e. without reference to any other domain or system. Domains also function as containers for usage and billing summary data. == Access Control == CDMI exactly follows the ACL and ACE model used for file authorization operations by NFSv4. This makes it also compatible with Microsoft Windows systems. == Metadata == CDMI draws much of its metadata model from the XAM specification. Objects and containers have "storage system metadata", "data system metadata" and arbitrary user specified metadata, in addition to the metadata maintained by an ordinary filesystem (atime etc.). == Queries == CDMI specifies a way for systems to support arbitrary queries against CDMI containers, with a rich set of comparison operators, including support for regular expressions. == Queues == CDMI supports the concept of persistent FIFO (first-in, first-out) queues. These are useful for job scheduling, order processing and other tasks in which lists of things must be processed in order. == Compliance == Both retention intervals and retention holds are supported by CDMI. A retention interval consists of a start time and a retention period. During this time interval, objects are preserved as immutable and may not be deleted. A retention hold is usually placed on an object because of judicial action and has the same effect: objects may not be changed nor deleted until all holds placed on them are removed. == Billing == Summary information suitable for billing clients for on-demand services can be obtained by authorized users from systems that support it. == Serialization == Serialization of objects and containers allows export of all data and metadata on a system and importation of that data into another cloud system. == Foreign protocols == CDMI supports export of containers as NFS or CIFS shares. Clients that mount these shares see the container hierarchy as an ordinary filesystem directory hierarchy, and the objects in the containers as normal files. Metadata outside of ordinary filesystem metadata may or may not be exposed. Provisioning of iSCSI LUNs is also supported. == Client SDKs == CDMI Reference Implementation Droplet libcdmi-java libcdmi-python .NET SDK

    Read more →
  • RFPolicy

    RFPolicy

    The RFPolicy outlines a method for contacting vendors about security vulnerabilities found in their products. It was initially written in 2000 by hacker and security consultant Rain Forest Puppy. It was perhaps the second disclosure policy, following Simple Nomad's. The policy gives the vendor five working days to respond to the reporter of the bug. If the vendor fails to contact the reporter within those five days, the issue is recommended to be disclosed to the general community. The reporter should help the vendor reproduce the bug and work out a fix. The reporter should delay notifying the general community about the bug if the vendor provides feasible reasons for requiring so. If the vendor fails to respond or shuts down communication with the reporter of the problem within five working days, the reporter should disclose the issue to the general community. When issuing an alert or fix, the vendor should give the reporter proper credit for reporting the bug. Context for the history of vulnerability disclosure is available in a history article.

    Read more →
  • Data independence

    Data independence

    Data independence is the type of data transparency that matters for a centralized DBMS. It refers to the immunity of user applications to changes made in the definition and organization of data. Application programs should not, ideally, be exposed to details of data representation and storage. The DBMS provides an abstract view of the data that hides such details. There are two types of data independence: physical and logical data independence. The data independence and operation independence together gives the feature of data abstraction. There are two levels of data independence. == Logical data independence == The logical structure of the data is known as the 'schema definition'. In general, if a user application operates on a subset of the attributes of a relation, it should not be affected later when new attributes are added to the same relation. Logical data independence indicates that the conceptual schema can be changed without affecting the existing schemas. == Physical data independence == The physical structure of the data is referred to as "physical data description". Physical data independence deals with hiding the details of the storage structure from user applications. The application should not be involved with these issues since, conceptually, there is no difference in the operations carried out against the data. There are three types of data independence: Logical data independence: The ability to change the logical (conceptual) schema without changing the External schema (User View) is called logical data independence. For example, the addition or removal of new entities, attributes, or relationships to the conceptual schema or having to rewrite existing application programs. Physical data independence: The ability to change the physical schema without changing the logical schema is called physical data independence. For example, a change to the internal schema, such as using different file organization or storage structures, storage devices, or indexing strategy, should be possible without having to change the conceptual or external schemas. View level data independence: always independent no effect, because there doesn't exist any other level above view level. == Data independence == Data independence can be explained as follows: Each higher level of the data architecture is immune to changes of the next lower level of the architecture. The logical scheme stays unchanged even though the storage space or type of some data is changed for reasons of optimization or reorganization. In this, external schema does not change. In this, internal schema changes may be required due to some physical schema were reorganized here. Physical data independence is present in most databases and file environment in which hardware storage of encoding, exact location of data on disk, merging of records, so on this are hidden from user. == Data independence types == The ability to modify schema definition in one level without affecting schema of that definition in the next higher level is called data independence. There are two levels of data independence, they are Physical data independence and Logical data independence. Physical data independence is the ability to modify the physical schema without causing application programs to be rewritten. Modifications at the physical level are occasionally necessary to improve performance. It means we change the physical storage/level without affecting the conceptual or external view of the data. The new changes are absorbed by mapping techniques. Logical data independence is the ability to modify the logical schema without causing application programs to be rewritten. Modifications at the logical level are necessary whenever the logical structure of the database is altered (for example, when money-market accounts are added to banking system). Logical Data independence means if we add some new columns or remove some columns from table then the user view and programs should not change. For example: consider two users A & B. Both are selecting the fields "EmployeeNumber" and "EmployeeName". If user B adds a new column (e.g. salary) to his table, it will not affect the external view for user A, though the internal schema of the database has been changed for both users A & B. Logical data independence is more difficult to achieve than physical data independence, since application programs are heavily dependent on the logical structure of the data that they access.

    Read more →
  • Manufacturing Automation Protocol

    Manufacturing Automation Protocol

    Manufacturing Automation Protocol (MAP) was a computer network standard released in 1982 for interconnection of devices from multiple manufacturers. It was developed by General Motors to combat the proliferation of incompatible communications standards used by suppliers of automation products such as programmable controllers. By 1985 demonstrations of interoperability were carried out and 21 vendors offered MAP products. In 1986 the Boeing corporation merged its Technical Office Protocol with the MAP standard, and the combined standard was referred to as "MAP/TOP". The standard was revised several times between the first issue in 1982 and MAP 3.0 in 1987, with significant technical changes that made interoperation between different revisions of the standard difficult. Although promoted and used by manufacturers such as General Motors, Boeing, and others, it lost market share to the contemporary Ethernet standard and was not widely adopted. Difficulties included changing protocol specifications, the expense of MAP interface links, and the speed penalty of a token-passing network. The token bus network protocol used by MAP became standardized as IEEE standard 802.4 but this committee disbanded in 2004 due to lack of industry attention.

    Read more →