Information extraction

Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents and other electronically represented sources. Typically, this involves processing human language texts by means of natural language processing (NLP). Recent activities in multimedia document processing like automatic annotation and content extraction out of images/audio/video/documents could be seen as information extraction. Recent advances in NLP techniques have allowed for significantly improved performance compared to previous years. An example is the extraction from newswire reports of corporate mergers, such as denoted by the formal relation: MergerBetween ⁡ ( c o m p a n y 1 , c o m p a n y 2 , d a t e ) {\displaystyle \operatorname {MergerBetween} (\mathrm {company} _{1},\mathrm {company} _{2},\mathrm {date} )} , from an online news sentence such as: "Yesterday, New York based Foo Inc. announced their acquisition of Bar Corp." A broad goal of IE is to allow computation to be done on the previously unstructured data. A more specific goal is to allow automated reasoning about the logical form of the input data. Structured data is semantically well-defined data from a chosen target domain, interpreted with respect to category and context. Information extraction is the part of a greater puzzle which deals with the problem of devising automatic methods for text management, beyond its transmission, storage and display. The discipline of information retrieval (IR) has developed automatic methods, typically of a statistical flavor, for indexing large document collections and classifying documents. Another complementary approach is that of natural language processing (NLP) which has solved the problem of modelling human language processing with considerable success when taking into account the magnitude of the task. In terms of both difficulty and emphasis, IE deals with tasks in between both IR and NLP. In terms of input, IE assumes the existence of a set of documents in which each document follows a template, i.e. describes one or more entities or events in a manner that is similar to those in other documents but differing in the details. An example, consider a group of newswire articles on Latin American terrorism with each article presumed to be based upon one or more terroristic acts. We also define for any given IE task a template, which is a(or a set of) case frame(s) to hold the information contained in a single document. For the terrorism example, a template would have slots corresponding to the perpetrator, victim, and weapon of the terroristic act, and the date on which the event happened. An IE system for this problem is required to "understand" an attack article only enough to find data corresponding to the slots in this template. == History == Information extraction dates back to the late 1970s in the early days of NLP. An early commercial system from the mid-1980s was JASPER built for Reuters by the Carnegie Group Inc with the aim of providing real-time financial news to financial traders. Beginning in 1987, IE was spurred by a series of Message Understanding Conferences. MUC is a competition-based conference that focused on the following domains: MUC-1 (1987), MUC-3 (1989): Naval operations messages. MUC-3 (1991), MUC-4 (1992): Terrorism in Latin American countries. MUC-5 (1993): Joint ventures and microelectronics domain. MUC-6 (1995): News articles on management changes. MUC-7 (1998): Satellite launch reports. Considerable support came from the U.S. Defense Advanced Research Projects Agency (DARPA), who wished to automate mundane tasks performed by government analysts, such as scanning newspapers for possible links to terrorism. == Present significance == The present significance of IE pertains to the growing amount of information available in unstructured form. Tim Berners-Lee, inventor of the World Wide Web, refers to the existing Internet as the web of documents and advocates that more of the content be made available as a web of data. Until this transpires, the web largely consists of unstructured documents lacking semantic metadata. Knowledge contained within these documents can be made more accessible for machine processing by means of transformation into relational form, or by marking-up with XML tags. An intelligent agent monitoring a news data feed requires IE to transform unstructured data into something that can be reasoned with. A typical application of IE is to scan a set of documents written in a natural language and populate a database with the information extracted. == Tasks and subtasks == Applying information extraction to text is linked to the problem of text simplification in order to create a structured view of the information present in free text. The overall goal being to create a more easily machine-readable text to process the sentences. Typical IE tasks and subtasks include: Template filling: Extracting a fixed set of fields from a document, e.g. extract perpetrators, victims, time, etc. from a newspaper article about a terrorist attack. Event extraction: Given an input document, output zero or more event templates. For instance, a newspaper article might describe multiple terrorist attacks. Knowledge Base Population: Fill a database of facts given a set of documents. Typically the database is in the form of triplets, (entity 1, relation, entity 2), e.g. (Barack Obama, Spouse, Michelle Obama) Named entity recognition: recognition of known entity names (for people and organizations), place names, temporal expressions, and certain types of numerical expressions, by employing existing knowledge of the domain or information extracted from other sentences. Typically the recognition task involves assigning a unique identifier to the extracted entity. A simpler task is named entity detection, which aims at detecting entities without having any existing knowledge about the entity instances. For example, in processing the sentence "M. Smith likes fishing", named entity detection would denote detecting that the phrase "M. Smith" does refer to a person, but without necessarily having (or using) any knowledge about a certain M. Smith who is (or, "might be") the specific person whom that sentence is talking about. Coreference resolution: detection of coreference and anaphoric links between text entities. In IE tasks, this is typically restricted to finding links between previously extracted named entities. For example, "International Business Machines" and "IBM" refer to the same real-world entity. If we take the two sentences "M. Smith likes fishing. But he doesn't like biking", it would be beneficial to detect that "he" is referring to the previously detected person "M. Smith". Relationship extraction: identification of relations between entities, such as: PERSON works for ORGANIZATION (extracted from the sentence "Bill works for IBM.") PERSON located in LOCATION (extracted from the sentence "Bill is in France.") Semi-structured information extraction which may refer to any IE that tries to restore some kind of information structure that has been lost through publication, such as: Table extraction: finding and extracting tables from documents. Table information extraction : extracting information in structured manner from the tables. This task is more complex than table extraction, as table extraction is only the first step, while understanding the roles of the cells, rows, columns, linking the information inside the table and understanding the information presented in the table are additional tasks necessary for table information extraction. Comments extraction : extracting comments from the actual content of articles in order to restore the link between authors of each of the sentences Language and vocabulary analysis Terminology extraction: finding the relevant terms for a given corpus Audio extraction Template-based music extraction: finding relevant characteristic in an audio signal taken from a given repertoire; for instance time indexes of occurrences of percussive sounds can be extracted in order to represent the essential rhythmic component of a music piece. Note that this list is not exhaustive and that the exact meaning of IE activities is not commonly accepted and that many approaches combine multiple sub-tasks of IE in order to achieve a wider goal. Machine learning, statistical analysis and/or natural language processing are often used in IE. IE on non-text documents is becoming an increasingly interesting topic in research, and information extracted from multimedia documents can now be expressed in a high level structure as it is done on text. This naturally leads to the fusion of extracted information from multiple kinds of documents and sources. == World Wide Web applications == IE has been the focus of the MUC conferences. The proliferation of the Web, however, intensified the need for developing IE systems that help people

Active learning (machine learning)

Active learning is a special case of machine learning in which a learning algorithm can interactively query a human user (or some other information source) to label new data points with the desired outputs. The human user must possess expertise in the problem domain, including the ability to consult authoritative sources when necessary. In statistics literature, it is sometimes also called optimal experimental design. The information source is also called teacher or oracle. There are situations in which unlabeled data is abundant but manual labeling is expensive. In such a scenario, learning algorithms can actively query the teacher for labels. Since the learner chooses the examples, the number of examples to learn a concept can often be much lower than the number required in normal supervised learning. However, there is a risk that the algorithm is overwhelmed by uninformative examples. Recent developments are dedicated to multi-label active learning, hybrid active learning and active learning in a single-pass (on-line) context, combining concepts from the field of machine learning (e.g. conflict and ignorance) with adaptive, incremental learning policies in the field of online machine learning. Using active learning allows for faster development of a machine learning algorithm, when comparative updates would require a quantum or super computer. Large-scale active learning projects may benefit from crowdsourcing frameworks such as Amazon Mechanical Turk that include many humans in the active learning loop. == Definitions == Let T be the total set of all data under consideration. For example, in a protein engineering problem, T would include all proteins that are known to have a certain interesting activity and all additional proteins that one might want to test for that activity. During each iteration, i, T is broken up into three subsets T K , i {\displaystyle \mathbf {T} _{K,i}} : Data points where the label is known. T U , i {\displaystyle \mathbf {T} _{U,i}} : Data points where the label is unknown. T C , i {\displaystyle \mathbf {T} _{C,i}} : A subset of TU,i that is chosen to be labeled. Most of the current research in active learning involves the best method to choose the data points for TC,i. == Scenarios == Pool-based sampling: In this approach, which is the most well known scenario, the learning algorithm attempts to evaluate the entire dataset before selecting data points (instances) for labeling. It is often initially trained on a fully labeled subset of the data using a machine-learning method such as logistic regression or SVM that yields class-membership probabilities for individual data instances. The candidate instances are those for which the prediction is most ambiguous. Instances are drawn from the entire data pool and assigned a confidence score, a measurement of how well the learner "understands" the data. The system then selects the instances for which it is the least confident and queries the teacher for the labels. The theoretical drawback of pool-based sampling is that it is memory-intensive and is therefore limited in its capacity to handle enormous datasets, but in practice, the rate-limiting factor is that the teacher is typically a (fatiguable) human expert who must be paid for their effort, rather than computer memory. Stream-based selective sampling: Here, each consecutive unlabeled instance is examined one at a time with the machine evaluating the informativeness of each item against its query parameters. The learner decides for itself whether to assign a label or query the teacher for each datapoint. As contrasted with Pool-based sampling, the obvious drawback of stream-based methods is that the learning algorithm does not have sufficient information, early in the process, to make a sound assign-label-vs ask-teacher decision, and it does not capitalize as efficiently on the presence of already labeled data. Therefore, the teacher is likely to spend more effort in supplying labels than with the pool-based approach. Membership query synthesis: This is where the learner generates synthetic data from an underlying natural distribution. For example, if the dataset are pictures of humans and animals, the learner could send a clipped image of a leg to the teacher and query if this appendage belongs to an animal or human. This is particularly useful if the dataset is small. The challenge here, as with all synthetic-data-generation efforts, is in ensuring that the synthetic data is consistent in terms of meeting the constraints on real data. As the number of variables/features in the input data increase, and strong dependencies between variables exist, it becomes increasingly difficult to generate synthetic data with sufficient fidelity. For example, to create a synthetic data set for human laboratory-test values, the sum of the various white blood cell (WBC) components in a white blood cell differential must equal 100, since the component numbers are really percentages. Similarly, the enzymes alanine transaminase (ALT) and aspartate transaminase (AST) measure liver function (though AST is also produced by other tissues, e.g., lung, pancreas) A synthetic data point with AST at the lower limit of normal range (8–33 units/L) with an ALT several times above normal range (4–35 units/L) in a simulated chronically ill patient would be physiologically impossible. == Query strategies == Algorithms for determining which data points should be labeled can be organized into a number of different categories, based upon their purpose: Balance exploration and exploitation: the choice of examples to label is seen as a dilemma between the exploration and the exploitation over the data space representation. This strategy manages this compromise by modelling the active learning problem as a contextual bandit problem. For example, Bouneffouf et al. propose a sequential algorithm named Active Thompson Sampling (ATS), which, in each round, assigns a sampling distribution on the pool, samples one point from this distribution, and queries the oracle for this sample point label. Expected model change: label those points that would most change the current model. Expected error reduction: label those points that would most reduce the model's generalization error. Exponentiated Gradient Exploration for Active Learning: In this paper, the author proposes a sequential algorithm named exponentiated gradient (EG)-active that can improve any active learning algorithm by an optimal random exploration. Uncertainty sampling: label those points for which the current model is least certain as to what the correct output should be. Query by committee: a variety of models are trained on the current labeled data, and vote on the output for unlabeled data; label those points for which the "committee" disagrees the most Querying from diverse subspaces or partitions: When the underlying model is a forest of trees, the leaf nodes might represent (overlapping) partitions of the original feature space. This offers the possibility of selecting instances from non-overlapping or minimally overlapping partitions for labeling. Variance reduction: label those points that would minimize output variance, which is one of the components of error. Conformal prediction: predicts that a new data point will have a label similar to old data points in some specified way and degree of the similarity within the old examples is used to estimate the confidence in the prediction. Mismatch-first farthest-traversal: The primary selection criterion is the prediction mismatch between the current model and nearest-neighbour prediction. It targets on wrongly predicted data points. The second selection criterion is the distance to previously selected data, the farthest first. It aims at optimizing the diversity of selected data. User-centered labeling strategies: Learning is accomplished by applying dimensionality reduction to graphs and figures like scatter plots. Then the user is asked to label the compiled data (categorical, numerical, relevance scores, relation between two instances). A wide variety of algorithms have been studied that fall into these categories. While the traditional AL strategies can achieve remarkable performance, it is often challenging to predict in advance which strategy is the most suitable in a particular situation. In recent years, meta-learning algorithms have been gaining in popularity. Some of them have been proposed to tackle the problem of learning AL strategies instead of relying on manually designed strategies. A benchmark which compares 'meta-learning approaches to active learning' to 'traditional heuristic-based Active Learning' may give intuitions if 'Learning active learning' is at the crossroads == Minimum marginal hyperplane == Some active learning algorithms are built upon support-vector machines (SVMs) and exploit the structure of the SVM to determine which data points to label. Such methods usually calculate the margin, W, of each u

Content adaptation

Content adaptation is the action of transforming content to adapt to device capabilities. Content adaptation is usually related to mobile devices, which require special handling because of their limited computational power, small screen size, and constrained keyboard functionality. Content adaptation could roughly be divided to two fields: Media content adaptation that adapts media files. Browsing content adaptation that adapts a website to mobile devices. == Browsing content adaptation == Advances in the capabilities of small, mobile devices such as mobile phones (cell phones) and Personal Digital Assistants have led to an explosion in the number of types of device that can now access the Web. Some commentators refer to the Web that can be accessed from mobile devices as the Mobile Web. The sheer number and variety of Web-enabled devices poses significant challenges for authors of websites who want to support access from mobile devices. The W3C Device Independence Working Group described many of the issues in its report Authoring Challenges for Device Independence. Content adaptation is one approach to a solution. Rather than requiring authors to create pages explicitly for each type of device that might request them, content adaptation transforms an author's materials automatically. For example, content might be converted from a device-independent markup language, such as XDIME, an implementation of the W3C's DIAL specification, into a form suitable for the device, such as XHTML Basic, C-HTML, or WML. Similarly, a suitable device-specific CSS style sheet or a set of in-line styles might be generated from abstract style definitions. Likewise, a device specific layout might be generated from abstract layout definitions. Once created, the device-specific materials form the response returned to the device from which the request was made. Another way is to use the latest trend responsive design based on CSS, covered in this article (RWD). Content adaptation requires a processor that performs the selection, modification, and generation of materials to form the device-specific result. IBM's Websphere Everyplace Mobile Portal (WEMP), BEA Systems' WebLogic Mobility Server, Morfeo's MyMobileWeb, and Apache Cocoon are examples of such processors. Wurfl and WALL are popular open source tools for content adaptation. WURFL is an XML-based Device Description Repository with APIs to access the data in Java and PHP (and other popular programming languages). WALL (Wireless Abstraction Library) lets a developer author mobile pages which look like plain HTML, but converts them to WML, C-HTML, or XHTML Mobile Profile, depending on the capabilities of the device from which the HTTP request originates. GreasySpoon lets the developer build plugins for content editing, in JavaScript, Ruby (programming language), and more, just like the Firefox application GreaseMonkey. Alembik (Media Transcoding Server) is a Java (J2EE) application providing transcoding services for variety of clients and for different media types (image, audio, video, etc.). It is fully compliant with OMA's Standard Transcoder Interface specification and is distributed under the LGPL open source license. In 2007, the first large scale carrier-grade deployments of content transformation, on existing mass-market handsets, with no software download required, were deployed by Vodafone in the UK and globally for Yahoo! oneSearch, using the Novarra Vision solution. Novarra's content adaptation solution had been used in enterprise intranet deployments as early as 2003 (at that time, the platform was named "Engines for Wireless Data"). InfoGin, the 9-year-old content-adaptation company with customers like Vodafone, Orange, Telefónica and PCCW. The patented "Web to Mobile adaptation", Mobile Matrix Transcoder, Multimedia and Documents transcoders, Video adaptation supporte. Launched in 2007, Bytemobile's Web Fidelity Service was another carrier-grade, commercial infrastructure solution, which provided wireless content adaptation to mobile subscribers on their existing mass-market handsets, with no client download required.

Locative media

Locative media or location-based media (LBM) is a virtual medium of communication functionally bound to a location. The physical implementation of locative media, however, is not bound to the same location to which the content refers. Location-based media delivers multimedia and other content directly to the user of a mobile device dependent upon their location. Location information determined by means such as mobile phone tracking and other emerging real-time locating system technologies like Wi-Fi or RFID can be used to customize media content presented on the device. Locative media are digital media applied to real places and thus triggering real social interactions. While mobile technologies such as the Global Positioning System (GPS), laptop computers and mobile phones enable locative media, they are not the goal for the development of projects in this field. == Description == Media content is managed and organized externally of the device on a standard desktop, laptop, server, or cloud computing system. The device then downloads this formatted content with GPS or other RTLS coordinate-based triggers applied to each media sequence. As the location-aware device enters the selected area, centralized services trigger the assigned media, designed to be of optimal relevance to the user and their surroundings. Use of locative technologies "includes a range of experimental uses of geo-technologies including location-based games, artistic critique of surveillance technologies, experiential mapping, and spatial annotation." Location based media allows for the enhancement of any given environment offering explanation, analysis and detailed commentary on what the user is looking at through a combination of video, audio, images and text. The location-aware device can deliver interpretation of cities, parklands, heritage sites, sporting events or any other environment where location based media is required. The content production and pre-production are integral to the overall experience that is created and must have been performed with ultimate consideration of the location and the users position within that location. The media offers a depth to the environment beyond that which is immediately apparent, allowing revelations about background, history and current topical feeds. == Locative, ubiquitous and pervasive computing == The term 'locative media' was coined by Karlis Kalnins. Locative media is closely related to augmented reality (reality overlaid with virtual reality) and pervasive computing (computers everywhere, as in ubiquitous computing). Whereas augmented reality strives for technical solutions, and pervasive computing is interested in embedded computers, locative media concentrates on social interaction with a place and with technology. Many locative media projects have a social, critical or personal (memory) background. While strictly spoken, any kind of link to additional information set up in space (together with the information that a specific place supplies) would make up location-dependent media, the term locative media is strictly bound to technical projects. Locative media works on locations and yet many of its applications are still location-independent in a technical sense. As in the case of digital media, where the medium itself is not digital but the content is digital, in locative media the medium itself might not be location-oriented, whereas the content is location-oriented. Japanese mobile phone culture embraces location-dependent information and context-awareness. It is projected that in the near future locative media will develop to a significant factor in everyday life. == Enabling technologies == Locative media projects use technology such as Global Positioning System (GPS), laptop computers, the mobile phone, Geographic Information System (GIS), and web map services such as Mapbox, OpenStreetMap, and Google Maps among others. Whereas GPS allows for the accurate detection of a specific location, mobile computers allow interactive media to be linked to this place. The GIS supplies arbitrary information about the geological, strategic or economic situation of a location. Web maps like Google Maps give a visual representation of a specific place. Another important new technology that links digital data to a specific place is radio-frequency identification (RFID), a successor to barcodes like Semacode. Research that contributes to the field of locative media happens in fields such as pervasive computing, context awareness and mobile technology. The technological background of locative media is sometimes referred to as "location-aware computing". == Creative representation == Place is often seen as central to creativity; in fact, "for some—regional artists, citizen journalists and environmental organizations for example—a sense of place is a particularly important aspect of representation, and the starting point of conversations." Locative media can propel such conversations in its function as a "poetic form of data visualization," as its output often traces how people move in, and by proxy, make sense of, urban environments. Given the dynamism and hybridity of cities and the networks which comprise them, locative media extends the internet landscape to physical environments where people forge social relations and actions which can be "mobile, plural, differentiated, adventurous, innovative, but also estranged, alienated, impersonalized." Moreover, in using locative technologies, users can expand how they communicate and assert themselves in their environment and, in doing so, explore this continuum of urban interactions. Furthermore, users can assume a more active role in constructing the environments they are situated in accordingly. In turn, artists have been intrigued with locative media as a means of "user-led mapping, social networking and artistic interventions in which the fabric of the urban environment and the contours of the earth become a 'canvas.'" Such projects demystify how resident behaviors in a given city contribute to the culture and sense of personality that cities are often perceived to take on. Design scholars Anne Galloway and Matthew Ward state that "various online lists of pervasive computing and locative media projects draw out the breadth of current classification schema: everything from mobile games, place-based storytelling, spatial annotation and networked performances to device-specific applications." A prominent use of locative media is in locative art. A sub-category of interactive art or new media art, locative art explores the relationships between the real world and the virtual or between people, places or objects in the real world. == Examples == Notable locative media projects include Bio Mapping by Christian Nold in 2004, locative art projects such as the SpacePlace ZKM/ZKMax bluecasting and participatory urban media access in Munich in 2005 and Britglyph by Alfie Dennen in 2009, and location-based games such as AR Quake by the Wearable Computer Lab at the University of South Australia and Can You See Me Now? in 2001 by Blast Theory in collaboration with the Mixed Reality Lab at the University of Nottingham. In 2005, the Silicon Valley–based collaborators of C5 first exhibited the C5 Landscape Initiative, a suite of four GPS inspired projects that investigate perception of landscape in light of locative media. In William Gibson's 2007 novel Spook Country, locative art is one of the main themes and set pieces in the story. Narrative projects which engage with locative media are sometimes referred to as Location-Aware Fiction, as explored in "Data and Narrative: Location Aware Fiction" a 2003 essay by Kate Armstrong. This location-aware fiction is also known as locative literature, where locative stories and poems can be experienced via digital portals, apps, QR codes and e-books, as well as via analogue forms such as labelling tape, Scrabble tiles, fridge magnets or Post-It notes, and these are forms often used by the writer and artist Matt Blackwood. The Transborder Immigrant Tool by the Electronic Disturbance Theater is a locative media project aimed at providing life saving directions to water for people trying to cross the US / Mexico border. The project attracted global media attention in 2009 and 2010. Articles included a Los Angeles Times cover story focusing on Ricardo Dominguez and an AP story interviewing Micha Cárdenas and Brett Stalbaum. The articles focused on concerns over the legality of the project and the ensuing investigations of the group, which are still underway. The Transborder Immigrant Tool has recently been included in a number of major exhibitions including Here, Not There at the Museum of Contemporary Art San Diego and the 2010 California Biennial at the Orange County Museum of Art. Invisible Threads by Stephanie Rothenberg and Jeff Crouse is a locative media project aimed at creating embodied awareness of sweatshops and just-in-time production t

Electronics

Electronics is a scientific and engineering discipline that studies and applies the principles of physics to design, create, and operate devices that manipulate electrons and other electrically charged particles. It is a subfield of physics and electrical engineering which uses active devices such as transistors, diodes, and integrated circuits to control and amplify the flow of electric current and to convert it from one form to another, such as from alternating current (AC) to direct current (DC) or from analog signals to digital signals. Electronic devices have significantly influenced the development of many aspects of modern society, such as telecommunications, entertainment, education, health care, industry, and security. The main driving force behind the advancement of electronics is the semiconductor industry, which continually produces ever-more sophisticated electronic devices and circuits in response to global demand. The semiconductor industry is one of the global economy's largest and most profitable industries, with annual revenues exceeding $481 billion in 2018. The electronics industry also encompasses other branches that rely on electronic devices and systems, such as e-commerce, which generated over $29 trillion in online sales in 2017. == History and development == Karl Ferdinand Braun's development of the crystal detector, the first semiconductor device, in 1874 and the identification of the electron in 1897 by Sir Joseph John Thomson, along with the subsequent invention of the vacuum tube which could amplify and rectify small electrical signals, inaugurated the field of electronics and the electron age. Practical applications started with the invention of the diode by Ambrose Fleming and the triode by Lee De Forest in the early 1900s, which made the detection of small electrical voltages, such as radio signals from a radio antenna, practicable. Vacuum tubes (thermionic valves) were the first active electronic components which controlled current flow by influencing the flow of individual electrons, and enabled the construction of equipment that used current amplification and rectification to give us radio, television, radar, long-distance telephony and much more. The early growth of electronics was rapid, and by the 1920s, commercial radio broadcasting and telecommunications were becoming widespread and electronic amplifiers were being used in such diverse applications as long-distance telephony and the music recording industry. The next big technological step took several decades to appear, when the first working point-contact transistor was invented by John Bardeen and Walter Houser Brattain at Bell Labs in 1947. However, vacuum tubes continued to play a leading role in the field of microwave and high power transmission as well as television receivers until the middle of the 1980s. Since then, solid-state devices have all but completely taken over. Vacuum tubes are still used in some specialist applications such as high power RF amplifiers, cathode-ray tubes, specialist audio equipment, guitar amplifiers and some microwave devices. In April 1955, the IBM 608 was the first IBM product to use transistor circuits without any vacuum tubes and is believed to be the first all-transistorized calculator to be manufactured for the commercial market. The 608 contained more than 3,000 germanium transistors. Thomas J. Watson Jr. ordered all future IBM products to use transistors in their design. From that time on, transistors were almost exclusively used for computer logic circuits and peripheral devices. However, early junction transistors were relatively bulky devices that were difficult to manufacture on a mass-production basis, which limited them to a number of specialised applications. The MOSFET was invented at Bell Labs between 1955 and 1960. It was the first truly compact transistor that could be miniaturised and mass-produced for a wide range of uses. Its advantages include high scalability, affordability, low power consumption, and high density. It revolutionized the electronics industry, becoming the most widely used electronic device in the world. The MOSFET is the basic element in most modern electronic equipment. As the complexity of circuits grew, problems arose. One problem was the size of the circuit. A complex circuit like a computer was dependent on speed. If the components were large, the wires interconnecting them must be long. The electric signals took time to go through the circuit, thus slowing the computer. The invention of the integrated circuit by Jack Kilby and Robert Noyce solved this problem by making all the components and the chip out of the same block (monolith) of semiconductor material. The circuits could be made smaller, and the manufacturing process could be automated. This led to the idea of integrating all components on a single-crystal silicon wafer, which led to small-scale integration (SSI) in the early 1960s, and then medium-scale integration (MSI) in the late 1960s, followed by VLSI. In 2008, billion-transistor processors became commercially available. == Subfields == == Devices and components == An electronic component is any component, either active or passive, in an electronic system or electronic device. Components are connected together, usually by being soldered to a printed circuit board (PCB), to create an electronic circuit with a particular function. Components may be packaged singly or in more complex groups as integrated circuits. Passive electronic components are capacitors, inductors, resistors, whilst active components are such as semiconductor devices; transistors and thyristors, which control current flow at electron level. == Types of circuits == Electronic circuit functions can be divided into two function groups: analog and digital. A particular device may consist of circuitry that has either or a mix of the two types. Analog circuits are becoming less common, as many of their functions are being digitized. === Analog circuits === Analog circuits use a continuous range of voltage or current for signal processing, as opposed to the discrete levels used in digital circuits. Analog circuits were common throughout electronic devices in the early years, in devices such as radio receivers and transmitters. Analog electronic computers were valuable for solving problems with continuous variables until digital processing advanced. As semiconductor technology developed, many of the functions of analog circuits were taken over by digital circuits, and modern circuits that are entirely analog are less common; their functions being replaced by hybrid approach which, for instance, uses analog circuits at the front end of a device receiving an analog signal, and then use digital processing using microprocessor techniques thereafter. Sometimes it may be difficult to classify some circuits that have elements of both linear and non-linear operation. An example is the voltage comparator, which receives a continuous range of voltage but only outputs one of two levels, as in a digital circuit. Similarly, an overdriven transistor amplifier can take on the characteristics of a controlled switch, having essentially two levels of output. Analog circuits are still widely used for signal amplification, such as in the entertainment industry, and conditioning signals from analog sensors, such as in industrial measurement and control. === Digital circuits === Digital circuits are electric circuits based on discrete voltage levels. Digital circuits use Boolean algebra and are the basis of all digital computers and microprocessor devices. They range from simple logic gates to large integrated circuits, employing millions of such gates. Digital circuits use a binary system with two voltage levels labelled 0 and 1 to indicate logical status. Often logic 0 will be a lower voltage and referred to as Low while logic 1 is referred to as High. However, some systems use the reverse definition (0 is High) or are current based. Quite often, the logic designer may reverse these definitions from one circuit to the next as they see fit to facilitate their design. The definition of the levels as 0 or 1 is arbitrary. Ternary (with three states) logic has been studied, and some prototype computers made, but have not gained any significant practical acceptance. Universally, computers and digital signal processors are constructed with digital logic circuits using transistors such as MOSFETs in the electronic logic gates to generate binary states. Logic gates Adders Flip-flops Counters Registers Multiplexers Schmitt triggers Highly integrated devices: Memory chip Microprocessors Microcontrollers Application-specific integrated circuit (ASIC) Digital signal processor (DSP) Field-programmable gate array (FPGA) Field-programmable analog array (FPAA) System on chip (SOC) == Design == Electronic systems design deals with the multi-disciplinary design issues of complex electronic devices and systems, such as mob

Weibo

Weibo (Chinese: 微博; pinyin: Wēibó), or Sina Weibo (Chinese: 新浪微博; pinyin: Xīnlàng Wēibó), is a Chinese microblogging (weibo) website. Launched by Sina Corporation on 14 August 2009, it is one of the biggest social media platforms in China, with over 582 million monthly active users (252 million daily active users) as of Q1 2022. The platform has been highly successful but has faced criticism for heavy censorship. Sina had gone public on the Nasdaq in 2000. In March 2014, Sina announced a spinoff of Weibo and filed an IPO under the symbol WB. Sina carved out 11% of Weibo in the IPO, with Alibaba owning 32% post-IPO. The company began trading publicly on 17 April 2014. In March 2017, Sina launched Sina Weibo International Version. In November 2018, Sina Weibo suspended its registration function for minors under the age of 14. In July 2019, Sina Weibo announced that it would launch a two-month campaign to clean up pornographic and vulgar information, named "Project Deep Blue" (蔚蓝计划). On 29 September 2020, the company announced it would go private again due to rising tensions between the US and China. == Name == "Weibo" (微博) is the Chinese word for "microblog". Sina Weibo launched its new domain name weibo.com on 7 April 2011, deactivating and redirecting from the old domain, t.sina.com.cn, to the new one. Due to its popularity, the media sometimes refers to the platform simply as "Weibo", despite the numerous other Chinese microblogging services including Tencent Weibo, Sohu Weibo, and NetEase Weibo. However, the latter three have stopped providing services. == Background == Sina Weibo is a platform based on fostering user relationships to share, disseminate, and receive information. Through the website or the mobile app, users can upload pictures and videos publicly for instant sharing, with other users being able to comment with text, pictures and videos, or use a multimedia instant messaging service. The company initially invited a large number of celebrities to join the platform at the beginning and has since invited many media personalities, government departments, businesses and non-governmental organizations to open accounts for the purpose of publishing and communicating information. To avoid the impersonation of celebrities, Sina Weibo uses verification symbols; celebrity accounts have an orange letter "V" and organizations' accounts have a blue letter "V". Sina Weibo has more than 500 million registered users; out of these, 313 million are monthly active users, 85% use the Weibo mobile app, 70% are college-aged, 50.10% are male and 49.90% are female. There are over 100 million messages posted by users each day. With more than 100 million followers, actress Xie Na holds the record for the most followers on the platform. Despite fierce competition among Chinese social media platforms, Sina Weibo remains the most popular. == History == After the July 2009 Ürümqi riots, China shut down most domestic microblogging services, including Fanfou, the very first weibo service. Many popular non-China-based microblogging services like Twitter, Facebook, and Plurk have since been blocked. Sina Corporation CEO Charles Chao considered this to be an opportunity, and on 14 August 2009, Sina launched the tested version of Sina Weibo. Basic functions including message, private message, comment and reposting were made available that September. A Sina Weibo–compatible API platform for developing third-party applications was launched on 28 July 2010. On 1 December 2010, the website experienced an outage, which administrators later said was due to the ever-increasing numbers of users and posts. Registered users surpassed 100 million in February 2011. Since 23 March 2011, t.cn has been used as Sina Weibo's official shortened URL in lieu of sinaurl.cn. On 7 April 2011, weibo.com replaced t.sina.com.cn as the new main domain name used by the website. The official logo was also updated. In June 2011, Sina announced an English-language version of Sina Weibo would be developed and launched, though content would still be governed by Chinese law. On 11 January 2013, Sina Weibo and Alibaba China (a subsidiary of Alibaba Group) signed a strategic cooperation agreement. With more and more foreign celebrities using Sina Weibo, language translation has become an urgent need for Chinese users who wish to communicate with their idols online, especially Korean. In January 2013, Sina Weibo and NetEase.com announced that they had reached a strategic cooperation agreement. When users browse foreign language content, they can now directly obtain translation results through the YouDao Dictionary. The Sina Weibo financial report in February 2013 showed that its total revenue was approximately US$66 million and that the number of registered users had exceeded the 500 million mark. In April 2013, Sina officially announced that Sina Weibo had signed a strategic cooperation agreement with Alibaba. The two sides conducted in-depth cooperation in areas such as user account interoperability, data exchange, online payment, and internet marketing. At the same time, Sina announced that Alibaba, through its wholly owned subsidiary, had purchased the preferred shares and common shares issued by Sina Weibo Company for US$586 million, which accounted for approximately 18% of Weibo's fully diluted and diluted total shares. === Ownership === On 9 April 2013, Alibaba Group announced that it would acquire 18% of Sina Weibo for US$586 million, with the option to buy up to 30% in the future. Alibaba exercised this option when Weibo was listed on the NASDAQ in April 2014. == Users == According to iResearch's report on 30 March 2011, Sina Weibo had 56.5% of China's microblogging market based on active users and 86.6% based on browsing time over competitors such as Tencent Weibo and Baidu. According to research by Sina Corporation, the number of active users reached over 400 million by Q1 2018, making Sina Weibo the 7th platform with at least 400 million active users, and daily usage increased by 21%. As of 2017, approximately 80% of its users were in their 20s and 30s. The top 100 users had over 485 million followers combined. More than 5,000 companies and 2,700 media organizations in China use Sina Weibo. The site is maintained by a growing microblogging department of 200 employees responsible for technology, design, operations, and marketing. Sina executives invited and persuaded many Chinese celebrities to join the platform. Users now include Asian celebrities, movie stars, singers, famous business and media figures, athletes, scholars, artists, organizations, religious figures, government departments, and officials from Hong Kong, Mainland China, Malaysia, Singapore, Taiwan, and Macau, as well as some famous foreign individuals and organizations, including Kevin Rudd, Boris Johnson, David Cameron, Narendra Modi, Toshiba, and the Germany national football team. Sina Weibo has a verification program for known people and organizations. Once an account is verified, a verification badge is added beside the account name. == Features == Many of Sina Weibo's features resemble those of Twitter. A user may post with a 140-character limit (increased to 2,000 as of January 2016 with the exception of reposts and comments). An analysis of 29 million Weibo posts found the median length was 14 characters. Users may mention or talk to other people using "@UserName" formatting, add hashtags, follow other users to make their posts appear in one's own timeline, re-post with "//@UserName" similar to Twitter's retweet function "RT @UserName", select posts for one's favorites list, and verify the account if the user is a celebrity, brand, business or otherwise of public interest. URLs are automatically shortened using the domain name t.cn, akin to Twitter's t.co. Official and third-party applications can access Sina Weibo from other websites or platforms. Users may: Submit up to 18 images/video files in every post Send personal messages to followers Follow others and be followed Post "stories" like on Instagram React to posts using different emojis Receive monetary rewards that can be used in a digital store linked to Weibo View posts identified as "hot" or popular Display the location they post from Hashtags differ slightly between Sina Weibo and Twitter, using the double-hashtag "#HashName#" format (the lack of spacing between Chinese characters necessitates a closing tag). Users can own a hashtag by requesting hashtag monitoring; the company reviews these requests and responds within one to three days. Once a user owns a hashtag, they have access to a wide variety of functions available only to them on the condition that they remain active (less than 1 post per calendar week revokes these privileges). Additionally, comments appear as a list below each post. A commenter can also choose to re-post the comment, quoting the whole original post, to their own page. Unregistered users can only browse a few post

Influence-for-hire

Influence-for-hire or collective influence, refers to the economy that has emerged around buying and selling influence on social media platforms. == Overview == Companies that engage in the influence-for-hire industry range from content farms to high-end public relations agencies. Traditionally influence operations have largely been confined to public sector actors like intelligence agencies, in the influence-for-hire industry the groups conduction the operations are private with commerce being their primary consideration. However many of the clients in the influence-for-hire industry are countries or countries acting through proxies. They are often located in countries with less expensive digital labor. == History == In May 2021, Facebook took a Ukrainian influence-for-hire network offline. Facebook attributed the network to organizations and consultants linked to Ukrainian politicians including Andriy Derkach. During the COVID-19 pandemic state sponsored misinformation was spread through influence-for-hire networks. In August 2021, a report published by the Australian Strategic Policy Institute implicated the Chinese government and the ruling Chinese Communist Party in campaigns of online manipulation conducted against Australia and Taiwan using influence-for-hire.