Web science

Web science

Web science is an emerging interdisciplinary field concerned with the study of large-scale socio-technical systems, particularly the World Wide Web. It considers the relationship between people and technology, the ways that society and technology co-constitute one another and the impact of this co-constitution on broader society. Web Science combines research from disciplines as diverse as sociology, computer science, economics, and mathematics. The Web Science Institute, founded at the University of Southampton by director Wendy Hall and colleagues, describes Web Science as focusing "the analytical power of researchers from disciplines as diverse as mathematics, sociology, economics, psychology, law and computer science to understand and explain the Web. It is necessarily interdisciplinary – as much about social and organizational behaviour as about the underpinning technology." A central pillar of Web science development is Artificial Intelligence or "AI". The current artificial intelligence that in development at the moment is Human-Centered, with goals to further professional development courses as well as influencing public policy. Artificial intelligence developers are focused on the most impactful uses of this technology, while also hoping to expedite the growth and development of the human race. An early definition was given by American computer scientist Ben Shneiderman: "Web Science" is processing the information available on the web in similar terms to those applied to natural environment. == Areas of activity == === Emergent properties === Philip Tetlow, an IBM-based scientist influential in the emergence of web science as an independent discipline, argued for the concept of web life, which considers the Web not as a connected network of computers, as in common interpretations of the Internet, but rather as a sociotechnical machine capable of fusing together individuals and organisations into larger coordinated groups. It argues that unlike the technologies that have come before it, the Web is different in that its phenomenal growth and complexity are starting to outstrip our capability to control it directly, making it impossible for us to grasp its completeness in one go. Tetlow made use of Fritjof Capra's concept of the 'web of life' as a metaphor. == Research groups == There are numerous academic research groups engaged in Web Science research, many of which are members of WSTNet, the Web Science Trust Network of research labs. Health Web Science emerged as a sub-discipline of Web Science that studies the role of the Web's impact on human's health outcomes and how to further utilize the Web to improve health outcomes. These groups focus on the developmental possibilities, provided through Web Science, in areas such as health care and social welfare. Discussion of web science has been widely adopted as a method in which the internet can have a real world impact in the field of medicine, currently coined Medicine 2.0. The World Wide Web acts as a medium for the spread and circulation of knowledge, though these various research groups consider themselves responsible for maintaining verifiable and testable knowledge. Using their knowledge of the healthcare system as well as web science, researchers are focused on formatting and structuring their knowledge in a way that is easily accessible throughout the internet. The World Wide Web is quickly evolving meaning that the information we provide and its formatting must also. Recognizing the overlap between both aspects, the spread of knowledge and development of the internet, allows us to properly display our knowledge in a manner that evolves as quickly as the internet and everyday medical research. The accessibility of the internet and quick development of knowledge must be companied with efficient formatting to allocate successful dissemination of information, as described by these various researcher groups. == Related major conferences == Association for Computing Machinery (ACM), Hypertext Conference (HT) sponsored by SIGWEB ACM SIGCHI Conference on Human Factors in Computing Systems (CHI) International AAAI Conference on Weblogs and Social Media (ICWSM) The Web Conference (WWW) Association for Computing Machinery (ACM) Web Science Conference (WebSci)

KE Software

KE Software is a formerly Australian-owned computer software company based in Manchester, United Kingdom, which specialises in collection management programs for museums, galleries and archives. The Axiell Group acquired the firm in 2014. == History == KE Software had its origins in investigations into electronic systems for managing natural science collections conducted in the late 1970s under a joint program of the University of Melbourne, the then National Museum of Victoria and the Australian Museum, which led to the development of the Titan Database in 1984. Much of the credit for the development of the project was due to the work of Martin Hallett of the Museum of Victoria which evolved into Textpress, and by 2000, the KE EMu database program. KE Software was bought by Axiell in 2014 and the team merged with the Axiell staff. Axiell continues to sell and support EMu. == Products == The firm has two main products: the Ke EMu Electronic Museum management system, a collections management system for museums; and Vitalware Vital Records Management System. The first version of Ke EMu was launched in 1997 and uses the Texpress database engine with client/server architecture on a Windows or Unix/Linux server. Ke Emu is consistent with the Dublin Core / Darwin Core standards for archive and museum catalogue metadata. "The company’s clients include the three largest museums in the world.: == KE EMu == KE EMu is considered one of the more effective and purpose-designed museum cataloguing programs. particularly in the creation of public interfaces to museum catalogue data. KE EMu was further developed in 1997 as a multilingual platform, which has been utilised in bilingual institutions such as the Canadian Museum of Civilisation. Subsequently this evolved into Texpress and KE EMu (standing for Electronic MUseum) in 2000, which is "now used across the world in natural science museums with huge collections'". KE EMu is used by a large number of museums and galleries around the world, including the Smithsonian Anthropological Collection, American Museum of Natural HistoryVancouver Art Gallery, New York Botanical Garden, the University of Chicago Research Archives, the University of Pennsylvania Museum in Philadelphia, the National Museum of Australia, the Australian Museum, Museum of Victoria, University of Melbourne Archives, and the Alexander Turnbull Library, National Library of New Zealand. There are over 300 clients, and more than 5000 users of the EMu software worldwide. The program has been described as providing "...comprehensive museum management (collection management plus other administrative needs for a museum), workflow and project management, flexible metadata, various stats and metrics, and comprehensive web interface with support for mobile devices and kiosks" == KE Vitalware == The firm's vitalware software is used by a number of governments and commercial organisations for managing and accessing large data sets, such as the birth records of the Trinidad and Tobago Registrar General, the Government of Anguilla, Ministry for Infrastructure, Communications, Utility and Housing, and the Mississippi Department of Information Technology Services. == Further development == A specialist tracking component for KE EMu has been developed by Forbes Hawkins of Museum Victoria. This enables locations to be barcoded, and data to be updated as items are moved around the stores, or between venues, display, laboratories and other locations. This system has been considered by Museums around the world. The company has been working with Australian government agencies to digitize birth deaths and marriage registers in order to cross match identity data. The program has also been used for managing the Australian Plant Disease Database and the Australian Plant Pest Database as the program "...has several features that have proven to be invaluable for a plant disease database".

Hardware-based encryption

Hardware-based encryption is the use of computer hardware to assist software, or sometimes replace software, in the process of data encryption. Typically, this is implemented as part of the processor's instruction set. For example, the AES encryption algorithm (a modern cipher) can be implemented using the AES instruction set on the ubiquitous x86 architecture. Such instructions also exist on the ARM architecture. However, more unusual systems exist where the cryptography module is separate from the central processor, instead being implemented as a coprocessor, in particular a secure cryptoprocessor or cryptographic accelerator, of which an example is the IBM 4758, or its successor, the IBM 4764. Hardware implementations can be faster and less prone to exploitation than traditional software implementations, and furthermore can be protected against tampering. == History == Prior to the use of computer hardware, cryptography could be performed through various mechanical or electro-mechanical means. An early example is the Scytale used by the Spartans. The Enigma machine was an electro-mechanical system cipher machine notably used by the Germans in World War II. After World War II, purely electronic systems were developed. In 1987 the ABYSS (A Basic Yorktown Security System) project was initiated. The aim of this project was to protect against software piracy. However, the application of computers to cryptography in general dates back to the 1940s and Bletchley Park, where the Colossus computer was used to break the encryption used by German High Command during World War II. The use of computers to encrypt, however, came later. In particular, until the development of the integrated circuit, of which the first was produced in 1960, computers were impractical for encryption, since, in comparison to the portable form factor of the Enigma machine, computers of the era took the space of an entire building. It was only with the development of the microcomputer that computer encryption became feasible, outside of niche applications. The development of the World Wide Web lead to the need for consumers to have access to encryption, as online shopping became prevalent. The key concerns for consumers were security and speed. This led to the eventual inclusion of the key algorithms into processors as a way of both increasing speed and security. == Implementations == === In the instruction set === ==== x86 ==== The X86 architecture, as a CISC (Complex Instruction Set Computer) Architecture, typically implements complex algorithms in hardware. Cryptographic algorithms are no exception. The x86 architecture implements significant components of the AES (Advanced Encryption Standard) algorithm, which can be used by the NSA for Top Secret information. The architecture also includes support for the SHA Hashing Algorithms through the Intel SHA extensions. Whereas AES is a cipher, which is useful for encrypting documents, hashing is used for verification, such as of passwords (see PBKDF2). ==== ARM ==== ARM processors can optionally support Security Extensions. Although ARM is a RISC (Reduced Instruction Set Computer) architecture, there are several optional extensions specified by ARM Holdings. === As a coprocessor === IBM 4758 – The predecessor to the IBM 4764. This includes its own specialised processor, memory and a Random Number Generator. IBM 4764 and IBM 4765, identical except for the connection used. The former uses PCI-X, while the latter uses PCI-e. Both are peripheral devices that plug into the motherboard. === Proliferation === Advanced Micro Devices (AMD) processors are also x86 devices, and have supported the AES instructions since the 2011 Bulldozer processor iteration. Due to the existence of encryption instructions on modern processors provided by both Intel and AMD, the instructions are present on most modern computers. They also exist on many tablets and smartphones due to their implementation in ARM processors. == Advantages == Implementing cryptography in hardware means that part of the processor is dedicated to the task. This can lead to a large increase in speed. In particular, modern processor architectures that support pipelining can often perform other instructions concurrently with the execution of the encryption instruction. Furthermore, hardware can have methods of protecting data from software. Consequently, even if the operating system is compromised, the data may still be secure (see Software Guard Extensions). == Disadvantages == If, however, the hardware implementation is compromised, major issues arise. Malicious software can retrieve the data from the (supposedly) secure hardware – a large class of method used is the timing attack. This is far more problematic to solve than a software bug, even within the operating system. Microsoft regularly deals with security issues through Windows Update. Similarly, regular security updates are released for Mac OS X and Linux, as well as mobile operating systems like iOS, Android, and Windows Phone. However, hardware is a different issue. Sometimes, the issue will be fixable through updates to the processor's microcode (a low level type of software). However, other issues may only be resolvable through replacing the hardware, or a workaround in the operating system which mitigates the performance benefit of the hardware implementation, such as in the Spectre exploit.

Scandiweb

scandiweb is a web development, digital strategy, AI consultation & implementation agency specializing in the Magento (Adobe Commerce) platform. The company was established in 2003 in Latvia by Antons Sapriko. It has offices in the United States, Sweden, Latvia, and Georgia. scandiweb provides solutions for primarily eCommerce businesses and acts as a strategic partner for IT development focusing on web, mobile, and big data analysis. T == Partnerships == scandiweb is an official Adobe Gold Partner, with the largest team of Adobe Commerce-certified employees. The company holds the Google Premier Partner status for 2025, placing it among top 3% agencies globally. scandiweb is a BigCommerce Certified Partner and a Pimcore Platinum Partner. Since 2016, scandiweb has been collaborating with Oro, Inc., an open-source business application development firm. scandiweb is a Platinum Partner of Hyvä, working with the Magento 2 frontend theme to optimize performance metrics. The company is also a Sanity Agency Partner, assisting with content management through Sanity’s headless CMS.

Web API

A web API is an application programming interface (API) for either a web server or a web browser. As a web development concept, it can be related to a web application's client side (including any web frameworks being used). A server-side web API consists of one or more publicly exposed endpoints to a defined request–response message system, typically expressed in JSON or XML by means of an HTTP-based web server. A server API (SAPI) is not considered a server-side web API, unless it is publicly accessible by a remote web application. == Client side == A client-side web API is a programmatic interface to extend functionality within a web browser or other HTTP client. Originally these were most commonly in the form of native plug-in browser extensions however most newer ones target standardized JavaScript bindings. The Mozilla Foundation created their WebAPI specification which is designed to help replace native mobile applications with HTML5 applications. Google created their Native Client architecture which is designed to help replace insecure native plug-ins with secure native sandboxed extensions and applications. They have also made this portable by employing a modified LLVM AOT compiler. == Server side == A server-side web API consists of one or more publicly exposed endpoints to a defined request–response message system, typically expressed in JSON or XML. The web API is exposed most commonly by means of an HTTP-based web server. Mashups are web applications which combine the use of multiple server-side web APIs. Webhooks are server-side web APIs that take input as a Uniform Resource Identifier (URI) that is designed to be used like a remote named pipe or a type of callback such that the server acts as a client to dereference the provided URI and trigger an event on another server which handles this event thus providing a type of peer-to-peer IPC. === Endpoints === Endpoints are important aspects of interacting with server-side web APIs, as they specify where resources can be accessed by third-party software. Usually the access is via a URI to which HTTP requests are posted, and from which the response is thus expected. Web APIs may be public or private, the latter of which requires an access token. Endpoints need to be static, otherwise the correct functioning of software that interacts with them cannot be guaranteed. If the location of a resource changes (and with it the endpoint) then previously written software will break, as the required resource can no longer be found at the same place. As API providers still want to update their web APIs, many have introduced a versioning system in the URI that points to an endpoint. === Resources versus services === Web 2.0 Web APIs often use machine-based interactions such as REST and SOAP. RESTful web APIs use HTTP methods to access resources via URL-encoded parameters, and use JSON or XML to transmit data. By contrast, SOAP protocols are standardized by the W3C and mandate the use of XML as the payload format, typically over HTTP. Furthermore, SOAP-based Web APIs use XML validation to ensure structural message integrity, by leveraging the XML schemas provisioned with WSDL documents. A WSDL document accurately defines the XML messages and transport bindings of a Web service. === Documentation === Server-side web APIs are interfaces for the outside world to interact with the business logic. For many companies this internal business logic and the intellectual property associated with it are what distinguishes them from other companies, and potentially what gives them a competitive edge. They do not want this information to be exposed. However, in order to provide a web API of high quality, there needs to be a sufficient level of documentation. One API provider that not only provides documentation, but also links to it in its error messages is Twilio. However, there are now directories of popular documented server-side web APIs. === Growth and impact === The number of available web APIs has grown consistently over the past years, as businesses realize the growth opportunities associated with running an open platform, that any developer can interact with. ProgrammableWeb tracks over 24000 Web APIs that were available in 2022, up from 105 in 2005. Web APIs have become ubiquitous. There are few major software applications/services that do not offer some form of web API. One of the most common forms of interacting with these web APIs is via embedding external resources, such as tweets, Facebook comments, YouTube videos, etc. In fact there are very successful companies, such as Disqus, whose main service is to provide embeddable tools, such as a feature-rich comment system. Any website of the TOP 100 Alexa Internet ranked websites uses APIs and/or provides its own APIs, which is a very distinct indicator for the prodigious scale and impact of web APIs as a whole. As the number of available web APIs has grown, open source tools have been developed to provide more sophisticated search and discovery. APIs.json provides a machine-readable description of an API and its operations, and the related project APIs.io offers a searchable public listing of APIs based on the APIs.json metadata format. === Business === ==== Commercial ==== Many companies and organizations rely heavily on their Web API infrastructure to serve their core business clients. In 2014 Netflix received around 5 billion API requests, most of them within their private API. ==== Governmental ==== Many governments collect a lot of data, and some governments are now opening up access to this data. The interfaces through which this data is typically made accessible are web APIs. Web APIs allow for data, such as "budget, public works, crime, legal, and other agency data" to be accessed by any developer in a convenient manner. == Example == An example of a popular web API is the Astronomy Picture of the Day API operated by the American space agency NASA. It is a server-side API used to retrieve photographs of space or other images of interest to astronomers, and metadata about the images. According to the API documentation, the API has one endpoint: https://api.nasa.gov/planetary/apod The documentation states that this endpoint accepts GET requests. It requires one piece of information from the user, an API key, and accepts several other optional pieces of information. Such pieces of information are known as parameters. The parameters for this API are written in a format known as a query string, which is separated by a question mark character (?) from the endpoint. An ampersand (&) separates the parameters in the query string from each other. Together, the endpoint and the query string form a URL that determines how the API will respond. This URL is also known as a query or an API call. In the below example, two parameters are transmitted (or passed) to the API via the query string. The first is the required API key and the second is an optional parameter — the date of the photograph requested. https://api.nasa.gov/planetary/apod?api_key=DEMO_KEY&date=1996-12-03 Visiting the above URL in a web browser will initiate a GET request, calling the API and showing the user a result, known as a return value or as a return. This API returns JSON, a type of data format intended to be understood by computers, but which is somewhat easy for a human to read as well. In this case, the JSON contains information about a photograph of a white dwarf star: The above API return has been reformatted so that names of JSON data items, known as keys, appear at the start of each line. The last of these keys, named url, indicates a URL which points to a photograph: https://apod.nasa.gov/apod/image/9612/ngc2440_hst2.jpg Following the above URL, a web browser user would see this photo: Although this API can be called by an end user with a web browser (as in this example) it is intended to be called automatically by software or by computer programmers while writing software. JSON is intended to be parsed by a computer program, which would extract the URL of the photograph and the other metadata. The resulting photo could be embedded in a website, automatically sent via text message, or used for any other purpose envisioned by a software developer.

Spatial embedding

Spatial embedding is one of feature learning techniques used in spatial analysis where points, lines, polygons or other spatial data types. representing geographic locations are mapped to vectors of real numbers. Conceptually it involves a mathematical embedding from a space with many dimensions per geographic object to a continuous vector space with a much lower dimension. Such embedding methods allow complex spatial data to be used in neural networks and have been shown to improve performance in spatial analysis tasks == Embedded data types == Geographic data can take many forms: text, images, graphs, trajectories, polygons. Depending on the task, there may be a need to combine multimodal data from different sources. The next section describes examples of different types of data and their uses. === Text === Geolocated posts on social media can be used to acquire a library of documents bound to a given place that can be later transformed to embedded vectors using word embedding techniques. === Image === Satellites and aircraft collect digital spatial data acquired from remotely sensed images which can be used in machine learning. They are sometimes hard to analyse using basic image analysis methods and convolutional neural networks can be used to acquire an embedding of images bound to a given geographical object or a region. === Point === A single point of interest (POI) can be assigned multiple features that can be used in machine learning. These could be demographic, transportation, meteorological, or economic data, for example. When embedding single points, it is common to consider the entire set of available points as nodes in a graph. === Line / multiline === Among other things, motion trajectories are represented as lines (multilines). Individual trajectories are embedded taking into account travel time, distances and also features of points visited along the way. Embedding of trajectories allows to improve performance of such tasks as clustering and also categorization. === Polygon === The geographic areas analyzed in machine learning are defined by both administrative boundaries and top-down division into grids of regular shapes such as rectangles, for example. Both types are represented as polygons and, like points, can be assigned different demographic, transportation, or economic features. A polygon can also have features related to the size of the area or shape it represents. === Graph === An example domain where graph representation is used is the street layout in a city, where vertices can be intersections and edges can be roads. The vertices can also be destination points like public transport stops or important points in the city, and the edges represent the flow between them. Embedding graphs or single vertices allows to improve accuracy of analysis methods in which the treated geographical domain can be represented as a network. == Usage == POI recommendation - generating personalized point of interest recommendations based on user preferences. Next/future location prediction - prediction of the next location a person will go to based on their historical trajectory. Zone functions classification - based on different mobility of people or POI distribution a function of a given area in a city can be predicted. Crime prediction - estimation of crime rate in different regions of a city. Local event detection - studying spatio-temporal changes in embeddings can provide valuable information in detection of local event occurring in specific location. Regional mobility popularity prediction - analysis of mobility can show patterns in popularity of different regions in a city. Shape matching - finding a similar shape of given polygon, for example finding building with the same shape as input building. Travel time estimation - predicting estimated travel time given current traffic conditions and special occurring events. Time estimation for on-demand food delivery - estimation of delivery time when placing an order through the website. == Temporal aspect == Some of the data analyzed has a timestamp associated with it. In some cases of data analysis this information is omitted and in others it is used to divide the set into groups. The most common division is the separation of weekdays from weekends or division into hours of the day. This is particularly important in the analysis of mobility data, because the characteristics of mobility during the week and at different times of the day are very different from each other. Another area in which time division into, for example, individual months can be used is in the analysis of tourism of a given region. In order to take such a split into account, embedding methods treat the time stamp specifically or separate versions of the model are developed for different subgroups of the analyzed set.

Control communications

In telecommunications, control communications is the branch of technology devoted to the design, development, and application of communications facilities used specifically for control purposes, such as for controlling (a) industrial processes, (b) movement of resources, (c) electric power generation, distribution, and utilization, (d) communications networks, and (e) transportation systems.