AI For Students Studying

AI For Students Studying — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Audio inpainting

    Audio inpainting

    Audio inpainting (also known as audio interpolation) is an audio restoration task which deals with the reconstruction of missing or corrupted portions of a digital audio signal. Inpainting techniques are employed when parts of the audio have been lost due to various factors such as transmission errors, data corruption or errors during recording. The goal of audio inpainting is to fill in the gaps (i.e., the missing portions) in the audio signal seamlessly, making the reconstructed portions indistinguishable from the original content and avoiding the introduction of audible distortions or alterations. Many techniques have been proposed to solve the audio inpainting problem and this is usually achieved by analyzing the temporal and spectral information surrounding each missing portion of the considered audio signal. Classic methods employ statistical models or digital signal processing algorithms to predict and synthesize the missing or damaged sections. Recent solutions, instead, take advantage of deep learning models, thanks to the growing trend of exploiting data-driven methods in the context of audio restoration. Depending on the extent of the lost information, the inpainting task can be divided in three categories. Short inpainting refers to the reconstruction of few milliseconds (approximately less than 10) of missing signal, that occurs in the case of short distortions such as clicks or clipping. In this case, the goal of the reconstruction is to recover the lost information exactly. In long inpainting instead, with gaps in the order of hundreds of milliseconds or even seconds, this goal becomes unrealistic, since restoration techniques cannot rely on local information. Therefore, besides providing a coherent reconstruction, the algorithms need to generate new information that has to be semantically compatible with the surrounding context (i.e., the audio signal surrounding the gaps). The case of medium duration gaps lays between short and long inpainting. It refers to the reconstruction of tens of millisecond of missing data, a scale where the non-stationary characteristic of audio already becomes important. == Definition == Consider a digital audio signal x {\displaystyle \mathbf {x} } . A corrupted version of x {\displaystyle \mathbf {x} } , which is the audio signal presenting missing gaps to be reconstructed, can be defined as x ~ = m ∘ x {\displaystyle \mathbf {\tilde {x}} =\mathbf {m} \circ \mathbf {x} } , where m {\displaystyle \mathbf {m} } is a binary mask encoding the reliable or missing samples of x {\displaystyle \mathbf {x} } , and ∘ {\displaystyle \circ } represents the element-wise product. Audio inpainting aims at finding x ^ {\displaystyle \mathbf {\hat {x}} } (i.e., the reconstruction), which is an estimation of x {\displaystyle \mathbf {x} } . This is an ill-posed inverse problem, which is characterized by a non-unique set of solutions. For this reason, similarly to the formulation used for the inpainting problem in other domains, the reconstructed audio signal can be found through an optimization problem that is formally expressed as x ^ ∗ = argmin X ^ L ( m ∘ x ^ , x ~ ) + R ( x ^ ) {\displaystyle \mathbf {\hat {x}} ^{}={\underset {\hat {\mathbf {X} }}{\text{argmin}}}~L(\mathbf {m} \circ \mathbf {\hat {x}} ,\mathbf {\tilde {x}} )+R(\mathbf {\hat {x}} )} . In particular, x ^ ∗ {\displaystyle \mathbf {\hat {x}} ^{}} is the optimal reconstructed audio signal and L {\displaystyle L} is a distance measure term that computes the reconstruction accuracy between the corrupted audio signal and the estimated one. For example, this term can be expressed with a mean squared error or similar metrics. Since L {\displaystyle L} is computed only on the reliable frames, there are many solutions that can minimize L ( m ∘ x ^ , x ~ ) {\displaystyle L(\mathbf {m} \circ \mathbf {\hat {x}} ,\mathbf {\tilde {x}} )} . It is thus necessary to add a constraint to the minimization, in order to restrict the results only to the valid solutions. This is expressed through the regularization term R {\displaystyle R} that is computed on the reconstructed audio signal x ^ {\displaystyle \mathbf {\hat {x}} } . This term encodes some kind of a-priori information on the audio data. For example, R {\displaystyle R} can express assumptions on the stationarity of the signal, on the sparsity of its representation or can be learned from data. == Techniques == There exist various techniques to perform audio inpainting. These can vary significantly, influenced by factors such as the specific application requirements, the length of the gaps and the available data. In the literature, these techniques are broadly divided in model-based techniques (sometimes also referred as signal processing techniques) and data-driven techniques. === Model-based techniques === Model-based techniques involve the exploitation of mathematical models or assumptions about the underlying structure of the audio signal. These models can be based on prior knowledge of the audio content or statistical properties observed in the data. By leveraging these models, missing or corrupted portions of the audio signal can be inferred or estimated. An example of a model-based techniques are autoregressive models. These methods interpolate or extrapolate the missing samples based on the neighboring values, by using mathematical functions to approximate the missing data. In particular, in autoregressive models the missing samples are completed through linear prediction. The autoregressive coefficients necessary for this prediction are learned from the surrounding audio data, specifically from the data adjacent to each gap. Some more recent techniques approach audio inpainting by representing audio signals as sparse linear combinations of a limited number of basis functions (as for example in the Short Time Fourier Transform). In this context, the aim is to find the sparse representation of the missing section of the signal that most accurately matches the surrounding, unaffected signal. The aforementioned methods exhibit optimal performance when applied to filling in relatively short gaps, lasting only a few tens of milliseconds, and thus they can be included in the context of short inpainting. However, these signal-processing techniques tend to struggle when dealing with longer gaps. The reason behind this limitation lies in the violation of the stationarity condition, as the signal often undergoes significant changes after the gap, making it substantially different from the signal preceding the gap. As a way to overcome these limitations, some approaches add strong assumptions also about the fundamental structure of the gap itself, exploiting sinusoidal modeling or similarity graphs to perform inpainting of longer missing portions of audio signals. === Data-driven techniques === Data-driven techniques rely on the analysis and exploitation of the available audio data. These techniques often employ deep learning algorithms that learn patterns and relationships directly from the provided data. They involve training models on large datasets of audio examples, allowing them to capture the statistical regularities present in the audio signals. Once trained, these models can be used to generate missing portions of the audio signal based on the learned representations, without being restricted by stationarity assumptions. Data-driven techniques also offer the advantage of adaptability and flexibility, as they can learn from diverse audio datasets and potentially handle complex inpainting scenarios. As of today, such techniques constitute the state-of-the-art of audio inpainting, being able to reconstruct gaps of hundreds of milliseconds or even seconds. These performances are made possible by the use of generative models that have the capability to generate novel content to fill in the missing portions. For example, generative adversarial networks, which are the state-of-the-art of generative models in many areas, rely on two competing neural networks trained simultaneously in a two-player minmax game: the generator produces new data from samples of a random variable, the discriminator attempts to distinguish between generated and real data. During the training, the generator's objective is to fool the discriminator, while the discriminator attempts to learn to better classify real and fake data. In GAN-based inpainting methods the generator acts as a context encoder and produces a plausible completion for the gap only given the available information surrounding it. The discriminator is used to train the generator and tests the consistency of the produced inpainted audio. Recently, also diffusion models have established themselves as the state-of-the-art of generative models in many fields, often beating even GAN-based solutions. For this reason they have also been used to solve the audio inpainting problem, obtaining valid results. These models generate new data instances by inverting the

    Read more →
  • ESign (India)

    ESign (India)

    Aadhaar eSign is an online electronic signature service in India to facilitate an Aadhaar holder to digitally sign a document. The signature service is facilitated by authenticating the Aadhaar holder via the Aadhaar-based e-KYC (electronic Know Your Customer) service. To eSign a document, one has to have an Aadhaar card and a mobile number registered with Aadhaar. With these two things, an Indian citizen can sign a document remotely without being physically present. == Procedure == The notification issued by Government of India in this regard stipulates the following procedure for the e-authentication using Aadhaar e-KYC services. Authentication of an electronic record by e-authentication technique, which shall be done by the applicable use of e-authentication, hash function, and asymmetric cryptosystem techniques, leading to issuance of digital signature certificate by Certifying Authority, a trusted third party service by subscriber's key pair generation, storing of the key pairs on hardware security module and creation of digital signature provided that the trusted third party shall be offered by the certifying authority (the trusted third party shall send application form and certificate signing request to the Certifying Authority for issuing a digital signature certificate to the subscriber), issuance of digital signature certificate by Certifying Authority shall be based on e-authentication, particulars given in the prescribed format, digitally signed verified information from Aadhaar e-KYC services and electronic consent of digital signature certificate applicant, the manner and requirements for e-authentication shall be as issued by the Controller from time to time, the security procedure for creating the subscriber's key pair shall be in accordance with the e-authentication guidelines issued by the Controller, the standards referred to in rule 6 of the Information Technology (Certifying Authorities) Rules, 2000 shall be complied with, in so far as they relate to the certification function of public key of Digital Signature Certificate applicant, and the manner in which information is authenticated by means of digital signature shall comply with the standards specified in rule 6 of the Information Technology (Certifying Authorities) Rules, 2000 in so far as they relate to the creation, storage and transmission of Digital Signature. == eSign Service Providers == Organisations and individuals seeking to obtain the eSigning Service can utilize the services of various service providers. There are empanelled service providers with whom organisations can register as an Application Service Prover after submitting the requisite documents, getting UAT access, building the application around the service and going through an IT Audit by an CERT-IN empanelled auditor. However, the process of registering as an Application Service Provider is cumbersome, and requires huge investments of time, money and resources in complying with the regulations and building a suitable application. Most organisations prefer using services of plug-n-play gateway providers who take the responsibility of complying with the regulations, hence simplifying the process for the market.

    Read more →
  • Change data capture

    Change data capture

    In databases, change data capture (CDC) is a set of software design patterns used to determine and track the data that has changed (the "deltas") so that action can be taken using the changed data. The result is a delta-driven dataset. CDC is an approach to data integration that is based on the identification, capture and delivery of the changes made to enterprise data sources. For instance it can be used for incremental update of data loading. CDC occurs often in data warehouse environments since capturing and preserving the state of data across time is one of the core functions of a data warehouse, but CDC can be utilized in any database or data repository system. == Methodology == System developers can set up CDC mechanisms in a number of ways and in any one or a combination of system layers from application logic down to physical storage. In a simplified CDC context, one computer system has data believed to have changed from a previous point in time, and a second computer system needs to take action based on that changed data. The former is the source, the latter is the target. It is possible that the source and target are the same system physically, but that would not change the design pattern logically. Multiple CDC solutions can exist in a single system. === Timestamps on rows === Tables whose changes must be captured may have a column that represents the time of last change. Names such as LAST_UPDATE, LAST_MODIFIED, etc. are common. Any row in any table that has a timestamp in that column that is more recent than the last time data was captured is considered to have changed. Timestamps on rows are also frequently used for optimistic locking so this column is often available. === Version numbers on rows === Database designers give tables whose changes must be captured a column that contains a version number. Names such as VERSION_NUMBER, etc. are common. One technique is to mark each changed row with a version number. A current version is maintained for the table, or possibly a group of tables. This is stored in a supporting construct such as a reference table. When a change capture occurs, all data with the latest version number is considered to have changed. Once the change capture is complete, the reference table is updated with a new version number. (Do not confuse this technique with row-level versioning used for optimistic locking. For optimistic locking each row has an independent version number, typically a sequential counter. This allows a process to atomically update a row and increment its counter only if another process has not incremented the counter. But CDC cannot use row-level versions to find all changes unless it knows the original "starting" version of every row. This is impractical to maintain.) === Status indicators on rows === This technique can either supplement or complement timestamps and versioning. It can configure an alternative if, for example, a status column is set up on a table row indicating that the row has changed (e.g., a boolean column that, when set to true, indicates that the row has changed). Otherwise, it can act as a complement to the previous methods, indicating that a row, despite having a new version number or a later date, still shouldn't be updated on the target (for example, the data may require human validation). === Time/version/status on rows === This approach combines the three previously discussed methods. As noted, it is not uncommon to see multiple CDC solutions at work in a single system, however, the combination of time, version, and status provides a particularly powerful mechanism and programmers should utilize them as a trio where possible. The three elements are not redundant or superfluous. Using them together allows for such logic as, "Capture all data for version 2.1 that changed between 2005-06-01 00:00 and 2005-07-01 00:00 where the status code indicates it is ready for production." === Triggers on tables === May include a publish/subscribe pattern to communicate the changed data to multiple targets. In this approach, triggers log events that happen to the transactional table into another queue table that can later be "played back". For example, imagine an Accounts table, when transactions are taken against this table, triggers would fire that would then store a history of the event or even the deltas into a separate queue table. The queue table might have schema with the following fields: Id, TableName, RowId, Timestamp, Operation. The data inserted for our Account sample might be: 1, Accounts, 76, 2008-11-02 00:15, Update. More complicated designs might log the actual data that changed. This queue table could then be "played back" to replicate the data from the source system to a target. Data capture offers a challenge in that the structure, contents and use of a transaction log is specific to a database management system. Unlike data access, no standard exists for transaction logs. Most database management systems do not document the internal format of their transaction logs, although some provide programmatic interfaces to their transaction logs (for example: Oracle, DB2, SQL/MP, SQL/MX and SQL Server 2008). Other challenges in using transaction logs for change data capture include: Coordinating the reading of the transaction logs and the archiving of log files (database management software typically archives log files off-line on a regular basis). Translation between physical storage formats that are recorded in the transaction logs and the logical formats typically expected by database users (e.g., some transaction logs save only minimal buffer differences that are not directly useful for change consumers). Dealing with changes to the format of the transaction logs between versions of the database management system. Eliminating uncommitted changes that the database wrote to the transaction log and later rolled back. Dealing with changes to the metadata of tables in the database. CDC solutions based on transaction log files have distinct advantages that include: minimal impact on the database (even more so if one uses log shipping to process the logs on a dedicated host). no need for programmatic changes to the applications that use the database. low latency in acquiring changes. transactional integrity: log scanning can produce a change stream that replays the original transactions in the order they were committed. Such a change stream include changes made to all tables participating in the captured transaction. no need to change the database schema == Confounding factors == As often occurs in complex domains, the final solution to a CDC problem may have to balance many competing concerns. === Unsuitable source systems === Change data capture both increases in complexity and reduces in value if the source system saves metadata changes when the data itself is not modified. For example, some Data models track the user who last looked at but did not change the data in the same structure as the data. This results in noise in the Change Data Capture. === Tracking the capture === Actually tracking the changes depends on the data source. If the data is being persisted in a modern database then Change Data Capture is a simple matter of permissions. Two techniques are in common use: Tracking changes using database triggers Reading the transaction log as, or shortly after, it is written. If the data is not in a modern database, CDC becomes a programming challenge. === Push versus pull === Push: the source process creates a snapshot of changes within its own process and delivers rows downstream. The downstream process uses the snapshot, creates its own subset and delivers them to the next process. Pull: the target that is immediately downstream from the source, prepares a request for data from the source. The downstream target delivers the snapshot to the next target, as in the push model. === Alternatives === Sometimes the slowly changing dimension is used as an alternative method. CDC and SCD are similar in that both methods can detect changes in a data set. The most common forms of SCD are type 1 (overwrite), type 2 (maintain history) or 3 (only previous and current value). SCD 2 can be useful if history is needed in the target system. CDC overwrites in the target system (akin to SCD1), and is ideal when only the changed data needs to arrive at the target, i.e. a delta-driven dataset.

    Read more →
  • Social media use in African politics

    Social media use in African politics

    Since the Egyptian Revolution in 2011 and the Tunisian Revolution, social media, especially Facebook, Twitter, and YouTube, began to gain traction as a political tool in Africa. Various political actors have used social media to pursue a wide range of political objectives. State actors can use social media to encourage political discourse, campaign, or implement censorship and surveillance. Non-state actors, such as civil society organizations and opposition movements, can use social media to address political concerns and to organize widespread uprisings, such as the 2014 Burkinabé uprising. Meanwhile, extremist organizations can use social media to further their propaganda and recruitment. However, social media has been criticized for its limited accessibility and for facilitating the spread of misinformation, causing some skepticism about its effectiveness. Due to low entry barriers and user-generated content, social media provides a platform where people from different social classes can engage and interact with one another. Under traditional media, the public had limited opportunities to voice their political opinions. Social media enables people to both create and consume content. The public has become increasingly comfortable and confident in expressing political opinions online, often away from government scrutiny. Scholars argue that social media use has democratizing effects in African countries. == State actors == === Promoting political discourse === Through social media, the government and its citizens can discuss policy ideas, policy implementation, and political actions. Regardless of geographical location and distance, people are able to voice their opinions to the government. Social media includes citizens who were previously not able to express their discontent or share their ideas to the government. As state actors keep the public informed, social media can increase civic engagement. With more civic engagement, policies can be discussed without politicization. Before the commonplace use of social media, African countries faced weak feedback mechanisms that effectively excluded the average African citizen from policy discourse. In South Africa, the government uses social media to connect with constituencies. The South African president runs an official Twitter, Facebook, YouTube, and Flickr accounts to engage with the public. === Campaigning === Political parties also use social media for political campaigns during election periods. In South Africa, the ANC (African National Congress) and DA (Democratic Alliance) use social media for political purposes. These parties specifically use Facebook as a tool for campaigning and engaging with the public to improve their relationship with citizens. Nigerian President Goodluck Jonathan employed social media to campaign for the presidential election in 2011, which he won. When President Goodluck Jonathan announced his bid for the presidency on social media in 2010, it reached about 217,000 people. As his campaign progressed, President Goodluck Jonathan was able to increase his followers to half a million by early 2011. === Censorship & Surveillance === While state actors can use social media to encourage their party or discourse, social media can be used to censor and surveil citizens. For example, the ANC and DA use Facebook to monitor South Africans. The government is able to track down people who have spoken against the government and translate this information into physical action to stop any possibility of a revolution. Social media platforms can be shut down to manipulate the flow of information. In Chad, citizens cannot access information through online platforms. This censorship blocked "Facebook, Twitter, WhatsApp and Viber". In the Democratic Republic of Congo, the government shut down the internet before contested elections. In Zimbabwe, the government shut down the internet to hide civilian protests against fuel price increases. == Non-state actors == === Civil society organizations (CSOs) === Civil society organizations have also used social media networks in an effort to recruit supporters and communicate with the public. CSOs can use social media to mobilize people to support their cause, such as the Ghanaian Committee for Joint Action (CJA). In 2005 and 2006, the CJA gathered support to protest against the 50% fuel price increase. CSOs can play the role of a counterforce against state actors and state propaganda during times of crises, such as protests and military clashes. In some cases, CSOs release their own videos and photos on social media which challenges traditional forms of media. CSOs have also served to monitor elections to reduce corruption and violence during election day. For instance, the Zambian Bantu Watch started the #bantuwatch social media campaign to monitor the 2011 presidential election. Zambians used Facebook and Twitter to report polling station results to mitigate election fraud and election violence. In South Africa, CSOs created 'amandla.mobi' to campaign for public policies by creating petitions. Through 'amandla.mobi', CSOs are able to circulate petitions on social media to collect signatures. South African CSOs reported how social media helped their organizations to gain support and share ideas. However, CSOs struggle to attract media attention and often have to pay for media coverage. === Opposition forces against the government === Social media is also used by the public or opposition forces against the government. Through horizontal social media, organizing can lead to street protests and revolutions, some of which are successful. For instance, during the Egyptian revolution of 2011, "The Day of the Revolution Against Torture, Poverty, Corruption, and Unemployment" and "We Are All Khaled Said" gathered support against President Hosni Mubarak. In particular, "We Are All Khaled Said" had Egyptian citizens gather around the death of Khaled Said who was brutally tortured and killed by the Egyptian government because Said wanted to uncover government corruption. As unrest erupted into public demonstrations, President Hosni Mubarak was forced to resign. Witnessing the success of social media during the Egyptian revolution, the Tunisian Revolution, or the Jasmine Revolution, mobilized through Facebook and Twitter. Likewise, in South Africa, Malawi, and Mozambique, these countries have used social media as "new protest drums." Due to social media's low entry barrier, opposition forces against the government can facilitate political discourse that can lead to accountability. Whistleblowers and opposition forces are able to expose corruption through social media, where they face less repression while reaching a larger audience. For example, the youth of Zimbabwe and South Africa use Facebook to discuss politics without judgment. Specifically, in Zimbabwe, political youth used Facebook to avoid state surveillance. Social media is used as a supplemental tool for activism. In 2015, South African student activists started the hashtag #RhodesMustFall to push the issue of colonialism and racism at the forefront of the public. === Extremist organizations === Social media is easily accessible and created by user-based content. Therefore, marginalized groups are able to use social media to spread extremist ideas. For instance, Boko Haram created the Media Office of West Africa Province and perpetuated propaganda through Twitter and YouTube. Boko Haram's online propaganda campaign targets and persuades young dissuaded Nigerians to join their cause. It is important to note that social media has also been used against Boko Haram. In April 2014, Boko Haram kidnapped 276 schoolgirls and an international campaign fought for their return through #BringBackOurGirls. Another extremist group, Al-Shabaab, has created an online presence through Twitter and YouTube. Through these social media networks, Al-Shabaab recruits new members to their extremist group through their propaganda which emphasizes the group's successes. Albeit their efforts, Al-Shabaab has not been very successful in coordinating their members but they are successful in financing their group. Furthermore, the Islamic State of Iraq and the Levant (ISIL) use social media to target and recruit individuals to their cause. ISIL's social media usage is more diverse compared to Boko Haram and Al-Shabaab; ISIL uses "Facebook, Twitter, YouTube, WhatsApp, Telegram, JustPaste.it, Kik and Ask.fm." Since ISIL's Twitter accounts kept getting shut down, ISIL uses Telegram and WhatsApp chat rooms to privately conduct meetings. Due to the spread of extremist ideology, Zhuravskaya et al. acknowledge social media's potential to be misused. == Challenges == Although social media can be used as a political tool, it faces challenges in Africa. Due to low literacy rates in Africa, social media networks exclude many of the population members. In addition, lack of access to electricity and the internet can fur

    Read more →
  • Optical granulometry

    Optical granulometry

    Optical granulometry is the process of measuring the different grain sizes in a granular material, based on a photograph. Technology has been created to analyze a photograph and create statistics based on what the picture portrays. This information is vital in maintaining machinery in various trades worldwide. Mining companies can use optical granulometry to analyze inactive or moving rock to quantify the size of these fragments. Forestry companies can zero in on wood chip sizes without stopping the production process, and minimize sizing errors. With more photoanalysis technologies being produced, mining companies have shown an increased interest in these types of systems because of their ability to maintain efficiency throughout the mining process. Companies are saving millions of dollars annually because of this new technology, and are cutting back on maintenance costs on equipment. In order for optical granulometry to be completely successful, an accurate photo must be taken – under sufficient lighting, and using proper technology – to obtain quantified results. If these requirements are met, an image analysis system can be implemented. == The process == Software uses four basic steps in determining the average size of material: See the Wikipedia article on Photoanalysis to see how mining, forestry and agricultural companies are using this technology to improve quality control techniques. == Smartphone-based, segmentation-free estimation of grain size distribution == Recently, a methodology has emerged by which soil grain size distribution can be inferred from optical images acquired with commodity smartphones by training convolutional neural networks to predict parameters of the distribution curve directly from the image, without explicit image segmentation . In this approach, a standardized image of a soil surface is captured under controlled conditions, preprocessed to reduce device-specific variability, and passed to a regression model that outputs the parameters of a cumulative distribution function e.g., a two-parameter Weibull curve. The resulting distribution can be used to derive geotechnical descriptors and class boundaries.

    Read more →
  • Completeness (cryptography)

    Completeness (cryptography)

    In cryptography, a boolean function is said to be complete if the value of each output bit depends on all input bits. This is a desirable property to have in an encryption cipher, so that if one bit of the input (plaintext) is changed, every bit of the output (ciphertext) has an average of 50% probability of changing. The easiest way to show why this is good is the following: consider that if we changed our 8-byte plaintext's last byte, it would only have any effect on the 8th byte of the ciphertext. This would mean that if the attacker guessed 256 different plaintext-ciphertext pairs, he would always know the last byte of every 8byte sequence we send (effectively 12.5% of all our data). Finding out 256 plaintext-ciphertext pairs is not hard at all in the internet world, given that standard protocols are used, and standard protocols have standard headers and commands (e.g. "get", "put", "mail from:", etc.) which the attacker can safely guess. On the other hand, if our cipher has this property (and is generally secure in other ways, too), the attacker would need to collect 264 (~1020) plaintext-ciphertext pairs to crack the cipher in this way.

    Read more →
  • Semi-Automatic Ground Environment

    Semi-Automatic Ground Environment

    The Semi-Automated Ground Environment (SAGE) was a system of large computers and associated networking equipment that coordinated data from many radar sites and processed it to produce a single unified image of the airspace over a wide area. SAGE directed and controlled the NORAD response to a possible Soviet air attack, operating in this role from the late 1950s into the 1980s. The processing power behind SAGE was supplied by the largest discrete component-based computer ever built, the AN/FSQ-7, manufactured by IBM. Each SAGE Direction Center (DC) housed an FSQ-7 which occupied an entire floor, approximately 22,000 square feet (2,000 m2) not including supporting equipment. The FSQ-7 was actually two computers, "A" side and "B" side. Computer processing was switched from "A" side to "B" side on a regular basis, allowing maintenance on the unused side. Information was fed to the DCs from a network of radar stations as well as readiness information from various defense sites. The computers, based on the raw radar data, developed "tracks" for the reported targets, and automatically calculated which defenses were within range. Operators used light guns to select targets on-screen for further information, select one of the available defenses, and issue commands to attack. These commands would then be automatically sent to the defense site via teleprinter. Connecting the various sites was an enormous network of telephones, modems and teleprinters. Later additions to the system allowed SAGE's tracking data to be sent directly to CIM-10 Bomarc missiles and some of the US Air Force's interceptor aircraft in-flight, directly updating their autopilots to maintain an intercept course without operator intervention. Each DC also forwarded data to a Combat Center (CC) for "supervision of the several sectors within the division" ("each combat center [had] the capability to coordinate defense for the whole nation"). SAGE became operational in the late 1950s and early 1960s at an estimated total cost between 8 and 12 billion dollars, four times the cost of the Manhattan Project. Throughout its development, there were continual concerns about its real ability to deal with large attacks, and the Operation Sky Shield tests showed that only about one-fourth of enemy bombers would have been intercepted. Nevertheless, SAGE was the backbone of NORAD's air defense system into the 1980s, by which time the tube-based FSQ-7s were increasingly costly to maintain and completely outdated. Today the same command and control task is carried out by microcomputers, based on the same basic underlying data. == Background == === Earlier systems === Just prior to World War II, Royal Air Force (RAF) tests with the new Chain Home (CH) radars had demonstrated that relaying information to the fighter aircraft directly from the radar sites was not feasible. The radars determined the map coordinates of the enemy, but could generally not see the fighters at the same time. This meant the fighters had to be able to determine where to fly to perform an interception but were often unaware of their own exact location and unable to calculate an interception while also flying their aircraft. The solution was to send all of the radar information to a central control station where operators collated the reports into single tracks, and then reported these tracks to the airbases, or sectors. The sectors used additional systems to track their own aircraft, plotting both on a single large map. Operators viewing the map could then see what direction their fighters would have to fly to approach their targets and relay that simply by telling them to fly along a certain heading or vector. This Dowding system was the first ground-controlled interception (GCI) system of large scale, covering the entirety of the UK. It proved enormously successful during the Battle of Britain, and is credited as being a key part of the RAF's success. The system was slow, often providing information that was up to five minutes out of date. Against propeller driven bombers flying at perhaps 225 miles per hour (362 km/h) this was not a serious concern, but it was clear the system would be of little use against jet-powered bombers flying at perhaps 600 miles per hour (970 km/h). The system was extremely expensive in manpower terms, requiring hundreds of telephone operators, plotters and trackers in addition to the radar operators. This was a serious drain on manpower, making it difficult to expand the network. The idea of using a computer to handle the task of taking reports and developing tracks had been explored beginning late in the war. By 1944, analog computers had been installed at the CH stations to automatically convert radar readings into map locations, eliminating two people. Meanwhile, the Royal Navy began experimenting with the Comprehensive Display System (CDS), another analog computer that took X and Y locations from a map and automatically generated tracks from repeated inputs. Similar systems began development with the Royal Canadian Navy, DATAR, and the US Navy, the Naval Tactical Data System (NTDS). A similar system was also specified for the Nike SAM project, specifically referring to a US version of CDS, coordinating the defense over a battle area so that multiple batteries did not fire on a single target. All of these systems were relatively small in geographic scale, generally tracking within a city-sized area. === Valley Committee === When the Soviet Union tested its first atomic bomb in August 1949, the topic of air defense of the US became important for the first time. A study group, the "Air Defense Systems Engineering Committee", was set up under the direction of Dr. George Valley to consider the problem and is known to history as the "Valley Committee". Their December report noted a key problem in air defense using ground-based radars. A bomber approaching a radar station would detect the signals from the radar long before the reflection off the bomber was strong enough to be detected by the station. The committee suggested that when this occurred, the bomber would descend to low altitude, thereby greatly limiting the radar horizon, allowing the bomber to fly past the station undetected. Although flying at low altitude greatly increased fuel consumption, the team calculated that the bomber would only need to do this for about 10% of its flight, making the fuel penalty acceptable. The only solution to this problem was to build a huge number of stations with overlapping coverage. At that point the problem became one of managing the information. Manual plotting was ruled out as too slow, and a computerized solution was the only possibility. To handle this task, the computer would need to be fed information directly, eliminating any manual translation by phone operators, and it would have to be able to analyze that information and automatically develop tracks. A system tasked with defending cities against the predicted future Soviet bomber fleet would have to be dramatically more powerful than the models used in the NTDS or DATAR. The Committee then had to consider whether or not such a computer was possible. The Valley Committee was introduced to Jerome Wiesner, associate director of the Research Laboratory of Electronics at MIT. Wiesner noted that the Servomechanisms Laboratory had already begun development of a machine that might be fast enough. This was the Whirlwind I, originally developed for the Office of Naval Research as a general purpose flight simulator that could simulate any current or future aircraft by changing its software. Wiesner introduced the Valley Committee to Whirlwind's project lead, Jay Forrester, who convinced him that Whirlwind was sufficiently capable. In September 1950, an early microwave early-warning radar system at Hanscom Field was connected to Whirlwind using a custom interface developed by Forrester's team. An aircraft was flown past the site, and the system digitized the radar information and successfully sent it to Whirlwind. With this demonstration, the technical concept was proven. Forrester was invited to join the committee. === Project Charles === With this successful demonstration, Louis Ridenour, chief scientist of the Air Force, wrote a memo stating "It is now apparent that the experimental work necessary to develop, test, and evaluate the systems proposals made by ADSEC will require a substantial amount of laboratory and field effort." Ridenour approached MIT President James Killian with the aim of beginning a development lab similar to the war-era Radiation Laboratory that made enormous progress in radar technology. Killian was initially uninterested, desiring to return the school to its peacetime civilian charter. Ridenour eventually convinced Killian the idea was sound by describing the way the lab would lead to the development of a local electronics industry based on the needs of the lab and the students who would leave the lab to start their

    Read more →
  • Data profiling

    Data profiling

    Data profiling is the process of examining the data available from an existing information source (e.g. a database or a file) and collecting statistics or informative summaries about that data. The purpose of these statistics may be to: Find out whether existing data can be easily used for other purposes Improve the ability to search data by tagging it with keywords, descriptions, or assigning it to a category Assess data quality, including whether the data conforms to particular standards or patterns Assess the risk involved in integrating data in new applications, including the challenges of joins Discover metadata of the source database, including value patterns and distributions, key candidates, foreign-key candidates, and functional dependencies Assess whether known metadata accurately describes the actual values in the source database Understanding data challenges early in any data intensive project, so that late project surprises are avoided. Finding data problems late in the project can lead to delays and cost overruns. Have an enterprise view of all data, for uses such as master data management, where key data is needed, or data governance for improving data quality. == Introduction == Data profiling refers to the analysis of information for use in a data warehouse in order to clarify the structure, content, relationships, and derivation rules of the data. Profiling helps to not only understand anomalies and assess data quality, but also to discover, register, and assess enterprise metadata. The result of the analysis is used to determine the suitability of the candidate source systems, usually giving the basis for an early go/no-go decision, and also to identify problems for later solution design. == How data profiling is conducted == Data profiling utilizes methods of descriptive statistics such as minimum, maximum, mean, mode, percentile, standard deviation, frequency, variation, aggregates such as count and sum, and additional metadata information obtained during data profiling such as data type, length, discrete values, uniqueness, occurrence of null values, typical string patterns, and abstract type recognition. The metadata can then be used to discover problems such as illegal values, misspellings, missing values, varying value representation, and duplicates. Different analyses are performed for different structural levels. E.g. single columns could be profiled individually to get an understanding of frequency distribution of different values, type, and use of each column. Embedded value dependencies can be exposed in a cross-columns analysis. Finally, overlapping value sets possibly representing foreign key relationships between entities can be explored in an inter-table analysis. Normally, purpose-built tools are used for data profiling to ease the process. The computational complexity increases when going from single column, to single table, to cross-table structural profiling. Therefore, performance is an evaluation criterion for profiling tools. == When is data profiling conducted? == According to Kimball, data profiling is performed several times and with varying intensity throughout the data warehouse developing process. A light profiling assessment should be undertaken immediately after candidate source systems have been identified and DW/BI business requirements have been satisfied. The purpose of this initial analysis is to clarify at an early stage if the correct data is available at the appropriate detail level and that anomalies can be handled subsequently. If this is not the case the project may be terminated. Additionally, more in-depth profiling is done prior to the dimensional modeling process in order assess what is required to convert data into a dimensional model. Detailed profiling extends into the ETL system design process in order to determine the appropriate data to extract and which filters to apply to the data set. Additionally, data profiling may be conducted in the data warehouse development process after data has been loaded into staging, the data marts, etc. Conducting data at these stages helps ensure that data cleaning and transformations have been done correctly and in compliance of requirements. == Benefits and examples == Data profiling can improve data quality, shorten the implementation cycle of major projects, and improve users' understanding of data. Discovering business knowledge embedded in data itself is one of the significant benefits derived from data profiling. It can improve data accuracy in corporate databases.

    Read more →
  • Touch 'n Go eWallet

    Touch 'n Go eWallet

    Touch 'n Go eWallet is a Malaysian digital wallet and online payment platform, established in Kuala Lumpur, Malaysia, in July 2017 as a joint venture between Touch 'n Go and Ant Financial. It allows users to make payments at over 280,000 merchant touch points via QR code, as well as perform peer-to-peer (P2P) money transfers. Since then, the e-wallet further diversified for users to pay for tolls via RFID or PayDirect, street parking and various online payment spanning e-hailing, car-sharing apps or taxis, various overhead bills; top-up for mobile prepaid or in-game currencies; purchases on e-commerce websites; food delivery; renewing motor insurance and other insurance/takaful plans; and even movie, bus, trains or airline tickets. == Background == Prior to the launch of the e-wallet service, Touch 'n Go provided stored-value physical all-in-one contactless card (namely Touch 'n Go cards or "TnG cards") that users can use to pay for toll fares, public transportation and parking lots as well as purchases in some retail stores. In 1999, Touch 'n Go also markets SmartTag devices that allow road users to pass through certain toll booths without the need to unwind the car window. The high entry cost of the device (around RM 100 each) also meant that only few can enjoy the seamless experience. In 2009, Touch 'n Go partnered with Maxis to launch FastTap, a new mobile payment service that utilised Near-Field Communication (NFC). Maxis customers can make payments by placing the phone near the card readers (that also supports physical bank cards and Touch ’N Go cards). However, the venture featured only one phone model, Nokia 6212, which greatly limited the public reach. In July 2012, Touch 'n Go announced another collaboration with CIMB and Maxis to create similar NFC-based online transaction service that runs on compatible smartphones. Touch 'n Go Wallet was launched in February 2017 as an QR code-based e-wallet application, to compete with Samsung Pay that utilizes NFC modules. In the controlled pilot test in Taman Tun Dr Ismail, the correspondents can experience basic functionalities (prepaid mobile service reload, bills payment, movie tickets and flight tickets purchase, transfer of money with another user, and payments at participating stores and restaurants). While the deployed version of the app was generally well-received, the existing process to transfer the balance to the physical TnG card stored value from the app garnered unanimous backlash. Test groups felt that the need to head to a self-service terminal named "Pick Up Device" in person within 24 hours for completion, along with the failure to do so (the balance would be credited back to the wallet after 24 hours), was not divulged clearly and also defeated the purpose of convenience, not to mention there were only 2 such terminals. The feature was eventually suspended. On 15 November 2017, Touch 'n Go was granted permission by the Central Bank of Malaysia to form a joint venture with Ant Financial, a Chinese-based financial company that operates Alipay. The partnership allowed the local e-wallet to learn from and build upon the operational model pioneered by Alipay. In June 2018, it was reported that Touch 'n Go was pilot testing the uses of the Touch 'n Go eWallet in Rapid Transit, as the ticketing system was enabled on the Kelana Jaya line in the Klang Valley. Pilot testing only applied to stations in Kelana Jaya, KL Gateway–Universiti, Kerinchi, KL Sentral, Dang Wangi, KLCC, and Ampang Park. The test was reported to be successful in February 2020 and was planned to be fully deployed on the LRT and MRT. Due to unforeseen circumstances, this feature did not come into fruition, the app merely adds in-app purchase of monthly concession cards called "My50". In August 2018, Touch 'n Go announced that selected drivers may experience first-hand a new RFID-based payment (later rebranded as "myRFID") that serves to replace SmartTag devices on closed toll roads with during pilot testing phase commencing on 3 September 2018. On 2 November 2018, participation in the ongoing pilot programme was expanded, allowing more drivers to sign up ahead of the public rollout of the RFID system. During the same period, Touch 'n Go has discontinued the sales of SmartTAG devices in favor of the RFID-based payment system. Initially, the installation of the RFID chip onto the car could only be done by Touch 'n Go staff at the RFID fitment centers, at no cost. As the pilot testing concluded on 15 February 2020, a self-installation kit are being offered to the public on Lazada and Shopee. Support for taxi-hailing mobile apps was added in November 2018 when Touch 'n Go partnered with EzCab and Public Cab, allowing users to make payments via QR code. This was later expanded to support MULA on 7 January 2020, and later MyCar on 4 April 2020. Touch 'n Go eWallet was also the first eWallet to convert Kuala Lumpur's most famous Ramadan bazaar in Kampong Bahru into "Kampong Kashless", a venue that can accept cashless QR payments. It welcomed more than 250,000 Malaysians including local celebrities and government officials. On 1 October 2019, some e-commerce websites owned by the Alibaba Group (TMall and Taobao) began to support Touch 'n Go eWallet payments, Lazada joined the list on 29 October 2019. Touch 'n Go eWallet was one of the three e-wallet services in Malaysia (the other being Boost and GrabPay) that was eligible for its users to receive an RM 30 credit in conjunction of E-Tunai Rakyat program under the Budget 2020 plan, that further normalizes adoption of cashless and mobile payment among Malaysians. Unlike Boost and GrabPay, whose P2P transfers were completely disabled until users have exhausted the RM 30 first, Touch 'n Go eWallet did not impose such measures. in 2020, Touch 'n Go eWallet joined DuitNow, an electronic transaction ecosystem in Malaysia which allows the funds from Touch 'n Go eWallet to be transferred to other competing services and vice versa, by implementing a standard DuitNow QR code deisgn. Japan become the first country outside Malaysia to support Touch 'n Go eWallet payment via Alipay Connect. During the COVID-19 pandemic and the enforcement of the movement control order, use of eWallets (including Touch 'n Go eWallet) increased tremendously among citizens due to its contactless nature of the payment and increased take-out orders at home; which in turn helped small and medium-sized enterprises to thrive. Touch 'n Go eWallet launched its loyalty programme – The Goal Hunter – in October 2020 where on monthly basis, users collect stamps by paying with the app in exchange for rewards that include lucky draws and other vouchers. == Services == Touch 'n Go eWallet app is available for download on both Google Play and Apple Appstore. It utilizes QR code technology for local in-store payments. The Touch 'n Go eWallet app also diversifies payment types, including but not limited to Utility bills Purchase of motor insurance policy Pay Later facility Prepaid reload and Postpaid payment to telecommunications companies loan repayments for courts, MBSJ payments, zakat and PTPTN payment for car parking P2P transfer airline ticket bookings; movie tickets from TGV Cinemas RFID refuelling at Shell stations (defunct after Shell launched its own payment app in 2024) User can reload the eWallet credit by setting up auto-reload, purchasing reload pins from convenience stores (such as 7-Eleven, KK Super Mart, MyNews, Family Mart etc.), reloading by FPX and credit/debit card. The PayDirect feature allows users to link their physical Touch 'n Go cards into the eWallet, where the toll fare can be debited from the eWallet balance when flashing the card near the sensor. In the circumstance of insufficient balance in the app, the toll fare will be deducted from the physical card's balance instead. This also conveniently allows users to view the card's remaining balance. Touch 'n Go eWallet is the first and only eWallet to offer a money-back guarantee when an unauthorised transaction is made on the user’s eWallet account, subject to Terms & Conditions. Payment via QR code scanning, including Touch 'n Go eWallet, becomes a norm in most of the shops/restaurants across Malaysia, including roadside hawkers/stall owners and automatic vending machines. The merchants usually display their owner's individual QR or Business account that they can apply for in-app. The popularity attributes to the low merchant onboarding cost (Unlike NFC payment and debit/credit card that requires purchase or rental of a payment terminal device at a yearly fee.) The app is also one of the few ewallet that supports bidirectional liquidity (alongside MAE developed by Maybank), where funds can be transferred two-way with bank accounts. This is not possible with the other major ewallets (GrabPay, Boost, ShopeePay etc.) where the money that is reloaded to the wallet cannot be transferred to another bank account, unless through manual req

    Read more →
  • Social media as a public utility

    Social media as a public utility

    Social media as a public utility is a theory postulating that social networking sites (such as Meta - ie:Facebook & Instagram or Alphabet - ie: YouTube & Google, but also independent sites such as Twitter, Tumblr, Snapchat etc.) are essential public services that should be regulated by the government, in a manner similar to how electric and phone utilities are typically government regulated. It is based on the notion that social media platforms have monopoly power and broad social influence. == Background == === Definitions === Social media is defined as "a group of Internet-based applications that build on the ideological and technological foundations of Web 2.0, and that allow the creation and exchange of User Generated Content." Furthermore, the New Zealand Government of Internal Affairs describes it as "a set of online technologies, sites, and practices which are used to share opinions, experiences and perspectives. Fundamentally it is about the conversation. In contrast with traditional media, the nature of social media is to be highly interactive." Moreover, the term social media is described as online tools that let people interact and communicate with each other. This has become a standard word for online cultural exchange and a dominant way for individuals to engage on the internet. By using social media individuals become more closely and strongly connected than ever before. The traditional definition of the term public utility is "an infrastructural necessity for the general public where the supply conditions are such that the public may not be provided with a reasonable service at reasonable prices because of monopoly in the area." Conventional public utilities include water, natural gas, and electricity. In order to secure the interests of the public, utilities are regulated. Public utilities can also be seen as natural monopolies implying that the highest degree of efficiency is accomplished under one operator in the marketplace. Public utility regulation for social media has been largely criticized because people believe it would produce undesirable and indirect effects. However, others say that truly effective government regulation would produce valuable results. Social media as a public utility is a crucial debate because utilities get regulated, so marking social media websites as utilities would require government regulation of various social media websites and platforms such as Facebook, Google, and Twitter. Applying the term public utility to social media implies that social media websites are public necessities, and, consequently, should be regulated by the government. While social media are not as essential for survival as traditional public utilities such as electricity, water, and natural gas, many people believe it has become vital for living in an interconnected world and without it, living a successful life would be difficult. Therefore, many people believe that social media has reached utility status and should be treated as a public utility. However, others believe that this is not true because social media are constantly revolutionizing and giving such platforms "utility status" would result in government regulation, which would consequently hinder innovation. Over the past decade many have debated and questioned whether or not "Internet service providers should be considered essential facilities or natural monopolies and regulated as public utilities." === Monopoly === A monopoly is defined as "a firm that is the only seller of a product or service having no close substitutes." A natural monopoly is when the entire demand within a relevant market can be satisfied at lowest cost by one firm rather than by two or more, and if such a market contains more than one firm then the firms will "quickly shake down to one through mergers or failures, or production will continue to consume more resources than necessary." In a monopoly competition is said to be short-lived, and in a natural monopoly it is said to produce inefficient results." Public utility companies can be regulated to prevent them from gaining monopolistic control. In November 2011 AT&T's proposal for merging with T-Mobile was rejected because it would have "diminished competition," and have led to the company having monopolistic power within the telephone industry. Such regulation is permitted because the telephone industry is a public utility. Similarly, Microsoft has also been prevented from taking various business actions that could result in the company gaining monopolistic power. If social media were a public utility then regulation of Google and Facebook would similarly dictate what they could and could not do. The possibility was raised in 2018 by U.S. Representative Steve King during a House Judiciary hearing on social media filtering practices. == Arguments == Advocates of this theory believe that social media websites already act like public utilities, and therefore regulation is needed. Additionally, advocates say that in the 21st century, using such websites are as necessary for communication as using traditional public utilities such as telephone, water, electricity, and natural gas are for other everyday uses. Specifically, advocates note that Google search should be treated as a public utility and needs to be regulated because it dominates the search engine market and no website can afford to ignore it. There is the position that a social media website such as Google "is a common carrier and should be regulated as such (Newman 2011)." These are reinforced by a perception that social media companies fail to properly maintain fair platforms for discourse. === Individual level === Advocates of regulating social media as a public utility believe that having an Internet presence using social media websites is imperative for individuals to adequately take part in the 21st century. Consequently, they argue that these sites are public utilities that need to be regulated to ensure that the constitutional rights of users are protected. For example, regulation may be needed to protect freedom of speech against risks such as Internet censorship and deplatforming. Social media affects people's behavior. For instance, it plays an important role in shaping its users' decisions and actions pertaining to health. This is demonstrated in a Pew Research Center research, which showed that 72 percent of American adults turned to social media for health information in 2011. Around 70 percent of people with chronic illnesses also use the platform to find cure, diagnoses, and other health answers. This development becomes a public issue as social media are likely to provide wrong medical information. Additionally, social media sites can also facilitate deleterious health behavior such as smoking, drug use, and harmful sexual behavior. === Business level === Advocates of social media as a public utility maintain that social media services dominate the Internet and are mainly owned by three or four companies that have unparalleled power to shape user interaction, and because of this power such businesses need to be regulated as public utilities. Zeynep Tufekci, University of North Carolina Chapel Hill, claims that services on the Internet such as Google, eBay, Facebook, Amazon.com, are all natural monopolies. She has stated that these services "benefit greatly from network externalities[,] which means that the more people on the service, the more useful it is for everyone," and thus it is difficult to replace the market leader. === Government level === Advocates of social media as a public utility believe that the government should impose restrictions on social media websites, such as Google, that are designed to benefit its rivals. Due to the recent substantial growth of social media websites such as Google, advocates claim that such a website "might need search neutrality regulation modeled after net neutrality regulation and that a Federal Search Commission might be needed to enforce such a regime." danah boyd expresses a future issue which the government may have to deal with in her research: Facebook is becoming an international social media website, specifically prevalent in Canada and Europe which are "two regions that love to regulate their utilities." Furthermore, recent books by New America Foundation Senior Fellow Rebecca MacKinnon and law professor Lori Andrews advise society to start considering Facebook and Google as nation-states or the "sovereigns of cyberspace." Overall, advocates of social media as a public utility believe that due to the immense popularity and necessity of social media websites, it is imperative that the Government imposes regulations in the same manner they do for electricity, water, and natural gas. == Counterarguments == Opponents of this theory say that social media websites should not be treated as public utilities because these platforms are changing every year, and because they are not essential services for s

    Read more →
  • NRENum.net

    NRENum.net

    The NRENum.net service is an end-user ENUM service run by TERENA and the participating national research and education networking organisations (NRENs), primarily for academia. NRENum.net is considered as a complementary service and a valid alternative to the Golden ENUM tree. The domain nrenum.net is being populated in order to provide the infrastructure in DNS for storage of E.164 numbers. The NRENum.net service includes the operation of the Tier-0 root Domain Name Server(s) and the delegation of county codes to NRENum.net Registries. NRENum.net is a registered community trademark of TERENA. == Service description == E.164 Telephone Number Mapping (ENUM) is a standard protocol that is the result of work of the Internet Engineering Task Force's Telephone Number Mapping working group. ENUM translates a telephone number into a domain name. This allows users to continue to use the existing phone number formats they are familiar with, while allowing the call to be routed using DNS. This makes ENUM a quick, stable and cheap link between telecommunications systems and the Internet. RFC 3761 discusses the use of the Domain Name System for storage of E.164 numbers. More specifically, how DNS can be used for identifying available services connected to one E.164 number. The RIPE NCC provides DNS operations for e164.arpa (known as Golden ENUM tree) in accordance with the instructions from the Internet Architecture Board. The NRENum.net service is an end-user ENUM service run by TERENA and the participating NRENs primarily for academia. NRENum.net is considered as a complementary service and a valid alternative to the Golden ENUM tree. The domain nrenum.net is being populated in order to provide the infrastructure in DNS for storage of E.164 numbers. The NRENum.net service includes the operation of the Tier-0 root Domain Name Servers and the delegation of county codes to NRENum.net Registries. NRENum.net is a registered community trademark of TERENA. NRENum.net facilitates services such as Voice over IP and videoconferencing. NRENum.net tree refers to the tree structure where: Tier-0 root Domain Name Servers (technically one master and several secondary servers ensuring resilience) are run by the hosting organisations and coordinated by the NRENum.net Operations Team. Tier-1 Domain Name Servers are run by the NRENum.net (national or regional) Registries responsible for the country code(s) delegated. Tier-2 and lower DNS sub-delegations may be implemented, regulated by the national service policies. An NRENum.net Registry is an entity that is authorised by the NRENum.net Operations Team to operate the national or regional Tier-1 Domain Name Server and be responsible for the county code(s) delegated. In many countries there is a National Research and Education Networking organisation (NREN) that acts as the Registry of the country. An NRENum.net Registrar is responsible for the number/block registration in the Tier-1 DNS and a Number Validation Entity is responsible for the validation of the E.164 telephone numbers to be registered. The NREN may at the same time have the role of the NRENum.net Registry, Registrar and Validation Entity for the country code(s) delegated. A Registrant (end user) is an E.164 telephone number holder. Holders of E.164 numbers who want to be listed in the service must contact the appropriate NRENum.net Registrar. Number (block) delegation is the technical process of assigning country codes to national registries, or number blocks under country codes to end users. Number (block) registration is the technical process of configuring DNS and populating it with the appropriate ENUM records (i.e., adding NAPTR records to DNS) via registrars. The ITU-T strictly regulates the number structure of valid E.164 telephone numbers and assigns number blocks to national authorities (telecom regulators) or recently to global entities directly. The national authorities can further delegate the number ranges to local operators within the country or region. A virtual number has either a non-valid E.164 number structure (e.g., longer than 15 digits) or has a valid structure but is not assigned to any national authorities or operators. The number Validation Entity is responsible for checking the numbers to be registered to NRENum.net. == History == The idea for the NRENum.net service was conceived in 2006. NRENum.net became operational in August 2006, and was run by Bernie Höneisen, a staff member of SWITCH, and Kewin Stöckigt, a staff member of AARNet, as a private service, with technical support from SWITCH and the participants in the TERENA Task Force on Enhanced Communication Services (TF-ECS). When that task force completed its activities in 2008, TERENA agreed to take over the coordination of the NRENum.net service. By that time, nine NRENs had joined NRENum.net. The service continued to grow during the next years, and in March 2012 NRENum.net went global when RNP from Brazil joined the service as its 14th partificpant and the first outside Europe. In 2011, the participants decided to migrate the operation of the service's master Domain Name Server to NIIF and the operation of the two secondary DNSs to CARNET and SWITCH. In 2013, Internet2, AARNet and NORDUnet set up additional secondary Domain Name Servers for their regions, thereby completing the global distribution of DNS slaves and bringing the resilience of the NRENum.net infrastructure to a high level. == Governance == TERENA has established a lightweight global governance structure. The Global NRENum.net Governance Committee (GNGC) is the highest-level strategic body responsible for overall NRENum.net service definition, sustainability and long-term strategy. This includes formulating and recommending service governance principles and policies. Its members are nominated by the NRENum.net Registries in the various world regions, and are appointed by TERENA. The GNGC is composed of two members representing Europe, two representing the Asia-Pacific region, and two representing the Americas. The NRENum.net Operations Team is responsible for the day-to-day operations of the Tier-0 root DNSs and the handling of country code delegation requests. It may escalate technical or policy issues to the GNGC for discussion. TERENA is responsible for ensuring the correct and secure operations of the NRENum.net service performed by the NRENum.net Operations Team and governance by the GNGC. TERENA also supports the development of technical improvements to the NRENum.net service and promotes the deployment of NRENum.net worldwide. == Geographical deployment == Thirty-two county codes are delegated in the NRENum.net service. Below these are listed per world region. === Europe === === Asia-Pacific === === North America === +1 United States (Internet2) === Latin America === === Caribbean === === Africa === +262 Réunion, Mayotte (RENATER)

    Read more →
  • Social media newsroom

    Social media newsroom

    A social media newsroom is a company resource, set up to increase the functionality and usability of the traditional online newsroom. Social media newsrooms (SMNs) are intended to encourage dialogue and information sharing. Unlike online newsrooms, content is accessible to more than just journalists, but to all those with whom the company engages such as bloggers, their prospects, customers, business partners and investors. It gives these stakeholders access to news, public relations announcements, images, audio, video and other multimedia files. In addition to posting press releases and corporate news, companies can integrate other social content from sites such as YouTube, Flickr and Slideshow as well as streams from corporate Twitter accounts. Traditional tools for journalists such as corporate fast facts, leadership information, a multimedia library, financial information, awards and other recent media coverage are also included in an SMN. Examples of companies effectively using social media newsrooms include Opel Group, Pressat, First Direct, MyNewsdesk, Scania and Newport Beach.

    Read more →
  • CMU Pronouncing Dictionary

    CMU Pronouncing Dictionary

    The CMU Pronouncing Dictionary (also known as CMUdict) is an open-source pronouncing dictionary originally created by the Speech Group at Carnegie Mellon University (CMU) for use in speech recognition research. CMUdict provides a mapping orthographic/phonetic for English words in their North American pronunciations. It is commonly used to generate representations for speech recognition (ASR), e.g. the CMU Sphinx system, and speech synthesis (TTS), e.g. the Festival system. CMUdict can be used as a training corpus for building statistical grapheme-to-phoneme (g2p) models that will generate pronunciations for words not yet included in the dictionary. The most recent release is 0.7b; it contains over 134,000 entries. An interactive lookup version is available. == Database format == The database is distributed as a plain text file with one entry to a line in the format "WORD " with a two-space separator between the parts. If multiple pronunciations are available for a word, variants are identified using numbered versions (e.g. WORD(1)). The pronunciation is encoded using a modified form of the ARPABET system, with the addition of stress marks on vowels of levels 0, 1, and 2. A line-initial ;;; token indicates a comment. A derived format, directly suitable for speech recognition engines is also available as part of the distribution; this format collapses stress distinctions (typically not used in ASR). The following is a table of phonemes used by CMU Pronouncing Dictionary. == History == == Applications == The Unifon converter is based on the CMU Pronouncing Dictionary. The Natural Language Toolkit contains an interface to the CMU Pronouncing Dictionary. The Carnegie Mellon Logios tool incorporates the CMU Pronouncing Dictionary. PronunDict, a pronunciation dictionary of American English, uses the CMU Pronouncing Dictionary as its data source. Pronunciation is transcribed in IPA symbols. This dictionary also supports searching by pronunciation. Some singing voice synthesizer software like CeVIO Creative Studio and Synthesizer V uses modified version of CMU Pronouncing Dictionary for synthesizing English singing voices. Transcriber, a tool for the full text phonetic transcription, uses the CMU Pronouncing Dictionary 15.ai, a real-time text-to-speech tool using artificial intelligence, uses the CMU Pronouncing Dictionary

    Read more →
  • Squeaky Dolphin

    Squeaky Dolphin

    Squeaky Dolphin is a program developed by the Government Communications Headquarters (GCHQ), a British intelligence and security organization, to collect and analyze data from social media networks. The program was first revealed to the general public on NBC on 27 January 2014 based on documents previously leaked by Edward Snowden. == Scope of surveillance == According to a document of the GCHQ dated August 2012, the program enables broad, real-time surveillance of the following items: YouTube video views The Like button on Facebook. Facebook has since then encrypted the data. Blogspot/Blogger visits Twitter, which has however encrypted its communications since this presentation was made The program can be supplemented with commercially available analytic software to determine which videos are popular among residents of specific cities. The dashboard software chosen was made by Splunk. The presentation, which was originally shown to an NSA audience and was made public by the NBC, contains a note saying the program was "Not interested in individuals just broad trends!". However, "according to other Snowden documents" obtained by NBC, in 2010, "GCHQ exploited unencrypted data from Twitter to identify specific users around the world and target them with propaganda."

    Read more →
  • White-box cryptography

    White-box cryptography

    In cryptography, the white-box model refers to an extreme attack scenario, in which an adversary has full unrestricted access to a cryptographic implementation, most commonly of a block cipher such as the Advanced Encryption Standard (AES). A variety of security goals may be posed (see the section below), the most fundamental being "unbreakability", requiring that any (bounded) attacker should not be able to extract the secret key hardcoded in the implementation, while at the same time the implementation must be fully functional. In contrast, the black-box model only provides an oracle access to the analyzed cryptographic primitive (in the form of encryption and/or decryption queries). There is also a model in-between, the so-called gray-box model, which corresponds to additional information leakage from the implementation, more commonly referred to as side-channel leakage. White-box cryptography is a practice and study of techniques for designing and attacking white-box implementations. It has many applications, including digital rights management (DRM), pay television, protection of cryptographic keys in the presence of malware, mobile payments and cryptocurrency wallets. Examples of DRM systems employing white-box implementations include CSS and Widevine. White-box cryptography is closely related to the more general notions of obfuscation, in particular, to Black-box obfuscation, proven to be impossible, and to Indistinguishability obfuscation, constructed recently under well-founded assumptions but so far being infeasible to implement in practice. As of January 2023, there are no publicly known unbroken white-box designs of standard symmetric encryption schemes. On the other hand, there exist many unbroken white-box implementations of dedicated block ciphers designed specifically to achieve incompressibility (see § Security goals). == Security goals == Depending on the application, different security goals may be required from a white-box implementation. Specifically, for symmetric-key algorithms the following are distinguished: Unbreakability is the most fundamental goal requiring that a bounded attacker should not be able to recover the secret key embedded in the white-box implementation. Without this requirement, all other security goals are unreachable since a successful attacker can simply use a reference implementation of the encryption scheme together with the extracted key. One-wayness requires that a white-box implementation of an encryption scheme can not be used by a bounded attacker to decrypt ciphertexts. This requirement essentially turns a symmetric encryption scheme into a public-key encryption scheme, where the white-box implementation plays the role of the public key associated to the embedded secret key. This idea was proposed already in the famous work of Diffie and Hellman in 1976 as a potential public-key encryption candidate. Code lifting security is an informal requirement on the context, in which the white-box program is being executed. It demands that an attacker can not extract a functional copy of the program. This goal is particularly relevant in the DRM setting. Code obfuscation techniques are often used to achieve this goal. A commonly used technique is to compose the white-box implementation with so-called external encodings. These are lightweight secret encodings that modify the function computed by the white-box part of an application. It is required that their effect is canceled in other parts of the application in an obscure way, using code obfuscation techniques. Alternatively, the canceling counterparts can be applied on a remote server. Incompressibility requires that an attacker can not significantly compress a given white-box implementation. This can be seen as a way to achieve code lifting security (see above), since exfiltrating a large program from a constrained device (for example, an embedded or a mobile device) can be time-consuming and may be easy to detect by a firewall. Examples of incompressible designs include SPACE cipher, SPNbox, WhiteKey and WhiteBlock. These ciphers use large lookup tables that can be pseudorandomly generated from a secret master key. Although this makes the recovery of the master key hard, the lookup tables themselves play the role of an equivalent secret key. Thus, unbreakability is achieved only partially. Traceability (Traitor tracing) requires that each distributed white-box implementation contains a digital watermark allowing identification of the guilty user in case the white-box program is being leaked and distributed publicly. == History == The white-box model with initial attempts of white-box DES and AES implementations were first proposed by Chow, Eisen, Johnson and van Oorshot in 2003. The designs were based on representing the cipher as a network of lookup tables and obfuscating the tables by composing them with small (4- or 8-bit) random encodings. Such protection satisfied a property that each single obfuscated table individually does not contain any information about the secret key. Therefore, a potential attacker has to combine several tables in their analysis. The first two schemes were broken in 2004 by Billet, Gilbert, and Ech-Chatbi using structural cryptanalysis. The attack was subsequently called "the BGE attack". The numerous consequent design attempts (2005-2022) were quickly broken by practical dedicated attacks. In 2016, Bos, Hubain, Michiels and Teuwen showed that an adaptation of standard side-channel power analysis attacks can be used to efficiently and fully automatically break most existing white-box designs. This result created a new research direction about generic attacks (correlation-based, algebraic, fault injection) and protections against them. == Competitions == Four editions of the WhibOx contest were held in 2017, 2019, 2021 and 2024 respectively. These competitions invited white-box designers both from academia and industry to submit their implementation in the form of (possibly obfuscated) C code. At the same time, everyone could attempt to attack these programs and recover the embedded secret key. Each of these competitions lasted for about 4-5 months. WhibOx 2017 / CHES 2017 Capture the Flag Challenge targeted the standard AES block cipher. Among 94 submitted implementations, all were broken during the competition, with the strongest one staying unbroken for 28 days. WhibOx 2019 / CHES 2019 Capture the Flag Challenge again targeted the AES block cipher. Among 27 submitted implementations, 3 programs stayed unbroken throughout the competition, but were broken after 51 days since the publication. WhibOx 2021 / CHES 2021 Capture the Flag Challenge changed the target to ECDSA, a digital signature scheme based on elliptic curves. Among 97 submitted implementations, all were broken within at most 2 days. WhibOx 2024 / CHES 2024 Capture the Flag Challenge again targeted ECDSA. Among 47 submitted implementations, all were broken during the competition, with the strongest one staying unbroken for almost 5 days.

    Read more →