AI Coding Tools

Explore the best AI Coding Tools — independent reviews, comparisons, pricing and step-by-step how-to guides, curated by Aizhi.

  • Scenery generator

    Scenery generator

    A scenery generator (or terrain generator) is a software used to create landscape images, 3D models, and animations. These programs often use procedural generation to generate the landscapes, or sometimes created and rendered by a 3D artist. These programs are often used in video games or movies. Basic elements of landscapes created by scenery generators include terrain, water, foliage, and clouds. The process for basic random generation uses a diamond square algorithm. == Common features == Most scenery generators can create basic heightmaps to simulate the variation of elevation in basic terrain. Common techniques include Simplex noise, fractals, or the diamond-square algorithm, which can generate 2-dimensional heightmaps. A version of scenery generator can be very simplistic. Using a diamond-square algorithm with some extra steps involving fractals, an algorithm for random generation of terrain can be made with only 120 lines of code. The program in example takes a grid and then divides the grid repeatedly. Each smaller grid is then split into squares and diamonds and the algorithm then makes the randomized terrain for each square and diamond. Most programs for creating landscapes also allow for adjustment and editing of the landscape. For example, World Creator allows for terrain sculpting, which uses a similar brush system as Photoshop, and allows for additional terrain enhancement with its procedural techniques such as erosion, sediments, and more. Other tools in the World Creator program include terrain stamping, which allows you to import elevation maps and use them as a base. The programs tend to also allow for additional placement of rocks, trees, etc. These can be done procedurally or by hand depending on the program. Typically the models used for the placement objects are the same as to lessen the amount of work that would be done if the user was to create a multitude of different trees. The terrain generated the computer does a generation of multifractals then integrates them until finally rendering them onto the screen. These techniques are typically done “on-the-fly” which typically for a 128 × 128 resolution terrain would mean 1.5 seconds on a CPU from the early 1990s. == Applications == Scenery generators are commonly used in movies, animations, 3D rendering, and video games. For example, Industrial Light & Magic used E-on Vue to create the fictional environments for Pirates of the Caribbean: Dead Man's Chest. In such live-action cases, a 3D model of the generated environment is rendered and blended with live-action footage. Scenery generated by the software may also be used to create completely computer-generated scenes. In the case of animated movies such as Kung Fu Panda, the raw generation is assisted by hand-painting to accentuate subtle details. Environmental elements not commonly associated with landscapes, such as ocean waves, have also been handled by the software. Scenery generation is used in most 3D based video-games. These typically use either custom or purchased engines that contain their own scenery generators. For some games they tend to use a procedurally generated terrain. These typically use a form of height mapping and use of Perlin noise. This will create a grid that with one point in a 2D coordinate will create the same heightmap as it is pseudorandom, meaning it will result in the same output with the same input. This can then easily be translated into the product 3D image. These can then be changed from the editor tools in most engines if the terrain will be custom built. With recent developments neural networks can be built to create or texture the terrain based on previously suggested artwork or heightmap data. These would be generated using algorithms that have been able to identify images and similarities between them. With the info the machine can take other heightmaps and render a very similar looking image to the style image. This can be used to create similar images in example a Studio Ghibli or Van Gogh art-style. == Software == Most game engines, whether custom or proprietary, will have terrain generation built in. Some terrain generator programs include, Terragen, which can create terrain, water, atmosphere and lighting; L3DT, which provides similar functions to Terragen, and has a 2048 × 2048 resolution limit; and World Creator, which can create terrain, and is fully GPU powered. === List of 3D terrain generation software ===

    Read more →
  • Solid-state electronics

    Solid-state electronics

    Solid-state electronics are semiconductor electronics: electronic equipment that use semiconductor devices such as transistors, diodes and integrated circuits (ICs). The term is also used as an adjective for devices in which semiconductor electronics that have no moving parts replace devices with moving parts, such as the solid-state relay, in which transistor switches are used in place of a moving-arm electromechanical relay, or the solid-state drive (SSD), a type of semiconductor memory used in computers to replace hard disk drives, which store data on rotating disks. == History == The term solid-state became popular at the beginning of the semiconductor era in the 1960s to distinguish this new technology. A semiconductor device works by controlling an electric current consisting of electrons or holes moving within a solid crystalline piece of semiconducting material such as silicon, while the thermionic vacuum tubes it replaced worked by controlling a current of electrons or ions in a vacuum within a sealed tube. Although the first solid-state electronic device was the cat's whisker detector, a crude semiconductor diode invented around 1904, solid-state electronics started with the invention of the transistor in 1947. Before that, all electronic equipment used vacuum tubes, because vacuum tubes were the only electronic components that could amplify—an essential capability in all electronics. The transistor, which was invented by John Bardeen and Walter Houser Brattain while working under William Shockley at Bell Laboratories in 1947, could also amplify, and replaced vacuum tubes. The first transistor hi-fi system was developed by engineers at GE and demonstrated at the University of Philadelphia in 1955. In terms of commercial production, The Fisher TR-1 was the first "all transistor" preamplifier, which became available mid-1956. In 1961, a company named Transis-tronics released a solid-state amplifier, the TEC S-15. The replacement of bulky, fragile, energy-hungry vacuum tubes by transistors in the 1960s and 1970s created a revolution not just in technology but in people's habits, making possible the first truly portable consumer electronics such as the transistor radio, cassette tape player, walkie-talkie and quartz watch, as well as the first practical computers and mobile phones. Other examples of solid state electronic devices are the microprocessor chip, LED lamp, solar cell, charge coupled device (CCD) image sensor used in cameras, and semiconductor laser. Also during the 1960s and 1970s, television set manufacturers switched from vacuum tubes to semiconductors, and advertised sets as "100% solid state" even though the cathode-ray tube (CRT) was still a vacuum tube. It meant only the chassis was 100% solid-state, not including the CRT. Early advertisements spelled out this distinction, but later advertisements assumed the audience had already been educated about it and shortened it to just "100% solid state". LED displays can be said to be truly 100% solid-state.

    Read more →
  • T.38

    T.38

    T.38 is an ITU recommendation for allowing transmission of fax over IP networks (FoIP) in real time. == History == The T.38 fax relay standard was devised in 1998 as a way to transport faxes across IP networks between existing Group 3 (G3) fax terminals. T.4 and related fax standards were published by the ITU in 1980, before the rise of the Internet. In the late 1990s, VoIP, or voice over IP, began to gain ground as an alternative to the conventional public switched telephone network (PSTN). However, because most VoIP systems are optimized (through their use of aggressive lossy bandwidth-saving compression) for voice rather than data calls, conventional fax machines worked poorly or not at all on them due to the network impairments such as delay, jitter, packet loss, and so on. Thus, some way of transmitting fax over IP was needed. == Overview == In practical scenarios, a T.38 fax call has at least part of the call being carried over PSTN, although this is not required by the T.38 definition, and two T.38 devices can send faxes to each other. This particular type of device is called Internet-Aware Fax device, or IAF, and it is capable of initiating or completing a fax call towards the IP network. The typical scenario where T.38 is used is – T.38 fax relay – where a T.30 fax device sends a fax over PSTN to a T.38 fax gateway which converts or encapsulates the T.30 protocol into a T.38 data stream. This is then sent either to a T.38-enabled end point such as fax machine or fax server or another T.38 gateway that converts it back to a PSTN PCM or analog signal and terminates the fax on a T.30 device. The T.38 recommendation defines the use of both TCP and UDP to transport T.38 packets. Implementations tend to use UDP, due to TCP's requirement for acknowledgement packets and resulting retransmission during packet loss, which introduces delays. When using UDP, T.38 copes with packet loss by using redundant data packets. T.38 is not a call setup protocol, thus the T.38 devices need to use standard call setup protocols to negotiate the T.38 call, e.g. H.323, SIP & MGCP. == Operation == There are two primary ways that fax transactions are conveyed across packet networks. The T.37 standard specifies how a fax image is encapsulated in e-mail and transported, ultimately, to the recipient using a store-and-forward process through intermediary entities. T.38, however, defines a protocol that supports the use of the T.30 protocol in both the sender and recipient terminals. (See diagram above.) T.38 lets one transmit a fax across an IP network in real time, just as the original G3 fax standards did for the traditional (time-division multiplexed (TDM)) network, also called the public switched telephone network or PSTN. A special protocol is needed for real-time fax over IP (Internet Protocol) since existing fax terminals only supported PSTN connections, where the information flow was generally smooth and uninterrupted, as opposed to the jittery arrival of IP packets. The trick was to come up with a protocol that makes the IP network “invisible” to the endpoint fax terminals, which would mean the user of a legacy fax terminal need not know that the fax call was traversing an IP network. The network interconnections supported by T.38 are shown above. The two fax terminals on either side of the figure communicate using the T.30 fax protocol published by the ITU in 1980. Interconnection of the PSTN with the IP packet network requires a “gateway” between the PSTN and IP networks. PSTN-IP Gateways support TDM voice on the PSTN side and VoIP and FoIP on the packet side. For voice sessions, the gateway will take in voice packets on the IP side, accumulate a few packets to ensure a smooth flow of TDM data upon their release, and then meter them out over TDM where they eventually are heard by a human or stored on a computer for later playback. The gateway employs packet-management techniques to enhance the quality of the speech in the presence of network errors by taking advantage of the natural ability of a listener to not really hear the occasional missing or repeated packet. But facsimile data are transmitted by modems, which aren't as forgiving as the human ear is for speech. Missing packets will often cause a fax session to fail at worst or create one or more image lines in error at best. So the job of T.38 is to “fool” the terminal into “thinking” that it's communicating directly with another T.30 terminal. It will also correct for network delays with so-called spoofing techniques, and missing or delayed packets with fax-aware buffer-management techniques. Spoofing refers to the logic implemented in the protocol engine of a T.38 relay that modifies the protocol commands and responses on the TDM side to keep network delays on the IP side from causing the transaction to fail. This is done, for example, by padding image lines or deliberately causing a message to be re-transmitted to render network delays transparent to the sending/receiving fax terminals. Networks that do not have packet loss or excessive delay can exhibit acceptable fax performance without T.38, provided the PCM clocks in all gateways are of very high accuracy (explained below). T.38 not only removes the effect of PCM clocks not being synchronized, but also reduces the required network bandwidth by a factor of 10, while it corrects for packet loss and delay. === Bandwidth reduction === As shown in the diagram below, a T.38 gateway is composed of two primary elements: the fax modems and the T.38 subsystem. The fax modems modulate and demodulate the PCM samples of the analog data, turning the sampled-data representation of the fax terminal's analog signal to its binary translation, and vice versa. The PSTN network samples the analog signal of a voice or modem signal (it doesn't know the difference) 8,000 times per second (SPS), and encodes them as 8-bit data bytes. This means 8000 samples-per-second times 8-bits per sample, or 64,000 bits per second (bit/s) to represent the modem (or voice) data in one direction. For both directions the modem transaction consumes 128,000 bits of network bandwidth. However, the typical modem in a fax terminal transmits the image data at 33,600 bit/s, so if the analog data are first converted to the digital content they represent, only 33,600 bits (plus network overhead of a few bytes) are needed. And since T.30 fax is a half-duplex protocol, the network is only needed for one direction at a time. Refer to RFC 3261 === PCM clock synchronization === In the diagram above, there is a sample-rate clock in the fax terminal and one in the gateway's modems that is used to trigger the sampling of the analog line 8,000 times per second. These clocks are usually quite accurate, but in some low-cost terminal adapters (a one or two-line gateway) the PCM clock can be surprisingly inaccurate. If the terminal is sending data to the gateway, and the gateway's clock is too slow, the buffers (jitter buffers) in the gateway will eventually overflow, causing the transaction to fail. Since the difference is often quite small, this problem occurs on long, detailed fax images giving the clocks more time to cause the jitter buffer in gateway to either underflow or overflow, which is just the same as missing or duplicated packets. === Packet loss === T.38 provides facilities to eliminate the effects of packet loss through data redundancy. When a packet is sent, either zero, one, two, three, or even more of the previously sent packets are repeated. (The specification does not impose a limit.) This increases the network bandwidth required (it's still much less than not using T.38) but it allows the receiving gateway to reconstruct the complete packet sequence, even with a fairly high level of packet loss. == Related standards == T.4 is the umbrella specification for fax. It specifies the standard image sizes, two forms of image-data compression (encoding), the image-data format, and references, T.30 and the various modem standards. T.6 specifies a compression scheme that reduces the time required to transmit an image by roughly 50-percent. T.30 specifies the procedures that a sending and receiving terminal use to set up a fax call, determine the image size, encoding, and transfer speed, the demarcation between pages, and the termination of the call. T.30 also references the various modem standards. V.21, V.27ter, V.29, V.17, V.34: ITU modem standards used in facsimile. The first three were ratified prior to 1980, and were specified in the original T.4 and T.30 standards. V.34 was published for fax in 1994. T.37 The ITU standard for sending a fax-image file via e-mail to the intended recipient of a fax. G.711 pass through - this is where the T.30 fax call is carried in a VoIP call encoded as audio. This is sensitive to network packet loss, jitter and clock synchronization. When using voice high-compression encoding techniques such as, but not limited to, G.729, some fax tonal signa

    Read more →
  • Microelectronics

    Microelectronics

    Microelectronics is a subfield of electronics. As the name suggests, microelectronics relates to the study and manufacture (or microfabrication) of very small electronic designs and components. Usually, but not always, this means micrometre-scale or smaller. These devices are typically made from semiconductor materials. Many components of a normal electronic design are available in a microelectronic equivalent. These include transistors, capacitors, inductors, resistors, diodes and (naturally) insulators and conductors can all be found in microelectronic devices. Unique wiring techniques such as wire bonding are also often used in microelectronics because of the unusually small size of the components, leads and pads. This technique requires specialized equipment and is expensive. Digital integrated circuits (ICs) consist of billions of transistors, resistors, diodes, and capacitors. Analog circuits commonly contain resistors and capacitors as well. Inductors are used in some high frequency analog circuits, but tend to occupy larger chip area due to their lower reactance at low frequencies. Gyrators can replace them in many applications. As techniques have improved, the scale of microelectronic components has continued to decrease. At smaller scales, the relative impact of intrinsic circuit properties, such as unintended interactions between components or their parts, may become more significant. These are called parasitic effects, and the goal of the microelectronics design engineer is to find ways to compensate for or to minimize these effects, while delivering smaller, faster, and cheaper devices. Today, microelectronics design is largely aided by electronic design automation (EDA) software.

    Read more →
  • Cyber attribution

    Cyber attribution

    In the area of computer security, cyber attribution is an attribution of cybercrime, i.e., finding who perpetrated a cyberattack. Uncovering a perpetrator may give insights into various security issues, such as infiltration methods, communication channels, etc., and may help in enacting specific countermeasures. Cyber attribution is a costly endeavor requiring considerable resources and expertise in cyber forensic analysis. For governments and other major players dealing with cybercrime would require not only technical solutions, but legal and political ones as well, and for the latter ones cyber attribution is crucial. Attributing a cyberattack is difficult, and of limited interest to companies that are targeted by cyberattacks. In contrast, secret services often have a compelling interest in finding out whether a state is behind the attack. A further challenge in attribution of cyberattacks is the possibility of a false flag attack, where the actual perpetrator makes it appear that someone else caused the attack. Every stage of the attack may leave artifacts, such as entries in log files, that can be used to help determine the attacker's goals and identity. In the aftermath of an attack, investigators often begin by saving as many artifacts as they can find, and then try to determine the attacker.

    Read more →
  • Giditraffic

    Giditraffic

    GidiTraffic (or GIDITRAFFIC) is an online social service started on 23 September 2011. Based primarily on social media, the service employs crowdsourcing as its primary means of providing real-time traffic updates to subscribers on its platform. The service, delivered free of charge, affords its users access to various types of information. Though its broadest category of users is road users and motorists, GIDITRAFFIC lends itself as a platform for answering inquiries from anyone who requires information on any subject of interest. GIDITRAFFIC's core competence is in vehicular traffic reports, however, the service also handles all other forms of traffic (going by the fact that the word traffic also means "the mutual exchange of information"). == Operation == Users of the service log on to its Twitter feed to get up-to-date traffic information or to post a general inquiry, which GIDITRAFFIC then publishes to all subscribers. Through crowdsourced replies, a requester receives numerous responses from other subscribers who have seen the question and can provide a relevant answer. In addition, updates are provided by subscribers to the platform via their mobile devices, thereby making the service effective in delivering traffic updates as they occur, and providing timely answers to other user inquiries. This informs GIDITRAFFIC's motto of "Lending each other an eye", alluding to the collaboration and cooperation between the platform's users in making the service indispensable to its users. == Reception == On Twitter, which is its primary platform, the service caters to over 1,800,000 subscribers, with the number increasing daily. The popularity of the platform stems from the fact that it not only keeps its subscribers abreast of the traffic situation in Lagos, the commercial capital city of Nigeria (well known for its many traffic jams), but users in other parts of the world. For a regular user of the platform, knowing where to avoid getting to a set destination in good time is well worth the two or three minutes it takes to access and scroll through the GIDITRAFFIC feed for updates. Another interesting aspect of this platform is the identity of the person behind it. The sustained anonymity of this individual has sparked many discussions centering on his or her possible identity. Online, GIDITRAFFIC continuously publishes traffic updates and user questions, while keeping up witty interactions with the platform's followers round the clock – adding to the mystery and persona of the GIDITRAFFIC owner. == Awards and recognition == In early 2012, GIDITRAFFIC received a nomination for a Shorty Award in the Life-Saving Hero category. Although this did not translate into a win, it brought recognition and wider exposure for the service from international news outlets such as the BBC, Washington Post. and New York Times. Back home in Nigeria, also in 2012, GIDITRAFFIC was honored with a Future Award for Best Use of New Media in recognition of the huge impact the service has had in terms of helping Lagos residents better manage time spent in traffic. == Mobile Applications == In 2012, GIDITRAFFIC partnered with telecommunications company Nokia to produce a downloadable mobile traffic application (the GIDITRAFFIC application, available for Nokia Asha phones on Nokia's online store). There are plans to extend the application to a wider range of mobile phone platforms. On 4 September 2013, the GIDITRAFFIC application for Nokia Lumia phones using Windows Phone 8 was launched on the Windows App Store.

    Read more →
  • Hype (marketing)

    Hype (marketing)

    Hype in marketing is a strategy of using extreme publicity. Hype as a modern marketing strategy is closely associated with social media. Marketing through hype often uses artificial scarcity to induce demand. Consumers of hyped products often participate as a form of conspicuous consumption to signify characteristics about themselves. Hype allows brands to promote their image above the actual quality of the product. Streetwear brands have collaborated with luxury fashion to justify charging premium prices for their goods. As an example, fashion label Vetements used social media channels to promote a limited-edition hoodie which sold 500 units in hours, recording sales of €445,000. When hype marketing is used to drive demand for limited-edition goods, consumers sometimes attempt resell those good on secondary markets for a profit (comparable to ticket scalping). The resale market is a $24 billion industry. == Method == Luxury brands may release products as a collaborate with ready-made garment brands as a way to build hype. Collaborations have been used by some luxury brands to circumvent fast fashion brands copying their designs. NYU Professor Adam Alter says that for an established brand to create a scarcity frenzy, they need to release a limited number of different products, frequently. Hype is often built via Pop-up retail. Comme des Garçons was one of the first to use this strategy, leasing a short-term vacant shop solved the storage problems of releasing product for quick sale. Hype campaigns also rely on influencer marketing, where brands enlist creators whose parasocial relationships with their followers help convert audience attention into demand for limited releases. == In popular culture == The term 'hypebeast' has been coined to define consumers vulnerable to hype marketing. The origins of the term come from the Hong Kong-based company Hypebeast. The behaviours of the hypebeast define hype marketing; the purchase of popular goods they can't afford to impress others. Hype also manifests itself in queues with brands often retailing hyped products through pop-up stores. Many luxury brands release hyped products via their online shop. This has led to the creation of companies that allow consumers to use bots to guarantee or improve their chances of purchasing a limited-edition product.

    Read more →
  • Line splice

    Line splice

    In electrical engineering and telecommunications, a line splice is a joint directly connecting lengths of electrical cables (electrical splice) or optical fibers (optical splice). The splices are often protected by sleeves. == Splicing of copper wires == The splicing of copper wires happens in the following steps: The cores are laid one above the other at the junction. The core insulation is removed. The wires are wrapped two to three times around each other (twisting). The bare veins on a length of about 3 cm "strangle" or "twist". In some cases, the strangulation is soldered. To isolate the splice, an insulating sleeve made of paper or plastic is pushed over it. The splicing of copper wires is mainly used on paper insulated wires. LSA techniques (LSA: soldering, screwing and stripping free) are used to connect copper wires, making the copper wires faster and easier to connect. LSA techniques include: Wire connection sleeves (AVH = Adernverbindungshülsen) and other crimp connectors. The two wires to be connected are inserted into the AVH without being stripped, which is then compressed with special pliers. The about 2 cm long AVH consist of contact, pressure and insulation. For wire connection strips (AVL = Adernverbindungsleisten) several pairs of wires (10 = AVL10 or 20 = AVL20) are inserted, the strip is then closed with a lid and pressed together with a hydraulic press, which ensures the connection. == Splicing of glass fibers == Fiber-optic cables are spliced using a special arc-splicer, with installation cables connected at their ends to respective "pigtails" - short individual fibers with fiber-optic connectors at one end. The splicer precisely adjusts the light-guiding cores of the two ends of the glass fibers to be spliced. The adjustment is done fully automatically in modern devices, whereas in older models this is carried out manually by means of micrometer screws and microscope. An experienced splicer can precisely position the fiber ends within a few seconds. Subsequently, the fibers are fused together (welded) with an electric arc. Since no additional material is added, such as gas welding or soldering, this is called a "fusion splice". Depending on the quality of the splicing process, attenuation values at the splice points are achieved by 0.3 dB, with good splices also below 0.02 dB. For newer generation devices, alignment is done automatically by motors. Here one differentiates core and jacket centering. At core centering (usually single-mode fibers), the fiber cores are aligned. A possible core offset with respect to the jacket is corrected. In the jacket centering (usually in multimode fibers), the fibers are adjusted to each other by means of electronic image processing in front of the splice. When working with good equipment, the damping value is according to experience at max. 0.1 dB. Measurements are made by means of special measuring devices including optical time-domain reflectometry (OTDR). A good splice should have an attenuation of less than 0.3 dB over the entire distance. Finished fiber optic splices are housed in splice boxes. One differentiates: Fusion splice Adhesive splicing Crimp splice or NENP (no-epoxy no-polish), mechanical splice

    Read more →
  • Language identification

    Language identification

    In natural language processing, language identification or language guessing is the problem of determining which natural language a given content is in. Computational approaches to this problem view it as a special case of text categorization, solved with various statistical methods. == Overview == === Logical approach === A common non-statistical intuitive approach (though highly uncertain) is to look for common letter combinations, or distinctive diacritics or punctuation. === Statistical approach === There are several statistical approaches to language identification. An older statistical method by Grefenstette was based on the frequency of short n-grams, which are often function morphemes. For example, "ing" is more common in English than in French, while the sequence "que" is more common in French. Given a new page found on the Web, one counts the number of occurrences of each such short sequence and picks the language whose frequency table it matches the most. One technique is to compare the compressibility of the text to the compressibility of texts in a set of known languages. This approach is known as mutual information based distance measure. The same technique can also be used to empirically construct family trees of languages which closely correspond to the trees constructed using historical methods. Mutual information based distance measure is essentially equivalent to more conventional model-based methods and is not generally considered to be either novel or better than simpler techniques. Another technique, as described by Cavnar and Trenkle (1994) and Dunning (1994) is to create a language n-gram model from a "training text" for each of the languages. These models can be based on characters (Cavnar and Trenkle) or encoded bytes (Dunning); in the latter, language identification and character encoding detection are integrated. Then, for any piece of text needing to be identified, a similar model is made, and that model is compared to each stored language model. The most likely language is the one with the model that is most similar to the model from the text needing to be identified. This approach can be problematic when the input text is in a language for which there is no model. In that case, the method may return another, "most similar" language as its result. Also problematic for any approach are pieces of input text that are composed of several languages, as is common on the Web. As of 2025, a commonly used baseline method is via the fastText library, which has comparable classification accuracy as deep learning techniques, but much faster. == Identifying similar languages == One of the great bottlenecks of language identification systems is to distinguish between closely related languages. Similar languages like Bulgarian and Macedonian or Indonesian and Malay present significant lexical and structural overlap, making it challenging for systems to discriminate between them. In 2014 the DSL shared task has been organized providing a dataset (Tan et al., 2014) containing 13 different languages (and language varieties) in six language groups: Group A (Bosnian, Croatian, Serbian), Group B (Indonesian, Malaysian), Group C (Czech, Slovak), Group D (Brazilian Portuguese, European Portuguese), Group E (Peninsular Spanish, Argentine Spanish), Group F (American English, British English). The best system reached performance of over 95% results (Goutte et al., 2014). Results of the DSL shared task are described in Zampieri et al. 2014. == Software == Apache OpenNLP includes char n-gram based statistical detector and comes with a model that can distinguish 103 languages Apache Tika contains a language detector for 18 languages

    Read more →
  • Comet (programming)

    Comet (programming)

    Comet is a web application model in which a long-held HTTPS request allows a web server to push data to a browser, without the browser explicitly requesting it. Comet is an umbrella term, encompassing multiple techniques for achieving this interaction. All these methods rely on features included by default in browsers, such as JavaScript, rather than on non-default plugins. The Comet approach differs from the original model of the web, in which a browser requests a complete web page at a time. The use of Comet techniques in web development predates the use of the word Comet as a neologism for the collective techniques. Comet is known by several other names, including Ajax Push, Reverse Ajax, Two-way-web, HTTP Streaming, and HTTP server push among others. The term Comet is not an acronym, but was coined by Alex Russell in his 2006 blog post. In recent years, the standardisation and widespread support of WebSocket and Server-sent events has rendered the Comet model obsolete. == History == === Early Java applets === The ability to embed Java applets into browsers (starting with Netscape Navigator 2.0 in March 1996) made two-way sustained communications possible, using a raw TCP socket to communicate between the browser and the server. This socket can remain open as long as the browser is at the document hosting the applet. Event notifications can be sent in any format – text or binary – and decoded by the applet. === The first browser-to-browser communication framework === The very first application using browser-to-browser communications was Tango Interactive, implemented in 1996–98 at the Northeast Parallel Architectures Center (NPAC) at Syracuse University using DARPA funding. TANGO architecture has been patented by Syracuse University. TANGO framework has been extensively used as a distance education tool. The framework has been commercialized by CollabWorx and used in a dozen or so Command&Control and Training applications in the United States Department of Defense. === First Comet applications === The first set of Comet implementations dates back to 2000, with the Pushlets, Lightstreamer, and KnowNow projects. Pushlets, a framework created by Just van den Broecke, was one of the first open source implementations. Pushlets were based on server-side Java servlets, and a client-side JavaScript library. Bang Networks – a Silicon Valley start-up backed by Netscape co-founder Marc Andreessen – had a lavishly financed attempt to create a real-time push standard for the entire web. In April 2001, Chip Morningstar began developing a Java-based (J2SE) web server which used two HTTP sockets to keep open two communications channels between the custom HTTP server he designed and a client designed by Douglas Crockford; a functioning demo system existed as of June 2001. The server and client used a messaging format that the founders of State Software, Inc. assented to coin as JSON following Crockford's suggestion. The entire system, the client libraries, the messaging format known as JSON and the server, became the State Application Framework, parts of which were sold and used by Sun Microsystems, Amazon.com, EDS and Volkswagen. In March 2006, software engineer Alex Russell coined the term Comet in a post on his personal blog. The new term was a play on Ajax (Ajax and Comet both being common household cleaners in the USA). In 2006, some applications exposed those techniques to a wider audience: Meebo’s multi-protocol web-based chat application enabled users to connect to AOL, Yahoo, and Microsoft chat platforms through the browser; Google added web-based chat to Gmail; JotSpot, a startup since acquired by Google, built Comet-based real-time collaborative document editing. New Comet variants were created, such as the Java-based ICEfaces JSF framework (although they prefer the term "Ajax Push"). Others that had previously used Java-applet based transports switched instead to pure-JavaScript implementations. == Implementations == Comet applications attempt to eliminate the limitations of the page-by-page web model and traditional polling by offering two-way sustained interaction, using a persistent or long-lasting HTTP connection between the server and the client. Since browsers and proxies are not designed with server events in mind, several techniques to achieve this have been developed, each with different benefits and drawbacks. The biggest hurdle is the HTTP 1.1 specification, which states "this specification... encourages clients to be conservative when opening multiple connections". Therefore, holding one connection open for real-time events has a negative impact on browser usability: the browser may be blocked from sending a new request while waiting for the results of a previous request, e.g., a series of images. This can be worked around by creating a distinct hostname for real-time information, which is an alias for the same physical server. This strategy is an application of domain sharding. Specific methods of implementing Comet fall into two major categories: streaming and long polling. === Streaming === An application using streaming Comet opens a single persistent connection from the client browser to the server for all Comet events. These events are incrementally handled and interpreted on the client side every time the server sends a new event, with neither side closing the connection. Specific techniques for accomplishing streaming Comet include the following: ==== Hidden iframe ==== A basic technique for dynamic web application is to use a hidden iframe HTML element (an inline frame, which allows a website to embed one HTML document inside another). This invisible iframe is sent as a chunked block, which implicitly declares it as infinitely long (sometimes called "forever frame"). As events occur, the iframe is gradually filled with script tags, containing JavaScript to be executed in the browser. Because browsers render HTML pages incrementally, each script tag is executed as it is received. Some browsers require a specific minimum document size before parsing and execution is started, which can be obtained by initially sending 1–2 kB of padding spaces. One benefit of the iframes method is that it works in every common browser. Two downsides of this technique are the lack of a reliable error handling method, and the impossibility of tracking the state of the request calling process. ==== XMLHttpRequest ==== The XMLHttpRequest (XHR) object, a tool used by Ajax applications for browser–server communication, can also be pressed into service for server–browser Comet messaging by generating a custom data format for an XHR response, and parsing out each event using browser-side JavaScript; relying only on the browser firing the onreadystatechange callback each time it receives new data. === Ajax with long polling === None of the above streaming transports work across all modern browsers without negative side-effects. This forces Comet developers to implement several complex streaming transports, switching between them depending on the browser. Consequently, many Comet applications use long polling, which is easier to implement on the browser side, and works, at minimum, in every browser that supports XHR. As the name suggests, long polling requires the client to poll the server for an event (or set of events). The browser makes an Ajax-style request to the server, which is kept open until the server has new data to send to the browser, which is sent to the browser in a complete response. The browser initiates a new long polling request in order to obtain subsequent events. IETF RFC 6202 "Known Issues and Best Practices for the Use of Long Polling and Streaming in Bidirectional HTTP" compares long polling and HTTP streaming. Specific technologies for accomplishing long-polling include the following: ==== XMLHttpRequest long polling ==== For the most part, XMLHttpRequest long polling works like any standard use of XHR. The browser makes an asynchronous request of the server, which may wait for data to be available before responding. The response can contain encoded data (typically XML or JSON) or Javascript to be executed by the client. At the end of the processing of the response, the browser creates and sends another XHR, to await the next event. Thus the browser always keeps a request outstanding with the server, to be answered as each event occurs. ==== Script tag long polling ==== While any Comet transport can be made to work across subdomains, none of the above transports can be used across different second-level domains (SLDs), due to browser security policies designed to prevent cross-site scripting attacks. That is, if the main web page is served from one SLD, and the Comet server is located at another SLD (which does not have cross-origin resource sharing enabled), Comet events cannot be used to modify the HTML and DOM of the main page, using those transports. This problem can be sidestepped by creating a proxy server in

    Read more →
  • Supercomputer operating system

    Supercomputer operating system

    A supercomputer operating system is an operating system intended for supercomputers. Since the end of the 20th century, supercomputer operating systems have undergone major transformations, as fundamental changes have occurred in supercomputer architecture. While early operating systems were custom tailored to each supercomputer to gain speed, the trend has been moving away from in-house operating systems and toward some form of Linux, with it running all the supercomputers on the TOP500 list in November 2017. In 2021, top 10 computers run for instance Red Hat Enterprise Linux (RHEL), or some variant of it or other Linux distribution e.g. Ubuntu. Given that modern massively parallel supercomputers typically separate computations from other services by using multiple types of nodes, they usually run different operating systems on different nodes, e.g., using a small and efficient lightweight kernel such as Compute Node Kernel (CNK) or Compute Node Linux (CNL) on compute nodes, but a larger system such as a Linux distribution on server and input/output (I/O) nodes. While in a traditional multi-user computer system job scheduling is in effect a tasking problem for processing and peripheral resources, in a massively parallel system, the job management system needs to manage the allocation of both computational and communication resources, as well as gracefully dealing with inevitable hardware failures when tens of thousands of processors are present. Although most modern supercomputers use the Linux operating system, each manufacturer has made its own specific changes to the Linux distribution they use, and no industry standard exists, partly because the differences in hardware architectures require changes to optimize the operating system to each hardware design. == Context and overview == In the early days of supercomputing, the basic architectural concepts were evolving rapidly, and system software had to follow hardware innovations that usually took rapid turns. In the early systems, operating systems were custom tailored to each supercomputer to gain speed, yet in the rush to develop them, serious software quality challenges surfaced and in many cases the cost and complexity of system software development became as much an issue as that of hardware. In the 1980s the cost for software development at Cray came to equal what they spent on hardware and that trend was partly responsible for a move away from the in-house operating systems to the adaptation of generic software. The first wave in operating system changes came in the mid-1980s, as vendor specific operating systems were abandoned in favor of Unix. Despite early skepticism, this transition proved successful. By the early 1990s, major changes were occurring in supercomputing system software. By this time, the growing use of Unix had begun to change the way system software was viewed. The use of a high level language (C) to implement the operating system, and the reliance on standardized interfaces was in contrast to the assembly language oriented approaches of the past. As hardware vendors adapted Unix to their systems, new and useful features were added to Unix, e.g., fast file systems and tunable process schedulers. However, all the companies that adapted Unix made unique changes to it, rather than collaborating on an industry standard to create "Unix for supercomputers". This was partly because differences in their architectures required these changes to optimize Unix to each architecture. As general purpose operating systems became stable, supercomputers began to borrow and adapt critical system code from them, and relied on the rich set of secondary functions that came with them. However, at the same time the size of the code for general purpose operating systems was growing rapidly. By the time Unix-based code had reached 500,000 lines long, its maintenance and use was a challenge. This resulted in the move to use microkernels which used a minimal set of the operating system functions. Systems such as Mach at Carnegie Mellon University and ChorusOS at INRIA were examples of early microkernels. The separation of the operating system into separate components became necessary as supercomputers developed different types of nodes, e.g., compute nodes versus I/O nodes. Thus modern supercomputers usually run different operating systems on different nodes, e.g., using a small and efficient lightweight kernel such as CNK or CNL on compute nodes, but a larger system such as a Linux-derivative on server and I/O nodes. == Early systems == The CDC 6600, generally considered the first supercomputer in the world, ran the Chippewa Operating System, which was then deployed on various other CDC 6000 series computers. The Chippewa was a rather simple job control oriented system derived from the earlier CDC 3000, but it influenced the later KRONOS and SCOPE systems. The first Cray-1 was delivered to the Los Alamos Lab with no operating system, or any other software. Los Alamos developed the application software for it, and the operating system. The main timesharing system for the Cray 1, the Cray Time Sharing System (CTSS), was then developed at the Livermore Labs as a direct descendant of the Livermore Time Sharing System (LTSS) for the CDC 6600 operating system from twenty years earlier. In developing supercomputers, rising software costs soon became dominant, as evidenced by the 1980s cost for software development at Cray growing to equal their cost for hardware. That trend was partly responsible for a move away from the in-house Cray Operating System to UNICOS system based on Unix. In 1985, the Cray-2 was the first system to ship with the UNICOS operating system. Around the same time, the EOS operating system was developed by ETA Systems for use in their ETA10 supercomputers. Written in Cybil, a Pascal-like language from Control Data Corporation, EOS highlighted the stability problems in developing stable operating systems for supercomputers and eventually a Unix-like system was offered on the same machine. The lessons learned from developing ETA system software included the high level of risk associated with developing a new supercomputer operating system, and the advantages of using Unix with its large extant base of system software libraries. By the middle 1990s, despite the extant investment in older operating systems, the trend was toward the use of Unix-based systems, which also facilitated the use of interactive graphical user interfaces (GUIs) for scientific computing across multiple platforms. The move toward a commodity OS had opponents, who cited the fast pace and focus of Linux development as a major obstacle against adoption. As one author wrote "Linux will likely catch up, but we have large-scale systems now". Nevertheless, that trend continued to gain momentum and by 2005, virtually all supercomputers used some Unix-like OS. These variants of Unix included IBM AIX, the open source Linux system, and other adaptations such as UNICOS from Cray. By the end of the 20th century, Linux was estimated to command the highest share of the supercomputing pie. == Modern approaches == The IBM Blue Gene supercomputer uses the CNK operating system on the compute nodes, but uses a modified Linux-based kernel called I/O Node Kernel (INK) on the I/O nodes. CNK is a lightweight kernel that runs on each node and supports a single application running for a single user on that node. For the sake of efficient operation, the design of CNK was kept simple and minimal, with physical memory being statically mapped and the CNK neither needing nor providing scheduling or context switching. CNK does not even implement file I/O on the compute node, but delegates that to dedicated I/O nodes. However, given that on the Blue Gene multiple compute nodes share a single I/O node, the I/O node operating system does require multi-tasking, hence the selection of the Linux-based operating system. While in traditional multi-user computer systems and early supercomputers, job scheduling was in effect a task scheduling problem for processing and peripheral resources, in a massively parallel system, the job management system needs to manage the allocation of both computational and communication resources. It is essential to tune task scheduling, and the operating system, in different configurations of a supercomputer. A typical parallel job scheduler has a master scheduler which instructs some number of slave schedulers to launch, monitor, and control parallel jobs, and periodically receives reports from them about the status of job progress. Some, but not all supercomputer schedulers attempt to maintain locality of job execution. The PBS Pro scheduler used on the Cray XT3 and Cray XT4 systems does not attempt to optimize locality on its three-dimensional torus interconnect, but simply uses the first available processor. On the other hand, IBM's scheduler on the Blue Gene supercomputers aims to exploit locality a

    Read more →
  • Digital media service

    Digital media service

    A digital media service (DMS) is an online service provider that sells access to digital library of content such as films, software, games, images, literature, etc. While no transfer of property is made, a nearly perfect duplicate of the data (song movie, etc.) is made on a customer's computer. Content is either primarily hosted on a dedicated server, which is owned by the service provider, or it is hosted primarily on the hard drives of its customers using a P2P protocol with, perhaps, a dedicated server to supplement. == History == One example of the older business model is the iTunes Store, which still markets and prices data as individual retail products. There are no examples of the latter business model in operation yet, but one is currently in development by Global Gaming Factory X and expected to begin operation some time after they acquire The Pirate Bay domain on August 27, 2009. A key difference between the two models is that the model which relies on its customer base for offering their bandwidth for other customers to access customer hosted data can operate at significantly lower costs than a company that seeks to limit data access to a per-download fee in order to supplement the cost of using its own hosting and bandwidth. The P2P model holds the potential for companies to offer unlimited access to the largest data library in the history of the internet to its customers for a reasonably low membership rate that is relevant to the cost of operation. While the market is virtually untouched, the P2P supplemented model will need entrepreneurs who are able to overcome a series of challenges in order to compete with the older business model as well as that which is offered for free (and often against the wishes of copyright holders) by hundreds of P2P communities on the internet. These challenges include, but are not limited to: Offering better data quality, speed, convenience and ease of use, protocol, sense of security, indexing and search organization, site up time, data library size, customer support, advertising, artist/copyright holder incentives and compensation, incentives and compensation for customers hosting data and providing bandwidth, guaranteed seeding (available access to indexed data at all times), than competitors.

    Read more →
  • Reflection lines

    Reflection lines

    Engineers use reflection lines to judge a surface's quality. Reflection lines reveal surface flaws, particularly discontinuities in normals indicating that the surface is not C 2 {\displaystyle C^{2}} . Reflection lines may be created and examined on physical surfaces or virtual surfaces with the help of computer graphics. For example, the shiny surface of an automobile body is illuminated with reflection lines by surrounding the car with parallel light sources. Virtually, a surface can be rendered with reflection lines by modulating the surfaces point-wise color according to a simple calculation involving the surface normal, viewing direction and a square wave environment map. == Mathematical definition == Consider a point p {\displaystyle p} on a surface M {\displaystyle M} with (normalized) normal n {\displaystyle n} . If an observer views this point from infinity at view direction v {\displaystyle v} then the reflected view direction r {\displaystyle r} is: r = v − 2 ( n ⋅ v ) n . {\displaystyle r=v-2(n\cdot v)n.} (The vector v {\displaystyle v} is decomposed into its normal part v n = ( n ⋅ v ) v {\displaystyle v_{n}=(n\cdot v)v} and tangential part v t = v − v n {\displaystyle v_{t}=v-v_{n}} . Upon reflection, the tangential part is kept and the normal part is negated.) For reflection lines we consider the surface M {\displaystyle M} surrounded by parallel lines with direction a {\displaystyle a} , representing infinite, non-dispersive light sources. For each point p {\displaystyle p} on M {\displaystyle M} we determine which line is seen from direction v {\displaystyle v} . The position on each line is of no interest. Define the vector r p {\displaystyle r_{p}} to be the reflection direction r {\displaystyle r} projected onto a plane P {\displaystyle P} that is orthogonal to a {\displaystyle a} : r p = r − ( r ⋅ a ) a {\displaystyle r_{p}=r-(r\cdot a)a} and similarly let v p {\displaystyle v_{p}} be the viewing direction projected onto P {\displaystyle P} : v p = v − ( v ⋅ a ) a {\displaystyle v_{p}=v-(v\cdot a)a} Finally, define v o {\displaystyle v_{o}} to be the direction lying in P {\displaystyle P} perpendicular to a {\displaystyle a} and v p {\displaystyle v_{p}} : v o = a × v p {\displaystyle v_{o}=a\times v_{p}} Using these vectors, the reflection line function θ ( p ) : M → ( − π , π ] {\displaystyle \theta (p):M\rightarrow (-\pi ,\pi ]} is a scalar function mapping points p {\displaystyle p} on the surface to angles between v p {\displaystyle v_{p}} and r p {\displaystyle r_{p}} : θ = arctan ⁡ ( r p ⋅ v o , r p ⋅ v p ) {\displaystyle \theta =\arctan {(r_{p}\cdot v_{o},r_{p}\cdot v_{p})}} where a r c t a n ( y , x ) {\displaystyle arctan(y,x)} is the atan2 function producing a number in the range ( − π , π ] {\displaystyle (-\pi ,\pi ]} . ( v p {\displaystyle v_{p}} and v o {\displaystyle v_{o}} can be viewed as a local coordinate system in P {\displaystyle P} with x {\displaystyle x} -axis in direction v p {\displaystyle v_{p}} and y {\displaystyle y} -axis in direction v o {\displaystyle v_{o}} .) Finally, to render the reflection lines positive values θ > 0 {\displaystyle \theta >0} are mapped to a light color and non-positive values to a dark color. == Highlight lines == Highlight lines are a view-independent alternative to reflection lines. Here the projected normal is directly compared against some arbitrary vector x {\displaystyle x} perpendicular to the light source: θ = arctan ⁡ ( n a ⋅ a ⊥ , n a ⋅ x ) {\displaystyle \theta =\arctan {(n_{a}\cdot a^{\perp },n_{a}\cdot x)}} where n a {\displaystyle n_{a}} is the surface normal projected on the light source plane P {\displaystyle P} : n a ^ / | n a ^ | , n a ^ = n − ( n ⋅ a ) a {\displaystyle {\hat {n_{a}}}/|{\hat {n_{a}}}|,{\hat {n_{a}}}=n-(n\cdot a)a} The relationship between reflection lines and highlight lines is likened to that between specular and diffuse shading.

    Read more →
  • The Culture of Connectivity

    The Culture of Connectivity

    The Culture of Connectivity: A Critical History of Social Media is a book by José van Dijck published by Oxford University Press in 2013 on social media platforms and their history. The author considers the histories of five social media platforms: Facebook, Twitter, Flickr, YouTube, and Wikipedia. She focuses on how their technological, social and cultural dimensions contribute to their current status.

    Read more →
  • Military communications

    Military communications

    Military communications or military signals involve all aspects of communications, or conveyance of information, by armed forces. Examples from Jane's Military Communications include text, audio, facsimile, tactical ground-based communications, naval signalling, terrestrial microwave, tropospheric scatter, satellite communications systems and equipment, surveillance and signal analysis, security, direction finding and jamming. The most urgent purposes are to communicate information to commanders and orders from them. Military communications span from pre-history to the present. The earliest military communications were delivered by runners. Later, communications progressed to visual signals. For example, Naval ships would use flag signaling to communicate from ship to ship. These flags are a uniform set of easily identifiable nautical codes that would convey visual messages and codes between ships and from ship to shore. Then militaries discovered methods to use audible signaling to communicate with each other. This way of communicating was possible because of telegraphs. They are an electronic device that is used by a sender and when the sender presses on the telegraph key, they interrupt the current creating an audible pulse that is heard at the receiving station. The receiver then decodes the pulses to decode the messages. Since then, military communication has evolved and advanced much further. Today, there are many perspectives used to examine how troops around the world communicate. Anthony King states how Military sociologists have attempted to explain how military institutions develop and maintain high levels of social cohesion. == History == In past centuries communicating a message usually required someone to go to the destination, bringing the message. Thus, the term communication often implied the ability to transport people and supplies. A place under siege was one that lost communication in both senses. The association between transport and messaging declined in recent centuries. The first military communications involved the use of runners or the sending and receiving of simple signals (sometimes encoded to be unrecognizable). The first distinctive uses of military communications were called semaphore. Modern units specializing in these tactics are usually designated as signal corps. The Roman system of military communication (cursus publicus or cursus vehicularis) is an early example of this. Later, the terms signals and signaller became words referring to a highly-distinct military occupation dealing with general communications methods (similar to those in civil use) rather than with weapons. Present-day military forces of an informational society conduct intense and complicated communicating activities on a daily basis, using modern telecommunications and computing methods. Only a small portion of these activities are directly related to combat actions. Modern concepts of network-centric warfare (NCW) rely on network-oriented methods of communications and control to make existing forces more effective. == Military communications equipment == Drums, horns, flags, and riders on horseback were some of the early methods the military used to send messages over distances. The advent of distinctive signals led to the formation of the signal corps, a group specialized in the tactics of military communications. The signal corps evolved into a distinctive occupation where the signaller became a highly technical job dealing with all available communications methods including civil ones. In the middle 20th century radio equipment came to dominate the field. Many modern pieces of military communications equipment are built to both encrypt and decode transmissions and survive rough treatment in hostile climates. They use different frequencies to send signals to other radio stations to communicate. Radios have played a major role in military communication. Since they are capable of sending radio waves to transmit voice signals over long distances. This can be helpful for communication on the battlefield since it is a good way to send messages undetected over long distances. Radios are also very reliable because even in harsh weather conditions they are still able to help communicate among the soldiers. Militaries still use radios and continue to improve the technology because of their durability and reliability for military communication. Spelling alphabets such as the NATO phonetic alphabet are used to aid radio communications by reducing ambiguity between letters. Military communications – or "comms" – are activities, equipment, techniques, and tactics used by the military in some of the most hostile areas of the earth and in challenging environments such as battlefields, on land (compare radio in a box), underwater and also in air. Military comms include command, control and communications and intelligence and were known as the C3I model before computers were fully integrated. The U.S. Army expanded the model to C4I when it recognized the vital role played by automated computer equipment to send and receive large, bulky amounts of data. In the modern world, most nations attempt to minimize the risk of war caused by miscommunication or inadequate communication. As a result, military communication is intense and complicated and often motivates the development of advanced technology for remote systems such as satellites. Satellites have been improving and are being used more and more for communication. They are being made to have higher transmission capacity to help with their communication abilities. The military is upgrading satellites to be immune to interference during combat operations. This advancement will establish stable, high-quality information highways for long distance communication. Aircraft are also beneficial for communication, both crewed and uncrewed, as well as computers. Computers and their varied applications have revolutionized military comms. Although military communication is designed for warfare, it also supports intelligence-gathering and communication between adversaries, and thus sometimes prevents war. The six categories of military comms are: alert measurement systems cryptography military radio systems command and control signal corps network-centric warfare The alert measurement systems are various states of alertness or readiness for the armed forces used around the world during a state of war, act of terrorism or a military attack against a state. They are known by different acronyms, such as DEFCON, or defense readiness condition, used by the U.S. Armed Forces. Cryptography is the study of methods of converting messages to a form unreadable except to one who knows how to decrypt them. This ancient military comms art gained new importance with the rise of radio systems whose signals traveled far and were easily intercepted. Cryptographic software is also widely used in civilian commerce. == Commercial refile == In United States military communications systems, commercial refile refers to sending a military message via a commercial communications network. The message may come from a military network, such as a tape relay network, a point-to-point telegraph network, a radio-telegraph network, or the Defense Switched Network. Commercial refiling of a message will usually require a reformatting of the message, particularly the heading.

    Read more →