AI Face Generator From Photo Free

AI Face Generator From Photo Free — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Central Equipment Identity Register

    Central Equipment Identity Register

    A Central Equipment Identity Register (CEIR) is a database of mobile equipment identifiers (IMEI – for networks of GSM standard, MEID – for networks of CDMA standard). Such an identifier is assigned to each SIM slot of the mobile device. Different kinds of IMEIs could be, White, for devices that are allowed to register in the cellular network; Black, for devices that are prohibited to register in the cellular network; and Grey, for devices in intermediate status (when it is not yet defined in which of the lists - black or white - the device should be placed). Depending on the rules of mobile equipment registration in a country the CEIR database may contain other lists or fields beside IMEI. For example, the subscriber number (MSISDN), which is bound to the IMEI, the ID of the individual (passport data, National ID, etc.) who registered IMEI in the database, details of the importer who brought the device into the country, etc. == History == Originally abbreviation CEIR stood for IMEI Database, created and provided by GSM Association. It was proposed to blacklist the IMEIs of stolen or lost phones. It was assumed that any MNO would be able to receive this list to block the registration of such devices on their network. Thus, it turns out that a stolen phone, once blacklisted by the GSMA CEIR, cannot be used on a large number of cellular networks, which means that the theft of mobile devices will become meaningless. However, it soon became clear that the MNOs on their initiative were not going to do this because if many phones stopped working in their networks, but works in another, it puts them at a disadvantage and can lead to an outflow of subscribers. It became clear that the blocking of stolen devices should be introduced simultaneously in all mobile networks of the country by legislative measures at the initiative of the communications regulator. In this case, as a rule, a national IMEI database is created, which contains general lists of blocked IMEIs. Since the registration in the cellular operator's network is directly blocked by a network node called EIR (Equipment Identity Register), the system that contains the national IMEI base became known as Central EIR (CEIR). To avoid confusion the database of GSM Association was renamed to IMEI Database - IMEI DB (it was in 2003-2008, see “Document History” at IMEI Database File Format Specification). Also sometimes a common IMEI database for several EIRs is called SEIR (Shared EIR). In each country, the CEIR can interact with IMEI DB differently. National CEIR may not communicate with IMEI DB at all. Firstly, it is separately decided whether CEIR will send information about its blacklist to IMEI DB (which IMEIs are placed in it or removed from there). Secondly, upon receipt of the blacklist from IMEI DB, the regulator decides from which countries it will receive it (IMEI DB stores the information exactly who blacklisted the IMEI). For example, you can get a list from neighboring countries, from countries in your region, from around the world. In addition to the blacklist, the GSMA is developing a list of IMEIs allocated to manufacturers for use in their devices. The manufacturer for each new device model gets at least one TAC (Type Allocation Code) allocated by GSMA, consisting of 8 digits, to which he can add a 6-digit serial number to obtain the IMEI. Thus, with one TAC, a manufacturer can release up to 1 million devices with a unique IMEI. Usually, CEIR receives a list of allocated TACs from the GSMA, since if the first 8 digits of the IMEI of a device are not in this list, this is a sign that it is counterfeit. If the central database of identifiers does not work with GSM networks, but with CDMA, then for the same purposes it is necessary to interact with another worldwide database that contains MEIDs – MEID Database. A system that directly blocks the registration of a mobile device on a cellular network – EIR. Each MNO must have at least one EIR, to which IMEI check requests (CheckIMEI) are sent when registering a device on the network. A typical EIR and CERI interaction scheme: The CEIR accumulates black, white, and grey lists using various data sources and verification methods. These lists are periodically transmitted to all EIRs. EIR uses them when processing every CheckIMEI request to determine whether to allow the device on the network or not. EIR can transmit some data to the CEIR database too. Usually, changes in a grey list – new IMEIs on the network that are not in any list – are transmitted from EIR to CEIR. In addition to synchronizing lists across multiple networks, the main function of CEIR is to implement the scenarios of changes at these lists. This usually requires interaction with various IT systems (databases) of other organizations and/or with subscribers. Еxamples of such scenarios: Whitelisting the IMEI of devices imported by the legal entity Whitelisting the IMEI of devices manufactured domestically Whitelisting the IMEI of devices imported by individual Blacklisting the IMEI of stolen/lost devices Binding IMEI to the subscriber's number and, vice versa, unbinding IMEI from the subscriber == System implementation results == The goals and results of CEIR implementation in a country are usually: Reducing mobile phone theft Reducing the import of devices stolen in other countries Reducing the presence of counterfeit devices on the market (null IMEI, incorrect IMEI, changed IMEI) Reducing illegal imports of mobile devices (increase in the collection of customs duties) Additionally, CEIR most often contributes to the solution of such problems: Combating various mobile fraud schemes Obtaining more accurate statistics on the state of the mobile communications market for the regulator Fight against terrorism (the ability to block the device at once in all mobile networks of the country). Known results achieved in some countries: Great Britain – reducing mobile phone theft. Turkey – reducing mobile phone theft, decreasing the current account deficit of Turkey and maximizing tax revenues. Uzbekistan – preventing black import of mobile devices by 98%, increase in revenues from the import of mobile devices by 700%. Kenya – disposing the market of counterfeit mobile equipment. Azerbaijan – disposing the market of counterfeit mobile equipment. Ukraine – increasing of legally imported mobile devices by 95%, increase in revenues from the import of mobile devices. == CEIR and EIR manufacturers == Some countries have used local developers to implement CEIR for their country (Great Britain, Turkey, India, and Azerbaijan). EIR is a system that is standardized in a 2G-5G networks. Such system may be established at mobile network even it doesn’t use black list and there are no CEIR in a country. Some developers of MNO’s signal core include EIR in a complex solution. However, its standard capabilities are usually lacking for specific requirements when implementing CEIR.

    Read more →
  • Hundred (novel series)

    Hundred (novel series)

    Hundred (ハンドレッド, Handoreddo) is a Japanese light novel series written by Jun Misaki and illustrated by Nekosuke Ōkuma. SB Creative published 16 novels between November 15, 2012, and October 15, 2018, under their GA Bunko imprint. A manga adaptation with art by Sasayuki was serialized in Fujimi Shobo's Monthly Dragon Age magazine. An anime television series adaptation, produced by Production IMS and directed by Tomoki Kobayashi, aired from April to June 2016. == Plot == "Hundreds" are a kind of weapon that get their name from their ability to change into many different forms, and are the only thing that can counter the mysterious life forms called Savage that are attacking Earth. Those who can wield a Hundred are sought out to be made into Slayers, trained individuals who can use them in combat. To become a Slayer, Hayato Kisaragi successfully enrolls in the marine academy city ship Little Garden. However he feels a strange yet familiar sense of incongruity towards Emile Crossford, his roommate who somehow knows him from somewhere. On top of that, shortly after he enters the school, he ends up getting challenged to a duel by the "Queen" and the school's most powerful Slayer, Claire Harvey. == Characters == Hayato Kisaragi (如月 ハヤト, Kisaragi Hayato) Voiced by: Yoshiaki Hasegawa (Japanese); Ricco Fajardo (English) Hayato is the male protagonist of Hundred. Originally from Yamato, Hayato became a Slayer in order to obtain state-of-the-art medical treatment for his sister. His previous encounter with a Savage 10 years ago resulted in him becoming a Variant - one of a very small fraction of people (fewer than 10 in the world, according to Emile) who have survived exposure to the Savages and obtained a greatly increased affinity for Hundreds as a result. He has the highest known compatibility with a Hundred and his Hundred, the Flying Swallow, is a chevalier-type that takes the form of a sword and a shoulder guard. When he first met Emilia he didn't realize that she was really a girl, but upon discovering the truth, he agreed to keep her secret. He is shown to be slightly uncomfortable whenever Emilia was showing him affection and would always blush when around her or other women who show their romantic feelings toward him. Emilia Hermit (エミリア・ハーミット, Emiria Hāmitto) Voiced by: Rumi Ōkubo (Japanese); Mikaela Krantz (English) Emilia is the female protagonist of Hundred. She is a silver-haired girl from the Britannia Empire and Hayato's roommate. She initially poses as a boy under the name Emile Crossfode (エミール・クロスフォード, Emīru Kurosufōdo) with only a few people aware of her secret until she eventually reveals the truth about herself. She and Hayato were survivors from the second Savage attack 10 years earlier, which resulted in her and Hayato becoming Variants. Hayato only has vague recollections of the prior event and it isn't until their encounter with the Savages at Zwei Island that Hayato realizes her true identity. She is a citizen of the Gudenburg Empire by birth and eventually reveals that she is Emilia Gudenburg (エミリア・グーデンブルグ, Emiria Gūdenburugu), the Empire's third princess. Her Hundred is the Arms Shroud that is an innocence type able to change into any form of weapon, something no other Slayer's Hundred can do. Like Hayato, she too is a Variant. Ten years ago she and Hayato where fleeing from the Savages' onslaught when she was attacked by one and almost died. The attack left a potent amount of virus in her gaping wound. Hayato, in an attempt to save her life sucked some of the fluids out, causing him to become a Variant as well. A substantial amount was still left in her system. She is in love with Hayato and is known to be very affectionate towards him and does not care about the rumors circulating about their relationship since everyone assumes them to be gay. Eventually, her status as a princess and girl are revealed to her peers, who were shocked at her heritage and finally understand her feelings to Hayato. Claire Harvey (クレア・ハーヴェイ, Kurea Hāvei) Voiced by: M.A.O (Japanese); Caitlin Glass (English) The highest-ranked Slayer in Little Garden who is from the United States of Liberia, she is called the Queen. The newly-arrived Hayato is forced to duel her to prevent the expulsion of two students who arrived late to the entrance ceremony because they are looking for him at the airport when he arrived. During the duel Hayato accidentally gropes her and she goes all out and defeats him, but the duel is called a draw and the students are allowed to stay. After Hayato saves her from a Savage and, later, accidentally kisses her, she falls in love with him. Her Hundred is a Dragoon Type which utilizes multiple cannons or transforms into a large powerful rifle, in doing so it drains much of her energy. She is also one of the few people who are aware that Emilia is secretly a girl. Karen Kisaragi (如月 カレン, Kisaragi Karen) Voiced by: Kaya Okuno (Japanese); Dawn M. Bennett (English) Hayato's younger sister who is ill. Hayato became a Slayer in order to obtain first-class treatment for her. While staying in the hospital she is often seen playing tarot cards, where she has become sort of a clairvoyant. Unlike her brother, Hayato, she suspected that Emilia was really a girl the moment she met her, until she was later convinced otherwise. She later becomes good friends with popular idol Sakura. Sakura Kirishima (霧島 サクラ, Kirishima Sakura) Voiced by: Mayu Yoshioka (Japanese); Amber Lee Connors (English) She is a popular idol who falls in love with Hayato after seeing him defeat the Trenta Savage at Zwei Island. She originally met Hayato and Karen at a shelter in Gudenberg during the second Savage attack. She remembers Karen but wasn't able to get Hayato's name at the time. After that incident, she lives with her father whom she never meets. When she later falls ill from an unknown illness, her father sells her to the Warslran Research Facility, where subjects like her are injected with vaccines that are developed from the fluids recovered from defeated Savages. She is the only one of the test subjects to have survived and, like Hayato and Emilia, she is also a Variant and a Slayer. Liza Harvey (リザ・ハーヴェイ, Riza Hāvei) Voiced by: Nichika Ōmori (Japanese); Megan Shipman (English) Claire's younger sister. Liddy Steinberg (リディ・スタインバーグ, Ridi Sutainbāgu) Voiced by: Rika Kinugawa (Japanese); Alex Moore (English) Little Garden's student council Vice President who is in charge of enforcement, she is very loyal to Claire and can be very uptight when enforcing the school's rules and regulations. Her Hundred takes the form of a lance and a shield. Erica Candle (エリカ・キャンドル, Erika Kyandoru) Voiced by: Yui Makino (Japanese); Natalie Hoover (English) She is also student council Vice President, however, she is mostly in charge of strategic planning, she has a high admiration for Claire, and it is suggested that she has certain feelings for her. Her Hundred, the Everlasting, is an Arsene type, which takes the form of a massive chained yoyo that she uses for restraining. Unfortunately her Hundred is ineffective against much stronger Savages. She is also one of the few people who became aware of Emilia's secret. Fritz Granz (フリッツ・グランツ, Furittsu Gurantsu) Voiced by: Wataru Hatano (Japanese); Jason Liebrecht (English) Hayato's classmate and Latia's partner. His Hundred takes the form of a sniper rifle. He and Latia were childhood friends, he often pokes fun at her. He is curious about the relationship between Hayato and Emilie and often teases them about their relationship, including sometimes referring to them as a couple on occasion. Latia Saintemilion (レイティア・サンテミリオン, Reitia Santemirion) Voiced by: Yuka Ōtsubo (Japanese); Elizabeth Maxwell (English) She is classmates with Hayato and Emilia, she is also Fritz's partner. Her Hundred is a close quarter melee type. She is Fritz's childhood friend. Charlotte Dimandias (シャーロット・ディマンディウス, Shārotto Dimandiusu) Voiced by: Miyu Matsuki (1st drama CD), Yui Horie (2nd drama CD, anime); Sarah Wiedenheft (English) She is a child prodigy who serves as the Little Garden's only main technical expert and chief researcher on Hundreds. Her authority is equal to that of the student council, that she can go against them or question their decisions. She is best friends with Emilia, and she is one of the characters who knows her secret. Meimei (メイメイ, Meimei) Voiced by: Ayaka Imamura (Japanese); Jill Harris (English) Miharu Kashiwagi (柏木 ミハル, Kashiwagi Miharu) Voiced by: Yuna Yoshino (Japanese); Rachel Glass (English) Miharu is a nurse at the hospital where Karen is staying. She is known for her very sweet demeanor and large breasts. Chris Steinbelt (クリス・シュタインベルト, Kurisu Shutainberuto) Voiced by: Emiri Kato (Japanese); Howard Wang (English) Noa Sheldon (ノア・シェルダン, Noa Sherudan) Voiced by: Yurika Kubo (Japanese); Madeleine Morris (English) Xue-Mei Liu (劉雪梅, Ryū Shuemei) Voiced by: Eri Suzuki (Japanese); Apphia Yu (English) Alphonse Brustad (アルフォ

    Read more →
  • Speech synthesis

    Speech synthesis

    Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition. Synthesized speech can be created by concatenating pieces of recorded speech that are stored in a database. Systems differ in the size of the stored speech units; a system that stores phones or diphones provides the largest output range, but may lack clarity. For specific usage domains, the storage of entire words or sentences allows for high-quality output. Alternatively, a synthesizer can incorporate a model of the vocal tract and other human voice characteristics to create a completely "synthetic" voice output. The quality of a speech synthesizer is judged by its similarity to the human voice and by its ability to be understood clearly. An intelligible text-to-speech program allows people with visual impairments or reading disabilities to listen to written words on a home computer. The earliest computer operating system to have included a speech synthesizer was Unix in 1974, through the Unix speak utility. In 2000, Microsoft Sam was the default text-to-speech voice synthesizer used by the narrator accessibility feature, which shipped with all Windows 2000 operating systems, and subsequent Windows XP systems. A text-to-speech system (or "engine") is composed of two parts: a front-end and a back-end. The front-end has two major tasks. First, it converts raw text containing symbols like numbers and abbreviations into the equivalent of written-out words. This process is often called text normalization, pre-processing, or tokenization. The front-end then assigns phonetic transcriptions to each word, and divides and marks the text into prosodic units, like phrases, clauses, and sentences. The process of assigning phonetic transcriptions to words is called text-to-phoneme or grapheme-to-phoneme conversion. Phonetic transcriptions and prosody information together make up the symbolic linguistic representation that is output by the front-end. The back-end—often referred to as the synthesizer—then converts the symbolic linguistic representation into sound. In certain systems, this part includes the computation of the target prosody (pitch contour, phoneme durations), which is then imposed on the output speech. == History == Long before the invention of electronic signal processing, some people tried to build machines to emulate human speech. There were also legends of the existence of "Brazen Heads", such as those involving Pope Silvester II (d. 1003 AD), Albertus Magnus (1198–1280), and Roger Bacon (1214–1294). In 1779, the German-Danish scientist Christian Gottlieb Kratzenstein won the first prize in a competition announced by the Russian Imperial Academy of Sciences and Arts for models he built of the human vocal tract that could produce the five long vowel sounds (in International Phonetic Alphabet notation: [aː], [eː], [iː], [oː] and [uː]). There followed the bellows-operated "acoustic-mechanical speech machine" of Wolfgang von Kempelen of Pressburg, Hungary, described in a 1791 paper. This machine added models of the tongue and lips, enabling it to produce consonants as well as vowels. In 1837, Charles Wheatstone produced a "speaking machine" based on von Kempelen's design, and in 1846, Joseph Faber exhibited the "Euphonia". In 1923, Paget resurrected Wheatstone's design. In the 1930s, Bell Labs developed the vocoder, which automatically analyzed speech into its fundamental tones and resonances. From his work on the vocoder, Homer Dudley developed a keyboard-operated voice-synthesizer called The Voder (Voice Demonstrator), which he exhibited at the 1939 New York World's Fair. Franklin S. Cooper and his colleagues at Haskins Laboratories built the pattern playback in the late 1940s and completed it in 1950. There were several different versions of this hardware device; only one currently survives. The machine converts pictures of the acoustic patterns of speech in the form of a spectrogram back into sound. Using this device, Alvin Liberman and colleagues discovered acoustic cues for the perception of phonetic segments (consonants and vowels). === Electronic devices === The first computer-based speech-synthesis systems originated in the late 1950s. Noriko Umeda et al. developed the first general English text-to-speech system in 1968, at the Electrotechnical Laboratory in Japan. In 1961, physicist John Larry Kelly, Jr and his colleague Louis Gerstman used an IBM 704 computer to synthesize speech, an event among the most prominent in the history of Bell Labs. Kelly's voice recorder synthesizer (vocoder) recreated the song "Daisy Bell", with musical accompaniment from Max Mathews. Coincidentally, Arthur C. Clarke was visiting his friend and colleague John Pierce at the Bell Labs Murray Hill facility. Clarke was so impressed by the demonstration that he used it in the climactic scene of his screenplay for his novel 2001: A Space Odyssey, where the HAL 9000 computer sings the same song as astronaut Dave Bowman puts it to sleep. Despite the success of purely electronic speech synthesis, research into mechanical speech-synthesizers continues. Linear predictive coding (LPC), a form of speech coding, began development with the work of Fumitada Itakura of Nagoya University and Shuzo Saito of Nippon Telegraph and Telephone (NTT) in 1966. Further developments in LPC technology were made by Bishnu S. Atal and Manfred R. Schroeder at Bell Labs during the 1970s. LPC was later the basis for early speech synthesizer chips, such as the Texas Instruments LPC Speech Chips used in the Speak & Spell toys from 1978. In 1975, Fumitada Itakura developed the line spectral pairs (LSP) method for high-compression speech coding, while at NTT. From 1975 to 1981, Itakura studied problems in speech analysis and synthesis based on the LSP method. In 1980, his team developed an LSP-based speech synthesizer chip. LSP is an important technology for speech synthesis and coding, and in the 1990s was adopted by almost all international speech coding standards as an essential component, contributing to the enhancement of digital speech communication over mobile channels and the internet. In 1975, MUSA was released, and was one of the first Speech Synthesis systems. It consisted of a stand-alone computer hardware and a specialized software that enabled it to read Italian. A second version, released in 1978, was also able to sing Italian in an "a cappella" style. Dominant systems in the 1980s and 1990s were the DECtalk system, based largely on the work of Dennis Klatt at MIT, and the Bell Labs system; the latter was one of the first multilingual language-independent systems, making extensive use of natural language processing methods. Handheld electronics featuring speech synthesis began emerging in the 1970s. One of the first was the Telesensory Systems Inc. (TSI) Speech+ portable calculator for the blind in 1976. Other devices had primarily educational purposes, such as the Speak & Spell toy produced by Texas Instruments in 1978. Fidelity released a speaking version of its electronic chess computer in 1979. The first video game to feature speech synthesis was the 1980 shoot 'em up arcade game, Stratovox (known in Japan as Speak & Rescue), from Sun Electronics. The first personal computer game with speech synthesis was Manbiki Shoujo (Shoplifting Girl), released in 1980 for the PET 2001, for which the game's developer, Hiroshi Suzuki, developed a "zero cross" programming technique to produce a synthesized speech waveform. Another early example, the arcade version of Berzerk, also dates from 1980. The Milton Bradley Company produced the first multi-player electronic game using voice synthesis, Milton, in the same year. In 1976, Computalker Consultants released their CT-1 Speech Synthesizer. Designed by D. Lloyd Rice and Jim Cooper, it was an analog synthesizer built to work with microcomputers using the S-100 bus standard. Synthesized voices typically sounded male until 1990, when Ann Syrdal, at AT&T Bell Laboratories, created a female voice. Ray Kurzweil predicted in 2005 that as the cost-performance ratio caused speech synthesizers to become cheaper and more accessible, more people would benefit from the use of text-to-speech programs. === Artificial intelligence === In September 2016, DeepMind released WaveNet, which demonstrated that deep learning models are capable of modeling raw waveforms and generating speech from acoustic features like spectrograms or mel-spectrograms, starting the field of deep learning speech synthesis. Although WaveNet was initially considered to be computationally expensive and slow to be used in consumer products at the time, a year after its

    Read more →
  • Deep learning speech synthesis

    Deep learning speech synthesis

    Deep learning speech synthesis refers to the application of deep learning models to generate natural-sounding human speech from written text (text-to-speech) or spectrum (vocoder). Deep neural networks are trained using large amounts of recorded speech and, in the case of a text-to-speech system, the associated labels and/or input text. == Formulation == Given an input text or some sequence of linguistic units Y {\displaystyle Y} , the target speech X {\displaystyle X} can be derived by X = arg ⁡ max P ( X | Y , θ ) {\displaystyle X=\arg \max P(X|Y,\theta )} where θ {\displaystyle \theta } is the set of model parameters. Typically, the input text will first be passed to an acoustic feature generator, then the acoustic features are passed to the neural vocoder. For the acoustic feature generator, the loss function is typically L1 loss (Mean Absolute Error, MAE) or L2 loss (Mean Square Error, MSE). These loss functions impose a constraint that the output acoustic feature distributions must be Gaussian or Laplacian. In practice, since the human voice band ranges from approximately 300 to 4000 Hz, the loss function will be designed to have more penalty on this range: l o s s = α loss human + ( 1 − α ) loss other {\displaystyle loss=\alpha {\text{loss}}_{\text{human}}+(1-\alpha ){\text{loss}}_{\text{other}}} where loss human {\displaystyle {\text{loss}}_{\text{human}}} is the loss from human voice band and α {\displaystyle \alpha } is a scalar, typically around 0.5. The acoustic feature is typically a spectrogram or Mel scale. These features capture the time-frequency relation of the speech signal, and thus are sufficient to generate intelligent outputs. The Mel-frequency cepstrum feature used in the speech recognition task is not suitable for speech synthesis, as it reduces too much information. == History == In September 2016, DeepMind released WaveNet, which demonstrated that deep learning-based models are capable of modeling raw waveforms and generating speech from acoustic features like spectrograms or mel-spectrograms. Although WaveNet was initially considered to be computationally expensive and slow to be used in consumer products at the time, a year after its release, DeepMind unveiled a modified version of WaveNet known as "Parallel WaveNet," a production model 1,000 faster than the original. This was followed by Google AI's Tacotron 2 in 2018, which demonstrated that neural networks could produce highly natural speech synthesis but required substantial training data—typically tens of hours of audio—to achieve acceptable quality. Tacotron 2 used an autoencoder architecture with attention mechanisms to convert input text into mel-spectrograms, which were then converted to waveforms using a separate neural vocoder. When trained on smaller datasets, such as 2 hours of speech, the output quality degraded while still being able to maintain intelligible speech, and with just 24 minutes of training data, Tacotron 2 failed to produce intelligible speech. In 2019, Microsoft Research introduced FastSpeech, which addressed speed limitations in autoregressive models like Tacotron 2. FastSpeech utilized a non-autoregressive architecture that enabled parallel sequence generation, significantly reducing inference time while maintaining audio quality. Its feedforward transformer network with length regulation allowed for one-shot prediction of the full mel-spectrogram sequence, avoiding the sequential dependencies that bottlenecked previous approaches. The same year saw the release of HiFi-GAN, a generative adversarial network (GAN)-based vocoder that improved the efficiency of waveform generation while producing high-fidelity speech. In 2020, the release of Glow-TTS introduced a flow-based approach that allowed for fast inference and voice style transfer capabilities. In March 2020, the free text-to-speech website 15.ai was launched. 15.ai gained widespread international attention in early 2021 for its ability to synthesize emotionally expressive speech of fictional characters from popular media with minimal amount of data. The creator of 15.ai (known pseudonymously as 15) stated that 15 seconds of training data is sufficient to perfectly clone a person's voice (hence its name, "15.ai"), a significant reduction from the previously known data requirement of tens of hours. 15.ai is credited as the first platform to popularize AI voice cloning in memes and content creation. 15.ai used a multi-speaker model that enabled simultaneous training of multiple voices and emotions, implemented sentiment analysis using DeepMoji, and supported precise pronunciation control via ARPABET. The 15-second data efficiency benchmark was later corroborated by OpenAI in 2024. == Semi-supervised learning == Currently, self-supervised learning has gained much attention through better use of unlabelled data. Research has shown that, with the aid of self-supervised loss, the need for paired data decreases. == Zero-shot speaker adaptation == Zero-shot speaker adaptation is promising because a single model can generate speech with various speaker styles and characteristic. In June 2018, Google proposed to use pre-trained speaker verification models as speaker encoders to extract speaker embeddings. The speaker encoders then become part of the neural text-to-speech models, so that it can determine the style and characteristics of the output speech. This procedure has shown the community that it is possible to use only a single model to generate speech with multiple styles. == Neural vocoder == In deep learning-based speech synthesis, neural vocoders play an important role in generating high-quality speech from acoustic features. The WaveNet model proposed in 2016 achieves excellent performance on speech quality. Wavenet factorised the joint probability of a waveform x = { x 1 , . . . , x T } {\displaystyle \mathbf {x} =\{x_{1},...,x_{T}\}} as a product of conditional probabilities as follows p θ ( x ) = ∏ t = 1 T p ( x t | x 1 , . . . , x t − 1 ) {\displaystyle p_{\theta }(\mathbf {x} )=\prod _{t=1}^{T}p(x_{t}|x_{1},...,x_{t-1})} where θ {\displaystyle \theta } is the model parameter including many dilated convolution layers. Thus, each audio sample x t {\displaystyle x_{t}} is conditioned on the samples at all previous timesteps. However, the auto-regressive nature of WaveNet makes the inference process dramatically slow. To solve this problem, Parallel WaveNet was proposed. Parallel WaveNet is an inverse autoregressive flow-based model which is trained by knowledge distillation with a pre-trained teacher WaveNet model. Since such inverse autoregressive flow-based models are non-auto-regressive when performing inference, the inference speed is faster than real-time. Meanwhile, Nvidia proposed a flow-based WaveGlow model, which can also generate speech faster than real-time. However, despite the high inference speed, parallel WaveNet has the limitation of needing a pre-trained WaveNet model, so that WaveGlow takes many weeks to converge with limited computing devices. This issue has been solved by Parallel WaveGAN, which learns to produce speech through multi-resolution spectral loss and GAN learning strategies.

    Read more →
  • Zamzar

    Zamzar

    Zamzar is an online file converter and compressor, created by brothers Mike and Chris Whyley in England in 2006. It allows users to convert files online, without downloading a software tool, and supports over 1,200 different conversion types. Since its formation, the service has converted over 510 million files for users from 245 different countries. The service supports the conversion of documents, images, audio, video, e-Books, CAD files and compressed file formats. Users can type in a URL or upload one or more files (if they are all of the same format) from their computer; Zamzar will then convert the file(s) to another user-specified format, such as an Adobe PDF file to a Microsoft Word document. Once conversion is complete, users can immediately download the file from their web browser. Users can also choose to receive an email with a link to download the converted file. In February 2021 Zamzar expanded their tool and announced a new file compression service. The compressor is visually similar to the conversion tool with a drag and drop download feature. As with the converter, users have the option to subscribe for a paid plan if they wish to compress multiple or larger files than the free service permits == File conversion API == in 2015 Zamzar launched a file conversion API, allowing users to integrate file conversion capabilities into their own websites and applications. Sample code is provided to allow users to integrate file conversion capabilities in C#, Java, Node.js, PHP, Python and cURL. Zamzar also maintains a project on GitHub which allows users to perform file conversion from the command line on Linux, MacOS or Windows systems. == Email file conversion == It is also possible to send files for conversion by emailing them to Zamzar. Zamzar launched this capability in 2012, allowing users to email files to dedicated email addresses for the file to be automatically converted to a different format. A link is then emailed back to the end user to allow them to download their converted file. == User privilege levels == Zamzar is currently free to use, but there is a limit of two conversions per hour for files up to 100MB. Users can pay a monthly subscription in order to access preferential features, such as unlimited file conversions, online file management, shorter response and queuing times and other benefits. == Name == Its name comes from Franz Kafka's The Metamorphosis. Its main character is called Gregor Samsa and it is from his surname that Zamzar is derived. The founders of the service considered three other names – Konvertieren, Khamailen and Obrogo – before settling on Zamzar.

    Read more →
  • Coronavirus breathalyzer

    Coronavirus breathalyzer

    A coronavirus breathalyzer is a diagnostic medical device enabling the user to test with 90% or greater accuracy the presence of severe acute respiratory syndrome coronavirus 2 in an exhaled breath. As of the first half of 2020, the idea of a practical coronavirus breathalyzer was concomitantly developed by unrelated research groups in Australia, Canada, Finland, Germany, Indonesia, Israel, Netherlands, Poland, Singapore, United Kingdom and the USA. == Australia == In Australia, GreyScan CEO Samantha Ollerton and Prof. Michael Breadmore of the University of Tasmania are basing a coronavirus breathalyzer on existing technology that is used around the world to detect explosives. Another invention published from ABC News; produced by Colin Hickey and Examin Holdings, have released information on a new breathalyzer called the "Queensland Breath test" claiming its function has 98% efficiency, equipped with a replaceable plastic nozzle for reusability (February 2022). a statement in claim by Bruce Thompson, a professor at Swinburne University of Technology, Although this products is reliable, due to insufficient funding, the product is inaccessible. == Canada == Canary Health Technologies, headquartered in Toronto with offices in Cleveland, Ohio, is developing a breathalyzer with disposable nanosensors using AI-powered cloud-based analysis. According to a press release, clinical trials began in India during November 2020. The stated goal is to develop an accurate, reasonably priced screening tool that can be used anywhere and deliver a result in less than a minute. The company postulates that analyzing volatile organic compounds in human breath could potentially detect diseases before the on-set of symptoms, earlier than currently available methods. Moreover, the cloud-based technology is designed to be used as a disease surveillance apparatus. == Finland == By the end of June 2020, Forum Virium Helsinki, in collaboration with Finnish software firm Deep Sensing Algorithms, funded by the Helsinki-Uusimaa Regional Council, announced that testing of their device had begun with a control group in Kazakhstan, with plans to expand to the Netherlands, the United States, South Africa, Brazil and Finland throughout the summer. The efficacy of the Forum Virium Helsinki / Deep Sensing Algorithms device hinges on its AI component. "We are engaged in innovative cooperation with corporations to solve the coronavirus crisis, and we will help firms to use the city as a development platform. We are utilizing artificial intelligence and digitalization," said Forum Virium Helsinki CEO Mika Malin. == Germany == In March 2020, the Singaporean company RAM Global conducted research in Germany in hopes of developing a one-minute breathalyzer test for SARS-CoV-2 based on terahertz time-domain spectroscopy. The company attempted to develop a disposable test kit for direct detection of COVID-19 virion particles in breath, saliva and swab samples. On 31 March, RAM Global completed an initial clinical study on live patients at University Hospital Saarland. In April, the company pursued a small unknown sample study in which hospital doctors provided unknown samples in order to test accuracy in differentiating positive and negative samples. == Indonesia == Since April 2020, a team of researchers from Gadjah Mada University (UGM) has been developing an electronic nose called GeNose C19. The GeNose C19 can be used as a rapid, non-invasive screening tool in less than two minutes. A profiling test was carried out at the Bhayangkara Hospital and the Covid Bambanglipuro Special Field Hospital in Yogyakarta. GeNose C19 consists of gas sensors and an artificial intelligence-based pattern recognition system. The diagnostic test was carried out with the cooperation of nine multi-center hospitals. In the end of December 2020, GeNose C19 received a distribution permit from Indonesia's Health Ministry. Initially, 100 units will be released and each device will be able to perform 120 tests per day. The test is estimated to cost 15,000–25,000 Indonesian rupiah ($1–$1.8) and would take three minutes for the test and another two minutes to yield a result. Researchers hope to manufacture up to 1,000 GeNose C19 units, increasing the country's testing capabilities by 120 thousand subjects per day. Moreover, they aim to manufacture 10,000 units by February 2021. == Israel == In Israel, it is at the photonics lab of Gabby Sarusi, professor at Ben-Gurion University of the Negev, that research is underway as of midsummer 2020. Separately from Sarusi's project, in July 2020, it was reported that Israeli start-up Nanoscent in cooperation with Sheba Medical Center had devised a breathalyzer that Magen David Adom (MDA) is seeking to incorporate into existing drive-thru testing stations located throughout the country. Questionable intellectual property of Gabby Sarusi regarding this project is now under discussion in the court in Israel. == The Netherlands == A breath test with the SpiroNose device, made by the Dutch company Breathomix, has been developed and tested in collaboration with the Leiden University Medical Center (LUMC), Franciscus Gasthuis & Vlietland and the GGD Amsterdam. The breath test has been validated as a pre-screening test for people who have no or mild symptoms of COVID-19. From April 2021, the device was operational in COVID-19 test drive-ins, conferences and events, i.e. Eurovision Song Contest 2021. Subjects must abstain from alcohol for eight hours prior to taking the breath test. The SpiroNose contains four sets of seven different sensors that can measure the mixture of volatile organic compounds (biomarkers) in the exhaled air. These VOCs provide a picture of a person's metabolism. This 'breath profile' is forwarded to an online analysis platform. Here the breath profile is compared with other breath profiles of people with and without a COVID-19 diagnosis and analysed by algorithms. Data-analysis involves advanced signal processing and statistics based on independent t-tests followed by linear discriminant and ROC analysis. The test result is known within minutes. The breath test has a sensitivity/specificity for SARS-CoV-2 infection of 100/78, >99/84, 98/82% in validation, replication and asymptomatic cohorts of patients. The breath test reliably detects who is not infected. Such a subject will receive a test result immediately. Other subjects must promptly conduct a subsequent test, for example a PCR test or LAMP test. The test results can be viewed by the client and are not automatically interfaced to other databases, i.e. for public health surveillance, source and contact tracing, vaccination programs. In July 2021, the ministry stopped the tests with the SpiroNose because, according to the GGD, the device gives unusable results in some cases. Breathomix indicates that this is the result of the way in which the SpiroNose is deployed. The SpiroNose is and remains a reliable instrument for lung diseases. The analysis platform is developed conform the requirements of the standard ISO 27001 (Information Security) and NEN 7510 (Information Security in Health Care). A CE marking has been requested. In the meantime, the Dutch minister has granted a CE marking exemption on 25 January 2021. The device may also be used to detect other diseases, e.g., asthma, COPD, lung cancer, interstitial lung diseases (ILD). == Poland == In February 2021, the President of Poland, Andrzej Duda, announced that ML System S. A., headquartered in Zaczernie, Poland, had successfully developed a means of analyzing a patient's breath to test for the presence of coronavirus. According to an anonymous press release, test subjects exhale into a device in order to determine the presence of the coronavirus. The procedure, similar to that of a police breathalyzer, is said to take less than ten seconds. Independent clinical trials were begun in April 2021. In the first half of May 2021, a brief text concerning partial results was published by ML System, stating that independent clinical trials were successful with specificity (97,15%) and accuracy/sensitivity (86,86%), for CT (Cycle Threshold) assumed at 25, which is in line with the guidelines set out by the World Health Organization. Moreover, ML System in partnership with Rzeszów–Jasionka Airport published a statement indicating their intention to test the device at the airport. Similar plans exist between the manufacturer and the Warsaw Chopin Airport. Two large networks of laboratories in Poland, "Diagnostyka" and "ALAB Laboratoria", have signed a letter of intent with ML System. In agreement with ALAB, the parties declared cooperation in the implementation of the product named "COVID DETECTOR" on the Polish, German and Ukrainian markets. In addition, the companies declared joint activities aimed at extending the diagnosis with the use of "COVID Detector" to include mutations of the SARS-CoV-2 virus, differentiate the stage of the disease and ot

    Read more →
  • 17776

    17776

    17776 (also known as What Football Will Look Like in the Future) is a serialized speculative fiction multimedia narrative by Jon Bois, published online through SB Nation. Set in the distant future in which all humans have become immortal and infertile, the series follows three sapient space probes that watch humanity play an evolved form of American football in which games can be played for millennia over distances of thousands of miles. The series debuted on July 5, 2017, and new chapters were published daily until the series concluded with its twenty-fifth chapter on July 15, 2017. Bois began developing 17776 in 2016. Because the story incorporates text, animated GIFs, still images, and videos hosted on YouTube, new tools were developed to allow it to be hosted efficiently on the SB Nation website. The work explores themes of consciousness, hope, despair, and why humans play sports. 17776 was well received by critics, who praised it for its innovative use of its medium and for the depth of emotion it evoked. In 2018, the story won a National Magazine Award for Digital Innovation and was longlisted for both the Hugo Awards for Best Novella and Best Graphic Story. It is followed by a sequel series: 20020, released from September to October 2020. The sequel series follows a 111-team game of college football on fields spanning 130,000 miles (210,000 km) across the United States. Bois originally intended to follow up with a further series entitled 20021; however, it was postponed indefinitely. In May 2025, Bois announced that the series would be continued with a novel titled 50007: An American Football Odyssey. == Premise == The story takes place on a future Earth where humans stopped dying, aging, and being born on April 7, 2026. All social ills were subsequently eliminated, and technology preventing humans from any injury was developed. In the United States, American football evolved to include new rules, including those that allow fields thousands of miles long, hundreds of in-game players, and games millennia long. Over time, computers gained sentience due to constant exposure to broadcast human data. By the year 17776, the space probe Pioneer 9 (called Nine) has gained sentience and made contact with Pioneer 10 (called Ten) and the Jupiter Icy Moons Explorer (called Juice). As Nine adjusts to a world radically different from that of the 20th century, the three space probes watch multiple football games occurring across the United States: a game using the entirety of Nebraska as a field in which the next point scored wins the game; a game in which players strive to possess every existing football autographed by obscure NFL player Koy Detmer; a game played between the Canadian border and the Mexican border deadlocked for 13,000 years at the bottom of a gorge in Arizona; an NFL regulation game between the Denver Broncos and the Pittsburgh Steelers that changed over 15,000 years into 58 playing teams owning and capitalizing upon portions of Sports Authority Field at Mile High while the ball is lost; a 500 game that results in the destruction of the Centennial Light; and a game in which the possessing player is attempting to score an automatic win by hiding in his team's end zone for 10,000 years. == Format == 17776 is read by scrolling through web pages occupied by large GIF images and colored dialogue text, interspersed with occasional YouTube videos. The story is divided into chapters, which were originally published in daily installments between July 5 and 15, 2017. Much of the GIF and video content of the series uses Google Earth satellite imagery, 3D buildings, and other tools within Google Earth to create animations and visual effects. == Development == Bois wrote and illustrated 17776 for Vox Media's sports news website SB Nation, of which he is creative director. Aside from 17776, Bois produces two other recurring, humorous video essay programs for the site: Pretty Good, which focuses on unusual sports topics and stories, and Chart Party, which focuses on statistics and has an emphasis on Bois' use of visual art in his journalism and storytelling. Bois is also known for the Breaking Madden series, in which he attempted unusual scenarios in the Madden NFL series of video games. In early 2016, Bois began developing an "anti-sci fi" project as a possible sequel to The Tim Tebow CFL Chronicles, an earlier work for SB Nation, and set the story in a year far enough in the future that "nobody ever thinks about it." Although he liked the concept and the visuals, he believed the project would not connect with readers and shelved it. Later, he realized that the story needed a centering character; he wrote one in the form of a small town, AM radio talk show host before coming up with the characters of the probes. Development renewed in May 2016, and the project solidified after SB Nation published its article "The Future of Football." Bois described it as the biggest project he ever attempted. The series was developed by Graham MacAree, who used a Vox Media tool that creates custom packages from standard article sets to give Bois creative leeway and to accommodate the series' weight on the SB Nation website. MacAree found that there were few resources online for achieving the desired effects. == Themes == Bois has stated that he had "conceived [17776] to give the reader a good time," asserting that this "was literally the whole point." William Hughes writing for The A.V. Club described 17776 as concerned with why humans play sports: "That is, given the massive resources, time, and information at our disposal (not to mention those available to our descendants), why does communal game-playing still hold such an important place in society?" He also listed consciousness, hope, and despair as among the work's themes. Beth Elderkin of io9 described it as "a deep thought experiment into what we consider humanly possible". She also felt that Ten and Juice take on the role of angel and devil, and she suggested the two may be unreliable narrators. Ian Crouch of The New Yorker felt that the work had a "tonal echo" of Don DeLillo's 1972 novel End Zone due to thematic similarities "with the way that the order and logic of football might act as a counterbalance to the chaos of the real world". == Reception == According to the communications director at Vox Media, 17776 garnered over 2.3 million pageviews by July 10. Two days later, it had received more than 2.9 million pageviews. Average engagement time was over nine minutes, and 43 percent of readers finished each installment of the series published by July 7. On July 19, Bois claimed that 17776 received 700,000 unique visitors and 4 million total pageviews, with an average engagement time of 11 minutes. Thu-Huong Ha for Quartz described 17776 as "part Italo Calvino, part Peter Heller [author of The Dog Stars], with humor seemingly from within the depths of Reddit," saying that the story would appeal to fans of both sports and literature. Tor.com described the first chapter as full of tension and felt that receiving answers is a "surprisingly heartbreaking" experience "lessened by a gleeful bouncing immaturity" one would not expect from the characters. Beth Elderkin at io9 said the series is "akin to Homestuck" and described it as "weird, complex, and pretty spectacular". William Hughes writing for The A.V. Club felt that 17776 is a "truly innovative piece of work". After reading the first three chapters, Agatha French of the Los Angeles Times stated that she was "impressed and excited by the innovation" of what she saw, and that she was intrigued despite not knowing what the work is or is saying. She felt the work took full advantage of its online medium and suggested that it "may also be a glimpse into the future of reading on the Internet". Ian Crouch of The New Yorker described the series as, "despite its seemingly meagre parts, a thing of startling beauty". Of the chapters published by July 12, he felt "the most striking chapter" to be one that used audio of Verne Lundquist calling the end of a 2013 game between the University of Alabama and Auburn University over a video panning over Earth. He also noted that the series was compared to Homestuck and relayed additional comparisons to Thomas Pynchon novels and "a Reddit thread hijacked by robot trolls". The series won the inaugural National Magazine Award for Digital Innovation from the American Society of Magazine Editors; this was the first National Magazine Award nomination and win for SB Nation. It was described by the judges as "an extraordinary combination of art, fiction and technology, an online acid trip that had to be experienced to be believed." It was also longlisted for the Hugo Awards for Best Novella and Best Graphic Story in 2018, ultimately finishing in 11th place in both categories. == Sequel series == On September 28, 2020, a sequel titled 20020 was launched on Secret Base, a branch of SB Nation; on October 13, it was revea

    Read more →
  • AI Seoul Summit 2024

    AI Seoul Summit 2024

    The AI Seoul Summit 2024 was an event in May 2024 co-hosted by the South Korean and British governments. The Seoul Declaration was adopted to address artificial intelligence technology and related challenges and opportunities. == Background == The AI Seoul Summit is the second such meeting following the AI Safety Summit held in the United Kingdom in November 2023. In the Bletchley Declaration, the participating countries agreed to prioritize identifying AI safety risks of shared concern, a shared concern, but at the Seoul Summit, the leaders also recognized the importance of AI. == Notable attendees == The summit was attended by the leaders of Group of Seven countries, including the United States, Canada, France, and Germany, South Korea, Singapore and Australia, representatives of the United Nations, the Organisation for Economic Co-operation and Development, and the European Union. Also in attendance were representatives of global companies such as Tesla CEO Elon Musk, Samsung Electronics Chairman Lee Jae-yong, ChatGPT maker OpenAI, Google, Microsoft, Meta, and South Korea's top portal operator Naver. == Topics == === South Korean AI safety center === "South Korea will push forward with the establishment of an AI safety research center in Korea and join a network to boost the global safety of AI." Minister of Science, Lee Jong-ho said that South Korea was planning to open an AI Safety Institute in 2024. He also expressed his intention to strengthen cooperation for the development of international standards. === Seoul Declaration for Safe, Innovative and Inclusive AI === The Seoul Declaration was adopted at the summit by leaders representing the EU, the US, the UK, Australia, Canada, Germany, France, Italy, Japan, South Korea, and Singapore. The declaration is a commitment to foster international cooperation to help develop AI governance frameworks that are interoperable between countries, partly by integrating the Hiroshima Process International Code of Conduct for Organizations Developing Advanced AI Systems. It advocates for the development of human-centric AI in collaboration with the private sector, academia, and civil society. === Seoul Ministerial Statement for advancing AI safety === At the ministerial meeting of the summit, the Seoul Ministerial Statement, a joint statement calling for the improvement of the safety, innovation, and inclusivity of AI technologies, was adopted by ministers from Australia, Canada, Chile, France, Germany, India, Indonesia, Israel, Italy, Japan, Kenya, Mexico, the Netherlands, Nigeria, New Zealand, the Philippines, South Korea, Rwanda, Saudi Arabia, Singapore, Spain, Switzerland, Turkey, Ukraine, the United Arab Emirates, the UK, and the US, as well as an EU representative. It aims to develop low-power chips as the AI industry rapidly expands and massive consumption is expected. == Global AI Summit series ==

    Read more →
  • Log shipping

    Log shipping

    Log shipping is the process of automating the backup of transaction log files on a primary (production) database server, and then restoring them onto a standby server. This technique is supported by Microsoft SQL Server, 4D Server, MySQL, and PostgreSQL. Similar to replication, the primary purpose of log shipping is to increase database availability by maintaining a backup server that can replace a production server quickly. Other databases such as Adaptive Server Enterprise and Oracle Database support the technique but require the Database Administrator to write code or scripts to perform the work. Although the actual failover mechanism in log shipping is manual, this implementation is often chosen due to its low cost in human and server resources, and ease of implementation. In comparison, SQL server clusters enable automatic failover, but at the expense of much higher storage costs. Compared to database replication, log shipping does not provide as much in terms of reporting capabilities, but backs up system tables along with data tables, and locks the standby server from users' modifications. A replicated server can be modified (e.g. views) and is therefore unsuitable for failover purposes.

    Read more →
  • Content-based image retrieval

    Content-based image retrieval

    Content-based image retrieval, also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR), is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases (see this survey for a scientific overview of the CBIR field). Content-based image retrieval is opposed to traditional concept-based approaches (see Concept-based image indexing). "Content-based" means that the search analyzes the contents of the image rather than the metadata such as keywords, tags, or descriptions associated with the image. The term "content" in this context might refer to colors, shapes, textures, or any other information that can be derived from the image itself. CBIR is desirable because searches that rely purely on metadata are dependent on annotation quality and completeness. == Comparison with metadata searching == An image meta search requires humans to have manually annotated images by entering keywords or metadata in a large database, which can be time-consuming and may not capture the keywords desired to describe the image. The evaluation of the effectiveness of keyword image search is subjective and has not been well-defined. In the same regard, CBIR systems have similar challenges in defining success. "Keywords also limit the scope of queries to the set of predetermined criteria." and, "having been set up" are less reliable than using the content itself. == History == The term "content-based image retrieval" seems to have originated in 1992 when it was used by Japanese Electrotechnical Laboratory engineer Toshikazu Kato to describe experiments into automatic retrieval of images from a database, based on the colors and shapes present. Since then, the term has been used to describe the process of retrieving desired images from a large collection on the basis of syntactical image features. The techniques, tools, and algorithms that are used originate from fields such as statistics, pattern recognition, signal processing, and computer vision. === QBIC - Query By Image Content === The earliest commercial CBIR system was developed by IBM and was called QBIC (Query By Image Content). Recent network- and graph-based approaches have presented a simple and attractive alternative to existing methods. While the storing of multiple images as part of a single entity preceded the term BLOB (Binary Large OBject), the ability to fully search by content, rather than by description, had to await IBM's QBIC. === VisualRank === == Technical progress == The interest in CBIR has grown because of the limitations inherent in metadata-based systems, as well as the large range of possible uses for efficient image retrieval. Textual information about images can be easily searched using existing technology, but this requires humans to manually describe each image in the database. This can be impractical for very large databases or for images that are generated automatically, e.g. those from surveillance cameras. It is also possible to miss images that use different synonyms in their descriptions. Systems based on categorizing images in semantic classes like "cat" as a subclass of "animal" can avoid the miscategorization problem, but will require more effort by a user to find images that might be "cats", but are only classified as an "animal". Many standards have been developed to categorize images, but all still face scaling and miscategorization issues. Initial CBIR systems were developed to search databases based on image color, texture, and shape properties. After these systems were developed, the need for user-friendly interfaces became apparent. Therefore, efforts in the CBIR field started to include human-centered design that tried to meet the needs of the user performing the search. This typically means inclusion of: query methods that may allow descriptive semantics, queries that may involve user feedback, systems that may include machine learning, and systems that may understand user satisfaction levels. == Techniques == Many CBIR systems have been developed, but as of 2006, the problem of retrieving images on the basis of their pixel content remains largely unsolved. Different query techniques and implementations of CBIR make use of different types of user queries. === Query By Example === QBE (Query By Example) is a query technique that involves providing the CBIR system with an example image that it will then base its search upon. The underlying search algorithms may vary depending on the application, but result images should all share common elements with the provided example. Options for providing example images to the system include: A preexisting image may be supplied by the user or chosen from a random set. The user draws a rough approximation of the image they are looking for, for example with blobs of color or general shapes. This query technique removes the difficulties that can arise when trying to describe images with words. === Semantic retrieval === Semantic retrieval starts with a user making a request like "find pictures of Abraham Lincoln". This type of open-ended task is very difficult for computers to perform - Lincoln may not always be facing the camera or in the same pose. Many CBIR systems therefore generally make use of lower-level features like texture, color, and shape. These features are either used in combination with interfaces that allow easier input of the criteria or with databases that have already been trained to match features (such as faces, fingerprints, or shape matching). However, in general, image retrieval requires human feedback in order to identify higher-level concepts. === Relevance feedback (human interaction) === Combining CBIR search techniques available with the wide range of potential users and their intent can be a difficult task. An aspect of making CBIR successful relies entirely on the ability to understand the user intent. CBIR systems can make use of relevance feedback, where the user progressively refines the search results by marking images in the results as "relevant", "not relevant", or "neutral" to the search query, then repeating the search with the new information. Examples of this type of interface have been developed. === Iterative/machine learning === Machine learning and application of iterative techniques are becoming more common in CBIR. === Other query methods === Other query methods include browsing for example images, navigating customized/hierarchical categories, querying by image region (rather than the entire image), querying by multiple example images, querying by visual sketch, querying by direct specification of image features, and multimodal queries (e.g. combining touch, voice, etc.) == Content comparison using image distance measures == The most common method for comparing two images in content-based image retrieval (typically an example image and an image from the database) is using an image distance measure. An image distance measure compares the similarity of two images in various dimensions such as color, texture, shape, and others. For example, a distance of 0 signifies an exact match with the query, with respect to the dimensions that were considered. As one may intuitively gather, a value greater than 0 indicates various degrees of similarities between the images. Search results then can be sorted based on their distance to the queried image. Many measures of image distance (Similarity Models) have been developed. === Color === Computing distance measures based on color similarity is achieved by computing a color histogram for each image that identifies the proportion of pixels within an image holding specific values. Examining images based on the colors they contain is one of the most widely used techniques because it can be completed without regard to image size or orientation. However, research has also attempted to segment color proportion by region and by spatial relationship among several color regions. === Texture === Texture measures look for visual patterns in images and how they are spatially defined. Textures are represented by texels which are then placed into a number of sets, depending on how many textures are detected in the image. These sets not only define the texture, but also where in the image the texture is located. Texture is a difficult concept to represent. The identification of specific textures in an image is achieved primarily by modeling texture as a two-dimensional gray level variation. The relative brightness of pairs of pixels is computed such that degree of contrast, regularity, coarseness and directionality may be estimated. The problem is in identifying patterns of co-pixel variation and associating them with particular classes of textures such as silky, or rough. Other methods of classifying textures include: Co-occurrence matrix Laws texture energy Wavelet transform Orthogonal transforms (discrete Chebyshev moments) =

    Read more →
  • Computer-assisted legal research

    Computer-assisted legal research

    Computer-assisted legal research (CALR) or computer-based legal research is a mode of legal research that uses databases of court opinions, statutes, court documents, and secondary material. Electronic databases make large bodies of case law easily available. Databases also have additional benefits, such as Boolean searches, evaluating case authority, organizing cases by topic, and providing links to cited material. Databases are available through paid subscription or for free. Subscription-based services include Westlaw, LexisNexis, JustCite, HeinOnline, Bloomberg Law, Lex Intell, VLex and LexEur. As of 2015, the commercial market grossed $8 billion. Free services include OpenJurist, Google Scholar, AltLaw, Ravel Law, WIPO Lex, Law Delta and the databases of the Free Access to Law Movement. == Purposes == Computer-assisted legal research is undertaken by a variety of actors. It is taught as a topic in many law degrees and is used extensively by undergraduate and postgraduate law students in meeting the work requirements of their degree courses. Professors of Law rely on the digitization of primary and secondary sources of law when conducting their research and writing the material that they submit for publication. Professional lawyers rely on computer-assisted legal research in order to properly understand the status of the law and so to act effectively in the best interest of their client. They may also consult the text of case judgements and statutes specifically, as well as wider academic comment, in order to form the basis of (or response to) an appeal. The availability of legal information online differs by type, jurisdiction and subject matter. The types of information available include: Texts of statutes, statutory instruments, civil codes, etc. Explanatory notes and government publications relating to statutes and their operation Texts of governing documents such as constitutions and treaties Case judgements Journals on legal matters or legal theory Dictionaries and legal encyclopedia Legal texts and materials in the form of e-books Current affairs and market information Educational information on the law and its operation == Before the Internet == Prior to the advent and popularization of the World Wide Web, access to digital legal information was largely through the use of CD-ROMs, designed and sold by commercial organizations. Dial-up services were also available from the 1970s. As the use of the Internet spread in the early 1990s, companies such as LexisNexis and Westlaw incorporated Internet connectivity into their software packages. Browser-based legal information started to be published by Legal Information Institutes from 1992. == Publicly available information == The first effort to provide free computer access to legal information was made by two academics, Peter Martin and Tom Bruce, in 1992. Today, the Legal Information Institute freely publishes such resources as the text of the United States Constitution, judgements of the United States Supreme Court, and the text of the United States Code. The Australasian Legal Information Institute (AusLII) was established soon after in 1995. Other legal information institutes, such as those of Great Britain and Ireland (BAILII), Canada (CII) and South Africa (SAfLI) soon followed. LIIs were partially formalized in 2002 following the signing of the Declaration of Free Access to the Law, which has been signed by 54 countries. At the time of writing, the World Legal Information Institute contains in excess of 1800 databases from 123 jurisdictions. Many governments also publish legal information online. For example, UK legislation and statutory instruments have been publicly available online since 2010. Depending on the jurisdiction in question, the decisions of higher appellate courts may also be published online, either by the Legal Information Institute or by the court service directly. Sources of European Union Law are published for free by EUR-Lex in 23 languages, including judgments of the European Courts. Similarly, judgements of the European Court of Human Rights are published on its website.

    Read more →
  • The Machine That Won the War (short story)

    The Machine That Won the War (short story)

    "The Machine That Won the War" is a science fiction short story by American writer Isaac Asimov. The story first appeared in the October 1961 issue of The Magazine of Fantasy & Science Fiction, and was reprinted in the collections Nightfall and Other Stories (1969) and Robot Dreams (1986). It was also printed in a contemporary edition of Reader's Digest, illustrated. It is one of a loosely connected series of such stories concerning a fictional supercomputer called Multivac. == Plot summary == Three influential leaders of the human race meet in the aftermath of a successful war against the Denebians. Discussing how the vast and powerful Multivac computer was a decisive factor in the war, each of the men admits that in fact, he falsified his part of the decision process because he felt that the situation was too complex to follow normal procedures. John Henderson, Multivac's Chief Programmer, admits that he altered the data being fed to Multivac, since the populace could not be trusted to report accurate information in the current situation. Max Jablonski then admits that he altered the data that Multivac produced, since he knew that Multivac was not in good working order due to manpower and spare parts shortage. Finally, Lamar Swift, executive director of the Solar Federation, reveals that he had not trusted the reports produced by Multivac, and had made the final decisions purely on the toss of a coin.

    Read more →
  • Core FTP

    Core FTP

    Core FTP LE is a freeware secure FTP client for Windows, developed by CoreFTP.com. Features include FTP, SSL/TLS, SFTP via SSH, and HTTP/HTTPS support. Secure FTP clients encrypt account information and data transferred across the internet, protecting data from being seen, or sniffed across networks. Core FTP is a traditional FTP client with local files displayed on the left, remote files on the right. Core FTP Server is a secure FTP server for Windows, developed by CoreFTP.com, starting in 2010. == Licensing == CoreFTP LE is free for personal, educational, non-profit, and business use.

    Read more →
  • Smart speaker

    Smart speaker

    A smart speaker is a type of loudspeaker and voice command device with an integrated virtual assistant that offers interactive actions and hands-free activation with the help of one "wake word" (or several "wake words"). Some smart speakers also act as smart home hubs by using Wi-Fi, Bluetooth, Thread, and other protocol standards to extend usage beyond audio playback and control home automation devices connected through a local area network. == History == Early voice-activated devices began in 2013 with MIT's Jasper project, which used multiple microphones and cloud software to power hands-free interactions from across a room. The first commercial smart speaker was the Amazon Echo, which was released in 2014 powered by Alexa and a ring of far-field microphones. Google followed in 2016 with Home, powered by Google Assistant. By 2017, devices like the Echo Show and Home Hub (later called Nest Hub) added touchscreens and video, creating the "smart display" subcategory. In 2018, Apple joined the smart speaker trend by launching the HomePod, which focused on high-quality audio alongside their built-in assistant Siri. ASUS release its own smart Speaker Xiao-Bu in 2019 with Artificial Intelligence, it terminates the Cloud Service on June 1st, 2025, which means all real-time service such as weather, news, currency conversion is affected. Sonos's 1st smart speaker Sonos One released in 2017, powered by Alexa. Invoke by Harman Kardon was powered by Microsoft's intelligent personal assistant, Cortana. In the early 2020s, smart speakers gained on-device voice processing for faster responses and improved privacy. New standards such as Matter and Thread allowed multitudes of smart-home devices (even from completely different brands) to work together. == Features == === Audio and Voice === Smart speakers use multiple microphones along with noise-cancelling software to pick up your voice from across the room, even when music is playing or the assistant is already talking. Noise suppression and echo cancellation is also used by the speaker so it can focus in on who is talking and ignore any background noises. Most smart speaker models can recognize who is speaking by voiceprint, which allows the speaker to grab information from that person's calendar, preferences, or music playlists. Listening to music on a speaker is when importance for good audio quality becomes apparent. Entry-level (cheaper) speakers such as the Home Mini or the Echo Dot have a single full-range driver. These lower-end speakers typically aren't great for listening to music as the audio quality is pretty poor. More advanced units such as the Home Max or Echo Studio have separate tweeters and woofers meant for listening to music in high quality. === Connectivity and smart-home control === Most connect over Wi-Fi or Bluetooth and support hub protocols like Thread and Matter. That lets them not only stream and play music but also allows you to control various brands of smart lights, thermostats, door locks, cameras, and much more-all from one point of control. Each can have its own designated interface and features in-house, usually launched or controlled via application or home automation software. These devices are able to communicate with each other via peer-to-peer connection through mesh networking. These speakers and related smart devices are typically controlled with one smartphone application. === Assistant services and skills === The built-in assistants handle timers, alarms, reminders, news briefings, weather updates, send messages to other smart devices, send texts, make calls, and simple questions. You can combine actions together in what are typically known as routines (for example saying "good morning" turns on lights, starts the coffee, says the weather, and reads the news) and add extra functions known as skills or actions (for things like ordering food or playing trivia games). This hands-free use of smart speakers can help assist those with disabilities. Most other technologies need the user to be able to physically interact with the device. Smart speakers are not bound by these limitations and can serve as an excellent tool for those who are unable to use their arms or legs or have vision issues. Although these tasks can be completed by a phone or computer, consumers tend to lean towards smart speakers due to factors such as their range being much greater than that of a phone and the need to not have to physically interact with the speaker to get the voice assistant as with most smartphones, certain parts of a phone may need to be interacted with to activate the speaking assistant. === Smart displays === Some smart speakers also include a screen to show the user a visual response. A smart speaker with a touchscreen is known as a smart display; these integrate a conversational user interface with display screens to augment voice interaction with images and video. They are powered by one of the common voice assistants and offer additional controls for smart home devices, feature streaming apps, and web browsers with touch controls for selecting content. The first smart displays were introduced in 2017 by Amazon (Amazon Echo Show) and Google (Google/Nest Home Hub). Hotel chain Marriott International partnered with Amazon to install Echo devices in select hotels since 2018. A Taiwanese startup, Aiello, launched the Aiello Voice Assistant (AVA) in the Asian hotel market in 2019, claiming it is powered by a multi-AI model system. Angie by Nomadix, which is similar to the Amazon Echo, launched its first product in 2017, specifically targeting hotel properties in the North America. In May 2019, Angie Hospitality acquired the assets of Roxy, a competitor that also built its own speech-enabled virtual assistant technology for hotels. This acquisition merged two proprietary NLP stacks into the current Nomadix product. === Artificial intelligence === The newest speakers can use on-device AI or cloud-based generative models to allow the smart speaker to carry on much more natural conversations, draft emails or recipes, suggest ideas based on context, or even create short pieces of music or art. This AI evolution allows these speakers to do far more than what they could do before. == Accuracy == According to a study by Proceedings of the National Academy of Sciences of the United States of America released In March 2020, the six biggest tech development companies, Amazon, Apple, Google, Yandex, IBM and Microsoft, have misidentified more words spoken by "black people" than "white people". The systems tested errors and unreadability, with a 19 and 35 percent discrepancy for the former and a 2 and 20 percent discrepancy for the latter. The North American Chapter of the Association for Computational Linguistics (NAACL) also identified a discrepancy between male and female voices. According to their research, Google's speech recognition software is 13 percent more accurate for men than women. It performs better than the systems used by Bing, AT&T, and IBM. == Privacy concerns == The built-in microphone in smart speakers is continuously listening for wake words followed by a command. However, these continuously listening microphones also raise privacy concerns among users. According to a survey taken by 1,007 people in Western Europe, it is clear that privacy is the biggest concern holding consumers back from buying "smart" products. these concerns include what is being recorded, how the data will be used, how it will be protected, and whether it will be used for invasive advertising. Furthermore, an analysis of Amazon Echo Dots showed that 30–38% of "spurious audio recordings were human conversations", suggesting that these devices capture audio other than strictly detection of the wake word. === As a wiretap === There are strong concerns that the ever-listening microphone of smart speakers presents a perfect candidate for wiretapping. In 2017, British security researcher Mark Barnes showed that pre-2017 Echos have exposed pins which allow for a compromised OS to be booted. According to Umar Iqbal, an assistant professor at Washington University in St. Louis, research indicates that data from consumer interactions with Alexa was used to targeted advertisements and products to consumer with over 40% of transmitted data lacking proper encryption raising privacy concerns. Further data indicates that due to the Smart Speakers ability to always capture audio, it begins to pick up on external conversations from consumers not related to commands given to the smart speaker. Things such as other members in the household, consumers on the phone and even TV audio can be picked up by these speakers and stored for future use by companies. === Voice assistance vs privacy === While voice assistants provide a valuable service, there can be some hesitation towards using them in various social contexts, such as in public or around other users. However, only more recently have users begun interac

    Read more →
  • The Sword in the Stoned

    The Sword in the Stoned

    "The Sword in the Stoned" is the fifth episode of the second season of the American fantasy comedy television series Ted. Written by Julius Sharpe, and directed by Seth MacFarlane, it premiered on the American streaming service Peacock, along with the rest of season two, on March 5, 2026. The series acts as a precursor to the Ted film franchise, showcasing the childhood lives of the protagonists. The series, set in 1994, focuses on John Bennett (Max Burkholder), the series' primary protagonist, an awkward high-school aged boy; along with Ted (MacFarlane), the series' titular anthropomorphic teddy bear. The two live with John's family, Susan (Alanna Ubach), his mild mannered mother, and Matty (Scott Grimes), his conservative father. Also residing with the family is Blaire (Giorgia Whigham), his radically liberal cousin whom often clashes with Matty. In the episode, Ted and John join the school play so they can have more extracurricular activities for their college applications, but the latter grows a connection with the school's popular teenager, Erin (Francesca Xuereb). Concurrently, Susan and Matty get a job at Dunkin' Donuts to help with their financial troubles, and Matty is given an opportunity to tell off Bill Clinton. Burkholder wore prop armor during the episode's play scenes. Bill Clinton’s appearance in the episode was portrayed by MacFarlane. After conventional makeup and visual techniques failed to convincingly resemble Clinton, the production used artificial intelligence to digitally replace MacFarlane's face with Clinton's likeness. Upon release, the episode received generally positive reviews from critics, though the use of AI in the Clinton scene was polarizing among audiences and reviewers. == Plot == John tells Ted that he is the last single guy left at their school, to which Ted points out the popular, single cheerleader, Erin, but John dismisses this. At home, Blaire tells John that he needs extracurricular activities to get into college, while Susan and Matty discuss their financial troubles, especially regarding John's college tuition. Looking over their options, they decide to audition for a school production of the play Camelot. Matty takes a job at Dunkin' Donuts, despite being told that nobody will give him a tip, and having to wear an incorrect name tag. Waiting for their auditions, John and Ted watch several poor auditions for the play before seeing Erin's, who delivers a flawless performance; John and Ted do less serious auditions, getting cast as knights, while Erin gets the role of Guinevere. Matty complains about his low salary, and Susan decides to get a job at Dunkin' Donuts beside him to help earn more income. Erin clashes with Lancelot's actor while rehearsing, and John compliments her performance, which she ignores, but, seeing Ted and John give good performances in a repetition exercise, she becomes interested in him, particularly since he treats her better than her stage-partner. Matty and Susan watch an employee training video, explaining how they should treat customers politely, not affecting Matty's nihilistic attitude. The manager announces that Bill Clinton is visiting their Dunkin' Donuts for publicity, and Matty sees this as a chance to tell Bill off. John and Erin practice lines, as she reveals the show is being taped so it can be sent to Emerson College in hopes of her getting in; Erin asks John to go out with her after the show. At dinner, Matty enthusiastically reveals what he plans to tell Bill, as John becomes stressed about the play when Susan tells there will be a large audience. Bill comes to the Dunkin' Donuts, and, seeing Matty is nervously insulting him, stages a private meeting with him, where Bill yells at Matty, calling him a loser before posing for a picture with Matty and subsequently throwing the cold coffee onto him. To ease the pressure, Ted and John take edibles from Blaire, but learn at the show that they contained mushrooms, causing them to stress further. On stage, Ted and John yell nervously that they're on drugs as the latter urinates in his costume, causing Erin to angrily storm off. == Production == "The Sword in the Stoned" was directed by series creator and lead Seth MacFarlane, and written by Julius Sharpe in his third and final writing credit for the series. When Ted and John are doing repetition exercises, they tackle each other to the ground, which required a stuntman named Ashton to play the role of Ted, according to Max Burkholder, who portrays John. Burkholder also recalled that, when Ted was choking John in the scene, he kept making a noise during the choking, which made Bill, the cameraman, laugh, despite being a "stone face" that never laughs, noting that seeing him be amused by the noise he was making assured Burkholder that what he was doing was "hilarious". Burkholder found the filming of the play scenes "weird", as he was put in fake armor with a hose inside his suit—which was filled with water mixed with yellow food coloring—that was made to create the urine stream that comes out of John's armor in the episode; he also noted that it took around 45 minutes to put on and take off the armor. He revealed that he himself had to urinate during the filming, as doing a scene about a character having to do so "really [broke] my brain", with the fact that it took 45 minutes to get the suit off adding to the frustration. Jennifer Ashley Connell, who worked for wardrobe, had to repeatedly go to Burkholder quickly between takes to dry off his pants with two hair dryers to make it look like the fake urine hadn't already streamed down his pants, so they could get as many shots of it as possible. Francesca Xuereb guest stars in the episode as Erin, the cheerleader who stars in the play. Incumbent president Bill Clinton was portrayed by MacFarlane, with artificial intelligence (AI) being used to digitally make MacFarlane's face look like Clinton's during post-production. Before settling on AI, the crew tried to use traditional computer-generated imagery and prosthetics, which made him look "terrifying", resulting in them deciding that AI would give them a more accurate look. One of the original technologies considered was one where, after scanning MacFarlane, a mesh of his head was created, and they had to use computer graphics to replace MacFarlane's face with Clinton's. An issue was faced, however, when they found the archival footage used as reference from the Clinton Library—an official Presidential Library containing information related to Clinton—to be extremely low-quality, making it hard to properly emulate his face, since only still images were of acceptable quality, and there weren't references of his moving face to work off of. A forensic artist was hired to help with this, and they created a 3D model of Clinton's head in ZBrush, based off of his presidential portrait. The model head worked for still frames, but movement was still difficult to do realistically, due to it being made for a "single-point perspective", which made details like the cheekbones or other minor issues more noticeable when using it for the scene. Since this did not work, AI was ultimately chosen through the studio Deep Voodoo, which used large language models to teach the tool how to correctly replicate Clinton's appearance. Defending the episode's use of AI, MacFarlane noted that the crew did not want people to focus on the tool being used, trying to utilize it in a way that wouldn't distract from the humor and narrative. Like the rest of the series, the episode was shot using ViewScreen; MacFarlane was able to act live with the cast as Ted due to ViewScreen, a technology that allows the production crew to visualize what Ted will look like in each scene in real time. == Release and reception == "The Sword in the Stoned" was first released on March 5, 2026, on the American streaming service Peacock, along with the rest of the second season. Nate Richards of Collider highlighted the Dunkin' Donuts subplot as an example of Scott Grimes delivering a "lot of laughs" through his performance as Matty. Dustin Rowles of Pajiba called "The Sword in the Stoned" one of the season's many episodes he'd recommend, particularly for the scenes of Ted and John being high on mushrooms during the play. Oppositely, Nick Valdez of ComicBook.com ranked the episode as the worst of the second season, criticizing it for not having a "huge impact" on the Bennett family dynamic like other episodes of the season do, and Susan and Matty's side story as the main reason he felt it was "[kept] from being great". Valdez noted the episode for likely being an advertisement for Dunkin' Donuts, calling the plot's ending scene involving Clinton the reason "it just all sticks out like a sore thumb". === Response to AI usage === The episode's use of AI for MacFarlane's portrayal of Clinton proved controversial, mainly on social media, where audiences asserted that the crew should have gotten an actor that resembl

    Read more →