AI Avatar Heygen

AI Avatar Heygen — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Model compression

    Model compression

    Model compression is a machine learning technique for reducing the size of trained models. Large models can achieve high accuracy, but often at the cost of significant resource requirements. Compression techniques aim to compress models without significant performance reduction. Smaller models require less storage space, and consume less memory and compute during inference. Compressed models enable deployment on resource-constrained devices such as smartphones, embedded systems, edge computing devices, and consumer electronics computers. Efficient inference is also valuable for large corporations that serve large model inference over an API, allowing them to reduce computational costs and improve response times for users. Model compression is not to be confused with knowledge distillation, in which a smaller "student" model is trained to imitate the input-output behavior of a larger "teacher" model (as opposed to using the "teacher"'s trained parameters or the "teacher"'s training targets). == Techniques == Several techniques are employed for model compression. === Pruning === Pruning sparsifies a large model by setting some parameters to exactly zero. This effectively reduces the number of parameters. This allows the use of sparse matrix operations, which are faster than dense matrix operations. Pruning criteria can be based on magnitudes of parameters, the statistical pattern of neural activations, Hessian values, etc. === Quantization === Quantization reduces the numerical precision of weights and activations. For example, instead of storing weights as 32-bit floating-point numbers, they can be represented using 8-bit integers. Low-precision parameters take up less space, and takes less compute to perform arithmetic with. It is also possible to quantize some parameters more aggressively than others, so for example, a less important parameter can have 8-bit precision while another, more important parameter, can have 16-bit precision. Inference with such models requires mixed-precision arithmetic. Quantized models can also be used during training (rather than after training). PyTorch implements automatic mixed-precision (AMP), which performs autocasting, gradient scaling, and loss scaling. === Low-rank factorization === Weight matrices can be approximated by low-rank matrices. Let W {\displaystyle W} be a weight matrix of shape m × n {\displaystyle m\times n} . A low-rank approximation is W ≈ U V T {\displaystyle W\approx UV^{T}} , where U {\displaystyle U} and V {\displaystyle V} are matrices of shapes m × k , n × k {\displaystyle m\times k,n\times k} . When k {\displaystyle k} is small, this both reduces the number of parameters needed to represent W {\displaystyle W} approximately, and accelerates matrix multiplication by W {\displaystyle W} . Low-rank approximations can be found by singular value decomposition (SVD). The choice of rank for each weight matrix is a hyperparameter, and jointly optimized as a mixed discrete-continuous optimization problem. The rank of weight matrices may also be pruned after training, taking into account the effect of activation functions like ReLU on the implicit rank of the weight matrices. == Training == Model compression may be decoupled from training, that is, a model is first trained without regard for how it might be compressed, then it is compressed. However, it may also be combined with training. The "train big, then compress" method trains a large model for a small number of training steps (less than it would be if it were trained to convergence), then heavily compress the model. It is found that at the same compute budget, this method results in a better model than lightly compressed, small models. In Deep Compression, the compression has three steps. First loop (pruning): prune all weights lower than a threshold, then finetune the network, then prune again, etc. Second loop (quantization): cluster weights, then enforce weight sharing among all weights in each cluster, then finetune the network, then cluster again, etc. Third step: Use Huffman coding to losslessly compress the model. The SqueezeNet paper reported that Deep Compression achieved a compression ratio of 35 on AlexNet, and a ratio of ~10 on SqueezeNets.

    Read more →
  • Flux (text-to-image model)

    Flux (text-to-image model)

    Flux (also known as FLUX.1 and FLUX.2) is a text-to-image model developed by Black Forest Labs (BFL), based in Freiburg im Breisgau, Germany. Black Forest Labs was founded by former employees of Stability AI. As with other text-to-image models, Flux generates images from natural language descriptions, called prompts. == History == Black Forest Labs (BFL) was founded in 2024 by Robin Rombach, Andreas Blattmann, and Patrick Esser, former employees of Stability AI. All three founders had previously researched the artificial intelligence image generation at LMU Munich as research assistants under Björn Ommer. They published their research results on image generation in 2022, which resulted in creation of Stable Diffusion. Investors in BFL included venture capital firm Andreessen Horowitz, Brendan Iribe, Michael Ovitz, Garry Tan, and Vladlen Koltun. The company received an initial investment of US$31 million. In August 2024, Flux was integrated into the Grok chatbot developed by xAI and made available as part of premium feature on X (formerly Twitter). Grok later switched to its own text-to-image model Aurora in December 2024. On 18 November 2024, Mistral AI announced that its Le Chat chatbot had integrated Flux Pro as its image generation model. On 21 November 2024, BFL announced the release of Flux.1 Tools, a suite of editing tools designed to be used on top of existing Flux models. The tools consisting of Flux.1 Fill for inpainting and outpainting, Flux.1 Depth for control based on extracted depth map of input images and prompts, Flux.1 Canny for control based on extracted canny edges of input images and prompts, and Flux.1 Redux for mixing existing input images and prompts. Each tools are available in both Pro and Dev models. In January 2025, BFL announced a partnership with Nvidia for inclusion of Flux models as foundation models for Nvidia's Blackwell microarchitecture. The company also announced the release of Flux Pro Finetuning API, designed for customisation and fine-tuning of Flux-generated images and a partnership with German media company Hubert Burda Media for usage of Flux Pro as part of content creation. On 29 May 2025, BFL announced Flux.1 Kontext, a suite of models that enable in-context image generation and editing, allowing users to prompt with both text and images. Alongside this, BFL Playground, an interface for testing Flux models was released. On 31 July 2025, BFL announced Flux.1 Krea Dev, a model developed in collaboration with Krea AI that trained to achieve better performance, more varied aesthetics, and better realism compared to existing text-to-image models. In September 2025, Adobe Inc. announced that Photoshop (beta) users can use Flux.1 Kontext Pro as a model for its generative fill tool. BFL collaborated with Meta on Vibes, a video-generation app. On 25 November 2025, BFL announced the release of Flux.2 model series, consisting of Pro, Flex, Dev, and Apache 2.0-licensed Klein (meaning Little or Small in German language) models along with Flux.2 variational autoencoder which also released as open-source software under Apache 2.0 licence. This series claimed improvements for image reference, photorealism, typography, and prompt understanding. == Models == Flux is a series of text-to-image models. The models are based on rectified flow transformer blocks scaled to 12 billion parameters. Flux.1 models were released under different licences with Schnell (meaning Fast or Quick in German language) released as open-source software under Apache License, Dev released as source-available software under a non-commercial licence (users can obtain a self-serving commercial licence for Dev from BFL), and Pro released as proprietary software and only available as API that can be licensed by third-party users. Users retained the ownership of resulting output regardless of models used. An improved flagship model, Flux 1.1 Pro was released on 2 October 2024. Two additional modes were added on 6 November, Ultra which can generate image at four times higher resolution and up to 4 megapixel without affecting generation speed and Raw which can generate hyper-realistic image in the style of candid photography. Flux.1 Kontext is a series with in-context image generation and editing capabilities. It is available in Max, Pro, and Dev models. Max is the highest quality model and can be used to iteratively modify an existing image by using prompt while Pro is optimized to balance quality and speed of generation. Dev is an open-weight model released under non-commercial license, same as Flux.1 Dev. Flux.2 models are based on latent flow matching architecture with Mistral AI's Mistral-3 model (24 billion parameters) for its vision-language model. As with Flux.1, Flux.2 models were also released under different licences with Klein released as open-source software under Apache License, Dev released as source-available software under a non-commercial licence (users can obtain a self-serving commercial licence from BFL), and both Flex and Pro released as proprietary software and only available as API. The models can be used either online or locally by using generative AI user interfaces such as ComfyUI, Recraft Studio and Stable Diffusion WebUI Forge (a fork of Automatic1111 WebUI). Related to Flux is a text-to-video model by Black Forest Labs, under development as of February 2026. == Reception == According to a test performed by Ars Technica, the outputs generated by Flux.1 Dev and Flux.1 Pro are comparable with DALL-E 3 in terms of prompt fidelity, with the photorealism closely matched Midjourney 6 and generated human hands with more consistency over previous models such as Stable Diffusion XL. Flux has been criticised for its very realistic generated images. According to media reports, depictions ranged from an image of Donald Trump posing with guns to disturbing scenes, which triggered discussions about ethical implications of Flux models. After the release of the model, social media platform X was flooded with Flux-generated images. Black Forest Labs has not provided exact details of the data used to train the model. Ars Technica suspected that Flux is based on a large, unauthorised collection of images scraped from the internet, a controversial practice with potential legal consequences. According to a test performed by Japanese technology news website Gigazine for Flux.1 Kontext, the model series has a good understanding of the English language and can easily transfer style of the image from photorealistic into anime-style according to prompts given by the user; however, its ability to understand Japanese is quite poor. == Availability == In addition to the official BFL Playground on its website, the Flux models are also widely available through various third-party platforms for creative and professional use. These include repositories on platforms like Hugging Face and Replicate. == Further readings == FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space (29 May 2025) FLUX.2: Analyzing and Enhancing the Latent Space of FLUX – Representation Comparison (25 November 2025)

    Read more →
  • AI-assisted targeting in the Gaza Strip

    AI-assisted targeting in the Gaza Strip

    As part of the Gaza war, the Israel Defense Forces (IDF) have used artificial intelligence to rapidly and automatically perform much of the process of determining what to bomb. Israel has greatly expanded the bombing of the Gaza Strip, which in previous wars had been limited by the Israeli Air Force running out of targets. These tools include the Gospel, an AI which automatically reviews surveillance data looking for buildings, equipment and people thought to belong to the enemy, and upon finding them, recommends bombing targets to a human analyst who may then decide whether to pass it along to the field. Another is Lavender, an "AI-powered database" which lists tens of thousands of Palestinian men linked by AI to Hamas or Palestinian Islamic Jihad, and which is also used for target recommendation. Critics have argued the use of these AI tools puts civilians at risk, blurs accountability, and results in militarily disproportionate violence in violation of international humanitarian law. == The Gospel == Israel uses an AI system dubbed "Habsora", "the Gospel", to determine which targets the Israeli Air Force would bomb. It automatically provides a targeting recommendation to a human analyst, who decides whether to pass it along to soldiers in the field. The recommendations can be anything from individual fighters, rocket launchers, Hamas command posts, to private homes of suspected Hamas or Islamic Jihad members. AI can process military intelligence far faster than humans. Retired Lt Gen. Aviv Kohavi, head of the IDF until 2023, stated that the system could produce 100 bombing targets in Gaza a day, with real-time recommendations which ones to attack, where human analysts might produce 50 a year. A lecturer interviewed by NPR estimated these figures as 50–100 targets in 300 days for 20 intelligence officers, and 200 targets within 10–12 days for the Gospel. === Technological background === The Gospel uses machine learning, where an AI is tasked with identifying commonalities in vast amounts of data (e.g. scans of cancerous tissue, photos of a facial expression, surveillance of Hamas members identified by human analysts), then looking for those commonalities in new material. What information the Gospel uses is not known, but it is thought to combine surveillance data from diverse sources in enormous amounts. Recommendations are based on pattern-matching. A person with enough similarities to other people labeled as enemy combatants may be labelled a combatant themselves. Regarding the suitability of AIs for the task, NPR cited Heidy Khlaaf, engineering director of AI Assurance at the technology security firm Trail of Bits, as saying "AI algorithms are notoriously flawed with high error rates observed across applications that require precision, accuracy, and safety." Bianca Baggiarini, lecturer at the Australian National University's Strategic and Defence Studies Centre wrote AIs are "more effective in predictable environments where concepts are objective, reasonably stable, and internally consistent." She contrasted this with telling the difference between a combatant and non-combatant, which even humans frequently can't do. Khlaaf went on to point out that such a system's decisions depend entirely on the data it's trained on, and are not based on reasoning, factual evidence or causation, but solely on statistical probability. === Operation === The IAF ran out of targets to strike in the 2014 war and 2021 crisis. In an interview on France 24, investigative journalist Yuval Abraham of +972 Magazine stated that to maintain military pressure, and due to political pressure to continue the war, the military would bomb the same places twice. Since then, the integration of AI tools has significantly sped up the selection of targets. In early November, the IDF stated more than 12,000 targets in Gaza had been identified by the target administration division that uses the Gospel. NPR wrote on December 14 that it was unclear how many targets from the Gospel had been acted upon, but that the Israeli military said it was currently striking as many as 250 targets a day. The bombing, too, has intensified to what the December 14 article called an astonishing pace: the Israeli military stated at the time it had struck more than 22,000 targets inside Gaza, at a daily rate more than double that of the 2021 conflict, more than 3,500 of them since the collapse of the truce on December 1. Early in the offensive the head of the Air Force stated his forces only struck military targets, but added: "We are not being surgical." Once a recommendation is accepted, another AI, Fire Factory, cuts assembling the attack down from hours to minutes by calculating munition loads, prioritizing and assigning targets to aircraft and drones, and proposing a schedule, according to a pre-war Bloomberg article that described such AI tools as tailored for a military confrontation and proxy war with Iran. One change that The Guardian noted is that since senior Hamas leaders disappear into tunnels at the start of an offensive, systems such as the Gospel have allowed the IDF to locate and attack a much larger pool of more junior Hamas operatives. It cited an official who worked on targeting decisions in previous Gaza operations as saying that while the homes of junior Hamas members had previously not been targeted for bombing, the official believes the houses of suspected Hamas operatives were now targeted regardless of rank. In the France 24 interview, Abraham, of +972 Magazine, characterized this as enabling the systematization of dropping a 2000 lb bomb into a home to kill one person and everybody around them, something that had previously been done to a very small group of senior Hamas leaders. NPR cited a report by +972 Magazine and its sister publication Local Call as asserting the system is being used to manufacture targets so that Israeli military forces can continue to bombard Gaza at an enormous rate, punishing the general Palestinian population. NPR noted it had not verified this; it was unclear how many targets are being generated by AI alone, but there had been a substantial increase in targeting, with an enormous civilian toll. In principle, the combination of a computer's speed to identify opportunities and a human's judgment to evaluate them can enable more precise attacks and fewer civilian casualties. Israeli military and media have emphasized this capacity to minimize harm to non-combatants. Richard Moyes, researcher and head of the NGO Article 36, pointed to "the widespread flattening of an urban area with heavy explosive weapons" to question these claims, while Lucy Suchman, professor emeritus at Lancaster University, described the bombing as "aimed at maximum devastation of the Gaza Strip". The Guardian wrote that when a strike was authorized on private homes of those identified as Hamas or Islamic Jihad operatives, target researchers knew in advance the expected number of civilians killed, each target had a file containing a collateral damage score stipulating how many civilians were likely to be killed in a strike, and according to a senior Israeli military source, operatives use a "very accurate" measurement of the rate of civilians evacuating a building shortly before a strike. "We use an algorithm to evaluate how many civilians are remaining. It gives us a green, yellow, red, like a traffic signal." ==== 2021 use ==== Kohavi compared the target division using the Gospel to a machine and stated that once the machine was activated in the war of May 2021, it generated 100 targets a day, with half of them being attacked, in contrast with 50 targets in Gaza per year beforehand. Approximately 200 targets came from the Gospel out of the 1,500 targets Israel struck in Gaza in the war, including both static and moving targets according to the military. The Jewish Institute for National Security of America's after action report identified an issue, stating the system had data on what was a target, but lacked data on what wasn't. The system depends entirely on training data, and intel that human analysts had examined and deemed didn't constitute a target had been discarded, risking bias. The vice president expressed his hopes this had since been rectified. === Organization === The Gospel is used by the military's target administration division (or Directorate of Targets or Targeting Directorate), which was formed in 2019 in the IDF's intelligence directorate to address the air force running out of targets to bomb, and which Kohavi described as "powered by AI capabilities" and including hundreds of officers of soldiers. In addition to its wartime role, The Guardian wrote it'd helped the IDF build a database of between 30,000 and 40,000 suspected militants in recent years, and that systems such as the Gospel had played a critical role in building lists of individuals authorized to be assassinated. The Gospel was developed by Unit 8200 of the Israeli Intelligence C

    Read more →
  • Mobile Fortify

    Mobile Fortify

    Mobile Fortify is a mobile app used by United States Immigration and Customs Enforcement (ICE) on their government-issued phones. The app allows agents to take a photo in order to gather biometrics, including contactless fingerprints and faceprints, for the purpose of identifying an individual and their potential immigration status. The app was created by NEC. == History == In June 2025, use of Mobile Fortify by ICE was uncovered through leaked emails and the user manual, reported by 404 Media. The app is internally developed, and details of the parent company and developer were initially unknown. In January 2026, the DHS's 2025 AI Use Case Inventory revealed the vendor as NEC Corporation, an international conglomerate with subsidiaries in Argentina, Australia, China, India and Malaysia. Later that month, several senators demanded transparency around the app and its origins, and that ICE stop using it. A second letter was sent again in November, after hearing no response to the previous letter from ICE. == Technology == Unlike other facial recognition software, Fortify uses federally linked databases. By contrast, Clearview AI uses public social media databases for biometric scanning. Federal databases include DHS's automated biometric identification system (IDENT), containing more than 270 million biometric records, and Customs and Border Protection's Traveler Verification Service. The State Department's visa and passport photo database, the FBI's National Crime Information Center, National Law Enforcement Telecommunications Systems, and CBP's TECS and Seized Assets and Case Tracing System (SEACATS). == Oversight == Several senators urged ICE to stop using the app for fear of infringing on fourth amendment and first amendment rights, and requested details on who developed the app, when it was deployed, whether the app was tested for accuracy, and policies and practices governing its use. In June 2025, they sent an open letter to Todd Lyons, ICE acting director, signed by senators Cory Booker, Chris Van Hollen, Ed Markey, Bernie Sanders, Adam Schiff, Tina Smith, Elizabeth Warren, and Ron Wyden. On November 3, a second letter was sent to the ICE by senators, after not receiving answers to questions from the previous letter deadlined for October 2. == Criticism == Mobile Fortify, and ICE's use of similar biometric identification technologies (such as Mobile Identify, an app similar to Mobile Fortify to be used by local or regional law enforcement to assist in immigration enforcement ) has faced scrutiny from a variety of digital rights organizations, politicians, and news outlets. The criticism is already considered to potentially be a reason why the similar Mobile Identify app was pulled from the Google Play Store. Facial recognition technologies are known to produce false-positives and generally unreliable results, especially on those with darker skin tones. ICE has already previously mistakenly arrested a U.S. citizen under the belief he was illegally in the country, and later stated that he "could be deported based on biometric confirmation of his identity" prior to his release. U.S. representative Bennie Thompson, ranking member of the House Homeland Security Committee has previously commented that "ICE officials have told us that an apparent biometric match by Mobile Fortify is a ‘definitive’ determination of a person's status and that an ICE officer may ignore evidence of American citizenship—including a birth certificate—if the app says the person is an alien," and that "Mobile Fortify is a dangerous tool in the hands of ICE, and it puts American citizens at risk of detention and even deportation," On January 19, 2026, 404 Media reported on a case where a woman, identified in court documents as "MJMA", was scanned by Mobile Fortify twice in the same interaction, and two entirely different names were provided by the app. According to the Innovation Law Lab, whose attorneys are representing MJMA, both of the names were incorrect. ICE has stated that they will not allow people to decline to be scanned by Mobile Fortify, and that photos taken, even those of U.S. citizens, will be stored for 15 years, something that has been criticized primarily because ICE has not performed a Privacy Impact Assessment (PIA) for Mobile Fortify, the right to decline other forms of biometric verification to the U.S. government is often available under other circumstances, and the 15 year window is viewed as unnecessarily large.

    Read more →
  • Prism Video Converter

    Prism Video Converter

    Prism is a multi-format video converter developed by NCH Software for Windows and Mac OS. It offers converting tools for instant media conversions. Prism Video Converter can handle large and high-quality resolution media files. It provides built-in compressor and adjuster settings, allowing users to customize and optimize their videos according to their needs. The software also includes features such as previewing videos and adding effects. Prism offers a free version for non-commercial use as well as a premium version. == Features == Prism Video File Converter supports a wide range of file formats. It enables users to convert videos into formats like AVI, ASF, WMV, MP4, 3GP, etc. It offers the ability to convert DVDs into various formats. It provides tools for adjusting colour and filter options. Prism Video File Converter provides several customizable options for tweaking the output files during the conversion process. Users can adjust compression/encoder rates, set the resolution and frame rate, and specify the desired output file size. The software also offers various effects like video rotation, captions, watermarks, and text overlay. It also includes a built-in preview feature, that enables users to view their videos before and after the conversion process. It supports batch conversion and running conversion in background. == Controversy == Previously, Prism and certain other NCH Software products were bundled with optional browser plugins, including the Google Chrome toolbar and the Conduit toolbar. This resulted in user complaints and raised concerns from antivirus software companies like Norton and McAfee, which flagged them as potential malware. NCH Software has since removed all toolbars, browsers, and third-party app offerings in all Prism versions.

    Read more →
  • Type-2 fuzzy sets and systems

    Type-2 fuzzy sets and systems

    Type-2 fuzzy sets and systems generalize standard type-1 fuzzy sets and systems so that more uncertainty can be handled. From the beginning of fuzzy sets, criticism was made about the fact that the membership function of a type-1 fuzzy set has no uncertainty associated with it, something that seems to contradict the word fuzzy, since that word has the connotation of much uncertainty. So, what does one do when there is uncertainty about the value of the membership function? The answer to this question was provided in 1975 by the inventor of fuzzy sets, Lotfi A. Zadeh, when he proposed more sophisticated kinds of fuzzy sets, the first of which he called a "type-2 fuzzy set". A type-2 fuzzy set lets us incorporate uncertainty about the membership function into fuzzy set theory, and is a way to address the above criticism of type-1 fuzzy sets head-on. And, if there is no uncertainty, then a type-2 fuzzy set reduces to a type-1 fuzzy set, which is analogous to probability reducing to determinism when unpredictability vanishes. Type1 fuzzy systems are working with a fixed membership function, while in type-2 fuzzy systems the membership function is fluctuating. A fuzzy set determines how input values are converted into fuzzy variables. == Overview == In order to symbolically distinguish between a type-1 fuzzy set and a type-2 fuzzy set, a tilde symbol is put over the symbol for the fuzzy set; so, A denotes a type-1 fuzzy set, whereas à denotes the comparable type-2 fuzzy set. When the latter is done, the resulting type-2 fuzzy set is called a "general type-2 fuzzy set" (to distinguish it from the special interval type-2 fuzzy set). Zadeh didn't stop with type-2 fuzzy sets, because in that 1976 paper he also generalized all of this to type-n fuzzy sets. The present article focuses only on type-2 fuzzy sets because they are the next step in the logical progression from type-1 to type-n fuzzy sets, where n = 1, 2, ... . Although some researchers are beginning to explore higher than type-2 fuzzy sets, as of early 2009, this work is in its infancy. The membership function of a general type-2 fuzzy set, Ã, is three-dimensional (Fig. 1), where the third dimension is the value of the membership function at each point on its two-dimensional domain that is called its "footprint of uncertainty"(FOU). For an interval type-2 fuzzy set that third-dimension value is the same (e.g., 1) everywhere, which means that no new information is contained in the third dimension of an interval type-2 fuzzy set. So, for such a set, the third dimension is ignored, and only the FOU is used to describe it. It is for this reason that an interval type-2 fuzzy set is sometimes called a first-order uncertainty fuzzy set model, whereas a general type-2 fuzzy set (with its useful third-dimension) is sometimes referred to as a second-order uncertainty fuzzy set model. The FOU represents the blurring of a type-1 membership function, and is completely described by its two bounding functions (Fig. 2), a lower membership function (LMF) and an upper membership function (UMF), both of which are type-1 fuzzy sets! Consequently, it is possible to use type-1 fuzzy set mathematics to characterize and work with interval type-2 fuzzy sets. This means that engineers and scientists who already know type-1 fuzzy sets will not have to invest a lot of time learning about general type-2 fuzzy set mathematics in order to understand and use interval type-2 fuzzy sets. Work on type-2 fuzzy sets languished during the 1980s and early-to-mid 1990s, although a small number of articles were published about them. People were still trying to figure out what to do with type-1 fuzzy sets, so even though Zadeh proposed type-2 fuzzy sets in 1976, the time was not right for researchers to drop what they were doing with type-1 fuzzy sets to focus on type-2 fuzzy sets. This changed in the latter part of the 1990s as a result of Jerry Mendel and his student's works on type-2 fuzzy sets and systems. Since then, more researchers around the world are writing articles about type-2 fuzzy sets and systems. == Interval type-2 fuzzy sets == Interval type-2 fuzzy sets have received the most attention because the mathematics that is needed for such sets—primarily Interval arithmetic—is much simpler than the mathematics that is needed for general type-2 fuzzy sets. The literature about interval type-2 fuzzy sets is large, whereas the literature about general type-2 fuzzy sets is much smaller. Both kinds of fuzzy sets are being actively researched by an ever-growing number of researchers around the world and have resulted in successful employment in a variety of domains such as robot control. Formally, the following have already been worked out for interval type-2 fuzzy sets: Fuzzy set operations: union, intersection and complement Centroid (a very widely used operation by practitioners of such sets, and also an important uncertainty measure for them) Other uncertainty measures [fuzziness, cardinality, variance and skewness and uncertainty bounds Similarity Subsethood Embedded fuzzy sets Fuzzy set ranking Fuzzy rule ranking and selection Type-reduction methods Firing intervals for an interval type-2 fuzzy logic system Fuzzy weighted average Linguistic weighted average Synthesizing an FOU from data that are collected from a group of subject == Interval type-2 fuzzy logic systems == Type-2 fuzzy sets are finding very wide applicability in rule-based fuzzy logic systems (FLSs) because they let uncertainties be modeled by them whereas such uncertainties cannot be modeled by type-1 fuzzy sets. A block diagram of a type-2 FLS is depicted in Fig. 3. This kind of FLS is used in fuzzy logic control, fuzzy logic signal processing, rule-based classification, etc., and is sometimes referred to as a function approximation application of fuzzy sets, because the FLS is designed to minimize an error function. The following discussions, about the four components in Fig. 3 rule-based FLS, are given for an interval type-2 FLS, because to-date they are the most popular kind of type-2 FLS; however, most of the discussions are also applicable for a general type-2 FLS. Rules, that are either provided by subject experts or are extracted from numerical data, are expressed as a collection of IF-THEN statements, e.g., IF temperature is moderate and pressure is high, then rotate the valve a bit to the right. Fuzzy sets are associated with the terms that appear in the antecedents (IF-part) or consequents (THEN-part) of rules, and with the inputs to and the outputs of the FLS. Membership functions are used to describe these fuzzy sets, and in a type-1 FLS they are all type-1 fuzzy sets, whereas in an interval type-2 FLS at least one membership function is an interval type-2 fuzzy set. An interval type-2 FLS lets any one or all of the following kinds of uncertainties be quantified: Words that are used in antecedents and consequents of rules—because words can mean different things to different people. Uncertain consequents—because when rules are obtained from a group of experts, consequents will often be different for the same rule, i.e. the experts will not necessarily be in agreement. Membership function parameters—because when those parameters are optimized using uncertain (noisy) training data, the parameters become uncertain. Noisy measurements—because very often it is such measurements that activate the FLS. In Fig. 3, measured (crisp) inputs are first transformed into fuzzy sets in the Fuzzifier block because it is fuzzy sets and not numbers that activate the rules which are described in terms of fuzzy sets and not numbers. Three kinds of fuzzifiers are possible in an interval type-2 FLS. When measurements are: Perfect, they are modeled as a crisp set; Noisy, but the noise is stationary, they are modeled as a type-1 fuzzy set; and, Noisy, but the noise is non-stationary, they are modeled as an interval type-2 fuzzy set (this latter kind of fuzzification cannot be done in a type-1 FLS). In Fig. 3, after measurements are fuzzified, the resulting input fuzzy sets are mapped into fuzzy output sets by the Inference block. This is accomplished by first quantifying each rule using fuzzy set theory, and by then using the mathematics of fuzzy sets to establish the output of each rule, with the help of an inference mechanism. If there are M rules then the fuzzy input sets to the Inference block will activate only a subset of those rules, where the subset contains at least one rule and usually way fewer than M rules. The inference is done one rule at a time. So, at the output of the Inference block, there will be one or more fired-rule fuzzy output sets. In most engineering applications of an FLS, a number (and not a fuzzy set) is needed as its final output, e.g., the consequent of the rule given above is "Rotate the valve a bit to the right." No automatic valve will know what this means because "a bit to the right" is a linguistic expression, and a valv

    Read more →
  • ICAART

    ICAART

    The International Conference on Agents and Artificial Intelligence (ICAART) is a meeting point for researchers (among others) with interest in the areas of Agents and Artificial Intelligence. There are 2 tracks in ICAART, one related to Agents and Distributed AI in general and the other one focused in topics related to Intelligent Systems and Computational Intelligence. The conference program is composed of several different kind of sessions like technical sessions, poster sessions, keynote lectures, tutorials, special sessions, doctoral consortiums, panels and industrial tracks. The papers presented in the conference are made available at the SCITEPRESS digital library, published in the conference proceedings and some of the best papers are invited to a post-publication with Springer. ICAART's first edition was in 2009 counting with several keynote speakers like Marco Dorigo, Edward H. Shortliffe and Eduard Hovy. Since then, the conference had several other invited speakers like Katia Sycara, Nick Jennings, Robert Kowalski, Boi Faltings and Tim Finin. Bart Selman is one of the names confirmed for the next edition of this conference. Since 2012 the conference is held in conjunction with 2 other conferences: the International Conference on Operations Research and Enterprise Systems (ICORES) and the International Conference on Pattern Recognition Applications and Methods (ICPRAM). == Areas == === Agents === Agent communication languages Cooperation and Coordination Distributed Problem Solving Economic Agent Models Emotional Intelligence Group Decision Making Intelligent Auctions and Markets Mobile Agents Multi-agent systems Negotiation and Interaction Protocols Nep News Detection Agent Models and Architectures Physical Agents at Work Privacy, Safety and Security Programming Environments and Languages Robot and Multi-Robot Systems Self Organizing Systems Semantic Web Simulation Swarm Intelligence Task Planning and Execution Transparency and Ethical Issues Agent-Oriented Software Engineering Web Intelligence Agent Platforms and Interoperability Autonomous systems Cloud Computing and Its Impact Cognitive robotics Collective Intelligence Conversational Agents === Artificial intelligence === AI and Creativity Deep Learning Evolutionary Computing Fuzzy Systems Hybrid Intelligent Systems Industrial Applications of AI Intelligence and Cybersecurity Intelligent User Interfaces Knowledge Representation and Reasoning Knowledge-Based Systems Ambient Intelligence Machine learning Model-Based Reasoning Natural Language Processing Neural Networks Ontologies Planning and Scheduling Social Network Analysis Soft Computing State Space Search Bayesian Networks Uncertainty in AI Vision and Perception Visualization Big Data Case-Based Reasoning Cognitive Systems Constraint Satisfaction Data Mining Data Science == Editions == === ICAART 2023 – Lisbon, Portugal === === ICAART 2020 – Valletta, Malta === === ICAART 2019 – Prague, Czech Republic === Proceedings - Proceedings of the 11th International Conference on Web Information Systems and Technologies - Volume 1. ISBN 978-989-758-350-6 Proceedings - Proceedings of the 11th International Conference on Web Information Systems and Technologies - Volume 2. ISBN 978-989-758-350-6 === ICAART 2018 – Funchal, Madeira, Portugal === Proceedings - Proceedings of the 10th International Conference on Web Information Systems and Technologies - Volume 1. ISBN 978-989-758-275-2 Proceedings - Proceedings of the 10th International Conference on Web Information Systems and Technologies - Volume 2. ISBN 978-989-758-275-2 === ICAART 2017 – Porto, Portugal === Proceedings - Proceedings of the 9th International Conference on Web Information Systems and Technologies - Volume 1. ISBN 978-989-758-219-6 Proceedings - Proceedings of the 9th International Conference on Web Information Systems and Technologies - Volume 2. ISBN 978-989-758-220-2 === ICAART 2016 – Rome, Italy === Proceedings - Proceedings of the 8th International Conference on Web Information Systems and Technologies - Volume 1. ISBN 978-989-758-172-4 Proceedings - Proceedings of the 8th International Conference on Web Information Systems and Technologies - Volume 2. ISBN 978-989-758-172-4 === ICAART 2015 – Lisbon, Portugal === Proceedings - Proceedings of the 7th International Conference on Web Information Systems and Technologies - Volume 1. ISBN 978-989-758-073-4 Proceedings - Proceedings of the 7th International Conference on Web Information Systems and Technologies - Volume 2. ISBN 978-989-758-074-1 === ICAART 2014 – ESEO, Angers, Loire Valley, France === Proceedings - Proceedings of the 6th International Conference on Web Information Systems and Technologies - Volume 1. ISBN 978-989-758-015-4 Proceedings - Proceedings of the 6th International Conference on Web Information Systems and Technologies - Volume 2. ISBN 978-989-758-016-1 === ICAART 2013 – Barcelona, Spain === Proceedings - Proceedings of the 5th International Conference on Web Information Systems and Technologies - Volume 1. ISBN 978-989-8565-38-9 Proceedings - Proceedings of the 5th International Conference on Web Information Systems and Technologies - Volume 2. ISBN 978-989-8565-39-6 === ICAART 2012 – Vilamoura, Algarve, Portugal === Proceedings - Proceedings of the 4th International Conference on Web Information Systems and Technologies - Volume 1. ISBN 978-989-8425-95-9 Proceedings - Proceedings of the 4th International Conference on Web Information Systems and Technologies - Volume 2. ISBN 978-989-8425-96-6 === ICAART 2011 – Rome, Italy === Proceedings - Proceedings of the 3rd International Conference on Web Information Systems and Technologies - Volume 1. ISBN 978-989-8425-40-9 Proceedings - Proceedings of the 3rd International Conference on Web Information Systems and Technologies - Volume 2. ISBN 978-989-8425-41-6 === ICAART 2010 – Valencia, Spain === Proceedings - Proceedings of the 2nd International Conference on Web Information Systems and Technologies - Volume 1. ISBN 978-989-674-021-4 Proceedings - Proceedings of the 2nd International Conference on Web Information Systems and Technologies - Volume 2. ISBN 978-989-674-022-1 === ICAART 2009 – Porto, Portugal === Proceedings - Proceedings of the 1st International Conference on Web Information Systems and Technologies. ISBN 978-989-8111-66-1

    Read more →
  • Orion's Arm

    Orion's Arm

    The Orion's Arm Universe Project (OA) is a multi-authored online hard science fiction world-building project, first established in 2000 by M. Alan Kazlev, Donna Malcolm Hirsekorn, Bernd Helfert and Anders Sandberg and further co-authored by many people since. Anyone can contribute articles, stories, artwork, or music to the website. The first published Orion's Arm book, a collection of five novellas set within the OA universe, called Against a Diamond Sky, was released in September 2009. == Canon == The fictional setting of Orion's Arm takes place about 10,000 years in the future, where an interstellar civilization spread across thousands of light-years, with inhabited planets and space habitats. Its inhabitants range from humans to extensively modified human beings, including superhumans with advanced augmentations and internal AI systems, while most people exist as softwares. Engineered wormholes are used for interstellar travel and transport, although not for time travel. The setting also includes several alien civilizations and evidence of more advanced alien societies in the past. At its highest levels, directed human evolution has produced vast godlike beings linked across interstellar distances, capable of understanding and creating technologies beyond ordinary minds. == Reception == Orion's Arm has been reviewed in the role-playing magazine Knights of the Dinner Table, as well as on Boing Boing by transhumanist science fiction author Cory Doctorow. References to the Encyclopaedia Galactica have been made in a book on overcoming Librarian stereotypes. The Orion's Arm website has also been recommended in a children's teaching guide.

    Read more →
  • Cybernetics

    Cybernetics

    Cybernetics is the transdisciplinary study of circular causal processes such as feedback and recursion, where the effects of a system's actions (its outputs) return as inputs to that system, influencing subsequent actions. It is concerned with general principles that are relevant across multiple contexts, including engineering, ecological, economic, biological, cognitive and social systems and also in practical activities such as designing, learning, and managing. Cybernetics' transdisciplinary character means that it intersects with a number of other fields, resulting in a wide influence and diverse interpretations. The field is named after an example of circular causal feedback—that of steering a ship (the ancient Greek κυβερνήτης (kybernḗtēs) refers to the person who steers a ship). In steering a ship, the position of the rudder is adjusted in continual response to the effect it is observed as having, forming a feedback loop through which a steady course can be maintained in a changing environment, responding to disturbances from cross winds and tide. Cybernetics has its origins in exchanges between numerous disciplines during the 1940s. Initial developments were consolidated through meetings such as the Macy conferences and the Ratio Club. Early focuses included purposeful behaviour, neural networks, heterarchy, information theory, and self-organising systems. As cybernetics developed, it became broader in scope to include work in design, family therapy, management and organisation, pedagogy, sociology, the creative arts and the counterculture. == Definitions == Cybernetics has been defined in a variety of ways, reflecting "the richness of its conceptual base". One of the best known definitions is that of the American scientist Norbert Wiener, who characterised cybernetics as concerned with "control and communication in the animal and the machine". Another early definition is that of the Macy cybernetics conferences, where cybernetics was understood as the study of "circular causal and feedback mechanisms in biological and social systems". Margaret Mead emphasised the role of cybernetics as "a form of cross-disciplinary thought which made it possible for members of many disciplines to communicate with each other easily in a language which all could understand". Other definitions include: "the art of governing or the science of government" (André-Marie Ampère); "the art of steersmanship" (Ross Ashby); "the study of systems of any nature which are capable of receiving, storing, and processing information so as to use it for control" (Andrey Kolmogorov); and "a branch of mathematics dealing with problems of control, recursiveness, and information, focuses on forms and the patterns that connect" (Gregory Bateson). == Etymology == The Ancient Greek term κυβερνητικός (kubernētikos, '(good at) steering') appears in Plato's Republic and Alcibiades, where the metaphor of a steersman is used to signify the governance of people. The French word cybernétique was also used in 1834 by the physicist André-Marie Ampère to denote the sciences of government in his classification system of human knowledge. According to Norbert Wiener, the word cybernetics was coined by a research group involving himself and Arturo Rosenblueth in the summer of 1947. It has been attested in print since at least 1948 through Wiener's book Cybernetics: Or Control and Communication in the Animal and the Machine. In the book, Wiener states: After much consideration, we have come to the conclusion that all the existing terminology has too heavy a bias to one side or another to serve the future development of the field as well as it should; and as happens so often to scientists, we have been forced to coin at least one artificial neo-Greek expression to fill the gap. We have decided to call the entire field of control and communication theory, whether in the machine or in the animal, by the name Cybernetics, which we form from the Greek κυβερνήτης or steersman. Moreover, Wiener explains, the term was chosen to recognize James Clerk Maxwell's 1868 publication on feedback mechanisms involving governors, noting that the term governor is also derived from κυβερνήτης (kubernḗtēs) via a Latin corruption gubernator. Finally, Wiener motivates the choice by steering engines of a ship being "one of the earliest and best-developed forms of feedback mechanisms". == History == === First wave === The initial focus of cybernetics was on parallels between regulatory feedback processes in biological and technological systems. Two foundational articles were published in 1943: "Behavior, Purpose and Teleology" by Arturo Rosenblueth, Norbert Wiener, and Julian Bigelow – based on the research on living organisms that Rosenblueth did in Mexico – and the paper "A Logical Calculus of the Ideas Immanent in Nervous Activity" by Warren McCulloch and Walter Pitts. The foundations of cybernetics were then developed through a series of transdisciplinary conferences funded by the Josiah Macy, Jr. Foundation, between 1946 and 1953. The conferences were chaired by McCulloch and had participants that included Ross Ashby, Gregory Bateson, Heinz von Foerster, Margaret Mead, John von Neumann, and Norbert Wiener. In the UK, similar focuses were explored by the Ratio Club, an informal dining club of young psychiatrists, psychologists, physiologists, mathematicians and engineers that met between 1949 and 1958. Wiener introduced the neologism cybernetics to denote the study of "teleological mechanisms" and popularized it through the book Cybernetics: Or Control and Communication in the Animal and the Machine. During the 1950s, cybernetics was developed as a primarily technical discipline, such as in Qian Xuesen's 1954 "Engineering Cybernetics". The text was quickly translated into multiple languages and became a foundational text on automation. In the Soviet Union, Cybernetics was initially considered with suspicion but became accepted from the mid to late 1950s. By the 1960s and 1970s, however, cybernetics' transdisciplinarity fragmented, with technical focuses separating into separate fields. Artificial intelligence (AI) was founded as a distinct discipline at the Dartmouth workshop in 1956, differentiating itself from the broader cybernetics field. After some uneasy coexistence, AI gained funding and prominence. Consequently, cybernetic sciences such as the study of artificial neural networks were downplayed. Similarly, computer science became defined as a distinct academic discipline in the 1950s and early 1960s. === Second wave === The second wave of cybernetics came to prominence from the 1960s onwards, with its focus shifting away from technology toward social, ecological, and philosophical concerns. It was still grounded in biology, notably Maturana and Varela's autopoiesis, and built on earlier work on self-organising systems and the presence of anthropologists Mead and Bateson in the Macy meetings. The Biological Computer Laboratory, founded in 1958 and active until the mid-1970s under the direction of Heinz von Foerster at the University of Illinois at Urbana–Champaign, was a major incubator of this trend in cybernetics research. Focuses of the second wave of cybernetics included management cybernetics, such as Stafford Beer's biologically inspired viable system model; work in family therapy, drawing on Bateson; social systems, such as in the work of Niklas Luhmann; epistemology and pedagogy, such as in the development of radical constructivism. Cybernetics' core theme of circular causality was developed beyond goal-oriented processes to concerns with reflexivity and recursion, notably in Mead's invocation at the inaugural meeting of the American Society for Cybernetics (ASC) to apply cybernetics to the activities of the ASC itself. This focus on reflexivity was especially prominent in the development of second-order cybernetics (or the cybernetics of cybernetics), developed and promoted by Heinz von Foerster, which focused on questions of observation, cognition, epistemology, and ethics. The 1960s onwards also saw cybernetics begin to develop exchanges with the creative arts, design, and architecture, notably with the Cybernetic Serendipity exhibition (ICA, London, 1968), curated by Jasia Reichardt, and the unrealised Fun Palace project (London, unrealised, 1964 onwards), where Gordon Pask was consultant to architect Cedric Price and theatre director Joan Littlewood. In 1962, Qian Xuesen recruited Song Jian and Guan Zhaozhi to establish China's first cybernetics laboratory with him. Following the Sino-Soviet split, cybernetics was deemed disreputable in China. The field was again favored in the 1970s and 1980s following Deng Xiaoping's emphasis on modernisation. === Third wave === From the 1990s onwards, there has been a renewed interest in cybernetics from a number of directions. Early cybernetic work on artificial neural networks has been returned to as a paradigm in machine learning and artifi

    Read more →
  • Mark I Perceptron

    Mark I Perceptron

    The Mark I Perceptron was a pioneering supervised image classification learning system developed by Frank Rosenblatt in 1958. It was the first implementation of an artificial intelligence (AI) machine. It differs from the Perceptron which is a software architecture proposed in 1943 by Warren McCulloch and Walter Pitts, which was also employed in Mark I, and enhancements of which have continued to be an integral part of cutting edge AI technologies like the Transformer. == Architecture == The Mark I Perceptron was organized into three layers: A set of sensory units which receive optical input A set of association units, each of which fire based on input from multiple sensory units A set of response units, which fire based on input from multiple association units The connection between sensory units and association units were random. The working of association units was very similar to the response units. Different versions of the Mark I used different numbers of units in each of the layers. == Capabilities == In his 1957 proposal for funding for development of the "Cornell Photoperceptron", Rosenblatt claimed:"Devices of this sort are expected ultimately to be capable of concept formation, language translation, collation of military intelligence, and the solution of problems through inductive logic."With the first version of the Mark I Perceptron as early as 1958, Rosenblatt demonstrated a simple binary classification experiment, namely distinguishing between sheets of paper marked on the right versus those marked on the left side. One of the later experiments distinguished a square from a circle printed on paper. The shapes were perfect and their sizes fixed; the only variation was in their position and orientation. The Mark I Perceptron achieved 99.8% accuracy on a test dataset with 500 neurons in a single layer. The size of the training dataset was 10,000 example images. It took 3 seconds for the training pipeline to go through a single image. Higher accuracy was observed with thick outline figures compared to solid figures, likely because outline figures reduced overfitting. Another experiment distinguished between a square and a diamond for which 100% accuracy was achieved with only 60 training images, with a Perceptron having 1,000 neurons in a single layer. The time taken to process each training input for this larger perceptron was 15 seconds. The only variation was in position of the image, since rotation would have been ambiguous. In that same experiment, it could distinguish between the letters X and E with 100% accuracy when trained with only 20 images (10 images of each letter). Variations in the images included both position and rotation by up to 30 degrees. When variation in rotation was increased to any angle (both in training and test datasets), the accuracy reduced to 90% with 60 training images (30 images of each letter). For distinguishing between the letters E and F, a more challenging problem due to their similarity, the same 1,000 neuron perceptron achieved an accuracy of more than 80% with 60 training images. Variation was only in the position of the image, with no rotation.

    Read more →
  • Stable Diffusion

    Stable Diffusion

    Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of the ongoing AI boom. It is primarily used to generate detailed images conditioned on text descriptions, though it can also be applied to other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt. Its development involved researchers from the CompVis Group at LMU Munich and Runway with a computational donation from Stability and training data from non-profit organizations. Stable Diffusion is a latent diffusion model, a kind of deep generative artificial neural network. Its code and model weights have been released publicly, and an optimized version can run on most consumer hardware equipped with a modest GPU with as little as 2.4 GB VRAM. This marked a departure from previous proprietary text-to-image models such as DALL-E and Midjourney which were accessible only via cloud services. == Development == Stable Diffusion originated from a project called Latent Diffusion, developed in Germany by researchers at LMU Munich in Munich and Heidelberg University. Four of the original 5 authors (Robin Rombach, Andreas Blattmann, Patrick Esser and Dominik Lorenz) later joined Stability AI and released subsequent versions of Stable Diffusion. The technical license for the model was released by the CompVis group at LMU Munich. Development was led by Patrick Esser of Runway and Robin Rombach of CompVis, who were among the researchers who had earlier invented the latent diffusion model architecture used by Stable Diffusion. Stability AI also credited EleutherAI and LAION (a German nonprofit which assembled the dataset on which Stable Diffusion was trained) as supporters of the project. == Technology == === Architecture === Diffusion models, introduced in 2015, are trained with the objective of removing successive applications of Gaussian noise on training images, which can be thought of as a sequence of denoising autoencoders. The name diffusion is from the thermodynamic diffusion, since they were first developed with inspiration from thermodynamics. Models in Stable Diffusion series before SD 3 all used a variant of diffusion models, called latent diffusion model (LDM), developed in 2021 by the CompVis (Computer Vision & Learning) group at LMU Munich. Stable Diffusion consists of 3 parts: the variational autoencoder (VAE), U-Net, and an optional text encoder. The VAE encoder compresses the image from pixel space to a smaller dimensional latent space, capturing a more fundamental semantic meaning of the image. Gaussian noise is iteratively applied to the compressed latent representation during forward diffusion. The U-Net block, composed of a ResNet backbone, denoises the output from forward diffusion backwards to obtain a latent representation. Finally, the VAE decoder generates the final image by converting the representation back into pixel space. The denoising step can be flexibly conditioned on a string of text, an image, or another modality. The encoded conditioning data is exposed to denoising U-Nets via a cross-attention mechanism. For conditioning on text, the fixed, pretrained CLIP ViT-L/14 text encoder is used to transform text prompts to an embedding space. Researchers point to increased computational efficiency for training and generation as an advantage of LDMs. With 860 million parameters in the U-Net and 123 million in the text encoder, Stable Diffusion is considered relatively lightweight by 2022 standards, and unlike other diffusion models, it can run on consumer GPUs, and even CPU-only if using the OpenVINO version of Stable Diffusion. ==== SD XL ==== The XL version uses the same LDM architecture as previous versions, except larger: larger UNet backbone, larger cross-attention context, two text encoders instead of one, and trained on multiple aspect ratios (not just the square aspect ratio like previous versions). The SD XL Refiner, released at the same time, has the same architecture as SD XL, but it was trained for adding fine details to preexisting images via text-conditional img2img. ==== SD 3.0 ==== The 3.0 version completely changes the backbone. Not a UNet, but a Rectified Flow Transformer, which implements the rectified flow method with a Transformer. The Transformer architecture used for SD 3.0 has three "tracks", for original text encoding, transformed text encoding, and image encoding (in latent space). The transformed text encoding and image encoding are mixed during each transformer block. The architecture is named "multimodal diffusion transformer (MMDiT), where the "multimodal" means that it mixes text and image encodings inside its operations. This differs from previous versions of DiT, where the text encoding affects the image encoding, but not vice versa. === Training data === Stable Diffusion was trained on pairs of images and captions taken from LAION-5B, a publicly available dataset derived from Common Crawl data scraped from the web, where 5 billion image-text pairs were classified based on language and filtered into separate datasets by resolution, a predicted likelihood of containing a watermark, and predicted "aesthetic" score (e.g. subjective visual quality). The dataset was created by LAION, a German non-profit which receives funding from Stability AI. The Stable Diffusion model was trained on three subsets of LAION-5B: laion2B-en, laion-high-resolution, and laion-aesthetics v2 5+. A third-party analysis of the model's training data identified that out of a smaller subset of 12 million images taken from the original wider dataset used, approximately 47% of the sample size of images came from 100 different domains, with Pinterest taking up 8.5% of the subset, followed by websites such as WordPress, Blogspot, Flickr, DeviantArt and Wikimedia Commons. An investigation by Bayerischer Rundfunk showed that LAION's datasets, hosted on Hugging Face, contain large amounts of private and sensitive data. === Training procedures === The model was initially trained on the laion2B-en and laion-high-resolution subsets, with the last few rounds of training done on LAION-Aesthetics v2 5+, a subset of 600 million captioned images which the LAION-Aesthetics Predictor V2 predicted that humans would, on average, give a score of at least 5 out of 10 when asked to rate how much they liked them. The LAION-Aesthetics v2 5+ subset also excluded low-resolution images and images which LAION-5B-WatermarkDetection identified as carrying a watermark with greater than 80% probability. Final rounds of training additionally dropped 10% of text conditioning to improve Classifier-Free Diffusion Guidance. The model was trained using 256 Nvidia A100 GPUs on Amazon Web Services for a total of 150,000 GPU-hours, at a cost of $600,000. === Limitations === Stable Diffusion has issues with degradation and inaccuracies in certain scenarios. Initial releases of the model were trained on a dataset that consists of 512×512 resolution images, meaning that the quality of generated images noticeably degrades when user specifications deviate from its "expected" 512×512 resolution; the version 2.0 update of the Stable Diffusion model later introduced the ability to natively generate images at 768×768 resolution. Another challenge is in generating human limbs due to poor data quality of limbs in the LAION database. The model is insufficiently trained to replicate human limbs and faces due to the lack of representative features in the database, and prompting the model to generate images of such type can confound the model. In addition to human limbs, Stable Diffusion is unable to generate legible ambigrams and some other forms of text and typography. Stable Diffusion XL (SDXL) version 1.0, released in July 2023, introduced native 1024x1024 resolution and improved generation for limbs and text. Accessibility for individual developers can also be a problem. In order to customize the model for new use cases that are not included in the dataset, such as generating anime characters ("waifu diffusion"), new data and further training are required. Fine-tuned adaptations of Stable Diffusion created through additional retraining have been used for a variety of different use-cases, from medical imaging to algorithmically generated music. However, this fine-tuning process is sensitive to the quality of new data; low resolution images or different resolutions from the original data can not only fail to learn the new task but degrade the overall performance of the model. Even when the model is additionally trained on high quality images, it is difficult for individuals to run models in consumer electronics. For example, the training process for waifu-diffusion requires a minimum 30 GB of VRAM, which exceeds the usual resource provided in such consumer GPUs as Nvidia's GeForce 30 series, w

    Read more →
  • Residuated Boolean algebra

    Residuated Boolean algebra

    In mathematics, a residuated Boolean algebra is a residuated lattice whose lattice structure is that of a Boolean algebra. Examples include Boolean algebras with the monoid taken to be conjunction, the set of all formal languages over a given alphabet Σ {\displaystyle \Sigma } under concatenation, the set of all binary relations on a given set X {\displaystyle X} under relational composition, and more generally the power set of any equivalence relation, again under relational composition. The original application was to relation algebras as a finitely axiomatized generalization of the binary relation example, but there exist interesting examples of residuated Boolean algebras that are not relation algebras, such as the language example. == Definition == A residuated Boolean algebra is an algebraic structure ( L , ∧ , ∨ , ¬ , 0 , 1 , ∙ , I , / , ∖ ) {\displaystyle (L,\wedge ,\vee ,\neg ,0,1,\bullet ,\mathbf {I} ,/,\backslash )} such that An equivalent signature better suited to the relation algebra application is ( L , ∧ , ∨ , ¬ , 0 , 1 , ∙ , I , ▹ , ◃ ) {\displaystyle (L,\wedge ,\vee ,\neg ,0,1,\bullet ,\mathbf {I} ,\triangleright ,\triangleleft )} where the unary operations x ∖ {\displaystyle x\backslash } and x ▹ {\displaystyle x\triangleright } are intertranslatable in the manner of De Morgan's laws via x ∖ y = ¬ ( x ▹ ¬ y ) {\displaystyle x\backslash y=\neg (x\triangleright \neg y)} , x ▹ y = ¬ ( x ∖ ¬ y ) {\displaystyle x\triangleright y=\neg (x\backslash \neg y)} , and dually / y {\displaystyle /y} and ◃ y {\displaystyle \triangleleft y} as x / y = ¬ ( ¬ x ◃ y ) {\displaystyle x/y=\neg (\neg x\triangleleft y)} , x ◃ y = ¬ ( ¬ x / y ) {\displaystyle x\triangleleft y=\neg (\neg x/y)} , with the residuation axioms in the residuated lattice article reorganized accordingly (replacing z {\displaystyle z} by ¬ z {\displaystyle \neg z} ) to read ( x ▹ z ) ∧ y = 0 ⇔ ( x ∙ y ) ∧ z = 0 ⇔ ( z ◃ y ) ∧ x = 0 {\displaystyle (x\triangleright z)\wedge y=0\ \Leftrightarrow \ (x\bullet y)\wedge z=0\ \Leftrightarrow \ (z\triangleleft y)\wedge x=0} This De Morgan dual reformulation is motivated and discussed in more detail in the section below on conjugacy. Since residuated lattices and Boolean algebras are each definable with finitely many equations, so are residuated Boolean algebras, whence they form a finitely axiomatizable variety. == Examples == Any Boolean algebra, with the monoid multiplication ∙ {\displaystyle \bullet } taken to be conjunction and both residuals taken to be material implication x → y {\displaystyle x\to y} . Of the remaining 15 binary Boolean operations that might be considered in place of conjunction for the monoid multiplication, only five meet the monotonicity requirement, namely 0 , 1 , x , y {\displaystyle 0,1,x,y} and x ∨ y {\displaystyle x\vee y} . Setting y = z = 0 {\displaystyle y=z=0} in the residuation axiom y ≤ x ∖ z ⇔ x ∙ y ≤ z {\displaystyle y\leq x\backslash z\ \Leftrightarrow \ x\bullet y\leq z} , we have 0 ≤ x ∖ 0 ⇔ x ∙ 0 ≤ 0 {\displaystyle 0\leq x\backslash 0\ \Leftrightarrow \ x\bullet 0\leq 0} , which is falsified by taking x = 1 {\displaystyle x=1} when x ∙ y = 1 {\displaystyle x\bullet y=1} , x {\displaystyle x} , or x ∨ y {\displaystyle x\vee y} . The dual argument for z / y {\displaystyle z/y} rules out x ∙ y = y {\displaystyle x\bullet y=y} . This just leaves x ∙ y = 0 {\displaystyle x\bullet y=0} (a constant binary operation independent of x {\displaystyle x} and y {\displaystyle y} ), which satisfies almost all the axioms when the residuals are both taken to be the constant operation x / y = x ∖ y = 1 {\displaystyle x/y=x\backslash y=1} . The axiom it fails is x ∙ I = x = I ∙ x {\displaystyle x\bullet \mathbf {I} =x=\mathbf {I} \bullet x} , for want of a suitable value for I {\displaystyle \mathbf {I} } . Hence conjunction is the only binary Boolean operation making the monoid multiplication that of a residuated Boolean algebra. The power set 2 X 2 {\displaystyle 2^{X^{2}}} made a Boolean algebra as usual with ∩ {\displaystyle \cap } , ∪ {\displaystyle \cup } and complement relative to X 2 {\displaystyle X^{2}} , and made a monoid with relational composition. The monoid unit I {\displaystyle \mathbf {I} } is the identity relation { ( x , x ) | x ∈ X } {\displaystyle \{(x,x)|x\in X\}} . The right residual R ∖ S {\displaystyle R\backslash S} is defined by x ( R ∖ S ) y ⇔ ∀ z ∈ X , z R x ⇒ z S y {\displaystyle x(R\backslash S)y\ \Leftrightarrow \ \forall z\in X,zRx\Rightarrow zSy} . Dually the left residual S / R {\displaystyle S/R} is defined by y ( S / R ) x ⇔ ∀ z ∈ X , x R z ⇒ y S z {\displaystyle y(S/R)x\ \Leftrightarrow \ \forall z\in X,xRz\Rightarrow ySz} . The power set 2 Σ ∗ {\displaystyle 2^{\Sigma ^{}}} made a Boolean algebra as for Example 2, but with language concatenation for the monoid. Here the set Σ {\displaystyle \Sigma } is used as an alphabet while Σ ∗ {\displaystyle \Sigma ^{}} denotes the set of all finite (including empty) words over that alphabet. The concatenation L M {\displaystyle LM} of languages L {\displaystyle L} and M {\displaystyle M} consists of all words u v {\displaystyle uv} such that u ∈ L {\displaystyle u\in L} and v ∈ M {\displaystyle v\in M} . The monoid unit is the language { ε } {\displaystyle \{\varepsilon \}} consisting of just the empty word ε {\displaystyle \varepsilon } . The right residual M ∖ L {\displaystyle M\backslash L} consists of all words w {\displaystyle w} over Σ {\displaystyle \Sigma } such that M w ⊆ L {\displaystyle Mw\subseteq L} . The left residual L / M {\displaystyle L/M} is the same with w M {\displaystyle wM} in place of M w {\displaystyle Mw} . == Conjugacy == The De Morgan duals ▹ {\displaystyle \triangleright } and ◃ {\displaystyle \triangleleft } of residuation arise as follows. Among residuated lattices, Boolean algebras are special by virtue of having a complementation operation ¬ {\displaystyle \neg } . This permits an alternative expression of the three inequalities y ≤ x ∖ z ⇔ x ∙ y ≤ z ⇔ x ≤ z / y {\displaystyle y\leq x\backslash z\ \Leftrightarrow \ x\bullet y\leq z\ \Leftrightarrow \ x\leq z/y} in the axiomatization of the two residuals in terms of disjointness, via the equivalence x ≤ y ⇔ x ∧ ¬ y = 0 {\displaystyle x\leq y\ \Leftrightarrow \ x\wedge \neg y=0} . Abbreviating x ∧ y = 0 {\displaystyle x\wedge y=0} to x # y {\displaystyle x\#y} as the expression of their disjointness, and substituting ¬ z {\displaystyle \neg z} for z {\displaystyle z} in the axioms, they become with a little Boolean manipulation ¬ ( x ∖ ¬ z ) # y ⇔ x ∙ y # z ⇔ ¬ ( ¬ z / y ) # x {\displaystyle \neg (x\backslash \neg z)\#y\ \Leftrightarrow \ x\bullet y\#z\ \Leftrightarrow \ \neg (\neg z/y)\#x} Now ¬ ( x ∖ ¬ z ) {\displaystyle \neg (x\backslash \neg z)} is reminiscent of De Morgan duality, suggesting that x ∖ {\displaystyle x\backslash } be thought of as a unary operation f {\displaystyle f} , defined by f ( y ) = x ∖ y {\displaystyle f(y)=x\backslash y} , that has a De Morgan dual ¬ f ( ¬ y ) {\displaystyle \neg f(\neg y)} , analogous to ∀ x ϕ ( x ) = ¬ ∃ x ¬ ϕ ( x ) {\displaystyle \forall x\phi (x)=\neg \exists x\neg \phi (x)} . Denoting this dual operation as x ▹ {\displaystyle x\triangleright } , we define x ▹ z {\displaystyle x\triangleright z} as ¬ x ∖ ¬ z {\displaystyle \neg x\backslash \neg z} . Similarly we define another operation z ◃ y {\displaystyle z\triangleleft y} as ¬ ( ¬ z / y ) {\displaystyle \neg (\neg z/y)} . By analogy with x ∖ {\displaystyle x\backslash } as the residual operation associated with the operation x ∙ {\displaystyle x\bullet } , we refer to x ▹ {\displaystyle x\triangleright } as the conjugate operation, or simply conjugate, of x ∙ {\displaystyle x\bullet } . Likewise ◃ y {\displaystyle \triangleleft y} is the conjugate of ∙ y {\displaystyle \bullet y} . Unlike residuals, conjugacy is an equivalence relation between operations: if f {\displaystyle f} is the conjugate of g {\displaystyle g} then g {\displaystyle g} is also the conjugate of f {\displaystyle f} , i.e. the conjugate of the conjugate of f {\displaystyle f} is f {\displaystyle f} . Another advantage of conjugacy is that it becomes unnecessary to speak of right and left conjugates, that distinction now being inherited from the difference between x ∙ {\displaystyle x\bullet } and ∙ x {\displaystyle \bullet x} , which have as their respective conjugates x ▹ {\displaystyle x\triangleright } and ◃ x {\displaystyle \triangleleft x} . (But this advantage accrues also to residuals when x ∖ {\displaystyle x\backslash } is taken to be the residual operation to x ∙ {\displaystyle x\bullet } .) All this yields (along with the Boolean algebra and monoid axioms) the following equivalent axiomatization of a residuated Boolean algebra. y # x ▹ z ⇔ x ∙ y # z ⇔ x # z ◃ y {\displaystyle y\#x\triangleright z\ \Leftrightarrow \ x\bullet y\#z\ \Leftrightarrow \ x\#z\triangleleft y} With this signature it remains the case that this axiomatization can be expressed as

    Read more →
  • Naked Objects for .NET

    Naked Objects for .NET

    Naked Objects for .NET or Naked Objects MVC is a software framework that builds upon the ASP.NET MVC framework. As the name suggests, the framework synthesizes two architectural patterns: naked objects and model–view–controller (MVC). These two patterns have been considered as antithetical. However, Trygve Reenskaug (the inventor of the MVC pattern) has made it clear that he does not see it that way, in his foreword to Richard Pawson's PhD thesis on the Naked Objects pattern. The Naked Objects MVC framework will take a domain model (written as Plain Old CLR Objects) and render it as a complete HTML application without the need for writing any user interface code - by means of a small set of generic View and Controller classes. The framework uses reflection rather than code generation. The developer may then choose to create customised Views and/or Controllers, using standard ASP.NET MVC patterns, for use where the generic user interface is not suitable.

    Read more →
  • Blackboard system

    Blackboard system

    A blackboard system is an artificial intelligence approach based on the blackboard architectural model, where a common knowledge base, the "blackboard", is iteratively updated by a diverse group of specialist knowledge sources, starting with a problem specification and ending with a solution. Each knowledge source updates the blackboard with a partial solution when its internal constraints match the blackboard state. In this way, the specialists work together to solve the problem. The blackboard model was originally designed as a way to handle complex, ill-defined problems, where the solution is the sum of its parts. == Metaphor == The following scenario provides a simple metaphor that gives some insight into how a blackboard functions: A group of specialists are seated in a room with a large blackboard. They work as a team to brainstorm a solution to a problem, using the blackboard as the workplace for cooperatively developing the solution. The session begins when the problem specifications are written onto the blackboard. The specialists all watch the blackboard, looking for an opportunity to apply their expertise to the developing solution. When someone writes something on the blackboard that allows another specialist to apply their expertise, the second specialist records their contribution on the blackboard, hopefully enabling other specialists to then apply their expertise. This process of adding contributions to the blackboard continues until the problem has been solved. == Components == A blackboard-system application consists of three major components The software specialist modules, which are called knowledge sources (KSs). Like the human experts at a blackboard, each knowledge source provides specific expertise needed by the application. The blackboard, a shared repository of problems, partial solutions, suggestions, and contributed information. The blackboard can be thought of as a dynamic "library" of contributions to the current problem that have been recently "published" by other knowledge sources. The control shell, which controls the flow of problem-solving activity in the system. Just as the eager human specialists need a moderator to prevent them from trampling each other in a mad dash to grab the chalk, KSs need a mechanism to organize their use in the most effective and coherent fashion. In a blackboard system, this is provided by the control shell. === Learnable Task Modeling Language === A blackboard system is the central space in a multi-agent system. It's used for describing the world as a communication platform for agents. To realize a blackboard in a computer program, a machine readable notation is needed in which facts can be stored. One attempt in doing so is a SQL database, another option is the Learnable Task Modeling Language (LTML). The syntax of the LTML planning language is similar to PDDL, but adds extra features like control structures and OWL-S models. LTML was developed in 2007 as part of a much larger project called POIROT (Plan Order Induction by Reasoning from One Trial), which is a Learning from demonstrations framework for process mining. In POIROT, Plan traces and hypotheses are stored in the LTML syntax for creating semantic web services. Here is a small example: A human user is executing a workflow in a computer game. The user presses some buttons and interacts with the game engine. While the user interacts with the game, a plan trace is created. That means the user's actions are stored in a logfile. The logfile gets transformed into a machine readable notation which is enriched by semantic attributes. The result is a textfile in the LTML syntax which is put on the blackboard. Agents (software programs in the blackboard system) are able to parse the LTML syntax. == Implementations == We start by discussing two well known early blackboard systems, BB1 and GBB, below and then discuss more recent implementations and applications. The BB1 blackboard architecture was originally inspired by studies of how humans plan to perform multiple tasks in a trip, used task-planning as a simplified example of tactical planning for the Office of Naval Research. Hayes-Roth & Hayes-Roth found that human planning was more closely modeled as an opportunistic process, in contrast to the primarily top-down planners used at the time: While not incompatible with successive-refinement models, our view of planning is somewhat different. We share the assumption that planning processes operate in a two-dimensional planning space defined on time and abstraction dimensions. However, we assume that people's planning activity is largely opportunistic. That is, at each point in the process, the planner's current decisions and observations suggest various opportunities for plan development. The planner's subsequent decisions follow up on selected opportunities. Sometimes, these decision-sequences follow an orderly path and produce a neat top-down expansion as described above. However, some decisions and observations might also suggest less orderly opportunities for plan development. A key innovation of BB1 was that it applied this opportunistic planning model to its own control, using the same blackboard model of incremental, opportunistic, problem-solving that was applied to solve domain problems. Meta-level reasoning with control knowledge sources could then monitor whether planning and problem-solving were proceeding as expected or stalled. If stalled, BB1 could switch from one strategy to another as conditions – such as the goals being considered or the time remaining – changed. BB1 was applied in multiple domains: construction site planning, inferring 3-D protein structures from X-ray crystallography, intelligent tutoring systems, and real-time patient monitoring. BB1 also allowed domain-general language frameworks to be designed for wide classes of problems. For example, the ACCORD language framework defined a particular approach to solving configuration problems. The problem-solving approach was to incrementally assemble a solution by adding objects and constraints, one at a time. Actions in the ACCORD language framework appear as short English-like commands or sentences for specifying preferred actions, events to trigger KSes, preconditions to run a KS action, and obviation conditions to discard a KS action that is no longer relevant. GBB focused on efficiency, in contrast to BB1, which focused more on sophisticated reasoning and opportunistic planning. GBB improves efficiency by allowing blackboards to be multi-dimensional, where dimensions can be either ordered or not, and then by increasing the efficiency of pattern matching. GBB1, one of GBB's control shells implements BB1's style of control while adding efficiency improvements. Other well-known of early academic blackboard systems are the Hearsay II speech recognition system and Douglas Hofstadter's Copycat and Numbo projects. Some more recent examples of deployed real-world applications include: The PLAN component of the Mission Control System for RADARSAT-1, an Earth observation satellite developed by Canada to monitor environmental changes and Earth's natural resources. The GTXImage CAD software by GTX Corporation was developed in the early 1990s using a set of rulebases and neural networks as specialists operating on a blackboard system. Adobe Acrobat Capture (now discontinued), as it used a blackboard system to decompose and recognize image pages to understand the objects, text, and fonts on the page. This function is currently built into the retail version of Adobe Acrobat as "OCR Text Recognition". Details of a similar OCR blackboard for Farsi text are in the public domain. Blackboard systems are used routinely in many military C4ISTAR systems for detecting and tracking objects. Another example of current use is in Game AI, where they are considered a standard AI tool to help with adding AI to video games. == Recent developments == Blackboard-like systems have been constructed within modern Bayesian machine learning settings, using agents to add and remove Bayesian network nodes. In these 'Bayesian Blackboard' systems, the heuristics can acquire more rigorous probabilistic meanings as proposal and acceptances in Metropolis Hastings sampling though the space of possible structures. Conversely, using these mappings, existing Metropolis-Hastings samplers over structural spaces may now thus be viewed as forms of blackboard systems even when not named as such by the authors. Such samplers are commonly found in musical transcription algorithms for example. Blackboard systems have also been used to build large-scale intelligent systems for the annotation of media content, automating parts of traditional social science research. In this domain, the problem of integrating various AI algorithms into a single intelligent system arises spontaneously, with blackboards providing a way for a collection of distributed, modular natural language processing algorithm

    Read more →
  • Gundam Build Metaverse

    Gundam Build Metaverse

    Gundam Build Metaverse (Japanese: ガンダムビルドメタバース, Hepburn: Gandamu Birudo Metabāzu) is a Japanese original net animation anime mini-series produced by Sunrise Beyond, and the fifth series within the Gundam Build Series sub-series. The series celebrates the 10th anniversary of the Gundam Build franchise, including characters from the previous installments. == Plot == The story is set in the same universe of the Gundam Build series in an online metaverse space where users can use avatars to move around and interact with other users, including conducting Gunpla (Gundam plastic model) battles with them. The story centers on Rio Hōjō, a boy who lives in Hawaii, and who learns how to build Gunpla from a local hobbyist named Seria Urutsuki. In the metaverse, a figure known as Mask Lady teaches him the art of Gunpla battling, and he strives to get better at it every day. With his custom Lah Gundam, he seeks out ever stronger opponents. == Characters == === Main characters === Rio Hojo (ホウジョウ・リオ, Hōjō Rio) Voiced by: Chika Anzai A young boy from Hawaii who is an enthusiast of Gunpla Battle and is an apprentice of the mysterious Diver "Mask Lady". Rio's Gunpla is the Lah Gundam, modeled after an entry-grade RX-78-2 Gundam, from the original Mobile Suit Gundam anime series. Seria Urutsuki (ウルツキ・セリア, Urutsuki Seria) / Mask Lady (マスクレディー, Masuku Reidi) Voiced by: Rio Tsuchiya A clerk at a local hobby shop and the instructor at their Gunpla class, Seria becomes Rio's Gunpla mentor using the alias "Mask Lady". Seria's Gunpla is the ZGMF-X20A-PF Gundam Perfect Strike Freedom Rouge, based on both the MBF-02 Strike Rouge and the GAT-X105+AQM/E-YM1 Perfect Strike Gundam from Mobile Suit Gundam Seed and the ZGMF-X20A Strike Freedom Gundam from Mobile Suit Gundam Seed Destiny. === Returning characters === Fumina Hoshino (ホシノ・フミナ, Hoshino Fumina) Voiced by: Yui Makino A veteran Gunpla Battler from the early days of the sport and the Leader of "Team Try Fighters", she works as an advertiser and announcer within the Metaverse realm. Tatsuya Yuuki (ユウキ・タツヤ, Yūki Tatsuya) / Meijin Kawaguchi III (三代目メイジン・カワグチ, Sandaime Meijin Kawaguchi) Voiced by: Takuya Satō A builder and three-times Gunpla Battle world champion who inherited the name of the legendary Meijin Kawaguchi, known as "Meijin Kawaguchi III", and still the current title holder. His newest Gunpla is the Gundam Amazing Barbatos Lupus based on the ASW-G-08 Gundam Barbatos Lupus from Mobile Suit Gundam: Iron-Blooded Orphans. Riku Mikami (ミカミ・リク, Mikami Riku) / Riku (リク) Voiced by: Yūsuke Kobayashi The Founder and former leader of the legendary force, "Build Divers". His Gunpla is the Gundam 00 Diver Arc, the latest version of the original GN-0000DVR Gundam 00 Diver from Gundam Build Divers, incorporating elements from the 00 Gundam from Mobile Suit Gundam 00 and the Gundam AGE-FX from Mobile Suit Gundam AGE. Sarah (サラ, Sara) Voiced by: Haruka Terui An EL-Diver and member of the Build Divers. Momoka Yashiro (ヤシロ・モモカ, Yashiro Momoka) / Momo (モモ) Voiced by: Nene Hieda Member of Build Divers. Her gunpla is the MOMOKAPOOL (R×R), an upgraded version of her PEN-01M Momokapool from Gundam Build Divers Aya Fujisawa (フジサワ・アヤ, Fujisawa Aya) / Ayame (アヤメ) Voiced by: Manami Numakura Member of Build Divers. Her Gunpla is the F-Kunoichi Kai, an SD Gunpla based on the F91 Gundam F91 from Mobile Suit Gundam F91. Sei Iori (イオリ・セイ, Iori Sei) Voiced by: Mikako Komatsu A builder and one time Gunpla Battle World Champion. His current Gunpla is the GAT-X105B/EG Build Strike Exceed Galaxy, the latest version of the original GAT-X105B Build Strike Gundam from Gundam Build Fighters. Aria von Reiji Asuna (アリーア・フォン・レイジ・アスナ, Arīa fon Reiji Asuna) Voiced by: Sachi Kokuryu A prince from the country called Arian that exists within a space colony in another dimension, who became friends with Sei Iori and together won the Gunpla Battle World Championship. He somehow manages to log into the metaverse to reunite with his friend, piloting the SB-011 Star Burning Gundam. Sekai Kamiki (カミキ・セカイ, Kamiki Sekai) Voiced by: Kazumi Togashi A veteran builder and former member of Team Try Fighters. He is currently the Japanese National representative Champion. In the series he develops a rivalry relationship with Hiroto similar to that of Kyoya and Rommel. His current Gunpla is the Shin Burning Gundam, the latest version of the original KMK-B01 Kamiki Burning Gundam from Gundam Build Fighters Try which is based on the Burning Gundam and Master Gundam. Hiroto Kuga (クガ・ヒロト, Kuga Hiroto) / Hiroto (ヒロト, Hiroto) Voiced by: Chiaki Kobayashi A veteran diver, the one responsible for discovering more EL-Divers, and a former member of the legendary force "Avalon", who later joined the unofficial, "BUILD DiVERS" and eventually became the current Force Leader, and as well as the current title holder of "Hero of Gunpla". In the third episode he is the only Build Diver member who participates in the tournament, while his fellow force-mates are in the audience routing for him and Rio. His Gunpla is the Plutine Gundam, which is a combination of his Core Gundam II Plus, upgraded from the Core Gundam II featured in Gundam Build Divers Re:Rise equipped with the Pluto Armor. Magee (マギー, Magī) Voiced by: Taishi Murata A flamboyant veteran Diver who owns a shop in the metaverse and is an acquaintance of Seria's. Freddie (フレディ, Furedi) Voiced by: Ai Kakuma An alien anthropomorphic dog boy from planet Eldora, a support member to both Build Diver teams, who manages to access the metaverse from his home planet along his fellow Eldorans. Ogre (オーガ, Ōga) Voiced by: Wataru Hatano Kyoya Kisugi (キスギ・キョウヤ, Kisugi Kyōya) / Kyoya Kujo (クジョウ・キョウヤ, Kujō Kyōya) Voiced by: Jun Kasama Leader of the legendary force "Avalon" and the reigning and current title holder of "World Champion". He along with Hiroto Kuga, Maria Urutsuki, and Tatsuya Yuuki are currently at the top of the entire gunpla world community. His current gunpla is an recolored version of his AGE-TRYMAG Gundam TRY AGE Magnum from Gundam Build Divers Re:Rise. Susumu Sazaki (サザキ・ススム, Sazaki Susumu) Voiced by: Ryo Hirohashi Kaoruko Sazaki (サザキ・カオルコ, Sazaki Kaoruko) Voiced by: Ryo Hirohashi Mahiru Shigure (シグレ・マヒル, Shigure Mahiru) Voiced by: Rinko Natsuhi Keiko Sano (サノ・ケイコ, Sano Keiko) Voiced by: Ami Naito === Others === Maria Urutsuki (ウルツキ・マリア, Urutsuki Maria) / Mascarilla (マスカリージャ, Masukarīja) Voiced by: Ai Kakuma A mysterious masked woman with a harsh rivalry with Seria and a similar avatar as hers, she is later revealed as Seria's younger sister Maria, who began to loathe her sister after she quit on their dream to fight for the title of Lady Kawaguchi. She later obtains the title, becoming "Lady Kawaguchi VII". Jeff (ジェフさん, Jefu-san) Voiced by: Kenta Miyake A distant relative of Seria and Maria's and owner of the hobby shop where Seria lives. Mellow Neige (メロウ・ネージュ, Merō Nēju) Voiced by: Chikano Ibuki A sentient A.I. who is the current publicity face of the Gunpla Metaverse. == Episodes ==

    Read more →