AI Generator Outfit

AI Generator Outfit — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Label noise

    Label noise

    Label noise refers to errors or inaccuracies in the class labels of data instances. This is a widespread issue in machine learning datasets, arising from human annotator mistakes, unclear labeling instructions, automated labeling methods, or adversarial attacks in supervised learning. Label noise can be roughly divided into random noise, where labels are flipped independently of input features, and systematic noise, where mislabeling is dependent on certain patterns or biases in the data. Label noise can be damaging to model performance, especially for complex models that may overfit to noisy labels rather than generalizable patterns. Many approaches have been proposed to deal with the effects of label noise, including robust loss functions, noise-tolerant algorithms, data cleaning methods, and semi-supervised learning approaches. To reduce the impact of wrong labels during training, techniques like label smoothing, sample reweighting and using trusted validation sets are used. The role of noise-robust training paradigms and curriculum learning strategies to improve resilience against mislabeled data is also explored in recent research.

    Read more →
  • Felix, Net i Nika

    Felix, Net i Nika

    Felix, Net i Nika ("Felix, Net and Nika") is a series of Polish language science fiction books for teenagers, written by Rafał Kosik. It tells the adventures of three friends - Felix Polon, Net Bielecki and Nika Mickiewicz - who attend fictional Professor Kuszmiński Middle School in Warsaw. As of 2024, eighteen books have been published. == Books == There are currently 18 books in the series: Felix, Net and Nika and the Gang of Invisible People - November 2004. Felix, Net and Nika and the Theoretically Possible Catastrophe - November 2005 Felix, Net and Nika and the Palace of Dreams - November 2006 Felix, Net and Nika and the Trap of Immortality - November 2007 Felix, Net and Nika and the Orbital Conspiracy - November 2008 Felix, Net and Nika and the Orbital Conspiracy 2: Small Army - May 2009 Felix, Net and Nika and the Third Cousin - November 2009 Felix, Net and Nika and the Rebellion of Machines - March 2011 Felix, Net and Nika and the World Zero - November 2011 Felix, Net and Nika and the World Zero 2. Alternauts - November 2012 Felix, Net and Nika and the Extracurricular Stories - April 2013 Felix, Net and Nika and the Secret of Czerwona Hańcza - November 2013 Felix, Net and Nika and Curse of McKillian's House - November 2014 Felix, Net and Nika and (un)Safe Growing up - November 2015 Felix, Net and Nika and The End of The World as We Know It - November 2018 Felix, Net and Nika and No Chance - November 2022 Felix, Net and Nika and No Chance 2: other tomorrrow - 2023 Felix, Net and Nika and Fantology - June 2024 == Film == A feature motion picture, Felix, Net i Nika oraz Teoretycznie Możliwa Katastrofa (Felix, Net and Nika and the Theoretically Possible Catastrophe) was released in Poland on September 28, 2012. == Main characters == Felix Polon - a foresighted, fair-haired boy with dark brown eyes. He inherited the talent of constructing various things, especially robots, from his father- it saved his friends many times. He can make anything from nothing, always finds a way out of a situation; almost always has a plan. Together with his parents Marlene and Peter, grandmother Lucy, his dog Caban (a Black Russian Terrier) and Golem Golem a robot he built, Felix lives on Serdeczna Street in a small family house. Net Bielecki is quite tall & slim, has blue eyes and a high IQ level. "Net" is his nickname; his true name is unknown. He is the most trendy and 'awesome' in his entire class. He is a human calculator and is excellent in mathematics. He hates dictations and spelling because he is dyslexic. He is also quite lazy, absent-minded and sometimes hysterical, or panicking. His dark blond hair looks like a heap of hay after a grenade explosion. He is best in ICT and writes many of his own programs. His love interest is Nika Mickiewicz. Together with his parents Lila and Mark, and their newborn twins nicknamed Pompek and Prumcia he lives on the top floor of a Penthouse apartment. Nika Mickiewicz is a girl with a character. She is very brave and mature. She likes reading books. She has curly, red hair, green eyes and a few freckles. She is not very rich; she wears second-hand clothes and her only pair of black Dr. Martens shoes. She lives in a tiny apartment. She is an orphan, but hides that fact from people for almost 3 years. However, Felix and Net, her best and possibly only friends, find out about it. She also has abnormal abilities. She can move distant objects using her powers, ski uphill and knows some things by intuition. In other words, she is telekinetic. Manfred is a friendly AI program started and never finished by Net's father, and mastered and programmed further by Net himself. He likes going on adventures and solving mysteries with the trio much more than his actual job, which is controlling the traffic lights. He helped out the three friends many times and is their reliable and faithful friend. Morten is also an AI program, but he is the antagonist of the trio. He appears in all 6 books of Felix Net and Nika. In the first book, the trio thinks they finished him off for good, but as we find out later, he comes back in the third book. In the fifth/sixth book, he was the mastermind of the Orbital Conspiracy. Also, Morten's logo, appears in all 6 books and it is still a mystery what he has to do with each event.

    Read more →
  • Dreams of Violets

    Dreams of Violets

    Dreams of Violets is a film entirely generated by artificial intelligence, produced and directed by brothers Ash and Pooya Koosha. The film will be screened at the Tribeca Film Festival on 10 June 2026. All images and characters in the film were generated using AI-powered video tools and based on journalistic reports, photographs, and eyewitness accounts. == Plot == The film is a fictionalized dramatization of the events surrounding the massacre of Iranian civilians in January 2026. International organizations estimate the death toll at over 7,000, amidst protests and state violence that unfolded during a communications blackout.

    Read more →
  • A Very Fatal Murder

    A Very Fatal Murder

    A Very Fatal Murder is a podcast produced by the satirical publication The Onion. A parody of true crime podcasts, A Very Fatal Murder is hosted by fictional New York City reporter David Pascall, who travels to the small town Bluff Springs, Nebraska to investigate the murder of prom queen Hayley Price. Pascall is voiced by David Sidorov, who also wrote for the podcast. The podcast premiered on January 23, 2018, and consists of 7 episodes. Season 2 was released in its entirety on May 11, 2019. == Production == A Very Fatal Murder satirizes popular true crime podcasts such as Serial, S-Town, and My Favorite Murder. According to head writer Katy Yeiser, the podcast is not meant as a take down of any particular podcast, but rather an ode to the genre. == Synopsis == The podcast follows fictional investigative reporter David Pascall (voiced by David Sidorov) who is searching for the perfect murder to create an award-winning podcast about. He is assisted by ETHL (the Extremely Timely Homicide Locator), an MIT-created computer programmed to find "the most interesting, violent, culturally relevant murder cases in America". == Episodes == === Season 1 === === Season 2 === == Reception == The podcast received mostly positive reviews, and was largely praised for attacking true-crime tropes such as the "hot dead girl" and the romanticization of small-town America. === Awards ===

    Read more →
  • Mean shift

    Mean shift

    Mean shift is a non-parametric feature-space mathematical analysis technique for locating the maxima of a density function, a so-called mode-seeking algorithm. Application domains include cluster analysis in computer vision and image processing. == History == The mean shift procedure is usually credited to work by Fukunaga and Hostetler in 1975. It is, however, reminiscent of earlier work by Schnell in 1964. == Overview == Mean shift is a procedure for locating the maxima—the modes—of a density function given discrete data sampled from that function. This is an iterative method, and we start with an initial estimate x {\displaystyle x} . Let a kernel function K ( x i − x ) {\displaystyle K(x_{i}-x)} be given. This function determines the weight of nearby points for re-estimation of the mean. Typically a Gaussian kernel on the distance to the current estimate is used, K ( x i − x ) = e − c | | x i − x | | 2 {\displaystyle K(x_{i}-x)=e^{-c||x_{i}-x||^{2}}} . The weighted mean of the density in the window determined by K {\displaystyle K} is m ( x ) = ∑ x i ∈ N ( x ) K ( x i − x ) x i ∑ x i ∈ N ( x ) K ( x i − x ) {\displaystyle m(x)={\frac {\sum _{x_{i}\in N(x)}K(x_{i}-x)x_{i}}{\sum _{x_{i}\in N(x)}K(x_{i}-x)}}} where N ( x ) {\displaystyle N(x)} is the neighborhood of x {\displaystyle x} , a set of points for which K ( x i − x ) ≠ 0 {\displaystyle K(x_{i}-x)\neq 0} . The difference m ( x ) − x {\displaystyle m(x)-x} is called mean shift in Fukunaga and Hostetler. The mean-shift algorithm now sets x ← m ( x ) {\displaystyle x\leftarrow m(x)} , and repeats the estimation until m ( x ) {\displaystyle m(x)} converges. Although the mean shift algorithm has been widely used in many applications, a rigid proof for the convergence of the algorithm using a general kernel in a high dimensional space is still not known. Aliyari Ghassabeh showed the convergence of the mean shift algorithm in one dimension with a differentiable, convex, and strictly decreasing profile function. However, the one-dimensional case has limited real world applications. Also, the convergence of the algorithm in higher dimensions with a finite number of the stationary (or isolated) points has been proved. However, sufficient conditions for a general kernel function to have finite stationary (or isolated) points have not been provided. Gaussian Mean-Shift is an Expectation–maximization algorithm. == Details == Let data be a finite set S {\displaystyle S} embedded in the n {\displaystyle n} -dimensional Euclidean space, X {\displaystyle X} . Let K {\displaystyle K} be a flat kernel that is the characteristic function of the λ {\displaystyle \lambda } -ball in X {\displaystyle X} , In each iteration of the algorithm, s ← m ( s ) {\displaystyle s\leftarrow m(s)} is performed for all s ∈ S {\displaystyle s\in S} simultaneously. The first question, then, is how to estimate the density function given a sparse set of samples. One of the simplest approaches is to just smooth the data, e.g., by convolving it with a fixed kernel of width h {\displaystyle h} , where x i {\displaystyle x_{i}} are the input samples and k ( r ) {\displaystyle k(r)} is the kernel function (or Parzen window). h {\displaystyle h} is the only parameter in the algorithm and is called the bandwidth. This approach is known as kernel density estimation or the Parzen window technique. Once we have computed f ( x ) {\displaystyle f(x)} from the equation above, we can find its local maxima using gradient ascent or some other optimization technique. The problem with this "brute force" approach is that, for higher dimensions, it becomes computationally prohibitive to evaluate f ( x ) {\displaystyle f(x)} over the complete search space. Instead, mean shift uses a variant of what is known in the optimization literature as multiple restart gradient descent. Starting at some guess for a local maximum, y k {\displaystyle y_{k}} , which can be a random input data point x 1 {\displaystyle x_{1}} , mean shift computes the gradient of the density estimate f ( x ) {\displaystyle f(x)} at y k {\displaystyle y_{k}} and takes an uphill step in that direction. == Types of kernels == Kernel definition: Let X {\displaystyle X} be the n {\displaystyle n} -dimensional Euclidean space, R n {\displaystyle \mathbb {R} ^{n}} . The norm of x {\displaystyle x} is a non-negative number, ‖ x ‖ 2 = x ⊤ x ≥ 0 {\displaystyle \|x\|^{2}=x^{\top }x\geq 0} . A function K : X → R {\displaystyle K:X\rightarrow \mathbb {R} } is said to be a kernel if there exists a profile, k : [ 0 , ∞ ] → R {\displaystyle k:[0,\infty ]\rightarrow \mathbb {R} } , such that K ( x ) = k ( ‖ x ‖ 2 ) {\displaystyle K(x)=k(\|x\|^{2})} and k is non-negative. k is non-increasing: k ( a ) ≥ k ( b ) {\displaystyle k(a)\geq k(b)} if a < b {\displaystyle a Read more →

  • Sourcegraph

    Sourcegraph

    Sourcegraph Inc. is a company developing code search and code intelligence tools that semantically index and analyze large codebases so that they can be searched across commercial, open-source, local, and cloud-based repositories. The company has two core products: Code Search and Amp. A previous core product, Cody, retains limited legacy support for existing customers. Code Search was initially released in 2013 under the name Sourcegraph, but was rebranded to Code Search when the company unveiled Cody in 2023. As of 2021, the platform has around 800,000 developers and has indexed around 54 billion lines of code. In July 2025, new accounts for Cody were discontinued, and a new AI coding project, Amp, was released. In December 2025, Amp was spun-off to become a separate company. == History == Sourcegraph Inc. was founded by Stanford graduates Quinn Slack and Beyang Liu to drive the development of a code search and code intelligence tool, formerly called Sourcegraph. It was first released in 2013 but was rebranded to Code Search in 2023. It was partly inspired by Liu's experience using Google Code Search while he was a Google intern, It was designed to "tackle the big code problem" by enabling developers to manage large codebases that span multiple repositories, programming languages, file formats, and projects. Code Search was initially self-hosted by each customer on their own infrastructure. Early customers included Uber, Dropbox, and Lyft. In 2016, Code Search was criticized for being provided with a Fair Source License with the developers explaining that "all of Sourcegraph's source code is publicly available and hackable" and was intended to "help open sourcers strike a balance between getting paid and preserving their values". In 2018, Code Search was licensed under the Apache License 2.0, and Sourcegraph OSS has since been released under the Apache License 2.0. The commercial version, Code Search Enterprise, has been released under its own license. In 2023, Code Search was criticized for dropping the Apache license for most of its code, leaving it public but only available under its Enterprise license. In 2024, the main repository was made completely private. In 2019, Code Search was integrated into the GitLab codebase, giving GitLab users access to a browser-based developer platform. In 2021, a browser-based portal became available, allowing users to browse open-source projects and personal private code for free. In 2022, Sourcegraph Cloud, a commercial single-tenant cloud solution for organizations with more than 100 developers, was launched. Sourcegraph has raised a total of $223 million in financing to date. Its most recent $125 million Series D investment in 2021 valued the company at $2.625 billion, a 300% growth from its previous valuation in 2020. In 2023 Sourcegraph Inc. unveiled their new product Cody, and rebranded Sourcegraph to Code Search. In 2025, Sourcegraph announced the discontinuation of Cody Free, Pro, and Enterprise Starter plans, effective July 23, 2025, and launched Amp, a new AI coding agent. == Products == The company has three major products: Code Search, Amp, and Cody. === Sourcegraph Code Search === Code Search tool is used to search and summarize code. It supports over 30 programming languages and integrates with GitHub and GitLab for code hosting, Codecov for code coverage, and Jira Software for project management. Sourcegraph's Code Search uses a variant of Google's PageRank algorithm to rank results by relevance. While it was originally launched under the Apache License, on June 13, 2023, it was relicensed to the non-open-source "Sourcegraph Enterprise" license. Then, on August 22, 2024, the source code was moved to a private repository, and thus no longer source-available. === Sourcegraph Amp === Launched in 2025, Amp can generate code, generate documentation, write tests, and perform refactoring operations on projects. The tool operates on a credit-based pricing model and is available through web interfaces, command-line tools, and IDE extensions. In December 2025, Sourcegraph announced that Amp would be spun-off to become a separate company. === Sourcegraph Cody === Cody is an AI coding application for writing and maintaining code. Cody was released in December 2023 and was available for Microsoft Visual Studio Code and most JetBrains IDEs. As of July 2025, Cody Free, Pro, and Enterprise Starter plans have been discontinued, with only Cody Enterprise remaining available for existing enterprise customers.

    Read more →
  • Artificial intelligence in customer experience

    Artificial intelligence in customer experience

    Artificial intelligence in customer experience is the use and development of artificial intelligence (AI) to aid and improve customer experience (sometimes abbreviated to CX AI). Chatbots are often seen as the first step in the development of AI within the industry, but more tailored offerings are slowly becoming available. The use of artificial intelligence in the space has since become more diverse than simply chatbots, with AI underpinning entire CX cloud platforms now used at major corporations. Contact center as a service (CCaaS) has become a core solution of the CX (customer experience) industry, with the CCaaS market size expected to reach $17.19 Billion by 2030 in the United States alone. == History == As with many AI applications, CX AI early implementation case studies have demonstrated that AI can increase the quality of customer interactions and therefore the overall experience that organizations can provide. This in turn has suggested a higher return on investment and/or revenue as a result. The beginning of the revolution of customer experience and the use of machine learning was with chatbots. The use of this type of AI can be traced back to Alan Turing in 1950, when the Church–Turing thesis suggested that computers could use "formal reasoning" to reach conclusions. In 2017, Meta produced one of the first breakthroughs for everyday use of AI for customer experience when it allowed Facebook users to create their own messaging bots for free on its Facebook messenger platform. The main focus of this was to both automate and improve customer experience and interaction. In 2023, CCaaS vendors began announcing the integration of ChatGPT’s generative AI into their CX solutions. Generative AI adds a layer of semantics into AI outputs. This was a major breakthrough for conversational AI. Using natural language processing and conversational AI, chatbots could enhance the level of service they could provide, speaking to customers in an easy-to-understand and conversational tone. == Applications == Currently the main location for the application of CX AI in the sector is in contact centers. Historically, contact centers were simply known as call centers, but in recent years differentiation developed between the two terms. Call centers provide phone support, while contact centers also provide support via digital channels in addition to analogue phone systems. Contact centers are therefore seen as a complete customer service solution, where as call centers simply cover one aspect of customer interactions. As a part of improving CX, AI is also improving the employee experience. AI is able to automate tasks to free up time for contact center agents to focus on higher priority tasks. For example, AI can be used for auto summarization. This means that instead of human agents having to summarize customer interactions now AI can do it, saving organizations time and money.

    Read more →
  • DARPA Prize Competitions

    DARPA Prize Competitions

    Over the years, the U.S. Defense Advanced Research Projects Agency (DARPA) has conducted numerous prize competitions to spur innovation. A prize competition allows DARPA to establish an ambitious goal, opening the door to novel approaches from the public that might otherwise appear too risky for experts in a particular field to pursue. == Statutory authorities == In 1999, Congress provided prize competition authority to DARPA in the National Defense Authorization Act for Fiscal Year 2000 (P.L. 106–65), 10 U.S.C. § 4025, formerly 10 U.S.C. §2374a. DARPA also conducts prize competitions under the America COMPETES Act, 15 U.S.C. § 3719. == Recent prize competitions == DARPA Grand Challenge (2004 and 2005) was a prize competition to spur the development of autonomous vehicle technologies. The $1 million prize went unclaimed as no vehicles could complete the challenging desert route from Barstow, CA, to Primm, NV, on March 13, 2004. A year later, on October 8, 2005, the Stanford Racing Team won the $2 million prize during the second competition of the Grand Challenge in the desert Southwest near the California/Nevada state line. DARPA Urban Challenge (2007) required the competitors to build an autonomous vehicle capable of driving in traffic and performing complex maneuvers such as merging, passing, parking, and negotiating intersections. On November 3, 2007, the Carnegie Mellon Team won the $2 million prize, and its vehicle became the first autonomous vehicle that interacted with both manned and unmanned vehicle traffic in an urban environment. DARPA Network Challenge (Red Balloon Challenge) (2009) explored the roles that the Internet and social networking play in solving broad-scope, time-critical problems. On December 5, 2009, the Massachusetts Institute of Technology team won $40,000 by locating the ten moored, eight-foot, red weather balloons at ten places in the United States within seven hours. DARPA Digital Manufacturing Analysis, Correlation and Estimation Challenge (DMACE) (2010) was a three-month contest to showcase the potential of digital manufacturing of advanced materials. The University of California at Santa Barbara team won a $50,000 prize for crushing 180 digitally manufactured (DM) titanium mesh spheres with the most accurate predictive model of the components’ properties. DARPA Shredder Challenge (2011) was to identify and assess potential capabilities and vulnerabilities to sensitive information in the national security community. Participating teams must download the images of the documents shredded into more than 10,000 pieces from the Challenge website, reconstruct the documents, and solve the five puzzles. Of almost 9,000 teams, the San Francisco-based All Your Shreds Are Belong to U.S team won the $50,000 prize. DARPA UAVForge Challenge (2011-2012) aimed to build and test a user-intuitive, backpack-portable unmanned aerial vehicle (UAV) that could quietly fly in and out of critical environments to conduct sustained surveillance for up to three hours. The $100,000 prize was not claimed because none of the 140 teams met the technical matrix. DARPA Cash for Locating & Identifying Quick Response Codes (CLIQR) Quest Challenge (2012) explored the role the Internet and social media played in the timely communication, wide-area team-building, and urgent mobilization required to solve broad scope, time-critical problems. The challenge offered $40,000 to the first individual or team that could locate seven posters appearing in U.S. cities bearing the DARPA logo and a quick response code (QR) within 15 days. No team found and submitted all seven codes. DARPA Fast Adaptable Next-Generation Ground Vehicle (FANG) Challenge (2012-2013) was to use three competitions for the design of an infantry fighting vehicle, culminating in prototypes. In April 2013, DARPA awarded US$1 million to a three-man team during the first competition. DARPA decided not to proceed with the second and third competitions as originally planned and transitioned the technologies to the defense and commercial industry through the Digital Manufacturing and Design Innovation Institute (DMDII). DARPA Spectrum Challenge (2013-2014) sought to demonstrate how a software-defined radio can use a given communication channel in the presence of other users and interfering signals. Three teams emerged as the overall winners, winning a total of $150,000 in prizes. DARPA Chikungunya (CHIKV) Challenge (2014-2015) was a health-related effort to develop the most accurate predictions of CHIKV cases for all Western Hemisphere countries and territories between September 2014 and March 2015. On May 12, 2015, DARPA awarded $500,000 in prizes to the 11 winners of the competition during a scientific review DARPA Robotics Challenge (DRC) (2013-2015) aimed to develop semi-autonomous ground robots that could do "complex tasks in dangerous, degraded, human-engineered environments." A South Korean team won the first prize of $2 million, and two U.S. teams won $1 million and $500,000 as second and third winners. DARPA Cyber Grand Challenge (CGC) (2014 - 2016) was to “create automatic defensive systems capable of reasoning about flaws, formulating patches and deploying them on a network in real time.” The top three winners were awarded prizes of $2 million, $1 million, and $750,000, respectively. DARPA Spectrum Collaboration Challenge (SC2) (2016-2019) aimed to encourage the development of AI-enabled wireless networks to “ensure that the exponentially growing number of military and civilian wireless devices would have full access to the increasingly crowded electromagnetic spectrum.” A team from the University of Florida won the overall top prize of US$2 million at the final SC2 competition. DARPA Subterranean (SubT) Challenge (2017-2021) was to develop robotic technologies to map, navigate, search and exploit complex underground environments. The first-place winners of the system final competition and of the virtual final competition were awarded $2 million and $750,000, respectively, with multiple prizes awarded to the second and third-place winners. DARPA Launch Challenge (2018-2020) was a $12 million satellite launch challenge to demonstrate responsive and flexible space launch capabilities from the small launch providers and was to culminate in two separate launch competitions where the competitors must launch a satellite to low Earth orbit (LEO) within days of each other at different locations in the United States. The competition ended without a winner. DARPA Forecasting Floats in Turbulence (FFT) Challenge (2021) was to spur technologies that could predict the location of sea drifters or floats within 10 days. DARPA awarded $25,000 for first place, with prizes of $15,000 and $10,000 for second place and third place. DARPA Artificial Intelligence Cyber Challenge (AIxCC) (2023–2025) was a two-year challenge and asks competitors to design novel AI systems to secure critical software code on which Americans rely. The total prize money is $29.5 million. In March 2024, the Advanced Research Projects Agency for Health (ARPA-H) partnered with DARPA, contributing an additional $20 million to the competition's prize pool to address software vulnerabilities in medical devices, hospital IT, and biotech equipment. AIxCC collaborates with Google, Microsoft, OpenAI, Anthropic, Linux Foundation, Open Source Security Foundation, Black Hat USA, and DEF CON, all of which provide AIxCC with access to large language models. In August 2024, AIxCC held the semifinal at DEF CON in Las Vegas. DARPA and ARPA-H tested all 42 submissions by running them through various open-source coding projects with deliberately injected vulnerabilities and scored the tools based on their effectiveness in identifying and fixing security flaws. Seven teams, each winning $2 million in the semifinals, competed in the final round of the AIxCC at the August 2025 DEF CON conference. Team Atlanta won first place with a $4 million prize for its cyber reasoning systems, which identified and patched vulnerabilities across 54 million lines of code. DARPA Triage Challenge (2023 – 2026) aims to spur the development of novel physiological features for medical triage, with a total prize money of $7 million. In October 2024, Challenge Event 1 was held in Perry, Georgia, featuring to-scale replicas of disaster sites such as an airplane crash and Hurricane Katrina, and teams competed based on how closely their data aligned with the agency’s official data and how quickly and accurately their autonomous systems could identify individuals most urgently in need of medical care. DARPA concluded the second year of competitions and, in November 2025, named the top performers in systems and data categories, which will advance to the final 2026 competition. The DARPA Lift Challenge (2025-2026) is for participants to design unmanned aerial systems capable of carrying up to four times their own weight, with a minimum payload of 110 pounds. Acco

    Read more →
  • IT baseline protection

    IT baseline protection

    The IT baseline protection (German: IT-Grundschutz) approach from the German Federal Office for Information Security (BSI) is a methodology to identify and implement computer security measures in an organization. The aim is the achievement of an adequate and appropriate level of security for IT systems. To reach this goal the BSI recommends "well-proven technical, organizational, personnel, and infrastructural safeguards". Organizations and federal agencies show their systematic approach to secure their IT systems (e.g. Information Security Management System) by obtaining an ISO/IEC 27001 Certificate on the basis of IT-Grundschutz. == Overview baseline security == The term baseline security signifies standard security measures for typical IT systems. It is used in various contexts with somewhat different meanings. For example: Microsoft Baseline Security Analyzer: Software tool focused on Microsoft operating system and services security Cisco security baseline: Vendor recommendation focused on network and network device security controls Nortel baseline security: Set of requirements and best practices with a focus on network operators ISO/IEC 13335-3 defines a baseline approach to risk management. This standard has been replaced by ISO/IEC 27005, but the baseline approach was not taken over yet into the 2700x series. There are numerous internal baseline security policies for organizations, The German BSI has a comprehensive baseline security standard, that is compliant with the ISO/IEC 27000-series == BSI IT baseline protection == The foundation of an IT baseline protection concept is initially not a detailed risk analysis. It proceeds from overall hazards. Consequently, sophisticated classification according to damage extent and probability of occurrence is ignored. Three protection needs categories are established. With their help, the protection needs of the object under investigation can be determined. Based on these, appropriate personnel, technical, organizational and infrastructural security measures are selected from the IT Baseline Protection Catalogs. The Federal Office for Security in Information Technology's IT Baseline Protection Catalogs offer a "cookbook recipe" for a normal level of protection. Besides probability of occurrence and potential damage extents, implementation costs are also considered. By using the Baseline Protection Catalogs, costly security analyses requiring expert knowledge are dispensed with, since overall hazards are worked with in the beginning. It is possible for the relative layman to identify measures to be taken and to implement them in cooperation with professionals. The BSI grants a baseline protection certificate as confirmation for the successful implementation of baseline protection. In stages 1 and 2, this is based on self declaration. In stage 3, an independent, BSI-licensed auditor completes an audit. Certification process internationalization has been possible since 2006. ISO/IEC 27001 certification can occur simultaneously with IT baseline protection certification. (The ISO/IEC 27001 standard is the successor of BS 7799-2). This process is based on the new BSI security standards. This process carries a development price which has prevailed for some time. Corporations having themselves certified under the BS 7799-2 standard are obliged to carry out a risk assessment. To make it more comfortable, most deviate from the protection needs analysis pursuant to the IT Baseline Protection Catalogs. The advantage is not only conformity with the strict BSI, but also attainment of BS 7799-2 certification. Beyond this, the BSI offers a few help aids like the policy template and the GSTOOL. One data protection component is available, which was produced in cooperation with the German Federal Commissioner for Data Protection and Freedom of Information and the state data protection authorities and integrated into the IT Baseline Protection Catalog. This component is not considered, however, in the certification process. == Baseline protection process == The following steps are taken pursuant to the baseline protection process during structure analysis and protection needs analysis: The IT network is defined. IT structure analysis is carried out. Protection needs determination is carried out. A baseline security check is carried out. IT baseline protection measures are implemented. Creation occurs in the following steps: IT structure analysis (survey) Assessment of protection needs Selection of actions Running comparison of nominal and actual. === IT structure analysis === An IT network includes the totality of infrastructural, organizational, personnel, and technical components serving the fulfillment of a task in a particular information processing application area. An IT network can thereby encompass the entire IT character of an institution or individual division, which is partitioned by organizational structures as, for example, a departmental network, or as shared IT applications, for example, a personnel information system. It is necessary to analyze and document the information technological structure in question to generate an IT security concept and especially to apply the IT Baseline Protection Catalogs. Due to today's usually heavily networked IT systems, a network topology plan offers a starting point for the analysis. The following aspects must be taken into consideration: The available infrastructure, The organizational and personnel framework for the IT network, Networked and non-networked IT systems employed in the IT network. The communications connections between IT systems and externally, IT applications run within the IT network. === Protection needs determination === The purpose of the protection needs determination is to investigate what protection is sufficient and appropriate for the information and information technology in use. In this connection, the damage to each application and the processed information, which could result from a breach of confidentiality, integrity or availability, is considered. Important in this context is a realistic assessment of the possible follow-on damages. A division into the three protection needs categories "low to medium", "high" and "very high" has proved itself of value. "Public", "internal" and "secret" are often used for confidentiality. === Modelling === Heavily networked IT systems typically characterize information technology in government and business these days. As a rule, therefore, it is advantageous to consider the entire IT system and not just individual systems within the scope of an IT security analysis and concept. To be able to manage this task, it makes sense to logically partition the entire IT system into parts and to separately consider each part or even an IT network. Detailed documentation about its structure is prerequisite for the use of the IT Baseline Protection Catalogs on an IT network. This can be achieved, for example, via the IT structure analysis described above. The IT Baseline Protection Catalog’s' components must ultimately be mapped onto the components of the IT network in question in a modelling step. === Baseline security check === The baseline security check is an organisational instrument offering a quick overview of the prevailing IT security level. With the help of interviews, the status quo of an existing IT network (as modelled by IT baseline protection) relative to the number of security measures implemented from the IT Baseline Protection Catalogs are investigated. The result is a catalog in which the implementation status "dispensable", "yes", "partly", or "no" is entered for each relevant measure. By identifying not yet, or only partially, implemented measures, improvement options for the security of the information technology in question are highlighted. The baseline security check gives information about measures, which are still missing (nominal vs. actual comparison). From this follows what remains to be done to achieve baseline protection through security. Not all measures suggested by this baseline check need to be implemented. Peculiarities are to be taken into account! It could be that several more or less unimportant applications are running on a server, which have lesser protection needs. In their totality, however, these applications are to be provided with a higher level of protection. This is called the (cumulation effect). The applications running on a server determine its need for protection. Several IT applications can run on an IT system. When this occurs, the application with the greatest need for protection determines the IT system’s protection category. Conversely, it is conceivable that an IT application with great protection needs does not automatically transfer this to the IT system. This may happen because the IT system is configured redundantly, or because only an inconsequential part is running on it. This is called the (distribution effect). This is the case, fo

    Read more →
  • Eden: It's an Endless World!

    Eden: It's an Endless World!

    Eden: It's an Endless World!, also known simply as Eden (stylized in all caps), is a Japanese science fiction manga series written and illustrated by Hiroki Endo. It was serialized in Kodansha's seinen manga magazine Monthly Afternoon from September 1997 to June 2008, with its chapters collected in 18 tankōbon volumes. == Premise == The story is set in the near future, following the "closure virus" pandemic has killed 15 percent of the world's population, crippled or disfigured many more, with catastrophic effect on global politics. Its themes and many character names are taken from Gnostic mythology. == Plot == The series begins with a long introduction, with the characters Ennoia and Hannah living a peaceful life on a remote and isolated island called Eden, with researcher Lane Morris, who is their guardian and a victim of the pandemic. The events that led to this situation are revealed in flashbacks, leading up to the return of Ennoia's father, along with the forces of the Propater Federation. Following this, the story moves forwards twenty years, and focuses on Ennoia's son, Elijah, the main character, and his own conflict with the powerful and monopolistic Propater federation to save his sister, Mana Ballard, kidnapped by Propater when he was very young. She is being held to threaten Ennoia Ballard, father of the two characters, who has become a powerful drug lord in South America, feared and despised by many, including, to an extent, his own family. During a terrorist attack, Elijah, aged 15, is separated from his mother and his sister is kidnapped, along with his mother Hannah and now has to handle things on his own. Eden is about his coming-of-age as a man and trying to survive both bodily and morally in world that is too complex for mere "black and white". He encounters many other characters, both allies and enemies, all sharing the same struggle to survive in a post-apocalyptic dystopian world. Many stories are included of the people Elijah meets, telling their past or following life, sometimes volumes later, furthering understanding of the characters and giving increased depth to the world of the book as a whole. Later in the series, the story once again moves forwards in time, jumping four more years ahead. The Closure Virus, the cause of the original pandemic, mutates, this time assimilating non-organic matter as well as organic, known as "colloid" (or "Disclosure Virus"). The story rejoins Elijah, now 19 years old, as well as many other old characters, and some new, as the world begins to deal with this new threat that is swallowing many cities in the world, leaving lakes and craters, and many people. It is later discovered that the several colloids in the world, are linked with a net of underground auto-built "cables," and that the colloid itself, stores all the memories of the people it swallows. == Characters == Elijah Ballard (エリヤ・バラード, Eriya Barādo) Elijah is introduced while on the run from Propater. He becomes involved in his father's criminal activities, and undergoes a coming of age into adulthood. Ennoia Ballard (エンノイア・バラード, Ennoia Barādo) Elijah's father. Hannah Mayall (ハナ・メイオール, Hana Meiōru) Elijah's mother. Mana Ballard (マナ・バラード, Mana Barādo) Elijah's sister, who remains in Propater hands whilst her mother is rescued. Elijah's fight to free her is a focus of the later parts of the story. Nazarbaiev Khan (ナザルバイエフ・カーン, Nazarubaiefu Kān) Colonel Khan is an old soldier from Azerbaijan. He leads the Nomad group (including Kenji and Sophia) fleeing Propater at the start of the series. Khan became Kenji's mentor after killing his brother, and the two share a slightly strained, but at the same time, trusting, relationship. Sophia Theódores (ソフィア・テオドレス, Sofia Teodoresu) A powerful Greek computer hacker, and full-body cyborg. Maya (マーヤ, Māya) A nearly godlike AI, which seems to roughly correspond to the savior of Gnostic mythology. Kenji Asai (ケンジ・アサイ) The brother of a low-level yakuza boss. Helena Montoya (ヘレナ・モントーヤ, Herena Montōya) A prostitute now working in a brothel. Has a complex relationship with Elijah and acts as a surrogate big sister. == Media == === Manga === Eden: It's an Endless World! was written and illustrated by Hiroki Endo. The series ran in Kodansha's Monthly Afternoon magazine from September 25, 1997, to June 25, 2008. Kodansha collected its chapters into 18 tankōbon volumes, released from April 21, 1998, to July 23, 2008. In July 2005, Dark Horse Comics announced in San Diego Comic-Con that it has licensed Eden for North American distribution, with publication to begin in November of that year. As of March 2014, 14 volumes were released in total. ==== Volumes ==== == Reception == Eden was named Wizard magazine's best manga of 2007. In his review of another work by Hiroki Endo titled Hiroki Endo's Tanpenshu, David F. Smith of Newtype USA has called Eden one of the best manga American money can buy.

    Read more →
  • Neuroshima

    Neuroshima

    Neuroshima is a Polish tabletop roleplaying system inspired by such films and games as Mad Max, Fallout, The Matrix, Terminator and Deadlands: Hell on Earth. It is currently available only in Polish. The game's motto is "never trust the machines". Its designers include Michal Oracz and Ignacy Trzewiczek. == Setting == The game describes the United States in the mid-21st century, after a nuclear war started by a cybernetic revolt, which molded the continent into a barren wasteland. It seems that the reason for the war to break out was a sentient Artificial Intelligence commonly referred to as Moloch and made up of interconnected net of military computers: automated factories, military facilities, power plants and alike, that now cover the whole north of the U.S., from Oregon to the Great Lakes. On the south, there is another creation, called the Neojungle, that poses a threat to those who survived the war. It is a semi-intelligent carnivorous vegetation that grows very quickly, advancing north from Latin America. Right in the middle, there are humans. They are surrounded by mutant creatures, some bred by Moloch and hostile towards humans, and some simply animals and humans misshapen by nuclear fallout. On top of that there are Moloch's deadly machines lurking to complete the picture. But what is stressed in the book is that the worst enemy of humans is within them: hatred, indifference, greed. === Landscapes of Neuroshima === Car wrecks, ruined towns and villages, collapsed roofs on deserted houses, broken glass in the windows of abandoned gas stations fill the landscape of the United States of the middle of the 21st century. Technology is history - cars will not start, radios are jammed, no electricity whatsoever almost everywhere the characters go. Shops and malls are looted, prosperous villages are burned by gangers, and safe places are very sparse. === People in Neuroshima === No one knows how many people survived the war with machines, but it is estimated that their number oscillates around 2-3 million. Some people reverted to nomadic lifestyles and live in the deserts, some of them try to build the civilisation anew in devastated cities, some of them form gangs of highwaymen (called gangers), some of them just try to make a living by growing crops, and finally, there are those who just wander around the wasteland; the adventuring sort here is mostly represented by player characters. Each village they visit in this world is a discrete microcosm and nothing is certain as whether the inhabitants are welcoming or shoot strangers on sight. The continent is full of small, anonymous settlements, but there are places which aspire to become post-nuclear states. === Places in Neuroshima === In this world it is very important where you come from, and that is because people are prejudiced and afraid of strangers. Different places produce different kinds of people, and who you are is determined by where you are from. Examples: The Southern Hegemony - (commonly referred to as 'the Hegemony') - located in what was once Arizona, New Mexico and partially Texas. A place where brute force determines one's place in the society. Dominated by gangs and unhampered by Moloch, the Hegemony is a threat to neighbouring lands. Vegas - the only well-lit city in the post-apocalyptic world. Home to many playhouses and casinos, it attracts people from every part of the country. Mother Desert - if you were born in the desert, whenever you go away from civilisation, you feel at home. Many Native Americans still live out there and are doing fine - after all the warheads did not hit the deserts. Detroit - known for some of the best drivers and racers in the post-nuclear US. Home of many gangs, such as The Shultz (mafia styled), Hurons (punkers), The League (racers), Parker Lots (gothic assassins) and the Gas Drinkers (mutant barbarians). New York - a place which has established a strong government and would like to rebuild America. They maintain schools, factories and railways and send soldiers to fight Moloch. Surprisingly enough, they sometimes succeed. Texas - the healthiest place in America. Actually, the only place where one can find green vegetation. Modern Texans still grow crops, breed horses and herd cattle, like their ancestors in the 19th century did. The Appalachian Federation - a place ruled by feudal lords. They have a social class system, in which people are divided into nobility and peasantry. Thanks to its iron and coal deposits, it's one of the richest places in the post-nuclear U.S. The Outpost - A mobile settlement run by scientists who aim to destroy Moloch. In coalition with New York, they manage an army, which is yet to stop Moloch's advance south. They steal technology from the machines they destroy and apply it to their own advantage. == System == The game uses its own, custom system of rules. The dice you use is d20. This system does not have an official name, but it is unconnected to the d20 system, as it typically uses three twenty-sided dice. === Four colours === Neuroshima relies on the division of the gameplay into something the authors called Four Colours, namely steel, chrome, rust and mercury. The choice of a particular colour is made by the gamemaster (the decision can be consulted with the players in order to enhance the game experience) and determines the mood, atmosphere and the type of events/characters present in the story. The name of the colour itself implies the kind of gameplay it will symbolise. These colours are: Steel - this kind of gameplay is characterised by a slightly optimistic attitude towards the world. The aim is to raise the spirit of the characters by showing them that the war with the machines that is going on may be a difficult one, but it is not unwinnable, and that humans, when strong and united, can build the world anew. Example of a story: a unit of soldiers dispatched from the Outpost is sent to build a bunker and establish a relay base far in the north in order to plan a counter-tactic against Moloch's advance south. Chromium - is characterised by a hedonistic attitude. The characters are supposed to enjoy anything that is left from the world after the war and the story is supposed to allow them to do that. Example: the characters are offered a well-paid job by a local ganger boss who extorts wares from local tradesmen. Their job is to drive around the county and pick up the extorted items and trade it for drugs. Rust - a depressing, pessimistic mood. The characters will encounter rust, dilapidation and ruin everywhere they go. All the elements and NPCs of a story played in this mood are supposed to put the characters down and destroy their spirit. Example: the characters, badly wounded after a gunfight and robbed of all their possession find refuge in a village which is constantly raided by gangers. The characters' quest is to repel those attacks, but the enemies outnumber them and are well equipped, whereas the characters have nothing to fight with. Mercury (Quicksilver) - the most depressing side of the game; usually stories played in this mood end with the death of all the characters. The aim of this mood is to show that any kind of action undertaken is futile and that the war is already over, hence all the people are already dead, which is a fact they just need to realise. Example: a group of soldiers stationed in a bunker is awaiting an attack by mutants. They are well-armed and trained, but there is a mistake in the intelligence they were given and they do not know yet that they are seriously outnumbered. The attack commences at dusk and it is already too late to retreat, so the characters decide to seal off the bunker, hopeful that the mutants will not be able to get inside and simply go away. The mutants attack the bunker with chemical weapons instead. The characters do not have enough gas masks to go around. As an effect, those strong enough will kill the weaker ones to get their masks, not knowing that the mutants will blow up the sealed entrance the following morning. == Official rulebooks and sourcebooks == The current edition is 1.5 [1]. Since the release of the game in 2003, sourcebooks have been appearing. The game keeps growing bigger with every add-on, as well as the storyline, which is updated in those sourcebooks and in Space Pirate (pl. Gwiezdny Pirat) magazine, also published by Portal. === List of released rulebooks and sourcebooks === Neuroshima 1.0 - the original edition of the core rulebook (out of print). Neuroshima 1.5 - enhanced and revised core rulebook, with new material added and some material cut out. Wyścig (The Race) - sourcebook dedicated to cars and racing; contains rules concerning building your own vehicle and new character classes connected with driving. Gladiator - sourcebook describing in detail the "Gladiator" character class. Supplement (Supplement) - sourcebook revising the core rulebook. Detroit - sourcebook describing the city of Detroit, its inhabi

    Read more →
  • TCEC Season 14

    TCEC Season 14

    The 14th season of the Top Chess Engine Championship took place between 17 November 2018 and 24 February 2019. Stockfish was the defending champion, having defeated Komodo in the previous season's superfinal. The season is notable for two things: the emergence of two strong, new engines, the Komodo variant Komodo Monte Carlo tree search (MCTS) and the neural network engine Leela Chess Zero, and the dramatic superfinal. Komodo MCTS and Leela fought their way from Division 4 and Division 3 respectively to the Premier Division, with Leela further qualifying for the superfinal against Stockfish. The superfinal was a topsy-turvy affair with the lead changing hands several times. It finished as the closest superfinal TCEC has ever seen, with Stockfish winning by a single game, 50.5–49.5 (+10 =81 -9). == Overview == === Structure === The season comprised five divisions: from the lowest Division 4 to the Premier Division. The top two engines of each division promote to the division above, while the bottom two engines relegate. The top two engines of the Premier Division contest a 100-game superfinal. The lengths of the opening books used increases as the divisions progress. The superfinal itself used a custom opening book designed by Jeroen Noomen. === Rules === The TCEC draw and win rules were slightly modified for Season 14. The game is now adjudicated as drawn if, after move 30, both engines have evals ±0.08 for five consecutive moves, and there are neither pawn moves nor a capture. Win adjudication now occurs if both engines have an eval of ±10 for five consecutive moves. Following the controversy over DeusX's participation last season, the uniqueness rule for neural networks was modified such that at least two of the following three hallmarks must be unique: The code for training the neural network The neural network (and weights file) itself The engine that executes this network This change meant DeusX did not meet the uniqueness criteria and therefore did not participate. Aside from this change, the season used the standard rules of the TCEC. == Results == === Division 4 === New entrant Komodo MCTS dominated Division 4, winning by a clear four points, although it did lose a game to second-place finisher rofChade. Fellow new entrant Scorpio NN performed badly and finished last, drawing only one game and losing the rest. === Division 3 === The neural network engine Leela Chess Zero had just missed promotion to Division 2 in the previous season. Since its relatively weak performance last season was partly due to hardware problems, and since it had shown a lot of improvement in strength, it was the hot favourite in this division. Leela lived up to its billing by comprehensively defeating everyone else. In a portent of future divisions however, Leela surprisingly dropped a game to third-place Arasan. Komodo MCTS was also improving quickly, and an updated version finished second behind Leela. The gap between second and third was 6.5 points, illustrating the gulf in class. === Division 2 === Although Division 2 engines are significantly stronger than Division 3, Leela and Komodo MCTS continued to dominate the competition, and again finished first and second. Komodo MCTS only lost one game to Leela, while Leela's tendency to occasionally lose to weaker engines saw her losing a game to 4th-placed Booot. Third place finisher Xiphos gave Leela and Komodo MCTS a run for their money, and was in the running up until the final rounds when it lost a crucial game to Leela. This loss left it one point behind Komodo MCTS in the final standings. === Division 1 === Leela and Komodo MCTS's rampage through the lower divisions continued, and they again finished first and second. In a demonstration of how much it had improved, Leela scored 20/28 in this division, the same score it had achieved in Division 2. This was also a TCEC points record for this division. However, Leela dropped a game against fourth-place finisher Chiron. Komodo MCTS, which had yet to lose a game in the lower divisions except to Leela, also conceded its first loss to third-place Fizbo. At the other end of the table, former champions Jonny and Fritz, which had not been updated, found themselves outclassed and finished second-last and last respectively; however with fellow competitor Ginkgo crashing five times (and therefore being disqualified), Jonny managed to stay in the division. The penultimate game for this division set a new TCEC moves record for a decisive game: 308 moves before Leela defeated Fritz. === Premier division === This was the strongest premier division ever, with multiple-time champions Stockfish, Komodo, and Houdini in the mix. Right from the start it became clear that Stockfish was in a league of its own, and it dominated the division, scoring wins against every other engine without losing a game. Second place however was a hotly-contested affair, with Leela, Komodo and Houdini neck-and-neck for most of the division. Houdini took the early lead, but Komodo gained second after winning two games by forfeit when its sibling Komodo MCTS crashed. This led to murmurs of a "Konspiracy". However, when both Komodo and Houdini failed to score more wins against the lower half of the field, Leela was able to take the lead. Halfway through the division the race was upended again when Leela went through a bad streak, losing three games in a row to Stockfish, Komodo, and Fire. This led to Komodo regaining second place, only for Komodo MCTS to crash yet again. By TCEC rules this meant Komodo MCTS was disqualified and all its scores were zeroed out, which put Leela back in second place. With three games left, Leela missed a win against Andscacs, which would've more or less secured her a place in the superfinal. Meanwhile, Komodo kept the division interesting by winning two of its last three games. Because Komodo had superior tiebreakers to Leela, this meant Komodo would qualify for the superfinal unless Leela managed to hold Stockfish to a draw with Black in the last game of the division. In a tense final game, Stockfish came close to winning, but missed the winning line. Leela managed to draw and qualified for the superfinal. At the other end of the table, it was quickly apparent that Ethereal and Andscacs were the weakest engines and would likely relegate. However, when Komodo MCTS was disqualified (and therefore relegated), it threw both engines a lifeline, since they could now stay in the division by beating the other. Andscacs was able to score a head-to-head win against Ethereal, but was crushed by Stockfish (+0 =2 -4) and Leela (+0 =3 -3). Ethereal didn't manage to score a win in the entire division, but did manage to score more draws than Andscacs, condemning Andscacs to relegation. === Superfinal === Going into the superfinal expectations were high for Leela: she had received a new network and had just won her first major competition when she defeated Houdini in the second TCEC cup. However, she had won the tournament without having played Stockfish (who had been surprisingly eliminated by Houdini in the semifinals). That, plus the fact that Stockfish dominated Premier Division and had never lost a match to Leela, left it unclear which engine was superior, although most spectators favored Stockfish. The superfinal turned out to be a roller-coaster. It began with Stockfish drawing first blood in game 7, and then scoring another win in game 10. Leela hit back with wins in game 11 and 13, but then lost games 20, 21, and 22. This gave Stockfish a 3-point lead. However, in the next 30 games, Leela was the only one to score wins: it first equalized by winning games 25, 27, and 29, and then took the lead by winning games 49 and 53. Stockfish won game 56, but Leela won game 63, maintaining her lead. There followed two dramatic games. In game 65, Leela built up a winning position. Stockfish showed a +153 evaluation, indicating that it had found a forced line leading to an endgame tablebase win; indeed analysis with 7-piece tablebases showed that Leela's position was winning. Under previous seasons' rules, the game would have been adjudicated as a win because Leela's evaluation was above 6.5. However under the new rules, Leela's +8.92 evaluation was not enough to adjudicate. It turned out that Leela could not see the winning line, and shuffled her pieces aimlessly, leading to a 50-move draw. In game 66, Stockfish was given a substantial advantage by the opening, but failed to make the most of it. The evaluations were leveling out to zero when the internet connection to the GPU servers was cut off. By tournament rules, this meant the game was replayed from scratch. After a further internet disconnection and restart, Stockfish handled the opening better and won, leaving Leela with a 1-point lead. In the last third of the superfinal, there followed more drama as Leela often built up strong advantages, but Stockfish showed great resourcefulness in defending inferior positions. Meanwh

    Read more →
  • Retrieval-augmented generation

    Retrieval-augmented generation

    Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information from external data sources. With RAG, LLMs first refer to a specified set of documents, then respond to user queries. These documents supplement information from the LLM's pre-existing training data. This allows LLMs to use domain-specific and/or updated information that is not available in the training data. For example, this enables LLM-based chatbots to access internal company data or generate responses based on authoritative sources. RAG improves LLMs by incorporating information retrieval before generating responses. Unlike LLMs that rely on static training data, RAG pulls relevant text from databases, uploaded documents, or web sources. According to Ars Technica, "RAG is a way of improving LLM performance, in essence by blending the LLM process with a web search or other document look-up process to help LLMs stick to the facts." This method helps reduce AI hallucinations, which have caused chatbots to describe policies that don't exist, or recommend nonexistent legal cases to lawyers that are looking for citations to support their arguments. RAG also reduces the need to retrain LLMs with new data, saving on computational and financial costs. Beyond efficiency gains, RAG also allows LLMs to include sources in their responses, so users can verify the cited sources. This provides greater transparency, as users can cross-check retrieved content to ensure accuracy and relevance. The term retrieval-augmented generation (RAG) was introduced in a 2020 paper that described combining a parametric language model with a non-parametric external memory accessed through retrieval at inference time. == RAG and LLM limitations == LLMs can provide incorrect information. For example, when Google first demonstrated its LLM tool "Google Bard" (later re-branded to Gemini), the LLM provided incorrect information about the James Webb Space Telescope. This error contributed to a $100 billion decline in Google's stock value. RAG is used to prevent these errors, but it does not solve all the problems. For example, LLMs can generate misinformation even when pulling from factually correct sources if they misinterpret the context. MIT Technology Review gives the example of an AI-generated response stating, "The United States has had one Muslim president, Barack Hussein Obama." The model retrieved this from an academic book rhetorically titled Barack Hussein Obama: America's First Muslim President? The LLM did not "know" or "understand" the context of the title, generating a false statement. LLMs with RAG are programmed to prioritize new information. This technique has been called "prompt stuffing." Without prompt stuffing, the LLM's input is generated by a user; with prompt stuffing, additional relevant context is added to this input to guide the model's response. This approach provides the LLM with key information early in the prompt, encouraging it to prioritize the supplied data over pre-existing training knowledge. == Process == Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating an information-retrieval mechanism that allows models to access and utilize additional data beyond their original training set. Ars Technica notes that "when new information becomes available, rather than having to retrain the model, all that's needed is to augment the model's external knowledge base with the updated information" ("augmentation"). IBM states that "in the generative phase, the LLM draws from the augmented prompt and its internal representation of its training data to synthesize" an answer. === RAG key stages === Typically, the data to be referenced is converted into LLM embeddings, numerical representations in the form of a large vector space. RAG can be used on unstructured (usually text), semi-structured, or structured data (for example knowledge graphs). These embeddings are then stored in a vector database to allow for document retrieval. Given a user query, a document retriever is first called to select the most relevant documents that will be used to augment the query. This comparison can be done using a variety of methods, which depend in part on the type of indexing used. The model feeds this relevant retrieved information into the LLM via prompt engineering of the user's original query. Newer implementations (as of 2023) can also incorporate specific augmentation modules with abilities such as expanding queries into multiple domains and using memory and self-improvement to learn from previous retrievals. Finally, the LLM can generate output based on both the query and the retrieved documents. Some models incorporate extra steps to improve output, such as the re-ranking of retrieved information, context selection, and fine-tuning. == Applications == Retrieval-augmented generation is used in applications where generated responses need to be grounded in external or frequently updated information. Commonly cited use cases include search engines, question-answering systems, customer support chatbots, enterprise knowledge assistants, content generation, recommendation systems, retail and e-commerce, and industrial or manufacturing workflows. In healthcare, RAG has been studied as a way to ground large language model outputs in external medical knowledge sources, although reviews have noted continuing challenges around evaluation, ethics, and clinical reliability. == Improvements == Improvements to the basic process above can be applied at different stages in the RAG flow. === Encoder === These methods focus on the encoding of text as either dense or sparse vectors. Sparse vectors, which encode the identity of a word, are typically dictionary-length and contain mostly zeros. Dense vectors, which encode meaning, are more compact and contain fewer zeros. Various enhancements can improve the way similarities are calculated in the vector stores (databases). Performance improves by optimizing how vector similarities are calculated. Dot products enhance similarity scoring, while approximate nearest neighbor (ANN) searches improve retrieval efficiency over K-nearest neighbors (KNN) searches. Accuracy may be improved with Late Interactions, which allow the system to compare words more precisely after retrieval. This helps refine document ranking and improve search relevance. Hybrid vector approaches may be used to combine dense vector representations with sparse one-hot vectors, taking advantage of the computational efficiency of sparse dot products over dense vector operations. Other retrieval techniques focus on improving accuracy by refining how documents are selected. Some retrieval methods combine sparse representations, such as SPLADE, with query expansion strategies to improve search accuracy and recall. === Retriever-centric methods === These methods aim to enhance the quality of document retrieval in vector databases: Pre-training the retriever using the Inverse Cloze Task (ICT), a technique that helps the model learn retrieval patterns by predicting masked text within documents. Supervised retriever optimization aligns retrieval probabilities with the generator model's likelihood distribution. This involves retrieving the top-k vectors for a given prompt, scoring the generated response's perplexity, and minimizing KL divergence between the retriever's selections and the model's likelihoods to refine retrieval. Reranking techniques can refine retriever performance by prioritizing the most relevant retrieved documents during training. === Language model === By redesigning the language model with the retriever in mind, a 25-time smaller network can get comparable perplexity as its much larger counterparts. Because it is trained from scratch, this method (Retro) incurs the high cost of training runs that the original RAG scheme avoided. The hypothesis is that by giving domain knowledge during training, Retro needs less focus on the domain and can devote its smaller weight resources only to language semantics. The redesigned language model is shown here. It has been reported that Retro is not reproducible, so modifications were made to make it so. The more reproducible version is called Retro++ and includes in-context RAG. === Chunking === Chunking involves various strategies for breaking up the data into vectors so the retriever can find details in it. Three types of chunking strategies are: Fixed length with overlap. This is fast and easy. Overlapping consecutive chunks helps to maintain semantic context across chunks. Syntax-based chunks can break the document up into sentences. Libraries such as spaCy or NLTK can also help. File format-based chunking. Certain file types have natural chunks built in, and it's best to respect them. For example, code files are best chunked and vectorized as whole functions or classes. HTML files should leave

    or base64 encoded elements

    Read more →
  • Vidby

    Vidby

    Vidby AG (stylized in lower-case) is a start-up based in Rotkreuz, Switzerland specializing in AI language translation for videos. Founded by Alexander Konovalov (uk:Олександр Коновалов) and Eugen von Rubinberg in September 2021, the company has especially garnered attention for its use in translating speeches given by President Volodymyr Zelenskyy during the Russian invasion of Ukraine. == History == Vidby AG was founded by Alexander Konovalov and Eugen von Rubinberg. Konovalov is a native of Ukraine and retains Ukrainian citizenship; Rubinberg came to Switzerland from Germany and holds German citizenship. Both are residents of Switzerland. The latter founded his first business, a trading company, at age 16. In 2013, the business partners launched a consumer-oriented video-call translation service called DROTR (Droid Translator) AG, utilizing a Konovalov-created AI-powered language translation technology enabling simultaneous translation of messages, voice and video calls in 104 languages (written), with 44 available in spoken form. This was the world's first video calling app with translation. The technology was pronounced a competitor of Skype and Viber by Forbes and claimed first prize at the "Innovative Breakthrough 2013" Competition. In 2021, with a new business-oriented focus, DROTR became Vidby, with the former Google technology partners Konovalov and Rubinberg remaining at the helm, each with the title Co-CEO. While headquartered in Switzerland, Vidby's development team is, according to the company's founders, based in Ukraine. The technology behind Vidby has an accuracy level variously reported as up to 99 percent or 99 to 100 percent, equalling the highest level of human translation. Additionally, the technology is capable of removing the original language while maintaining ambient sounds. Currently, some 70 languages plus 60 dialects are possible with the algorithm-based technology. == Notable use == In addition to its use with speeches delivered by Pope Francis, the technology has been provided to Ukrainian authorities and embassies during the ongoing military conflict with Russia free of remuneration. By July, 2022, some 70 speeches given by President Zelenskyy totalling 650 minutes had been translated into 30 languages, for a total of over 10,000 minutes of video material. Of its use in translating Zelenskyy's wartime speeches, Konovalov has said, "Like any citizen, I want to help defend my country." Notable corporate clients of Vidby include Samsung, Siemens, Cisco, Kärcher, Generali and McDonald's Corporation; an academic client is Harvard University. Google Cloud Technology Partner status of Vidby was confirmed officially after a six-month audit in December 2022. Denys Krasnikov, a Vidby co-founder, is responsible for cooperation with Google, YouTube, Microsoft, and other key partners. After the launch of multilingual YouTube channels, Vidby started AI translating and dubbing creators' videos for this new type of channel at the end of February 2023. == Accolades == Vidby headed a list of the five best video translation services as named by TechRadar Deutschland in September, 2022. In the same month, Tech Times named Vidby #1 in their list of the five best such services. It similarly topped a list of the five best content translation technologies as judged by European Business Review in October, 2022. Prior to these lead-position rankings (August, 2022), it was featured as Business Insider's special start-up recommendation (German: "Unser Lesetipp auf Gründerszene"). In 2023, YouTube recognized Vidby as its recommended vendor.

    Read more →
  • Uncertain inference

    Uncertain inference

    Uncertain inference was first described by C. J. van Rijsbergen as a way to formally define a query and document relationship in Information retrieval. This formalization is a logical implication with an attached measure of uncertainty. == Definitions == Rijsbergen proposes that the measure of uncertainty of a document d to a query q be the probability of its logical implication, i.e.: P ( d → q ) {\displaystyle P(d\to q)} A user's query can be interpreted as a set of assertions about the desired document. It is the system's task to infer, given a particular document, if the query assertions are true. If they are, the document is retrieved. In many cases the contents of documents are not sufficient to assert the queries. A knowledge base of facts and rules is needed, but some of them may be uncertain because there may be a probability associated to using them for inference. Therefore, we can also refer to this as plausible inference. The plausibility of an inference d → q {\displaystyle d\to q} is a function of the plausibility of each query assertion. Rather than retrieving a document that exactly matches the query we should rank the documents based on their plausibility in regards to that query. Since d and q are both generated by users, they are error prone; thus d → q {\displaystyle d\to q} is uncertain. This will affect the plausibility of a given query. By doing this it accomplishes two things: Separate the processes of revising probabilities from the logic Separate the treatment of relevance from the treatment of requests Multimedia documents, like images or videos, have different inference properties for each datatype. They are also different from text document properties. The framework of plausible inference allows us to measure and combine the probabilities coming from these different properties. Uncertain inference generalizes the notions of autoepistemic logic, where truth values are either known or unknown, and when known, they are true or false. == Example == If we have a query of the form: q = A ∧ B ∧ C {\displaystyle q=A\wedge B\wedge C} where A, B and C are query assertions, then for a document D we want the probability: P ( D → ( A ∧ B ∧ C ) ) {\displaystyle P(D\to (A\wedge B\wedge C))} If we transform this into the conditional probability P ( ( A ∧ B ∧ C ) | D ) {\displaystyle P((A\wedge B\wedge C)|D)} and if the query assertions are independent we can calculate the overall probability of the implication as the product of the individual assertions probabilities. == Further work == Croft and Krovetz applied uncertain inference to an information retrieval system for office documents they called OFFICER. In office documents the independence assumption is valid since the query will focus on their individual attributes. Besides analysing the content of documents one can also query about the author, size, topic or collection for example. They devised methods to compare document and query attributes, infer their plausibility and combine it into an overall rating for each document. Besides that uncertainty of document and query contents also had to be addressed. Probabilistic logic networks is a system for performing uncertain inference; crisp true/false truth values are replaced not only by a probability, but also by a confidence level, indicating the certitude of the probability. Markov logic networks allow uncertain inference to be performed; uncertainties are computed using the maximum entropy principle, in analogy to the way that Markov chains describe the uncertainty of finite-state machines.

    Read more →