AI Data Trainer/annotator

AI Data Trainer/annotator — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Isotropic position

    Isotropic position

    In the fields of machine learning, the theory of computation, and random matrix theory, a probability distribution over vectors is said to be in isotropic position if its covariance matrix is proportional to the identity matrix. == Formal definitions == Let D {\textstyle D} be a distribution over vectors in the vector space R n {\textstyle \mathbb {R} ^{n}} . Then D {\textstyle D} is in isotropic position if, for vector v {\textstyle v} sampled from the distribution, E v v T = I d . {\displaystyle \mathbb {E} \,vv^{\mathsf {T}}=\mathrm {Id} .} A set of vectors is said to be in isotropic position if the uniform distribution over that set is in isotropic position. In particular, every orthonormal set of vectors is isotropic. As a related definition, a convex body K {\textstyle K} in R n {\textstyle \mathbb {R} ^{n}} is called isotropic if it has volume | K | = 1 {\textstyle |K|=1} , center of mass at the origin, and there is a constant α > 0 {\textstyle \alpha >0} such that ∫ K ⟨ x , y ⟩ 2 d x = α 2 | y | 2 , {\displaystyle \int _{K}\langle x,y\rangle ^{2}dx=\alpha ^{2}|y|^{2},} for all vectors y {\textstyle y} in R n {\textstyle \mathbb {R} ^{n}} ; here | ⋅ | {\textstyle |\cdot |} stands for the standard Euclidean norm.

    Read more →
  • Li Sheng (computer scientist)

    Li Sheng (computer scientist)

    Li Sheng (Chinese: 李生; born 1943), is a professor at the School of Computer Science and Engineering, Harbin Institute of Technology (HIT), China. He began his research on Chinese-English machine translation in 1985, making himself one of the earliest Chinese scholars in this field. After that, he pursued in vast topics of natural language processing, including machine translation, information retrieval, question answering and applied artificial intelligence. He was the final review committee member for computer area in NSF China. Born and raised in Heilongjiang province, he graduated in 1965 from the computer specialty of HIT, which is one of the earliest computer specialties in Chinese universities. Then he started to work as a staff in the Computer specialty of HIT, which was finally granted as a department in 1985. Also from 1985, he was appointed to undertake a series administrative positions in HIT, e.g. Dean of Computer Department(1987–1988), Director of R&D Division (1988–1990), Chief R&D Officer and several other key leading positions in HIT. Resigned all his administrative positions in 2004, Li devoted himself as the director of MOE-Microsoft Join Key Lab of NLP& Speech (HIT), making it a leading NLP research group with more than 100 staffs and students working on various aspects of NLP. So far, the lab has already been granted for dozens of technology awards by the ministries of central government and local provincial government of China. Its research progresses are reported annually in top tier conferences including ACL, IJCAI, SIGIR etc. As one of the pioneers in NLP research in China, he contributes NLP in China not only in technology innovations but also in talents education. So far, his research group has graduated more than 60 Ph.D. and almost 200 M.E with NLP major. Most of them are now working as the chief researcher in various NLP groups of universities and companies in China, including several world-known NLP scholars, such as Wang Haifeng of Baidu, Zhou Ming of Microsoft Research, Zhang Min (张民) of Soochow University (China), and Zhao Tiejun (赵铁军) and Liu Ting (刘挺) of HIT. Owing to his contributions in Chinese language processing, Li was elected as the President of Chinese Information Processing Society of China (CIPSC) in 2011. He scaled this top level academic organization in China up to more than 3000 registered members, and promoted NLP into several national projects for research or industry development. In addition, the CIPSC is now enhancing its co-operations with world NLP organizations including ACL. == Machine Intelligence & Translation Laboratory (MI&TLAB) == Originates from Machine Translation Research Group of Computer Science Department, Harbin Institute of Technology, which was started Li in 1985. It is one of the earliest institutions engaged in MT research in China, featured by its investigations into Chinese-English machine translation. It is now running under the Research Center on Language Technology, School of Computer Science and Technology, HIT. Details for staffs and publications can be found at https://mitlab.hit.edu.cn. == MOE-MS Joint Key Lab of Natural Language Processing and Speech (HIT) == In June, 2000, the Joint HIT-Microsoft Machine Translation Lab was founded by MI&T Lab and Microsoft Research (China). It was the third joint lab established by Microsoft Research (China) with Chinese universities, and the only one focusing on Machine Translation. Based on this jointly lab, the cooperation between HIT and Microsoft gradually extended to the areas of machine translation, information retrieval, speech recognition and processing, natural language understanding. In Oct, 2004, the joint key lab was granted as one of the 10 joint key labs supported by the Microsoft Research of Asia and Ministry of Education in China. In July 2006, the Shenzhen extension of the lab was launched. More than 200 staff and students have undertaken research projects, including some sponsored by the National Natural Science Foundation of China and the National 863 program of China. Since 2005, the lab has also been organizing a summer camp in Harbin Institute of Technology, and approximately 150 faculty members and students from universities in China have participated. This summer workshop was organized annually until 2014, when it was organized formally as the summer school series by Chinese Information Processing Society, China. Through the lab, a Microsoft Research of Asia-HIT joint PhD program was implemented in 2012. == CEMT-I MT System == In May 1989, CEMT-I passed the formal project appraisal in Harbin, China. Capable of translating technical paper titles from Chinese to English, it is not only the first MT system completed by Li and his group, but also the first Chinese-English Translation system that passed the technical appraisal by Chinese government according to the public reports. It was then awarded the Second Prize of Ministry Level Technology Innovation by the former National Aerospace Industry Corporation in 1990. == Daya Translation Workstation == Owing to the technical achievements by Li's group in Chinese-English machine translation, the former National Aerospace Industry Corporation of China sponsored a commercial system development of "Daya Translation Station (MT)" in 1993. Designed as a comprehensive English composition aid for Chinese users, this system was finished and put into the market in 1995. And in 1997, this system was awarded the Second Prize of Ministry Level Technology Innovation by the former National Aerospace Industry Corporation. == BT863 MT System == From 1994, the researches in Li's lab were supported by National 863 Hi-tech Research and Development Program. During this period, the BT863 system was explored to employ one engine for both Chinese-English and English-Chinese translation. This system was proved to be the best performance among Chinese-English MT systems in the formal technical evaluation of National 863 program, yielding the Third Prize of Ministry Level Technology Innovation by the former National Aerospace Industry Corporation in 1997. == Next Generation IR == This is a key project granted by NSF China (with a joint sponsorship from MSRA) started form 2008. In contrast to his previous NSF grants for different NLP issues, Li explored in his last PI project on key technologies in personalized IR, together with researchers from Tsinghua University and Institute of Software, Chinese Academy of Science. With impressive publications in top tier journals and conferences (including breakthrough publications in SIGIR of his own group), this projected was approved "A-level" achievements by the NSF China office in 2012.

    Read more →
  • OCR Systems

    OCR Systems

    OCR Systems, Inc., was an American computer hardware manufacturer and software publisher dedicated to optical character recognition technologies. The company's first product, the System 1000 in 1970, was used by numerous large corporations for bill processing and mail sorting. Following a series of pitfalls in the 1970s and early 1980s, founder Theodor Herzl Levine put the company in the hands of Gregory Boleslavsky and Vadim Brikman, the company's vice presidents and recent immigrants from the Soviet Ukraine, who were able to turn OCR System's fortunes around and expand its employee base. The company released the software-based OCR application ReadRight for DOS, later ported to Windows, in the late 1980s. Adobe Inc. bought the company in 1992. == History == OCR Systems was co-founded by Theodor Herzl Levine (c. 1923 – May 30, 2005). Levine served in the U.S. Army Signal Corps during World War II in the Solomon Islands, where he helped develop a sonar to find ejected pilots in the ocean. After the war, Levine spent 22 years at the University of Pennsylvania, earning his bachelor's degree in 1951, his master's degree in electrical engineering in 1957, and his doctorate in 1968. Alongside his studies, Levine taught statistics and calculus at Temple University, Rutgers University, La Salle University and Penn State Abington. Sometime in the 1960s, Levine was hired at Philco. He and two of his co-workers decided to form their own company dedicated to optical character recognition, founding OCR Systems in 1969 in Bensalem, Pennsylvania. OCR Systems's first product, the System 1000, was announced in 1970. OCR Systems entered a partnership with 3M to resell the System 1000 throughout the United States in March 1973. This was 3M's entry into the data entry field, managed by the company's Microfilm Products Division and accompanying 3M's suite of data retrieval systems. It soon found use among Texas Instruments, AT&T, Ricoh, Panasonic and Canon for bill processing and mail sorting. Later in the mid-1970s an unspecified Fortune 500 company reneged on a contract to distribute the System 1000; later still a Canadian company distributing the System 1000 in Canada went defunct. Both incidents led OCR Systems to go nearly bankrupt, although it eventually recovered. By the early 1980s, however, the company was almost insolvent. In 1983 Levine had only $8,000 in his savings and became bedridden with an illness. He left the company in the hands of Gregory Boleslavsky and Vadim Brikman, two Soviet Ukraine expats whom Levine had hired earlier in the 1980s. Boleslavsky was hired as a wire wrapper for the System 1000 and as a programmer and beta tester for ReadRight—a software package developed by Levine implementing patents from Nonlinear Technology, another OCR-centric company from Greenbelt, Maryland. Boleslavsky in turn recommended Brikman to Levine. The two soon became vice presidents of the company while Levine was bedridden; in Boleslavsky's case, he worked 14-hour work days for over half a year in pursuit of the title. The two presented OCR Systems' products to the National Computer Conference in Chicago, where they were massively popular. The company soon gained such clients as Allegheny Energy in Pennsylvania and the postal service of Belgium and received an influx of employees—mostly expats from Russia but also Poland and South Korea, as well as American-born workers. To accommodate the company's employee base, which had grown to over 30 in 1988, Levine moved OCR System's headquarters from Bensalem to the Masons Mill Business Park in Bryn Athyn. Chinon Industries of Japan signed an agreement with OCR Systems in 1987 to distribute OCR's ReadRight 1.0 software with Chinon's scanners, starting with their N-205 overhead scanner. In 1988, OCR opened their agreement to distribute ReadRight to other scanner manufacturers, including Canon, Hewlett-Packard, Skyworld, Taxan, Diamond Flower and Abaton. That year, the company posted a revenue of $3 million. OCR Systems extended their agreement with Chinon in 1989 and introduced version 2.0 of ReadRight. OCR Systems faced stiff competition in the software OCR market in the turn of the 1990s. The Toronto-based software firm Delrina signed a letter of intent to purchase the company in November 1991, expecting the deal to close in December and have OCR software available by Christmas. OCR was to receive $3 million worth of Delrina shares in a stock swap, but the deal collapsed in January 1992. Delrine later marketed its own Extended Character Recognition, or XCR, software package to compete with ReadRight. In July 1992, OCR Systems was purchased by Adobe Inc. for an undisclosed sum. == Products == === System 1000 === The System 1000 was based on the 16-bit Varian Data 620/i minicomputer with 4 KB of core memory. The system used the 620/i for controlling the paper feed, interpreting the format of the documents, the optical character recognition process itself, error detection, sequencing and output. The System was initially programmed to recognize 1428 OCR (used by Selectrics); IBM 407 print; and the full character sets of OCR-A, OCR-B and Farrington 7B; as well as optical marks and handwritten numbers. OCR Systems promised added compatibility with more fonts available down the line—per request—in 1970. The number of fonts supported was limited by the amount of core memory, which was expandable in 4 KB increments up to 32 KB. The System 1000 later supported generalized typewriter and photocopier fonts. The rest of the System 1000 comprised the document transport, one or more scanner elements, a CRT display and a Teletype Model 33 or 35. Pages are fed via friction with a rubber belt. Up to three lines could be scanned per document, while the rest of the scanned document could be laid out in any manner granted there was enough space around the fields to be read. The reader initially supported pages as small as 3.25 in by 3.5 in dimension (later supporting 2.6 in by 3.5 in utility cash stubs) all the way to the standard ANSI letter size (8.5 in by 11 in; later 8.5 in by 12 in as used in stock certificates). The initial System 1000 had a maximum throughput of 420 documents per minute per transport (later 500 documents per minute), contingent on document size and content. A feature unique to the System 1000 over other optical character recognition systems of the time was its ability to alert the operator when a field was unreadable or otherwise invalid. This feature, called Document Referral, placed the document in front of the operator and displayed a blank field on the screen of the included CRT monitor for manual re-entry via keyboard. Once input, data could be output to 7- or 9-track tape, paper tape, punched cards and other mass storage media or to System/360 mainframes for further processing. The complete System 1000 could be purchased for US$69,000. Options for renting were $1,800 per month on a three-year lease or $1,600 per month for five years. Computerworld wrote that it was less than half the cost of its competitors while more capable and user-friendly. Competing systems included the Recognition Equipment Retina, the Scan-Optics IC/20 and the Scan-Data 250/350. === ReadRight === ReadRight processes individual letters topographically: it breaks down the scanned letter into parts—strokes, curves, angles, ascenders and descenders—and follows a tree structure of letters broken down into these parts to determine the corresponding character code. ReadRight was entirely software-based, requiring no expansion card to work. Version 2.01, the last version released for DOS, runs in real mode in under 640 KB of RAM. OCR Systems released the Windows-only version 3.0 in 1991 while offering version 2.01 alongside it. The company unveiled a sister product, ReadRight Personal, dedicated to handheld scanners and for Windows only in October 1991. This version adds real-time scanning—each word is updated to the screen while lines are being scanned. ReadRight proper was later made a Windows-only product with version 3.1 in 1992. The inclusion of ReadRight 2.0 with Canon's IX-12F flatbed scanner led PC Magazine to award it an Editor's Choice rating in 1989. Despite this, reviewer Robert Kendall found qualification with ReadRight's ability to parse proportional typefaces such as Helvetica and Times New Roman. Mitt Jones of the same publication found version 2.01 to have improved its ability to read such typefaces and praised its ease of use and low resource intensiveness. Jones disliked the inability to handle uneven page paragraph column widths and graphics, noting that the manual recommended the user block out graphics with a Post-it Note. Version 3.1 for Windows received mixed reviews. Mike Heck of InfoWorld wrote that its "low cost and rich collection of features are hard to ignore" but rated its speed and accuracy average. Barry Simon of PC Magazine called it economical but inaccurate, unable to correct errors it did

    Read more →
  • Emma Brunskill

    Emma Brunskill

    Emma Patricia Brunskill is an American computer scientist. Her research combines machine learning with human–computer interaction by studying the effects of AI systems in human-centered applications including educational software and healthcare, and the theory of reinforcement learning in situations where mistakes impose high risks or costs. She is an associate professor of computer science at Stanford University, where she also holds a courtesy appointment in the Stanford Graduate School of Education and is an affiliate of the King Center on Global Development. == Education and career == Brunskill grew up in Seattle and Edmonds, Washington, and entered the University of Washington at age 15. She graduated magna cum laude in 2000, with a bachelor's degree in computer engineering and physics. A Rhodes Scholarship took her to Magdalen College, Oxford in England, where she received a master's degree in neuroscience in 2002. After a summer working in Rwanda, she became a graduate student of computer science at the Massachusetts Institute of Technology, where she completed her Ph.D. in 2009. Her doctoral dissertation, Compact parametric models for efficient sequential decision making in high-dimensional, uncertain domains, was supervised by Nicholas Roy. After working as an NSF Postdoctoral Research Fellow at the University of California, Berkeley, she joined Carnegie Mellon University (CMU) in 2011 as an assistant professor of computer science. She moved from CMU to Stanford University in 2017. == Recognition == Brunskill was a 2014 recipient of the National Science Foundation CAREER Award and a 2015 recipient of the Office of Naval Research Young Investigator Award. She was one of two alumni of the University of Washington's Paul G. Allen School of Computer Science and Engineering to be honored in 2020 by the school's Alumni Impact Awards. She was elected as a Fellow of the Association for the Advancement of Artificial Intelligence in 2025, "for significant contributions to the field of reinforcement learning, and applications for societal benefit, in particular AI for education".

    Read more →
  • Plant Nanny

    Plant Nanny

    Plant Nanny is a water tracker mobile application which reminds users to drink water. It was developed by Taiwanese app maker Fourdesire. The app was first released in 2013 and is available on the Apple App Store for iPhones and the Google Play Store for Android devices. == Description == Play Nanny uses a game method that allows users to turn their virtual selves into plants, which grows and thrives as the user drinks more water. The app sends occasional push notifications to remind users to drink water throughout the day. Users can choose from a wide range of plants, including cacti and carnations, and track their water intake. The app uses two resources, How to calculate how much water you should drink by Jennifer Stone (2018) and Human energy requirements by the Food and Agriculture Organization (2004), to calculate the recommended daily water intake for its users. Upon downloading the app, users are prompted to input basic personal information which is then used to calculate the recommended daily water intake and prompts them to drink the appropriate amount. == Accolades ==

    Read more →
  • Karsten Borgwardt

    Karsten Borgwardt

    Karsten Borgwardt (born 1980) is a German computer scientist and biologist specializing in machine learning and computational biology. Since February 2023, he has been a director at the Max Planck Institute of Biochemistry in Martinsried, Germany, where he leads the Department of Machine Learning and Systems Biology. == Education and career == Borgwardt was born in Kaiserslautern. He obtained a Diplom (equivalent to a master’s degree) in computer science from LMU Munich in 2004 and a Master of Science in biology from the University of Oxford in 2003. In 2007, he obtained his PhD from LMU Munich in computer science. Following a postdoctoral position at the University of Cambridge, he became a research group leader for machine learning and computational biology at the Max Planck Institute for Biological Cybernetics and the former Max Planck Institute for Developmental Biology in Tübingen in 2008. In 2011, Borgwardt was appointed professor of data mining in the life sciences at the University of Tübingen. In 2014, he joined ETH Zurich as an associate professor in the Department of Biosystems Science and Engineering (D-BSSE) and was promoted to full professor in 2017. During his tenure at ETH Zurich, he coordinated significant research programs, including two Marie Curie Innovative Training Networks and the Personalized Swiss Sepsis Study, focusing on the prediction of sepsis using machine learning. In 2023, he was appointed as Scientific Member of the Max Planck Society and as Director at the Max Planck Institute of Biochemistry in Martinsried. == Research contributions == Borgwardt’s research integrates big data analysis with biomedical research. He develops novel machine learning algorithms to detect patterns and statistical dependencies in large biological and medical datasets. His work aims to enable the automatic generation of new knowledge from big data and to understand the relationship between the function of biological systems and their molecular properties, which is fundamental for personalized medicine. == Awards and honors == During his studies, he was a scholar of the Stiftung Maximilianeum, and the Bavarian Foundation for the Promotion of the Gifted. Borgwardt received scholarships from the Studienstiftung des deutschen Volkes in 2002 and 2007. His PhD dissertation received the Heinz Schwärtzel Dissertation Award for Foundations of Computer Science in 2007. As a professor in Tübingen, he was awarded the Alfried-Krupp-Förderpreis for Young Professors in 2013. In 2015, he received an SNSF Starting Grant. In 2014, 2015 and 2016, he was listed in “Top 40 under 40” in Germany rankings selected by Capital magazine. In 2018, Borgwardt was named among “25 individuals who have the potential to shape the next 25 years” by Focus magazine. In 2023, Borgwardt received an honorary professorship from LMU Munich by the Faculty of Chemistry and Pharmacy. Publications from Borgwardt's group have received the Outstanding Student Paper Award in NIPS in 2009, the SIB Graduate Paper Award in 2020 and SIB Remarkable Output Awards in 2020 and 2021 from the Swiss Institute of Bioinformatics (SIB). == Selected publications == Weisfeiler-Lehman Graph Kernels (’‘Journal of Machine Learning Research’’, 2011): Introduced an efficient graph kernel based on the Weisfeiler-Lehman algorithm. “Direct antimicrobial resistance prediction from clinical MALDI-TOF mass spectra using machine learning” (’‘Nature Medicine’’, 2022): showcased the feasibility of predicting antimicrobial resistance from readily collected mass spectrometry data in the hospital. The new method is able to identify antibiotic resistance 24 hours earlier than previous methods.

    Read more →
  • Best AI Coding Assistants in 2026

    Best AI Coding Assistants in 2026

    Curious about the best AI coding assistant? An AI coding assistant is software that uses machine learning to help you get more done — it combines speed, accuracy, and an interface that just works. Hands-on testing shows real-world results vary, so a short free trial is the smartest way to decide. Whether you are a beginner or a pro, the right AI coding assistant slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Law and Corpus Linguistics

    Law and Corpus Linguistics

    Law and corpus linguistics (LCL) is an academic sub-discipline that uses large databases of examples of language usage equipped with tools designed by linguists called corpora to better get at the meaning of words and phrases in legal texts (statutes, constitutions, contracts, etc.). Thus, LCL is the application of corpus linguistic tools, theories, and methodologies to issues of legal interpretation in much the same way law and economics is the application of economic tools, theories, and methodologies to various legal issues. == History == A 2005 law review article by Lawrence Solan noted in passing that corpus linguistics had potential for its application to interpreting legal texts. But the first systematic exploration and advocacy of applying the tools and methodologies of corpus linguistics to legal interpretive questions of law and corpus linguistics came in the fall of 2010, when the BYU Law Review published a note by Stephen Mouritsen, entitled The Dictionary is Not a Fortress: Definitional Fallacies and a Corpus-Based Approach to Plain Meaning. The note argued that dictionaries are the primary linguistic tool used by judges to determine the plain or ordinary meaning of words and phrases, and highlighted the deficiencies of such an approach. In its stead, the note proposed using corpus linguistics. And the note would be later cited by Adam Liptak in a New York Times article on statutory construction. Law and corpus linguistics (LCL) gained greater legitimacy in July 2011 with the first judicial opinion in American history utilizing corpus linguistics to determine the meaning of a legal text: In re the Adoption of Baby E.Z. In a concurrence in part and in the judgment, Justice Thomas Lee wrote to put forth an alternative ground for the majority's holding—interpreting the phrase "custody determination" by using corpus linguistics. Justice Lee looked at 500 randomized sample sentences from the Corpus of Contemporary American English (COCA) and found that the most common sense of "custody" was in the context of divorce rather than adoption. Further, he found that "custody" is ten times more likely to co-occur (or collocate) with "divorce" than with "adoption". From that evidence Justice Lee concluded that he "would find that the custody proceedings covered by the Act are limited to proceedings resulting in the modifiable custody orders of a divorce", rather than the broader range of custody proceedings. Other jurisprudence and scholarship would follow. In a 2015 concurrence in State v. Rasabout, Justice Lee used a COCA search to determine that "discharge" when used with a firearm (or one of its synonyms) overwhelmingly referred to a single shot rather than emptying the entire magazine of the weapon. And in 2016, four of the five justices joined a footnote in a majority opinion by Justice Lee commending a party for using corpus linguistics in its briefing even though the Court found it unnecessary to resolve the related question. Finally, in 2016 the Michigan Supreme Court became the first court to use a linguist-designed corpus in a majority opinion (COCA), with both the majority and the dissent turning to COCA to determine the meaning of the word "information". In 2020, courts desiring to bolster the legal theory of original intent have sought the opportunity to undertake analyses of statutes utilizing corpus linguistics. In a Ninth Circuit Court of Appeals case, Jones v. Becerra (No. 20-56174), a case involving the Second Amendment and the constitutionality of a California statute which bans the sale of firearms to individuals under the age of 21, a Ninth Circuit panel requested that the parties address three questions: 1) “What is the original public meaning of the Second Amendment phrases: ‘A well regulated Militia’; ‘the right of the people’; and ‘shall not be infringed’? 2) How does the tool of corpus linguistics help inform the determination of the original public meaning of those Second Amendment phrases?” 3) How do the data yielded from corpus linguistics assist in the interpretation of the constitutionality of age-based restrictions under the Second Amendment? As to scholarship, in 2012, Mouritsen followed up his original work with an article in the Columbia Science and Technology Law Review, where he further refined and promoted the use of corpus-based methods for determining questions of legal ambiguity. Additionally, in 2016 two essays and an article on law and corpus linguistics were published. The Yale Law Journal Forum published Corpus Linguistics & Original Public Meaning: A New Tool to Make Originalism More Empirical. Written by Justice Lee and two co-authors, the essay urged originalists to turn to corpus linguistics to improve the rigor and accuracy of originalist scholarship. And in response, the Forum published an essay by Lawrence Solan (a Brooklyn Law professor with a PhD in linguistics), Can Corpus Linguistics Help Make Originalism Scientific? The Boston University Public Interest Law Journal published The Merciful Corpus: The Rule of Lenity, Ambiguity and Corpus Linguistics by Daniel Ortner. In the article Ortner applied corpus linguistics to determining whether sufficient ambiguity exists to trigger the rule of lenity in five Supreme Court cases. Looking forward, in 2017 two more articles are slated for publication. Lee Strang focuses on corpus linguistics and originalism in the U.C. Davis Law Review, and Lawrence Solan and Tammy Gales explore corpus linguistics in the context of finding ordinary meaning in statutory interpretation in the International Journal of Legal Discourse. Lawyers and journalists have also taken notice of corpus linguistics at it relates to the law. In 2010, Neal Goldfarb filed the first known brief in the Supreme Court using corpus linguistics (COCA) to determine whether the ordinary meaning of "personal" referred to corporations in the case FCC v. AT&T. The amicus brief looked at the top collocates (words that co-occur) of "personal" in COHA as well as BYU's Time Magazine Corpus. And writing for The Atlantic, Ben Zimmer took note of this new trend, referring to corpus linguistics in the courts as "Like Lexis on Steroids". On the academic front, in 2013 BYU Law School started the first class on law and corpus linguistics, co-taught by Mouritsen, Lee, and (now Dean) Gordon Smith. The class is currently in its fourth year. And in February 2016, BYU Law School hosted the inaugural conference on LCL, with over two dozen legal and linguistic scholars from around the country discussing and debating the next steps forward for the growing academic movement. The conference has been held regularly in subsequent years. At the 2016 conference BYU Law School announced its plans and progress on the Corpus of Founding Era American English (COFEA), a corpus that covers 1760–1799 and contains more than 120 million words have been collected from founding era letters, diaries, newspapers, non-fiction books, fiction, sermons, speeches, debates, legal cases, and other legal materials.

    Read more →
  • Super-resolution optical fluctuation imaging

    Super-resolution optical fluctuation imaging

    Super-resolution optical fluctuation imaging (SOFI) is a post-processing method for the calculation of super-resolved images from recorded image time series that is based on the temporal correlations of independently fluctuating fluorescent emitters. SOFI has been developed for super-resolution of biological specimen that are labelled with independently fluctuating fluorescent emitters (organic dyes, fluorescent proteins). In comparison to other super-resolution microscopy techniques such as STORM or PALM that rely on single-molecule localization and hence only allow one active molecule per diffraction-limited area (DLA) and timepoint, SOFI does not necessitate a controlled photoswitching and/ or photoactivation as well as long imaging times. Nevertheless, it still requires fluorophores that are cycling through two distinguishable states, either real on-/off-states or states with different fluorescence intensities. In mathematical terms SOFI-imaging relies on the calculation of cumulants, for what two distinguishable ways exist. For one thing an image can be calculated via auto-cumulants that by definition only rely on the information of each pixel itself, and for another thing an improved method utilizes the information of different pixels via the calculation of cross-cumulants. Both methods can increase the final image resolution significantly although the cumulant calculation has its limitations. Actually SOFI is able to increase the resolution in all three dimensions. == Principle == Likewise to other super-resolution methods SOFI is based on recording an image time series on a CCD- or CMOS camera. In contrary to other methods the recorded time series can be substantially shorter, since a precise localization of emitters is not required and therefore a larger quantity of activated fluorophores per diffraction-limited area is allowed. The pixel values of a SOFI-image of the n-th order are calculated from the values of the pixel time series in the form of a n-th order cumulant, whereas the final value assigned to a pixel can be imagined as the integral over a correlation function. The finally assigned pixel value intensities are a measure of the brightness and correlation of the fluorescence signal. Mathematically, the n-th order cumulant is related to the n-th order correlation function, but exhibits some advantages concerning the resulting resolution of the image. Since in SOFI several emitters per DLA are allowed, the photon count at each pixel results from the superposition of the signals of all activated nearby emitters. The cumulant calculation now filters the signal and leaves only highly correlated fluctuations. This provides a contrast enhancement and therefore a background reduction for good measure. As it is implied in the figure on the left the fluorescence source distribution: ∑ k = 1 N δ ( r → − r → k ) ⋅ ε k ⋅ s k ( t ) {\displaystyle \sum _{k=1}^{N}\delta ({\vec {r}}-{\vec {r}}_{k})\cdot \varepsilon _{k}\cdot s_{k}(t)} is convolved with the system's point spread function (PSF) U(r). Hence the fluorescence signal at time t and position r → {\displaystyle {\vec {r}}} is given by F ( r → , t ) = ∑ k = 1 N U ( r → − r → k ) ⋅ ε k ⋅ s k ( t ) . {\displaystyle F({\vec {r}},t)=\sum _{k=1}^{N}U({\vec {r}}-{\vec {r}}_{k})\cdot \varepsilon _{k}\cdot s_{k}(t).} Within the above equations N is the amount of emitters, located at the positions r → k {\displaystyle {\vec {r}}_{k}} with a time-dependent molecular brightness ε k ⋅ s k {\displaystyle \varepsilon _{k}\cdot s_{k}} where ε k {\displaystyle \varepsilon _{k}} is a variable for the constant molecular brightness and s k ( t ) {\displaystyle s_{k}(t)} is a time-dependent fluctuation function. The molecular brightness is just the average fluorescence count-rate divided by the number of molecules within a specific region. For simplification it has to be assumed that the sample is in a stationary equilibrium and therefore the fluorescence signal can be expressed as a zero-mean fluctuation: δ F ( r → , t ) = F ( r → , t ) − ⟨ F ( r → , t ) ⟩ t {\displaystyle \delta F({\vec {r}},t)=F({\vec {r}},t)-\langle F({\vec {r}},t)\rangle _{t}} where ⟨ ⋯ ⟩ t {\displaystyle \langle \cdots \rangle _{t}} denotes time-averaging. The auto-correlation here e.g. the second-order can then be described deductively as follows for a certain time-lag τ {\displaystyle \tau } : δ F ( r → , t ) = ⟨ δ F ( r → , t + τ ) ⋅ δ F ( r → , t ) ⟩ t {\displaystyle \delta F({\vec {r}},t)=\langle \delta F({\vec {r}},t+\tau )\cdot \delta F({\vec {r}},t)\rangle _{t}} From these equations it follows that the PSF of the optical system has to be taken to the power of the order of the correlation. Thus in a second-order correlation the PSF would be reduced along all dimensions by a factor of 2 {\displaystyle {\sqrt {2}}} . As a result, the resolution of the SOFI-images increases according to this factor. === Cumulants versus correlations === Using only the simple correlation function for a reassignment of pixel values, would ascribe to the independency of fluctuations of the emitters in time in a way that no cross-correlation terms would contribute to the new pixel value. Calculations of higher-order correlation functions would suffer from lower-order correlations for what reason it is superior to calculate cumulants, since all lower-order correlation terms vanish. == Cumulant-calculation == === Auto-cumulants === For computational reasons it is convenient to set all time-lags in higher-order cumulants to zero so that a general expression for the n-th order auto-cumulant can be found: A C n ( r → , τ 1 … n − 1 = 0 ) = ∑ k = 1 N U n ( r → − r → k ) ε k n w k ( 0 ) {\displaystyle AC_{n}({\vec {r}},\tau _{1\ldots n-1}=0)=\sum _{k=1}^{N}U^{n}({\vec {r}}-{\vec {r}}_{k})\varepsilon _{k}^{n}w_{k}(0)} w k {\displaystyle w_{k}} is a specific correlation based weighting function influenced by the order of the cumulant and mainly depending on the fluctuation properties of the emitters. Albeit there is no fundamental limitation in calculating very high orders of cumulants and thereby shrinking the FWHM of the PSF there are practical limitations according to the weighting of the values assigned to the final image. Emitters with a higher molecular brightness will show a strong increase in terms of the pixel cumulant value assigned at higher-orders as well as this performance can be expected from a diverse appearance of fluctuations of different emitters. A wide intensity range of the resulting image can therefore be expected and as a result dim emitters can get masked by bright emitters in higher-order images:. The calculation of auto-cumulants can be realized in a very attractive way in a mathematical sense. The n-th order cumulant can be calculated with a basic recursion from moments K n ( r → ) = μ n ( r → ) − ∑ i = 1 n − 1 ( n − 1 i ) K n − i ( r → ) μ i ( r → ) {\displaystyle K_{n}({\vec {r}})=\mu _{n}({\vec {r}})-\sum _{i=1}^{n-1}{\begin{pmatrix}n-1\\i\end{pmatrix}}K_{n-i}({\vec {r}})\mu _{i}({\vec {r}})} where K is a cumulant of the index's order, likewise μ {\displaystyle \mu } represents the moments. The term within the brackets indicates a binomial coefficient. This way of computation is straightforward in comparison with calculating cumulants with standard formulas. It allows for the calculation of cumulants with only little time of computing and is, as it is well implemented, even suitable for the calculation of high-order cumulants on large images. === Cross-cumulants === In a more advanced approach cross-cumulants are calculated by taking the information of several pixels into account. Cross-cumulants can be described as follows: C C n ( r → , τ 1 … n − 1 = 0 ) = ∏ j < l n U ( r → j − r → l n ) ⋅ ∑ i = 1 N U n ( r → i − ∑ k n r → k n ) ε i n w i ( 0 ) {\displaystyle CC_{n}({\vec {r}},\tau _{1\ldots n-1}=0)=\prod _{j Read more →

  • Machine translation in China

    Machine translation in China

    Machine translation in China is the history of machine translation systems developed in China. China became the fourth country that began machine translation (MT) research following USA, UK, and the Soviet Union. In 1957, the Language Institute of Chinese Academy of Sciences took the initiative in Russian-Chinese MT research program and set up an MT research group. From then on the research activities were directed and applied for academic purposes in Universities. The turning point of MT systems launching initiatives in market began from 1990s. MT systems went into blossom into the market. Among these systems, there were commercialized MT systems. To be more specific, Transtar was the first commercialized MT system and has been constantly upgraded. What's more, IMC/EC MT system which was developed by Computer Institute of Chinese Academy of Sciences has further made great advancement. Meanwhile, the practical MT system MT-IT-EC specific to communication domain was also striking to notice, for it has greatly improved the efficiency and productivity in the issue of publications. Government funding is a critical component and support in the development of market-oriented machine translation in China. It is evident to see that since Chinese opened up to the outside world and joined the WTO, the vigorous import and export trade generate opportunities for machine translation to transfer technical terms of products into the readable target information. Facing the increasing demand of sophisticated state-of -the -art translation technology, the academic area including research institute and universities are even launching bachelors’ and master's programs regarding machine translation. Thus, strong evidence illustrates the promising field of machine translation in the future market of China.

    Read more →
  • How to Choose an AI Website Builder

    How to Choose an AI Website Builder

    Shopping for the best AI website builder? An AI website builder is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI website builder slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • The Best Free AI Marketing Tool for Beginners

    The Best Free AI Marketing Tool for Beginners

    Looking for the best AI marketing tool? An AI marketing tool is software that uses machine learning to help you get more done — it can save you hours every week by automating repetitive work. Most options offer a generous free tier, with paid plans unlocking higher limits, faster processing, and team features. Whether you are a beginner or a pro, the right AI marketing tool slots into your workflow and pays for itself fast. Read on for hands-on impressions, pricing tiers, and the standout features that matter.

    Read more →
  • Software component

    Software component

    A software component is a modular unit of software that encapsulates specific functionality. The desired characteristics of a component are reusability and maintainability. == Value == Components allow software developers to assemble software with reliable parts rather than writing code for every aspect. It makes implementation more like factory assembly than custom building. == Attributes == Desirable attributes of a component include but are not limited to: Cohesive – encapsulates related functionality Reusable Robust Substitutable – can be replaced by another component with the same interface Documented Tested == Third-party == Some components are built in-house by the same organization or team building the software system. Some are third-party, developed elsewhere and assembled into the software system. == Component-based software engineering == For large-scale systems, component-based development encourages a disciplined process to manage complexity. == Framework == Some components conform to a framework technology that allows them to be consumed in a well-known way. Examples include: CORBA, COM, Enterprise JavaBeans, and the .NET Framework. == Modeling == Component design is often modeled visually. In Unified Modeling Language (UML) 2.0 a component is shown as a rectangle, and an interface is shown as a lollipop to indicate a provided interface and as a socket to indicate consumption of an interface. == History == The idea of reusable software components was promoted by Douglas McIlroy in his presentation at the NATO Software Engineering Conference of 1968. (One goal of that conference was to resolve the so-called software crisis of the time.) In the 1970s, McIlroy put this idea into practice with the addition of the pipeline feature to the Unix operating system. Brad Cox refined the concept of a software component in the 1980s. He attempted to create an infrastructure and market for reusable third-party components by inventing the Objective-C programming language. IBM introduced System Object Model (SOM) in the early 1990s. Microsoft introduced Component Object Model (COM) in the early 1990s. Microsoft built many domain-specific component technologies on COM, including Distributed Component Object Model (DCOM), Object Linking and Embedding (OLE), and ActiveX.

    Read more →
  • Alexander Gammerman

    Alexander Gammerman

    Alexander Gammerman (born 2 November 1944) is a British computer scientist, and professor at Royal Holloway University of London. He is the co-inventor of conformal prediction. He is the founding director of the Centre for Machine Learning at Royal Holloway, University of London, and a Fellow of the Royal Statistical Society. == Career == Gammerman's academic career has been pursued in the Soviet Union and the United Kingdom. He started working as a Research Fellow in the Agrophysical Research Institute, St. Petersburg. In 1983, he emigrated to the United Kingdom and was appointed as a lecturer in the Computer Science Department at Heriot-Watt University, Edinburgh. Together with Roger Thatcher, Gammerman published several articles on Bayesian inference. In 1993, he was appointed to the established chair in Computer Science at University of London tenable at Royal Holloway and Bedford New College, where he served as the Head of Computer Science department from 1995 to 2005. In 1998, the Centre for Reliable Machine Learning was established, and Gammerman became the first director of the centre. Gammerman has written 7 books. == Honours and awards == In 1996, Gammerman received the P.W. Allen Award from the Forensic Science Society. In 2006, he became an Honorary Professor, at University College London. In 2009, he became a Distinguished Professor at Complutense University of Madrid, Spain. In 2019, he received a research grant funded by the energy company Centrica about predicting the time to the next failure of equipment. In 2020, he received the Amazon Research Award for the project titled Conformal Martingales for Change-Point Detection == Selected books == Measures of Complexity (2016), Springer, ISBN 3319357786. Algorithmic Learning in a Random World (2005), Springer, ISBN 0387001522. Causal Models and Intelligent Data Management (1999), Springer, ISBN 978-3-642-58648-4. Probabilistic Reasoning and Bayesian Belief Networks (1998), Nelson Thornes Ltd, ISBN 1872474268. Computational Learning and Probabilistic Reasoning (1996), Wiley, ISBN 0471962791.

    Read more →
  • Karl Steinbuch

    Karl Steinbuch

    Karl W. Steinbuch (June 15, 1917 in Stuttgart-Bad Cannstatt – June 4, 2005 in Ettlingen) was a German computer scientist, cyberneticist, and electrical engineer. He was an early and influential researcher in German computer science, and was the developer of the Lernmatrix, an early implementation of artificial neural networks. From the late 1960s onwards the focus of his activity shifted from scientific research to right-wing political activism supporting the Neue Rechte. == Biography == Steinbuch joined the National Socialist German Students' League (NSDStB) and the Nazi Party. Steinbuch studied at the University of Stuttgart and in 1944 he received his PhD in physics. In 1948 he joined Standard Elektrik Lorenz (SEL, part of the ITT group) in Stuttgart, as a computer design engineer and later as a director of research and development, where he filed more than 70 patents. Steinbuch completed the first European fully transistorized computer, the ER 56 marketed by SEL. In 1958 he became professor and director of the Institute of Technology for information processing (ITIV) of the University of Karlsruhe, where he retired in 1980. In 1967 he began publishing books, in which he tried to influence German education policy. Together with books from colleagues like Jean Ziegler from Switzerland, Eric J. Hobsbawm from the UK, and John Naisbitt his books predicted what he regarded as the coming education disaster of the emerging civic lobby society. In 1957, together with Helmut Gröttrup, Steinbuch coined the term Informatik, the German word for computer science, which gave informatics, and the term kybernetische Anthropologie. == Awards and recognition == Wilhelm-Boelsche award - medal in Gold German non-fiction book award Gold medal award of the XXI. International Congresses on Aerospace Medicine Konrad Adenauer award of science Jakob Fugger award medal Medal of merit of the state of Baden-Wuerttemberg member, German Academy of Sciences Leopoldina member, International Academy of Science, Munich. grants from a state government grants program, named "Karl-Steinbuch-Stipendium" Steinbuch Centre for Computing at the Karlsruhe Institute of Technology named after him == Books == Steinbuch wrote several books and articles, including: 1957 Informatik: Automatische Informationsverarbeitung ("Informatics: automatic information processing"). 1963 Learning matrices and their applications (together with U. A. W. Piske) 1965 A critical comparison of two kinds of adaptive classification networks (together with Bernard Widrow) 1966 (1969): Die informierte Gesellschaft. Geschichte und Zukunft der Nachrichtentechnik (The informed society. History and Future of telecommunications) 1989: Die desinformierte Gesellschaft (The disinformed society) 1968: Falsch programmiert. Über das Versagen unserer Gesellschaft in der Gegenwart und vor der Zukunft und was eigentlich geschehen müßte. (as a bestseller listet in: Der Spiegel) (Programmed falsely. About our society's failure in the present and with respect to the future and what should be done.) 1969: Programm 2000. (as a bestseller listet in: Der Spiegel) 1971: Automat und Mensch. Auf dem Weg zu einer kybernetischen Anthropologie (Machine and Man. On the way to a cybernetic anthropology; 4th revised edition) 1971: Mensch Technik Zukunft. Probleme von Morgen (German non-fiction book award) (Man Technology Future. Problems of Tomorrow) 1973: Kurskorrektur (Correcting the Course) 1978: Maßlos informiert. Die Enteignung des Denkens (Excessively informed. The Deprivation of Thinking) 1984: Unsere manipulierte Demokratie. Müssen wir mit der linken Lüge leben? (Our Thought-controlled Democracy. Do we have to live with the leftist lie?)

    Read more →