Multiline optical-character reader

A multiline optical-character reader, or MLOCR, is a type of mail sorting machine that uses optical character recognition (OCR) technology to determine how to route mail through the postal system. MLOCRs work by capturing images of the front of letter-sized mailpieces, and extracting the entire address from each piece. It looks up the postal code within each address in a master database, prints a barcode representing this information on the mailpiece, and performs an initial sort. All of this occurs in a fraction of a second as the mailpiece passes through the machine. After this point, mail is further sorted by barcode sorters that read this barcode to determine its destination throughout its journey all the way down to the walk sequence of the mail carrier. The United States Postal Service has used remote bar coding since 1992. In the United States, if the MLOCR is not able to decode the address, then the mailpiece is placed on "hold" by printing a unique fluorescent barcode on the back of the mailpiece, and the mailpiece is then set aside for further processing by the Remote Bar Coding System (formerly called Remote Video Encoding). An image of the mailpiece is sent to a Remote Encoding Center where a human data conversion operator manually inspects the image. The operator converts the information on the mailpiece into abbreviated codes and enters the data into the computer. This data is sent back to the MLOCR site where it is matched with the unique barcode on the back of the un-coded mailpiece, and a barcode is then printed on the mailpiece like the rest of the mail. All this effort is invested up front into deciphering the destination of each mailpiece and printing the correct barcode, so that the mailpiece will never need to be manually examined again until it reaches the hands of the letter carrier who will carry it to the final delivery point. A Delivery Bar Code Sorter is repeatedly used at each point in the USPS system to read the barcode and sort the mailpiece to a tray corresponding to the next leg of its journey towards its final destination. The United States Postal Service is the largest user of these machines; however, large volume mailers and mail consolidators also have their own MLOCR systems to barcode outgoing mail in order to receive significant postage discounts. An option called FASTforward can be added to an MLOCR that allows it to automatically forward mail to a new address. This additional computer hardware/software combination looks up decoded addresses in the National Change of Address database to see if the recipient has recently moved. If so, a POSTNET barcode representing the new address is sprayed on the mailpiece thus routing it to new address although the old address is still visible—a testament to the degree at which mail can be mechanically sorted. Generally, all OCR-equipped letter sorting machines ordered since the late 1980s have been equipped with OCR systems capable of reading multiple lines of address.

Secure coding

Secure coding is the practice of developing computer software in such a way that guards against the accidental introduction of security vulnerabilities. Defects, bugs and logic flaws are consistently the primary cause of commonly exploited software vulnerabilities. Through the analysis of thousands of reported vulnerabilities, security professionals have discovered that most vulnerabilities stem from a relatively small number of common software programming errors. By identifying the insecure coding practices that lead to these errors and educating developers on secure alternatives, organizations can take proactive steps to help significantly reduce or eliminate vulnerabilities in software before deployment. Some scholars have suggested that in order to effectively confront threats related to cybersecurity, proper security should be coded or "baked in" to the systems. With security being designed into the software, this ensures that there will be protection against insider attacks and reduces the threat to application security. Implementing secure coding practices is part of the secure by design approach to security engineering. == Buffer-overflow prevention == Buffer overflows, a common software security vulnerability, happen when a process tries to store data beyond a fixed-length buffer. For example, if there are 8 slots to store items in, there will be a problem if there is an attempt to store 9 items. In computer memory the overflowed data may overwrite data in the next location which can result in a security vulnerability (stack smashing) or program termination (segmentation fault). An example of a C program prone to a buffer overflow is If the user input is larger than the destination buffer, a buffer overflow will occur. To fix this unsafe program, use strncpy to prevent a possible buffer overflow. Another secure alternative is to dynamically allocate memory on the heap using malloc. In the above code snippet, the program attempts to copy the contents of src into dst, while also checking the return value of malloc() to ensure that enough memory was able to be allocated for the destination buffer. == Format-string attack prevention == A Format String Attack is when a malicious user supplies specific inputs that will eventually be entered as an argument to a function that performs formatting, such as printf(). The attack involves the adversary reading from or writing to the stack. The C printf function writes output to stdout. If the parameter of the printf function is not properly formatted, several security bugs can be introduced. Below is a program that is vulnerable to a format string attack. A malicious argument passed to the program could be "%s%s%s%s%s%s%s", which can crash the program from improper memory reads. == Integer-overflow prevention == Integer overflow occurs when an arithmetic operation results in an integer too large to be represented within the available space. A program which does not properly check for integer overflow introduces potential software bugs and exploits. Below is a function in C++ which attempts to confirm that the sum of x and y is less than or equal to a defined value MAX: The problem with the code is it does not check for integer overflow on the addition operation. If the sum of x and y is greater than the maximum possible value of an unsigned int, the addition operation will overflow and perhaps result in a value less than or equal to MAX, even though the sum of x and y is greater than MAX. Below is a function which checks for overflow by confirming the sum is greater than or equal to both x and y. If the sum did overflow, the sum would be less than x or less than y. == Path traversal prevention == Path traversal is a vulnerability whereby paths provided from an untrusted source are interpreted in such a way that unauthorised file access is possible. For example, consider a script that fetches an article by taking a filename, which is then read by the script and parsed. Such a script might use the following hypothetical URL to retrieve an article about dog food: https://www.example.net/cgi-bin/article.sh?name=dogfood.html If the script has no input checking, instead trusting that the filename is always valid, a malicious user could forge a URL to retrieve configuration files from the web server: https://www.example.net/cgi-bin/article.sh?name=../../../../../etc/passwd Depending on the script, this may expose the /etc/passwd file, which on Unix-like systems contains (among others) user IDs, their login names, home directory paths and shells. (See SQL injection for a similar attack.) == Regulatory drivers == Secure coding practices are increasingly mandated by regulatory frameworks governing the development and maintenance of software systems that process sensitive data. The Health Insurance Portability and Accountability Act (HIPAA) Security Rule requires covered entities to protect the integrity of protected health information through technical safeguards under 45 CFR 164.312(c)(1) and to implement mechanisms to authenticate electronic protected health information under 45 CFR 164.312(c)(2). The Payment Card Industry Data Security Standard (PCI DSS) version 4.0 Requirement 6.2 mandates that custom software is developed securely, including training developers in secure coding techniques (6.2.2), reviewing custom code for vulnerabilities before release (6.2.3), and addressing common software attacks in development practices (6.2.4).

Predictions of the end of Wikipedia

Various observers have predicted the end of Wikipedia since it rose to prominence, with potential pitfalls from lack of quality-control, artificial intelligence or inconsistencies among contributors. Alternative online encyclopedias have been proposed as replacements for Wikipedia, including WolframAlpha, as well as the both now-defunct Knol (from Google) and Owl (from AOL). A 2013 review raised alarms regarding Wikipedia's shortcomings on hoaxes, on vandalism, an imbalance of material, and inadequate quality control of articles. Earlier critiques lamented the vulgar content and absence of sufficient references in articles. Others suggest that the unwarranted deletion of useful articles from Wikipedia may portend its end, which itself inspired the creation of the now inactive Deletionpedia. Contrary to such predictions, Wikipedia has constantly grown in both size and influence. Recent developments with artificial intelligence in Wikimedia projects have prompted new predictions that AI applications, which consume free and open content, will replace Wikipedia. == Personnel == Wikipedia is crowdsourced by a few million volunteer editors. Of the millions of registered editors, only tens of thousands contribute the majority of its contents, and a few thousand do quality control and maintenance work. As the encyclopedia expanded in the 2010s, the number of active editors did not grow proportionately. Various sources predicted that Wikipedia will eventually have too few editors to be functional and collapse from lack of participation. English Wikipedia has 818 volunteer administrators who perform various functions, including functions similar to those carried out by a forum moderator. Critics have described their actions as harsh, bureaucratic, biased, unfair, or capricious and predicted that the resulting outrage would lead to the site's closure. Various 2012 articles reported that a decline in English Wikipedia's recruitment of new administrators could end Wikipedia. === Decline in editors (2014–2015) === A 2014 trend analysis published in The Economist stated that "The number of editors for the English-language version has fallen by a third in seven years." The attrition rate for active editors in English Wikipedia was described by The Economist as substantially higher than in other (non-English) Wikipedias. It reported that in other languages, the number of "active editors" (those with at least five edits per month) has been relatively constant since 2008: some 42,000 editors, with narrow seasonal variances of about 2,000 editors up or down. In the English Wikipedia, the number of active editors peaked in 2007 at about 50,000 editors, and fell to 30,000 editors in 2014. Given that the trend analysis published in The Economist presented the number of active editors for non-English Wikipedias as remaining relatively constant, sustaining their numbers at approximately 42,000 active editors, the contrast pointed to the effectiveness of Wikipedia in those languages to retain their active editors on a renewable and sustained basis. Though different language versions of Wikipedia have different policies, no comment identified a particular policy difference as potentially making a difference in the rate of editor attrition for English Wikipedia. Editor count showed a slight uptick a year later, and no clear trend after that. In a 2013 article, Tom Simonite of MIT Technology Review said that for several years running, the number of Wikipedia editors had been falling, and cited the bureaucratic structure and rules as a factor. Simonite alleged that some Wikipedians use the labyrinthine rules and guidelines to dominate others and have a vested interest in keeping the status quo. A January 2016 article in Time by Chris Wilson said Wikipedia might lose many editors because a collaboration of occasional editors and smart software will take the lead. Andrew Lih and Andrew Brown both maintain editing Wikipedia with smartphones is difficult and discourages new potential contributors. Lih alleges there is serious disagreement among existing contributors on how to resolve this. In 2015, Lih feared for Wikipedia's long-term future while Brown feared problems with Wikipedia would remain and rival encyclopedias would not replace it. == Viewers and fundraisers == As of 2015, with more viewing by smartphones, there had been a marked decline in persons who viewed Wikipedia from their computers, and according to The Washington Post "[people are] far less likely to donate". At the time, the Wikimedia Foundation reported reserves equivalent to one year's budgeted expenditures. On the other hand, the number of paid staff had ballooned, so those expenses increased. In 2021, Andreas Kolbe, a former co-editor-in-chief of The Signpost, wrote that the Wikimedia Foundation was reaching its 10-year goal of a US$100 million endowment, five years earlier than planned, which may surprise donors and users around the world who regularly see Wikipedia fundraising banners. He also said accounting methods disguise the size of operating surpluses, top managers earn $300,000 – 400,000 a year, and over 40 people work exclusively on fundraising. == Artificial intelligence == Wikipedia faces a decline in human visitors, raising concerns about its long-term sustainability and community participation. The Wikimedia Foundation (WMF), when reporting this decline, attributed this in part to the lack of clicks from users of large language models and search engines that are using content from Wikipedia. Data published in August 2025 showed that after the launch of ChatGPT and the rise of other AI-powered search summaries, some types of articles on Wikipedia — especially those that closely resemble the kind of content ChatGPT produces — experienced a noticeable drop in readership. Overall human pageviews reportedly fell by about 8% between 2024 and 2025, suggesting that AI-overviews and chatbots are increasingly being used in place of direct visits to Wikipedia. According to industry web analytics data, ChatGPT's estimated monthly web traffic surpassed that of Wikipedia since May 2025, as visits to ChatGPT continued to grow while Wikipedia’s total site traffic declined. == Timeline of predictions == On the eve of the 20th anniversary of Wikipedia, associate professor of the Department of Communication Studies at Northeastern University Joseph Reagle conducted a retrospective study of numerous "predictions of the ends of Wikipedia" over two decades, divided into chronological waves: "Early growth (2001–2002)", "Nascent identity (2001–2005)", "Production model (2005–2010)", "Contributor attrition (2009–2017)" and the current period "(2020–)". Each wave brought its distinctive fatal predictions, which never came true; as a result, Reagle concluded Wikipedia was not in danger. Concern grew in 2023 that the ubiquity and proliferation of artificial intelligence (AI) may adversely affect Wikipedia. Rapid improvements and widespread application of AI may render Wikipedia obsolete or reduce its importance. A 2023 study found that AI, when applied to Wikipedia, works most efficiently for error-correction, while Wikipedia still needs to be written by humans.

Logico-linguistic modeling

Logico-linguistic modeling is a method for building knowledge-based systems with a learning capability using conceptual models from soft systems methodology, modal predicate logic, and logic programming languages such as Prolog. == Overview == Logico-linguistic modeling is a six-stage method developed primarily for building knowledge-based systems (KBS), but it also has application in manual decision support systems and information source analysis. Logico-linguistic models have a superficial similarity to John F. Sowa's conceptual graphs; both use bubble style diagrams, both are concerned with concepts, both can be expressed in logic and both can be used in artificial intelligence. However, logico-linguistic models are very different in both logical form and in their method of construction. Logico-linguistic modeling was developed in order to solve theoretical problems found in the soft systems method for information system design. The main thrust of the research into has been to show how soft systems methodology (SSM), a method of systems analysis, can be extended into artificial intelligence. == Background == SSM employs three modeling devices i.e. rich pictures, root definitions, and conceptual models of human activity systems. The root definitions and conceptual models are built by stakeholders themselves in an iterative debate organized by a facilitator. The strengths of this method lie, firstly, in its flexibility, the fact that it can address any problem situation, and, secondly, in the fact that the solution belongs to the people in the organization and is not imposed by an outside analyst. Information requirements analysis (IRA) took the basic SSM method a stage further and showed how the conceptual models could be developed into a detailed information system design. IRA calls for the addition of two modeling devices: "Information Categories", which show the required information inputs and outputs from the activities identified in an expanded conceptual model; and the "Maltese Cross", a matrix which shows the inputs and outputs from the information categories and shows where new information processing procedures are required. A completed Maltese Cross is sufficient for the detailed design of a transaction processing system. The initial impetus to the development of logico-linguistic modeling was a concern with the theoretical problem of how an information system can have a connection to the physical world. This is a problem in both IRA and more established methods (such as SSADM) because none base their information system design on models of the physical world. IRA designs are based on a notional conceptual model and SSADM is based on models of the movement of documents. The solution to these problems provided a formula that was not limited to the design of transaction processing systems but could be used for the design of KBS with learning capability. == The six stages of logico-linguistic modeling == The logico-linguistic modeling method comprises six stages. === 1. Systems analysis === In the first stage logico-linguistic modeling uses SSM for systems analysis. This stage seeks to structure the problem in the client organization by identifying stakeholders, modelling organizational objectives and discussing possible solutions. At this stage it not assumed that a KBS will be a solution and logico-linguistic modeling often produces solutions that do not require a computerized KBS. Expert systems tend to capture the expertise, of individuals in different organizations, on the same topic. By contrast a KBS, produced by logico-linguistic modeling, seeks to capture the expertise of individuals in the same organization on different topics. The emphasis is on the elicitation of organizational or group knowledge rather than individual experts. In logico-linguistic modeling the stakeholders become the experts. The end point of this stage is an SSM style conceptual models such as figure 1. === 2. Language creation === According to the theory behind logico-linguistic modeling the SSM conceptual model building process is a Wittgensteinian language-game in which the stakeholders build a language to describe the problem situation. The logico-linguistic model expresses this language as a set of definitions, see figure 2. === 3. Knowledge elicitation === After the model of the language has been built putative knowledge about the real world can be added by the stakeholders. Traditional SSM conceptual models contain only one logical connective (a necessary condition). In order to represent causal sequences, "sufficient conditions" and "necessary and sufficient conditions" are also required. In logico-linguistic modeling this deficiency is remedied by two addition types of connective. The outcome of stage three is an empirical model, see figure 3. === 4. Knowledge representation === Modal predicate logic (a combination of modal logic and predicate logic) is used as the formal method of knowledge representation. The connectives from the language model are logically true (indicated by the "L" modal operator) and connective added at the knowledge elicitation stage are possibility true (indicated by the "M" modal operator). Before proceeding to stage 5, the models are expressed in logical formulae. === 5. Computer code === Formulae in predicate logic translate easily into the Prolog artificial intelligence language. The modality is expressed by two different types of Prolog rules. Rules taken from the language creation stage of model building process are treated as incorrigible. While rules from the knowledge elicitation stage are marked as hypothetical rules. The system is not confined to decision support but has a built in learning capability. === 6. Verification === A knowledge based system built using this method verifies itself. Verification takes place when the KBS is used by the clients. It is an ongoing process that continues throughout the life of the system. If the stakeholder beliefs about the real world are mistaken this will be brought out by the addition of Prolog facts that conflict with the hypothetical rules. It operates in accordance to the classic principle of falsifiability found in the philosophy of science == Applications == === Knowledge-based computer systems === Logico-linguistic modeling has been used to produce fully operational computerized knowledge based systems, such as one for the management of diabetes patients in a hospital out-patients department. === Manual decision support === In other projects the need to move into Prolog was considered unnecessary because the printed logico-linguistic models provided an easy-to-use guide to decision making. For example, a system for mortgage loan approval === Information source analysis === In some cases a KBS could not be built because the organization did not have all the knowledge needed to support all their activities. In these cases logico-linguistic modeling showed shortcomings in the supply of information and where more was needed. For example, a planning department in a telecoms company == Criticism == While logico-linguistic modeling overcomes the problems found in SSM's transition from conceptual model to computer code, it does so at the expense of increased stakeholder constructed model complexity. The benefits of this complexity are questionable and this modeling method may be much harder to use than other methods. This contention has been exemplified by subsequent research. An attempt by researchers to model buying decisions across twelve companies using logico-linguistic modeling required simplification of the models and removal of the modal elements.

Interactive activation and competition networks

Interactive activation and competition (IAC) networks are artificial neural networks used to model memory and intuitive generalizations. They are made up of nodes or artificial neurons which are arrayed and activated in ways that emulate the behaviors of human memory. The IAC model is used by the parallel distributed processing (PDP) Group and is associated with James L. McClelland and David E. Rumelhart; it is described in detail in their book Explorations in Parallel Distributed Processing: A Handbook of Models, Programs, and Exercises. This model does not contradict any currently known biological data or theories, and its performance is close enough to human performance as to warrant further investigation.

BulSemCor

The Bulgarian Sense-annotated Corpus (BulSemCor) (Bulgarian: Български семантично анотиран корпус (БулСемКор)) is a structured corpus of Bulgarian texts in which each lexical item is assigned a sense tag. BulSemCor was created by the Department of Computational Linguistics at the Institute for Bulgarian Language of the Bulgarian Academy of Sciences. == Structure == BulSemCor was created as part of a nationally funded project titled "BulNet – A lexico-semantic network for the Bulgarian Language" (2005–2010). It follows the general methodology of SemCor combined with some specific principles. The corpus for annotation consists of 101,791 tokens covering an excerpt from the Bulgarian "Brown" Corpus modelled on the Brown Corpus.Francis Kucera An important feature of BulSemCor is that the samples are selected using heuristics that provide optimal coverage of ambiguous lexis. BulSemCor is manually sense-annotated according to the Bulgarian WordNet. Its size is comparable to that of other contemporary semantically annotated corpora or pool of acceptable linguistic components. The semantic annotation consists in associating each lexical item in the corpus with exactly one synonym set (synset) in the Bulgarian WordNet that best describes its sense in the particular context. The selection of the best match among the suggested candidates is based on a set of procedures, such as the other synset members, the synset gloss (explanatory definition) and the position of a given candidate in the WordNet structure. == Scale == The number of annotated tokens is 99,480 (the difference in the number of tokens compared to the initial corpus is due to the fact that some of them are not linguistic items). The simple word count is 86,842 and multiword expressions (MWE) are 5,797 (12,638 tokens). == Specific features == All words in BulSemCor are assigned a sense, while according to established practice only simple content words or content word classes (typically nouns and verbs) are annotated. Since 2000 the development of language resources, has broadened to include annotation of function words and multiword expressions covering particular senses or types of words and expressions. In this respect, BulSemCor's annotation is more exhaustive and hence provides greater opportunities for linguistic observations and non-linear programming (NLP) applications. Annotated items inherit the linguistic information associated with the corresponding synset, which along with morphological and semantic tags may include annotation on one or more of the following additional levels: Partial information about the syntactic structure of MWE types – particularly, information about syntactic heads and their dependents; Information about the category of the named entities – names, locations, organisations, dates, numbers, etc.; Information about the taxonomic category of adverbs, such as time, place, manner, degree, quantity, etc.; Information about the type of the syntactic relationships – coordination or subordination – expressed by conjunctions; Information about the original part-of-speech of substantivised words (non-nouns that act as nouns in a particular context); Stylistic/register, grammatical and other information about synsets or individual synset members;

Civitai

Civitai is an online platform and marketplace for generative artificial intelligence (Gen AI) content, primarily focused on AI-generated images and models, and AI-generated videos. == History == Civitai was founded in 2022 by Justin Maier. By January 2023, the site reached 100,000 registered users and 3 million by November. In November 2023, Civitai secured funding from venture capital firm Andreessen Horowitz. By April 2024, Civitai had 23.2 million monthly accesses. The company is headquartered in Boise, Idaho. == Platform == Civitai allows users to share and download AI models, particularly those used for image generation. The platform supports various AI models, including Stable Diffusion and Flux, and provides a space for users to showcase and monetize their AI-generated content. Users have profile pages and can comment on other users' models and images. The website also features a virtual currency called Buzz that can be used to generate images on Civitai's servers. Buzz can be bought or earned by engaging with the site. The platform is open source. == Controversies == In 2023, 404 Media reported that Civitai began a "Bounties" marketplace where users could commission deepfakes, of real or fake people. Users are rewarded with Buzz for completing Bounties. In December 2023, AI provider OctoML announced it had ended its business relationship with Civitai after concerns were raised users were generating images that “could be categorized as child pornography.”