Sorenson Squeeze was a software video encoding tool used to compress and convert video and audio files on Mac OS X or Windows operating systems. It was sold as a standalone tool and has also long been bundled with Avid Media Composer. == History == Sorenson Squeeze was first announced on July 17, 2001, as the first variable bit rate (VBR) compression application for Mac OS X, and was released on October 29 of that same year. By March 2002, Sorenson Squeeze became available for Windows OS. Sorenson Squeeze was originally released as a tool for encoding videos for the Web and QuickTime playback but began adding new codecs as more versions were released. The software was discontinued by Sorenson in January 2019, and correspondingly was no longer offered as part of Avid Media Composer. == Features == Squeeze included a number of features to improve video & audio quality. Features included: GPU accelerated H.264 encoding, adaptive bitrate encoding, HD encoding and Dolby certified AC3 Audio. Intelligent encoding presets available in Squeeze included: x265 (H.265) MainConcept H.264 and MainConcept H.264 CUDA. Adaptive bitrate encoding allows for optimal bitrate and error resilience based on network conditions, resulting in a dynamic adjustment of the video bitstream being delivered. It encoded to multiple formats including QuickTime, Windows Media, Flash Video, Silverlight, WebM & WMV. It uses multiple codecs, including the Sorenson codecs SV3 Pro and Spark, H.265, H.264, H.263, VP6, VC1, MPEG2, and many others. Squeeze operates on the Apple Macintosh and Microsoft Windows operating systems. Squeeze offers native plugins to Avid, Apple Final Cut Pro and Adobe Premiere (CS4, CS5) NLEs. Each copy of Squeeze included the Dolby Certified AC3 Consumer encoder. Squeeze also included a simplified review and approval process, which allows the user to automatically send secure, password protected videos for immediate review. Instant feedback is received via Web or mobile. == Versions == Sorenson Squeeze was released on October 29, 2001. Sorenson Squeeze for Macromedia Flash MX was released on March 14, 2002. Sorenson Squeeze 3 for MPEG-4 was released in January 2003. Sorenson Squeeze 3 Compression Suite was released in January 2003. Sorenson Squeeze 5 was released on March 31, 2008. Sorenson Squeeze was updated to version 5.1 on May 11, 2009. Sorenson Squeeze 6 was released on November 3, 2009. Sorenson Squeeze 7 was released January 25, 2011. Sorenson Squeeze 11 was released August 27, 2016. == Awards == Streaming Media magazine Readers’ Choice Award for Encoding Software for 2007, 2008, 2009 and 2010. 2008 Vanguard Award from Digital Content Producer magazine == Squeeze 7 system requirements == Windows Pentium IV-based computer or greater Windows XP, Vista or 7 32- and 64-bit compatible (including AVID 64-bit update); Faster performance on 64-bit systems 512 MB RAM 120 MB available hard drive space QuickTime 7.2 or later DirectX 9.0b or later Macintosh Intel-based processor Mac OS 10.4 or later 32- and 64-bit compatible; Faster performance on 64-bit systems 512 MB RAM 120 MB available hard drive space QuickTime 7.2 or later
Freemake Video Converter
Freemake Video Converter is a freemium video editing app developed by Ellora Assets Corporation. Designed primarily for entry-level users, the software offers a range of functionalities including video format conversion, DVD ripping, and the creation of photo slideshows and music visualizations. Additionally, Freemake Video Converter is capable of burning video streams that are compatible with various media, such as DVDs and Blu-ray Discs. It also features direct video uploading capabilities to platforms like YouTube., enhancing its utility for content creators. The application's user-friendly interface and broad compatibility make it accessible for individuals with minimal video editing experience. == Features == Freemake Video Converter can perform simple non-linear video editing tasks, such as cutting, rotating, flipping, and combining multiple videos into one file with transition effects. It can also create photo slideshows with background music. Users are then able to upload these videos to YouTube. Freemake Video Converter can read the majority of video, audio, and image formats, and outputs them to AVI, MP4, WMV, Matroska, FLV, SWF, 3GP, DVD, Blu-ray, MPEG and MP3. The program also prepares videos supported by various multimedia devices, including Apple devices (iPod, iPhone, iPad), Xbox, Sony PlayStation, Samsung, Nokia, BlackBerry, and Android mobile devices. The software is able to perform DVD burning and is able to convert videos, photographs, and music into DVD video. The user interface is based on Windows Presentation Foundation technology. Freemake Video Converter supports NVIDIA CUDA technology for H.264 video encoding (starting with version 1.2.0). == Important updates == Freemake Video Converter 2.0 was a major update that integrated two new functions: ripping video from online portals and Blu-ray disc creation and burning. Version 2.1 implemented suggestions from users, including support for subtitles, ISO image creation, and DVD to DVD/Blu-ray conversion. With version 2.3 (earlier 2.2 Beta), support for DXVA has been added to accelerate conversion (up to 50% for HD content). Version 3.0 added HTML5 video creation support and new presets for smartphones. Version 4.0 (introduced in April 2013) added a freemium "Gold Pack" of extra features that can be added if a "donation" is paid. Starting with version 4.0.4, released on 27 August 2013, the program adds a promotional watermark at the end of every video longer than 5 minutes unless Gold Pack is activated. Version 4.1.9, released on 25 November 2015 added support for drag-and-drop functions that were not available in prior versions. Since at least version 4.1.9.44 (1 May 2017), the Freemake Welcome Screen is added at the beginning of the video, and the big Freemake logo is watermarked in the center of the whole video. This decreases the quality of free outputs, and users are forced to pay money to remove the watermark or stop using it. Version 4.1.9.31 (11 August 2016) does not have this restriction. == Licensing issues == FFmpeg has added Freemake Video Converter v1.3 to its Hall of Shame. An issue tracker entry for this product, opened on 16 December 2010, says it is in violation of the GNU General Public License as it is distributing components of the FFmpeg project without including due credit. Ellora Assets Corporation has not responded yet. == Bundled software from sponsors == Since version 4.0, Freemake Video Converter's installer includes a potentially unwanted search toolbar from Conduit as well as SweetPacks malware. Although users can decline the software during installation, the opt-out option is rendered in gray, which could mistakenly give the impression that it's disabled.
Information extraction
Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents and other electronically represented sources. Typically, this involves processing human language texts by means of natural language processing (NLP). Recent activities in multimedia document processing like automatic annotation and content extraction out of images/audio/video/documents could be seen as information extraction. Recent advances in NLP techniques have allowed for significantly improved performance compared to previous years. An example is the extraction from newswire reports of corporate mergers, such as denoted by the formal relation: MergerBetween ( c o m p a n y 1 , c o m p a n y 2 , d a t e ) {\displaystyle \operatorname {MergerBetween} (\mathrm {company} _{1},\mathrm {company} _{2},\mathrm {date} )} , from an online news sentence such as: "Yesterday, New York based Foo Inc. announced their acquisition of Bar Corp." A broad goal of IE is to allow computation to be done on the previously unstructured data. A more specific goal is to allow automated reasoning about the logical form of the input data. Structured data is semantically well-defined data from a chosen target domain, interpreted with respect to category and context. Information extraction is the part of a greater puzzle which deals with the problem of devising automatic methods for text management, beyond its transmission, storage and display. The discipline of information retrieval (IR) has developed automatic methods, typically of a statistical flavor, for indexing large document collections and classifying documents. Another complementary approach is that of natural language processing (NLP) which has solved the problem of modelling human language processing with considerable success when taking into account the magnitude of the task. In terms of both difficulty and emphasis, IE deals with tasks in between both IR and NLP. In terms of input, IE assumes the existence of a set of documents in which each document follows a template, i.e. describes one or more entities or events in a manner that is similar to those in other documents but differing in the details. An example, consider a group of newswire articles on Latin American terrorism with each article presumed to be based upon one or more terroristic acts. We also define for any given IE task a template, which is a(or a set of) case frame(s) to hold the information contained in a single document. For the terrorism example, a template would have slots corresponding to the perpetrator, victim, and weapon of the terroristic act, and the date on which the event happened. An IE system for this problem is required to "understand" an attack article only enough to find data corresponding to the slots in this template. == History == Information extraction dates back to the late 1970s in the early days of NLP. An early commercial system from the mid-1980s was JASPER built for Reuters by the Carnegie Group Inc with the aim of providing real-time financial news to financial traders. Beginning in 1987, IE was spurred by a series of Message Understanding Conferences. MUC is a competition-based conference that focused on the following domains: MUC-1 (1987), MUC-3 (1989): Naval operations messages. MUC-3 (1991), MUC-4 (1992): Terrorism in Latin American countries. MUC-5 (1993): Joint ventures and microelectronics domain. MUC-6 (1995): News articles on management changes. MUC-7 (1998): Satellite launch reports. Considerable support came from the U.S. Defense Advanced Research Projects Agency (DARPA), who wished to automate mundane tasks performed by government analysts, such as scanning newspapers for possible links to terrorism. == Present significance == The present significance of IE pertains to the growing amount of information available in unstructured form. Tim Berners-Lee, inventor of the World Wide Web, refers to the existing Internet as the web of documents and advocates that more of the content be made available as a web of data. Until this transpires, the web largely consists of unstructured documents lacking semantic metadata. Knowledge contained within these documents can be made more accessible for machine processing by means of transformation into relational form, or by marking-up with XML tags. An intelligent agent monitoring a news data feed requires IE to transform unstructured data into something that can be reasoned with. A typical application of IE is to scan a set of documents written in a natural language and populate a database with the information extracted. == Tasks and subtasks == Applying information extraction to text is linked to the problem of text simplification in order to create a structured view of the information present in free text. The overall goal being to create a more easily machine-readable text to process the sentences. Typical IE tasks and subtasks include: Template filling: Extracting a fixed set of fields from a document, e.g. extract perpetrators, victims, time, etc. from a newspaper article about a terrorist attack. Event extraction: Given an input document, output zero or more event templates. For instance, a newspaper article might describe multiple terrorist attacks. Knowledge Base Population: Fill a database of facts given a set of documents. Typically the database is in the form of triplets, (entity 1, relation, entity 2), e.g. (Barack Obama, Spouse, Michelle Obama) Named entity recognition: recognition of known entity names (for people and organizations), place names, temporal expressions, and certain types of numerical expressions, by employing existing knowledge of the domain or information extracted from other sentences. Typically the recognition task involves assigning a unique identifier to the extracted entity. A simpler task is named entity detection, which aims at detecting entities without having any existing knowledge about the entity instances. For example, in processing the sentence "M. Smith likes fishing", named entity detection would denote detecting that the phrase "M. Smith" does refer to a person, but without necessarily having (or using) any knowledge about a certain M. Smith who is (or, "might be") the specific person whom that sentence is talking about. Coreference resolution: detection of coreference and anaphoric links between text entities. In IE tasks, this is typically restricted to finding links between previously extracted named entities. For example, "International Business Machines" and "IBM" refer to the same real-world entity. If we take the two sentences "M. Smith likes fishing. But he doesn't like biking", it would be beneficial to detect that "he" is referring to the previously detected person "M. Smith". Relationship extraction: identification of relations between entities, such as: PERSON works for ORGANIZATION (extracted from the sentence "Bill works for IBM.") PERSON located in LOCATION (extracted from the sentence "Bill is in France.") Semi-structured information extraction which may refer to any IE that tries to restore some kind of information structure that has been lost through publication, such as: Table extraction: finding and extracting tables from documents. Table information extraction : extracting information in structured manner from the tables. This task is more complex than table extraction, as table extraction is only the first step, while understanding the roles of the cells, rows, columns, linking the information inside the table and understanding the information presented in the table are additional tasks necessary for table information extraction. Comments extraction : extracting comments from the actual content of articles in order to restore the link between authors of each of the sentences Language and vocabulary analysis Terminology extraction: finding the relevant terms for a given corpus Audio extraction Template-based music extraction: finding relevant characteristic in an audio signal taken from a given repertoire; for instance time indexes of occurrences of percussive sounds can be extracted in order to represent the essential rhythmic component of a music piece. Note that this list is not exhaustive and that the exact meaning of IE activities is not commonly accepted and that many approaches combine multiple sub-tasks of IE in order to achieve a wider goal. Machine learning, statistical analysis and/or natural language processing are often used in IE. IE on non-text documents is becoming an increasingly interesting topic in research, and information extracted from multimedia documents can now be expressed in a high level structure as it is done on text. This naturally leads to the fusion of extracted information from multiple kinds of documents and sources. == World Wide Web applications == IE has been the focus of the MUC conferences. The proliferation of the Web, however, intensified the need for developing IE systems that help people
Chinchilla (language model)
Chinchilla is a family of large language models (LLMs) developed by the research team at Google DeepMind, presented in March 2022. == Models == It is named "chinchilla" because it is a further development over a previous model family named Gopher. Both model families were trained in order to investigate the scaling laws of large language models. It claimed to outperform GPT-3. It considerably simplifies downstream utilization because it requires much less computer power for inference and fine-tuning. Based on the training of previously employed language models, it has been determined that if one doubles the model size, one must also have twice the number of training tokens. This hypothesis has been used to train Chinchilla by DeepMind. Similar to Gopher in terms of cost, Chinchilla has 70B parameters and four times as much data. Chinchilla has an average accuracy of 67.5% on the Measuring Massive Multitask Language Understanding (MMLU) benchmark, which is 7% higher than Gopher's performance. Chinchilla was still in the testing phase as of January 12, 2023. Chinchilla contributes to developing an effective training paradigm for large autoregressive language models with limited compute resources. The Chinchilla team recommends that the number of training tokens is twice for every model size doubling, meaning that using larger, higher-quality training datasets can lead to better results on downstream tasks. It has been used for the Flamingo vision-language model. == Architecture == Both the Gopher family and Chinchilla family are families of transformer models. In particular, they are essentially the same as GPT-2, with different sizes and minor modifications. Gopher family uses RMSNorm instead of LayerNorm; relative positional encoding rather than absolute positional encoding. The Chinchilla family is the same as the Gopher family, but trained with AdamW instead of Adam optimizer. The Gopher family contains six models of increasing size, from 44 million parameters to 280 billion parameters. They refer to the largest one as "Gopher" by default. Similar naming conventions apply for the Chinchilla family. Table 1 of shows the entire Gopher family: Table 4 of compares the 70-billion-parameter Chinchilla with Gopher 280B.
Bigram
A bigram or digram is a sequence of two adjacent elements from a string of tokens, which are typically letters, syllables, or words. A bigram is an n-gram for n=2. The frequency distribution of every bigram in a string is commonly used for simple statistical analysis of text in many applications, including in computational linguistics, cryptography, and speech recognition. Gappy bigrams or skipping bigrams are word pairs which allow gaps (perhaps avoiding connecting words, or allowing some simulation of dependencies, as in a dependency grammar). == Applications == Bigrams, along with other n-grams, are used in most successful language models for speech recognition. Bigram frequency attacks can be used in cryptography to solve cryptograms. See frequency analysis. Bigram frequency is one approach to statistical language identification. Some activities in logology or recreational linguistics involve bigrams. These include attempts to find English words beginning with every possible bigram, or words containing a string of repeated bigrams, such as logogogue. == Bigram frequency in the English language == The frequency of the most common letter bigrams in a large English corpus is: th 3.56% of 1.17% io 0.83% he 3.07% ed 1.17% le 0.83% in 2.43% is 1.13% ve 0.83% er 2.05% it 1.12% co 0.79% an 1.99% al 1.09% me 0.79% re 1.85% ar 1.07% de 0.76% on 1.76% st 1.05% hi 0.76% at 1.49% to 1.05% ri 0.73% en 1.45% nt 1.04% ro 0.73% nd 1.35% ng 0.95% ic 0.70% ti 1.34% se 0.93% ne 0.69% es 1.34% ha 0.93% ea 0.69% or 1.28% as 0.87% ra 0.69% te 1.20% ou 0.87% ce 0.65%
Outlook on the web
Outlook on the web (formerly Outlook Web App and Outlook Web Access) is a personal information manager web app from Microsoft. It is a web-based version of Microsoft Outlook, and is included in Exchange Server and Exchange Online (a component of Microsoft 365). It can be freely accessed from any web browser whether inside or outside an organization's network, and includes a web email client, a calendar tool, a contact manager, and a task manager. It also includes add-in integration, Skype on the web, and alerts as well as unified themes that span across all the web apps. == Purpose == Outlook on the web is available to Microsoft 365 (formerly Office 365) and Exchange Online subscribers, and is included with the on-premises Exchange Server, to enable users to connect to their email accounts via a web browser, without requiring the installation of Microsoft Outlook or other email clients. In case of Exchange Server, it is hosted on a local intranet and requires a network connection to the Exchange Server for users to work with e-mail, address book, calendars and task. The Exchange Online version, which can be bought either independently or through Office 365 licensing program, is hosted on Microsoft servers on the World Wide Web. == History == Outlook Web Access was created in 1995 by Microsoft Program Manager Thom McCann on the Exchange Server team. An early working version was demonstrated by Microsoft Vice President Paul Maritz at Microsoft's famous Internet summit in Seattle on December 27, 1995. The first customer version was shipped as part of the Exchange Server 5.0 release in early 1997. The first component to allow client-side scripts to issue HTTP requests (XMLHTTP) was originally written by the Outlook Web Access team. It soon became a part of Internet Explorer 5. Renamed XMLHttpRequest and standardized by the World Wide Web Consortium, it has since become one of the cornerstones of the Ajax technology used to build advanced web apps. Outlook Web Access was later renamed Outlook Web App in 2010. An update on August 4, 2015, renamed OWA to "Outlook on the web", often referred to in brief as simply "Outlook". == Components == === Mail === Mail is the webmail component of Outlook on the web. The default view is a three column view with folders and groups on the left, an email message list in the middle, and the selected message on the right. With the 2015 update, Microsoft introduced the ability to pin, sweep and archive messages, and undo the last action, as well as richer image editing features. It can connect to other services such as GitHub and Twitter through Office 365 Connectors. Actionable Messages in emails allows a user to complete a task from within the email, such as retweeting a Tweet on Twitter or setting a meeting date on a calendar. Outlook on the web supports S/MIME and includes features for managing calendars, contacts, tasks, documents (used with SharePoint or Office Web Apps), and other mailbox content. In the Exchange 2007 release, Outlook on the web (still called Outlook Web App at the time) also offers read-only access to documents stored in SharePoint sites and network UNC shares. === Calendar === Calendar is the calendaring component of Outlook on the web. With the update, Microsoft added a weather forecast directly in the Calendar, as well as icons (or "charms") as visual cues for an event. In addition, email reminders came to all events, and a special Birthday and Holiday event calendars are created automatically. Calendars can be shared and there are multiple views such as day, week, month, and today. Another view is work week which includes Mondays through Fridays in the calendar view. Calendar's "Board View" feature allows for a customizable calendar with widgets such as Goal, Calendar, Tasks and Tips. Calendar details can be added with HTML and rich-text editing, and files can be attached to calendar events and appointments. === People === People is the contact manager component of Outlook on the web. A user can search and edit existing contacts, as well as create new ones. Contacts can be placed into folders and duplicate contacts can be linked from multiple sources such as LinkedIn or Twitter. In Outlook Mail, a contact can be created by clicking on an email address sender, which pulls down a contact card with an add button to add to Outlook People. Contacts can be imported as well as placed into a list that can be utilized when composing an email in Outlook Mail. People can also sync with friends and connections lists on LinkedIn, Facebook, and Twitter. === To Do === To Do was originally launched as Tasks for Outlook Web App. Microsoft was slowly rolling out a preview of Tasks to its consumer-based Outlook.com service that in May 2015, was announced to be moving to the Office 365 infrastructure. It was initially a part of Calendar as a view. Microsoft has separated the services into its own web app in Outlook on the web. In a post on the Office Blogs in 2015, Microsoft announced that Outlook Web App would be renamed Outlook on the web and that Tasks would move under that brand. A user can create tasks, put them into categories, and move them to another folder. A feature added was the ability to set due days and sort and filter the tasks according to those criteria. The app provides the user with fields such as subject, start and end dates, percent complete, priority, and how much work was put into each task. Rich editing features like bold, italic, underline, numbering, and bullet points were also introduced. Tasks can be edited and categorized according to how the user wishes them to be sorted. == Removed features == Outlook on the web has had two interfaces available: one with a complete feature set (known as Premium) and one with reduced functionality (known as Light or sometimes Lite). Prior to Exchange 2010, the Premium client required Internet Explorer. Exchange 2000 and 2003 require Internet Explorer 5 and later, and Exchange 2007 requires Internet Explorer 6 and later. Exchange 2010 supports a wider range of web browsers: Internet Explorer 7 or later, Firefox 3.01 or later, Chrome, or Safari 3.1 or later. However, Exchange 2010 restricts its Firefox and Safari support to macOS and Linux. In Exchange 2013, these browser restrictions were lifted. In Exchange 2010 and earlier, the Light user interface is rendered for browsers other than Internet Explorer. The basic interface did not support search on Exchange Server 2003. In Exchange Server 2007, the Light interface supported searching mail items; managing contacts and the calendar was also improved. The 2010 version can connect to an external email account. The ability to add new accounts to Outlook on the web using the Connected accounts feature was removed in September 2018 and all connected accounts stopped synchronizing email the following month.
Ed (chatbot)
Ed was a chatbot co-developed by the Los Angeles Unified School District and AllHere Education. Described as a learning acceleration platform, it was the first personal assistant for students in the United States. Part of the district's Individual Acceleration Plan, it was able to interact with students both verbally and visually, offering support in 100 languages. The chatbot was launched on March 20, 2024, as part of the district's plan for academic recovery from the COVID-19 pandemic and to improve overall academic performance. Utilizing artificial intelligence, Ed organizes data and reports on grades, test scores, and attendance, creating individualized plans for each student. After the company behind it, AllHere, collapsed, the district shuttered operations of the chatbot on June 14, 2024. The firm is under investigation by the US Federal Bureau of Investigation. == History == On February 14, 2022, Alberto M. Carvalho became the Superintendent of the Los Angeles Unified School District, pledging to give the district a full academic recovery from the COVID-19 pandemic. In December 2022, he announced the Individual Acceleration Plan for the district, which aimed to provide each student with a unique progress report and help them determine if they were on track to graduate. The district faced criticism from disability advocates for its management of Individualized Education Programs, and in April 2022, the United States Department of Education announced that the district had failed to provide appropriate educational services to students with disabilities during the pandemic. The district had been grappling with significant absenteeism issues since the pandemic, which led to declining academic performance and disengagement among students. On February 17, 2023, the district issued a request for proposals to develop a fully integrated portal system. Later that year, they signed a $6 million, five-year contract with AllHere Education, a Boston-based company founded in 2016. The introduction of Ed follows the public launch of ChatGPT, which has been utilized by both teachers and students in educational settings. On August 4, 2023, during an annual address at the Walt Disney Concert Hall, Carvalho and the Los Angeles Unified School District announced the launch of Ed. The district invested $4 million into the chatbot, with Carvalho noting that this cost would be halved thanks to donor and grant funding. The chatbot was launched on March 20, 2024. Following its launch, a press conference was held to address security and technology concerns. Carvalho stated that the district had collaborated with security companies and incorporated filters to screen for threatening language. Months after its launch, AllHere Education furloughed most of its staff on June 14, citing their “current financial position” on its website as the reason. After learning about the furlough, the district terminated its dealings with AllHere Education. However, it stated its intention to bring the chatbot back in the future once officials determine the best course of action. Carvalho announced that he would appoint an independent task force to review what went wrong with AllHere Education and the chatbot. On February 25, 2026, the FBI served a search warrant on Carvalho’s home and office in connection with AllHere. The FBI also raided the LAUSD's headquarters. == Service == The chatbot was described as a personal assistant and a "one-stop shop for parents and students" who want to see information about a student's attendance and grades, as well as other resources from the district. Additionally, the application can function as an alarm clock, provide daily lunch menus from the school cafeteria, and offer updates on the location of school buses. The chatbot also helps students and parents who do not speak English as their first language by translating displayed information into approximately 100 different languages. The application can also help with submitting applications and give updates on progress and upcoming assignments. The district stated that the primary goal of Ed was to actively motivate students to complete homework and other tasks. == Reception == The chatbot received a mostly positive reception among parents and observers upon its launch. Some parents and teachers expressed caution about the technology, voicing concerns that the district's push for its implementation lacked public accountability. Rob Nelson from the University of Pennsylvania described the district's strategy as risky, saying that the release felt "like the beginning of a Clippy-level disaster". After the chatbot's shutdown, The 74 criticized it for misusing student data. Chris Whiteley, a former software engineer at AllHere Education, alleged that the data collected by the chatbot likely violated the district's data privacy rules.