The usage share of an operating system is the percentage of computers running that operating system (OS). These statistics are estimates as wide scale OS usage data is difficult to obtain and measure. Reliable primary sources are limited and data collection methodology is not formally agreed. Currently devices connected to the internet allow for web data collection to approximately measure OS usage. As of December 2025, Android, which uses the Linux kernel, is the world's most popular operating system with 38.94% of the global market, followed by Windows with 29.99%, iOS with 15.66%, macOS with 2.14%, and other operating systems with 10.78%. This is for all device types excluding embedded devices. For smartphones and other mobile devices, Android has 72% market share, and Apple's iOS has 28%. For desktop computers and laptops, Microsoft Windows has 60.8%, followed by unknown operating systems at 19.7%, Mac OS at 14.4%, desktop Linux at 3.2%, then Google's ChromeOS at 1.6%, as of March 2026. For tablets, Apple's iPadOS (a variant of iOS) has 52% share and Android has 48% worldwide. For the top 500 most powerful supercomputers, Linux distributions have had 100% of the market share since 2017. The global server operating system market share has Linux leading with a 63.1% marketshare, followed by Windows, Unix and other operating systems. Linux is also most used for web servers, and the most common Linux distribution is Ubuntu, followed by Debian. Linux has almost caught up with the second-most popular (desktop) OS, macOS, in some regions, such as in South America, and in Asia it's at 6.4% (7% with ChromeOS) vs 9.7% for macOS. In the US, ChromeOS is third at 5.5%, followed by (desktop) Linux at 4.3%. The most numerous type of device with an operating system are embedded systems. Not all embedded systems have operating systems, instead running their application code on the "bare metal"; of those that do have operating systems, a high percentage are standalone or do not have a web browser, which makes their usage share difficult to measure. Some operating systems used in embedded systems are more widely used than some of those mentioned above; for example, modern Intel microprocessors contain an embedded management processor running a version of the Minix operating system. == Worldwide device shipments == Shipments (to stores) do not necessarily translate to sales to consumers, therefore suggesting the numbers indicate popularity and/or usage could be misleading. Not only do smartphones sell in higher numbers than PCs, but also a lot more by dollar value, with the gap only projected to widen, to well over double. According to Gartner, the following is the worldwide device shipments (referring to wholesale) by operating system from 2012 to 2016, which includes smartphones, tablets, laptops and PCs together. On 27 January 2016, Paul Thurrott summarized the operating system market, the day after Apple announced "one billion devices": Apple's "active installed base" is now one billion devices. [..] Granted, some of those Apple devices were probably sold into the marketplace years ago. But that 1 billion figure can and should be compared to the numbers Microsoft touts for Windows 10 (200 million, most recently) or Windows more generally (1.5 billion active users, a number that hasn’t moved, magically, in years), and that Google touts for Android (over 1.4 billion, as of September). My understanding of iOS is that the user base was previously thought to be around 800 million strong, and when you factor out Macs and other non-iOS Apple devices, that's probably about right. But as you can see, there are three big personal computing platforms. And only one of them is actually declining. We’ll see how Windows 10 fares over the long term, but even if Microsoft hits the 1 billion figure in 1-2 years as promised, it will by then still be the smallest of those three platforms. In 2018, Apple stopped revealing unit sales in its reports. Since 2018, the company have been publishing only revenues per device models which, nonetheless, allowed the analysers to extrapolate the unit sales from the model revenues by applying the wholesale device prices. Other hardware manufacturers usually do not report unit sales. === PC shipments === For 2015 (and earlier), Gartner reports for "the year, worldwide PC shipments declined for the fourth consecutive year, which started in 2012 with the launch of tablets" with an 8% decline in PC sales for 2015 (not including cumulative decline in sales over the previous years). Microsoft backed away from their goal of one billion Windows 10 devices in three years (or "by the middle of 2018") and reported on 26 September 2016 that Windows 10 was running on over 400 million devices, and in March 2019, on more than 800 million. In May 2020, Gartner predicted further decline in all market segments for 2020 due to COVID-19, predicting a decline of 13.6% for all devices. while the "Work from Home Trend Saved PC Market from Collapse", with only a decline of 10.5% predicted for PCs. However, in the end, according to Gartner, PC shipments grew 10.7% in the fourth quarter of 2020 and reached 275 million units in 2020, a 4.8% increase from 2019 and the highest growth in ten years." Apple in 4th place for PCs had the largest growth in shipments for a company in Q4 of 31.3%, while "the fourth quarter of 2020 was another remarkable period of growth for Chromebooks, with shipments increasing around 200% year over year to reach 11.7 million units. In 2020, Chromebook shipments increased over 80% to total nearly 30 million units, largely due to demand from the North American education market." Chromebooks sold more (30 million) than Apple's Macs worldwide (22.5 million) in pandemic year 2020. According to the Catalyst group, the year 2021 had record high PC shipments with total shipments of 341 million units (including Chromebooks), 15% higher than 2020 and 27% higher than 2019, while being the largest shipment total since 2012. According to Gartner, worldwide PC shipments declined by 16.2% in 2022, the largest annual decrease since the mid-1990s, due to geopolitical, economic, and supply chain challenges. In 2024 and 2025, due to lower adoption of Windows 11 and Microsoft ending its support to Windows 10, the number of PCs shipped with pre-installed Windows OS dropped. Pundits attribute the low Windows 11 acceptance to its steep hardware requirements and especially the TPM 2.0 ready chipset requirement and the 2024 CrowdStrike-related IT outages. Meanwhile, the macOS device market share in PC device shipments increased to new heights, with improved numbers seen for Linux devices too. In Q3 2025, the macOS pre-installed device shipments increased by 14.9% year-over-year (YoY), while the overall PC-shipments increased only by 8.1%, in Q2 2025, it grew 21.4% YoY while the global PC-shipments increased only by 6.5%, and in Q1 2025, it grew 7% YoY while the global PC-shipments increased by 4.8%. === Tablet computers shipments === In 2015, eMarketer estimated at the beginning of the year that the tablet installed base would hit one billion for the first time (with China's use at 328 million, which Google Play doesn't serve or track, and the United States's use second at 156 million). At the end of the year, because of cheap tablets – not counted by all analysts – that goal was met (even excluding cumulative sales of previous years) as: Sales quintupled to an expected 1 billion units worldwide this year, from 216 million units in 2014, according to projections from the Envisioneering Group. While that number is far higher than the 200-plus million units globally projected by research firms IDC, Gartner and Forrester, Envisioneering analyst Richard Doherty says the rival estimates miss all the cheap Asian knockoff tablets that have been churning off assembly lines.[..] Forrester says its definition of tablets "is relatively narrow" while IDC says it includes some tablets by Amazon — but not all.[..] The top tech purchase of the year continued to be the smartphone, with an expected 1.5 billion sold worldwide, according to projections from researcher IDC. Last year saw some 1.2 billion sold.[..] Computers didn’t fare as well, despite the introduction of Microsoft's latest software upgrade, Windows 10, and the expected but not realized bump it would provide for consumers looking to skip the upgrade and just get a new computer instead. Some 281 million PCs were expected to be sold, according to IDC, down from 308 million in 2014. Folks tend to be happy with the older computers and keep them for longer, as more of our daily computing activities have moved to the smartphone.[..] While Windows 10 got good reviews from tech critics, only 11% of the 1-billion-plus Windows user base opted to do the upgrade, according to Microsoft. This suggests Microsoft has a ways to go before the software gets "hit" status. Apple's new operating system El Capitan has been
Thinkfree Office
Thinkfree Office is a web-based commercial office productivity suite developed by South Korea-based Thinkfree Inc. It includes Word (a word processor), Spreadsheet (a spreadsheet) and Presentation (a presentation program). They are compatible with Microsoft Office's Word, PowerPoint, and Excel. It also features collaborative editing. The product is hosted on the client's server. == Supported file formats == Thinkfree Office supports ISO/IEC international standard ISO/IEC 26300 Open Document Format for Office Applications (odf, odt, odp, ods, odg). It also supports Microsoft's XML formats (docx, pptx, xlsx) and Microsoft's legacy binary formats (doc, ppt, xls). == Naming == The software was previously marketed under different names, such as Thinkfree Server, Thinkfree Online, Hancom Office Online, and Hancom Office Web. Eventually, the brand was consolidated under the name Thinkfree Office. == History == In June 2000, Thinkfree Inc. released Thinkfree Office, based in Silicon Valley, California. It is recognized as the world's first online office editor (predating Google Docs and Microsoft 365) and attracted significant media coverage, including reports on CNN. In 2001, Microsoft CEO Steve Ballmer highlighted Thinkfree as a significant competitor in a magazine interview, considering it a potential threat to his company, second only to Linux. In November 2003, Hancom, a South Korean office software company, signed a memorandum of understanding and subsequently acquired Thinkfree. In January 2004, Thinkfree expanded into other foreign markets. Subsidiary Haansoft USA, Inc. was created in San Jose, California to begin formal commercial operations in the US market. At the same time, a partnership was established with Riverdeep with the purpose of improving marketshare. In February 2004, expansion into the Japanese market began. A commercial agency agreement was signed with PSI in Shinjuku, Japan, which allowed for localized distribution. In addition, a global agreement was entered into with Yamada Denki, one of the three main computer distributors in Japan, for a total of 180,000 units. In May 2006, Thinkfree Office received the "Product of the Year" award at the Well-Connected Awards, USA. In January 2009, Thinkfree Mobile was launched at CES 2009 in Las Vegas. In April 2009, Thinkfree Live, Korea's first web office service, was launched. In June 2018, a partnership was formed with Amazon Web Services to integrate Thinkfree Office into WorkDocs, an in-house office suite. In October 2023, Hancom split its online office business unit as "Thinkfree Inc.".
Vicarious (company)
Vicarious was an artificial intelligence company based in the San Francisco Bay Area, California. They use the theorized computational principles of the brain to attempt to build software that can think and learn like a human. Vicarious describes its technology as "a turnkey robotics solution integrator using artificial intelligence to automate tasks too complex and versatile for traditional automations". Alphabet Inc acquired the company in 2022 for an undisclosed amount. == Founders == The company was founded in 2010 by D. Scott Phoenix and Dileep George. Before co-founding Vicarious, Phoenix was Entrepreneur in Residence at Founders Fund and CEO of Frogmetrics, a touchscreen analytics company he co-founded through the Y Combinator incubator program. Previously, George was Chief Technology Officer at Numenta, a company he co-founded with Jeff Hawkins and Donna Dubinsky while completing his PhD at Stanford University. == Funding == The company launched in February 2011 with funding from Founders Fund, Dustin Moskovitz, Adam D’Angelo (former Facebook CTO and co-founder of Quora), Felicis Ventures, and Palantir co-founder Joe Lonsdale. In August 2012, in its Series A round of funding, it raised an additional $15 million. The round was led by Good Ventures; Founders Fund, Open Field Capital and Zarco Investment Group also participated. The company received $40 million in its Series B round of funding. The round was led by individuals including Mark Zuckerberg, Elon Musk, and others. An additional undisclosed amount was later contributed by Amazon.com CEO Jeff Bezos, Yahoo! co-founder Jerry Yang, Skype co-founder Janus Friis and Salesforce.com CEO Marc Benioff. == Recursive Cortical Network == Vicarious is developing machine learning software based on the computational principles of the human brain. One such software is a vision system known as the Recursive Cortical Network (RCN), it is a generative graphical visual perception system that interprets the contents of photographs and videos in a manner similar to humans. The system is powered by a balanced approach that takes sensory data, mathematics, and biological plausibility into consideration. On October 22, 2013, beating CAPTCHA, Vicarious announced its model was reliably able to solve modern CAPTCHAs, with character recognition rates of 90% or better when trained on one style. However, Luis von Ahn, a pioneer of early CAPTCHA and founder of reCAPTCHA, expressed skepticism, stating: "It's hard for me to be impressed since I see these every few months." He pointed out that 50 similar claims to that of Vicarious had been made since 2003. Vicarious later published their findings in peer-reviewed journal Science. Vicarious has indicated that its AI was not specifically designed to complete CAPTCHAs and its success at the task is a product of its advanced vision system. Because Vicarious's algorithms are based on insights from the human brain, it is also able to recognize photographs, videos, and other visual data.
Deep image prior
Deep image prior is a type of convolutional neural network used to enhance a given image with no prior training data other than the image itself. A neural network is randomly initialized and used as prior to solve inverse problems such as noise reduction, super-resolution, and inpainting. Image statistics are captured by the structure of a convolutional image generator rather than by any previously learned capabilities. == Method == === Background === Inverse problems such as noise reduction, super-resolution, and inpainting can be formulated as the optimization task x ∗ = m i n x E ( x ; x 0 ) + R ( x ) {\displaystyle x^{}=min_{x}E(x;x_{0})+R(x)} , where x {\displaystyle x} is an image, x 0 {\displaystyle x_{0}} a corrupted representation of that image, E ( x ; x 0 ) {\displaystyle E(x;x_{0})} is a task-dependent data term, and R(x) is the regularizer. Deep neural networks learn a generator/decoder x = f θ ( z ) {\displaystyle x=f_{\theta }(z)} which maps a random code vector z {\displaystyle z} to an image x {\displaystyle x} . The image corruption method used to generate x 0 {\displaystyle x_{0}} is selected for the specific application. === Specifics === In this approach, the R ( x ) {\displaystyle R(x)} prior is replaced with the implicit prior captured by the neural network (where R ( x ) = 0 {\displaystyle R(x)=0} for images that can be produced by a deep neural networks and R ( x ) = + ∞ {\displaystyle R(x)=+\infty } otherwise). This yields the equation for the minimizer θ ∗ = a r g m i n θ E ( f θ ( z ) ; x 0 ) {\displaystyle \theta ^{}=argmin_{\theta }E(f_{\theta }(z);x_{0})} and the result of the optimization process x ∗ = f θ ∗ ( z ) {\displaystyle x^{}=f_{\theta ^{}}(z)} . The minimizer θ ∗ {\displaystyle \theta ^{}} (typically a gradient descent) starts from a randomly initialized parameters and descends into a local best result to yield the x ∗ {\displaystyle x^{}} restoration function. ==== Overfitting ==== A parameter θ may be used to recover any image, including its noise. However, the network is reluctant to pick up noise because it contains high impedance while useful signal offers low impedance. This results in the θ parameter approaching a good-looking local optimum so long as the number of iterations in the optimization process remains low enough not to overfit data. === Deep Neural Network Model === Typically, the deep neural network model for deep image prior uses a U-Net like model without the skip connections that connect the encoder blocks with the decoder blocks. The authors in their paper mention that "Our findings here (and in other similar comparisons) seem to suggest that having deeper architecture is beneficial, and that having skip-connections that work so well for recognition tasks (such as semantic segmentation) is highly detrimental." == Applications == === Denoising === The principle of denoising is to recover an image x {\displaystyle x} from a noisy observation x 0 {\displaystyle x_{0}} , where x 0 = x + ϵ {\displaystyle x_{0}=x+\epsilon } . The distribution ϵ {\displaystyle \epsilon } is sometimes known (e.g.: profiling sensor and photon noise) and may optionally be incorporated into the model, though this process works well in blind denoising. The quadratic energy function E ( x , x 0 ) = | | x − x 0 | | 2 {\displaystyle E(x,x_{0})=||x-x_{0}||^{2}} is used as the data term, plugging it into the equation for θ ∗ {\displaystyle \theta ^{}} yields the optimization problem m i n θ | | f θ ( z ) − x 0 | | 2 {\displaystyle min_{\theta }||f_{\theta }(z)-x_{0}||^{2}} . === Super-resolution === Super-resolution is used to generate a higher resolution version of image x. The data term is set to E ( x ; x 0 ) = | | d ( x ) − x 0 | | 2 {\displaystyle E(x;x_{0})=||d(x)-x_{0}||^{2}} where d(·) is a downsampling operator such as Lanczos that decimates the image by a factor t. === Inpainting === Inpainting is used to reconstruct a missing area in an image x 0 {\displaystyle x_{0}} . These missing pixels are defined as the binary mask m ∈ { 0 , 1 } H × V {\displaystyle m\in \{0,1\}^{H\times V}} . The data term is defined as E ( x ; x 0 ) = | | ( x − x 0 ) ⊙ m | | 2 {\displaystyle E(x;x_{0})=||(x-x_{0})\odot m||^{2}} (where ⊙ {\displaystyle \odot } is the Hadamard product). The intuition behind this is that the loss is computed only on the known pixels in the image, and the network is going to learn enough about the image to fill in unknown parts of the image even though the computed loss doesn't include those pixels. This strategy is used to remove image watermarks by treating the watermark as missing pixels in the image. === Flash–no-flash reconstruction === This approach may be extended to multiple images. A straightforward example mentioned by the author is the reconstruction of an image to obtain natural light and clarity from a flash–no-flash pair. Video reconstruction is possible but it requires optimizations to take into account the spatial differences. == Implementations == A reference implementation rewritten in Python 3.6 with the PyTorch 0.4.0 library was released by the author under the Apache 2.0 license: deep-image-prior A TensorFlow-based implementation written in Python 2 and released under the CC-SA 3.0 license: deep-image-prior-tensorflow A Keras-based implementation written in Python 2 and released under the GPLv3: machine_learning_denoising == Example == See Astronomy Picture of the Day (APOD) of 2024-02-18
Visual descriptor
In computer vision, visual descriptors or image descriptors are descriptions of the visual features of the contents in images, videos, or algorithms or applications that produce such descriptions. They describe elementary characteristics such as the shape, the color, the texture or the motion, among others. == Introduction == As a result of the new communication technologies and the massive use of Internet in our society, the amount of audio-visual information available in digital format is increasing considerably. Therefore, it has been necessary to design some systems that allow us to describe the content of several types of multimedia information in order to search and classify them. The audio-visual descriptors are in charge of the contents description. These descriptors have a good knowledge of the objects and events found in a video, image or audio and they allow the quick and efficient searches of the audio-visual content. This system can be compared to the search engines for textual contents. Although it is relatively easy to find text with a computer, it is much more difficult to find concrete audio and video parts. For instance, imagine somebody searching a scene of a happy person. The happiness is a feeling and it is not evident its shape, color and texture description in images. The description of the audio-visual content is not a superficial task and it is essential for the effective use of this type of archives. The standardization system that deals with audio-visual descriptors is the MPEG-7 (Motion Picture Expert Group - 7). == Types == Descriptors are the first step to find out the connection between pixels contained in a digital image and what humans recall after having observed an image or a group of images after some minutes. Visual descriptors are divided in two main groups: General information descriptors: contain low level descriptors which give a description about color, shape, regions, textures and motion. Specific domain information descriptors: give information about objects and events in the scene. A concrete example would be face recognition. === General information descriptors === General information descriptors consist of a set of descriptors that covers different basic and elementary features like: color, texture, shape, motion, location and others. This description is automatically generated by means of signal processing. ==== Color ==== It's the most basic quality of visual content. Five tools are defined to describe color. The three first tools represent the color distribution and the last ones describe the color relation between sequences or group of images: Dominant color descriptor (DCD) Scalable color descriptor (SCD) Color structure descriptor (CSD) Color layout descriptor (CLD) Group of frame (GoF) or group-of-pictures (GoP) ==== Texture ==== It's an important quality in order to describe an image. The texture descriptors characterize image textures or regions. They observe the region homogeneity and the histograms of these region borders. The set of descriptors is formed by: Homogeneous texture descriptor (HTD) Texture browsing descriptor (TBD) Edge histogram descriptor (EHD) ==== Shape ==== It contains important semantic information due to human's ability to recognize objects through their shape. However, this information can only be extracted by means of a segmentation similar to the one that the human visual system implements. Nowadays, such a segmentation system is not available yet, however there exists a serial of algorithms which are considered to be a good approximation. These descriptors describe regions, contours and shapes for 2D images and for 3D volumes. The shape descriptors are the following ones: Region-based shape descriptor (RSD) Contour-based shape descriptor (CSD) 3-D shape descriptor (3-D SD) ==== Motion ==== It's defined by four different descriptors which describe motion in video sequence. Motion is related to the objects motion in the sequence and to the camera motion. This last information is provided by the capture device, whereas the rest is implemented by means of image processing. The descriptor set is the following one: Motion activity descriptor (MAD) Camera motion descriptor (CMD) Motion trajectory descriptor (MTD) Warping and parametric motion descriptor (WMD and PMD) ==== Location ==== Elements location in the image is used to describe elements in the spatial domain. In addition, elements can also be located in the temporal domain: Region locator descriptor (RLD) Spatio temporal locator descriptor (STLD) === Specific domain information descriptors === These descriptors, which give information about objects and events in the scene, are not easily extractable, even more when the extraction is to be automatically done. Nevertheless, they can be manually processed. As mentioned before, face recognition is a concrete example of an application that tries to automatically obtain this information. == Descriptors applications == Among all applications, the most important ones are: Multimedia documents search engines and classifiers. Digital library: visual descriptors allow a very detailed and concrete search of any video or image by means of different search parameters. For instance, the search of films where a known actor appears, the search of videos containing the Everest mountain, etc. Personalized electronic news service. Possibility of an automatic connection to a TV channel broadcasting a soccer match, for example, whenever a player approaches the goal area. Control and filtering of concrete audiovisual content, like violent or pornographic material. Also, authorization for some multimedia content.
Carrier cloud
In cloud computing, a carrier cloud is a class of cloud that integrates wide area networks (WAN) and other attributes of communications service providers’ carrier-grade networks to enable the deployment of highly-complex applications in the cloud. In contrast, classic cloud computing focuses on the data center and does not address the network connecting data centers and cloud users. This may result in unpredictable response times and security issues when business-critical data are transferred over the Internet. == History == The advent of virtualization technology, cost-effective computing hardware, and ubiquitous Internet connectivity have enabled the first wave of cloud services starting in the early years of the 21st century. But many businesses and other organizations hesitated to move to more demanding applications, from on-premises dedicated hardware to private or public clouds. As a response, communications service providers started in the 2010/2011 time frame to develop carrier clouds that address perceived weaknesses in existing cloud services. Cited weaknesses vary but often include possible downtime, security issues, high cost of custom software and data transfer, inflexibility of some cloud apps, poor customer and nonfulfillment of service level agreements (SLAs). == Characteristics == To enable the deployment of time-sensitive and business critical applications in the cloud, the carrier cloud is designed to match or even exceed the characteristics of on-premises deployments. Therefore, the carrier cloud is characterized by some or all of the following items: Configurable, elastic network performance: Typical cloud computing solutions use the best effort of the public Internet to connect cloud users and data centers. This approach provides instant connectivity but does not offer control over network capacities, latencies, and jitter. Carrier clouds address these gaps with content delivery networks and/or dedicated virtual private networks (VPN) at OSI layers 1 (optical wavelengths), 2 (data link layer), and 3 (network layer). These VPNs can be configured to offer the desired performance parameters and exhibit the same type of elasticity for the network that regular clouds provide for servers and storage. To achieve the requested performance parameters, such as low latency, cloud applications can be (automatically) allocated to distributed data centers that are close enough to the cloud users. Automatic resource placement: For a cloud with multiple data centers, information about both the data center and the connecting network is relevant for a decision of where to place cloud images and storage volumes. For this decision, carrier clouds can obtain relevant information about the network, e.g., using the Application-Layer Traffic Optimization (ALTO) protocol. High level of security and governance: Cloud application providers are subject to general and domain specific security, privacy, and governance requirements and regulations, such as the European Data Protection Directive and the U.S. Health Insurance Portability and Accountability Act. For added security, the wide area network of the carrier cloud can provide segregated encrypted or unencrypted network links that are not accessible from the general Internet. At the data center, the carrier cloud provides e.g. virtual private servers, management processes, logs, and documentation to fulfill security and governance rules. Location control: Fundamentally, cloud users should not be concerned with the geographic location of their cloud resources. However, privacy and other regulations may mandate that certain types of data must not be sent outside a national jurisdiction or other geographical region. Open APIs: Carrier clouds provide graphical user interfaces and Web application programming interfaces that allow cloud application providers to set up, manage, and monitor both, the data center and the WAN, of their cloud services. == Architecture == Carrier clouds encompass data centers at different network tiers and wide area networks that connect multiple data centers to each other as well as to the cloud users. Links between data centers are used for failover, overflow, backup, and geographic diversity. Carrier clouds can be set up as public, private, or hybrid clouds. The carrier cloud federates these cloud entities by using a single management system to orchestrate, manage, and monitor data center and network resources as a single system.
Visual temporal attention
Visual temporal attention is a special case of visual attention that involves directing attention to specific instant of time. Similar to its spatial counterpart visual spatial attention, these attention modules have been widely implemented in video analytics in computer vision to provide enhanced performance and human interpretable explanation of deep learning models. As visual spatial attention mechanism allows human and/or computer vision systems to focus more on semantically more substantial regions in space, visual temporal attention modules enable machine learning algorithms to emphasize more on critical video frames in video analytics tasks, such as human action recognition. In convolutional neural network-based systems, the prioritization introduced by the attention mechanism is regularly implemented as a linear weighting layer with parameters determined by labeled training data. == Application in Action Recognition == Recent video segmentation algorithms often exploits both spatial and temporal attention mechanisms. Research in human action recognition has accelerated significantly since the introduction of powerful tools such as Convolutional Neural Networks (CNNs). However, effective methods for incorporation of temporal information into CNNs are still being actively explored. Motivated by the popular recurrent attention models in natural language processing, the Attention-aware Temporal Weighted CNN (ATW CNN) is proposed in videos, which embeds a visual attention model into a temporal weighted multi-stream CNN. This attention model is implemented as temporal weighting and it effectively boosts the recognition performance of video representations. Besides, each stream in the proposed ATW CNN framework is capable of end-to-end training, with both network parameters and temporal weights optimized by stochastic gradient descent (SGD) with back-propagation. Experimental results show that the ATW CNN attention mechanism contributes substantially to the performance gains with the more discriminative snippets by focusing on more relevant video segments. == Literature == Seibold VC, Balke J and Rolke B (2023): Temporal attention. Front. Cognit. 2:1168320. doi: 10.3389/fcogn.2023.1168320.