KNIME

KNIME

KNIME ( ), the Konstanz Information Miner, is a data analytics, reporting and integrating platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining "Building Blocks of Analytics" concept. A graphical user interface and use of Java Database Connectivity (JDBC) allows assembly of nodes blending different data sources, including preprocessing (extract, transform, load, or ETL), for modeling, data analysis and visualization with minimal, or no, programming. It is free and open-source software released under a GNU General Public License. Since 2006, KNIME has been used in pharmaceutical research, and in other areas including customer relationship management (CRM) and data analysis, business intelligence, text mining and financial data analysis. Recently, attempts were made to use KNIME as robotic process automation (RPA) tool. KNIME's headquarters are based in Zurich, with other offices in Konstanz, Berlin, and Austin (USA). == History == Development of KNIME began in January 2004, with a team of software engineers at the University of Konstanz, as an open-source platform. The original team, headed by Michael Berthold, came from a Silicon Valley pharmaceutical industry software company. The initial goal was to create a modular, highly scalable and open data processing platform that allows easy integration of different data loading, processing, transforming, analyzing, and visual exploring modules, without focus on any one application area. The platform was intended for collaborating, research, and for integrating various other data analysis projects. In 2006, the first version of KNIME was released. Several pharmaceutical companies began using KNIME, and several life science software vendors began integrating their tools into the platform. Later that year, after an article in the German magazine c't, users from a number of other areas joined ship. As of 2012, KNIME is in use by over 15,000 actual users (i.e. not counting downloads, but users regularly retrieving updates) in the life sciences and at banks, publishers, car manufacturer, telcos, consulting firms, and various other industries, and a large number of research groups, worldwide. Latest updates to KNIME Server and KNIME Big Data Extensions, provide support for Apache Spark 2.3, Parquet and HDFS-type storage. For the sixth year in a row, KNIME has been placed as a leader for data science and machine learning platforms in Gartner's Magic Quadrant. == Design philosophy, features == These are the design principles and features that KNIME software follows: Visual, Interactive Framework: KNIME Software prioritizes a user-friendly and intuitive approach to data analysis. This is achieved through a visual and interactive framework where data flows can be combined using a drag-and-drop interface. Users can develop customized and interactive applications by creating simple to advanced and highly-automated data pipelines. These may include, for example, access to databases, machine learning libraries, logic for workflow control (e.g., loops, switches, etc.), abstraction (e.g., interactive widgets), invocation, dynamic data apps, integrated deployment, or error handling. Modularity: processing units and data containers should remain independent of each other. This design choice enables easy distribution of computation and allows for the independent development of different algorithms. Data types within KNIME are encapsulated, meaning no types are predefined. This design choice facilitates adding new data types, and integrating them with extant types, while including type-specific renderers and comparators. This principle also enables inspecting results at the end of each single data operation. Extensibility: KNIME Software is designed to be extensible. Adding new processing nodes or views is made simple through a plug-in mechanism. This mechanism ensures that users can distribute their custom functionalities without the need for complicated install or uninstall procedures. Interleaving No-Code with Code: the platform supports integrating both visual programming (no-code) and script-based programming (e.g., Python, R, JavaScript) approaches to data analysis. This design principle is termed low-code. Automation and Scalability: for example, the use of parameterization via flow variables, or the encapsulation of workflow segments in components contribute to reduce manual work and errors in analyses. Further, the scheduling of workflow execution (available in KNIME Business Hub and KNIME Community Hub for Teams) reduces dependency on human resources. In terms of scalability, a few examples include the ability to handle large datasets (millions of rows), execute multiple processes simultaneously out of the box and reuse workflow segments. Full Usability: due to the open source nature, KNIME Analytics Platform provides free full usability with no limited trial periods. == Internals == KNIME allows users to visually create data flows (or pipelines), selectively execute some or all analysis steps, and later inspect the results, models, using interactive widgets and views. KNIME is written in Java and based on Eclipse. It makes use of an extension mechanism to add plug-ins providing added functions. The core version includes hundreds of modules for data integration (file input/output (I/O), database nodes supporting all common database management systems through JDBC or native connectors: SQLite, MS-Access, SQL Server, MySQL, Oracle, PostgreSQL, Vertica and H2), data transformation (filter, converter, splitter, combiner, joiner), and the commonly used methods of statistics, data mining, analysis and text analytics. Visualization is supported with the Report Designer extension. KNIME workflows can be used as data sets to create report templates that can be exported to document formats such as doc, ppt, xls, pdf and others. Other KNIME abilities are: KNIMEs core-architecture allows processing of large data volumes that are only limited by the available hard disk space (not limited to the available RAM). E.g., KNIME allows analyzing 300 million customer addresses, 20 million cell images, and 10 million molecular structures. Added plug-ins allow integrating methods for text mining, image mining, time series analysis, and networking. KNIME integrates various other open-source projects, e.g., machine learning algorithms from Weka, H2O, Keras, Spark, the R project and LIBSVM; plotly, JFreeChart, ImageJ, and the Chemistry Development Kit. KNIME is implemented in Java, allows for wrappers calling other code, in addition to providing nodes that allow it to run Java, Python, R, Ruby and other code fragments. Since 2021, KNIME's Python Integration utilizes Anaconda for Python distribution and environment management. == License == In 2024, KNIME version 5.3 is released under the same GPLv3 license as previous versions. As of version 2.1, KNIME is released under the GPLv3 license, with an exception that allow commercial software vendors to use the well-defined node application programming interface (API) to add proprietary extensions, or wrappers calling their tools from KNIME. == Courses == KNIME allows the performance of data analysis without programming skills. Several free, online courses are provided.

Cross-language information retrieval

Cross-language information retrieval (CLIR) is a subfield of information retrieval dealing with retrieving information written in a language different from the language of the user's query. The term "cross-language information retrieval" has many synonyms, of which the following are perhaps the most frequent: cross-lingual information retrieval, translingual information retrieval, multilingual information retrieval. The term "multilingual information retrieval" refers more generally both to technology for retrieval of multilingual collections and to technology which has been moved to handle material in one language to another. The term Multilingual Information Retrieval (MLIR) involves the study of systems that accept queries for information in various languages and return objects (text, and other media) of various languages, translated into the user's language. Cross-language information retrieval refers more specifically to the use case where users formulate their information need in one language and the system retrieves relevant documents in another. To do so, most CLIR systems use various translation techniques. CLIR techniques can be classified into different categories based on different translation resources: Dictionary-based CLIR techniques Parallel corpora based CLIR techniques Comparable corpora based CLIR techniques Machine translator based CLIR techniques CLIR systems have improved so much that the most accurate multi-lingual and cross-lingual adhoc information retrieval systems today are nearly as effective as monolingual systems. Other related information access tasks, such as media monitoring, information filtering and routing, sentiment analysis, and information extraction require more sophisticated models and typically more processing and analysis of the information items of interest. Much of that processing needs to be aware of the specifics of the target languages it is deployed in. Mostly, the various mechanisms of variation in human language pose coverage challenges for information retrieval systems: texts in a collection may treat a topic of interest but use terms or expressions which do not match the expression of information need given by the user. This can be true even in a mono-lingual case, but this is especially true in cross-lingual information retrieval, where users may know the target language only to some extent. The benefits of CLIR technology for users with poor to moderate competence in the target language has been found to be greater than for those who are fluent. Specific technologies in place for CLIR services include morphological analysis to handle inflection, decompounding or compound splitting to handle compound terms, and translations mechanisms to translate a query from one language to another. The first workshop on CLIR was held in Zürich during the SIGIR-96 conference. Workshops have been held yearly since 2000 at the meetings of the Cross Language Evaluation Forum (CLEF). Researchers also convene at the annual Text Retrieval Conference (TREC) to discuss their findings regarding different systems and methods of information retrieval, and the conference has served as a point of reference for the CLIR subfield. Early CLIR experiments were conducted at TREC-6, held at the National Institute of Standards and Technology (NIST) on November 19–21, 1997. Google Search had a cross-language search feature that was removed in 2013.

Pippit

Pippit (Chinese: 小云雀; pinyin: Xiǎoyúnquè) is an artificial intelligence content creation platform developed by the Chinese technology company ByteDance. The platform, powered by CapCut leverages multimodal AI technology to streamline professional-grade video and image production, specifically targeting small and medium-sized enterprisesand social media creators. == History == In May 2025, ByteDance officially launched Pippit, which is positioned as an AI video and picture creation tool. In early 2026, Pippit underwent a major architectural overhaul with the integration of the Dreamina seedance 2.0. This technical milestone introduced the "Short Drama Agent" functionality, which enables the end-to-end conversion of scripts up to 100,000 words into fully rendered video productions.

Woken Furies

Woken Furies (2005) is a science fiction novel by British writer Richard Morgan. It is the third novel featuring the anti-hero Takeshi Kovacs and is the sequel to Broken Angels. This addition to the series casts light upon Kovacs' early life providing information on his post-envoy activities. Morgan's official website and interviews suggest that Woken Furies could be the last Kovacs novel, although in 2018 (before Netflix cancelled the show) Morgan stated that the Netflix adaptation has "kind of woken it all up again" after all these years, making him possibly reconsider being done with Kovacs. == Plot == Takeshi Kovacs finds himself in a new "sleeve," or human body, back on his home planet of Harlan's World. He is on the run after making numerous attacks against the Knights of the New Revelation, an extremist religious order responsible for the death of his lost love and her daughter. Because she had violated tenets about resleeving, her executioners dropped her and her daughter's cortical stacks in the sea, effectively preventing them from being resleeved (into new bodies). While trying to secure passage after his most recent attack, Kovacs saves a woman named Sylvie from a group of religious zealots. In return, she allows him to take refuge with her mercenary "deCom" crew as they head out to decommission sentient military hardware that has run amok on the island of New Hokkaido (AKA New Hok). Sylvie is the "command head" of her crew, co-ordinating them during missions by using her biologically implanted circuitry and software. During one of these missions, Sylvie collapses, regains consciousness, and Kovacs realizes that her personality seems to have been replaced by that of long-dead revolutionary leader Quellcrist Falconer. Harlan's World is surrounded by automated "orbitals" which target flying objects, such as vehicles, with high-energy beam weapons known as "angelfire"; Falconer is believed to have died without a backup of her cortical stack when her getaway aircraft was destroyed by angelfire 300 years prior. When Sylvie's crew returns from New Hok, they discover a younger version of Kovacs has been illegally duplicated into a different body (AKA "double sleeved") and is hunting them on behalf of the Harlan family that rules the planet. Most of Sylvie's crew is killed and Sylvie/Quellcrist is captured. Kovacs schemes to rescue Sylvie by approaching old criminal associates of his, the Little Blue Bugs. The Little Blue Bugs mount a semi-successful attack on a Harlan fortress and rescue Sylvie/Quellcrist. Hiding from Harlan forces in a floating base, the neo-Quellists are sold out by its owner and recaptured. An assault by Kovacs and a single UN Envoy on the base ends badly when Kovacs is betrayed by the Envoy who was actually embedded with several colleagues. However, Sylvie/Quellcrist has established a connection with the orbitals and calls down angelfire, eliminating their captors. The younger Kovacs is killed in the aftermath. Sylvie explains that angelfire is a destructive recording device. Thus, in destroying Quellcrist and the helicopter carrying her, it copied her. When the technology of the deCom crews advanced far enough, her persona was able to insert itself into Sylvie's implants and co-exist in her body. The novel ends with Kovacs, Virginia Vidaura, and Sylvie/Quellcrist waiting to see if they can use Sylvie/Quellcrist's newfound connection to the orbitals and the expansion of a long-dormant genetic virus to turn the population against the ruling oligarchy.

Trevor Paglen

Trevor Paglen (born 1974) is an American artist, geographer, and author whose work covers mass surveillance and data collection. In 2016, Paglen won the Deutsche Börse Photography Foundation Prize and he has also won The Cultural Award from the German Society for Photography. In 2017, he was a recipient of a MacArthur Fellowship. On March 17, 2026, Paglen was awarded the 2026 LG Guggenheim Award (a collaboration between LG and Guggenheim New York). == Early life and education == Paglen earned a B.A. degree in religious studies in 1998 from the University of California at Berkeley, a M.F.A. degree in 2002 from the School of the Art Institute of Chicago, and a Ph.D. in Geography in 2008 from the University of California at Berkeley. While at UC Berkeley, Paglen lived in the Berkeley Student Cooperative, residing in Chateau, Fenwick, and Rochdale co-ops. == Work == Sean O'Hagan, writing in The Guardian in 2015, said that Paglen, whose "ongoing grand project [is] the murky world of global state surveillance and the ethics of drone warfare", "is one of the most conceptually adventurous political artists working today, and has collaborated with scientists and human rights activists on his always ambitious multimedia projects." His visual work such as his "Limit Telephotography" and "The Other Night Sky" series have received widespread attention for both his technical innovations and for his conceptual project that involves simultaneously making and negating documentary-style truth-claims. Paglen’s work relies on contemporary technology in two meaningful ways. Firstly, the views he photographs would be impossible to shoot without media tech, that includes the cameras, the microscopes, and even helicopters. But interestingly enough, the shots would not be possible if not for the existence of the subject. The contrasts between secrecy and revelation, evidence and abstraction distinguish Paglen's work. With that the artist presents not so much "evidence" as admonitions to awareness. He was an Eyebeam Commissioned Artist in 2007. In 2008 the Berkeley Art Museum devoted a comprehensive solo exhibition to his work. In the next year, Paglen took part in the Istanbul Biennial, and in 2010 he exhibited at the Vienna Secession. Autonomy Cube was a project by Paglen and Jacob Appelbaum that placed relays for the anonymous communication network Tor in traditional art museums. He contributed to the Oscar-winning documentary film Citizenfour (2014), directed by Laura Poitras. Paglen features in the nerd-culture documentary Traceroute (2016). Orbital Reflector was a reflective, mylar sculpture by Paglen intended to be the first "purely artistic" object in space. The temporary satellite, containing an inflatable mylar balloon with reflective surface, launched into space 3 December 2018. A mid-career survey in 2018–2019, Trevor Paglen: Sites Unseen, was a traveling exhibition shown at the Smithsonian American Art Museum in Washington DC and the Museum of Contemporary Art San Diego. In September 2020, Pace Gallery in London held an exhibition of Paglen's work, exploring "the weird, partial ways computers look back at us". His work is included in the permanent collections of the San Francisco Museum of Modern Art, the Columbus Museum of Art, and the Metropolitan Museum. === Experimental Geography === Paglen is credited with coining the term "Experimental Geography" to describe practices coupling experimental cultural production and art-making with ideas from critical human geography about the production of space, materialism, and praxis. The 2009 book Experimental Geography: Radical Approaches to Landscape, Cartography, and Urbanism is largely inspired by Paglen's work. == Publications == Paglen has published a number of books. Torture Taxi (2006) (co-authored with investigative journalist A. C. Thompson) was the first book to comprehensively describe the CIA's extraordinary rendition program. I Could Tell You But Then You Would Have to be Destroyed by Me (2007), is a look at the world of black projects through unit patches and memorabilia created for top-secret programs. Blank Spots on the Map: The Dark Geography of the Pentagon's Secret World (2009) is a broader look at secrecy in the United States. The Last Pictures (2012) is a collection of 100 images to be placed on permanent media and launched into space on EchoStar XVI, as a repository available for future civilizations (alien or human) to find. === Publications by Paglen === I Could Tell You But Then You Would Have to be Destroyed by Me. Brooklyn, NY: Melville House, 2007. ISBN 1-933633-32-8. Blank Spots on the Map: The Dark Geography of the Pentagon's Secret World. New York: Dutton, 2009. ISBN 9781101011492. Invisible: Covert Operations and Classified Landscapes, Photographs by Trevor Paglen. New York: Aperture, 2010. ISBN 9781597111300. With an essay by Rebecca Solnit. The Last Pictures. Oakland, CA: University of California, 2012. ISBN 9780520275003. Trevor Paglen. London: Phaidon, 2018. ISBN 0714873446. With essays by Laren Cornell, Julia Bryan-Wilson, Omar Kholeif. === Publications co-authored === Torture Taxi. Co-authored with A. C. Thompson. Brooklyn, NY: Melville House Publishing, 2006. ISBN 1-933633-09-3. Icon, 2007. ISBN 9781840468304. === Publications with contributions by Paglen === Experimental Geography: Radical Approaches to Landscape, Cartography, and Urbanism. Brooklyn, NY: Melville House, 2009. ISBN 978-0091636586. Edited by Nato Thompson. With essays by Paglen, Thompson, and Jeffrey Kastner. Trevor Paglen and Jacob Appelbaum – Autonomy Cube. Revolver, 2016. ISBN 978-3957633026. Essays by Luke Skrebowski and Keller Easterling on Autonomy Cube, a piece of sculpture by Paglen and Jacob Appelbaum. In English and German. == Exhibitions == Bellwether Gallery, New York, November–December 2006 The Other Night Sky, Berkeley Art Museum, 2008 A Compendium of Secrets, Cologne Still Revolution: Suspended in Time, Museum of Contemporary Canadian Art, Toronto, May–June 2009. Group exhibition with Paglen, Barbara Astman, Walead Beshty, Mat Collishaw, Stan Douglas, Idris Khan, Martha Rosler, and Mikhael Subotzky A Hidden Landscape, Aksioma, Ljubljana, Slowenia Geographies of Seeing, Lighthouse, Brighton, England, October–November 2012 The Last Pictures, New York, 2012–13 Trevor Paglen, Altman Siegel gallery, San Francisco, CA, March–May 2015 The Octopus, Frankfurter Kunstverein, Frankfurt am Main, 2015 Autonomy Cube, Edith-Russ-Haus, Oldenburg, Germany, October 2015 – January 2016. Sculpture by Paglen and Jacob Appelbaum. Deutsche Börse Photography Foundation Prize 2016, The Photographers' Gallery, London, April–July 2016. Deutsche Börse Photography Prize shortlist with Paglen, Erik Kessels, Laura El-Tantawy, and Tobias Zielony. Radical Landscapes, di Rosa, Napa, February–April 2016 L’Image volée, Americas II, Bahamas Internet Cable System (BICS-1) and Globenet, Fondazione Prada, Milan (group exhibition), 2016 A Study of Invisible Images, Metro Pictures, New York, September–October 2017 == Awards == 2014: Pioneer Award from the Electronic Frontier Foundation. 2015: The Cultural Award from the German Society for Photography (DGPh) 2015: Academy Award as cameraman and director for the documentary film Citzenfour. 2016: Deutsche Börse Photography Foundation Prize 2017: MacArthur Fellowship, John D. and Catherine T. MacArthur Foundation, Chicago, IL 2018: Nam June Paik Art Center Prize == Films about Paglen == Unseen Skies (2021) == Works ==

Acoustic model

An acoustic model is used in automatic speech recognition to represent the relationship between an audio signal and the phonemes or other linguistic units that make up speech. The model is learned from a set of audio recordings and their corresponding transcripts. It is created by taking audio recordings of speech, and their text transcriptions, and using software to create statistical representations of the sounds that make up each word. == Background == Modern speech recognition systems use both an acoustic model and a language model to represent the statistical properties of speech. The acoustic model models the relationship between the audio signal and the phonetic units in the language. The language model is responsible for modeling the word sequences in the language. These two models are combined to get the top-ranked word sequences corresponding to a given audio segment. Most modern speech recognition systems operate on the audio in small chunks known as frames with an approximate duration of 10ms per frame. The raw audio signal from each frame can be transformed by applying the mel-frequency cepstrum. The coefficients from this transformation are commonly known as mel-frequency cepstral coefficients (MFCCs) and are used as an input to the acoustic model along with other features. Recently, the use of convolutional neural networks has led to major improvements in acoustic modeling. == Speech audio characteristics == Audio can be encoded at different sampling rates (i.e. samples per second – the most common being: 8, 16, 32, 44.1, 48, and 96 kHz), and different bits per sample (the most common being: 8-bits, 16-bits, 24-bits or 32-bits). Speech recognition engines work best if the acoustic model they use was trained with speech audio which was recorded at the same sampling rate/bits per sample as the speech being recognized. == Telephony-based speech recognition == The limiting factor for telephony based speech recognition is the bandwidth at which speech can be transmitted. For example, a standard land-line telephone only has a bandwidth of 64 kbit/s at a sampling rate of 8 kHz and 8-bits per sample (8000 samples per second 8-bits per sample = 64000 bit/s). Therefore, for telephony based speech recognition, acoustic models should be trained with 8 kHz/8-bit speech audio files. In the case of voice over IP, the codec determines the sampling rate/bits per sample of speech transmission. Codecs with a higher sampling rate/bits per sample for speech transmission (which improve the sound quality) necessitate acoustic models trained with audio data that matches that sampling rate/bits per sample. == Desktop-based speech recognition == For speech recognition on a standard desktop PC, the limiting factor is the sound card. Most sound cards today can record at sampling rates of between 16–48 kHz of audio, with bit rates of 8- to 16-bits per sample, and playback at up to 96 kHz. As a general rule, a speech recognition engine works better with acoustic models trained with speech audio data recorded at higher sampling rates/bits per sample. But using audio with too high a sampling rate/bits per sample can slow the recognition engine down. A compromise is needed. Thus for desktop speech recognition, the current standard is acoustic models trained with speech audio data recorded at sampling rates of 16 kHz/16 bits per sample.

Science Fiction Thinking Machines

Science Fiction Thinking Machines: Robots, Androids, Computers is an anthology of science fiction short stories edited by American anthologist Groff Conklin. It was first published in hardcover by Vanguard Press in May 1954. An abridged paperback edition titled, Selections from Science Fiction Thinking Machines was later published by Bantam Books in August 1955 and was reprinted in September 1964. The book consists of twenty-two novelettes and short stories by various science fiction authors, together with an introduction and bibliography by the editor. The stories were previously published from 1899-1954, in various science fiction and other magazines. == Contents == Note: stories also appearing in the abridged edition annotated A. "Introduction" (Groff Conklin) "Automata: I" (S. Fowler Wright) "Moxon's Master" (Ambrose Bierce) "Robbie" (Isaac Asimov) A "The Scarab" (Raymond Z. Gallun) "The Mechanical Bride" (Fritz Leiber) "Virtuoso" (Herbert Goldstone) A "Automata: II" (S. Fowler Wright) "Boomerang" (Eric Frank Russell) A "The Jester" (William Tenn) A "R. U. R." (Karel Čapek) "Skirmish" (Clifford D. Simak) A "Soldier Boy" (Michael Shaara) "Automata: III" (S. Fowler Wright) "Men Are Different" (Alan Bloch) A "Letter to Ellen" (Chan Davis) A "Sculptors of Life" (Wallace West) "The Golden Egg" (Theodore Sturgeon) A "Dead End" (Wallace Macfarlane) A "Answer" (Hal Clement) "Sam Hall" (Poul Anderson) A "Dumb Waiter" (Walter M. Miller Jr.) A "Problem for Emmy" (Robert Sherman Townes) A "Selected List of Tales About Robots, Androids, and Computers" (Groff Conklin)