Philipp Koehn

Philipp Koehn

Philipp Koehn (born 1 August 1971 in Erlangen, West Germany) is a computer scientist and researcher in the field of machine translation. His primary research interest is statistical machine translation and he is one of the inventors of a method called phrase based machine translation. This is a sub-field of statistical translation methods that employs sequences of words (or so-called "phrases") as the basis of translation, expanding the previous word based approaches. A 2003 paper which he authored with Franz Josef Och and Daniel Marcu called Statistical phrase-based translation has attracted wide attention in Machine translation community and has been cited over a thousand times. Phrase based methods are widely used in machine translation applications in industry. Philipp Koehn received his PhD in computer science in 2003 from the University of Southern California, where he worked at the Information Sciences Institute advised by Kevin Knight. After a year as a postdoctoral fellow under Michael Collins at the Massachusetts Institute of Technology, he joined the University of Edinburgh as a lecturer in the School of Informatics in 2005. He was appointed reader in 2010 and professor in 2012. In 2014, he was appointed professor at the computer science department of The Johns Hopkins University, where he is affiliated with the Center for Language and Speech Processing. == Moses statistical machine translation decoder == The Moses machine translation decoder is an open source project that was created by and is maintained under the guidance of Philipp Koehn. The Moses decoder is a platform for developing Statistical machine translation systems given a parallel corpus for any language pair. The decoder was mainly developed by Hieu Hoang and Philipp Koehn at the University of Edinburgh and extended during a Johns Hopkins University Summer Workshop and further developed under Euromatrix and GALE project funding. The decoder (which is part of a complete statistical machine translation toolkit) is the de facto benchmark for research in the field. Although Koehn continues to play a major role in the development of Moses, the Moses decoder was supported by the European Framework 6 projects Euromatrix, TC-Star, the European Framework 7 projects EuroMatrixPlus, Let's MT, META-NET and MosesCore and the DARPA GALE project, as well as several universities such as the University of Edinburgh, the University of Maryland, ITC-irst, Massachusetts Institute of Technology, and others. Substantial additional contributors to the Moses decoder include Hieu Hoang, Chris Dyer, Josh Schroeder, Marcello Federico, Richard Zens, and Wade Shen. == Europarl corpus == The Europarl corpus is a set of documents that consists of the proceedings of the European Parliament from 1996 to the present. The corpus has been compiled and expanded by a group of researchers led by Philipp Koehn at University of Edinburgh. The data that makes up the corpus was extracted from the website of the European Parliament and then prepared for linguistic research. The latest release (2012) comprised up to 60 million words per language, with 21 European languages represented: Romanic (French, Italian, Spanish, Portuguese, Romanian), Germanic (English, Dutch, German, Danish, Swedish), Slavic (Bulgarian, Czech, Polish, Slovak, Slovene), Finno-Ugric (Finnish, Hungarian, Estonian), Baltic (Latvian, Lithuanian), and Greek. == Other interests and activities in chronological order == Koehn is a professor at Johns Hopkins University where he continues his research into machine translation through his affiliation with the Center for Language and Speech Processing Koehn is a professor and chair of machine translation at the University of Edinburgh School of Informatics and contributes to its statistical machine translation group which organises workshops, seminars and project related to the subject. Koehn has consulted to SYSTRAN periodically between 2006 and 2011. SYSTRAN was acquired by CLSI, a Korean machine translation company in April 2014. Koehn worked for Facebook/META AI Research from 2018 to 2022. Koehn is also chief scientist for Omniscien Technologies and a shareholder in Omniscien Technologies since 2007. Omniscien Technologies is a private company developing and commercialising machine translation technologies. Koehn authored a book titled "Statistical Machine Translation" in 2009 and a book titled "Neural Machine Translation" in 2020. == Awards and recognition == 2013: One of three finalists in the category of Research for the European Patent Office (EPO) 2013 European Inventor Award. Koehn was recognised for patent EP 1488338 B, Phrase-Based Joint Probability Model for Statistical Machine Translations, a translation model that uses mathematical probabilities to determine the most likely interpretation of chunks of text between foreign languages. 2015: Koehn received the Award of Honor of the International Association for Machine Translation. 2024: Koehn was named Fellow of the Association for Computational Linguistics (ACL).

Visual servoing

Visual servoing, also known as vision-based robot control and abbreviated VS, is a technique which uses feedback information extracted from a vision sensor (visual feedback) to control the motion of a robot. One of the earliest papers that talks about visual servoing was from the SRI International Labs in 1979. == Visual servoing taxonomy == There are two fundamental configurations of the robot end-effector (hand) and the camera: Eye-in-hand, or end-point open-loop control, where the camera is attached to the moving hand and observing the relative position of the target. Eye-to-hand, or end-point closed-loop control, where the camera is fixed in the world and observing the target and the motion of the hand. Visual Servoing control techniques are broadly classified into the following types: Image-based (IBVS) Position/pose-based (PBVS) Hybrid approach IBVS was proposed by Weiss and Sanderson. The control law is based on the error between current and desired features on the image plane, and does not involve any estimate of the pose of the target. The features may be the coordinates of visual features, lines or moments of regions. IBVS has difficulties with motions very large rotations, which has come to be called camera retreat. PBVS is a model-based technique (with a single camera). This is because the pose of the object of interest is estimated with respect to the camera and then a command is issued to the robot controller, which in turn controls the robot. In this case the image features are extracted as well, but are additionally used to estimate 3D information (pose of the object in Cartesian space), hence it is servoing in 3D. Hybrid approaches use some combination of the 2D and 3D servoing. There have been a few different approaches to hybrid servoing 2-1/2-D Servoing Motion partition-based Partitioned DOF Based == Survey == The following description of the prior work is divided into 3 parts Survey of existing visual servoing methods. Various features used and their impacts on visual servoing. Error and stability analysis of visual servoing schemes. === Survey of existing visual servoing methods === Visual servo systems, also called servoing, have been around since the early 1980s , although the term visual servo itself was only coined in 1987. Visual Servoing is, in essence, a method for robot control where the sensor used is a camera (visual sensor). Servoing consists primarily of two techniques, one involves using information from the image to directly control the degrees of freedom (DOF) of the robot, thus referred to as Image Based Visual Servoing (IBVS). While the other involves the geometric interpretation of the information extracted from the camera, such as estimating the pose of the target and parameters of the camera (assuming some basic model of the target is known). Other servoing classifications exist based on the variations in each component of a servoing system , e.g. the location of the camera, the two kinds are eye-in-hand and hand–eye configurations. Based on the control loop, the two kinds are end-point-open-loop and end-point-closed-loop. Based on whether the control is applied to the joints (or DOF) directly or as a position command to a robot controller the two types are direct servoing and dynamic look-and-move. Being one of the earliest works the authors proposed a hierarchical visual servo scheme applied to image-based servoing. The technique relies on the assumption that a good set of features can be extracted from the object of interest (e.g. edges, corners and centroids) and used as a partial model along with global models of the scene and robot. The control strategy is applied to a simulation of a two and three DOF robot arm. Feddema et al. introduced the idea of generating task trajectory with respect to the feature velocity. This is to ensure that the sensors are not rendered ineffective (stopping the feedback) for any the robot motions. The authors assume that the objects are known a priori (e.g. CAD model) and all the features can be extracted from the object. The work by Espiau et al. discusses some of the basic questions in visual servoing. The discussions concentrate on modeling of the interaction matrix, camera, visual features (points, lines, etc..). In an adaptive servoing system was proposed with a look-and-move servoing architecture. The method used optical flow along with SSD to provide a confidence metric and a stochastic controller with Kalman filtering for the control scheme. The system assumes (in the examples) that the plane of the camera and the plane of the features are parallel., discusses an approach of velocity control using the Jacobian relationship s˙ = Jv˙ . In addition the author uses Kalman filtering, assuming that the extracted position of the target have inherent errors (sensor errors). A model of the target velocity is developed and used as a feed-forward input in the control loop. Also, mentions the importance of looking into kinematic discrepancy, dynamic effects, repeatability, settling time oscillations and lag in response. Corke poses a set of very critical questions on visual servoing and tries to elaborate on their implications. The paper primarily focuses the dynamics of visual servoing. The author tries to address problems like lag and stability, while also talking about feed-forward paths in the control loop. The paper also, tries to seek justification for trajectory generation, methodology of axis control and development of performance metrics. Chaumette in provides good insight into the two major problems with IBVS. One, servoing to a local minima and second, reaching a Jacobian singularity. The author show that image points alone do not make good features due to the occurrence of singularities. The paper continues, by discussing the possible additional checks to prevent singularities namely, condition numbers of J_s and Jˆ+_s, to check the null space of ˆ J_s and J^T_s . One main point that the author highlights is the relation between local minima and unrealizable image feature motions. Over the years many hybrid techniques have been developed. These involve computing partial/complete pose from Epipolar Geometry using multiple views or multiple cameras. The values are obtained by direct estimation or through a learning or a statistical scheme. While others have used a switching approach that changes between image-based and position-based on a Lyapnov function. The early hybrid techniques that used a combination of image-based and pose-based (2D and 3D information) approaches for servoing required either a full or partial model of the object in order to extract the pose information and used a variety of techniques to extract the motion information from the image. used an affine motion model from the image motion in addition to a rough polyhedral CAD model to extract the object pose with respect to the camera to be able to servo onto the object (on the lines of PBVS). 2-1/2-D visual servoing developed by Malis et al. is a well known technique that breaks down the information required for servoing into an organized fashion which decouples rotations and translations. The papers assume that the desired pose is known a priori. The rotational information is obtained from partial pose estimation, a homography, (essentially 3D information) giving an axis of rotation and the angle (by computing the eigenvalues and eigenvectors of the homography). The translational information is obtained from the image directly by tracking a set of feature points. The only conditions being that the feature points being tracked never leave the field of view and that a depth estimate be predetermined by some off-line technique. 2-1/2-D servoing has been shown to be more stable than the techniques that preceded it. Another interesting observation with this formulation is that the authors claim that the visual Jacobian will have no singularities during the motions. The hybrid technique developed by Corke and Hutchinson, popularly called portioned approach partitions the visual (or image) Jacobian into motions (both rotations and translations) relating X and Y axes and motions related to the Z axis. outlines the technique, to break out columns of the visual Jacobian that correspond to the Z axis translation and rotation (namely, the third and sixth columns). The partitioned approach is shown to handle the Chaumette Conundrum discussed in. This technique requires a good depth estimate in order to function properly. outlines a hybrid approach where the servoing task is split into two, namely main and secondary. The main task is keep the features of interest within the field of view. While the secondary task is to mark a fixation point and use it as a reference to bring the camera to the desired pose. The technique does need a depth estimate from an off-line procedure. The paper discusses two examples for which depth estimates are obtained from robot odometry and by assuming that all

Cloud Security Alliance

Cloud Security Alliance (CSA) is a not-for-profit organization with the mission to "promote the use of best practices for providing security assurance within cloud computing, artificial intelligence and to provide education on the uses of cloud computing to help secure all other forms of computing." The CSA has over 80,000 individual members worldwide. The CSA gained significant reputability in 2011 when the American Presidential Administration selected the CSA Summit as the venue for announcing the federal government’s cloud computing strategy. == History == The CSA was formed in December 2008 as a coalition by individuals who saw the need to provide objective enterprise user guidance on the adoption and use of cloud computing. Its initial work product, Security Guidance for Critical Areas of Focus in Cloud Computing, was put together in a Wiki-style by dozens of volunteers. In 2014, the Chairman of the Board of the CSA was Dave Cullinane, VP of Global Security and Privacy for Catalina Marketing, St. Petersburg, Florida, and former CISO for eBay. Cullinane has said, "If you have an application exposed to the Internet that will allow people to make money, it will be probed." == Profile == In 2009, the Cloud Security Alliance incorporated in Nevada as a Corporation and achieved US Federal 501(c)6 non-profit status. It is registered as a Foreign Non-Profit Corporation in Washington. == Policy maker support == The CSA works to support a number of global policy makers in their focus on cloud security initiatives including the National Institute of Standards and Technology (NIST), European Commission, Singapore Government, and other data protection authorities. In March 2012, the CSA was selected to partner with three of Europe’s largest research centers (CERN, EMBL and ESA) to launch Helix Nebula – The Science Cloud. == Size == The Cloud Security Alliance employs roughly sixty full-time and contract staff worldwide. It has several thousand active volunteers participating in research, working groups and chapters at any time. == Membership == According to CSA, they are a member-driven organization, chartered with promoting the use of best practices for providing security assurance within Cloud Computing, and providing education on the uses of Cloud Computing to help secure all other forms of computing. === Individuals === Individuals who are interested in cloud computing and have experience to assist in making it more secure receive a complimentary individual membership based on a minimum level of participation. === Chapters === The Cloud Security Alliance has a network of chapters worldwide. Chapters are separate legal entities from the Cloud Security Alliance, but operate within guidelines set down by the Cloud Security Alliance In the United States, Chapters may elect to benefit from the non-profit tax shield that the Cloud Security Alliance has. Chapters are encouraged to hold local meetings and participate in areas of research. Chapter activities are coordinated by the Cloud Security Alliance worldwide. === International scope === There are separate legal entities in Europe and Asia Pacific, called Cloud Security Alliance (Europe), a Scottish company in the United Kingdom, and Cloud Security Alliance Asia Pacific Ltd, in Singapore. Each legal entity is responsible for overseeing all Cloud Security Alliance-related activities in their respective regions. These legal entities operate under an agreement with Cloud Security Alliance that give it oversight power and have separate Boards of Directors. Both are companies Limited By Guarantee. The Managing Directors of each are members of the Executive Team of Cloud Security Alliance. == Areas of research == The Cloud Security Alliance has 25+ active working groups. Key areas of research include cloud standards, certification, education and training, guidance and tools, global reach, and driving innovation. Security Guidance for Critical Areas of Focus in Cloud Computing. Foundational best practices for securing cloud computing. Top Threats to Cloud Computing. Helps organizations make educated risk management decisions regarding their cloud adoption strategies. GRC (Governance, Risk and Compliance) Stack. A toolkit for key stakeholders to instrument and assess clouds against industry established best practices, standards and critical compliance requirements. Cloud Controls Matrix (CCM). Security controls framework for cloud provider and cloud consumers. CloudTrust Protocol. The mechanism by which cloud service consumers ask for and receive information about the elements of transparency as applied to cloud service providers. Consensus Assessments Initiative Research. Tools and processes to perform consistent measurements of cloud providers. Software Defined Perimeter. A proposed security framework that can be deployed to protect application infrastructure from network-based attacks. It will incorporate standards from organizations such as OASIS and NIST and security concepts from organizations like the U.S. DoD into an integrated framework. == Working groups and initiatives == Mobile Working Group Big Data Working Group Security as a Service Working Group Trusted Cloud Initiative CloudAudit CloudCERT CloudSIRT Cloud Metrics Security, Trust and Assurance Registry (STAR) Cloud Data Governance Turbot (business) Blockchain/Distributed Ledger

I-MSCP

i-MSCP (internet Multi Server Control Panel) was a free and open-source software for shared hosting environments management on Linux servers. It comes with a large choice of modules for various services such as Apache2, ProFTPd, Dovecot, Courier, Bind9, and can be easily extended through plugins, or listener files using its events-based API. Latest stable is the 1.5.3 version (build 2018120800) which has been released on 8 December 2018. The i-MSCP is no longer under development, although the developer has repeatedly claimed to be working on a new version, which has never has been published or even shown in any possible way. Whether development occurs or not, the current version of the software is not installable, as it only supports outdated versions of systems for which some of the necessary software to install i-MSCP cannot be installed. == Licensing == i-MSCP has a dual license. A part of the base code is licensed under the Mozilla Public License. All new code, and submissions to i-MSCP are licensed under the GNU Lesser General Public License Version 2.1 (LGPLv2). To solve this license conflict there is work on a complete rewrite for a completely LGPLv2 licensed i-MSCP. == Features == === Supported Linux Distributions === Debian Jessie (8.x), Stretch (9.x), Buster (10.x) Devuan Jessie (1.0), ASCII (2.x) Ubuntu Trusty Thar (14.04 LTS), Bionic Beaver (18.04 LTS) === Supported Daemons / Services === Web server: Apache (ITK, Fcgid and FastCGI/PHP-FPM), Nginx Name server: Bind9 MTA (Mail Transport Agent): Postfix MDA (Mail Delivery Agent): Courier, Dovecot Database: MySQL, MariaDB, Percona FTP-Server: ProFTPD, vsftpd Web statistics: AWStats === Addons === PhpMyAdmin Pydio, formerly AjaXplorer Net2ftp Roundcube Rainloop == Competing software == cPanel DTC Froxlor ISPConfig ispCP OpenPanel hestiacp Plesk SysCP Virtualmin

List of Ruby software and tools

This is a list of software and programming tools for the Ruby programming language, which includes libraries, web frameworks, implementations, tools, and related projects. == Web tools == Capistrano (software) – remote server automation tool Mongrel – Ruby web server Rack – interface between web servers and web applications Ruby on Rails – full-stack web application framework Sinatra – lightweight Ruby web application framework Spree Commerce – e-commerce platform WEBrick – Ruby HTTP server toolkit == Libraries == BioRuby – bioinformatics and computational biology library for Ruby Bogus – Ruby library for creating reliable test doubles with contract verification ERuby – embedded Ruby templating EventMachine – event-driven I/O library Factory Bot – test fixtures library Fat comma – Ruby library for JSON-like hash syntax Geocoder – Ruby library for geocoding and reverse geocoding addresses Haml – HTML templating engine Markaby – HTML generation via Ruby Nokogiri – XML/HTML parsing library RSpec – behavior-driven testing framework for Ruby RubyGems – package manager for Ruby libraries and applications Sass – CSS preprocessor Sidekiq – background job framework for Ruby, used to handle asynchronous tasks. Uconv – Unicode text conversion library Watir – web application testing framework == Ruby implementations == HotRuby – Ruby interpreter implemented in JavaScript, enabling Ruby code to run in web browsers. IronRuby – Ruby for .NET platform JRuby – Ruby on the Java Virtual Machine MacRuby – Ruby implementation for macOS Mod ruby – Apache module that embeds the Ruby interpreter to improve performance of Ruby web applications Mruby – lightweight Ruby interpreter Rubinius – alternative Ruby implementation, based loosely on the Smalltalk-80 Blue Book design. Ruby MRI – the standard Ruby interpreter YARV – "Yet Another Ruby VM," the bytecode interpreter used in modern Ruby implementations == Tools == Homebrew – package manager for macOS and Linux written in Ruby Pry – interactive Ruby shell Rake – build and task management Ruby Version Manager – environment manager RubyCocoa – bridge between Ruby and Cocoa RubyForge – project hosting site RubyMotion – for iOS/macOS development RubySpec – language specification tests == Integrated Development Environments == Aptana Studio — integrated RadRails plugin for Ruby on Rails development Eclipse DLTK Ruby Plugin — Ruby development plugin for Eclipse Eric — open-source Python-based IDE with Ruby support Komodo IDE — commercial cross-platform IDE with Ruby support RubyMine — commercial IDE for Ruby and Rails by JetBrains SlickEdit — commercial cross-platform IDE with Ruby support == List of websites using Ruby on Rails == Airbnb Basecamp Diaspora – decentralized social network application built with Ruby on Rails Discourse – open-source discussion platform built with Ruby on Rails Fiverr GitHub Hulu Shopify SoundCloud Twitch Zendesk

Data Science and Predictive Analytics

The first edition of the textbook Data Science and Predictive Analytics: Biomedical and Health Applications using R, authored by Ivo D. Dinov, was published in August 2018 by Springer. The second edition of the book was printed in 2023. This textbook covers some of the core mathematical foundations, computational techniques, and artificial intelligence approaches used in data science research and applications. By using the statistical computing platform R and a broad range of biomedical case-studies, the 23 chapters of the book first edition provide explicit examples of importing, exporting, processing, modeling, visualizing, and interpreting large, multivariate, incomplete, heterogeneous, longitudinal, and incomplete datasets (big data). == Structure == === First edition table of contents === The first edition of the Data Science and Predictive Analytics (DSPA) textbook is divided into the following 23 chapters, each progressively building on the previous content. === Second edition table of contents === The significantly reorganized revised edition of the book (2023) expands and modernizes the presented mathematical principles, computational methods, data science techniques, model-based machine learning and model-free artificial intelligence algorithms. The 14 chapters of the new edition start with an introduction and progressively build foundational skills to naturally reach biomedical applications of deep learning. Introduction Basic Visualization and Exploratory Data Analytics Linear Algebra, Matrix Computing, and Regression Modeling Linear and Nonlinear Dimensionality Reduction Supervised Classification Black Box Machine Learning Methods Qualitative Learning Methods—Text Mining, Natural Language Processing, and Apriori Association Rules Learning Unsupervised Clustering Model Performance Assessment, Validation, and Improvement Specialized Machine Learning Topics Variable Importance and Feature Selection Big Longitudinal Data Analysis Function Optimization Deep Learning, Neural Networks == Reception == The materials in the Data Science and Predictive Analytics (DSPA) textbook have been peer-reviewed in the Journal of the American Statistical Association, International Statistical Institute’s ISI Review Journal, and the Journal of the American Library Association. Many scholarly publications reference the DSPA textbook. As of January 17, 2021, the electronic version of the book first edition (ISBN 978-3-319-72347-1) is freely available on SpringerLink and has been downloaded over 6 million times. The textbook is globally available in print (hardcover and softcover) and electronic formats (PDF and EPub) in many college and university libraries and has been used for data science, computational statistics, and analytics classes at various institutions.

Discrete skeleton evolution

Discrete Skeleton Evolution (DSE) describes an iterative approach to reducing a morphological or topological skeleton. It is a form of pruning in that it removes noisy or redundant branches (spurs) generated by the skeletonization process, while preserving information-rich "trunk" segments. The value assigned to individual branches varies from algorithm to algorithm, with the general goal being to convey the features of interest of the original contour with a few carefully chosen lines. Usually, clarity for human vision (aka. the ability to "read" some features of the original shape from the skeleton) is valued as well. DSE algorithms are distinguished by complex, recursive decision-making processes with high computational requirements. Pruning methods such as by structuring element (SE) convolution and the Hough transform are general purpose algorithms which quickly pass through an image and eliminate all branches shorter than a given threshold. DSE methods are most applicable when detail retention and contour reconstruction are valued. == Methodology == === Pre-processing === Input images will typical contain more data than is necessary to generate an initial skeleton, and thus must be reduced in some way. Reducing the resolution, converting to grayscale, and then binary by masking or thresholding are common first steps. Noise removal may occur before and/or after converting an image to binary. Morphological operations such as closing, opening, and smoothing of the binary image may also be part of pre-processing. Ideally, the binarized contour should be as noise-free as possible before the skeleton is generated. === Skeletonization === DSE techniques may be applied to an existing skeleton or incorporated as part of the skeleton growing algorithm. Suitable skeletons may be obtained using a variety of methods: Thinning algorithms, such as the Grassfire transform Voronoi diagram Medial Axis Transform or Symmetry Axis Transform Distance Mapping === Significance Measures === DSE and related methods remove entire spurious branches while leaving the main trunk intact. The intended result is typically optimized for visual clarity and retention of information, such that the original contour can be reconstructed from the fully pruned skeleton. The value of various properties must be weighted by the application, and improving the efficiency is an ongoing topic of research in computer vision and image processing. Some significance measures include: Discrete Bisector Function Contour length Bending Potential Ratio Discrete Curve Evolution === Iteration === Each branch is evaluated during a pass through the skeletonized image according to the specific algorithm being used. Low value branches are removed and the process is repeated until a desired threshold of simplicity is reached. === Reconstruction === If all points on the output skeleton are the center points of maximal disks of the image and the radius information is retained, a contour image can be reconstructed == Applications == === Handwriting and text parsing === Variability in hand-written text is an ongoing challenge, simplification makes it somewhat easier for computer vision algorithms to make judgements about intended characters. === Soft body classification (animals) === The maximal disks centered on the skeleton imply roughly spherical masses, the features of the extracted skeleton are relatively unchanged even as the soft body deforms or self-occludes. Skeleton information is one facet of determining whether two animals are the "same" some way, though it must usually be paired with another technique to effectively identify a target. === Medical uses === Investigation of organs, tissue damage and deformation caused by disease.