Theory and Applications of Bayesian Co-Training
Fall 2008 CSE Colloquium Series
Dr. Shipeng YU
Staff Scientist
Siemens Medical Solutions USA, Inc.
Date: October 17
Time: 11:00 AM - 12:00 PM
1260 Anthony Hall
Host: Rong Jin
Abstract
Co-training is a popular algorithm for semi-supervised classification and has been applied to many real world problems. When the input data have multiple representations or views (e.g. each web page has the text representation as one view and the hyperlinks from other pages as another view), co-training works by iteratively labeling some unlabeled data using a classifier trained on each view, and enlarging the training set. In this talk we present our recent work on Bayesian co-training, which is an undirected graphical model for co-training. The model clarifies some previously unclear assumptions about co-training, and takes the standard co-training and many of its extensions (e.g. co-regularization) as special cases. A co-training kernel will also be introduced in a Gaussian process (GP) framework, which allows efficient learning with one-step, globally optimal solution. An important application of Bayesian co-training will also be discussed, which is called active sensing where we are allowed to select a previously unobserved (data, view) pair to acquire such that the overall performance is optimized. Experiments on web page classification and some medical applications will be presented at the end of the talk.
Biography
Shipeng YU is currently a staff scientist at Siemens Medical Solutions USA, Inc. He received his B.Sc. and M.Sc. degrees in mathematics from Peking University in 2000 and 2003, respectively, and finished his Ph.D. in computer science at University of Munich in Germany in 2006. He has
been working on many areas of statistical machine learning, such as Gaussian processes, Dirichlet processes, probabilistic dimensionality reduction, ordinal regression, multi-task learning and semi-supervised learning. He is also interested in machine learning applications in medical data mining, information and image retrieval, and user modeling.