Dr. Yan Yan
Research Fellow
University of Michigan
Friday, November 10, 2017
11 AM - 12 PM
EB 3105
Abstract:
Multi-task learning, as one important branch of machine learning, has developed very fast during the past decade. Multi-task
learning methods aim to simultaneously learn classification or regression models for a set of related tasks. This typically
leads to better models as compared to a learner that does not account for task relationships. In this talk, we will investigate
a multi-task learning framework for head pose estimation and actor-action segmentation. (1) Head pose estimation from
low-resolution surveillance data has gained in importance. However, monocular and multi-view head pose estimation approaches
still work poorly under target motion, as facial appearance distorts owing to camera perspective and scale changes when a person
moves around. We propose FEGA-MTL, a novel framework based on multi-task learning for classifying the head pose of a person who
moves freely in an environment monitored by multiple, large field-of-view surveillance cameras. Upon partitioning the monitored
scene into a dense uniform spatial grid, FEGA-MTL simultaneously clusters grid partitions into regions with similar facial
appearance, while learning region-specific head pose classifiers. (2) Fine-grained activity understanding in videos has
attracted considerable recent attention with a shift from action classification to detailed actor and action understanding that
provides compelling results for perceptual needs of cutting-edge autonomous systems. However, current methods for detailed
understanding of actor and action have significant limitations: they require large amounts of finely labeled data, and they fail
to capture any internal relationship among actors and actions. To address these issues, we propose a novel, robust multi-task
ranking model for weakly-supervised actor-action segmentation where only video-level tags are given for training samples. Our
model is able to share useful information among different actors and actions while learning a ranking matrix to select
representative supervoxels for actors and actions respectively.
Biography:
Yan Yan is currently a research fellow with EECS in the University of Michigan, Ann Arbor. He received the PhD degree in
computer science from the University of Trento Italy, and the M.S. degree from Georgia Institute of Technology. He was a
visiting scholar with Carnegie Mellon University in 2013 and a visiting research fellow with the Advanced Digital Sciences
Center (ADSC), UIUC, Singapore in 2015. His research interests include computer vision, machine learning, and multimedia. He
received the Best Student Paper Award in ICPR 2014 and Best Paper Award in ACM Multimedia 2015. He has published papers in CVPR
/ ICCV / ECCV / TPAMI / AAAI / IJCAI / ACM Multimedia. He has been PC members for several major conferences and reviewers for
referred journals in computer vision and multimedia. He served as a guest editor in TPAMI, CVIU and TOMM. He is a member of the
IEEE and the ACM.
Host:
Dr. Xiaoming Liu