Graphs everywhere: Novel methods for summarization and
natural language processing
Date:
Host: J. Chai
Abstract: Graph-based
representations turn out to be a very helpful tool for natural language
processing and machine learning. Some recent successes include Brin and Page's
Pagerank method for ranking Web pages and Zhu, Ghahramani, and Lafferty's
semi-supervised learning methods using harmonic functions. In this talk, I will
present a framework (and two demos) for natural language processing using
random walks on graphs. The first part introduces the concept of lexical
centrality, based on random walks on lexical similarity graphs. Lexical
centrality is used to find the most important passages in a collection of
textual documents. The second part will discuss some work in progress on
semi-supervised learning with binary features with applications to natural
language problems such as parsing. In both cases I will show state of the art
results on competitive challenges. I will also show two publicly available
demos that illustrate the concepts of the talk.
Biography: Dragomir R. Radev
is an Associate Professor of Information, Electrical Engineering and Computer
Science, and Linguistics at the
Dr.
Radev's current research on probabilistic and link-based methods for exploiting
very large textual repositories, graph-based methods for natural language
processing, representing and acquiring knowledge of genome regulation, and
semantic entity and relation extraction from Web-scale text document
collections is supported by NSF and NIH. He serves on the HLT-NAACL advisory committee, was recently reelected as
treasurer of NAACL, is a member of the editorial boards of JAIR and Information
Retrieval, and is a four-time finalist at the ACM programming finals (as
contestant in 1993 and as coach in 1995-1997).
Dragomir
received a graduate teaching award at