Wednesday, October 27, 2004, 12:00pm, EB 3105
Speaker: Douglas W. Oard, University of Maryland, College Park
Title: Using Oral History
to Learn About Searching Spontaneous Conversational Speech
Abstract:
Recent dramatic
improvements in the accuracy of automatic transcription for spontaneous
conversational speech hold the promise to unlock access to vast quantities of
spoken language sources. Realizing that
promise requires that we develop search technology that is tuned to the nature
of conversational speech. In this talk, I will describe a first effort to do
just that, leveraging a large collection of "oral history" interviews for which
a uniquely rich collection of associated metadata is available. I'll
briefly describe the status of our work on speech recognition, topic
segmentation, and text classification. I'll
then focus on the process that we have
used to build an information retrieval test collection, and our results from initial experiments with that
collection. I'll conclude by explaining
how those results are helping to guide future work on speech recognition, and
our plans for building test
collections for languages other than English. This is joint work with Charles University, the IBM TJ Watson Research Center, the
Johns Hopkins University, the
Survivors of the ShoahVisual History Foundation, and
theUniversity of West Bohemia.
About the speaker: Douglas Oard is an Associate
Professor at the University of Maryland, College Park, with a joint appointment
in the College of Information Studies and the Institute for Advanced Computer
Studies. He holds a Ph.D. in Electrical Engineering from the University of
Maryland, and his research interests center around the
use of emerging technologies to support information seeking by end users. Dr. Oard's recent work has focused on cross-language
information retrieval, searching spoken language collections, data mining from
text, and the exchange of ratings by networked users. Additional information is
available at http://www.glue.umd.edu/-oard/.