Entropy Estimation of Network Traffic Flow Data

Mitsunori Ogihara

University of Rochester

Thursday, April 26, 2007
Talk: 11:00 am - 12:00 pm
3105 Engineering

Host: Eric Torng

Abstract:

Analysis of network traffic flow data is a subject of growing importance. It offers many interesting challenges partly because network traffic data are large in size and they rapidly come and go. Research in this arena is expected to make a significant impact not only on network monitoring and protection against attack but also on other domains with a high volume of data traffic. Space- and time-efficient algorithms have been proposed to solve a variety of problems such as iceberg queries and frequency moments. This talk is concerned with the problem of estimating entropy. The use of entropy has been recently suggested in the networking community to solve such problems as anomaly detection. Using a result from communication complexity, it can be argued that both approximation and randomness will be required for efficiently estimating the entropy. Two algorithms for this problem will be presented. The first one uses the idea of the celebrated frequency-moment-estimation algorithm of Alon, Mathias, and Szegedy. The second one combines the first one with the Elephant/Ant approach of Estan and Varghese. An empirical comparison of these algorithms against some standard approaches has been made using some real network data sets.

If time permits, the talk will cover another, fun topic: music information retrieval. Many people listen to music through computers and portable digital music players. The digital music collection of such a listener can be very large and cover many genres and styles, and so it can be cumbersome to classify, organize, and retrieve music in it. One of the ultimate goals of music information retrieval is to develop efficient algorithms for these tasks by learning from data. This talk will present some recent advances in the area, including the use of wavelet coefficient histograms for genre classification from audio data, detecting emotion aroused in listeners, and semi-supervised learning of artist groups using features from audio and lyrics data.

This is joint work with Ashwin Lall, Qi Li, Tao Li, Vyas Sekar, Jun Xu, and Hui Zhang.