Data Mining for Networks - The Good and the Bad
Stanley Wasserman

Department of Statistics and Department of Psychological and Brain Sciences
Indiana University

Stanley Wasserman

Abstract

Data mining of network data often focuses on classification methods from machine learning, statistics, and pattern recognition perspectives. These techniques have been described by many, but many of these researchers are unaware of the rich history of classification and clustering techniques originating in social network analysis.

The growth of rich social media, on-line communities, and collectively produced knowledge resources has greatly increased the need for good analytic techniques for social networks. We now have the opportunity to analyze social network data at unprecedented levels of scale and temporal resolution; this has led to a growing body of research at the intersection of the computing, statistics, and the social and behavioral sciences.

This talk discusses some of the current challenges in the analysis of large-scale social network data, focusing on the inference of social processes from data. The invasion of network science by computer scientists has produced much interesting, both good and bad, research.

Short bio

Stan Wasserman, an applied statistician, joined the Departments of Sociology and Psychology at Indiana University in Bloomington in Fall 2004, as Rudy Professor of Statistics, Psychology, and Sociology. He also has an appointment in the Karl F. Schuessler Institute for Social Research. Prior to moving to Indiana, he held faculty positions at Carnegie-Mellon University, University of Minnesota, and University of Illinois, in the disciplines of Statistics, Psychology, and Sociology; in addition, at Illinois, he was a part-time faculty member in the Beckman Institute of Advanced Science and Technology, and has had visiting appointments at Columbia University and the University of Melbourne. In 2005, he helped create the new Department of Statistics in Bloomington, and became its first chair in 2006.

Wasserman is best known for his work on statistical models for social networks and for his text, co-authored with Katherine Faust, Social Network Analysis: Methods and Applications. His other books have been published by Sage Publications and Cambridge University Press. He has published widely in sociology, psychology, and statistics journals, and has been elected to a variety of leadership positions in the Classification Society of North America and the American Statistical Association. He teaches courses on applied statistics.

He is a fellow of the Royal Statistical Society, and an honorary fellow of the American Statistical Association and the American Association for the Advancement of Science. He has been an Associate Editor of a variety of statistics and methodological journals (Psychometrika, Journal of the American Statistical Association, Sociological Methodology, to name a few), as well as the Book Review Editor of Chance. His research has been supported over the years by NSF, ONR, ARL, and NIMH.

Wasserman was also Chief Scientist of Visible Path Corporation in Foster City, California, a software firm engaged in developing social network analysis for corporate settings. He currently blogs at http://www.iq.harvard.edu/blog/netgov/ He was educated at the University of Pennsylvania (receiving two degrees in 1973) and Harvard University (Ph.D., in Statistics, 1977).

Website: http://mypage.iu.edu/~stanwass/