ACM Sixteenth Conference on Information and Knowledge Management (CIKM)

CIKM 2007
Lisboa, Portugal

 
 
 
  Home   Program  Hotel Info  Conference/Hotel Registration   Paper Submissions   Call for Papers   Camera-ready Instructions
  Organizing Committee    Program Committee     Tutorials    Workshops     Sponsors      Keynote Speakers
  List of accepted papers   Information about Lisboa

 

Keynote Speakers  
Prabhakar Raghavan
Head of Yahoo! Research
Web Search: From Information Retrieval to Microeconomic Modeling  - In scarcely a decade, web search has gone from simply scaling traditional information retrieval, to a groundswell of new opportunities that are changing marketing as we know it. In this lecture we begin by reviewing this progress, pointing out that web search is no longer a purely computer science problem. We then hint at the role of other disciplines in this ongoing revolution and a number of directions for research 

Prabhakar Raghavan has been Head of Yahoo! Research since July 2005. His research interests include text and web mining, and algorithm design. He is a Consulting Professor of Computer Science at Stanford University and Editor-in-Chief of the Journal of the ACM. Raghavan received his PhD from Berkeley and is a Fellow of the ACM and of the IEEE. Prior to joining Yahoo, he was Senior Vice-President and Chief Technology Officer at Verity; before that he held a number of technical and managerial positions at IBM Research.

 

Dan Suciu
University of Washington, USA

Management of Data with Uncertainties -
Many applications today need to manage large volumes of uncertain data, such as fuzzy object matchings, uncertain schema mappings, data extracted by IE systems, exploratory queries in databases, RFID and sensor data. I this talk I will discuss probabilistic databases as a unified framework for managing large volumes of uncertain data. The central problem in probabilistic databases is the query evaluation problem, which is a particular instance of probabilistic inference. I will describe a general approach to evaluating SQL queries over large probabilistic databases that reuses much of the existing query processing and query optimization infrastructure. Then I will discuss future research directions in management of probabilistic data.

Dan Suciu is a professor at the University of Washington. He received his Ph.D. from the University of Pennsylvania in 1995, then was a principal member of the technical staff at AT&T Labs until he joined the University of Washington in 2000. Suciu is conducting research in data management, with an emphasis on topics that arise from sharing data on the Internet, such as management of semistructured and heterogeneous data, data security, and managing data with uncertainties. He is a co-author of the book Data on the Web: from Relations to Semistructured Data and XML, holds six US patents, received the 2000 ACM SIGMOD Best Paper Award, is a recipient of the NSF Career Award and of an Alfred P. Sloan Fellowship.

 

Fernando Pereira
University of Pennsylvania, USA
Learning to Join Everything -Text, speech, images, video, DNA sequences provide information about entities that people can recognize when looking at a particular instance. But those entities and their attributes and relationships are not directly accessible to queries that join across types of sources. Information extraction methods based on supervised machine learning recognize mentions of entities and relationships of predefined types in different kinds of sources, which can then be used to answer some useful types of queries. However, supervised learning relies on hand-annotated training sets that are difficult to create and limit what types of entities and relationships can be joined for new applications. These limitations have prompted research into unsupervised extraction methods that rely on correlations among sources rather than hand-annotated training sets. While these methods are not yet as accurate as those based on supervised learning, they have the potential for a new query-by-example approach to information integration in which seed sets of query answers are expanded into ranked lists of potential answers by learning occurrence patterns from the seed answers. I will give examples of both types of methods from our research on biomedical information extraction, leading to some ideas on a possible convergence of search and databases through machine learning.

Fernando Pereira is the Andrew and Debra Rachleff Professor and chair of the department of Computer and Information Science, University of Pennsylvania. He received a Ph.D. in Artificial Intelligence from the University of Edinburgh in 1982. Before joining Penn, he held industrial research and management positions at SRI International, at AT&T Labs, where he led the machine learning and information retrieval research department from September 1995 to April 2000, and at WhizBang Labs, a Web information extraction company. His main research interests are in machine-learnable models of language and other natural sequential data such as biological sequences.  He made major contributions to advances in finite-state models for speech and text processing now in everyday industrial use.  He has over 100 research publications on computational linguistics, speech recognition, machine learning and logic programming, and several issued and pending patents on speech recognition, language processing, and human-computer interfaces.  He was elected Fellow of the American Association for Artificial Intelligence in 1991 for his contributions to computational linguistics and logic programming, and he is a past president of the Association for Computational Linguistics.

 

Last update: 12/02/2007