Statistical language modeling for information access

The purpose of this tutorial is to systematically explain the use of statistical language models to information retrieval with an emphasis on the underlying principles and framework, empirically effective models, as well as language models developed for a broad range of retrieval tasks, both traditional and non-traditional. Students can expect to learn the major principles and methods of applying statistical language models to information retrieval, the outstanding problems in this area, as well as obtain comprehensive pointers to the research literature. - Lecture 1: general retrieval modeling and evaluation principles; introduction to language modeling. - Lecture 2: estimation, smoothing methods, mixture models, and applications to retrieving (semi)structured documents. - Lecture 3: incorporating symbolic knowledge, lexical relations and context within a language modeling setting. - Lecture 4: language modeling approaches to tasks at the interface of IR and IE. - Lecture 5: ongoing developments and prominent research questions. Using the Lemur Toolkit, students will gain hands-on experience with a popular retrieval platform based on language modeling principles. Students will be expected to brings laptops.

