Department » Colloquia » Abstracts

Pattern Discovery and the Algorithmics of Surprise

Alberto Apostolico
Purdue University and University of Padova

Friday, December 3, 2004
1065 Kemper Hall
3 :10-4:00 p.m.


Abstract:

The problem of characterizing and detecting recurrent sequence patterns such as substrings or motifs and related associations or rules is variously pursued in order to compress data, unveil structure, infer succinct descriptions, extract and classify features, etc. In Molecular Biology such regularities have been implicated in various facets of biological function and structure. The discovery, particularly on a massive scale, of significant patterns and correlations thereof poses interesting methodological and algorithmic problems, and often exposes scenarios in which tables and descriptors grow faster and bigger than the phenomena they are meant to encapsulate.

This talk reviews some results at the crossroads of statistics, pattern matching and combinatorics on words that enable us to control such paradoxes, and presents some related constructions, implementations and empirical results.