Computer Science Seminar: Dr. Wolfgang Gatterbauer from Carnegie Mellon University
Host: Norm Matloff
When: Thursday, April 6th at 3:10pm
Where: 1131 Kemper Hall
Repurposing relational databases for approximate probabilistic inferencef
Performing inference over large uncertain data sets has become a central problem in database management with applications in related fields, such as machine learning, AI and statistics. With recent probabilistic knowledge bases having millions to billions of uncertain tuples, there is a pressing need for general-purpose and scalable solutions. Since general reasoning under uncertainty is highly intractable, many state-of-the-art systems perform approximate inference with sampling today.
In this talk, I show an alternative approach that uses only basic operators of relational database management systems (i.e., no sampling required). The first part develops optimal oblivious bounds for the probability of Boolean functions by treating multiple occurrences of variables as independent and assigning them new probabilities. The second part then uses these bounds to reduce the problem of approximate inference over probabilistic databases to a standard multi-query evaluation problem. I give experimental evidence that this approach can be orders of magnitude faster and also more accurate than sampling-based approaches for ranking top query answers.
Wolfgang Gatterbauer is an Assistant Professor in the Tepper School of Business at Carnegie Mellon University, and by courtesy in the Computer Science Department of Carnegie Mellon University. He received his PhD in Computer Science from Vienna University of Technology and did a Post-Doc with Dan Suciu at University of Washington. Wolfgang’s work focuses on ways to extend the capabilities of modern data management systems to support new forms of data, in particular uncertain data. He is the recipient of a CAREER award from the National Science Foundation and a “best-of-conference” mention from VLDB 2015.
1131 Kemper Hall