Improving query results using answer corroboration

Amelie Marian, Assistant Professor
Computer Science Department at Rutgers University


In many scenarios, answers to user queries are not trustworthy because of erroneous, misleading, biased, or outdated information. In such cases, the existence of several similar answers can be viewed as corroborating evidence, increasing the quality of the corresponding information. In this talk, I will first discuss the use of corroborating evidence in a data cleaning scenario, where the underlying database(s) have poor quality data, due to erroneous data entry or data integration mismatch, and characterized by violations of integrity constraints like keys and functional dependencies within and across databases. Specifically, I will present the Multiple Join Path (MJP) framework, which associates quality scores with candidate answers by first scoring individual data paths between a pair of field values taking into account data quality with respect to specified integrity constraints, and then agglomerating scores across multiple data paths that serve as corroborating evidences for a candidate answer. I will show novel techniques to find the top-few (highest quality) answers in the MJP framework. I will also discuss the use of corroborating evidence to improve web search, saving users from having to rummage through a large quantity of information to compare answers, and highlight future research directions in that area.