- Towards Effective
Search and Knowledge Discovery in Online Health Forums
We are investigating a
patient-centered approach for information extraction, classification, and
integration to support effective search and knowledge discovery in healthcare
forums.
See more
- XSEEK : an
Intelligent Search Engine for Semi-Structured Data
We are developing a search engine for databases. We are
identifying a spectrum of problem space for supporting keyword search on
structured/semi-structured data, ranging from evaluation framework,
generating high-quality results, to helping users analyze results, and
developing techniques to address the open challenges. More
Information about XSEEK
SIGMOD
2009 Tutorial: Keyword Search on Structured and
Semi-structured Data. Slides: pptx/
ppt
ICDE 2011 Tutorial:
Keyword based Search and Exploration on Databases. Slides: pptx
- Querying Incomplete and
Inconsistent Web Databases
We are developing techniques for
querying web databases in the presence of the imprecise nature of user queries
as well as inconsistence in the data. More
Information about the Project
- ExpertNet: Collaboration
Network for Intelligent Social Computing
We are developing
computational foundations and quantitative frameworks to model, optimize, and
search collaborative social networks to expedite problem-solving and
innovation. More Information
about ExpertNet
- SWAN: Smart Workflow
Management
We are developing techniques for workflow
management, including workflow modeling, provenance reasoning, workflow
search, and optimization, for both scientific workflows and business
processes, for regular workflows as well as ad-hoc workflows. More
Information about SWAN and its sub-project SmartFlow
for managing ad-hoc workflows specifically.
Overview:
Traditionally
information extraction systems are implemented as a pipeline of special-purpose
processing modules, which necessitates extraction to be re-applied from scratch
to the entire text corpus whenever the data, processing modules, or extraction
goals change. we propose an innovative paradigm for information extraction: the
parse trees that are output by natural language processing on textual documents
are stored in a database, and then extraction is expressed as queries using our
proposed structured query language on databases. Such a paradigm have several
advantages:
- avoiding writing special-purpose extraction programs,
- leveraging query optimization in databases,
- allowing incremental extraction upon changes.
Furthermore, to allow ordinary users to easily perform information extraction or
keyword search on corpus without learning the structured query language, we are
investigating techniques that automatically generate structured queries based
on the user keyword query and its pseudo-relevance feedback to obtain
high-quality results.
Publications:
TKDE'12, ICDE'10
(demo), ICDE'06
- Completed
Projects
- XML Stream Processing
- XML Databases
- XML Constraints
- Querying Linguistic Databases
A
Complete List of Publications