|
Research
Interests
|
|
My research interests span
the areas of data management: structured, semistructured and XML data,
generalized tree-pattern query evaluation and optimization on the web,
keyword search on XML data, definition of semantics for keyword queries,
views on XML data, data integration, and semantic web.
-
Flexible querying of XML data sources on the web.
XML data sources include data sources with different structures or
data sources with complex or partially known structures. The queries
needed for this task go beyond tree-pattern queries (TPQs) and
encompass keyword-based queries and queries with arbitrary
structural constraints. I consider such a class of queries in my
research and refer to them as generalized TPQs (GTPQs).
XML query evaluations usually have been conducted in two contexts:
one deals with indexed XML data. The other deals with (non-indexed)
XML streams. These two contexts have different requirements on
evaluation algorithms and present different challenges. In my
research, I have addressed the evaluation issue in both contexts and
designed efficient algorithms for evaluating GTPQs on XML data.
Details can be found in my following publications:
[SSDBM09][DASFAA09][WWW08]
[CIKM08]
[CIKM07] .
|
-
Defining semantics for
keyword queries and generalized tree-pattern queries
on XML data.
In recent
publications [DKE08]
[DASFAA07], I have devised
an original approach for assigning semantics to GTPQs. The novel
semantics seamlessly applies to keyword queries and to queries with
structural restrictions. Previous approaches identify meaningful
answers by operating locally on the data. In contrast, this new
approach operates globally on structural summaries of data to
compute meaningful TPQs. This overview of data gives it an advantage
when compared to previous approaches. These advantages are largely
confirmed by the experimental results.
|
-
Answering XML queries using materialized views.
Answering
queries using views is a well-established technique in databases. In
this context, two outstanding problems can be formulated. The first
one consists in deciding whether a query can be answered exclusively
using one or multiple materialized views. Given the many alternative
ways to compute the query from the materialized views, the second
problem consists in finding the best way to compute the query from
the materialized views. In the realm of XML, there is a restricted
number of contributions in the direction of these problems due to
the many limitations associated with the use of materialized views
in traditional XML query evaluation models.
In my
research, I adopt a recent evaluation model, called inverted lists
model, and holistic algorithms which together have been established
as the prominent technique for evaluating queries on large
persistent XML data, and I address the previous two problems.
I am
currently working on the inverted list based TPQ optimization using
materialized views as well as the view selection problem. A
recent publication on this area can be found in
[CIKM09].
|
|