CV: A very short
CV adapted from my ABET CV.
Overview of my research: I am interested in deriving intelligence from
corpora using text mining, information extraction, natural language processing,
machine
learning and information retrieval approaches. My work has been applied
in distance learning student performance evaluation, representation of
research
expertise for personalized uMining, finding similar people from the web
using a personal web site as a search query, personzlied query refinement,
etc.
The following is a list of my current projects. (Current as of June 20, 2013) |
 |
IFME: Information filtering
by multiple examples, with Ph.D. student Mingzhu Zhu |
| |
This framework utilizes multiple representative articles provided by a
user as positive samples to represent a complex information need without
the user composing any search query. The system learns from the user
samples and ranks all documents in a document base (such as a digital
library), based on their relevance to the information need representing
the sample
documents provided by the user, using a semi-supervised Positive and
Unlabeled
Learning (PU Learning) approach. To achieve a high level of learning
performance even with only very few positive samples, the system utilizes
under-sampling,
which is especially beneficial when relevant documents similar to the
samples are not evenly distributed in the document base.
|
 |
Task-based user profiling
for personalized query refinement, with Ph.D. student Chao Xu |
| |
This project uses the user’s prior search sessions to model his or
her evolving task-based search interests with long- and short-term, and
positive and negative descriptors. To reduce the noise in the dataset,
the clicked pages in the user’s search sessions are represented using
associated social tags to form a pseudo user representation, from where
the descriptors in the user ’s
profile are derived.
|
 |
Intent-based user segmentation
with query enhancement for online advertisement, with Ph.D. student Wei Xiong |
| |
This project proposes a query enhancement mechanism that augments a user’s
queries by leveraging the user’s query log, which provides more useful
context for the user’s interests and hence reduces the ambiguity in
the inferred user ’s intent.
|
 |
Automatically generating audience level metadata
for digital library resources, with Ph.D. student Todd Will |
| |
This project trains a support vector machine classifier to label digital
library resources by subject and reading level automatically.
|
 |
Concept chaining utilizing
meronyms in text characterization, with Ph.D. student Lori Watrous-Deversterre |
| |
This project utilizes semantic and linguistic content categorization which
will facilitate improved access methods for digital library resources.
|