|
On the Web, there are limited ways of finding people sharing similar
interests or background with a given person. The current methods,
such as using regular search engines, are either ineffective or time consuming.
In this work, a new approach for searching people sharing similar
interests
from the Web, called People-Search, is presented. Given a person,
to find similar people from the Web, there are two major research issues:
person representation and matching persons. In this study, a person
representation method which uses a person¡¦s website to represent
this
person¡¦s interest and background is proposed. The design of matching
process takes person representation into consideration to allow the
same representation to be used when composing the query, which is
also a personal website. Based on this person representation method, the
main proposed algorithm integrates textual content and hyperlink
information
of all the pages belonging to a personal website to represent a person
and match persons. Other algorithms, based on different combinations
of content, inlink, and outlink information of an entire personal
website or only the main page, are also explored and compared to the main
proposed
algorithm. Two kinds of evaluations were conducted. In the automatic
evaluation, precision, recall, F and Kruskal-Goodman ƒ· measures were
used to compare these algorithms. In the human evaluation, the effectiveness
of the main proposed algorithm and two other important ones were
evaluated by human subjects. Results from both evaluations show that the
People-Search
algorithm integrating content and link information of all pages belonging
to a personal website outperformed all other algorithms in finding
similar people from the Web.
|