BIOKDD, 2001

1st Workshop on Data Mining in Bioinformatics

August 26, 2001
San Francisco, CA, USA

in conjunction with

7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
(KDD'2001)

Proceedings of the ACM SIGKDD Workshop on Data Mining in Bioinformatics edited by Mohammed J. Zaki, Hannu T.T. Toivonen, and Jason T. L. Wang :

Foreword , Mohammed J. Zaki, Hannu T.T. Toivonen, and Jason T. L. Wang, pp. i - ii.

Part I (Invited Paper & Gene Expression)

Invited Paper: Determination of RNA folding pathway functional intermediates using a massively parallel genetic algorithm , Bruce A. Shapiro, David Bengali, and Wojciech Kasprzak, National Cancer Institute, USA, pp. 1.

Extracting knowledge from gene expression data: A case study of Batten Disease, Simon M. Lin, Sumeer Dhar, and Rose-Mary N. Boustany, Duke University Medical Center, USA, pp. 2 - 7.

Part II (Microarrays)

Mining microarray expression data for classifier gene-cores, Goutham Kurra, Wen Niu, and Raj Bhatnagar, University of Cincinnati, USA, pp. 8 - 14.

Classification of genes using probabilistic models of microarray expression profiles, Paul Pavlidis, Christopher Tang, and William S. Noble, Columbia University, USA, pp. 15 - 21.

Analysis of an associative memory neural network for pattern identification in gene expression data, Silvio Bicciato, Mario Pandin, Giuseppe Didone', and Carlo Di Bello, University of Padova and Cittadella Hospital, Italy, pp. 22 - 30.

Part III (Sequence Assembly)

A learning algorithm for string assembly, Mark K. Goldberg, Darren T. Lim, and Malik Magdon-Ismail, RPI, USA, pp. 31 - 37.

A probabilistic approach to sequence assembly validation, Sun Kim, Li Liao, and Jean-Francois Tomb, Indiana University and DuPont, USA, pp. 38 - 43.

Part IV: (Invited Paper & Proteins)

Invited Paper: Shared challenges in data mining and computational biology, Charles Elkan, University of California, San Diego, USA, pp. 44.

Learning to recognize brain specific proteins based on low-level features from on-line prediction servers, Mikael Huss, Henrik Bostrom, Lars Asker, and Joakim Coster, Vitrual Genetics Laboratory, Sweden, pp. 45 - 49.

Investigation of bagging-like effects and decision trees versus neural nets in protein secondary structure prediction, Nitesh Chawla, Thomas E. Moore, Kevin Bowyer, Lawrence O. Hall, Clayton Springer, and Philip Kegelmeyer, University of South Florida and Sandia National Labs., USA, pp. 50 - 59.

Part V (Sequence Modeling & Clustering)

Maximum entropy methods for biological sequence modeling, Eugen C. Buehler and Lyle H. Ungar, University of Pennsylvania, USA, pp. 60 - 64.

Hierarchical cluster analysis of SAGE data for cancer profiling, Raymond T. Ng, Jorg Sander, and Monica C. Sleumer, University of British Columbia, Canada, pp. 65 - 72.

A scalable algorithm for clustering protein sequences, Valerie Guralnik and George Karypis, University of Minnesota, USA, pp. 73 - 80.

WORKSHOP CO-CHAIRS:

Mohammed J. Zaki, Rensselaer Polytechnic Institute (zaki@cs.rpi.edu )

Hannu T.T. Toivonen, University of Helsinki and Nokia Research Center (Hannu.TT.Toivonen@nokia.com)

Jason T. L. Wang, New Jersey Institute of Technology (jason@cis.njit.edu)

PROGRAM COMMITTEE:

Chuck Baldwin, Lawrence Livermore National Laboratory

Chris Bystroff, Rensselaer Polytechnic Institute

Shi-Kuo Chang, University of Pittsburgh

Wesley W. Chu, University of California, Los Angeles

Diane J. Cook, University of Texas at Arlington

Charles Elkan, University of California, San Diego

Janice Glasgow, Queen's University, Canada

Richard Hughey, University of California, Santa Cruz

Hasan Jamil, Mississippi State University

Minoru Kanehisa, Kyoto University

Simon M. Lin, Duke University Medical Center

Jacob V. Maizel, Jr., National Institutes of Health

Sharad Mehrotra, University of California at Irvine

Shinichi Morishita, University of Tokyo

Jane Richardson, Duke University

Isidore Rigoutsos, IBM Thomas J. Watson Research Center

Bruce Shapiro, National Institutes of Health

Vassilis J. Tsotras, University of California, Riverside

Alex Tuzhilin, New York University/Stern School of Business

Jeff Vitter, Duke University

Cathy H. Wu, Georgetown University Medical Center

Michael Zucker, Rensselaer Polytechnic Institute

BIOKDD, 2001

1st Workshop on Data Mining in Bioinformatics

7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'2001)

7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
(KDD'2001)