Computational Methods for Deciphering Genomic Structures of Bacteria

Dongsheng Che
East Stroudsburg University of Pennsylvania


High-throughput sequencing technologies have generated huge amount of genomic data. This wealth of genomic data provides computational biologists unprecedented opportunities to unveil the biological machinery encoded in genomes. Characterizing the structures of genomes is one of important and challenging tasks, and it represents an essential step towards deciphering the networks and pathways in a biological system. In this talk I will discuss our UNIPOP algorithm, which uses comparative genomics methods to identify genes that are co-transcribed as a transcriptional unit (also known as operons). I will also discuss our Uber-operon algorithm that uses comparative genomics methods to group sets of operons into uber-operons, which are evolutionary and functional related. Finally, I will discuss our recent work of using the decision-tree based ensemble learning approach for detecting genomic islands which are genome segments horizontally transferred from other genomes.