Updown Distance: A New Distance Measure for Comparing Two Phylogenetic Trees

Jason T. L. Wang
Bioinformatics Program
Department of Computer Science
New Jersey Institute of Technology
wangj@njit.edu

Steven Regula
Department of Computer Science
New Jersey Institute of Technology

Dennis Shasha
Courant Institute of Mathematical Sciences
Department of Computer Science
New York University

Lei Hua
Department of Computer Science
New Jersey Institute of Technology

Rekha Patel
Department of Computer Science
New Jersey Institute of Technology


Updown Distance - Input Files Tutorial

This tutorial is an extension to the Updown distance project summary.

To overcome command line interface input string length limitations, the system allows for parameters that specify filenames of Newick formatted input strings. To use this option, first create and save a plain text file using a text editor. Ths following diagram highlights important details that must be followed when creating input files.

Figure 1: Specific input file details

Note that in this image, there is a new line after the Newick formatted tree. After the input file has been generated, it can be used for processing using the following command line parameters.

java phylogeny/UpdownDistance /?

Usage: java UpdownDistance [-p QUERYFILENAME] [-d DATAFILENAME]

Therefore, to use the updown distance calculator with datafiles as input, the user would type the following if the query and data trees were named query.txt and data.txt, respectively.

java phylogeny/UpdownDistance -p query.txt -d data.txt


To further elaborate, the following screenshots should help users who might encounter difficulty in using the system.

Figure 2: An example of a directory containing the "phylogeny" directory and two textfiles

Figure 3: The contents of the "phylogeny" directory (note that it contains the class files we will execute)

Figure 4: Back in the initial directory, listing the contents of the file "query.txt" (a Newick formatted query tree)

Figure 5: Back in the initial directory, listing the contents of the file "data.txt" (a Newick formatted data tree)

Figure 6: A sample execution of the system using input files with Newick formatted trees

Finally, note that input files can be located in any directory as long as the full path is specified. The following figure is an example of this feature.

Figure 7: Executing Updown distance with input files located elsewhere in the file system


Citation

Jason T. L. Wang, Huiyuan Shan, Dennis Shasha and William H. Piel, "Fast Structural Search in Phylogenetic Databases," Evolutionary Bioinformatics Online, Volume 1, 2005, pp. 37-46.


Links

Newick Notation - Wikipedia Entry

Phyfi - Online tool for drawing and manipulating phylogeny color figures

What is the largest text file that Java can read into a String?


Download Issues

Some browsers open the PDF file and the Web page manuals and programs instead of starting a download. If this happens, try right-clicking on the link and choosing an option named "Save Target As..." or similar. If a separate window is popped up, click "File" on the top bar menu of the window and click on "Save As" to save the file.