Xin Zhang
Department of Computer Science
Courant Institute of Mathematical Sciences
New York University
Dennis Shasha
Department of Computer Science
Courant Institute of Mathematical Sciences
New York University
Yang Song
Department of Computer Science
New Jersey Institute of Technology
Jason Wang
Department of Computer Science
New Jersey Institute of Technology
wangj@njit.edu
We present here a software tool, called PeakID, for fast elastic peak detection in 2D liquid chromatographic-mass spectrometry (LC-MS) data. PeakID takes 2D LC-MS data as input and locates all peaks across multiple window sizes of interest in the input data.
The programs of PeakID are written in C++.
They were compiled and tested on a Dell PC
under the Microsoft Windows XP operating system.
Here, we post the source code of the software and
provide instructions to compile and run the programs.
All the input files are text files. In the training phase, we use a state-space algorithm to find the topology and structure of an efficient Shifted Aggregation Tree to be used by PeakID. To train the software tool, you need to provide two input files: a sample data file (sample.txt) and a threshold file (thresh.txt). In addition, you need to provide the name of a file (tree.txt), in which the topology and structure, i.e., the shift, shadow size and degree of each level, of the efficient Shifted Aggregation Tree will be computed and stored.
To detect peaks, you need to provide three input files: an input data file (input.txt), a threshold file (thresh.txt), and the Shifted Aggregation Tree structure file (tree.txt). The sample data file (sample.txt) and the input data file (input.txt) have exactly the same format.
For your convenience, we have included a copy of each of the above text files in this package.
The software tool displays the output on the terminal.
You can redirect the output to a file.
The output comprises a list of tuples as follows:
Starting_position, Window_size,
Sum_of_intensity_values
Each tuple represents that there is a peak in the time window beginning at
Starting_position with Window_size time points
and the sum of the intensity values occurring within this time window equals
Sum_of_intensity_values.
The Sum_of_intensity_values must be greater than or equal to
the threshold associated with the Window_size.
To run SAT.exe, you must be in the DOS environment.
The following screenshot shows how the training step works:
The following screenshot shows how the peak detection works:
Some browsers open the text file and the Web page manuals and programs instead of starting a download. If this happens, try right-clicking on the link and choosing an option named "Save Target As..." or similar. If a separate window is popped up, click "File" on the top bar menu of the window and click on "Save As" to save the file.