Given two sequences, where one is a substring of the other, we define a substring alignment by matching the substring with the longer sequence and placing gaps everywhere else. For example if the input is ACCTGTAGG and TGT then the substring alignment is ACCTGTAGG ---TGT--- Write a Perl program that prints the substring alignment of two unaligned sequences in a FASTA file. If no substring alignment exists then the program should print "No alignment found". --------------------------------------- You are given an alignment in ClustalW format. Write a Perl program to convert this to FASTA format and print it to the screen. The ClustalW format begins with the line "CLUSTAL W (1.83) multiple sequence alignment" followed by two blank lines. Following that the alignment begins with the sequence name followed by spaces and the sequence on separate lines. However, the alignment is broken by sequence length into chunks of 60 characters. Below is an example of a ClustalW alignment. CLUSTAL W (1.83) multiple sequence alignment Seq1 AAAACAGGGTGATACAGATGCGATGCCCACACACCCCACACGGATTATTATTATATATAT Seq2 AAAACAGGGTGATACAGATGCTAATGCAACACACCAAACACGGATTATTAAAATCTATAT Seq3 AAAACAGGGTGATACAGATGCGATGCCCACACACCAAACACGGATTATTACCATGTATAT Seq1 GGCACGCAAAAATTATAGCGAGATGCTAGCATCAGACACAAAAAAAACGCGCGCTAGTCA Seq2 GGCACGCAACCCTTATAGCGAGATGCTAGCATCAGACACAAAAACGCCCGCGCGCTAGTC Seq3 GGCACGCAAGGGTTATAGCGAGATGCTAGCATCAGACACAAAACGGCCGCGCGCTAGTCA Seq1 AAGCGCGTAGCCC Seq2 AAGCGAATAGGGG Seq3 AACCCCGTAGCCC The resulting FASTA alignment would be >Seq1 AAAACAGGGTGATACAGATGCGATGCCCACACACCCCACACGGATTATTATTATATATATGGCACGCAAAAATTATAGCGAGATGCTAGCATCAGACACAAAAAAAACGCGCGCTAGTCAAAGCGCGTAGCCC >Seq2 AAAACAGGGTGATACAGATGCTAATGCAACACACCAAACACGGATTATTAAAATCTATATGGCACGCAACCCTTATAGCGAGATGCTAGCATCAGACACAAAAACGCCCGCGCGCTAGTCAAGCGAATAGGGG >Seq3 AAAACAGGGTGATACAGATGCGATGCCCACACACCAAACACGGATTATTACCATGTATATGGCACGCAAGGGTTATAGCGAGATGCTAGCATCAGACACAAAACGGCCGCGCGCTAGTCAAACCCCGTAGCCC ------------------------------------- Write a Perl program to compute the sum of pairs score of a multiple alignment with gaps. Your program will take as input a filename containing a FASTA alignment, the filename containing the scoring matrix, the gap open, gap extension, and endgap penalties. When comparing two sequences don't forget to ignore sites where both are gaps. As an example, suppose the gap open, extension, and endgap penalties are -1, 0.1, and 0 respectively. Also suppose the match score is 1 and mismatch is 0. Then for the following three sequence alignment the total score is AACGTTGCAC --CGTT--A- AACG----A- 3.9 (for sequence 1 and 2) + 3.7 (for sequence 1 and 3) + 1.9 (for sequence 2 and 3) = 9.5 ------------------------------------- Write a Perl subroutine that computes a maximal local aligned segment. The input to the subroutine is 1. $q (query sequence) 2. $t (target sequence) 3. $kmer (a substring of length 3 that is contained in $q and $t) 4. $threshold (defined such that the maximal segment score will reduce by at most threshold if further extended) For example, suppose the input $q and $t are $t: ACCTGTAGG $q: ATGTC $kmer="TGT", and $threshold=-2. Also suppose the match score=10 and mismatch=-2. Then the kmer-hit has score 30. But if extended it becomes CTGTA ATGTC which has a reduction of -4. This is less than the threshold -2 and so the maximal aligned segment is TGT TGT -------------------------------------