\magnification 1120 \parindent 0pt \parskip 5pt \hsize 7.2truein \vsize 9.7 truein \hoffset -0.45truein \def\u{\vskip 0.4in} \def\h{\hskip 2in} Fragment Assembly of DNA Homework 1.) Draw the directed graph corresponding to the following fragments. Find a Hamiltonian path with the largest weight sum. Use this path to determine the sequence of the original DNA sequence. Is there more than one Hamiltonian path with the largest weight sum? \vskip 10pt \centerline{AGCAGTAG \h GAGCGAT} \u \centerline{ AGTCGTAG \h GATCACAG} There are two Hamiltonian paths with the largest weight sum: GAGCGAT, GATCACAG, AGCAGTAG, AGTCGTAG giving an original sequence of \hfil \break GAGCGATCACAGCAGTAGTCGTAG GAGCGAT, GATCACAG, AGTCGTAG AGCAGTAG, giving an original sequence of \hfil \break GAGCGATCACAGTCGTAGCAGTAG, 2.) Draw the directed graph corresponding to the following fragments. Find a Hamiltonian path with the largest weight sum. Use this path to determine the sequence of the original DNA sequence. Is there more than one Hamiltonian path with the largest weight sum? \vskip 10pt \centerline{ACTGGATT} \centerline{ATTATG \h TGTT} \u \centerline{TTAACT \h TTCTACT} There are two Hamiltonian paths with the largest weight sum: TTAACT, ACTGGATT, ATTATG, TGTT, TTCTACT giving an original sequence of \hfil \break TTAACTGGATTATGTTCTACT TTCTACT, ACTGGATT, ATTATG, TGTT, TTAACT giving an original sequence of \hfil \break TTCTACTGGATTATGTTAACT 3.) To find the original DNA sequence, what are you looking for (i.e., what kind of graph theory problem is this?)? We are looking for a Hamiltonian path. 4.) Is the solution always unique? No. 5.) How is this problem similar to and different from the traveling salesperson problem? i.) TSP is a graph problem, while fragment assembly is a digraph problem. ii.) In both cases, we are interested in the sum of the weights, but in TSP we usually want the minimum sum (i.e. minimized costs or distance --unless you want the maximum distance for more frequent flyer miles), while in fragment assembly We want to maximize the weights of the arcs (i.e. maximize the overlaps between consecutive segments) in our simplified and altered problem. Note in real life you may not want to maximize the weights, but instead you may use knowledge about the length of the original sequence. ii.) In both cases we want to visit each vertex exactly once, but in TSP, we also want to return to the starting point (i.e. we are looking for a closed path) while in fragment assembly we do not return to the starting point. Side-note: Both problems can be modified. If the DNA is circular, we are looking for a Hamiltonian cycle, and if the salesperson does not want to return home, we are looking for a Hamiltonian path. \eject \end