Sequence for Yourself Part V: Assembly and Finishing previous
Assembly of 500-Base Segments Perhaps you noticed in the last section that the human DNA incorporated in
the vector is 4,000 base pairs long and that the 500 bases of human DNA we can
read has to be adjacent to the vector DNA. So how do we read the rest of the
human DNA? The answer is by piecing together overlapping sequences.
Because we have many overlapping pieces, we also have many starting points for
the 4,000-base sequences -- enough to allow us to read every base.
Assembly of 150,000-Base Segments
With the help of computers, we assemble the 500-base sequences into the
150,000-base segments from which they were derived.
Rebuilding the Chromosomes
Finally, we determine the chromosome that the 150,000-base segment belongs to
as well as where along the chromosome it belongs. We do this by looking for
overlaps and by looking for matches in banding between the segments and
chromosomes.
Using this approach -- called the "map-based shotgun approach" -- we can
sequence the entire genome. To learn about the differences between this
approach and the "whole genome shotgun" approach, check out the links in our
Resources section.
Final Notes
Although the job of piecing together overlapping fragments sounds
straightforward, the task is a challenging one. Many gaps will need to be
filled in, and there are areas where sequences are repeated many times, making
it almost impossible to determine where some of the fragments belong.
In this explanation, some of the descriptions of the techniques used were
simplified. Also, for each of the processes described here, researchers can use
a variety of techniques to get the same results.