|
|
Building of clone contigs (overlapping series of cloned DNA fragments).
(i) Hybridization approach. The simplest way to build up a contig is to start with one clone from a genomic library. This first clone is selected through hybridization with mapped DNA marker(s). This is followed by chromosome walking for selection a second clone, whose inset overlaps clones and so on. The successive overlapping clones are identified by positive singal, when the clone in question is hybridized as a probe to screen a genomic library. The main problem in this method arises, if the probe contains a repeat sequence, leading to non-specific hybridization. This problem can be partly overcome by pre-hybridization with unlabelled excess genomic DNA.
|
Sometimes the end of the clone in question is subcloned and used as a probe to eliminate the chance of non-specific hybridization due to repetitive DNA. This methods is based on the pproach of chromosome walking and was the first method devised for assembly of clone contigs. Major limitation of this approach is the requirement of a high density DNA marker map of the target genome. (ii) PCR approach for building contigs. If the end-fragment of the selected clone is sequenced, a pair of primers can be designed from this sequence and expected size of amplified product can be predicted. If this pair of primers is used for PCR with all other clones individually in the library, the PCR product of correct size will detect the overlapping clone and the exercise may continue.
|
However, to speed up the process, groups of clones are used in PCR in such a manner that an individual overlapping clone is identified through combinatorial screening, in which 960 clones were used in 10 microtitre trays each tray having 96 wells in 8 x 12 array with one clone per well. PCR is carried out as follows: (i) samples, one from each of the cones in a row of a microtitre tray are mixed together and single PCR is carried out, so that each tray would need eight row PCRs making a total of 80 row PCRs; (ii) samples, one from each column of a tray are mixed together and a single PCR is carried out, so that each tray will need 12 column PCRs, making a total of 120 column PCRs; (iii) 10 samples, one each from individual corresponding wells from all 10 trays, (e.g. A1 from row A and column 1) are mixed together and a single PCR is carried out, so that pooled samples from individual wells will make 96 PCRs. This step of individual wells is needed only when there are two or more positive clones in the same tray.
|
|
Overlapping YAC or BAC clones for a region of the genome can be identified using a technique described s ‘STS (sequence tag site) content mapping STS primer are used with pooled DNA samples from BACs, which are grouped for combinational screening described above. Two BACs giving the same size of PCR product using a particular pair of STS primers, will be assumed to carry an overlap, which can be confirmed by further analysis. However, a large number of STS markers are needed in this approach, which requires sequencing of clones followed by designing and synthesis of primers.
|
|
|
The combinatorial approach discussed above will reduce the number of PCR reactions from 960 to 296 (80 + 120 + 96; see above) and will give enough information to identify positive clones. If a solitary positive row and a solitary positive column is available in a tray, the clone can be identified as a well common to the row and the column so identified. In this case, a positive clone will be identified without running PCRs for individual wells, thus reducing the PCR reactions further from 296 to 200.
|
(iii) Clone fingerprinting. The two approaches described above for building clone contigs involve chromosome walking clone by clone and are therefore slow and are suitable only when we need to walk from a mapped locus to an interesting gene, which is no more than a few Mb away from the locus. These approaches, however, have not been found suitable when clone contigs need to be assembled across the entire genome. STS content mapping is also labour intensive. Therefore, alternative methods have been used and one such method makes use of clone fingerprinting, in which by comparing fingerprints of two clones, overlaps if any can be inferred due to similarity in a part of the banding pattern. One or more of the following techniques are used for clone fingerprinting leading to the assembly of clone contigs and electrophoresed separately and the restriction profiles after Ethidium bromide staining are compared so that if there are common bands, it suggests overlaps if any can be inferred due to similarity in a part of h banding due to similarity in a part of the banding pattern.
|
One or more of the following techniques are used for clone fingerprinting leading to the assembly of clone contigs: The clones may be subjected to Southern blotting and hybridized with probes for genome-wide repents. Hybridizing bands that are common suggest overlapping clones. (iii) Repetitive DNA-PCR may be conducted using primers that anneal with genome-wide repeats and amplify single copy DNA between two regions of repeats. The similarity in size of PCR products may suggest overlaps and may be used for assembly of clone contigs. For instance, Alu-PCR of a human BAC carrying an insert of 150kb length may give as many as ~40 PCR products, providing a detailed fingerprint. (iv) PCR may be conducted on each clone using primers for each individual STS, so that similar products will suggest an overlap, since STS loci are unique sequences.
Four Different Techniques for Clone Fingerprinting
(a) Restriction Fingerprints |
(b) Repetitive DNA probing |

|

|
(c) Repetitive DNA PCR Fingerprints |
(d) STS Content Mapping |

|

|
|
AP-PCR (arbitrary primed-PCR), involving the use of random/arbitrary primers may also be used to survey pooled BAC DNAs and individual BAC DNAs. The fingerprints of BACs, each involving many common bands, that are not universal would help identification of overlapping clones. The technique of overlapping BACs of rice, at the International Rice Research Institute (IRRI), located t Manila, Philippines. Whole-genome physical maps showing position of overlapping BACs in rice and maized were published in 2001-2002.
(iv) The directed shotgun approach. It has been shown that for sequencing of whole genome, generally clones measuring about ten times the length of the genome need to be sequenced, if the conventional shotgun need to be sequenced, if the conventional shotgun approach discussed above is used. This sequencing effort adding to ten times the genome size will cover over 99.8% of the genome, leaving only a few gaps. Thee gaps can be closed by one of the methods developed during H. influenzae project discussed earlier in. For human beings as an illustration, if 70 million individual clones, each 500bp long, are sequenced, this will give a total of 35,000 Mb of sequences, amounting to tn times the genome size of 3,5 Mb.
This should be sufficient to get the whole genome sequence duly assembled, if random clones are used for sequencing, without the help of a map. The new automatic sequencer like MegaBACE 400o which became available during 2001-2001, can sequence upto 1-2 Mb per day, so that with 100 such machines, the task can be achieved within one year. Earlier in 1998, with 70 ABI 3700 machines, each machine doing 1000 clones = 0.5Mb every day, it was estimated to be done in three years
|
However, it was soon realized that the above 70 million sequences (each 500bp) generated through conventional shotgun sequencing through conventional shotgun sequencing can not be assembled correctly into a whole genome sequence, without any reference to a physical map. One of the reason is the presence of extensive repetitive sequences, and the problem was resolved by using “directed shotgun” approach, in which 70 million clones to be sequenced were chose as follows : (i) 60 million clones, each carrying a ~2kb long insert in a plasmid vector, and (ii) 10 million clones each containing ~10kb insert and cloned in another vector (some sequences are difficult to be cloned in the above plasmid vectors).
The library of 10kb inserts was needed because 10kb inserts each will contain an entire repeat (average size of the repeat being 5kb), while 2kb insert will contain only a part of it. The use of 2kb inserts can lead to sequence jumps during assembly, which can be avoided, if two ends of a 10 kb sequence are checked, if two ends of a 10 kb sequence are checked in the assembled sequence are checked in the assembled sequence. In Human Genome Project, while using shotgun approach, STS map data as well as the physical positions of 300,000 BAC clones and their end sequence facilitated the assembly of whole genome sequence using the shotgun approach.
|
|
|