THE POLYMERASE CHAIN REACTION
(PCR)
1. IN VITRO DNA CLONING BY PCR
The polymerase chain reaction (PCR) consists of a rapid
procedure for the in vitro enzymatic amplification of a specific
segment of DNA (Mullis and
Faloona, 1987). This powerful technique is an invention of
Kary Mullis who sensed the possibility in 1983, and for which he
received the Robert Koch Prize, as well as the Nobel
Prize in Chemistry 1993. Like in vivo molecular cloning,
PCR has paved the way for a wide variety of previously
unthinkable experiments.
The number of PCR applications seems to be infinite and is still
growing, particularly in the field of clinical diagnostics. They
include direct cloning from genomic DNA or cDNA, in vitro
mutagenesis and DNA engineering, genetic fingerprinting of
samples analyzed in the context of medico-legal diagnostics,
tests for the presence of infectious agents, diagnosis of
genetic diseases, analysis of variations of allelic sequences,
analysis of the structure of RNA transcripts, genomic
footprinting, and direct nucleotide sequencing starting from
genomic DNA and cDNA. Since the applications are such as
numerous, it is desirable to achieve optimization conditions of
this technique, which guarantee the achievement of consistent
yields.
PCR is the most commonly used
method in molecular biology, thanks to its particular sensitivity,
specificity and operational simplicity. The
correct application of this powerful diagnostic tool requires a
solid knowledge of the basic principles of nucleic acid
manipulation procedures, and of the possible drawbacks and
dangers encountered during the analytical phases that lead to
the result using the PCR technique.
In order to carry out a PCR
reaction, a small amount of target DNA is added to a buffer
solution containing the enzyme DNA polymerase, two appropriate
oligonucleotide primers (called primers), the four
deoxynucleotides triphosphates (dNTPs) that make up the DNA and
the MgCl2 cofactor (Magnesium Chloride). The primers
are designed to delimit the target region to be amplified, so
the ends of the sequence must be known in sufficient detail.
Each primer binds to one of the two DNA strands so that their 3´
ends are directed towards the centre of the region to be
amplified. In the case of transcribed gene sequences, the strand
whose sequence is equal to the one of RNA will bind to the
"antisense" DNA strand (or "template strand"). On the other
hand, the "antisense primer" will have a complementary sequence,
will have a sequence complementary to the one of the mRNA and
will bind to the DNA strand "sense" (or "coding strand").
The PCR procedure has undergone
automation, thanks to two important innovations. The original
PCR procedure involved the use of the Klenow fragment of E.
coli DNA polymerase I, thermolabile, with optimal activity
at a temperature of 37°C, which underwent degradation during the
denaturation phases of PCR. The discovery of thermostable
polymerases resistant to repeated denaturation phases and
with an optimal activity at 72°C (Taq DNA polymerase; Tth DNA polymerase; Pwo DNA polymerase; Vent DNA polymerase) eliminated
the needing for the addition of an aliquot of the enzyme to the
reaction mixture after each stage of amplification. The most
used is Taq DNA polymerase, isolated from the bacterium Thermus
aquaticus, capable of withstanding very high temperatures (97°
C) and capable of operating up to temperatures of 65°C.
It should be kept in mind that
natural polymerases have typically different activities
associates in the same enzyme:
5´-3´ DNA-dependent DNA polymerase,
responsible for the template-dependent synthesis of a
complementary DNA strand;
5´-3´ DNA exonuclease, able to
remove DNA strands paired to the template in order to proceed
with the DNA synthesis;
3´-5´ DNA exonuclease, which
allows a step back to remove a mismatched nucleotide wrongly
incorporated and to restore the correct one. This activity is
absent in Taq polymerase but is present in the DNA polymerases
found in some bacteria, such as Pyrococcus furiosus (Pfu
polymerase). Pfu polymerase can thus increase the
fidelity of the amplicon sequence, at the cost of a reduced
yield of the product.
Besides, PCR is performed in
machines (Thermal Cycler) where a computer checks the
repeated changes in temperature and the duration of the
respective phases.
Once prepared, the mixture is
subjected to various cycles, through temperatures that allow the
denaturation of the double-stranded DNA (91-96°C), the pairing
of the primers with the target (50-65°C) and the synthesis of
DNA by the polymerase (72°C) in order to amplify a product of
predefined size and sequence exponentially. In addition to
ensuring the automation of PCR, Taq polymerase has also led to
other advantages. In fact, this enzyme, compared to
non-thermostable polymerases, allows the use of higher
temperatures in the pairing and extension phases, leading to an
increase of the stringency during the hybridization between
primer and template and therefore of the specificity of
the amplification product. This circumstance results in an
increase in the yield in the desired product.
The amplification reaction involves
three phases:
1. Denaturation: the
double-stranded DNA is denatured at a temperature of about 95°C,
i.e. the two strands are separated, and it is converted into
single-chain DNA.
2. Pairing (annealing): the oligonucleotide primers complementary
to the two sides of the sequence to be amplified
hybridize with the two denatured filaments, at a temperature
that is approximately 2-5°C lower than the melting temperature
(Tm) of the primers themselves; their sequence is
oriented to be able to guide DNA polymerization in the 5´-3´
direction in the stretch between the two regions to which they
associate.
The first part of this phase, during which the primers explore
the entire starting DNA looking for the homologous sequences
with which to pair, is often called the screening phase.
3. Extension: this phase, at a temperature higher than
the previous one, involves anchoring the DNA polymerase to the
site where the primer is bound, after which the enzyme extends
the target DNA strand starting from the 3´ end made available
from the primer and using the free dNTPs in solution. The
primers are extended each in the direction of the other but on
two different complementary chains leading to the synthesis of
two double-stranded DNA molecules, copies of the target region
delimited by the primers.
The first cycle is characterized by products of indeterminate
length which tend to accumulate linearly with each subsequent
cycle, i.e. the quantity present will be linearly proportional
to the number of cycles carried out. From the second cycle
onwards the first "short products" are produced, i.e. delimited
by the ends 5´ of the two primers, the growth of which assumes
an exponential trend with each subsequent amplification cycle.
This growth can lead to a few million times amplification
starting from the discrete fragment over 20-30 cycles.
The crucial chemical variable is the net synthesis of the
product during the various cycles; due to this synthesis, the
molecular balance between product, template, DNA polymerase,
primer and deoxynucleotides changes with each cycle. As the
amplification product accumulates, all the enzyme present is
engaged during the extension phase, and the relationship between
primer and template decreases, promoting the self-annealing of
the DNA strand. Furthermore, since the size of the amplification
product is much larger than that of the primers, such annealing
can already occur at temperatures much higher than those of
pairing the primers with the template. This fact will inevitably
tend to occur during the cooling of the reaction mixture after
the denaturation phase. When this reappearance becomes
significant or when the quantity of enzyme is limiting,
the reaction reaches saturation, and therefore exponential
growth ceases; thus, the plateau phase was reached, and
no more product is synthesized.
2. THE PCR PARAMETERS
Many important variables can affect the outcome of PCR.
Primers. The oligonucleotides that act as primers
must be designed following well-defined criteria, which
guarantee maximum pairing efficiency with the template DNA. The
two members of the primer pair may be called in different ways:
Sense primer
Direct primer, Forward primer
Left primer
"Bottom" primer
Primer 1
Antisense primer
Reverse primer, Back primer
Right primer
"Top" primer
Primer 2
The two regions on the template DNA
that will be the target for the primers should not be far more
than 1.5 kb, or DNA polymerase could not efficiently perform the
synthesis. Maintaining the amplification product (amplicon)
size below 700 bp ensures the possibility to obtain its
whole sequence by Sanger standard sequencing.
When dealing with amplification
from RNA (following its retrotranscription in DNA;
RT-PCR) it is critical that the two primers are designed to
anneal to two different exons, to clearly distinguish
amplicons derived from RNA (with the size expected following the
removal of the intervening intron sequence) rather than by
contaminating DNA
"Biochemical"
parameters
Any possible complementarity between the "forward" and "reverse"
primer sequence must be avoided, in particular at the 3´ end,
because it could cause the formation of a considerable quantity
of primer dimers which reduce the yield of the desired
product; primers with palindromic sequences and with extended
secondary structures (due to self-complementarity) must also be
avoided.
It is advisable to avoid the T or A bases at the 3´ end
because they give a less stable pairing at the level of the
critical site for the beginning of the extension. "GC Clamp" refers to the presence
or one or more G or C bases at the 3´ end of the primer.
The dissociation temperature of the two primers, which
basically depends on their length and concentration in C and G,
must be roughly the same. From the structural point of view, an
ideal primer must have a length between 18 and 28 nucleotides
and composition in G+C (GC content) between 50% and 60%.
These conditions guarantee a Tm between 50 and 80°C
according to the simplified formula for determining the Tm
starting from the related content of the pairs G/C or A/T:
Tm = 4 (G + C) + 2 (A + T)
As for the concentration of
primers, this must be the same for both, and 0.1-0.4 μM is
recommended for each (2.5-10 pmol in 25 μL of the reaction
mixture). This concentration ensures that the excess of primer
with respect to the template remains essentially constant.
Higher concentrations are not necessary, such as 3 μM provided
in previous protocols, and they are also disadvantageous as they
lead to the formation of primer-dimers and the increased
probability of pairing errors.
"Biological" parameters
A fundamental characteristic of the primers must be the
uniqueness of their sequence with respect to a complex
eukaryotic genome, i.e. the guarantee that the probability of
hybridising with sequences other than the desired one is
extremely low. This prerogative is generally guaranteed by their
length; in fact, a size of 18 or more nucleotides usually
ensures the uniqueness of the sequence with respect to the
genome.
Repeated sequences, or consecutive
runs of the same base, are to be avoided.
In any case, this parameter must be checked within the database,
checking in particular that it has not included in the
oligonucleotide sequences homologous to Alu or mitochondrial
DNA. Nowadays, numerous DNA sequence analysis software tools
include menus for primer design, also available in the web (Primer-BLAST).
Magnesium Chloride. One of the key variables of PCR
is the concentration of Mg++ ions, usually variable
between 0.5 and 2.5 mM, which has an important role both in
terms of yield and specificity. It affects the reaction
differently at high and low concentrations: too high
concentrations of Mg++
stabilise the double-stranded DNA and hinder the complete
denaturation of the product at each cycle, reducing the yield.
Furthermore, the excess of Mg++ could also
stabilise incorrect pairing between primer and template DNA,
leading to an accumulation of unwanted product and therefore to
low specificity. On the other hand, too low concentrations of
magnesium (less than 0.5 mM) affect the extension phase, since Mg++
is required as a co-factor for the enzymatic activity
of most DNA polymerases. Some Mg++
ions form complexes soluble with dNTPs in the reaction
mixture. So there must be an ionic concentration for which there
are optimal conditions in terms of yield and specificity.
Buffer. There are numerous buffers used in PCR, but the
choice must be made according to the reaction conditions:
characteristics of the target DNA and primers, as well as the
reaction cycle. A 50 mM Tris-HCl buffer with a pH between 8.3
and 9 at 25°C is generally used (the pH of the Tris decreases
with increasing temperature). To facilitate the pairing of the
primers to the denatured DNA, up to 50 mM of KCl can be added to
the reaction mixture. At concentrations above 50 mM, KCl,
however, can be a Taq polymerase inhibitor.
Deoxynucleotide triphosphates (dNTPs). In most
technical manuals, a concentration of dNTPs of 200 μM each is
recommended, which guarantees an excess condition for the
duration of the reaction. This concentration is sufficient for
the stability of the dNTPs during repeated PCR cycles is such
that approximately 50% of them remain after 50 cycles. dNTPs act
as chelators towards the magnesium ions, changing their optimal
concentration. Therefore, a quantity of dNTPs greater than 200
μM leads to an increase in the incidence of errors by the
polymerase; millimolar concentrations of dNTPs completely
inhibit the enzyme. Low concentrations of deoxynucleotides
minimise phenomena of incorrect pairing towards sites other than
the target, leading to advantages in terms of specificity and
accuracy (fidelity).
It is particularly important that
the concentrations of the four deoxynucleotides are equivalent
to prevent incorporation errors by the polymerase.
Polymerase. Thermostable DNA polymerases, like the other
DNA polymerases, catalyze the synthesis of DNA from triphosphate
nucleotides from a primer that has a 3´ free hydroxyl group. In
general, they have maximum catalytic activity between 75°C and
80°C and activity substantially reduced at lower temperatures
due to the change in pH. At 37°C, Taq polymerase has only about
10% of its maximum activity. Taq polymerase has a half-life that
progressively decreases with increasing temperature (greater
than 2 hours at 92.5°C, 40 minutes at 95°C and 5 minutes at
97.5°C).
Polymerases that lack 3´-5´ exonuclease activity generally have
higher error rates than enzymes that exhibit this activity. The
total error rate of the Taq polymerase is reported within a
range between 1 × 104 and 2 × 105 errors
per pair of bases.
As for the quantity, an excess of the enzyme could synthesize
DNA from spurious interactions between primer and template. The
Taq concentration suggested in most protocols is 0.5 unit (U)
per 25 μL of reaction. This limiting concentration is required
to control the specificity of the amplification reaction.
Concentrations higher than 2.5 nM (1.25 U for 25 μL of reaction)
must be avoided because they lead to a higher efficiency of the
PCR only up to a certain point. In reality, such a high amount
of enzyme can increase the yield in non-specific products, at
the expense of the product of interest. In addition, the cost of
the enzyme should not be underestimated.
DNA template. All the parameters discussed so far
consider a total human genome quantity of 125 ng per 25 μL of
reaction. The accuracy of the amplification depends on the DNA
polymerase error rate, but it also depends on the number of
initial copies of the template DNA.
It is not necessary for the sequence to be synthesized
enzymatically to be present initially in pure form; it can also
represent a small fraction in a complex mixture, such as the
segment of a single copy gene in the total human genome. The
sequence to be synthesized may initially be present as a
discrete molecule or may be part of a larger molecule. In both
cases, the reaction product will be a discrete double-stranded
DNA molecule with ends corresponding to the 5´ end of the
oligomers used.
Even relatively degraded DNA preparations can serve as useful
templates for generating products of moderate length. The most
important conditions concern purity and quantity. A large number
of contaminants found in DNA preparations can reduce PCR
efficiency. These include heme group, urea, SDS detergent,
sodium acetate and, sometimes, residual components of
purification from agarose gel.
Of course, the amount of template DNA must be sufficient to
ensure the display of the PCR product using ethidium bromide
(EtBr). Usually, 100 ng of genomic DNA is sufficient to reveal a
PCR product from a single copy mammalian gene. Using too much
template DNA might reduce the efficiency of the PCR reaction,
both because MgCl2 and other parameters are already
optimized, and because of the greater quantity of contaminants
that are introduced into the reaction mixture.
The profile of the amplification cycles is also essential for
carrying out the reaction.
Number of cycles. 30 amplification cycles are recommended
for titration, but in some cases, if the yield is very poor,
additional cycles are required. The number of cycles will depend
on the molar ratio between the primer and the template. For a
ratio of 106-107, 30 cycles should be
adequate to generate sufficient material for viewing on ethidium
bromide gel. For ratios higher or lower by two orders of
magnitude, this value is respectively increased or decreased by
10 cycles.
Denaturation phase. For target sequences of 1 Kb or less,
denature for 1 min at 94°C. For larger fragments, about one
minute is added to the denaturation time for each additional
kilobase. A long denaturation phase is particularly important at
the beginning of the PCR, to ensure total denaturation of the
starting DNA. For regions rich in G and C of the genomic DNA
template, this phase is essential.
Annealing phase. It is critical for the
specificity of amplification. A general scheme provides for the
use of a pairing temperature (Ta) about 2-5°C lower
than the Tm of the two primers used. A consequence of
the use of a too low pairing temperature is that one or both
primers can hybridize with sequences other than the target
sequences (non-specific amplification).
This fact causes a drop in the
yield of the desired product; on the contrary, a higher
temperature leads to a reduction in the hybridization of the
primers on the target DNA and therefore in the yield. The
suggested time for pairing is 1 min. However, it can be reduced
to 30 sec in the case of small reaction volumes (10-25 μL) that
quickly reach the reaction equilibrium at low temperatures.
It is possible to use the same
temperature profile to amplify different loci, even if the
calculated temperature for pairing can be different. Templates
for which the calculated temperature for pairing is higher than
the standard profile may require a lower concentration of Mg
ions. On the contrary, those with higher pairing temperatures
may require a greater quantity of Mg ions to compensate for the
too stringent temperature.
Extension phase. Taq polymerase is highly processive and
extends 2-4 Kb per min (35-70 bases per sec). However, this
parameter is strongly affected by the buffer pH, the saline
concentration of the medium, as well as the nature of the target
DNA. In general, for targets of size greater than 1 Kb, 1 min is
sufficient for the extension phase. In reality, the rise time
from the pairing temperature to the denaturation temperature is
generally long enough to allow the complete synthesis of a
target of 500 bp. For fragments greater than 1 Kb, about 1 min
should be added to the extension phase for each additional Kb.
It is important to note that for 20 base primers, with a GC
content of 60% or higher, Tm is in the range of
temperatures at which the enzyme performs an efficient
extension. Therefore, in this condition, the annealing and
polymerization phases can be combined in a single step (annealing-extension).
3. PCR EFFICIENCY
The combination of all the parameters discussed so far
determines, ultimately, what is the efficiency of the PCR
reaction. A basic equation of the polymerase chain reaction
describes the trend of amplification (Mullis, 1991):
Yn = (1+ e)n
where Yn is the amplification factor after n cycles (although,
for greater precision n should be considered the number of
cycles minus one, given that the effective presence of the
"short fragments" of interest begins, as was previously
discussed, only starting from the second cycle) and e represents
the efficiency of the reaction and assumes values between 0
and 1.
For example, if the starting DNA is made up of 3 × 105
(henceforth the 3E5 notation will be used) copies of the human
genome, the observation of a band on standard gel would require
5E12 molecules of 100 bp length in an aliquot of 10 μL (Mullis,
1991). A typical PCR could foresee a starting DNA quantity equal
to 300,000 copies of the human genome in a volume of 100 μL and
could aim to amplify a 100 bp long fragment of a single copy
human gene. With the formula described above, it is therefore
possible to calculate the number of cycles necessary to amplify
the desired segment, to obtain a visible band on gel:
Yn = 5 × 1011 /300,000 = 1.76 × 1011
So, assuming optimistically equal to 1 the efficiency, we obtain
that:
n = 20.7 cycles
If we assume that the reaction took place perfectly, in the last
cycle 2.5 × 1011 DNA molecules would be amplified,
which would, therefore, require 2.5 × 1011 Taq
molecules, which correspond to about 10 units.