THE POLYMERASE CHAIN ​​REACTION (PCR)

1. IN VITRO DNA CLONING BY PCR

The polymerase chain reaction (PCR) consists of a rapid procedure for the in vitro enzymatic amplification of a specific segment of DNA (Mullis and Faloona, 1987). This powerful technique is an invention of Kary Mullis who sensed the possibility in 1983, and for which he received the Robert Koch Prize, as well as the Nobel Prize in Chemistry 1993. Like in vivo molecular cloning, PCR has paved the way for a wide variety of previously unthinkable experiments.

The number of PCR applications seems to be infinite and is still growing, particularly in the field of clinical diagnostics. They include direct cloning from genomic DNA or cDNA, in vitro mutagenesis and DNA engineering, genetic fingerprinting of samples analyzed in the context of medico-legal diagnostics, tests for the presence of infectious agents, diagnosis of genetic diseases, analysis of variations of allelic sequences, analysis of the structure of RNA transcripts, genomic footprinting, and direct nucleotide sequencing starting from genomic DNA and cDNA. Since the applications are such as numerous, it is desirable to achieve optimization conditions of this technique, which guarantee the achievement of consistent yields.

PCR is the most commonly used method in molecular biology, thanks to its particular sensitivity, specificity and operational simplicity. The correct application of this powerful diagnostic tool requires a solid knowledge of the basic principles of nucleic acid manipulation procedures, and of the possible drawbacks and dangers encountered during the analytical phases that lead to the result using the PCR technique.

In order to carry out a PCR reaction, a small amount of target DNA is added to a buffer solution containing the enzyme DNA polymerase, two appropriate oligonucleotide primers (called primers), the four deoxynucleotides triphosphates (dNTPs) that make up the DNA and the MgCl2 cofactor (Magnesium Chloride). The primers are designed to delimit the target region to be amplified, so the ends of the sequence must be known in sufficient detail. Each primer binds to one of the two DNA strands so that their 3´ ends are directed towards the centre of the region to be amplified. In the case of transcribed gene sequences, the strand whose sequence is equal to the one of RNA will bind to the "antisense" DNA strand (or "template strand"). On the other hand, the "antisense primer" will have a complementary sequence, will have a sequence complementary to the one of the mRNA and will bind to the DNA strand "sense" (or "coding strand").

The PCR procedure has undergone automation, thanks to two important innovations. The original PCR procedure involved the use of the Klenow fragment of E. coli DNA polymerase I, thermolabile, with optimal activity at a temperature of 37°C, which underwent degradation during the denaturation phases of PCR. The discovery of thermostable polymerases resistant to repeated denaturation phases and with an optimal activity at 72°C (Taq DNA polymerase; Tth DNA polymerase; Pwo DNA polymerase; Vent DNA polymerase) eliminated the needing for the addition of an aliquot of the enzyme to the reaction mixture after each stage of amplification. The most used is Taq DNA polymerase, isolated from the bacterium Thermus aquaticus, capable of withstanding very high temperatures (97° C) and capable of operating up to temperatures of 65°C.

It should be kept in mind that natural polymerases have typically different activities associates in the same enzyme:
5´-3´ DNA-dependent DNA polymerase, responsible for the template-dependent synthesis of a complementary DNA strand;
5´-3´ DNA exonuclease, able to remove DNA strands paired to the template in order to proceed with the DNA synthesis;
3´-5´ DNA exonuclease, which allows a step back to remove a mismatched nucleotide wrongly incorporated and to restore the correct one. This activity is absent in Taq polymerase but is present in the DNA polymerases found in some bacteria, such as Pyrococcus furiosus (Pfu polymerase). Pfu polymerase can thus increase the fidelity of the amplicon sequence, at the cost of a reduced yield of the product.

Besides, PCR is performed in machines (Thermal Cycler) where a computer checks the repeated changes in temperature and the duration of the respective phases.
Once prepared, the mixture is subjected to various cycles, through temperatures that allow the denaturation of the double-stranded DNA (91-96°C), the pairing of the primers with the target (50-65°C) and the synthesis of DNA by the polymerase (72°C) in order to amplify a product of predefined size and sequence exponentially. In addition to ensuring the automation of PCR, Taq polymerase has also led to other advantages. In fact, this enzyme, compared to non-thermostable polymerases, allows the use of higher temperatures in the pairing and extension phases, leading to an increase of the stringency during the hybridization between primer and template and therefore of the specificity of the amplification product. This circumstance results in an increase in the yield in the desired product.

The amplification reaction involves three phases:
(Animation of PCR at https://dnalc.cshl.edu/resources/animations/pcr.html)

1. Denaturation: the double-stranded DNA is denatured at a temperature of about 95°C, i.e. the two strands are separated, and it is converted into single-chain DNA.
2. Pairing (annealing): the oligonucleotide primers
complementary to the two sides of the sequence to be amplified hybridize with the two denatured filaments, at a temperature that is approximately 2-5°C lower than the melting temperature (Tm) of the primers themselves; their sequence is oriented to be able to guide DNA polymerization in the 5´-3´ direction in the stretch between the two regions to which they associate.
The first part of this phase, during which the primers explore the entire starting DNA looking for the homologous sequences with which to pair, is often called the screening phase.
3. Extension: this phase, at a temperature higher than the previous one, involves anchoring the DNA polymerase to the site where the primer is bound, after which the enzyme extends the target DNA strand starting from the 3´ end made available from the primer and using the free dNTPs in solution. The primers are extended each in the direction of the other but on two different complementary chains leading to the synthesis of two double-stranded DNA molecules, copies of the target region delimited by the primers.

The first cycle is characterized by products of indeterminate length which tend to accumulate linearly with each subsequent cycle, i.e. the quantity present will be linearly proportional to the number of cycles carried out. From the second cycle onwards the first "short products" are produced, i.e. delimited by the ends 5´ of the two primers, the growth of which assumes an exponential trend with each subsequent amplification cycle. This growth can lead to a few million times amplification starting from the discrete fragment over 20-30 cycles.

The crucial chemical variable is the net synthesis of the product during the various cycles; due to this synthesis, the molecular balance between product, template, DNA polymerase, primer and deoxynucleotides changes with each cycle. As the amplification product accumulates, all the enzyme present is engaged during the extension phase, and the relationship between primer and template decreases, promoting the self-annealing of the DNA strand. Furthermore, since the size of the amplification product is much larger than that of the primers, such annealing can already occur at temperatures much higher than those of pairing the primers with the template. This fact will inevitably tend to occur during the cooling of the reaction mixture after the denaturation phase. When this reappearance becomes significant or when the quantity of enzyme is limiting, the reaction reaches saturation, and therefore exponential growth ceases; thus, the plateau phase was reached, and no more product is synthesized.

2. THE PCR PARAMETERS

Many important variables can affect the outcome of PCR.

Primers. The oligonucleotides that act as primers must be designed following well-defined criteria, which guarantee maximum pairing efficiency with the template DNA. The two members of the primer pair may be called in different ways:

Sense primer
Direct primer, Forward primer
Left primer
"Bottom" primer
Primer 1

Antisense primer
Reverse primer, Back primer
Right primer
"Top" primer
Primer 2


The two regions on the template DNA that will be the target for the primers should not be far more than 1.5 kb, or DNA polymerase could not efficiently perform the synthesis. Maintaining the amplification product (amplicon) size below 700 bp ensures the possibility to obtain its whole sequence by Sanger standard sequencing.
When dealing with amplification from RNA (following its retrotranscription in DNA; RT-PCR) it is critical that the two primers are designed to anneal to two different exons, to clearly distinguish amplicons derived from RNA (with the size expected following the removal of the intervening intron sequence) rather than by contaminating DNA

"Biochemical" parameters
Any possible complementarity between the "forward" and "reverse" primer sequence must be avoided, in particular at the 3´ end, because it could cause the formation of a considerable quantity of primer dimers which reduce the yield of the desired product; primers with palindromic sequences and with extended secondary structures (due to self-complementarity) must also be avoided.
It is advisable to avoid the T or A bases at the  3´ end because they give a less stable pairing at the level of the critical site for the beginning of the extension. "GC Clamp" refers to the presence or one or more G or C bases at the
3´ end of the primer.
The dissociation temperature of the two primers, which basically depends on their length and concentration in C and G, must be roughly the same. From the structural point of view, an ideal primer must have a length between 18 and 28 nucleotides and composition in G+C (GC content) between 50% and 60%. These conditions guarantee a Tm between 50 and 80°C according to the simplified formula for determining the
Tm starting from the related content of the pairs G/C or A/T:
Tm = 4 (G + C) + 2 (A + T)
As for the concentration of primers, this must be the same for both, and 0.1-0.4 μM is recommended for each (2.5-10 pmol in 25 μL of the reaction mixture). This concentration ensures that the excess of primer with respect to the template remains essentially constant. Higher concentrations are not necessary, such as 3 μM provided in previous protocols, and they are also disadvantageous as they lead to the formation of primer-dimers and the increased probability of pairing errors.

"Biological" parameters
A fundamental characteristic of the primers must be the uniqueness of their sequence with respect to a complex eukaryotic genome, i.e. the guarantee that the probability of hybridising with sequences other than the desired one is extremely low. This prerogative is generally guaranteed by their length; in fact, a size of 18 or more nucleotides usually ensures the uniqueness of the sequence with respect to the genome.
Repeated sequences, or consecutive runs of the same base, are to be avoided.
In any case, this parameter must be checked within the database, checking in particular that it has not included in the oligonucleotide sequences homologous to Alu or mitochondrial DNA. Nowadays, numerous DNA sequence analysis software tools include menus for primer design, also available in the web (Primer-BLAST).

Magnesium Chloride. One of the key variables of PCR is the concentration of Mg++ ions, usually variable between 0.5 and 2.5 mM, which has an important role both in terms of yield and specificity. It affects the reaction differently at high and low concentrations: too high concentrations of
Mg++ stabilise the double-stranded DNA and hinder the complete denaturation of the product at each cycle, reducing the yield. Furthermore, the excess of Mg++ could also stabilise incorrect pairing between primer and template DNA, leading to an accumulation of unwanted product and therefore to low specificity. On the other hand, too low concentrations of magnesium (less than 0.5 mM) affect the extension phase, since Mg++ is required as a co-factor for the enzymatic activity of most DNA polymerases. Some Mg++ ions form complexes soluble with dNTPs in the reaction mixture. So there must be an ionic concentration for which there are optimal conditions in terms of yield and specificity.

Buffer. There are numerous buffers used in PCR, but the choice must be made according to the reaction conditions: characteristics of the target DNA and primers, as well as the reaction cycle. A 50 mM Tris-HCl buffer with a pH between 8.3 and 9 at 25°C is generally used (the pH of the Tris decreases with increasing temperature). To facilitate the pairing of the primers to the denatured DNA, up to 50 mM of KCl can be added to the reaction mixture. At concentrations above 50 mM, KCl, however, can be a Taq polymerase inhibitor.

Deoxynucleotide triphosphates (dNTPs). In most technical manuals, a concentration of dNTPs of 200 μM each is recommended, which guarantees an excess condition for the duration of the reaction. This concentration is sufficient for the stability of the dNTPs during repeated PCR cycles is such that approximately 50% of them remain after 50 cycles. dNTPs act as chelators towards the magnesium ions, changing their optimal concentration. Therefore, a quantity of dNTPs greater than 200 μM leads to an increase in the incidence of errors by the polymerase; millimolar concentrations of dNTPs completely inhibit the enzyme. Low concentrations of deoxynucleotides minimise phenomena of incorrect pairing towards sites other than the target, leading to advantages in terms of specificity and accuracy (fidelity).
It is particularly important that the concentrations of the four deoxynucleotides are equivalent to prevent incorporation errors by the polymerase.

Polymerase. Thermostable DNA polymerases, like the other DNA polymerases, catalyze the synthesis of DNA from triphosphate nucleotides from a primer that has a 3´ free hydroxyl group. In general, they have maximum catalytic activity between 75°C and 80°C and activity substantially reduced at lower temperatures due to the change in pH. At 37°C, Taq polymerase has only about 10% of its maximum activity. Taq polymerase has a half-life that progressively decreases with increasing temperature (greater than 2 hours at 92.5°C, 40 minutes at 95°C and 5 minutes at 97.5°C).
Polymerases that lack 3´-5´ exonuclease activity generally have higher error rates than enzymes that exhibit this activity. The total error rate of the Taq polymerase is reported within a range between 1 × 104 and 2 × 105 errors per pair of bases.
As for the quantity, an excess of the enzyme could synthesize DNA from spurious interactions between primer and template. The Taq concentration suggested in most protocols is 0.5 unit (U) per 25 μL of reaction. This limiting concentration is required to control the specificity of the amplification reaction. Concentrations higher than 2.5 nM (1.25 U for 25 μL of reaction) must be avoided because they lead to a higher efficiency of the PCR only up to a certain point. In reality, such a high amount of enzyme can increase the yield in non-specific products, at the expense of the product of interest. In addition, the cost of the enzyme should not be underestimated.

DNA template. All the parameters discussed so far consider a total human genome quantity of 125 ng per 25 μL of reaction. The accuracy of the amplification depends on the DNA polymerase error rate, but it also depends on the number of initial copies of the template DNA.
It is not necessary for the sequence to be synthesized enzymatically to be present initially in pure form; it can also represent a small fraction in a complex mixture, such as the segment of a single copy gene in the total human genome. The sequence to be synthesized may initially be present as a discrete molecule or may be part of a larger molecule. In both cases, the reaction product will be a discrete double-stranded DNA molecule with ends corresponding to the 5´ end of the oligomers used.
Even relatively degraded DNA preparations can serve as useful templates for generating products of moderate length. The most important conditions concern purity and quantity. A large number of contaminants found in DNA preparations can reduce PCR efficiency. These include heme group, urea, SDS detergent, sodium acetate and, sometimes, residual components of purification from agarose gel.
Of course, the amount of template DNA must be sufficient to ensure the display of the PCR product using ethidium bromide (EtBr). Usually, 100 ng of genomic DNA is sufficient to reveal a PCR product from a single copy mammalian gene. Using too much template DNA might reduce the efficiency of the PCR reaction, both because MgCl2 and other parameters are already optimized, and because of the greater quantity of contaminants that are introduced into the reaction mixture.

The profile of the amplification cycles is also essential for carrying out the reaction.
Number of cycles. 30 amplification cycles are recommended for titration, but in some cases, if the yield is very poor, additional cycles are required. The number of cycles will depend on the molar ratio between the primer and the template. For a ratio of 106-107, 30 cycles should be adequate to generate sufficient material for viewing on ethidium bromide gel. For ratios higher or lower by two orders of magnitude, this value is respectively increased or decreased by 10 cycles.
Denaturation phase. For target sequences of 1 Kb or less, denature for 1 min at 94°C. For larger fragments, about one minute is added to the denaturation time for each additional kilobase. A long denaturation phase is particularly important at the beginning of the PCR, to ensure total denaturation of the starting DNA. For regions rich in G and C of the genomic DNA template, this phase is essential.
Annealing phase. It is critical for the specificity of amplification. A general scheme provides for the use of a pairing temperature (Ta) about 2-5°C lower than the Tm of the two primers used. A consequence of the use of a too low pairing temperature is that one or both primers can hybridize with sequences other than the target sequences (non-specific amplification).
This fact causes a drop in the yield of the desired product; on the contrary, a higher temperature leads to a reduction in the hybridization of the primers on the target DNA and therefore in the yield. The suggested time for pairing is 1 min. However, it can be reduced to 30 sec in the case of small reaction volumes (10-25 μL) that quickly reach the reaction equilibrium at low temperatures.
It is possible to use the same temperature profile to amplify different loci, even if the calculated temperature for pairing can be different. Templates for which the calculated temperature for pairing is higher than the standard profile may require a lower concentration of Mg ions. On the contrary, those with higher pairing temperatures may require a greater quantity of Mg ions to compensate for the too stringent temperature.
Extension phase. Taq polymerase is highly processive and extends 2-4 Kb per min (35-70 bases per sec). However, this parameter is strongly affected by the buffer pH, the saline concentration of the medium, as well as the nature of the target DNA. In general, for targets of size greater than 1 Kb, 1 min is sufficient for the extension phase. In reality, the rise time from the pairing temperature to the denaturation temperature is generally long enough to allow the complete synthesis of a target of 500 bp. For fragments greater than 1 Kb, about 1 min should be added to the extension phase for each additional Kb.
It is important to note that for 20 base primers, with a GC content of 60% or higher, Tm is in the range of temperatures at which the enzyme performs an efficient extension. Therefore, in this condition, the annealing and polymerization phases can be combined in a single step (annealing-extension).

3. PCR EFFICIENCY

The combination of all the parameters discussed so far determines, ultimately, what is the efficiency of the PCR reaction. A basic equation of the polymerase chain reaction describes the trend of amplification (Mullis, 1991):
Yn = (1+ e)n
where Yn is the amplification factor after n cycles (although, for greater precision n should be considered the number of cycles minus one, given that the effective presence of the "short fragments" of interest begins, as was previously discussed, only starting from the second cycle) and e represents the efficiency of the reaction and assumes values ​​between 0 and 1.
For example, if the starting DNA is made up of 3 × 105 (henceforth the 3E5 notation will be used) copies of the human genome, the observation of a band on standard gel would require 5E12 molecules of 100 bp length in an aliquot of 10 μL (Mullis, 1991). A typical PCR could foresee a starting DNA quantity equal to 300,000 copies of the human genome in a volume of 100 μL and could aim to amplify a 100 bp long fragment of a single copy human gene. With the formula described above, it is therefore possible to calculate the number of cycles necessary to amplify the desired segment, to obtain a visible band on gel:
Yn = 5 × 1011 /300,000 = 1.76 × 1011
So, assuming optimistically equal to 1 the efficiency, we obtain that:
n = 20.7 cycles

If we assume that the reaction took place perfectly, in the last cycle 2.5 × 1011 DNA molecules would be amplified, which would, therefore, require 2.5 × 1011 Taq molecules, which correspond to about 10 units.