Sometimes the levels of protein expression are low despite the use of strong transcriptional and translational signals. The following approaches can be used to optimize expression levels:
Varying induction conditions. The levels of expression of the target protein can be optimized by varying the time and/or temperature of induction and the concentration of the inducer.
Examining the codon usage of the heterologous protein. Not all 61 mRNA codons are used equally. The so-called major codons are those that occur in highly expressed proteins, whereas the minor or rare codons tend to be in genes expressed at a low level. Which of the codons are the rare ones depends strongly on the organism. The codon usage per organism can be found in the Codon Usage Database. For more information on the low usage codons per organisms see table 1 and table 2.
Usually, the frequency of the codon usage reflects the abundance of their cognate tRNAs. Therefore, when the codon usage of your target protein differs significantly from the average codon usage of the expression host, this could cause problems during expression. The following problems are often encountered:
- Decreased mRNA stability (by slowing down translation)
- Premature termination of transcription and/or translation, which leads to a variety of truncated protein products
- Frameshifts, deletions and misincorporations (e.g. lysine for arginine).
- Inhibition of protein synthesis and cell growth.
As a consequence, the observed levels of expression are often low or there will be no expression at all. Especially in cases were rare codons are present at the 5'-end of the mRNA or in clusters expression levels are low and truncated protein products are found.
The expressed levels can be improved by:
- replacing codons that are rarely found in highly expressed E. coli genes with more favourable codons throughout the whole gene. Codons that have been associated with translation problems in E. coli are:
- co-expressing the genes encoding for a number of the rare codon tRNAs. There are several commercial E. coli strains available that encode for a number of the rare codon genes:
|BL21 (DE3) CodonPlus-RIL||arginine (AGG, AGA), isoleucine (AUA) and leucine (CUA)|
|BL21 (DE3) CodonPlus-RP||arginine (AGG, AGA) and proline (CCC)|
|Rosetta or Rosetta (DE3)||AGG/AGA (arginine), CGG (arginine), AUA (isoleucine)
CUA (leucine)CCC (proline), and GGA (glycine)
- making changes in the coding sequence that reduce secondary structure in the translation initiation region. This is mainly done by increasing the number of A residues.
Examining the second codon. In endogenous E. coli proteins not all codons are used to the same extend in the second triplet (following the N-terminal methionine). The most used is AAA lysine (13.9%) while a number of other codons are not used at all. Looman et al. showed that the expression efficiency of a modified lacZ gene varies at least 15 fold, depending on this codon. Thus, chosing the right codon in this position or changing it into one that is more often used in E. coli could improve expression levels.
Reference: Looman et al. (1987) EMBO J. 6, 2489-2492.
Minimizing the GC content at the 5'-end. A high GC content in the 5'-end of the gene of interest usually leads to the formation of secondary structure in the mRNA. This could result in interupted translation and lower levels of expression. Thus, higher expression levels could be obtained by changing G and C residues at the 5'-end of the coding sequence to A and T residues without changing the amino acids.
Addition of a transcription terminator (or an additional one if one is already present).
Addition of a fusion partner. Fusion of the N-terminus of a heterologous protein to the C-terminus of a highly-expressed fusion partner often results in high level expression of the fusion protein.
Using protease-deficient host strains. The use of host strains carrying mutations which eliminate the production of proteases can sometimes enhance accumulation by reducing proteolytic degradation. BL21, the work horse of E. coli expression, is deficient in two proteases encoded by the lon (cytoplasmic) and ompT (periplasmic) genes.