ABOVE: Reassigning rare codons to unconventional amino acids provides biologists with a new strategy to diversify proteins. ©ISTOCK, jxfzsy

Living organisms synthesize a staggering variety of proteins by combining 20 amino acids into chains of any length and order. In the past, to expand protein diversity beyond the scope of these 20 subunits, scientists tweaked the genetic code and designed artificial proteins that carry unconventional amino acids.1 However, these efforts yielded minimal success because cells only incorporated the extra building blocks into a few copies of the desired proteins.

In a recent Science study, synthetic biologists developed an alternative strategy to incorporate novel amino acids into proteins.2 These findings provide a new method to generate proteins with artificial properties at high yields, expanding the scope of synthetic biology.

“This group did a lot of work and was able to get efficiencies above 80 percent, which I think shows that they really went the full distance,” said James Van Deventer, a protein engineer at Tufts University who was not involved with the study.

Ribosomes receive instructions to synthesize proteins from genes in the form of an RNA transcript listing the order in which the 20 amino acids should appear. However, RNA is limited to four letters: adenine (A), cytosine (C), guanine (G), and uracil (U). For the four bases to have the capacity to code for 20 amino acids, the ribosome reads them in triplets known as codons, for which there are 64 varieties.3 To add amino acids to the growing protein chain, codons recruit transfer RNA (tRNA) to the ribosome. Each tRNA recognizes a specific codon sequence and delivers the corresponding amino acid to the nascent protein. Three of the 64 codons signal the end of protein synthesis. These stop codons do not pair up with tRNA and instruct the ribosome to release the polypeptide chain.4

In previous attempts to expand the genetic code alphabet, researchers introduced into cells a tRNA that is linked to a non-canonical amino acid and recognizes a stop codon.5 However, this strategy proved inefficient because proteins that trigger the end of protein synthesis bound to the codon more strongly than the modified tRNA, so the stop codons retained their natural function most of the time.6 “Efficiency is typically lower than five percent, so in that case you can rarely do applications based on that,” said Shixian Lin, a synthetic biologist at Zhejiang University and study coauthor. Lin hypothesized that it might be more efficient to give the cells a tRNA that pairs up with a rare codon. For those codons that are rare, the cell produces fewer of the corresponding tRNA, and the synthetic tRNA face less competition.7

First, Lin’s team had to search for rare codons in human cell lines. Using RNA sequencing, they documented the seven least popular triplets. To determine which of the seven they could most efficiently repurpose, they introduced each of the rare codons into the gene for enhanced green fluorescent protein (eGFP) and transfected the modified construct into the cells. When they treated the cultures with synthetic tRNA that carried an easily traceable unconventional amino acid and recognized these rare codons, they found that the TCG triplet introduced the new building block into the protein at the highest yield.

Given that the TCG codon could efficiently instruct the ribosome to add the unconventional amino acid to a desired protein, the scientists reasoned that it might also add the new subunit to other proteins in the cell that harbor this rare triplet, potentially altering the function of essential factors. To assess the level of background inclusion, the researchers treated cells with different tRNA that carried a traceable amino acid and recognized either the TCG codon or one of the other six least used codons. After harvesting all the proteins from the cells and staining them for the traceable building block, the team found that those treated with the tRNA that corresponded to the TCG codon produced the faintest signal, revealing that this triplet led to the lowest level of background incorporation in other proteins.

To understand why repurposing the TCG codon minimally affected other proteins in the cell, Lin’s team studied the RNA sequences on either side of the TCG codon. They found that the efficiency of incorporation depended on the surrounding sequences. “We even found that if the upstream codon mutates by one SNP [single nucleotide polymorphism], the recoding efficiency is going to be dramatically different,” Lin said. This means researchers looking to repurpose this codon should carefully consider where to place it in a sequence. “In some cases, we incorporate these rare codons in different positions on the protein, and the recoding efficiency diverges dramatically from five percent to 99 percent,” he noted.

Finally, to push the limits of their strategy, the synthetic biologists tried to repurpose the next two least used codons, TAG and TGA, alongside TCG. They successfully modified eGFP to carry three unconventional amino acids in its chain simultaneously. This finding reveals that multiple novel subunits with unique properties can be added to proteins, expanding the ability of researchers to design biomolecules with complex functions.

Engineering proteins to carry unconventional amino acids could have therapeutic applications. Van Deventer suggested that scientists could use this strategy to install an amino acid that acts as a drug conjugate onto antibodies. Getting the cell to incorporate the conjugate during protein synthesis could improve the yield compared to chemically altering ready-made antibodies, he explained.

Researchers could also use this strategy to add building blocks with unusual properties into proteins. “We try to incorporate unnatural amino acids with high chemical reactivity,” Lin said. This could allow the novel amino acid to undergo further modifications following protein synthesis. For example, this approach could allow researchers to incorporate cross-linkable amino acids into proteins, forcing them to physically couple, Van Deventer noted. Such a technology might facilitate the study of the physical interactions between proteins.

Disclosure of Conflicts of Interest: Shixian Lin filed a provisional patent related to this study with the aim of commercializing the technique.

References

1. Tang H, et al. Recent technologies for genetic code expansion and their implications on synthetic biology applicationsJ Mol Biol. 2022;434(8):167382.
2. Ding W, et al. Rare codon recoding for efficient noncanonical amino acid incorporation in mammalian cellsScience. 2024;384(6700):1134-1142.
3. Komar AA. The Yin and Yang of codon usageHum Mol Genet. 2016;25(R2):R77-R85.
4. Brown A, et al. Structural basis for stop codon recognition in eukaryotesNature. 2015;524(7566):493-496.
5. Chin J. Expanding and reprogramming the genetic codeNature. 2017;550(7674):53–60.
6. Lekomtsev SA. Nonstandard genetic codes and translation terminationMol Biol. 2007;41(6):878-885.
7. Komar AA. A code within a code: How codons fine-tune protein folding in the cell. Biochem Mosc. 2021;86(8):976-991.