Modifying proteins with unnatural amino acids is a wide field with a lot of interesting research areas. Nature has provided us with twenty-odd amino acids (counting some rare ones), but there’s no reason that we have to play the hand that we’re dealt. Modifications of protein transcription and translation machinery have increasingly allowed incorporation of all sorts of man-made amino acids, and you can drop in fluorescent side chains, chemically reactive ones (including “click” handles of various sorts), residues with fluorinated groups, unusual hydrogen-bonding properties or three-dimensional character, what have you.
Of course, a lot of these ideas will involve larger, greasier groups than the standard suite of amino acids provides, and that can lead to solubility problems in the resulting proteins. A wonderfully modified protein for your research project will do you little good if it’s sitting in the bottom of the tube or is tangled up into balls inside the cell. Here’s a look at a couple of bacterial proteins that have had an acridine-substituted side chain amino acid (acridonylalanine) substituted throughout their sequences, in an attempt to see if there are any lessons to be learned.
You’re actually dealing with two different tolerance regimes: the ability of the protein (once made) to deal with new new residue, and the ability of the protein to be expressed with the unnatural amino acid (Uaa) in the first place. There have been many papers that have tried to survey one or the other of these (there’s a good short review of them at the beginning of this new one) and there have been several recommendations: try to put your new residue in a spot that had the closest natural analog in it to start with, for example, or try to pick a spot that already seems tolerant of natural mutations. Both of those make sense.
But as seems to be the case so often, making sense isn’t nearly enough. This team (from Penn) tried to take the expression variability out of the system as much as possible, and generally saw about a fourfold window across a range of mutants, which isn’t bad at all. The amount of actual soluble protein varied much more, as you’d imagine: some of the mutants were fine, and others were disastrous, which is right in line with what others have seen. But do any of the conventional explanations help to understand it? They tried to correlate soluble protein amounts with hydrophobicity, conservation of the wild-type residue, accessibility in the tertiary structure of the protein, and so on, but found nothing useful: none of the recommended properties seem to have much explanatory power. Studying two different proteins was a good move, because things that looked like they might be trends in one were generally useless for the other.
So there aren’t any shortcuts. Different proteins have different tolerances to substitution in different structural domains, and you’re going to have to find those out by experimentation. “In the absence of reliable predictors or refined simulation algorithms for Uaa tolerability, a chemical biologist pursuing Uaa incorporation in a new protein, as of now, should broaden, rather than narrow, the types of residues screened for Uaa tolerability whenever possible.” No shortcuts.
Of course, getting decent soluble protein expression can be a pain even when you’re using the good ol’ amino acids that we know and love. One component of directed-evolution protein engineering (this year’s Nobel) is the search for more soluble protein forms that are easier to handle and will have the same function as wild-type. To that end, David Liu’s group at Harvard has applied their phage-evolution idea (blogged about here, see that one for a general explanation, which there’s not time to provide here) to select for solubility. “You get what you select for” is the rule of evolution, and they’ve modified their previous conditions to select for well-folded soluble proteins by making the activity of the T7 phage RNA polymerase (T7 RNAP) dependent on it.
They split the protein into two parts that can associate back into an active enzyme, and fuse the smaller (N-terminal) fragment to the C-terminal of the protein that’s being evolved. The overall folded state4 of that fusion protein is now a rate-limiting step in the whole process – the key phage protein needed for reproduction will be available if the protein is soluble and stable, and the ones that tend to aggregate or fall apart are selected against. Another modification is the use of a split-intein system for the pIII phage protein, which allows them to select for solubility and protein activity at the same time. Selection stringency is key in these experiments – if you lean on the system too hard, you encourage things (like premature stop codons) that allow the phage to survive without bothering about your silly protein at all. Ideally, you’d want to get close to where such fence-jumping starts to be worthwhile, without really giving it a chance to happen.
The experiments are optimizing scFv proteins, which are fusions of two variable antibody domains with a short peptide chain between them. These things (which come from the phage-display world, also a Nobel subject this year) are usually produced in bacteria, making them a perfect subject for this bacteriophage-evolution technique. The team picked one (the Ωg protein) that had been produced to bind a yeast transcription factor (GCN4), but is notoriously insoluble in the bacteria that produce it. 72 hours of selection in the “phage lagoon” system produced a triple-mutant form of the protein with fivefold better expression, similar to the best-identified variant to date. A similar experiment on another scFv protein (C4, which binds to the first 17 amino acids of Huntington protein) also led to a variants with up to sixfold improved expression yields while maintaining binding affinity. Mammalian proteins and bacterial proteins also could be improved by the same methods.
This technique could take a lot of tedious work out of the protein-engineering field, outsourcing the work to evolutionary pressures in living bacteria. And the authors suggest that it could also be used on proteins that have been engineered or evolved by other methods, as a second pass to improve their stability and expression (which are two properties that can often take a hit when you’re optimizing for activity in a new enzyme). Now if you want to combine the two topics of today’s blog, you’d set up a directed-evolution system that optimized the incorporation of unnatural amino acids and picked out the most soluble active proteins that incorporated the Uaa somewhere. I offer this idea free of charge – for all I know, it’s being worked on already!