Skip to main content

Biological News

Chemical Biology: Engineering Enzymes

I mentioned directed evolution of enzymes the other day as an example of chemical biology that’s really having an industrial impact. A recent paper in Science from groups at Merck and Codexis really highlights this. The story they tell had been presented at conferences, and had impressed plenty of listeners, so it’s good to have it all in print.
It centers on a reaction that’s used to produce the diabetes therapy Januvia (sitagliptin). There’s a key chiral amine in the molecule, which had been produced by asymmetric hydrogenation of an enamine. On scale, though, that’s not such a great reaction. Hydrogenation itself isn’t the biggest problem, although if you could ditch a pressurized hydrogen step for something that can’t explode, that would be a plus. No, the real problem was that the selectivity wasn’t quite what it should be, and the downstream material was contaminated with traces of rhodium from the catalyst.
So they looked at using a transaminase enzyme instead. That’s a good idea, because transaminases are one of those enzyme classes that do something that we organic chemists generally can’t usually do very well – in this case, change a ketone to a chiral amino group in one step. (It takes another amine and oxidizes that on the other side of the reaction). We’ve got chiral reductions of imines and enamines, true, but those almost always need a lot of fiddling around for catalysts and conditions (and, as in this case, can cause their own problems even when they work). And going straight to a primary amine can be, in any case, one of the more difficult transformations. Ammonia itself isn’t too reactive, and you don’t have much of a steric handle to work with.
sitagliptan rxn
But transaminases have their idiosyncracies (all enzymes do). They generally only will accept methyl ketones as substrates, and that’s what these folks found when they screened all the commercially available enzymes. Looking over the structure (well, a homology model of the structure) of one of these (ATA-117), which would be expected to give the right stereochemistry if it could be made to give anything whatsoever, gave some clues. There’s a large binding pocket on one side of the ketone, which still wasn’t quite large enough for the sitagliptin intermediate, and a small site on the other side, which definitely wasn’t going to take much more than a methyl group.
They went after the large binding pocket first. A less bulky version of the desired substrate (which had been turned, for now, into a methyl ketone) showed only 4% conversion with the starting enzymes. Mutating the various amino acids that looked important for large-pocket binding gave some hope. Changing a serine to phenylalanine, for example, cranked up the activity by 11-fold. The other four positions were, as the paper said, “subjected to saturation mutagenesis”, and they also produced a combinatorial library of 216 multi-mutant variations.
Therein lies a tale. Think about the numbers here: according to the supplementary material for the paper, they varied twelve residues in the large binding pocket, with (say) twenty amino acid possibilities per. So you’ve got 240 enzyme variants to make and test. Not fun, but it’s doable if you really want to. But if you’re going to cover all the multi-mutant space, that’s twenty to the 12th, or over four quadrillion enzyme candidates. That’s not going to happen with any technology that I can easily picture right now. And you’re going to want to sample this space, because enzyme amino acid residues most certainly do affect each other. Note, too, that we haven’t even discussed the small pocket, which is going to have to be mutated, too .
So there’s got to be some way to cut this problem down to size, and that (to my mind) is one of the things that Codexis is selling. They didn’t, for example, get a darn thing out of the single-point-mutation experiments. But one member of a library of 216 multi-mutant enzymes showed the first activity toward the real sitagliptin ketone precursor. This one had three changes in the small pocket and that one P-for-S in the large, and identifying where to start looking for these is truly the hard part. It appears to have been done through first ruling out the things that were least likely to work at any given residue, followed by an awful lot of computational docking.
It’s not like they had the Wonder Enzyme just yet, although just getting anything to happen at all must have been quite a reason to celebrate. If you loaded two grams/liter of ketone, and put in enzyme at 10 grams/liter (yep, ten grams per liter, holy cow), you got a whopping 0.7% conversion in 24 hours. But as tiny as that is, it’s a huge step up from flat zero.
Next up was a program of several rounds of directed evolution. All the variants that had shown something useful were taken through a round of changes at other residues, and the best of these combinations were taken on further. That statement, while true, gives you no feel at all for what this stuff is like, though. There are passages like this in the experimental details:

At this point in evolution, numerous library strategies were employed and as beneficial mutations were identified they were added into combinatorial libraries. The entire binding pocket was subjected to saturation mutagenesis in round 3. At position 69, mutations TAS and C were improved over G. This is interesting in two aspects. First, V69A was an option in the small pocket combinatorial library, but was less beneficial than V69G. Second, G69T was improved (and found to be the most beneficial in the next
round) suggesting that something other than sterics is involved at this position as it was a Val in the starting enzyme. At position 137, Thr was found to be preferred over Ile. Random mutagenesis generated two of the mutations in the round 3 variant: S8P and G215C. S8P was shown to increase expression and G215C is a surface exposed mutation which may be important for stability. Mutations identified from homologous enzymes identified M94I in the dimer interface as a beneficial mutation. In subsequent rounds of evolution the same library strategies were repeated and expanded. Saturation mutagenesis of the secondary sphere identified L61Y, also at the dimer interface, as being beneficial. The repeated saturation mutagenesis of 136 and 137 identified Y136F and T137E as being improved.

There, that wasn’t so easy, was it? This should give you some idea of what it’s like to engineer an enzyme, and what it’s like to go up against a billion years of random mutation. And that’s just the beginning – they ended up doing ten rounds of mutations, and had to backtrack some along the way when some things that looked good turned out to dead-end later on. Changes were taken on to further rounds not only on the basis of increased turnover, but for improved temperature and pH stability, tolerance to DMSO co-solvent, and so on. They ended up, over the entire process, screening a total of 36,480 variations, which is a hell of a lot, but is absolutely infinitesmal compared to the total number of possibilities. Narrowing that down to something feasible is, as I say, what Codexis is selling here.
And what came out the other end? Well, recall that the known enzymes all had zero activity, so it’s kind of hard to calculate improvement from that. Comparing to the first mutant that showed anything at all, they ended up with something that was about 27,000 times better. This has 27 mutations from the original known enzyme, so it’s a rather different beast. The final enzyme runs in DMSO/water, at loadings up of to 250g/liter of starting material at 3 weight per cent enzyme loading, and turns isopropylamine into acetone while it’s converting the prositagliptin ketone to product. It is completely stereoselective (they’ve never seen the other amine), and needless to say involves no hydrogen tanks and furnishes material that is not laced with rhodium metal.
This is impressive stuff. You’ll note, though, the rather large amount of grunt work that had to go into it, although keep in mind, the potential amount of grunt work would be more than the output of the entire human race. To date. Just for laughs, an exhaustive mutational analysis of twenty-seven positions would give you 1.3 times ten to the thirty-fifth possibilities to screen, and that’s if you know already which twenty-seven positions you’re going to want to look at. One microgram of each of them would give you the mass of about a hundred Earths, not counting the vials. Not happening.
Also note that this is the sort of thing that would only be done industrially, in an applied research project. Think about it: why else would anyone go to this amount of trouble? The principle would have been proven a lot earlier in the process, and the improvements even part of the way through still would have been startling enough to get your work published in any journal in the world and all your grants renewed. Academically, you’d have to be out of your mind to carry things to this extreme. But Merck needs to make sitagliptin, and needs a better way to do that, and is willing to pay a lot of money to accomplish that goal. This is the kind of research that can get done in this industry. More of this, please!

33 comments on “Chemical Biology: Engineering Enzymes”

  1. Jim Hu says:

    unclosed italics tag?

  2. InfMP says:

    Buchwald must be so upset – I don’t think Negishis are that useful

  3. p says:

    Haven’t had a chance to read the paper yet (but will).
    But, is the engineered enzyme good at other transaminations? That is, now that you have this enzyme, can it be used for more than one substrate? Assuming some structural similarity to sitagliptin, of course.

  4. @InfMP says:

    Only Buchwald? What about Sonogashira, Kumada, Hartwig, and mechanistic folks like Shibasaki and Overman?
    Negishi’s are finicky (bone-dry zinc chloride) but can work very well with substrates prone to beta-hydride elimination.
    Don’t know why enzymatic reductive transamination to make Januvia is getting so much press. Is it because it’s from Merck? Birman from WUStL has published some interesting organocatalytic desymmetrization of racemic mixtures and meso substrates.

  5. Will says:

    @ 3
    Merck doesn’t care about other substrates, they only want to make large quanities of the api easily without toxic impurity. At the drug discovery stage, the methods mentioned by Derek at the beginning of the post are perfectly fine

  6. processchemist says:

    Impressive… 4 volumes for mass unit of substrate and 3% of enzyme is something most people working on industrial enzymations only dreams of.
    Next step: an enzyme working on a wide variety of substrates. Cheap, if possible.
    BTW, Merck people is for sure aware of the wide variety of metal scavengers on the market…

  7. JMB says:

    Great post! What less-than-desirable reactions would you like to see replaced with enzymes? Maybe along with “Derek’s Laws of the Lab” and “stuff I won’t work with” we need a “I wish there was an enzyme for that” category

  8. anchor says:

    Simply mind boggling and superb work! This opens up research for further investigation and from this point on it can only get better.

  9. HealthyCynic says:

    This is an impressive example of practical bioorganic chemistry. I only hope that Januvia itself is designed better than Avandia. It would be a shame if so much effort went into optimizing the synthesis of a faulty product. I don’t think Merck/Schering-Plough can handle another Vioxx or Vytorin.

  10. Curt F. says:

    Thanks to Derek for bringing this paper to my attention.
    The final enzyme performance characteristics really are amazing. Activity on 250 g/L of starting material in a 50% DMSO starting material means the water concentration in the reaction mixture is well under half of what we would usually consider aqueous.
    This was my favorite paragraph in the paper:
    In comparison with the rhodium-catalyzed process (Fig. 1A), the biocatalytic process provides sitagliptin with a 10 to 13% increase in overall yield, a 53% increase in productivity (kg/l per day), a 19% reduction in total waste, the elimination of all heavy metals, and a reduction in total manufacturing cost; the enzymatic reaction is run in multipurpose vessels, avoiding the need for specialized high-pressure hydrogenation equipment.

  11. JMB says:

    Re: substrate specificity / broadness–I’m sure some of those 36480 screened variants accepted broader ranges of substrates. On the way to making Merck’s pet molecule, Codexis got a whole family of transaminases that they can screen for reactivity on your favorite substrate. I’m sure they’d be happy to optimize it to your desired conditions for the right price.

  12. C says:

    @4 – Kumada may be unhappy, but being dead is likely to stop him from mentioning it too often…

  13. Nate says:

    I work for big pharma in the biocatalysis unit. This sort of enzyme engineering is becoming more and more routine. The biggest reason for this push is that biocatalytic process can be incredibly cheap. Not only do you not have to worry about protecting groups but the enzymes themselves can be produced recombinantly extremely cheaply, especially on scale.
    As for the question about enzymes working on a wide variety of substrates, what typically happens is that we have enzymes in a particular class from a wide variety of sources. That provides the diversity and the wide substrate scope. We have those enzymes in a 96-well plate format and do simple screening to narrow down the hits. Chemists don’t have a single catalyst that does every asymmetric hydrogenation. The same holds true with biocatalysts.
    The past ten years have provided a huge explosion in the different types of chemical transformations available using biocatalysts. Not that long ago, the field was mostly lipase resolutions and baker’s yeast ketone reductions. Now, the field has asymmetric C=C reductions, transaminases, asymmetric cyanohydrin synthesis, asymmetric epoxide hydrolysis, and mild hydrolysis of cyano groups to acids or amides, just to name a few off the top of my head.
    Biocatalysis (and enzyme engineering) is a great research tool to have in the pocket of a chemist from medicinal discovery to industrial chemical process development. Biocatalysis can’t solve every chemical problem, but if you need chirality or specificity, then a biocatalyst might be the answer.

  14. OrgSynPreppie says:

    @Curt F: You forgot to include the line from the paragraph about this discovery allowing Merck to reduce process and manufacturing headcount.

  15. Curt F. says:

    @OrgSynPreppie: I missed the line in the paper you are referring to. Can you help me find it?
    Whether or not it is discussed in the paper, wouldn’t it be a good thing if the process engineers and manufacturing staff that used to have to devote their lives to running rhodium-catalyzed hydrogenations to produce sitagliptin are now free to pursue other, more worthy endeavors?

  16. Anonymous says:

    Curt, I think what he’s getting at is that Merck laid off thousands of people this year. They didn’t move into other areas, they moved out of Merck entirely.

  17. T says:

    The sitagliptin enzyme turns out to be extremely general. As long as the ketone has a big side and small side. The paper has a couple of examples at the end and more in the supporting info.

  18. Curt F. says:

    @Anonymous #16: Oh, I see. Undoubtedly, the layoffs were a significant upheaval in the lives of those affected. I hope that they eventually do find more worthy endeavors to pursue. In any case, I don’t see how the layoffs are relevant to the paper. We are drifting way off topic here.

  19. MeLikeChemistry says:

    Now if only the Codexis bioengineers can come up with Suzuki, Heck, Huisgen, and olefin metathesis enzymes!

  20. coprolite says:

    I saw a presentation in Nebraska about feed lot genetics, this reminds me of that.

  21. SP says:

    I’m surprised a medicinal chemist is suggesting something so off-base. When there’s a screening hit, are you told to go off and make every possible combination of every modification you can think of, and that’s 10^n so you’re crazy to think you can do it? No, you optimize one part, then optimize another, and you’ll miss some combinations that individually are useless but only work together, but in general you’ll improve things. That’s the way to do directed evolution- mutate everything, take the best of what comes out, and mutate again from there, rinse, repeat. The argument that one can’t possibly make all combinations is irrelevant (and is an common argument of creationists- how could humans have ever evolved when all combinations of a 100 amino acid protein are greater than the mass of the universe?)

  22. @10, 15, 18 says:

    “Thanks to Derek for bringing this paper to my attention.”
    Somebody’s not keeping up with the literature…
    I agree with Nate, though. Not much from the best catalysis (organo or metallo) labs can really beat or even match optimized biosynthetic pathways in terms of turnover, atom economy, ee, or polydispersity.

  23. gyges says:

    Don’t have access to the paper.
    Can anyone tell me how the enzyme is synthesised?
    Amino acid by amino acid; or,
    Has DNA that codes for the protein been produced and is this placed in E Coli; over expressed, fermented, and extracted by conventional means?

  24. Barack Obama says:

    Questions for the folks in the know:
    1. Can you give me a ballpark figure for what big pharma would be willing to pay for a biocatalyst (for a typical synthesis step)?
    2. How many enzymes like this do they pursue each year?

  25. Nate says:

    Enzymes are never synthesized amino acid by amino acid in these applications. It’ll be done by recombinantly introducing the DNA for the gene into E coli. The cells probably won’t be processed much, if at all, after they express the enzyme. You’ll be able to do the reaction using whole cells or broken cells with the insoluble biomass removed (lysate). Anything more than that would make the catalysts expensive.
    As for price, I can’t give you the number we use obviously but it’s extremely cheap. Once you have the E coli prepared and the fermentation growth conditions, the media components are going to be things like yeast extract, corn steep liquor, or glycerol. All of those are extremely cheap.
    As for the biocatalyst toolkit for doing C-C bond forming reactions, there are the various aldolases for doing aldol additions. Benzaldehyde lyases reactions can be used for making various asymmetric heterocoupled benzoins. Oxynitrilases give asymmetric cyanohydrins with aldehydes and ketones. There’s also been some really nice work for glycosylating small molecule directly without protecting groups. That’s what I can remember off the top of my head.
    Biocatalysts can’t do everything. But what they do, they can do very well.

  26. Nate says:

    Codexis talked about how the transaminase mutants can be used for other substrates. One nice thing about enzyme engineering projects is that when you generate the mutants, you’ll be able to use those mutants for other projects.

  27. gippgig says:

    Note that since the genetic code has been modified to add unnatural amino acids you aren’t even limited to the standard ones.

  28. too late too tired says:

    courses i wish were offered to all organic grad students:
    1. biotransformation chemistry
    2. nanotechnology marketing
    3. regulatory affairs
    4. patent examination
    5. six sigma certification

  29. Thomas McEntee says:

    So, what did the Codexis effort take in terms of human resources and time? A colleague who knows far more than I do about molecular biology suggested 3-4 weeks elapsed time for each round of enzyme evolution, including design and construction of the libraries, three tiers of screening, and statistical analysis. Multiply that by the 10 rounds Codexis did. He guesses that the entire operation could be done with a staff of 10, with 5 people handling the evolution rounds. It’s unclear what resources and time were needed for the preliminary rational design work. Perhaps 3 people working over 1-3 years? Whatever the real numbers are, they’re undoubtedly less than what such an effort would have taken 10 or 15 years ago.

  30. PTM says:

    Once we master genetic engineering there will be better ways to evolve enzymes.
    In theory each cell could test one variant, but that would mean no amplification of signal, so it would be more practical to have each new cell perform directed mutagenesis with 33% probability.
    The enzyme system used for mutagenesis should also contain a genetic addressing register which targets mutation so that each time mutagenesis takes place at a different location in the gene. The register should reset once all the gene has been addressed once.
    The cells would also need to have sensors – proteins which would regulate the speed of cell cycle based on how well the mutated enzymes work.
    This system would then ensure that quite a lot of protein space would be explored and that there would be a way to pick up the best candidates – as they would be overrepresented with each new generation due to faster cell cycle.
    This is all theoretically possible but a complex set of specialized proteins would be needed. The addressing register is probably the most tricky part. It could be made from a repetitive sequence of DNA with one recognition motif sequentially shifted once per cell cycle by a pair of enzymes locked in negative feedback loops. A spacer protein would then bind to the motif and position the mutating enzyme at some particular offset from the recognition site, this offset would have to be larger then the size of the gene to be mutated though so it’s quite tricky, probably a set of proteins engineered to form a filament which would bind the ds DNA and position and activate the mutating enzyme would do the trick. It doesn’t have to be perfectly precise really, approximate precision will do.
    Or it could be some modified polymerase running not on ATP but on a synthetic polymer – each nucleotide shift would use up one monomer so the offset traveled could be controlled by the length of the polymer. Making the polymer would be tricky though, perhaps some enzyme system which would modify DNA or RNA for the role?
    All in all possibilities are almost endless though our current technological capabilities are still extremely primitive and completely inadequate for such work.

  31. Nate says:

    Codexis did not use unnatural amino acids. They modified the genetic code of the original transaminase to insert different amino acids than the native wild-type sequences. They still used the 20 natural amino acids but just in slightly different combinations.

  32. Paul says:

    Has anyone ever tried to engineer an enzyme entirely from scratch? By that I mean start from some entirely unnatural (perhaps randomly generated) peptide that shows a trace of activity, then do directed evolution from that?

  33. Medical says:

    is the engineered enzyme good at other transaminations?

Comments are closed.