Skip to main content

Chemical Biology

The Other Guys

Writing the other day about the lipid formulations used in the current mRNA vaccines makes me want to highlight something else that I hit on from time to time around here. When you learn in school about the major classes of biomolecules, you hear about proteins, lipids, carbohydrates, and nucleic acids. That’s a reasonable classification, but you hear a lot more about the first and last items on that list than you ever hear about the middle two.

That’s understandable, because after all, nucleic acids (as DNA) code (through the intermediacy of RNA) for proteins. And it’s those proteins that are involved in either synthesizing the lipids or carbohydrates (through a long list of enzymatic pathways) or bringing them in from our diets, as transporter proteins and the like. So proteins really do take the lead, but those other two categories are still their own worlds. Lipids, of course, make up the cellular membranes that surround (that have to surround) bacteria, archaea, and every single cell with a nucleus. And they’re used on a large scale for things like myelin sheaths around nerve tissue, and on a molecular scale as energy feedstocks (two carbons at a time, being peeled off all those fatty acids) and starting materials for the synthesis of a long list of hormones and cytokines. The carbohydrates are of course also at the center of that cellular metabolism energy landscape, with glucose in the Krebs cycle cranking out the common currency of ATP (itself a carbohydrate derivative). But they also decorate the surfaces of nearly every protein in the body, and are intimately involved in regulating their localization and function. The surfaces of entire cells are no different – the immune system’s recognition of self depends on complex carbohydrates. And I haven’t even mention how ribose and 2-deoxyribose are crucial to the structures of all those nucleic acids.

No, neither of these compounds classes are exactly junior partners. But they don’t get the attention that the others do, and that goes for the laboratories as well as the popular imagination. The biggest problem is that both lipids and carbohydrates are just intrinsically harder to work with than are proteins and nucleic acids. Both of those biopolymers are hooked up (for the most part) in repetitive ways from a rather small list of common building blocks. That’s given us the chance both to work out ways to read off those sequences and to build machines that will synthesize them for us. Those either co-opt the exquisite cellular machinery (which we’ve adapted to work in vitro) or use automated organic chemistry reactions that we’ve come up with ourselves. The list of Nobels and other awards for all the discoveries in those fields is a very long one indeed, as it should be.

But look at carbohydrates. Those also form long polymeric structures, and the list of building blocks is no shorter than it is for the proteins. But the ways in which they can be linked – now, that’s a headache. Good ol’ glucose has five OH groups (counting the anomeric center at the first carbon) that can be involved in such linkages. In addition, that anomeric center can be “up” or “down” (alpha and beta glycosides), and glucose can tie itself into either a six-membered ring or a five-membered one, depending on conditions.

Now run through the list of all the six-carbon and five-carbon sugars (and the smaller carbohydrates too, can’t forget those), the various oxidation state changes like the sugar alcohols and acids, the keto sugars, and the important variants where one or more of those OH groups they’re all decorated with is missing or replaced with an amine and. . .well, you’re looking at a lot of compounds that can be fitted together into an insanely large number of different compounds. The number of possible proteins that can be assembled out of twenty amino acids gets out of hand pretty quickly, but it’s nothing compared to how fast the number of possible complex carbohydrates will take off on you.

And as you’d figure, the chemistry needed to make those things is all over the place, too. With proteins, it’s just amide bond, amide bond, amide bond in an endless series. Don’t get me wrong, that’s bad enough, what with the reactive side chains and the chances of messing up the single chiral center if you use synthetic conditions that are too severe. But even just putting a single substituent at C1 of a given sugar, glycoside formation. . .well, there are entire books full of conditions for the Koenigs-Knorr reaction and Fischer glycosidation, and those are the two simplest and most direct techniques we have. There’s been a lot of work (and a lot of progress) over the years, but there is still no general set-it-and-forget-it carbohydrate synthesis machine, the way that there is for proteins and for nucleic acids. The number of different chemistries (and the number of ways in which they can go wrong) is still too big a problem. That goes for sequencing them back down, too: if you isolate some new complex carbohydrate and want to figure out its composition, you do not pop it into some sort of sequencer and go to lunch. No, you settle down for weeks, months (possibly years) of work. The problem is exacerbated by the huge molecular weight of some of the biological polysaccharides. And as for three-dimensional structures, from what I can see the field is just barely lifting itself up off the ground compared to what we know about protein and oligonucleotide tertiary structure.

Lipids suffer from many of the same problems. They don’t have as much of the multiple-sites-of-attachment thing going, until you consider that large important classes of them are hooked up to (you guessed it) the OH groups of various carbohydrates. So that plunges you right back into the structure determination problems and the synthetic problems we were just talking about. But another tricky part of lipids is that they can be just beastly to handle and characterize via the usual tools. We have a lot of chromatographic tricks to separate out polar compounds (and that includes the proteins and the carbohydrates). But making fine distinctions between different varieties of, well, grease is something else again. The physical interactions are just not as well-defined as you have with (say) hydrogen bonds and charged groups, so you end up chromatographically with broader peaks and worse resolution. Techniques like NMR suffer as well: the long carbon chains of many lipid molecules can be hard to distinguish and assign. Steroid backbones, now those are well worked out and you can find your way around them. But a 22-carbon chain with a double bond somewhere in the middle of it looks an awful lot like a 24-carbon chain with that double bond moved a couple of spaces down, and that goes for NMR, for HPLC, and many other techniques.

It doesn’t necessarily look so similar to your body, though. Think about the state of human nutritional advice, with all the conflicting evidence about the desirability of various saturated, unsaturated, monounsaturated, and polyunsaturated fats. Biochemically, our cells make a lot of fine distinctions between structures that we ourselves have to stare at closely to see any differences at all. The mention of nutrition is to emphasize that we don’t understand a lot of these biochemical effects very well, at either the micro or the macro level.

There are drugs that target lipid and carbohydrate pathways, to be sure, but these are generally small-molecular inhibitors of enzymes that process the compounds. Making compounds that bind to protein sites like those is something we’ve learned a lot about of the years, but compounds that bind directly to complex carbohydrates or to lipids are far more uncommon.

So we have a bit of a blind spot when it comes to both classes of compounds. That blindness is in no small part technological, because it’s been easier to develop tools to analyze, modify, and synthesize proteins and nucleic acids. But we shouldn’t let that make us think that lipids and carbohydrates are somehow ancillary, just because we can’t manipulate them as well. That’s our problem – not theirs.

23 comments on “The Other Guys”

  1. Jason P says:

    Wow! Nicely done! If you ever get tired of the ‘rat race’ of drug discovery, you would do well teaching!

  2. Marko says:

    I’m particularly grateful for the role played by carbohydrates in the fermentation process.

    Well, I’m grateful about their role in SOME fermentation processes, anyway. Some others, I could do without.

    1. Someone says:

      Cheers to that!

      1. Miles says:

        I can trump to that

  3. Jonathan says:

    Yes, let’s hear it for the lipids and carbohydrate chains!

    Certainly, for the big topic of the day, Covid-19 vaccines: the Pfizer and I assume also the Moderna vaccines depend heavily on some novel lipid chemistry. And all those approved so far use strategies to exploit eukaryote cells to put carbohydrate chains in the right place on the spike protein.

    It underlines that we don’t know well enough to mimic the natural cellular process by which glycosylation is directed, we have to use the cells that “know how”.

  4. I wonder, how much of the problem is due to a lack of $$ to pay for development work on lipids and carbohydrates, and how much is due to “we don’t even know enough yet to know what to spend the money on”?

    1. Nesprin says:

      Id argue that the tech hurdles are significant- nucleic acid sequencing can be done with a kit by most biologists, protein analysis is kit level and proteomics can be done by core research facilities, but carbohydrates and lipids are still the purview of specialists. Because most labs aren’t equipped to attribute specific biological functions to sugars and lipids, the work isn’t done, so theres no field agreement that lipid x regulated y, so theres no calls for funding.

  5. ccm says:

    This post sets up an analogy between chains of carbohydrates and proteins before contrasting the two. One more thing to note is that carbohydrate chains can be branched (and usually are when attached to proteins) which increases the possible number of combinations immensely. It’s amazing how much ‘space’ this possibility opens up when you usually think about linear polymers.
    From what I can tell, the canonical view in structural biology is that protein glycosylations are highly mobile, not adopting a single structure, and therefore not amenable to visualisation with current techniques. Maybe we will start to see more polysaccharide as complexes are more likely to be purified from a native source, which is much more feasible with modern cryo-EM. This is already happening more and more with lipids, for example here

  6. David says:

    Great post. An example of how not understanding lipid biology can hurt you comes from the COX2 inhibitors, with the unpleasant discovery that blocking COX2 leads to increased risk of cardiovascular disease.

  7. Merkeet says:

    “The problem is exacerbated by the huge molecular weight of some of the biological polysaccharides. And as for three-dimensional structures, from what I can see the field is just barely lifting itself up off the ground compared to what we know about protein and oligonucleotide tertiary structure.”

    How far are we from just doing cryo-EM to solve this problem?

  8. Marko says:

    I’ve been pounding the drums for evaluating mutation clusters of variants for immune escape properties by doing just that – look at the entire cluster, together – rather than relying on evaluation of linear epitopes in isolation. My concern was mainly the effect mutation clusters might have on second and tertiary structure of protein antigens that could bury sites from antibody recognition. Glycosylation adds another reason to do so – and a reason to ensure that your pseudotyped variant virus particle is normally glycosylated – since glycosylation is a common mechanism of immune escape. To be really thorough about it, immune escape studies should probably be done using the wild isolates themselves, which only certain high-safety labs can do.

    Here’s a paper describing fingerprinting of spike protein glycosylation, an area than seems sure to receive more attention going forward :

    1. Marko says:

      Speaking of which, they’re doing it right in South Africa :

      “…..We were the first to outgrow two variants of 501Y.V2 from South Africa, designated 501Y.V2.HV001dF and 501Y.V2.HV002. We examined the neutralizing effect of convalescent plasma collected from six adults hospitalized with COVID-19 using a microneutralization assay with live (authentic) virus. ”

  9. Spingos Konstantinos says:

    Pretty insightful! Either the chaos of carbohydrates and lipids is not codifiable at all or we are far from the translation of their language with success comparable to the one of proteins and nucleic acids.

  10. bks says:

    Speaking of carbohydrates of importance: cellulose.

  11. This blog author makes it sound like the fundamental limitations to working on lipids and carbohydrates are on the chemistry side. With all due respect, that’s just false. He or she needs to go read Derek Lowe, who is a very well respected scientist and writes an _actually_ well-researched blog, and who has unequivocally stated that “med-chem is not the problem”: (see link to his blog “In The Pipeline” in my name).

  12. Marko says:

    Breaking! They found the intermediate host! :

    1. sgcox says:

      We all know it was Batman.

      1. Marko says:

        Maybe Bernie IS Batman.

  13. Gedejones says:

    Bidens “science” picks are so pseudoscientific they make me sick. Check out the trash that he picked as top science adviser. She literally has no bench experience.

    1. CrystalGrower says:

      This comment is terribly imprecise. The tone doesn’t help. If you’re not just trolling, please take the time to say something that can actually be discussed.

    2. Derek Lowe says:

      Disagree strongly. Back this opinion up with something of substance or keep this stuff to yourself, is my advice.

  14. bacillus says:

    Many Gram negative bacteria produce O-antigens with completely novel sugars, making it even more difficult to determine the structures of many of them. A colleague of mine recently came across an O-antigen with 15 different sugars, several of them novel, in each repeat unit. At least with proteins and nucleic acids you already know the entirety of the building blocks.

  15. boronsaur says:

    I’d like to add that the ability of lipids to form a variety of dynamic structures (micelles, vesicles, discs, etc) adds another level of complexity!

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.

This site uses Akismet to reduce spam. Learn how your comment data is processed.