This paper (from two groups at Yale’s chemistry department) addresses several important things that fall into the “important irritants” category in synthesis and molecular biology – or maybe that should be “irritatingly important”. We spend a lot of time thinking about proteins in terms of their primary sequence and features of their three-dimensional structure (binding pockets, recognition domains, etc.) And that’s understandable. But it’s always important not to lose sight of the fact that the surfaces of proteins are decorated with all kinds of ornaments that are vital for their function, even if they’re synthetically hard to control.
Chief among those are phosphates, you’d have to say. Kinases, of course, phosphorylate things and phosphatases take those back off (and as is often the case in biology, those two classes of enzymes are rather unbalanced: there are a lot more specific kinases out there, compared to a relatively small number of jack-of-all-trades phosphatases). Phosphorylation state can be a powerful on/off switch on specific protein residues, but it’s worth remembering that proteins can have a whole long list of phosphoryation sites once you get past the heavy-duty ones. The number of kinase inhibitors out there is proof enough that this method of regulating protein function can be taken advantage of, but it’s worth keeping in mind that making specifically phosphorylated proteins can be a painful process if you don’t have a living cell or at least a tame enzyme to do it for you. That’s particularly true if you’re interested in one of the less-common phosphorylation sites.
Then you have a very different sort of polarity modification, prenylation. Farnesyl (or geranyl-geranyl) groups are small lipophilic chains that get attached to proteins to modify their function and/or cellular localization. Proteins also get decorated with even longer lipid chains via palmitoylation and myristoylation. It’s believed that these are generally involved in localizing things to membrane sites (as those greasy side chains settle into the lipid membrane). What’s for sure is that these modifications can make proteins rather tricky to work with (as is generally the case when you have a big polar/nonpolar structural separation).
After that you get to the proteins-modified-with-other-proteins class, with ubiquitination at the head of that class. Ubiquitin, as the name implies, is all over the place, and proteins get ubiquitinated (at lysine residues) to mark them for degradation, to mark them not to be degraded, to change their localization and interaction networks, and for who knows what else. It’s a huge topic, since there are lots of varieties of ubiquitin attachment, and if you think we’ve got them all figured out you are sadly mistaken. Then there’s SUMOylation, another add-a-small-protein process that’s clearly important but which we understand even less. It’s less common than ubiquitination (we think!), but there are several different sorts of SUMO proteins and places that they can be attached.
I’m skipping over several other post-translational modifications to get to a big class of them that I haven’t mentioned yet: glycosylations. Sugar molecules (and chains of them) are stapled on to residues like tyrosine, starting with good old glucose and going on from there. These things can also be linked through amine and sometimes even SH residues, and that is just the beginning. I did my graduate work using carbohydrates as starting materials for organic synthesis, and anyone who’s worked with them will have some appreciation of the complications that are available. With all those open hydroxyls available, carbohydrates have a lot of linking possibilities (some more common than others, naturally), and you have the alpha/beta anomeric centers for variety as well. Two naturally occurring amino acids can give you four peptides, but two naturally occurring sugars can give you a whole lot more disaccharides, at least in theory.
These modifications are hugely important in proteins and many other biomolecules (including small-molecule natural products), but when I was in grad school, a lot of this was, chemically and biologically speaking, dark matter. A lot of it still is. Sometimes people would prefer just to ignor that part – even now, you’ll see total syntheses of some natural product as “whateverol aglycon”, that is, minus those pesky sugars. And when it comes to proteins, you almost have to rely on a living cell to glycosylate them properly, because chemically it can be a beast. A lot of chemical glycosylations are variations on the Koenigs-Knorr reaction, with glycosyl halides, pseudohalides, and other leaving groups, and the number of variations on this is surely beyond counting. Holy cow, are there ever a lot of attempts to make glycosylation a better and more predictable reaction. My old “Lowe’s Laws of the Lab List” had one that went “When there are twenty different ways of running a reaction in the literature, it means that there is no good way to run the reaction“, and the Koenigs-Knorr was what I had in mind. But it’s true: every time you vary the substrate being glycosylated or mess with the structure of the sugar part in any way, you can expect to see a new mixture of alpha/beta glycosides at the very least.
The paper I linked to so many paragraphs ago is a good shot, though, at taming this stuff for plain glucose-style glycosylation. The authors have a glycosyl fluoride protocol worked out (calcium hydroxide turns out to be a key addition) that seems to give pretty solid results on tyrosines (and phenols in general). What’s really impressive is that it works under aqueous conditions, using an unprotected sugar, and works on native (unprotected) proteins and peptides. It’s not perfect – you still get some glycosylation on other residues (serine, etc.), but it’s certainly the best I’ve ever seen. It would be interesting to see what happens when you try it on a protein with several possible tyrosines or with competing Cys residues, and I get the impression that the authors are heading there next. But even as it stands, this looks like a significantly better option than anything else out there for fast one-step glycosylation without messing around with protecting groups.
The sorts of tools that are available for synthesizing, manipulating, and analyzing proteins and nucleic acids are (in general) just not available for carbohydrates. They either don’t exist anywhere or haven’t been domesticated from their biochemical forms. Only in recent years has the dream of automated polysaccharide synthesis come close to reality. Even in this paper, it’s worth noting that the authors weren’t really able to tell what all those other minor glycosylated side products were; it would be a major effort to run all those down. But this stuff is important, and every new technique that makes it easier to work in the area is welcome.