There are plenty of useful drugs whose structures are, well, odd-looking. Antibiotics, as a class, have a lot of these: macrocycles, polyenes, polyhydroxylated beasts that don’t fit in with a medicinal chemist’s ideas of what a reasonable compound should look like. The “Rule of Five” metrics have been debated endlessly for what they might say about drug-like properties, but two things can be said without too much fear of contradiction: a majority of marketed drugs fall within these guidelines, and at the same time, there are tremendously active and useful drugs that fall far, far outside them.
Here’s a new paper on how such things might be binding to their targets. The authors (as in their previous work) divide these compounds into two classes. The ones with MW>500 but other properties at least close to rule-of-five space are termed “extended Ro5 compounds” (eRo5), while the ones that not only break the molecular weight range but definitively break one or more of the other rules are the “beyond Ro5 compounds” (bRo5). An upper weight bound of 3000 was set so that you don’t include insulin and the like (although we’re going to have to remember that such things are, in fact, molecules in their own right, and not just some separate class of things called “proteins”). The eRo5 compounds are basically the edges of conventional drug space, the tails of the various distributions, while the bRo5 ones are truly out there. The authors have assembled databases of both these sets, curated to remove contrast agents, polymers, and so on.
Of the 475 compounds they found, 93 have protein-ligand structures in the PDB, although there are quite a few redundant ones in that list. The binding sites were further classified (both manually and computationally) as flat, groove, tunnel, pocket, or internal. As you might have guesses, Ro5 drugs tend to go for the latter classes of binding site, and the compounds evaluated here are more biased towards the flat and groove binding sites. The surface areas involved are larger, too, and can approach those known for protein-protein interactions. No real trends were apparent, though, in the types of interactions made by the various sets (hydrogen bonds, pi-pi interactions, etc.)
More than half of the flat/groove binding compounds in the bRo5 set, though, are macrocycles, which presents a sort of chicken-and-egg problem. You could argue that the data set is biased, since it will tend to include active natural products and their derivatives, which are enriched in macrocyclic structures compared to what humans themselves have synthesized, or you could take that further and say that the natural product world is actually enriched in macrocycles for very good reasons that we should be emulating. The authors tried to assess conformational rigidity of the macrocycles versus the other drug structures, but (as far as I can see) without much effect. It doesn’t appear, though, that the macrocycles are significantly more rigid, for what that’s worth. They conclude that “the unique ability of macrocycles to adopt disk- and spherelike shapes that are better suited for binding to flat and groove-shaped sites is an important reason for enrichment of macrocycles in bRo5 space”, but emphasize that other effects (such as on membrane permeability) are acting as well.
The authors believe (and so do I) that as we move into more and more difficult target spaces in med-chem, that we’re going to have to learn how to explore these larger chemical spaces. That’s going to be challenging. The sheer size of chemical space up there is unnerving – we think it’s big down in traditional druglike territory, but the number of possible compounds goes totally beserk as you climb up in molecular weight. “Disklike” and “spherelike” aren’t goign to be enough as molecular descriptors to narrow things down much. To add to the fun, these molecules are (perforce) more highly functionalized, which makes their synthesis challenging. A person could spend an awful lot of time wandering around out there using current synthetic techniques. The best thing I can think of is a modular approach – mixing and matching sets of subunits, as large as you can stand, that can themselves be linked under fairly general conditions.
And if we’re going to target macrocycles, which doesn’t seem like a bad plan, they have their own peculiarities of synthesis as well. A collection of a couple of million diversely functionalized macrocyclic compounds would be a very interesting screening set indeed, but no one has such a thing yet (not even the DNA-encoded library folks, although they’re certainly working on it). That might be the best tool to start getting a handle on their permeability and stability, because I think our existing data sets are just too small to draw many useful conclusions. I am not, let me note quickly, applying to make such a compound set. But eventually we’ll have one, or more. I think we’ll have to.