Skip to Content

More Binding Sites Than Are Dreamt of in Your Philosophy

Cryptic binding sites: now there’s a puzzle for you. When you look at a protein structure, even if you know nothing about its function, you can usually spot small-molecule binding sites without too much trouble. They tend to be pocket-like folds, often with particular polar motifs. (If the protein is an enzyme, the binding site/active site determination is even easier, in most cases, since there will be functional residues involved there as well). Protein-protein binding surfaces are harder, but there are still a number of standard patterns that you can look for.

But there are other binding sites that take people by surprise, often because the protein changes its conformation to cause them to appear. That’s not something that you’re going to be able to eyeball, and it can turn into a major computational problem, too, depending on the protein. Those trees of possible/plausible conformations can branch out pretty wildly, and each leaf on each one of them has to be searched for something that looks like a believable binding site.

There have been several attempts at this problem over the years, and here’s the latest, from a large multicenter academic team. A combination of methods are used simultaneously: sequence conservation, molecular dynamics, fragment docking, etc. Their protocol (“CryptoSite”) seems to accurately predict many non-obvious binding sites that have been shown to exist in different proteins, and if it really is working, it also predicts that there are a lot more of them waiting to be targeted Looking at a set of about 4000 proteins, 1420 of which are known to be disease-related, they find a high proportion of cryptic binding sites:

In contrast to pockets, cryptic sites were predicted in 72% of the disease-associated proteins, 38% of which have no apparent pockets. However, some of the predictions may be false positives (the sites may in fact not bind any ligands). Moreover, for some sites, it may be very difficult to find a ligand (even if it does exist), and even if the ligand is found, it may not be a drug because it does not target the disease-modifying function of a protein or because it does not meet clinical development criteria. Nevertheless, the prediction of cryptic sites on the disease-associated proteins of known structure indicates that small molecules might be used to target significantly more disease-associated proteins than were previously thought druggable.

That immediately brings up the question, though, of how come we don’t see these things in high-throughput screens very often. Here’s how the authors deal with that one:

It has been shown that small-molecule libraries are biased toward traditional drug targets, such as G-protein-coupled receptors, ion channels, and kinases, while they are not as suitable for antimicrobial targets and those identified from genomic studies. It is conceivable that the existing libraries are also less suitable for cryptic sites. Moreover, cryptic sites may tend to bind ligands more weakly than binding pockets due to the need to compensate for the free energy of site formation and may thus be ranked lower on the high-throughput screening lists. Therefore, different approaches based on larger and more diverse chemical libraries, including small fragments, peptides, peptidomimetics, and natural products, may be needed for more efficient discovery of cryptic site ligands.

Well, that’s a big open question in drug discovery, and those are some of the traditional answers, which may be right. Chemical space is very large indeed, and any given compound library is only going to cover a tiny amount of it, especially since the compound libraries we use are already biased by the kinds of starting materials we can get and the kinds of transformations we can run on them. It’s entirely possible that “nontraditional” targets are still waiting for their princes to ride up for them, but at the same time, this argument has always made me uneasy. Maybe the reason we don’t find hits for Targets Like That is because we just haven’t gotten to the chemical matter that Targets Like That want to see, but that same reasoning can be applied (and misapplied) pretty freely. I’ve never tripped over a bar of gold while walking around, but is that because I just haven’t gone down the right street yet? Given what I know (my priors, in Bayesian terms) about the chances of gold bars on the sidewalk, I think I can reject that hypothesis. But I don’t have enough information about the number of good screening hits for tough targets, so my prior is not very useful, and I know that the number of possible streets to walk down (and their variety) is very large. And that’s why this question is still open, and why that answer to it is still at some unknown point on the scale running from “Silly Evasion” to “Perfectly Logical”.

What I can tell you, though, is that people do get impatient with the conclusion that gosh, the screening library must just be inadequate, because we get no hits (again). I think that the fragment screening aspect is a key here. If you can come up with fragment hits (or even plausible fragment docking, as nervous as I am about that approach), then it does seem to strengthen the case that your weirdo binding site really is a potential binding site, and you just haven’t given it what it wants, you lazy dog. But if you can run a good, solid fragment screen and come up totally dry, that (to me) is a good argument that this “site” is just not going to bind things – or is at least as close to that situation as you need it to be to walk away.

I actually hope that the CryptoSite people are right, and that there are a lot more binding sites than we realize. Anything that expands the field of action like that would be welcome, and I look forward to seeing how this holds up now that it’s out there in the real world.

20 comments on “More Binding Sites Than Are Dreamt of in Your Philosophy”

  1. luysii says:

    There was another paper which looked at this in a different way. Since alpha helices can’t pack together perfectly, holes are formed between them (and between other types of protein secondary strucure). These ‘pockets’ were classified into 400 types. They are protein spandrels, things that exist in a structure which is ‘trying’ to do something inherently different. Obviously, this sort of thing makes drug discovery hard.

    For details please see

  2. watcher says:

    And key is the question of which binding sites prompt an effective enough response (inhibition, blocking, promotion) to change the course of a disease. Anything that “looks” like a site will not necessarily be a useful target for small or macro molecules. How to decide? Find “useful” molecules and then see how they work on the target!

  3. Barry says:

    as the authors note:
    “cryptic sites may tend to bind ligands more weakly than binding pockets due to the need to compensate for the free energy of site formation”

    that brings us to the question of just what it means that a site “exists”. Traditionally, we’ve worked on sites that are visible in an x-ray diffractions study, whether of the target alone or (better) in a diffraction structure of a co-crystal with a bound hit molecule. If these cryptosites will only be induced by binding a strong ligand, then they can be said to not exist at all until you’re well into the drug-discovery process, and they offer no guidance to get you there.

  4. PAINS in the neck says:

    Anything’s a binding site if you’re brave enough.

    1. Peter Kenny says:

      A project manager once asked me if I thought we had a lead and my response was, “It depends on how desperate you are”.

  5. Peter Kenny says:

    As Barry notes, one key question for exploitation of cryptic sites is whether they are are observed in both apo and ligand-bound protein structures (assuming that there is a ligand-binding site in addition to the cryptic site). Analysis of how the characteristics (and existence) of cryptic sites varies between different protein molecules in the asymmetric unit would also be relevant because it may tell us about the influence of crystal packing. It’s also worth bearing in mind that cryptic sites may be an artifact of the construct (e.g. catalytic domain of protein; no post-translational modification) used for protein structural studies and you need to be very careful when interpreting a lack of observable electron density.

    1. Scandium says:

      I can envisage that this effect, i.e the cryptic binding site only being revealed when the orthosteric ligand binds or when the protein is in the “active” conformation could be exploited for example if it’s a target whose activity you want to attenuate while not interfering with the other binding functions of the endogenous ligand. For example in a disease state where something becomes constitutively active that’s not meant to be. The inverse situation would apply if the cryptic site is only exposed when the protein is not being “activated” enough by the endogenous ligand.

      1. Peter Kenny says:

        The challenge for the software is to determine whether the cryptic site that has found is simply a packing imperfection or has physiological function

  6. Magrinho says:

    Semantics. What’s the fix? Start with even weaker ligands for P38? Only in California.

    And what does this mean?!
    “However, while such sites do provide additional opportunities for drug discovery, they may not ultimately lead to drugs.”

  7. Argon says:

    To what degree might we find these transient binding pockets if we run longer pre-incubations in screens? Is it that these pockets are low affinity or is that that their states only remain accessible a short proportion of the time or perhaps require additional rearrangements to maximize binding?

  8. AcademicChemist says:

    “I think that the fragment screening aspect is a key here.”

    It seems to me that fragment screening may be particularly ill-suited for finding these sorts of leads. If your target has to completely reorganize itself to bind the ligand, how is a weak-binding fragment going to provide enough binding energy to make that happen? Could you pick up that kind of interaction in an assay, and if so, what type?

    1. Morten G says:

      Well, in fragment-based drug design you often grow your lead from a fragment. So you have a structure with a fragment in and you know which directions not to grow because you’ll crash with the protein and you might have an idea about where to go to pick up more affinity. If you know there’s a cryptic site nearby then that’s probably a good direction to expand your fragment / lead.

  9. Rule (of 5) Breaker says:

    Several people hit on it here: Non-traditional binding sites no doubt exist, but it is a question of biological relevance (as the article itself mentions). From my own personal experience, I generated some ligands that induced a binding pocket on the target – with awesome potency too. What effect did they have? None – wasn’t biologically relevant to the target’s interaction with its binding partners. While I think this effort is scientifically interesting and may lead to a few novel discoveries, I will withhold my excitement that it will lead to a treasure chest of new legitimate opportunities.

  10. Mr. Mxyzptlk says:

    If these cryptic sites are druggable, than the best approach seems to me to be unbiased biophysical screening. So yes, fragment screens. But encoded library screens as well. Along these lines I would point to the latest from the GSK around a RIP1K inhibitor. Using a completely unbiased and diverse library, they found single digit nM inhibitors that it turns out do not bind in the hinge. Pretty rare, if not unique binding mode. And exquisitely selective.

    OK, it might not quite be a “cryptic” site, it still is sitting in the ATP pocket. But it certainly is an unexpected and novel discovery, again, generated by unbiased biophysical screening.

  11. Christophe Verlinde says:

    Ultimately, I would rather see an experimental validation of the cryptic pocket at the level of seeing fragments in the pocket than to rely on the CRYPTOSITE prediction “for our benchmark, the true positive and false positive rates are 73% and 29%, respectively” – these odds are not all that impressive. In addition, CRYPTOSITE does not tell you what the cryptic site will look like.

    1. Anon says:

      There is an experimental validation by NMR screening in the paper.
      Read the paper before you start throwing stones.

  12. tuan says:

    Induced pockets and/or allosteric binding sites are not unheart of – pharmacological relevance is key. The idea of HTS is so apply an unbiased set. With fragment based screening it is indeed more likely to find unknown sites, because determined binding is followed by XRay, whereas in an HTS one identifies what one is looking for.
    Another publication …

  13. xglax says:

    The question “how come we don’t see these things in high-throughput screens very often” is answered in the paragraph immediately above, “it does not target the disease-modifying function of a protein”. Simply delete “disease-modifying” and there’s your answer.

    I’ve no doubt there are 1000s of unknown binding sites on proteins, especially for simple fragments (see PNAS 2015, 112(52), p15910). I’ve also seen compounds that are lipophilic enough to open up proteins and shove themselves inside. Most of these won’t make any difference to the target’s activity, hence they won’t show up in any screen except a direct-binding assay. I’d argue that’s a good thing, because why would you want to waste time chasing irrelevant binding sites?

  14. xglax2 says:

    @ Mr. Mxyzptlk

    Re. RIP1K, before the GSK compound came along there were already inhibitors known to bind in the same site, the necropsins (Structure, 2013, 21, 493). These were found by traditional cell-based and enzyme activity assays backed up by crystallography (Nat Chem Biol 2008, 4 313). I’m not saying that unbiased biophysical screening is a bad thing, but your narrative about such things being unprecedented in RIP1K before its use is unfortunately inaccurate.

  15. Mr. Mxyzptlk says:


    You’re absolutely correct, I should have checked the references more carefully. Still a cool result, and still a nice example of biophysical screening, but you’re right, the binding mode is not unprecedented. Maybe someone from the field could find a better example? Or is it that I’m wrong, and biophysical screening doesn’t lead to a higher proportion of cryptic or unprecedented ligand-protein interactions?

Comments are closed.