Here’s a recent paper that bears on the “How many binding pockets are there” question. Or maybe that’s the “How many different types of binding pockets” question, which last came up around here a couple of years ago. That one was a computational approach that suggested that there were around 500 different varieties, a level of detail that still leaves a lot of wiggle room. As I mentioned there, even if every single protein in the human body that had a small-molecule binding site had a completely different one, we’d be looking at what – high tens of thousands? So there’s an upper bound for you. Meanwhile, we have millions of small molecules to screen.
That does not imply that you can then screen those millions of compounds in near-certainty that you’ll get a hit, though. As mentioned here recently, substantial parts of any screening collection go a long time without hitting anything, if ever. And many targets have been screened over and over without ever yielding very much. Although, to be sure, when you do get a hit, most of the time you can be sure that it’s going to bind to at least a few other things, if you look hard enough for them. That’s where this new paper, from the Shoichet group at UCSF, comes in. They’ve gone rooting through the PDB and found several dozen examples where the same exact ligand is bound to at least two proteins that are unrelated by folding patterns that make up their binding sites. Half the time, completely different amino acid residues were involved in binding the ligand, and in half of those cases, it was because a completely different part of the ligand was being bound in the first place.
Even in the other half of the set, where broadly similar binding was taking place, you still couldn’t superimpose the ligand and really see related residues being involved much. They conclude that “There appears to be no single pattern-matching “code” for identifying binding sites in unrelated proteins that bind identical ligands. . .” That fits my own experience more than that other limited-number-of-sites proposal, although the difference between these may be more in the definition of what a different site might be. At the resolution level of PDB structures, everything may well look different, whereas many of these might have been classed as “pretty much the same” by the earlier computational approach.
But in drug discovery, we’re working more in the PDB world, or worse. An X-ray structure, after all, is static, and there are probably (in fact, almost certainly) binding modes in two very similar X-ray structures that were arrived at dynamically by different processes, and would have different SAR against a series of compounds. Helices shift, residues flip, water molecules dart in and out of of the binding pockets (and there are a lot of different sorts of water molecules, when you start talking enthalpy versus entropy). For the purposes of drug discovery, I don’t think you can go far wrong by assuming that every binding site is its own separate problem. Take what similarities you can get, and be prepared to use whatever lessons you can draw from them, but don’t be surprised, when you get right down to should-I-put-a-methyl-here, to find them different in the end.