Skip to main content

Drug Assays

Virtual Screening – As Big As It Currently Gets

This new paper on “ultra-large” virtual screening is well worth a look in detail. We find a great many lead compounds in this business by random screening of compound libraries, and virtual screening is (as the name implies) the technique of doing this computationally instead of with hundreds (thousands) of sample plates and tireless robot arms. All of that takes time and effort and money – accumulating such a compound collection, making sure that those compounds are (or are still) what you think they are, dispensing them in a useful form, coming up with an assay that’s strong enough to run in automated fashion and actually getting it done, etc. The idea of doing all this computationally by docking mathematical representations of molecules into mathematical representations of your target has always been appealing, and it gets more so every year as the hardware gets ever more capable.  Even if you can’t predict de novo the compounds that will do the job, and we can’t, you can still run huge numbers of them, all varieties, and see which ones come out on top.

This paper (a large multi-center academic collaboration) reports what I believe is the largest publicly disclosed effort of this type. It takes as its starting point 70,000 commercially available building block compounds, and elaborates those using a set of 130 known reactions. This gives you what should be a “make-on-demand” library whose actual synthesis has a good chance of being reasonable. The paper itself screens 99 million compounds against one target (the AmpC enzyme) and 138 million against another (the D4 receptor), and the library has grown much larger since then. Less than 3% of that library is itself commercially available; there are a lot of compounds to make in this world.

The computational screening of this set is not a trivial exercise – to their credit, the authors did a pretty thorough job. You could play the game of run-a-quick-minimization-and-dock-that-as-if-were-rigid on these things, and you’d get through them pretty quickly, but to what end? Reality is more various than that. So for each compound, an average of around 4,000 orientations was checked (basically, which part of the molecule approaches the protein target and a what angle) and for each of those, 280 conformations of the molecule were sampled. That adds up to a number of possibilities in the ten to the thirteenth range, scored with DOCK3.7, which will take you tens of thousands of core hours to chew through on typical hardware. Compounds resembling known ligands for these targets were deliberately filtered out in a search for new chemical matter.

Now we get to some interesting numbers. From the AmpC hits, the team picked out 51 top-ranking molecules (each from a different scaffold class) to synthesize, and 44 of those efforts were successful. (These molecules, as with the D4 example coming up next, were selected both by docking scores and by human inspection – see below!) Of those 44, only five showed any activity in the enzyme assay (ranging from about 1 micromolar to about 400 micromolar). The best of that list represents a very good starting point indeed for this enzyme, and synthesizing analogs of its structure led to a 77 nM compound, which appears to be among the most potent non-covalent inhibitors reported for it. A crystal structure of the inhibitor/enzyme complex confirmed the predicting docking pose, which is always good to see.

As for the dopamine D4 effort, this one went a bit more in-depth. The team selected 589 structures, and not just from the top rank of the docking scores, but from the middle and lower parts of the list as well. 549 of these could be synthesized, and 122 of these showed more than 50% radioligand displacement at 10 micromolar. 81 of these were dose-responsed, and showed Ki values of 18 nanomolar to 8 micromolar. Not bad! Most of the potent compounds were full or partial agonists, but there were two antagonists in there as well. One of the potent agonists was synthesized as its four separate diastereomers, and one of those was down to 180 picomolar activity, 2500x selective against the related dopamine receptor subtypes, which is about as good as you’re ever going to get.

There are a lot of interesting take-aways from this work. For one thing, as the authors mention, it would be tempting to just dock representative members of each structural type/cluster, rather than having to do them all. But trying that really demolished the effectiveness of the screen, shedding active hits at an alarming rate. The current docking/scoring technology can get you as far as “compounds that look kind of like this”, but definitely cannot reach in and pick out the best representative of any given class. And even that level of discrimination comes with a lot of effort – note the number of hit compounds in both the examples above that turned out to be completely inactive on synthesis. That definitely argues for setting up these virtual libraries according to expected ease of synthesis, because otherwise you could spend a lot of time making tough compounds that don’t do anything. People have.

This also speaks to the importance of size. The D4 receptor has been the subject of virtual screening before, but not at this scale, and the best compounds here were (of course) not found in those efforts. Nor were any that were as potent as the best ones here. Size matters, and since we can’t zero in on the best compounds, we’d better be prepared to evaluate as many of them as we can stand.

Another point is that high-middle-low effort on the D4 case. The binding assay results compared to the docking scores are shown at right. You can see that the number of potent compounds (better than 50% displacement, below that dashed line) decreases as the scores get worse; the lowest bin doesn’t have any at all. But at the same time, there are a few false-negative outliers with binding activity at pretty low scores, and at the other end of the scale, the top three bins look basically undistinguishable. So the broad strokes are there, but the details are of course smeared out a bit.

There’s also a human-versus-machine comparison in evaluating the hits. The authors took the top 1,000 compounds and selected 124 of them by eyeballing them for what looked like good interactions in the docking pose (not looking at the scoring), and took 114 molecules on the basis of docking scores alone. The hit rates for the two sets were almost identical (about 24%), but the human-selected ones were disproportionately potent – and indeed, in the two campaigns, the human-selected compounds were quite over-represented in the lists of potent compounds. So we have that going for us. But again, note that three-quarters of the compound selected, even after all this effort, were not active. That’s a huge enhancement over background, which is good news, but it’s not the magic that some outside the field think we can work, either.

One thing to note is that these two binding sites are very well characterized. There are plenty of compounds known for each, and there’s a lot of understanding about the structure of them bound to the proteins. Trying this against a blue-sky binding site that you don’t know much about is going to be a much different undertaking – but that, of course, is what we’d like to do. Ideally computational screening will eventually do even more not only with compounds that aren’t yet real, but do that against proteins that have never been physically screened before at all. Getting solid, actionable protein structures, though, is far more difficult than running through orientations and conformers for small molecules – as it stands now, screening  modeled compounds against real protein structures can (as this paper shows) give you good results, although keep in mind that this report is pretty much at the edge of what we can do with current technology. But screening modeled compounds against modeled proteins runs a substantial risk of giving you a lot of noise. We’ll get there, but we aren’t there yet.

One other sobering note: this paper, as so many virtual screening papers do, starts off by mentioning the estimate for small-molecule chemical space of perhaps 1063 compounds. There’s room to wonder about that estimate, but since it’s cited here, let’s use it against the paper’s figure of 1.2 calendar days of straight processing time on 1,500 cores to get through the 138 million compounds in the second set. Extrapolating to the Big Set of Everything That Exists, that gives us 1055 days of processor time to screen the lot. Unfortunately, it’s only been about 5 x 1012 calendar days, more or less, since the Big Bang. So even if we allow ourselves more time by turning each day since the universe began into another universe’s current age worth of time (setting off a Big Bang every morning and waiting another 13.8 billion years until counting the next day) and then take each day of that unimaginable stretch and turn every one of those into another universe’s-age worth of time, then you would only need around a quadrillion of those extra-long intervals to get through the data set. Large numbers are indeed large.

23 comments on “Virtual Screening – As Big As It Currently Gets”

  1. Carla says:

    I wonder why Enamine did not mention their space concept, which is already an order of magnitude larger…

  2. LD says:

    I wonder what is a bigger challenge right now though. To find a potent chemical matter or to be able to progress it to be more selective (kinases, for example) or non-selective (for beta-Lactamases inhibitor, selectivity against AmpC in this case is not really favorable)

  3. Amateur says:

    So based on brief reading, DOCK 3.7 works like this:

    “DOCK3.7 places pre-generated flexible ligands into the binding site by superimposing atoms of
    each molecule on matching spheres, representing favorable positions for individual ligand
    atoms. Here, the crystallized nemonapride pose was used to define 45 matching spheres.”

    So does this method only find hits that use pretty much the same binding site as the known ligand? Can it find ligands that extend into new pockets not used the known ligand?

    1. John Irwin says:

      The spheres (hot spots) in DOCK 3.7 generate the initial translation-rotation matrix to fly the molecule into the binding site. Once placed, the molecule can be rigid-body-minimized well away from its starting pose. We often see docked ligands going well away from precedent, and the spheres themselves.

    2. cc says:

      Docking can target any site in theory. The limitations are

      1) The user needs to know which ones (out of many indentations on a protein) are important.

      2) The user needs to be able to screen the hits against the chosen site (and eliminate interfering compounds).

      3) Docking without pharmacophoric constraints is a waste of time in my opinion (as a practitioner for over 30 years). If you know nothing about how compounds really bind to your pocket you stand very little chance of finding a true hit.

      4) Typically 1% of hits from docking will turn out to be active in a bioassay, even for targets where the true hit-rate is far lower. Docking hits should be reproducibly active across multiple assay formats but most publications showing hits derived from VS simply don’t do this – they have found artefacts and not bothered to validate them.

  4. MTK says:

    Hey Derek,

    You may have to re-do that processing time calculation. You forget to factor in Moore’s Law !

  5. Bryan says:

    1500 cores is a very small machine by current leadership standards. Also, did they use any form of acceleration such as GPUs or FPGAs? The problem is also “embarrassingly parallel” so that use of spare cycles obtained via grid computing can be considered. Thus, the computational requirements should not stand in the way if this approach is shown to be an appropriate starting point for finding new drugs.

    1. John Irwin says:

      We use commodity Intel/AMD CPU hardware. We’re highly motivated to make DOCK 3.7 run fast on GPUs, but we just haven’t figured it out yet. DOCK 3.7 runs well on AWS and of course desktops.

  6. curious-mind says:

    Great work and great results, but how can one gain access to that 180 million compound library? surfing ZINC15 is not very helpful to be honest. Any direct download links or alike? Or one should simply get in touch with Enamine?

    1. John Irwin says:

      Browse to
      Use 3D. Select what you want, download the scripts. Run them.
      curl, wget or powershell.
      Let me know if you have any trouble.

      It is true the database has grown and we are struggling with the analog-by-catalog searching. We are investing heavily to make it faster.

      There’s a youtube video about downloading should that help.

    2. Or alternatively, current release of the REAL database can be found here:
      You should be registered in order to downoad the database.

  7. John Irwin says:

    Please try the tranche browser:
    Use 3D. scripts for curl, wget and powershell.
    there’s a youtube video that explains some background should that help.
    first version of this reply did not appear, so sorry if this duplicates.

    1. Shradha Suyal says:

      Hello John,
      I browsed Zinc15 and tried downloading the 3D, ran them similarly as was shown in your tutorial video. But just few pdbqt.curl files ran and then i got a msg on my terminal saying ..More(0%) and when I tried closing the terminal, it said a job is running.
      I am again trying to download the 3D tranches, but your 3D page is not responding. I am getting a msg that “it may take a moment”. Is it some network related issue or some problem in your website?

  8. Charles says:

    Enamine’s RealDB has been around for quite a while. I’m surprised they didn’t even mention it.

    1. John Irwin says:

      Hi Charles

      The whole paper is basically about the REAL Database which we refer to as make-on-demand throughout the paper. It is my oversight that we did not identify it as RDB. I hope there is not too much confusion.


  9. Peter Kenny says:

    This is certainly a very interesting article and it raises questions about how best to analyze the output of such a massive virtual screen. Although correlations between scores and affinity tend to be weak, there may be advantages in analyzing scores in terms of structural relationships between compounds. One might search for ‘scoring function cliffs’ (analogous to activity cliffs) and matched molecular pairs. Examining pairwise relationships between compounds in a database of this size would clearly be a challenging computational problem and, given that much of the screening library is generated from reagents, it may be practical to determine structural relationships from reagents. Could be an interesting cheminformatics project?

    One very simple analysis that one can do is to normalize the scores by molecular size (or lipophilicity). My favored approach to doing this is to fit the scores to molecular size (and/or lipophilicity) and use the residual as a measure of the extent to which the score for a compound beats the trend in the data. This approach to normalization is discussed in ‘The nature of ligand efficiency’ which I have linked as the URL for this comment.

  10. marcus says:

    Yes. And a good alternative to working with such gigantic pools of ideas could be to mine from the even much larger, 3.6 bn(!) Enamine REAL _space_ first (that’s not their REAL database!); takes a few minutes on a laptop. Then go 3D or order with Enamine. is a tool that’s free to download and test. Note: I am biased, we develop that. 🙂

    1. angelo pugliese says:

      REALSpace search just 4 minutes on my pc. Really useful toll for analogues search and just 4 weeks for cmpds delivery. BiosolveIT and Enamine did a good job on this.

  11. Cb says:

    I am curious to hear if the authors did a virtual screen on thrombin inhibitors and found 2-chloro-thiophene derivatives which interact in the specificity pocket ( rivaroxaban like)

    1. John Irwin says:

      Hi Cb

      We did not do this. It is a fun idea. We’ve lately focused on GPCRs, but thanks for the suggestion.


  12. NeoK says:

    Congratulations to John Irwin and co-workers (including those working on Enamine’s RealDB). I felt tempted to write that this is a landmark study in the area. Then I stopped for a minute and realize that this is only for two targets. Surely you tested your approach in more than two targets and these are the two targets providing the best results. John – my question is: which % of all the targets led to impressive results? Furthermore, every time you obtained some good hits by screening say 10% of the library, did you obtained impressive hits by screening 100% of the library? (again over all targets, not only these two, and prospectively). Thanks in advance for your answer.

    1. John Irwin says:

      Hi NeoK

      We have run LSD (150-300 M) against over 10 targets. AmpC and D4 are *not* the two targets with the best results. They were the targets we tried LSD on first because they allowed us to a) test docking by crystallography (AmpC) and b) to integrate the curve to estimate total binders (D4). They are model systems.

      We have found picomolar against one other besides D4, single-digit nM against 2 more, sub-uM against 2 more, single-digit uM against 2 more. Hit rates 10-40%.

      What percent of the targets led to impressive results? Docking is hard, and sometimes does not work. As you said, if small scale docking yields hits you like, LSD will likely find more and better hits. If your target is really hard for whatever reason, LSD might not do well. We have had a couple of disappointing experiences, but most of the projects have led to new exciting compounds with chemotypes we would not have found with smaller libraries.


  13. Garry says:

    Wow!! Great article and very informational

Comments are closed.