Skip to Content

One Way to Find Out

Here’s a thing about research (and drug discovery in particular) that makes it a bit different from many other occupations: you can go for extended periods without even being sure that you’re doing what you’re supposed to be doing. This thought came to mind yesterday when (on Twitter) Ash Jogalekar quoted a biotech veteran as saying that the most likely result of a high-throughput screening campaign was nothing and that the second most likely result was crap. Michael Gilman (biotech veteran himself) then chimed in to say that he would definitely prefer the “nothing”, because digging through all the crap was so deadly.

I can endorse both of those viewpoints, but I wanted to add some refinements. First off, it’s actually pretty unusual to get flat-out nothing from a screen. In fact, a real blank would strongly suggest that something went wrong, because you always get some sort of result, even if it’s just noise. That’s why any new screening campaign generally starts off with a test set, X hundred or X thousand compounds that get run as a pilot. If the output is flat zero for every well, something is very likely wrong. On the other end of the scale, if you get a 10% solid hit rate, something is definitely wrong. A more believable raw hit rate is something below 1% for a lot of targets (more if you’re screening a target class that you know the test deck has liked in the past, of course). But a ten per cent hit rate for what’s supposed to be a randomish selection of compounds is just not going to happen; your assay window is too wide or your assay format is just messed up at a fundamental level.

“Nothing” in screening terms generally means “nothing more than the usual crap”. Every assay technique is vulnerable to false positives. Some of these are specific to a particular readout – intrinsically fluorescent compounds or fluorescence interfering ones, for example, and some of them (aggregation) can mess up a whole range of assays. You should, then, expect to see your old friends. If you’re running a luciferase-driven assay, for example, you surely have some luciferase inhibitors in your collection (every collection has some!) If they don’t show up as “hits”, you have a problem.

What about an assay that returns a few hundred compounds even after the frequent-hitters for that assay technique have been identified? That is the situation I described in the opening paragraph, because that’s research, because you have no idea if those compounds are legitimate hits or if they’re a pile of junk that was generated by a false positive mechanism that you don’t know about yet. The newer and more exciting the assay technique, the greater the chances are that this has happened. The wilder and more squirrely the target, the greater the chances for false positives, too, since the intrinsic hit rate is likely quite small.

The kicker is that wilder and trickier targets tend to get screened in newer or more complex assay formats and against weirder compound collections, because they’ve either already been tried in the more normal stuff or there’s nothing else that has a hope of working. So you’re getting it from both directions. But your only choice is to start working your way through the pile, because that’s what you came here for, right? To find a drug lead? This is the situation that Gilman is describing – weeks or even months of chasing things down class by class, compound by compound. Slam the doors, kick the tires, do it again.

There’s another problem that occurs more often at the frontier, too: lack of control compounds. You would like to test your assay(s) with a compound that’s known to do what you’re looking for, to make sure that everything’s working and that you can find such things, but what if no such compounds exist yet? You’re left trying to develop assays in the hopes that they’ll do what you want – I mean, they look like they should work – but you’re never quite sure. In these cases, if all you get is a collection of odds and ends when you run the screen, is that because it’s a hard target and not much was going to show up anyway? Or is it because there’s something wrong with your screen? These are, sadly, not mutually exclusive. Positive controls and validated assays have got to come from somewhere, though. . .

The “Are we doing this right?” feeling does not go away after the screen, either – in fact, dealing with it is an important part of being able to do research. Have you picked the right compound series to expand on, or is it a dead end that needs to be abandoned? Are you screening in the appropriate cell lines? How about that in vivo model that you’re heading for – do you trust it? You’re going to have to trust something at that level, because that’s what’s going to recommend a compound to the clinic. Oh God, the clinic. Is the whole idea behind the project even sound enough to have effects in humans? The Phase II failure rate will show that the answer to this question is often “Nope”, and many of those folks thought they’d answered their questions as well as they could, too, before the ground truth landed.

This is, in the end, only one way to answer such questions and deal with such doubts: run the flippin’ experiments. Set them up as well as you can, with all the brainpower and effort you can bring to the task, and then run the experiments and see what you get. There are plenty of jobs where you know what’s going to happen, but you chose this one instead!

29 comments on “One Way to Find Out”

  1. DrunkTheKoolAid says:

    Derek – you bring up a point which highlights the problem with target-based drug development and the advantage of phenotypic screening, namely that (with targets) you are often starting way behind the tee. Ironically, your last post lauded the advantage of using high-content cell-based screening to inform on targets, instead of the real advantage of phenotypic screening, that is, you end up with something that is (if designed reasonably well) somewhat informative of the disease state as opposed to just sending you back to the deconstructionist beginning (i.e. targets).

    1. Derek Lowe says:

      Oh, I freely admit that advantage. A bad phenotypic screen is the worse of both worlds, but a good one is really something to behold. Of course, that cell-imaging technique I mentioned does tend to lead back to targets, rather than directly to phenotypes, since the perturbations being looked at are pretty basic ones.

      1. DrunkTheKoolAid says:

        With good machine learning, it may be able to correlate seemingly basic changes to something useful, especially if there are known drugs in the space. It’s not necessarily a requirement to be able to visually (in human terms) understand the phenotypic changes – having prior examples of those changes can inform.

      2. Mol Biologist says:

        Yee Derek, it is getting bored. I remember few years ago at the conference I suddenly dumbfound the postdoc at company which was developing INNOVATIVE drugs to treat cardiomyopathy. My question was not naive since I asked why did you use neonatal cardiomyocites for HTS screening. “The devil is in the detail” is an idiom that refers to a catch or mysterious element hidden in the details, meaning that something might seem simple at a first look but will take more time and effort to complete than expected. IMO before using neonatal cardiomyocites you have to ask very important question how and why is the fetal gene program reactivated when the heart remodeling.

    2. MrRogers says:

      Of course your phenotypic screen has to use a phenotype that actually reflects the pathophysiology of the disease. What you’re really doing is trading a biochemical hypothesis for a cell biological hypothesis. Until you get to Phase II you have to rely on your model (even if it’s whole animal based) being an accurate reflection of the disease. The only way to know that for sure is for Phase III to be successful. Until then you get to wonder why you didn’t go into finance or economics. In those fields you’ll never know if you were right, but at least you’ll be well-compensated.

      1. DrunkTheKoolAid says:

        Not necessarily, especially in early stage discovery. As long as the phenotype is reproducible and complex, it can act as a fingerprint rather than a recapitulation of the disease state. Of course, in this case you would want to follow up with something more translational.

        1. MrRogers says:

          But you still have to be right about what the “fingerprint” should look like. If you hypothesize that type 1 diabetes is a problem with renal glucose transport and use phenotypic assays to develop modulators of that activity, you’ll still fail because your screen didn’t reflect disease pathophysiology.

          1. DrunkTheKoolAid says:

            Again, not necessarily. If you train a model with examples of actives and inactives, what happens to the cells does not necessarily have to reflect the disease state, it just has to be consistent for each class.

            Imagine building a fast screen for bank loan risk. If you had pictures of people who were a good risk and some with bad risks, it may be something in the way they dress, how they cut their hair, whatever, that is a surrogate for their risk. You can model this.

            Of course, this would only be appropriate as a initial screen. You’d want to follow up with something that is a better measure of being a good therapeutic. But you could do this on a much, much smaller set.

          2. DrunkTheKoolAid says:

            Let me be a bit clearer.

            You would not have to “understand” the fingerprint. The model would figure out what separates the training set actives from inactives.

            Also, if you were studying an effect that you postulate affects renal function, you’d probably want to use renal cells (but that’s not necessarily an absolute requirement). But you probably don’t have to use a disease model either. Normal renal cells may suffice.

        2. regularguy says:

          As Derek mentions, positive controls are a challenge for drug discovery targets. It seems it would not be very common to have enough positives to train a target and phenotype agnostically learning algorithm on… and if you did that reduces the worth of discovering new chemical matter.
          Also I think the proportion of drug targets that will produce a unique and measurable signature, especially in cell models used for HTS, is pretty low. Most will require some kind of engineering of the system, which is invoking a cellular hypothesis as others commented.

  2. John Wayne says:

    “There are plenty of jobs where you know what’s going to happen, but you chose this one instead!”


  3. Wavefunction says:

    I feel like we should have a bumper sticker saying “Honk if you have been misled by HTS hits.”

    1. Daniel Barkalow says:

      Eh, most of the people who’d honk just honk at everyone.

      1. KN says:

        These people are such a PAIN

        1. Anonymous Researcher snaw says:

          My wife says PAINS are the drug discovery equivalent of the Messier Catalog. I replied, “actually I think the Messiest Catalog!”

          Derek will get this since he is an amateur astronomer. Here’s an explanation for those unfamiliar with astronomy. Messier made his catalog because he was a comet hunter and things like nebulae were false positives. So he cataloged them to avoid being fooled when comet hunting.

  4. Uncle Al says:

    R&D evinces HR-staffed labs fail by the book. The awful alternative is hiring autist monomaniacs, tossing in money, and sweating blood. Wrong people succeed because right people exhausted near every reasonable solution. “Reasonable” has orthogonal elasticity.

    Add video gamers to drug discovery. How much fundamental – and working – science was born as a mistake? Ziegler-Natta, nylon, kevlar, NMR, Pyroceram; penicillin, cyclosporin, crown ethers. Somebody screwed the pooch – who then had kittens.

  5. Eric says:

    Good description of the struggles in early drug discovery. It’s often underappreciated in the lay press just how little we really understand. Even in industry there is a strong tendency for revisionist history in successful programs. It’s much more appealing to say we optimized a lead through rational drug design than to admit we tried a bunch of analogs and were surprised by the one that worked so well. Only later did we discover why it was successful and the others weren’t.
    Serendipity drives so much of drug research – so as you say just ‘run the flippin experiments’

  6. Rabia Mateen says:

    Hello all,
    I am a PhD student in polymer science who recently published a paper on using hydrogel-based microarrays for drug screening. ( I found that performing a simple drug screen (enzyme+ inhibitor) in a hydrogel interface prevents the occurrence of false-positive hits by aggregating inhibitors via steric blocking. I have no experience in performing HTS studies and was wondering if any of you could comment on this approach and its potential to mitigate the difficulties caused by aggregators in HTS.

    Thanks!! 🙂

    1. Agg says:

      How do you know there is no false negative?

      1. Rabia Mateen says:

        The assay simply operates on the principle of size exclusion. Any drug molecules that self-associate to form aggregates will not be able to access the target enzyme. Are they cases of viable drugs that also tend to aggregate?

        1. Derek Lowe says:

          Problem is, aggregation is an extremely assay-specific phenomenon, so “tend to aggregate” is a risky phrase. Once over concentration W in Buffer X at pH Y and ionic stength Z, yes – but under other conditions, maybe not. . .

  7. Peter kenny says:

    Even in the early days of HTS in the early nineties it was clear to the people working up the results that much crap was generated and the problem has always been to demonstrate what was thought to be crap really was crap. By about 2005 where I worked (AstraZeneca), the first phase of HTS workup typically consisted of selecting compounds for high-throughput concentration response and historic assay results would be taken account of when prioritizing hits. In general, I think that it is necessary (but not sufficient) to characterize the concentration response in order to claim that one has experimentally demonstrated that a hit is crap (this is one of my challenges to the PAINS cult). Ideally, I would want to be in a position to characterize binding using an affinity assay such as SPR that allows association, dissociation and stoichiometry to be observed directly. It’s also important to recognize that that there different types of turd in the HTS scatologeome. First, compounds that give a positive readout without affecting target function (e.g. singlet oxygen quenchers in AlphaScreen assay). Second, compounds that affect target function by an undesirable mechanism of action (e.g. colloidal aggregators). Third, compounds that are highly insoluble, unstable, metabolically labile, etc.

    That said, I would echo Derek’s advice and run the experiments. In case it is of interest, I have linked my comments on the ACS assay interference editorial as the URL for this comment.

    1. 10 Fingers says:

      “It’s also important to recognize that that there different types of turd in the HTS scatologeome.”

      This is the power of orthogonal counterscreens: try and line two or three up for efficiency and just look at the minimal set that survives the running of the gauntlet. The things at the end are either really pernicious artifacts or something real.

      HTS/Co-Crystallography approaches work well for this reason. One can still get an “artifactual” binder at the end (it is amazing how much kinases like garbage planar arrays of aromatic rings), but the signal to noise, in my experience, is still really high.

      1. Peter Kenny says:

        Orthogonal counterscreens will identify HTS turds of the first type that stink (but have no effect on target) but they are less useful for HTS turds of the second type that act by an undesirable mechanism of action. Crystal structures certainly raise confidence levels although SPR would still be my favored option (if it can be applied) because it can deal with HTS turds of the first two types, leaving porcelain gleaming as if straight from the factory.

    2. John Wayne says:


      Brilliant. It looks like the snow day was good for creativity.

      1. Peter Kenny says:

        Thanks. Not really a snow day here in Trinidad where it was mid 80s before I escaped to the hills for an exercise walk.

  8. Chrispy says:

    Unless by “run the flippin’ experiments” you mean Phase III trials, you still won’t know. Every experiment we run is merely an argument to run the next one: no one cares about your mouse experiment once your compound is in humans, but it will never see humans without the mice. The fact that we don’t have reliable mouse models for most of our diseases of interest does not mean that you can skip this step, it just means that you use gamed mouse systems with predictable outcomes. We have a well-mapped path to failing in the clinic.

  9. Barry says:

    On one end of the spectrum are things like bacterial infections, in which the leaps from the first assay (is the well cloudy? or clear?) to the first animal (is the guinea pig dead? or alive?) to the human disease are small. This is where Paul Erlich was at the dawn of the 20th century.
    On the other end, there are diseases like Alzheimers, where the leaps from the first assay (a protease? a protein-protein interaction?) to the animal, and from the animal model to the human disease are contentious (you might prefer other adjectives).
    There’s still room to get burned in the clinic, even on well-trodden terrain. Abbott abandoned an efficacious, safe, macrolide antibiotic ca. 15yrs ago because it perverted the patients’ sense of taste. But that’s nothing like the probability of failing in e.g. Alzheimer’s.

  10. AlmostTrp'edUp says:

    In the plenty of ways to lose category, over expression of your target of interest causes upregulation of an endogenous protein which then provides “hits”. In this case the authors recognized AITC as a TRPA1 agonist and ran the appropriate follow-up experiments. I wonder how often this situation occurs?

    Overexpression of Human Transient Receptor Potential M5 Upregulates Endogenous Human Transient Receptor Potential A1 in a Stable HEK Cell Line. Buber et al. Assay Drug Devel. Technol. 2010, 8, 695.

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.