Skip to main content

Drug Assays

Drug Assay Numbers, All Over the Place

There’s a truly disturbing paper out in PLoSONE with potential implications for a lot of assay data out there in the literature. The authors are looking at the results of biochemical assays as a function of how the compounds are dispensed in them, pipet tip versus acoustic, which is the sort of idea that some people might roll their eyes at. But people who’ve actually done a lot of biological assays may well feel a chill at the thought, because this is just the sort of you’re-kidding variable that can make a big difference.

Dispensing and dilution processes may profoundly influence estimates of biological activity of compounds. Published data show Ephrin type-B receptor 4 IC50 values obtained via tip-based serial dilution and dispensing versus acoustic dispensing with direct dilution differ by orders of magnitude with no correlation or ranking of datasets.

Lovely. There have been some alarm bells sounded before about disposable-pipet-tip systems. The sticky-compound problem is always out there, where various substances decide that they like the plastic walls of the apparatus a lot more than they like being in solution. That’ll throw your numbers all over the place. And there have been concerns about bioactive substances leaching out of the plastic. (Those are just two recent examples – this new paper has several other references, if you’re worried about this sort of thing).
This paper seems to have been set off by two recent AstraZeneca patents on the aforementioned EphB4 inhibitors. In the assay data tables, these list assay numbers as determined via both dispensing techniques, and they are indeed all over the place. One of the authors of this new paper is from Labcyte, the makers of the acoustic dispensing apparatus, and it’s reasonable to suppose that their interactions with AZ called their attention to this situation. It’s also reasonable to note that Labcyte itself has an interest in promoting acoustic dispensing technology, but that doesn’t make the numbers any different. The fourteen compounds shown are invariably less potent via the classic pipet method, but by widely varying factors. So, which numbers are right?
The assumption would be that the more potent values have a better chance of being correct, because it’s a lot easier to imagine something messing up the assay system than something making it read out at greater potency. But false positives certainly exist, too, so the authors used the data set to generate a possible pharmacophore for the compound series using both sets of numbers. And it turns out that the one from the acoustic dispensing runs gives you a binding model that matches pretty well with reality, while if you use the pipet data you get something broadly similar, but missing some important contributions from hydrophobic groups. That, plus the fact that the assay data shows a correlation with logP in the acoustic-derived data (but not so much with the pipet-derived numbers) makes it look like the sticky-compound effect might be what’s operating here. But it’s hard to be sure:

No previous publication has analyzed or compared such data (based on tip-based and acoustic dispensing) using computational or statistical approaches. This analysis is only possible in this study because there is data for both dispensing approaches for the compounds in the patents from AstraZeneca that includes molecule structures. We have taken advantage of this small but valuable dataset to perform the analyses described. Unfortunately it is unlikely that a major pharmaceutical company will release 100’s or 1000’s of compounds with molecule structures and data using different dispensing methods to enable a large scale comparison, simply because it would require exposing confidential structures. To date there are only scatter plots on posters and in papers as we have referenced, and critically, none of these groups have reported the effect of molecular properties on these differences between dispensing methods.

Some of those other references are to posters and meeting presentations, so this seems to be one of those things that floats around in the field without landing explicitly in the literature. One of the paper’s authors was good enough to send along the figure shown, which brings some of these data together, and it’s an ugly sight. This paper is probably doing a real service in getting this potential problem out into the cite-able world: now there’s something to point at.
How many other datasets are hosed up because of this effect? Now there’s an important question, and one that we’re not going to have an answer for any time soon. For some sets of compounds, there may be no problems at all, while others (as that graphic shows) can be a mess. There are, of course, plenty of projects where the assay numbers seem (more or less) to make sense, but there are plenty of others where they don’t. Let the screener beware.
Update: here’s a behind-the-scenes look at how this paper got published. It was not an easy path into the literature, by any means.
Second update: here’s more about this at Nature Methods.

47 comments on “Drug Assay Numbers, All Over the Place”

  1. BHip says:

    Speaking as someone on the assay side, I know this is the kind of thing that drives medicinal chemist’s berserk.
    Something obvious to keep in mind- the tips are not the only potential source of spurious compound binding in the assays (potency shifts due to the plates, beads, etc.??).

  2. SP says:

    You’re more likely to have false positives in cell-based assays, where a pin tool or pipettor dumps a blob of concentrated compound in high percentage DMSO that sinks onto the cell monolayer at the bottom of the well before dispersing.

  3. JeffC says:

    Aint no substitute for repeats. Different days, different biologists. Tends to pick up experimental error. And retesting from solid.
    This emphasises the massive risks in the virtual, outsourced drug discovery model. It might be lean and capital efficient but if your scientists aren’t talking to each other every day face-to-face these types of issues are going to bite you in the ass. Probably once you’ve spent all that lean cash.

  4. ADCchem says:

    Interesting study. My experience with nanomolar to picomolar greasy compounds is quite the opposite. That is picomolar dilutions tend to be more concentrated as the greasy compounds stick to the plastic tips and are transferred to the cellular assay at higher then diluted concentrations. In general greasy compounds IC50s tend to be lower then expected at these concentrations. It would be interesting to see how the acoustic based method performs on low concentration assays.

  5. Anonymous says:

    “3. JeffC on May 3, 2013 9:36 AM writes…
    Aint no substitute for repeats. Different days, different biologists. Tends to pick up experimental error. And retesting from solid.”
    This is potentially where you can get caught out. The IC50s by both tip and acoustic dispencers are reproducible, just reproduciably different. So your n=3 might make you happy that you’ve got the right result, but you haven’t. That’s why your blood should run cold.

  6. JeffC says:

    I agree to some extent with what you say but typically a biologists will use one pipette that is “theirs” but maybe completely different from another person in the same lab doing the same assay. This is a physical effect essentially and while it’s beautifully repeatable across a single device if you have biologists using multiple devices I suspect that’s when you’ll see variation and the alarm bells will start ringing. Additionally, some biologists will dilute their samples one way, and another will do it slightly differently. The theoretical concentration is the same but how they get there is different. All of these things will increase the chance of you seeing odd results or things that don’t quite make sense. But only if you are actually talking to the people doing the work. All the time.
    I’m not disagreeing with you. This yet another way in which things can get messed up. But my point is that a good team that is talking about the results being generated all the time will stand a far better chance of picking this type of thing up before they find the compound tanks in the Phase 2 POC trial.

  7. RD says:

    Well, this explains a lot of pointless team meetings and SAR that went nowhere.

  8. Computationally Entertained says:

    It reminds me of the complaints I had for a recent paper titled “do medicinal chemists learn from activity cliffs? …”

  9. Stephan says:

    I think (hope) everybody in assay development is familiar with the sticky wall and volatile plate issues.
    Our production folks certainly are. If I ask them to package 100µl of 1µg/ml (there aren’t any moles in their lab) detector antibody they ask if that is for immediate use (fine) use any time after after 2 weeks @ 4°C (fine, they will adjust the conc.), or to be used anywhere between 12h and 14 days after bottling (good luck, you are on your own).

  10. Anonymous says:

    I completely agree with the second bit. And a good dialogue with the screener in the team can really help when the correlations get messy for other reasons too.
    If you really are screening with a biologist and their favourite pipette then you might want to think about changing that. Using standardised automated processes can really help to minimise those kinds of differences.

  11. Cellbio says:

    Agree with ADCchem. Jeff, it is more about sticking to plastic and method than specific device or operator. Worked with a compound that showed 3 orders of magnitude difference in apparent IC50 depending upon whether one changed tips between dilution, with no tip change appearing more potent. For a really greasy compound, now a marketed drug, dilution in media without serum approached micromollar, low nanomolar with serum, sub-picomolar formulated in micelles. Came to conclude that just as mg/kg dosing is not a valid way to infer in vivo exposure, inferring intrinsic target potency from concentrations expected from dilution schemes is flawed, as many factors such as plastic binding, protein binding etc are integrated.

  12. darwinsdog says:

    In additional to adhesion to plastic surfaces, I would wonder about aggregation of cmpds (think Brian Shoichet) or other physical effects related to the properties of each cmpd that manifest by diff. dispensation methods.

  13. JC says:

    Never look at the IC50 curves; you will go mad.

  14. Hap says:

    12 – I thought the detergent-added runs were supposed to prevent that – are detergents generally unable to break aggregates already formed?

  15. Cialisize Me says:

    Isn’t the best way to perform biochemical IC50s is to do your serial dilutions *first* into straight DMSO followed by addition of a small amt of these to your reaction well. That way you know that the actual amt of compound got into the well. This works for any method, pipette or acoustic. It solves your sticky pipette problem or crashing out at higher concs. Your enzyme has to be able to function in a percent or two DMSO, but most times this is OK. C-M

  16. Hap says:

    12 – I thought the detergent-added runs were supposed to prevent that – are detergents generally unable to break aggregates already formed?

  17. Algirdas says:

    @14 Hap,
    I wonder exactly how does adding detergent work? I have not done this myself, but I do have an assay featuring a greasy compound, which may benefit from added detergent. But… one might imagine two scenarios:
    i) no detergent. Greasy compound aggregates, effective dissolved conc is tiny, assay shows no effect.
    ii) with detergent, greasy compound partitions into mixed micelles, effective dissolved conc is tiny, assay shows no effect.
    Are improvements seen with the addition of the detergent large? Like, orders of magnitude-large?

  18. Anonymous says:

    @ 17 they can be very large differences. And it depends on how your compounds are acting. It can either help stabilise compounds in solution and so you get an increase in potency or (this us the one I’ve seen more often) your cpds in the absence of detergent are forming aggregates and giving non-specific inhibition, in which case adding detergent gets rid of this activity.
    Brian Shiochet’s papers have a lot on assays with/without detergent.

  19. Hap says:

    17: I haven’t done used detergen in drug assays (because I don’t work in lab), but there has been a lot of discussion about it here. If using detergent doesn’t help with aggregation in the pipet (doesn’t break up already-formed aggregates), you might still have problems with your assays when you think you don’t.

  20. smurf says:

    at 17, 18: orders of magnitude differences are possible.
    Use Pluronics in your assay and solubilisation buffer, e.g. F-127. It is biocompatible, and actually not a detergent. Helps a lot.
    Be careful how you dissolve the solid, the exact path from solid compound to assay data does matter. It’s anal stuff, unfortunately still more art than science: most people don’t care.

  21. darwinsdog says:

    @16,19 First of all the paper in question is a meta analysis of data from other papers. So take it for what it is worth and realize that a lot more may be a variable here than just the dispensation method alone (usual list when changing assay protocols all the way to material lots over time) but it makes for a point to blog about and count as a PLOS publication (for whatever that gets you). But back to the detergent control – I do not know if the original papers included detergent controls in an attempt to address aggregation but I can tell you in general pharma companies (at least the ones I know about first hand) don’t usually dedicate wells in a plate to this type of control (I guess the cmpd libraries are deemed to sufficiently cover the ‘palmolive-space’ :-).

  22. Anonymous says:

    I tend to use tween / triton and am with 15 in dissolving in DMSO then adding a small volume to the assay. I think that’s what they do in the paper Derek blogged about too?

  23. smurf says:

    at 15: yes, that’s the best way. From solid into DMSO, then sonication, then serialisation, then dilution in solubilisation buffer, then assay buffer. Acoustic dispensing works very well, still, for key compounds I used to repeat the data as described, often from different batches or from solid compound sources.
    And stats matters, number of data points matter, multiple repeat experiments at different days matter: managers don’t like it, but it DOES matter. Ideally the assay and screening biologists should be in daily contact with the chemists to ensure that key compounds are repeated again and again to establish a solid data set.

  24. cdsouthan says:

    Time for a CASFEAR competition “Critcal Assesment For Enzyme Assay Reproducability”
    Just send a few reference compounds (with LogP spread) out to 20 to 50 labs and ask for the IC50s and Ki’s back with some linearity curves thrown in. Then write the paper.

  25. Hap says:

    I can see why it wasn’t in Science, but I wouldn’t have agreed that it belonged in a medicinal chemistry journal.
    How do journals get all the pretty biological data they show? In some cases (maybe a lot), there aren’t exogenous components in the assays or their standards and so aggregation/other issues during dispensing aren’t issues, but in the remaining cases components which may not behave correctly are added exogenously. Without further data, you can’t assume that all compounds behave like this (maybe the set of compounds used was funny in some way that led to the observed behavior), but I’d not be betting the house, its furniture, or even the toys that this doesn’t happen elsewhere. Doesn’t that possibility make an awful lot of biological data look less secure? If it does, that would be important to a lot of people.

  26. MedChemGradStudent says:

    @17: Check out the below linked recent article by Davis and Erlanson on fragment screening. It is mainly focused on fragment based screening concerns, but it has a section on use of detergents and other important general points to keep in mind when it comes to screening (not just for fragment based screening)

  27. Alex says:

    Seems like the authors had some fun getting the paper published, too…

  28. darwinsdog says:

    @25 Well Hap welcome to the world of in vitro screening. To answer your first question – “graphing software”. To answer your second question – “yes, yes it does” but everybody in the field knows this intimately. Medicinal chemists, well all drug devpt scientists for that matter, view the screen as the means to the end or the back-story about how the pharmacophore came to be with the pharmacophore being the thing of interest. So other than in-house use, HTS data is mainly used as claims in patent filings and has all the robustness and validation implied by that. Said another way, if you have a critical decision to make on your own research don’t base it on screening data that someone else has published.

  29. anon says:

    well, you shouldn’t get too excited anyway until you get confirmatory activity in a totally different, low-throughput, manually pipetted assay
    but still it affects what “best” compounds ever go into those

  30. Jose says:

    I think it’s an important finding, but did they seriously think it was Science worthy enough to submit not once, but TWICE?

  31. Lars Duelund says:

    Two things here:
    First I think PLoS One is a good place for studies like this. Everybody can get access to this work, also the people working in small organisation.
    Second, and I know it is a bit ot I know, but the pipettes with disposable tips are probably the most misused piece of lab equipment around. Just recently I have encountered a post doc, complaining to me that when he pipetted 1.5 mL chloroform into a 2 ml vial there was more than 2 ml of solvent… First off, CHCl3 in single use plastic tip will not give the right volume, never. The second mistake was that the knob was pushed all the way down, as for emptying the tip, both when filling and emptying it, hence the more than 2 ml solvent. And I was also told about the professor with several PNAS papers who also had done the same all hes carrier.

  32. Anne says:

    @Lars: I vividly remember the moment, during my very first lab research experience (the summer after freshman year of college), when I realized that I had been pipetting incorrectly all summer, pushing the knob down fully to pull up small quantities of a very expensive drug. This explained why I had run out of the drug so quickly (though as this was 2006 my boss had no problem dropping a few thousand on another batch). I then had to recalculate the actual drug dosages I had used in my experiments, and my data ended up being labeled as “high” and “normal” concentrations… I was profoundly embarrassed and terrified to tell my boss. Luckily he was kind and understanding I’m just glad I realized this at such an early and low stakes stage of my research career. I can’t imagine making it to postdoc, much less professor, without knowing how to pipet properly…

  33. Sean Ekins says:

    Thank you Derek for highlighting the paper and the blog. Also a thank you to the conmmentors. I would like to respond to them all but I just do not have the time or the knowledge on some of the issues raised I hope this suffices. I have added a bit more analysis behind why this work gives me nightmares on my blog and ( and answered just a few of the comments relating to the visibility of the “stickiness/ leaching” issue and also why we tried “Science” so rather than repeat myself…Would be happy if people can find any other datasets that we could analyze using multiple dispensing methods, also anyone at AZ want to send us the compounds in the paper and we can probably find collaborators that would test many of the hypotheses set forth in the comments above? I sense this is just the begininng of trying to get to the cause of what we and others have seen. I regret we are the ones that are getting attention when really those in the figure Derek used should be the ones we look to and thank for bringing the issue to the public attention.

  34. sgcox says:

    ~20 smurf:
    never tried Pluronics before.
    What concentration would you recommend?
    Should it be used in addition to detergent or instead ? thank you.

  35. Joe says:

    A blog from Nature Methods comments on the original PLOSONE paper at

  36. DaveK says:

    Acoustic dispensing makes a difference? Oh no, maybe the homeopaths were onto something with their succussion all along! 😉

  37. cdsouthan says:

    Comparative characteristion of the basic dilution linearity could be useful, before layering on the complications of compounds and proteins. Manual Gilson vs tip vs acoustic robots could be run with a robust chromophoric reagent and even corroborated by uHPLC to check in = out

  38. KA1OS says:

    The somewhat scary thing about dilutions in various solutions is that they can be extremely reproducible. Low variation tells you nothing about whether the results are accurate. What we end up doing is increasing the number of mix cycles until the variation goes away. Without an independent control or way of measuring, you can’t know.
    #15 Cialize Me, #23 smurf — I agree. If your compounds start in DMSO, keep all manipulations in that solvent as far into the process as possible. Note that acoustic dispensing often still requires three to four intermediate pre-dilution steps to compensate for the limit dispense range of these instruments (Acoustic dispensers typically provide 2.5 nL – 250 nL dispense volumes, only a 100-fold working range). The HP ‘inkjet’ dispenser provides a wider range but still cannot cover that needed for a 10-point, half-log curve.
    I am familiar with the AZ studies which compare Echo dispensing against aqueous dilution (the latter case being significantly bad), but if anyone has a comparison against tip-based serial dilutions in DMSO, I’d appreciate any reference. I can say that we have seen differences between acoustic dispensing and serial-dilutions in DMSO *if* there is an aqueous buffer step just prior to final plating.
    I’ve also wondered for years how much pre-dispensed (or dry-dotted) plates created before adding assay reagents may exhibit problems. This would be tested against assay runs where the compounds are added after many of the reagents.
    I suppose the ultimate would be running assays in electrostically-suspended droplets with no external surfaces to bind compounds.
    – Tim

  39. RB Davis says:

    Decades ago in my analytical chem days (HPLC) I worked with hazardous and toxic industrial and agrichemicals. Some of them very much preferred sticking to a plastic pipetter tip over going into the alcohol, acetonitrile, or MECl2 they were supposed to be dissolved in.
    The GC guys down the hall had long-since standardized on positive displacement micropipetters with a clear glass tube and PTFE piston. They were a little fiddlier to work with than the plastic wonders favored by the bio lab people, but they delivered outstanding accuracy and it was easy to visually verify that the gunk of interest was indeed no longer stuck to the piston or tube before pitching the tube and washing the piston with a squirt bottle.
    The air void plastic pipetters were always wildly inaccurate by our standards and the chemical affinity and contamination issues of the tips made them untrustworthy for our assays (purity, and concentration work, mostly) anyway.

  40. Sylvia says:

    While this is “truly disturbing” and probably means that a lot of time and money has been wasted – and it also explains all the disappointing news on drugs which were thought to be great candidates (until they failed efficacy in clinical trials)…to me this also sounds like the dawn of a new era, sort of a potentially looming gold-rush.
    All those compounds which could have been the desired next block-buster and which have escaped discovery so far due to simply how they were dispensed – they are still out there. Waiting to be identified, to become new drugs.

  41. Andrew Ray says:

    Sylvia – Be careful not to make too many assumptions in the other direction…that is, don’t assume that sticky walls and other pipette-related “errors” explain all efficacy failures. For example, those recent Alzheimer’s antibody failures are probably more likely the result of theoretical problems due to our still-limited understanding of AD. Sometimes, too, you’ll run into 100% water soluble, nonsticky compounds where differences in animal model/human metabolism causes them the impress during preclinical tests, then fail during clinical tests.

  42. Joe says:

    @20, @34. Note that Pluronics, contrary to earlier beliefs, do have biological affects on their own and with other drugs. See some of the papers by Alexander Kabanov on the impact of Pluronics on cells

  43. Joe says:

    @4 I agree that people often believe that hydrophobic compounds are transferred at higher concentrations than expected in serial dilutions. Michael Berger( definitely agrees with you. However, the data in the patents from AstraZeneca suggest that tips are doing the exact opposite of this–they are reducing the concentration of the test compound as the dilutions are attenuated. This is also seen in Harris D, Olechno J, Datwani S, Ellson R, Gradient, contact-free volume transfers minimize compound loss in dose-response experiments. J Biomol Screen 2010 15: 86–94. Data from Bristol-Myers Squibb (Spicer T, Fitzgerald Y, Burford N, Matson S, Chatterjee M, Gilchrist M, Myslik J and O’Connell J. 2005 Pharmacological evaluation of different compound dilution and transfer paradigms on an enzyme assay in low volume 384-well format. Drug Discovery Technology. Boston) suggests that the compound that exhibited the greatest loss of material during transfer had a logP value of 2.87.
    I think that dilutions, an area that one might expect to be simple, are actually fraught with complications and deserve more attention. I hope that researchers will read the PLOS ONE article and that pharmaceutical companies (the holders of much similar data) will release the data for subsequent analyses.

  44. Anne Carpenter says:

    In response to the request for other studies of dispensing methods, this one compares acoustically-prepared pre-plated compounds vs pin tool:

  45. ben schenker says:

    I didn’t see anything in the PLoS one publication that references that the compounds may be ‘sticking’ to the tips. It does mention leachates, but then cites 4 pubs which say leachates come from all plasticware (including plates which acoustic devices use for storage and dispensing) and that leachates inhibit biological pathways, leading to greater perceived potency. This would seem to contradict what the authors are suggesting. The major difference would seem to be the use of high volume versus low volume liquid handling for setting up the dilution and assay plates, with the former potentially causing compounds to partially crash out of solution – this is not really a new idea though.

  46. Sarah says:

    Just to point out the ‘nice’ table showing the 4 comparative studies between acoustic and tip-based dispensing (Table 3) in the PLoS ONE paper contains only 1 study that is peer reviewed! In addition the ‘published data’ that are the basis of the whole paper is data from patent applications NOT peer reviewed publications – call me old fashioned but if it isn’t peer reviewed then I don’t take much notice of it….not that all peer reviewed papers are perfect either!

  47. Olechno says:

    In response to Sarah (@46), I am one of the authors of the PLOS ONE paper. I agree that patent data is not always the best source. I have seen patents suggesting that dowsing rods work. But one works with the data one has. As we pointed out in our peer-reviewed paper, we would be very excited to see someone do a comparison of a larger number of compounds. I suppose that it is possible that both AZ and BMS independently generated errors that trend in the same ways despite using different targets, different libraries and different assays. This, however, seems to be inventing reasons why not to believe the work is meaningful. It seems odd to discount the non-peer-reviewed patent data when we show that the direct dilution pharmacophore from that data was very similar to the pharmacophore from X-ray crystallography while the pharmacophore from serial dilution was both very different and non-predictive. I know that I can speak for my co-authors and state that we would be delighted to work with you or any others to run comprehensive tests and to nail down the effect (or non-effect) that we reported. I have asked groups to extend the work presented at the podium or at posters. I have been told that the research groups have made their decisions and see no reason to publish. They seem to see the results as an advantage that they might lose upon publication.
    Concerning peer review, you may be interested in reading the article by John Smith, former editor of BMJ:
    Also, consider that excellent science has appeared in non-peer-reviewed books (De Revolutionibus, Principia, Origin of Species). And the original Einstein paper on relativity did not undergo formal peer-review but were printed because the editor liked them. And is particularly open to non-peer-reviewed work.

Comments are closed.