Skip to Content

Aggregators Are Here to Stay, Unfortunately

When you screen a collection of compounds against a drug target, you have to be ready to deal with the fact that some fraction – and maybe it’s a large fraction – of the hit compounds are not really hits. They’re false positives of one sort or another. All the major techniques for compound screening generate false positives of one sort or another, and if the target you’re screening against has a low intrinsic hit rate, then almost everything on your hit list is probably in that category.

A very common mechanism underlying these is colloidal aggregation. This has only become appreciated relatively recently, thanks largely to the efforts of Brian Shoichet and his group at UCSF. Basically, the problem is that many compounds will start to pile up with each other in solution, not quite to the point of visibly crashing out in a sticky mess at the bottom of the vial, but enough to sequester your protein target. Taking it out of contention like this under the assay conditions will read out as a hit (lower activity!), so aggregators generally are hard to distinguish, in the first pass, from real hits. There are ways around this, the main one being to add some detergent or another to the assay system, after first screening those to see which ones it can stand. A small amount can make aggregates much less likely to form, and often makes a rather long list of supposed hits vanish into the mist. You can also analyze individual compounds by dynamic light scattering under the assay conditions, a more retail approach.

The really annoying things about aggregating compounds are that (1) there’s no good way to predict them up front, and (2) the whole phenomenon varies according to the assay conditions. Something that ferociously aggregates in Buffer A may be just fine in Buffer B. And as you’d imagine, this is all concentration-dependent as well. A great many compounds will start showing this behavior if you push them to high concentrations, which is one of the many reasons that you want to be really careful with data generated under those conditions, but the big question is whether a particular compound is aggregating at the concentrations where it appears to be active. If those ranges overlap, it doesn’t absolutely prove that the compound is a false positive, but you should at least put some sort of asterisk next to it to remind you that it certainly could be an artifact, and to look into it more closely if you really get interested.

There’s a new paper in J. Med. Chem. from Shoichet and colleagues with some interesting data on all this. They’re making an aggregator database freely available, with thousands of compounds in it, and it includes similarity searching. That needs to be used with some caution, because (for the reasons mentioned above) aggregation can be a slippery phenomenon (the paper reviews some of the past attempts to do this sort of thing, which have been of limited usefulness). Testing this tool prospectively against molecules from the literature, followed by experimental verification, showed that it is indeed pointing in the right direction – the higher the Tanimoto similarity to known aggregators, the more likely a compound is to be one itself, although there are always false positives and false negatives. Indeed, just looking at the data they’ve generated will show how difficult the problem is, since there are very similar compounds that display quite different behavior.

So there’s no substitute for experiment. But the broad strokes of the data are hard to argue with:

Comparing ChEMBL Version 17, representing compounds tested in the medicinal chemistry literature, to ZINC, representing simply purchasable molecules, over 7% of compounds in the medicinal chemistry literature are predicted to aggregate and 1% are known to aggregate, while 0.73% of purchasable molecules are predicted to aggregate. This suggests that aggregators are actually being prioritized as ligands in medicinal chemistry.

. . .since 1995 the prevalence of plausible aggregators has grown by more than 8-fold in the medicinal chemistry literature (we cannot exclude the effects of the different types of chemotypes that have been explored, for new targets, since 1995, in addition to the advent of target-based screening).

I can believe that. It wouldn’t surprise me a bit if one of the factors pushing that number up was the rise in metal-catalyzed coupling chemistry during that period, which tends to generate molecules with less-than-ideal solubility. But this analysis raises some interesting questions. One possibility, and it seems to be a likely one, is that the false-positive nature of aggregating compounds causes them to be pursued as leads more often than they should be, enriching them in the med-chem literature. Another one, and it’s not mutually exclusive by any means, is that the sorts of structural features that cause compounds to aggregate may also have some overlap with things that protein binding sites actually select for. If that’s the case, then we really are never going to be able to get away from aggregating compounds, because they’re too much like the things we want to find.

It would be interesting to look at a large set of diverse compounds that don’t seem to aggregate under any common assay conditions, and see what the hit rate is for them over a good list of high-throughput screening campaigns. Comparing that figure to the other compounds directly wouldn’t be right, since the other hit rates are surely inflated by just these sorts of aggregating false positives, but what I’d like to know is the comparison between “real false-positive adjusted hit rates across a number of assays for the whole compound collection” versus “real hit rates across a number of assays for the set of compounds that are least likely to aggregate”. I’d be willing to bet that the latter rate would be noticeably lower, unfortunately, but I’m not volunteering to run this study, either.

28 comments on “Aggregators Are Here to Stay, Unfortunately”

  1. MoBio says:

    Very useful…for those of us who do lots of screening this could save some anguish

  2. Ash (Curious Wavefunction) says:

    Has anyone explored scaffold dependence of aggregators? For instance if there’s a known aggregator, what happens to its aggregation propensity as we start putting different kinds of spinach on it? In other words, is there a paper exploring “aggregation SAR” in any detail?

  3. Kelvin says:

    Why not set up initial “control” assays (under the same condidtions) to identify and screen out any potential aggregators before running the actual biological assay? At least that would (or should) improve signal-to-noise ratio and reduce the false positive rate in case the actual hit rate is low. One way or another, you have to do this control assay, either before or after the biological assays…

    Or have I missed something?

  4. bhip says:

    I agree with Kelvin. I don’t worry about the “easy” false positives as they will be apparent in the first counter screen using the same assay conditions (buffer, pH, etc) as the original screen. The silent killers are the false negatives of which we are blissfully unaware,

  5. Wondering says:

    Where do we stand on compounds which are detergent sensitive, but crystal structures have been generated of the compound bound to target. Developable, or not?

  6. Sam Adams the Dog says:

    @Wondering: That’s covered by the concentration dependence. Though crystallization might have been done at concentrations where aggregation takes place, the real question is whether aggregation will occur at the concentration appropriate to drug use; say, low nanomolar. If not, you’re OK. Some drugs on the market are in fact known aggregators, as Shoichet showed in his early papers; but presumably, that’s not their mode of inhibition at drug-like concentrations. Interestingly, docking programs often find aggregators. To the extent (admittedly not huge) that docking programs can predict good binding, this demonstrates the point that Derek made that some aggregators are good ligands. What I don’t understand is why HTS isn’t always done with a bit of detergent present, when the assay can stand it. That should get rid of false positives (compounds that appear positive only because they aggregate) without causing false negatives (since aggregators that bind specifically will still show up as active when aggregation is inhibited).

  7. b says:

    @kevin – what control assays do you envision? Just doing compounds and reagents alone isn’t enough as you are looking for some sort of enzymatic readout in most cases. Light scattering? You could but it would be a challenge to do on your whole library and not really worth it. Secondary screen? You will be doing that on your false positives in the end anyways. Maybe I am missing something too but I’m not really sure what control assays you are envisioning.

  8. Lynn says:

    All the more reason to do whole-cell phenotypic assays. 🙂

  9. Kelvin says:

    @b: Yes, I was thinking more of light scattering, but to say it’s not worth it doesn’t make sense if people are reporting problems with false positives: Either this is a problem, or it’s not, and this is just one potential solution if it is.

  10. smurf says:

    We used to add Pluronic F-127 to every single of our buffers, its very mild does seem to do a decent job in preventing aggregation.

  11. HTSguy says:

    @Sam Adams: I agree with your comments on the importance of considering the concentration-dependence. BTW, detergent does reduce the prevalence of aggregators, but doesn’t eliminate the problem. A significant fraction of aggregating compounds are not detergent-sensitive. I think this was discussed in that recent Shoichet group paper.

  12. Peter Kenny says:

    Making the data available is an excellent initiative and congratulations to the authors for doing this. I’ve not looked at the aggregation literature for a while so somebody may have studied whether acids and bases are more or less likely to aggregate than neutrals. Aggregation of ionized species is likely to require incorporation of counter-ions in the aggregate. There is also the question of the extent to which the protein itself can influence aggregation.

    One way to think about aggregation is to ask why self-recognition results in precipitates/crystals in some cases and colloids/aggregates in others. An approach to mining databases like this would be look for the smallest changes in structure that lead to the largest changes in behaviour (cliffs of sorts but not activity cliffs).

  13. MoMo says:

    John Irwin and Brian Shoichet should be commended on this, saving Pharma from expensive attrition and compound failures while doing basic science. They sure don’t do it themselves.

    Every Pharma reading this should contact them and send them research support money, as I cant even count the number of times such compounds have been championed, then abandoned because of lack of attention to this primal characteristic of compounds.

    Another trick we find useful is to screen them non-specific membrane perturbation in mammalian cells using flow or HTS. But Pharma is afraid to do this- they’d end up throwing out half their chemical libraries, and explain THAT to stockholders.

  14. Barney says:

    Presumably added detergent is either lowering the solution concentration of the aggregator by partitioning it into detergent micelles or by arresting growth of aggregate nuclei that form (presumably the latter in the Shoichet paper where it looks like the Triton X-100 concentration close to but just below the critical micelle concentration?).

    It seems like another option would be to try to modify the protein being screened against to keep it out of aggregates (pegylate it or encapsulate it somehow), though that would probably introduce a whole new set of problems.

  15. Sam Adams the Dog says:

    @HTSguy: Thanks. Actually, I don’t see in the paper Derek quotes that some aggregators don’t respond to detergent. If anyone has a reference that shows this, I’d appreciate it. (Hoping I didn’t just miss it.) The authors do say (first page, first paragraph) , “detergent typically only right-shifts concentration-response curves”. But if that’s the extent of it, it might not take a lot of detergent to right-shift the candidate compounds’ critical aggregation concentration to something greater than the concentration of the assay. This would then seem to be a powerful filter, especially when the candidate’s required binding affinity for further consideration is significantly less than the assay concentration. Possibly I’m still missing something, here, but so ISTM, based on my current state of ignorance. (Possible kicker: affinities of different candidates by detergent could be right-shifted by different extents, which really is just interpolation of shades of grey into the assertion that some aggregators don’t respond to detergent. But by how much, and what is the distribution of right-shifts over aggregators? That could be used to quantify the power of the filter.)

  16. anon electrochemist says:

    Just a heads up for those chemists adding detergents like Pluronic. Most ether detergents contain BHT or other free radical scavenger at some arbitrary concentration, which can then go on to influence the assay.

  17. Brian Shoichet says:

    Thanks Derek for highlighting this.

    I would like to second Momo’s comments that we be sent lots and lots of money. Thus far that hasn’t happened but operators are standing by…

    Apropos of papers showing aggregators that don’t respond readily to detergent, in http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2655312/ Kerim Babaoglu found 9 (of 70) HTS hits that were resistant to detergent. Detergent did eventually work to disrupt aggregation, but only at 0.1% Triton X-100, not the more conservative 0.01%. As has been already mentioned, we do try to stay below the CMC of whatever detergent we are using, which one usually is at 0.01% Triton, but one is well-over the CMC at 0.1%.

    Triton, Tween-80 (for membrane proteins), and others are good first passes, they *right shift* many things, but are not infallible guides.

    1. Aled says:

      Again to Momo, don’t listen to Shoichet.

      Please tell the donors to send all their money to Canada, where it will go a lot further (unfortunately). We promise to aggregate all the funding (without detergent) to purchase World Series Champion paraphernalia. We’ll also buy Blue Jays hats for our ex-pat friends at UCSF.

  18. Adam Shapiro says:

    There were additional tests for non-specific enzyme inhibition recommended in papers from Dr. Shoichet’s group, in addition to loss of potency upon adding detergent. Two others were time-dependent potency and enzyme concentration-dependent potency. Non-specific inhibition due to aggregators was lessened by increasing the enzyme concentration and by decreasing the reaction time. In applying these two tests, I found it convenient (and, in fact, unavoidable) to combine them. I compared the IC50 for the normal reaction conditions to the IC50 at higher enzyme concentration and shortened reaction time (detergent was always included in both assays). Taking into account the expected range of experimental error, an acceptable range for the ratio of the IC50s was set, and any compound that was outside that range was considered to be a non-specific inhibitor. Its propensity for aggregation was not investigated, since other reasons for non-specific inhibition were assumed to be possible as well (such as the presence of chemically reactive or chaotropic substances) . This methodology allowed non-specific actives to be identified quickly and easily. It was only applied to relatively low-potency inhibitors (µM), since time-dependence of specific nM inhibitors was considered to be fairly plausible.

  19. annon II says:

    Great effort. Need more like it in both pharma and academic probe discovery

  20. tangent says:

    Question from the non-chemist gallery: is predicting aggregation within the range of what modeling software can conceivably be any use at?

    (The comment from Adam Shapiro — it sounds like aggregation takes macroscopic lengths of time to develop, minutes plus? — makes me concerned that any simulation will be hellaciously expensive in compute to reach the aggregated state.)

    1. John Irwin says:

      @tangent : We think aggregation will be very challenging to predict by simulation. Aggregation depends not only on the molecular structure, but also the buffer and solute concentration, so it would be a heck of a simulation.

  21. Stephen Frye says:

    Simple rule of thumb for hits: “Guilty of being an assay interference artifact until proven innocent”. Aggregation is one important generator of artifacts, but since there are many others, the safest bet is to have orthogonal, biophysical assays to prove target engagement. I never believe anything until all the in vitro data and a readout of function or target engagement in cells (e.g. CETSA) line up. Hit-to-lead work rewards the skeptic, not the optimist.

  22. At the risk of boring those who have already seen my presentation on this topic! Feel free to groan or yawn if you are one of them.

    The use of chemical microarrays totally sidesteps this issue. This is one of the numerous attractive advantages of the technology that make it stand out from all other alternatives.

  23. Soon-to-be-former BMS person MOLS says:

    For any HTS campaign there should be well-chosen counterscreens, such as related proteins. When visualized on a phylogenetic tree, your pattern of hits should make sense. If you’re hitting half the kinome, you need to know that.

  24. Berkley Lynch says:

    Back in the 1990s I was involved in screening for SH2 binding inhibitors. We encountered exactly this problem, though not with random screening but in med-chem projects. I finally realized aggregation was likely to be the cause of the inhibition of certain classes of compounds by several methods. One was the addition of low concentration of detergents, which did not perturb phosphotyrosine peptide binding to the SH2 domain, but did eliminate test compound binding. Another related way was that the SH2 domains could tolerate higher concentrations of DMSO… and at the higher concentrations the peptide binding remained, while the med chem compounds no longer bound. A third clue was that the med-chem inhibition/binding was irreversible: running the assay in pure aqueous solution showed inhibition, adding detergent to the wells after the inhibition and re-reading the plates still showed inhibition even though the presence of detergent from the beginning prevented binding. A final clue was that many of the medchem aggregating compounds showed steep Hill slopes.

  25. Peter Lowe says:

    In reply to comments suggesting that it is simple to run control biochemical assays, this is not as easy as it sounds. Controls such as omitting a component in an assay, or with an alternative target or assay system, can work with compounds of a defined binding mechanism (e.g binding at a single site at a stoichiometry of 1;1). However, it is very difficult to envisage a simple, sure, control assay to eliminate compounds that are active via their aggregation. As mentioned above, there are many ways that one can look for ways to distinguish a single site binding mechanism from the effect of an aggregated compound, such as effect of detergent, changing target concentration, high slopes etc. However, any single one of these alone will not detect every effect caused by an aggregator. Likely, this is because there is not a single “mechanism” for how every aggregator exerts its effects; that is quite different from a single-site 1:1 binding molecule.

    Another important message, which still is sometimes overlooked, is that aggregation itself is not “bad”. The problem comes when the cause of the observed effect is due to aggregation alone, and that the compound acts by a mechanism that is not by simple stoichiometric binding. Hence screening for compound aggregation itself is unhelpful, and positively bad in “throwing out the baby with the bath water”. The issue occurs through the combination of a compound that has the potential to aggregate (which we now know is common), a target that may be sensitive to an aggregate (e.g able to locally unfold), and the specific assay conditions. So. though many “nuisance” aggregating compounds may be “promiscuous”, this can give the wrong impression. In practice some can appear to be show surprising apparent “specificity”. Here, I define “specificity” as causing effect on only a few protein targets, but not “specificity” in the sense of binding stoichiometrically at a single well-defined site on a target

    To my view, one reason why “aggregating” inhibitors have become problematic in the last decade is due to assay technologies becoming more sensitive. This has resulted in using a lower concentration of target in an assay. Low protein concentrations often promote the undesirable effects of aggregating compounds, and so the advances in screening may have created the problem. Although not always easy to do in practice, comparing at a single compound concentration, or better still dose-response curves, at high and low target concentrations is a powerful way of detecting undesirable aggregators- with the proviso that none of the compounds are sufficiently potent that their IC50 is shifted by the change in protein concentration. I like this method as it is conceptually simple based on the differences between simple stoichiometric, reversible binding equilibria and the undefined, undesirable mechanism by which an aggregator might act . Note that increasing the target concentration, is not the same as addition of protein (e.g BSA) to an assay with a low concentration of target; though clearly that can also be beneficial.

    The good news is that the work from Schoichet’s lab and others over the last years has laid out facts about such difficult to define mechanisms and that the tools are there to prevent compounds being followed up when their mechanism is not the desired one. The bad news is that compounds that work by inappropriate mechanisms are not always rejected soon enough; e.g. this can arise when the compound shows activity in a cellular system, and the biochemistry behind aggregating compounds is conveniently forgotten.

  26. mostafa says:

    Do soluble aggregates still induce chemical shifts in a protein 15N HSQC experiment?

Comments are closed.