Skip to Content

How Many of Those Compounds Are Crap?

A reader sent along a note about this letter to Nature Medicine earlier in the year. It’s about drug repurposing, and more specifically about the Drug Repurposing Hub at the Broad Institute. This is a collection of nearly 5,000 compounds, curated and annotated with their histories and activities. It was not a straightforward task:

. . .we sought to assemble a comprehensive library of drugs that have reached the clinic. Surprisingly, we found that no such chemical library of approved and clinical trial drugs is available for purchase. In particular, drugs that have been tested in clinical trials but did not reach approval are not readily accessible. Even obtaining a complete list of such drugs and their annotations is challenging. A prior effort led by the US National Institutes of Health (NIH) focused on drugs approved by the US Food and Drug Administration (FDA), but the library has few compounds that have yet to achieve FDA approval7. Some chemical vendors offer a subset of approved drugs, but most of these commercial libraries overlap in their content and include only a small fraction of the approximately 10,000 drugs that have reached the clinic in the United States and Europe. Given that no complete collection exists, we launched a three-step effort to create the Repurposing Library by (i) identifying and purchasing compounds; (ii) comprehensively annotating their known activities and clinical indications; and (iii) experimentally confirming drug identity and purity.

This looks like a very useful resource, and I’m glad to bring it up. But there’s one aspect of its creation that I wanted to highlight, which is what caught my correspondent’s eye. Intensive combing of databases showed that about 10,000 different small-molecule drugs have reached clinical development over the years. Fewer than 6,000 of these are commercially available substances, but the team purchased them all, and from multiple vendors when possible.

Outside of all the functional and historical annotation of these compounds, there’s the question of whether they are what they’re advertised to be. And here’s the part to note:

The final step in creating a drug-screening library is experimental confirmation of compound identity and purity. We therefore tested all compound samples in the Repurposing Library by ultra-performance liquid chromatography–mass spectrometry (UPLC–MS), after receipt of the compound from the vendor. Surprisingly, 2,482 of 8,584 samples (29%) failed QC, defined as a purity of less than 85%, as measured by UPLC absorbance peak area at 210 nm, or by an evaporative light-scattering detector (ELSD) for peaks containing the expected compound mass (Supplementary Fig. 4a). The majority of QC failures were subsequently confirmed by the vendors of the compounds upon checking of the source stocks.

Readers will vary in whether or not they are surprised by that. I would have been stunned if they’d reported that everything came back fine, but 29% is probably still a bit higher than I would have guessed (the last time I worked through a big list of purchased compounds, I think we lost about 20%). It appears that many of these QC failures came about through compound storage in DMSO, but not all of them can be blamed on that. In total, 984 compounds had to be excluded from the collection, even though they were putatively commercially available, because there was no source for them with material that passed analysis.

Experienced compound-library users already know this – but this should serve as a warning: don’t trust what’s on the label. Always run a check before you commit a purchased compound to your assays. Always check them again every time one hits from your own archive – it may have been just what it says on the label back in 2004, but it may not be now. When it comes to compound identity and purity, it’s strictly Nullius in verba – on no one’s word – or you’ll come to regret it.

30 comments on “How Many of Those Compounds Are Crap?”

  1. Josh Bittker says:

    We also recognized that, because we used our standard high-throughput analytical method, there are certain compounds that would not be properly detected. As a first pass we informatically flagged organometallics (although some of those were successful with our method) and compounds <100 MW with no rings. We annotated those as "QC-incompatible" on the web app, and if the vendor could provide their own QC data (NMR, LC-MS) beyond just a claim of purity, we accepted them.

    1. Derek Lowe says:

      That’s certainly true – there are things that are invisible to standard conditions. But I note that you used UV/ELSD, which covers a good amount of ground, and that’s what we used in the compound-purchase experience I referred to as well. We did rescue some things, but I think the biggest category of those were diastereomeric mixtures that the software flagged as impurities.

  2. luysii says:

    It’s far worse for the hapless buyer of illicit drugs on the street. I’m sure there is more recent work but a study of 1,163 samples of street drugs represented as stimulants (30 years ago) found that cocaine was absent in 20% and pure only in 60%. Of those said to contain methamphetamine, only 1/3 did. Other stimulants were found such as caffeine, ephedrine, lidocaine and procaine.

    One of the more fascinating cases I saw as an intern, was a case of strychnine poisoning. Addicts back then were well aware of quality problems with what they were buying, and since morphine is an alkaloid basic and bitter, they would taste the drug to make sure it was bitter, so strychnine was added.

    The major effect of strychnine is blockade of glycine receptors, which are inhibitory in the spinal cord– it also blocks the GABA[A] receptor and the nicotinic cholinergic receptor.

    At first it looked like the patient was convulsing. But he was not, just having excruciating muscle spasms while awake. He survived.

  3. nowhere man says:

    An additional QC concern would be isomers of the same mass. There were some problems I remember hearing about where a vendor was selling samples of the kinase inhibitor bositinib that actually had chlorines on the wrong position of a phenyl ring. Caveat Emptor. If you are going to draw an important conclusion from an experiment you should check all reagents (antibodies too!) carefully.

  4. Imaging guy says:

    Does anyone know a similar website for protein therapeutics especially antibodies?

    1. Antibody guy says:

  5. Barry says:

    We had one interesting pyrimidine emerge from screening (we eventually got a very selective single-digit nanomolar kinase inhibitor from it) that hadn’t been in the library. On storage in DMSO, the dihydropyrimidine (which we thought would be an interesting, non-flat scaffold, but which hit in none of our assays) had oxidized to the biologically active pyrimidine.

    1. Ted says:

      That brings back some memories. We had a series of putative 2-aminopyrimidines that hit in a GPCR assay. The initial SAR was a bit puzzling, until I figured out that our hits were actually 2-chloropyrimidines – probably the synthetic precursors. We then disposed of the next three years of my life trying to turn that sow’s ear into… something. Unfortunately, that something was a ‘former medicinal chemist!’


  6. R says:

    Just for kicks, I searched for my company’s compounds – a few came up. Surprisingly, the structure of one of them is completely wrong. The QC metric that was discussed assumes that the vendor has the correct structure of the compound that they claim to be selling since it is based on matching the observed m/z to the theoretical one based on structure. Apparently that is not always the case.

  7. GladToMoveToProcess says:

    Sounds like Pfaltz and Bauer all over again!

  8. Joe Morrison says:

    Sometimes I like reading this to remind me how simple my life is as a metallurgical engineer. In some applications ensuring that a magnet will stick to the part is adequate quality verification. Although every few years I do have to etch aluminum samples with dilute hydroflouric acid, most time NaOH is an adequate etchant for aluminum.

  9. R says:

    My bad. I take my comment back…Sorry…I blame my oversight on lack of coffee…

  10. anon says:

    Hopefully, quality control is Pharma is better now. In mid-2000s at Schering-Plough, long overdue quality control analyses of the chemical library reduced the library size from 2 miilion-ish by ~50%. Disheartening for those tasked with finding hits…

    1. Validated Target says:

      I was involved in library synthesis and had good looks at much of the data from the entire group. Most of the products were mixtures or junk. I pointed out that there was a problem. They were sent out anyway. So was I.

  11. Yossarian says:

    210 nm is a demanding wavelength. At 210 nm small absorbant impurities get attenuated due to wide extinction coefficient differences. 254 nm seems to give more truthful results.

    1. Barry says:

      254nm is the workhorse as long as you’re looking for aromatics (still most drug substances, I believe) but it’s blind to impurities like DMF, and misses some drug substances which lack such a chromophore. Better to follow multiple frequencies (if not wholly different analytical methods)

    2. Chrispy says:

      What kind of name is Yossarian?

      1. Wavefunction says:

        One word: Catch-22

      2. anon says:

        It’s Yossarian’s name, sir.

      3. Me says:

        It is the kind of name someone named Yossarian would have.

    3. milkshaken says:

      quite right, the oxidation products typically absorb a lot more at short wavelengths. But it is good they are being extra cautious with that collection: There are quite few not-so stable compounds used as drugs – beta lactam antibiotics, indole compounds, even tertiary amines .

      Then there are compounds that cis/trans photoisomerize in solution (retinoids, Sutent-like compounds)

    4. A Nonny Mouse says:

      Yes, 210nm is a bit of a “catch all” but it can be very misleading.

      I had a company ask me to remake a compound that had “gone off”. I resupplied (together with a CoA which stated the purity. It was supplied and rejected by them. It turns out that they were only doing 210nm for purity while the compound in question had no absorption at that wavelength (the purity was stated at 260nm on the CoA…). I don’t know how many of these compounds would be similarly affected by this.

  12. Papa Francesco says:

    Molecules are like people, they break down over time and slowly degrade and oxidize. They even get freezer burn and respond to love, care and respect.

    Love all molecules, even the degraded ones, as we slowly oxidize…,,.

  13. Scott says:

    Sounds like an incredibly useful library, but curating it sounds like an absolute nightmare.

    What was that Big Data study mentioned here a few years ago where the Big Data guys didn’t even bother to verify their starting “known drug-like compounds” (and I called them on it, despite not being a chemist)?

    I’d really be more interested in the various compounds that failed at some point or another and what was done to see WHY they failed (and what they actual do). Root Causes are far more interesting and important than ‘Oh, this didn’t do what we wanted it to’.

  14. Chrispy says:

    Has anyone ever used the Selleckchem libraries?
    1430 FDA-approved drugs, ready-to-go in 96 well plates. Seems like an interesting set (and no, I don’t work for them!).

    1. Wes says:

      We have used their “bioactive compound library” with some success… it seems to be rich in kinase inhibitors…

  15. Tellussomethingwedontknow says:

    How many of the duds came from china or india?

  16. Annoned says:

    By HPLC/UPLC-MS do you mean RP gradient chromatography and ESI mass spec?
    Does everything separate by RP chromatography? How many components of the sample come off in the void or bond to the column? Does everything ionize by ESI?
    Also the response from ELSDs is nonlinear and can differ greatly depending on the compound and conditions.
    As a first screen HPLC/UPLC-MS is better than nothing, but it is far from perfect.

    1. Professor Electron says:

      From my experience, about 10% of drug-like compounds are problematic with UPLC/MS (pos/neg ESI) but only a very few (maybe 1%) don’t appear at all. The main problems are adduct formation or fragmentation in the ESI source itself – losing water, gaining water, hydrolyzing… My favorite is a well-known fungicide which gives m/z=(M+H-Cl)H+ in ESI. What do others think?

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.