Skip to main content

Clinical Trials

Why Are Clinical Trials So Complicated?

Since we’ve been dealing with a lot of clinical trial data around here, I thought I would field a question from my email that might get to some things that others are thinking about as well. Here (with permission of its sender) is the idea:

. . .it does seem, from the outside, like most of the time and expense of clinical trials really is just bureaucracy. I can easily think of several compounds (some FDA-approved but no longer being made, some approved outside the US, some naturally occurring and sold as supplements) that I could literally run a trial on from my apartment, if the only question were “does this actually work?”. The synthesis is easy enough to do in your kitchen, one or two steps with basic equipment and common reagents. My friends have LC/MS and NMR machines available, or, if that didn’t work, there are analytics labs that I’ve sent samples to for a few hundred bucks. I have an AdWords account that I could use for recruiting, or, heck, there’s 80,000 people on /r/researchchemicals who take experimental compounds just for fun.

Making placebos is easy, Gwern even has a protocol for doing a placebo-controlled trial on yourself (, although even that wouldn’t be necessary if you mail people the compound (or placebo) and do the randomization at home. And there are already studies that test drug effects, collect information from subjects, etc. entirely online, with some pretty simple software ( – since psychedelics are still controlled, in this case the subjects are totally anonymous). Now, granted, doing it on this low of a budget cuts a bunch of corners. And I’m sure the data would be noisier than in a traditional Phase II trial, so you’d need a larger sample size. And lots of drugs are much harder to synthesize, drugs like Spinraza can’t be self-administered, etc., which adds extra complexity. But even so, this really doesn’t seem like it *ought* to require $100 million, a team of lawyers, a lobbying firm and special guanxi for every trial you do. Am I totally off base here?

OK, I can see that some of this query is coming out of the biohacking movement. And I can imagine that from that perspective, a drug-industry-style clinical trial does look like a needlessly complex, expensive tangle. Hear me out, though: there are reasons for doing it the way we do. Some of those reasons are better than others, I’m sure, but the good reasons are in fact really good. I’ll take these suggestions in the order they’re mentioned in the email, and let me say up front that I’m not trying to make fun of these ideas and I’m not trying to put down the idea of asking them – they’re valid questions and deserve good answers, and my thinking is that doing so out here on the blog might be of interest to others.

First off, the part about “the synthesis is easy enough to do in your kitchen”. There are indeed quite a few drugs where the active pharmaceutical ingredient could, in fact, be produced that way, for some values of the verb “produce”. But remdesivir (to pick one that’s in the spotlight) is not one of those, unfortunately. Here’s a look at its manufacturing, and while some of the difficulties come from having to make zillions of doses of the stuff, some of them scale right back down to that kitchen. How much drug are we looking at? The NIH/NIAID trial that is starting to report dosed 286 people with the drug, 200mg the first day and 100mg/day after that for up to ten days. Let’s call it an average of 7 days – over that number of patients you need just over 228 grams of remdesivir to meet the exact amount.

Even if this were a drug that you could make in your kitchen, though, you’d better not just make that amount and call it a day. There will be losses in formulation for starters and more losses right down the line. How much do you need? Well, there are entire companies that exist to help you calculate just how much “overage” you might need in your clinical trial – if you run out of drug, you’ve blown the whole trial, and if you make too much you’re wasting money. Just the existence of such a business niche is food for thought.

But if you look like you’re running out, can’t you just make more? That brings up another big problem: Good Laboratory Practices and the Current Good Manufacturing Practices. Those two are broadly similar, but have different requirements and are used for different purposes. GLP/CGMP does indeed add bureaucracy, because it adds a lot of standardized procedures and a ton of documentation. I’ve never worked under these standards (I do early stage research, not up next to the clinical batches), and I would not enjoy it. It would indeed make things move faster to ditch the stuff. But that’s a bad idea. You cannot wing it when you’re giving a drug to human beings. Outside of the sheer ethical problems of the “go on, take the drug, it’s probably good enough for the likes of you” aspect, there’s the need to make sure that everything is absolutely standardized so that your clinical data themselves have a better chance of meaning something. The amount of noise in human clinical data can defy belief if you haven’t seen it in person. Every part of the process that can be ironed down flat should be, because the parts you can’t iron down (like “does the darn stuff work in humans”) will completely eat your drug development project if you don’t. The idea of GMP is to eliminate all the risk and variability that can be eliminated by testing and standardization – what’s left, ideally, is the risk of the investigational drug itself and God knows that’s still enough.

I won’t go into the details of CGMP, but it’s safe to say that the requirements cannot be met in a kitchen. Impurity profiles, batch documentation, tracking of all ingredients and starting materials, batch-to-batch variability. . .even just setting the right standards for those things before you start measuring them for real is work. So even if your friends have NMR and LC/MS machines available (I have some friends like that, too, and I wouldn’t have it any other way), who keeps the LC/MS running? Who works out the analytical method and shows that it’s optimized for detecting the most likely impurities as well as the less likely ones that could be most worrisome? What standards do you run, and how often? Can you be sure that the substance you’re measuring didn’t get contaminated along the way? If there’s a difference in one batch, can you trace everything back to find out where this occurred and why, so it doesn’t happen again? And so on.

Keep in mind that you’re watching both the manufacturing of the drug and its formulation, and how exactly do you formulate it? If it’s a solid, how do you make sure that you’ve made the same solid form every time? Polymorphs can and do appear seemingly out of nowhere and have ruined development timelines again and again. How fine should your particle size be for the substance, and how do you make sure that it’s the same in every pill? (That can made a substantial difference). What “excipients” should be in the pill as well (other compounds to help disperse the active ingredient in the pill, help it break up and dissolve reproducibly in the gut, and so on? Those can make a big difference, too, and they need to be the same every time or your data could be trashed. An i.v. formulation like Remdesivir is, as you noted, a whole other order of difficulty: it’s not just hard to self-dose, it’s very hard indeed to manufacture reproducibly under sterile conditions, in a formulation that you’re sure the drug is stable in over time, etc.

We can disagree at this point if you feel that CGMP (whose surface I have not penetrated very far) is just a way for the FDA to flex its power and turn itself into a bottleneck, and there are indeed people who think that – but to be honest, not many of those people have ever actually taken a drug into the clinic, which is also something to think about. I could imagine streamlining some of this stuff in an emergency, but I can’t imagine just ignoring it, because the underlying issues that the whole system is trying to address are very real. The idea is to protect patients and to avoid having the data from the trial getting hashed up, and those two often overlap.

OK, now if you have your drug, it’s time to recruit patients. Well, actually, it’s time to design a trial before you start recruiting, and that’s a whole field in itself. You’d figure that studying a single drug in an infectious disease setting might be one of the more simple trial designs, and that’s actually true, but that’s only simple on the relative scale. First off, what are your endpoints – what are you going to measure to see if the drug works? In the case of a coronavirus trial, do you want to measure viral load, lung function, overall severity of disease, days to discharge, days to death? Probably all of those, sure, but which one do you figure is the most important (primary) endpoint, the one the rest of your design decisions are aimed at answering?

Now, for those decisions. How many doses are you going to check? How do you know you’ve picked the most useful ones? What sorts of patients are you going to recruit? In this case, you have several known variables to consider (or to set aside, if you can make a case for that too): severity of disease, time since onset of symptoms, age, gender, body weight, other medications being taken, known co-morbidities such as hypertension, and so on. While you’re enrolling patients, you’ll need to control for and match up/balance as many of these as you can in each arm of the study. How many arms are you going to have? How many patients do you figure you’ll need for each one, in order for the results to be meaningful? Where will you find them? Who will examine them and collect all the initial data, and who will keep track of all the data collection along the way? Recruiting via AdWords or via the users of a subReddit would indeed be faster, but that unfortunately won’t cut it. Even a Phase I study (normal individuals, no disease, mostly checking blood levels and looking for any wildly overt safety signals) needs to pick its members carefully to make sure that  the curtain can come up smoothly in Phase II. And since you’re talking about a “does this actually work” trial, that’s Phase II by definition.

The NIAID study ended up recruiting (and dosing, and monitoring) at dozens of clinical centers on several continents just to enroll a properly matched set of 286 treatment patients and 286 control patients, or as properly matched as they could get them under the current conditions. The amount of noise in human clinical data can defy belief if you haven’t seen it in person, and the last thing you want to do is go to the time, trouble, and expense of experimenting on human beings but not generating useful data at the end of it. Note that the data coming out of the trial are already being argued about, because the mortality data looked like they might be real, but didn’t reach the usual statistical bar that we use. Perhaps the trial should have been larger, more complex, more expensive – dosing hundreds of people around the world just wasn’t enough, in the end?

Another factor: you also have to have people in all these locations whose job it is to dose the patients, unfortunately. Mailing participants the drug and asking them to take it will add a lot of unnecessary scatter to the data, because some people will forget to take it. And some of those will try taking double the amount the next day, and some of them won’t be sure if they did the dose yesterday or not. Some of them will have eaten, and some won’t, and some will suddenly remember as they’re going to bed. The more people that are involved in the trial, the more of this you’ll get, and it will likely run right alongsiee any advantage of that larger sample size and cancel it right out. I’m not making this up: ask any physician to speak candidly about patient compliance and get ready for an earful. Even things like taking a pill at breakfast with orange juice versus grapefruit juice can make a gigantic difference (which is something that for years wasn’t even realized). But we’re talking Phase II, though, so these patients might be pretty sick, or even full-time hospital patients (as is the case under coronavirus conditions). So you’re automatically talking doctors and nurses, etc, and you have to figure out where you will find such people who are willing to add the burden of assisting your clinical trial to their daily duties. What is the distribution of available medical help compared to the distribution of enrollable patients? Who keeps track of all these folks and makes sure that their record keeping is what it should be?

You also need, for both ethical and practical reasons, another set of people who are watching over the whole thing. These folks, the monitoring committee, are the answer to the ancient question of quis custodiet ipsos custodes, and they will be off to the side of the whole affair, empowered to look at the data being generated at set points during the trial to make sure that things aren’t going off the rails. Most of the time they will say “carry on”, but a reasonable number of trials end because they have said “You have no chance of meeting your endpoints – stop right now, it’s futile and it’s unethical to keep giving people your drug”. Once in a while, they’ll even say “Stop right now – your drug is so damn good that it’s unethical not to give it to everyone in the trial”, but years go by when that does not happen, anywhere in the industry. Either way, you need some outside experts who are not involved in the dosing and day-to-day parts of the trial, and who are not part of a group that has a vested interest in the results. These are the people who got a top Pfizer executive out of the shower on a Sunday morning one time to tell him that a review of the data showed that the company’s biggest drug development project ever (a cardiovascular program to raise HDL) was definitely (and most unexpectedly) killing off the treatment group at a slightly higher rate than the controls in a multi-thousand-patient Phase III trial, which is the sort of thing you want to know about at the first opportunity.

The big overarching difficulty with running a human trial the way you’re proposing to, though, is that you would open yourself up to criminal prosecution by doing so. The FDA wants to hear about your drug before you give it to human beings: what it is, what you expect it to do and why, the details (GMP!) of how you make it and how you’re sure that you’ve made what you think you’ve made, what your trial design is and why you chose that, and what you’re going to look for as it goes on. This is an IND, an Investigational New Drug application, and it is a pain in the rear. Under the pandemic conditions, the agency is indeed speeding that sort of thing up and cutting corners, but they’re not just dropping it, either, for the same reasons that they’re not just dropping CGMP. Again, we can argue about whether what the appropriate background work is to start dosing human subjects with an investigational drug and what the penalties should be if you decide to ignore them – but those are details and degrees. I would very strongly stick to my position that you need some sort of written statement, plan, rationale, and documentation that you’re going to do this right, though, and that it should be reviewed by an outside team that knows the business as well but isn’t at the same time in the business. And that’s the FDA.

You mention that under your proposal “the data would be noisier than in a traditional Phase II trial, so you’d need a larger sample size”, and I definitely agree about the first part. But as mentioned above, going to a larger sample size under these conditions runs a large risk of just generating every more and louder noise. The amount of noise in human clinical data can defy belief if you haven’t seen it in person. Even under all the constraints and controls that I’ve been describing here, 90% of our drugs fail in the clinic: can you imagine the failure rate if we tried to go faster and noisier? Even controlling for everything we can imagine, we barely get anything through.

Well, I knew this was going to be a lengthy answer, and I hope it didn’t come across too ranty. The procedures that have grown up around drug development are a sort of Chesterton’s Fence: we shouldn’t pull them down without considering the reasons why they were put up. I do understand the argument that human lives are at stake when people are dying from a pandemic. But lives are at stake when you test human beings for their response to a new drug, too, and all of the biggest rules and controls mentioned above are, in fact, built on the hospital beds (and the graves) of patients who were tested in more sloppy ways and for less well-worked-out reasons.

112 comments on “Why Are Clinical Trials So Complicated?”

  1. Todd says:

    I work under cGLP/GMP, and the rules required for them can be picayune and bizarre, even to your typical academic scientist who doesn’t work under those guidelines. However, the best way to describe it is as a return address for everything you do. For everything that can possibly go wrong (and Lord knows, that’s a voluminous and unceasing list), you need to know Why it happened, and explain it in a way that makes the regulators and your bosses happy.

    Even the biohacking movement has *some* regulatory guidelines, even if they’re unseen by endusers. Most of those compounds are made under USP guidelines, most of which can be patched into GLP/GMP without too much of an issue. If you do it in your kitchen, you’re a bad mistake in dishwashing from a lawsuit that would put you out of business.

    1. CMCGuy says:

      Unfortunately I have seen some “picayune and bizarre” cGMP interpretations that are promoted as the Gold Standard and thus get followed without true examination of the value they add (or take away) from the process.

    2. milkshake says:

      when you work in the process development at a small biotech, you learn about some incredible shenanigans that happened even with the FDA regulatory oversight. Similar things go on with the API generic manufacturers – who would rather “improve by hand” their quality control records and ship out a bad batch and run the plant floor filthy. Imagine what it would be like if they were not inspected by FDA and bear the risk of de-certification of their plant.

      GMP manufacturing rituals could use some streamlining. The same probably goes for clinical trials (the main beneficiaries of a very expensive system are the “not-for-profit” hospitals running the trials) but the system is generally geared to minimize the patient abuse by unscrupulous biopharma execs.

  2. Tony says:

    Reading this makes me SO glad I went into IT instead.

    1. Anon says:

      cGxP applies to instrument and IT systems as well for many the same reasons. Mountains of data will be collected at each of the steps, and systems that generate, retain and report data are formally validated as being fit for purpose, data integrity, etc. Google ALCOA Principles as an FYI.

      Someday all this may be on the cloud as a service, not quite there yet, so data is not something to be farmed out to your nephew who happens to be good at computers.

    2. Adrian Bunk says:

      Depends on where in IT you are working.

      When the lives of people depend on your software you have a similar circus with functional safety. This is the same mixture of being an absolute pain and being necessary for good reasons while many details of the requirements are debatable.

      As an example, when you are flying in a modern plane the pilots no longer control the plane mechanically – most of that is done by software. When this software crashes, there is a plane full of people that will also crash.

      1. OnboardG1 says:

        Yeah indeed, I’ve worked with DO254 (aviation electronics hardware safety) before and it’s a bit of a chore (particularly when you’re also under ITAR and standard security requirements). I’m glad it exists every time I fly but it takes a certain type of mindest to work well under those restrictions.

  3. Sok Puppette says:

    So, if I haven’t seen it in person, then can the amount of noise in human clinical data defy belief?

    1. WishIcouldwrite says:

      Please do not tease the writer.

    2. tlp says:

      Yes, that’s the take home message.

    3. John Wayne says:

      He said it three times, just like God.

  4. Barry says:

    We shouldn’t let the shop that ground a mirror certify that it’s best for the Hubble.
    We shouldn’t let the manufacturer certify that their jetliner is airworthy.
    We shouldn’t let the Drug Company who has an obvious financial interest in the outcome run Phase III clinical studies on its own product.

    1. Pedwards says:

      The company can run whatever they want and call it a phase 3, but a “successful” phase 3 doesn’t mean that you’ll get approval for your drug. It still has to go through the FDA approval process.

      Yes, the company can improperly influence the FDA into approving something that probably shouldn’t be approved, but that’s a different conversation.

    2. matt says:

      Disagree. The beauty of following the rules in prospective, randomized, controlled, double-blinded clinical trials is the answer doesn’t change for you, no matter how much you want it to. All the FDA has to do is make sure the process was followed, including making sure there was enough sufficiently independent oversight.

      The alternative, turning clinical trials over to some other group, misaligns incentives. No other group will be as motivated to do a trial which determines safety and efficacy with minimal wasted resources including time. The company is motivated because it’s necessary to get the treatment approved. The company is paying for this in the hopes of winning the prize of selling the treatment, which is the beauty of capitalism: it pulls on the rope of human desire to have nice things (like food on the table or a private island), rather than pushing, as government action so often does to little effect.

      If you think all Phase III’s are corrupted by being sponsored by a pharmaceutical company, explain why more than half fail. And most of the rest just barely squeak by. Or explain AD trials, where all have failed, despite a $20 billion/year prize if companies can get an approval. And to my knowledge, it’s pretty rare that Ph3 trials are invalidated (meaning proven to be wrong about efficacy/safety, not just testing an unrepresentative population or insufficiently powered, etc). None of these pieces of information are consistent with trials manipulated by a sponsor.

  5. Nesprin says:

    Our current regulatory apparatus is akin to that old joke about democracy: it’s the worst possible form, except for all the other options.

  6. Shazbot says:

    Thank you for your continued excellent content in the middle of this time where it is of such importance.

  7. intercostal says:

    From the original e-mail
    >>heck, there’s 80,000 people on /r/researchchemicals who take experimental compounds just for fun

    Wait, WHAT???

    Is this even legal?

    1. SirWired says:

      As long as you aren’t messing with controlled substances, the government is quite uninterested in all the different chemicals you choose to give to yourself or why you are doing it. (Okay, they’d prefer you not turn your kitchen into a HazMat site, but that’s not something the FDA cares anything about.)

      They only become interested once you start providing them to others…

      1. Pedwards says:

        It’s not illegal to win a Darwin Award. It’s just illegal to drag others with you.

        (Which, if I recall correctly, would invalidate your candidacy for the Darwin Award anyways)

      2. intercostal says:

        I was thinking about illegal-drug laws (in the recreational, not medical, sense) not the FDA so much.

        But I guess if it’s not a listed compound…

        1. eub says:

          The 1986 Federal Analogue Act as written seems to cover a whole swathe of these ‘research chemicals’ by reference to other substances that are controlled, but as enforced in the ground it has been not to universal.

  8. J. Severs says:

    Well done and well described. I would have added more details re statistical analyses (especially multiple endpoints) and ongoing safety compilations, but that is only because that is where I worked. Very nice.

  9. Stat Reck says:

    Any drug developer who’s happy doing their chemistry in a kitchen is probably happy managing their sites from their car and doing their data analysis in a tavern.

    Barry says:
    We shouldn’t let the Drug Company who has an obvious financial interest in the outcome run Phase III clinical studies on its own product.

    This is just plain wrong, for the precise reason you object to it – finances. Industry does a far better job, on average, of running trials than government or academia because we devote much greater resources, per data point, than they do. As noted, it’s not perfect but that’s why we have multiple layers of review.

    You want clinical trials controlled by the same scientific politics that are perpetuating the ongoing replication crisis in basic biology?

    1. Barry says:

      Oh, Boeing has its very ongoing existence invested in proving that the 737MAX is airworthy. That’s not a sound reason to let Boeing certify its airworthyness.

      1. Stat Reck says:

        Your proposal is not equivalent to saying Boeing shouldn’t be able to certify its planes. It is equivalent to saying they should not be allowed to test them.

  10. myst_05 says:

    ” is an IND, an Investigational New Drug application, and it is a pain in the rear”

    But isn’t this the main complaint? Bureaucracy is rarely a problem if it operates swiftly – e.g. few people complain about using their biometric passport to enter the country via automated passport gates. It’s bureaucracy, but it’s fast and painless.

    Could we radically speed up drug development if we increased the FDA budget and mandated lightning speed SLAs for all their approvals?

    Also, wouldn’t we speed up many trials if we dropped the idea of “do no harm”? This is especially poignant during the COVID crisis – it seems that regulatory authorities refuse to administer human challenge trials even in the midst of a pandemic that could realistically kill 50 million people around the world of left unchecked.

    1. Derek Lowe says:

      Challenge trials already happen with less severe diseases, and they’re being seriously discussed for the coronavirus. But given that most drugs fail, dropping “first do no harm” in general might be roughly equivalent to “get the killing started sooner”.

      1. Stat Reck says:

        Doing NO harm has never been relevant, just ask a surgeon.

        The crux of drug development is figuring out, in a specific context, how to maximize the expected benefit given the associated risk.

        And by “specific context” I mean a doctor or two and a patient (hopefully with the support of a loved on) sitting in a room some time in the future using the information you are planning on collecting now to decide whether your regimen is the right one to try next.

    2. SirWired says:

      Putting together an IND package is a pain in the rear not because the FDA is being obtuse, but because the questions that need to be answered are tough questions; tough questions that are being asked for good reasons. (As Derek mentions at the end that those questions stem from the hospital beds and graves of the patients that didn’t have the benefit of correct answers to those questions.)

      And a lot of corners *are* being cut for the current pandemic. (How else do you think vaccines are already in clinical trials for a disease that didn’t even exist six months ago? This is a process that would normally still be handled by mice, pigs, monkeys, etc.) And short-circuiting the trial process is how conditional approvals have been granted, also in record time, with standards far lower than we’d normally accept.

      1. Heteromeles says:

        Yes, you beat me to the conditional acceptance corner-cutting. Hydroxychloroquine demonstrated the problems with that quite ably.

        While it’s rightly not addressed in this post, we’ve got another example of the hazards of corner-cutting going on with all the Covid19 serology tests being commercialized without their false positive and false negative rates being known. I suspect the noise in the resulting test data is going to return to bite the US population rather hard in coming months, if we have problematic testing, inadequate contact tracing, and strong political pressure to get life back to normal.

        Great article though. Thanks to Derek and the person who thought to ask it!

  11. James d says:

    Fantastic post, thanks.
    Also then after running the gauntlet of a complete sequence of development, we still don’t always know all we need to about the drug…
    Think it also ties to effect size. Most drugs aren’t transforming people’s heathy, rather are having incremental effects in the right direction. You want Everything to be as same as you possibly can (while knowing it can’t truly be) bar the treatment.

  12. Eugene says:

    My understanding from here and other places is there is a history behind this that goes back to the 19th century and well into the 20th. Patent medicine, Adulterated food, The Poison Squad, Cocaine, Heroin, Opium, Nazi human medical experimentation, Thalidomide are just a few of the highlights. Experimentation on humans is not something that is done casually.

    1. Derek Lowe says:

      Yeah, I didn’t bring those out because immediately going to thalidomide and Dr. Mengele isn’t a fair way of arguing, but. . .

    2. eyesoars says:

      Don’t forget fen-phen ( ), the tryptophan disaster ( ), or vaccine-induced polio ( ) and SV-40 infections. Or that Celebrex/Bextra/Vioxx were pulled from the market after approval.

      Some of these weren’t under FDA purview, but there are plenty of traps here, and like aviation rules, most of them are there because they weren’t followed and people died as a result.

  13. And for the clinical end of the development, studies are run where the investigators sign and date some documents “month / day / year and other documents “day / month /year” and it is a minor infraction if you do the opposite.

    Go figure.


    1. eub says:

      Oh for fuck’s sake ISO 8601 or go home people.

      1. Druid says:

        Tx. I did not know ISO8601. Makes a lot of sense.

        1. Oldfort says:

          Yes, mm-dd-yy is bizarre, like fahrenheit.

          BTW Derek, thank you so much for remaining a beacon of light in these times.

      2. Stat Reck says:

        My QA department won’t accept 8601 format for dates associated with signatures. Argh….it’s unambiguous and sorts correctly in character format. What else do you want?

  14. Kaleberg says:

    I did some programming for aviation. There was a lot of “scar tissue”. “What’s that code about?” “There was a crash in ’93. Pilots should always use full power on that runway.”

    It’s nothing compared to the pharma business. That little reference to unexpected polymorphisms hides a real horror story.

    1. Derek Lowe says:

      “Scar tissue” is a good way of putting it – and a lot of aviation practice is built up in just that way, isn’t it?

      1. Some idiot says:

        There’s the old story about a new mechanic that had just joined the fledgling airline QANTAS, which became Australia’s national airline (stands for Queensland And Northern Territory Arial Service, which is why there is no “U” after the “Q”). Has (or at least had) a really excellent safety record.
        The story is set back when planes were single (piston) engine planes.

        The junior mechanic had just done a full service on the engine, and then reported back to the chief mechanic that he was finished. “Ok then, take it for a quick flight.” “What?” “I said take it for a quick flight. The rule around here is if you work on it, then you are the first one to fly it.”

        A really good safety record…! Having skin in the game helps… 🙂

        1. Eugene says:

          Worked with a Mechanical Engineer some years back and he related his experience at Disney’s Design studio where ride design took place. After an Engineer finished a design it was prototyped on a test track and the tradition was that the design Engineer would take the first ride!

          1. Jim Hartley says:

            Nuclear submarines too. The captain who supervises the overhaul/refit gets to take it out on sea trials. Or so I have been told by a former captain.

          2. Fortescue Bullrout says:

            An ancient and excellent tradition. When Roman engineers built viaducts and aqueducts they supervised the dismantling of the formwork while standing under the arch.

        2. Jamil says:

          I’m sure there was a time when early discovery medicinal chemists had to taste their creations. Back in the day that small molecules with big effect sizes were still being discovered.

    2. loupgarous says:

      One of the classic essays on the importance of exhaustively documenting how safety-critical computer progams actually work is Peter Amey’s 2001 paper “Logic Versus Magic in Critical Systems touching on how object-oriented programming can too easily take correct functioning of programming objects for granted, unless someone takes the time to verify that everything works as intended. He wrote too early to predict the fatal 2015 crash of an Airbus A400m transport aircraft when its full authority digital engine control (FADEC) system couldn’t read engine sensors properly, due to an accidental file-wipe – and three of its four propeller engines remained in “Idle” mode during takeoff, but it was the sort of mishap Amey warned about.

  15. that guy says:

    I started as a process chemist less than 10 years ago. I’ve had to explain to friends and family that developing a drug, even putting aside the enormous scientific achievements required of the research side of the business, is much more than the chemical synthesis. This is a great explanation and I’ll start reviewing it before family reunions.

  16. Giannis says:

    Why use GMP practices instead of testing the final product. If the final batch is up to standard , why do we care how it was produced? Even the Zantac fiasco was not because of manufacturing defects, it was because the drug degraded that way. If good enough remdesivir can be produced without GMP then for whatever is holly please do. If GMP doesn’t make a difference, then follow it.

    1. Antoine Bas-de-Plafond says:

      — because you cannot inspect quality into a product. In other words, a 50-100 g sample is not representative of a 100 or 500 kg batch … or, you cannot test 100% of a batch.

    2. Some idiot says:

      There are very many answers to your question, but I will just give two…
      1) You mention degradation, but that is quite often very sensitive to the purity (and particle size, morphology etc) of both the API and the excipients. The GMP ensures that when the material is produced that it satisfies the criteria that will give a product with a predictable shelf life. Analysis on the final product is not necessarily sufficient, particularly when “almost invisible carry-through” from earlier steps are problematic.
      2) Toxicity. The question of toxicity is a complex one. Not only toxicity of the compound itself, but also of the impurities which are present in the API (and there will _always_ be impurities in the API). The toxicity (or, hopefully, lack thereof) will be “qualified” by tox testing in animals. This sets a minimum purity and maximum for certain impurities. Then, they are “qualified” again when first doses in man (typically Phase 1). So typically, the bar for purity is set in early tox, then Phase 1. Problem is, this required purity cannot be guaranteed by a set of analyses on the final product (and deciding what analyses need to be done is a story or two in itself). New impurities can crop up under either the API or an existing impurity, giving misleading results. If one of these is toxic (or gives severe side effects), then you have problems. And the origin of the impurity may have been many steps back. Therefore the only way you can guarantee that your API is still “qualified” by the earlier studies is by following the analysis and paper trail all the way through the synthesis.

      1. Brian Lavey says:

        As a follow on- once a major pharma company used ethanol to clean out a reactor rather than the “standard” solvent that had been approved for GMP and tox qualified. Unfortunately, the next chemical step used methane sulfonic acid in the synthesis, which produced ethyl mesylate as an impurity. I “think” that the (very reactive/toxic) ethyl mesylate ended up in the API. It was a couple of years ago- but it is probably in the In the Pipeline files.

    3. Anon says:

      For a number of years FDA has been advocating Quality by Design which through firm understanding of the manufacturing process and monitoring key quality attributes can minimize the testing in of quality after the fact. Blue sky end objective would be real-time batch release.

      Generally we are not there yet for drug products manufactured for many years (let alone something totally novel) which is why extensive QC testing for batch release and years of follow-up stability testing is done under cGMP.

    4. Hap says:

      Because you’d like to know that the stuff you make in six months (or next month, or a year) still works and is identical to the stuff you tested to show it works, You’d also like to know if it stops working, why it stopped working, and how to fix it. Not knowing those things means that the test data you got doesn’t apply to what you’re now making, and you don’t know what you claim to know.

      People get cranky when their computers stop working, and that only prevents them from looking at the Internet (and sometimes working, but mostly looking at the Internet). When your drug can make people blind or deaf, or die, people get more protective. COVID-19 isn’t good, but it doesn’t have the death rates of things people already make drugs for and are expected to keep safe.

    5. Thomas says:

      Valsartan contamination was caused by a different production method (solvent, if I remember correctly).
      But I suppose there is more to it that requires traceability. How far does that traceability go? Across suppliers as well?

    6. CMCguy says:

      Giannas Although does involve degradation issues I seem to recall the Zantac problems arose after a change in the manufacturing process involving switch that included use of DMF which introduced trace impurities that react to produce the cancer inducing impurities. The first GMP rule I learned is you can not test Quality into the product where while proper analytical is critical knowledge and control of the entire process is required to assure product will be as safe as possible

    7. eyesoars says:

      Derek pointed out polymorphs in this article (see and others). That’s only one of the many reasons your suggestion is a very bad idea. If you want another, look up the tryptophan disaster, which killed three dozen young healthy people and crippled about 1500 ( ). As Derek points out, every one of those regulations was transcribed from gravestones.

    8. eub says:

      Testing for what is the problem that chemists tell me. That you can’t just ask “is this 99.9999999% pure y/n”, you can have a rough test of what’s there, but as far as asking what’s *not* there you need to know what questions to ask. You need to understand the failure modes of the production process.

  17. franko says:

    Can you comment on the study,

    Patterns of COVID-19 Mortality and Vitamin D: An Indonesian Study

    The paper is on SSRN


    The results seem very dramatic. Any medication that showed an effect like this would have people very excited indeed. Is there something wrong with the study? The number of subjects was 780.

    This was a retrospective study that compared the likelihood of death in three patient groups: normal Vitamin D level in the blood, insufficient Vitamin D level, or deficient Vitamin D level:

    Data taken from Table 1.

    Vitamin D level given as ng/ml; Patient death rate calculated as expired/total, %.

    Normal Vitamin D level (> 30 ng/ml); 4% died

    Insufficient Vitamin D level (21-29 ng/ml); 88% died

    Deficient Vitamin D level (< 20 ng/ml); 99% died

    1. Qetzal says:

      Even if Vit D deficiency really does correlate with increased risk of death from COVID, as the study claims, that doesn’t mean that giving Vit D itself will help. Patients with Vit D deficiency probably have other issues as well, eg from generally poor nutrition, etc, that could put them at higher risk if death from COVID. In such a case, giving Vit D wouldn’t help.

      Not saying it doesn’t deserve more study, just that going straight to the idea that Vit D itself would reduce deaths is a pretty big leap.

    2. Marko says:

      Vit. D supplementation for those patients who are deficient on admission would seem to be a no-brainer. Similarly a recommendation to the public for modest supplementation unless and until their vit D status has been determined.

      There’s already plenty of data suggesting a benefit in respiratory disease generally , in addition to the paper you cite :

      Maybe Gilead will run a big clinical trial , out of the goodness of their heart. Haha.

    3. loupgarous says:

      You have to work back from the adverse event involved (in this case, “death”) and consider all disease processes involved before you can say whether or not vitamin deficiency was a clinical marker or the actual cause of death. Vitamin D3 deficiency is associated with diabetes, which can cause all sorts of organ damage, including heart disease of various types. But organ damage can have a number of other causes, too.

      The problem with retrospective studies is that they can be highly suggestive of a link to a given disease process, but those trying to use them must consider confounding factors, not all of which might appear in the patient data (was the patient well-nourished? A smoker? A regular drinker of alcohol?).

      This study’s the kind of thing that might give an NIH center reason to commission more studies and either pay for a larger (thousands of patients) retrospective study, or enroll a number of patients into a study which compiled extensive data on each patient’s medical history, their diet, intake of medications and non-prescribed drugs, level of physical activity, et cetera.

      1. milkshake says:

        correlation is not causation, one has to look also for alternative explanations (what kind of patients get D-vitamin deficiency: are they elderly? Are they poor? do they have chronic conditions, or bad nutrition? Do they drink more, or smoke?). but D-vitamin supplementation trial would be one that is easy to run.

        1. loupgarous says:

          And we still might not be treating the main cause of the illness. Supplementation with D3’s a common adjunct to treatment for diabetes. The D3 helps the diabetic, who still must be treated for the main disease process, diabetes – for which the main measure of success is HbA1C – (a good measure of blood glucose levels over time). The D3 levels and therapy with D3 aren’t central to management of diabetes, regular control of blood glucose is.

          We agree that there are many possible causes of a particular illness, and seizing on a single treatment variable is unsound reasoning (that continues to complicate clinical trials today).

  18. Churlish says:

    Great summary Derek! I believe that thoughtful explanations like yours are really important to address huge misconceptions about therapy development that seem (to me) to be widely held in the general population:
    -If a potential therapy logically ‘should’ work, it will work
    -Drug companies already know everything about their potential therapies right at the start of the development process
    -The time and expense it takes to bring a potential therapy to market represent some combination of bureaucratic red tape, inertia, and greed

    As I assume your blog is getting more visits during the pandemic, your consistent education on these points is more important than ever.

  19. rhodium says:

    I started to read this but I got sidetracked imagining using TMSCN in my kitchen.

    1. Hap says:

      Using chemistry performed by people who don’t necessarily care about the consequences in their end users is likely not a good method by which to make drugs.

      No, you don’t use the porcelain tub for the HF waste.

  20. David says:

    You mention GMP and GLP, as I would expect a chemist to do (no offense). There’s also GCP (good clinical practice) which governs conduct of a clinical trial. Your post covers some of it, but there’s also regulatory oversight of trial conduct, institutional review boards (IRB also called ethics committees), site inspections, data collection standards that ensure that data can be verified by an outside reviewer, and not changed after the fact (databases are locked prior to conducting statistical analyses), built-in checks for subject compliance and tracking study drug consumption amounts to ensure that subjects actually took the drug as directed, safety reporting (most trials have an independent drug safety monitor to review expedited reports of serious adverse events), and on and on.

    There’s a lot of skepticism of phase III trials sponsored by pharmaceutical companies. In fact, they are run in a way that allows thorough outside audit of the results, which the FDA usually does before approval. There are many shady practices in the industry, but I don’t know of very many phase III trials that have been shown to be invalid, after drug approval (at least, in US by FDA). The way such trials are run, it’s very difficult to game the system. Ensuring that level of transparency and enabling verification takes effort, time, and money.

    1. Ian Malone says:

      Another thing GCP covers is ensuring that of somebody in a trial turns up at an A&E department (or something less dramatic, like becomes pregnant, and being unlikely to become pregnant is often a recruitment criteria for that reason) it is possible for their care team to break the blind and find out what they have been given. There are lots of little corners like this that require you to have planned ahead, and to have the right people on board and available with the right information accessible should it become necessary.

  21. loupgarous says:

    Thanks, Derek, for comprehensively answering “why can’t we ditch all the red tape in clinical trials?” I worked in analysis of the data from human trials (number crunching, not “why is this happening?”) for years with various product teams. I got to see a cross-section of the issues involved, from medical experts to tech writers, to statistics (I worked pretty closely with the stats people in most of my projects).

    I didn’t see anyone who was a “telephone sanitizer” (the useless third of the population Douglas Adams made immortal in The Hitchhiker’s Guide to the Galaxy) who could have been rounded up and sent off to another solar system with no loss to society. Everyone there was vital to the task of testing new drugs and being as sure as possible they worked (did what they were supposed to, and nothing else significant) and didn’t hurt the patients we tested them on.

    Back in the 1990s when we were doing this, someone i worked with at Lilly estimated getting a drug through clinical testing to NDA approval cost about $300,000/day. I don’t think that included the cost of building pilot plants to make enough study drug to test (at my group in Lilly, that involved a specialized kind of brewery to make huge cultures of recombinant E.coli which made insulin with the lysine and proline swapped around in one crucial point in its amino acid sequence, making specific changes to the non-recombinant human insulin produced), or the billable legal costs of shepherding the new drug through the maze at FDA.

    But all that money paid for things that needed to be done to reassure ourselves and FDA that we were making something that worked, was better than existing medications for the same disease, and didn’t hurt people who were taking it.

    Every medical problem seen in a patient was registered in the study as an adverse event. We left it to the physicians and scientists in the company and to FDA’s scientists to determine the significance and probable cause of each adverse event. That couldn’t be omitted without exposing patients to unknown risks to their health.

    Post-marketing pharmacovigilance is a free-for-all by comparison – I’ve only worked on a relative few pharma projects where firms paid for post-marketing safety analyses of their drugs (which was great fun, we got to flex our brains on doing data transformations in and out of safety databases). Adverse events have to be life-threatening to fatal before FDA gets involved even to the extent of a “black box warning” to prescribers and users, much less withdrawing a drug from the market once it gains New Drug Approval.

    1. loupgarous says:

      Erratum:“that involved a specialized kind of brewery to make huge cultures of recombinant E.coli which made insulin with the lysine and proline swapped around in one crucial point in its amino acid sequence, making specific changes to the recombinant human insulin produced)”

    2. eub says:

      And boom, nowadays we could really use some telephone sanitizers, of whatever telephones exist. I think hairdressers were another set sent off on the ship to nowhere, and boy I see some people missing their hairdressers. Wheels turn.

    3. A Nonny Mouse says:

      There is certainly extensive “Phase IV” work that goes on, ie, the constant monitoring of adverse events. Here in the UK, there is a direct link of these to a national database which is embedded in the software (which my wife has been doing for 25 years now….).

      1. loupgarous says:

        I’m happy to hear post-approval pharmacovigilance is happening reliably in the UK! Here, we have a voluntary registry of drug-associated AEs to which anyone can contribute. FDA estimates the system captures 10% of post-approval drug-related adverse events. We can do better. Unfortunately, it’ll take a nasty drug-related AE like thalidomide-caused malformations of thousands of children here to galvanize Congress into action.

  22. JRegs says:

    To clarify: Good Laboratory Practice is a regulation that applies to non-clinical (animal) safety testing used to support an IND. Good Clinical Practice covers clinical trials supporting an NDA. Current Good Manufacturing Practice covers manufacture of test items for GLP and GCP use, and commercial sale. /pedant

    They all require what laymen would think as extreme levels of traceability and validation of inputs, instruments and equipment, processes, and suppliers. The intent of this is to ensure the integrity and mutual regulatory acceptance of the supporting data, a consistent level of quality across lots of the same product, and the rights of the clinical and non-clinical test subjects based on lessons learned from atrocities in past medical research.

    Some folks in the UK came up with the idea of Good Clinical Laboratory Practice, which is not a regulation or standard, to cover the analysis of clinical samples in GCP studies, because the GCP regulation doesn’t say much about how this should be done.

    1. JRegs says:

      And to clarify again, clinical = in humans. Non- / pre-clinical = in animals.

  23. enl says:

    One aspect of my career is safety and compliance (think people falling off roofs or entering a confined space that has been argon purged) and the guide there is that for every rule that annoys you, someone died. Usually, several someones, actually.

    The kitchen-chemist developers miss the point that the same holds rue in drug development.

    Thank you for making some of these points crystal clear. “there are entire companies that exist to help you calculate just how much “overage” you might need in your clinical trial” makes the point quite well. That large organizations go to outside consultants for this makes it dead clear that the devil is in the detail.

  24. Carl Pham says:

    I still think you might adduce the benefit to the future as to why we are sticklers in the present. Yes, people *right now* are dying of COVID, and any fiddle-faddle about documenting and being sure, and measure twice, cut once, means more of those people will die.

    But people will also get COVID in the future, and since it’s an infinite future, pretty much by definition there are far more COVID patients at risk of dying in the future than in the present. Whatever corner-cutting that we do today means we give the future less data, or less reliable data, on which to make *their* decisions — and that will cost lives, in the future. Many more lives than are sacrificed in the present getting it right.

    People make these decisions all the time, and they understand them. End-stage cancer patients go into a clinical trial even knowing they might get assigned a placebo, because they understand that if that happens, and if the drug works, meaning they die (sooner) even if they needn’t have (had they gotten the drug), their death has real meaning and value — it buys *information* for future patients. It can save somebody else, probably many thousands of somebody else.

    1. Jim Hartley says:

      Well said. Thank you for this insight.

    2. intercostal says:

      This makes a lot of sense, in general.

      Though it may not apply so well in the specific case of a pandemic. If COVID burns through the world this year and then drops off to low endemic levels for another decade, and then is eliminated like smallpox, or reduced to a very limited area of the world like polio, by vaccination, then most of the total COVID deaths may occur this year.

      1. Pedantic Speaker says:

        Unfortunately, due to other problems such as armed conflict, and due to anti-vaxxers even in relatively peaceful places such as America and Europe, it seems to be increasingly difficult to extirpate diseases.
        I remember a decade or more ago, when we were “very close” to eliminating Polio, and we are no closer now than we were then.
        Measles is, if anything, actually increasing its endemic prevalence. We have had a vaccine for it for *decades*, and it *still* put an entire nation into lockdown as recently as last year.

  25. Thomas Lumley says:

    I think it’s useful to divide the regulatory aspects into two components. Part of it is your licence to experiment on sick people, and part of it is your licence to advertise the results.

    [This my field, too: I’m not a chemist, but I am a biostatistician, and I design clinical trials and sit on data monitoring committes, and teach people to do the design and analysis.]

    Society (in the US and elsewhere) has come up with fairly strict rules for who is allowed to experiment on humans, especially sick ones, starting with the Nuremberg Code. Part of this is getting the informed consent of the participants. Part of it is realising that fever and shortness of breath are not an aid to sober reflection on risks and benefits, so you want to avoid asking people to consent to anything that they wouldn’t or shouldn’t consent to when in their right minds. The history of clinical trials keeps showing that “Don’t be a dick” is actually not sufficient guidance for people doing medical research.

    So, you get regulations and bureaucracy about how you recruit people, what you say to them and don’t say to them. Someone else, not you, gets to decide whether your recruitment pitch qualifies as ‘well-founded enthusiasm’, ‘insufficient care to detail’, or outright lies. Someone else looks at your interim data and decides whether the trial should continue. Someone else is at least available for complaints from participants. And so on.

    The licence to advertise the results is what makes a blockbuster drug more profitable than a trademarked blend of snake oil and grass clippings. In theory (and to some real extent in practice) the FDA exists so that it is more profitable to make drugs that are safe and effective than drugs that aren’t. The nutritional supplement industry — and the history of medicine — shows that this doesn’t happen naturally.

    This is where all the GxP stuff comes in: you need to be sure that your results will allow you to advertise the results. If you’re just doing it in your kitchen you can maybe avoid this, but the ability to convince people of the results will have taken a massive hit. Derek knows a lot more than me about this aspect.

    The two components reconverge at this point. Experimenting on sick people is only ethical if there are benefits that are worth the risks. One of the big benefits in clinical trials is that future patients will get better treatment, and that’s an important reason people volunteer. If your results aren’t going to convince the medical community, the risk/benefit ratio of your trial just took a huge hit.

  26. John Harrold says:

    Is this guy going to send the serum samples back to his kitchen? Is he going to have his friend analyze those for the drug? What about active metabolites? Has this guy considered the impact of hepatic or renal impairment since some of these patients have both? These people are on all kinds of comeds, I’m sure he’s considered the potential for DDIs?

  27. drsnowboard says:

    It is interesting how ready some folk are prepared to ingest new compounds / follow ‘broscience’ to achieve some perceived benefit. I’m thinking the bodybuilding / doping community and the psychedelic experimenters. You could argue some of that latter broscience has sponsored clinical research on microdosing and the former has at least pushed the envelope on the effects of high dose steroids and supplements. Humans are optimists.
    Thanks for the detailed explanation of getting a molecule into patients. For those who want the freedom to do it in their kitchen, I’d ask ‘would you eat food from takeaways / restaurants if we relied on photos of the final plate rather than food hygiene standards…?’

    And yes I have taken compounds I’ve been responsible for, way back in a clinical trial that wouldn’t be allowed now due to conflict of interest and more recently for early taste assessment. My risk assessment was based on the IB and IMPD which I helped collate, with all the quality controls mentioned above. I wouldn’t relax the regulations for a second.

  28. Steven says:

    From a statistical perspective the NIAID remdesivir trial was very small. Over 230,000 people have died of COVID-19 so far. Probably an order of magnitude more have been hospitalized. What percentage of those hospitalized people were in a meaningful clinical trial? I doubt it more than a couple percentage points. Every hospitalized patient is a learning opportunity and we’re throwing most of those learning opportunities away.

    I realize there were logistical and perhaps supply issues, but most of the the questions around the NIAID remdesivir study would have been answered had the trial been a factor of 10x larger. For instance, this trial showed mortality rate of 8% in the remdesivir group which had size 286 according to If you try to compute a 95th percentile confidence interval it gives you anything between 11.2% and 4.7%. Want 3 sigma for 99.7th percentile confidence interval instead? It’s anything from 12.8% to 3.2%. If you had the more sophisticated goal of not just to determining if the p-value is slightly smaller than 0.05, but to gauge with reasonable error bars how much better the experimental arm is than the control you’re completely out luck.

    My point is: With tiny trials it’s impossible to measure minority events like mortality rates well. A drug basically has to be a magic bullet that has a profound effect on mortality to even pop up above the statistical noise in a hypothesis test. If you tried to say run a political poll with sample sizes this small the Nate Silvers of the world would deride it useless, so why are we making some of the most important drug decisions of a generation this way?

    1. Charles H. says:

      NIAID remdesivir trial has not been completed, and I question whether the results so far revealed say very much in favor of the drug. They do seem to indicate that it isn’t killing people, but, IIUC, side effects are still not revealed.

      A prior study (cancelled before completion) seemed to indicate that remdesivir wasn’t useful. The NIAID study seems to indicate that it’s minimally useless. The death rate change was not statistically meaningful, and the hospitalization change was 3 days out of say 14 (that end point wasn’t stable, of course). And it wasn’t for a large number of patients. So weakly statistically significant, but at what cost we don’t know. (Side effects are still not revealed.)

      What it looks like is “probably better than nothing, but not by much, and at a cost”. Possibly this will change when the study is complete. I, personally, think that it was revealed as a political move, so that the government could be seen as doing something.

      1. milkshake says:

        you are probably right, there is lots of political pressure – but at least they are no longer pushing hydroxychloroquine. Plus this trial makes it easier to look at combination therapies. I am sure there will be many more after this.

        By the way, it is highly instructive to look at the history of anti-HIV medications. Zidovulidine, the first approved AIDS medication, has on its own remarcably poor efficacy, and many unpleasant side effects

  29. anon says:

    The unspoken bottom line is that clinical trials are hard because most of the drugs being tested don’t do much. If, as happens rarely, the effect size were large and consistent it would be easy to see. As it is most trials are trying to make a case that some small fraction of the patients lived an extra week or two and the insurance company should therefore pay the $80K the company wants to charge. That’s really hard.

    1. eyesoars says:

      And it’s this that is the dirty little secret of big pharma. Statistically significant is not the same as significant. For a drug that’s truly effective, small studies are often adequate, except for ruling in/out effectiveness for/against unusual outcomes.

  30. Another Guy says:

    Challenges to overcome with the “broscience” clinical trial model (and a suggestion to improve the system where the bros can help):
    1) Unless you are testing placebo vs placebo, you will need licensed physicians to administer investigational substances or devices. The FDA and other regulatory authorities will freak out if you try to bypass this.
    2) It is probably a good idea to have physicians involved in the study, as the investigational substances or device could cause harm to some patients. It is nice to have someone who knows what to do if that “rare” adverse effect happens (and it will, you just need more participants to see it).
    3) From a liability point of view, it is good to have physicians and hospitals involved since they have malpractice insurance, and we can demonstrate we did all we could to make the study as safe as possible. Unavoidably, we get along with that some lengthy negotiations about who’s contract template to use, oh, and the hospital lawyer is on sick leave, etc, etc.
    4) The physicians and hospitals know there is money to be made once the new drug is approved, so they want overhead added to the cost of doing the study at their site. 30% to 40% is not uncommon. They have big institutions to heat and air condition, you know.
    5) Human nature: everyone wants a piece of the action.
    6) The reason we have regulatory systems is to avoid repeating the bad-old days (I think thalidomide was mentioned earlier).
    My two-cents: I like the virtual clinical trials ideas proposed in this post, we could still have physicians in the loop monitoring patients virtually and skip the clinical trial toll-booth at the hospital sites.

  31. Doc Harry says:

    As a biotech consultant I have seen lots of early-stage biotech company CEOs (usually academics) whose knowledge of the industry and path to first-in-man is on the same level as this biohack bro. Your explanation can prove out to be very useful with those boneheads…thank you!

  32. JP Leonard says:

    @Another Guy . I’m with you on the same track that MD’s can help out by doing trials. Physicians don’t sell meds, but they do prescribe treatments and sign off on the success or failure, along with the patient’s opinion.
    It seems to be half-forgotten that the spectrum of useful evidence is not black and white, with gold standard on one end and anecdotal on the other, and nothing in between.
    There are valid trial models like small crossover trials and even single subject trials, where the patient is their own control and you test different interventions at different times. Short longitudinal trials would be perfect for this pandemic: A. for figuring out what works against Sars-Cov-2, and B. for sharing those results in a more convincing, quantitative format that would foster wider adoption quickly enough to make a difference in real time.
    So we don’t have to suffer these awful trade-offs like patients must die because we will never have clinical trials, or the economy must shipwreck because the only therapy we have is social distancing.
    I have set up a project at // inviting doctors who have successes in treating Covid-19 (or who would like to try promising treatments) to run small crossover trials (patient takes active agent for a few days and placebo or vitamin for a few days, then back on active agent for a few days.) They can then quantify their results so they can be “empirical and verifiable” and believable. Cit. Wikipedia, “Some anecdotal evidence can be both empirical and verifiable, e.g. in the use of case studies in medicine.”
    See also “Reversal or A-B-A. The reversal design is the most powerful of the single-subject research designs showing a strong reversal from baseline (“A”) to treatment (“B”) and back again. If the variable returns to baseline measure without a treatment then resumes its effects when reapplied, the researcher can have greater confidence in the efficacy of that treatment.”
    Makes sense, doesn’t it?

    1. drsnowboard says:

      to be honest, no.
      The plural of anecdotes is not data.
      Compliance for ‘I’m taking a treatment and I’m getting better so I’m going to just take some supplements for a few days’ for a potentially life threatening disease? Scale of assessment?

  33. JP Leonard says:

    drsnowboard, Quick but not accurate. You missed some points.
    Even a single subject study can produce valid data. Having more subjects is better of course. But the real problem with anecdotes is when they aren’t verifiable. The reports have not shared their data, they’ve been more like declarations.
    Also it’s not about supplements or self assessment, it’s about stronger meds and observation under a physician’s care.
    For assessment, nurse or doctor measures symptoms daily. 5-point scales of degree of severity have been shown to be valid. If the effect is there it will show up.

    BTW a note for Derek,
    I know this isn’t on your main topic of drug development, so to try and cheer you up, for just once I’d like to offer an idea along those lines, altho you might need to stretch your imagination a bit.
    What if the war on Covid-19 actually spurred research to develop a cure for the common cold, that’s eluded us all these years?
    It’s been long known that zinc lozenges can reduce cold symptoms. What if you could develop an OTC zinc ionophore to put in them, that would give much greater cold relief? After all, this is based on known research.
    And if it works, there’s probably way more money to be made in it than in drugs for Covid-19. Just a thought.

    1. drsnowboard says:

      Of course, the single subject data is completely applicable to the single subject.
      What distinguishes the medic-verified patient x had a 4 point score reduction in their symptoms for the 2 days they were on med vs the 2 days they were on a meds break from ‘ my grandmother feels better when she snorts meth outside on a sunny day for her hayfever’ verifeid by your grandmother?

      1. heteromeles says:

        I agree with you about the problem with an ABA approach, but I’ll put it more verbosely:

        What we have with Covid19 is a disease where we do know that a) most people get better eventually, b) there is an enormous diversity of paths to “getting eventually better,” some of which involve things like limb amputations or dealing with strokes or heart attacks, most of which do not, and c) progression of the disease from mild to serious can be fast in nonlinear and unpredictable ways.

        Onto this polymorphic perversity of possible disease processes and progressions, the ABA protocol introduces a pattern of “try a drug for a day, then don’t, and see if it makes a difference.”

        How in the name of the Flying Spaghetti Monster are we supposed to be able to see the drug signal from the noise? We know that, whether the patient is under treatment or under placebo, every day that patient will be having a different experience due to Covid19. Unless a single day treatment has miraculous curative powers, or (more likely) puts the patient in the ICU or the mortuary, the effect of the single day’s treatment can’t be distinguished from the unpredictable course of the disease in the patient.

        1. JP Leonard says:

          Heteromeles, If you looked at my project, you’d not find any mention of a drug a day. We’re talking about 3 to 5 day periods. And in more serious cases, active agent first. Maybe Active-Placebo-Agent for everybody, patient care comes first. Also consider we are adding on an active agent that almost nobody is getting, so they’re not losing out.
          You do make a point about signal to noise ratio. If the signal isn’t strong enough to stand out from the noise, then it can’t be doing much. All I’m saying is let’s measure it and find out.

          1. heteromeles says:

            1 day or 3-5 days doesn’t particularly matter in this case. When current warnings tell us to worry about what happens on anything between Day 5 and Day 14 of an infection, because somewhere in that time is when the pneumonia can ramp up rapidly, a 3-5 day course of medication won’t do much. For example: you give the medicine to someone on days 5-8 of infection, and they don’t worsen. Was it due to the drug, or were they simply in the 90%(?) who wouldn’t have gotten worse in any case? And then you have the problem if some of the people you gave the drug to on days 5-8 (say, 11%) did end up in the hospital. Did your treatment make them sicker? Or was the treatment useless? Or did it keep them from dying on the way to the hospital? Hard to tell, when post people naturally get better. Same story with those who suffer major injuries or die. Getting enough patients to take the drug on days 5-8 to determine that it’s actually having a measurable effect is non-trivial.

            And this is an example. You could give treatment on days 1-5, or days 10-15 and have similar problems. It’s similar to the problems with a traditional multi-armed study, but there is, I think, a lot less statistical pain in dealing with matched groups of control plus treatment, rather than trying to do normal (or even non-parametric) analyses on a disease with so many possible outcomes.

  34. JP Leonard says:

    BTW Derek, maybe this is on the topic of drug development after all.
    Everyone understands why rigorous trials are needed before new drugs are introduced.
    But there are other applications.
    For instance, you want to enhance bioavailability of one of your products.
    “Bioavailability studies are usually conducted as crossover studies.”

    1. Crossover trials are interpretable when patients are in some sort of steady state that can, with treatment, be transiently nudged better or worse. For example, some patients are in the steady state of having, say, a seizure or migraine headache every month or so. Such patients can be studied in a treatmentA/B-washout-treatmentB/A-washout-treatmentA/B trial, and one can see whether they have more seizures/migraines during the A periods than the B periods. Crossover trials can work for infections, when the infections are the sort that people get repeatedly, like colds.
      The bane of crossover trials is a time x treatment interaction. When the patients are not in steady state, but are riding out the natural history of a disease, the only convincing endpoints apply to the trial as a whole (mortality, neuro status at 90 days, etc.). Using a crossover trial in such patients provides information, but not the information one really wants. If the ABA patients do better than the BAB patients, that might mean that A is better than B early (or late) in the illness, or that A is especially toxic around the middle of the illness, or that A is generally toxic, but that B is so useful mid-illness that the ill effects of A are outweighed, and so on.
      I’d estimate that of the crossover trials that came across my desk when I worked at FDA, about a third were ruled to be so confounded by treatment x time interactions that their only interpretable data came from the first period, before the first crossover. When that happened, there usually wasn’t enough power left to convince anyone of anything.

  35. JP Leonard says:

    Thank you, Robert. Your comment is helpful.
    I went digging some more and realized that crossover trial is a bit of a misnomer. My apologies. What I’m proposing is closer to an off-on-off-on trial. Altho, it’s kind of like a simplified crossover, but with only one constant med and one variable med over a short time.
    Typically the ABAB reversal trials are long drawn out affairs, too, with weeks to return to baseline. Not quite our case either, but the same in principle, just telescoped in time.
    It would be nice to know what the critter is called. For now it will have to answer to off-on-off-on.
    In essence, I’m looking for a way for physicians to quantify their results when they have something that is working, to validate it for third parties, and not just for doctor and patient.
    There might have been a time in history when doctors and scientists were the same people, when there wasn’t the extreme division of labor and profusion of information we have now. But the Internet makes it easier to manage and absorb tons of info, so it might empower physicians in new ways.
    Thanks again for taking the time to explain that.

    1. Another Guy says:

      Interesting to ponder the “on-off-on” clinical trial concept, and thank you JP Leonard for putting it out there. This might work for analgesics studies, where the patient won’t die from the pain, and it would be safe to start and stop treatment at will. Normally these studies use patient questionnaires to assess pain severity rather than a battery of lab tests, so there is not much to lose by doing it over the internet. I’m not so sure about diseases where the patient might die or suffer irreversible harm if they stop treatment just for the sake of seeing what happens next.

      1. What @AnotherGuy & @JPLeonard have been calling on-off trials are just variations on crossovers, with one of the treatments being placebo. When individual clinicians do these with specific patients, they are called n-of-1 trials. The literature can be found with “N of 1” as the search term.

        1. JP Leonard says:

          @Another Guy @Robert R. Fenichel
          Yes, N of 1 is kind of the umbrella term for these related types.
          “List of variation in N of 1 trial” – includes
          “A-B-A-B Experiment Withdrawal design where effects of B phase can be established”
          which is same as off-on-off-on
          Wiki also says of N=1 trials
          “It can be very effective in confirming causality. This can be achieved in many ways. One of the most common procedures is the ABA withdrawal experimental design, where the patient problem is measured before a treatment is introduced (baseline) and then measured again during the treatment and finally when the treatment has terminated. If the problem vanished during the treatment it can be established that the treatment was effective.” Can be, not is.
          ABAB is what I’m proposing at which is more safe and certain than ABA.
          Regarding Another Guy’s objection “I’m not so sure about diseases where the patient might die or suffer irreversible harm if they stop treatment just for the sake of seeing what happens next.”
          That is certainly an important concern. There are a couple defenses.
          One is a stipulation that withdrawal from treatment is reversed (off/on) as soon as patient improvement stops, rather than holding to the fixed time period.
          In the study I’m proposing, with HcQ as the control (currently over 50% of covid patients are on hcq), the treatment /high dose zinc/ is an alternative one that very few patients are getting, so they are “lucky” to get it with the physician doing the trial. They’re still getting standard of care.
          Thirdly participating in the abab trial should still be voluntary by informed patient consent.
          Also if you go with placebos you don’t have to have some on placebo and some not and compare the two groups. To me it seems like this type of design has fewer ethical issues.
          As wiki says there’s room for variation. It’s not rocket science but maybe neither is trial design something physicians have studied since medical school. They might feel unsure about it.
          This is how i tried to make a common sense description of it
          “You just alternate your patients a few days on high-dose zinc and a few days off (all the while on hydroxychloroquine/HCQ), and compare how their symptoms improve.”
          So the main idea can be told in a single sentence, without resort to statistical equations. Because it’s basically kind of common sense.

  36. Stanislav Radl says:


    I have a question regarding the clinical trial involving famotidine titled „A Multi-site, Randomized, Double-Blind, Multi-Arm Historical Control, Comparative Trial of the Safety and Efficacy of Hydroxychloroquine, and the Combination of HCQ and Famotidine for the Treatment of COVID-19“ (; ). In a CNN report they stated that „HCQ migh not be used in the study moving forward, …“.
    So, my question is: Could they simply use the study without HCQ or should they ask FDA for approval of a new study? In my opinion, such a change would totally change the intended goal of the study and therefore a new study should be designed.
    Stan Radl

    1. RoyT says:

      Might be possible to drop HCQ from the study however in that case, the sponsors need to plan to analyze the data separately for the two cohorts; the patients who were administered HCQ and those who weren’t. So, in a sense, it would be like 2 separate studies but administered within a single study. The difficulty with such an approach would be the logistics, getting the protocol amended, getting approval from the DSMB, the FDA, and finally approval from the separate IRB’s where the study is being conducted. My experience as a biostatistician is that getting such alignment on such a major change in a protocol from all parties is time-consuming and extremely difficult.

  37. Andrew Molitor says:

    I am inescapably reminded of the story of Dr. Jekyll and Mr. Hyde. The protagonist, as surely all will know, prepared a pharmaceutical compound of great effectiveness. He was tragically unable to duplicate it, despite his careful attempts because as he determined in the end and too late, his compound’s efficacy relied NOT on the theoretical basis upon which he had formulated the drug, but rather on an impurity in one of his ingredients.

    I dare say this story has occurred in one form or another in the real world, many times.

    Manufacturing really really matters.

  38. Insilicoconsulting says:

    I have had the job of running an engineering team alongside clinical trial experts to develop a clinical trial monitoring product. Note this
    1. It is highly bureaucratic and SLOW with heterogeneous siloed data and processes with High reliance on excel, SAS and people . Reluctance to change even when beneficial.

    2. Outsourcing to CRO is sometimes more akin to now its their problem.

    3. CRO’s never willingly share data even with sponsor. Especially patient data, not only operational

    4. Quite a few times, key stakeholders either aren’t aware or disinterested in detailed data analysis/risks. We had to point to discrepancies in their protocol and EDC for example which would have invalidated their study. Several other instances.

    5. Very few companies are interested in leveraging their own past clinical trials to improve future designs-> protocols,-> inc-exc criteria . Interest is increasing but hardly where one expects it to be since its touted to be the holy grail of evidence based medicine.

    6. CRA’s are badly paid, under tremendous pressure and increasingly don’t deem it worthwhile. This among other reasons is also a factor in Risk based analysis and visits.

    In sum RCT might be a smashing way to study and find statistically significant evidence but given that it catches signals ( efficacy/safety) in a limited fashion, to me it seems to be only a necessary evil. We need to find faster better method . Not to replace these immediately but sometime in the future.

  39. Chris Swain says:

    The other major problem is clinical trials data is often not reported despite this being a legal requirement. Nearly 60% are not reported within the 1 year deadline.

    Industry sponsors and large sponsors were most likely to report trial data, whereas universities were the least likely. The sponsor with the lowest compliance was the US government.

  40. JGault says:

    I would love to see a more in depth discussion of Challenge trials in the context of COVID.
    The ethical, timeline, and statistical details would be a facinating topic for discussion.

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.

This site uses Akismet to reduce spam. Learn how your comment data is processed.