Skip to main content

Clinical Trials

An Update on Anti-Androgen Therapy for the Coronavirus

I blogged earlier this year about some interesting androgen antagonist results in coronavirus therapy. The idea is that the TMPRSS serine protease needs to be present at the cell membrane for the entire cell-entry pathway of the virus to work, and that androgen antagonists decrease its expression sharply. As the references in that earlier post show, that’s not a crazy idea per se, and it had gotten some attention from various groups.

What prompted the post was a press releases on a trial that had taken place in Brazil. A small company called Kintor had provided an investigational androgen receptor antagonist (proxalutamide) for this work, which looked at over 500 patients and reported quite strong results. A preprint manuscript on this work appeared late in June. But now Science reports that there are doubts about some parts of the study, and it’s apparently having trouble finding any other publication venue.

One problem is that those results were not only strong, they seem too strong to be real. That gets us back to the key concept of effect size, which I’ve blogged about here and here, among other places.  You can think of that as the underlying strength of the intervention that you’re studying. If you’re the sort of person (like me) who keeps calling for controlled trials of new therapy ideas, you will get to have people ask you “You wouldn’t run a double-blinded trial of parachutes for people jumping out of a plane, would you?”. As a side note, many of the people who ask this one seem to feel that it’s both extremely original and extremely telling, but sadly, it’s neither. The effect size of a parachute intervention during free fall towards the ground is enormous, and in fact could not be larger. And the larger the underlying effect size, the fewer data points you need to be sure that you’re seeing something real.

But in drug trials, we just don’t see parachute-level effect sizes. The closest example I can think of is when penicillin was first made available and given to patients with severe bacterial sepsis. Doctors at the time saw people die of bacterial infections all the time, and they knew exactly what it looked like when someone was passing the point of no return. But penicillin could often bring such people back, and those physicians who saw it happen needed no further convincing of its efficacy.

I wish we had a lot more stories like that, but we don’t – you can count them easily on your fingers. What we have instead are a lot more than fifty shades of grey. Most successful trials show that a drug works to some degree in some patients. Some respond pretty strongly, some don’t respond at all, and “pretty strongly” is a phrase that’s on a sliding scale of its own. Even really dramatic interventions like CAR-T therapy for leukemia show this. These engineered T-cell transplant techniques have pulled some people practically out of the grave – oncologists know what it looks like and what it means when someone has been through every possible therapy and the disease is continuing to worsen, but CAR-T has sent some of these people back out onto the world without signs of leukemia. But not all of them. Some people get a partial or less durable response, and some people don’t really respond at all (substantial effort is going into trying to figure out how we could identify such patients up front, as you would imagine!)

That’s how it works out here in the real world, as opposed to the last act of a screenplay. But the Brazilian trial showed a 77% reduction in mortality, which is either very impressive or too impressive depending on how you look at it. As the Science article shows, a problem is that the mortality in the control group seems very high (49% of the hospitalized patients!), which could of course inflate any benefit in the treatment group. The enrollment of the patients seems a bit too fast to be plausible as well. And there’s worse:

But on 8 June, the Brazilian newspaper O Globo reported that the Brazilian National Research Ethics Commission was investigating the study because the authors failed to report trial deaths as quickly as required by clinical trial rules in Brazil, and at different times reported a total of 170 and more than 200 deaths during the trial. The agency would not confirm the investigation, noting that “all data from the research protocols under analysis are confidential,” but Cadegiani confirms to Science that the commission is expected to issue a report on the trial.

It does not help much that lead investigator Flavio Cadegiani has also been a big proponent of a number of other coronavirus treatments that are (at the very least) unproven. But one way or another, we’re going to know what’s going on, because proxalutimide itself is in another trial here in the US, and other AR antagonists are in trials as well. To be sure, looking at people who are already on androgen depletion therapy has shown no effect on their infection rates with the coronavirus, so there’s room to wonder about the entire idea, as plausible as it might seem mechanistically. But it looks like we’ll at least get the answer.

The whole story, though, illustrates what non-screenplay clinical research is like. Extremely solid-sounding rationales sometimes work and sometimes don’t, and we often don’t know why they didn’t. Trials get run that point one way, and trials get run that point another, and we have to figure out what the differences might have been between them, which ones were intrinsically more likely to be reliable, and what lessons to learn from them taken together. If the effect sizes were higher, we’d have less confusion. But most of the time, they aren’t.

36 comments on “An Update on Anti-Androgen Therapy for the Coronavirus”

  1. Cynic says:

    That “you wouldn’t run a double-blind study on of parachutes for people jumping out of a plane, would you?” question confuses me… It’s more like testing parachutes that work on unicorn magic, which seem to generally work better if you’ve had fewer sexual partners, but maybe 30% of people seem to actually hit the ground FASTER than they should have after using them… But 60% of people recover after hitting the ground with these on, and 28% are completely uninjured!

  2. Jonathan B says:

    It doesn’t quite seem right to reference “fifty shades of grey” in an article about anti-androgens…

  3. Another Guy says:

    The famous parachute-jump clinical trial analogy, and a great humorous example of why details matter.

    See: BMJ 2018;363:k5094

    1. En Passant says:

      Thanks for that cleverly hilarious article. It makes me long for the old Journal of Irreproducible Results. And it has a photo with a nicely restored Stearman too.

      This is as good a place as any for another example of Derek’s observation:

      And the larger the underlying effect size, the fewer data points you need to be sure that you’re seeing something real.

      There is a good argument to be made that effect size was a major factor in causing serious investigation of Robin Warren and Barry Marshall’s hypothesis that Helicobacter pylori, not stress or spicy food, caused stomach ulcers.

      Marshall, in good health, drank a beaker of H. pylori culture, and within days had acute gastritis. Endoscopy revealed an H. pylori culture growing in his gut. So the hypothesis began to be investigated seriously instead of dismissed as a crank theory.

      1. Derek Lowe says:

        Oh, absolutely. It wasn’t something that needed 1000 patients to assess the statistical significance, as it turned out. Instant Ulcers, courtesy of H. pylori, got people’s attention very quickly, as did their fast disappearance on antibiotic treatment.

      2. Not-an-epidemiologist says:

        A small clarification — Marshall’s experiment on himself (OHS and ethics these days would have an apoplexy!) didn’t give him an ulcer. He developed gastritis that could be successfully treated with antibiotics, and correctly hypothesised that ulceration could be a long-term consequence if left untreated, but the paper wasn’t actually the lay-down misere demonstration that popular lore would have it.

        You could very well argue that Marshall’s early hypothesis re. H. pylori and ulceration was based for the longest time on the same unsubstantiated correlation == causation arguments that are currently being applied to all sorts of things + covid. (And let’s face it, if one of these correlates ever turns out to be important to covid outcomes, we’ll be re-writing history in just the same way as we did with Barry Marshall.)

    2. Mark says:

      Thank you for this! Made my evening. Special mention for reference 16!

    3. Nice thank you for posting that – very entertaining – and it references a *seminal* parachute trialing paper from 2003 which is located here:!po=2.08333

    4. Nesprin says:

      The best part of the article is hands down the acknowledgements:
      Contributors: RWY had the original idea but was reluctant to say it out loud for years. In a moment of weakness, he shared it with MWY and BKN, both of whom immediately recognized this as the best idea RWY will ever have. RWY and LRV wrote the first draft. CS, DBK, JBS, EAS, and JLH provided critical review. RMD provided subject matter expertise. DSK took this work to another satirical level. All authors suffered substantial abdominal discomfort from laughter. RWY worried that BKN would not keep his mouth shut until the Christmas issue was published. All authors had full access to the data in the study and can take responsibility for the integrity of the data and the accuracy of the data analysis. RWY is the guarantor. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

  4. David says:

    The issue isn’t just effect size. You could compare a open-label outcomes for a treatment with modest effect size to a historical control population, if you had confidence that the populations were truly comparable and the data was obtained without bias.

    Accepting open-label data as proof of efficacy implies that you have a well-founded and accurate understanding of the natural history of the disorder in question. As you point out, for established bacterial sepsis, the mortality rate without antibiotics is very nearly 100%, so that’s a reliable comparison. For COVID-19, if the composition of the group changes a little (age, M:F ratio, comorbidities, severity at entry, quality of other medical care received), the mortality rate varies widely. That’s the fundamental error which led to apparently positive HCQ data.

  5. Another Idiot says:

    Extraordinary claims require extraordinary evidence, as the saying goes.

    A great example of this is when Thomas Starzl presented his kidney transplantation results at a select National Science Council meeting of leading researchers on organ transplantation in the early 60’s. He knew going in that his success rate due to anti-rejection drugs was going to be questioned. His group compiled all of his patient records and organized the data on big rolls of paper. The last two days of the meeting consisted of everyone pouring over the rolls after the sh*tstorm that his presentation caused. By the end about half of the attendees changed their flight plans to go to the University of Denver to see the results themselves.

  6. Insilicoconsulting says:

    I don’t always concur with comments on the ML-AI front or clinical trials are the gold standard way of presentation. But talking about trials and effect sizes, confounding, mixed effects and even open label being right in certain circumstances, as a fellow commentator did is outstanding.

    1. Another Guy says:

      Bingo. Now imagine a parachute-jump study where a small fraction of the airplanes are flying, but no records exist of exactly how many are flying and if the “study group” and “placebo groups” are evenly matched or not. By chance an equal number of deaths occurred in both groups because one or two participants in the stationary plane tripped and fell on their heads (occurred, but not properly documented), Would you trust the conclusion that parachutes have no effect on reducing mortality and book your next (very real) parachute-jump excursion and ditch the parachute?

  7. stewart says:

    Having read the preprint (and particularly the limitations section) there seems to me to be a large hole in the randomization process. They were concerned that the remote sites would fail to correctly assign treatment packs to individual patients, with the (I assume) risk that there would be an unrecognized partial treatment arm, depowering the study. In addressing this they ended up with a non-random distribution between sites, with the Manaus sites 75% placebo, and the other sites 88% treatment.

    I don’t see why that might produce a confounder that would produce an effect size of this magnitude. But if we can get our hands on outcome data for non-trial patients at those sites for the same time period we can test the null hypothesis that there is no significant difference in outcomes between sites.

  8. TallDave says:

    eh test enough jellybeans and you’ll find a color that cures cancer

    this is why independent experimental replication is the primary universal scientific standard

    today the fact there are so many studies makes replication all the more relevant

    also as someone who once jumped out of a perfectly good plane with someone whose chute did not open, I feel obliged to point out that even twenty years ago the safety protocols dictated that even if you passed out on the way down, your reserve chute automatically opened upon sensing a variety of factors indicating imminent death would otherwise obtain

    in fact regulars told me the only ones they’d ever seen die (in tens of thousands of jumps) were two people who opened their reserve chutes after the main deployed, either accidentally or (the general feeling) on purpose

    so yes, in fact new chute designs could be (and probably, in fact, have been) tested double-blind

    1. TallDave says:

      oh the other detail about reserves was that even if you accidentally deployed it after the main had already deployed, you were still generally okay unless you happened to have pulled the reserve at the fairly specific range of heights that was high enough to kill you but not high enough for your chute to inflate and slow you down

      which was why “accident” was viewed a bit askance when the unfortunate events occurred

      1. The real problem is if you have an arrangement that lets you pop the reserve without cutting away the main. High chance that you’ll be falling under two streamers.

    2. LineDrive says:

      “ someone who once jumped out of a perfectly good plane with someone whose chute did not open,”

      Well, this is proof that there is Internet in the afterlife.

      I’ve done one jump. Shortly after that, the outfit’s plane crashed. 1/2 of one survivor, all others dead.

  9. Simon says:

    So nice to read about coronavirus without gamers tedious screeching.

    Thanks Derek!

  10. Karl Pfleger says:

    If we don’t limit to “in drug trials” but instead consider the only slightly wider “in clinical medicine”, then the other obvious category of reliable parachute sized effects comes from repletion of clinical deficiency of an essential micronutrient. For example, Vitamin C cures serious scurvy with an effect size similar to the way penicillin cures bacterial sepsis. But the principle applies to deficiencies broadly (just as the point applies to antibiotics broadly).

    It continues to baffle me why public health & clinical medicine does not prioritize taking advantage of these huge effect sizes by stamping down known high-prevalence deficiencies and monitoring the more likely ones to accurately assess these prevalences. We don’t have much scurvy anymore, but based on Palacio & Gonzales (2013) and a couple other refs for Europe & China estimates, the global prevalence of clinical vitamin D deficiency appears to be roughly 50% based on estimates for countries representing ~2/3 of global population. [See for the math and the other 2 papers.] This is higher than commonly discussed estimates, but even if it’s off by a bit and it’s only 1 billion people instead, the point is the same.

    Covid relationship or not, there’s no argument about the effect size to bone health of resolving the deficiency, but in the US despite several high-quality papers showing D deficiency prevalence of 30% or 35% or higher we seem to have no government group charged with monitoring the deficiency prevalence and driving it down to low single-digit %.

  11. Bruce Grant says:

    Chloramphenicol also had a parachute-size effect — even bigger, arguably, than penicillin, due to its broad-spectrum coverage. Unfortunately, the pre-Harris/Kefauver approval regime —abbreviated safety testing, with no large-scale efficacy trials required — failed to reveal its rare side effect of aplastic anemia, condemning hundreds to a slow, gruesome death before Parke-Davis was finally compelled to share the AE reports it had been receiving since shortly after launch. Effect size alone is no substitute for a RCT sufficiently powered to evaluate both efficacy and safety. Just saying.

    1. Dean says:

      Regarding Cloramphenicol:
      Is it possible to determine in advance whether a particular patient would suffer aplastic anemia if they were given Cloramphenicol? If such a determination were quick, reliable, and inexpensive, it would seem that Cloramphenicol could be a useful drug.

      Alternatively, are there infections that could be treated successfully with Cloramphenicol that cannot otherwise be treated? If the likelihood of untreated survival was much lower than the likelihood of aplastic anemia, Cloramphenicol could still look like a good choice.

      As I am neither a medical professional nor a biochemist, I have no clue what the answers to the above questions turn out to be. That said, I’ve always been somewhat puzzled by blanket bans on drugs that are clearly contra-indicated for a subset of all patients. Wouldn’t it make more sense to figure out how to use such drugs only for patients who fall outside the problematic group. Is there a reason Thalidomide couldn’t be useful for people who cannot possibly get pregnant — men, for example?

      1. En Passant says:

        Wouldn’t it make more sense to figure out how to use such drugs only for patients who fall outside the problematic group. Is there a reason Thalidomide couldn’t be useful for people who cannot possibly get pregnant — men, for example?

        I’ve often wondered about that. A double blind RCT is not necessary to show that men will not bear children with drug induced phocomelia.

        Perhaps there is another psychological factor at work in Derek’s effect size factor. There is the folk tale of the cat who once sat on a hot stove and never sat on another stove again, whether hot or cold. The effect size of being burned by a hot stove overwhelmed whatever rationality the cat had to distinguish between hot or cold stoves.

        Maybe the terrible and readily observable consequences of Thalidomide use by pregnant women overwhelms rational thought both among regulators and the general public. So regulators issued a complete ban on the drug.

        To be fair to regulators, some facts are worth noting:

        Some countries’ regulators do permit Thalidomide to be used in treating erythema nodosum leprosum (ENL), a painful complication of leprosy, for which it has been found to be effective. The use includes various restrictions to prevent the drug from use to treat pregnant women. Those regulators include the US FDA in 1998, which approved that use more than 30 years after the drug was shown to be effective against ENL.

        In 2006 the FDA also approved use of Thalidomide in treatment of multiple myeloma. The angiogenesis inhibition side-effect of Thalidomide, which causes birth defects when used by pregnant women, is apparently what makes it effective in treating multiple myeloma.

        1. Rich Rostrom says:

          Not a folk tale.

          “We should be careful to get out of an experience only the wisdom that is in it — and stop there; lest we be like the cat that sits down on a hot stove-lid. She will never sit down on a hot stove-lid again — and that is well; but also she will never sit down on a cold one anymore.”

          — Mark Twain, Pudd’nhead Wilson’s Maxims.

      2. aairfccha says:

        Even with its side effects, Chloramphenicol still/already is a very useful drug. When a drug is on the WHO Model List of Essential Medicines it’s generally for a reason.

  12. Adrian says:

    Phague therapy could also qualify as parachute-level effect in some cases.

    An institute in Georgia (not the US state) is using it successfully for nearly 100 years, but in Western countries it is not approved and often not even known.

    When a patient has an infection with bacteria resistant to all antibiotics, and doctors have told that there is no hope for improvement, then phague therapy can have the same effect as your penicillin example.

    1. Crocodile Chuck says:

      Phage, correct?

      1. Harvey 6'3.5" says:

        You can find lots of phage therapy studies in the US (see, e.g. Aside from the problems listed in the paper, the personalized nature of the therapy is problematic because a phage that treats one person with a particular disease might not work on another person with the same disease because of differences in the bacteria.

    2. Wallace Grommet says:

      Much of the medical talent from the Soviet Union immigrated to the US and founded a phage research company called Intralytics.

  13. cancer_man says:

    “I wish we had a lot more stories like that, but we don’t – you can count them easily on your fingers.”

    But the natural world provides many examples, like the still mysterious powers of resveratrol.

    1. eub says:

      $720 million sale versus no sale, that’s a big effect size all right.

  14. Lizzy says:

    Testosterone is required for expression of ACE2 and TMPRSS2 via the androgen-receptor signaling, and testosterone has immunosuppressive functions. Therefore you’d expect testosterone to be high in men with more serious Covid 19 infections. But testosterone is a two edged sword.
    Here’s a little article expressing the opposite. European Association of Urology Congress (EAU21), held July 8 to 12, 2021.

  15. Dan Robinson says:

    Moderna and Pfizer showing heart inflammation. what’s with that?

  16. Jack says:

    One thing that disturbs me about this pandemic is this. What if there was a highly effective existing drug. And a few people championed it. But we would never be convinced because it never was funded, never had properly powered trials, or the trials that ran were flawed. One only has to look at how long it took for people to believe in Katalin Karikó.

    The amazing thing is, if there truly was a drug, how long would it take before the world would believe its efficacy? A year? Two years? A decade? By all measures, the next pandemic will overrun us again before we find that drug. With all the world’s resources, and all the world’s spread of disease of a pandemic, we could not coordinate or pool resources into running properly run trials to determine whether certain drugs could be effective.

    Are there any studies that show how the mortality rate has changed since near the beginning of the pandemic when there were no studies of some of the drugs now used to treat COVID patients?

  17. Byrel R Mitchell says:

    A bit late to the party, but another example of the small category of parachute-interventions: pulmonary surfactant in pre-term infants. In the first four trials of an effective surfactant, absolute all-cause mortality rate in week 25-29 infants dropped by and average of 25% (with relative mortality reduction ranging from about 25% to 75% depending on how badly the trial population was doing; week 25-29 neonates died a LOT back then.)

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.

This site uses Akismet to reduce spam. Learn how your comment data is processed.