Skip to main content
Menu

Clinical Trials

Adaptive Trials

There are a lot of clinical trial designs out there. But one thing that a lot of them have in common is that they are designed right from the start to run under certain set conditions and to enroll a set number of people (or at least to meet certain thresholds before that enrollment is closed). You set all that down before you even start, run the trial, and see what you obtained after you analyze the data. There’s another class of trials, though, that use adaptive designs. These can be rolling-enrollment, and can even change protocols as the trial goes on (adding or dropping particular treatment groups, changing dosages, etc.) You can even run with several different interventions under study and change those along the way, in theory. This recent article will catch you up on the general ideas, as shown in the scheme at right.

This is by no means a new idea. I was, in fact, commissioned to write an article on the subject around 15 years ago, and it wasn’t a new idea then, either. But it’s taken a while to catch on. For one thing, these things tend to be more friendly to a Bayesian statistical approach rather than the classic “frequentist” one that we all know and sort of love. That’s a big topic all by itself, and there are people more qualified than I am to discuss it, but the general idea is that a Bayesian framework updates the probability of a particular hypothesis as more data come in to support or refute it. You have a “prior probability” (based on what you knew before the experiment started), and a “likelihood function” is applied to that based on the effect of the new data (which were not used in calculating the prior), which gives you a “posterior probability”. Your hypothesis ends up looking more or less likely after the data are collected and analyzed, and you can keep the process going, updating along the way.

There’s nothing weird or spooky about the Bayesian approach, but it does require a different approach and it has its own pitfalls along with its own strengths. Traditionally, Bayesian trials have been quite rare in the drug business, so there are fewer people with the relevant experience in setting them up. The first one I remember seeing was a Pfizer cardiovascular trial in the early 2000s, but the literature on the subject has been growing steadily, and the new paper mentioned notes eight current trials, each with its own design features. There are a lot of choices possible, and (as with any trial) a vital step is to specify up front just what you’re doing, how you’re going to do it, and why.

“Go Bayesian” is not synonymous with “Go wild” – the new design possibilities opened up by adaptive designs also need to be specified up front, in great detail. It’s also important to run plenty of simulations beforehand to see what you might expect as you tweak the starting design, or how the trial might perform as it proceeds and re-weights. One thing this paper mentions is that there aren’t (yet) commonly agreed ways to evaluate or rank the effects of these various design decisions, so you’re a bit on your own in this process (well, you and the relevant regulatory authorities, who have been coming to grips with all these issues themselves). That’s a real issue when you go before an institutional review board – the complexity of some of these trials as they take advantage of all those features can be a barrier to approval (or even to understanding).

A common adaptive technique is probably re-weighting of the patient population depending on the results that come in (and the various probabilities of success get updated). This lets you study more than one sort of patient at the same time, or even merge a traditional Phase II effort into a Phase III one as the trial proceeds. But if you’re going to do that sort of thing, you also have to guard against drift in the way the trial is run and how the data are collected, because things could run for a while. (The same thing is a concern in traditional trials, too, of course).

The paper under discussion is particularly geared toward adaptive platform trials, which is the sort of thing you might run if you’re comparing several oncology clinical candidates or combinations (for example). That has real appeal in situations where patient enrollment is a limiting factor, because you have at least a theoretically more efficient way of evaluating all these as compared to running separate traditional trials. (On the other hand, if you’re just comparing a couple of possibilities, you’re probably causing yourself greater trouble and expense by setting up an adaptive trial rather than a traditional one. But as the authors note, even when they’re well-suited these things “do not lend themselves to traditional funding models“. You don’t know how many people will be involved or how long things might run, which is not what granting agencies or clinical research VPs like to hear. I hope, though, that this work by the Adaptive Platform Trials Coalition helps to move the field forward. There really are a lot of appealing possibilities.

23 comments on “Adaptive Trials”

  1. CMCguy says:

    I am aware the Adaptive Trials Design concept has been floating around for a long time as recall sitting in on debates in the late 1990s thus indeed taken a long time to become applied as acceptable approach. From what I observed it was a case of educational struggle of the Biostatistical experts attempting to first get the Clinical groups buy in where great resistant encountered to staying the standard known paths except in a few who understood the potential and became advocates. There was indeed much uncertainty on how to deal with practical aspects for enrollments/site selection and finance models however it was typically the major unknown response of the FDA that appeared to hold off implementing. Like many drug development advances once the Agency showed signs of flexibility to consider positively it has become more common. From what I had heard the application really needs to be evaluated thoroughly on a case-to-case basis as may or may not benefit a particular study where it can be very difficult upfront to have clear guidance on the choice.

    1. CMCguy says:

      I should have added that from a Clinical Drug Supply view supporting Adaptive Trials are a bigger challenge with more than normal moving parts to balance therefore as usual Manufacturing makes its best guess (and back-up options) on limited actual data and then sweats out the progression.

      1. Derek Lowe says:

        Yeah, I should have mentioned that part, which is definitely another complication. Much easier when you know all that stuff going in.

  2. Yvar says:

    I think one big barrier to adaptive trials is that your outcome is always intended to convince the FDA to make an inference that your treatment (drug, combo, etc) will be effective in treating a set of patients. Because humans are generally worse at intuitively understanding the validity of Bayesian statistical conclusions, I think they are more reluctant to make inferences based on those conclusions, which lowers the odds of approval. This applies strongly in my life science research world as well – can’t ever forget that one factor in the outcome of the experiment is the conclusions that human brains draw from it.

    1. Anonymous says:

      I was at a meeting in the 1990s on clinical trials. I don’t specifically remember the term “Adaptive Trials” but I remember a general consensus: “We’re not going try anything different if it’s going to affect how it gets treated by the FDA.” “We’re not going to try anything different until the method is PRE-approved by the FDA.” et sim.

    2. anon says:

      Humans are also unable to interpret frequentest results. Gigerenzer has lots of empirical data showing this, for example, https://www.researchgate.net/publication/241372934_The_Null_Ritual_What_You_Always_Wanted_to_Know_About_Significance_Testing_but_Were_Afraid_to_Ask

      To me, p(H | D) crisply answers the central business and scientific question: Does this drug work? The frequentest P(D | H₀) can’t answer that, and its correct interpretation is a awfully confusing to most.

    3. loupgarous says:

      There’s precedent for US government decision-makers to dispute or ignore findings of Bayesian analysis.

      When the US Navy (led by undersea engineer John Craven) used Bayesian analysis to locate USS Scorpion‘s wreck, the “likelihood” step in the analysis was done by polling a group of salvage experts – asking them to bet bottles of whisky on how likely given findings in the search were to be accurate.

      It worked, largely because the body of data and human expertise Craven was able to tap was huge, even though the Navy’s experts in deep sea ship recovery disputed Craven’s findings at almost every step – they looked everywhere but where Craven’s calculations showed the wreck would be – and where it was found on a last-minute search using USNS Mizar.

      While “likelihood” determination is more refined now, it’s still both potentially the most powerful part of Bayesian analysis and the most potentially risky.

      Everyone else here knows more than me what the quality of the data and expert intuition in drug development used to determine likelihood and shape adaptive analyses are.

    1. Derek Lowe says:

      Nope, it’s coming up. . .

    2. Lambchops says:

      “We going to do that through a combination of developing them internally and also partnering them out. It’s something we would do very carefully because we want to make sure that drugs are developed efficiently,” Frey said.

      I read that as – we’re going to blame the chemists if we don’t make to clinical studies, even if our lead candidate has terrible PK or a high chance of toxicity issues! And if it works after significant time on lead optimisation, well, that’s the genius of our AI!

      But maybe I’m getting cynical in my old age.

      I do think it’s great that targets can be identified like this – but as we’re all aware there’s a lot of work still to be done.

  3. Nesprin says:

    Happy to see I-Spy highlighted in this article- the design of this trial has been groundbreaking and makes my (frequentist) head spin- they deserve a lot of credit for their work.

    1. Djokodal says:

      Great point. Now if only someone would tell Novartis and Microsoft.

  4. Anon20 says:

    The author really needs to start asking the experts, in this case those familiar w/ clinical trials. Just ask, it’s ok to say you don’t know, but we are not convinced by the heading of science trans med.

    1. anonymouse says:

      Derek has worked in pharma for gazillions of years. It’s unclear what you hope to achieve by your comment.

        1. sgcox says:

          Could be more..

          “Giving Bush his daily war briefing, Donald Rumsfeld ended by saying: ‘Yesterday, three Brazilian soldiers were killed.’ ‘Oh no!’, exclaimed Bush. ‘That’s terrible.’ His staff were stunned by this display of emotion. Finally Bush raised his head from his hands and asked: ‘OK, so how many is a Brazillion?'” (Anon)

    2. 20Anons says:

      We also refer to ourselves in the first-person plural.

  5. loupgarous says:

    There was some work toward Bayesian analysis of clinical data at Lilly when I worked there. I’m just sorry I was dismissive of it when it was described to me. Your article and the paper on which it was based made a considerable difference to my attitude toward it, and my previous attitude toward Bayesian analysis was purely a product of my ignorance.

  6. MattF says:

    Maybe I’m just a hopeless frequentist— but why is this not just a fancy way oh introducing bias to get the result you want?

    1. Emjeff says:

      Not at all; in fact, you and every other person use Bayesian thought processes every day. Example: your stomach hurts. It could be many things; pancreatic cancer, appendicitis. But you also know you had Mexican food for lunch, and Mexican always gives you indigestion. So, based on your prior knowledge, you take an antacid instead of heading to the ER. Note that you could be wrong; you could in fact have some other serious illness. However, you pick the most likely cause and it will probably turn out that you are correct.
      Drug trials are more complex, of course, but I find the idea of building on past knowledge very appealing indeed. Note that there are ways to check to see if your prior knowledge is overwhelming your “new” knowledge (usually referred to as the likelihood), and one can also look at results from a “skeptic’s” or “believer’s” point of view.
      Building a repository of knowledge instead of treating every study as a new entity can also result in less patients being treated. Why is this desirable? It may speed up the approval process, and may also allow less patients on placebo or standard therapy, particulary late in development. For people with serious illnesses (cancer, HIV), this is very desirable.

      As Derek states, it’s not all roses and unicorns – you must pay attention to when you perform interim looks, and you must carefully specify your priors. But, it is important to note that the “gold standard” frequentist approach is far from ideal, and has some very undesirable properties, one of the most serious of which is that you can make almost any trial have a significant p-value simply by increasing the sample size. The other important issue around this type of analysis is that most people, even physicians and scientists, use p-values every day, and yet cannot define what they are. We need to be open to newer approaches, because what we are doing now is not, in fact, the “gold” standard.

      1. Earl Boebert says:

        Similar adaptive/Baysean reasoning occurs in the systems safety domain with the concept of leading and lagging indicators. Definitions vary, but in general terms leading indicators are analytic and speculative (an accident hasn’t happened yet) and provide the initial, informal estimate of risk through historical data and professional assessment. Lagging indicators are evidential (e.g., near misses) and used to adjust the initial risk estimate up or down. The assumption is that when the informal assessment of risk gets high enough action will be taken (e.g., stop the operation in question).

        As with the Mexican food example, you can get it wrong. An initial assessment based on leading indicators can point to an operation at risk and the operators can get lucky for a while, thereby depriving observers of lagging indicators. Then their luck runs out and the first lagging indicator you see is a catastrophe.

  7. Barry says:

    My faith in a Phase III clinical result is highest when it meets the clinical endpoints established for the Phase II trial, and lowest when the only signal is for an endpoint cobbled together after the data are in.
    This smells like something in between, moving the goalposts during play, but then potentially adding overtime to show that the score is real.

    1. Emjeff says:

      Not really, Barry. No “goal-posts” are being moved. Just like all good trials, you must state your end-points upfront. This is simply a way to incorporate your prior knowledge in a formal way, when running your next trial.

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.