by Martin Enserink and Jon Cohen
PARIS—The fog around the largest AIDS vaccine study ever conducted began to lift today, as Thai and U.S. researchers for the first time publicly presented a detailed analysis of their data to over 1000 scientists gathered here at an annual meeting.
The study results, also published online by The New England Journal of Medicine (NEJM) today, received widespread attention 3 weeks ago, when researchers touted them during press conferences in the United States and Thailand as the first success in a real-world test of an AIDS vaccine. But that pronouncement came under intense scrutiny because of concerns that it omitted negative analyses that challenged the upbeat conclusions.
After seeing a more thorough presentation of the data, even some scientists who were initially skeptical about the trial believe the vaccine does offer modest protection from infection with HIV.
"I think there’s something there," says Eric Hunter of the Emory Vaccine Center in Atlanta, one of 22 scientists who argued in a 2004 editorial in Science that the trial lacked a proper rationale and was a waste of money. Colonel Nelson Michael of the Walter Reed Army Institute of Research, who helped lead the $105 million trial that involved more than 16,000 participants, said at the meeting that their findings are a "yes-we-can-moment."
But the two vaccines used in the study and delivered as a one-two punch protected fewer than one-third of the people who received them, the effects appeared short-lived, and it's unclear exactly how the shots work, or even how much each of the vaccines contributes. "I think for the first time we have a vaccine that has an effect," says Mitchell Warren, who, as the executive director of the AIDS Vaccine Advocacy Coalition, monitors the field closely. "Now we need to find out why, and where we go from here."
The “prime-boost” strategy relied on ALVAC-HIV, a crippled canarypox virus engineered to contain three HIV genes, and AIDSVAX, a genetically engineered version of the HIV surface protein gp120. AIDSVAX alone had failed in two previous efficacy trials, and ALVAC-HIV performed poorly in smaller studies. So when researchers from the U.S. Army and the Thai Ministry of Health first reported that it offered 31.2% protection, many AIDS vaccine researchers were stunned.
But as ScienceInsider reported on 5 October, some researchers who were shown more of the data in confidential briefings complained that the study team had painted a rosy picture in its press announcement by ignoring two analyses that undermined the positive findings.
| ITT | mITT | PP | |
|---|---|---|---|
| Participants | 16,402 | 16,395 | 12,452 |
| Infections (placebo group) | 76 | 74 | 50 |
| Infections (vaccine group) | 56 | 51 | 36 |
| Vaccine efficacy | 26.4% | 31.2% | 26.2% |
| P-value | 0.08 | 0.04 | 0.16 |
| Significant? | NO | YES | NO |
Analyze this. Of three different analyses, only the one called modified Intention-to-Treat (mITT) produced statistically significant results.
In today's presentations and in the NEJM paper, the team shows the results of the three analyses. One, called Intention-to-Treat (ITT), contains all 16,402 people originally enrolled in the study. In this analysis, the vaccine had a 26.4% efficacy, but the so-called p-value was 0.08, meaning there's an 8% chance that the results were due to chance, well above the generally accepted 5% cutoff. Another analysis, called Per Protocol (PP), excluded almost 4000 study participants who didn't follow the study design to the letter. (The vast majority missed one or more of the six shots or did not receive them at the appropriate time.) This analysis showed a similar protective effect, but a much higher p-value of 0.16.
The most important analysis, the team argues, is the modified Intention-to-Treat (mITT) analysis, which is identical to ITT, except it excludes seven participants who were discovered during the study to have been infected with HIV before receiving the first shot. This analysis, the only one presented on 24 September, showed the highest efficacy—31.2%—and it was also the only one in which the protective effect was statistically significant, although only marginally so, with a p-value of 0.04.
Focusing on the mITT may look like statistical cherry-picking, but Harvard University biostatistician Victor de Gruttola says this misses the point. “All of the analyses should have been presented initially—they give you a more complete picture of the study—but it’s a mistake to say when you look at three analyses you get a very different picture,” says de Gruttola. "The basic message is it’s a weak signal. It’s not nothing. It’s not compelling.”
Some critics say if the researchers had presented all three analyses on 24 September, they would have avoided much criticism. "If they’d come out with this right from the start, people would have said there’s something interesting here to look over," says John Moore of Weill Cornell Medical College in New York City, another co-signatory of the 2004 Science editorial attacking the study. But Supachai Rerks-Ngarm, a scientist with the Thai Ministry of Health, says the team had promised to present the data to the Thai people first, and it wanted to deliver a clear-cut message. The full analysis "would have been difficult for them to understand," Supachai says.
Researchers hope that careful studies of participants' blood samples may help tease out exactly which immune responses protected people in the vaccine group, information that could be used to design better vaccines. One of the field's major frustrations has been its inability to find so-called correlates of protection, or biological markers for immunity. "This study is now really the only hope we have of finding them," says HIV researcher Joep Lange of the Academic Medical Center in Amsterdam.
The researchers have assembled four teams of experts to advise them on how to proceed with such analyses. And they're going to extraordinary lengths to solicit ideas from colleagues. “We’re in the process of setting up an online submission site so that researchers in the field can submit ideas about how best to use the limited samples available from this trial,” says Mark de Souza of the U.S. Military HIV Research Program based in Bangkok.
A breakdown of the data resulted in several trends that, although not statistically significant, deserve further exploration, the study researchers say. The vaccine appeared most protective during the first year, for instance; after that, new infections in both groups occurred at roughly the same rate. The vaccine also seemed to offer more protection to people who put themselves at low or medium risk of contracting HIV.
Ultimately, Emory’s Hunter says the field may have to stage smaller clinical trials with the same vaccine combination in populations that have higher rates of new infections than these communities in Thailand. “Trying to tease out the correlates of protection is going to be a difficult challenge,” says Hunter.
Given the string of disappointments in their field, AIDS vaccine researchers say they're counting their blessings today. "If this were any other vaccine you’d say these are incredibly disappointing results," says Bruce Walker of the Massachusetts General Hospital in Boston. "Here you see a signal that looks to me like it’s marginal—and that’s exciting."

To Chris Kaegi,
These tests are absolutely not independent. To get independent tests, you would have to go out and conduct the entire vaccination trial again. Combining p-values from multiple tests concerning the same set of data in the manner you describe does not produce anything resembling a valid p-value.
I have a different question beyond the implications of which tests were included or not.
The real question trying to be ascertained (I think) is if the vaccine appears to improve health better than the placebo.
The we run 3 tests, and have 3 p-values: 0.08, 0.04, and 0.16. True, 2 of the tests fail the 5% significance test. However, aren't these 3 sets of events a) testing more or less the same hypothesis, and b) independent of each other?
If so, then isn't the overall p-value (Call is Po-value) from the 3 events something more like this:
Po-value = 0.04 * 0.08 * 0.16 = 0.000512.
I'm sure this is a gross oversimplification - with means, I believe this might have a Poisson distribution, or an F-distribution.
My question is this: does anyone know how to study the combined effect of multiple experiments on the hypothesis?
oops...forgot to include a second question in the previous post. What's the skinny on PG9 and PG16...has there been much discussion in Paris re the importance of these two new antibodies?
I'm curious to know what Dennis Burton's assessment of RV144 is, now that he has heard the full results of the trial, as presented in Paris, and had time to formulate a public opinion.
To write "but the so-called p-value was 0.08, meaning there's an 8% chance that the results were due to chance," is incorrect. The p-value means that were the true effect zero, there would be an 8% chance of seeing outcomes as extreme as those observed. The p-value indicates how surprising the results are were there no real effect. The p-value does not tell us how likely it is that there is no effect.
It is innacurate to write that the vaccine "protected fewer than one-third of the people who received [it]." This is not what a 31.2% efficacy means.
In these days of shaky reporting it is crucial to be accurate and state results properly and fully.
In this regard it would be interesting to know how Supachai is going to the explain to the Thai people the new results now that the different analysis are public. I find it rather demeaning and partonising to say that they would have not understood the first time.