After that news of the Stanford professor who underwent just about every “omics” test known, I wrote that I didn’t expect this sort of full-body monitoring to become routine in my own lifetime:
It’s a safe bet, though, that as this sort of thing is repeated, that we’ll find all sorts of unsuspected connections. Some of these connections, I should add, will turn out to be spurious nonsense, noise and artifacts, but we won’t know which are which until a lot of people have been studied for a long time. By “lot” I really mean “many, many thousands” – think of how many people we need to establish significance in a clinical trial for something subtle. Now, what if you’re looking at a thousand subtle things all at once? The statistics on this stuff will eat you (and your budget) alive.
I can now adduce some evidence for that point of view. The Institute of Medicine has warned that a lot of biomarker work is spurious. The recent Duke University scandal has brought these problems into higher relief, but there are plenty of less egregious (and not even deliberate) examples that are still a problem:
The request for the IOM report stemmed in part from a series of events at Duke University in which researchers claimed that their genomics-based tests were reliable predictors of which chemotherapy would be most effective for specific cancer patients. Failure by many parties to detect or act on problems with key data and computational methods underlying the tests led to the inappropriate enrollment of patients in clinical trials, premature launch of companies, and retraction of dozens of research papers. Five years after they were first made public, the tests were acknowledged to be invalid.
Lack of clearly defined development and evaluation processes has caused several problems, noted the committee that wrote the report. Omics-based tests involve large data sets and complex algorithms, and investigators do not routinely make their data and computational procedures accessible to others who could independently verify them. The regulatory steps that investigators and research institutions should follow may be ignored or misunderstood. As a result, flaws and missteps can go unchecked.
So (Duke aside) the problem isn’t fraud, so much as it is wishful thinking. And that’s what statistical analysis is supposed to keep in check, but we’re got to make sure that that’s really happening. But to keep everyone honest, we also have to keep everything out there where multiple sets of eyes can check things over, and this isn’t always happening:
Investigators should be required to make the data, computer codes, and computational procedures used to develop their tests publicly accessible for independent review and ensure that their data and steps are presented comprehensibly, the report says. Agencies and companies that fund omics research should require this disclosure and support the cost of independently managed databases to hold the information. Journals also should require researchers to disclose their data and codes at the time of a paper’s submission. The computational procedures of candidate tests should be recorded and “locked down” before the start of analytical validation studies designed to assess their accuracy, the report adds.
This is (and has been for some years) a potentially huge field of medical research, with huge implications. But it hasn’t been moving forward as quickly as everyone thought it would. We have to resist the temptation to speed things up by cutting corners, consciously or unconsciously.