Skip to main content

Biological News

“It Is Not Hard to Peddle Incoherent Math to Biologists”

Here’s a nasty fight going on in molecular biology/bioinformatics. Lior Pachter of Berkeley describes some severe objections he has to published work from the lab of Manolis Kellis at MIT. (His two previous posts on these issues are here and here). I’m going to use a phrase that Pachter hears too often and say that I don’t have the math to address those two earlier posts. But the latest one wraps things up in a form that everyone can understand. After describing what does look like a severe error in one of the Manolis group’s conference presentations, which Pachter included in a review of the work, he says that:

. . .(they) spun the bad news they had received as “resulting from combinatorial connectivity patterns prevalent in larger network structures.” They then added that “…this combinatorial clustering effect brings into question the current definition of network motif” and proposed that “additional statistics…might well be suited to identify larger meaningful networks.” This is a lot like someone claiming to discover a bacteria whose DNA is arsenic-based and upon being told by others that the “discovery” is incorrect – in fact, that very bacteria seeks out phosphorous – responding that this is “really helpful” and that it “raises lots of new interesting open questions” about how arsenate gets into cells. Chutzpah. When you discover your work is flawed, the correct response is to retract it.
I don’t think people read papers very carefully. . .

He goes on to say:

I have to admit that after the Grochow-Kellis paper I was a bit skeptical of Kellis’ work. Not because of the paper itself (everyone makes mistakes), but because of the way he responded to my review. So a year and a half ago, when Manolis Kellis published a paper in an area I care about and am involved in, I may have had a negative prior. The paper was Luke Ward and Manolis Kellis “Evidence for Abundant and Purifying Selection in Humans for Recently Acquired Regulatory Functions”, Science 337 (2012) . Having been involved with the ENCODE pilot, where I contributed to the multiple alignment sub-project, I was curious what comparative genomics insights the full-scale $130 million dollar project revealed. The press releases accompanying the Ward-Kellis paper (e.g. The Nature of Man, The Economist) were suggesting that Ward and Kellis had figured out what makes a human a human; my curiosity was understandably piqued.

But a closer look at the paper, Pachter says, especially a dig into the supplementary material (always a recommended move) shows that the conclusions of the paper were based on what he terms “blatant statistically invalid cherry picking”. See, I told you this was a fight. He also accuses Kellis of several other totally unacceptable actions in his published work, the sorts of things that cannot be brushed off as differences in interpretations or methods. He’s talking fraud. And he has a larger point about how something like this might persist in the computational biology field (emphasis added):

Manolis Kellis’ behavior is part of a systemic problem in computational biology. The cross-fertilization of ideas between mathematics, statistics, computer science and biology is both an opportunity and a danger. It is not hard to peddle incoherent math to biologists, many of whom are literally math phobic. For example, a number of responses I’ve received to the Feizi et al. blog post have started with comments such as
“I don’t have the expertise to judge the math, …”
Similarly, it isn’t hard to fool mathematicians into believing biological fables. Many mathematicians throughout the country were recently convinced by Jonathan Rothberg to donate samples of their DNA so that they might find out “what makes them a genius”. Such mathematicians, and their colleagues in computer science and statistics, take at face value statements such as “we have figured out what makes a human human”. In the midst of such confusion, it is easy for an enterprising “computational person” to take advantage of the situation, and Kellis has.

You can peddle incoherent math to medicinal chemists, too, if you feel the urge. We don’t use much of it day-to-day, although we’ve internalized more than we tend to realize. But if someone really wants to sell me on some bogus graph theory or topology, they’ll almost certainly be able to manage it. I’d at least give them the benefit of the doubt, because I don’t have the expertise to call them on it. Were I so minded, I could probably sell them some pretty shaky organic chemistry and pharmacokinetics.
But I am not so minded. Science is large, and we have to be able to trust each other. I could sit down and get myself up to speed on topology (say), if I had to, but the effort required would probably be better spent doing something else. (I’m not ruling out doing math recreationally, just for work). None of us can simultaneously be experts across all our specialities. So if this really is a case of publishing junk because, hey, who’ll catch on, right, then it really needs to be dealt with.
If Pachter is off base, though, then he’s in for a rough ride of his own. Looking over his posts, my money’s on him and not Kellis, but we’ll all have a chance to find out. After this very public calling out, there’s no other outcome.

32 comments on ““It Is Not Hard to Peddle Incoherent Math to Biologists””

  1. Anonymous says:

    You can’t snow the snowman!

  2. “The first principle is that you must not fool yourself, and you are the easiest person to fool” – Richard Feynman.
    I don’t know enough about the original study to comment on it but the kind of brouhaha described here seems to me to be typical of the birthing pains that interdisciplinary research often suffers from. It’s especially the case in biology which has recently been invaded by computer scientists, mathematicians, statisticians and other non-biologists. But this is hardly a new trend; after all there has been a highly successful tradition of physicists in biology, starting with Francis Crick and continuing on through Max Delbruck, Walter Gilbert and venki Ramakrishnan.
    The fact is that biology is too complex and fascinating to be left to the biologists and in many cases it’s only outsiders who can bring a fresh perspective to a discipline. So this is all good as long as, as you indicated, the concerned parties don’t peddle incoherent principles from their own disciplines to each other.

  3. MB says:

    Ah graph theory and optimization, the one class in my whole life that was able to give me nightmares.

  4. Anonymous says:

    Pachter might be right (or not), and he may be raising important topics relevant for all of systems biology, but he sounds like a very angry, nasty, vindictive individual … and generally such people get ignored as “kooks” … so it will be interesting to see if anyone else picks up on this and seriously examines the Kellis paper. Kellis BTW is generally held in high regard both as a scientist and as a person, so he will get the benefit of the doubt from most people.
    Note that Pachter also attacks another well-known compbio fellow, Barabasi, pretty hard, and again in a nasty way.

  5. luysii says:

    Well, I love both math and organic chemistry, and am far better at the latter. They require very different styles of thought. Basically, organic chemistry is working by analogy with fairly ill-defined concepts (Sn1, Sn2) where most examples fall between. Also organic just doesn’t have that many concepts to master. In math the concepts are extremely sharp and must remembered exactly to be applied and there are hordes of them.
    One book said that it took mathematicians 200 years to come up with the correct definition of continuity. This didn’t stop them from using it, but it did lead to several glaring errors even by the greatest of mathematicians — Cauchy got uniform convergence of a sequence of functions wrong.
    For more on this point — see

  6. Anonymous says:

    Got a bad italics closing bracket-
    I don’t think people read papers very carefully. . .>/i>
    And I think that math, and statistics in particular, really needs to be something that needs to be better understood by the people that rely on it- or even for it to be handed off to a third party (possibly as “blind” data). We might also want some better “standard language” to interpret the conclusions for less math savvy people to more easily call out people that try to twist statistical results.

  7. Anonymous says:

    Kind of funny where that bad bracket was actually 😛

  8. Nanonymous says:

    Chemists seem to be be pretty gullible on peddled math as well. A very odd statement of the hairy ball theorem from topology seems to form the basis of this paper:
    This has been pointed out to Science and no correction or retraction issued. Some discussion in the related paper:

  9. anon the II says:

    In my experience about 90% of the people doing computational chemistry are charlatans. The biggest problem is that familiarity with the Unix command line is much more important for admission to the OCC (Organization of Computational Chemists) than fundamental knowledge of structural organic chemistry. Most don’t know the energy difference between gauche and anti butane and don’t even seem to know why they should.
    I gotta believe it’s even worse for computational biologists. It’s so disappointing when useful technology is hijacked by silver tongued idiots.

  10. cirby says:

    When you get right down to it, pretty much every field of science is vulnerable to bad math – sometimes even mathematicians, if they aren’t paying enough attention.

  11. Neo says:

    I agree with #2, but this is not only applicable to Biology, it is also the case in Chemistry.
    #9, your problem has an easy fix: work with the remaining 10%. I suspect however that you cannot tell the difference between both groups, thus your frustration.

  12. Dr. Manhattan says:

    On a more mundane but just as important math matter in biological/pharmaceutical science, see the Feb. 13 issue (506, pages 150-152) of Nature. There is an article entitled: “Scientific method: Statistical errors in P values, the ‘gold standard’ of statistical validity, are not as reliable as many scientists assume.”
    Worthwhile reading for all in that it places P values in their proper perspective!

  13. Curious Wavefunction says:

    #9: There are two kinds of computational chemists: the ones who are computational chemists and the ones who are also organic or medicinal chemists. Work with the latter.

  14. luysii says:

    #4 Anonymous: I’m not sure who the actual authors are, but if you can get your hands on it (I can’t, but I read it in the past and threw out the journal) look at the American Mathematical Monthyl vol. 59 pp. 586 – 599 ’09. I’ve been reading the journal for years (for the most part uncomprehendingly), and this paper was unusual for its hostile tone. I did note that the article was by some heavies at CalTech and Bell Labs — it was a brutal attack on the notion that the world wide web and the internet are scale free networks (have a scale free topology). They reference the work of Grzybowki, Barabasi and Albert which claimed this.

  15. jrftzgb says:

    As a researcher, if someone approaches me with the “give me your DNA so I can see what makes you a genius” I would probably be a little skeptical of their experimental design.
    I’m not certain that approaching someone that way would get by an ethics board necessarily.

  16. Sweden Calling says:

    #9. If chemist knew more math rule-of-five, ligand efficiency and other “quality” measurements wouldn’t have taken off…better to work together than in silos

  17. Anon says:

    @4 (Curious Wavefunction): Goes back farther than Francis Crick. cf. Leo Szilard (at least!)

  18. Helical Investor says:

    Teach statistics before calculus (primary and early secondary education).
    I think it is a fair point that more people would benefit from learning statistics and earlier.

  19. dearieme says:

    ” Science is large, and we have to be able to trust each other.” That’s why Climate Scientists get off with it.

  20. Bioorganic Chemist says:

    I am reminded of a discussion I had with someone in the Bioinformatics program here, who said that they actively discourage their students from taking organic chemistry (which, by extension, prevented them from taking introductory biochemistry): “Students don’t need to know the chemical details to interpret the bioinformatics results.”

  21. Anon says:

    “But this is hardly a new trend; after all there has been a highly successful tradition of physicists in biology, starting with Francis Crick and continuing on through Max Delbruck, Walter Gilbert and venki Ramakrishnan.”
    Yes, but all of those guys except for Francis Crick did experiments. (It is pretty impressive that Walter Gilbert, who had been a theoretical particle physicist, isolated the lac repressor.) Also, many biologists may not understand how X-ray crystallography works, but I’m sure most of them can grasp the connection between structure and function.
    I came to work (somewhat by accident) in the genomics field. I think it is overall very positive that people with different backgrounds (typically experimental biologists and bioinformaticians with backgrounds in computer science, statistics, etc.) working together as a team. But I think the gap between the two cultures is very big. As an experimentalist, I am often appalled by the ignorance of my bioinformatics colleagues on biology and experimental methods. And most biologists are really clueless about math. As someone who know a little more math than average biologists, I have observed how computational models and a little math that are unimpressive to me are sometimes used to impress math-phobic biologists.
    I might add that even the bioinformaticians, for the most part, are not that knowledgeable about math. Mostly, they just write codes and apply pre-existing methods.
    I do think Kellis is smart and one encounter that I had with him was a friendly one. But I’ve also heard complaints from people who actually had to deal with him on a big collaborative project.
    As for Barabasi, I won’t classify him as a compbio guy. It seems to me he is a guy who tries to apply the same idea to as may problems he can find and has gotten an incredible milage out of that. At least Kellis has done a lot of useful work in biology. I don’t think Barabasi’s work have had much practical use. Barabasi is so repetitive as if to make a point that he is self-similar, like he is fractal. (OK, I stole the last one from John Hodgman.)

  22. kemist says:

    How bout peddling garbage chemistry to biologists.
    hydrazones as a stable conjugation linkage (solulink) !!??

  23. Dave says:

    “useful technology is hijacked by silver tongued idiots.”
    Ummm doesn’t that kind of describe what’s going on in the entire pharmaceutical industry these days?
    I’m sorry….

  24. RKN says:

    1. It would be a mistake to think a given math is incoherent merely because it can’t be understood by everyone who reads the paper. Indeed, understanding the math is prerequisite for knowing whether or not it’s incoherent.
    2. I work in the area of network biology, using protein interaction networks to identify candidate disease markers. In this application it is essential that any significant result be verified in an appropriate model, cell culture , animal, etc., and/or cross-validated on a relevant independent data set. Biological networks are noisy, this is true, but so are results coming from many other analytical methods in biology. The goal is to find the signal, use control experiments properly, and hold results to rigorous statistical tests.

  25. SteveM says:

    I actually transitioned from organic chemistry to applied mathematics (operations research). The problem with network models or system dynamics models in general is that the causal effects between nodes are driven by differential equations. And except for relatively simple physical systems, those functional relationships are usually unknown.
    So very elaborate models can be constructed with GIGO (Garbage In – Garbage Out) data sets and assumptions. There is some really nice modeling software out there, but the real application base is very sparse. With good reason.
    To paraphrase, academic food fights are so vicious because the stakes are so low.

  26. a says:

    Flag 26. Spammer. Kill it and it’s IP address with FIRE.

  27. PPedroso says:

    Manolis has answered back on Patcher’s blog. And by the look of it there’s truth and scientific rigor on his side as well. I do not have the math to analyse it but it is nice to see this kind of dynamics in blogs.

  28. D.J. says:

    @cirby “When you get right down to it, pretty much every field of science is vulnerable to bad math – sometimes even mathematicians, if they aren’t paying enough attention.”
    Well, there are many different flavors of math, just as I am sure there are many different flavors of biology and chemistry. I am a far weaker topologist than I am a combinatorialist, for instance.

  29. a. nonymaus says:

    Re: 14
    You have the journal name wrong. The article is in Notices of the AMS (American Mathematical Society), a copy hosted by one of the authors can be found here:
    I’d also describe it as a little snarky, but the authors aren’t throwing huge bombs like it’s an argument about higher-order cuprates or something.

  30. luysii says:

    #29 a. nonymaus — Thanks for the link. True it isn’t at the Phil Anderson level of invective, but for the AMS the tone was very unusual. Read it for yourself and decide. They basically say that Barabasi and company didn’t know what they were doing. Interestingly Pachter was not one of the authors.

  31. Wavefunction says:

    #30: Phil Anderson has almost always been right.

  32. Spiny Norman says:

    #21: Crick did experiments. Damned important ones, at that.

Comments are closed.