Skip to main content

In Silico

Wrong, But Still Convincing

SciTheory has a post, complete with links to the relevant articles in Science, etc., on a recent batch of trouble in structural biology. Geoffrey Chang and his group at Scripps have been working on the structures of transporter proteins, which sit in the cell membrane and actively move nonpermeable molecules in and out. There are a heap of these things, since (as any medicinal chemist will tell you) a lot of reasonable-looking molecules just won’t get into cells without help. It’s even tougher at a physiological level, because (from a chemist’s perspective) many of the things that need to be shuttled around aren’t very reasonable-looking at all – they’re too small and polar or too large and greasy.
Many of these transportersm especially in bacteria, fall into a large group known as the ABC transporters, which have an ATP binding site in them for fuel. (For the non-scientists in the audience, ATP is the molecule used for energy storage in everything living on Earth. Thinking of an ATP-binding site as a NiCad battery pack gets you remarkably close to the real situation). Chang solved the structure of one of these, the bacterial protein MsbA, by X-ray crystallography back in 2001, and it was quite an accomplishment. Getting good X-ray diffraction data on proteins which spend their lives stuck in the cell membrane is rather a black art.
How dark an art is now apparent – here’s the original paper’s abstract in PubMed, but if you look just above the abstract, you’ll see a retraction notice, and it’s not alone. Five papers on various structures have been withdrawn. As SciTheory says, anyone who doubted the original MsbA structure had some real food for thought last year when another bacterial transporter was solved at the ETH in Zurich. These two should have looked more similar than they did, to most ways of thinking, but they were quite divergent.
And now we know why. Chang’s group was done in by some homebrew software which swapped two columns of data. In a structure this large and complicated, you can have such disruptive things happen and still be able to settle down on a final protein picture – it’s just that it’ll be completely wrong. And so it was. The same software seems to have undermined the other determinations, too.
This is important (as well as sad and painful) on several levels. For one thing, transporters are essential to understanding resistance to antibiotics and cancer therapies, and they’re vital parts of a lot of poorly understood processes in normal cells. We’re not going to be able to get a handle on the often-inscrutable distribution of drug candidates in living systems until we know more about these proteins, but now some of what we thought we knew has evaporated on us.
Another point that people shouldn’t miss is the trouble with relying too much on computational methods. There’s really no alternative to them in protein crystallography, of course, but there always has to be a final “Does that make sense?” test. The difficulty is that many perfectly valid protein structures show up with odd and surprising features. Alternately, it’s unnerving that the data for these things can be so thoroughly hosed and still give you a valid-looking structure, but that just serves to underline how careful you have to be.
And we’re talking about X-ray data, which (done properly) is considered to be pretty solid stuff. So what does this say about basing research programs on the higher levels of abstraction found in molecular modeling and docking progams?

21 comments on “Wrong, But Still Convincing”

  1. eugene says:

    Hmmm… looks like the spam blocker doesn’t like posts with lots of links. Maybe I should have sent that post as an email since it was more of a question. Basically, it was about the recent controversy in Angewandte and JACS about Aspirin’s second crystal form. Can anyone who’s knowledgeable weigh in on that (without me providing the links)?

  2. jim says:

    > So what does this say about basing research
    > programs on the higher levels of abstraction
    > found in molecular modeling and docking progams
    Not too much I think. Surely one mistake with a high end model doesn’t invalidate lesser methods if your perspective is right. I dislike the idea of critiqing modelling as a poor representation of reality. The emphasis of modelling in a pharma environment is producing ideas to help progress projects – not really trying to aspire to reality as the tools are to crude. The models should not be pursued in themselves unless they progress the compounds. The good model is a way of stimulating ideas alongside the more traditional approaches. It’s a framework for thinking that should form part of the project strategy but not the whole of the strategy. The compounds should be pursuesd, not the model. If the model proves to be useful it should be used until it breaks. the use of a model should be linked to our confidence in it as a useful tool.
    > Another point that people shouldn’t miss is the
    > trouble with relying too much on computational
    > methods
    Exactly so, but this doesn’t take anything away from molecular modeling and docking progams, it just puts them in perspective. Use them as a tool – a model – nothing more or less. *Use* the models, never rely on them. I get the impression you are a traditionalist and a sceptic Derek 😉

  3. A-non-y-mous says:

    Modeling and computational stuff certainly has a role in chemistry, but like any sub-field, be it medchem, o-chem, p-chem, etc., garbage in = garbage out.
    This example just reiterates the need to check and double check everything when you come across something new and exciting, just like you did, Derek, with “vial 33.”
    Another point, which I don’t think has been mentioned yet, is that it took over 5 years since the structure was published for someone (was it Chang?) to notice something was wrong. In a synthesis paper, it’s relatively easy to see (and test) if something is screwy. LaClair, anyone?
    The more complex the problem, the more complex the solution, and the more difficult it is to verify if it is indeed correct.
    By the way, the best term I’ve run across for molecular modelers is “Cowboys of the keyboard,” no offense intended modelers.

  4. NJBiologist says:

    What ever happened to the idea of testing structure models with actual biochemistry data–like using membrane-delimited couplers to test transmembrane domains, crosslinkers to identify nearby domains, etc.? I agree with Jim that models are most useful for generating hypotheses, but hypotheses are only useful if you can/do test them.

  5. Slanderer says:

    To #1
    In short: Much ado about nothing.
    Authors A discovered a new polymorph for Aspirin and promplty published it in JACS.
    The new polymorph might have long been sought and proved elusive for authors B. Based on the usual but natural egotic hypothesis that ‘if we could not, nobody can’ they searched for finding the loop holes in the JACS paper. As is often happens, B found what they sought; the X-al refinement of A was not upto the present standard. (Unfortunately B forgot that; 1. Crystallography is not the only techinique useful for chemists. 2. Good old techniques such as melting point is as good for distinguisng polymorphs. 3. The heighest possible refinement of a X-al is decided by the X-al not by our free will.)
    Apparently, based on the preceptions generated by these blogs that ‘fraud is rampant’, B was convinced that the paper of A is rubbish. B published an ACIE implying so;emphasizing that at the refinement of A, even the common polymorph can be assigned to the new Crystal structure. Alas, this ‘theory’ paper was accepted for GOK for what reason. In any case, this one is a good JCE kind article.
    Since ‘truth never hurt the teller’ A might have been marvelled by the turn of affairs.
    Upon some revelation, B decided to do the experiment of A,any way (Why did not they do such a very simple experiment before the ACIE submission?). To the surprise of B they got the same polymorph of A!. This confirmation of the JACS result, resulted in another ACIE !
    Moral of the story:
    1.Fraud is not as rampant as many would have wished!
    2.Serendipity do happens.

  6. Rich Y says:

    IANA structural biologist, but I’ve heard that there had been some doubt about those structures in structural biology circles for some time before the retraction came out. Activity data and non-crystallographic supporting data just didn’t match up.
    Also, I got the impression (sorry, no references) that they (don’t ask me who “they” is) have published a structure for those proteins that sync with the supporting data. The ground-level x-ray data was just fine, it was the analysis that messed up. So the “some of what we thought we knew has evaporated on us” is not really true, it’s that certain people have unfortunately had their time wasted on the wrong structure.
    In the end, this just underscores the point not to rely too much on one source of data — independent and supporting confirmation is really important.

  7. eugene says:

    Thanks! I thought that the result was slightly different though. The new polymorph could not be produced cleanly as a separate crystal and only as a pretty dirty mixture of crystals. If B really did confirm the findings of A, why would the two ACIE articles (where they say A is wrong and then the second one where they do their own experiment) be released back to back?
    Also, why is it important in the first place? Is it really true that patent law is so flawed, as to allow anyone who can make a different crystal of an active pharmaceutical ingredient, to claim a new patent based on the fact that it’s a new physical entity? Half, the authors from A worked for TransForm pharmaceuticals, who seem to be saying as much. Either that, or they are trying to extend patent life for existing drugs by letting companies get away with finding a “new crystal polymorph” and filing for a new patent since obviously, a slightly different crystal will make you feel much better when you take your Aspirin (or whatever the case may be).

  8. JSinger says:

    I give Geoffrey Chang credit for handling this in a classy way, at least, instead of brazening it out or pointing fingers at the creator of the software.

  9. roadnottaken says:

    this is not a problem with x-ray crystallography, it’s a problem with over-interpretation. the original structures were at 4.5 angstrom resolution which is basically just not good enough to give indisputable models. furthermore, solid biochemistry WAS done by other labs that contradicted these structures but Chang had the hubris to ignore these results. one could forgive him his error, but it is an open question of how many other scientists had crystals which diffracted at 4.5 angstrom resolution and refused to interpret them until further refinement could be accomplished. i think the reviewers and Science magazine also deserve some of the blame as they should’ve had higher editorial standards and been less greedy to publish such a high-profile (but scientifically weak) result.

  10. xray says:

    Geoffrey Chang had no choice but to retract his paper, after another paper in Nature came out to prove he got the wrong hand of the transporter. There is nothing ‘classy’ about that! In fact, he waited quite a long time after that paper came out.
    The fact that he managed to propagate his error for an
    additional four papers is almost unbelievable. One
    of the consequences for getting this wrong is that the helices
    are left-handed. This is an obvious error that should have been
    picked up by him, even if someone else is responsable for the
    ‘jiffy’ that caused the problem. Evidently, he built in right-handed helices into an electron density map that shows left-handed ones…
    Nice work!

  11. Wavefunction says:

    I think that it’s wrong to specifically criticise computers in this story. The problem is with blindly trusting ANY kind of data; in this particular case, it’s data that comes from computers.
    Derek, most of the best computational chemists in the world realise the limitations of their method. Perhaps you should talk to someone who has successfully used docking in lead development; someone sensible who will tell you about the dose of salt that he took his results with, without discarding them as inconsequential. It is unfair to blame computers instead of blaming the people who overinterpreted or misinterpreted the data or took it at face value. And you certainly cannot extrapolate from this case to docking and other modeling programs.
    In docking, people are trying to precisely reduce the role of the “higher levels of abstraction” that you are talking about, by trying to mimic the EXPERIMENTAL physical chemistry of protein-ligand interaction as closely as possible. Again, the latest work of the Schrodinger group (Friesner et al.) is instructive. Also, as I have asked before, is experimental science with “lower” levels of abstraction really any more predictable?
    As you know, drug discovery is a complex process and everyone has a role to play. It is better to realise the limitations of a method and use its strengths instead of focusing on the limitations all the time. As William Jorgensen says in his Science review, there is no drug that has been discovered by computer modeling (and I don’t think there ever will be), but it is increasingly difficult to find a blockbuster drug who development was not helped by computational methods.
    I get the feeling that experimental chemists feel threatened by computational chemists. But that is totally unnecessary. Computational chemists can never ever usurp experimental chemists from the drug discovery process. At the same time, experimental chemists are the ones who can lose a lot if they decide to ignore computational results completely in their work. The two groups need to see each other as complementary, not opposite.

  12. Hap says:

    Computational data is a useful tool – but lots of times (some JACS papers, for example, involving compounds of intermediates never seen and unlikely to be seen), it seems to be followed to its limits in the absence of other data. Single-source data is always going to be problematic, however it is obtained.
    Crystal data seems to be in a set of data which are difficult to confirm by other means and which are highly dependent on the outcome of computational models. What other experiments (until someone gets another crystal structure) could indicate that a protein (or small molecule) crystal structure might be wrong? In this case, a competing crystal structure would have, but biochemical experiments probably wouldn’t. The only reason people realized that Diazonamide A had the wrong crystal structure was because Harran made the proposed structure and found it to be too unstable to be isolated. People tend to follow the data they have, even if it might be wrong, because they don’t know what else to do. Computational data is sometimes taken to be holy writ in the absence of other data, and that is probably a mistake.

  13. DLIB says:

    To the computational chemists out there— What single kind of experimental data would go the farthest in improving computational outcomes ( as it relates to reality ). Thermodyanmics, kinetics…

  14. JSinger says:

    Geoffrey Chang had no choice but to retract his paper, after another paper in Nature came out to prove he got the wrong hand of the transporter. There is nothing ‘classy’ about that!
    Sure, but it’s hardly unheard of researchers to brazen things out well past the point you’d figure they “had no choice”…

  15. Anonymous says:

    any comments / insider info from your readers about Abbott’s anouncement (getting rid of a couple hundred in drug duscovery)?

  16. molecularArchitect says:

    Wavefunction’s reference to William Jorgensen’s Science review caused me to chuckle at a memory from graduate school. I have great respect for Bill Jorgensen but as a 2nd year grad student, I took a course he taught on Theoretical Organic Chemistry. You should have heard the groans from the synthesis grad students when Bill stated that “theoretical methods yield more accurate results than experimental methods”.
    This is not a verbatim quote. This happened more that 2 decades ago but I still laugh at the memory. His comment in Science suggests that his arrogance regarding computational methods has mellowed and he understands the proper and complementary role it should play with experimental studies.

  17. Slanderer says:

    I surmise that Chang is a junk. A scientist’s first job is understand things, not getting things. Apparently, Chang’s sole aim is to get things and publish it as science.
    To #7
    Mixture of X-als and intergrowth of X-al are different, IMHO. At the bottom, A got a different form of Aspirin. The remainings are coming out of some people’s beleif that ‘defintions are facts’.
    Related to patent issue etc, I can only say that the world is often governed by the whims, fancies, and egos of some who wield the power.

  18. Deepak says:

    I don’t think this is an issue with computation per se as others have pointed out, but definitely what I would call sloppy application. I would actually say that the modeling techniques used in crystallography are quite simple (simple force fields, etc), but the constraints available, the electron density, are excellent most of the time, and that makes results relatively reliable.
    The problem I have is this. When we make software too easy to use, the chance of user error goes up. The theory underlying a lot of methods is fairly complex. I have seen examples of MD simulations being run with ridiculous choices in cutoff distances, choice of protonation states etc, especially among casual users.

  19. jim says:

    > this is not a problem with x-ray crystallography
    > it’s a problem with over-interpretation.
    There’s a difference in objectives between academic research (William Jorgensen’s work), and drug discovery research (in Pharma). Jorgensen’s work perhaps aspires to modelling reality to push the field forward and improve the techniques. In pharma you apply the techniques, concious of the limitations, factoring them into hypotheses developed. However, the two objectives do get mixed up in both environments, not necesarily in a negative way.
    Perhaps there is a lack of education and some overstatement of the purpose of the models.
    > no drug that has been discovered by computer
    > modeling …but it is increasingly difficult
    > to find a blockbuster drug who development was
    > not helped by computational methods
    Exactly so. “Molecular modelling and docking” does not discover drugs but can have a big role to play in the R&D of new drugs. After technology hype, you get the technology realisation.

  20. acetogen says:

    Chang’s papers were wrong because he never tested the in-house program against diffraction data from known structure, which would have revealed the incorrect transformation. There was a single line in the phasing program that gave the mistake. Once the correction was applied the correct electron density is obtained (the one which agrees with the experimental data). The Swiss group was esentially a former postdoc of D.Rees (same advisor as Chang BTW) who simply did not use the same program. Chang was fooled by the program and thought his structures were correct because all 3-4 proteins he was working on produced the same electron density. It is true that many experimentalists were working on the assumption that the wet biochemistry had to be revised based on Chang’s structures and now all that effort is lost. But I bet there have been worse incorrect pursuits than this case.

  21. rosko says:

    It’s good to see there’s at least someone reading my blog.
    In any case, I agree with roadnottaken that the poor resolution had a lot to do with it. At higher resolution the incorrect handedness of the helices would have been obvious and would have revealed the error long before the stage of publication.
    As for modeling never discovering a drug, I guess it depends on what you man by “drug”. If you mean a compound that actually makes it on the market for clinical use, then I agree it is unlikely that one was ever discovered directly by modeling–there is always some tweaking of the structure necessary to achieve a favorable in vivo profile. This, however, is true of most high throughput screening leads as well (and also many natural products).
    If, on the other hand, you mean high-affinity binders to a given target, some of those definitely HAVE been invented by structure-based design. And by the way, I have seen models (of other proteins) that looked more plausible than that EmrE structure from Chang’s group (though I’ve also seen lots of horrible-looking models).

Comments are closed.