Skip to main content

In Silico

MedChemica: When One Compound Collection Isn’t Enough

According to SciBx, here’s another crack at computational solutions for drug discovery: MedChemica, a venture started by several ex-AstraZeneca scientists. They’re going to be working with data from both AZ and Roche, using what sounds like a “matched molecular pair” approach:

Although other algorithms try to relate structure to biological function, most of the analyses look at modifications across a wide array of diverse structures. MedChemica’s approach is to look at modifications in a set of similar structures and see how minor differences affect the compounds’ biological activity.
Al Dossetter, managing director of MedChemica, said the advantage of the company’s platform is the WizePairZ algorithm that looks at pairs of fragments that are similar in structure but differ by a chemical group, such as a change from chlorine to fluorine or the addition of a methyl group.
This platform, he told SciBX, captures the chemical environment of the fragment change. For example, it incorporates the fact that the effect of changing chlorine to fluorine on a molecule will depend on the surrounding structure. The result is a rule that is context dependent.
The MedChemica approach applies to small molecules and uses only partial chemical structures, thus keeping compound identities out of the picture.
Because the platform does not reveal compound identities, AstraZeneca and Roche can share knowledge without disclosing proprietary information.

The belief is that neither company’s database on its own gives quite enough statistical power for this approach to work, so they’re trying it on the pooled data:

smaller databases only allow researchers to extract one to five matched pairs, which have a low fidelity of prediction. Ten matched pairs are sufficient to draw a prediction, but reliability increases significantly with 20 matched pairs.
The MedChemica database contains 1.2 million datapoints, each of which represents a single molecule fragment in a single assay. It includes 31 different assays, although more are likely to be added in the future, and not all molecules have been tested in all assays.

The article says that AZ and Roche are in discussions with other companies about joining the collaboration. Everyone who joins will get a copy of the pooled database, in addition to being able to share in whatever insights MedChemica comes up with. A limitation is mentioned as well: this is all in vitro data, and its translation to animals or to the clinic provides room to argue.
That’s a real concern, I’d say, although I can certainly see why they’re doing things the way that they are. It’s probably hard enough coming up with in vitro assays across the two companies that are run under similar enough conditions to be usefully paired. In vivo protocols are more varied still, and are notoriously tricky to compare across projects even inside the same company. Just off the top of my head, you have the dosing method (i.v., p.o., etc.), the level of compound given, the vehicle and formulation (a vast source of variability all in itself), the species and strain of animal, the presence of any underlying disease model (versus control animals), what time of day they were dosed and whether they were fed or fasted, whether they were male or female, how old the animals were, and so on and so on. And these factors would be needed just to compare things like PK data, blood levels and so on. If you’re talking about toxicology or other effects, there’s yet another list of stuff to consider. So yes, the earlier assays will be enough to handle for now.
But will they be enough to provide useful information? Here’s where the arguing starts. Limitations of working with only in vitro data aside, you could also say that any trends that are subtle enough to need multi-company-sized pools of data might be too subtle to affect drug discovery very much. The counterargument to that is that some of these rules might still be quite real, but lost in the wilds of chemical diversity space due to lack of effective comparisons. (And the counterargument to that is that if you don’t have very many example, how are you so sure that it’s a rule?) I’m not sure which side of that one I come down on – “skeptical but willing to listen to data” probably describes me here – but this is the key question that MedChemica will presumably answer, one way or another.
Even so, that in vitro focus is going to be a long-term concern. One of the founders is quoted in the article as saying that the goal is to learn how to predict which compounds shouldn’t be made. Fine, but “shouldn’t have been made” is a characteristic that’s often assigned only after a compound has been dosed in vivo. In the nastier cases, the ones you want to avoid the most, it’s only realized after a compound has been in hundreds or thousands of humans in the clinic. The kinds of rules that MedChemica will come up with won’t have any bearing on efficacy failures (nor are they meant to), but efficacy failures – failures of biological understanding – are depressingly common. Perhaps they’ve got a better chance at cutting down the number of “unexplained tox” failures, but that’s still a very tall order as well as a very worthy goal.
Falling short of that, I worry, will mean that the MedChemica approach might end up – even if it works – by only optimizing a bit the shortest and cheapest part of the whole drug discovery process, preclinical med-chem. I sympathize – most of my own big ideas, when I get them, bear only on that part of the business, too. But is it the part that needs to be fixed the most? The hope is that there’s a connection, but it takes quite a while to prove if one exists.

8 comments on “MedChemica: When One Compound Collection Isn’t Enough”

  1. Pete says:

    Generally, I would ensure that each matched molecular pair came from a single data set when doing this sort of analysis. Matched molecular pair analysis (MMPA) can be seen as a type of local QSAR modeling and one assumes (correctly or otherwise)that differences in values of properties are less sensitive to assay variation than the values of the properties themselves.

  2. SAR screener says:

    ‘It’s probably hard enough coming up with in vitro assays across the two companies that are run under similar enough conditions to be usefully paired.’
    This is the part I worry about. Screening technology, protein construct, buffer, substrate concentrations, pre-incubation times etc can have a huge impact on the measured IC50.

  3. Sweden Calling says:

    Speaking from within the walls…I am amazed. A tool was developed, which never really was used. A few good people were laid off, and they could take the (clunky) tool for free. The tool is then bought back from the laid off people (in the context of getting noisy data in return). The noisy data will never really be used (and good forbid evaluated). Money well spent?! Just waiting for same to happen with another of our (external) success, autoQSAR…which never really is used. Wonder why?

  4. Chrispy says:

    Well, this seems in keeping with the “post chemistry” era of drug discovery we seem to have entered. It used to be that this was exactly the kind of work done by real chemists (who, incidentally, could remake compounds or design new compounds to test SAR theories). Perhaps, too, this signals the beginning of a “post target” era since the data will only be useful for those targets already carpet bombed by not just one but two drug companies. Have we given up on finding anything novel?

  5. Anonymous says:

    Have we given up on finding anything novel?

    Yes, but only because we’ve given up on doing anything novel – everyone’s business model seems to be “Get cheap people to come up with an idea and implement it, buy idea, fire all the people, profit.”

  6. JC says:

    A giant load of tripe.

  7. TX raven says:

    If a given SAR trend works for your chemical series, does it matter whether the trend is a “universal rule” or not?

  8. leeh says:

    This approach is puzzling. The whole point to MMP analysis is to discovery activity cliffs. These cliffs are very local, and in the case of a given chemical series in a particular assay very specific to a particular site on the molecule. These rules are not transferable across chemical series (unless you know how the scaffolds align in the binding site) or across assays (unless you have prior knowledge that a scaffold binds to two binding sites that are essentially identical in the area of that particular part of the molecule). It is possible that particular chemical changes result in a higher than average probability of increasing binding affinity (such as substituting chlorine for hydrogen on an aromatic), but these kinds of rules are rather trivial and tend to be obvious by inspection (especially for a particular assay). If you isolate the change (by obscuring much of the structure) the value is lost.
    This approach is more general for some properties, such as physicochemical properties, but I’m guessing that’s not what these guys are trying to do.

Comments are closed.