Skip to main content

Chemical News

Automated Discovery

To what extent can scientific discovery be automated? Where are the areas where automation can make the biggest contribution to human efforts? These questions and a number of others are addressed in a very interesting two-part review article on “Automated Discovery in the Chemical Sciences”. The authors, from MIT, are well-equipped (in all senses of the word) to give their perspectives on this – part I of the review defines terms, and introduces a classification scheme for the sorts of automation and the sorts of discoveries under discussion, and reviews what’s been done to date. Part II is more of a look toward the future, with some open questions to be resolved.

As the authors say, “The prospect of a robotic scientist has long been an object of curiosity, optimism, skepticism, and job-loss fear, depending on who is asked” I know that when I’ve written about such topics here, the comments and emails I receive cover all those viewpoints and more. Most of us are fine with having automated help for the “grunt work” of research – the autosamplers, image-processing and data-analysis software, the plate handlers and assay readers, etc. But the two things that really seem to set off uneasiness are (1) the idea that the output such machinery might be usefully fed into software that can then reach its own conclusions about the experimental outcomes, and (2) the enablement of discovery through “rapid random” mechanized experimental setups, which (to judge from the comments I’ve gotten) is regarded by a number of people as a lazy or even dishonorable way to do science.

I think that the classification scheme in the paper is a useful one to start to deal with these objections. They divide scientific discoveries impacted by automation into three categories: physical matter (a drug candidate, a new metal alloy, new crystal form, etc.), processes (such as new chemical reactions), and models (new laws, rules of thumb, correlations, and connections). The authors argue that all three of these are fundamentally search problems – they just differ in the knowledge space being searched, which is a process of validation and feedback. That holds whether you’re talking about a hypothesis-first (Popperian) mode of discovery or an observation-first (Baconian) one; the difference between the two is (to a large extent) where you enter that cycle of observation and experimentation. The paper makes the key point that in every example of machine-aided discovery so far, the search space has been far larger than what was (or even could be) explored. When you look closer, it’s human input that has narrowed the terms and the search space. Whether that will eventually change is one of those open questions. The authors also note the three factors that are enabling automation in all of these classes – access to large amounts of data, the increasing computing power to process it all, and the advances in hardware to mechanically manipulate the physical tools of experimentation.

Now one gets to the question of just how automated/autonomous things really are (or really can get):

Here, we propose a set of questions to ask when evaluating the extent to which a discovery process or workflow is autonomous: (i) How broadly is the goal defined? (ii) How constrained is the search/design space? (iii) How are experiments for validation/feedback selected? (iv) How superior to a brute force search is navigation of the design space? (v) How are experiments for validation/feedback performed? (vi) How are results organized and interpreted? (vii) Does the discovery outcome contribute to broader scientific knowledge?

As you’d imagine, existing examples fall all over the place on these scales, and the upper reaches of this scheme are still basically unpopulated. But that is surely not always going to be the case, is it? We humans operate in a pretty unconstrained search space, and that means that you can have situations where the human makes all the decisions and points the machine at the task, where the human uses the machine to narrow down the possibilities and then takes action, and (finally) where the machine narrows down those possibilities and takes action itself.

The how-effective-is-brute-force question applies to the “automated serendipity” style of reaction discovery that has shown up in the literature in recent years. I see that as a sliding scale – on one end you have a machine chugging through every single flippin’ possibility, filling in all the boxes as you hope something interesting hits, and at the other is the ideal of the human scientist, eyes closed and fingers to temples as they stand in front of the whiteboard, in the very act of bringing a creative discovery into being. In truth, most discoveries are in between those two extremes. The machines (as mentioned) have their search space limited by human input, and the humans often have to try and discard a number of possibilities before hitting on the right one.

I haven’t even made it in this post to the second review paper – we’ll save that for another day! The rest of this one features an extremely comprehensive review of past examples of machine-aided discovery in chemistry (literature mining, reaction prediction, new reaction discovery, property prediction, ligand prediction, optimization of existing reaction conditions, and more). It’s a thorough look at what’s been done – but the next paper goes into what might be accomplished from here on, and what we’ll need in order to do it. . .

21 comments on “Automated Discovery”

  1. Isidore says:

    618 references!!!!!

  2. Charlie Kilian says:

    The fear or discomfort at what might be accomplished with automation is interesting. Especially the idea that it is a lazy or dishonorable way to do science. I wonder how much of this can be ascribed to the desire for credit.

    Every scientist is going to have two competing tensions in herself: On the one hand, a desire for new knowledge. It’s the “cool factor” of knowing a new thing just to know it. On the other hand, a desire for credit. For one’s peers to know that you discovered this new thing, and the admiration that comes with it.

    The precise mix of how these two motivations drive a person is going to be unique in every individual. I wonder if people who are higher on the “knowing a thing is its own reward” axis are more comfortable with the techniques afforded by automation (and the corresponding “accidental” discovery of a new thing) than the people who are stronger on the “I desire the admiration of my peers” axis.

    1. Eric Nuxoll says:

      I’m skeptical of how much “desire for credit” is a motivation of its own in scientific discovery. In many cases, there aren’t very many people who can even understand just how great a particular achievement is, so it’s a relatively narrow and obscure group from which you’d receive the credit. I think many other professions supply much more opportunity.
      Rather than an end, I think “desire for credit” is usually a means–a means to discovering more cool new stuff. Discovering cool new stuff usually takes a lot resources these days, and people with lots of credit for discovering cool new stuff find it much easier to get the resources to discover more cool new stuff.
      Or maybe I just have an idealistic view of my colleagues.

  3. loupgarous says:

    One could ask “Will exponential (in the case of computing power, asymptotic) increases in automation of the drug development task hinder or help society’s objective of getting medical care to as many people as possible at a cost society can bear?” The two main answers seem to be “who buys this equipment, how do efficiently do they use it, and do any incentives for buying this equipment help or hinder people’s access to drugs in general?

    Crusading, publicity-seeking politicians such as Elizabeth Warren and Alexandria Ocasio-Cortez describe things like drug development as corporate activity intended to increase the profit of capitalists. I haven’t noticed them talking about the sheer amount of this capital expended developing the ~90% of new drugs that fail to gain approval. This is the price of making new drugs as safe and effective as possible.

    There are also politicians who discuss creation of laws regarding drug marketing and development behind closed doors with PhRMA and other industry lobbyists. Lobbyists don’t care about political parties – legislation on drug pricing is now in House subcommittees controlled by people who’ve accepted a lot of money from Big Pharma. Oddly, that didn’t change when the House went from control by one Big Party to control by the other Big Party.

    It’s hard not to see abuse of laws permitting pharma firms to recover development costs for orphan and other small-market drugs to jack the prices up on older drugs with largely known side-effect and toxicity profiles such as Daraprim, colchicine and Thiola as the result of that other extreme of political activity. It’s actually worse if the damage was unintended, because it means no matter what Congress does to “fix things”, they’ll break other things about how Americans get medical treatment.

    That is where my thoughts about “society’s objectives” came up short. Society itself, through what its elective officials do, is not helping in the ways they say they want to.

    Politicians ought to make sure things like laws intended to make newer and safer drugs available to those who need them don’t become ways to loot the pool of health care funding through insurance and social entitlements. They also need to discover and publicize answers to questions such as why insulin analogs cost so much in this country that Americans die from not being able to afford them, when across the border in Canada, they cost so much less.

    “How will the increase in automation of drug development impact the price of drugs to those who use them?” is something we all ought to be interested in. News coverage of abusive pricing of old drugs by clever tweaking of Federal laws and regulation shows that the higher efficiency, quality and quantity of the scientific work made possible by such automation isn’t even the main answers to that question.

    Someone has to pay for all these shiny new tools before they can be used. How will the capital structure needed to buy all this new equipment evolve? Will Congress continue to give Big Pharma and the next Martin Shkreli ways to cheat those who pay for medicines and reduce actual access to life-saving drugs? Will Elizabeth Warren’s next new idea be to fund drug development directly? How long will it take the next Martin Shkrelis of the world to turn that into a way of funneling health care funding into their pockets?

    1. loupgarous says:

      NOTE: the similarity between my “The two main answers seem to be “who buys this equipment, how do efficiently do they use it, and do any incentives for buying this equipment help or hinder people’s access to drugs in general?” and Cardinal Fang’s listing of the reasons nobody expects the Spanish Inquisition are stipulated for the record, as are any mismatches between “is”, “are” and the number of their subjects later on.

  4. New PI says:

    The current generation of pi’s is a little piece of arrogant trash…..go ahead and ignore 10 postdoc applications tomorrow, each of which is a better scientific mind than you will ever be. Sit on your a$$ and “ write a grant”.

  5. Diver Dude says:

    It might have escaped the notice of non-UK readers but the official current policy of the UK main opposition party (who may well be the government in a few weeks time) is the creation of a state owned pharmaceutical company to produce off-patent pharmaceuticals and to use compulsory licensing to produce versions of on-patent pharmaceuticals that it deems too expensive to buy from the patent holder. This proposal has met with exactly no criticism whatsoever from anyone in political circles. I think the pharma biz may be in great danger of jumping the shark, at least in the UK.

    1. Yup says:

      It hasn’t escaped our notice……everyone who is sane in the us knows that the whole rest of , basically the world, is a big freaking idiot that is hellbent on shoving socialism up our crack hole.

      1. Diver Dude says:

        Well, that is one way of looking at it but it is very US-centric one. Another way might be to say “the global model for drug development no longer serves the interests of the people it is meant to be for (ie the patients), so something fundamental needs to change”. And the pharmaceutical biz seems to be bent on handing all the ammunition in the world to its detractors to support that position.

        1. metaphysician says:

          And then the “rest of the world” will get to enjoy the wonderful state of affairs where *no one* is actually researching new drugs, because there is no money in it. After all, if the US stops paying the costs of new drug development, and none of the rest of the world had been paying those costs for decades. . . that kind of leaves no reason to do new drug development.

          I hope we actually survive the period ( decades, potentially ) before new drug research actually starts again.

          1. Diver Dude says:

            Which begs the rather uncomfortable question “do we actually need new drugs?”. I completely accept that there are conditions that are under served or not served at all. The latest vogue for immunotherapy based cancer treatments, for instance, appears to be massively useful for a very few patients and for the rest, not so much. But you pays the same either way. And the even more voguish vogue for developimg treatments for vanishing small populations at enormous prices is not a good look.

            Meanwhile, the really big (western) diseases seem to be relatively well catered for and any improvements appear to be incremental despite their eyewatering prices.

            So given that, Peter Thiel notwithstanding, you *are* going to die of *something*, is it worth bankrupting our societies to add a very limited duration of not very good quality of life?

            We might not like the questions, but governments have a duty to ask them and then propose political solutions.

          2. loupgarous says:

            @DiverDude: The people in BigPharma who make insulin analogs seem to be asking the same question about diabetics living in the US: “Do they need insulin therapy?” I was placed on insulin for poorly-managed NIDDM, but the sheer price of it caused me to go to a different non-insulin medication regime which works better and depends on two generics and one affordable on-patent med. Now, I’m down to pre-diabetic BG levels.

            My point is there’s usually a “happy medium” between government-controlled pricing and robber-baron pharmacapitalism, and it depends on availability of generic medications to all who need them. FDA broke that mechanism by allowing market exclusivity on grandfathered drugs with well-understood PK and side-effect profiles, to the detriment of those who need drugs like Daraprim, colchicine, and other small-market grandfathered drugs. No one, stunningly, seems to have proposed changing that law to do what it was meant to do.

            I’m calling BS on the idea that we needed to do this for those grandfathered drugs – what about aspirin, whose pharmacology could be better understood? The answer seems to be “but… but… but…. the money isn’t right!”.

            If we did to aspirin what we did to, say, colchicine, the odds are we’d learn what we already knew, that we don’t give ASA to kids, and we teach people to always watch out for GI bleeds and allergic reactions – and its price would go up to US$100/bottle. But patients have alternatives that still cost about $0.10/pill like naproxen, and even cheaper but potentially more dangerous ones like APAP. So nobody wants to re-visit FDA approval for aspirin, they’d be out, say, $70-80 million for the clinical studies and regulatory hire, but unlikey to earn that back during market exclusivity.

    2. Lambchops says:

      It’s also the official policy of the current government to leave the EU by the end of the October “do or die.” Is this actually going to happen?

      Reading too much into policy announcements before an inevitable election, particularly in such a febrile political climate, is a fool’s errand.

      Even the most categorical of pre-election policy announcements can run into the rocky grounds of either the changing political climate or reality (Lib Dems not raising tuition fees ring a bell?).

      If anything I’d have said their other policy announcement of ensuring that drugs based on public funded research is slightly more controversial than the state run pharma company (but of course the latter is more evocative of “the commies are coming for you” style imagery so I can see why it is being focused on more in the media!). The sheer amount of effort that is often involved in going from a target found in a public funded study and an actual viable drug has been well covered before on this blog, and there are already mechanisms in place by which public funders can negotiate royalty deals etc.

    3. loupgarous says:

      It’s not just the UK that’s looking at compulsory licensing. In the US, almost 20 members of Congress belonging to the Democratic party have called on the head of the US Department of Health and Human Services to issue compulsory licenses for drugs like Gilead’s Harvoni and AbbVie’s Mavyret combination therapies for hepatitis C.

      “Compulsory licensing” sounds great until, say, Gilead Sciences and AbbVie look at developing drugs that, say, are less toxic that their Harvoni combination therapy, and have to work up business plans to justify fronting the costs for developing the new drug, getting IND approval, then holding clinical trials, the 10% chance of getting marketing approval, with an eventual income stream much smaller than Harvoni’s or Mavyret’s.

      Ayn Rand’s Atlas Shrugged is preachy and doctrinaire, not pleasure reading for most folks. But the novel’s premise, “what if capitalists just stop playing a game rigged against them?”, still is a question worth asking. Capitalists won’t play in a market where the government can decide whether they are compensated for their investment, and how much return, if any, they get on it.

      Big Pharma, after all, isn’t the only place people with investment capital can put their money. If the return on the massive investment new drugs need to be developed, with only a 10% chance they’ll even have a drug to sell after the costs of developing it are sunk isn’t there, Congress and Parliament may get the chance to pay for the entire drug development cycle themselves after they mug drug companies for their patent rights a dozen times.

      Of course, “compulsory service for lawmakers”, with salaries to be determined by poll results and more stringent penalties for taking bribes to make up the difference from their present salaries might get the point across to lawmakers who like compulsory licensing.

      1. Diver Dude says:

        “Capitalists won’t play in a market where the government can decide whether they are compensated for their investment, and how much return, if any, they get on it.”

        Yes they will because they already do. In most of the non-US world that is exactly how things work and yet capitalists invest, profits are made and patients are treated.

        1. loupgarous says:

          “They do not preach that their God will rouse them a little before the nuts work loose.
          They do not teach that His Pity allows them to leave their job when they damn-well choose.
          As in the thronged and the lighted ways, so in the dark and the desert they stand,
          Wary and watchful all their days that their brethren’s days may be long in the land.”

          Those who feel that the people who they send to government know best what returns investors in the drug industry should make should pray long and hard that we never reach the point where our medications are made by government factories, whose managers are political appointees, and possibly political nepots.

          1. Diver Dude says:

            And my point is that, if the effect of our actions over the last 20 years is any guide, we are making that highly undesirable state of affairs much more likely.

          2. loupgarous says:

            @Diver Dude: Possibly so. There’s a middle way that even Trump’s biopolicy guys have talked about – importation of generics. It’s something I hope is also in that bundle of Democratic party drug pricing proposals – it’s a KIND of compulsory licensing we could look at that might not damage drug development economics badly, but serve as a pricing guideline to corporate decision makers.

            Someone’s already discussed its use in procuring antiviral therapy for hepatitis C for more people.

      2. fajensen says:

        Capitalists won’t play in a market where the government can decide whether they are compensated for their investment, and how much return, if any, they get on it.

        Oh, Boy, Oh Boy! Somebody has obviously never been involved with defence procurement or even scratched the paintwork on any of it!

        Here down in the mud, back on old Planet Earth, Capitalist not only love, no, they will positively crave those cost+ contracts that only Government can give: Government Guaranteed Profits. It’s just the best damn thing there is for any serious Capitalist venture, they can even use that certainty to borrow hugely against years of future income, thereby sucking all the earnings into fat returns Today, Now.

        Capitalist, they truly understand and appreciate all those things that Government and especially Regulation can do for them. Which is why they all invest heavily in making any Government work for them and in Changing any government that don’t!

        Only fringe ideologues, those with little stable income from sponsors and maybe more ‘globalised’ funding, are seriously fighting ‘Big Government’.

  6. loupgarous says:

    I’ve noticed my comments and those of others have wandered away from topic – “Automated Discovery”. Mea culpa.

    But now we we’re where we want to be when talking about automated drug discovery – where does it fit into drug development economics? I’ve worked at one global drug concern that had not one, but at least two Cray supercomputers for examining protein structure and dynamics. I assume they were at least tangential to the project I was assigned to, a recombinant origin version of a human endocrine peptide in which the positions of two nucleotide bases were swapped to make its component parts more bioavailable and improve its utility as a drug.

    How much did this wind up costing customers of the druq in question, an insulin analog? Quite a bit, considering this touched off a drug development war between competing drug companies which didn’t decrease the price of treatment of diabetes by insulin analog. The price went UP.

    Of course, this could have been solely the result of “physician education” by drug company sales reps on how Brand X recombinant human insulin analog reduced insulin resistance (one of the hypotheses floating around where I worked). Someone got to my endocrinolgist 12 years ago – she prescribed me a way cool version of vitamin D3 that (to my shock and amazement) cost me $100 a bottle. I get my D3 from Dollar Tree for $1 a bottle, now.

    But my point is, I didn’t have IDDM. I had NIDDM, and probably insulin resistance that didn’t care how the beta-units and C-proteins in my insulin analog were re-arranged (the cold-like symptoms and redness at my injection sites were sort of a giveaway). So we went back to the drawing board and discovered a combination of generic oral hypoglycemic agents and a wonder drug that made me pee glucose more often. That worked for a fraction of the price of insulin analog therapy.

    How much “education” of physicians on the virtues of new medications we owe to automated drug discovery can we expect? And how often will physicians prescribe them when they ought to be considering less-expensive generic therapy and their governments ought to be making those generic versions as available to them as conditions warrant?

  7. eub says:

    “[…] models (new laws, rules of thumb, correlations, and connections). The authors argue that all three of these are fundamentally search problems – they just differ in the knowledge space being searched […]”

    This one is only true if the idea of “model” is impoverished. A really interesting scientific discovery is one where you extend your concept of the search space — “oh, I never would have thought of that!”

    Sure, in a closed conceptual space where you know all of the variables, you can exhaustively search the pairwise causal connections, the higher-valence connections, as you like. But if you didn’t already suspect that, oh, ulcers might be related to bacteria, just how would you construct the search space that could hit on that one? Your search space might include… everything biologically salient.

    Or try to enumerate the search space that’s going to hit on “hey maybe mitochondria are subsumed alphaproteobacteria.”

Comments are closed.