Skip to main content

Drug Assays

Images of Machine Learning

Where has machine learning made the most strides in recent years? A lot of people who are into this topic will tell you that it’s image processing, specifically recognition and differentiation of objects. You can see that just by how much more effective reverse imagine searching on the internet has become (to pick a free example), and in the you-gotta-pay-for-it world there are many more. Some of this has been driven by the amount of effort and funding that has gone into things like facial recognition for security (note the system that Apple uses for their latest round of iPhones), and some of it is due to digital images being a very algorithm-friendly form of data to work with. Compared to the shagginess of other data sets, a pile of image files is already wonderfully homogeneous stuff; some of the hardest curation has already been done up front by forcing things into a grid of defined dimensionality, full of defined pixel chunks, each in a defined color space.

So if you ask me where the most likely applications of machine-learning algorithms in drug discovery will come from, I’d say this is a good place to bet. I’ve seen some impressive presentations on automated histopathology evaluation, and (moving back to the early stages of things) high-content cell imaging assays are another obvious target that a lot of work has gone into. This new paper is a perfect illustration. People have been running cell imaging assays for quite a while now, but often in a targeted fashion: does this compound affect the cell phenotype in the way that we’re looking for? Once in a while you’ll see a more open-ended one: which of these compounds do unusual things to the cells that the others don’t do? This paper tries to fill in that gap. They’re looking at a screening set of over 500,000 compounds that had been run through a high-content cell imaging assay, and then looking at the company database (Janssen/J&J) to see what assays these compounds had been run though.

Turning the machine-learning algorithms loose on the combination of these two data sets allowed the software to construct fingerprints for all sorts of activities. I have just skipped over a good deal of work in the middle of the paper, of course, because the details of how you arrive at these relationships are the real business end of this business. I’m not competent to evaluate the techniques used here, but what I can tell you is that the authors tried the Bayesian matrix factorization method Macau, which throws everything into a high-dimensional vector space and thus avails itself of vector operations to do the processing. That’s about the limit of my guidance. They compared this method with a deep-neural-network one, layers of simulated “neurons” where the top layer gets the imaging files and the ones below it deal with the data according to their own algorithmic specializations and weightings, eventually (several layers later) giving you an output. This is a simplified version of what goes on in the visual cortex, with various layers and groups of neurons that are sensitive to straight lines, contrast features, and so on – tricking the processing routines of such neurons is the basis for optical illusions.

Setting a pretty stiff threshold for significance when comparing the imaging data to 535 various protein/target assays, the Macau method found high-quality models for 31 of them, while the DNN procedure found such for 43. Note that the original imaging screen was for just one target (effects on the glucocorticoid receptor). This suggests that a single high-content imaging screen might be able to replace two or three dozen other screening campaigns. The team put this to the test, looking at a kinase target for an oncology target, and another enzyme target in a CNS program. Comparing the compounds picked out by the imaging screen models with the regular HTS hit rates, the first one was fifty-fold enriched in hits, and the second one was 289-fold enhanced. (It should be noted that both hit sets included a number of different chemical scaffolds). This strongly suggests that they’re on to something.

And as with every other discussion of stuff that’s driven by hardware and software, this is only going to get better and faster. I think the authors would agree with this summary: if you have a large enough compound library that has already a legacy of being tested across many different targets – that is to say, if you’re a big drug company – you should strongly consider unlocking the latent information in there via a selection of high-content cell imaging screens. Here’s how they put it:

We emphasize that our approach relies on a supervised machine-learning method, and hence activity measurements and imaging data must be acquired for a reasonably sized library of compounds to train the model. Subsequently, however, it seems possible to replace many particular assays with the potentially more cost-efficient imaging technology together with machine-learning models. Specifically, one would execute one or a few image screens on the library instead of dozens of target-focused assays. This raises an interesting question of the breadth of drug targets that could be accessed by imaging screens if the screen were optimized for that purpose, or if a combination of screens was used that explored multiple cell lines or sources, culturing conditions, staining of organelles, and/or incubation times.

Indeed. In fact, this paper could be just a first crude step compared to what’s possible in this line. I very much look forward to what comes next. This whole thing, I might add, gives a good framework to think about the role of machine learning and automation in the drug business in general. Instead of our machine overlords coming to eat our lunch (“Unclear-input-we-do-not-require-“lunch”), this is an example of our tireless machine servants running off to do completely insane amounts of grunt work that would drive us nuts, but will return high-quality results that we can then use our human brains and powers of decision to act on. Bring ’em on.

31 comments on “Images of Machine Learning”

  1. Belgian PhD student says:

    As soon as I read the words ‘machine learning’ and ‘Janssen’ I had to check if H. Ceulemans was amongst the authors. Turns out he’s senior author. He did his PhD in our lab and two decades later they still tell stories about him.

  2. Anon says:

    Show a two-year old 3 pictures of 3 different dogs, and they’d be able to accurately identify any dog in a fourth picture.

    Now how many thousands – or millions – of images would you have to show an AI program before it could accurately identify a dog?

    That’s how crap AI is right now.

    1. Jeff L says:

      Ask a random two-year old to teach itself chess. How many decades would it take to beat grand masters? That’s how crap two-year olds are.

      A two-year old has seen hundreds of millions of “images” in its life already. It has been pre-trained to do image recognition with an architecture heavily biased toward noticing trends like the one you used as an example. Give the same two-year old three SEM images and ask it to identify any nanoparticle in the fourth and they’re likely to perform just as well as a model.

      This isn’t an insurmountable problem. Right now you can just use pre-trained networks, well-trained models are much more generalizable than a lot of people think. There are some other, more experimental methods as well.

      1. Anon says:

        Oh dear, chess again. How many real-life problems are there to solve where we have all the rules and potential outcomes so clearly defined without having to make any additional experiments or observations?

        1. Jeff L says:

          There are countless. Any time someone builds a model to fit already existing data that is exactly what they have done. There’s plenty of data already existing which we have no understanding.

          If you want a “real life” example, try driving a car. Something AI has also surpassed humans in, with the advantage that you don’t need to teach every car how to drive.

      2. Emjeff says:

        Your analogy is imperfect, because chess, as played by computers, largely consists of running simulations of the game conditions on a certain move, discarding those with poor outcomes, and retaining the one(s) with good outcomes. Asking a two year old to do this (or a 40 year old) is a tall order.

        1. Jeff L says:

          That’s exactly my point (that it is difficult for humans to do this). The original posters analogy requires you to disregard all the privileges of having a human neural network already pre-trained, without considering any of the limitations.

          1. Pennpenn says:

            Yeah, it’s almost as if being the product of hundreds of millions of years worth of utterly brutal natural selection gives you an edge up on the results of a few decades (being generous) of product development.

      3. Jeff – very well argued. Another example: A neural net can be trained in a few days on a desktop computer using the Linux codebase (about half a gigabyte), and can output made-up code that is very recognizably C-ish – correct syntax (including nested braces) and decent-looking “sentences.” A two year old, or even a six year old, would not be able to do that nearly as well.

        So, for tasks where a human has trained for years on 99% of the knowledge necessary (such as recognizing images of dogs), humans will be better than AI. For other tasks, a well designed AI may learn faster and may perform better in the end.

        1. steve says:

          You guys are arguing chess when Go is a much more complicated game with 3^361 possible states. AlphaGo was programmed in the manner you criticize chess AI programs – by programming a number of games and their outcomes. It beat the world’s Go master, Lee Sedol, coming up with moves that Sedol said no human would think of. It therefore demonstrated true innovative thought. AlphaGoZero was only taught the rules of Go and then taught itself, without being programmed with any human games. In 40 days it mastered enough to beat all existing programs including Alpha Go. Recently, AlphaZero took only 8 days of training itself to beat AlphaGo Zero. It also beat a top top chess program (Stockfish) and a top Shōgi program (Elmo). So please, spare me the rhetoric about how dumb these programs are and value that you are living in the decade or two before they take over.

    2. joergkurtwegner says:

      Thanks for sharing Derek!

      Macau means you don’t predict one assay, but multiple at the same time, which means that an existing correlation structure is used for improving estimates. You can simply think about this as related targets or phenotypes, the known and unknown ones. This is called multitask learning, and simply formalizes using existing knowledge, everyone in the GPCR and Kinase area knows very well, now it is just formalized and target-class independent!

      I might reformulate the dog analogy.
      The challenge is to find which of a million animals will prevent their owner from harm? Mind that we measure a phenomics compound perturbation (for starters) response in imaging, not the compound directly.
      As industry, do we want more dogs? I like other animals, too ! 😉

      “All Science Is Either Physics or Stamp Collecting” [Ernest Rutherford]

    3. NotAChemist says:

      Yes, humans are smarter than machines. But if you have run half a million compounds through a screen, you cannot ask a human to look for correlations in the output. Nobody smart enough to do that job properly would be willing to do that job. A computer will.

    4. johnnyboy says:

      “That’s how crap AI is right now”. Your comment is about 5 years old. AI is no longer crap. I have 20 years work experience, and deep learning algorithms trained for a few hours can do parts of my job that I couldn’t dream of doing myself. And i’m not trying to recognize dogs.

    5. CheMystery says:

      Everyone needs to read the work on Hubert Dreyfus on AI. Let the philosopher’s handle it.

  3. Thoryke says:

    Somebody managed to train the current version of Apple’s ‘Photos’ program to recognize/tag “cat”, “building”, and “flower” photos with reasonable accuracy. I don’t think the “AI can’t recognize a dog as well as a 2-year-old” argument has as many legs to stand on as it used to.

    1. Some idiot says:

      4 legs, if it is a dog, I think…?


    2. Some idiot says:

      But sorry, yes, I agree with your point! To use the Apple photos app as an example, quite a while ago (10 years or something?) I was a bit frustrated at how a much earlier version (for MacOS) kept identifying other boys as my son. Now it is very significantly better. And that is just 10 years…

      Hmmm… I wonder how good it would be identifying my dog amongst other dogs (and especially dogs of the same breed)? Hmmm… worth a test… 🙂

  4. Barry says:

    Not too long ago, efforts to automate screening conditions for crystallizing novel proteins still relied on humans to interpret microscope images (images were sent to India for the purpose, I think). Image processing has improved since then. We can’t make crystals grow on our schedule, but machines can now tell us when we’ve succeeded.

    1. Leo says:

      “We can’t make crystals grow on our schedule, but machines can now tell us when we’ve succeeded”

      Do you have some evidence for this claim that you would be willing to share? Please?

  5. Imaging guy says:

    After reading this article, I remember an article criticizing QSAR (1). The same thing will be said of this method (i.e. predicting biological activities from pixels of a fluorescent image) if it becomes popular (a big if) in next 5 or 10 years.
    1) The Trouble with QSAR (or How I Learned To Stop Worrying and Embrace Fallacy) (Johnson SR, 2008, PMID: 18161959)

  6. tlp says:

    These are impressive results! Here’s just a minor nitpicking:
    1) Their method does enrich chemically-similar compounds from initially quite dissimilar set (fig. 5, left) in the new oncology project. So it would be more fair to compare their selected compounds to the ones identified only via similarity search. For CNS project they did a few additional filtering tricks and it’s unclear how it affected similarity. But anyway results are impressive.
    2) I don’t know how often pharma companies develop new assays for new projects (I’d estimate ‘quite often’), so their model’s predictive utility is going to be limited by this factor. Also I don’t know how often pharma companies are interested in knowing activity of new compounds in old assays (I’d estimate ‘not that often as they develop new assays’). So reverse problem – predicting activity of the old compounds in new assays – would be much more valuable but it is clearly a much higher hanging fruit. Oh, wait, it sounds like docking.

  7. Anonymous Researcher snaw says:

    Google has made a particularly clever use of its image recognition capabilities to make a tool for calling genetic variants in DNA sequencing data. The “standard” way of doing that is feed the data through a pipeline of analysis steps, ending up with a long list of putative variants. Then you generate some visualizations (basically color-coded DNA sequence alignments) and a human scans them to judge which are likely real biological differences versus technical artifacts.

    Well, Google trained an image recognizer to look at such images and classify them as likely real variants versus technical artifacts.

    Here’s a good summary of what Google did:

    1. DeepVariant says:

      The claims made about Google DeepVariant are apparently not received well in the variant calling research community. It is shown to be not superior or in some cases inferior to other methods (GATK, 16GT, Strelka2) in accuracy, and to be much more resource inefficient.

      One article written for general audience:

  8. Imaging guy says:

    Machine learning (supervised learning) does things what human can do at much faster rate. But I doubt it can discover new knowledge (latent information). It is like Excel. Human can analyze database, but Excel can do it faster and with much less error. For example, machine learning after being trained on thousands of histology/X ray images, it will be able to correctly classify new images. But this is what pathologists/radiologists do. When someone says their machine training algorithms can classify sexual orientation or criminality of a person after being trained on thousands of images (i.e. discovering new knowledge/latent information), you should be very doubtful. fMRI data is also analyzed with a type of supervised learning/machine learning method. In contrast to X ray/histology images, human cannot analyze fMRI data and that is the reason fMRI is still “learning” after nearly thirty years.

  9. Chris says:

    We applied a ‘supervised machine learning’ approach through a commercially available high content imaging system to the most basic gross-cell phenotypic changes (does it hold shape or does it turn into tiny bubbles of debris when we dump compound on it) and then ran that through a compound library to look for inhibitory compounds.
    Defintely pros in that it was live-cell (5 % CO2 in air, 37 c. so we measured at a few different time-points), we got images at the end of it, so if need be I could go back now and verify that indeed there’s no cellular matter there (or that perhaps there is but it’s shrunk significantly, or become rounded etc which interestingly didn’t come up). Furthermore it cost us time on a machine, and while it’s not a cheap microscope it was a cheaper option than the robotic resazurin/3H-nucleotide assays we had setup previously despite being similar throughput.
    Definitely cons- data management was something we didn’t put enough thought into early on, and we switched how we stored things half-way through which does make life more difficult when accessing it all now. The commercial application we have wasn’t as sensitive as it seems the one in your paper is, however this is perhaps down to the quality of the software which I imagine will only get better over time. We also couldn’t image whole wells as it took too long and the cells became stressed despite apparently being incubator conditions (maybe the lacking humidity?), and in the end imaged around 1/5th of the well. And lastly getting it to find edges of cells was, for whatever reason, quite problematic.
    Overall it was definitely something different we wanted to try, and something that will only get better and better over time, but we’re not quite there yet.

  10. Kaleberg says:

    I think machine learning using neural networks can be very useful, but then again, so is regression. Machine learning takes a big training set and finds a set of neural net parameters that can produce the same result. Regression takes a set of points and determines the parameters for a curve fitting model. You can use machine learning to save a huge pile of work as long as you remember that you are doing something different in quantity, but not in quality from a regression. Expect to save a bazoodle of hours of work, but don’t expect magic.

  11. Insiliconsulting says:

    supervised drug target/MOA prediction (Multi-class/Multi-label) has been around a long long time. So has tox and adme prediction. Recent improvements in ML especially Deep learning in cheminformatics have to do with structure generation based on known SAR.

    Image processing and ML have been used successfully in allied problems like age prediction based on facial features, skin condition/pathology and several others.

    HCS is certainly ripe for disruption by ml at this stage. Other applications of deep learning in scientific literature mining like finding drug-ae associations for pharmacovigilance are also noteworthy.

  12. Kelvin says:

    Machine learning cannot create new knowledge, it can only connect the dots of existing knowledge, and there is no guarantee that the way it connects the dots is meaningful and predictive of where any new dots may lie.

    The only way to *know* where any new dots are, is to observe them directly!

  13. steve says:

    Sorry, but wrong. As I pointed out above, AlphaGo came up with moves when it won against Lee Sedot – the reigning world Go champion – that Sedot and the Go community said no human would have thought of doing. According to one of its inventors, “”AlphaGo learned to discover new strategies for itself, by playing millions of games between its neural networks, against themselves, and gradually improving,” As I discussed above, AlphaGo was first generation and the second and third are orders of magnitude better. Too many on this thread are drastically underestimating the power of the technology.

    1. fajensen says:

      I’d add that these results are despite that we are, in my opinion, not even using a proper approach to Machine Intelligence. For now, we are just simulating things using vastly complicated hardware and pretty static algorithms.

      From doing some research on “reservoir computing” I suspect that “the ability to perform computing” is a property of many more physical systems than currently assumed, therefore it could be possible to build something in a lab somewhere that is vastly more efficient at AI-type computing than brains and computers are. I don’t see why an AI should not also be creative – especially if some kind of quantum process is involved, the AI’s “computation nodes” will be “entangled” to some degree with the rest of the universe like our brains are. The noise from various quantum effects will cause different computational paths to be set off and the AI can pick from all of them, exactly like we can, and come up with new things.

  14. Just saying says:

    Nothing new under the sun in terms of machine learning, quite standard. Another example of how good is to have exclusivity of data to achieve impact. Ps: I am really hoping that the first post was not sent by one of the authors/friends…

Comments are closed.