Skip to main content

Drug Assays

Clouding Biology: That Verb Cuts Both Ways

Here’s Steve Dickman at Forbes with a look at “cloud biology” approaches to medicine and drug discovery. This is an area I’ve written about several times before, and I also recommend Wavefunction’s take on this. I particularly like the quote from Mark Murcko about having to extrapolate from biology that’s “half right and half wrong”.

That’s something I emphasized here as well – biology is not only inherently messy and complicated, but the data that we have to figure it out are in pretty squishy shape, too. There’s the whole reproducibility problem, for one, which has gotten a lot of ink in the last few years. (And there are weird little variables that might be kicking all sorts of assays around). But beyond that, there are a lot of reproducible but hard-to-interpret data sets out there. If you’re using tool compound X to draw conclusions about pathway Y, and (unbeknownst to you), compound X has off-target activity that has some sort of bounce-shot downstream effect on your pathway Y markers, well. . .other people will probably be able to reproduce those numbers, but no one’s going to be able to make sense of them very easily.

There’s also the interesting problem of Stuff We Just Don’t Know About. Twenty years ago, for example, nobody really knew that all these little RNA pathways existed (RNAi, miRNA, dsRNA, etc.) There are any number of other examples, important processes and pathways that we haven’t even noticed yet. They’re in there, making cells do things in our assays, but all we can do if we’re puzzled is throw up our hands and say “something else must be going on” or try to force interpretations based on what we actually know.

Computer hardware and computer software, I think it’s safe to say, don’t suffer nearly as much from this sort of thing, since (as I never tire of pointing out) humans built them. That’s not to say that weird bugs don’t happen (they certainly do), but you don’t suddenly discover the inverse of a weird bug – something you didn’t suspect and didn’t understand that is actually causing your program to function correctly. (Reminds me of the rarely-if-ever seen diagnosis of inverse paranoia: the irrational conviction that people are sneaking around behind your back to do you favors). Since when do programmers find whole classes of subroutines in their code that they didn’t know existed, or assembly-language-level steps that are functioning in their systems without them ever being aware of them? The very idea seems crazy and surreal to a coder or chip designer, but this sort of thing happens all the time in biology. In fact, it’s going on right now, in the background of those papers that are coming out this week, and we’re just going to have to learn more about what we’re doing before we can make some more sense out of it all.

Like Ash in his post, I still welcome the Valley types, and I welcome the big-data handling and the automation and all the rest of it. Some of this is surely going to help, and we need all the help we can get. But it’s worth remembering that sometimes the end result is going to be generating ever bigger piles of puzzling results, even more quickly than anyone ever has before. I just want folks to be braced for that, mentally, organizationally, and financially.

32 comments on “Clouding Biology: That Verb Cuts Both Ways”

  1. box_disappeared says:

    Reminds me of a nice recent commentary in Science:

  2. KevinH says:

    “…the rarely-if-ever seen diagnosis of inverse paranoia”

    Alan Nelson wrote a short and amusing bit of science fiction back in 1951 titled _Narapoia_. The protagonist suffered repeated delusions of a particular nature.

    “…I keep thinking that I’m following someone. … I keep having this strange feeling that people are plotting to do me good.”

  3. Stu West says:

    I saw a discussion about what the term should be for software that works by accident a few months ago:

    Actually, this could be useful for the next time someone who designs radios for a living decides they’re going to show the world how to do drug discovery right. Just tell them, “Imagine trying to fix malfunctioning software that consists of 99% undocumented behaviour, runs on non-standard hardware and is written in a language you barely understand…”

    1. hacker says:

      This sort of thing does happen in software. A well-known (by the experienced) example is code that breaks after an informational print statement is removed (triggering a bug that previously overwrote its data in a benign way into the print statements stack memory, but now overwrites something required for correct execution).

      I can imagine more complex and obscure “enabling bugs” (as both classes and more specific examples) that could cause this sort of thing, particularly for complex machine learning software systems that use many input data sources (particularly stochastic sources). It wouldn’t be much of a stretch to compare these to complex biological regulation networks (particularly as some of them are simulations of complex biological regulation networks…).

    2. Old Pump kicker says:

      To extend your analogy to drug development, you can’t fix the software by removing bad code. You can only add new code to try to block execution of the bug, but it could also block other, correct code which looks similar to the buggy code (based on definitions of similar which may not be known).

  4. Julien says:

    Reverse paranoia is called “erotomania”.

    What bothers me the most in biology is : how robust is my protocol ? What happens if I change parameter x by, says, 10 % ? Most often the yield goes from whatever to 0 %… Not that robust.

  5. Earl Boebert says:

    We used to call it “feature discovery.” If the program did something off-spec that was interesting, you added that characteristic to the marketing brochure. And of course there’s the old axiom that a program that is written in the absence of a specification can never be wrong, only surprising.

    I would be careful about swallowing too much of the big data snake oil. The robot that is munching its way through a cloud of bits has a world-view that is bounded by what’s in the cloud. This prevents it from noticing something that *should* be there but isn’t — AKA the “Silver Blaze Effect:” the dog that did nothing in the night-time. A robot can detect the the outline of an elephant in a forest but it takes a human to smell a rat.

    1. Some idiot says:

      “A robot can detect the outline of an elephant in the forest, but it takes a human to smell a rat.”

      Thanks! That one made my day!!!


  6. JB says:

    Derek, I think you’re conflating two different claims. My view of the automation/cloud aspect is that it will standardize the experimental variables and data tracking so that artifacts of biology can more cleanly be separated from artifacts of procedure. People often joke that experimental results sometimes depend on the phase of the moon (or less jokingly the season = ambient temp/humidity.) ECL isn’t far off from tracking all of that metadata and making it usable- # of times a column has been used, last calibration time and result, lot #s, etc. How many times have you talked about a synthetic reaction only working with a certain lot due to trace contaminants, and that effect only being discovered through hard work and/or serendipity? What if the effect was reported in a notebook by someone who hasn’t worked there in 8 years so there’s no way to track down all those minutiae?
    I think those are the useful parts of the cloud and big data- removing or recording variables we can control so we can better understand the ones we don’t yet know about. Now, I don’t know if the venture people are limited to that more realistic view or if they really do have delusions of automated crystallography solving every cellular interaction within 5 years, but I’ll take the useful things I can get from the $100 bills they’re setting on fire.

  7. Matthew says:

    When you try to apply biology to hardware, you get the opposite effect.

  8. Rule (of 5) Breaker says:

    Big data will prove to be another useful tool in the drug discovery toolbox, but hardly the earth-shattering breakthrough some (mostly those in you know, big data) make it out to be. The problem is the unknown unknowns if you will, which tend to be more of a problem than the known unknowns. I am all for big data to find out what light it might shed, but let’s not hold our breath here. History is littered with examples of moderately useful “stuff” that was at one time or another touted as the next huge breakthrough. siRNA anyone? How about the zillion drugs we were going to get from kinase inhibitor efforts? Or maybe the flood of NME’s that were going to come out of epigenetic research. All have utility, but none have been the game changer they were originally made out to be. Big data useful tool – OK. Big data a revolutionary milestone in drug discovery – meh.

    1. Andre says:

      I could not agree more with this analysis. We may want to add “organs-on-a-chip” and “mRNA-based therapeutics” as to further examples of “revolutionary tools” in drug discovery….

  9. Peter Kenny says:

    This quote from may be relevant to the discussion:

    “Open Source Pharma (OSP) is a concept inspired by the Linux model of operation.

    In brief, crowdsourced, computer-driven drug discovery; IT-enabled clinical trials with open data; and generics manufacture.

    In four words, Open Source Pharma is “affordable medicine for all.” In three words it is,”Linux for drugs.”

    Adapted to tackling important public health challenges, it hopes to catalyze radical change in the way we do medical R&D and deliver better and more affordable innovation quicker and cheaper to patients”

    1. GCash says:

      Wow, that pretty much wins the buzzword bingo of the month.

      The “Linux model” only works because it’s a very straightforward subject (the Linux OS) needing very little resources (net connection & medium sized PC) to work on. It works because the Linux kernel is a well-designed, well-documented codebase, overseen by a “benign dictator” (Torvalds) who IS NOT afraid to call shit code “shit code” in public and refuse to incorporate it.

      People don’t realize this last point is probably why Linux has succeeded where other large software projects have foundered.

      In contrast, I can’t see any sort of drug discovery or clinical trials taking place without any labwork at all.

      And the “Linux model” doesn’t always work, even for software design. Look at the OpenSSL shambles, where disastrous bugs like Heartbleed have only emerged after decades of being able to look at the code.

      1. tangent says:

        FYI, there are lots of different social/organizational structures for doing open source, and having someone who likes to rant about “shit code” is certainly not a necessity.

  10. anon says:

    Computer hardware and computer software do suffer from that sort of thing, People spend weeks, months to find the bug sometimes. In fact, most software developers spend most of their time on debugging.

    I think the solution for medical sciences is that MS and Ph.D. scientists should learn how to code, visualize data and analyze it. There is a gap between people who are software oriented and who are working at the bench and unaware of the computational results.

  11. steve says:

    I think the major difference between biology and engineering is error. Biology thrives on error – mutations are the driving force in evolution. If an engineer had designed biological systems I doubt very much that he or she would have thought that huge error rates and variability from biological unit to biological unit were a desirable feature; they would have tried to eliminate such messiness. That’s why, as mentioned above, most software developers spend most of their time on debugging and why software development is not the same as biology.

  12. Dr. Manhattan says:

    “I think the solution for medical sciences is that MS and Ph.D. scientists should learn how to code, visualize data and analyze it.”

    Many biologists have in fact done that, and among the younger scientists, there are quite a few who are quite good at coding. But I am not sure what you mean by “a solution for medical sciences”. Analysis is great for looking at specific large data sets (RNA-seq, as one example), but if you are talking about the overall process of drug discovery & development, the answers that are needed don’t drop out of analysis of “big data”. Wish it were different, but it’s not.

  13. a. nonymaus says:

    A glossary:

    The cloud – Computers owned by somebody else with their own agenda. To use them, you’ll pay with either cash, your data, analytics about how you use your data, or usually all of the above.

    Crowdsourcing – Asking other people to do unpaid work for you.

  14. There are examples of unplanned benefits in the early days of computer hardware. The IBM 701/704/709/7090/7094 series of computers had instructions with (I am simplifying) 15 bits of memory address and 21 bits that specified what to do. For example, there were 4 or 5 different instruction codes that caused various selections of bits from the central registers to be stored into the addressed location. It was discovered that a variant of these instructions, not contemplated by the machine’s designers, would reliably store zero into the addressed location.

    1. Earl Boebert says:

      I was told that the first bootstrap command (read input and branch to the location where stored) was discovered and not designed in the IBM 701.

  15. Zach says:

    Good post. A few interesting stories come to mind, though, about baffling technology. First was the poor soul trying to troubleshoot why someone could not send email more than 500 miles.

    Second, someone told a device to program itself, using processes very close to Darwinian natural selection. It developed an incredibly efficient program using less than half of logic gates available to it, and some of them weren’t even connected to any other gates, but the program stopped working if any of them were disabled.

    Third is the “magic/more magic” switch on the side of a computer that only had one wire coming from it, that crashed the computer if it was switched to “magic”.

  16. Zach says:

    I tried to post a few interesting stories about bewildering technology, but I don’t see it. Did adding links possibly set off a spam filter?

  17. Sok Puppette says:

    To add to the things people are saying about programming, in a modern program your code’s behavior depends on that of many layers of libraries, language runtimes, system services, remote services, operating system, even hardware. Those things can and do do things that make your program work, in ways that you don’t understand and can sometimes break by seemingly innocent changes in your own behavior.

    That library you linked against does indeed contain giant classes of subroutines you don’t know about, and some of them are probably actually getting executed because of stuff you’re doing. And, yes, the runtime system that supports your program is doing things in assembly language that you have no clue about. Most programmers don’t even have the training, or sometimes the talent, to understand all the code that’s supporting them even if they go look at it.

    Each PART of the system was engineered, but that doesn’t mean that any one person remotely understands the WHOLE system. It’s not remotely the mess that biology is, especially because the hidden stuff usually actively tries not to surprise people who don’t know about how it does its job. But don’t kid yourself that it’s simple, that weird unknown interactions don’t come up, or that anybody writing a program has a complete understanding of everything the system is doing to make that program run.

    1. Mark Thorson says:

      Richard Stallman does, but after he dies you’re on your own.

  18. gippgig says:

    Look at neural networks. You don’t program them, they learn from examples. Not even the people who designed them understand how they work, but work they do – a neural network just crushed the European Go champion.

  19. chiz says:

    Inverse paranoia is common among the religious. That chance encounter with a stranger
    in the cafe who turned out to have a solution to some problem you had wasn’t just a chance
    encounter but was arranged for you by some deity etc.

  20. watcher says:

    Having read many article from Forbes over a long period, I’ve decided that most of the contributors know very little about the actual process and activities involved in discovering and making new drugs. Few of the writers have ever worked as scientist in the drug business, and at best they might have a degree in a scientific area, and hence declare themselves experts to give opinion, and be paid for accordingly. Seems comparable to all those folks working in “BIG BANKS” that make nothing, provide no real service, but insist on being in the 1% by age 30.

    1. Forbes has both extremely well-versed good and utterly clueless writers who venture into the biopharma space, and it is grossly unfair to the former to lump them with the latter. Steve Dickman, who wrote this piece, is an active consultant and VC in the biotech space. Steven Salzberg is an academic scientist who developed many procedures for assembling and analyzing genomes. Matthew Herper doesn’t work in the space, but has invested a lot of time understanding drug discovery and it shows John LaMattinna is a former Pfizer R&D executive. Almost certainly a few more.

  21. Kaleberg says:

    Some years back I was on a team to do what we called a ‘software inventory’. We were hired to write a specification for an airline’s flight planning system, one currently in use. We found all sorts of amazing stuff. For example, no one working there knew that there was an interface for estimating fuel usage for test flights i.e. take off, fly around for a certain time, land safely. There was also history recorded in the code. For example, for certain aircraft types on a certain runway at the old Hong Kong airport, the program calculated takeoff thrust as if the aircraft was some tons heavier than listed. Luckily there was an old timer around who remembered the crash that inspired this. Aviation has only been around for a century or so. Biological systems have been around for billions of years. I imagine there is a lot more of this kind of thing buried in there.

  22. William says:

    I generally agree with this post, and in general I think the attitude that biology can be treated like software development (or other forms of engineering) is silly and ignores reality. Even the chips that software run on need to be designed in a manner quite different than almost all software.

    I’m not a biochemist, but I do happen to be a chip designer. And it turns out that we do pretty regularly find unexpected effects that cause correct operation (as well as incorrect operation)! Despite the fact that we treat transistors as “digital gates” that are on or off, they are in fact analog devices with some pretty weird properties. I have, several times, discovered that the way we designed a circuit shouldn’t have worked, but some of the unexpected complexity of the circuit interactions causes the circuit to work anyways. This can be especially surprising, and unfortunate, to discover when a circuit that used to work in an old manufacturing process suddenly stops working in a new one!

    I believe that when Intel first discovered strained transistors, they didn’t know why they worked. If I recall the story correctly, it was only after using them for nearly a full process generation that they discovered the underlying mechanism that made the transistors behave so much better than they had expected. Chip design is really quite a messy business!

    But these examples are exactly why I can appreciate how difficult drug discovery must be! Modern transistors are messy and are real world examples of many quantum effects that we can’t ignore any more! And yet I can still, fairly accurately, simulate at least small circuits. I haven’t really heard of something equivalent for biochemistry yet. Not that works at a meaningful scale. And if I can’t imagine reverse engineering a large chip (which was designed by people) how could I possibly imagine that drug discovery could be anything even vaguely resembling simple?

    I really don’t understand how Andy Grove, from a similar background, has ended up with such as… misinformed an opinion as he has.

Comments are closed.