In which I am bested by an algorithm

In many B-movies, machines try to take over the world. And in real life, we often joke about losing our lab jobs to them. As case in point, three of my five years in graduate school were largely consumed by sequencing a few megabases of DNA. After performing the radioactive chain-termination reactions, I’d carefully clean and tape up the big glass plates, prepare the fresh polyacrylamide gel and run the samples through. When the dye ran off I’d dry down the gel and put it on film. Every day, there was also the previous evening’s film to develop and then – the worst part, as far as I was concerned – staring at this film, with its ladders of tiny horizontal black dash marks, and entering the sequences manually into the computer, running one finger upward as I went so I wouldn’t lose my place. After those three years, the G, C A and T keys on my computer were visibly faded compared to their neighbors.

Today, the latest production-scale sequencers can analyze millions of base pairs of DNA in less than a day.

Sobering, isn’t it? Viewed in this light, machines become the things that free up our time to do things that are a lot more creative. Back in the ’90’s, the only way to understand viral evolution was to roll up my sleeves and sequence strains of the same virus over and over, documenting the mutations that occurred in a population during the course of infection. But imagine if I’d seen the lay of the land in a few weeks, and could have spent the next three years probing into the question in more meaningful ways? How much more would I have achieved?

Of course one could argue that every generation has its tedious chore – for every megalithic task that has become routine, the latest state-of-the-art technique consumes our time in much the same way. I guess I’m living through this right now, as I do high-throughput RNA screens using only minor automation on five-year-old platforms and process hundreds of thousands of images visually. But I’ve seen the future (and have unsuccessfully dodged its o’er-enthusiastic sales reps): wardrobe-sized machines with robotic arms that do your entire screen with no human intervention. Indeed, they practically make you a cup of tea and nip down to the shops to pick up your dry-cleaning.

As far as the automated image analysis, it’s still early days with regard to the features that I’m interested in. But I found last week that the future lurks around that corner too. I can’t go into any detail at the moment, but let’s just say I’m involved in a collaboration with some computer scientists who are keen to try out their image algorithms on something completely different. Enter my dataset. We started simply, choosing one morphological characteristic of interest: the tendency of some of my gene knockdowns to turn fried-egg-shaped cells into vaguely geometric objects. Although the human brain can see this differences instantly, it turns out to be surprisingly challenging for a computer to make the same call. But this new algorithm, after a few rounds of training, managed the task with about a 96% success rate. And when I nipped over to the CS department to see where it had gone wrong, I was mortified to find out that my eye had misclassified some fried eggs as triangles, and the programme was actually 100% correct.

OK, this was embarrassing. But it was also fascinating, because the algorithm wasn’t looking at the same thing that I look at. All the pixel intensities had been isolated, plotted on polar coordinates, snipped into patches: strange random-looking patterns of light and dark that would not be out of place hanging in the Tate Modern. I am quite sure that whatever my visual cortex is doing, it is not seeing the data in this warped form. But nevertheless, the correct answer emerged.

I’m not looking for a new job just yet. But I might think about more creative ways to fill my time.

About Jennifer Rohn

Scientist, novelist, rock chick
This entry was posted in Uncategorized. Bookmark the permalink.

23 Responses to In which I am bested by an algorithm

  1. Richard P. Grant says:

    bq. Indeed, they practically make you a cup of tea and nip down to the shops to pick up your dry-cleaning.
    I’d buy one of them.

  2. Bob O'Hara says:

    Yay, another success for statistics! One of the early papers in this field was called “The Statistical Analysis of Dirty Pictures”.

  3. Jennifer Rohn says:

    Is this a joke, Bob? Funny you should mention this, because my collaborators’ usual types of dataset are far more worldly than my humble cells.

  4. Richard P. Grant says:

    Now I have a mental picture of unshaven and greasy cell biologists accosting young, impressionable grad students outside the confocal lab …
    Psst. Do you want to see some special pictures of HeLa cells?

  5. Raf Aerts says:

    Although the human brain can see this differences instantly, it turns out to be surprisingly challenging for a computer to make the same call.
    Reminds me of that reCAPTCHA paper that was published in Science earlier this year (DOI). Apparently, our brain is better in recognizing words in scanned text than the best OCR algorithm there is today.

  6. Jennifer Rohn says:

    Or perhaps Page Three cells, with super-enhanced organelles.

  7. Richard P. Grant says:

    ‘Would you like to see my mitochondria?’

  8. Jennifer Rohn says:

    Raf, that’s interesting. Is this related to the tendency of the brain to fill in gaps? Whenever someone else proofreads my writing, the most common mistake I seem to make is leaving out the word ‘to’. Even though I know I do this, and I am an excellent proofreader, I can’t seem to force my brain to not to add in what my eye can’t seem to see is missing.

  9. Richard P. Grant says:

    No, wait–
    ‘Is that a 30 nm filament or are you just pleased to see me?’

  10. Brian Derby says:

    Image analysis has a long history in materials science and was tedious in its early days too. Now we have a number of image correlation tools that are routinely used to map differences (and hence displacements) between two images at different time points.
    Image anlaysis algorithms to distinguish shapes and other morphological features has been used in many non-conventional areas in the past. A colleague (sadly killed in a car accident) used image analysis software to identify the subtle characteristics of modern art styles from cubism to abstract. I remeber a talk in which he showed Braque having a distinct tendency to include linear features with a top-left to bottom-right diagonal – as seen in this obvious image below.

  11. Jennifer Rohn says:

    That’s really interesting, Brian. It also reminds me a lot of my cells — we aren’t just looking at individual cells (which might be an easier problem), but at a carpet of cells all growing together with no spaces in between. There is also quite a bit of variation even in the untreated cells. When you look at these fields of cells, you get an instant “feeling” about the morphological trends. But when I was trying to explain this to my collaborator, I realized that I didn’t know precisely what characteristics were giving me the feelings I had that caused me to classify an image as one or the other.

  12. Henry Gee says:

    wardrobe-sized robots with robotic arms that do your entire screen with no human intervention. Indeed, they practically make you a cup of tea and nip down to the shops to pick up your dry-cleaning.
    Not even my iPhone does that, but then it’s a lot smaller than a wardrobe.
    Too much technology can be intoxicating. When I was doing my PhD and learning about multivariate analysis, I produced lots of analyses, graphs, plots and tables because I could, rather than on thinking rigorously through my hypotheses – what, precisely, was I trying to find out? My supervisor who’d been a part of a pioneering effort into multivariate moprhometrics back in the 1950s, told me that it had taken days, if not weeks, to do a single ANOVA – using mechanical calculating machines. Time was too precious to waste chasing wild geese.

  13. Jennifer Rohn says:

    Very true. We do screens because we can, and because they are likely to throw up something of interest. But I confess I am looking forward to the post-screen phase when I actually choose a few interesting candidates and start working with actual hypotheses again to figure out whether they are biologically important.

  14. Richard P. Grant says:

    I’m at that stage myself, Jenny. I have lots of data—too much, actually, yet somehow not enough.
    My current ‘robot’ consists of a grad student, who’ll do anything to get a thesis. Hypotheses are cheap in this world; one after another rises only to be struck down by her experiments.

  15. Jennifer Rohn says:

    Is that a gamble, for a PhD student? Is the mass of data guaranteed to yield something of interest, or is it a bit dicey? I ask because lots of PhD students in our place seem to be doing screens, but I recall in ye olde days of graduate school, some of these projects took four years and yielded not much, so post-docs were the preferred robotic platform.

  16. Richard P. Grant says:

    She actually has a shedload of results from other aspects of the project, so even if we don’t find out what this bloody protein actually does she’ll have enough other stuff to be going on with.
    It’s ‘my’ array, and yes, you’re right: this is post-doc work. I keep spinning things off for her to check out though as there is too much for just one person.

  17. Jennifer Rohn says:

    When I was a student I had a collaboration like this with one of the post-docs. Far from feeling put out, I was flattered to be of assistance and it really did a lot for my self-confidence, even though it ever made it into my actual thesis.

  18. Eva Amsen says:

    Too much of this post and these comments are related to my thesis project. I don’t even want to talk about it right now. Which is unfortunate, because I have to edit my thesis before Monday so I can December’s tuition money back.

  19. Jennifer Rohn says:

    Eva, didn’t you have your viva on Tuesday?
    How did it go? or should we not ask?

  20. Richard P. Grant says:

    ‘Dr’ Eva to you, I believe…

  21. Cath Ennis says:

    Yup, according to Facebook anyway. Congrats again Eva!

  22. Jennifer Rohn says:

    Voel je verschillend?
    Eva, the loose ends bit is always the worst part. You think it’s all over…but no. Just get it out of the way as quickly as you can, and one day you’ll look back and laugh!

  23. Eva Amsen says:

    “How did it go?”
    I don’t really remember much of it, but everyone said I did great. There will be a blog post once I can string together more than 3 sentences. I honestly don’t think I’ve been able to write anything past that length since then.