In which the blind see

What is truth? After long stretches of time in the lab, I often wonder if we’ll ever really know. Science can be loosely defined as the search for truth, but this goal is often illusory. Every experiment we do hides a truth, but the problem is recognizing it when we see it – and more importantly, realizing when, despite all appearances, we actually don’t.

Seeing, you might point out, is believing. Yet as the Hare brothers wrote in their book of essays, Guesses At Truth (1827), “Though, of all our senses, the eyes are the most easily deceived, we believe them in preference to any other evidence.” To get around this problem, some sciences rely on the generation of numbers to reveal truths, and it would be a poor scientist indeed who could persuade himself that 3 is less than 2. But in my particular discipline, cell biology, we are frequently reliant on what the eye can see. Some visual attributes can now be measured in a cold, unbiased fashion by automated image analysis: is X bigger or smaller than Y? Is A brighter or dimmer than B? But for the work I do – trying to understand the ins and outs of why and how cells take up the shapes they do – the questions are far subtler.

The cell is a tiny machine. We can stain it, photograph it, separate its contents like an auto mechanic disassembling an engine into a hundred pieces onto the concrete floor of a garage. We can sometimes be awed by the sheer beauty of the visual output of a cell under the microscope, enhanced by the emerald, ruby and sapphire stains of our fluorescent probes. We can marvel at the feathery protrusions, the intertwined cables, the spidery networks of a cell caught in the act of behaving, preserved for all eternity in a frozen snapshot by our fixation techniques. And we can be bamboozled by the sheer variety: in a field of a hundred cells, all of which are supposed to identical clones, we can be treated to an infinite array of subtle variation from one cell to its neighbor. Biology’s minute-by-minute interpretations of the genetic instruction manual, and the influence of local environment, can lead to a vast array of possible outcomes.

Awe and appreciation, and even temporary bamboozlement, are fine. But then we have to work out what it all means. Of course our expectations of cell appearance and behavior do not occur in a vacuum. There will be a vast body of general knowledge on this particular topic stretching back for decades and influencing our ideas. And as for the specifics, most observers like me will have been on the hunt tracking an elusive truth for months, if not years. Our modus operandi is pure objectivity, but I don’t know a scientist alive who, like me, doesn’t occasionally get caught up in the drama and excitement of an unproved theory. We shouldn’t want something to be true, but that’s exactly what happens sometimes. Because if our theory is true, suddenly all the pieces will fall into place, forming a picture of exquisite beauty. (Some might even be subconsciously motivated by the practicalities of such resolution: the bolstering of a CV, the enhancement of a future prospect, the funding of the next grant – all of which become more likely when another paper is published.)

And this is where the phenomenon of observer bias steps in. It is well established that when the assessment criteria are subjective, a human being can see what he wants to see. So often I am looking at a field of one hundred cells under a microscope, wanting to know what percentage look “normal” versus perturbed in some fashion. Then I tweak the experiment and see if this proportion changes as a result. At the moment, I’m fixated on spiky cells: cells that abandon their vaguely fried-egg shape to take on a starfish-like appearance. But if you think this is a black and white thing, step into my lab and I’ll show you an infinite gradient of shapes between “fried egg” and “starfish”. You would, I am sure, have just as much of a hard time as I scoring cells that fall into the grey area – especially if in the back of your mind, you were being supremely unscientific and hoping that your particular tweak might lead to a change in the percentage in spikiness.


When you get to this stage, you have a problem. Most scientists overcompensate for observer bias: they think, “I want there to be more starfish shapes, so I will downplay the evidence that there really is more this time.” Which does almost as much harm to the truth as letting your desires sway you the other way. The strategy that I, and many other scientists, use to get around this is by coding the samples so you actually don’t know which is which: in researcher parlance, you are blinded. Suddenly, you are released from the awful obligations of desire and expectation, and your blind eyes can just see, as best they can, what is actually there. Believe me, it’s an incredible relief, and I travel blind whenever I can.

Some scientists might deny that what I am talking about ever troubles them. And perhaps it doesn’t: perhaps there are people out there who don’t agonize over observer bias, who just take their readings in blissful ignorance. In fact, I often suspect that in a significant fraction of the sporadic instances of scientific fraud that make the news, subconscious observer bias might be the driving force, as opposed to conscious, and malicious, intent to deceive. I think it’s important, though, to admit that we as scientists are as human as anyone else, and to take the appropriate precautions whenever we are studying material with subjective output. Our natural tendency towards observer bias might be a dirty little secret in the trade, but I believe it should be brought out into the light and discussed, and its tenacious propensities be revealed to our young trainees as soon as possible. Because more important than the narrative we weave around our work to help guide our experiments is the actual truth that underpins it – one way or the other.

About Jennifer Rohn

Scientist, novelist, rock chick
This entry was posted in Uncategorized. Bookmark the permalink.

12 Responses to In which the blind see

  1. Eva Amsen says:

    GAH, flashbacks to the lab!
    My boss wanted the blinding to be done by teaching a poor technician in her other lab (who was on the same grant as me, I guess) to do my entire experiment. I fought that because I wouldn’t trust her with ALL my work, and she didn’t have time to do my 14-hour experiments, so in the end we did it this way:
    there were about 24 knockdown constructs, against 11 genes (some had only one, others had two or three). They all had a number from the supplier, but I knew what some of the numbers were. I gave those numbered tubes to the technician, who did NOT know which number was which gene/construct. The tech diluted them all to the have same concentrations, pipetted them to new tubes, and gave them random numbers from 1 to 24. She kept the list of which new number corresponded to which old number, and gave me back the tubes. For practical reasons, I did know which sample was the negative control, because I had to make sure that was on every slide as comparison, but I didn’t get the key until I was ready to stop experimenting and write it up. I even gave an entire talk in which I had no clue which sample was which, and just showed that numbers so and so gave a result.

  2. Jennifer Rohn says:

    Your solution was better, because the technician would not have been properly blind.
    Fortunately I’ve reached an age where I no longer require help for the double-blind; my memory is so poor that I’ve forgotten what I’ve transfected into what wells about half an hour after doing it. My experiments are usually 4 to 7 days in duration, and I’ve got 3 or 4 on the go simultaneously, so there’s no hope at all that I will recall where I put the negative controls by then. (I do always vary their positions). I definitely don’t want to know which one is negative, since it tends to have a certain percentage of baseline spikiness.
    I’d like to teach the computer how to score them, but I don’t think it would be able to cope with the subtle variations.

  3. Richard P. Grant says:

    Jenny, I remember you talking about a shiny new piece of kit in your lab that gave a string of numbers instead of pretty pictures of cells. Do you think that automated, computational processes might help us around observer bias–especially as we get better at designing algorithms to do such stuff?
    “Shortly after, the human era will be ended.”

  4. Jennifer Rohn says:

    Definitely. But at the moment, in my opinion, the algorithms aren’t very good at textures. There are some good people working on it, but the best lab in the world had a crack at my dataset and wasn’t able to ‘see’ the trickiest parameters.
    Another potential problem is that for machine learning, the algorithms have to be trained by a human operator. Some bias might creep in that way as well.

  5. Eva Amsen says:

    Algorithms don’t like when cells touch each other. Cells looooove touching each other.
    Re: the negative controls: I needed them to standardize the background variation, so had to put the same one on each four-well slide. If you’re bored or can’t sleep, the details are all in here:
    Ha, re-reading that abstract, I forgot (suppressed?) that the automated image analysis only picked up one of the two I found with my human eyes. Human eyes FTW!

  6. Jennifer Rohn says:

    In the interests of fairness I should report that some algorithms did better than I did. When I went back to check some ‘false positives’, I found that I had actually missed the parameter during my own perusal.

  7. Richard P. Grant says:

    I don’t see why algorithms, eventually, can’t be as good or better than our eyes. The trick is to figure out the computational processes we perform, I guess, but I am not a neuroscientist, so YMMV.

  8. Jennifer Rohn says:

    Yes, that’s why I said ‘definitely’. We’re a long way off though, in my opinion. With any luck my grandchildren won’t have to score cells under a microscope, leaving more time for doing futuristic things like flying around with jet-packs.

  9. Henry Gee says:

    I had an alorithm once. It was green. Well, I thought it was green. Everyone else was convinced it was blue. I think the moral of this story is to stay well clear of algorithms.

  10. Jennifer Rohn says:

    What did your GP recommend?

  11. Henry Gee says:

    She recommended that I lie down in a darkened room. And to stay away from algorithms of all kinds.
    But seriously, what you have here with the smooth-versus-spiky cell dilemma (and I can quite see why you’re entranced by them, they’re beautiful ) is that you are trying, perhaps, to quantify a continuum – well, not a continuum, but to separate two classes of object, both of which show some degree of variation. I expect that this is a problem all scientists have to confront.
    Years ago when the world was young my Ph. D. problem was to come up with ways to separate the leg bones of fossil bison from the leg bones of fosil cattle. Collections from the ice age are full of these, but nobody has really known how to tell the two apart, which is a shame, because the two creatures have different ecologies and so on and so forth, and knowing the difference could tell us a lot about the ever-changing panorama of ice-age faunal relations, community structure and whatnot. Well, after years of measuring bones and coming up with some very fancy discriminant functions (always good for telling things apart that are fuzzy at the edges, have you tried them?) I found the best way of telling the difference between a bison bone and a cow bone was entirely analogue.
    When looked at from behind, the distal end of a bison cannon bone looks like the neck of a bottle of claret (with ‘shoulders’), whereas the distal end of a cow cannon bone looks like a bottle of burgundy.

  12. Jennifer Rohn says:

    Henry, that’s very interesting. I would have thought that automated image analysis would have invaded even that field by now.

Comments are closed.