In which I can’t say no

The heroine of my third novel has a problem with filing: she is incapable of committing to a concrete decision. Instead of compartmentalizing her papers into simple, broad categories, she has arranged her reprint collection into a drawer of hundreds of folders, each labelled with a highly specific topic. A set of information annotated like this will be rich in details, but it might be hard to find what you want quickly, or to get a bird’s-eye perspective of what the overall set contains.


The elusiveness of the binary

Normally when I write fictional characters, I don’t see the similarities with my own personality until I step back. It was only the other day, when I was struggling over annotating one of my validation screens, that I saw what I shared with my heroine.

It should be easy, right? An image of genetically altered cells either depicts something that looks like wild type, or something that differs from it: in other words, a phenotype. When you look at cells, it should be possible to say yes, this is a hit, or no, it is not. But somehow, the mind slides away from the inevitability of the task. I can’t help but worry that any ‘no’ might somehow be eliminating an important clue forever. The first time I went through this smaller subset of fly genes, I found myself ranking things in a painfully analogue fashion – a system of stars from none to four, supplemented with lots of prose marginalia. And even that wasn’t enough: in that seemingly infinite space between no stars and one, I couldn’t help, in a few cases, to write ‘maybe’. Occasionally further undermined by a question mark.

This time through, on the repeat, I was determined to make a clean decision: score each phenotype, compare with the previous run-through’s star ranking, and then decide once and for all: an overall yes or no for each gene.

Well, needless to say I failed miserably. I’ve managed to be ruthless enough to pare it down to only two stars and no maybes, but I couldn’t quite commit to that binary decision. It’s times like these when I start to see the appeal of automated image analysis. In this procedure, all the parameters become numbers, and you introduce a cut-off, under which the answer is no. Even the pain of choosing the threshold is taken from you – it’s all down to statistics, which never equivocate or worry about the consequences of false negatives.

But it’s proving difficult to find any image analysis partners willing to take on my data long-term. I’m already on my third collaborator, but only the other day she broke it off as well – over the phone. It was almost exactly like getting dumped:

“It’s not your dataset – it’s me,” she said. “I’m going in new directions.”

It’s not the size of my dataset that’s scaring them off: it’s the complexity of the textures and structures we want to score. We’ve actually bought a MATLAB license and are thinking about going it alone – but it’s a scary new direction for this squishy biologist.

About Jennifer Rohn

Scientist, novelist, rock chick
This entry was posted in Uncategorized. Bookmark the permalink.

68 Responses to In which I can’t say no

  1. Richard P. Grant says:

    It’s the curse of the cell biologist. I once wrote a paper that depended on scoring cells as spread or rounded, and that was tough enough.

  2. Jennifer Rohn says:

    That’s tricky, isn’t it? I mean, how spread is spread? A few tentative protrusions or the full lamellipodial Monty? I feel your pain — retrospectively.

  3. Darren Saunders says:

    How big is your data set and is it just morphology you’re scoring? That picture suggests some fluorescent stains? We have people here that may be able to help, whether they can (or have the time to) I can’t say. but I can suss them out for you with more info…

  4. Jennifer Rohn says:

    Darren: ‘Just’ morphology: I’ve never heard it phrased like that before! We have about 50 vocabulary terms in our morphology annotation protocol.
    The dataset in search of a computationalist at the moment is 3840 samples, three sites photographed per sample, three channels per sample (three different fluorescent markers), done in duplicate. That’s about 69,000 images. Each site has about 100-200 cells – so about 10 million segmented objects in total. A drop in the bucket compared to my timelapse screen but still nothing to sneeze at.
    I’d love to hear more – many thanks!

  5. Darren Saunders says:

    Sorry, not to downplay morphology!
    I just sent you a message via email…

  6. Henry Gee says:

    I had an editorial colleague who havered over manuscript decisions until she was almost buried by new submissions. Decisions will always be arbitrary, in the end. My feeling is that you simply have to make a decision quickly, and run the risk that it might be wrong – in the hope that things will always even themselves out in the end.

  7. Richard P. Grant says:

    But Henry, what about the art?
    flounces off, stage left

  8. Jennifer Rohn says:

    Henry, you are so right. Especially in my case – my screen is not to make some Grand Biological Pronouncement, it is just to assemble a shortlist of potentially interesting new genes for further studies. In that case, it’s best to err on the side of caution and have false negatives, not false positives. Because at the end, will it make a difference if you have 20 genes to follow up instead of 15? Not really – because there will only be time to focus on the top two or three anyway.
    At least, that is the stern lecture I give myself every time I sit down at that damned MetaMorph workstation.

  9. Henry Gee says:

    [sympathetic hugs]. I think this problem comes up quite a lot in science, especially messy biology – how to subdivide continua into discrete variables. In my PhD days my task was to come up with a list of features that would distinguish the bones of dead cows (_Bos primigenius_ ) from dead bison (_Bison priscus_) Those of masochistic tendency can peruse the results . I spent ages wondering whether the fourth trochanter/ glenoid fossa/ acetabular notch/ [select other skeletal feature] was deeper/ shallower/ broader/ more elliptical/ more wiggly in one taxon than another. I must have had hundreds of potential characters but only a handful proved reliable in the end. Ah, the dust, the museum air-con, the smell of organic solvents …

  10. Jennifer Rohn says:

    It sounds exactly the same — except for the solvents. (My depths of Annotation Angst have not yet driven me to sniffing things in the Flammables cabinet, although sometimes it’s tempting.)
    What I really need is an electronic spreadsheet with radio buttons that will ONLY accept yes or no as answers.

  11. Richard P. Grant says:

    Tut, Henry. DOI, please.

  12. Jennifer Rohn says:

    They didn’t have DOIs in the Paleolithic.

  13. Richard P. Grant says:

    They did have pay per view, though. I can’t read Henry’s article.
    Once again, humanity is safe. Thank you, Superman.

  14. Jennifer Rohn says:

    What did it cost you, a couple of bison hides?

  15. Henry Gee says:

    I can has DOI. You can has see in in teh page, O Myopic One.
    10.1002/jqs.3390080107
    It was published in 1993, which in the mayfly minds of cell biologists is coeval with the Pyramids.

  16. Richard P. Grant says:

    What’s the difference between a buffalo and a bison?

  17. Graham Steel says:

    What’s the difference between a buffalo and a bison?
    At the last count, about £1.76 a kilo.

  18. Richard Wintle says:

    Buffalo – thing with horns that lives in Africa. Or extinct variety thereof. Or a bison.
    Bison – thing with horns that lives in America. Or extinct variety thereof.
    All this according to Junior Wintle #1’s National Geographic book of Prehistoric Mammals, as interpreted by me, around about bedtime, yesterday.
    But getting back to Jenny’s delightful problematic images – rather than confusing marginalia, can you not tag each image with a rich set of keywords? Then you can use all the terms you like: round, pointy, shiny, crap, and the like. You could even assemble a defined list of these and arrange them in a nice database with drop-down menus and the like – which would not actually need to also contain the images themselves, just their filename identifiers (although of course a great big database holding the images as well would be ideal).
    From the analysis point of view, I’m sure Darren’s bods would be helpful. People who “do” image analysis (I suggest someone with a lot of experience dealing with MRI scans) eat these kinds of problems for lunch. We currently have one such working on our next-gen sequencing images, which is a fun change from all the conventional molecular biologists around here.

  19. Richard P. Grant says:

    sigh
    “You can’t wash your hands in a buffalo”

  20. Jennifer Rohn says:

    Richard W: you make it sound so easy. And I suspect you missed my post about this very topic. Read it and weep.
    Yes, in the ideal world I’d have drop-down menus so I didn’t lose control. The problem with that is that it takes going through the entire thing visually to understand the terms required — and every time you go through it, you notice something else. It could go on forever.

  21. Jennifer Rohn says:

    p.s. Richard G., that bison joke only works in Australian.

  22. Richard P. Grant says:

    In Brum, actually.

  23. Ian Brooks says:

    bq. t’s best to err on the side of caution and have false negatives, not false positives. Because at the end, will it make a difference if you have 20 genes to follow up instead of 15? Not really – because there will only be time to focus on the top two or three anyway.
    My god this is my PhD all over again. Screening flies for a temperature-sensitive mutant phenotype; a variation on rapid TS paralysis. I learned this lesson the very hard way. There’s a blog post in there somewhere…
    Oh, fwiw, I came to NN just now to write a blog post on categorisation etc. You’ve pipped me to the punch. Maybe I can turn it into something for Lablit instead…

  24. Jennifer Rohn says:

    Don’t click on that link, folks. The site is currently down, aside from the main page. Long story, but hopefully normal service will be restored later this evening.
    Ian, I’m tempted to say, isn’t an ion channel just opened or closed, yes or no?

  25. Richard P. Grant says:

    Don’t click on that link, cos it was incomplete. Try LabLit — and it’s live from this angle.
    Jenny: an ion channel is neither open nor closed until someone clicks on it observes it. That’s my story, and I’m sticking with it.

  26. Jennifer Rohn says:

    Yup, fixed in the wee hours!

  27. Richard P. Grant says:

    \o/

  28. Jennifer Rohn says:

    I’ve just realized why I like doing restriction digests. They either cut or they don’t, and the partials are dead obvious too. Fridays are good days to be tangible.

  29. Åsa Karlström says:

    Jenny> I am amazed about how many pics there were to analyze. Then again, it reminds me about the histology comparisions I’m about to look at… that sounds like a walk in the park compared to “slightly round, spread and fuzzy – I’ll give it *** “.
    I, as Ian, focused on “that case, it’s best to err on the side of caution and have false negatives, not false positive” I guess, it depends on what the point is with the assay. To look for a few markers and see what phenotype they get? Or to fully characterize all different morphology types you can get. Oh, I guess they might be the same …. hmm…
    Somewhere I remember a sheet with 30 columns to each sample and a yes/no answer that had to be put in the box next to a characteristics. I also remember inventing the ? in there.[and this was bacteria and different biochemistry things too, much easier]
    However, my experience is that it almost always show that you need to define the parameters beforehand and then stick to them throughout the whole set. As in, round = a shape where the radius does not vary more than 5% when going around the circle, or something like that. If it doesn’t fit, well tough it doesn’t. [hard love]
    I wish you the best of happy thoughts interpreting the data though. And I am quite convinced that my suggestions were nothing new to you. I feel the pain.

  30. Richard P. Grant says:

    As in, round = a shape where the radius does not vary more than 5%
    Ah… would that all biology was so quantifiable.
    Both in terms of giving in numbers and being able to measure something!

  31. Åsa Karlström says:

    Richard> I was trying to make a point… I mean, isn’t that the beauty of using a computer program that can help instead of manually looking at the cells in a microscope???
    [being obtuse]

  32. Richard P. Grant says:

    You’re not obtuse Åsa, you’re incredibly acute.
    x

  33. Ian Brooks says:

    Ian, I’m tempted to say, isn’t an ion channel just opened or closed, yes or no?
    Do you wanna slap? :p

  34. Richard P. Grant says:

    Your slap-boxing against Jenny’s karate?
    I’m taking bets. And selling tickets.

  35. Cath Ennis says:

    Can an ion channel be slightly ajar?

  36. Richard P. Grant says:

    No, that’s a door, Cath.
    Wait, wrong joke…

  37. Cath Ennis says:

    “My dog’s got no nose”

  38. Richard P. Grant says:

    “Snail varnish”.
    Um, Cath, I think we’re getting stern looks.

  39. Cath Ennis says:

    Is Jenny not OK with this?
    Hang on, Alaska myself.

  40. Richard P. Grant says:

    Jamaica?

  41. Cath Ennis says:

    Terrible!

  42. Jennifer Rohn says:

    I go away to watch The Wire for two seconds and it all goes to pot.
    Actually, Asa, you are absolutely right. Tough love is what is required. What I really need is like those Personal Coaches you can get at the gym – someone to stand over you and tell you exactly what to do, and not to do.
    Delete that ‘maybe’! Yes, I’m talking to you.”

  43. Åsa Karlström says:

    Jenny: yes, that is the way. I guess that was the prof when you were a lowely undergrad but nowadays there are no PTs outside the gym, just you and your stress level 😉 (I am exactly the same. Well, maybe this would be an interesting dot… hmm… nah… maybe… ehh… that why I love the cut off 30% weight loss! It is not to be messed with. Ever.)
    Richard: Jenny said I was right. Nannanannanana ;P

  44. Richard P. Grant says:

    You’re also awesome, Åsa. XX

  45. Henry Gee says:

    A man walks into a bar.
    ‘Ouch’, he says.
    (Courtesy Gee Minima, aged 9)

  46. Richard P. Grant says:

    I see you’ve found it.

  47. Jennifer Rohn says:

    There are definitely still some profs around who push people to do stuff. I think people are scared of me, though! My career path is too weird to mess with.

  48. Richard P. Grant says:

    ‘You are in a twisty maze of career paths, all alike different’

  49. Richard Wintle says:

    Hm. You look away for a few seconds, and it all goes to crap puns.
    But…
    _Richard W: you make it sound so easy. And I suspect you missed my post about this very topic. Read it and weep.
    Yes, in the ideal world I’d have drop-down menus so I didn’t lose control. The problem with that is that it takes going through the entire thing visually to understand the terms required — and every time you go through it, you notice something else. It could go on forever._
    Mm, yes. And I freely admit I did not read your other post on the topic (I figure if I manage about 0.1% of NN blog posts I’m doing well). But the problem you have is that shudder you didn’t have a database person design it from the ground up. Adding keywords as you “discover” new things in your data set should not be an issue, although going back to earlier database entries and applying them where appropriate would be a headache. But none of this should result in the confusing mess of meta-data you describe in that post, if the thing is designed right in the first place. And by “designed right” I don’t mean “postdoc with Filemaker Pro”, I mean “real database developer with loads of relational database design experience, knowledge and understanding of logical and physical datamodels, and l33t SQL chopz, as well as thorough knowledge of the underlying biology”.
    No such animal exists, of course, bison, buffalo, or otherwise. But honestly, if descriptive keywords (round, pointy, looks like Henry Gee’s chicken, etc.) don’t do the job, then you need to apply some “real” metrics instead (like what someone mentioned above, 5% skewness from the obliquity of the ecliptic, or whatever), which is what “real” image analysis bods do all the time.
    All this is terribly hard to implement in the real biological lab world of course, and just thinking about it is making my head hurt, and disabling my usually razor sharp rather lame punning ability too.

  50. Richard Wintle says:

    P.S. “postdoc with Filemaker Pro” was not intended to describe any present company, or anybody that anyone might actually know, living or dead, etc. etc. etc.

  51. Heather Etchevers says:

    How about “skilled but non professional medical doctor with FileMaker Pro”?
    Know anyone or any company offhand who will design such databases on a European lab budget, perchance? We desperately need an overhaul.

  52. Heather Etchevers says:

    Sorry, I meant non-professional in the database design part, not the doctor part. (If she reads this ever.)
    On the other hand, a defined ontology can be a pretty powerful tool. And you wouldn’t really need to peruse the whole list each time when you assign descriptives, as when you type the first letters, if you are lucky in your tool (FM for one), it will fill in the rest of the words, or reduce your choices, as you type.

  53. Jennifer Rohn says:

    God, my kingdom for a tool that would fill in the rest of the term phrase, or a decent drop-down menu you could add to in real time. I didn’t know it was possible.
    I need help.
    {cries}
    Nah, I’m fine really. But if I were doing it again I’d try really hard to get the thing set up properly. No offense to the system we have or the people who put it together – they did a great job considering the tools and time available.

  54. Richard Wintle says:

    I forgot to add that the phrase “no such animal exists” applies not only to the described database developer, but also the mythical database I described… 😛

  55. Jennifer Rohn says:

    Damn, damn, damn!
    My hopes are dashéd.

  56. Åsa Karlström says:

    Richard G> aww. You’re sweet!
    Jenny> it does sound lovely, doesn’t it?! Personally, I think that if nothing else the creation of data bases and key words etc makes it even more crucial to “think the experiment through before you start”. Of course, it is hard to know what kind of shapes you are going to run into before the stuff happens though….
    good luck with all the pics. I am a bit envious, if it helps??? 🙂

  57. Heather Etchevers says:

    Took me a while to get back to this, but (not to advertise for them, since I have a copy of which I’m not particularly proud, if you catch my drift) FileMaker Pro does allow you to make a drop-down list to which you can add progressively, and you can fill in the rest of the term phrase… is this so rare? Richard W? Wouldn’t Access allow one to do the same?

  58. Richard Wintle says:

    Indeed it would, Heather – I think my point was that databases should be designed by database designers, who really understand datamodels and primary keys and the like. Folks like me should not mess around with them (your mileage may vary though).

  59. Richard P. Grant says:

    Databases should be designed by database designers, sure: but user interfaces shouldn’t.
    That’s how you get things like Windows, which were designed by people saying ‘Hey, look at all this neat stuff we can do!’ rather than asking ‘What does the user want to achieve?’

  60. Richard Wintle says:

    Agreed with point one.
    Always with the Windows thing. Used iTunes recently?

  61. Richard P. Grant says:

    I don’t remember the iTunes interface… that’s how good it is.
    And come on. Windows is shit, isn’t it? Quite without regard to anything else.

  62. Jennifer Rohn says:

    My problem with Access and FMP both is that I’ve found it’s too easy to accidentally change a record. Especially the first one!
    To be honest, I gave up on our computerized database and annotated on a piece of paper, then transferred the details to Excel later – from Excel a tabdelimited file could be exported back to the database. With our java-based homemade system there were radio buttons, again very easy to accidentally tick or detick. After obliterating about ten hits, I decided I really enjoyed the pencil and paper interface. Also, when you already have ten windows open, it’s just easier to scribble.
    I know: I’m hopeless.

  63. Pamela Ronald says:

    Hello Jenny
    I am very much enjoying experimental heart. It is like reading about a previous life. Now that I have been a professor for 17 years, I no longer drink beer in between experiments and fall in love with the other graduates students (OK, i never really did drink much beer).
    Unfortunately, I dont think my lab group finds me as cool as Magritte. Nevertheless, I was quite inspired by her character and therefore today offered to help my student and postdoc with an important experiment they are carrying our for a paper that is under revision. I dont think they are going to take me up on it though, their experiment is too important to risk.
    It seems like you had a great time writing this book- I sure love reading it.
    I cant wait to read your second and third novels.
    all the best
    Pam

  64. Richard Wintle says:

    @RPG – I actually didn’t mind Windows XP, but Vista really gets on my nerves. Ok, Office 2000-whateverthelatestis is what really irks me, not Vista per se.
    No agreement on the iTunes interface, it’s horrid.
    Jenny – the problem with keeping your data in Excel, and I apologize if I’m stating the bleedin’ obvious (but I’m going with my strengths here!), is that it’s just too darn easy to sort a bunch of columns but omit others, thereby irrevocably #%@%)ing up your data set.
    Not saying how I know this, mind…
    Also, Pamela is right, Experimental Heart was fab. You going to post that excellent rather poorly executed photo I sent? 😉

  65. Richard P. Grant says:

    Ah, yes, photos. There is a lablit flickr account: but it makes little sense until we get the blog running, which depends on trying to find who keeps our CNAME records…
    Yes, Winty: the latest version of Office is unspeakably awful. I’m going to say no more because I quite like the keyboard I’m currently typing on.

  66. Richard P. Grant says:

    Ah, yes, photos. There is a lablit flickr account: but it makes little sense until we get the blog running, which depends on trying to find who keeps our CNAME records…
    Yes, Winty: the latest version of Office is unspeakably awful. I’m going to say no more because I quite like the keyboard I’m currently typing on.

  67. Henry Gee says:

    And come on. Windows is shit, isn’t it? Quite without regard to anything else
    I agree with Grant. Completely.
    [Smashes own head against wall, irons own hands]

  68. Oznur Tastan says:

    Hi Jennifer,
    In your very first post, you mentioned the ‘the complexity of the textures and structures we want to score.’ Would u mind explaining little bit more what they are like? I am not an image analysis person, I am just curious..:)

Comments are closed.