{"id":649,"date":"2008-05-22T09:48:46","date_gmt":"2008-05-22T09:48:46","guid":{"rendered":"http:\/\/occamstypewriter.org\/mindthegap\/2008\/05\/22\/in_which_i_lose_control_of_my_vocabulary\/"},"modified":"2008-05-22T09:48:46","modified_gmt":"2008-05-22T09:48:46","slug":"in_which_i_lose_control_of_my_vocabulary","status":"publish","type":"post","link":"https:\/\/occamstypewriter.org\/mindthegap\/2008\/05\/22\/in_which_i_lose_control_of_my_vocabulary\/","title":{"rendered":"In which I lose control of my vocabulary"},"content":{"rendered":"<p>Can being a writer actually make you a less efficient scientist? For the past few months I have been knee-deep in a high-throughput RNAi screen for pathways that affect cell shape and the actin cytoskeleton. Automated image analysis has come a long way recently, thanks to the work of people like Anne Carpenter, whose <a href=\"http:\/\/www.cellprofiler.org\/index.htm\">CellProfiler<\/a> program revolutionized the field. But detecting the textures and shapes of subtle morphological characteristics against the backdrop of normal cell variation is still in its infancy, and nothing so far can beat the discriminating power of the human eye. Although we have a few talented computer scientist collaborators working on the question, I&#8217;ve still had to sift through nearly fifty gigabytes worth of images by eye. <\/p>\n<p>\n<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/www.lablit.com\/images\/Hela.jpg\" alt=\"\" width=\"354\" height=\"222\" \/><\/p>\n<p>\n<strong>Worth a thousand words:<\/strong> But only a dozen are permitted<\/p>\n<p>\nNot only do I have to look at these cells, but I have to describe what I see using a controlled vocabulary. From this descriptive annotation, we can boil everything down to numbers, and thereby start to cluster and compare all the various manifestations of our gene knockdowns. Biology to numbers, and numbers back to biology.<\/p>\n<p>\nSo far, so good. You would think, as a writer, that describing what I see would be the easy part. My bioinformaticist collaborator assigned a list of about ten features, aspects such as &#8220;cell shape&#8221;, &#8220;actin&#8221;, and &#8220;cell number&#8221;, and told me to tick &#8220;yes&#8221; if what I saw differed from the negative control cells, or &#8220;no&#8221; if it didn&#8217;t. There was also a Notes field at the bottom where I could jot down any additional comments, which the bioinformaticist said was usually left blank.<\/p>\n<p>\nBut soon, as I began sifting through the images, the parade of weird and striking manifestations quite overwhelmed the simple digital system. Just in the &#8220;cell shape&#8221; category alone, the word &#8220;yes&#8221; seemed insufficient to cover the reoccurring and surprisingly regular morphological themes that unfolded: triangles; star-shapes; bipolarized elongated spindles; rhomboids. Ruffled lamellae pushing in multiple directions like a fractal pattern. Sharp-edged vertices with precise actin-rich points at their tips. <\/p>\n<p>\nSurely, I reasoned, if these shapes were well represented enough to become familiar to me, it would be a shame not to record the information; and this was true for all the other categories too. If I did it in a way that could be retrieved later, surely we could subdivide the original descriptors in retrospect and make our clusters that much more informative.  The bioinformaticist gave me the green light, so I began to use the Notes section. Copiously. Some of the images were so complicated that I ended up with paragraphs. I was incredibly careful, or so I thought, to use consistent words and phrases and to subdivide these with consistent punctuation for ease of automated text-mining extraction later. <\/p>\n<p>\nOf course it all went horribly wrong.<\/p>\n<p>\n&#8220;You&#8217;ve spelt &#8216;multinucleate&#8217; twenty different ways,&#8221; the bioinformaticist informed me dourly. &#8220;I&#8217;d hate to be your editor.&#8221; <\/p>\n<p>\nSpelling was the least of our problems. When he tried to extract phrases separated by commas, it soon became clear that I&#8217;d mistyped or missed out commas in many places, or used them in list series as you would in prose. When he decided to bin the phrases and just take individual words (1274 words with more than two letters, to be exact), the decoupling of adjectives from nouns \u2013 and the mingling of different aspects in the same paragraph \u2013 was deadly. Of course it&#8217;s so clear now I should have stopped annotating early on as soon as I realized the scope of the problem, subdivided all the digital descriptors and started over, but at the time, it had all seemed workable. So I&#8217;m cleaning up the list now, and will need to go back and manually reannotate. The bioinformaticist is trying to restrict me to twenty terms; I am secretly fomenting rebellion.<\/p>\n<p>\nWhen scanning down the spreadsheet of all the nouns, adjectives and adverbs extracted from the Notes section, removed from context, I feel heartily embarrassed. Did I really see fit, over those long weeks, to use terms such as &#8220;beads-on-string&#8221;, &#8220;curviness&#8221;, &#8220;cross-hatched&#8221;, &#8220;raggedy&#8221;, &#8220;stellate&#8221;, &#8220;splotchy&#8221;, or (oh God) &#8220;fried-egg&#8221;? My inner novelist, impossible to avoid. But it did make me start to think how <del>Henry Gee<\/del> the great writers of the ages might have annotated my screen. Perhaps Homer would have seen a &#8220;wine-dark&#8221; quality in particular nucleoli, or considered a wonky spindle to be &#8220;well-greav\u00e9d&#8221;. Robert Browning, when contemplating a wispy actin phenotype, could have evoked the long grass amongst Roman ruins where lovers lie, &#8220;such a carpet as, this summer-time, o&#8217;erspreads\/And embeds&#8221;. <\/p>\n<p>\nAnd I like to think that P.G. Wodehouse might have deemed a particularly spectacular set of stress fibers &#8220;dashed decent&#8221;.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Can being a writer actually make you a less efficient scientist? For the past few months I have been knee-deep in a high-throughput RNAi screen for pathways that affect cell shape and the actin cytoskeleton. Automated image analysis has come &hellip; <a href=\"https:\/\/occamstypewriter.org\/mindthegap\/2008\/05\/22\/in_which_i_lose_control_of_my_vocabulary\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":6,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-649","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/occamstypewriter.org\/mindthegap\/wp-json\/wp\/v2\/posts\/649","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/occamstypewriter.org\/mindthegap\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/occamstypewriter.org\/mindthegap\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/occamstypewriter.org\/mindthegap\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/occamstypewriter.org\/mindthegap\/wp-json\/wp\/v2\/comments?post=649"}],"version-history":[{"count":0,"href":"https:\/\/occamstypewriter.org\/mindthegap\/wp-json\/wp\/v2\/posts\/649\/revisions"}],"wp:attachment":[{"href":"https:\/\/occamstypewriter.org\/mindthegap\/wp-json\/wp\/v2\/media?parent=649"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/occamstypewriter.org\/mindthegap\/wp-json\/wp\/v2\/categories?post=649"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/occamstypewriter.org\/mindthegap\/wp-json\/wp\/v2\/tags?post=649"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}