Anatomy of a blog post on the anatomy of a scientific discovery

At the risk of getting uber-meta, here is a blog post about writing my latest blog post at the Guardian. This was an account of a scientific discovery, albeit a minor one, that occurred during the process of shepherding the latest paper from my lab to publication.

Why write about writing this post? Because maybe it will help others, and maybe it will help me to think it through.

I should know better by now but I underestimated how hard it would be. To tell the story, my blog post had to dig into the molecular details of our analysis of the mechanism of the initiation of translation of the RNA genome that is delivered to infected cells by the norovirus. But, as I have discovered when tackling molecular topics in the past, you can’t start digging until the ground is prepared, and all along the way you have to keep stopping to explain this or that piece of molecular jargon. There is a constant battle between narrative momentum and the desire to keep the reader in the picture, without insulting their intelligence.

Writing about DNA is a piece of cake compared to writing about proteins. It is safe to assume that most readers have an image in their minds eye of the double-helix and a grasp of the idea that it contains coded instructions written in a sequence of bases. It doesn’t really matter if they don’t know what bases are. Most, I believe, are aware that there are four of them: A, C, G and T. Proteins are more complex and sprout jargon from every feature – the peptide bonds that string their constituent amino acids into a polypeptide chain, which starts with an N terminus and ends in a C terminus and folds up into a three-dimensional shape stabilised by non covalent interactions of various types. And pretty much no-one knows has ever heard of any of this.

In the present piece I had to dish up all the detail on protein composition and structure before getting into the nuts and bolts of the interaction between norovirus and the protein synthesis machinery of the cell that was central to my tale. This was no picnic, especially since the main actors all had awkward and forgettable names. Ladies and gentlemen, please welcome to the stage VPg, eIF4G, NS6 and – everyone’s favourite – the HEAT-1 domain*.

The first step was the first draft. I have learned just to power through this, whatever the quality. Get the story down on the page and work from there. So that’s what I did. Because the paper I was describing was still fresh in my mind and because of my desire – or is it the instinct or bad habit of the scientist? –  to immerse the reader in the flavours and smells of the laboratory, I suspected I may have overdone things. It felt good to have knocked out a draft but I had my doubts and took to Twitter to express them, which provoked a telling reply from physicist Helen Czerski:

Helen is a scientist who is not a structural biologist, so I thought I’d exploit the contact by asking her to read my draft. She was kind enough to agree.

And honest enough to tell me where I was going wrong: too much detail and too many acronyms that were getting in the way and likely to induce the general reader to bail out well before the end.

I took her advice and hacked at the piece, clearing out as much extraneous detail and jargon as I could. Or so I thought.

Not wanting to trouble Helen again I sent version 2 to Jenny Rohn, my co-conspirator here at Occam’s Typewriter and the Guardian. She’s a good editor, with an eye for telling detail and deviations from the rule of “show, don’t tell”. Her annotated version was full of helpful cuts, insertions and comments.

Following Jenny suggestions I re-wrote the start and end of the piece and clawed out some more unnecessary detail. I also added figures since I could not find a way to paint pictures with words alone. A failing perhaps, but when operating at the molecular level with engineered proteins that have no correspondents in everyday life there seems to be little alternative. Figures would hopefully provide support for the reader. At the very least, they would break up the text to make the piece seem less formidable. To counter the risk that they would give it the look of a text book I labelled the images using a font that resembled handwriting.

By this stage I was at version 4 and asked my wife, a non-scientist, to see what she made of it. By this stage most of the problems had been ironed out but she picked up one or two problems with sequencing (especially in the paragraph describing protein synthesis from RNA by the ribosome). She questioned the use of “complex” (to describe a cluster of proteins) and was unsure about “precursor” In the end I got rid of complex but felt that an interested reader could make an educated guess about “precursor”.

There was a final polish – I forced myself late in the day to read the post out loud to myself – and then I published.

Am I pleased with the final product? I’m not displeased, and some readers have left approving comments, but I still think I could have done better. I’m no Horace Judson, even if I might aspire in that direction. By the end I was bored of the piece. Fatigued. It would probably have been a good idea to leave it for a few days and then return afresh. The editorial assistance was a huge help but molecular material requires a level of devotion to make it come alive that I did not have time for on this occasion. But I have tried before and this stuff gnaws at me. It is a world worth exploring in words, so no doubt I will try again.


*Parenthetically, HEAT is officially the worst acronym ever. It stands for “Huntingtin, elongation factor 3 (EF3), protein phosphatase 2A (PP2A), and the yeast kinase TOR1). I kid you not.

This entry was posted in Blogging, Protein Crystallography, Science and tagged , . Bookmark the permalink.