Ten papers for ten years

Scientific paper clip-art

Scientific paper clip-art

Blogging Beyond is about ten years old now

To celebrate, here are ten papers I like, in chronological order by publication date. Each is accompanied by a short justification for its inclusion in this list. 

  1. Ridge Regression: Biased estimation for nonorthogonal problems (1970) Technometrics Hoerl and Kennard [pdf]

    This paper sets out the statistical technique of ridge regression. This method formed the basis of much of my PhD thesis. I read this paper so many times, and had so many highlighted and wrinkled printouts kicking around, that by the end of my studies I could almost recite it. I learned matrix algebra and the canonical form of the linear model from this paper and related ones. After spending so much time singular value decomposed, when it comes to linear modelling, I think in projections. I find the sums of squares mentality much harder to get my head around. (Those last sentences are for the stats people.)

  2. On being sane in insane places (1973) Rosenhan Science [pdf]

    Often referred to simply as the Rosenhan experiment, this study of what madness is and is not, is not of itself particularly strong scientifically – I am always a little puzzled as to how it ended up published in Science. Nonetheless I like the literary style, amusing story, and the message about madness and what it means and does not mean in different contexts. A personal choice, perhaps somewhat revealing.

  3. Simple mathematical models with very complicated dynamics (1976) May Nature [pdf]

    This elegant self-styled interpretive review discusses the complicated dynamics that can arise in systems described by first-order differential equations. This paper was my entry point into dynamical systems, an area in which I no longer work. I did publish on the topic, albeit tangentially. I do not have many regrets about my academic career but I do regret never submitting the work that comprised my undergraduate thesis for publication. That work related to this topic. At the time I thought the work unworthy of publication. These days I look back and think what a pity that was. Imposter syndrome is a real thing. To be clear, though, my work was not a patch on May’s paper.

  4. Can a Biologist Fix a Radio? — or, What I Learned while Studying Apoptosis (2002) Lazebnik Cancer Cell [pdf

    I recommend this essay regularly to colleagues who are struggling with the interface of biology and mathematics. Not many people to whom I have recommended it, read it. I like it though. It makes me laugh out loud. 

  5. Subnets of scale-free networks are not scale-free: Sampling properties of networks (2004) Stumpf, Wiuf and May PNAS [pdf]

    I think I mostly liked this paper because I (thought I) understood it and that made me feel clever. I’ve not read it in years. The May on the author list is the same one who wrote Simple mathematical models with very complicated dynamics.

  6. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls (2007) The Wellcome Trust Case Control Consortium Nature [pdf]

    This ground-breaking paper presented a collection of interconnect GWAS with a large sample size for the time. This is the paper I presented as part of my interview for a place on the PhD program at Imperial which I later completed. A fun story: candidates were given a free choice of papers to present, and asked to bring multiple printed copies of their chosen paper (seven copies, if memory serves) to give to the interview panel. This paper is long and I spent all my printing credit printing it out multiple times and carefully stapling. None of the interview panel wanted a copy, gesturing that they had already read it. Oh well. 

  7. What is a gene, post-ENCODE? History and updated definition (2007) Gerstein et al Genome Research [pdf]

    The ENCODE project, which was to go on to play a significant role in my life for a period, sparked more conversation than ever about regulatory genomics. This perspective article discusses the definition of a gene looking back over more than a century of scientific understanding. The conversation was continued as the ENCODE project unfolded – as discussed here at Nature News.

  8. Trisomy represses ApcMin-mediated tumours in mouse models of Down’s syndrome (2008) Sussan, Yang, Li, Ostrowski & Reeves Nature [journal link]

    When I applied for the place on the Imperial College PhD program (interview mentioned above) I also applied for a place on a similar program at Edinburgh. There, the interview involved presenting one of a choice of several papers, and this paper was one of the options. Being the precocious undergraduate I used to be, I had a subscription to the print edition (remember those?) of Nature at the time, a touching birthday gift from my grandfather. I had already read this paper and heard about it on the Nature podcast [link to transcript]. I emailed Reeves to clarify a couple of points which I had not understood from the paper, and we had a charming email exchange which I will tell you about in person on request. 

  9. Can the flow of medicines be improved? Fundamental pharmacokinetic and pharmacological principles toward improving Phase II survival (2012) Morgan et al Drug Discovery Today [pubmed link] and Lessons learned from the fate of AstraZeneca’s drug pipeline: a five-dimensional framework (2014) Cook et al Nature Reviews Drug Discovery [journal link]

    Since moving to industry I do not blog much about work. These two papers, referred to colloquially as the “three pillars” and “five pillars” papers, discuss just how hard it is to make a medicine.

  10. A reanalysis of mouse ENCODE comparative gene expression data [version 1; referees: 3 approved, 1 approved with reservations] (2015) Gilad Y and Mizrahi-Man O. F1000Research [doi link]

    A second mention for ENCODE, this paper was notable for the fact it addresses confounding (an important concept in design of experiment), its use of forensic bioinformatics (using file names to reconstruct the design of a study) and its unconventional route to publication via a reanalysis that was first shared on Twitter. The discussion thread on f1000 is worth a read. Zeitgeisty at the time. 

Happy reading, everybody. 

This entry was posted in Science and tagged , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *