On taking a good look at ourselves

Cross-posted from Naturally Selected for added controversy.

Perhaps the most distinctive and powerful thing about Science is its tendency, or rather proclivity to ask searching, even uncomfortable questions. And unlike belief systems, or ideological and political and movements, or pseudoscience, it asks those questions of itself. There’s been a fair bit of that going on recently.

An article in the New Yorker looked at the puzzling phenomenon, as yet unnamed, of seemingly solid observations becoming less reproducible over time. This article received two evaluations on F1000 (here and here), and sparked a lively discussion over at Naturally Selected (= the day job, for those who don’t know).

The New York Times reported on a paper in the 4 January issue of the Annals of Internal Medicine, which claims published clinical trials don’t cite previous trials. Now, there might be trivial explanations for that, but in the same vein Sir Iain Chalmers, editor of the James Lind Library, has some harsh words for scientists. He says there are fundamental and systemic things wrong with the way research, particularly clinical research, is done today.

It’s certainly misconduct

Among these, he accuses (some) researchers of not addressing questions that are not of interest to patients and clinicians, of failing to contextualize new findings, and being clear about what they’ve actually discovered. He also takes aim at the failure of scientists to publish negative or ‘disappointing’ results. In Ann Intern Med* last year there was a paper, recently evaluated at Faculty of 1000, scrutinizing the reliability of and inherent bias in clinical trials. And today, Nature published a Correspondence arguing that it’s critical to publish negative results.

Interesting times. Are the criticisms Sir Iain makes fair? If so, is this fault of the scientists themselves or the system in which they find themselves working? If the latter, how can it be changed? What about publishing negative results, and reproducibility and publication bias?

Are we questioning Science enough?

*I’m not giving you a link to Ann Intern Med because they don’t do DOIs. Barbarians.

About rpg

Scientist, poet, gadfly
This entry was posted in F1000, Politics and tagged , , , , , . Bookmark the permalink.

21 Responses to On taking a good look at ourselves

  1. Jenny says:

    It is very difficult to publish negative results because journals are not interested. I’m sure scientists would leap at the chance to publish more papers if there were actually enough reputable venues at which to deposit these useful data, venues which actually enhanced their CVs. So I don’t think it’s fair to lay blame entirely at the feet of researchers.

  2. Eva says:

    I didn’t know New Yorker pieces could also be reviewed on F1000. Makes sense, I guess, since it’s no different from opinion pieces in peer reviewed journals (which I’ve seen enough of on F1000) but I guess I’m surprised that Faculty Members thought of reviewing Jonah’s article on F1000.

  3. rpg says:

    Definitely. And would those papers ‘count’ towards CV points? There’s no real incentive for scientists to do this, apart from ‘altruistic’ reasons, and that doesn’t help people land jobs, does it?

  4. rpg says:

    Eva, my comment above is to Jenny. Buy me a beer and I’ll tell you about reviewing the NY at F1000, etc.

  5. Steve Caplan says:


    I don’t buy the argument that “seemingly solid observations become less reproducible over time”. That statement, in itself, is inherently unscientific–given that “time” is the only variable changed, if this statement were true, we could just pack it in and forget about science. Try alchemy again, or some of the terrific mystic and homeopathic approaches that seem to be so popular.

    As for “negative results”–again, our scientific method holds that you can’t really prove that A does NOT do B. So while there probably is room (especially in clinical trials) for better data sharing between clinicians, I think basic scientists will continue to shy away from making negative claims.

    By the way, I’m not sure if I didn’t get enough sleep last night, or something in your own “double negative” doesn’t quite resonate: “Among these, he accuses (some) researchers of not addressing questions that are not of interest to patients and clinicians”. Sounds a little bit like the infamous Monty P. Piranha brothers sketch as they try to compose the perfect blackmail scheme…

  6. rpg says:

    Thanks for spotting the error, Steve.

    The problem with that NY article is of course that it’s not subject to the scientific method. Personally, I would welcome firm evidence that says why those reported observations are crap.

    While not disputing Jenny’s point above, I don’t think your second para is right, Steve. Surely the whole point of falsifiable hypotheses is that you can prove something is false. But it’s (almost?) impossible to prove something doesn’t exist–is that what you were thinking of?

  7. Mike says:

    Jenny, there are journals out there, doing their best to encourage researchers to submit their “negative” results, in a variety of fields.

    We had some Correspondence published in Nature about this in 2006 (the doi link doesn’t work, but try here). The Correspondence Richard linked to above has essentially the same message. I would add a comment about this on Nature’s site, but they won’t let me log in. Sigh. Sound familiar anyone?

  8. Steve Caplan says:

    You are correct– “does not exist” is hard/impossible to prove. But can we really say with finality, for example, that 2 proteins don’t interact? Don’t we always resort to the cautionary “By our hands”, “Under these conditions”, etc.? That they do interact–at least in the experiments provided, we can show. That they don’t interact–well, perhaps we need to test more carefully? But I agree with you that common sense is necessary here and it is legitimate to make extensions such as “A and B interact, but under similar conditions, A and C don’t”. Otherwise, you are right, we’d be forever walking on thin ice…

  9. Mike says:

    And I forgot to add that these new “negative results” journals can only gain a reputation if authors are willing to submit their high quality work there. Even blogging about them doesn’t always get that much attention. We want people to submit negative results!

    While I understand that there are a heap of “pay to publish” journals springing up these days of unknown quality, it’s sometimes counterproductive to make ‘reputability’ a prerequisite of submission in these circumstances. Unless you want to check out, e.g., the Editorial board of the journal and find reputable characters like li’l ol’ me there.


  10. rpg says:

    Well, quite.

    Having been asked to show that something happens, when it quite clearly didn’t, I’m a bit sensitive to this. If you use multiple techniques and can’t show an interaction, it’s fair to say it doesn’t happen in any meaningful sense.

  11. Steve Caplan says:

    Agreed–although a more sensitive method may be developed over time and change all this.

  12. rpg says:

    Steve, I doubt it. The point being, biologically, if you can’t detect it, it doesn’t happen. We can detect fleeting interactions down to the mM-1 level: and even then it’s debatable whether they mean anything.

    Of course, this doesn’t mean that an incompetent student’s experiments are law 😉

  13. Mike says:

    Steve said:
    Agreed–although a more sensitive method may be developed over time and change all this.

    Careful now, this is a commonly used pro-homeopathy argument 😉

  14. Steve Caplan says:

    @Mike I have once been asked to review a paper for a journal where the explicit criteria was not novelty, but scientific accuracy. The manuscript was neither novel, nor scientifically controlled. I rejected the manuscipt and thought that was the end of the matter. The editor eventually returned the paper to me with a few cosmetic changes and all of the uncontrolled expts. I re-reviewed and came up with the same conclusion–it would have to be an entirely new body of work to be sufficiently scientifically ‘rigorous’ to be worthy of publishing, regardless of the novelty. Back and forth this went with the editor, with tremendous pressure on me to accept pending minor revisions. I finally told the editor that if he wanted to publish the paper, then consider it a non-peer-reviewed paper, and don’t ever send me manuscripts again and waste my time. That’s exactly what he did. It seems to me that the key criteria for publication was the ~$2000 fee for the authors that the journal did not want to miss out on… This is how we all know which journals are respected, and which ones are not.

  15. Mike says:

    Steve, I’ve had a similarly frustrating reviewing experience, except the MS I was reviewing wasn’t being paid for. It was part of a Special Edition of an otherwise ‘reputable’, but relatively new, traditional model journal.

    It so happened at least one of authors on the MS I was dealing with was the guest editor for the special edition. This was only made known to me after I’d recommended rejection first, then (at least) major revisions following resubmission. Conflict of interest, anyone?

    There are lots of reasons to remain wary of published research, but at least in JNR-EEB, we look for rigorously carried out work, following the same methodological standards that “traditional” journals would expect. The difference being we hope there’s more to a paper’s story than just the p-value.

  16. chall says:

    Wouldn’t part of the “problem” with publishing negative results (or “not noting significant difference”) be that it there are certain hypothesis that are ‘less interesting’ or even wrong? With this I mean that it is harder to distiguish between the relevant negative results and the negative results that might not be relevant*, and especially compared to the positive results and the clear interactions with a hypothesis?!

    As for the article you link to, it’s an interesting thought…. with all the various studies on their own not being “stat safe” but put together and viewed as a cohort you get something else.

    *not sure this is the best way of trying to explain what I’m thinking…. late in the afternoon

  17. RPG- In fact that the question of interest or no interest in a publication is relative in science, as it may be of interest to scientists working in the same or similar topic and can’t be interesting for other.

  18. antipodean says:

    Hi Richard

    One of the reasons Sir Ian can smell a rat in clinical trials work is because it’s possible to do so in clinical trials work (see point 5).

    But since I actually do clinical trials I might note:

    1. It is fucking improssible to get negative trial data in a top flight journal unless the trial is huge anyway or substantively overturns existing practice. In other words it’s gotta be newsworthy. But it is possible and if you are really tenacious you can get it into a journal somewhere even if it takes a few years.

    2. Trial effects getting smaller. This is not automatically a smelly rat. This is a sign that the first trials try it out in the best patients to get a result (look it works in somebody!). The subsequent trials try it out in more marginal patient populations and because these more marginal populations are of greater clinical decision interest (i.e. we are not sure this is the right treatment here or should these patients even be treated at all problem). S because of addition interest there are often more trials in these less likely to respond patients. This is why if you are an ignorant meta-analyst (not accusing Sir Ian in anyway of this) you will find a negative correlation between year and size of treatment effect- this is also caused by point 1. It takes longer to get negative trials published so trial effects look like they are getting smaller.

    3. Clinical trials need to be under the joint control of scientists and clincians and most importantly people who honestly qualify as both (and patient reps). That way they are more likely to investigate problems that patients and clincians actually care about. But that’s just for investigator initiated trials… If you try to get public funding for some of these sorts of trials you are told to go get industry money instead…

    4. Industry sponsored trials are not automatically conducted in order to be useful to patients or clinicians. They are sometimes conducted in order to get the FDA to say Yes.

    5. Publication bias might be worse in the rest of science. Lab experiments, for instance, don’t have to be registered before they are undertaken. So you never know how much negative stuff is being swept under the carpet.

    6. Sorry everybody. I spent yesterday reviewing stuff….

  19. rpg says:

    That’s a very good point, Alejandro. And of course, in many cases we’re not yet far enough advanced in our knowledge to do clinically ‘interesting’ science.

    Great comment, antipodean. Never apologize for being interesting.

    Chall, yes, and I have stories of my own.

    I should add that I don’t necessarily agree with Sir Ian, nor those (non-peer reviewed…) journalistic articles. There’s a reason I cross-posted a work post to here.

  20. biochembelle says:

    Certainly some food for thought here with many competing issues at the center.

    Perhaps one reason for failure to contextualize results is the pressure/expectation to publish in high impact journals, and with that, a focus on establishing the novelty of our results. There is a mentality or need to say, “zomg this is the greatest thing since sliced bread, and WE did it!!” Sometimes that doesn’t lend itself well to saying, “We built on the work of _____.”

    Another consideration on contextualizing results is the desire to protect one’s intellectual space. Some may be wary to lay the context out too clearly for fear of generating competition, especially from bigger/richer labs that could do the same work in half the time.

    On the point of systematically looking at what’s been done already, my time in research suggests many causes.

    First, lab’s often have an inherent level of (dis)trust for groups in their field. Group A (possibly a collaborator) does good work; we take it as the gospel truth (although that may be a poor analogy to use among scientists ;)). Group B (possibly a nemesis) does shoddy work or poor interpretation; we dismiss anything they publish immediately. We pick up these generalizations from lab seniority and accept them without understanding why. While a group may generally do great or shoddy research, we should let each study stand or fall on its own.

    Second, we often fail to perform a systematic review of the original literature in our field. We read reviews and recent literature, but how often do we go back and read the classic papers cited those? In the digital age, when we can get PDFs immediately, we are loathe to trek to the library and photocopy articles we can’t find online. From an experience of being forced to do just that, I now realize that there are sometimes tidbits of data that never make it the reviews, but those nuances can prove quite important.

    Third, there is so much information to wade through and keyword searches are fickle. In an attempt to thin the stacks of papers, we add in one keyword that might just exclude an important paper (a problem I encountered recently when trying to find a pub on a knockout mouse). This is why publishing more information in the form of negative results might not be as beneficial as hoped. Even if it’s out there, how do ensure that it’s found in a search alongside those more “interesting” postive results?

  21. rpg says:

    @Steve: from a practical standpoint, I think it’s far more likely that you are able to demonstrate an interaction that isn’t real, rather than fail to show one that is. In other words, you could quite easily ‘prove’ a protein-protein interaction, that turns out to be meaningless in a physiological complex. The opposite case, of missing a real interaction, is much less likely.

    Good points Biochembelle, and thanks for dropping by. I’d be careful of reading too much into the ‘not citing the primary literature’ result, though. All you can reasonably draw from the observation that the primary literature does not get cited is that the primary literature does not get cited. It’s my experience that when one writes a paper, one does go back and find lots of interesting things; even if you end up not citing it (for whatever reason) you’ve still read it, and it has informed your own publication.

Comments are closed.