Scooped

Being scooped is every researcher’s worst nightmare.
It happened to me, once; less than three days before submitting my first paper from my postdoctoral lab, I spotted a new paper in press at my target journal, with substantially overlapping conclusions. Luckily I was so close to publication myself that there could be no doubt the work was independent – and there were some conclusions that were unique to my paper. This meant that we did still get the paper published (in a different journal), but not without some very stressful times. One of my labmates was not so lucky a year later; he was completely scooped (by a different lab to the one that got me), and didn’t get a chance to even submit his paper.
So my heart goes out to Professor Laura Bierut at Washington University in St. Louis, the victim of a scoop that made the news section of Science this week.
For those of you who don’t have access to the original article, the story is that Professor Bierut had contributed data to a shared database. The federally funded project had a publication embargo in place, so that while other researchers could access and analyse the data, the contributors would get the first shot at publication. However, a researcher at a different institute breached the terms of this agreement, and submitted a paper based on Professor Bierut’s data a full six months before the embargo was due to expire.
The offending paper has since been retracted, but it can’t be unpublished, and it remains available in the journal’s online archives. The NIH are investigating, and have frozen the other researcher’s access to the shared database until their review is complete.
There have been some discussions of data sharing around these parts recently (see Bob’s recent post for an example). There is obviously a great reluctance that needs to be overcome before free and open data sharing becomes widespread, and incidents like this one (even if unintentional) represent a major obstacle.
So, assuming that data sharing is A Good Thing, who should shoulder the responsiblity of protecting the people who generate the data?

  • The goverment and its funding bodies? Well, it seems as if they tried in this case.
  • The journals? Can they really be expected to vet every single submission to ensure that no embargoes were breached?

A better system for removing retracted papers from the literature might help, but what’s the point of treating the symptoms?
My feeling is that researchers will be keeping their cards close to their chests for the forseeable future.

About Cath@VWXYNot?

"one of the sillier science bloggers [...] I thought I should give a warning to the more staid members of the community." - Bob O'Hara, December 2010
This entry was posted in Uncategorized. Bookmark the permalink.

25 Responses to Scooped

  1. Eva Amsen says:

    That doesn’t look so good for open science…
    I can’t say I’m surprised, I know people will cheat and steal data when it’s so easy, and it was only a matter of time before this happened. It will hopefully push towards different kinds of regulations rather than closing everything up again. (For example, maybe publishing a paper wouldn’t be the key event that secures funding, but publishing the data first could be enough. Data collection, analysis, and summarization could be three different things, with credit for each.

  2. Eva Amsen says:

    Here’s the closing parenthesis.)

  3. Cath Ennis says:

    Nope, not very good at all.
    Your suggestion has potential – although the analysis and communication of the data would probably be given more weight than its collection (which, really, anyone with the right equipment could do). Which doesn’t really protect the contributor much.

  4. Eva Amsen says:

    But they could meanwhile analyze something that someone else contributed – speeding up the whole process of science.

  5. Cath Ennis says:

    Reciprocal scooping? I like it.

  6. Bob O'Hara says:

    I think the responsibility would have to belong to the people archiving the data. They should be able to put in place agreements that everyone has to sign making it clear what can and cannot be done, and what the penalties are.

  7. Jennifer Rohn says:

    _it was only a matter of time before this happened. _
    I’m afraid to break your bubble of idealism, but it happens all the time. I’ve seen it happen myself on a number of occasions, and it’s happened once to me personally. This one just got a prominent write-up.
    Yup – my cards and my chest have a pretty healthy relationship these days.

  8. Lee Turnpenny says:

    I think a ‘publication embargo’ is naive and completely nonsensical. We can’t have a system that places scientists under such pressure to publish and not expect some degree of chicanery to occur (which I don’t advocate, by the way). Instead, couldn’t (doesn’t?) the database submission count as a recorded ‘publication’? In which case, if those data are used by others, it must be properly cited as a publication, otherwise something akin to plagiarism is occurring. The extreme version of this would be a kind of scientific utopia, where we all throw our data out there as soon as it’s generated for all to see, analyse and (re)interpret – kind of the whole point, really. But that ain’t gonna happen – not while science remains a human endeavour.

  9. Eva Amsen says:

    Jenny, I’m not that naive; I know scooping has been happening all over the place, but this was the first time I saw a concrete example of this being a direct result of the increasingly popular “data sharing” or “open science” ideal.
    I like the notion of sharing everything, but I also have a tendency to immediately see the downsides and dangers of everything, and this was an obvious one. I was actually talking about this earlier this week, in the context of open notebook science. I don’t think that’s suddenly going to be super-popular, because there’s too much at stake. The risks are too high, because in many fields (esp. life science) a paper is easily three years of work. Sharing data is too scary if it can put you back three years. It would work better if there was a whole different reward system. I’m imagining something where you can say “look, we’re still collectingore data for the past couple of months, but our first set of data has already been analyzed by so-and-so” and you and so-and-so would both look good for progressing knowledge. Now so-and-so steals data, you don’t have a paper, you both get punished, and science gets nowhere.
    And this was way too long to type on an iPod…

  10. Eva Amsen says:

    the weird word is supposed to say ” collecting more”

  11. Heather Etchevers says:

    I’m also disappointed (and a naive optimist by nature, so be it). I wonder why the offending author didn’t just hang on and submit the article with an explicit comment to the editor that it could not appear until after expiration of the embargo? Seems like it would be really easy to stay honest and still ahead of the game as it were – the embargo IS really very short, as mentioned in the article. But those who contribute their data are aware of it, and probably plan their analyses accordingly.
    Also, from experience, you can submit data to some databases and not have it even visible to anyone for a set time (ie. a year or until publication) unless they go directly through you and obtain a link. Under those circumstances, I’m with Bob – it’s worthwhile getting an agreement of how and when the data can be used and published, in writing. Once a publication has happened using that primary data, whoever uses it thereafter can and should cite the first publication.
    Perhaps what is hard here is that the agreement, which would be direct and explicit between two research groups in the above situation (but some groups are perhaps more vulnerable than others) is supposed to be vouchsafed by a NIH data access committee. It seems more impersonal, but you’d expect, more even across the board and with more clout. I am glad they took action on behalf of Professor Beirut.
    I don’t think it’s realistic to expect a database to be cited directly, as nice as it would be to have credit for all sorts of other kinds of clear scientific contributions.
    Both you, Cath, and Science magazine write that the article is still available on the PNAS website, but I can only easily find the retraction. And I didn’t get Professor Schekman’s point about how removing online content creates difficulties for librarians. I’d be interested to hear Frank input on that.

  12. Frank Norman says:

    Heather –
    I think our problem stems from the principle that publishers should not alter the scholarly record. Departing from this leads to an Orwellian scenario.
    Although a very different situation, the case of the deleted Human Immunology article a few years ago. As Scott Plutchak wrote after that,

    Once an article was published, it was out in the world, and anyone engaged in damage control had to assume that it would always be out in the world

    If a paper has appeared in print then the print version will continue to exist, so there is a divergence between the print record and online record. If the deletion of an online article happens sometime after original publication, it will already have been disseminated and read, so trying to disinvent it retrospectively seems deceitful.
    I note that the PNAS paper was an epub ahead of print, so the print vs online issue may not arise. I could see a case for deferring publication in this instance.
    But once knowledge has been disseminated, you cannot oblige people to forget it. Imagine reading a paper online one day, telling a colleague about it and they not being able to find the article because it has been deleted. Nature abhors a vaccuum. So, it seems does PNAS.
    Clearly something has gone wrong here, but deleting it would not put things right.

  13. Eva Amsen says:

    Nature abhors a vaccuum. So, it seems does PNAS.
    Badoom-tish

  14. Richard Wintle says:

    This is terribly disappointing. Dr. Bierut was very forthcoming in granting us access to one of her unpublished (and embargoed) datasets in dbGaP. And no, it wasn’t us that scooped her.
    I suspect this will put a crimp in the process, and that fewer people will be willing to grant access pre-publication. Shame.

  15. Heather Etchevers says:

    Imagine reading a paper online one day, telling a colleague about it and they not being able to find the article because it has been deleted.
    No, they wouldn’t be able to find the article because it had been retracted. The reasons for the retraction are explained, and the retraction is easily found. The article is either wrong, unethical, or both. I don’t think anyone will forget the hypothesis or the science in the paper if they read it – but up to the community to prove it true in an acceptable manner.

  16. Frank Norman says:

    A retraction is fine. A retraction means that an article is marked as defective in some way, but it still exists. A deletion is trying to reinvent history, or saying that this happened but you may not see it.
    In general librarians are not happy with the idea that anyone should have that amount of power over the record.

  17. Åsa Karlström says:

    I am not sure I understand what happens next though. If the paper is retracted but Dr Bierut can’t publish her study since “it has been published” what are you supposed to reference then?
    In comment to the whole story; I personally hope that something good will come out of this. (And maybe that the researcher at the prestigeous place will get adequately punished – I’m a little bit vindictive about things like this.) I have a very hard time believing “this was an honest mistake” but sure…. maybe I am too cynical?
    It does make sense to have a less “open” database and only being able to search in it via links for the people who deposited the data. then again, sooner or later it wil lturn into either “sharing data and being at risk for being scooped” or “playing very close to chest and only share with your best scientist friends in a collaboration”*
    *and a collaboration isn’t really that sure fact of not being scooped but at least it is less likely for leave you completely out. Although it is not unheard of that authors change order etc…

  18. Cath Ennis says:

    “I think the responsibility would have to belong to the people archiving the data. They should be able to put in place agreements that everyone has to sign making it clear what can and cannot be done, and what the penalties are.
    But it sounds like they did do that – the first part, at least. From the article:
    When an investigator gets permission to access the raw data stored at dbGaP, he or she signs an agreement pledging not to submit any paper before the embargo date. Author Zhang, who signed the access agreement last year, told Science he doesn't want to say anything until NIH concludes its review.
    If someone decides the consequences are worth the reward… how can you stop them, really? In an ideal world someone’s word is their bond, but, as several commenters have pointed out, when publication is everything to a scientist’s career, ethics may go out the window.
    Heather’s experiences with shared data not even being visible for a set amount of time are very interesting – this may be the way to go (but then, is it really sharing?). I don’t have any direct experience with this kind of collaborative project, so I don’t really have a solid grasp of what practices would and would not be acceptable or useful in this context.
    The conversation about “correcting” the literature is an interesting one. I see Frank’s point about not trying to reinvent history, but I just feel that the current retraction system is not ideal. It would certainly be possible that someone (especially a trainee who may not do the most exhaustive literature search) could find the original paper, but not spot the retraction. So “retracted” papers must still get cited, whether accidentally or not. If everything were electronic, it might be easier to ensure that the retraction automatically pops up when someone accesses the original paper. But how do you deal with the hard copies – the ones printed out by individuals and filed away, or the ones in the print journals that libraries are still collecting?

  19. Jennifer Rohn says:

    Jenny, I’m not that naive; I know scooping has been happening all over the place, but this was the first time I saw a concrete example of this being a direct result of the increasingly popular “data sharing” or “open science” ideal.
    Just to clarify, Eva; I was actually referring to data sharing violations.

  20. Eva Amsen says:

    I thought you were talking about the old-fashioned conference-type of data sharing, or direct collaborators? Not so much the specific type of repositories where people can look at your data-in-progress in their own time. I guess it’s all kind of the same, in the end. Although I’ve never signed anything official when a collaborator showed me their data or I saw something at a conference, and these people apparently did (and broke a written agreement on top of an implicit one.)

  21. Jennifer Rohn says:

    I’m sure things are more fraught in my field (genomic screening). Other disciplines don’t do the amount of sharing, yet, that could tempt people.

  22. Mike Fowler says:

    Boooo! Hissssss! Cheaters never win.

    It would certainly be possible that someone (especially a trainee who may not do the most exhaustive literature search) could find the original paper, but not spot the retraction. So “retracted” papers must still get cited, whether accidentally or not.

    But expert reviewers should be familiar enough with the literature to highlight this before acceptance. This is a high profile case, other retractions may not get the same coverage, but if a reviewer is unsure and wants to check these things out, it only takes a quick WoS or other database search.
    A couple of other points:
    (1) As well as the database organisers having a ‘code of conduct’ concerning how & when data can be used & submitted, don’t journals usually have some rules about having the correct authority to use data in their publication agreements? Something like, ‘I promise that I have the consent of all those good folks who supplied data to use it in this way’?
    This, however, means that two safety measures have been breached, through inappropriate conduct by the retracted authors.
    (2) I’m completely with Frank about retraction being more appropriate than deletion. Apart from anything else, paper copies are so last century, digital access makes it easy to follow up a paper’s fate. Again, expert reviewers and Editors should be up to date with the literature. Retractions are rare enough to be worthy of discussion around the coffee table. People get to know about them quickly.
    I’m also with Åsa about being confused why the scooped authors can’t publish their work? Surely the journal that published the retracted paper should step in and publish Bierut’s version, if it meets otherwise normal publication standards. Any news on this?

  23. Cath Ennis says:

    “But expert reviewers should be familiar enough with the literature to highlight this before acceptance.”
    Yes, they should – but I’m sure I’m not the only person here who’s received a 3 line review on a submitted paper.
    And also, someone who’s knowingly broken an agreement they signed with a database, is unlikely to be put off by a journal’s self-disclosure policy.
    I guess I’d just rather have a solution that’s less vulnerable to human error / laziness / malice. Maybe an online “database of databases” that provides a one-stop shop for reviewers/editors who want to make sure than no embargoes are being breached. Maybe an EndNote / Refworks / whatever widget that highlights retracted papers.

  24. Maxine Clarke says:

    A journal can certainly have policies, and many do, but enforcing them can be a problem. Is it the role of journals to be the police force of the scientific community? No, in my view?
    The journals, however (as Frank and others say), are the custodians of the scientific record and their responsibility is to maintain version control and a record of important post-publication errors. Therefore a decent journal will always mark a correction or retraction, not only in its own online archives but also on the A&I services (eg medline, Thomson-Reuters et al) which host abstracts.
    As far as a journal editors being able to tell if something is a duplicate publication or plagiarised – well, we aren’t magicians. You may have heard of crosscheck.org – many journals (including Nature and the other NPG journals) are part of that – we scan a selection of submitted articles against a database of journals to pick up on possible misconduct of this sort. (Did anyone read that recent Great Beyond post about the science minister’s publications having been plagiarised?)
    I agree with Jenny and others who are drawing attention to the scooping downside to sharing data and other prepublication material. It’s really tough. Databases can provide accession numbers, preprint servers can provide unique identifiers – these provide some degree of protection/version control. But at the end of the day, we are dealing with human nature, here.

  25. Mike Fowler says:

    Cath,

    I guess I’d just rather have a solution that’s less vulnerable to human error / laziness / malice.

    Too true. But then we’re basically left with not making data available to anyone other than the original researcher until the embargo period has passed. If we scientists can’t be trusted to keep our word, don’t give us the opportunity to break it. And three line reviews are a cheat as well!
    Maxine,

    Is it the role of journals to be the police force of the scientific community?

    No, I completely agree with you on that. But if they could carefully police the content that is accepted to appear within their pages, that’d be useful. I would not expect reviewers or editors to check all submitted manuscripts, but any that are accepted for publication could be checked by the editorial team.
    If we are to continue to promote the peer review process as being the best option available, we should be willing to modify it to protect it and honest scientists against (the few) cheats.

Comments are closed.