Diversithon – some recipes

Recipe 1

It’s a simple recipe. Gather together some people who want to change the world. Put some inspirational speakers in front of them to get people fired up about diversity in science. Provide cakes and biscuits. Teach some basic skills in editing and writing for Wikipedia, then set them loose on a list of scientists who deserve, but don’t yet have, biographical articles in Wikipedia. The room ignites in a silent flurry of activity and two hours later the cakes and biscuits have been transformed into Wikipedia articles about scientists from black, Asian and minority ethnic backgrounds.

That’s what happened on 9 November 2018. Nearly 40 people gathered to hear our speakers and learn about Wikipedia. Yolanda Ohene (UCL) and Sara Essilfie-Quaye (Imperial) talked about their experiences as black women in academia and their desire to see much-improved diversity and inclusion in science. Jess Wade (Imperial) talked about how increasing visibility through creating Wikipedia articles about women scientists and BAME scientists really can change perceptions, and change the world. No-one hearing her speak can have been left in any doubt about this. (If you want to understand what I mean, try watching Jess’ talk at the Royal Society’s Research Culture conference last month. Her talk starts at 4hr 25mins and lasts for about 12 minutes).

Finally, Alice White (Wellcome) provided simple instructions for creating effective Wikipedia articles, and some tips on how the Wikipedia community works. She was a good teacher and people learnt quickly.

Recipe 2

I’m very keen on Wikipedia (WP), but I am a rather sporadic contributor. I was first introduced to WP editing in 2011, when I attended a workshop at the MRC. The following year I participated remotely in a Royal Society editathon, on Ada Lovelace day. Then in 2013 I helped organise an editathon at NIMR for women scientists, which gained some attention, and another smaller one in 2014.

I’ve been wanting to arrange a WP workshop at the Crick for a couple of years but lacked the spark to make it happen.  Last year I had a chat with Beatrice Mikuzi, the then chair of our PRISM network, for black, asian and minority ethnic (BAME) staff. She commented that wikipedia editathons for BAME scientists always seemed to focus on USA scientists.  That implicit challenge sewed a seed in my head.

This summer I went along to a women-in-science wikipedia editathon at UCL, and there met Jess Wade and Alice White. I tweeted about the event and then my fellow Crick open science enthusiast Martin Jones saw my tweet and commented:

I think @franknorman and @jesswade would make an amazing
dynamic duo for a future wikithon!!

Well, how could I refuse? The sparks from Beatrice and Martin united and Diversithon was born.  I got to work with Alice and Jess and set a date for the event, then I liaised with the new PRISM co-chairs Karen and Esther. They injected a ton of enthusiasm and organisational skill. Actually Esther and Georgiee (from the engagements team) took on most of the organisation work from then on – they were the real dynamic duo.

The second recipe for success then was: someone keen on Wikipedia, someone with a vision for BAME in science, someone to light a spark and someone to enthuse and organise.

Recipe 3

We came up with an initial list of names of BAME scientists and via Twitter invited people to add to it. Jess Wade used her extensive Twitter following to solicit more names, and in October we added more names thanks to Black History Month as many people were tweeting on the subject.

When it came to the day, 9 Nov, it was a relief to see people enthusiastically turning up to the event. I want to say that our speakers were the stars of the day, but on reflection it was the attendees who deserve the limelight. They absorbed the Wikipedia ways of working – getting to grips with notability criteria (especially for academics) and neutral point of view, and the guidelines for biographies of living people. Then our new editors started reading about the scientists they’d chosen to work on, becoming enthusiastic as they learnt about the achievements of these people.

By the end of the afternoon some articles had been created, or started, and I sensed that many new converts to Wikipedia editing had been won over. If we could have kept going into the evening I think some people would have gladly done so.

The pages created and page views can be tracked here  (though some of those listed were created at other events).  The BAME scientists who still need articles to be created are listed at the bottom of this page.

The most important recipe is also the easiest – make sure you have some good stories to tell about scientists of colour who have been overlooked in Wikipedia. It turns out that there are plenty of them, and their stories will inspire you to write about them.


The case for more events such as Diversithon was driven home by this recent salutary reminder in Nature of the need to democratise knowledge.

We are planning to run another event in 2019, and other events are also planned at Imperial College and Cambridge Univ.

Scientific archives workshop 2018

I attended the Second Workshop on Scientific Archives held at the Carnegie Institution for Science, Washington, D.C. on the 13 & 14 August 2018.

The first Workshop on Scientific Archives was held at EMBL in 2016, and was organised entirely by Anne-Flore Laloe, the archivist at EMBL. It was (I think) the first time that archivists working in the scientific area had come together internationally to exchange experiences. I attended it and gave a paper (even though I’m not an archivist). After that first workshop a small international committee was formed (CAST – Committee on Contemporary Archives in Science and Technology). This committee planned the 2018 workshop which featured a good range of topics and attracted about 40 attendees.

The complete programme is here. I learnt something from most papers, but some stood out for me.

Data Management Plans and reasons for keeping data

Jean Deken, SLAC National Accelerator Laboratory, Scientific Data Management Plans in Theory and In Practice

Jean Deken noted that scientists are required to plan for how they manage research data, thanks to funders’ policies. She suspects that archivists’ concerns were not uppermost in policymakers’ minds when they made their rules.

To an archivist, a Data Management Plan (DMP) is a historical document describing the data practices of the experimental collaboration.

Tools exist to help with creating DMPs – e.g. the California Digital Library DMP tool and the  Digital Curation Centre DMPonline tool in the UK.

In theory DMPs minimise the risk of data loss and maximise data accessibility but in reality they leave many questions unanswered. Jean quoted Jeff Rothenburg’s wisecrack “Digital data lasts forever – or five years”.

After the analysis of a dataset is completed there is often no requirement to retain the original data.  Even when it is retained, it may become unusable over time even by the original researchers. Sometimes it’s better or cheaper to do a new experiment.

Here Jean mentioned the National Research Council report in 1995 which highlighted the difference between experimental science and observational science when it comes to data retention. Observational science benefits from long-term data gathering, so it makes sense to hold onto old data. Experimental scientists tend to expect that repeating an experiment in the future with better equipment will give better results, so they’d-rather repeat the experiment than hold onto it long-term. 

This ignores the issue of reproducibility, which was perhaps not so prominent back in 1995.

Record-keeping in science

Juan Ilerbaig, University of Toronto, Integrating Data and Records in Archiving Scientific Research

Ana Margarida Dias da Silva et al., Universidade de Coimbra, The Importance of the Botanic Archive in Contextualizing the Botanic Collections of the University of Coimbra

Juan Ilerbaig gave a very thought-provoking talk about the role of record-keeping in science, and the inter-relationship of different records and objects. This was new ground for me but Juan’s talk made me want to learn more.  Juan noted that the records of science include both the structured ‘minutes of science’ (the published literature) and various less structured records (communications, raw data, records).

Juan referred to the correspondence between records, data and physical objects. A published scientific paper can be seen as a proxy for the research (the data). The data and objects produced by research can be seen as possible sources for future work.

He cited the US archivist Maynard Brichford who wrote in 1969 that “Test and experimental data should be destroyed when the information they contain is condensed into published reports or statistical summaries.” (1)

Juan suggested that this point of view neglects to consider that scientific record making is an active agent in the process of science, not just a passive byproduct. Therefore models of science that rely only on the final publication risk misrepresenting what really happens in research.

To support what he said Juan related an example from Charles Darwin’s voyage of the Beagle. Juan explained that the links between Darwin’s specimens, tags (metadata), published description, labels, notebooks, specimen catalogue, zoological diary (rewritten diary), were all crucial to an understanding of how Darwin came to his conclusions. At first it was not clear to Darwin that the location of where he had collected his specimens was important. He had not been gathering location information. When he realised that location was a crucial part of the story he asked the ship’s crew members (many of whom had made their own notes) to provide information to fill in the gaps in his records.

Juan said that the process of recording (writing) and cross-referencing turns private experience into public information and turns itemized knowledge into generalized knowledge. I need to think a bit more about that – I’m not sure I quite grasp it. 

Some of what Juan said chimed with another talk, from Ana Margarida Dias da Silva at the University of Coimbra. She too emphasized that the whole is greater than the sum of its parts, showing how links between her institution’s botanic archive and its plant collections were synergistic. Similarly links between the archives can shed valuable light on objects in the museum collections and on the development of the library collections.

I really appreciate this holistic point of view, and the context provided by different kinds of information and evidence resources.

Archiving websites

Polina Ilieva, University of California, San Francisco, Science Online: Evaluating Appraisal, Usage, and Impact

Polina Ilieva from the UCSF archives explained their approach to archiving websites. She stated that an archive needs to collect more broadly than just records that support the published record. A contemporary scientific archive must also collect many unofficial channels of communication, including electronic information.

Polina made the point strongly that when talking about electronic records, appraisal has to occur soon after creation of the records, not decades later (2). 

At first UCSF only collected websites that linked to existing archive records but then extended their remit to archive the websites of all labs. They invited PIs to nominate websites to be archived (allowing self-nominations). Now they are archiving 128 out of 187 unique lab websites that they have identified. They crawl the websites twice a year. They use Archive-It  to archive lab websites.

Lab websites often only represent the successful side of research. Not all the failed, rejected stuff. UCSF is also looking at electronic lab notebooks (ELN) with a view to archiving these. Because they are proprietary it may not be possible to archive them. Maybe archivists need to start a conversation with ELN service providers.

Polina recommended Lorraine Daston’s recent book – Science in the Archives.


John Faundeen, U.S. Geological Survey, Science and Technology Archives: The Art and Science of Conducting Appraisals

Patrick Shea, Science History Institute, Appraising the Records of 20th century science

I enjoyed the papers from John Faundeen and from Patrick Shea on appraisal, though they were mostly talking about paper records.  This section was instructive for me, a non-archivist.

Appraisal informs the initial decision to ingest records to the archive, and subsequent decisions to retain or discard. One approach is t form an appraisal team, including an archivist, scientist(s), and a research manager.

Both John and Patrick used structured questionnaires to collect facts about the records. John  used 44 questions (NARA best practice for federal agencies) while Patrick used 21 questions.

John asks scientists:  are the records somewhere else too? what was the original purpose of these records? what may be the future scientific uses of these records? He has carried out 90 appraisals in 12 years. In that time he has accepted/retained about two thirds of the material appraised.

In his talk Patrick noted that you can’t keep everything. The material’s uniqueness, form, importance, and value all come into the decision. As well as actual archives his institute will collect ephemeral material – e.g. conference proceedings, equipment catalogues.

Scientists don’t appreciate the importance of anything except the published reports. There are many challenges – not least that Records Management can end up destroying too many records.

“History in the true sense depends on the unvarnished evidence, considering not only what happened, but why it happened, what succeeded, what went wrong” said US archivist Frank Burke.

Archives for a new institute

Laura Outterside, European XFEL, New Science, New Archives: Records Management at European XFEL

Laura Outterside is records manager at the XFEL (European X-Ray Free-Electron Laser Facility). This is a new institute – though it has been some years in the planning. Her focus is on scientific records – records about the administration of science – funding, planning, and everything before the data gathering. She is also considering the need for an XFEL archive.

She noted that XFEL researchers are managing their records already, but they are all doing it differently. Laura is planning to undertake records ‘health checks’ to assess the state of RM across all research groups. She hopes to work towards a central document catalogue.

Now is a good time to focus on RM and archives as XFEL moves from a planning phase to an operational phase. A new chapter is opening, and a new generation of staff is coming in. The current scientific director is retiring. He has been involved from the start of the XFEL project and will have many paper, digital, and email records. Laura plans an oral history interview with him. She is also planning to review procedures for managing records on the departure of key staff.

Laura is starting with the records and working backwards to procedures, policies.  Bottom-up, decentralised, flexible rather than compliance-based approach. This seems a very pragmatic approach, and it makes sense to me. Good scientific research practice policy has some documentation and publishing guidelines relevant to archives, such as “retain all records safely”. XFEL also has an Asset Management policy which is relevant to RM.

Laura has been inspired by the examples of EMBL, CERN, and SLAC archives. Those  archives were created 20-40 years after the creation of the respective institutions. Laura noted that today it is important to consider archival legacy from the start, echoing the point made by Polina that digital archives are more vulnerable than paper archives.

Archives to theatre

Christian Salewski, Alfred-Wegener-Institut Helmholtz Centre for Polar and Marine Research/ Archive for German Polar Research (AGPR), The History of German Polar Research goes Theatre – The Project “Staging Files”

The final paper of the workshop was from Christian Salewski, head of the Archive for German Polar Research.

According to its website “The mission of the AGPR is to secure the written and oral tradition of German polar and maritime research, a 150-year-old scientific venture with deep roots in the federal state of Bremen. Founded in 2011, the AGPR archives records and other material of this research field. “

Christian told us that there is a 100 year-old tradition in Germany of documentary theatre. In 2016 the AGPR decided to create a play about early German polar research, based on their archives. The process was led by a historian, working with a theatre company. Christian taught students from the University of Bremen history department about the history of German polar research. The students were given access to material in the AGPR. Then they wrote essays about the history and these were used by the theatre company to put together a first draught of the play.

The play was developed as a stage reading.  It is called Vom Eis gebissen, im Eis vergraben (Bitten by ice, buried by ice) and was put on by the Bremer Shakespeare Company.

The AGPR got great recognition for the play, including from the Institute management. It is a very creative way to exploit archives.

Other points from other papers:

  • It’s always helpful to document choices and decisions when you make them.
  • The importance of established criteria on what to collect.
  • How can technical or technology-related archives become accessible for humanities research?
  • First, persuade owners/creators of existence and significance of archive.
  • People may value the old, but do not realise the value of newer records even if they are very rare.
  • Holding public events for the community helped to change attitudes towards archives.
  • Help records creators to understand significance of things they have, and stop them throwing it away.

More about CAST

The CAST committee has been brought under the umbrella of the International Council on Archives Section for Research and University Archives (ICA-SUV).  This opens up some funding streams for future events and helps to bring the workshops to a wider audience. It is planned to continue alternating between Europe and north America, and to hold a workshop every one or two years.

I’m pleased to say that I have recently been invited to become a member of CAST, which is very flattering.  I will be working with the other members of the committee to help plan the 2020 workshop, and look forward to getting involved.


  1. Scientific and technological documentation : archival evaluation and processing of university records relating to science and technology / by Maynard J. Brichford.
  2. Terry Cook, http://www.interpares.org/book/interpares_book_l_app03.pdf


Edited 12 Nov 2018

C-CAST has changed its name to CAST, and dropped the word ‘Contemporary’ from its title.

Library day in the life 2018

This post is an account of what I did at work each day from Monday 17 September 2018 through to Friday 21 September 2018. The idea is to give an impression of the range of tasks I engage in. I’ve done it four times previously, starting in 2011. I explain more about ‘Library day in the life’ in my Library day in the life 2016 post.

Monday 17 Sep
I woke up feeling tired from my run yesterday, so I decided against an early start; I stayed in bed till 7am. Unluckily for me there was a faulty train on the Northern line causing long delays. I opted to take a bus for 35 mins and then travelled two stops on the Piccadilly line, getting into work at 9.20. Not a good Monday morning start!

This is my week to look after the @CrickEDI twitter account. Today is the first day for our new cohort of Crick PhD students, so I tweeted a welcome to them, mentioning the Crick staff networks.

After checking my emails and registering for an interesting-looking webinar next week (Bringing Insight to Data: Info Pros’ Role in Text- and Data-Mining)  I made a start on clearing up some outstanding tasks.

I sorted out some outstanding expenses claims (tedious but necessary) and busied myself with some administrative preparations for open access week in October. I also sent an email that I’d been putting off. It might frustrate some of the recipients but it needed to be sent.

I work in a research institute so there are always scientific talks and seminars happening, but they are mostly pitched at expert level.  One monthly series of talks is aimed at non-scientists on the staff – the “Introduction to…” series. Today it was “Introduction to malaria” and we were treated to 30 mins of parasitology from one of the Crick postdocs.

One of my regular chores is to check through and edit the weekly publications list, to ensure they are all categorised correctly (primary research, reviews etc) and that they are all genuinely our publications. After that I  select a few of them to be highlighted internally by our Comms team. It’s a good way to get an overview of what’s being published by Crick scientists.

A monthly chore is to prepare the LIS monthly report. Some of this is generated from various Excel spreadsheets of statistics, but it’s not all as slick as I would like. Making it more automated would really help. I’m a bit late with this month’s report but finished the draft this afternoon, ready for my colleagues to comment on.

It seems to have been a day for chores and clearing up tasks. It’s always a good feeling to get things done.

My final task was to welcome three colleagues from our partner universities.  I meet with representatives from the libraries from time to time to catch up with their news and update them with ours. I was interested to hear about their bibliometrics support services and plans to expand research data management. They were interested to hear about our plans to introduce a CRIS (though we call it a RIMS) and repository. We also talked a bit about funding for OA, Plan S, ORCiD and DORA, and the frustrations of ebooks. These meetings are a good chance for me to talk with fellow librarians.

Tuesday 18 Sep
A year ago the EDIS Symposium was held  – addressing Equality, Diversity and Inclusion in science and health (see tweet summary of the event).  Since then the organisations which organised the symposium (Wellcome, Francis Crick Institute, GSK) have been working hard to create a new organisation called EDIS.

Today EDIS held a round table discussion of interested parties. The aim is to move EDIS forward as a campaigning and support organisation, and the event was held to gather commitments to join (and fund) the organisation. I went along to the meeting this morning, held at the Wellcome Trust, to hear about the plans for EDIS.  It was good to hear strong support for EDIS from the other attendees. There were some quite probing questions from the attendees about the plans but there were good answers for all of them. We also had some discussion about what questions EDIS should try to answer and what activities it should undertake.  EDIS has a website and has just launched a Twitter account if you want to find out more.

I then went into work for the rest of the day – luckily it’s just a short walk. I had a very quick lunch before attending the PRISM network meeting. PRISM is our staff network for black and ethnic minority staff. It is a really active group and they have done some great work in their two years of existence. I was there to brief them about a proposed Wikipedia editathon, to focus on black and ethnic minority scientists.  There was some enthusiasm for the event and some ideas have flowed already. We’re planning to crowdsource a list of names of scientists who need Wikipedia articles, or improved articles.

Later on I had a one-to-one with one of my direct reports, talking about her work on journal subscriptions and some ILL problems.

I did a bit of chasing up some information we need for a project, sent some information about the next Open Research London event to one of the speakers, and answered a question asking if we had any photos of past members of staff. I also helped someone convert a large Outlook distribution list into two columns in Excel – names and email addresses.

I fixed a date to show Paywall the movie and booked our auditorium for the purpose on one lunchtime during OA week. I’m assured that showing a film from a laptop is really easy these days, but I’ve not tried it before so will need some handholding from our AV people.

I’ve been meaning to do some work on ebooks for a while, but I’m a bit anxious that it’s going to prove impossible to get the books we need at an affordable price. The discussion yesterday with my university colleagues made me realise I need to get on with it. If it does prove impossible, then I’ll need to make alternative plans.  I spent a bit of time working on getting a candidate list of titles together, from a preliminary list a colleague prepared for me. I’ll need to do some more work on it to expand it a bit to cover all relevant subjects.

I finished my day by composing a few tweets for the @CrickEDI account, about PRISM and the EDIS meeting.

Wednesday 19 Sep
I got in early this morning to do some preparation for my talk to the new PhD students later on. It’s always hard to know how much to say in these talks. I try to sketch out what we offer, in as broad a way as possible, and make some sort of impression on them so that they will remember that the Library & Information Services team exists. The talk went will enough and I was proud of my segue from ‘how to choose which tools you use’, into ‘sharing and open science’. I did a show of hands to see who uses which citation management tool – Mendeley wins hands down. And I asked if any of them are OA/open science advocates. I was pleased to see two and a half hands go up.

A tweet asked me about a comment I’d made on a 2013 blogpost. It turned out to be something about the MRC Common Cold Unit (CCU). I had commented on the blogpost about the link between the CCU and NIMR, noting that virologist Christopher Andrewes was the original instigator of the Unit. That person was the grandfather of my tweet-correspondent, and she thanked me for adding my comment about her grandfather. That gave me a nice warm feeling.

I put the finishing touches to a co-authored blogpost, after discussion with my  colleague and co-author. It’s just a short piece about making deposits into Europe PubMed Central, and the need to develop skills in order to double check the deposits after they’ve been marked up into XML.

Then I spent 45 mins with another of my colleagues doing some guerrilla interviews. These entail standing by the coffee queue in our cafe to ask people a few questions about accessing journal articles.  We’ll also ask some of them to participate in some more in-depth work, but the guerrilla interviews give a first impression of people’s experience. This is part of our UX programme – using UX techniques to improve our services.

Next it was time for a one-to-one with another of my direct reports. We talked about some work on information literacy she’s doing; preparations for OA week; publicising the Open Research London event; setting up Research Buzz Club again (this is like a journal club but for research policy news).

After lunch I sat down with someone from Finance to look at our budget for 2019/20, and review the current and past years’ spending. There’s no nasty surprises, yet.

I put the finishing touches to my August monthly report and sent it out then went along to the Crick Lecture. This was a talk at the interface between cancer biology, developmental biology and drug development, and was pitched at a good level for a non-specialist.

I joined in the after-lecture drinks for a while, and had an interesting conversation about research data sharing (both internally and externally). Sometimes those kind of conversations can be very important.

Thursday 20 September

I did a bit of preparation first thing to collate information about lab protocols. More of that later.

While I was having a cup of coffee someone asked me whether our OA and publishing budget would stretch to paying for a cover image.. We’ve had that request before and decided it wouldn’t.

I and a colleague had a meeting with two people from Jisc Collections. They came to explain the new arrangements for affiliate members. As we’re not a higher education institution we will in future have to pay a fee for each Jisc deal we take part in, to help cover the negotiation costs. The fees are not unreasonable.

After that I went upstairs to do my weekly stint sitting by the hub on the fifth floor. There are four lab floors in the building and each one has a central ‘hub’ where lab managers and administrators sit. Each member of my team spends one morning a week sitting adjacent to the hub. The reason for this is to keep in closer contact with the labs, and increase our visibility to the scientists. As we have no physical library space I think it’s important for us to take steps like this to be visible. While on the hub I took a phone call on my laptop, talking to a colleague from Rescolinc about one of their journal deals for 2019 (which also involves Jisc).

After lunch I spent 45 mins talking about lab protocols and methods to our Asst Information Services Specialist. This position was created for an early career librarian and advertised as such, with a promise to provide on-the-job training. I decided we should go beyond that to give a broader perspective too, as a ‘learning development programme’ -a roughly weekly session on some aspect of LIS in a research institute. So far we’ve covered journals and discovery, and just about to move onto books. Today I endeavoured to explain how protocols are published and list some sources for finding them, but I also talked about OA and reproducibility.

Back at my desk I found an email asking if we could pay for a cover image (a different one, from a different person). The answer was ‘no’ again.

Then I saw a tricky email from someone complaining about our lousy access to journals, giving examples of articles she’d been unable to access. I could see this would need delicate handling. We agreed to get copies of the articles she needed, and I will go and talk to her later.

Then it was time to go to the IT&S Gathering. This is a monthly meeting of all staff in our broader department – Information Technology and Services. We celebrate news and achievements, staff arrivals and departures, and hear short talks from within IT&S and other people in the Institute.

Usually I’d stay and chat to people after the Gathering, but today there was another talk I wanted to attend, organised by the CrAIC (Crick artificial intelligence club). This was given by Elena Lestini, about her career switch from chemistry research to AI in a start-up. The start-up is managed by Chiin-Rui Tan (who was also present).  Elena had realised that as a woman aged 40, and a mother, she was not going to progress in academia. She retrained and with the help of a Daphne Jackson fellowship moved into artificial intelligence. I spent some time afterwards talking to other attendees, and to Elena and Chiin.

Friday 21 September
Courtesy of the Knowledge Quarter I had a ticket for an early morning (well, 8.30 am) view of the “I object” exhibition at the British Museum. This featured objects of dissent, drawn from the BM collection and curated by Ian Hislop of Private Eye. See some photos on my Instagram.

Back at work I crafted a series of tweets about the ORL event on 3 October, and sent emails to the ORL mailing list and several other listservs, as well as some internal lists. It always surprises me how much time this kind of thing can take.

I sent some emails trying to track down contacts for the families of two past NIMR scientists. The old NIMR site at Mill Hill is being redeveloped by Barratt Homes. There will be a few new roads that need naming and several blocks of flats will also need names. Barratts want to name these after scientists associated with NIMR, but they need to check with the families of those people, and I’ve ended up helping to put them in touch with some of them.

I was pleased to receive emails from two of our new PhD students asking to join the OA advocates group. I sent them invites and some blurb.

I replied to an email asking whether we would like an album of photos of the old NIMR, and followed up a request to contact a scientist who used to work at NIMR. It seems to have been a day for history and links to NIMR.

Summing up

I don’t how any of people will have made it through to the end of this post. I chose this week as it looked like I had an interesting variety of things happening – maybe too much variety! There seems to have been quite a bit of EDI and a good deal of Open Research in there, but also a solid bit of journal access, some promotional work, and some management. The number of historical queries was more than usual, I’m not sure why. I didn’t put in much time on some key projects – that will be a priority over the next 12 months. Maybe I’ll try another Library life in the day in September 2019, and see how that compares.


Open access deposits to Europe PubMed Central – building skills

Blogpost by Kate Beeby and Frank Norman.

Our funders’ open access policies mandate deposit of all primary research articles into Europe PubMed Central (ePMC). We opt for the Gold (immediate Open Access) route when we can, but if the publisher offers no Gold option then we have to deposit the paper into ePMC with a 6-month embargo. In some cases, the publisher makes the deposit for us, in other cases it is down to the author.

At the Crick, the OA team offers to make the deposit on the authors’ behalf. They like this and most group leaders do take us up on this offer. This involves obtaining all relevant files, uploading them onto the ePMC site, and associating the correct grant details with the deposit.

A short while after the deposit has been made, it is necessary to check the accuracy of the marked-up XML version of the article that has been created on ePMC. Until the accuracy has been confirmed by the author or their nominee the article is not released for public view in ePMC. The XML version is checked for accuracy around stylistic details (e.g. italicised words, use of symbols), as well as reference formatting (e.g. in-text links to figures, numerical citations) in comparison to the original files that were deposited.

It is important to note that we are not reviewing the content of the manuscript, nor undertaking some of the other checks involved with proofreading (e.g. text formatting, grammatical errors). The main focus of the deposit check is to ensure that the ePMC-created version matches that of the original submitted document.

This task can be challenging and time-consuming for staff at first. The knowledge of how best to do these checks grows with practice – though we don’t have a high throughput of articles requiring manual deposit, so practice opportunities are limited.

It’s difficult to find relevant training in this niche. There are courses from the Society for Editors and Proofreaders  but these are aimed at people looking to start a career in proofreading/editing in the publishing industry. They are too detailed for what we need. There is a useful ebook about proofreading – it’s free but needs Flash installed. It has some helpful information, particularly about the structure of research articles and what to look out for in figures, tables etc. We also wonder whether the project to define the skills that new Scholarly Comms staff need covers these skills?

It would be interesting to learn how other libraries and institutions manage ePMC deposits where required by funders. Do dedicated staff make these deposits on behalf of researchers? Are researchers required to do this task themselves? Are there any tricks you have developed when checking the ePMC version of articles?

If you are making ePMC deposits on behalf of authors and checking them later, have you developed guidance for new staff undertaking this work?  Have you found any training helpful to build these skills?

Please share your thoughts and ideas below in the comments.

Kate Beeby (@ka_be) is Assistant Information Specialist in the Library & Information Services at the Francis Crick Institute.

Thanks also to Patti Biggs for comments on this post.

Frank Norman, Patti Biggs and Kate Beeby are all part of the Open Access Team at the Crick.

To fail is to learn

After leaving school I worked in a library for a year and was in the music and drama section for six months. Towards the end of that time I was trusted enough that they let me prepare some orders for new stock. We needed to buy the orchestral parts for the Requiem Mass by Gabriel Faure. I prepared a standard order – six first violin parts, six second violin parts, four violas, four cellos and two double basses, plus all the brass, woodwind and percussion. It duly arrived and was labelled up ready to be borrowed. As it happens a choir that I sang with was performing the Faure Requiem, and hired the orchestral parts  for the concert. So when it came time to have a rehearsal with the orchestra I went up to our chorus master to proudly let him know that I had supplied the orchestral parts from the library.

Then he told me that there was a bit of a mess up.  The Faure Requiem string orchestration is unusual. It has just one violin part but the violas are split into first and second, and the cellos are too. So my standard order resulted in 12 copies of the violin part, and only two copies of each of the various viola and cello parts! My mouth dried up and I didn’t say anything but slunk away. Next time I that I had to order some parts I made sure to check  the instrumentation of the piece first.

How do you handle failure? Is it something to feel ashamed of, something that  that threatens your sense of esteem? Is it something to be concealed at all costs, denied, or blamed on someone else? This is an instinctive response for many of us. Or do you regard it just as a fact – something to be noted and investigated? Something that you can learn from and that helps you to improve? I’ve just read Matthew Syed’s book Black box thinking. In it, he shows how damaging our instinctive response to failure is and how beneficial is the learning approach. The book tries to show how failure is something we must learn from, how it is a necessary part of improving.

Matthew Syed begins by contrasting how failure is treated in healthcare and in aviation. In healthcare there is a culture of blame whereby errors are concealed and seen as marks of shame. Anyone who admits to an error will take the blame. In aviation every error, failure, incident or near miss, is treated as an opportunity (nay, a necessity) to learn and improve. There is no culture of blame. In countless examples Syed shows the harm done by a working culture that does not see failure as an opportunity for improvement.

A teacher once told me that ignorance (specifically the recognition of one’s own ignorance) is the first step towards learning. If you know everything then obviously you can learn nothing more. In a similarly paradoxical way Syed insists that failure is a necessary stage on the way to success.

He explains how our need to conceal errors or to blame others for them is a form of cognitive dissonance. If I am a leading expert on heart surgery then I will struggle to face up to my role in a failed heart operation – it strikes at the core of my expertise, my self. Syed lists several extreme examples of the mental contortions that people go through to avoid having to admit error. His examples come from law enforcement, healthcare, politics, religious cults and pseudoscience. He explains how people can have a ‘closed loop’ way of thinking: “It’s right because it’s right”.

Failure can be useful in many scenarios. It is important for innovation. James Dyson tested more than 5,000 different prototypes before his vacuum cleaner was ready to market. He was initially driven to develop his novel design by the failure of the prevailing models of vacuum cleaners. Syed also describes how a washing machine manufacturer used an evolutionary development process to make improvements to a powder nozzle. They changed something, tested it, then accepted or rejected the change and repeated the process. After many iterations (and failures) this resulted in a much-improved nozzle.

In cycling, the Team Sky system of marginal gains relies on a similar system of trial and error. Across the business world randomised controlled trials have been adopted, for instance to improve response rates to letters and emails sent to customers. The colour, font and wording can be tweaked and the response rate from customers measured to see what works best.

Syed also draws on psychological research. He describes how people with a ‘growth mindset’ have been shown to be more open to learning and to attempting challenges beyond their experience, whereas people with a ‘fixed mindset’ almost fail before they try. A ‘fixed mindset’ leads people to believe they cannot learn new skills and to fear blame for failures.

I felt a bit bludgeoned by the book – the key points are made, vividly illustrated, made again, reinforced with further examples. I longed for an executive summary! But the book certainly drives its point home and it pierced my own closed-loop thinking.  It made me question  whether I embrace failure, or I rationalise failures out of existence. Do I blame colleagues for failures? How can I innovate better?

Syed asks:

“Do you fail in your judgements? Do you ever get access to the evidence that shows where you might be going wrong? Are your decisions ever challenged by objective data? If the answer to any of those questions is ‘no’ you are almost certainly not learning”.

The book’s emphasis on trying things out and responding to evidence puts me in mind a bit of the Library UX movement.  I’ve been to a couple of Library UX workshops and the strong message I took away is that it’s wrong to assume anything – we must question what we think we know and test reactions to our services.

Am I brave enough to share my failures here? I certainly do make mistakes, and fail to do things I meant to. I think I have ordered two copies of a book by mistake in the past – forgetting that I’d already ordered it. I failed to get a project off the ground as I had tried to do it all by myself and that just wasn’t feasible. When the internet was new I set up a gopher, then a website, then a better website etc etc, gradually approaching something worthwhile. I learnt something from that about incremental development. Each stage was useful in persuading more people it was a good idea.

I remember an error from much longer ago. When I was 16 I had a summer job working in a bank.  The branch manager asked me to refill the red inkwell on his desk, and told me not to overfill it.  The red ink was in an enormous bottle so I found pouring a small amount of ink into the well rather tricky. I thought I’d done it right but later that afternoon the manager was in a fury as a red stain spread through his handsome wooden desk. His resourceful secretary managed to sort things out and I laid low for a while. I learnt that ink bottles were not my friend.

It would be interesting to hear of any of your failures and lessons learnt, or improvements made through error.

What is open science?

The question

Wikipedia suggests that open science began in the 17th century, with the start of the academic journal. Some say that open science started in 1957 with the establishment of the World Data Center system, for International Geophysical Year. The system was established with agreement for “free and open exchange of [geophysical] data among nations”.  I’d assumed that open science started much later, but it all depends what you mean by ‘open science’.

Kendra Godwin wrote that “open science, while often discussed, is not well understood nor uniformly defined”. Jeroen Bosman and Bianca Kramer also observed that there is a great deal of variation in definitions of what open science is. They explain it thus:

Rather than see this as a problematic lack of focus, or as a sign that it is too early to define what open science is or not, we’d argue that the scope of open science and the variety of actors involved make it not realistic, and even counterproductive, to expect there to be, now or in the future, one definition of open science that fits all.

This has not stopped people from attempting to come up with definitions! Bosman and Kramer categorise five types of definitions, from high-level definitions and all-embracing definitions to practical definitions, plus personal and catchphrase definitions.

I started thinking about this when the Crick announced its ‘open science collaboration’ with pharmaceutical company GSK, three years ago. The press release noted that “research findings from the collaboration will be shared with the broader scientific community, via joint publication in peer-reviewed journals”. I felt uncomfortable with the designation of this as ‘open science’ as it seemed very different from what I understood by that term. One or two tweets at the time showed I was not alone. When I thought about it more I realised that there was no absolute level of openness. This initiative was certainly more open than GSK’s standard practice, but equally clearly less open than research at the Montreal Neurological Institute (the most open research institute I know of).

How can we judge at what point along the spectrum from ‘closed’ to ‘open’ does research qualify for the ‘open science’ badge? It’s a conundrum.

The answer

I thought it would be a good idea therefore to assemble a group of experts to shed more light on open science. A meeting of Open Research London on 3 October 2018 at the Francis Crick Institute will feature four top speakers who will give their thoughts on open science, in particular the relationship between open science and commercial activity:

  • Patrick Vallance
  • Jenny Molloy
  • Wen Hwa Lee
  • Tim Britton

More details are here and an Eventbrite page will go live on 30 August.


To set the scene for the meeting I’ve been doing some reading and thought I’d inflict it on you. I looked at a few definitions of open science. Many of the definitions put sharing research outputs at the heart of open science, but they differ in the details of what is to be shared. Openness and reuse are also key to many, while several definitions refer to the use of digital technology and collaborative tools. Most of the sharing is targeted at the scientific community but some definitions target the broader population too – public engagement and citizen science. Some definitions were pithy: ‘Increased sharing among scientists‘  or inscrutable ‘To make scientific research [outputs] accessible to all levels of an inquiring society‘. One is almost a slogan: ‘Open science isn’t a movement, it’s just (good) science‘and one is far-reaching ‘Carrying out scientific research in a completely transparent manner, and making the results of that research available to everyone‘. A longer list of quotes from definitions is at the end of this post in Table 1.

Three extensive reports about open science have recently appeared. These are from the US National Academy of Sciences (NAS), from the League of European Research Universities (LERU) and an international group of open scholars, led by Jon Tennant.

The LERU document states that “Open Science is not about dogma; it is about greater efficiency and productivity, more transparency and a better response to interdisciplinary research needs”. It also refers to the eight pillars of Open Science identified by the European Commission: the future of scholarly publishing, FAIR data, the European Open Science Cloud, education and skills, rewards and incentives, next-generation metrics, research integrity, and citizen science. These pillars are more all-embracing than most of the definitions above, though more aimed at policy-makers than researchers. The NAS report states that “Openness and sharing of information are fundamental to the progress of science and to the effective functioning of the research enterprise”, raising the stakes to an existential level. Finally, Jon Tennant et al’s open strategy document avoids attempting any definition, saying “it is a holistic term that encompasses many disciplines, practices, and principles”.

What, why and how to share?

Most definitions talk about sharing papers and sharing data, but several also mention software code and methods, and a couple mention sharing of peer reviews too.

Many definitions omit to mention the purpose behind open science.  Typically these are to do with accelerating and improving science:

  • to boost collaborative progress and bring greater transparency
  • to transform the entire endeavour of science
  • boosting information flow can improve our collective cognition

If open science is about sharing, then to do it you must simply share your outputs, in some kind of repository. Some definitions put more emphasis on collaboration and on digital technology – which is kind of a given I’d have thought. (I don’t see much about open science sharing using stone tablets or parchment scrolls). Open licensing (Creative Commons etc) is also a key to effective sharing. Some definitions talk about sharing throughout the whole research life cycle.

Open and commercial – innovation

Some definitions of open science also refer to open innovation.  As noted above, the pharma company GSK has adopted a number of open innovation strategies. In the pharmaceutical industry, the concept of open innovation has received a fair amount of attention as a possible countermeasure to the general decline in R&D productivity. In a recent paper Bountra, Lee and Lezaun argue that ‘open science approaches represent the most promising path forward’ for drug discovery.

Open and commercial – infrastructure

Much attention has been given recently to the question of commercial infrastructure that supports open science and the conditions that may need to be imposed.  Do commercial companies have a positive role to play in the development of scholarly communications infrastructure? Mark Hahnel warns that universities should be encouraged to take help from outside of academia when developing open science infrastructure, but should be careful in these dealings. Opinion is divided between those like Hahnel who think that there is nothing about open access/open science as such that precludes for-profit provision and those who believe that the profit motive is absolutely misaligned with the core values of academic life (10).

Jefferson Pooley has observed that ‘much of the for-profit scholarly communication ecosystem sits on the value-extraction end of the continuum‘, not the value-adding end. Jon Tennant wrote in a typically hard-hitting opinion piece, about the ‘corrupting’ effect that Elsevier has on open science in Europe.


It seems that no single definition of open science fits all circumstances. I think the concepts of ‘more open’ and ‘less open’ are easier to pin down than the absolute of ‘Open’ with a capital O.  I saw another definition mentioned on Twitter that is quite good, in a different way, but it doesn’t cover every aspect.

I hope that the Open Research London event on 3 October will give us some more ways of thinking about open science. See you there!

Sources linked to in text

  • https://datascience.nih.gov/WhatisOpenScience
  • https://im2punt0.wordpress.com/2017/03/27/defining-open-science-definitions/
  • https://about.hindawi.com/opinion/a-radically-open-approach-to-developing-infrastructure-for-open-science
  • http://blogs.lse.ac.uk/impactofsocialsciences/2017/10/11/open-source-commercial-non-profit-for-profit-what-power-have-you-got/
  • http://blogs.lse.ac.uk/impactofsocialsciences/2017/08/15/scholarly-communications-shouldnt-just-be-open-but-non-profit-too/
  • http://uk.gsk.com/en-gb/research/sharing-our-research/open-innovation/
  • https://www.theguardian.com/science/political-science/2018/jun/29/elsevier-are-corrupting-open-science-in-europe
  • https://www.future-science.com/doi/full/10.4155/fmc.15.122
  • https://www.leru.org/publications/open-science-and-its-role-in-universities-a-roadmap-for-cultural-change
  • https://zenodo.org/record/1323437#.W22mqNhKjOR
  • https://www.nap.edu/catalog/25116/open-science-by-design-realizing-a-vision-for-21st-century
  • https://en.wikipedia.org/wiki/International_Geophysical_Year#World_Data_Centers
  • https://www.crick.ac.uk/news/2015-07-14-gsk
  • http://science.sciencemag.org/content/351/6271/329
  • https://en.wikipedia.org/wiki/Open_science
  • https://www.oxfordmartin.ox.ac.uk/publications/view/2613

Table 1

Phrases used to describe open science Ref.
Increased sharing among scientists 1
Reinventing the way we work together in the context of the web 2
Open Science is open [research outputs] 2
A new approach to the scientific process 2
Cooperative work and new ways of diffusing knowledge by using digital technologies and new collaborative tools 2
To make scientific research [outputs] accessible to all levels of an inquiring society 2
Right to use, reuse, modify, redistribute scholarly knowledge 2
Open science isn’t a movement, it’s just (good) science 2
Open access to [research outputs], and other forms of multi-directional exchange between academic researchers themselves and with the public 3
Science is purposefully conducted with digital technologies and in collaboration with others… allows for and facilitates the intentional sharing and reuse of all generated products 4
Free availability of [research outputs]. 5
Carrying out scientific research in a completely transparent manner, and making the results of that research available to everyone 6
A system for scholarly communications that is built to maximize the dissemination and reuse of all research outputs throughout the research lifecycle 7
Sharing expertise, resources, intellectual property and know-how with external researchers and the scientific community 8
To boost collaborative progress and bring greater transparency 1
Will transform the entire endeavour of science 2
Boosting information flow can improve our collective cognition 2
Using web-based tools to facilitate scientific collaboration 9
Collaboration with people at different levels and in differing fields 4
A process rooted in and relying on digital technologies 4
Science conducted in a way that will allow for sharing and reuse 4
And involvement with any or all parts of the research life cycle 4
References in table

  1. http://blogs.plos.org/neuro/2018/01/31/open-science-sharing-is-caring-but-is-privacy-theft-by-david-mehler-and-kevin-weiner/
  2. https://im2punt0.wordpress.com/2017/03/27/defining-open-science-definitions/
  3. https://www.rand.org/blog/2017/10/can-open-science-help-to-make-research-more-accessible.html
  4. https://datascience.nih.gov/WhatisOpenScience
  5. http://www8.nationalacademies.org/onpinews/newsitem.aspx?RecordID=25116
  6. https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0669-2
  7. https://about.hindawi.com/opinion/a-radically-open-approach-to-developing-infrastructure-for-open-science
  8. http://uk.gsk.com/en-gb/research/sharing-our-research/open-innovation/
  9. http://openscience.org/what-exactly-is-open-science/

Table 2

Research output No. of mentions
Papers 7
Data 7
Code 4
Methods 3
Peer reviews 2
Materials 1



Cat Zero – book review

This lablit novel is set in a research institute in north London. The story is centred on a virology research lab and its work.

An old lady dies. A cat dies. More cats die – could it be suspicious? Artie is a young virologist who’s recently started her own lab and is looking to make (more of) a name for herself. An eccentric epidemiologist works just down the corridor from her lab. These are some of the ingredients of Jenny Rohn’s latest lablit novel, Cat Zero.

Cat Zero is an everyday story of virus research, set in a research institute in Mill Hill, north London. Its author, Jenny Rohn, leads a cell biology research lab at University College London.

The book is a great read – a really gripping and engaging novel. It’s a tale full of surprise and detailed plotting, with some characters who you will love and some who will repel you. All of them spring to life in the author’s hands. The setting for the story, a modern but slightly peculiar biological research institute, is less familiar than the backdrop for most novels. The institute is vividly drawn and is almost a character in its own right.

The story moves along at a very good pace and the pages keep turning – it’s difficult to stop reading. The plot takes some directions you might not predict, but is always believable. Perhaps there’s a little bit of what Graham Greene called his ‘entertainments’, where the combination of chance events make a great tale but stretch the bounds of likelihood.

As the setting is a research institute, the novel includes many insights into life and work in scientific research labs – the joys of discovery and the long hours of preparation involved. Its setting in Mill Hill suggests that the fictional institute may have a bit of NIMR (National Institute for Medical Research) in it. Jenny Rohn has previously worked at the London  Research Institute (LRI) so I suspect that it too has contributed some of the atmosphere. The novel is full of characters, oddities, minor politicking, misogyny and of course science. We are introduced to the pressures of a science career and the rewards, and the intense teamwork and collaboration that is inherent to modern research. The manner in which the science is introduced is matter-of-fact, not pedagogic. The story about the science progresses alongside the human relationship elements and the later thriller elements, never overshadowing them, but integral to them. Jenny Rohn skilfully shows that researchers have an element of the detective in them as they sift the evidence and weigh up the possibilities. Hence it has the flavour of a detective story at times, though there are no detectives in it.

I did have one or two pedantic quibbles of fact. On a number of occasions the characters make a journey from Mill Hill to Hampstead by train, as though this is a direct line.  Those two stations are on different branches of the Northern line, though it would indeed be lovely if they were directly connected. Some of the incidental talk about reading research papers seems rooted in a print-on-paper view of journals.  This seemed a bit out-of-date to me (speaking as a librarian). But these are small worries.

This book is a really enjoyable read – I loved it and I’ve spoken to colleagues at the Crick who’ve read and enjoyed the book. One of them tweeted that it is “a fun read with mathematical models playing a key role”. We have a copy of Cat Zero in our library collection and I hope other libraries serving researchers will also acquire copies. You can get it from Amazon.

Jenny Rohn

Jenny’s PhD was on the evolution of feline leukaemia virus, which plays a part in this novel, and since then she has worked in the biotech industry and in science publishing before returning to academic research and starting her own lab. She has written two previous novels, established the LabLit.com genre and website and Fiction Lab (a science book club). She is also a regular and popular blogger and speaker.


Jenny is a friend and fellow blogger on the OT site. That certainly influenced me to read the book, but had no bearing on my enjoyment of it. If I hadn’t liked the book I would just have said nothing!

A version of this blogpost was initially posted on to Amazon as a review of the book.

A new scientific archive – launch and reflections

The event

I recently attended the launch of the EMBL archives, in its new purpose-built facility at the heart of the EMBL Heidelberg campus.  Most of the audience were from EMBL but there a few scientific archivists there too, admiring what has been achieved.

At the launch event we had a chance to look round the new facility and see some documents and photos from the archive. Then we heard talks by Iain Mattaj (EMBL Director), Giulio Superti-Furga (EMBL alumnus and Scientific Director of CeMM) and Anne-Flore Laloe (EMBL archivist) . All three speakers emphasised that the archive is and will continue to be a community-driven effort. EMBL alumni from across the world have submitted material. The archive is all about the people.

Cutting the ribbon to launch the archive. The slide shows the building where the archive is located.

Celebrating in style with jelly/fruit/cake

The archive

The germ of the idea for an archive came from the EMBL Alumni organisation as they prepared to work on the 40th anniversary of EMBL in 2014. They enlisted Sydney Brenner to endorse the idea (he had written to Nature in 2007 about the need for historical archives of science ).  Iain Mattaj, the EMBL Director, backed the idea and EMBL management took action to implement it.  Anne-Flore Laloe was recruited in 2015 to be the archivist.

Rolling stacks provide about 600 linear metres of shelving

As well as documents there are old instruments

The director’s view

Iain Mattaj stressed that it is important to keep records and archives of important information on events which help to shape our science and our society. This new archive is primarily about EMBL and EMBO but, because of the major role those organisations have played in the last 40+ years, the archive is also important for European molecular biology more generally.

Iain thanked Jenny Haynes and Jenny Shaw from the Wellcome Library for their input and support, as well as many others who lent support or ideas to the project to create the archive. And, of course, he thanked the archivist Anne-Flore who has taken the ideas forward, developed them and made them a reality.

The alumnus view

Giulio Superti-Furga recalled that several decades ago a small group of physicists and geneticists coalesced around the new field of molecular biology. Now there is a huge community of people working in molecular biology who have built a massive scientific field. Giulio compared the field’s development to someone running up to the top of a hill and then, exhausted, pausing to ask ‘how did we get here? Can I remember? Where are we?’  He said that this is why we need an archive.  The paradox is that it is exactly when things are happening that we have no time to record them. Hence EMBL only thought to create an archive after 40 years of leading research.

Giulio said that the archive is a fantastic, major milestone which also has symbolic value – showing how the molecular biology community in Europe came of age. He hopes that it will inspire other organisations too.

Archive photo of the EMBL site

The archivist’s view

Anne Flore started in her new role at EMBL with a blank slate, but huge expectations. When she started she had one cardboard box of papers and records.

“Archiving for EMBL’s future”

In the early days she talked to as many people as possible and told us that reactions were mixed. One person said “I’ve never seen an archivist before.” Another said “I thought I was going to meet with the EMBL anarchist!”.  She also held talks with archivists in many other institutions as she turned the initial plans into realistic ambitions. She wrote terms of reference, a collecting policy. Procured a catalogue system. The setup of the archive needs to stand the test of time, be adaptable to future changes and reflect the spirit of EMBL.

She catalogued the first item in mid-2016. Some big themes of the archive thus far are: instrumentation, photos, training, bioinformatics, social aspects of the lab.

Now the need is to ensure that things are collected in a representative fashion. Capturing material across all fields and types, including things that are being worked on right now. Historians may be interested in documents related to publications (cf  Darwin’s notebooks; Newton’s letters). Researchers should consider what papers of theirs might be of interest. Anne-Flore has started doing oral interviews too.

Her longterm goal is that no scientists will question the need for and existence of science archives.

To learn more about her approach, see her 2017 EMBO Reports article, explaining how “Archives for molecular biology preserve the heritage of science beyond the published record for future scholars”.

Anne-Flore emphasized that the EMBL archive is an accessible resource. Anyone can come in. It is a contemporary archive of science and technology, part of the broader landscape of archives.

Further reflections from Giulio

Two years ago I heard Giulio talk about the EMBL archive and the role of a scientific archive, at the first scientific archives workshop. He had some interesting ideas about the responsibility that scientists bear to future generations and he developed some of these ideas again in his talk at the archive launch.

Giulio believes that scientists should start self-reflecting while they are doing their research.  He said that without historical knowledge, we easily forget how things happened. For example, it is startling to be reminded that until the early 1960s most scientists thought that protein was the carrier of heritable characteristics.

Giulio sees it as a duty for scientists to record the history of their own research. He started a habit over 20 years ago to document his research and his work in a daily journal. He uses black ink on acid-free paper. He has written about 7,000 pages into this journal over the past 20 years.

Giulio’s notebooks

Researchers now are being encouraged to adopt the tenets of ‘Responsible science and innovation’ (RRI) – a kind of extended research ethics framework. The EU defines RRI as covering ethics, societal engagement, gender equality, open access/science and science education. Giulio’s definition of RRI also includes “proper recording of work and intellectual contribution” – going beyond publishing papers to record the context of research.

He then outlined what he’d like to see become mainstream practice:

  • Training young scientists to understand the importance of being responsible and to understand the importance of accountability.
  • Training established scientists to manage their data / reagents / legacy wisely.
  • Foster the interface between natural sciences and humanities and create exposure of young natural scientists to social scientists and historians.
  • Have all major scientific projects and initiatives be accompanied by a social scientist (e.g. historian or sociologist).
  • Interview veterans and encourage them to save and contribute their personal archives.

My thoughts

The archives ‘catch-up’ model seen at EMBL (collecting archives 40 years after the organisation began) is similar to what happened at CERN (collecting archives 25 years after the organisation began).  At the EMBL event I talked to the Records Manager of a new research institute. She is considering how to make the case now for defining the approach to archives, and I am in the same situation. Young organisations can be too busy establishing themselves to spend time thinking about archives. It’s good in theory to start collecting materials from the beginning, but it is hard to persuade people to see current documents as part of history.

I love Giulio’s practice of keeping research notebooks, but I suspect few researchers will be tempted to follow suit. I think his idea for attaching a social scientist to major projects is good, but perhaps an ethnographer or anthropologist would be even better. I don’t expect this will become widespread, but even to see some projects adopt the idea would be interesting.

It’s good to see scientific archives becoming more visible – the second workshop on science archives will take place later this month. I’ll be interested to learn there about new ways to capture the record, the process, the history of science.


Preprints in the news

I think Fiona Fox’s recent question about preprints and their impact on science news reporting deserves more consideration. She calls for more discussion of the issue and of possible solutions.

Preprints – good

I’ve invested quite a bit of time in supporting the idea that posting preprints should be a normal research practice in biomedical research. I admit that I was sceptical about preprints initially.  Sure it worked for Physics and Economics and people asked “Why can’t it work for biomedicine?”.  But the majority of reactions to Harold Varmus’ 1999 E-Biomed proposal showed that there was a good deal of disquiet among biomedical researchers about the idea of preprints. “Biomedical research is different from physics!” they said.

The growth of quantitative biology preprints in the q-bio section of arXiv showed that preprints could work in biology. In 2013 the founders of bioRxiv thought that preprints could be used more widely in biology, and their optimism has been borne out.  In 2015 Stephen Curry said that biologists should ‘Just do it‘ and they have been, in increasing numbers.

Contents of Prepubmed preprint search tool, June 2018. Jordan Anaya
Source: http://www.prepubmed.org/monthly_stats/

A recent eLife webinar highlights the excitement about preprints in biology, and the need for more new initiatives. There is still a long way to go before all papers are posted as preprints, but we are seeing a steady increase year on year. Initiatives like ASAPbio highlight the benefits – speed of dissemination, increased feedback and visibility, establishing priority. It would be good to have more data about the benefits to back up the anecdotal accounts.

Infrastructure around preprints has evolved quickly, but I’m not sure it’s on a sustainable basis yet and there is still a need for standards development. I’m looking forward to seeing how bioRxiv develops following the CZI investment and what plans ASAPbio have.  It was good to see recently that Europe PubMedCentral is now indexing preprints.

I think things are moving in the right direction for preprints and expect that in ten years’ time we’ll see many beneficial effects resulting from their adoption.

Preprints – problematic?

I was intrigued therefore to see the blogpost about preprints from Fiona Fox on the Science Media Centre website. Fiona has a good deal of experience in science communication and her insights are always worth reading. Evidently many people people read her post about preprints as there was quite a backlash on Twitter. Her post made three main points about preprints:

  1. They seem to be beneficial for science
  2. They create a difficulty for science journalists
  3. They’re not peer-reviewed and therefore potentially dangerous when it comes to medicine

On the first point, very few will disagree. On the third point, I think many will partially agree too, but I rather thought this potential problem was well understood and on the way to being sorted. Philip Bourne et al’s ‘Ten simple rules to consider regarding preprint submission‘ include this:

This point about potential dangers has been discussed and discussed ad nauseam in the past.  It is being discussed again now as plans for medRxiv are developed and safeguards are being put in place. I don’t think it is an insuperable problem.

Slide from talk by John Inglis, ‘Preprints in Biology and Medicine’
Source: https://www.slideshare.net/BaltimoreNISO/inglis-preprints-in-biology-and-medicine

News reporting, preprints, embargoes

For me the most interesting part of Fiona’s blogpost was her point about the difficulties that preprints pose for journalists trying to cover news of scientific advances.

Once an article is accepted by a journal for publication there is usually a period of time before it is published during which it can be sent to journalists ‘under embargo’.  This allows the journalists time to do some background work and prepare their news story. Supporters of the embargo system maintain that it is necessary to ensure high-quality reporting, and to ensure that science stories are seen as ‘newsworthy’.

Preprints currently are not subject to any embargo, but are posted online as soon as they have been through some basic checks. Hence journalists have no time to prepare. If they take the time to interview other researchers and gather opinions about the research in the preprint, then they run the risk that someone else gets into print sooner than them with a story, and their well-researched story might be discarded by a news editor hungry for hot news.

Embargoes divide opinion, but the Science Media Centre is a strong supporter of embargoes in practice. If you hate the embargo system then you might rejoice that preprints may hasten the end of that system.  If you accept that embargoes are good for science reporting, (and that science reporting is good for science) then we need to explore how preprints and good science reporting can work together.

Would it be feasible to have an option of a short embargo period for some preprints – those where press interest is expected / desired? Fiona suggests a number of other possible strategies at the end of her post, but these seem to have been overlooked in the rush to condemn her comments about the dangers of preprints. I think her call for discussion is timely and it’d be good to see proper engagement with it.

Surely we just need to adapt our current approaches to the new kid on the block?Maybe. But I still think we need to use this time to thrash out best practice and agree what the new rules should look like. 

Or is there something we have not thought of that could get us round these new realities with minimum adverse effects? 

The changes being made to a part of the system that was not working are set to have profound knock on effects on another part of the system that works and serves science well. The challenge here is to fix one end without losing the gains we have made in reporting findings to the public in an accurate and measured way.

Extracts from: Fiona Fox, 17 July 2018. The preprint dilemma: good for science, bad for the public? A discussion paper for the scientific community.

