Seven sins of science writing

I was never trained as an editor, but a few years ago I found myself as joint editor of our Institute’s annual volume of essays on science, aimed at a lay(-ish) audience. For the first few years I worked on them with a couple of more experienced scientist-editors, observing how one of them in particular could, with a few strokes of his pen, turn a clumsy sentence into an arrow of meaning. For the past five years I have been sole editor. I confess I do struggle sometimes when the topic is unfamiliar and/or the essay is pitched at the wrong level. I can usually find someone in the Institute to help out though, or else give some feedback to the original author. Most authors seem ready to accept the feedback I give.

After putting this year’s essays to bed I found myself reflecting a little on the variety of editing problems that I have encountered over the years and I came up with a list of seven sins of science writing. They are probably not original nor definitive, but I thought I’d share them anyway.

1. Error
This is the most fundamental sin. If you get some of your facts wrong then you have failed at the first hurdle. How can you explain what you don’t know yourself? If the topic of your essay is just a little outside your expertise, or if parts of the essay stretch past your comfort zone, then you need to check and recheck what you say and ideally run it past an expert to check. Getting something wrong is embarrassing!

2. Ambiguity
The facts should be set out clearly, leaving no room for doubt. It’s not enough to know something in your mind, you also have to find a way to express it in words that another mind can understand. If you leave your readers wondering what you meant then you have failed.

3. Obscurity
Don’t obscure your message by peppering your prose with technical terms that non-specialists will not understand. In particular you should try to avoid using too many names of entitities. I think of this as ten War and Peace problem; when the list of participants (genes, proteins, whatever) is too long then the reader can’t remember who is who and quickly becomes lost.

4. Complexity
Long sentences, with multiple clauses and tricky constructions, do not aid comprehension and readability. This doesn’t apply only to science of course, but I think it’s endemic in science. Trying to express something very precisely, especially when that something is inherently complicated, often results in sentences with too many twists and turns.  Cath Ennis memorably described this phenomenon as a gnarly word salad. Don’t be afraid to break the sentence up into smaller sentences. Maybe some of those sub-clauses are not really necessary.

5. Clumsiness
OK, we’re starting to get into personal abuse now but there’s getting away from it. Some writing is just clumsy; not difficult or obscure, just clumsy. It’s like driving along a road and hitting a pot-hole.  You’re reading the sentence and suddenly your flow is interrupted as you have to figure out whether an “it” means the thing last-mentioned or the thing before that.  Grammatical slip-ups are another common kind of pot-hole.

6. Ennui
It is important that your reader stays awake long enough to finish reading what you have written. Two things help here: brevity and a light touch. If you’re given a word limit try to stick to it or you may be making unrealistic demands of your reader’s attention span.  Also try to inject some personality into your writing, something to make it interesting – perhaps even some jocularity.  But please, not too many exclamation marks!

7. Irrelevance
Stick to your brief and if you are not sure what your brief is then talk to whoever has commissioned you to write the piece. If what you write is not relevant in some way to the intended audience then it doesn’t matter how good it is.  Perhaps that’s an overstatement, but your choice of topic and the way you decide to treat the topic are something you should give thought to, preferably before you start writing.

While preparing this diatribe I saw a couple of other posts relevant to good writing:

Interesting that they are both positive “tips” rather than my more negative “sins”.  Sins sound more interesting than tips though, so that suggests a final sin that I am guilty of quite often:

Don’t choose a boring title!

Posted in Writing | 5 Comments

Ethical retrieval

It may surprise you to know that librarians have codes of professional ethics. The main  UK membership organisation for librarians,  CILIP, requires its members to follow its ethical code; the American Library Association have something similar.  Subject classification and indexing is one of the more interesting areas where ethical concerns can arise.  Headings that seemed fine in earlier ages may now seem not fit for purpose (a bit like the famous moment when the American Psychiatric Association reclassified homosexuality in DSM-IV) . The great US cataloguer, Sanford Berman, has been a leader in pressing for bias to be removed from subject headings. See this article summarising his achievements (pdf).  Sanitising the catalogue in this way may be seen as politically correct but sometimes it is just common sense (e.g. putting Mark Twain under “English literature” is just wrong!).

In these days of internet search engines and full-text retrieval, library subject headings seem rather arcane and unnecessary.  You can search for whatever term you want in Google, be it abusive or polite, but there are still problems.  Google Scholar is an index of scholarly literature, but the way that it defines and detects what is scholarly has led to some disquiet recently and a petition to remove creationist material from its index. PZ Myers has pointed out that the petition is wrong-headed:

Google Scholar does not index on content; it can’t, it’s just a dumb machine sorting text …The way items get on Google Scholar is based entirely on whether they’re formatted like a scholarly paper.

Google then is not concerned with the content and makes no judgment on the rightness or wrongness of it, rather like the principle of net neutrality which is  in the news right now.

Google does make some value judgements though. There has been a growing wave of complaints that its search service is becoming dominated by spam sites:

Google’s search results [are] full of spammy links that lead to nothing of value… content scrapers, marketers, or sites that consisted of nothing but keywords surrounded by useless crappy content.

Some people were suspicious that the presence of Google ads on a site affected its position in the search rankings. Google have denied this and have now responded with a promise to work harder to remove these so-called ‘content farms’ from search results. The Blekko search engine is taking similar steps; these spam sites are not just a Google problem.

All search engines share the same problems of course – how to find everything relevant and only what is relevant, and to present the most relevant items at the top of the list. They each find their own way to resolve that problem, giving slightly different results.  Or do they?  Danny Sullivan, who blogs about search engines, has reported that Bing, Microsoft’s search engine, has been copying results from Google. In an elaborate sting operation Google created some ‘synthetic’ search terms to seed some false results into its database.  They then searched for these terms using laptops with Internet Explorer and the Bing toolbar installed. Within two weeks the false results were appearing in Bing.  Microsoft have admitted that they do watch how their customers use Google but say that this is not copying Google, and anyway all search engines do the same.

Who would have thought that Search Engine Ethics 101 could be so interesting?  I was surprised that a Google search for search engine ethics brought up quite a few results, including some from the International Review of Information Ethics which was a new one on me.  There is even a book The Blackwell Guide to the Philosophy of Computing and Information if you want to immerse yourself in the topic.

Google of course are famously the company who do no evil. But Siva Vaidhyanathan has just published a book called The Googlization of Everything: And Why We Should Worry. He doesn’t think that Google is evil, but he does think that its dominance and the speed with which it has reached that position are a little worrying.  I confess I haven’t read the book but there is an interesting interview with its author in Publishers Weekly. I think this comment from Vaidhyanathan gets to the core of things:

The assumption for years has been that Google merely aggregates our decisions, perceptions, and our judgments. But it’s not that simple. Google is not without its biases, and I wanted to try to unpack the nature of some of its biases, which, not surprisingly, skew toward what’s new, popular, and tech-savvy. The major realization I had in doing this book is that Google now governs the Web, and more because of the choices it makes than the choices we make. Think back to when Google first started. There were a handful of search engines, and if you went to any of them and typed in common words like “Asian” or “facial,” you’d get porn sites. It was Google that figured out how to make our Web experience better by filtering—not by censoring or blocking access to porn sites. But while Google is officially content-neutral, de facto it’s not, because it filters. For example, it favors certain aspects of page design. That’s a good thing, of course. It has made the Web better. But it is also important that we acknowledge what Google does, and that Google now pretty much runs the Web, albeit with our tacit, implicit consent.

Maybe Google should be signing up to one of those codes of ethics, or recruiting Sanford Berman to advise it?

Posted in Ethics, Searching | 5 Comments

Peering at review

The House of Commons Select Committee on Science & Technology have announced that they will conduct an inquiry into peer review. They list eight points starting with:

the strengths and weaknesses of peer review as a quality control mechanism for scientists, publishers and the public

and ending with:

the impact of IT and greater use of online resources on the peer review process, and possible alternatives to peer review.

This is not the first time the committee has looked at publishing. Back in 2004 it produced a report on open access called Free for all? That report was a reasonable round-up of the then state of play but did not precipitate any major changes. It gave a greenish light to UK Research Councils to define their initial, rather weak, policy on OA.

The 2004 report did have something to say on peer review, coming down in its favour and concluding:

As is the case with any process, peer review is not an infallible system and to a large extent depends on the integrity and competence of the people involved and the degree of editorial oversight and quality assurance of the peer review process itself. Nonetheless we are satisfied that publishers are taking reasonable measures to main high standards of peer review. Peer review is an issue of considerable importance and complexity and the Committee plans to pursue it in more detail in a future inquiry.

Well, seven years later, here is that future inquiry.

I had a feeling that POST, the Parliamentary Office for Science & Technology, had produced one of their useful briefing papers on the subject of peer review but was surprised to discover that it was back in 2002, which seems an age ago.

Peer review has been on everyone’s minds recently.  I first started to take notice of peer review in 1999 when Harold Varmus published his E-biomed proposal.  Varmus was head of the NIH at that time, which had recently (1997) launched PubMed as a free index to biomedical scholarly literature. Advised by David Lipman, the genius behind PubMed, and Pat Brown, a leading biomedical scientist from Stanford University, Varmus proposed a wholesale change to biomedical scholarly publishing.

I remember reading this proposal with a sense of disbelief. It did make sense but seemed to ignore the achievements and existing role of publishers (perhaps intentionally?). It proposed a really radical change to the system for publishing research without considering the practical realities of how to achieve that. The system envisaged was summarised thus:

(i) Many reports would be submitted to editorial boards. These boards could be identical to those that represent current print journals or they might be composed of members of scientific societies or other groups approved by the E-biomed Governing Board.

(ii) Other reports would be posted immediately in the E-biomed repository, prior to any conventional peer review, after passing a simple screen for appropriateness.

The proposal attracted some criticism, particularly what it had to say about peer-review.  Stevan Harnad welcomed the proposal but stated:

there is no need whatsoever to tamper with this proven system of  quality control in order to achieve the optimal outcome

Harnad has repeatedly insisted that peer review changes are not necessary to furthering the cause of open access.

The proposal was never realised in its most radical form but as a direct result of it were born the open access publisher Public Library of Science and the NIH repository PubMedCentral. Peer review has had a few tweaks and mini-experiments but I think it has not been seriously threatened since then.

The system that Varmus had proposed back in 1999 still seems radical. Cameron Neylon has recently suggested something similar and discovered that publishing without peer review is far from being a mainstream idea, (though Nature Precedings manages to do it). I attended a talk that Cameron gave last October in the Research Information Network series on Research Information in Transition. He asked what an ideal scholarly communications system for today would look like and stated that it needed to address archiving, registration and communication.  He pointed out that the present system had its origins in the 17th century, in an age of paper and of  centralised production and distribution. It may not be ideal for today with electronic communication and a much more diverse set of published objects.

Cameron suggested that peer review in publishing is too blunt an instrument.  I don’t know – I do hear scientists complaining about peer review in practice, but also hear strong support for the existing system from other scientists.  I think there is a good deal of worry that attempting a change will wreak havoc on the whole academic enterprise and research careers. Major changes could certainly damage the commercial concerns of journal publishers.  What is not clear to me is how significant are any problems with peer review, and whether there is a workable alternative.

I hope the Select Committee inquiry will help to move us closer to finding answers to those questions.

Posted in Journal publishing, Open Access | 8 Comments

Macaroni and Montaigne

My Facebook page is looking very macaronic these days. Forget pasta; I’m talking language here. I am not an expert on Flemish vocal music from the Renaissance period, but I sometimes listen to a CD of music by Ockeghem and Josquin des Pres. That is how I learnt that ‘macaronic’ denotes a text that is a mixture of languages, such as the Latin/German mixture of the 11th/12th century Carmina Burana manuscript (made famous much later by Carl Orff’s musical setting of part of it).

As  Steve Caplan said about his love of bilingual jokes, appreciation is helped if you can understand both languages. Unfortunately I am only fluent in English, with just a few smatterings of words and phrases from other languages. Not so my Facebook friends. I have one German friend who mainly posts to Facebook in English but also in German, Spanish and Portuguese from time to time.  I see French, Catalan, Italian, and Turkish in wall postings from other friends and occasionally some Sinhalese.   My partner is originally from the Philippines so I have many Filipino in-laws and friends. They post and comment sometimes in English but more often in Tagalog, which is the main language of the Philippines, or use one of the other languages or a mixture of languages.  My biggest problems are trying to understand some of the younger family members who post in some kind of Filipino textspeak/youthspeak.  I think this may be Jejemon but I am not too sure.  Suffice to say, some days my Facebook wall is quite impenetrable.

I love it though.  I love to see photos from friends, whether in London or Manila, and odd snippets about what they are up to. I feel some kind of connection that way. It’s also helpful to be reminded when their birthdays are coming up.

Facebook gets a good deal of criticism from tiresome critics to the effect that it isolates people from the real world. Sherry Turkle is the most recent critic.  I have some respect for her – I heard her talk some years ago about the time that her book Life on the Screen came out. She was very persuasive and clearly a thoughtful person.  I haven’t read her latest book, Alone Together, and I would not dare to challenge whatever data she brings to bear in it to support the claim that technology is

actually isolating us from real human interactions in a cyber-reality that is a poor imitation of the real world

I only know about myself and my own experience of using technology and how it has affected me.  I admit that sometimes one can get a bit obsessive about it, but mostly it has had a positive impact on my life in the past 20 years. What annoys me about the critics of Facebook and its ilk is their assumption that in the real world everyone is highly sociable and socialised. They seem to think that use of Facebook and other social media is displacing real-world interactions.  How dare they assume that I am at ease in real-world social situations and would be out gallivanting every night if I wasn’t so attached to my technology? Why do they think that sitting in a cafe or pub fiddling with my phone is worse than sitting there reading a book or newspaper?  If you deprive me of technology you don’t miraculously make me into someone who is happy to start talking to strangers; you just take away the communication tools that work best for me.

I read another critique of Facebook recently that didn’t annoy me so much, though I still thought it was wrong. Saul Frampton harked back to Montaigne and his recognition that

our inbuilt capacity for sympathy depends on our physical proximity to others

Intuitively this feels right – sympathy is higher for someone in the room with me than if they are at the end of a phone line.  But I can still feel sympathy for someone on the end of the phone, or at the end of an email etc. Often I don’t have the option of being in the room with them, so technology really does help to connect me to that person. I think it is hard to feel a sympathy for someone you have never met.

Michel Eyquem de Montaigne-Delecroix, 1533–1592

Jill Foster was a pioneer in the UK of using networks in academia. She started Mailbase, NISP and Netskills among other things and she was partly responsible for getting me interested in such things.  She was always adamant that face-to-face meetings were important as a way to cement relationships and that electronic communication was easier with people you had actually met.  I have found that to be true.

Maybe some social media users are just diving in and having conversations with random online strangers to the exclusion of all real-world experience. That is certainly not my experience and I resent being told that technology is doing me harm.

I think I may be whining too much here. If so then I apologise but I feel I have to defend something that has been important in my working life and now in my life too.

On a more positive note, I haven’t been able to listen to this BBC Radio three part series yet but it looks fascinating. The secret history of social networking goes right back to the 1970s when “hackers met hippies” in California and the first community bulletin boards were born.  You can find it on Facebook of course.

Finally, this cartoon seems apposite.

Posted in Social networking | 12 Comments

Some interesting reading

This is a shameless plug for the annual volume of essays that my Institute puts out: the Mill Hill Essays. My justification is that a) they are interesting b) they are free to read on the web and we give away print copies, so thy are totally non-commercial and c) I manage and edit them, so I have been living and breathing these essays over a period of several months.

Our tradition of an annual volume of essays was started 15 years ago as an attempt to shed some light for a general audience on scientific issues of topical interest. The essays are written by staff at the Institute, or ex-staff, with occasional guest authors. Sometimes they are topical, usually they are interesting, occasionally both. This year’s collection is bigger than ever, with nine essays plus ten mini-book reviews. I will take you on a canter through them, including some of the images included in the booklet.

The 2009 H1N1 swine flu pandemic: don’t panic but you are all going to die by Peter Coombs.
Peter is a postdoc in our Virology Division, which includes the WHO Influenza Centre that has a key role in influenza surveillance. He describes the course of the 2009 outbreak of pandemic influenza and what was done to contain the pandemic and characterise the virus.

A dangerous occupation by Zhores Medvedev.
Zhores was a geneticist at the Institute in the 70s and 80s, specialising in ageing. He has also written about the Chernobyl disaster, and Soviet science. He recently showed me a photo of Vladimir Putin presenting an 85th birthday present to his twin brother, Roy Medvedev. In his Mill Hill Essay Zhores describes his early life and education in Soviet Russia during the second world war and immediate post-war years, and the influence of Trofim Lysenko on Soviet science.

This cartoon by Argentinian artist Roberto Bobrow shows Vavilov, Lysenko and Stalin. See his blog for more of his work.

Courtesy of Roberto Bobrow

Bringing it all back home: next-generation sequencing technology and you by Mike Gilchrist.
Mike is a programme leader in our Division of Systems Biology. He gives a brilliant exposition of how high-throughput sequencing works, what it means and what we can learn from its results.

Immortality and obscurity by Harriet Groom.
Harriet is a postdoc in our Division of Virology. This is an essay-review of Rebecca Skloot’s best-selling book about Henrietta Lacks, which won the 2010 Wellcome Trust Book Prize. Harriet explains why HeLa cells are both remarkable and very useful to science.

This portrait depicts Francisco Pizarro, one of the Spanish conquistadors.

Conquistadores and cot death by Marianne Neary.
Marianne is a PhD student in our Division of Developmental Biology. She describes an interesting link between research into cot death and adaptation to life at high altitude. This essay was shortlisted for the Max Perutz Science Writing Award 2010.

Is immunotherapy the ultimate solution for Alzheimer’s Disease? by Marina Lynch.
Marina is a professor at the Trinity College Institute of Neuroscience in Dublin. She worked at Mill Hill in our Division of Neurophysiology and Neuropharmacology in the 1980s. Her essay explains what Alzheimer’s disease is and how immunological therapies are showing great promise as treatments for Alzheimer’s.

This portrait of Stephen Fry is by Lotte D’Hulster. You can see more of her work on her blog.

Courtesy of Lotte D’Hulster

Lithium, manic depression and beyond by Qiling Xu.
Qiling is a researcher in the Division of Developmental Neurobiology. In her essay she explains the pharmacological effects of lithium, its use in bipolar affective disorder and its effects on major developmental signalling pathways.

Translation: beating scientific swords into medical ploughshares -by John Galloway.
John is Head of the Dental Team Studies Unit at the Eastman Dental Hospital. His essay examines what translational research is and what its role is in bridging the gap between basic biomedical science and clinical benefits for patients.

What makes bone marrow such a versatile resource for curing human diseases? by Thomas Elliott.
Thomas is a student at Queen Elizabeths Boys School. This essay won the 2010 NIMR Human Biology Essay Competition for local schools.

An innovation in the 2010 Mill Hill Essays is the inclusion of a series of short book reviews by Institute staff. These covered quite a range, all on scientIfic topics, from prize-winning popular science books like Life Ascending and The Age of Wonder to books on influenza, medieval science, genetics, and drug research, and ending with a Mikhail Bulgakov novella. We also have a second review of Rebecca Skloot’s book in this section; since someone was kind enough to write it it seemed rude to refuse!  You will also find a few reviews written by me in this section.

Well, that’s it for another year.  I already have some ideas for topics for the next volume, but am always happy to have suggestions for topics we should cover.

Posted in Reading recommendations | 4 Comments

The importance of the ephemeral

I was given a new digital radio for Christmas, and now have radios installed in every room of my flat, meaning I can listen to my beloved Radio 4 wherever I am. This morning I was interested to hear John Lichfield, the Independent newspaper’s France correspondent, talking about France. Well, of course he was actually talking about his new book, a compilation of his essays on all things French.

He explained that while as a correspondent he wrote mainly about the major political stories of the day, he also wrote observational pieces and reports of trivial things that happened in his everyday life in Paris. These more ephemeral pieces often produced bigger postbags as they chimed with readers. He pointed out that no-one now is interested in (or remembers) the beef war of the 1990s and all the petty politicking from year to year, but “what happened to me in the baker the other day” can carry universal truth and experience and endure for years. Thus, the ephemeral is longer-lived than the important news stories of the day.

It struck me that tins is very true in blogging too. Sometimes I wish I could chase news stories and blog about them in real time, but I can’t devote the time that would require. A more personal approach, sometimes taking the long (or sideways) view, works better for me.

I think this is perhaps a feature of several bloggers here at Occam’s Typewriter. We celebrate the really important ephemeral tales of life, scientific and otherwise, rather than focusing on the here today, gone tomorrow, big splash news stories

– Posted using BlogPress from my iPad

Posted in Blogology, Froth | 8 Comments

Nature’s new position statement on open access

Nature Publishing Group (NPG) have just issued a new position statement on open access. It aims to give a useful of the company’s current activities in open access, and it sets out their policies and viewpoints with respect to open access.

Let’s face it, “self-serving press release” is a tautology.  The statement is a cunning mix of public relations and information.   To be fair to them, NPG do have  liberal policies permitting self-archiving and they have taken some bold steps, like establishing a full-OA journal and a high-level hybrid OA journal.  Last month they also added OA options to several of their academic journals. NPG have cooperated with UKPubMedCentral too and are working with the EU’s PEER project, which is looking into the effect of self-archiving on scholarly communication.

At the core of the statement though is NPG’s self-justifying mantra:  “one size does not fit all… Scholarly communication [is best] served by a mix of models”.  Hence, when discussing their own high-impact, low-acceptance rate journals NPG say that “it seems fairer to spread the costs across the large number of readers, rather than the much smaller number of authors”.  Note that they say “it seems” – in other words no evidence or rationale is provided, it is just an article of faith at NPG. Despite the recent discussion about submission charges these are barely mentioned.

It is interesting to note that there is no mention of last year’s controversy about the price increases charged to the University of California, beyond a veiled statement that “subscription prices can be controversial” and an assurance that NPG “keep prices as low as possible”.

On the other hand, another press release issued yesterday announced some new options for accessing NPG content. Articles from some journals are now available on the DeepDyve platform (see my post for more info about DeepDyve). Low cost access options are also now available on the nature.com iPhone app. Martin Fenner has welcomed both of these options, though takes issue with the pricing.

I was interested to read what the statement says about Nature Communications, NPG’s only Nature-branded open access offering:

It has a higher acceptance rate than other Nature titles, and accepts some manuscripts previously rejected by the Nature research journals (subject to independent editorial review). Along with digital-only publication, this reduces the costs per manuscript published, and so an APC of $5000 is viable. Nature Communications was born-hybrid, and currently 40% of its content is open access, much higher than most other hybrid open access journals at this time.

It had always struck me that Nature Communications was an uneasy compromise between high-impact (with the magic Nature brand in the title) and high-volume (necessary to keep the article charges reasonable).  It seemed to go against their mantra that OA is not possible for high-impact journals, and also seemed a tad hypocritical in view of their earlier comments on PLoS ONE (see also here and here).

The statement ends with its most interesting news, the announcement of another new launch:

In 2011, NPG will expand its open access publishing programme with the launch of Scientific Reports. This will be born-digital, fully open access (with Creative Commons non-commercial licences), with an acceptance rate significantly higher than Nature Communications. Scientific Reports will enjoy all the benefits of the nature.com platform, while offering authors the choice of a highly-affordable open access publishing option.

This looks very like NPG’s answer to PLoS ONE. I look forward to seeing reactions to this announcement.

Posted in Journal publishing, Open Access | 6 Comments

Philosophy and biology

When I was first an undergraduate, studying chemistry many moons ago, I still had some  pretension to being an intellectual.  That quickly evaporated as I discovered this intellectual stuff was, you know, HARD! But in my first year of study I took a course in philosophy and mostly enjoyed it (though trying to read turgid philosophical prose whilst sat in a warm library was the best sleep-enhancer I have ever found).

I recall a series of lectures on metaphysics from Stephan Körner.  These were well-attended, including many students who were not studying philosophy. Professor Körner liked to show how philosophy was important to all areas of knowledge. He would ask members of the audience what their area of study was, then show how philosophy impinged on quantum physics, or history or biology or whatever.  One time when he asked this question I stuck my hand in the air and piped up to say that I was a chemistry student.  The Professor just grimaced a little and passed on to the next hand, with a comment that there was really nothing to say about chemistry and philosophy!

Biology on the other hand throws up all kinds of philosophical issues. Cambridge University Press have just launched a new series of short textbooks entitled Cambridge Introductions to Philosophy and Biology. The publisher says that the books are:

short and accessible, offering lively and up-to-date discussions, and are designed to be used by a student readership in conjunction with university courses

Interestingly the first advertised titles, coming in March and September, deal with Paleontology and Agro-technology respectively.  Later titles will deal with human evolution, genetics and organisms.

It’s interesting to see such a series aimed at science students.  I wonder how long it will be before they start another series, on Philosophy and Chemistry?

Posted in Reading recommendations | 8 Comments

The end of academic libraries

This is my first real day back at work after Xmas and New Year, and my jury service.  I abjectly failed to get writing over the festive break so I am cheating and just directing your attention to a short piece in the Chronicle of Higher Education on the death of academic libraries. I do sympathise with much of the writer’s analysis, viz. that librarians’ success in embracing the digital has made their print collections and their own activity redundant.  I agree with some commenters that he overstates things a little, though by how much remains to be seen.

I think the piece is intended as a wake up call, urging us to start “plotting a realistic path to the future” for library services.  That task is very much on my mind right now.

I hope to be back with a proper post later this week.

Posted in Future of Libraries | 2 Comments

Google

The first Internet search engine I used, back in about 1990, was Archie. This was an index of content hosted across the internet on ftp servers; mostly software but there were documents and databases too.  Archie didn’t feel much like an information tool, but more something for computer specialists. Then came Veronica – an index to content hosted on gopher servers (kind of forerunners of the web). This did feel more like a way to search for information, though its content was still limited – a very small niche. Once the web came along we saw a succession of web search engines. Each came into being in a blaze of superlatives (“bigger and better”), trumpeted as the solution to searching the web, but each lasted just a couple of years and then slowly faded as the next new thing took over (where are you, Hotbot, Lycos, Alta Vista?). I never imagined back then that one of these search tools would grow to become an absolutely key part of the academic information environment with a major presence in every part of the information world.

Google has achieved that position. Beyond its dominant presence as a general internet search engine and software development company, the existence of Google Scholar, tehe Google book digitization project and the recently-launched Google ebook service make it a core part of the library and information landscape. My theme today is Google Scholar but I will come back to the book projects in another post.

A recent article you may have missed, in the International Journal of Cultural Studies, affirms that Google has become an integral part of everyday life, not least in the academic world. But Google’s instincts are not those of the academic world – it has a tendency to secrecy borne of its commercial mission. The press release about the article states:

One of the key points about search engines’ ranking and profiling systems is that these are not open to the same rules as traditional library scholarship methods in the public domain. Automated search systems developed by commercial Internet giants like Google tap into public values scaffolding the library system and yet, when looking beneath this surface, core values such as transparency and openness are hard to find.

Inexperienced users tend to trust proprietary engines as neutral knowledge mediators [but] engine operators use meta-data to interpret collective profiles of groups of searchers.

Another article, in Serials Review, is entitled Google Scholar’s Dramatic Coverage Improvement Five Years after Debut. The author finds that over the five years from 2005 to 2010 Google Scholar has improved its coverage of scholarly journals. Coverage varied between subject fields, but in 2005 was between 30% and 88%; in 2010 between 98% and 100%.

Librarians criticised Google Scholar in its early days for its very patchy coverage, and also for its lack of openness – it was very hard to find out exactly what it did cover. It seems they have overcome that problem, though worries over its accuracy remain. In an article in Issues in Science and Technology Librarianship science researchers at the University of California Santa Cruz were surveyed about their article database use and preferences. Web of Science was the single most used database, selected by 41.6%. Statistically there was no difference between PubMed (21.5%) and Google Scholar (18.7%) as the second most popular database. 83% of those surveyed had used Google Scholar and an additional 13% had not used it but would like to try it. While Google Scholar is favored for its ease of use and speed, those who prefer Web of Science feel more confident about the quality of their results than do those who prefer Google Scholar. Librarians and faculty alike often assert that “all researchers use Google Scholar.” Based on this study, this is essentially correct. 83% of researchers had used Google Scholar and an additional 13% had not used it but would like to try it. Of those who had used Google Scholar, almost three quarters of them (73%) found it useful.

In this context I was interested to see that Richard Wintle, one of the guest bloggers on this network, wrote recently about his experience of PubMed, suggesting that sometimes Google Scholar performed better than PubMed.  I think every tool has occasional weaknesses, so it is good to have multiple search tools available.

Peter Jacso, who has followed Google Scholar for some years, wrote in Library Journal about “Google Scholar’s ghost authors” and in Online Information Review about the “Metadata mega mess in Google Scholar“.  He highlights a key problem:

Google’s algorithms create phantom authors for millions of papers. They derive false names from options listed on the search menu, such as P Login (for Please Login). Very often, the real authors are relegated to ghost authors deprived of their authorship along with publication and citation counts.

Jacso says therefore that Google Scholar is  inappropriate for bibliometric searches, for evaluating the publishing performance and impact of researchers and journals.  One of the problems is that Google’s secrecy means that we don’t know how many records are in Google Scholar, and can only guess at the frequency of these errors.

Google Scholar is five years old, so it is still a young child when compared to PubMed (fully launched in 1997) or PubMed’s progenitor Index Medicus (started 1879). But Google Scholar no longer has a “beta” label, so clearly Google think it is a finished product or at least “good enough”.

My advice – be a little cautious whichever search tool you are using, but especially so with Google Scholar.

Posted in Searching | 11 Comments