{"id":154,"date":"2011-02-01T20:33:45","date_gmt":"2011-02-01T20:33:45","guid":{"rendered":"http:\/\/occamstypewriter.org\/trading-knowledge\/?p=154"},"modified":"2011-02-01T20:33:45","modified_gmt":"2011-02-01T20:33:45","slug":"ethical-retrieval","status":"publish","type":"post","link":"https:\/\/occamstypewriter.org\/trading-knowledge\/2011\/02\/01\/ethical-retrieval\/","title":{"rendered":"Ethical retrieval"},"content":{"rendered":"<p>It may surprise you to know that librarians have codes of professional ethics. The main \u00a0UK membership organisation for librarians, \u00a0CILIP, requires its members to follow its <a href=\"http:\/\/www.cilip.org.uk\/get-involved\/policy\/ethics\/pages\/default.aspx\">ethical code<\/a>; the American Library Association have <a href=\"http:\/\/www.ala.org\/ala\/issuesadvocacy\/proethics\/codeofethics\/codeethics.cfm\">something similar<\/a>. \u00a0Subject classification and indexing is one of the more interesting areas where ethical concerns can arise. \u00a0Headings that seemed fine in earlier ages may now seem not fit for purpose (a bit like the famous moment when the American Psychiatric Association <a href=\"http:\/\/en.wikipedia.org\/wiki\/Diagnostic_and_Statistical_Manual_of_Mental_Disorders#Political_controversies\">reclassified homosexuality in DSM-IV<\/a>) . The great US cataloguer, Sanford Berman, has been a leader in pressing for bias to be removed from subject headings. See this <a href=\"http:\/\/www.sanfordberman.org\/biblinks\/knowlton.pdf\">article summarising his achievements<\/a> (pdf). \u00a0Sanitising the catalogue in this way may be seen as politically correct but sometimes it is just common sense (e.g. putting Mark Twain under &#8220;English literature&#8221; is just wrong!).<\/p>\n<p>In these days of internet search engines and full-text retrieval, library subject headings seem rather arcane and unnecessary. \u00a0You can search for whatever term you want in Google, be it abusive or polite, but there are still problems. \u00a0Google Scholar is an index of scholarly literature, but the way that it defines and detects what is scholarly has led to some disquiet recently and a <a href=\"http:\/\/www.rationalskepticism.org\/creationism\/google-scholar-stop-including-creationist-sources-petition-t18661.html\">petition<\/a> to remove creationist material from its index. PZ Myers has <a href=\"http:\/\/scienceblogs.com\/pharyngula\/2011\/01\/how_to_game_google_scholar.php\">pointed out<\/a> that the petition is wrong-headed:<\/p>\n<blockquote><p>Google Scholar does not index on content; it can&#8217;t, it&#8217;s just a dumb machine  sorting text &#8230;The way items get on Google Scholar is based entirely on whether they&#8217;re  <em>formatted<\/em> like a scholarly paper.<\/p><\/blockquote>\n<p>Google then is not concerned with the content and makes no judgment on the rightness or wrongness of it, rather like the principle of <a href=\"http:\/\/en.wikipedia.org\/wiki\/Network_neutrality\">net neutrality<\/a> which is \u00a0<a href=\"http:\/\/www.pcworld.com\/businesscenter\/article\/218165\/netflix_egypt_and_the_case_for_net_neutrality.html\">in the news right now<\/a>.<\/p>\n<p>Google does make some value judgements though. There has been a growing <a href=\"http:\/\/www.marco.org\/2617546197\">wave of<\/a> <a href=\"http:\/\/techcrunch.com\/2011\/01\/01\/why-we-desperately-need-a-new-and-better-google-2\/\">complaints<\/a> <a href=\"http:\/\/www.blogher.com\/changes-google-will-reduce-spammy-search-engine-results?wrap=blogher-topics\/blogging-social-media-0&amp;crumb=10\">that<\/a> its search service is becoming dominated by spam sites:<\/p>\n<blockquote><p>Google&#8217;s search results [are] full of spammy links that lead to nothing of  value&#8230; content scrapers, marketers, or sites that consisted of nothing but keywords  surrounded by useless crappy content.<\/p><\/blockquote>\n<p>Some people were suspicious that the presence of Google ads on a site affected its position in the search rankings. Google have denied this and\u00a0have now <a href=\"http:\/\/googleblog.blogspot.com\/2011\/01\/google-search-and-search-engine-spam.html\">responded<\/a> with a <a href=\"http:\/\/searchengineland.com\/google-search-quality-with-new-on-page-spam-detection-62031\">promise<\/a> to work harder to remove these so-called &#8216;content farms&#8217; from search results. The Blekko search engine is taking <a href=\"http:\/\/searchengineland.com\/blekko-bans-content-farms-from-their-index-63134\">similar steps<\/a>;\u00a0these spam sites are not just a Google problem.<\/p>\n<p>All search engines share the same problems of course &#8211; how to find everything relevant and only what is relevant, and to present the most relevant items at the top of the list. They each find their own way to resolve that problem, giving slightly different results. \u00a0Or do they? \u00a0Danny Sullivan, who blogs about search engines, has <a href=\"http:\/\/searchengineland.com\/google-bing-is-cheating-copying-our-search-results-62914\">reported<\/a> that Bing, Microsoft&#8217;s search engine, has been copying results from Google. In an elaborate sting operation Google created some &#8216;synthetic&#8217; search terms to seed some false results into its database. \u00a0They then searched for these terms using laptops with Internet Explorer and the Bing toolbar installed. Within two weeks the false results were appearing in Bing. \u00a0Microsoft have <a href=\"http:\/\/searchengineland.com\/bing-admits-using-customer-search-data-says-google-pulled-spy-novelesque-stunt-63162\">admitted<\/a> that they do watch how their customers use Google but say that this is not copying Google, and anyway all search engines do the same.<\/p>\n<p>Who would have thought that Search Engine Ethics 101 could be so interesting? \u00a0I was surprised that a Google search for <strong><a href=\"http:\/\/www.google.co.uk\/search?q=search+engine+ethics&amp;rls=com.microsoft:*&amp;ie=UTF-8&amp;oe=UTF-8&amp;startIndex=&amp;startPage=1&amp;rlz=1I7GGLL_en&amp;redir_esc=&amp;ei=nkdITeqENIy2hAfM4fz7BA\">search engine ethics<\/a> <\/strong>brought up quite a few results, including some from the <em>International Review of Information Ethics <\/em>which was a new one on me. \u00a0There is even a book <a href=\"http:\/\/www.amazon.co.uk\/Blackwell-Philosophy-Computing-Information-Guides\/dp\/0631229191\/ref=sr_1_6?s=books&amp;ie=UTF8&amp;qid=1296583903&amp;sr=1-6\">The Blackwell Guide to the Philosophy of Computing and  Information<\/a> if you want to immerse yourself in the topic.<\/p>\n<p>Google of course are famously the company who do no evil. But Siva Vaidhyanathan has just published a book called\u00a0<em><a href=\"http:\/\/www.googlizationofeverything.com\/60about_this_book\/\">The Googlization of Everything: And Why We Should Worry<\/a>. <\/em> He doesn&#8217;t think that Google is evil, but he does think that its dominance and the speed with which it has reached that position are a little worrying. \u00a0I confess I haven&#8217;t read the book but there is an interesting <a href=\"http:\/\/www.publishersweekly.com\/pw\/by-topic\/authors\/interviews\/article\/45941-the-googlization-of-books.html\">interview with its author<\/a> in Publishers Weekly. I think this comment from Vaidhyanathan\u00a0gets to the core of things:<\/p>\n<blockquote><p>The assumption for years has been that Google merely aggregates our decisions,  perceptions, and our judgments. But it&#8217;s not that simple. Google is not without  its biases, and I wanted to try to unpack the nature of some of its biases,  which, not surprisingly, skew toward what&#8217;s new, popular, and tech-savvy. The  major realization I had in doing this book is that Google now governs the Web,  and more because of the choices it makes than the choices we make. Think back to  when Google first started. There were a handful of search engines, and if you  went to any of them and typed in common words like &#8220;Asian&#8221; or &#8220;facial,&#8221; you&#8217;d  get porn sites. It was Google that figured out how to make our Web experience  better by filtering\u2014not by censoring or blocking access to porn sites. But while  Google is officially content-neutral, de facto it&#8217;s not, because it filters. For  example, it favors certain aspects of page design. That&#8217;s a good thing, of  course. It has made the Web better. But it is also important that we acknowledge  what Google does, and that Google now pretty much runs the Web, albeit with our  tacit, implicit consent.<\/p><\/blockquote>\n<p>Maybe Google should be signing up to one of those codes of ethics, or recruiting Sanford Berman to advise it?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>It may surprise you to know that librarians have codes of professional ethics. The main \u00a0UK membership organisation for librarians, \u00a0CILIP, requires its members to follow its ethical code; the American Library Association have something similar. \u00a0Subject classification and indexing &hellip; <a href=\"https:\/\/occamstypewriter.org\/trading-knowledge\/2011\/02\/01\/ethical-retrieval\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":17,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[17,8],"tags":[],"class_list":["post-154","post","type-post","status-publish","format-standard","hentry","category-ethics","category-searching"],"_links":{"self":[{"href":"https:\/\/occamstypewriter.org\/trading-knowledge\/wp-json\/wp\/v2\/posts\/154","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/occamstypewriter.org\/trading-knowledge\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/occamstypewriter.org\/trading-knowledge\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/occamstypewriter.org\/trading-knowledge\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"https:\/\/occamstypewriter.org\/trading-knowledge\/wp-json\/wp\/v2\/comments?post=154"}],"version-history":[{"count":0,"href":"https:\/\/occamstypewriter.org\/trading-knowledge\/wp-json\/wp\/v2\/posts\/154\/revisions"}],"wp:attachment":[{"href":"https:\/\/occamstypewriter.org\/trading-knowledge\/wp-json\/wp\/v2\/media?parent=154"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/occamstypewriter.org\/trading-knowledge\/wp-json\/wp\/v2\/categories?post=154"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/occamstypewriter.org\/trading-knowledge\/wp-json\/wp\/v2\/tags?post=154"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}