Tuesday, October 14, 2008

Giving chance a chance, or the usefulness of serendipity

A post on the scholarly kitchen, entitled ‘Citation Controversy’, particularly a reference to the principle of least effort, sparked the train of thought leading to this post.

Scientific articles have references, which represent the connection of the article to other articles, and thus other knowledge. Articles in Wikipedia often have references, too. Although it is not rare that one sees the message “This article or section is missing citations”. The ‘Principle of least effort’ article in Wikipedia carries this message (on the date of posting this). Ironically demonstrating the principle, I think. Authors are often quite parsimonious when it comes to adding references to articles. And when references have been added to an article, there isn’t often a thorough check on whether they include all or enough of the appropriate ones. The omission of obvious references may be picked up by reviewers, but the omission of less obvious ones is easily missed. One of the sad things about omitting references is that it may reduce serendipity.

I have a suggestion for ‘Wikipedians’ who wish to add appropriate references and links to Wikipedia articles. In particular to Wikipedia articles in the areas of health and life science, and so encourage serendipitous discovery. I advise them to go to what I informally call 'wikimore', an enhancement layer where they will find that the text of Wikipedia articles is enriched with highlighted concepts. By clicking on a number of those highlighted concepts and adding them to a search query, you can search the appropriate articles to refer to in, say in Google Scholar, or in Wikipedia itself, and when found, add those references to the Wikipedia article, as a good Wikipedian would.

For instance, by clicking on the concepts ‘information seeking behavior’, ‘design’ and ‘library’, and subsequently searching in Google Scholar, I find this article:

Comparing faculty information seeking in teaching and research: Implications for the design of digital libraries, by Christine L. Borgman et al., in the Journal of the American Society for Information Science and Technology, Vol. 56, No. 6. (2005), pp. 636-657. DOI: 10.1002/asi.20154.

An interesting sentence from that article: “…faculty are more likely to encounter useful teaching resources while seeking research resources than vice versa.” In my view this demonstrates the drawback of a least effort approach (I like to call it the ‘laziness principle’), which by its very nature militates against serendipity. And yet serendipity is one of the most important routes to real breakthroughs in knowledge and understanding. A quote from an article by M.K. Stoskopf: "it should be recognized that serendipitous discoveries are of significant value in the advancement of science and often present the foundation for important intellectual leaps of understanding".

I’m not sure if the article I found (one among many others) would be a good reference to add to the Wikipedia article on the ‘principle of least effort’, but I do hope you can see that with wikimore you can, starting from a Wikipedia article, embark even better on a journey of serendipitous discovery than you already can without the enhancement layer that wikimore provides, since with wikimore, i.e. the concept web enhancement as applied to Wikipedia, every concept that is recognized in the text is a link to further information in itself, a ‘reference’, if you wish.

And while you’re at it, you might want to take a look at the ‘knowlet’ of ‘information seeking behavior’, and explore the concepts with which information seeking behavior is connected in the life and medical science area.

Happy exploring!

Jan Velterop

Open Access Day

Though I haven’t posted for a while on The Parachute, today, on Open Access Day, I feel I should.

Unfettered access to scientific research results is in my view one of the ‘infrastructural’ provisions that enables science to function optimally. So why isn’t open access universal and what can be done to make it so?

After all, open access is easy. Just as I am posting this entry on a blog – open and freely available to any reader, anywhere, any time – I can post a scientific article. It is increasingly unlikely that there are many scientific researchers in the world who don’t have the possibility to publish their articles on a blog or in an open repository. And I use the word ‘publishing’ advisedly. The notion that publishing is something that happens in journals is rather outdated since the emergence of the Web. (Isn’t it interesting, by the way, that our word ‘text’ is derived from the Latin ‘textus’ which means ‘web’?)

Actually, I have to correct myself here. Journals do publish, but they are not needed for the act of publishing by itself. Publishing can easily be done by the authors. The significance of journals lies not so much the scientific content of their articles, but in the metadata of those articles. And by metadata I mean not so much the information about volume, issue, page number, et cetera – though that is useful for unambiguous citation – but in particular the information indicating that, and when, the article has been peer-reviewed (and often enough improved) in the course of a given journal’s editorial process. The role of a journal is to formalize an article, to affix the ‘label’ of the journal to it, indicating not only that it has been peer-reviewed, but also slotting it into what might be called a ‘pecking order’ of scientific publications. One only has to consider the weight attributed to a journal’s Impact Factor to get a sense of how important that pecking order is, or is at least perceived to be.

One of the reasons we do not have universal open access yet is that we keep on confusing the two: publishing (i.e. making public) on the one hand, and formalizing (i.e. affixing a scientific ‘credibility’ label) on the other.

Journal publishers, although still called ‘publishers’, are, in the Web era, mainly in the business of organizing the latter: affixing the label. That is no sinecure, as anyone who has done it will confirm. And as long as it is deemed necessary in the scientific ego-system – in order to get recognition, tenure, funding – it needs to be done. But it should not be confused with making research results openly and freely available.

Journal publishers have been in this business for decades, maybe even centuries. In the print world, publishing and formalizing were completely interwoven, possibly without anyone realizing it. The publishers were paid for their efforts by both readers and authors, though in different ways. Readers paid for access to the information via subscriptions, and authors for affixing the journal label to their articles by transferring their copyright exclusively to the publisher. That exclusively transferred copyright was worth a lot, because it enabled publishers to sell access to their journals, since anyone who didn't hold the copyright (which after copyright transfer included the authors) was prevented from disseminating articles, at least on any significant scale.

But we live in the Web world now, no longer in the exclusively print world. The value to publishers of copyright has decreased significantly since authors either started to ignore it – no-doubt encouraged by the opportunities the Web offers for wide dissemination – or were forced to limit the exclusivity of their copyright transfer, for instance because of mandates to make their articles openly available within a given period of time (within a year, for instance, in the case of the NIH mandate).

Given that open access is a great good to science and society as a whole (I treat this as an axioma), what to do?

Two options for researchers, not mutually exclusive:
  1. Publish research articles freely and openly on the Web, on blogs, in repositories, et cetera, especially in those that allow public comments, and let laying the articles open to such public comments take the place of peer-review. This option may realistically be available only to tenured, established scientists and the very young ones with an independent and iconoclastic frame of mind.
  2. Publish in the ‘traditional’ journal system, but choose journals that accept payment for organizing the peer-review and formalization process, and then make the article in question freely available with full open access immediately upon acceptance, and back this up by depositing a copy of the article in an open repository. This option may realistically be available only to funded scientists, but those who are not able to source funding for it can always resort to option 1.
A few remarks to conclude: There are indications – so far anecdotal – that ‘informal’ publications are gradually being taken more seriously by the science community and that helps the popularity of the first option. There are also indications that even the new and relevant scientific literature is becoming so overwhelming in size in some disciplines that proper manageable ways to get an overview of the state of knowledge, which progresses daily, need to be found. The analogy, if you wish, of a dependable weather report as opposed to just knowing the general climate supplemented by looking out of the window.

And lastly, isn't it fitting that this week, at the Frankfurt Book Fair, the worldwide publishers' jamboree, the inclusion of open access publishing into the mainstream of science publishing is being presented? I'm referring of course to the take-over of BioMed Central by decidedly mainstream publisher Springer.

Jan Velterop