Monday, June 11, 2012

Small publications, large implications

When I recently enjoyed lunch with Steve Pettifer of Manchester University (the ‘father’ of Utopia Documents), the conversation turned to nanopublications. Ah, you want to know what nanopublications are. Nanopublications are machine-readable, single, attributable, scientific assertions.

Steve posed the question “why would any scientist believe a nanopublication, particularly if out of context?” Indeed, why would they? Why should they, well versed as scientists are in the art of critical thinking. They won’t, at least not without seeing the appropriate context.

Herein lies a great opportunity.

Let me explain. Nanopublications, or rather, their core in the form of machine-readable object-predicate-subject triples, can be incorporated in (vast) collections of such triples and used for reasoning, to discover new knowledge, or to make explicit hitherto tacit or hidden knowledge. Triples can therefore be very valuable to science. (The Open PHACTS project is in the process of establishing the value of this approach for drug discovery.) Many, perhaps most, scientific articles contain such single assertions, which could be presented as nanopublications.

In a recent Nature Genetics commentary called ‘The Value of Data’, Barend Mons et al. addressed this issue with the metaphor of the chicken and the egg. Now that eggs (individual assertions) are being distributed (‘traded’), their value (they all look roughly the same) can only be truly assessed by knowing the parents. Scientists will always want to personally judge whether a crucial connecting assertion in a given hypothesis is one they can accept as valid. The ability to locate where the assertion came from, in which article, in which journal, by which author, and when it was published – in short the ‘provenance’ of individual scientific assertions functioning in computer reasoning – is crucial for that. As is the ability to access the article in question.

Scientific publishers should, in their quest to add value to research publications, expose and clearly present the nanopublications contained in the articles they publish, particularly those that are believed (e.g. by the author, or the reviewers) to be unique and new. What’s more, they should make them openly and freely available, like they do with abstracts, even publishers that are not yet convinced that they should change their business models and make all their content open access. And they should not just make nanopublications open and accessible to human readers, but also to machines, because only machines are able to effectively process large numbers of nanopublications, treating each one as a ‘pixel’ of the larger picture that a researcher is building up.

So what’s the opportunity?

Well, openly accessible nanopublications are very useful for scientific discovery, they are attributable (to author, article, and journal) and scientist don’t just believe them when they see them, particularly if the assertion is new to them or when they find it in a process of computer-assisted (in silico) reasoning. Researchers will be eager to investigate their source, i.e. check out the article from which the nanopublication comes. They may cite the nanopublication, and in doing so, cite the article. An obvious win-win situation for scientists (in their roles of users and authors) and publishers alike.

What are we waiting for?

Jan Velterop

1 comment:

  1. Maybe we are waiting for the first traditional publisher to take the lead and make a big name and contribution to big data science?