When I recently enjoyed lunch
with Steve Pettifer of Manchester University (the ‘father’ of Utopia Documents), the conversation turned to nanopublications. Ah, you want to know
what nanopublications are. Nanopublications are
machine-readable, single, attributable, scientific assertions.
Steve posed the question “why would any scientist
believe a nanopublication, particularly if out of context?” Indeed, why would
they? Why should they, well versed as scientists are in the art of critical
thinking. They won’t, at least not without seeing the appropriate context.
Herein lies a great opportunity.
Let me explain. Nanopublications, or rather, their
core in the form of machine-readable object-predicate-subject triples, can be
incorporated in (vast) collections of such triples and used for reasoning, to discover
new knowledge, or to make explicit hitherto tacit or hidden knowledge. Triples
can therefore be very valuable to science. (The Open PHACTS project is in the process of establishing the value of this
approach for drug discovery.) Many, perhaps most, scientific articles contain
such single assertions, which could be presented as nanopublications.
In a recent Nature Genetics commentary called ‘The Value of Data’, Barend Mons et al.
addressed this issue with the metaphor of the chicken and the egg. Now that
eggs (individual assertions) are being distributed (‘traded’), their value
(they all look roughly the same) can only be truly assessed by knowing the
parents. Scientists will always want to personally judge whether a crucial
connecting assertion in a given hypothesis is one they can accept as valid. The
ability to locate where the assertion came from, in which article, in which
journal, by which author, and when it was published – in short the ‘provenance’
of individual scientific assertions functioning in computer reasoning – is
crucial for that. As is the ability to access the article in question.
Scientific publishers should, in their quest to add
value to research publications, expose and clearly present the nanopublications
contained in the articles they publish, particularly those that are believed
(e.g. by the author, or the reviewers) to be unique and new. What’s more, they
should make them openly and freely available, like they do with abstracts, even
publishers that are not yet convinced that they should change their business
models and make all their content open access. And they should not just make
nanopublications open and accessible to human readers, but also to machines,
because only machines are able to effectively process large numbers of
nanopublications, treating each one as a ‘pixel’ of the larger picture that a
researcher is building up.
So what’s the opportunity?
Well, openly accessible nanopublications are very
useful for scientific discovery, they are attributable (to author, article, and
journal) and scientist don’t just believe them when they see them, particularly
if the assertion is new to them or when they find it in a process of computer-assisted
(in silico) reasoning. Researchers
will be eager to investigate their source, i.e. check out the article from
which the nanopublication comes. They may cite the nanopublication, and in
doing so, cite the article. An obvious win-win situation for scientists (in
their roles of users and authors) and publishers alike.
What are we waiting for?
Jan Velterop
Maybe we are waiting for the first traditional publisher to take the lead and make a big name and contribution to big data science?
ReplyDelete