The Parachute

Tuesday, November 05, 2013

Essence of academic publishing

Let me start with a bit of context, all of which will be known, understood and widely discussed. The blame of unaffordability of the ever-increasing amount of scholarly literature, be it because of high subscription prices or article processing fees for ‘gold’ open access, is often laid at the door of the publishers.

The blame, however, should be on the academic preoccupation with the imperative of publisher-mediated prepublication peer review (PPR).

Of course, publishers, subscription-based ones as well as open access outfits, have a business which depends to a very large degree on being the organisers of PPR and few of them would like to see the imperative disappear. The ‘need’ – real or perceived – for publisher-mediated PPR in the academic ecosystem is the main raison d’être of most publishers. And it is responsible for most of their costs (personnel costs), even though it is actually carried out by academics and not publishers. The technical costs of publishing are but a fraction of that, at least the cost of electronic publishing (print and its distribution are quite expensive, but to be seen as an optional service and not as part of the essence of academic publishing).

Despite it being the imperative in Academia, publisher-mediated PPR has flaws, to say the least. Among causes for deep concern are its anonymity and general lack of transparency, highly variable quality, and the unrealistic expectations of what peer review can possibly deliver in the first place. The increasing amount of journal articles being submitted is making the process of finding appropriate reviewers not easier, either.

Originally, PPR was a perfectly rational approach to ensuring that scarce resources were not spent on the expensive business of printing and distributing paper copies of articles that were indeed not deemed to be worth that expense. Unfortunately, the rather subjective judgment needed for that approach led to unwelcome side effects, such as negative results not being published. In the era of electronic communication, with its very low marginal costs of dissemination, prepublication filtering seems anachronistic. Of course, initial technical costs of publishing each article remain, but the amounts involved are but a fraction of the costs per article of the traditional print-based system, and an even smaller fraction of the average revenues per article many publishers make.

Now, with the publishers’ argument of avoiding excessive costs of publishing largely gone, PPR is often presented as some sort of quality filter, protecting readers against unintentionally spending their valuable time and effort on unworthy literature. Researchers must be a naïve lot, given the protection they seem to need. The upshot of PPR seems to be that anything that is peer reviewed before publication, and does get through the gates, is to be regarded as proper, worthwhile, and relevant material. But is it? Can it be taken as read that everything in peer-reviewed publications is beyond doubt? Should a researcher be reassured by the fact that it has passed a number of filters that purport to keep scientific ‘rubbish’ out?

Of course they should. These filtering mechanisms are there for a reason. They diminish the need for critical thinking. Researchers should just believe what they read in ‘approved’ literature. They shouldn’t just question everything.

Or are these the wrong answers?

Isn’t it time that academics who are relying on PPR ‘quality’ filters – and let us hope it’s a minority of them – should stop believing at face value what is being presented in the ‘properly peer-reviewed and approved’ literature, and go back to the critical stance that is the hallmark of a true scientist: “why should I believe these results or these assertions?” The fact that an article is peer-reviewed in no way absolves researchers of applying professional skepticism to whatever they are reading. Further review, post-publication, remains necessary. It’s part of the fundamentals of the scientific method.

So, what about this: a system in which authors discuss, in-depth and critically, their manuscripts with a few people who they can identify and accept as their peers. And then ask those people to put their name to the manuscript as ‘endorsers’. As long as some reasonable safeguards are in place that endorsers are genuine, serious and without undeclared conflicts of interest (e.g. they shouldn’t be recent colleagues at the same institution as the author, or be involved in the same collaborative project, or have been a co-author in, say, the last five years), the value of this kind of peer-review – author-mediated PPR, if you wish – is unlikely to be any less than publisher-mediated PPR. In fact, it’s likely to offer more value, if only due to transparency and to the expected reduction in the cost of publishing. It doesn’t mean, of course, that the peer-endorsers should agree with all of the content of the articles they endorse. They merely endorse its publication. Steve Pettifer of the University of Manchester once presented a perfect example of this. He showed a quote from Alan Singleton about a peer reviewer’s report[1]:

"This is a remarkable result – in fact, I don’t believe it. However, I have examined the paper and can find no fault in the author’s methods and results. Thus I believe it should be published so that others may assess it and the conclusions and/or repeat the experiment to see whether the same results are achieved."

An author-mediated PPR-ed manuscript could subsequently be properly published, i.e. put in a few robust, preservation-proof formats, properly encoded with Unicode characters, uniquely identified and identifiable, time-stamped, citable in any reference format, suitable for human- and machine-reading, data extraction, reuse, deposit in open repositories, printing, and everything else that one might expect of a professionally produced publication, including a facility for post-publication commenting and review. That will cost, of course, but it will be a fraction of the current costs of publication, be they paid for via subscriptions, article processing charges, or subsidies. Good for the affordability of open access publishing for minimally funded authors, e.g. in the social sciences and humanities, and for the publication of negative results that, though very useful, hardly get a chance in the current system.

Comments welcome.

Jan Velterop

[1] Singleton, A. The Pain Of Rejection, Learned Publishing, 24:162–163

doi:10.1087/20110301

Tuesday, February 05, 2013

Transitions, transitions

Although I am generally very skeptical of any form of exceptionalism, political, cultural, academic, or otherwise, I do think that scholarly publishing is quite different from professional and general non-fiction publishing. The difference is the relationship between authors and readers. That relationship is far more of a two-way affair for scholarly literature than for any other form of publishing.

Broad and open dissemination of research results, knowledge, and insights has always been the hallmark of science. When the Elseviers/Elzevirs (no relation to the current company of the same name, which was started by Mr. Robbers [his last name; I can’t help it] a century and a half after the Elsevier family stopped their business), among the first true ‘publishers’, started to publish scholarship, for example the writings of Erasmus, they used the technology of the day to spread knowledge as widely as was then possible.

In those days, publishing meant ‘to make public’. And ‘openness’ was primarily to do with escaping censorship. (Some members of the Elsevier family went as far as to establish a pseudonymous imprint, Pierre Marteau, in order to secure freedom from censorship). But openness in a wider sense — freedom from censorship as well as broad availability — has, together with peer-review, been a constituent part of what is understood by the notions of scholarship and science since the Enlightenment. Indeed, science can be seen as a process of continuous and open review, criticism, and revision, by people who understand the subject matter: ‘peers’.

The practicalities of dissemination in print dictated that funds must be generated to defray the cost of publishing. And pre-publication peer review emerged as a way to limit waste of precious paper and its distribution cost by weeding out what wasn’t up to standards of scientific rigour and therefore not worth the expense needed to publish. The physical nature of books and journals, and of their transportation by stagecoach, train, ship, lorry, and the like, made it completely understandable and acceptable that scientific publications had to be paid for. Usually by means of subscriptions. However, scientific information never really was a physical good. It only looked like that, because of the necessary physicality of the information carriers. The essence of science publishing was the service of making public. You paid for the service, though it felt like paying for something tangible.

The new technology of the internet, specifically the development of web browsers (remember Mosaic?), changed the publishing environment fundamentally. The need for carriers that had to be physically transported all but disappeared from the equation. The irresistible possibility of unrestrained openness emerged. But something else happened as well. With the disappearance of physical carriers of information, software, etc. the perception of value changed. The psychology of paying for physical carriers, such as books, journals, CDs, DVDs is very different from the psychology of paying for intangibles, such as binary strings downloaded from the web, with no other carrier than wire, or optical cable, or even radio waves. In order to perceive value, the human expectation — need, even — for physical, tangible goods in exchange for payment is very strong, though not necessarily rational, especially where we have been used to receiving physical goods in exchange for money for a very long time. That is not to say that we wouldn’t be prepared to value and to pay for intangibles, like services. We do that all the time. But it has to be clear to us what exactly the value of a service is — something we often find more difficult, reportedly, than for physical goods.

This is a conundrum for science publishers. Carrying on with what they are used to, but then presented as a service and not ‘supported’ by physical goods any longer, can look very ‘thin’. Yet it is clear that the assistance publishers provide to the process of science communication is a service par excellence. Mainly to authors ('publish-or-perish') and less so to readers (‘read-or-rot’ isn’t a strong adage). Hence the author-side payment pioneered by open access publishers (Article Processing Charges, or APCs).

Although it would be desirable to make the transit to open access electronic publishing swiftly, the reality of inertia in the ‘system’ dictates that there be a transition period and method. This transition is sought in many different ways: new, born-OA journals that gradually attract more authors; hybrid journals that accept OA articles against author-side payment; ‘green’ mandates, that require authors to self-archive a copy of their published articles; unmediated, ‘informal’ publishing such as in arXiv; even publishing on blogs.

What may be an underestimated transition — and no-doubt a controversial one — is a model (a kind of ‘freemium’ model?) that’s gradually changing from restrictive to more and more open, extending the ‘free’, ‘open’ element and reducing the features that have to be paid for by the user. I even don’t think it is recognized as a potential transition model at the moment at all, but that may be missing opportunities. Let’s take a look at an example. If you don’t have a subscription you can’t see the full-text. However, where only a short time ago you saw only the title and the abstract, you now see those, plus keywords and the abbreviations used in the article, its outline in some detail, and all the figures with their captions (hint to authors: put as much of the essence of your paper in the captions). All useful information. It is not a great stretch to imagine that the references are added to what non-subscribers can see (indeed, some publishers already do that), and even the important single scientific assertions in an article, possibly in the form of ‘nanopublications’, on the way to eventual complete openness.

Of course, it is not the same as full, BOAI-compliant open access, but in areas where ‘ocular’ access is perhaps less important than the ability to use and recombine factual data found in the literature, it may provide important steps during what may otherwise be quite a protracted transition from toll-access to open access, from a model based on physical product analogies to one based on the provision of services that science needs.

Jan Velterop

Saturday, January 19, 2013

On knowledge sharing — #upgoerfive

This post was written with the #upgoerfive text editor, using only the most common 1000 words in English.

At one time there was a man who some people thought was god. Other people thought he was sent to the world by god. This man had two water animals you could eat and five pieces of other food and he wanted the many people who were with him to have enough to eat. But two water animals and five other pieces of food were not enough for the people if they all had to eat. So the man who some people thought was god and others that he was sent by god, made the food last until all the people had had enough to eat. This was a wonder. The people saw this and did not know if they could believe what they saw. But when it seemed true that he had a power that no other men or women had, they believed the man was really god or sent by god, because he could do what other men could never do at all. This story became very well known. And many people believe it is about food.

But I think it is not about food. I think it is about food for thought. About what we know, not about what we eat. Because if we give food that we have to others, we do not have it anymore for us to eat. But if we tell others what we know, they know it, too, and we still know it as well. So we can not share our food and still have it all, but we can share what we know and still have it all. We should share what we know if it is good for us all. Especially people who work on knowing more and more every day, as their job. They are paid by us all to work in their jobs on knowing more and more, and they really should share what they come to know with us, and in such a way that we can understand it, too.

Jan Velterop

Tuesday, January 15, 2013

Imagine if funding bodies did this

There is apparently a widespread fear that if a ‘gold’ (author-side paid) open access model for publishing scientific research is supported by funding bodies, the so-called article processing fees, paid for by funders on behalf of authors, might see unbridled increases. This fear is not unwarranted if not addressed properly. If funders agree to pay whatever publishers charge, they undermine the potential for competition among publishers and provide them with an incentive to maximize their income, while at the same time removing any price sensitivity on the part of the publishing researcher. However, it is not very difficult to address this problem.

In order to avoid untrammeled article processing fee increases, funding bodies should foster competition amongst publishers, and create price sensitivity to article processing charges in researchers publishing their results.

Imagine if they did the following:

Require open access publishing of research results;
Include in any grants a fixed amount for publishing results in open access journals;
Allow researchers to spend either more or less than that amount on article processing charges, any surplus to be used for the research itself, or any shortfall to be paid from the research budget;
Require any excess paid over and above the fixed amount to be justified by the researcher to the funder;
Provide a fixed amount for more than one publication if the research project warrants that, but so that researchers have an incentive to limit the number of published articles instead of salami-slicing the results into as many articles as possible, again by giving them discretion over how the fixed amounts are spent.

Jan Velterop

Sunday, September 09, 2012

'Pixels of information'

My friend Barend Mons wrote to me and I think it is worth sharing his letter on this blog. I checked with him, and he agrees that it can be shared on this blog.

Dear Jan,

I'm writing to you inspired by your remark that "OA is not a goal in itself but one means to an end: more effective knowledge discovery".

What we need for eScience is Open Information to support the Knowledge Discovery process. As eScience can be pictured as 'science that can not be done without a computer', computer reasonable information is the most important element to be 'open'.

You're right, Barend. That's why I think CC-BY is a necessary element of open access.

As we discussed many times before, computer reasoning and 'in silico' knowledge discovery leads essentially to 'hypotheses' not to final discoveries. There are two very important next steps. First, what I would call 'in cerebro' validation, mainly browsing the suggestions provided by computer algorithms mining the literature and 'validating' individual assertions (call them triples if you wish) in their original context. 'Who asserted it, where, based on what experimental evidence, assay...?' etc. In other words, why should I believe (in the context of my knowledge discovery process) this individual element of my 'hypothesis-graph' to be 'true' or 'valid'? Obviously in the end, the entire hypothesis put forward by a computer algorithm and 'pre'-validated by human reasoning based on 'what we collectively already know' needs to be experimentally proven (call it 'in origine' validation).

What I would like to discuss in a bit more depth is the 'in cerebro' part. For practical purposes I here define 'everything we collectively know', or at least what we have 'shared' as the 'explicitome' (I hope Jon Eisen doesn't include that in his 'bad -omes'), essentially a huge dynamic graph of 'nanopublications' or actually rather 'cardinal assertions' where identical, repetitive nanopublications have already been aggregated and assigned an 'evidence factor'. Whenever a given assertion (connecting triple) is not a 'completely established fact' (the sort of assertion you repeat in a new narrative without the need to add a reference/citation) we will go to narrative text 'forever' to 'check the validity' in my opinion.

Major computer power is now exploited for various intelligent ways to infer the 'implicitome' of what we implicitly know (sorry, Jon, should you ever see this!), but triples captured in RDF are certainly no replacement for narrative in terms of reading a good reasoning, why conclusions are warranted, extensive description of materials and methods etc. So the 'validation' of triples outside their context will be a very important process in eScience for many decades to come. In fact your earlier metaphor of the 'minutes of science' fits perfectly in this model. 'Why would I believe this particular assertion'? ... Well, look in the minutes by whom, where and based on what evidence it was made'.

Now here is a very relevant part of the OA discussion: The time when some people thought that OA was a sort of charity model for scientific publishing is definitely over, with profitable OA publishers around us. The only real difference is: do we (the authors) pay up front, or do we refuse that (for whatever good reason, see below) and now the reader has to pay 'after the fact'. So let's first agree that there is no 'moral superiority', whatever that is, in OA over the traditional subscription model.

Not sure if I agree, Barend. OK, let's leave morals out of it, but first of all, articles in subscription journals can also be made open access via the so-called 'green' route of depositing the accepted manuscript in an open repository; and secondly, OA at source, the so-called 'gold' route, is definitely practically and transparently the superior way to share scientific information with anyone who needs or wants it.

We have also seen the downsides of OA, for instance for researchers in developing countries who may still have great difficulty to find the substantial fees to publish in the leading Open Access journals.

I believe however, that we have a great paradigm shift right in front of us. Computer reasoning and ultralight 'RDF graphs' distributing the results to (inter alia) mobile devices will allow global open distribution of such 'pixels of information' at affordable costs, even in developing countries. Obviously, a practice that will be associated is to 'go and check' the validity of individual assertions in these graphs. That is exactly where the 'classical' narrative article will continue to have its great value. It is clear that the costs of reviewing, formatting, cross-linking and sustainably providing the 'minutes of science' is costly and that the community will have to pay for these costs via various routes. I feel that it is perfectly defensible that those articles for which the publishing costs have not been paid for by the authors, and that are still being provided by classical publishing houses, should continue to 'have a price'. As long as all nanopublications (let's say the assertions representing the 'dry facts' contained in the narrative legacy as well as data in databases) are exposed in Open (RDF) Spaces for people and computers to reason with, the knowledge discovery process will be enormously accelerated. Some people may still resent that they may have to pay (at least for some time to come) for narrative that was published following the 'don't pay now — subscribe later' adage. We obviously believe that the major players from the 'subscription age' have a responsibility, but also a very strong incentive to develop new methods and business models that allow a smooth transition to eScience-supportive publication without becoming extinct before they can adapt.

Best,

Barend

Your views are certainly worth a serious and in-depth discussion, Barend. I invite readers of this blog to join in and engage in that discussion.

Jan Velterop

Tuesday, August 07, 2012

Open access – gold versus green

Recently, Andrew Adams contributed to the 'gold' vs. 'green' open access discussion and he wrote this on the LIBLICENSE list (edited for typos):

There are on the order of 10,000 research instutitions and more than ten times as many journals. Persudaing 10,000 institutions to adopt OA deposit mandates seems to me a quicker and more certain route to obtain OA than persuading 100,000 journals to go Gold (and finding more money to bribe them into it, it would appear – money which is going to continue to be demanded by them in perpetuity, not accepted as a transitional fee – there's nothing so permanent as a temporary measure). (Full message here.)

The LIBLICENSE list moderator would not post my response, so I'm giving it here:

10,000 research institutes means, in terms of Harnadian 'green' mandates, a need for 10,000 repositories; 100,000 journals (if there were so many; I've only ever heard numbers in the order of 20-25,000 [recently confirmed as in the order of 28K]) does not mean 100,000 publishers. Besides, there is no existential reason for institutions to have a repository and 'green' mandate. The fact that others have repositories and it doesn't have one itself does not harm a research institution in the same way that not being 'gold' (or at least having a 'gold' option) does existentially harm journals in an environment of more and more 'gold' journals.

As for costs, there are two things that seem to escape the attention of 'green' advocates (by which I mean those who see no place for 'gold' open access at this stage on the basis that 'green' would be a faster route to OA and would be cheaper):

'Green' fully depends on the prolongation of the subscription model. Without subscription revenues no journals, hence no peer-reviewed articles, hence nothing to self-archive but manuscripts, arXiv-style. (That would be fine by me, actually, with post-publication peer review mechanisms overlaying arXiv-oids). The cost of maintaining subscriptions is completely ignored by exclusively 'green' advocates, who always talk about 'green' costing next to nothing. They are talking about the *marginal* cost of 'green', and compare it to the *integral* cost of 'gold'.
Exclusively 'green' advocates do not seem to understand that for 'gold' journals, publishers are not in any position to "demand money". They can only offer their services in exchange for a fee if those who would pay the fee are willing to pay it. That's known as 'competition', or as a 'functioning market'. By its very nature, it drives down prices. This in contrast to the monopoloid subscription market, a dysfunctional market, where the price drivers face upwards. Sure, some APC's increased since the early beginnings of 'gold' OA publishing, when 'gold' publishers found out they couldn't do it for amounts below their costs. But generally, the average APCs per 'gold' article are lower — much lower — than the average publisher revenues per subscription article. And this average per-article subscription price will still have to be coughed up in order to keep 'green' afloat.

Price-reducing mechanisms would even work faster if and when the denizens of the ivory tower were to reduce their culturalism and anglo-linguism that currently prevails, in which case we could rapidly see science publishing emerge in places like China, India, and other countries keen on establishing their place in a global market, competing on price. APCs could tumble. Some call this 'predatory gold OA publishing'. Few seem to realise that the 'prey' is the subscription model.

The recently published Finch Report expresses a preference for immediate, 'libre', open access, and sees 'gold' as more likely to be able to deliver that than 'green'. Meanwhile, 'green' is a way to deliver OA (albeit delayed and not libre) in cases where 'gold' is not feasible yet. That is an entirely sensible viewpoint, completely compatible with the letter – and I think also the spirit – of the Budapest Open Access Initiative (BOAI). Incidentally, referring to the BOAI is characterised as "fetishism" (sic) by Andrew Adams.

Comparing 'green' and 'gold' is almost, to borrow a phrase from Stevan Harnad, "comparing apples and orang-utans". The Finch report is not mistaken to see 'green' as (in the words of Michael Jubb) an "impoverished type of open access, with embargo periods, access only to an authors’ manuscript, without links and semantic enrichment; and severe limitations on the rights of use." After all, in the 'green' ID/OA scheme (ID = Immediate Deposit and OA meaning 'Optional Access" here) favoured by Harnad c.s., deposited articles may be made open if and when the publisher permits.

Besides, 'gold' implies also 'green' ('gold' articles can be deposited, without embargo or limits on use, anywhere, and by anyone), where 'green' does not imply 'gold'. A Venn diagram might look like this (below).

The Finch group has come to its conclusions because they have clearly learnt the lessons of the last decade. There is nothing — repeat: *nothing* — that prevents academics to eschew the services of "rent-seeking" (as Adams put it) publishers. They could easily self-organise (though I realise that both the words 'could' and 'easily' are probably misplaced). To expect publishers (for-profit and not-for-profit ones alike) to refuse providing services that academics are seeking from them is silly.

For the avoidance of doubt, I am not against 'green' OA (in spite of what some 'green'-only advocates assert), especially not where there is no other option. The choice is not so much for or against 'green' or 'gold', but emphatically for full, unimpeded open access, however it is delivered, as long as it is "permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself." You recognise this last phrase? Indeed, the precise wording of the BOAI.

Jan Velterop

Wednesday, August 01, 2012

The triumph of cloud cuckoo land over reality?

It should be abundantly clear that Open Access policies by Finch, RCUK, Wellcome Trust and many others are very important for the development of universal OA, in that they not only indicate practical ways of achieving it, but also signal to the scholarly community and the wider society interested in scientific knowledge and its advance that OA should be the norm.

The 'sin' that RCUK, Finch and the Wellcome Trust committed is that they didn't formulate their policies according to strict Harnadian orthodoxy. It's not that they forbid Harnadian OA (a.k.a. 'green'), oh no. It is that they see the 'gold' route to OA as worthy of support as well. Harnad, as ultimate arbiter of Harnadian OA (though he has acolytes), would like to see funder and institutional OA policies focus entirely and only on Harnadian OA, and would want them, to all intents and purposed, forbid the 'gold' route. In Harnad's view, the 'gold' route comes into play (as 'downsized gold', whatever that means) only once all scholarly journal literature is OA according to Harnadian rules. These rules are quite specific:

articles must be published in peer-reviewed subscription journals;
institutions must mandate their subsequent deposit in an institutional repository (not, for instance in a global subject repository);
there must be no insistence on OA immediately upon publication (his big idea is ID/OA — Institutional Deposit / Optional [sic] Access);
here must be no insistence on CC-BY or equivalent (which would make re-use and text-mining possible — OA in his view should just be ocular access, not machine-access).

It must be difficult to comply with these rules, and seeing his recent applause, subsequently followed by withdrawal of support, for the RCUK policy, even Harnad himself finds it difficult to assess whether his rules are 'properly' adhered to. It also seems as if his main focus is not OA but mandated deposit in institutional repositories. Probably hoping that that will eventually lead to OA. He would like to see 'gold' OA — OA at source — considered only if and when it is "downsized Gold OA, once Green OA has prevailed globally, making subscriptions unsustainable and forcing journals to downsize." It is the equivalent of opening the parachute only a split second before hitting the ground. It would be the triumph of a dogmatically serial process over a pragmatically parallel one. The triumph of cloud cuckoo land over reality.

Open Access is more than worth having. Different, complementary, ways help achieve it. There are many roads leading to Rome.

Jan Velterop
OA advocate

Monday, June 11, 2012

Small publications, large implications

When I recently enjoyed lunch with Steve Pettifer of Manchester University (the ‘father’ of Utopia Documents), the conversation turned to nanopublications. Ah, you want to know what nanopublications are. Nanopublications are machine-readable, single, attributable, scientific assertions.

Steve posed the question “why would any scientist believe a nanopublication, particularly if out of context?” Indeed, why would they? Why should they, well versed as scientists are in the art of critical thinking. They won’t, at least not without seeing the appropriate context.

Herein lies a great opportunity.

Let me explain. Nanopublications, or rather, their core in the form of machine-readable object-predicate-subject triples, can be incorporated in (vast) collections of such triples and used for reasoning, to discover new knowledge, or to make explicit hitherto tacit or hidden knowledge. Triples can therefore be very valuable to science. (The Open PHACTS project is in the process of establishing the value of this approach for drug discovery.) Many, perhaps most, scientific articles contain such single assertions, which could be presented as nanopublications.

In a recent Nature Genetics commentary called ‘The Value of Data’, Barend Mons et al. addressed this issue with the metaphor of the chicken and the egg. Now that eggs (individual assertions) are being distributed (‘traded’), their value (they all look roughly the same) can only be truly assessed by knowing the parents. Scientists will always want to personally judge whether a crucial connecting assertion in a given hypothesis is one they can accept as valid. The ability to locate where the assertion came from, in which article, in which journal, by which author, and when it was published – in short the ‘provenance’ of individual scientific assertions functioning in computer reasoning – is crucial for that. As is the ability to access the article in question.

Scientific publishers should, in their quest to add value to research publications, expose and clearly present the nanopublications contained in the articles they publish, particularly those that are believed (e.g. by the author, or the reviewers) to be unique and new. What’s more, they should make them openly and freely available, like they do with abstracts, even publishers that are not yet convinced that they should change their business models and make all their content open access. And they should not just make nanopublications open and accessible to human readers, but also to machines, because only machines are able to effectively process large numbers of nanopublications, treating each one as a ‘pixel’ of the larger picture that a researcher is building up.

So what’s the opportunity?

Well, openly accessible nanopublications are very useful for scientific discovery, they are attributable (to author, article, and journal) and scientist don’t just believe them when they see them, particularly if the assertion is new to them or when they find it in a process of computer-assisted (in silico) reasoning. Researchers will be eager to investigate their source, i.e. check out the article from which the nanopublication comes. They may cite the nanopublication, and in doing so, cite the article. An obvious win-win situation for scientists (in their roles of users and authors) and publishers alike.

What are we waiting for?

Jan Velterop

Sunday, April 29, 2012

OA not just for institutionalised scientists

On the Global Open Access List, an email list, a thread has developed on 'Open Access Priorities: Peer Access and Public Access'. Of course, true open access means access both for peers (meaning fellow-scientists, in this case, not just members of the UK House of Lords) and for the general public at large, so the discussion is really about what is more important and what is the more persuasive argument to get research scientists to make their publications available with open access. And should that argument mainly be quasi-legal, in the form of institutional mandates.

My view is this:

Is it not so that when there is no wide cultural or societal support for whatever law or mandate, more effort is generally being spent on evasion than on compliance and enforcement turns out to be like mopping up with the tap still running? If one should be taking examples from US politics, the 'war on drugs' is the one to look at.

Forcing scientists into open access via mandates and the like is only ever likely to be truly successful if it is rooted in an already changing culture. An academic culture with an expectation that research results are openly available to all. By the shame that researchers will be made to feel in the lab, at dinner parties, or in the pub, if their results are not published with open access. Of course that will still be mainly peer-pressure, but changing hearts and minds of peers is greatly helped if there were a societal substrate in which the open culture can grow. Mandates or not, OA will never happen if scientists aren't convinced from within. An appeal to them as human beings and members of society is more likely to achieve that than mandates, in my view. The latter should back up a general change of heart, not be a substitute for it.

What is 'the general public' should not be misunderstood and be construed to be only those interested in medical literature. It includes all those interested in the other 999 areas as well. Ex scientists, retired scientists, start-ups and SMEs, scientists interested in another discipline or cross-discipline topics, students, lawyers, reporters, teachers, even hobbyists. Einstein wasn't an institutionalised scientist when he worked on his most important work; he was a patent clerk.

Of course, those OA evangelists who wish to pursue mandates should be pursuing mandates. I encourage them to keep doing just that. But to narrow the efforts of OA evangelism to what is stubbornly being called "the quickest route", in spite of it being no more than a hypothesis which
certainly over the last decade and a half hasn't proved itself to be as effective as first thought, is a mistake.

By all means where there are opportunities to promote mandates let us do that, but not at the expense of making the moral and societal responsibility case for OA.

Jan Velterop

Wednesday, April 11, 2012

'Enriching' Open Access articles

I've been asked what the relevance is of my previous post to Open Access. The relevance of Utopia Documents to Open Access may not be immediately clear, but it is certainly there. Though Utopia Documents doesn't make articles open that aren't, it provides 'article-of-the-future-like' functionality for any PDFs, OA or not. It opens them up in terms of Web connectivity, as it were, and it is completely publisher-independent. So PDFs in open repositories – even informal, author-manuscript ones – and from small OA publishers can have the same type of functionality that hitherto only larger publishers could afford to provide, and then only for HTML versions of articles.

PDFs are often getting a bad press, as you probably know, yet according to statistics from many publishers, PDFs still represent by far the largest share of scientific article downloads. PDFs have great advantages, but until now, also disadvantages relative to HTML versions, particularly with regard to the latter's Web connectedness (this – open – article is worth reading: http://dx.doi.org/10.1087/20110309). This digital divide, however, has now been bridged! The Utopia Documents PDF-viewer is built around the concept of connecting hitherto static PDFs to the Web, and it bridges the 'linkability gap' between HTML and PDF, making the latter just as easily connected to whatever the Internet has on offer as the former (as long as you are online, of course).

The new – wholly renewed – version (2.0) of the Utopia Documents scientific PDF-viewer has now been released. It is free and downloads are currently available for Mac and Windows (and a Linux version is expected soon). Version 2.0 automatically shows Altmetrics (see how the article is doing), Mendeley (see related articles available there), Sherpa/RoMEO (check its open archiving status), etcetera, and connects directly to many more scientific and laboratory information resources on the Web, straight from the PDF.

Utopia Documents allows you, if you so wish, to experience dynamically enriched scientific articles. Articles from whichever publisher or OA repository, since Utopia Documents is completely publisher-independent, providing enrichment for any modern PDF*, even 'informal' ones made by authors of their manuscript (e.g. via 'Save as PDF') and deposited in institutional repositories.

'Enrichment' means, among other things, easy Web connectivity, directly from highlighted text in the PDF, to an ever-expanding variety of data sources and scientific information and search tools. It also means the possibility to extract any tables into a spreadsheet format, and a 'toggle' that converts numerical tables into easy-to-read scatter plots. It means up-to-date Altmetrics, whenever available, that let you see how articles are doing. It means a comments function that lets you carry out relevant discussions that stay right with the paper, rather than necessarily having to go off onto a blog somewhere. It means being able to quickly flick through the images and illustrations in an article. It means that existing PDFs from whatever source are 'converted', as it were, on-the-fly, to what some publishers call 'articles of the future'. (The original PDF is in no way altered; the 'conversion' is virtual).

With Utopia Documents, publishers, repositories, libraries, even individuals with PDFs on their personal sites, can offer enriched scientific articles just by encouraging their users to read PDFs with the free Utopia Documents PDF-viewer, and so get more out of the scientific literature at hand than would otherwise be possible. Utopia Documents is indeed truly free, and not even registration is needed (except for adding comments).

Utopia Documents is usable in all scientific disciplines, but its default specialist web resources are currently optimised for the biomedical/biochemical spectrum. http://utopiadocs.com

Friday, April 06, 2012

Pee Dee Effing Brilliant

Are you a scientist or student? Life sciences? Do you ever read research literature in PDF format?

Did it ever occur to you that it might be useful, or at least convenient, if scientific articles in PDF format were a bit more 'connected' to the rest of the web? And would enable you, for instance, directly from the text, to:

look up more information about a term or phrase you're encountering (e..g a gene, a protein, etc.)
look up the latest related articles (e.g. in PubMed, Mendeley)
see, in real time, how the article is doing (Altmetrics)
search (NCBI databases, protein databases, Google, Wikipedia, Quertle, etc.)
share comments with fellow researchers

Well, all of that – and much more – is now possible. All you have to do is view your PDFs in the new Utopia Documents.

Utopia Documents has been developed by researchers from the University of Manchester, is completely free, and available for Mac, Windows and Linux. It works with all PDFs* irrespective of their origin**.

I invite you – urge you – to try it out, tell your colleagues and friends, and ask them to tell theirs. And tweet and blog about it. Registration is not necessary, except if you want to make use of the public 'comment' function. Feedback is highly appreciated. Either as a comment on this blog, or directly to the Utopia crew. And testimonials, too, obviously.

Disclosure: I work with these guys. A lot. They are brilliant and yet pragmatic. Driven by a desire to make life easier for scientists and students alike.

*With the exception of bitmap-only PDFs (scans)
**From any publisher, and even including 'informal' PDFs as can be found in repositories, or those that you have created yourself from a manuscript written in Word, for instance

Thursday, February 23, 2012

They’re changing a clause, and even some laws, yet everything stays the way it was.

The title captures the feeling of frustration with the often glacial pace of changes we regard as necessary and inevitable. So we try to influence the speed of change, and one time-honoured tool we take out of the box is the boycott. Boycotts are a way to get things off your chest; even to get some guilt relief, but although there are notable exceptions, they rarely change things fundamentally. Take the Elsevier Science boycott. I understand the feeling behind it, but if their prices were reduced to half of what they are now, or even if they went out of business, would that really be a solution to the problems with which scientific communication wrestles? As many a boycott does, this one, too, is likely to result in ‘changing a clause, changing some laws, yet everything staying the way it was’.

A boycott doesn’t alter the fact that we view publishers as publishers. That's how they view themselves, too. However, that is the underlying problem. Perhaps publishers were publishers, in the past, but they are no longer. Any dissemination of knowledge that results from their activities is not much more than a side effect. No, publishers’ role is not to ‘publish’; it is to feed the need of the scientific ego-system for public approbation, and of its officialdom for proxies for validation and scientific prowess assessment in order to make their decisions about tenure, promotion and grants easier and swifter.

Crazy line of thought, no? Well, maybe, but look at what happens in physics. The actual publishing – dissemination of information and knowledge – takes place by posting in arXiv. Yet a good portion of articles in arXiv – quite possibly the majority, does anyone have the numbers? – are subsequently submitted to journals and formally ‘published’. Why? Well, "peer review" is the stock answer. And acquiring impact factors (even though especially in physics one would expect scientists to pay heed to Einstein’s dictum that “not everything that can be counted counts and not everything that counts can be counted”).

Clearly, officialdom in physics is prepared to pay, to the tune of thousands of dollars per article, for the organization of the peer review ritual and the acquisition of impact factor ‘tags’ that come with formal publication of a ‘version of record’. So be it. If officialdom perceives these things as necessary and is willing to pay, ‘publishers’ are of course happy to provide them.

But one of the biggest problems in science communication, the free flow of information, seems to have been solved in physics, as arXiv is completely open. If arXiv-like platforms were to exist in other disciplines as well, and if a cultural expectation were to emerge that papers be posted on those platforms before submission to journals, and their posting be accepted as a priority claim, we would have achieved free flow of information in those other areas as well.

I suspect that the essence of the Federal Research Public Access Act (FRPAA) is about achieving a situation like the one that exists in physics with arXiv. Given that arXiv has done no discernable damage to publishers (at least as far as I’m aware, and, reportedly, also according to the publishing arms of the AmericanPhysical Society and the UK Institute of Physics), pushing for the Research Works Act (RWA) instead of making the case for extending an arXiv-like ‘preprint’ system to disciplines beyond physics seems an extraordinary lapse of good judgement.

On the other hand, the concern that publishers have about the academic community not being willing for long to pay the sort of money they now do for what is little more than feeding the officialdom monster, is a realistic concern. Unfortunately for them, stopping the evolution of science communication in its tracks is simply not an option. Perhaps the current boycott is one of the rare successful ones, and perhaps it will spur publishers on to reconsider their role and position. There are definitely ways for a publisher to play a beneficial role. Just a small example: I was told of a recent case where the peer reviewer expressed his frustration with the words “Imagine if before it was sent to me for review a professional editor actually read all 40 pages and discovered the heinous number of basic grammatical issues, spelling errors, and typos, and sent it back to the authors or to an English correction service before I had to spend more time on that, rather than on the actual scientific content.”

Personally, I think open arXiv-like set-ups in disciplines other than physics are the way forward. Publishers should – and truly forward-looking ones may – establish those themselves, if they don’t want to be reduced to an afterthought in the scientific communication game.

We live in hope, though not holding our breath.

Jan Velterop

Sunday, February 05, 2012

Collaborate, don't frustrate

We have seen a fair amount of activity on the web in the last few weeks with regard to protests, even boycotts, aimed at prominent publishers. Most of it seems to be about money. When money is tight, it leads to a fight.

We are in the huge pickle of a dysfunctional system. And that’s certainly not just the publishers’ fault. They just make the most of the system that is there and that is being kept alive by the scientific community at large. See my previous post. All publishers are commercial and all want to optimize their profits, although some, the not-for-profit outfits, optimize their ‘results’ or their ‘surplus’. Same thing, really. It’s just the way the capitalist system works. The system is dysfunctional because there is no competition. The scientific community allows it to exist without competition. Relying on subscriptions for their income makes journals, and their publishers, monopoloid in an environment where content is non-rivalrous. If the only options to get from A to B – and you have to get from A to B – are a train or walking, because there are no roads, then the train company has a hold on you. And on your money. The situation in science publishing is scarcely different.

So the solution is introducing competition. ‘Gold’ Open Access publishing does just that, albeit perhaps in a fairly primitive way, so far. It’s typically a game of new entrants. But in order to be truly successful, the scientific community at large has to buy in to it. Literally ‘buy’ into it. Publishers can lead the horse to the Open Access water, but they can’t make it drink.

I won’t hold my breath. And there is so much else in science publishing, besides money matters, that needs to be improved.

Just one example: fragmentation. Fragmentation is a big, frustrating problem. Particularly for the efficient and effective ingestion of information. But it need not be so bad. Although science publishers are bound by antitrust rules, there are areas of a pre-competitive nature where they are allowed to collaborate. Think standards, think CrossRef. Those forms of collaboration, for the benefit of science, could be expanded. Other standards could be introduced, to do with data linking, for instance, with data representation, computer-readability, interoperability. Things like structured abstracts. Perhaps even ontologies and agreed vocabularies for scientific concepts, analogous to biological and chemical nomenclature. User licences could be standardized, pre-competitively. Et cetera. There are some sophisticated features around, but their wide adoption all too often suffers from the not-invented-here syndrome. Publishers, too, live in an ego-system of their own.

And it is not just in pre-competitive areas where fragmentation could be remedied. There are areas that you could call ‘post-competitive’, where collaborations between publishers and standardisations of practices and technologies could be of tremendous value to the scientific community, without costing the publishers much, or even anything. Take fragmentation again. Even if the subscription system were to be kept alive, publishers could, PubMedCentral-like, deposit all the journal articles they publish in discipline-central global databases, after, say, a year. The vast majority of the realizable economic value of annual subscriptions is realized within a year (that’s why the subscriptions are annual), and although open access after a year is not ideal, it would be a massive improvement over the current situation with very little cost to the publishers. And unlike PubMedCentral, the publishers should, collectively and proactively, set up and organize these open repositories. Asking funding agencies to help support the future maintenance of such repositories should not be too difficult. It's a conservation issue the responsibility for which cannot and should not be put on the shoulders of potentially fickle private enterprise.

Another area of post-competitive collaboration, or at least cooperation, would be the so-called ‘enrichment’ of journal articles. In html as well as in their pdf manifestations. Every publisher seems to have its own ideas, and that’s all very well, but it doesn’t make life easier for researchers. Why not pool these ideas and apply them as widely as possible? There is hardly, if any, competitive cost to that, and a great deal of potential benefit to the scientific community, the professed aim of virtually all scientific publishers.

It clearly is not beyond the publishers to work together and create something very useful. Just look at CrossRef. It is an example worthy of being the paradigm for publisher attitudes and behaviour with regard to pre-competitive and post-competitive collaborations.

Jan Velterop

Publishers are not evil

Commercial publishers, as a class, are not evil. To think so is wrong. They have just been doing what the scientific community can't or won't do by itself. And like most businesses, they charge what they can get away with. It’s known as ‘the market’. They can’t be criticised for existing and functioning in a perfectly legal capitalist market and regulatory environment. That doesn’t mean they can’t be criticised. Individual publishers can be criticised for their actions and inactions. As an industry, among the things they can be criticised for are not evolving fast enough, given the environmental change that the web has brought about. But so can the academic community. The reliance on old and now effectively dysfunctional systems and habits from a bygone era is mutual.

Centuries ago, in Europe, non-Christians were forbidden to belong to the guilds, which made it impossible for them to be any kind of craftsman, essentially leaving them with few other options than being an unskilled labourer, trader, or money lender. So some became very wealthy and thus became the target of envy. And accused of usury and the like. Just for doing the only thing they were allowed to do and society needed someone to do. It’s more complicated than that, but it captures the essence.

The relevance of this to science publishing? Well, at a certain point, when science had grown into a sizeable, professional and global pursuit, academics didn’t, or couldn’t, organise publishing properly anymore on that global scale. University presses were, by definition, rather local, and so were scientific societies. Commercial publishers stepped into the breach, some became very wealthy, and are now the target of envy. Or at least of criticism of their wealth. And accused of greed and the like. Just for doing some of the things the academic community needs or thinks it needs, in the environment of a ‘market’ (starting in the 1950’s with e.g. internationalisation of science communication; abolishing the sort of author charges the scientific societies were levying for their journals, standardisation of article structures, language, et cetera).

Lesson: if you leave it to outsiders to provide your essential services, because you can’t, or won’t, truly assimilate and embed those outsiders, and provide the services from within your own circles, you risk losing control and you cannot blame the outsiders for taking the opportunities you give them.

Jan Velterop

PS. The first Open Access publisher was a commercial publisher. The largest publisher of Open Access articles today is a commercial publisher. Why are there not more scientist-led initiatives like PLoS?