Incentives/rewards for scientific contributions
The traditional way to gauge a researcher’s scientific prowess is to look at his publication record in peer-reviewed journals, and use crude, imperfect metrics like the ISI Impact Factor (IF) as a measure of the quality of these journals. But there are many other ways, besides authoring traditional papers, in which researchers contribute to science. Submissions to biological databases, curation of data in those databases, Web 2.0 activities like scientific blogging, online commenting on and rating of scientific papers (pioneered by PLoS) represent examples of activities for which researchers get little or no credit for at present. Here we will explore how a microattribution scheme providing incentives/rewards could help in this regard and outline some examples.
If the various contributions (such as those listed above) can be tracked and linked to the identity of each researcher via his OpenID (see previous section), what would then gradually emerge is a web of publication credit-like (aka ‘microattribution’) information which can be mined and aggregated to produce far more useful metrics of individual scientific contribution than is possible today (see e.g. Scholar Factor as proposed in recent paper in PLoS[fn]Bourne et al. I am not a scientist, I am a number. PLoS Comput Biol (2008) vol. 4 (12) doi:10.1371/journal.pcbi.1000247[/fn]). These ideas are being further developed in the guise of a BioMedical Resource Impact Factor (BRIF)[fn]Cambon-Thomsen. Assessing the impact of biobanks. Nature Genetics (2003) vol. 34 (1) doi:10.1038/ng0503-25b[/fn], which is heavily centered on the needs and activities of the Biobanking community.
Submissions to biological databases
Database submissions are often driven by journal and/or funder requirements; i.e. in order to get a manuscript accepted (or research funded), the authors must submit their primary data to the appropriate archives (e.g. DNA sequences to GenBank/EMBL/DDBJ) and then cite submission accession numbers in their manuscript. For some categories of data, mainly DNA sequences and gene expression microarray experiments, this arrangement is now well established and the data are relatively standardized.
But for other kinds of data which have emerged quite recently (e.g. results from genome-wide association studies, see previous section) this is not the case, and papers are published every month where the underlying data are not made available. There can be several reasons for this, but key factors are undoubtedly that i) journals/funders do not yet require the data to be submitted to archives, and ii) researchers are not rewarded for submitting data.
Construction/maintenance of databases
For many kinds of biological databases, maintenance involves manual curation of the data contents. Biocurators verify data correctness (e.g. automated gene structure predictions), enhance data by adding related information (e.g. from literature mining) or cross-reference with other databases, and so on. Such work goes largely unnoticed as the output cannot usually be measured in journal publications, prompting calls for a robust infrastructure for recognition of curation work. This will be particularly important to facilitate community curation of large-scale datasets which are too large to be tackled by curator teams[fn]Howe et al. Big data: The future of biocuration. Nature (2008) vol. 455 (7209) doi:10.1038/455047a[/fn].
Additionally, funding for construction, and particularly long-term maintenance, of biological databases is hard to secure[fn]Merali et al. Databases in peril. Nature (2005) vol. 435 (7045) doi:10.1038/nrg2483[/fn]. It would be valuable for database maintainers if they could unequivocally show reviewers how useful their data content is to the community, by way of accurate citation metrics for datasets (possibly right down to the level of individual database records). This would be a far more useful metric of the scientific value of a database than simple website traffic statistics and/or citations to the paper describing the database as a whole).
Constructing a microcredit-tracking system or systems for incentives/rewards is entirely feasible technically, but success is heavily dependent on researchers being able to identify themselves uniquely (and securely) to the various online services involved. Furthermore, a key aspect of this is that the user should always have the choice whether or not to use his primary public identity for these activities (i.e. in situations where anonymity is preferred). The last section in this primer will discuss some scenarios where a user-centric system (as previously presented) can underpin a loosely-connected network of services which can go a long way towards achieving this goal.