Log In |
Select the search type
 
  • Site
  • Web
Search

Data archiving progress at Heredity

It says the following:

"Authors are strongly encouraged to follow established minimum guidelines for the reporting of biological data, wherever appropriate. Guidelines for many relevant data types are available from MIBBI: Minimum Information for Biological and Biomedical Investigations (http://www.mibbi.org/).
DNA sequences published in Heredity must be deposited in a publicly available database, usually EMBL / GenBank / DDBJ, and accession numbers must be included in the final version of the manuscript. Where public databases exist for other data types, such as microarray data (see www.ebi.ac.uk/Databases/microarray.html, for example), they must be used and the relevant reference should be included in the manuscript. Where no public database exists, authors are strongly encouraged to provide the data on which their analyses are based as Electronic Supplementary Information. The data should be formatted for use in a relevant, readily available software package, ideally one which allows data export in a variety of formats (such as CREATE for population genetic data: https://bcrc.bio.umass.edu/pedigreesoftware/node/2). Sufficient metadata (such as sample locations, individual identities, etc.) should be provided to allow easy repetition of analyses presented in the manuscript.

Heredity proposes to make public archiving of data a requirement for publication in the near future and welcomes feedback from authors on this proposal (please address comments to heredity@shef.ac.uk)."

Readers will immediately notice the difference between ‘must be deposited’ for sequence and microarray data and the rather different ‘strongly encouraged’ for other data types. This reflects the uncertainty discussed by our Newsletter Editor, Steve Russell in his TaxiDriver piece (Issue 61): we have all benefitted enormously from the universal archiving of sequence data in GenBank, and almost all scientists agree that the data on which published papers are based should be available for re-analysis or further study, but we are not sure how far to extend mandatory archiving to other data types.

Heredity introduced this interim policy because there were several concerns about rapidly adopting mandatory archiving. It would be good to take a lead on this issue but might be risky if other journals were not adopting similar policies. It is not clear that Supplementary Information provides a suitable archive where data can be stored in a useable form and easily retrieved. A suitable archive should also allow temporary embargoes on some data types, to allow authors to exploit their data more fully before release or to protect sensitive information. An ideal archive should allow easy submission of many data types, with metadata, integrated with the manuscript submission process.

The outlook is now changing rapidly. Several major journals in ecology and evolutionary biology (American Naturalist, Evolution, Molecular Ecology, Journal of Evolutionary Biology) have joined together in a joint data archiving initiative. Heredity’s field of interest overlaps substantially with these journals and the policy they propose to adopt fits closely with our planned development. Therefore, we will join the other journals in publishing closely linked statements on data archiving early in 2010. This will make archiving of all data types a requirement for publication. We will be expecting referees and editors to police this requirement: a small extra load, I am afraid, but one that will be well worth the effort.

These journals are also supporting an initiative by NESCENT, the US National Evolutionary Synthesis Center (http://www.nescent.org/) to create a data repository suitable for the very varied forms of data generated in evolutionary biology, including evolutionary genetics. The result is DRYAD (http://datadryad.org/). Dryad operates like a data library, allowing it to contain any data type but restricting the types of searches that are possible compared with sequence data, for example. Data submission can be linked to the electronic manuscript submission systems of member journals. Embargoes can be requested. There is basic curation of data and metadata files and the data are given a unique identifier which is linked to the appropriate publication. The joint data archiving initiative will recommend Dryad as a data repository but will not restrict the use of other repositories. The Genetics Society Committee is currently considering whether to join the Dryad Consortium. If you have views on Dryad, or data archiving more generally, pleased do contact me or any committee member.