Organising and documenting your data
Organising data on a day-to-day basis will enable you to retrieve the data more easily at a later stage. Things to consider are:
- use standard filenaming conventions
- include versioning information on documents and files, or use version control software
- structure your files logically
- consider file formats -can the data files be preserved or shared at the end of the project?
Documenting how the data was generated is especially important if the data is to be shared. RCUK state that research data should be "supported by sufficient contextual information to enable others to find what research data exists why, where and how it was generated and how to access it."
This metadata (ie data about the data) may need to describe the origin, processing, analysis and/or the researchers management of a dataset. Metadata/documentation can be embedded in the dataset, or via supporting file - eg a readme.txt file to provide context, instructions or explanation.
If you are using Pure to share/archive your data, the Pure metadata template allows you to create structured metadata to describe the data, and supporting documentation can be uploaded to the record. (See: How to deposit data in Pure)
A persistent identifier for the data, a doi, can be minted via Pure and used to reference the data - eg in a published paper. It is requirement of many funders that any data supporting published finding should be available for scrutiny, using a persistent identifier to link to it from the published findings. (see RCUK; EPSRC; Horizon 2020; PLOS)
For how to cite research data, see : Data Citation / Data Availability Statements (information from DataCite)
Managing and sharing data: best practice for researchers Booklet produced by the UK Data Archive. Has useful information on versioning, file formats, naming conventions.
Guidance on best practice in the management of research data RCUK Guidelines
Mantra Research Data Management Training (University of Edinburgh) has sections on organising and documenting data
The DCC has links to discipline-specific metadata schemae