##Data Reuse Planning Template
Create a text file including information provided below, as appropriate. Save this file as " DataReusePlan.txt" and save in top-level folder of your data set for use by future researchers wanting to reuse your data.
What:
Description (abstract): What is your project about? What is the goal? Why are you doing it? How does this data relate to your project?
We are examining the changes in size of rockhopper penguin eggs on the Tristan da Cunha archipelago throughout the past century. we
Title: What do you want to call your data set?
Permanent Identifier: A way for folks to find your data. Do you have a DOI or other permanent identifier?
Data Source: If you included someone else's data in your data set, provide information on where it came from. Just like articles, if you use it, cite it. It is preferable to include a permanent identifier like a DOI.
Subject: In what discipline or subject area is your project?
Related publication: Have you published an article, thesis or some other publication based on this data? Include a full citation and permanent identifier, if available.
Who:
Data Collector: Bond, Alex; McClelland, Greg; Glass, Trevor; Ryan, Peter (2016)
Funder Information : The Royal Society for the Protection of Birds, the UK partner in BirdLife International, and the National Geographic Society.
Collaborators: Ovenstone Agencies (Pty) Ltd and the South African Department of Environmental Affairs (South African National Antarctic Program, SANAP) for logistical support and transport to and from Tristan da Cunha on the MFV Edinburgh, MV Baltic Trader and SA Agulhas II.
Contact person: Who should someone contact for additional information about the data set? Include affiliation and contact information such as phone number, email, and/or physical address.
Where:
Location: Data were collected in the Tristan da Cunha archipelago in the South Atlantic. Three data collection sites, Tristan da Cunha (37* 06' 39.38" S, 12* 17' 14.73"One place? Multiple places? Use geographic coordinates if appropriate.
Place of publication: Where is the data made publicly available? Include URL.
When:
Temporal Coverage: When was the data collected? On a specific date? Specific time? Over a range of dates? Use the international standard date format (YYYY-MM-DD hh:mm.ss) and try to be as specific as possible.
Data were collected from 20142
Publication Date: When was the data made available in the place of publication (above)? Again, use the international standard date format.
How:
Data collection process: What instruments were used to collect the data? how frequently were the data collected? how were data collection sites selected? if there was a sample population, how was it selected?
Data processing: How did you clean the data? how are missing or null values handled? did you write code for processing the data and where can it be found?
File index: A listing of folders and files included in the data set. Explain what files will be found in each folder and naming conventions used. Additional files could include any questionnaires or other survey instruments used to collect the data. Be sure to nclude codebooks and data dictionaries or similar README files.
File format/s: What type/s of files are these? Are there multiple formats? What software is needed to use the file/s?
Avoid proprietary formats if possible! Most data can be reformatted to be communicated in text-based forms.
Standard Metadata: Increasingly, scientific fields are moving towards standard metadata formats (data.json, data.xml, etc) to pull all the information in the Data Reuse plan together in a machine readable format. Machine readable metadata enables cataloging of datasets on sites like Data.gov and allows other to ask questions and access your datasets using code. For example, open US govenment data online is required to expose a data.json in the landing page html to be listed on Data.gov, thereby facilitating data discovery. Because not all researchers who are mandated to actually include data.json files, Data.gov is incomplete, and simple questions like "what is the total volume of data generated by US federally funded scientists?" are unanswerable.
What is the metadata standard in your field? Not sure? That's normal. Your field may not have a universal standard yet. Check out the
Research Data Alliance recommendations and consider getting involved in the creation of metadata standards in your field!
##Note about codebooks and data dictionaries:
The use of additional documentation formats such as codebooks and data dictionaries in conjunction with your DataReusePlan.txt and other data set files makes your data infinitely more reusable and provide assistance with managing your data and your research process as a whole. Like your Data Reuse Plan, these file are stored with the rest of your data files, usually in a top-level folder.
A data dictionary (sometimes used interchangeably with " codebook") is another text file for defining field names and values. The file includes a list of all field names in the data set and a description of each such as: units of measurement, formulas used for calculation, abbreviations, value ranges, as well as the relationship of fields to one another.