Navigation auf uzh.ch
Good data documentation is essential for research reproducibility and data reusability. Data documentation provides information about the context, the structure, the provenance and the content of a dataset (or a file) with the aim to increase its usefulness. Data documentation is therefore a crucial part of making data FAIR.
Data documentation is also sometimes called metadata — data about data. Metadata describes basic characteristics of the data, such as:
Metadata can either be maintained through a data archive/repository where you have to describe the characteristics of the data according to the information the repository requires from you. Alternatively, you can create a data documentation (README file), which contains additional information for the reuse of your data. As a rule, both are recommended: the information in the data repository is machine-readable and can thus be used for meta-analyses, while the README file facilitates the further use of the data by humans.
Start your data documentation already when you collect your data. This will make it easier for you to track the complete data generation process later and will help you to create well-structured data documentation at the time of publishing.
Structure the documentation the first time: It is not necessary to have your data documentation fully structured right from the start. However, certain structures can help you gather all the metadata you need for your data to be reusable from the start.
The Stanford Libraries provide a good introduction to this.
Use metadata standards: Well-structured metadata or data documentation supports the long-term discoverability, understandability, and preservation of your research data. Discipline-specific repositories typically require highly structured metadata to enable highly granular searching of the repository.
Metadata standards are also referred to as "schemas". Schemas can be either generic or discipline-specific.
Well-known metadata standards include DublinCore — a set of 15 terms (such as creator, title, etc.). The Data Documentation Initiative (DDI) provides an XML-based schema for the content, transport, representation, and archiving of metadata in the social sciences. To find discipline-specific metadata schemas, look at:
Templates for creating data documentation can be found here:
Cornell University's README file is a Word document that asks the most important questions for comprehensive data documentation. From this, you can then generate a PDF and share it together with your data.
The CESSDA Metadata Schema allows you to capture project-level information about your data. To do this, answer the questions under "Project-level documentation".
The DataCite Metadata Generator creates XML-based data documentation for you based on the questions you answer in the generator.