Introduction

For the data to be understood, interpreted, and reusable by other researchers, it is essential to explain how the data were created, the context, structure, and content in a comprehensible and coherent manner. In RDR, this type of information must be provided in two ways, in the metadata fields and in the README file.

README files

A README file in RDR, as the name suggests (read me), allows communicating important information about the dataset, clarifying possible questions about the use, creation, and/or updating of the data. It is essential to write a good README file so that all the research is presented in a concise manner. The file should minimally contain the following:

  • Dataset title, DOI, contact information
  • Methods
  • Summary of data and files
  • Specific data information
  • Conditions of reuse

We recommend that you create your README file based on these templates.

The plain text file is organized by blocks and each of them contains information between brackets and the level of obligation of each section.

This document must be in .txt format and it is recommended that it include as much information as possible, even if the information does not fit within the template outline. It is also recommended that the information be expressed in English, in addition to the original language, to promote the reusability of the dataset.

The data dictionaries

A data dictionary is a type of metadata that links in an organized way the names, definitions and characteristics of each of the fields or attributes of a dataset. Its aim is to provide a common language between the author of the data and potential users. In addition, they allow us to understand and interpret a dataset by providing basic information about the fields or variables it contains. They provide the following information:

  • What each field or variable means.
  • What kind of data does it contain?
  • What values can it take, or if it uses any catalog.
  • If it contains public, confidential or reserved information.

The data dictionaries are designed to facilitate understanding and provide meaning, therefore they must document the existence, meaning and use of each element of the dataset.

Those responsible for the data must keep the contents of the data dictionary up to date, including definitions and values.

Codebooks

A codebook provides information about the structure, content, and layout of a data file. A well-documented codebook contains information that is intended to be complete and self-explanatory for each variable in a data file.

Although codebooks vary widely in the quality and quantity of information provided, a typical codebook includes:

  • Column locations and widths for each variable.
  • Definitions of different types of registration.
  • Response codes for each variable.
  • Codes used to indicate non-response and missing data.
  • Exact questions and skip patterns used in a survey.
  • Other indications of the content and characteristics of each variable.

The body of a codebook describes the contents of the data file.

  • No labels