Research data is data that is collected, observed, or created, for purposes of analysis to produce original research results. The word “data” is used throughout this site to refer to research data. Research data can be generated for different purposes and through different processes, and can be divided into different categories. Each category may require a different type of data management plan.
Research data may include all of the following:
Source: Boston University Libraries: http://www.bu.edu/datamanagement/background/whatisdata/
There are three types of metadata to consider. Click each tab to learn more.
Descriptive metadata describes your data set. Among these metadata might be "Principal Investigator Name", "Affiliation", etc. Some of these fields will be determined and made mandatory by the repository in which you manage, share, or publish your dataset. Others may be optional.
You may choose to use a schema of descriptive metadata that matches your discipline (more information about that is under "Structural").
|Multi-Disciplinary||Metadata standards that have been adopted by many disciplines.|
|Genome Metadata||Descriptive data about single genomes within the Pathosystems Resource Integration Center.|
|Life Sciences||List and links to various schema in the field of Biology.|
|Earth Sciences||List and links to various schema in the field of Earth Science.|
|Physical Sciences||List and links to various schema in the field of Physical Science.|
|Social Science and Humanities||Standards adopted by the Social Science and Humanities disciplines.|
Administrative metadata is the information about your datasets that allows for it to be managed. Examples include file size, file types, etc. This metadata is generally created automatically by the data repository.
However, information about copyright, reuse, and other access requirements are also considered Administrative. See the "Sharing your data" tab for more on this topic.
Open formats are preferable so that someone else has choices in how to use your data, but informed decisions are most important.
Four key questions in choosing formats:
Without good documentation, your research data may be useless. A year down the line you may have forgotten what certain abbreviations or codes mean, or how you synthesized or anonymized your data. Plan for these sorts of documentation, as applicable:
Good Data Management includes good file management. Naming your files and versioning them consistently allows for readable results. Check out a few of the best practices:
When dealing with many different files, software exists to assist with naming, versioning, and organization.
Learn more about good file management for research data: https://data.research.cornell.edu/content/file-management
Information security during the research process is vital to project success. Make a security plan early in your data management planning process. The following links will help you successfully secure your data:
You can save a great amount of time in your research if you're able to locate existing data for re-use.
Check re3data.org (an index of over 1500 data repositories) to see if existing data exist which would support your research.