During the data collection or data creation stage of a research project, it is important that researchers practice data management to not only improve organization and workflows but to ensure the integrity of their results. When data are easier to find, understand, and navigate, a research project can more easily be shared and reproduced.
For a comprehensive overview of file naming conventions, stable file formats, and file hierarchies, please visit our Research Data Management Guide.
During the analysis stage of a project, researchers can practice Open Science by using open source code, such as R, or selecting a code repository to create and manage their code as they analyze their data.
Open Source refers to source code that is made freely available for possible modification and redistribution. Using open source code makes research more reproducible since anyone can view and manipulate the code to verify data analysis.
Image by Colin Viebrock, from Wikimedia Commons
The open source coding languages R and Python are useful for a variety of data manipulation, analysis, and visualization tasks. To download and start using one of these languages, explore the links below. For tutorials and training, please see the "Coding Resources" box at the bottom of this page.
There are extensive, free resources available online for learning both R & Python.