During the data collection or data creation stage of a research project, it is important that researchers practice data management to not only improve organization and workflows but to ensure the integrity of their results. When data are easier to find, understand, and navigate, a research project can more easily be shared and reproduced.
For a comprehensive overview of file naming conventions, stable file formats, and file hierarchies, please visit our Research Data Management Guide.
During the analysis stage of a project, researchers can practice Open Science by using open source code, such as R, or selecting a code repository to create and manage their code as they analyze their data.
Open Source refers to source code that is made freely available for possible modification and redistribution. Using open source code makes research more reproducible since anyone can view and manipulate the code to verify data analysis.
Image by Colin Viebrock, from Wikimedia Commons
The open source coding languages R and Python are useful for a variety of data manipulation, analysis, and visualization tasks. To download and start using one of these languages, explore the links below. For tutorials and training, please see the "Coding Resources" box at the bottom of this page.
This link allows you to download and install the most updated R version for your operating system.
RStudio is a easy-to-use interface for R. It is free and open-source through Posit.co, but requires initial installation of R to function.
Posit also offers an online, browser-supported version of RStudio without any installation required. The individual plan is free, with more extensive versions available on a subscription basis.
Python is an advanced open source programming language with an OSI-approved open source license, making it freely usable and distributable.
GitHub is a developer platform that allows its users to create, manage, and share code for projects. It offers both free and paid plans.
Git lab is a paid platform that allows users to develop, secure, and operate their own software using AI-powered tools.
There are extensive, free resources available online for learning both R & Python.
Quick-R is a basic R tutorial for beginners offered by Datacamp.
This free course from Stanford University covers the basics of R programming on the EdX platform.
This tutorial provides a basic overview of Python for beginners.
This YouTube channel provides concise videos covering many aspects of Python.
These comprehensive Python lessons created by GeeksForGeeks is well-suited for both beginners and experienced programmers.
This guide was created using many resources, many of them are linked throughout the guide. This guide was also built using information from UTSA Libraries and Museums guide on Open Science by Rachel Davis.