Data collection is the methodological process of gathering information about a specific subject. It’s crucial to ensure your data is complete during the collection phase and that it’s collected legally and ethically. If not, your analysis won’t be accurate and could have far-reaching consequences.
In general, there are three types of data:
Before collecting data, there are several factors you need to define:
The data collection method you select should be based on the question you want to answer, the type of data you need, your timeframe, and your budget.
When looking for data, you may also consider:
Secondary data analysis is the analysis of existing data that was collected by others. Public data is data that can be used, reused, or redistributed. Government entities at all levels (municipal, state, federal, and international) produce large amounts of public data. Typically this data is accessible without restrictions. There may be instances where terms of use must be agreed to or approval must be granted before accessing datasets, such as health or education data.
Many larger municipalities and counties host open data repositories. To find open data repositories, try searching the municipality name and open data. A few examples include:
Below are a few examples of open data from regional governments in Texas and the State of Texas.
A collection of tabular datasets and geographic information by the North Central Texas Council of Governments (NCTCOG). Major focus areas include population, employment, land use, development, and geospatial data.
Administrative data reported by various departments in the state of Texas.
The Texas Higher Education Data (THED) website is Texas' primary source for statistics on higher education.
This site contains public data and statistics on various health topics.
The U.S. federal government is one of the largest data producers in the world. There are 13 federal statistical agencies all of which produce and publish data for public use. What's included below are only a few U.S. public data resources available.
A new platform to access data and digital content from the U.S. Census Bureau. For guidance on using data.census.gov, please see their Resources page: Guidance for Data Use linked through the icon.
Primarily a federal open government data site. Some data are not hosted directly in data.gov, but rather provides metadata of the data resource for users to explore.
This is a search tool to find Department of Energy funded research, its publication and publicly available scientific datasets of those projects.
Supported by NCES, the data lab provides QuickStats, PowerStats, and TrandStats for users to create tables and charts and conduct simple statistics, regression analysis, or multiple year analysis.
Looking for data outside the U.S.? Below are collections from international NGOs and other entities.
In addition to these data resources, many countries are adopting open data policies and publishing their data to the web. Trying searching the country's name and open data and you may find what you're looking for. Here are a few examples of international open data resources. Do note that international websites and datasets may not be published in English, but instead in the country's language.
Find, compare and share the latest OECD data: charts, maps, tables and related publications.
These organizations offer public data but require that you become credentialed users before accessing the data.
Provides text data of public domain works for research. Follow the researcher agreement and request process on their site to access their data.
Medical Information Mart for Intensive Care III is a database of de-identified health data from intensive care unit patients between June 2001 - October 2012. Researchers seeking to use this free database must formally request access.