Public Data Resources

Return to CEA Website

Locating Estimates in Public Data Sources

With continuing investments by both public and private sector on the open data initiatives, more and more data are available on public data repositories every year. Here we briefly describe how to locate estimates from public data sources and provide some good sources to use. One benefit of using publicly available data may be that some fragments of relevant information useful for a CEA project may be available before all studies have been completed and confirmed about its effectiveness. This is because unlike peer-reviewed literature there is no mechanism for information quality control on the Web. Thus, typically information is published on the Web faster than peer-reviewed literature. The flip side to this is that websites are filled with misleading, inaccurate, outdated, constantly changing and potentially dangerous information, as well as sales pitches. So, when using public data sources, it is important to use reliable sources and understand exactly what the data mean (i.e., the measure definition).


Often using publicly available data is called secondary data analysis because we are using data that was not collected per say for the purpose of the CEA project, but rather the data is available as a by-product of data collected for other primary purposes such as billing. Thus, most often the exact data that would be optimal for the CEA is not available. But there may be many other data points available that are similar enough to be useful in the CEA as a proxy for certain measures. In secondary data analysis, it is important to fully appreciate the meaning of the data that is available and the potential gap in the data that is available versus what is ideally desired. For example, for the CEA project you might want to use monthly cost of hospitalizations from congestive heart failure. But the only publicly available data that you can find is annual cost. It is key to appreciate that the unit of analysis differs in the data available and desired, then adjust for it in some acceptable way (e.g., assume uniform monthly costs and divide by 12). Unit of analysis has different dimensions, as discussed in the planning steps above, that all need to be considered such as time(e.g., monthly, annual), space (e.g. city, county, hospital), and entity (e.g. per hospitalization, per patient, per hospital). If the exact data for CEA is not available, consider looking for similar data that can be used as a proxy or with adjustments (e.g., if cost for Texas is not available, maybe there is an overall USA based cost data). Finally, often there will be many sources for one data point that are not the same. In these cases, it is important to first identify all sources, consider the differences, the reliability and applicability of the different sources for the CEA and make a good judgement call on which data is best suited for the CEA.


We also note that some of these data sources may have some data use limitations, required training, and/or acknowledgement of the terms of use (e.g., agree to not attempt to link data to other sources). Many are freely available, but some data may have to be purchased for a fee. In this section, we list some reliable sources of health data repositories that include Texas data. We also recommend always citing the information source along with the date the website was accessed because information is dynamic (i.e. changing constantly) on the Web. We recommend citing a URL as follows.


Author’s last name, Initial(s). (Year posted/last updated). Title of work. Retrieved [month day, year], from https://URL


Texas Center for Health Statistics (CHS)

The Texas Center for Health Statistics (CHS) has a plethora of health data that is specific to Texas and may be the best place to start looking for cost or utilization data. Below are three links that can get you started on exploring what data may be relevant to your CEA project.


Texas Center for Health Statistics (CHS), Department of Health State Services (DSHS) (n.d.). Texas Health Data. Retrieved [May 1, 2021] from http://healthdata.dshs.texas.gov/


Texas Center for Health Statistics (CHS), Department of Health State Services (DSHS) (2020). Links to Health Related Data. Retrieved [May 1, 2021] from https://www.dshs.state.tx.us/chs/links-to-health-related-data.shtm


Texas Center for Health Statistics (CHS), Department of Health State Services (DSHS) (2021). Texas Health Care Information Collection (THCIC). Retrieved [May 1, 2021] from https://www.dshs.texas.gov/thcic/


Other Texas Health Data Repositories

Both the Texas Hospital Association and the Dallas Fort Worth Hospital Council (DFWHC) Foundation have a data repository that may be useful for your CEA project in Texas. The Texas PricePoint Website has pricing data on the most common inpatient services by hospitals in Texas using the same THCIC Inpatient data mentioned in the section above. Currently, the data repository has only data for 2017 calendar year. The Healthy North Texas Community Health Website is an initiative of the DFWHC Foundation through the work of the Community Health Collaborative that have comprehensive county level indicator data that cover topics in health(e.g., alcohol & drug use, cancer), community (e.g., demographics, domestic violence & abuse), economy (e.g., poverty, housing) , education(e.g., literacy, student performance), and environmental health (e.g., built environment, air). Healthy North Texas covers data for X counties in North Texas.


Texas Hospital Association (n.d.). Texas PricePoint. Retrieved [May 1, 2021] from https://www.txpricepoint.org/default


DFWHC Foundation (n.d.). Healthy North Texas. Retrieved [May 1, 2021] from http://www.healthyntexas.org/topiccenter


Federal Government Data Repositories with Health Data

There are many federal government health data repositories that may potentially be useful for a CEA project. Below we list the main agency data websites and some examples that may be more relevant for data on health care cost or utilization. Some datasets are available in multiple federal agency websites.




  • Health Resources & Services Administration (HRSA)


  • The US Census Bureau

        • http://www.census.gov/

        • Survey of Income and Program Participation (SIPP): SIPP Collects data related to source and amount of income, labor force information, various poverty program participation and eligibility data, and general demographic characteristics. It is a series of panel datasets, with monthly data for 2.5 to 4 years per panel, for 14,000 to 37,000K households.Detailed information is available about household income and assets, and “in-kind” income via assistance programs. It also includes good health insurance coverage data and some health services utilization data


  • Bureau of Labor Statistics (BLS)

        • BLS has detailed national and some state/local data relating to Prices (overall, health care, etc.), Wages & employment (by industry or occupation), and occupational injury

        • https://www.bls.gov/




  • The main federal government open data initiative managed by the US General Services Administration which also includes many health data.


Other Health Data Repositories

There are many more health data repositories online that can not all be covered in this document. Below we list two more websites that can provide a more comprehensive list of reliable sources that may be helpful in a CEA.

  • Partners in information access for the public health workforce: This is a public and private partnership of over 10 organizations to facilitate public health workforce to access and use data for public health and contains many health data at the local, state, national, and global level.