In March of 2020, COVID-19 spread through the United States. To combat the absence of data in the pandemic, the COVID-19 Data Project was developed by the Broadstreet team. Initially, the project included daily numbers data. It has since expanded to include data on race and ethnicity, hereon referred to as Health Equity, policy information, and various other intern-led research projects related to the COVID-19 pandemic. That information can be found here:
The Centers for Disease Control and Prevention (CDC) reports social inequality and health systems issues as a cause for an increased risk of health and socioeconomic impacts as a result of COVID-19 for these groups1. Data reporting for race began in early April, with Louisiana being the first to report data2. Immediately, disparities in mortality deaths were noticed, and a June 2020 report by the CDC confirmed this disparity was widespread3. The need for the aggregate collection of race and ethnicity data became apparent, and the Health Equity team for the COVID-19 Data Project was created in mid-June. The process is ongoing with more counties being added to the dataset each month.
The purpose of the Health Equity team is to collect and report all race and ethnicity data for confirmed cases of COVID-19 across the United States.We publish our data monthly. We report this dataset to include the disparate impacts of COVID-19 on different racial and ethnic groups.
Since February 2021, we have noticed that many states have been reporting COVID-19 case rates for race/ethnicity less frequently. For example, in March, Oklahoma went from reporting new data daily to reporting it weekly. In fact, some states have completely stopped reporting race and ethnicity data. As a result, we have decided to stop reporting this data. The available dates for each state is described in Table 3.
- Race = a social grouping of people who have similar physical or social characteristics that are generally considered by society as forming a distinct group4
- Ethnicity = a social group that shares a common and distinctive culture, religion, language, or the like5
Each day, volunteer interns on the Health Equity team enter counts of confirmed cases for the United States counties that are reporting race and ethnicity data. Each county has its own column with race and ethnicity data for each day. To limit errors and increase ease of use, weeks are broken up into separate sheets. The team relies on quality assurance of other team members to ensure there are no lapses in data. As an extra layer of assurance, an additional team member checks all counties and keeps a back-up of any county that does not keep historic data. For some counties, historic data is available and can be retrieved. Once a month, team leads do a sweep of data and engage team members to fill all gaps with historic data sets.
Interns collect data from health department sites at both state and county levels, depending on the level of data collection and reporting that the county and state utilize.
Data is collected every day for each county in the United States that is reporting at least some race and ethnicity data. To ensure all reporting counties are included, new recruits sweep for new counties at the start of each month.
Data is collected as a cumulative count rather than a daily count. When encountered, only residential data is recorded. Multiracial and biracial categories are included in the “2+ races” category with a note added denoting this discrepancy in categorization. When counties report “Refused to Answer,” this is recorded in “Other” and a note is added to that county’s data going forward.
When counties report data as a percentage, the percentage for each category is multiplied by the total number of confirmed cases to determine the raw data.
When data is not available, often from lack of collection for that category, a - symbol is used to denote this. When 0 cases are reported in a category, a 0 is used as the placeholder.
Variable Name | Variable Description |
White | A person having origins in any of the original peoples of Europe, the Middle East, and North Africa6 |
Black/ African American | A person having origins in any of the black racial groups of Africa6 |
Asian | A person having origins in any of the original peoples of the Far East, Southwest Asia, or the Indian subcontinent6 |
American Indian/ Alaska Native | A person having origins in any of the original people of North and South America (including Central America) and who maintains tribal affiliation of community attachment6 |
Native Hawaiian/ Pacific Islander | A person having origins in any of the original peoples of Hawaii, Guam, Samoa, or other Pacific Islands6 |
2+ Races | A person with parents from two or more races7 |
Other | A person identifying as any other race |
Unknown | A person whose race is not known, identified, and/ or recorded |
Variable Name | Variable Description |
Hispanic | A person of Cuban, Mexican, Puerto Rican, South or Central American, or other Spanish culture or origin, regardless of race8 |
Non-Hispanic | A person not of Cuban, Mexican, Puerto Rican, South or Central American, or other Spanish culture or origin, regardless of race |
Not Specified | A person whose ethnicity is not known, identified, and/or recorded |
*Ethnicity data is inclusive of all race categories.
State | Data Lag | Historic Data | Reporting Anomalies | Date Stopped Reporting |
Alabama | Not updated on weekends | No | Race: only white, black, and other | May 12, 2021 |
Arizona | - | Yes | - | May 12, 2021 |
California | Yes | No | Each county updates differently (use county websites) | May 12, 2021 |
Colorado | Count Variation: some update irregularly | - | - | May 12, 2021 |
Delaware | - | - | - | June 30, 2021 |
Florida | - | No | Automated | June 1, 2021 |
Georgia | - | No | Automated Does not report American Indian, Native Hawaiian, 2+ races |
May 12, 2021 |
Idaho | - | - | - | May 12, 2021 |
Illinois | - | Yes (about 10 days) | Automated Race: no multiracial Ethnicity: only Hispanic |
May 12, 2021 |
Indiana | - | No | Data Entered through Scripts | May 12, 2021 |
Iowa | - | - | - | |
Kansas | - | - | - | May 12, 2021 |
Kentucky | Yes | Yes | Each county updates differently (use county websites) | May 12, 2021 |
Louisiana | Updates only on Wednesdays | No | Race: only white, black, other, and unknown | May 12, 2021 |
Maryland | County Variation: some update weekly, irregularly, or not at all | No | - | May 12, 2021 |
Michigan | Not updated on Sundays | No | Race: Asian and Pacific Island are combined
Ethnicity: no data |
May 12, 2021 |
Mississippi | - | Yes | Race: No mutliracial or Hawaiian; data is separated by ethnicity | May 12, 2021 |
Missouri | Daily updates with a 3-day lag | No | - | May 12, 2021 |
Nebraska | - | - | - | May 12, 2021 |
Nevada | County Variation: some update weekly, irregularly, or not at all | - | - | May 12, 2021 |
New Mexico | County Variation: some update weekly, irregularly, or not at all | No | - | May 12, 2021 |
North Carolina | - | No | Does not report Native Hawaiian or 2+ races | May 12, 2021 |
Ohio | - | No | Race & Ethnicity: Combines Refused to Answer and Unknown into “Unknown” | May 12, 2021 |
Oklahoma | - | No | Race: Multiracial and Other combined into “Other,” no Hawaiian
Ethnicity: no data |
May 12, 2021 |
Oregon | - | - | - | May 12, 2021 |
Pennsylvania | County Variation: some update irregularly | No | - | May 12, 2021 |
South Carolina | - | Yes | Race: Does not include multiracial; combines Asian, American Indian, and Native Hawaiian into “Asian” | June 30, 2021 |
Tennessee | County Variation: some update on a weekly basis | No | Updates Per 100,000; American Indian and Alaskan Combined, Other and Multiracial are Combined | June 30, 2021 |
Texas | County Variation: some update on a weekly basis | No | Each county updates differently | May 12, 2021 |
Virginia | - | Yes | Automated; Reports by health district but is converted into county level (see methods below) | May 12, 2021 |
Washington | - | - | - | May 12, 2021 |
West Virginia | - | No | Race: Only white, black and other | June 30,2021 |
Wisconsin | - | - | Automated; Does not report native Hawaiian/Pacific Islander or Other | May 12, 2021 |
The first challenge the team encountered was the discrepancies in reporting between counties. This includes the reporting of data as both raw data counts and percentages of the total confirmed cases. In addition, there are variations in categorization and reporting. For instance, some counties report 2+ races as biracial or multiracial or do not report one of our designated categories at all.
Another challenge that we encountered is that Virginia reports case numbers on race and ethnicity on a health district level instead of counties. A health district is a combination for multiple counties. When the Virginia Department of Health (VDH) was asked about this, they stated that Sections 32.1-36, 32.1-38, and 32.1-41 of the Code of Virginia required the VDH to protect the anonymity of people.
This was an issue because health districts do not have FIPS codes, which are unique codes that identify U.S states and counties. FIPS codes are important for data analysis, so we converted the data presented on the health district level into county level.
In order to resolve this issue, we determined the proportion of each race and ethnicity that is present in each county of the health district, using the American Community Survey data from 2018. Using these proportions, we calculated the approximate number of cases by race and ethnicity in each county of Virginia. Our uploaded data currently reports the data for Virginia at a county level, but it is an approximation so that must be taken into consideration when analyzing the data.
The next challenge interns face is the complications of a technology-based reporting system. Throughout the project, there have been technical difficulties with sites crashing, unclear reporting times based on test-updates. There is also wide variation in when counties report ranging from weekly to hourly updates. For that reason, we record data daily and those counties reporting less frequently are denoted.
In addition, our data is currently in the long format. We have SAS code available on our GitHub to convert this data into wide format. It can be found under the repository titled “open-source-contributions” under the file named “Race-and-Ethnicity ConvertorCode”.
- Centers for Disease Control and Prevention. Health Equity Considerations and Racial and Ethnic Minority Groups. Centers for Disease Control and Prevention website. Accessed September 15, 2020. https://www.cdc.gov/coronavirus/2019-ncov/community/health-equity/race-ethnicity.html
- Villarosa L. ‘A Terrible Price’: The Deadly Racial Disparities of COVID-19 in America. _The New York Times. _April 29, 2020. Accessed September 15, 2020. https://www.nytimes.com/2020/04/29/magazine/racial-disparities-covid-19.html
- Stokes EK, Zambrano LD, Anderson KN, et al. Coronavirus Disease 2019 Case Surveillance -- United States, January 22-May 30, 2020. MMWR Morb Mortal Wkly Rep 2020;69:759-765. http://dx.doi.org/10.15585/mmwr.mm6924e2
- Barnshaw,J.Race. InSchaefer,RichardT.,ed.EncyclopediaofRace, Ethnicity, and Society. 1. Thousand Oaks, CA: SAGE Publications; 2008:1091.
- Dictionary.com. Ethnicity. Dictionary.com website. Accessed August 20, 2020. https://www.dictionary.com/browse/ethnicity
- United States Census Bureau. Race. United States Census Bureau website. Accessed September 15, 2020. https://www.census.gov/topics/population/race/about.html
- Merriam-Webster. Biracial. Merriam-Webster website Accessed August 20, 2020. https.//www.merriam-webster.com/dictionary/biracial
- UNited States Census Bureau. Ispanic or Latino Origin. United States Census Bureau website. Accessed September 18, 2020. https://www.census.gov/quickfacts/fact/note/US/RHI725219
When using data images, downloaded data, or shared document formats, please attribute BroadStreet as well as the original source, when applicable. For examples and more information, review this article which answers the question "How do I cite BroadStreet?"
Tom Schmitt, PhD, Tracy Flood, MD, PhD, Sabrine Benzakour, Aisha Saleem, Sydney Myers. A full list of the Broadstreet Covid-19 Data Project volunteers can be found here: https://covid19dataproject.org/team-2/