Data sources & transparency

To create the most comprehensive COVID-19 dataset, COVID Atlas pulls information from over 150 official government health data sources and verified curated datasets from around the world.

To encourage these official sources to improve the transparency, accessibility, and hygiene of their data, COVID Atlas publishes ratings of each source.

Ratings are about transparency, not accuracy

Because COVID Atlas only uses official data sources, it assumes its data is intended to be completely accurate.

COVID Atlas source ratings are based on:

  • Completeness of the data provided – this includes data points for confirmed, hospitalized, discharged, and recovered cases; fatalities; total tests administered, etc.)
  • Data granularity – official data provided at the most local possible level (often counties /municipalities)
  • Machine-readability – data is available in JSON or CSV, or at least HTML with one row per locality

Source ratings are not based on:

  • Accuracy of information – we use only official data and assume its accuracy in good faith

Do you administer an official government health data source?

Administrators of official government health data can improve their sources for the benefit of scientists, researchers, developers, and, most importantly, the general public. Below is a brief list of ways we recommend doing so:

  • Publish every bit of verifiable data you can – this includes cumulative or timeseries data for confirmed, hospitalized, discharged, and recovered cases; fatalities; total tests administered, etc.)
  • Publish as granularly as possible – The most useful datasets have the highest degrees of geographical specificity, including individual columns for each locality
  • Publish only accessible and machine-readable formats – do not publish your data in PDFs, images, and other inaccessible formats. These formats are hostile to the general public's need understand what's going on, and are in some cases illegal. Use HTML <table> with one row per locality at the most granular level you have, and if possible, also publish JSON and/or CSV formats

We’d like to hear from you and help you make your data better! Please do reach out to us on Slack or file an issue on GitHub