Datasets

Synthetic Data To Support UK-US Prize Challenge For Developing Privacy Enhancing Methods: Predicting Individual Infection Risk During A Pandemic

This dataset is synthetic data, sometimes called a virtual population or digital twin. It is a statistically accurate representation of a real population’s demographics, activities, and social contacts, but does not contain any individual person’s information. This dataset consists of 3 components: a synthetic population representing a demographically realistic population, including synthetic activities for each individual denoting the different locations they visited, when they visited, and the duration of that visit; a contact network derived from these activities; and a set of disease states, the result of simulating the effects of a COVID-like disease within this population. There are two datasets available. One is a representation of the state of Virginia (~7.7 million individuals) and the other is a representation of the United Kingdom (~62 million individuals).

Synthetic Populations

Synthetic populations for regions of the World (SPW) is a collection of data sets, each data set being a synthetic population for a country, or state, both of which will be referred to as a region in the following. At a high level, a synthetic population of a region as provided here, captures the people of the region with selected demographic attributes, their organization into households, their assigned activities for a day, the locations where the activities take place and thus where interactions among population members happen (e.g., spread of epidemics).

Virginia County-level NPIs

We gathered data on non-pharmaceutical interventions (NPIs) against COVID-19 from counties and independent cities in Virginia. NPIs are methods for reducing the spread of a disease that do not involve vaccines or drug treatments. Specifically, we look for dates when closures or mandates were implemented or lifted in the following five categories: masks, businesses, pre-K-12 schools, colleges, religious organizations.

Global PatchSim Dataset

We developed this dataset for our project COVID-19 Response Support: Building Synthetic Multi-scale Networks, funded through the NSF RAPID program. This material is based upon work supported by the National Science Foundation under Grant No. OAC-2027541.

Baidu Mobility Data for January - April 2020

This archive contains mobility data made public by Baidu and scraped from their qiangxi.baidu.com web site in February and March, 2020. We have reformatted the data into a more easily computable form, comma-separated value (csv) files providing the full origin-destination matrix for each day. We are publishing this reformatted version for research purposes under Article 22 of the Copyright Law of the People’s Republic of China (https://wipolex.wipo.int/en/text/466268). We make no representation as to the suitability of the data for any purpose, but nevertheless hope that it may be useful for researchers trying to calibrate models of 2019-ncov. We wish to thank Baidu especially for making these valuable data available and encourage them to continue to do so. Thanks also to Chunhong Mao for making us aware of this data source and explaining what the data represent and to James Schlitt for scraping the data.