It contains the number of user visits from census tract to POI. Also, this data is aggregated over each week. It is to be used in the creation of the metapopulation graph. Contains 70k+ POIs, 200k+ trip pairs each week and is collected over 66 weeks
It contains the case-rate data of each New York zip-code collected weekly. It is to be used as ground truth for the model prediction
The key data-cleaning process involves creating a mapping from census tract to zip code tabulation area using a combined ratio.
The main idea behind our architecture is to learn a good representation of dynamics for each of the POIs and ZCTAs and leverage these representations along with visit counts to forecast future case rate as function of aggregated effect of mobility on each ZCTA from each POI.
We mention some common baselines along with reported accuracies (for r weeks ahead) in the below table. We see the predictions also in the plot as seen below for 8 randomly selected ZCTAs.
We observe that ZPMNet is the significantly better than all baselines. In particular we observe 52%-220% average improvements. As we forecast farther into the future the performance of baselines quickly degrade faster than our model's.
We also observed that for over 63% of all ZCTAs over model performs over 200% time better with predictions of other models failing to capture the general trends of ZCTA case rates
To detect hotspots, we consider a weight as defined below (Following naming conventions defined in the architecture). Essentially, this quantity models each POI's weight contribution to each region. A POI with a higher quantity is one that has effectively caused more infections.
The top 10 POI values are seen as mentioned in the below table. We observe that most of them are dining spots which have large frequency of airborne transition whereas other POIs include clubs and schools.