Foursquare Dataset

1. NYC Restaurant Rich Dataset (Check-ins, Tips, Tags)
Location based social networks have attracted millions of users and massively contains their digital footprints. We have crawled a part of these digital footprints from Foursquare in order to study the problems of personalized location recommendation and search. This dataset includes check-in, tip and tag data of restaurant venues in NYC collected from Foursquare from 24 October 2011 to 20 February 2012. It contains 3112 users and 3298 venues with 27149 check-ins and 10377 tips (written in English).

Please download the dataset here and check the readme file here

Please cite our paper if you publish material based on those datasets.
  • Dingqi Yang, Daqing Zhang, Zhiyong Yu and Zhiwen Yu, Fine-Grained Preference-Aware Location Search Leveraging Crowdsourced Digital Footprints from LBSNs. In Proceeding of the 2013 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp 2013), September 8-12, 2013, in Zurich, Switzerland. [PDF]
  • Dingqi Yang, Daqing Zhang, Zhiyong Yu and Zhu Wang, A Sentiment-enhanced Personalized Location Recommendation System. In Proceeding of the 24th ACM Conference on Hypertext and Social Media (HT 2013), 1-3 May, 2013, Paris, France. [PDF]
  • Dingqi Yang, Daqing Zhang, Zhiyong Yu, Zhiwen Yu, Djamal Zeghlache. SESAME: Mining User Digital Footprints for Fine-Grained Preference-Aware Social Media Search. ACM Trans. on Internet Technology, (TOIT)14(4), 28, 2014. [PDF]


2. NYC and Tokyo Check-in Dataset
This dataset contains check-ins in NYC and Tokyo collected for about 10 month (from 12 April 2012 to 16 February 2013). It contains 227,428 check-ins in New York city and 573,703 check-ins in Tokyo. Each check-in is associated with its time stamp, its GPS coordinates and its semantic meaning (represented by fine-grained venue-categories). This dataset is originally used for studying the spatial-temporal regularity of user activity in LBSNs.

Please download the dataset here and check the readme file here

Please cite our paper if you publish material based on those datasets.
  • Dingqi Yang, Daqing Zhang, Vincent W. Zheng, Zhiyong Yu. Modeling User Activity Preference by Leveraging User Spatial Temporal Characteristics in LBSNs. IEEE Trans. on Systems, Man, and Cybernetics: Systems, (TSMC), 45(1), 129-142, 2015[PDF]



3. Global-scale Check-in Dataset
This dataset includes long-term (about 18 months from April 2012 to September 2013) global-scale check-in data collected from Foursquare. It contains 33,278,683 checkins by 266,909 users on 3,680,126 venues (in 415 cities in 77 countries). Those 415 cities are the most checked 415 cities by Foursquare users in the world, each of which contains at least 10K check-ins. Please see the references for more details about data collection and processing.

Please download the dataset here (about 775MB compressed) and check the readme file here

Please cite our paper if you publish material based on this dataset.
  • Dingqi Yang, Daqing Zhang, Bingqing Qu. Participatory Cultural Mapping Based on Collective Behavior Data in Location Based Social Networks. ACM Trans. on Intelligent Systems and Technology (TIST), 2015. [PDF]
  • Dingqi Yang, Daqing Zhang, Longbiao Chen, Bingqing Qu. NationTelescope: Monitoring and Visualizing Large-Scale Collective Behavior in LBSNs. Journal of Network and Computer Applications (JNCA), 55:170-180, 2015. [PDF]


4. User Profile Dataset 
This dataset includes some user profile data for privacy study (i.e., gender, #friends, #followers). It contains 18,201 and 11,874 users who have checked in New York City and Tokyo, respectively. The corresponding user check-in data can be found in the global-scale check-in dataset I published. The two dataset can be linked by the anonymized user ID (the unique key). Please see the references for more details about data collection and processing.

Please download the dataset here and check the readme file here

Please cite our paper if you publish material based on this dataset.
  • Dingqi Yang, Daqing Zhang, Bingqing Qu, Philippe Cudré-Mauroux. PrivCheck: Privacy-Preserving Check-in Data Publishing for Personalized Location Based Services. In Proc. of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp'16), September, 2016, Heidelberg, Germany. [PDF]



    More dataset will come soon~