Location-Based Social Network Data Generation

Based on Patterns of Life


Joon-Seok Kim (George Mason University),  Hyunjee Jin (George Mason University), Hamdi Kavak (George Mason University), Ovi Chris Rouly (Tulane University), Andrew Crooks (George Mason University), Dieter Pfoser (George Mason University), Carola Wenk (Tulane University), Andreas Züfle (George Mason University)


Location-based social networks (LBSNs) have been studied extensively in recent years. However, utilizing real-world LBSN data sets in such studies yields several weaknesses: sparse and small data sets, privacy concerns, and a lack of authoritative ground-truth. To overcome these weaknesses, we leverage a large-scale geospatial simulation to create a framework to simulate human behavior and to create synthetic but realistic LBSN data based on human patterns of life. Such data not only captures the location of users over time but also their social interactions via their social networks. Such patterns of life are simulated by giving agents (i.e., people) an array of "needs" that they aim to satisfy. For instance, agents go home when they are tired, go to restaurants when they are hungry, they go to work to fulfill their financial needs, and go to recreational sites to meet friends and satisfy their social need. While existing real-world LBSN data sets are trivially small, the proposed framework provides a source for massive LBSN benchmark data that closely mimics the real-world. As such it allows us to capture 100% of the (simulated) population without any data uncertainty, privacy-related concerns, or incompleteness. It allows researchers to see the (simulated) world through the lens of an omniscient entity having perfect data. Our framework is made available to the community. In addition, we provide a series of simulated benchmark LBSN data sets using different real-world urban environments obtained from OpenSteetMap. These data sets, which comprise gigabytes of spatio-temporal and temporal social network data taken at 5-minute intervals, are made available to the research community. 

Overview of Location-Based Social Network (LBSN)

Why Synthetic Data?

Publicly available real-world data sets have been the driving force for LBSN research in recent years, but such data sets exhibit certain weaknesses:

Socio-spatial Simulation Settings

Social network and data visualization

The following demo videos visualize social network evolution over time with basic statistics.

Number of Agents: 1,000 / Maps: TownS (virtual city)

Number of Agents: 1,000 / Maps: TownL (virtual city)

Number of Agents: 1,000 / Maps:  NOLA (New Orleans, Louisiana)

Number of Agents: 1,000 / Maps:  GMU (George Mason University, Fairfax, Virginia)

Maps used for Location-Based Social Network Simulation

New Orleans, Louisiana (NOLA), Mississippi River, Lake Pontchartrain, and the French Quarter

Maps for NOLA dataset

Virtual City (TownL)

Maps for TownL dataset

George Mason University (GMU), Fairfax, VA.

Maps for GMU dataset

Virtual City (TownS)

Maps for TownS dataset

Analysis on LSBN Datasets

The following graphs show the comparison of the average social network degrees that chance over time between different scenarios.


GMU scenarios


NOLA scenarios


TownS scenarios


TownL scenarios


1K scenarios


3K scenarios


5K scenarios


All scenarios


Related Research: