Dataset Description

The public dataset represents the movements of people at the MERL facility for a period of a year, as recorded by a network of over 200 wireless motion sensors.

distinctRaw Motion Data

Individual wireless sensors detect the movement of people using passive infrared motion detectors. An individual or small group walking past the sensor will generate a single activation, recorded as an association between a sensor ID and a timestamp. The sensors are placed densely in the hallways and lobbies of MERL, spaced approximately two meters apart. As people move through the network, a sequence of sensor activations is recorded. This spatio-temporal structure, not the individual sensor activations, gives meaning to the data.

Even when people stand still under a sensor, their small movements will cause the sensor to trigger repeatedly, giving the data a different, distinct spatio-temporal structure.

The dataset contains over 30 million activations. It is interesting to consider the details of individual movements, such as in this plot of a fire evacuation (meetings and transit activity are apparent both before the evacuation and after the re-population):

It is also fruitful to consider the larger structures of populations over longer spans of time, such as this plot of a single week showing the differences between days and nights (vertical bands) as well as weekdays and weekends (the five dense center bars versus the sparser end bars):

Here is a short introductory video about the data showing a few things we've done with it:

Tracklets

When an individual moves through the space they create a structured pattern of activations that link parts of the space together in a meaningful way. Tracking is the process of recovering this structure from the raw observations. Here we see a set of activations linked together into a tracklet:

There is an inherent ambiguity in motion sensor data. A one-bit motion sensor cannot identify individuals, or even distinguish between individuals and small groups. It is therefore impossible to track individuals through the space without some degree of ambiguity. A tracklet is a small section of a track that can be recovered unambiguously. This dataset includes a forest of graphs that represents all the known tracklets as well as the ambiguity relationships between them. All true tracks will be embedded in a graph, but each graph may allow many valid interpretations:

Reach

It is possible to walk the tracklet graphs to discover the possible pairings of track starts and ends. The dataset includes pre-computed counts of potential trips between all points in the space. One possible use of this data is to estimate the probability that a trip beginning in one location will end at another location by accumulating evidence over a span of time, such as a day or a week. This is an illustration of such an estimate:

Symbolic Data

The dataset also includes a calibration file that associates the sensor IDs to a map of the lab. This grounds the data to the spatial context of our lab. Temporally, the data is given meaning by several calendars. These record the times and locations of various meetings and gatherings, the dates of official holidays, and a record of the number of people who were out of the office on given days. We've also included a daily almanac of the weather conditions in Cambridge, Massachusetts where our lab is located.

Using the Data

We invite you to download the data and apply your analytic, visualization, and interface tools.

Papers that utilize this dataset must reference this technical report: MERL TR2007-069. You can find other papers that have used the dataset by searching for papers that cite that report.