Par Aline Carneiro Viana, Research Director at INRIA Saclay
Introductory text :
Most work in the literature study human mobility predominantly from GPS datasets. Although this allows a fine-grained mobility investigation, datasets collected in large urban scenarios are rarely publicly available. In particular, the experiments performed to collect human mobility data, generally involve people carrying GPS-capable devices which regularly collect their precise positioning.
Due to the complexity of those experiments, they tend to be limited in number of participants (e. g., up to 35), time duration (i. e., a few weeks), and space as in university campuses [5.b], or shopping malls [6]. Lausanne campaign [7] and GeoLife [8] represent some of the few relatively large experiments with around 200 participants, that attempt to collect fine-grained human mobility. The dataset collected from the former is not publicly available, while the one from the latter is. Furthermore, human mobility datasets covering large areas tend to rely only on automobile transportation, which is not within the scope here [9, 10].
On the other side, datasets collected from cellular networks are being considered by the networking research community. Such datasets, named Call Detail Records (CDR), constitute another source of human mobility and content generation (calls, sms, or Internet traffic). Contrarily to a significant number of related work using voice call and text message information from Call Detail Records (CDRs), we (Eduardo Mucceli and Aline C. Viana) characterized and modeled data traffic demand generated by smartphone subscribers. Although convenient and frequently used, voice calls and SMS records only provide an approximation of users’ data consumption. In addition, due to its sparsity in time [4], subscribers calling behavior shows strong variations respect to the time and day of the week [3], which is not the case for data traffic. Also, call traffic misses the background traffic load automatically generated by current smartphone applications (e.g., email checks, synchronization). Most importantly, since smartphones are now used more for data than for calls [5.a], the use of call records for investigating traffic demands is not enough for dimensioning network usages.
For this reasons, we have modeled a precise temporal characterization of individual subscribers’ traffic behavior clustered by their usage patterns, instead of a network-wide data traffic view [11, 12,13]. Note that the high variability of individual subscribers’ traffic demands and the use of large scale datasets make this task complex. We then provided a way to synthetically, still consistently, reproduce temporal usage patterns of mobile subscribers (the first work in the literature to do so, to the best of our knowledge). In particular, we model the usage pattern of these six subscriber profiles according to two different journey periods: peak and non-peak hours. Our main result is a synthetic, measurement-based mobile data traffic generator, capable of imitating temporal traffic-related activity patterns for six different categories of subscribers, during two time periods of a routinary normal day in their lives.[14]
Goals:
The goal of this problem is to add a synthetic traffic, generated by the traffic generator described hereabove, to positions of users trajectories, described in Geolife dataset. This consists in the merging of a real location dataset with a synthetic traffic dataset. The output will be a new dataset containing users mobility and the corresponding traffic generated during their movement.
The problems to be tackled are: time synchronisation, duration of the traffic generated according to the mobility of users (a traffic may start in a position and finish in another). The traffic rate is an important information which is not provided, but should be also considered (the traffic sesison duration depends on that).
The Geolife dataset is publicly available :
- https://www.microsoft.com/en-us/research/publication/geolife-gps-trajectory-dataset-user-guide/
- https://www.microsoft.com/en-us/download/details.aspx?id=52367
The traffic generator is available here:
- http://macaco.inria.fr/software/
the code source is not available, but the student can use the generator to generate the trace with 200 users (as in geolife)
[3] J.Candia,M.Gonzalez,P.Wang,T.Schoenharl,G.Madey,A.-L.Barabasi,Uncovering individual and collective human dynamics from mobile phone records, Journal of Physics A: Mathematical and Theoretical 41.
[4] R. Becker, R. Caceres, K. Hanson, J. Loh, S. Urbanek, A. Varshavsky, C. Volinsky, A tale of one city: Using cellular network data for urban planning, IEEE Pervasive Computing 10 (4) (2011) 18–26.
[5.a] J.Wortham,Cellphonesnowusedmorefordatathanforcalls,NewYorkTimes.
[5.b] A. Socievole, F. De Rango, A. Caputo, Wireless contacts, facebook friendships and interests: Analysis of a multi-layer social network in an academic environment, in: Wireless Days (WD), 2014 IFIP, 2014, pp. 1–7. doi:10.1109/WD.2014.7020819.
[6] A.Galati,K.Djemame,C.Greenhalgh,Amobilitymodelforshoppingmallenvironmentsfoundedonrealtraces, Networking Science 2 (1-2) (2013) 1–11. doi:10.1007/s13119-012-0011-1.
URL http://dx.doi.org/10.1007/s13119-012-0011-1
N. Kiukkonen, B. J., O. Dousse, D. Gatica-Perez, L. J., Towards rich mobile phone datasets: Lausanne data collection campaign, in: Proc. ACM Int. Conf. on Pervasive Services, 2010.
[8] Y.Zheng,X.Xie,W.-Y.Ma,Geolife:Acollaborativesocialnetworkingserviceamonguser,locationandtrajectory,IEEE Data Eng. Bull. 33 (2) (2010) 32–39.
[9] M.Piorkowski,N.Sarafijanovic-Djukic,M.Grossglauser,Aparsimoniousmodelofmobilepartitionednetworkswith clustering, in: Communication Systems and Networks and Workshops, 2009. COMSNETS 2009. First International, IEEE, 2009, pp. 1–10. doi:10.1109/comsnets.2009.4808865.
URL http://dx.doi.org/10.1109/comsnets.2009.4808865
[10] R.Amici,M.Bonola,L.Bracciale,A.Rabuffi,P.Loreti,G.Bianchi,Performanceassessmentofanepidemicprotocol in {VANET} using real traces, Procedia Computer Science 40 (0) (2014) 92 – 99, fourth International Conference on Selected Topics in Mobile & Wireless Networking (MoWNet 2014). doi:http://dx.doi.org/10.1016/ j.procs.2014.10.035.
[11] D.Naboulsi,R.Stanica,M.Fiore,Classifying call profiles inlarge-scale mobile traffic datasets,in:Proc.ofIEEE infocom 2014.
[12] A. Pawling, N. V. Chawla, G. Madey, Anomaly detection in a mobile communication network, Computational and Mathematical Organization Theory 13 (4) (2007) 407–422.
[13] S.Hoteit,S.Secci,Z.He,C.Ziemlicki,Z.Smoreda,C.Ratti,G.Pujolle,Contentconsumptioncartographyoftheparis urban region using cellular probe data, in: Proc. of the 1st Workshop on Urban Networking (ACM UrbaNe), 2012.
[14] E. Mucceli A. C. Viana, K. P. Naveen, and C. Sarraute. Mobile Data Traffic Modeling: Revealing Temporal Facets. Computer Network Elsevier journal. Vol 112, Pages 176-193, January 2017.