• Exam datasets for research on crowdsourcing can be found here
The exam dataset is collected by us from one junior high school and one senior high school in Taiwan. 
It is released for research purpose only. The answers provided by students can be viewed as labels on exam questions. 

If you use this dataset, please cite the following papers:
[1] P.-Y. Chen, C.-W. Lien, F.-J. Chu, P.-S. Ting, and S.-M. Cheng, “Supervised Collective Classification for Crowdsourcing,” IEEE GLOBECOM Workshop, 2015
[2] P.-Y. Chen, S.-M. Cheng, P.-S. Ting, C.-W. Lien, and F.-J Chu, “When Crowdsourcing Meets Mobile Sensing: A Social Network Perspective,” IEEE Communications Magazine, 2015


  • Traces of actual lateral movement attack can be found here 
This dataset is collected by us from a real enterprise network. It contains heterogeneous connectivity patterns in terms of host-application information. There are two files in the dataset: one containing normal traffic and lateral movement traces, and the other containing propagation paths of lateral movements.

If you use this dataset, please cite the following papers:
[1] P.-Y. Chen, S. Choudhury, L. Rodriguez, A.  O. Hero, and I. Ray, “Enterprise Cyber Resiliency Against Lateral Movement: A Graph Theoretic Approach,” under review

  • Temporal collaboration network of Jure Leskovec and Andrews Ng (with ground-truth community labels) can be found here
This dataset is collected by Baichuan Zhang. It contains the coauthors of Prof. Jure Leskovec or Prof. Andrew Ng at Stanford University from year 1995 to year 2014. We partition this 20-year co-authorship into 4 different 5-year intervals and hence create a 4-layer multilayer graph. For each layer, there is an edge between two researchers if they co-authored at least one paper in the 5-year interval. For every edge in each layer, we adopt the temporal collaboration strength as the edge weight proposed in [2,3]. We manually label each researcher by either ``Leskovec's collaborator'' or ``Ng's collaborator'' based on the collaboration frequency and use the labels as the ground-truth cluster assignment. The ground-truth clusters with researcher names and collaboration strengths are displayed below.

If you use this dataset, please cite the following papers:
[1] P.-Y. Chen and A. O. Hero, Multilayer Spectral Graph Clustering via Convex Layer Aggregation: Theory and Algorithms,” IEEE Transactions on Signal and Information Processing over Networks, 2017
[2] B. Zhang, T. K. Saha, and M. Al Hasan, “Name disambiguation from link data in a collaboration graph,” in IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)2014
[3] T. K. Saha, B. Zhang, and M. Al Hasan, “Name disambiguation from link data in a collaboration graph using temporal and topological features,” Social Network Analysis and Mining, 2015.