Home

Current Position: Postdoctoral Researcher at Oak Ridge National Laboratory, Oak Ridge, TN.
I received my Ph.D. degree in Computer Science from Illinois Institute of Technology (IIT), Chicago. In IIT, I worked in Data-Intensive Computing System Laboratory with Dr. Ioan Raicu.
My research focuses on distributed systems (especially various kinds of distributed storage systems), cloud computing and data management. My research experience is also applicable to distributed systems in general and applications, namely performance, scalability and concurrency issues.


TECHNICAL SKILLS

Operating Systems:

Linux (7 years), Windows (12 years)

Programming Languages:

C (5 years), C++ (3 years), Java (2 years), Shell Script (3 years), Python (1 year)

Network programming:

Sockets with TCP/UDP (3 years)

Concurrent programming:

POSIX Threads (3 years), epoll (1 year)

Supercomputers:

Blue Gene/P (3 years), SiCortex (3 years)

Batch scheduling systems:

Cobalt (2.5 years), Slurm (3 years)

Cloud computing platforms:

Amazon EC2 (3 years), Openstack (2 year)

Other distributed systems:

Memcached, Cassandra, Amazon DynamoDB 


EDUCATION
  • Ph.D. in Computer Science, Illinois Institute of Technology, Chicago.                                  08/2009 - Present

Research area: Distributed systems, Advisor: Dr. Ioan Raicu

  • M.E in Computer Software, Xi’an Shiyou University, China.                                                                  07/2009

Master thesis: A Study on P2P Distributed Database Middleware, Advisor: Dr. Tianshi Liu

  • B.E in Computer Science, Xi’an Shiyou University, China.                                                                      07/2003

Bachelor thesis: A Study Fault Tolerance and Load Balancing in Distributed Computing


WORK EXPERIENCE

  • RA and TA, DataSys Laboratory, Illinois Institute of Technology, Chicago                            01/2011 – 2015/11
  • Research Aide (Internship), Argonne National Laboratory                                                    06/2014 – 08/2014
  • RA and TA, Xi’an Shiyou University, China                                                                              09/2006 – 07/2009
  • Website and database administrator (part time), Xi’an Shiyou University, China               09/2001 – 05/2003         

CURRENT RESEARCH PROJECTS

·     ZHT: a Zero-hop Distributed Key-Value Store                                                               01/2011 – 2015

ZHT is a high performance distributed key-value store that is designed to be light-weighted, high performance and fault tolerant. In recent evaluation, ZHT scales up to 8K nodes on an IBM BlueGene/P supercomputer with latencies of only 1.1ms and 18M ops/sec throughput. ZHT is written in C++ and open sourced. I started ZHT from scratch.

·     ZHT+: Adaptive Key-Value Storage as a Service                                                            08/2014 – present

In this project I propose a new approach to design and implement NoSQL Storage As A Service, which adjusts request-handling policies for each request according to application’s requirement. The prototype system will heuristically optimize the whole system’s performance in multiple metrics. I started ZHT+ based on ZHT.

·     WaggleDB: A Cloud-based Interactive Data Infrastructure for Sensor Networks       06/2014 – present

In this work I design and implement a cloud-based data store called WaggleDB that provides an interactive data infrastructure for sensor networks. It efficiently aggregates and stores data from sensor networks and enables the users to query those data sets. The special multi-tier architecture allows each tier scales independently.

·     FusionFS: a Fully Distributed File System                                                                      09/2010 – present

FusionFS is a new distributed filesystem that co-exists with current parallel filesystems in High-End Computing. It’s a user-level filesystem that runs on the compute resource infrastructure, and enables compute nodes to participate in the metadata and data management. Distributed metadata management is implemented with ZHT. 


PAST RESEARCH PROJECTS

·        Versatile Consistency Model Integration in Distributed Key-Value Store                                    Fall 2013

Due to different replication consistency models have very different performance impact and applications can tolerance different level data inconsistency, in this project we have implemented a series of consistency models for ZHT. In the ongoing work, we are enabling the applications to specify the desired consistency level to optimize performance. 

·        FRIEDA-State: Scalable State Management for Scientific Applications in the Cloud                    Summer 2013

This work proposes and implements a state management system (FRIEDA-State) for scientific applications running in cloud environments. FRIEDA-State’s design addresses the challenges of state management in cloud environments and discusses various configurations.

·        A Survey on the Cost of Cloud Computing and Datacenters                                                        Spring 2012

Collected and analyzed the cost data of building a datacenter as private cloud service and using commercial cloud (Amazon EC2).

·        A Study on Checkpoint Files Similarity                                                                                           Summer 2010

This project was aiming to understand the similarity between checkpoint files in order to reduce checkpoint/restart overhead in HPC.  I analyzed checkpoint data by looking at skeleton data generated by BLCR (Berkley Library of Checkpoint/Restart). Implemented an analysis tool for checkpoint file similarity based on dynamic programming. 



PUBLICATIONS 

·        Tonglin Li, Ioan Raicu, Lavanya Ramakrishnan, Scalable State Management for Scientific Applications in the Cloud, IEEE Proceedings of the 3rd International Congress on Big Data (BigData2014), 2014

·        Tonglin Li, Kate Keahey, Rajesh Sankaran, Pete Beckman, Ioan Raicu, A Cloud-based Interactive Data Infrastructure for Sensor Networks, ACM/IEEE Supercomputing Conference Regular Research Poster, SC2014

·        Tonglin Li, Xiaobing Zhou, Ioan Raicu, etc. ZHT: A Light-weight Reliable Persistent Dynamic Scalable Zero-hop Distributed Hash Table, 27th IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2013.

·        Tonglin Li, Raman Verma, Xi Duan, Hui Jin, Ioan Raicu. Exploring Distributed Hash Tables in High-End Computing, ACM SIGMETRICS Performance Evaluation Review (PER), 2011

·        Tonglin Li, Xiaobing Zhou, Ioan Raicu, etc., A Convergence of Distributed Key-Value Storage in Cloud Computing and Supercomputing, IEEE Transaction of Service Computing 2014, under review

·        Dongfang Zhao, Zhao Zhang, Xiaobing Zhou, Tonglin Li, Ke Wang, Dries Kimpe, Philip Carns, Robert Ross, and Ioan Raicu. FusionFS: Towards Supporting Data-Intensive Scientific Applications on Extreme-Scale High-Performance Computing Systems, IEEE International Conference on Big Data 2014

·        Ke Wang, Xiaobing Zhou, Tonglin Li, Dongfang Zhao, Michael Lang, Ioan Raicu. Optimizing Load Balancing and Data-Locality with Data-aware Scheduling, IEEE International Conference on Big Data 2014 


Technical Reports