Research

Learning + Decision Making

Dynamic decision making has been exhaustively studied and widely used in many areas such as wireless networking. The typical dynamic decision algorithms require the system statistics. However, in many practical systems, e.g., online advertising and crowdsourcing, such statistics could be unavailable when deploying the system or changing over time. Learning is necessary to collect this information, and exploration-exploitation tradeoff should be carefully design to avoid significantly scarifying performance. We study the techniques of combining learning and decision making algorithms for such cold-start or dynamic systems. In particular, we are interested in efficient algorithms for systems with certain practical issues such as budget constraint and many features.

Publications:

[1] Huasen Wu, R. Srikant, Xin Liu, and Chong Jiang, Algorithms with Logarithmic or Sublinear Regret for Constrained Contextual Bandits, NIPS 2015, Montreal, Canada.

As a starting point, we study the problem with the bandit setup. We propose a framework (see the right figure) for contextual bandits with budget constraints. Simple algorithms based on UCB (Upper-Confidence-Bound) and ALP (Adaptive-Linear-Programming) are proposed and shown to achieve logarithmic regret, which matches the lower bound and is order-optimal. To the best of our knowledge, this is the first work that shows how to achieve logarithmic regret in constrained contextual bandits.

[2] Huasen Wu and Xin Liu, Double Thompson Sampling for Dueling Bandits, NIPS 2016, Barcelona, Spain

We propose the first Thompson Sampling algorithm, referred to as Double Thompson Sampling (D-TS), for general Copeland dueling bandits, including the Condorcet dueling bandit as a special case. This double sampling structure has both practical and theoretical advantages. Practically, this structure enables fully utilization of Thompson sampling so that it can significantly reduce the regret. Moreover, as a randomized policy, DTS is more robust to the preference probabilities as well as feedback delay. Theoretically, this double sampling structure allows us to obtain the first theoretical bounds for Thompson sampling in dueling bandits.

ALP-UCB

D-TS

Task Scheduling in SQL DataSync

This is the project I did when I worked as a Research Intern in Wireless and Networking Group, Microsoft Research Asia. This is a collaborative projected with Microsoft China, Shanghai. My responsibility was to analyze the trace collected in the trial stage for identifying the performance bottleneck. I designed the delay-minimizing scheduling policy without knowledge of processing time and with preemption cost.The team was awarded for Excellence in Collaboration.

Flow-level Scheduling in Wireless Networks

This is another project I did when I worked at Microsoft Research Asia, and is also the theme of my PhD dissertation. We consider scheduling algorithms for for wireless networks with flow-level dynamics and deadlines. This problem is challenging due to the coupling effect brought by the deadline constraints. We proposed a multi-temporal-scale scheduling framework, and design algorithms for the small-scale and large-scale scheduling problems. We investigated the asymptotic behavior in the many-source regime, studied decomposition techniques and designed distributed scheduling policies. Three conference papers and one journal paper were published on this topic and another journal paper is under submission..

Publications:

[1] Huasen Wu, Xin Liu, and Youguang Zhang, “Laxity-based opportunistic scheduling with flow-level dynamics and deadlines,” IEEE WCNC, April 2013, Shanghai.

[2] Huasen Wu, Xiaojun Lin, Xin Liu, and Youguang Zhang, “Application-level scheduling with Deadline Constraints,” IEEE Infocom 2014, Toronto, Canada, 2014.

[3] Huasen Wu, Xiaojun Lin, Xin Liu, Kun Tan, and Yongguang Zhang, “Decompositionof Large Scale MDPs for Wireless Scheduling with Load- and Channel-Awareness,” 2014 Information Theory and Applications (ITA) Workshop, San Diego, 2014 (invited).

[4]Huasen Wu, Xiaojun Lin, Xin Liu and Youguang Zhang, “Application-Level Scheduling with Probabilistic Deadline Constraints”, IEEE/ACM Transactions on Networking, to appear. (Extension of the AppSchd paper at Infocom'14)

[5] Huasen Wu, Xiaojun Lin, Xin Liu, Kun Tan, and Yongguang Zhang, “CoSchd: Coordinated Scheduling with Channel- and Load-Awareness for Alleviating Cellular Congestion,” submitted to IEEE/ACM Transactions on Networking. (Extension of the CoSchd paper at ITA'14)

Random Access Control for M2M communications

This is a project I did when I visited UC Davis in 2011. I studied random access schemes for event-driven M2M communications. I proposed a Fast Adaptive S-ALOHA (FASA) scheme, and investigated the stability of the scheme with drift analysis. The results have been published in IEEE VTC-2012Fall and IEEE/ACM Transactions on Networking.

Publications:

[1] Huasen Wu, Chenxi Zhu, Richard La, Xin Liu, and Youguang Zhang, “Fast adaptive S-ALOHA scheme for event-drivenmachine-to-machine communications,” IEEE VTC 2012 Fall, Sept. 2012.

[2] Huasen Wu, Chenxi Zhu, Richard La, Xin Liu, and Youguang Zhang, “FASA: Accelerated S-ALOHA Using Access History for Event-driven M2M Communications,” IEEE/ACM Trans. on Networking, Vol. 21, no. 6, Dec. 2013.

Simulation Platform for Air-interface Protocol of RFID

This is a project I managed as a graduate research assistant at Beihang University. The project was to support the standardization of RFID air interface protocol in China. I planed the development process, designed the system architecture for the platform based on C++, designed and implemented the subsystem of MAC layer simulation. I learn a lot during this project, including both project management and coding techniques.

I really appreciate the trust of my advisor, Prof. Youguang Zhang, and many thanks to the group members, Jiqiang Tang, Xiaojie Ju, Guangshan Zhang, Bo Zhao, Jingjing Fang, Xiaoya Li, Zhaohao Wang, etc.