Projects

Reinforcement learning in sports scheduling

In this project, we designed a reinforcement learning environment to solve the single round-robin tournament scheduling problem. The article can be found at

J. C. Peng, A. D. Clark and A. Dahbura, "Introducing Human Corrective Multi-Team SRR Sports Scheduling via Reinforcement Learning," 2021 IEEE Symposium Series on Computational Intelligence (SSCI), 2021, pp. 1-7, doi: 10.1109/SSCI50451.2021.9660171.

My duties in this project include:

  • Co-designed Markov Decision Process modelling for single round-robin tournament scheduling problems

  • Designed innovative backtracking actions in action space, significantly increasing robustness

  • Developed custom reinforcement learning environment for round-robin tournament scheduling problems by OpenAI Gym

  • Trained reinforcement learning agents by ACKTR algorithm and successfully generated sample schedules with least number of breaks in single round-robin cases

A new effective InsurTech tool: CIBer

In this project, we designed a new novel supervised learning model named Comonotone-Independence Bayesian Classifier (CIBer). The manuscript is attached.

A_new_effective_InsurTech_tool__CIBer.pdf

My duties in this project include:

  • Co-designed and developed novel supervised learning model named Comonotone-Independence Bayesian Classifier (CIBer), whose core techniques include:

    • Utilizing hierarchical clustering algorithm to search for optimal partition of all predictive features

    • Using comonotonicity to model dependence structure and to estimate conditional joint probability

  • Co-designed and implemented Joint Encoding algorithm used to encode multiple categorical variables by numerical values while preserving their internal relationship

  • Compared CIBer with classical supervised learning models including Logistic regression, SVM, Decision tree, MLP, KNN, and Naïve Bayes on several insurance-related datasets, obtaining out-of-sample accuracy over 10% higher than any other model and AUC values over 0.08 higher than any other model on every dataset

  • Extended model with singleton classifier to more robust version with ensemble learning

Approximation to the optimal strategy in the Mozart Cafe problem by simultaneous perturbation stochastic approximation

This is the my master thesis project in Johns Hopkins University. In this project, I designed an approximation algorithm to search for the optimal strategy in the general Mozart Cafe problem. The manuscript is attached.


Thesis.pdf

My duties in this project include:

  • Designed novel k-Markovian parameterization technique as generic representation for any symmetric rendezvous search strategy in Mozart-Café problem

  • Developed Monte Carlo simulation programs by Python to estimate expected rendezvous time for any symmetric rendezvous search strategy parameterized by our k-Markovian technique

  • Utilized Simultaneous Perturbation Stochastic Approximation (SPSA) algorithm on parameter optimization in k-Markovian modeling and reproduced optimal strategies in small size cases