Research Interests: Foundation models | Transformer++ architecture | Efficient Pre-training | Knowledge Distillation