Ce Ge (戈策)
About
I am currently a staff member at Tongyi Lab, Alibaba Group. My recent research focuses on building data-centric AI systems and exploring data-efficient video generation similar to Sora, covering topics such as data infrastructure, dataset curation, scaling laws, and multimodal learning.
Previously, I have published research in visual recognition, including weakly supervised learning, object detection, and image retrieval. Additionally, I have worked on efficient AI methodologies such as neural architecture search (NAS) and pruning.
Contact:
cege.sci (at) gmail (dot) com
gece (at) foxmail (dot) com
Education
Ph.D., 2016 – 2022, School of Computer Science, Beijing University of Posts and Telecommunications. Supervisor Prof. Jianxin Liao.
Visiting researcher, 2019 – 2020, SketchX Lab, University of Surrey. Advisor Prof. Yi-Zhe Song.
B.E., 2012 - 2016, School of Computer Science, Beijing University of Posts and Telecommunications (BUPT).
Projects
Principal developer in the founding team, primarily for multimodal operators, acceleration, and evaluation.
Led and organized the 2nd event in the Data-Juicer for LLMs series, BetterMixture Challenge on Tianchi.
Use cases include Dianjin (点金) for financial analysis, Zhiwen (智文) for reading assistant, and PAI Designer on Alibaba Cloud.
A collection toolbox of neural architecture search methods, especially for TinyML (low-resource CPUs and MCUs).
Designed and implemented PreNAS framework for once-for-all CNN/ViT models search (ICML '23).
Developed QE-Score and QBR Algorithm for joint quantization and architecture search (NeurIPS '22).
Selected Publications
Data-centric AI
Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development. arXiv:2407.11784. [pdf]
Daoyuan Chen, Haibin Wang, Yilun Huang, Ce Ge, Yaliang Li, Bolin Ding, Jingren Zhou
Data Mixing Made Efficient: A Bivariate Scaling Law for Language Model Pretraining. arXiv:2405.14908. [pdf]
Ce Ge, Zhijian Ma, Daoyuan Chen, Yaliang Li, Bolin Ding
Data-Juicer: A One-Stop Data Processing System for Large Language Models. SIGMOD '24. [pdf] [code]
Daoyuan Chen, Yilun Huang, Zhijian Ma, Hesen Chen, Xuchen Pan*, Ce Ge* (co-second author), Dawei Gao*, Yuexiang Xie, Zhaoyang Liu, Jinyang Gao, Yaliang Li, Bolin Ding, Jingren Zhou
Efficient AI
PreNAS: Preferred One-Shot Learning Towards Efficient Neural Architecture Search. ICML '23. [pdf] [code]
Haibin Wang*, Ce Ge* (co-first author), Hesen Chen, Xiuyu Sun
Entropy-Driven Mixed-Precision Quantization for Deep Network Design. NeurIPS '22. [pdf] [code]
Zhenhong Sun, Ce Ge, Junyan Wang, Ming C. Lin, Hesen Chen, Hao Li, Xiuyu Sun
OICSR: Out-In-Channel Sparsity Regularization for Compact Deep Neural Networks. CVPR '19. [pdf] [code]
Jiashi Li, Qi Qi, Jingyu Wang, Ce Ge, Yujian Li, Zhangzhang Yue, Haifeng Sun
Visual Recognition (Detection & Retrieval)
Scene-Level Sketch-Based Image Retrieval with Minimal Pairwise Supervision. AAAI '23. [pdf]
Ce Ge, Jingyu Wang, Qi Qi, Haifeng Sun, Tong Xu, Jianxin Liao
Semi-transductive Learning for Generalized Zero-Shot Sketch-Based Image Retrieval. AAAI '23. [pdf]
Ce Ge, Jingyu Wang, Qi Qi, Haifeng Sun, Tong Xu, Jianxin Liao
Exploring Local Detail Perception for Scene Sketch Semantic Segmentation. TIP '22. [pdf] [code]
Ce Ge, Haifeng Sun, Yi-Zhe Song, Zhanyu Ma, Jianxin Liao
DLI-Net: Dual Local Interaction Network for Fine-Grained Sketch-Based Image Retrieval. TCSVT '22. [pdf]
Haifeng Sun, Jiaqing Xu, Jingyu Wang, Qi Qi, Ce Ge, Jianxin Liao
DLA-Net for FG-SBIR: Dynamic Local Aligned Network for Fine-Grained Sketch-Based Image Retrieval. MM '21. [pdf]
Jiaqing Xu, Haifeng Sun, Qi Qi, Jingyu Wang, Ce Ge, Lejian Zhang, Jianxin Liao
Towards Automatic Visual Inspection: A Weakly Supervised Learning Method for Industrial Applicable Object Detection. COMPUT IND '20. [pdf] [code]
Ce Ge, Jing Wang, Jingyu Wang, Qi Qi, Haifeng Sun, Jianxin Liao
Fewer is More: Image Segmentation Based Weakly Supervised Object Detection with Partial Aggregation. BMVC '18. [pdf]
Ce Ge, Jingyu Wang, Qi Qi, Haifeng Sun, Jianxin Liao