Listen, Speak, Read, Write, and Watch
Listen, Speak, Read, Write, and Watch
I am currently a Senior Staff Machine Learning Research Engineer at ByteDance Seed, working on multi-modal LLMs. Prior to that, I served as the team leader of the Cross Language Agent team. One of our products, VolcTrans, served as the official Machine Translation Platform for ByteDance, providing large-scale, high-performance MT service for ByteDance's products, e.g. , TikTok, CapCut, Lark.
I am focusing on AI research of LLMs, multi-modal (speech, video) translation, and their large-scale real-world applications.
I have published peer-reviewed papers in ACL, EMNLP, NAACL, AAAI, ICLR, among other top-tier conferences.
I have, three times, led my teams to win the competitions of the news translation shared tasks in the Conference of Machine Translation (WMT), securing championships in the years 2017, 2018, and 2021.
I am focusing on AI research of LLMs, multi-modal (speech, video) translation, and their large-scale real-world applications.
Email: cshanbo # gmail dot com (replace # with @, dot with .)
Our paper was entitled Outstanding Paper at ACL 2024!!!
ArXiv link: https://arxiv.org/abs/2405.12915
Pan, X., Huang, L., Kang, L., Liu, Z., Lu, Y. and Cheng, S., 2024. G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation. arXiv preprint arXiv:2405.12915.
We proposed CLASI, a Cross Language Agent for Simultaneous Interpretation
which (almost) achieves human parity on end-to-end simultaneous speech translation.
Details can be found at https://byteresearchcla.github.io/clasi/
Cheng, S., Huang, Z., Ko, T., Li, H., Peng, N., Xu, L., & Zhang, Q. (2024). Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent. arXiv preprint arXiv:2407.21646.
Pan, X., Huang, L., Kang, L., Liu, Z., Lu, Y., & Cheng, S. (2024). G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation. In Proceedings of the Association for Computational Linguistics: ACL 2024.
Cao, Z., Cao, Q., Lu, Y., Peng, N., Huang, L., Cheng, S. and Su, J., 2024. Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs. In Proceedings of the Association for Computational Linguistics: ACL 2024.
Li, J., Cheng, S., Huang, S., & Chen, J. (2024). MT-PATCHER: Selective and Extendable Knowledge Distillation from Large Language Models for Machine Translation. In Proceedings of the North American Chapter of Association for Computational Linguistics: NAACL 2024.
Liu, Z., Sun, Z., Cheng, S., Huang, S., & Wang, M. (2023, November). Only 5% Attention Is All You Need: Efficient Long-range Document-level Neural Machine Translation. In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 733-743).
Li, J., Zhou, H., Huang, S., Cheng, S., & Chen, J. (2024). Eliciting the translation ability of large language models via multilingual finetuning with translation instructions. Transactions of the Association for Computational Linguistics, 12, 576-592.
Kumar, V. B., Cheng, S., Peng, N., & Zhang, Y. (2023, June). Visual Information Matters for ASR Error Correction. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1-5). IEEE.
Wang, Y., Sun, Z., Cheng, S., Zheng, W., & Wang, M. (2023, July). Controlling Styles in Neural Machine Translation with Activation Prompt. In Findings of the Association for Computational Linguistics: ACL 2023 (pp. 2606-2620).
Zhu, Y., Sun, Z., Cheng, S., Huang, L., Wu, L., & Wang, M. (2023, July). Beyond Triplet: Leveraging the Most Data for Multimodal Machine Translation. In Findings of the Association for Computational Linguistics: ACL 2023 (pp. 2679-2697).
Zhu Y, Wu L, Cheng S, Wang M. Unified Multimodal Punctuation Restoration Framework for Mixed-Modality Corpus. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022 May 23 (pp. 7272-7276). IEEE.
Song Z, Zhou H, Qian L, Xu J, Cheng S, Wang M, Li L. switch-GLAT: Multilingual Parallel Machine Translation Via Code-Switch Decoder. InInternational Conference on Learning Representations 2021 Sep 29.
Qian L, Zhou Y, Zheng Z, Zhu Y, Lin Z, Feng J, Cheng S, Li L, Wang M, Zhou H. The Volctrans GLAT System: Non-autoregressive Translation Meets WMT21. In Proceedings of the Sixth Conference on Machine Translation (pp. 187-196).
Jiang, Q., Wang, M., Cao, J., Cheng, S., Huang, S., & Li, L. (2021, November). Learning Kernel-Smoothed Machine Translation with Retrieved Examples. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (pp. 7280-7290).
Wu, L., Cheng, S., Wang, M., & Li, L. (2021, August). Language Tags Matter for Zero-Shot Neural Machine Translation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 3001-3007).
Zhu C, Yu H, Cheng S, Luo W. Language-aware interlingua for multilingual neural machine translation. In Proceeding of the 58th Annual Meeting of the Association for Computational Linguistics 2020 Jul (pp. 1650-1655).
Weng R, Yu H, Huang S, Cheng S, Luo W. Acquiring knowledge from pre-trained model to neural machine translation. In Proceeding of the AAAI Conference on Artificial Intelligence 2020 Apr 3 (Vol. 34, No. 05, pp. 9266-9273).
Deng Y, Cheng S, Lu J, et al. Alibaba’s Neural Machine Translation Systems for WMT18[C]//Proceedings of the Third Conference on Machine Translation: Shared Task Papers. 2018: 368-376.
Wang Y, Cheng S, Jiang L, et al. Sogou neural machine translation systems for wmt17[C]//Proceedings of the Second Conference on Machine Translation. 2017: 410-415.
Cheng S, Huang S, Chen H, et al. PRIMT: a pick-revise framework for interactive machine translation[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016: 1240-1249.
Me on Google Scholar
Huang, Z., Ye, R., Ko, T., Dong, Q., Cheng, S., Wang, M., & Li, H. (2023). Speech Translation with Large Language Models: An Industrial Practice. arXiv preprint arXiv:2312.13585.
Li, J., Cheng, S., Sun, Z., Wang, M., & Huang, S. (2022). Better Datastore, Better Translation: Generating Datastores from Pre-Trained Models for Nearest Neural Machine Translation. arXiv. https://doi.org/10.48550/arXiv.2212.08822
Sun Z, Jiang Q, Huang S, Cao J, Cheng S, Wang M. Zero-shot Domain Adaptation for Neural Machine Translation with Retrieved Phrase-level Prompts. arXiv preprint arXiv:2209.11409. 2022 Sep 23.
Cheng S, Kuang S, Weng R, Yu H, Zhu C, Luo W. AR: Auto-Repair the Synthetic Data for Neural Machine Translation. arXiv preprint arXiv:2004.02196. 2020 Apr 5.
Tang X, Cheng S, Do L, et al. Improving Multilingual Semantic Textual Similarity with Shared Sentence Encoder for Low-resource Languages[J]. arXiv preprint arXiv:1810.08740, 2018.
Program Committees: reviewers of ACL 2019/ 2020/ 2021, EMNLP 2022, NAACL 2021, AAAI 2023, NeurIPS 2024, Area Chair of IJCNLP-AACL 2023 etc.