Tsz Kin Lam: MT Quality Estimation for e-Commerce: Automatically Assessing the Quality of Machine Translated Titles


We experimented with siamese network and predictor-estimator models in assessing the quality of machine translated listing titles in e-Commerce. Siamese network is metric-learning-based and is trained directly on quality estimation data. In contrast, predictor-estimator utilizes extra bitext in training and achieves state-of-the-art results on the WMT 2017 shared task. On in-house e-Commerce data, however, the predictor-estimator model does not outperform the siamese networks, even when pre-trained on a large amount of in-domain bitext. We can further improve the Siamese-network-based QE model by injecting a feature based on log-likelihood of an in-domain Transformer NMT model. This yields further improvement of more than 3% in F1-score evaluated on translated listing titles.

Paper: [download PDF]


Tsz Kin Lam is a first year PhD student at StatNLP, Heidelberg University, under the supervision of Prof. Stefan Riezler. He is enthusiastic about human-centered AI, multi-agent and Bayesian learning with applications in NLP tasks. Before starting his PhD, he received a master degree in Scientific Computing at Heidelberg University, and he worked at eBay Research in Aachen, and SAP Deutschland in Walldorf.