Deep Transfer Learning for Search and Recommendation

The WEB Conference 2020 Tutorial

Tutorial Slides

Public www2020-DTL-Tutorial

Video Recording

T9_v2.mp4

Tutorial Time

2:00 - 4:30pm on April 21, 2020 (GMT+8)

[WWW 2020 Schedule]

Overview

Training data sparsity is a common problem for many real-world applications in Search and Recommendation domains. Even for applications with a lot of training data, in the cold-start scenario we usually do not get enough labeled data. Transfer Learning is a promising approach to address this problem by bridging the generalization gap from the related applications into the new one. With the increasing adoption of deep neural networks in Search and Recommendation applications, deep models makes it easy to share model structures and parameters across different domains and are able to catch deep patterns hidden in complex feature interactions that traditional approaches might not be able to represent, which makes Transfer Learning integrate well with deep learning models. As a result, Deep Transfer Learning, which combines Deep Neural Networks with Transfer Learning techniques, has gained a lot of attention recently and been successfully applied to many real-world applications.

This tutorial offers an overview of Deep Transfer Learning approaches in Search and Recommendation domains from an industry perspective. In this tutorial we will first introduce the basic concepts of Transfer Learning and Deep Transfer Learning. We will then introduce the 3 major categories of Deep Transfer Learning techniques, and provide a comprehensive review of each category with discussions on concepts and examples from recent deep learning researches and applications. Then we will focus on the recent developments of Deep Transfer Learning approaches in the context of Search and Recommendation domains, and finally introduce an example of our practice on using Deep Transfer Learning to model user behavior representations at Linkedin.


Contributors

Yang Yang (Senior Staff Software Engineer at LinkedIn Inc., USA),

Sen Zhou (Senior Software Engineer at LinkedIn Inc., USA),

Jian Qiao, (Senior Software Engineer at LinkedIn Inc., USA),

Bo Long (Engineering Director at LinkedIn Inc., USA),

Yanen Li (Engineering Manager at LinkedIn Inc., USA),

Mingyuan Zhong ( Staff Software Engineer at LinkedIn Inc., USA)


Tutorial Outline

Part 1. Introduction

  • Overview of Search and Recommendation

  • Transfer Learning in Search and Recommendation

Part 2. Deep Transfer Learning

  • Preliminaries

  • Techniques of Deep Transfer Learning

  • Model Transfer

    • Sequential Training

    • Joint Training

  • Feature Representation Transfer

    • Domain Adaptation

      • Discrepancy-based

      • Adversarial-based

      • Reconstruction-based

  • Instance Transfer

    • Instance Transfer vs. Generative Data Augmentation

    • Instance Transfer with Feature Representation Transfer

Part 3. Deep Transfer Learning in Search and Recommendation

  • Introduction

  • Feature Transfer for Understanding and Representation

  • Model and Instance Transfer for Ranking

Part 4. Applications at LinkedIn

  • An End-to-end Example of Learning User Behavior Representation by Deep Transfer Learning at LinkedIn

Part 5. Conclusion

Presenters' Bios

Yang Yang is a Senior Staff Software Engineer and Tech Lead at LinkedIn. Before joining LinkedIn, Yang worked at Yahoo! Labs as a Scientist. She obtained her Ph.D. degree at Department of statistics, University of Michigan. She has produced various papers and patents on applying statistical methods and machine learning approaches to real data problem involving large scale data. She has published in conferences and journals including KDD, WWW, PAM, Statistical Analysis and Data Mining, The Canadian Journal of Statistics, IIE Transactions on Healthcare Systems Engineering, and Statistical Analysis for High-Dimensional Data.

Sen Zhou is a Senior Software Engineer at LinkedIn. Before joining Linkedin, Sen obtained his Ph.D. degree from Department of EECS, University of California, Irvine, working on data fusion and fault-tolerance in wireless sensor networks. He has published in journals including IEEE Transactions on Computers and Service Oriented Computing and Applications.

Jian (Jack) Qiao is a Senior Software Engineer at LinkedIn and graduated with Bachelor and Master degree from Department of EECS, University of California, Berkeley. He is a major contributor on multiple machine learning infrastructure projects at Linkedin, and he is currently working in AI Features Foundation team, focusing on creating horizontal machine learning features for all Linkedin's search and recommendation verticals through state-of-the-art modeling and infrastructure technologies.

Bo Long leads LinkedIn’s AI Foundations team. He has 15 years of experience in data mining and machine learning with applications to web search, recommendation, and social network analysis. He holds dozens of innovations and has published peer-reviewed papers in top conferences and journals including ICML, KDD, ICDM, AAAI, SDM, CIKM, and KAIS. He has served as re- viewers, workshops co-organizers, conference organizer committee members, and area chairs for multiple conferences, including KDD, NIPS, SIGIR, ICML, SDM, CIKM, JSM etc.


References

Introduction

  • Ahuja, Aman, Nikhil Rao, Sumeet Katariya, Karthik Subbian, and Chandan K. Reddy. "Language-Agnostic Representation Learning for Product Search on E-Commerce Platforms." In Proceedings of the 13th International Conference on Web Search and Data Mining, pp. 7-15. 2020.

  • Pan, Sinno Jialin, and Qiang Yang. "A survey on transfer learning." IEEE Transactions on knowledge and data engineering 22, no. 10 (2009): 1345-1359.

  • Lawrence, Neil D., and John C. Platt. "Learning to learn with the informative vector machine." In Proceedings of the twenty-first international conference on Machine learning, p. 65. 2004.

  • S. J. Pan, I. W. Tsang, J. T. Kwok and Q. Yang, "Domain Adaptation via Transfer Component Analysis," in IEEE Transactions on Neural Networks, vol. 22, no. 2, pp. 199-210, Feb. 2011.

  • Dai, Wenyuan, Qiang Yang, Gui-Rong Xue, and Yong Yu. "Boosting for transfer learning." In Proceedings of the 24th international conference on Machine learning, pp. 193-200. 2007.

  • Zhang, Shuai, et al. "Deep learning based recommender system: A survey and new perspectives." ACM Computing Surveys (CSUR) 52.1 (2019): 1-38.

  • Tan, Chuanqi, et al. "A survey on deep transfer learning." International conference on artificial neural networks. Springer, Cham, 2018.

  • Ruder, Sebastian. "An overview of multi-task learning in deep neural networks." arXiv preprint arXiv:1706.05098 (2017).

  • Wang, Mei, and Weihong Deng. "Deep visual domain adaptation: A survey." Neurocomputing 312 (2018): 135-153.

Deep Transfer Learning

Model Transfer

  • Tan, Chuanqi, Fuchun Sun, Tao Kong, Wenchang Zhang, Chao Yang, and Chunfang Liu. "A survey on deep transfer learning." In International conference on artificial neural networks, pp. 270-279. Springer, Cham, 2018.

  • Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).

  • He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Deep residual learning for image recognition." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778. 2016.

  • Yosinski, Jason, Jeff Clune, Yoshua Bengio, and Hod Lipson. "How transferable are features in deep neural networks?." In Advances in neural information processing systems, pp. 3320-3328. 2014.

  • Deng, Jia, et al. "Imagenet: A large-scale hierarchical image database." 2009 IEEE conference on computer vision and pattern recognition. Ieee, 2009.

  • Liu, Xiaodong, Pengcheng He, Weizhu Chen, and Jianfeng Gao. "Multi-task deep neural networks for natural language understanding." arXiv preprint arXiv:1901.11504 (2019).

  • Duong, Long, Trevor Cohn, Steven Bird, and Paul Cook. "Low resource dependency parsing: Cross-lingual parameter sharing in a neural network parser." In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 845-850. 2015.

  • Ma, Jiaqi, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, and Ed H. Chi. "Modeling task relationships in multi-task learning with multi-gate mixture-of-experts." In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1930-1939. 2018.

Feature Representation Transfer

  • Sharif Razavian, Ali, Hossein Azizpour, Josephine Sullivan, and Stefan Carlsson. "CNN features off-the-shelf: an astounding baseline for recognition." In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 806-813. 2014.

  • Tzeng, Eric, Judy Hoffman, Ning Zhang, Kate Saenko, and Trevor Darrell. "Deep domain confusion: Maximizing for domain invariance." arXiv preprint arXiv:1412.3474 (2014).

  • Goodfellow, Ian, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. "Generative adversarial nets." In Advances in neural information processing systems, pp. 2672-2680. 2014.

  • Tzeng, Eric, et al. "Adversarial discriminative domain adaptation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.

  • Liu, Ming-Yu, and Oncel Tuzel. "Coupled generative adversarial networks." In Advances in neural information processing systems, pp. 469-477. 2016.

  • Tzeng, Eric, et al. "Simultaneous deep transfer across domains and tasks." Proceedings of the IEEE International Conference on Computer Vision. 2015.

  • Ganin, Yaroslav, et al. "Domain-adversarial training of neural networks." The Journal of Machine Learning Research 17.1 (2016): 2096-2030.

  • Tzeng, Eric, et al. "Adversarial discriminative domain adaptation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.

  • Ghifary, Muhammad, et al. "Deep reconstruction-classification networks for unsupervised domain adaptation." European Conference on Computer Vision. Springer, Cham, 2016.

  • Bousmalis, Konstantinos, et al. "Domain separation networks." Advances in neural information processing systems. 2016.

  • Sun, Baochen, Jiashi Feng, and Kate Saenko. "Return of frustratingly easy domain adaptation." Thirtieth AAAI Conference on Artificial Intelligence. 2016.

  • Long, Mingsheng, et al. "Learning Transferable Features with Deep Adaptation Networks." International Conference on Machine Learning. 2015.

  • Bousmalis, Konstantinos, et al. "Domain separation networks." Advances in neural information processing systems. 2016.

Instance Transfer

  • Qu, Chen, et al. "Learning to selectively transfer: Reinforced transfer learning for deep text matching." Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 2019.

  • Mino, Ajkel, and Gerasimos Spanakis. "LoGAN: Generating logos with a generative adversarial neural network conditioned on color." 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 2018.

  • Bo Wang, et al. 2019. A Minimax Game for Instance based Selective Transfer Learning. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’19). Association for Computing Machinery, New York, NY, USA, 34–43.

  • Ge, Weifeng, and Yizhou Yu. "Borrowing treasures from the wealthy: Deep transfer learning through selective joint fine-tuning." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1086-1095. 2017.

  • Chowdhury, Somnath Basu Roy, K. M. Annervaz, and Ambedkar Dukkipati. "Instance-based Inductive Deep Transfer Learning by Cross-Dataset Querying with Locality Sensitive Hashing." arXiv preprint arXiv:1802.05934 (2018).

  • Qu, Chen, et al. "Learning to selectively transfer: Reinforced transfer learning for deep text matching." Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. 2019.

Deep Transfer Learning For Search and Recommendation

  • Prakash, Abhay, and Dhaval Patel. "Techniques for Deep Query Understanding." arXiv preprint arXiv:1505.05187 (2015).

  • Guorui Zhou, Xiaoqiang Zhu, Chenru Song, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2018. Deep Interest Network for Click-Through Rate Prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’18). Association for Computing Machinery, New York, NY, USA, 1059–1068.

  • deWet, Stephanie, and Jiafan Ou. "Finding Users Who Act Alike: Transfer Learning for Expanding Advertiser Audiences." Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2019.

  • Zhang, Shuai, et al. "Deep learning based recommender system: A survey and new perspectives." ACM Computing Surveys (CSUR) 52.1 (2019): 1-38.

  • Peters, Matthew E., et al. "Deep contextualized word representations." arXiv preprint arXiv:1802.05365 (2018).

  • Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).

  • Oquab, Maxime, et al. "Learning and transferring mid-level image representations using convolutional neural networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.

  • Misra, Ishan, et al. "Cross-stitch networks for multi-task learning." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.

  • Bansal, Trapit, David Belanger, and Andrew McCallum. "Ask the gru: Multi-task learning for deep text recommendations." Proceedings of the 10th ACM Conference on Recommender Systems. 2016.

  • Zhang, Yongfeng, et al. "Joint representation learning for top-n recommendation with heterogeneous information sources." Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2017.

  • Cantador, Iván, et al. "Cross-domain recommender systems." Recommender systems handbook. Springer, Boston, MA, 2015. 919-959.

  • Elkahky, Ali Mamdouh, Yang Song, and Xiaodong He. "A multi-view deep learning approach for cross domain user modeling in recommendation systems." Proceedings of the 24th International Conference on World Wide Web. 2015.

  • Hu, Guangneng, Yu Zhang, and Qiang Yang. "Conet: Collaborative cross networks for cross-domain recommendation." Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 2018.

  • Kanagawa, Heishiro, et al. "Cross-domain recommendation via deep domain adaptation." European Conference on Information Retrieval. Springer, Cham, 2019.

  • Ahuja, Aman, et al. "Language-Agnostic Representation Learning for Product Search on E-Commerce Platforms." Proceedings of the 13th International Conference on Web Search and Data Mining. 2020.

  • Bo Wang, Minghui Qiu, Xisen Wang, Yaliang Li, Yu Gong, Xiaoyi Zeng, Jun Huang, Bo Zheng, Deng Cai, and Jingren Zhou. A Minimax Game for Instance based Selective Transfer Learning. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD ’19). 2019.