In many real-world applications, it is often expensive and time-consuming to collect sufficient labeled data in a new domain of interest. Instead of spending huge labeling efforts from scratch, one may prefer to effectively utilize existing well-explored data from other domains, which are referred to as “auxiliary domains” or “source domains”, to help the learning task in the new domain (referred to as the “target domain”). However, traditional learning methods cannot be directly applied to learn a precise model for the target domain from the source-domain data because the data from different sources may have different statistical properties. Transfer Learning (TL), as a promising solution on the other hand, has attracted growing attention in the last two decades. Particularly, it has been successfully applied to many applications, such as text mining, video event recognition, sensor-based prediction problems, software engineering, image categorization and so forth.
One of the most challenging problems in TL is about how to reduce the difference in data distributions between domains. In the literature, many works have been proposed along this direction. For instance, some works have been focused on the domain adaptation problem where the source and target domains have data under different marginal distributions but share the same conditional distribution. Moreover, some other works have been focused on the inductive transfer learning or multi-task learning problem where the conditional distributions of the data or the predictive tasks of the source and target domains are usually different. Besides, there are also other works proposed to deal with other TL scenarios, including multi-source domain adaptation, one-shot learning, zero-shot learning, etc.
Nowadays, because of the advance of data storage and Internet technology, data become more massive, noisier and more complex. For instance, Internet itself is a very rich and huge database. The Internet data may be associated with certain structure (e.g., social networks data), may be only weakly labeled (e.g., the video and images crawled with search engine), and may be very large scale. Moreover, it is also desirable to exploit data of different formats and structures from multiple sources to further improve the learning tasks in the target domain (e.g., jointly using web images, web videos and social networks data to categorize consumer videos or images). Such new environments bring good opportunities and challenges for TL, i.e., how to make practical use of massive data and how to effectively deal with data from different domains need to be addressed in this era of big data.
The main purpose of this workshop is to document recent progress of transfer learning in different real-world applications and also to stimulate discussion about potential challenges that may open new directions of TL. We appreciate not only the manuscripts that dedicate to solve traditional transfer learning problems, but also those which aim to discuss the approaches and/or theories for handling the new TL issues when exploiting massive data of different formats or structures.