The First International Workshop on Sharing and Reuse of AI Work Products
As more AI technologies are used in industry applications, we anticipate that work products of AI will be shared and reused widely. Building AI-based systems involves creating and generating different types of “AI work products” such as training data sets, pretrained models, and AI-generated arts such as paintings and music. These work products are often results of heavy investment of human, data, and computer resources, and should carry some form of intellectual properties. Rules and best practices of sharing and reusing these work products are not very well agreed upon today. For the healthy proliferation of AI technologies in our society we believe now is the time to start discuss these issues. For example, when Company A applies a machine learning algorithm on a training data set prepared by Company B and generates a pretrained model of recognizing objects in images, who owns this pretrained model? How should Company C who wants to reuse this pretrained model by fine-tune it to their own domain compensate to the owner of the original model? These questions are related to various technical, legal, political, and practical issues.
This workshop is intended to be the first of series of discussions on the issues of sharing and reuse of AI work products.
9:00-9:10 Opening by Prof. Jun-ichi Tsujii, AIST
9:10-10:00 Keynote “Machine Learning Engineering and Reuse of AI Work Products,” by Hiroshi Maruyama, Preferred Networks, Inc. (pdf)
- Abstract: The process of developing machine learning (ML)-based systems is in many aspects different from the process of developing conventional software systems, where the accumulated knowledge on Software Engineering can guide. As more ML-based systems are developed and deployed, we argue that we should establish a discipline that guides the development process of such systems. This talk is intended to set the goal of this new discipline and discuss the areas of interest, especially focusing on the reuse of the artifacts of ML-based systems.
10:30-12:30 Session I -- Privacy, Intellectual Property, and Infrastructure , chaired by Hiroshi Maruyama
- Invited Talk, Chris Culnan, “The fallacy of de-identification and its impact on data sharing ” (slides)
- Abstract: The sharing of data, whether it be open data or commercial, is increasingly reliant on de-identification as a way to protect the privacy of the data subjects. In this talk we discuss why we believe de-identification to be a fallacy, with reference to recent examples of where it has failed, including the MBS/PBS dataset. We will look at how the failure of de-identification could impact on future data exchanges, and discuss how increased awareness of re-identification could impact on the public perception of data sharing. We will discuss some of the techniques that allow data to be safely shared, and how they restrict the type and nature of the data that can be released, and the impact that might have in the future.
- “Copyright Issues on Artificial Intelligence and Machine Learning,” Tatsuhiro Ueno, Waseda Univ. (pdf, slides)
- Abstract: Artificial intelligence (AI) has recently been causing intellectual property(IP) issues including copyrightability of AI-generated works, pre-trained models and training data sets. AI-generated works are not eligible for copyright protection due to lack of human author’s intellectual creation,in most countries including Japan. Therefore, it is now under discussion in Japan whether it is necessary to introduce some sort of legal protection for AI/computer-generated works for the purpose of protecting investment for them.Also, pre-trained models and training data sets can be protected by copyright, as long as they are creative and considered as not AI-generated works but human author’s own intellectual creation.Additionally, it should be noted that the Japanese Copyright Act has the explicit provision on copyright exception for text and data mining(Art.47septies)under which it is allowed to copy any work for the purpose of machine learning. This provision is the quite helpful for machine learning and facilitating technological development of AI, since it applies to a text and data mining not only for a non-commercial purpose but also for a commercial purpose as well. Hence, Japan is the paradise for machine learning.
- “Infrastructures for Sharing and Reusing AI Work Products,” Hideki Asoh, Ryoichi Sugimura, and Jun-ichi Tsujii (pdf, slides)
- Abstract: Artificial intelligence technologies had not been able to achieve sufficient performance due to knowledge acquisition bottleneck for long time. However, in recent years, because of the great improvement of the machine learning performance with a large amount of data, performance of AI has dramatically improved and it is strongly expected that AI will be applied to real problems in various social fields. In order to accelerate the implementation of recent AI technologies into society, sharing and/or reuse AI work products such as well-prepared data and trained good models is very important. Powerful computational and engineering infrastructure for producing variety of work products are also inevitable. In this presentation, we briefly survey the current status of such infrastructures. We also introduce our future plan for constructing a public infrastructure for sharing and reuse AI work products.
14:00-16:00 Session II -- Reuse and Sharing , chaired by Hideki Asoh
- Invited Talk by Viktor Gyenes, "Neural Network Exchange Format for the Deployment of Trained Networks to Inference Engines" (slides)
- Abstract: Neural networks are successfully being used to solve difficult tasks in image, audio and text processing. Several deep learning frameworks are available for the research community to train such networks. Neural networks require massive amounts of computational power, therefore chip vendors are working on new hardware solutions to accelerate computations, with accompanying libraries to drive their hardware. Low-power embedded hardware solutions are needed in industrial segments, such as autonomous driving, that heavily utilize pattern recognition. Network inference libraries need to be able to digest the products of various deep learning frameworks in order to deploy networks to devices across multiple platforms, and the market is in danger of fragmenting. The Khronos Group, an open consortium of leading hardware and software companies has authored various industrial compute standards, and is now working on developing a Neural Network Exchange Format to facilitate the deployment of trained neural networks from deep learning frameworks to hardware accelerated inference engines. The goal of the exchange format is to describe neural network structure and data in a unified way, with standardized semantics, that can be exported from deep learning frameworks, and is easy to digest for inference engines. The talk will overview the standardization activities, describe the general design principles of the planned exchange format, along with use cases and expected industry outcomes.
- “GHELIA Federation,” Ryo Shimizu, UEI
- Abstract: GHELIA Federation, jointly developed by Sony Computer Science Laboratories’ Hiroaki Kitano and UEI Corporation’s Ryo Shimizu offers a P2P-based GPU resource sharing platform. Now available as free, open-source software, its blockchain-esque distributed database design allows efficient sharing of computing resources. In addition, Sony CSL and UEI have founded GHELIA Incorporated in order to further promote this project and combat the deficiency of GPU resources worldwide.
- Junichi Tsujii, Director, AI Center at National Institute of Advanced Industrial Science and Technology (AIST),
- Joi Ito, Director of MIT Media Lab
- Hiroaki Kitano, Sony Computer Science Laboratories, Inc.
- Hiroshi Maruyama, CSO, Preferred Networks, Inc.
- Magnus Rattray, Manchester University
- Ryo Shimizu, President and CEO, UEI, Inc.
- Tatsuhiro Ueno, Professor, Waseda Univ.
- Hideki Asoh, AIST
All questions about submissions should be emailed to Hiroshi Maruyama (Program Chair, firstname.lastname@example.org).