Deep Learning Engineering Life Cycle

A feedback loop consisting of the creation, evaluation and update of data and model in deep learning process.
Deep Learning Engineering Life Cycle Management, a process engineering of deep learning, is a new-type system research topic.
Participants: Dongkyu Lee, Le Van Duc, Juhyun Lee, Yunsu Kim, Junha Chun

Specification

Introduction

Our research has been focused on the system software dealing with data. The classical database management system which stores data and executes queries consistently. An important feature of DBMS is transactional processing of data. For this feature, from software protocol to hardware characteristics are exploited such as concurrency control, cache-consciousness, latch-free algorithm, distributed commit protocol, logging/recovery, and etc. Nowadays, deep learning approach requiring a bunch of data became a mainstream of AI research. The process engineering of deep learning can be a new-type system research topic.

Deep Learning Engineering Life Cycle (DELC)

Figure 1. shows usual steps of deep learning process. These steps make various DL models solving given tasks using datasets. Pre-trained models from other huge dataset are usually used to transfer to other task, then the output model can be used to other version or task. Our goal is to research and development of a system software dealing with problems related to each DL process, thus, it falls into the category of process engineering [*]. The key idea of our system is to minimize redundant operations and data transfer through involving DL Life Cycle operations into the process boundary of data management system.

Figure 1. Deep Learning Process

DELC Problems

We are interested in following research questions.

Data Preparation

Data preparation encompasses collecting, labeling datasets for training and testing of deep learning model, data validation, and cleaning.

Missing data: some data miss part of their features, how can we handle?
Data distribution and noisy data: is the data distribution evenly?
Exploration of datasets: which datasets are proper to train a given new task?
Active learning: which data is better to be labeled among newly acquired ones?

Training

Scalability: Can training operations be scalable in parallel?
Federated Learning: how to guarantee convergence in data parallel, distributed manner.
Continual Learning: how can we train model without forgetting previous training phases?

Test

Testing process determines whether the model is ready to deploy.

Model quality: accuracy, training cost, inference cost, model size, robustness, etc.
System quality[*]: safety, security, reliability, scalability, performance, fairness, usability, etc.
Generate test dataset: how are data evaluated and dataset generated for the target quality? [*]

Deploy

Applying the model into real world application including collecting operational log for analysis

Model conversion: compress neural architecture, quantize parameters, change operation
Continuous deploy: efficient deploy to the target devices continuously

Version Control

Version control is a class of system responsible for managing changes to computer programs, documents, large web sites, or other collection of information [*].

In DELC context, data and model are main target of version management.

Data versions: train/test dataset, data correction, new acquisition, validation, etc. make new version of data
Model versions: train, test, deploy makes new versions
Execution log: design of schema to capture those meta information regarding changes of data and model

Deep Learning Engineering Life Cycle

Specification

Introduction

Deep Learning Engineering Life Cycle (DELC)

DELC Problems

Data Preparation

Training

Test

Deploy

Version Control

Links

Sponsored Projects

Tutorials

Related Work

Articles

References