Research

Deep Learning Training Workload Scheduler for Samsung Electronics Data Center

: 삼성전자 데이터 센터에 실시간으로 유입되는 딥러닝 훈련 job들을 Graph Neural Network 기반으로 스케줄링하는 연구

This research involves using heterogeneous graph transformer technology to connect the most suitable GPU server to the currently incoming deep learning job, given the connection status between the deep learning job and the GPU server in Samsung Electronics' data center.

Scheduling Framework for Accelerating Multiple Detection-Free Object Trackers

: Transformer 및 Siamese network을 사용하는 객체 추적 모델 연구 및 SW기반 가속화 프레임워크 연구

We propose a tracker scheduling framework. First, the computation structures of representative trackers are analyzed, and the scheduling unit suitable for the execution characteristics of each tracker is derived. Based on this analysis, the decomposed workloads of trackers are multi-threaded under the control of the scheduling framework.

Adversarial Defense in Embedded Systems

: 임베디드 시스템에서 DNN의 오동작 유도형 공격에 대하여 방어하고 연산 성능을 극대화시키는 연구

Deep Neural Networks are vulnerable to adversarial samples that are generated by perturbing correctly classified inputs to cause DNN models to misbehave.

We should efficiently orchestrate the simultaneous execution of target DNN and the detect algorithm or network that detects adversary sample attacks.

gCFS: completely fair scheduling on multiple GPUs for improved multi‑DNN execution in terms of performance isolation

: 여러 GPU로 구성된 GPU 서버 환경에서 여러 딥러닝 모델들을 공정하게 스케줄링하고 모델 실행시간을 단축시키는 연구

It inherits the CPU-side fair-share scheduling policy, achieving GPU perspective performance isolation in proportion to priorities.

Smaller scheduling granularity enables more precise control over the time slice on GPUs and makes DNN workloads more densely queued, reducing the GPU idle time.
Elastically, in scheduling, the length of DNN workload is adjusted to the given time slice, and dynamically the optimal GPU selection is performed.

ODMDEF: On-Device Multi-DNN Execution Framework Utilizing Adaptive Layer-Allocation on General Purpose Cores and Accelerators

: 자율주행차 혹은 스마트 로봇 내부에서 CPU와 GPU를 동시에 사용하여 다수의 인공신경망 연산 성능을 향상시키는 연구

We propose an on-device CPU-GPU co-scheduling framework for multi-DNN execution to remove the performance barrier precluding DNN executions from being bounded by the GPU

To cope with irregular arrivals of DNN workloads, and to accommodate their fluctuating demands for hardware resources, our framework dynamically selects the best fit core type after making a comparative judgement between the current availabilities of the two core types
During the core selection time, offline-trained prediction models are utilized to get precisely predicted execution time of the issued layer

Avoiding Performance Degradation of Object Detection via System Resource Management

: 자율주행차 내부에서 객체인식 성능이 다른 응용프로그램에게 방해받지 않게 하는 시스템 소프트웨어 연구

The complexity and data requirements of deep neural network operations are rapidly increasing to show high recognition accuracy
Neural networks are computed in embedded systems where system resources are shared with other applications, which can cause performance degradation when safety-sensitive functions such as object detection are performed
We propose an operating system level solution that finds all tasks related to object detection, gives more system resource utilization if the performance interference occurs after the object detection is started, and returns to the original state when finished

Page updated

Google Sites

Report abuse