Window Transformer based Anomaly Detection in Ethereum Validator
: Window(시계열) 간의 의존성과 다변량 Sensor 간 상관관계를 Attention을 활용하여 학습 후 시계열 Big Data 환경에서 이상을 탐지
This research proposes Window Transformer based Anomaly Detection (WTAD), it introduces a Window Attention mechanism to learn dependencies between temporally ordered windows, a Variable Attention mechanism to model inter-variable correlations, and a Window-to-Timepoint (W2T) projection that injects window-level relational information into timepoint embeddings.
Deep Learning Training Workload Scheduler for Samsung Electronics Data Center
: 삼성전자 데이터 센터에 실시간으로 유입되는 딥러닝 훈련 job들을 Graph Neural Network 기반으로 스케줄링하는 연구
This research involves using heterogeneous graph transformer technology to connect the most suitable GPU server to the currently incoming deep learning job, given the connection status between the deep learning job and the GPU server in Samsung Electronics' data center.
LAG-Guided Runtime Framework: Block-Level Scheduling and Dynamic Compression for Multi-DNN Environments
: Object Tracking/Detection/Classification 모델들을 On-Device 환경에서 다양한 정도의 압축된 혹은 원본 모델을 가속기에 할당하는 연구
This technique divides DNN models into functional units called blocks, which are then configured as execution units. When running different models in parallel, it identifies blocks that actually increase execution time (LAG) and controls them to run sequentially. To minimize execution delays while maintaining accuracy, a dynamic lightweight replacement technique that replaces blocks with highly anticipated execution delays with lightweight blocks at runtime.
Scheduling Framework for Accelerating Multiple Detection-Free Object Trackers
: Transformer 및 Siamese network을 사용하는 객체 추적 모델 연구 및 SW기반 가속화 프레임워크 연구
We propose a tracker scheduling framework. First, the computation structures of representative trackers are analyzed, and the scheduling unit suitable for the execution characteristics of each tracker is derived. Based on this analysis, the decomposed workloads of trackers are multi-threaded under the control of the scheduling framework.
Adversarial Defense in Embedded Systems
: 임베디드 시스템에서 DNN의 오동작 유도형 공격에 대하여 방어하고 연산 성능을 극대화시키는 연구
Deep Neural Networks are vulnerable to adversarial samples that are generated by perturbing correctly classified inputs to cause DNN models to misbehave.
We should efficiently orchestrate the simultaneous execution of target DNN and the detect algorithm or network that detects adversary sample attacks.
gCFS: completely fair scheduling on multiple GPUs for improved multi‑DNN execution in terms of performance isolation
: 여러 GPU로 구성된 GPU 서버 환경에서 여러 딥러닝 모델들을 공정하게 스케줄링하고 모델 실행시간을 단축시키는 연구
It inherits the CPU-side fair-share scheduling policy, achieving GPU perspective performance isolation in proportion to priorities.
Smaller scheduling granularity enables more precise control over the time slice on GPUs and makes DNN workloads more densely queued, reducing the GPU idle time.
Elastically, in scheduling, the length of DNN workload is adjusted to the given time slice, and dynamically the optimal GPU selection is performed.
ODMDEF: On-Device Multi-DNN Execution Framework Utilizing Adaptive Layer-Allocation on General Purpose Cores and Accelerators
: 자율주행차 혹은 스마트 로봇 내부에서 CPU와 GPU를 동시에 사용하여 다수의 인공신경망 연산 성능을 향상시키는 연구
We propose an on-device CPU-GPU co-scheduling framework for multi-DNN execution to remove the performance barrier precluding DNN executions from being bounded by the GPU
To cope with irregular arrivals of DNN workloads, and to accommodate their fluctuating demands for hardware resources, our framework dynamically selects the best fit core type after making a comparative judgement between the current availabilities of the two core types
During the core selection time, offline-trained prediction models are utilized to get precisely predicted execution time of the issued layer
Avoiding Performance Degradation of Object Detection via System Resource Management
: 자율주행차 내부에서 객체인식 성능이 다른 응용프로그램에게 방해받지 않게 하는 시스템 소프트웨어 연구
The complexity and data requirements of deep neural network operations are rapidly increasing to show high recognition accuracy
Neural networks are computed in embedded systems where system resources are shared with other applications, which can cause performance degradation when safety-sensitive functions such as object detection are performed
We propose an operating system level solution that finds all tasks related to object detection, gives more system resource utilization if the performance interference occurs after the object detection is started, and returns to the original state when finished