Addressing four aspects of Deep Learning (DL) design:
Precision on task-specific metrics
Hardware performance and efficiency
3. Decision robustness
4. Privacy and security
Design principle: Automatically find the best Deep Learning model using search-based algorithms and cross-stack optimization.
1- Efficient DL Training
SWANN [JETCAS ’21]: we propose a novel transformation that modifies the connections inside a DL model to achieve a small-world graph, called a SWANN. The small-world architectures have enhanced signal propagation characteristics, which results in faster convergence to a target accuracy during training. We validate our results both in the centralized and federated training scenarios.
Extracurricular Learning [CVPRW ‘21]: we investigate the role of teacher model uncertainty on the accuracy in a knowledge distillation setup. We show that the data used to train the student does not need to be the same as the teacher. In fact, data points where the teacher has high uncertainty are more beneficial. We further propose three sources of these high-uncertainty data which can be generated with data interpolation, generative models, or simply new datasets.
2- Efficient DL Inference
GeneCAI [GECCO ’20]: we develop an automated model customization technique based on genetic algorithms. GeneCAI solves a constrained optimization problem to compress a large model and find the smallest sub-architecture that satisfies a given accuracy constraint.
AdaNS [IEEE JSTSP ’20]: we propose a new technique that can solve black-box optimization problems in extremely high dimensional spaces, such as those seen for DL model customization. Our optimization tool, dubbed AdaNS, uses adaptive non-uniform sampling to locate and reconstruct the objective function around the maximizers.
EncoDeep [TECS '20]: EncoDeep is an example of DL algorithm and hardware codesign. We propose a nonlinear encoding of DNN weights and activations to reduce the memory footprint on FPGA. We also design the specialized compute kernels for running inference on our encoded models. EncoDeep also includes an automated compiler that configures the quantization bit-widths as well as the hardware parameters. The compiler optimization is inspired by reinforcement learning for efficient design-space exploration, delivering an optimal reduction in memory and hardware runtime.
LiteTransformerSearch [NeurIPS '22]: LTS is the first training-free, hardware-aware Neural Network Search for Transformers. The core of this method is an ultra low-cost proxy that can estimate the performance of candidate architectures. By eliminating the need for training, the search can be performed entirely on a constrained hardware, allowing us to use real hardware measurements, e.g., peak memory utilization and latency in the search loop.
3- Robust DL
DeepFense [ICCAD ’18], CURTAIL [TDSC ‘20]: adversarial samples are an example of runtime attacks where the attacker adds an imperceptible noise to the input samples to misguide the DL model. To mitigate these attacks, DeepFense and its journal paper CURTAIL provide modular redundancies that checkpoint the intermediate layers of a given DL model and mark suspicious samples as adversarial. These checker modules are trained using a specific loss to enforce disentanglement of features from different classes and therefore they can separate adversarial samples from benign ones. An important aspect of DeepFense is that it is accompanied by custom hardware acceleration of the modular redundancies to enable detection on the go.
CLEANN [ICCAD '20]: Trojan threats inject a backdoor inside the DL model during training. The backdoor can be triggered using a Trojan pattern such as a sticky-note during inference and hijack the DL models’ decision. To mitigate Trojans, we propose CLEANN which uses sparse reconstruction and dictionary learning to remove the Trojan effect from signals propagating through the DL model. What’s unique about CLEANN is that it can recover the original decision of the DL model without any need for retraining. CLEANN is also accompanied by a hardware accelerator to enable real-time analysis.
HASHTAG [ICCAD' 21]: HASHTAG detects hardware faults during DL inference. An example of faults is caused by bit-flip attacks which use row-hammer and other techniques to change the weight bits stored on the DRAM amid execution. HASHTAG extracts low-bit hash signatures from the model and uses them to verify that the model is intact. We show that with the simple technique we can achieve near 100% detection of faults with extremely low overhead.
3- Privacy-Preserving DL
COINN [CCS ’21]: we design DL models that achieve state-of-the-art performance in the secure domain. This required optimizing the operations performed in vanilla DL models to specialize them for the ciphertext domain. Specifically, we perform heterogeneous quantization of activations and model weights, along with weight clustering to enable factored matrix multiplication. We use our black-box optimization tool, AdaNS, to customize the DL models using the identified transformations.