Efficient Machine Learning - Reading Group
Today’s world needs orders of magnitude more efficient ML to address environmental and energy crises, optimize resource consumption and improve sustainability. With the end of Moore’s Law and Dennard Scaling, we can no longer expect more and faster transistors for the same cost and power budget. This is particularly problematic when looking at the growing data volumes collected by populated sensors and systems, larger and larger models we train, and the fact that most ML models have to run on edge devices to minimize latency, preserve privacy and save energy. The algorithmic efficiency of deep learning becomes essential to achieve desirable speedups, along with efficient hardware implementations and compiler optimizations for common math operations. ML efficiency is being actively investigated in many research communities. This reading group aims to help onboard young scientists interested in the topic and offers researchers at all levels a platform for an open dialog to foster collaboration, and stay up-to-date with rapid developments in the field of efficient ML. We welcome and discuss fresh research findings published as a pre-print or recently presented at research venues. The list of topics includes but is not limited to:
Low-resource and low-data ML, scaling laws
Sparsity, efficient network architectures, NAS, meta-learning, transfer learning
Multi-modal, multi-task learning, ensembling, mixture of experts
Invariance, equivariance, data augmentation, generalization and robustness
Continual learning, low-resource adaptation and reconfiguration, on-device learning
Benchmarking ML training and inference methods
Distributed and collaborative learning, edge computing
New resource-efficient ML paradigms
Schedule / Upcoming Talks
Subscribe to the Efficient ML mailing list / import the Efficient ML Events calendar to receive information on how to join the virtual talks.
13. May 2024 @ 5pm CET / 11am EST / 8am PST [timezone converter]
94% on CIFAR-10 in 3.29 Seconds on a Single GPU
Keller Jordan, Independent Researcher
Abstract: CIFAR-10 is among the most widely used datasets in machine learning, facilitating thousands of research projects per year. To accelerate research and reduce the cost of experiments, we introduce training methods for CIFAR-10 which reach 94% accuracy in 3.29 seconds, 95% in 10.4 seconds, and 96% in 46.3 seconds, when run on a single NVIDIA A100 GPU. As one factor contributing to these training speeds, we propose a derandomized variant of horizontal flipping augmentation, which we show improves over the standard method in every case where flipping is beneficial over no flipping at all.
27. May 2024 @ 5pm CET / 11am EST / 8am PST [timezone converter]
LF-ViT: Reducing Spatial Redundancy in Vision Transformer for Efficient Image Recognition
Youbing Hu (AIoT Lab, Harbin Institute of Technology) and Yun Cheng (Swiss Data Science Center)
Abstract: The Vision Transformer (ViT) excels in accuracy when handling high-resolution images, yet it confronts the challenge of significant spatial redundancy, leading to increased computational and memory requirements. To address this, we present the Localization and Focus Vision Transformer (LFViT). This model operates by strategically curtailing computational demands without impinging on performance. In the Localization phase, a reduced-resolution image is processed; if a definitive prediction remains elusive, our pioneering Neighborhood Global Class Attention (NGCA) mechanism is triggered, effectively identifying and spotlighting class-discriminative regions based on initial findings. Subsequently, in the Focus phase, this designated region is used from the original image to enhance recognition. Uniquely, LF-ViT employs consistent parameters across both phases, ensuring seamless end-to-end optimization. Our empirical tests affirm LF-ViT’s prowess: it remarkably decreases Deit-S’s FLOPs by 63% and concurrently amplifies throughput twofold.
15. July 2024 @ 5pm CET / 11am EST / 8am PST [timezone converter]
Subspace-Configurable Networks
Dong Wang, Graz University of Technology, Austria
Abstract: While the deployment of deep learning models on edge devices is increasing, these models often lack robustness when faced with dynamic changes in sensed data. This can be attributed to sensor drift, or variations in the data compared to what was used during offline training due to factors such as specific sensor placement or naturally changing sensing conditions. Hence, achieving the desired robustness necessitates the utilization of either an invariant architecture or specialized training approaches, like data augmentation. Alternatively, input transformations can be treated as a domain shift problem, and solved by post-deployment model adaptation. In this paper, we train a parameterized subspace of configurable networks, where an optimal network for a particular parameter setting is part of this subspace. The obtained subspace is low-dimensional and has a surprisingly simple structure even for complex, non- invertible transformations of the input, leading to an exceptionally high efficiency of subspace-configurable networks (SCNs) when limited storage and computing resources are at stake. We evaluate SCNs on a wide range of standard datasets, architectures, and transformations, and demonstrate their power on resource-constrained IoT devices, where they can take up to 2.4 times less RAM and be 7.6 times faster at inference time than a model that achieves the same test set accuracy, yet is trained with data augmentations to cover the desired range of input transformations.
Past Events and Talks
2024
29. April 2024 @ 5pm CET
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Rahim Entezari, Stability AI
arXiv: https://arxiv.org/pdf/2403.03206.pdf15. April 2024 @ 5pm CET — ▶️ YouTube
Just Say the Name: Online Continual Learning with Category Names Only via Data Generation
Diganta Misra, Max Planck Institute, Tübingen, Germany
arXiv: https://arxiv.org/pdf/2403.10853.pdf8. April 2024 @ 5pm CET
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models
Guillermo Ortiz-Jimenez, EPFL, Switzerland
arXiv: https://arxiv.org/pdf/2305.12827.pdf18. March 2024 @ 5pm CET
Sit Back and Relax: Learning to Drive Incrementally in All Weather Conditions
M. Jehanzeb Mirza, Graz University of Technology, Austria
arXiv: https://arxiv.org/abs/2305.1895326. February 2024 @ 5pm CET
Less is More – Towards parsimonious multi-task models using structured sparsity
Richa Upadhyay, Luleå University of Technology, Sweden
OpenReview: https://openreview.net/pdf?id=0VU6Vlh0zy19. February 2024 @ 5pm CET
How to Prune Your Language Model: Recovering Accuracy on the “Sparsity May Cry” Benchmark
Eldar Kurtic, Institute of Science and Technology Austria
arXiv: https://arxiv.org/abs/2312.1354713. February 2024 @ 5pm CET
Efficient Continual and On-Device Learning for Edge Computing Platforms
Young D. Kwon, University of Cambridge and Samsung AI, UK
arXiv: https://arxiv.org/abs/2311.11420, https://arxiv.org/abs/2307.0998829. January 2024 @ 5pm CET
HRBP: Hardware-friendly Regrouping towards Block-based Pruning for Sparse CNN Training
Haoyu Ma, University of California, Irvine, USA
OpenReview: https://openreview.net/pdf?id=VP1Xrdz0Bp22. January 2024 @ 5pm CET
A U-turn on Double Descent: Rethinking Parameter Counting in Statistical Learning
Alicia Curth and Alan Jeffares, University of Cambridge, UK
arXiv: https://arxiv.org/abs/2310.18988
2023
18. December 2023 @ 5pm CET
MosaicML: A Three-Year Retrospective of the Journey and Technology
Jonathan Frankle, Mosaic ML / Databricks11. December 2023 @ 5pm CET
Dual Algorithmic Reasoning
Danilo Numeroso, University of Pisa, Italy
arXiv: https://arxiv.org/abs/2302.0449627. November 2023 @ 5pm CET
Privacy Side Channels in Machine Learning Systems
Edoardo Debenedetti, ETH Zurich, Switzerland
arXiv: https://arxiv.org/abs/2309.0561017. November 2023 @ 3pm CET
Layer-wise Linear Mode Connectivity
Linara Adilova, Ruhr University Bochum, Germany
arXiv: https://arxiv.org/abs/2307.069666. November 2023 @ 5pm CET
MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
Wei Lin, Institute of Computer Graphics and Vision, Graz University of Technology, Austria
arXiv: https://arxiv.org/abs/2303.0891430. October 2023 @ 5pm CET
Why Do We Need Weight Decay in Modern Deep Learning?
Maksym Andriushchenko, EPFL, Switzerland
arXiv: https://arxiv.org/abs/2310.0441527. October 2023 @ 3pm CET
Cost-effective On-device Continual Learning over Memory Hierarchy with Miro
Xinyue Ma, UNIST, South Korea
arXiv: https://arxiv.org/abs/2308.0605318. September 2023 @ 5pm CET
Localised Adaptive Spatial-Temporal Graph Neural Network
Xiaoxi He, University of Macau, China
arXiv: https://arxiv.org/abs/2306.0693011. September 2023 @ 5pm CET
AutoCoreset: An Automatic Practical Coreset Construction Framework
Alaa Maalouf, Computer Science and Artificial Intelligence Lab (CSAIL), MIT, USA
arXiv: https://arxiv.org/abs/2305.119804. September 2023 @ 5pm CET
SparseGPT: Massive Language Models Can be Accurately Pruned in One-Shot
Elias Frantar, Institute of Science and Technology Austria (ISTA), Austria
arXiv: https://arxiv.org/abs/2301.0077410. July 2023 @ 5pm CET
Information Plane Analysis for Dropout Neural Networks
Linara Adilova, Ruhr University Bochum, Germany
arXiv: https://arxiv.org/abs/2303.0059626. June 2023 @ 5pm CET
MiniLearn: On-Device Learning for Low-Power IoT Devices
Marc-Andre Schümann and Olaf Landsiedel, Kiel University, Germany & Chalmers University of Technology, Sweden
PDF: https://ewsn2022.pro2future.at/paper/sessions/ewsn2022-final3.pdf
ACM digital library: https://dl.acm.org/doi/10.5555/3578948.35789495. June 2023 @ 5pm CET
A Modern Look at the Relationship Between Sharpness and Generalization
Maksym Andriushchenko, EPFL
arXiv link: https://arxiv.org/abs/2302.0701122. May 2023 @ 5pm CET
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Shiwei Liu, VITA group and the Institute for Foundations of Machine Learning (IFML) at UT Austin
arXiv link: https://arxiv.org/abs/2303.0214115. May 2023 @ 5pm CET
Continual Pre-Training Mitigates Forgetting in Language and Vision
Andrea Cossu, University of Pisa
arXiv link: https://arxiv.org/abs/2205.0935717. April 2023 @ 5pm CET
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Keller Jordan, Hive AI
arXiv link: https://arxiv.org/abs/2211.084033. April 2023 @ 5pm CET
Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
Mansheej Paul, Stanford
arXiv link: https://arxiv.org/abs/2210.03044
2022
2nd Workshop on Efficient Machine Learning, 2022, Vienna, Austria
Organizing Team and Contact
Contact us in case of questions or suggestions (efficientml@gmail.com). Self-nominations to present your freshly published work in the reading group are welcome.