Research Projects

MALV-OS: Rethinking OS Architecture for Machine Learning Training in Virtualized Clouds

I explore the path towards an ML-specialized OS, MALV-OS. MALV-OS rethinks the OS architecture to make it specifically tailored to ML workloads, especially in virtualized clouds, which are now widely used to run ML applications. MALV-OS’s envisioned architecture includes (1) a microkernel, Micro-LAKE, which allows kernel space applications to use the GPU, and (2) an MLaaS (ML as a Service) subsystem that gathers ML models to help Micro-LAKE with memory management and CPU scheduling. The MaLV-OS architecture also offloads system-sensitive parts of the models to the OS, thereby reducing model complexity and programming and speeding up its execution. Finally, MaLV-OS integrates an open-source GPU virtualization software, merged directly into the hypervisor. For more flexibility, MaLV-OS’s vision is to enable the virtual machine to dynamically select MLaaS policies that can improve the performance of the model the user is running. Because MLaaS is designed as loadable kernel modules, the MaLT-OS architecture enables the dynamic addition of new capabilities to the MLaaS subsystem.

OoH: Out of Hypervisor

OoH is a new virtualization research axis advocating the exposure of current hypervisor-oriented hardware virtualization features to virtual machines (instead of trying to virtualize full hardware inside VMs). This way, VM’s processes can also benefit from those features. We first leveraged OoH to improve checkpoint/restore, garbage collectors, and buffer overflow detection systems.

Ongoing work is looking at applying OoH principles in bare-metal for domain isolation.

stella.bitchebe@mcgill.ca

Page updated

Google Sites

Report abuse