Content-Aware Mapping of Streaming AI Workloads on Heterogeneous Edge Devices

PI: Prof. Sarma Vrudhula (Arizona State University)
Ph.D. Students: Mehdi Ghasemi, Soroush Heidari (Arizona State University)
Collaborators: Dr. Carole-Jean Wu (Facebook), Dr. Young Geun Kim (Soongsil University)
Support: National Science Foundation (CCF-2008244)

Project Abstract:

Humans can seamlessly detect and classify objects from a wide range of complex data sources, and draw inferences to make predictions and decisions. New types of algorithms known as deep neural networks (DNN), are being developed to endow computers with very same capabilities. The present approach of transferring all the data to a remote datacenter and have the algorithms executed there is not sustainable because the amount of data being generated is growing exponentially, is too slow, and can compromise privacy and security. The aim of this project is to enable the execution of complex DNN algorithms at or near the place of data acquisition. Referred to as "AI at the edge", nearly all the leading industries are developing varieties of new "edge devices" to be deployed in the field. This project will develop a framework consisting of technology-agnostic software tools that will optimally deploy the DNN algorithms on heterogeneous networks of edge devices to maximize their performance and energy efficiency. Domains that will benefit from the outcomes of this project include retail, security, transportation and logistics, factory automation, healthcare etc. The project team will also include graduate and undergraduate students. A Strong effort to recruit students from underrepresented groups will be made. The team will also vigorously pursue various avenues for commercialization.

The aim of this project is to enable "AI at the Edge" using DNN algorithms, which can be trained on any kind of data, in any number of dimensions, and then used to extract valuable information for automated prediction, classification, and decision making. Sophisticated DNN models can involve 100s of layers and tens of millions of parameters. Because training is computation and memory intensive, it is performed on servers. However, for performing inference at the edge, industry is building hardware accelerators that implement DNNs in silicon, integrating them with their mobile Systems on Chips (SoC)s to be deployed at the edge, each with their own architectures, memory organization and neuromorphic engines. Furthermore, complex ML applications will be expressed as heterogeneous Networks of Models (NoMs) of DNNs operating on streaming data. The key challenges to be addressed in this project are to determine how to optimally map NoMs, whose structure keeps changing depending the content of the data, onto a network of heterogeneous edge computing devices. The optimization will involve replicating and pipelining DNN models and deciding on which edge computing device to deploy each instance of a model, all at run-time. Furthermore, this determination will be based on the content of the data stream, the available resources, the characteristics of the communication medium, as well as the present allocation of models to devices. The outcomes of this project will include technology-agnostic algorithms and software tools for performing this mapping.

Publications:

Ghasemi, Mehdi, Soroush Heidari, Young Geun Kim, Aaron Lamb, Carole-Jean Wu, and Sarma Vrudhula. "Energy-Efficient Mapping for a Network of DNN Models at the Edge." In 2021 IEEE International Conference on Smart Computing (SMARTCOMP), pp. 25-30. IEEE, 2021.

Report abuse