Multi-tier GPU virtualization for deep learning in cloud-edge systems

Jason Kennedy, Vishal Sharma, Blesson Varghese, Carlos Reaño

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)
18 Downloads (Pure)

Abstract

Accelerator virtualization offers several advantages in the context of cloud-edge computing. Relatively weak user devices can enhance performance when running workloads by accessing virtualized accelerators available on other resources in the cloud-edge continuum. However, cloud-edge systems are heterogeneous, often leading to compatibility issues arising from various hardware and software stacks present in the system. One mechanism to alleviate this issue is using containers for deploying workloads. Containers isolate applications and their dependencies and store them as images that can run on any device. In addition, user devices may move during the course of application execution, and thus mechanisms such as container migration are required to move running workloads from one resource to another in the network. Furthermore, an optimal destination will need to be determined when migrating between virtual accelerators. Scheduling and placement strategies are incorporated to choose the best possible location depending on the workload requirements. This paper presents AVEC , a framework for accelerator virtualization in cloud-edge computing. The AVEC framework enables the offloading of deep learning workloads for inference from weak user devices to computationally more powerful devices in a cloud-edge network. AVEC incorporates a mechanism that efficiently manages and schedules the virtualization of accelerators. It also supports migration between accelerators to enable stateless container migration. The experimental analysis highlights that AVEC can achieve up to 7x speedup by offloading applications to remote resources. Furthermore, AVEC features a low migration downtime that is less than 5 seconds.
Original languageEnglish
Pages (from-to)2107-2123
Number of pages17
JournalIEEE Transactions on Parallel and Distributed Systems
Volume34
Issue number7
Early online date10 May 2023
DOIs
Publication statusPublished - Jul 2023

Keywords

  • Edge computing
  • Accelerators
  • Virtualization
  • Containers
  • Migration

Fingerprint

Dive into the research topics of 'Multi-tier GPU virtualization for deep learning in cloud-edge systems'. Together they form a unique fingerprint.

Cite this