Rapid deployment of DNNs for edge computing via structured pruning at initialization

Bailey Jack Eccles*, Leon Wong, Blesson Varghese

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Edge machine learning (ML) enables localized processing of data on devices and is underpinned by deep neural networks (DNNs). However, DNNs cannot be easily run on devices due to their substantial computing, memory and energy requirements for delivering performance that is comparable to cloud-based ML. Therefore, model compression techniques, such as pruning, have been considered. Existing pruning methods are problematic for edge ML since they: (1) Create compressed models that have limited runtime performance benefits (using unstructured pruning) or compromise the final model accuracy (using structured pruning), and (2) Require substantial compute resources and time for identifying a suitable compressed DNN model (using neural architecture search). In this paper, we explore a new avenue, referred to as Pruning-at-Initialization (PaI), using structured pruning to mitigate the above problems. We develop Reconvene, a system for rapidly generating pruned models suited for edge deployments using structured PaI. Reconvene systematically identifies and prunes DNN convolution layers that are least sensitive to structured pruning. Reconvene rapidly creates pruned DNNs within seconds that are up to 16.21× smaller and 2× faster while maintaining the same accuracy as an unstructured PaI counterpart.
Original languageEnglish
Title of host publication2024 IEEE/ACM 24th International symposium on cluster, cloud and internet computing (CCGrid)
Place of PublicationLos Alamitos
PublisherIEEE
Pages317 - 326
Number of pages10
ISBN (Electronic)9798350395662
ISBN (Print)9798350395679
DOIs
Publication statusPublished - 8 Oct 2024
Event24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing - Philadelphia, United States
Duration: 6 May 20249 May 2024
Conference number: 24
https://2024.ccgrid-conference.org

Publication series

NameIEEE International symposium on cluster, cloud and internet computing (CCGrid)
PublisherIEEE
ISSN (Print)2376-4414
ISSN (Electronic)2993-2114

Conference

Conference24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing
Abbreviated titleCCGrid
Country/TerritoryUnited States
CityPhiladelphia
Period6/05/249/05/24
Internet address

Keywords

  • Deep neural networks
  • Edge computing
  • Model compression
  • Structured pruning

Fingerprint

Dive into the research topics of 'Rapid deployment of DNNs for edge computing via structured pruning at initialization'. Together they form a unique fingerprint.

Cite this