Abstract
Edge machine learning (ML) enables localized processing of data on devices and is underpinned by deep neural networks (DNNs). However, DNNs cannot be easily run on devices due to their substantial computing, memory and energy requirements for delivering performance that is comparable to cloud-based ML. Therefore, model compression techniques, such as pruning, have been considered. Existing pruning methods are problematic for edge ML since they: (1) Create compressed models that have limited runtime performance benefits (using unstructured pruning) or compromise the final model accuracy (using structured pruning), and (2) Require substantial compute resources and time for identifying a suitable compressed DNN model (using neural architecture search). In this paper, we explore a new avenue, referred to as Pruning-at-Initialization (PaI), using structured pruning to mitigate the above problems. We develop Reconvene, a system for rapidly generating pruned models suited for edge deployments using structured PaI. Reconvene systematically identifies and prunes DNN convolution layers that are least sensitive to structured pruning. Reconvene rapidly creates pruned DNNs within seconds that are up to 16.21× smaller and 2× faster while maintaining the same accuracy as an unstructured PaI counterpart.
Original language | English |
---|---|
Title of host publication | 2024 IEEE/ACM 24th International symposium on cluster, cloud and internet computing (CCGrid) |
Place of Publication | Los Alamitos |
Publisher | IEEE |
Pages | 317 - 326 |
Number of pages | 10 |
ISBN (Electronic) | 9798350395662 |
ISBN (Print) | 9798350395679 |
DOIs | |
Publication status | Published - 8 Oct 2024 |
Event | 24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing - Philadelphia, United States Duration: 6 May 2024 → 9 May 2024 Conference number: 24 https://2024.ccgrid-conference.org |
Publication series
Name | IEEE International symposium on cluster, cloud and internet computing (CCGrid) |
---|---|
Publisher | IEEE |
ISSN (Print) | 2376-4414 |
ISSN (Electronic) | 2993-2114 |
Conference
Conference | 24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing |
---|---|
Abbreviated title | CCGrid |
Country/Territory | United States |
City | Philadelphia |
Period | 6/05/24 → 9/05/24 |
Internet address |
Keywords
- Deep neural networks
- Edge computing
- Model compression
- Structured pruning