Abstract
Automated crowd counting has made remarkable progress recently in computer vision thanks to the development of CNNs. However, this application area has run into bottlenecks since CNNs, by their nature, are limited by locally attentive receptive fields and are incapable of modelling larger-scale dependencies. To address this problem, we introduce a multi-scale transformer-based crowd-counting network, termed Crowd U-Transformer (CUT) which extracts and aggregates semantic and spatial features from multiple levels. In this design, we use crowd segmentation as an attention module to gain fine-grained features. Also, we propose a loss function that better focuses on the counting performance in the foreground area. Experimental results on four widely used benchmarks are presented and our method shows state-of-the-art performances.
| Original language | English |
|---|---|
| Publication status | Published - 2022 |
| Event | 33rd British Machine Vision Conference Proceedings, BMVC 2022 - London, United Kingdom Duration: 21 Nov 2022 → 24 Nov 2022 |
Conference
| Conference | 33rd British Machine Vision Conference Proceedings, BMVC 2022 |
|---|---|
| Country/Territory | United Kingdom |
| City | London |
| Period | 21/11/22 → 24/11/22 |
Fingerprint
Dive into the research topics of 'Segmentation Assisted U-shaped Multi-scale Transformer for Crowd Counting'. Together they form a unique fingerprint.Student theses
-
Towards fully automated analysis of crowd counting in images
Qian, Y. (Author), Donovan, C. R. (Supervisor), 3 Dec 2024Student thesis: Doctoral Thesis (PhD)