Segmentation Assisted U-shaped Multi-scale Transformer for Crowd Counting

Yifei Qian, Liangfei Zhang, Xiaopeng Hong, Carl R. Donovan, Ognjen Arandjelović

Research output: Contribution to conferencePaperpeer-review

Abstract

Automated crowd counting has made remarkable progress recently in computer vision thanks to the development of CNNs. However, this application area has run into bottlenecks since CNNs, by their nature, are limited by locally attentive receptive fields and are incapable of modelling larger-scale dependencies. To address this problem, we introduce a multi-scale transformer-based crowd-counting network, termed Crowd U-Transformer (CUT) which extracts and aggregates semantic and spatial features from multiple levels. In this design, we use crowd segmentation as an attention module to gain fine-grained features. Also, we propose a loss function that better focuses on the counting performance in the foreground area. Experimental results on four widely used benchmarks are presented and our method shows state-of-the-art performances.

Original languageEnglish
Publication statusPublished - 2022
Event33rd British Machine Vision Conference Proceedings, BMVC 2022 - London, United Kingdom
Duration: 21 Nov 202224 Nov 2022

Conference

Conference33rd British Machine Vision Conference Proceedings, BMVC 2022
Country/TerritoryUnited Kingdom
CityLondon
Period21/11/2224/11/22

Fingerprint

Dive into the research topics of 'Segmentation Assisted U-shaped Multi-scale Transformer for Crowd Counting'. Together they form a unique fingerprint.

Cite this