CrowdFormer: Weakly-supervised crowd counting with improved generalizability

Savner, Siddharth Singh; Kanhangad, Vivek

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/12284

Full metadata record

DC Field	Value	Language
dc.contributor.author	Savner, Siddharth Singh	en_US
dc.contributor.author	Kanhangad, Vivek	en_US
dc.date.accessioned	2023-10-18T09:41:22Z	-
dc.date.available	2023-10-18T09:41:22Z	-
dc.date.issued	2023	-
dc.identifier.citation	Savner, S. S., & Kanhangad, V. (2023). CrowdFormer: Weakly-supervised crowd counting with improved generalizability. Journal of Visual Communication and Image Representation, 94, 103853. https://doi.org/10.1016/j.jvcir.2023.103853	en_US
dc.identifier.issn	1047-3203	-
dc.identifier.other	EID(2-s2.0-85160571350)	-
dc.identifier.uri	https://doi.org/10.1016/j.jvcir.2023.103853	-
dc.identifier.uri	https://dspace.iiti.ac.in/handle/123456789/12284	-
dc.description.abstract	Convolutional neural networks (CNNs) have dominated the field of computer vision for nearly a decade. However, due to their limited receptive field, CNNs fail to model the global context. On the other hand, transformers, an attention-based architecture, can model the global context easily. Despite this, there are limited studies that investigate the effectiveness of transformers in crowd counting. In addition, the majority of the existing crowd-counting methods are based on the regression of density maps which requires point-level annotation of each person present in the scene. This annotation task is laborious and also error-prone. This has led to an increased focus on weakly-supervised crowd-counting methods, which require only count-level annotations. In this paper, we propose a weakly-supervised method for crowd counting using a pyramid vision transformer. We have conducted extensive evaluations to validate the effectiveness of the proposed method. Our method achieves state-of-the-art performance. More importantly, it shows remarkable generalizability. © 2023 Elsevier Inc.	en_US
dc.language.iso	en	en_US
dc.publisher	Academic Press Inc.	en_US
dc.source	Journal of Visual Communication and Image Representation	en_US
dc.subject	Crowd counting	en_US
dc.subject	Generalizability	en_US
dc.subject	Vision transformers	en_US
dc.subject	Weakly-supervised method	en_US
dc.title	CrowdFormer: Weakly-supervised crowd counting with improved generalizability	en_US
dc.type	Journal Article	en_US
dc.rights.license	All Open Access, Green	-
Appears in Collections:	Department of Electrical Engineering

Files in This Item:

There are no files associated with this item.

Show simple item record

Altmetric Badge: