Please use this identifier to cite or link to this item:
https://dspace.iiti.ac.in/handle/123456789/16325
Title: | Glaucoformer: Dual-domain Global Transformer Network for Generalized Glaucoma Stage Classification |
Authors: | Pachori, Ram Bilas |
Keywords: | dual-domain transformer layer;fast Fourier transform;fundus image;Glaucoformer;Glaucoma stage |
Issue Date: | 2025 |
Publisher: | Institute of Electrical and Electronics Engineers Inc. |
Citation: | Das, D., Nayak, D. R., & Pachori, R. B. (2025). Glaucoformer: Dual-domain Global Transformer Network for Generalized Glaucoma Stage Classification. IEEE Journal of Biomedical and Health Informatics. https://doi.org/10.1109/JBHI.2025.3574997 |
Abstract: | Classification of glaucoma stages remains challenging due to substantial inter-stage similarities, the presence of irrelevant features, and subtle lesion size, shape, and color variations in fundus images. For this purpose, few efforts have recently been made using traditional machine learning and deep learning models, specifically convolutional neural networks (CNN). While the conventional CNN models capture local contextual features within fixed receptive fields, they fail to exploit global contextual dependencies. Transformers, on the other hand, are capable of modeling global contextual information. However, they lack the ability to capture local contexts and merely focus on performing attention in the spatial domain, ignoring feature analysis in the frequency domain. To address these issues, we present a novel dual-domain global transformer network, Glaucoformer, to effectively classify glaucoma stages. Specifically, we propose a dual-domain global transformer layer (DGTL) consisting of dual-domain channel attention (DCA) and dual-domain spatial attention (DSA) with Fourier domain feature analyzer (FDFA) as the core component and integrated with a backbone. This helps in exploiting local and global contextual feature dependencies in both spatial and frequency domains, thereby learning prominent and discriminant feature representations. A shared key-query scheme is introduced to learn complementary features while reducing the parameters. In addition, the DGTL leverages the benefits of a deformable convolution to enable the model to handle complex lesion irregularities. We evaluate our method on a benchmark dataset, and the experimental results and extensive comparisons with existing CNN and vision transformer-based approaches indicate its effectiveness for glaucoma stage classification. Also, the results on an unseen dataset demonstrate the generalizability of the model. © 2013 IEEE. |
URI: | https://dx.doi.org/10.1109/JBHI.2025.3574997 https://dspace.iiti.ac.in:8080/jspui/handle/123456789/16325 |
ISSN: | 2168-2194 |
Type of Material: | Journal Article |
Appears in Collections: | Department of Electrical Engineering |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
Altmetric Badge: