Glaucoformer: Dual-domain Global Transformer Network for Generalized Glaucoma Stage Classification

Pachori, Ram Bilas

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/16325

Title:	Glaucoformer: Dual-domain Global Transformer Network for Generalized Glaucoma Stage Classification
Authors:	Pachori, Ram Bilas
Keywords:	dual-domain transformer layer;fast Fourier transform;fundus image;Glaucoformer;Glaucoma stage
Issue Date:	2025
Publisher:	Institute of Electrical and Electronics Engineers Inc.
Citation:	Das, D., Nayak, D. R., & Pachori, R. B. (2025). Glaucoformer: Dual-domain Global Transformer Network for Generalized Glaucoma Stage Classification. IEEE Journal of Biomedical and Health Informatics. https://doi.org/10.1109/JBHI.2025.3574997
Abstract:	Classification of glaucoma stages remains challenging due to substantial inter-stage similarities, the presence of irrelevant features, and subtle lesion size, shape, and color variations in fundus images. For this purpose, few efforts have recently been made using traditional machine learning and deep learning models, specifically convolutional neural networks (CNN). While the conventional CNN models capture local contextual features within fixed receptive fields, they fail to exploit global contextual dependencies. Transformers, on the other hand, are capable of modeling global contextual information. However, they lack the ability to capture local contexts and merely focus on performing attention in the spatial domain, ignoring feature analysis in the frequency domain. To address these issues, we present a novel dual-domain global transformer network, Glaucoformer, to effectively classify glaucoma stages. Specifically, we propose a dual-domain global transformer layer (DGTL) consisting of dual-domain channel attention (DCA) and dual-domain spatial attention (DSA) with Fourier domain feature analyzer (FDFA) as the core component and integrated with a backbone. This helps in exploiting local and global contextual feature dependencies in both spatial and frequency domains, thereby learning prominent and discriminant feature representations. A shared key-query scheme is introduced to learn complementary features while reducing the parameters. In addition, the DGTL leverages the benefits of a deformable convolution to enable the model to handle complex lesion irregularities. We evaluate our method on a benchmark dataset, and the experimental results and extensive comparisons with existing CNN and vision transformer-based approaches indicate its effectiveness for glaucoma stage classification. Also, the results on an unseen dataset demonstrate the generalizability of the model. © 2013 IEEE.
URI:	https://dx.doi.org/10.1109/JBHI.2025.3574997 https://dspace.iiti.ac.in:8080/jspui/handle/123456789/16325
ISSN:	2168-2194
Type of Material:	Journal Article
Appears in Collections:	Department of Electrical Engineering

Files in This Item:

There are no files associated with this item.

Show full item record

Altmetric Badge: