CLT-MambaSeg: An integrated model of Convolution, Linear Transformer and Multiscale Mamba for medical image segmentation

Uppal, Dolly; Prakash, Surya

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/16782

Title:	CLT-MambaSeg: An integrated model of Convolution, Linear Transformer and Multiscale Mamba for medical image segmentation
Authors:	Uppal, Dolly Prakash, Surya
Keywords:	Generative Adversarial Network;Mamba;Medical Image Segmentation;State Space Modal;Transformer;Computational Efficiency;Convolution;Convolutional Neural Networks;Deep Learning;Image Enhancement;Medical Image Processing;Memory Architecture;Network Architecture;State Space Methods;Adversarial Networks;Global Context;Local Feature;Long-range Dependencies;Mamba;Medical Image Segmentation;Skin Imaging;State Space Modal;State-space;Transformer;Image Segmentation;Back Propagation;Computer Vision;Convolutional Neural Network;Cross Validation;Echomammography;Feature Extraction;Feature Learning (machine Learning);Gaussian Noise;Generative Adversarial Network;Human;Image Segmentation;Machine Learning;Natural Language Processing;Residual Neural Network
Issue Date:	2025
Publisher:	Elsevier Ltd
Citation:	Uppal, D., & Prakash, S. (2025). CLT-MambaSeg: An integrated model of Convolution, Linear Transformer and Multiscale Mamba for medical image segmentation. Computers in Biology and Medicine, 196. https://doi.org/10.1016/j.compbiomed.2025.110736
Abstract:	Recent advances in deep learning have significantly enhanced the performance of medical image segmentation. However, maintaining a balanced integration of feature localization, global context modeling, and computational efficiency remains a critical research challenge. Convolutional Neural Networks (CNNs) effectively capture fine-grained local features through hierarchical convolutions however, they often struggle to model long-range dependencies due to their limited receptive field. Transformers address this limitation by leveraging self-attention mechanisms to capture global context, but they are computationally intensive and require large-scale data for effective training. The Mamba architecture has emerged as a promising approach, effectively capturing long-range dependencies while maintaining low computational overhead and high segmentation accuracy. Based on this, we propose a method named CLT-MambaSeg that integrates Convolution, Linear Transformer, and Multiscale Mamba architectures to capture local features, model global context, and improve computational efficiency for medical image segmentation. It utilizes a convolution-based Spatial Representation Extraction (SREx) module to capture intricate spatial relationships and dependencies. Further, it comprises a Mamba Vision Linear Transformer (MVLTrans) module to capture multiscale context, spatial and sequential dependencies, and enhanced global context. In addition, to address the problem of limited data, we propose a novel Memory-Guided Augmentation Generative Adversarial Network (MeGA-GAN) that generates synthetic realistic images to further enhance the segmentation performance. We conduct extensive experiments and ablation studies on the five benchmark datasets, namely CVC-ClinicDB, Breast UltraSound Images (BUSI), PH2, and two datasets from the International Skin Imaging Collaboration (ISIC), namely ISIC-2016 and ISIC-2017. Experimental results demonstrate the efficacy of the proposed CLT-MambaSeg compared to other state-of-the-art methods. © 2025 Elsevier B.V., All rights reserved.
URI:	https://dx.doi.org/10.1016/j.compbiomed.2025.110736 https://dspace.iiti.ac.in:8080/jspui/handle/123456789/16782
ISSN:	1879-0534 0010-4825
Type of Material:	Journal Article
Appears in Collections:	Department of Computer Science and Engineering

Files in This Item:

There are no files associated with this item.

Show full item record

Altmetric Badge: