UNFOLD: 3-D U-Net, 3-D CNN, and 3-D Transformer-Based Hyperspectral Image Denoising

Dixit, Aditya; Gupta, Anup Kumar; Gupta, Puneet

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/12862

Title:	UNFOLD: 3-D U-Net, 3-D CNN, and 3-D Transformer-Based Hyperspectral Image Denoising
Authors:	Dixit, Aditya Gupta, Anup Kumar Gupta, Puneet
Keywords:	3-D convolutional neural networks (CNNs);3-D transformers;3-D U-Net;hyperspectral imaging denoising;spatial spectral fusion
Issue Date:	2023
Publisher:	Institute of Electrical and Electronics Engineers Inc.
Abstract:	Hyperspectral images (HSIs) encompass data across numerous spectral bands, making them valuable in various practical fields such as remote sensing, agriculture, and marine monitoring. Unfortunately, inevitable noise introduction during sensing restricts their applicability, necessitating denoising for optimal utilization. The existing deep learning (DL)-based denoising methods suffer from various limitations. For instance, convolutional neural networks (CNNs) struggle with long-range dependencies, while vision transformers (ViTs) struggle to capture local details. This article introduces a novel method, UNFOLD, that addresses these inherent limitations by harmoniously integrating the strengths of 3-D U-Net, 3-D CNN, and 3-D Transformer architectures. Unlike several existing methods that predominantly capture dependencies either along the spatial or the spectral dimension, UNFOLD addresses HSI denoising as a 3-D task, synergizing spatial and spectral information through the utilization of 3-D Transformer and 3-D CNN. It employs the self-attention (SA) mechanism of Transformers to capture the global dependencies and model long-range relationships across spatial and spectral dimensions. To overcome the limitations of 3-D Transformer in capturing fine-grained local and spatial features, UNFOLD complements it by incorporating 3-D CNN. Moreover, UNFOLD utilize a modified form of 3-D U-Net architecture for HSI denoising, wherein it employs a 3-D Transformer-based encoder instead of the conventional 3-D CNN-based encoder. It further capitalizes on the property of U-Net to integrate features across various scales, thereby enhancing efficacy by preserving intricate structural details. Results from extensive experiments demonstrate that UNFOLD outperforms the state-of-the-art HSI denoising methods. © 1980-2012 IEEE.
URI:	https://doi.org/10.1109/TGRS.2023.3328922 https://dspace.iiti.ac.in/handle/123456789/12862
ISSN:	0196-2892
Type of Material:	Journal Article
Appears in Collections:	Department of Computer Science and Engineering

Files in This Item:

There are no files associated with this item.

Show full item record

Altmetric Badge: