Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/12862
Full metadata record
DC FieldValueLanguage
dc.contributor.authorDixit, Adityaen_US
dc.contributor.authorGupta, Anup Kumaren_US
dc.contributor.authorGupta, Puneeten_US
dc.date.accessioned2023-12-22T09:16:24Z-
dc.date.available2023-12-22T09:16:24Z-
dc.date.issued2023-
dc.identifier.citationQayyum, A., Razzak, I., Tanveer, M., & Mazher, M. (2023). Spontaneous Facial Behavior Analysis Using Deep Transformer-based Framework for Child-computer Interaction. ACM Transactions on Multimedia Computing, Communications and Applications. Scopus. https://doi.org/10.1145/3539577en_US
dc.identifier.issn0196-2892-
dc.identifier.otherEID(2-s2.0-85177039506)-
dc.identifier.urihttps://doi.org/10.1109/TGRS.2023.3328922-
dc.identifier.urihttps://dspace.iiti.ac.in/handle/123456789/12862-
dc.description.abstractHyperspectral images (HSIs) encompass data across numerous spectral bands, making them valuable in various practical fields such as remote sensing, agriculture, and marine monitoring. Unfortunately, inevitable noise introduction during sensing restricts their applicability, necessitating denoising for optimal utilization. The existing deep learning (DL)-based denoising methods suffer from various limitations. For instance, convolutional neural networks (CNNs) struggle with long-range dependencies, while vision transformers (ViTs) struggle to capture local details. This article introduces a novel method, UNFOLD, that addresses these inherent limitations by harmoniously integrating the strengths of 3-D U-Net, 3-D CNN, and 3-D Transformer architectures. Unlike several existing methods that predominantly capture dependencies either along the spatial or the spectral dimension, UNFOLD addresses HSI denoising as a 3-D task, synergizing spatial and spectral information through the utilization of 3-D Transformer and 3-D CNN. It employs the self-attention (SA) mechanism of Transformers to capture the global dependencies and model long-range relationships across spatial and spectral dimensions. To overcome the limitations of 3-D Transformer in capturing fine-grained local and spatial features, UNFOLD complements it by incorporating 3-D CNN. Moreover, UNFOLD utilize a modified form of 3-D U-Net architecture for HSI denoising, wherein it employs a 3-D Transformer-based encoder instead of the conventional 3-D CNN-based encoder. It further capitalizes on the property of U-Net to integrate features across various scales, thereby enhancing efficacy by preserving intricate structural details. Results from extensive experiments demonstrate that UNFOLD outperforms the state-of-the-art HSI denoising methods. © 1980-2012 IEEE.en_US
dc.language.isoenen_US
dc.publisherInstitute of Electrical and Electronics Engineers Inc.en_US
dc.sourceIEEE Transactions on Geoscience and Remote Sensingen_US
dc.subject3-D convolutional neural networks (CNNs)en_US
dc.subject3-D transformersen_US
dc.subject3-D U-Neten_US
dc.subjecthyperspectral imaging denoisingen_US
dc.subjectspatial spectral fusionen_US
dc.titleUNFOLD: 3-D U-Net, 3-D CNN, and 3-D Transformer-Based Hyperspectral Image Denoisingen_US
dc.typeJournal Articleen_US
Appears in Collections:Department of Computer Science and Engineering

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: