Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/17513
Full metadata record
DC FieldValueLanguage
dc.contributor.authorBansal, Shubhien_US
dc.contributor.authorGowda, Kushaanen_US
dc.contributor.authorKumar, Nagendraen_US
dc.date.accessioned2025-12-25T10:56:43Z-
dc.date.available2025-12-25T10:56:43Z-
dc.date.issued2025-
dc.identifier.citationGao, X., Bansal, S., Gowda, K., Li, Z., Nayak, S., Kumar, N., & Coler, M. (2025). AMuSeD: An Attentive Deep Neural Network for Multimodal Sarcasm Detection Incorporating Bi-modal Data Augmentation. IEEE Transactions on Affective Computing. Scopus. https://doi.org/10.1109/TAFFC.2025.3639406en_US
dc.identifier.issn1949-3045-
dc.identifier.otherEID(2-s2.0-105024449767)-
dc.identifier.urihttps://dx.doi.org/10.1109/TAFFC.2025.3639406-
dc.identifier.urihttps://dspace.iiti.ac.in:8080/jspui/handle/123456789/17513-
dc.description.abstractDetecting sarcasm effectively requires a nuanced understanding of context, including vocal tones and facial expressions. The progression towards multimodal computational methods in sarcasm detection, however, faces challenges due to the scarcity of data. To address this, we present AMuSeD (Attentive deep neural network for MUltimodal Sarcasm dEtection incorporating bi-modal Data augmentation). This approach utilizes the Multimodal Sarcasm Detection Dataset (MUStARD) and introduces a two-phase bimodal data augmentation strategy. The first phase involves generating varied text samples through Back-Translation from several secondary languages. The second phase involves the refinement of a FastSpeech2-based speech synthesis system, tailored specifically for sarcasm to retain sarcastic intonations. Alongside a cloud-based Text-to-Speech (TTS) service, this Fine-tuned FastSpeech2 system produces corresponding audio for the text augmentations. We also evaluate various attention mechanisms for selectively enhancing sarcasm-relevant features, finding self-attention to be the most efficient. Our experiments reveal that the proposed approach achieves a significant F1-score of 81.0% in text-audio modalities, surpassing even models that use three modalities from the MUStARD dataset. © 2010-2012 IEEE.en_US
dc.language.isoenen_US
dc.publisherInstitute of Electrical and Electronics Engineers Inc.en_US
dc.sourceIEEE Transactions on Affective Computingen_US
dc.subjectattention mechanismsen_US
dc.subjectdata augmentationen_US
dc.subjectmultimodalityen_US
dc.subjectSarcasm detectionen_US
dc.subjectspeech synthesisen_US
dc.titleAMuSeD: An Attentive Deep Neural Network for Multimodal Sarcasm Detection Incorporating Bi-modal Data Augmentationen_US
dc.typeJournal Articleen_US
Appears in Collections:Department of Computer Science and Engineering

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: