Please use this identifier to cite or link to this item:
https://dspace.iiti.ac.in/handle/123456789/17513
Full metadata record
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Bansal, Shubhi | en_US |
| dc.contributor.author | Gowda, Kushaan | en_US |
| dc.contributor.author | Kumar, Nagendra | en_US |
| dc.date.accessioned | 2025-12-25T10:56:43Z | - |
| dc.date.available | 2025-12-25T10:56:43Z | - |
| dc.date.issued | 2025 | - |
| dc.identifier.citation | Gao, X., Bansal, S., Gowda, K., Li, Z., Nayak, S., Kumar, N., & Coler, M. (2025). AMuSeD: An Attentive Deep Neural Network for Multimodal Sarcasm Detection Incorporating Bi-modal Data Augmentation. IEEE Transactions on Affective Computing. Scopus. https://doi.org/10.1109/TAFFC.2025.3639406 | en_US |
| dc.identifier.issn | 1949-3045 | - |
| dc.identifier.other | EID(2-s2.0-105024449767) | - |
| dc.identifier.uri | https://dx.doi.org/10.1109/TAFFC.2025.3639406 | - |
| dc.identifier.uri | https://dspace.iiti.ac.in:8080/jspui/handle/123456789/17513 | - |
| dc.description.abstract | Detecting sarcasm effectively requires a nuanced understanding of context, including vocal tones and facial expressions. The progression towards multimodal computational methods in sarcasm detection, however, faces challenges due to the scarcity of data. To address this, we present AMuSeD (Attentive deep neural network for MUltimodal Sarcasm dEtection incorporating bi-modal Data augmentation). This approach utilizes the Multimodal Sarcasm Detection Dataset (MUStARD) and introduces a two-phase bimodal data augmentation strategy. The first phase involves generating varied text samples through Back-Translation from several secondary languages. The second phase involves the refinement of a FastSpeech2-based speech synthesis system, tailored specifically for sarcasm to retain sarcastic intonations. Alongside a cloud-based Text-to-Speech (TTS) service, this Fine-tuned FastSpeech2 system produces corresponding audio for the text augmentations. We also evaluate various attention mechanisms for selectively enhancing sarcasm-relevant features, finding self-attention to be the most efficient. Our experiments reveal that the proposed approach achieves a significant F1-score of 81.0% in text-audio modalities, surpassing even models that use three modalities from the MUStARD dataset. © 2010-2012 IEEE. | en_US |
| dc.language.iso | en | en_US |
| dc.publisher | Institute of Electrical and Electronics Engineers Inc. | en_US |
| dc.source | IEEE Transactions on Affective Computing | en_US |
| dc.subject | attention mechanisms | en_US |
| dc.subject | data augmentation | en_US |
| dc.subject | multimodality | en_US |
| dc.subject | Sarcasm detection | en_US |
| dc.subject | speech synthesis | en_US |
| dc.title | AMuSeD: An Attentive Deep Neural Network for Multimodal Sarcasm Detection Incorporating Bi-modal Data Augmentation | en_US |
| dc.type | Journal Article | en_US |
| Appears in Collections: | Department of Computer Science and Engineering | |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
Altmetric Badge: