Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/15048
Full metadata record
DC FieldValueLanguage
dc.contributor.authorGupta, Anup Kumaren_US
dc.contributor.authorDhamaniya, Ashutoshen_US
dc.contributor.authorGupta, Puneeten_US
dc.date.accessioned2024-12-24T05:20:01Z-
dc.date.available2024-12-24T05:20:01Z-
dc.date.issued2024-
dc.identifier.citationGupta, A. K., Dhamaniya, A., & Gupta, P. (2024). RADIANCE: Reliable and interpretable depression detection from speech using transformer. Computers in Biology and Medicine. Scopus. https://doi.org/10.1016/j.compbiomed.2024.109325en_US
dc.identifier.issn0010-4825-
dc.identifier.otherEID(2-s2.0-85207802232)-
dc.identifier.urihttps://doi.org/10.1016/j.compbiomed.2024.109325-
dc.identifier.urihttps://dspace.iiti.ac.in/handle/123456789/15048-
dc.description.abstractDepression is a common but severe mental disorder that adversely impacts the ability of an individual to function normally in their day-to-day life. A majority of depressed individuals remain undiagnosed due to factors such as social stigma and a shortage of healthcare professionals. Consequently, several Machine Learning and Deep Learning (DL) models based on speech have been proposed for automatic depression detection, with the latter generally outperforming the former. However, DL models are blackbox and offer no transparency. In contrast, healthcare professionals prefer models that provide interpretability besides being accurate. In this direction, we propose a method RADIANCE (Reliable AnD InterpretAble depressioN deteCtion transformErs). RADIANCE incorporates a novel FilterBank VIsion Transformer (FBViT) network, which provides the symptoms of depression as interpretable features. Additionally, we employ a novel loss function that handles the class imbalance issue in the datasets. It also incorporates a penalty term that addresses the hierarchy of misclassification errors. We also propose a reliability predictor based on low-level descriptors that provides a reliability score to indicate the trustworthiness of the prediction by FBViT. Furthermore, in contrast to the conventional averaging and majority pooling, RADIANCE consolidates predictions from multiple clips of the input audio by intricately weighing each prediction based on its reliability score, ensuring a more accurate overall prediction. RADIANCE outperforms the state-of-the-art depression detection methods, achieving an accuracy of 89.36%, 80.36%, and 94.44% over the DAIC-WOZ, E-DAIC, and CMDC datasets, respectively. Further, RADIANCE achieves MAE scores of 3.27 and 5.04 on the DAIC-WOZ and E-DAIC datasets, respectively. © 2024 Elsevier Ltden_US
dc.language.isoenen_US
dc.publisherElsevier Ltden_US
dc.sourceComputers in Biology and Medicineen_US
dc.subjectAudio modalityen_US
dc.subjectDepression detectionen_US
dc.subjectInterpretabilityen_US
dc.subjectTrustworthinessen_US
dc.subjectVision transformeren_US
dc.titleRADIANCE: Reliable and interpretable depression detection from speech using transformeren_US
dc.typeJournal Articleen_US
Appears in Collections:Department of Computer Science and Engineering

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: