Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/13157
Full metadata record
DC FieldValueLanguage
dc.contributor.authorPachori, Ram Bilasen_US
dc.date.accessioned2024-01-31T10:50:36Z-
dc.date.available2024-01-31T10:50:36Z-
dc.date.issued2024-
dc.identifier.citationRadha, K., Bansal, M., & Pachori, R. B. (2024). Speech and speaker recognition using raw waveform modeling for adult and children’s speech: A comprehensive review. Engineering Applications of Artificial Intelligence. Scopus. https://doi.org/10.1016/j.engappai.2023.107661en_US
dc.identifier.issn0952-1976-
dc.identifier.otherEID(2-s2.0-85181676399)-
dc.identifier.urihttps://doi.org/10.1016/j.engappai.2023.107661-
dc.identifier.urihttps://dspace.iiti.ac.in/handle/123456789/13157-
dc.description.abstractConventionally, the extraction of hand-crafted acoustic features has been separated from the task of establishing robust machine-learning models in speech processing. The manual approach of feature engineering is both time-consuming and necessitates specialist knowledge, posing significant hindrances. Moreover, the resulting features may not be ideal for the desired application. The speech community has adopted raw waveform modeling to enhance performance. These techniques learn an optimized representation of the input automatically. With deep learning (DL) advancements, raw waveform modeling has become valuable for tasks like classification and prediction. The primary aim of this survey is to offer valuable insights and fills a gap in the existing literature by providing a comprehensive review of the state-of-the-art in speech and speaker recognition using raw waveform modeling for both adult and children's speech. The article covers papers from 2013–2023 and is the first to review both adult and children's databases. The article focuses on the advantages of raw waveform models. It presents essential concepts and techniques while discussing the challenges and limitations of using raw waveforms for speech and speaker recognition in both adult and children's speech. The article also evaluates recent progress in DL architectures such as SincNet, ResNet, and RawNet, and outlines future research directions in the field. © 2023en_US
dc.language.isoenen_US
dc.publisherElsevier Ltden_US
dc.sourceEngineering Applications of Artificial Intelligenceen_US
dc.subjectAdult speechen_US
dc.subjectAutomatic speaker recognitionen_US
dc.subjectAutomatic speech recognitionen_US
dc.subjectChildren's speechen_US
dc.subjectDeep learning architecturesen_US
dc.subjectRaw waveforms modelingen_US
dc.titleSpeech and speaker recognition using raw waveform modeling for adult and children's speech: A comprehensive reviewen_US
dc.typeShort Surveyen_US
Appears in Collections:Department of Electrical Engineering

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: