Please use this identifier to cite or link to this item:
https://dspace.iiti.ac.in/handle/123456789/13157
Title: | Speech and speaker recognition using raw waveform modeling for adult and children's speech: A comprehensive review |
Authors: | Pachori, Ram Bilas |
Keywords: | Adult speech;Automatic speaker recognition;Automatic speech recognition;Children's speech;Deep learning architectures;Raw waveforms modeling |
Issue Date: | 2024 |
Publisher: | Elsevier Ltd |
Citation: | Radha, K., Bansal, M., & Pachori, R. B. (2024). Speech and speaker recognition using raw waveform modeling for adult and children’s speech: A comprehensive review. Engineering Applications of Artificial Intelligence. Scopus. https://doi.org/10.1016/j.engappai.2023.107661 |
Abstract: | Conventionally, the extraction of hand-crafted acoustic features has been separated from the task of establishing robust machine-learning models in speech processing. The manual approach of feature engineering is both time-consuming and necessitates specialist knowledge, posing significant hindrances. Moreover, the resulting features may not be ideal for the desired application. The speech community has adopted raw waveform modeling to enhance performance. These techniques learn an optimized representation of the input automatically. With deep learning (DL) advancements, raw waveform modeling has become valuable for tasks like classification and prediction. The primary aim of this survey is to offer valuable insights and fills a gap in the existing literature by providing a comprehensive review of the state-of-the-art in speech and speaker recognition using raw waveform modeling for both adult and children's speech. The article covers papers from 2013–2023 and is the first to review both adult and children's databases. The article focuses on the advantages of raw waveform models. It presents essential concepts and techniques while discussing the challenges and limitations of using raw waveforms for speech and speaker recognition in both adult and children's speech. The article also evaluates recent progress in DL architectures such as SincNet, ResNet, and RawNet, and outlines future research directions in the field. © 2023 |
URI: | https://doi.org/10.1016/j.engappai.2023.107661 https://dspace.iiti.ac.in/handle/123456789/13157 |
ISSN: | 0952-1976 |
Type of Material: | Short Survey |
Appears in Collections: | Department of Electrical Engineering |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
Altmetric Badge: