Time-order representation based method for epoch detection from speech signals

Jain, Pooja; Pachori, Ram Bilas

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/6154

Full metadata record

DC Field	Value	Language
dc.contributor.author	Jain, Pooja	en_US
dc.contributor.author	Pachori, Ram Bilas	en_US
dc.date.accessioned	2022-03-17T01:00:00Z	-
dc.date.accessioned	2022-03-17T15:46:46Z	-
dc.date.available	2022-03-17T01:00:00Z	-
dc.date.available	2022-03-17T15:46:46Z	-
dc.date.issued	2012	-
dc.identifier.citation	Jain, P., & Pachori, R. B. (2012). Time-order representation based method for epoch detection from speech signals. Journal of Intelligent Systems, 21(1), 79-95. doi:10.1515/jisys-2012-0003	en_US
dc.identifier.issn	0334-1860	-
dc.identifier.other	EID(2-s2.0-84860119809)	-
dc.identifier.uri	https://doi.org/10.1515/jisys-2012-0003	-
dc.identifier.uri	https://dspace.iiti.ac.in/handle/123456789/6154	-
dc.description.abstract	Epochs present in the voiced speech are defined as time instants of significant excitation of the vocal tract system during the production of speech. Nonstationary nature of excitation source and vocal tract system makes accurate identification of epochs a difficult task. Most of the existing methods for epoch detection require prior knowledge of voiced regions and a rough estimation of pitch frequency. In this paper, we propose a novel method that relies on time-order representation (TOR) based on short-time Fourier- Bessel (FB) series expansion which can be employed on entire speech signal to detect epochs without any prior information. The proposed method automatically detects voiced regions in the speech signal by computing the marginal energy density with respect to time in the low frequency range (LFR) from the energy distribution in the time-frequency plane. An estimate of pitch frequency for each detected voiced region is then obtained by computing the marginal energy density with respect to frequency in the LFR from the energy distribution in the time-frequency plane. Epochs are located for each detected voiced region as peaks in the derivative of the low pass filtered (LPF) signal corresponding to falling edges of peak negative cycles in the LPF signal synthesized from TOR coefficients corresponding to LFR. Experimental results obtained by the proposed method on speech signals taken from the CMU-Arctic database are found to be promising. The proposed method detects epochs with high accuracy and reliability. © de Gruyter 2012.	en_US
dc.language.iso	en	en_US
dc.source	Journal of Intelligent Systems	en_US
dc.subject	Energy density	en_US
dc.subject	Energy distributions	en_US
dc.subject	Excitation sources	en_US
dc.subject	Falling edge	en_US
dc.subject	Fourier	en_US
dc.subject	Fourier-Bessel series expansion	en_US
dc.subject	Low frequency range	en_US
dc.subject	Low-pass	en_US
dc.subject	Nonstationary	en_US
dc.subject	Pitch frequencies	en_US
dc.subject	Prior information	en_US
dc.subject	Prior knowledge	en_US
dc.subject	Rough estimation	en_US
dc.subject	Series expansion	en_US
dc.subject	Speech signals	en_US
dc.subject	Time-frequency planes	en_US
dc.subject	Time-order representation	en_US
dc.subject	Vocal-tracts	en_US
dc.subject	Voiced speech	en_US
dc.subject	Electric power distribution	en_US
dc.subject	Fourier series	en_US
dc.subject	Low pass filters	en_US
dc.subject	Signal detection	en_US
dc.subject	Speech recognition	en_US
dc.title	Time-order representation based method for epoch detection from speech signals	en_US
dc.type	Journal Article	en_US
Appears in Collections:	Department of Electrical Engineering

Files in This Item:

There are no files associated with this item.

Show simple item record

Altmetric Badge: