New methods based on variational mode decomposition for speech signal analysis

Upadhyay, Abhay

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/560

Title:	New methods based on variational mode decomposition for speech signal analysis
Authors:	Upadhyay, Abhay
Supervisors:	Pachori, Ram Bilas
Keywords:	Electrical Engineering
Issue Date:	8-Nov-2017
Publisher:	Department of Electrical Engineering, IIT Indore
Series/Report no.:	TH093
Abstract:	Speech signal processing fi eld includes interpretation, analysis, and processing of speech signals. The applications of speech signal processing area include speech signal analysis, speech synthesis, human-computer interaction (HCI), wireless communications, pathological voice identi cation, mobile phones, global position system, speech to text conversion, natural language processing, and language translation. In this thesis, the di erent methodologies have been developed for speech signal analysis such as voiced and non-voiced detection, instantaneous fundamental frequency (IFF) determination, amplitude modulated and frequency modulated (AM-FM) signal model based speech analysis, and speech enhancement employing the variational mode decomposition (VMD). For the analysis of the speech signals, the detection of voiced and non-voiced from the speech signals are found suitable for determining the vocal fold activities of the human speech production system. Moreover, the detection of the voiced and non-voiced regions are also explored to several speech signal processing applications such as multi-rate speech coders, language identi cation, and modeling of the speech signals. In this thesis, the instantaneous voiced and non-voiced regions from the speech signals are detected in clean case and noisy environments. The fundamental frequency components (FFCs) of speech signals are explored to determine the voiced and non-voiced regions from speech signals. In order to determine voiced and nonvoiced regions from the speech signals the VMD is applied in iterative way with suitable convergence criteria to separate the FFC from speech signals. The envelopes of the FFC of speech signals are computed with help of analytical model based on the single degree of freedom (SDOF). The automatic threshold is computed using the values of envelopes for determining the voiced and non-voiced regions from the speech signals. The voiced speech signals are useful for speech signal analysis applications like as IFF detection, speech recognition, speaker recognition, and identi cation of the glottal closure instants (GCIs). In this work, the voiced speech signals are detected from the speech signals using VMD based method. The detected voiced speech signals are ltered into low frequency range (LFR) using Fourier-Bessel (FB) series expansion based method. The LFR voiced speech signals are divided into smaller segments. Further, the FFCs of each segment of the LFR voiced signals have been obtained using the VMD method applied in iterative way with suitable convergence criteria based on estimated center frequencies and distance metric values. After obtaining the FFCs of each LFR voiced speech segment, FFC of voiced speech signal is obtained by concatenating all the extracted FFCs corresponding to each LFR voiced speech segment. Finally, the Hilbert transform (HT) has been applied on the obtained FFCs of voiced speech signals in order to estimate the IFFs. The estimation of amplitude envelope (AE) and the instantaneous frequency (IF) functions from the formants of speech signals are useful in the area of speech signal processing for examples speech synthesis, bandwidth compression, encoding, and transmission purposes. The multicomponent AM-FM signal model is useful for the analysis of formants of speech signals. In this work, VMD based discrete energy separation algorithm (DESA) (VMD-DESA) is employed for analyzing the speech signal based on multicomponent AM-FM signal model. The three component AMFM sinusoidal signal and speech signal (/ae/) are used for studying the e ectiveness of the proposed VMD-DESA method. First, the monocomponent AM-FM signalsfrom a multicomponent AM-FM signal are separated using VMD method used in iterative way. Further, the AE and the IF functions of monocomponent AM-FM signals are carried out by employing DESA. The speech enhancement of the noisy speech signals in terms of quality and intelligibility nds of applications in hearing aids, mobile communications, speech recognition systems etc. In this thesis, a new algorithm of speech enhancement is developed based on combination of modi ed empirical mode decomposition (EMD) and VMD (mEMD-VMD) methods. In the proposed method, the EMD method is applied on the speech signals to decompose into intrinsic mode functions (IMFs). Then, the VMD method is applied on the summation of selected IMFs. The selection of IMFs are based on the Hurst exponent values. The mEMD-VMD approach is suitable to remove the low-frequency as well as high-frequency noises from the noisy speech signals for obtaining enhanced speech signals. In general, the main aim of this thesis is to explore VMD method for speech signal analysis. The presence of Wiener ltering structure and use of proposed iterative structure of VMD method have provided better speech signal analysis as compared to the existing methods in the literature for voiced and non-voiced speech detection, IFF determination, AM-FM model based speech analysis, and speech enhancement applications.
URI:	https://dspace.iiti.ac.in/handle/123456789/560
Type of Material:	Thesis_Ph.D
Appears in Collections:	Department of Electrical Engineering_ETD

Files in This Item:

File	Description	Size	Format
Th_93_Abhay_Upadhyay_1301102002_EE.pdf		9.23 MB	Adobe PDF	View/Open

Show full item record

Altmetric Badge: