 
 
    Please use this identifier to cite or link to this item:
    
    
    https://dspace.iiti.ac.in/handle/123456789/12148
| Title: | Exploring invertible architecture for speech enhancement | 
| Authors: | Singh, Mansi | 
| Supervisors: | Kanhangad, Vivek | 
| Keywords: | Electrical Engineering | 
| Issue Date: | 6-Jun-2023 | 
| Publisher: | Department of Electrical Engineering, IIT Indore | 
| Series/Report no.: | MT276; | 
| Abstract: | Speech enhancement plays a key role in many user-oriented audio applications like telecommunication, assistive hearing, speech recognition, etc. In speech enhancement, the task is to determine the clean speech component from a noisy signal. The forward process from clean speech to noisy speech is often well-defined, whereas the inverse problem is ambiguous since several parameter sets can map to the same observation i.e., noise can combine with clean speech in different ways to give the same noisy speech signal. In order to address this uncertainty, it is necessary to determine the complete posterior parameter distribution, considering the given measurement. One type of neural network that is particularly suitable for this purpose is known as Invertible Neural Networks (INNs). INNs focus on learning the forward process, using additional latent output variables to retain crucial information which would have been lost otherwise, while implicitly learning a model of the corresponding inverse process. Standard ResNet architectures have been made invertible, enabling the same model to be utilized for both generative and discriminative tasks. Invertible ResNets have been shown to perform competitively with state-of-the-art (SOTA) image classifiers and flow based generative models. They also bridge the performance gap between generative and discriminative approaches. This work explores the possibility of leveraging the i-ResNets for speech enhancement task. This is the first study investigating the applicability of i-ResNets for regression task in general, and speech enhancement in particular, with promising results. The experiments and results on VoiceBank- DEMAND dataset show that the performance is comparable with other related SE models. | 
| URI: | https://dspace.iiti.ac.in/handle/123456789/12148 | 
| Type of Material: | Thesis_M.Tech | 
| Appears in Collections: | Department of Electrical Engineering_ETD | 
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| MT_276_Mansi_Singh_2102102006.pdf | 2.78 MB | Adobe PDF | View/Open | 
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
Altmetric Badge:
            	
                
    
            
