Adversarial attack on visual speech recognition models

Gupta, Anup Kumar

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/3026

Title:	Adversarial attack on visual speech recognition models
Authors:	Gupta, Anup Kumar
Supervisors:	Gupta, Puneet
Keywords:	Computer Science and Engineering
Issue Date:	13-Aug-2021
Publisher:	Department of Computer Science and Engineering, IIT Indore
Series/Report no.:	MSR004
Abstract:	Visual speech recognition or lip reading is essential to understand speech in several real-world applications such as surveillance systems and aiding differently-abled. It proliferates the research in the field of Automatic Lip Reading (ALR) models, espe cially in the realm of Deep Learning (DL). Despite their immense success, the appli cability of DL models remains restricted as they tend to be vulnerable to adversarial attacks. The study of these attacks leads to new research avenues in the development of resilient DL systems and helps us prevent such attacks. Studying the generation of adversarial examples also facilitates exploring explainable artificial intelligence. How ever, the existing attacks on vision models such as image and video classifiers cannot be directly employed on ALR models. ALR models provide an inherent defence mech anism for most adversarial attacks since they encompass temporal information and complex network architecture of sequence modelling and hence are more challenging and strenuous than attacking image classification models. Furthermore, the region of interest in the ALR models is smaller compared to other video classification tasks, lead ing to perceivable perturbations. Despite these obstructions, our proposed method, FooLing Automatic lip Reading modEls (FLARE), successfully perform adversarial attacks on state-of-the-art ALR models. To the best of our knowledge, we are the first to successfully deceive ALR models for the task of word recognition. We further demonstrate that the efficiency of the proposed attack increases significantly when we incorporate logits instead of probabilities in the loss function. Our comprehensive experiments on a publicly available dataset show that the proposed attack successfully bypasses the well-known transformation based defences.
URI:	https://dspace.iiti.ac.in/handle/123456789/3026
Type of Material:	Thesis_MS Research
Appears in Collections:	Department of Computer Science and Engineering_ETD

Files in This Item:

File	Description	Size	Format
MSR004_Anup_Kumar_Gupta_1904101004.pdf		4.3 MB	Adobe PDF	View/Open

Show full item record

Altmetric Badge: