Please use this identifier to cite or link to this item:
https://dspace.iiti.ac.in/handle/123456789/10613
Title: | TRIESTE: translation based defense for text classifiers |
Authors: | Gupta, Anup Kumar Paliwal, Vardhan Rastogi, Aryan Gupta, Puneet |
Keywords: | Network security;Sentiment analysis;Translation (languages);Adversarial attack;Adversarial defense;Language processing;Natural language processing;Natural languages;Source language;State of the art;Text classifiers;Transformer;Translation;Classification (of information) |
Issue Date: | 2022 |
Publisher: | Springer Science and Business Media Deutschland GmbH |
Citation: | Gupta, A. K., Paliwal, V., Rastogi, A., & Gupta, P. (2022). TRIESTE: Translation based defense for text classifiers. Journal of Ambient Intelligence and Humanized Computing. https://doi.org/10.1007/s12652-022-03859-0 |
Abstract: | The field of natural language processing (NLP) has significantly evolved with the advent of state-of-the-art models. The discovery of these models has entirely revolutionised how NLP tasks such as machine translation, sentiment analysis and many others are performed. However, despite their high efficacy and meticulous performance, these models are prone to adversarial attacks. Adversarial attacks involve the introduction of perturbations imperceptible to humans, which can severely impact the model’s learning and prediction accuracy. Current defenses on text data include approaches such as spell-checking and adversarial training, which have their limitations against state-of-the-art adversarial attacks. This paper put forward an effective transformation-based defense, TRIESTE (TRanslatIon basEd defenSe for Text classifiErs). The proposed defense overcomes the shortcomings of existing defenses by translating the input text from the source language to a target language and again back to the source language before providing it to the text classifier. Translation ensures that the sentiment of the translated text is similar to that of the input text by taking the entire text into consideration, which leads to the removal of adversarial perturbations. Rigorous evaluation on publicly available datasets showcases that TRIESTE is successful against state-of-the-art attacks without a significant drop in the classifier accuracy. © 2022, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature. |
URI: | https://doi.org/10.1007/s12652-022-03859-0 https://dspace.iiti.ac.in/handle/123456789/10613 |
ISSN: | 1868-5137 |
Type of Material: | Journal Article |
Appears in Collections: | Department of Computer Science and Engineering |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
Altmetric Badge: