Malware detection and classification using transformer-based learning

Nassar, Fyse

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/3125

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Hubballi, Neminath	-
dc.contributor.author	Nassar, Fyse	-
dc.date.accessioned	2021-10-27T07:14:33Z	-
dc.date.available	2021-10-27T07:14:33Z	-
dc.date.issued	2021-10-22	-
dc.identifier.uri	https://dspace.iiti.ac.in/handle/123456789/3125	-
dc.description.abstract	Malware detection and classification assumes significance owing to the rapid prop agation and proliferation of new variants. Grouping these variants into families with similar traits allows us to develop mitigation techniques that work for an entire family. Machine learning algorithms are used extensively for malware detection and classifica tion tasks with selected features taken from executable files. Although these methods have shown good performance, the choice of features chosen constrain their ability to detect novel malware samples. To alleviate this limitation, recently, deep learning methods are used, which propose to automate the feature engineering task by extract ing hidden semantic relationships between elements (raw bytes, opcodes, API calls, etc.) of the file. In this thesis, we describe Transformer-based malware detection and family iden tification methods. Our first proposed method uses static analysis to extract opcode sequences from executable files, which are used to train a Transformer-based model for Windows malware detection and classification. We show that our proposed method can perform malware detection using only a short sequence of opcodes taken from the portable executable files. A more sophisticated malware uses code obfuscation techniques or memory-based attacks to avoid detection. Such memory-based malware reside in the RAM to carry out their attacks. To detect the obfuscated malware, we use memory dumps obtained using dynamic analysis. We represent these memory dumps as images to be directly used as an input to a Transformer-based model. We also compare the detection and classification performance of Transformer-based model with a few conventional ma chine learning models. Our extensive experiments with different datasets demonstrate that our proposed techniques achieve better classification performance compared to recent methods. Malware is a threat not restricted to a single operating system. Android-based malware attacks have gained tremendous pace owing to the widespread use of mobile devices. Hence, we extend the previously mentioned technique that uses short sequences of opcodes to detect benign and malicious Android applications.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering, IIT Indore	en_US
dc.relation.ispartofseries	MSR016	-
dc.subject	Computer Science and Engineering	en_US
dc.title	Malware detection and classification using transformer-based learning	en_US
dc.type	Thesis_MS Research	en_US
Appears in Collections:	Department of Computer Science and Engineering_ETD

Files in This Item:

File	Description	Size	Format
MSR016_Fyse_Nassar_1904101009.pdf		4.01 MB	Adobe PDF	View/Open

Show simple item record

Altmetric Badge: