iMAppGAN: Integrated Motion Appearance Generative Adversarial Networks for Video Anomaly Detection

Singh, Rituraj K.; Sethi, Anikeit; Bhadada, Mitika; Gautam, Hritika; Saini, Krishanu; Tiwari, Aruna

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/16533

Title:	iMAppGAN: Integrated Motion Appearance Generative Adversarial Networks for Video Anomaly Detection
Authors:	Singh, Rituraj K. Sethi, Anikeit Bhadada, Mitika Gautam, Hritika Saini, Krishanu Tiwari, Aruna
Keywords:	Adversarial learning;Generative adversarial networks (GAN);One-class classification (OCC);Video anomaly detection
Issue Date:	2025
Publisher:	Springer Science and Business Media Deutschland GmbH
Citation:	Singh, R., Sethi, A., Bhadada, M., Gautam, H., Saini, K., Tiwari, A., Saurav, S., & Singh, S. (2025). iMAppGAN: Integrated Motion Appearance Generative Adversarial Networks for Video Anomaly Detection. In Communications in Computer and Information Science: Vol. 2290 CCIS. https://doi.org/10.1007/978-981-96-6972-1_27
Abstract:	The identification of anomalies in videos primarily focuses on detecting rare or inappropriate occurrences within specific contexts. The other state-of-the-art models often rely on either future frame prediction or reconstruction approaches. However, these methods exhibit limitations, such as inadequate detection due to the less variation between the generated normal and anomaly video frames. Thus, there is a need that combine the strengths of both future frame prediction and reconstruction models, the Integrated Motion-Appearance Generative Adversarial Networks (iMAppGAN). It is an end-to-end network that sequentially performs the prediction of future frames and further reconstructs the predicted video frame. The prediction of future frames facilitates the detection of abnormal events by emphasizing significant reconstruction errors. This is achieved through the integration of an autoencoder block and a UNET block within the generator to generate the normal video frame and distort the anomaly video frame. The first autoencoder block features a dual-stream encoder designed for extracting both motion and appearance features while the UNET block is responsible for reconstructing the predicted future frames received from the autoencoder. This results in a robust solution for effectively detecting anomalies in video surveillance, overcoming the drawbacks of both the prediction and reconstruction methods, and enhancing overall performance. Experimental results show that our novel iMAppGAN outperforms existing state-of-the-art models, demonstrating remarkable performance with an AUC score of 97.9% on UCSD Ped2, 90.8% on CUHK Avenue, and 75.3% on the ShanghaiTech dataset. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
URI:	https://dx.doi.org/10.1007/978-981-96-6972-1_27 https://dspace.iiti.ac.in:8080/jspui/handle/123456789/16533
ISSN:	1865-0929
Type of Material:	Conference Paper
Appears in Collections:	Department of Computer Science and Engineering

Files in This Item:

There are no files associated with this item.

Show full item record

Altmetric Badge: