Cost-Effective Video Summarization Using Deep CNN with Hierarchical Weighted Fusion for IoT Surveillance Networks

Tanveer, M.

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/6605

Title:	Cost-Effective Video Summarization Using Deep CNN with Hierarchical Weighted Fusion for IoT Surveillance Networks
Authors:	Tanveer, M.
Keywords:	Automatic indexing;Cost effectiveness;Monitoring;Network security;Security systems;Video recording;Constrained resources;Effective solution;Internet of Things (IOT);Shot segmentation;State-of-the-art scheme;Surveillance networks;Surveillance video;Video summarization;Internet of things
Issue Date:	2020
Publisher:	Institute of Electrical and Electronics Engineers Inc.
Citation:	Muhammad, K., Hussain, T., Tanveer, M., Sannino, G., & De Albuquerque, V. H. C. (2020). Cost-effective video summarization using deep CNN with hierarchical weighted fusion for IoT surveillance networks. IEEE Internet of Things Journal, 7(5), 4455-4463. doi:10.1109/JIOT.2019.2950469
Abstract:	Video summarization (VS) has attracted intense attention recently due to its enormous applications in various computer vision domains, such as video retrieval, indexing, and browsing. Traditional VS researches mostly target at the effectiveness of the VS algorithms by introducing the high quality of features and clusters for selecting representative visual elements. Due to the increased density of vision sensors network, there is a tradeoff between the processing time of the VS methods with reasonable and representative quality of the generated summaries. It is a challenging task to generate a video summary of significant importance while fulfilling the needs of Internet of Things (IoT) surveillance networks with constrained resources. This article addresses this problem by proposing a new computationally effective solution through designing a deep CNN framework with hierarchical weighted fusion for the summarization of surveillance videos captured in IoT settings. The first stage of our framework designs discriminative rich features extracted from deep CNNs for shot segmentation. Then, we employ image memorability predicted from a fine-tuned CNN model in the framework, along with aesthetic and entropy features to maintain the interestingness and diversity of the summary. Third, a hierarchical weighted fusion mechanism is proposed to produce an aggregated score for the effective computation of the extracted features. Finally, an attention curve is constituted using the aggregated score for deciding outstanding keyframes for the final video summary. Experiments are conducted using benchmark data sets for validating the importance and effectiveness of our framework, which outperforms the other state-of-the-art schemes. © 2014 IEEE.
URI:	https://doi.org/10.1109/JIOT.2019.2950469 https://dspace.iiti.ac.in/handle/123456789/6605
ISSN:	2327-4662
Type of Material:	Journal Article
Appears in Collections:	Department of Mathematics

Files in This Item:

There are no files associated with this item.

Show full item record

Altmetric Badge: