Please use this identifier to cite or link to this item:
https://dspace.iiti.ac.in/handle/123456789/502
Title: | Multi-person pose estimation |
Authors: | Datta, Parual |
Supervisors: | Varadarajan, Srenivas Kanhangad, Vivek |
Keywords: | Electrical Engineering |
Issue Date: | 4-Jul-2017 |
Publisher: | Department of Electrical Engineering, IIT Indore |
Series/Report no.: | MT038 |
Abstract: | Human pose estimation is an important building block for performing several computer vision tasks like action recognition, human-object interaction recognition, and computing object affordance. There exist a lot of research on single person pose estimation. However, accurate multi-person pose estimation in crowded scenes is still a challenging issue. A novel multi-person pose algorithm is proposed in this work, in which body parts are assigned sequentially following the human kinematic chain from head to ankle. A systematic approach is taken to sparsify the body-part relationship graph in order to speed up the algorithm. In our greedy part assignment algorithm to decrease the combinatorial complexity, we reduce the number of part candidates extracted from initial deep learning step, by first estimating number of individuals in the scene. Due to noise and motion blur, misdetections may happen at regions which resemble body parts in appearance. A novel step of hallucination suppression is proposed in the work which removes the incorrect detections by considering their spatial relationship to prior discovered parts. We also propose a strategy to enhance the accuracy of the algorithm by spawning new person cluster from any significant body parts. The proposed algorithm accomplishes an overall precision of 72.2% on the MPII Multi-person dataset [6] and processes up to seven video frames per second on Intel Core i7 CPU timing at 1200 MHz with 64 GB RAM. Additionally, our pose algorithm is applied in two different applications namely, (i) Identifying group of people in images and (ii) Display surface selection through hand gestures, for projecting content. In case of identifying group of people, the human bounding box and face length derived from our pose algorithm is applied to cluster the people based on proximity and orientation. In the latter application, we propose utilizing a roof mounted fisheye camera to perceive the human interaction with numerous surfaces in a room and select the surface on which content should be displayed. Finally, some of the future directions for extending this work are also discussed in this thesis. |
URI: | https://dspace.iiti.ac.in/handle/123456789/502 |
Type of Material: | Thesis_M.Tech |
Appears in Collections: | Department of Electrical Engineering_ETD |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
MT_38_ParualDatta_1502102005.pdf | 2.63 MB | Adobe PDF | ![]() View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
Altmetric Badge: