Multi-person pose estimation

Datta, Parual

Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/502

Title:	Multi-person pose estimation
Authors:	Datta, Parual
Supervisors:	Varadarajan, Srenivas Kanhangad, Vivek
Keywords:	Electrical Engineering
Issue Date:	4-Jul-2017
Publisher:	Department of Electrical Engineering, IIT Indore
Series/Report no.:	MT038
Abstract:	Human pose estimation is an important building block for performing several computer vision tasks like action recognition, human-object interaction recognition, and computing object affordance. There exist a lot of research on single person pose estimation. However, accurate multi-person pose estimation in crowded scenes is still a challenging issue. A novel multi-person pose algorithm is proposed in this work, in which body parts are assigned sequentially following the human kinematic chain from head to ankle. A systematic approach is taken to sparsify the body-part relationship graph in order to speed up the algorithm. In our greedy part assignment algorithm to decrease the combinatorial complexity, we reduce the number of part candidates extracted from initial deep learning step, by first estimating number of individuals in the scene. Due to noise and motion blur, misdetections may happen at regions which resemble body parts in appearance. A novel step of hallucination suppression is proposed in the work which removes the incorrect detections by considering their spatial relationship to prior discovered parts. We also propose a strategy to enhance the accuracy of the algorithm by spawning new person cluster from any significant body parts. The proposed algorithm accomplishes an overall precision of 72.2% on the MPII Multi-person dataset [6] and processes up to seven video frames per second on Intel Core i7 CPU timing at 1200 MHz with 64 GB RAM. Additionally, our pose algorithm is applied in two different applications namely, (i) Identifying group of people in images and (ii) Display surface selection through hand gestures, for projecting content. In case of identifying group of people, the human bounding box and face length derived from our pose algorithm is applied to cluster the people based on proximity and orientation. In the latter application, we propose utilizing a roof mounted fisheye camera to perceive the human interaction with numerous surfaces in a room and select the surface on which content should be displayed. Finally, some of the future directions for extending this work are also discussed in this thesis.
URI:	https://dspace.iiti.ac.in/handle/123456789/502
Type of Material:	Thesis_M.Tech
Appears in Collections:	Department of Electrical Engineering_ETD

Files in This Item:

File	Description	Size	Format
MT_38_ParualDatta_1502102005.pdf		2.63 MB	Adobe PDF	View/Open

Show full item record

Altmetric Badge: