πŸ“ž +91-7667918914 | βœ‰οΈ ijarcce@gmail.com
IJARCCE Logo
International Journal of Advanced Research in Computer and Communication Engineering A monthly Peer-reviewed & Refereed journal
ISSN Online 2278-1021ISSN Print 2319-5940Since 2012
IJARCCE adheres to the suggestive parameters outlined by the University Grants Commission (UGC) for peer-reviewed journals, upholding high standards of research quality, ethical publishing, and academic excellence.
← Back to VOLUME 13, ISSUE 4, APRIL 2024

An Enhanced Method to Detect Hand Key-points in Single Images using Multiview Bootstrapping

Mohammad Hasan, Montasim Al Mamun, Abid Hasan

DOI: 10.17148/IJARCCE.2024.13477

Abstract: Hand key point detection is crucial for facilitating natural human-computer interactions. However, this task is highly challenging due to the intricate variations stemming from complex articulations, diverse viewpoints, self-similar parts, significant self-occlusions, as well as variations in shapes and sizes. To address these challenges, the thesis proposes several innovative contributions. Firstly, it introduces a novel approach employing a multi-camera system to train precise detectors for key points, particularly those susceptible to occlusion, such as the hand joints. This methodology, termed multiview bootstrapping, begins with an initial key point detector generating noisy labels across multiple hand views. Subsequently, these noisy detections undergo triangulation in 3D utilizing Multiview geometry or are identified as outliers. These triangulations, upon re-projection, serve as new labeled training data to refine the detector. This iterative process iterates, yielding additional labeled data with each iteration. The thesis also presents an analytical derivation establishing the minimum number of views necessary to achieve predetermined true and false-positive rates for a given detector. This methodology is further employed to train a hand key point detector tailored for single images. The resultant detector operates in real-time on RGB images and exhibits accuracy on par with methods utilizing depth sensors. Leveraging a single-view detector triangulated over multiple perspectives enables markerless 3D hand motion capture, even amidst complex object interactions. Keywords: Convolutional Neural Network, Key point detector, Density Network with a Single Gaussian Model, Mixture Density Network, Degree of Freedom.

How to Cite:

[1] Mohammad Hasan, Montasim Al Mamun, Abid Hasan, β€œAn Enhanced Method to Detect Hand Key-points in Single Images using Multiview Bootstrapping,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2024.13477