Latent Regression Forest Structured Estimation of 3D Hand Poses
Abstract: In this paper we present the latent regression forest (LRF), a novel framework for real-time, time, 3D hand pose estimation from a single depth image. Prior discriminative methods often fall into two categories: holistic and patch-based. patch Holistic methods are efficient icient but less flexible due to their nearest neighbour nature. Patch-based based methods can generalise to unseen samples by consider local appearance only. However, they are complex because each pixel need to be classified or regressed during testing. In contr contrast ast to these two baselines, our method can be considered as a structured coarse coarse-to-fine fine search, starting from the centre of mass of a point cloud until locating all the skeletal joints. The searching process is guided by a learnt latent tree model which re reflects flects the hierarchical topology of the hand. Our main contributions can be summarised as follows: (i) Learning the topology of the hand in an unsupervised, data data-driven driven manner. (ii) A new forest-based, based, discriminative framework for structured search in images, imag as well as an error regression step to avoid error accumulation. (iii) A new multi-view multi hand pose dataset containing 180 K annotated images from 10 different subjects. Our experiments on two datasets show that the LRF outperforms baselines and prior arts ts in both accuracy and efficiency.