Articulated Human Detection with Flexible Mixtures of Parts
read more
Citations
Going deeper with convolutions
Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields
Stacked Hourglass Networks for Human Pose Estimation
Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields
References
Histograms of oriented gradients for human detection
The Pascal Visual Object Classes (VOC) Challenge
Object Detection with Discriminatively Trained Part-Based Models
LIBLINEAR: A Library for Large Linear Classification
Pictorial Structures for Object Recognition
Related Papers (5)
Frequently Asked Questions (10)
Q2. How long does it take to process a typical benchmark image?
Their model requires roughly 1 second to process a typical benchmark image, allowing for the possibility of real-time performance with further speedups (such as cascaded [5] or parallelized implementations).
Q3. Why do the authors often resort to global mixture models to capture large appearance changes?
Because translating parts do not deform too much in practice, one often resorts to global mixture models to capture large appearance changes [4].
Q4. How can the authors increase the representational power of local part mixtures?
Their local part mixtures can becomposed to generate an exponential number of globalmixtures, greatly increasing their representational powerwithout sacrificing computational efficiency.
Q5. Why do the authors use PCK to perform diagnostic experiments?
Because PCK is easier to interpret and faster to evaluate than APK, the authors use PCK to perform diagnostic experiments exploring different aspects of their model in the next section.
Q6. What is the simplest way to score a configuration of parts?
To score a configuration of parts, the authors first define a compatibility function for part types that factors into a sum of local and pairwise scores:SðtÞ ¼ X i2V btii þ X ij2E b ti;tj ij : ð1ÞThe parameter btii favors particular type assignments for part i, while the pairwise parameter bti;tj ij favors particularco-occurrences of part types.
Q7. What is the main reason for the differences in appearance of limbs?
limbs vary greatly in appearancedue to changes in clothing and body shape, as well aschanges in viewpoint manifested in in-plane rotations and foreshortening.
Q8. How much does joint training of orientation-variant parts increase performance?
Wefind that joint training of orientation-variant parts in-creases performance by nearly a factor of 2, from 39 to72 percent PCK.
Q9. What does the author find important about the latent updating of mixture labels?
The authors find that the latent updating of mixture labels is not crucial, a star model definitively hurts performance, and adding rotated copies of their training images increases performance by a small but noticeable amount.
Q10. How many positive examples are there on the training dataset?
On their training datasets, the number of positive examples varies from 200 to 1,000 and the number of negative images is roughly 1,000.