Understanding images of groups of people
read more
Citations
Age and gender classification using convolutional neural networks
Age Synthesis and Estimation via Faces: A Survey
Age and Gender Estimation of Unfiltered Faces
AgeDB: The First Manually Collected, In-the-Wild Age Database
Describing clothing by semantic attributes
References
Active shape models—their training and application
TextonBoost : joint appearance, shape and context modeling for multi-class object recognition and segmentation
Putting Objects in Perspective
A System for the Notation of Proxemic Behavior1
Related Papers (5)
Frequently Asked Questions (16)
Q2. How is the classifier based on age?
For classifying age, their contextual features have an accuracy more than double random chance (14.3%), and gender is correctly classified about two-thirds of the time.
Q3. What is the reason why a person of honor is placed closer to the center of the image?
Sometimes a person of honor (e.g. a grandparent) is placed closer to the center of the image as a result of social factors or norms.
Q4. How many evaluations of the age and gender of the faces?
A total of 13 subjects estimated age and gender for each of the 45 faces for each of the 3 stages, for a total of 1755 evaluations for age and gender.
Q5. How many age categories are there in the face detection dataset?
The authors labeled each face as being in one of seven age categories: 0-2, 3-7, 8-12, 13-19, 20-36, 37-65, and 66+, roughly corresponding to different life stages.
Q6. How many columns are learned for age and gender?
Each column ofWa is a vector learned by finding the projection that maximizes the ratio of interclass to intraclass variation (by linear discriminate analysis) for a pair of age categories, resulting in 21 columns forWa.
Q7. How does the group shot face geometry achieve a median horizon estimate?
Using the group shot face geometry achieves a median horizon estimate of 4.6%, improving from an error of 17.7% when the horizon is assumed to pass through the image center, or 9.5% when the horizon estimate is the mean position of all other labeled images.
Q8. How do the authors solve the face horizon and camera height?
The face horizon yo and camera height Yc are solved using least squares by linearizing (7) and writing in matrix form:⎡⎢ ⎢ ⎣ Ei1 ei1 Ei2 ei2 . . . . . .
Q9. How does the gender classification process improve?
Exact age category recognition improves by 4.6%, and when the adjacent age category is also considered correct, the improvement is 6.8%.
Q10. What is the purpose of the fx?
The feature vector fx captures both the pairwise relationships between faces and a sense of of the person’s position relative to the global structure of all people in the image.
Q11. What is the way to estimate the face size?
The model from (8) could also be used to estimate the size of a face in the face plane, but its objective function minimizes a quantity related to the camera and scene geometry and does not guarantee that the estimated face sizes in the image are optimal.
Q12. What are the three search terms used to remove undesirable images?
The following three searches were conducted: “wedding+bride+groom+portrait” “group shot” or “group photo” or “group portrait” “family portrait” A standard set of negative query terms were used to remove undesirable images.
Q13. What is the benefit of this approach?
One benefit to this approach is that a common algorithm and training set are used for both tasks, only the class labels and pairing for learning discriminative projections are modified.
Q14. What is the probability of a face having a particular gender?
Regarding the degree deg(vn) of a face in MST (G), females tend to be more centrally located ina group, and consequently have a higher mean degree in MST (G).
Q15. How much distance between a person and her nearest neighbor?
Using the fact that the distance between human adult eye centers is 61±3 mm [9], the mean distance between a person and her nearest neighbor is 306 mm.
Q16. What are the terms used to search for people images?
As Flickr does not explicitly allow searches based on the number of people in the image, the authors created search terms likely to yield images of multiple people.