MatchNet: Unifying feature and metric learning for patch-based matching
read more
Citations
Learning to Compare: Relation Network for Few-Shot Learning
Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources
Learning to Compare: Relation Network for Few-Shot Learning
Unsupervised Learning of Depth and Ego-Motion from Video
Volumetric and Multi-view CNNs for Object Classification on 3D Data
References
ImageNet Classification with Deep Convolutional Neural Networks
Object recognition from local scale-invariant features
SURF: speeded up robust features
ORB: An efficient alternative to SIFT or SURF
A performance evaluation of local descriptors
Related Papers (5)
Frequently Asked Questions (14)
Q2. What are the future works in "Matchnet: unifying feature and metric learning for patch-based matching" ?
The authors also evaluate a suite of architectural variations to study the tradeoff between accuracy vs. storage/computation. This work demonstrates that deep convolutional neural networks can be effective for general wide-baseline patch matching. This suggests that using deep learning approaches— and more advanced quantization—can make even more significant improvements in the accuracy/feature size trade-off.
Q3. What is the use of the bottleneck layer?
The bottleneck layer can be used to reduce the dimension of the feature representation and to control overfitting of the network.
Q4. How many bits does the naive representation yield?
Using a naive representation: 1 bit to encode whether the value iszero or not, quantizing the features to 6 bits yields a representation of 64 + 6 × 64× 0.679 = 324.7 bits on average.
Q5. What is the model for nSIFT?
Their best model is trained without a bottleneck and it learns a high-dimensional patch representation coupled with a discriminatively trained metric.
Q6. How much error rate is the nSIFT concat.+NNet?
Without discriminative projection, at around 1500d, the error rate is still above 9%, more than twice as much as MatchNet’s error rate (3.87%) with 4096d patch representation.
Q7. What is the purpose of patch-based image matching?
Finding accurate correspondences between patches is instrumental in a broad variety of applications including wide-baseline stereo (e.g., [14]), object instance recognition (e.g., [13], fine-grained classification (e.g., [36]), multi-view reconstruction (e.g. [20]), image stitching (e.g. [4]), and structure from motion (e.g. [17]).
Q8. What types of learning algorithms are proposed to find the optimal parameters for [3], [28]?
Different types of learning algorithms are proposed to find the optimal parameters: Powell minimization, boosting and convex optimization for [3], [28] and [22], respectively.
Q9. How many times do the authors go through the whole dataset?
Since the authors go through the whole dataset many times, even though the authors only pick one positive pair from each group in each pass, the network still gets good positive coverage, especially when the average group size is small.
Q10. How is the performance of the matchnet evaluated?
The authors train MatchNet using techniques described in Section 4 and evaluate the performance under different (F,B) combinations, where F and B are the dimension of fully-connected layers (F1 and F2) and the bottleneck layer respectively.
Q11. What is the way to use the feature tower and the metric network?
the authors can use the feature tower and the metric network separately and in two1Following [32], if the sampler’s reservoir is not full, the candidate is always added; otherwise for the T-th candidate, with probability R/T it is added and replaces a random element in the reservoir and with probability 1-R/T it gets rejected.
Q12. How does the model achieve the performance?
With a bottleneck of 64d, their 64-1024×1024 model achieves 10.94% average error rate vs. [22]’s 10.75% using features with about the same dimension.
Q13. How much improvement in absolute error rate does MatchNet achieve?
On the other, with a 512d bottleneck and quantization, MatchNet still outperforms [22]’s PR (<640d) results in 4 out of 6 train-test pairs with up to 7% improvement in absolute error rate.
Q14. What is the significance of the trade-off between accuracy and feature size?
This suggests that using deep learning approaches— and more advanced quantization—can make even more significant improvements in the accuracy/feature size trade-off.