# 3D object recognition using spin-images for a humanoid stereoscopic vision system

## Summary (2 min read)

### Introduction

- Moreover if the information is precise enough, it can also be used for grasping behaviour.
- Recent works on 3D object model building make possible a description based on geometrical features.
- This behaviour consists in two majors steps: first building an internal representation of an object unknown to the robot, second finding this object in an unknown environment.

### B. Normal computation

- When computing spin-images, the normal computation should be as less sensitive as possible to noise.
- This is specially important for vision based informations where the noise might be significant.
- Using the Stanford Bunny model, and adding a Gaussian noise of 20 percent from the average adjacent edge, the most stable method found was the gravity center of the polynoms formed by the neighbours of each point.

### C. Spin-image filling

- Regarding the spin-image filling, Johnson propose two ways: either using a direct accumulation, or a bilinear interpolation.
- This makes the spin-image sensitive to noise.
- To solve this problem, a bilinear interpolation allows to smooth the effect of noise by sharing the density information among the 4 points connected to the surface.
- One of the most important feature needed in their case, is the possibility to perceive the object at different distances, and thus at different resolutions.

### A. Computing resolution of an object

- The resolution of the perceived object depends upon the stereoscopic system capabilities, the distance between the robot and the object, and the possible sub-sampling scheme during image processing.
- This error may also be induced by the segmentation used to match two points in the right and the left images, in their case a correlation.
- Those volumes are the intersection of the cones representing the surfaces on the image planes.
- They can be interpreted also as the location error of a 3D point. [8] and [9] proposed an ellipsoid based approximation of this volume, while [10] proposed a warranted bounding box using interval analysis.
- Thus in order to extract a global resolution from the scene, the average edge’s length Lscene is also used.

### B. Multi-resolution signature

- The dyadic scheme consists in dividing by 2 each dimension of the spin image between two resolutions.
- The main question is how to share the information carried by the points which will disappear.
- One can notice that the same quadrant may have several notations depending of the reference point used.
- It should be stress here that in their current implementation, only the spin-images are submit to a multi-resolution scheme.

### A. Selection of the best resolution

- From section III, the object resolution is the average edge’s length in the scene.
- Then the resolution for the model’s spin-images is chosen according to Eq. 1.
- Two spin-images (p,q) with the same resolution are compared using the following correlation function as proposed in [3]: R = N.∑Ni=0 pi.qi − ∑Ni=0 pi .
- Thus during the multiresolution phase the spin-images are not normalised.

### B. Rigid transformation evaluation

- The main rigid transformation is obtained as follows:.
- Some points are randomly selected in the scene.
- Their corresponding points in the model are searched by comparing their spin-image to all the model’s spin-images as depicted in Fig.
- If e is the real rigid transformation, then it should project the maximum number of points from the scene to the model.

### C. Final correlation coefficient

- On order to verify the main rigid transformation, points of the model are chosen randomly and verified against the scene using the proposed main rigid transformation.
- The main correlation coefficient is the average of the 80 % best correlation coefficients.

### A. Simulation

- The previously described algorithm was tested on different situations to check its efficiency.
- First, a Stanford Bunny spin-image was tested against a spin-image of the dinosaur represented in Fig.
- The third case intends to simulate a single view of the complete 3D model, and the subsequent self-occlusion as shown in Fig. 10.
- The associated rigid transformation has no rotation and no translation.
- The resulting main correlation coefficient was 0.22.

### B. OpenHRP[11] simulator

- The HRP-2 humanoid robot is simulated inside a house environment.
- The goal of this simulation was to try to cope with different objects present in the scene.
- The Stanford Bunny is above a table, behind chairs, and several objects are presents in the background, as depicted in Fig. 11 and Fig.
- Using the previously described scheme, the model is found with a correlation coefficient close to 0.99.
- One can conclude that the other objects in the scene does not decrease the efficiency of the search.the authors.

### C. Real data

- The HRP-2 humanoid robot is equipped with a trinoptic vision system.
- Using a correlation method to match points between the left image and the right image, clouds of 3D points are computed using epipolar geometry.
- The object used for this test is a box of cookies depicted in Fig. 12.(b).

### D. Computation time

- To build the Stanford Bunny model, it takes 6 minutes and 24 seconds for 34834 points.
- The recognition process takes 32 seconds for a scene, using 100 spin images to compute the rigid transformation.
- Two kinds of improvement are possible: using a compression scheme such as the Principal Component Analysis as proposed in [3], or a Wavelet based approach such as WaveMesh [13].
- Y. Sumi, Y. Kawai, T.Yoshimi, and T. Tomita, “3d object recognition in cluttered environments by segment-based stereo vision,” International Journal of Computer Vision, vol. 6, January 2002. [13].

Did you find this useful? Give us your feedback

##### Citations

36 citations

### Cites methods from "3D object recognition using spin-im..."

...Other works employing the SI for object matching and representation are: Stasse [277] presents a multi-resolution SI approach for object representation and recognition; Assfalg [17] shows a SI variation called Spin Image Signatures (SIS), which is developed under the SI approach with adaptations to support effective retrieval by content; Li [163] demonstrates a framework to identify partial 3D format in 3D CAD parts using the SI as descriptor; Ping [233] proposes the Tsallis entropy use to generate a concise SI representation, called Tsallis Entropy vector of Spin Image (TESI); Choi [47] proposed an improved SI version, which enhances the format discrimination performance, called Angular-Partitioned Spin Images (APSIs)....

[...]

28 citations

17 citations

### Cites methods from "3D object recognition using spin-im..."

...In [14], spin images were used in a 3D object detection system with the humanoid robot HRP-2 [18]....

[...]

...Spin images are shape descriptors which have been applied to surface matching [12], object recognition [13][14], 3D registration [15] and 3D object retrieval [16]....

[...]

8 citations

### Cites methods from "3D object recognition using spin-im..."

...Depending on the task different recognitions can be used, as we have at our disposal either a 3D-edge model [25] or a Spin-Image [26]....

[...]

7 citations

##### References

2,798 citations

1,424 citations

### "3D object recognition using spin-im..." refers methods in this paper

...Towards the design of a search engine for databases of CAD models, several 3D descriptors have been proposed to build signatures of 3D objects [3], [4], [5]....

[...]

919 citations

### "3D object recognition using spin-im..." refers methods in this paper

...Towards the design of a search engine for databases of CAD models, several 3D descriptors have been proposed to build signatures of 3D objects [3], [4], [5]....

[...]

[...]

897 citations

### "3D object recognition using spin-im..." refers background in this paper

...The targeted application us a “Treasure hunting” behaviour on a HRP-2 humanoid robot [6]....

[...]

[...]

716 citations

### "3D object recognition using spin-im..." refers background in this paper

...The targeted application us a “Treasure hunting” behaviour on a HRP-2 humanoid robot [6]....

[...]