# A Part-Based Skew Estimation Method

## Summary (3 min read)

### Introduction

- Skew estimation is one of the most important preprocessing steps for OCR.
- All of three methods assume that the characters in images are laid out in straight lines, therefore they estimate the text skew by finding text lines by their own approaches and then measuring their angles.
- Another example is cameracaptured scene images where some texts are included in a scattered manner.
- Furthermore, since local parts can be detected from the input image without binarization, the authors can expect more robustness to occlusions and complex backgrounds than using the full shapes of characters.
- In Section II, the conventional methods are first reviewed briefly.

### II. CONVENTIONAL METHODS

- There are several conventional methods for skew estimation.
- In the followings, the three most representative methods, that is, the projection profile method, the Hough transform method, and the nearest-neighbor method, are reviewed briefly.
- A conventional part-based method is also reviewed.

### A. The Projection Profile Method

- The projection profile method (e.g., [3]) utilizes a histogram acquired by accumulating the number of black pixel (in a binarized image) along parallel sample lines through the document.
- For the horizontal writing, the projection profile taken horizontally along rows have the narrowest peaks.
- In the most straightforward method, projection profile is calculated for each expected orientation, and one with the keen peaks shows the skew angle.
- A modified version of projection profile proposed by Akiyama, et al. [4] first separates the document into several “swaths”.
- Projection profile is then calculated for each of the swaths.

### B. The Hough Transform Method

- If the characters are laid out in straight lines, the centers of gravity of the characters align in straight lines accordingly.
- Amin and Fischer[7] first find connected components in an input image and group them together according to the distance between them.
- Each of the grouped areas is now divided into several swaths whose widths are about the same size as a connected component.

### C. The Nearest-Neighbor Method

- In the nearest-neighbor method, connected components are determined first.
- Then, for each connected component, the nearest neighbors components are found.
- The final skew is estimated as the peak of the histogram.
- O’Gorman [9] has used not only 1 but also 𝑘- nearest neighbors (where 𝑘 is usually 5) to first make a rough skew estimation.
- After getting rid of the between-lines nearest neighbors, more accurate estimation is calculated using only within-line nearest neighbors.

### D. The Conventional Part-Based Method

- One solution to the problem of the above conventional methods has been proposed in [10].
- Similarly to the proposed method, this is also a part-based skew estimation.
- Finally, the skew angle of “each character” can be found and most frequent local skew angle is chosen as the global skew.
- The drawback of [10] is that it totally relies on the connected components being extracted.
- Since the process of extracting connected components depends on the quality of binarization, the skew estimation will fail if binarization fails to extract connected components of characters accurately.

### B. Training Step

- First, upright character images (i.e., font images) are prepared as training dataset.
- If necessary, multiple font images are prepared.
- First, the detected local part positions are the same regardless of the scale and the skew of the target image.
- Second, it can determine the “dominant orientation” at each local part based on image gradient.
- Third, SURF feature vector is skew (and scale) invariant.

### C. Local Skew Estimation Step

- The skew angle of each local part of an input image is estimated by referring the database.
- Specifically, as shown in Fig. 2, the nearest neighbor (measured by Euclidean distance in the feature vector space) for the input local part is first found from the database.
- Because of the invariance of the SURF feature vector, the authors can expect that the input local part and its nearest neighbor are the same local part of a certain character.
- Then, recalling the second property of SURF explained in III-B, the skew angle of the input local part is estimated just by checking the difference of the dominant orientations of the input local part and its nearest neighbor.

### D. Global Skew Estimation Step

- The global skew angle is finally estimated by aggregating the estimated local skew angles.
- This is because, for example, the nearest neighbor in the database is sometimes a different local part due to the ambiguity of local parts.
- As a robust aggregation method which does not affected by the large deviations, the authors use a simple majority voting scheme as shown in Fig.
- The width of each bin is predetermined according to the skew sensitivity of the succeeding character recognition.
- The global skew angle is estimated as the angle of the bin with the maximum votes.

### A. Basic Performance Test

- An experiment has been conducted to observe the basic performance of the proposed method on 200 text-only images1.
- As training set the authors chose characters 0-9, lowercased a-z, and capitalized A-Z in Times New Roman.
- The proposed method had achieved an average error of 0.3 degree.
- There is a possibility that the authors can improve the accuracy of the proposed method by choosing more suitable features.

### B. Comparison with Conventional Methods

- As a simple example of such text images, mathematical equations was employed in the second experiment.
- Table I shows the accuracies; the proposed method had more accuracy than all of the three conventional methods on the test set.
- In Fig. 5(c) selected text lines are shown in each of the images.
- It is no surprise and yet interesting to see that, on ‘y’ and ‘2𝑥 2’, the blue dots that correspond to the correct skew angle dominates.

### C. Scene Images

- The test set are 5 scene images captured with a digital camera and 1 synthetic poster image.
- On the test set the accuracy of the proposed method was ± 2 degrees, and average of error was 0.43 degree.
- In (b), and (e) the method estimated the skew correctly even with complex backgrounds in images.
- It is very interesting to see that, in the character region, the color corresponding to the correct skew angle dominates the area.

### V. CONCLUSION

- In this paper the authors had proposed a part-based skew estimation method.
- Instead, the method utilizes the local parts of characters as a fundamental unit of skew estimation.
- It is effective on document images without explict (and long) text lines.
- The experimental results have shown the advantage of the proposed method over the conventional methods.
- The results have also shown that the proposed method is applicable to scene images with varieties of occlusions and complex backgrounds.

Did you find this useful? Give us your feedback

##### Citations

36 citations

26 citations

5 citations

### Cites background from "A Part-Based Skew Estimation Method..."

...…boundary (Clark & Mirmehdi, 1999), text lines (Clark & Mirmehdi, 2001), character shape (Liang et al., 2008; Lu et al., 2005), instances which means the pair of origin image and the distortion image (Lu & Tan, 2007; Shiraishi et al., 2012; Uchida et al., 2008) can be used to remove distortion....

[...]

..., 2005), instances which means the pair of origin image and the distortion image (Lu & Tan, 2007; Shiraishi et al., 2012; Uchida et al., 2008) can be used to remove distortion....

[...]

1 citations

### Cites background from "A Part-Based Skew Estimation Method..."

...In addition to a brief overview of initial trials proposed in [7], this paper provides several totally new experimental results for further evaluation....

[...]

1 citations

##### References

13,011 citations

^{1}

654 citations

628 citations

351 citations

### "A Part-Based Skew Estimation Method..." refers methods in this paper

...By simply finding the slope of lines which go through those the centers of gravity using Hough transform, the text skew can be estimated [6]....

[...]

265 citations

### "A Part-Based Skew Estimation Method..." refers methods in this paper

...Hough transform can be used to estimate a skew angle (e.g., [5])....

[...]

##### Related Papers (5)

##### Frequently Asked Questions (2)

###### Q2. What have the authors stated for future works in "A part-based skew estimation method" ?

In the future, the accuracy of the method can be improved by choosing more suitable feature detector and descriptor. Extension of nonuniform skew, especially, perspective distrotion is also an important future work. Another improvement can be made by refining the database since some of the local parts have less accuracy in calculated orientation than others.