# Decomposition of Grey-Scale Morphological Structuring Elements in Hardware

## I. Andreadis, C. Fyrinides, A. Gasteratos and Y. Boutalis Section of Electronics & Information Systems Technology Department of Electrical & Computer Engineering Democritus University of Thrace GR-67 100 Xanthi, Greece E-mail: iandread@ee.duth.gr

**Abstract-** Morphological image processing machines are not capable of handling large-size structuring elements. A new architecture for fast execution of the erosion/dilation operations in an up to  $9 \times 9$ -pixel, arbitrarily shaped, image window through decomposition of grey-scale morphological structuring element into  $3 \times 3$ -pixel sub-domains is presented in this paper. The proposed hardware structure has been also implemented in VLSI and its throughput rate is 10 Mbytes/sec.

Keywords: Computer vision, mathematical morphology, ASICs.

### **1. Introduction**

Mathematical morphology is a methodology for image analysis and image processing based on set theory and topology [5]. The two basic morphological operations are erosion and dilation, from which all the other operations and transforms are composed. In mathematical morphology primitive role plays the structuring element. This is sequentially translated through the image and it is compared with the region it overlays. Various structuring elements are used to perform morphological operations. Several applications such as textural segmentation and granulometry require the application of successive larger structuring elements. However, in special purpose morphological processors the size of structuring elements is restricted to  $3 \times 3$  pixels [3]. Thus, applying a large-size structuring element on a machine capable of handling certain size structuring elements is not a straightforward process. In order to overcome this problem, decomposition of the structuring element should be utilised. In fact this problem has concerned many researchers both for binary [4,7] and grey-scale morphology [1, 2, 6]. One decomposition strategy is to present the structuring element as successive dilations of smaller structuring elements. Algorithms for optimal structuring element decomposition according to this strategy are described in [1, 4, 7]. Algorithms for decomposing grey-scale structuring elements with rectangular support into horizontal and vertical structuring elements are presented in [2]. Several methods for decomposing structuring elements into combined structures of segmented small components are presented in [6].

A new architecture which performs the operations of dilation/erosion in a 9×9-pixel, arbitrarily shaped, image window through decomposition of grey-scale morphological structuring elements is presented in this paper. According to the decomposition technique the domain of the structuring element is divided into non-overlapping  $3 \times 3$ -pixel sub-domains. These sub-domains are used to compute the morphological operations locally. From these local results the global result of the morphological operation is extracted. The architecture has been also VLSI implemented. The resulting ASIC can handle images of up to  $32 \times 32$  pixels. The maximum frequency of operation (worst case) of the ASIC is approximately 90 MHz and its throughput rate is 10Mbytes/sec. Its die size dimensions are 5.5mm × 5.5mm =  $30.2mm^2$ , for a DLM,  $0.7\mu$ m, CMOS, N-well technology process.

### 2. Basic Definitions

In terms of grey-scale morphology dilation and erosion of a grey-scale image *f* by a grey-scale structuring element *g* are:

$$(f \oplus g)(x) = \max_{\substack{z \in G \\ x-z \in F}} \{f(x - z) + g(z)\}$$
(1)

and

$$(f Tg)(x) = \min_{\substack{z \in G \\ m \neq G}} \{ f(x+z) - g(z) \}$$
(2)

where,  $x, y \in E^N$  are the spatial co-ordinates,  $f: F \to E$  is the grey-scale image,  $g: G \to E$  is the grey-scale structuring element and  $F, G \subseteq E^N$  are the domains of the grey-scale image and the structuring element, respectively.

### 2.1 Decomposition strategy [6]

The decomposition strategy follows:

- i. Divide the large size structuring element g, into several smaller non-overlapped regions, denoted by  $g_1, g_2, ..., g_n$ , with domains denoted by  $G_1, G_2, ..., G_n$ , respectively. This procedure is graphically illustrated in Figure 1. In this case the 9 × 9-pixel structuring element is divided into 9 3 × 3-pixel structuring elements named  $g_1, g_2, ...$  and  $g_9$ .
- ii. The  $(f \oplus g_i)(x)$  can be computed by evaluating  $(f \oplus l_1)(x z_i)$ , where  $l_i: G'_i \to G_i$  and  $l_i(x) = g_i(x + z_i)$ .  $G'_i$  is  $G_i$  shifted by  $z_i$  with its origin located at the same location.
- iii. Dilation is obtained through the following formula:

$$(f \oplus g)(x) = \max_{i=1} \{ (f \oplus l_i)(x - z_i) \}$$
(3)

iv. Erosion is obtained through the following formula:

$$(fTg)(x) = \min_{i=1}^{n} \{(fTl_i)(x - z_i)\}$$

n



Fig. 1 Decomposition of a  $9 \times 9$ -pixel structuring element into  $9.3 \times 3$ -pixel structuring elements.

#### 2.2 Illustrative structuring element decomposition example

Let us consider the following image f and structuring element g: Dilation at point (0,0) according to eqn. (1) is:

(4)

| 13 | 9  | 0          | 12 |   | 7 | 10 |   |
|----|----|------------|----|---|---|----|---|
| 10 | 15 | <b>1</b> 2 | 11 |   | 5 | 7  | , |
|    | f  |            |    | - |   | g  |   |

 $(f \oplus g)(0,0) = \max(18,22,25,15,17,7,16,18) = 25$ 

According to the decomposition strategy:

i. The structuring is divided into four structuring elements:



ii. The following local results are obtained from the above structuring elements: 22, 25, 17, 18 for the first, the second, the third and the fourth  $2 \times 1$ -pixel structuring elements, respectively.

iii. According to eqn. (3) the max of these local results, i.e. 25, is the result of the global dilation at point (0,0).

### 3. VLSI Implementation

A VLSI h/w structure which performs the operations of dilation/erosion through structuring element decomposition (up to  $9 \times 9$  pixels initial structuring element size) is described in this section. This circuit can accommodate images of up to  $32 \times 32$  pixels. The resolutions of both image and structuring element are 8 bits. The block diagram of the circuit is shown in Figure 2. Its inputs are: an 8-bit bus (Din) to import the pixels of the image or the structuring element; signal LoadSEz, which allows loading of the structuring element; signal Mask determines whether a pixel belongs to the structuring element; signal Mode selects the operation; signal Startz determines the image is loading and finally the clock signal clk1. Initially, signal LoadSEz is set to 0 and the  $9 \times 9$ -pixel structuring element is transported serially.

In general, the shape of the structuring element can be other than rectangular and, therefore, along with the pixels of the structuring element the input Mask signal is used. More specifically, if Mask is set to 1, then the current pixel belongs to the structuring element. A randomly shaped structuring element and the corresponding Mask signals are shown in Figures 3a and 3b, respectively. The Mask signals are also used for the boundary conditions management (Figure 4). In this case some of the pixels of the structuring element may lay outside the region of the image and, therefore, these pixels should not be considered. The generator of  $3 \times 3$ -pixel structuring element windows module is shown in Figure 5. The structuring element and the Mask signals are stored into an array of 81 registers each of 9-bit resolution. After loading the structuring element, this array is isolated from Din bus by means of tri-state buffers. The Mask signals are the inputs to an array of 81  $2 \times 1$  multiplexers. The second inputs are set to 0. The multiplexers are controlled by two identical 5-bit counters and some combinational logic circuitry. Figure 6 shows the outputs (RC0 ... RC8) of the combinational logic circuitry with respect to the corresponding counter (either C-C or R-C) outputs. C-C and R-C count the columns and the rows of the image, respectively. C-C is controlled by clock clk2 and R-C is triggered by signal RC3, which is extracted from the combinational logic circuitry, associated with C-C. The 81 different combinations of the outputs of the two combinational logic circuits combined with OR gates control the array of multiplexers. Thus, in every clk2 pulse the pixels of the structuring element, which lie outside the region of the image, are known. The 9-bit buses which carry the pixels of the structuring element and the Mask signals are reconnected to the output of the 81  $2 \times 1$  multiplexers. The 9  $\times$  9-pixel structuring element is divided into  $3 \times 3$ -pixel sub-domains through a multiplexer that provides the appropriate  $3 \times 3$ -pixel window to the dilation/erosion unit. The multiplexer is controlled by a mod9 up/down counter. This counter operates in down mode in the case of dilation and in up mode in the case of erosion, according to eqns. (1) and (2).



Fig. 2 Structuring element decomposition architecture.



| 0 0 1 1 1 1   0 1 1 1 1 1 1 | 1 | 0 |
|-----------------------------|---|---|
|                             |   | 4 |
|                             |   |   |
| 0 0 0 1 1 1 1               | 1 | 1 |
|                             |   | 1 |
| 0 0 0 1 1 1 1               |   | 0 |
| 0 0 0 1 1 1 1               |   | 0 |
| 0 0 0 0 0 0 0               |   |   |
| 0 0 0 0 0 0                 | 0 | 0 |

(b)

Fig. 3 (a) Structuring element and (b) Corresponding Mask signals.



### Fig. 4 Boundary conditions.



Fig. 5 Generator of  $3 \times 3$ -pixel structuring elements.



Fig. 6 Image row/column timing diagram.



Fig. 7 Data loading timing diagram.

After loading the structuring element, signal Startz is set to 0 for one clock period. This signal initialises all the counters of the circuit. In the next clock period the image data is loaded pixel by pixel. The image is loaded to the circuit at a rate nine times slower than the rate of clk1. The generator of  $3 \times 3$ -pixel image windows unit is controlled by clock clk2, which is extracted from clk1 by dividing it by nine. The timing diagram of this procedure is shown in Figure 7. The generator of  $3 \times 3$ -pixel image windows unit consists of an array of registers, i.e. 8 lines of 32 8-bit cascaded registers and another 9 registers. These can accommodate a  $9 \times 9$ -pixel image window from the  $32 \times 32$ -pixel stored image. Image data flows from a register to the next. As in the case of the structuring element through the use of a multiplexer, the  $9 \times 9$ -pixel image window is divided into  $9 \times 3$ -pixel windows. The counter, which controls the selection input of the multiplexer, is an up counter, controlled by clock clk1. In this way, in one clk2 period,  $9 \times 3$ -pixel image data windows become available to the dilation/erosion unit.

The operations of dilation/erosion in  $3 \times 3$ -pixel windows are efficiently performed through the hardware module reported in [3]. The inputs to the dilation/erosion unit are: a  $3 \times 3$ -pixel image data window, the corresponding  $3 \times 3$ -pixel structuring element, 9 Mask signals (one for each pixel of the structuring element), signal Mode, and clock clk1. In this unit the required additions/subtractions of the image pixels with the appropriate structuring element pixels are executed. The results of these additions/subtractions are clipped when they are outside of the range [0..255]. Then, they are fed into a max/min unit, which traces the max or min value, depending on signal Mode. In order to trace the max or min value, flag signals, one for each addition/subtraction result, are utilised. These flags are used to reject, in successive stages, the numbers, which are not candidates to be the max/min. Thus, in the first stage of this procedure the Mask signals are given as inputs to the flag signal, in order to reject the numbers which should not be taken into account. The output of the max/min unit, which is the result of the dilation/erosion unit, is stored into 9 8-bit cascaded registers, triggered by clk1. The outputs of these registers are the inputs to a max/min unit, controlled by clock clk2 (Figure 2). Thus, in a clk2 period 9 local max/min results are provided simultaneously to the max/min unit.

For a DLM,  $0.7\mu m$ , CMOS, N-well technology process, the dimensions of the core of the chip are 5.5mm  $\times$  5.5mm =  $30.2 mm^2$ . Its maximum frequency of operation (worst case) is approximately 90 MHz and the throughput rate of the proposed ASIC is 10Mbytes/sec. The functionality of the ASIC was tested through several randomly generated inputs. The results obtained from the ASIC were compared to the expected ones. No errors were reported during this process.

### 4. Conclusions

A new architecture for any grey-scale morphological structuring element arbitrarily shaped decomposition up to  $9 \times 9$ pixel has been presented in this paper. The structure has been also VLSI implemented. Its maximum frequency of operation (worst case) is approximately 90 MHz. The throughput rate is 10Mbytes/sec. The ASIC is intended to be used in time critical computer vision applications. Such applications include textural segmentation and granulometry, where the use of successive larger structuring elements is required.

### References

[1] O.I. Camps, T. Kanungo and R.M. Haralick. Grey-scale structuring element decomposition. *IEEE Trans. Image Proc.*, IMP-5, 111-120, 1996.

[2] P.D. Gader. Separable decompositions and approximations of grey-scale morphological templates. *CVGIP: Image Understanding*, 53, 288-296, 1991.

[3] A. Gasteratos, I. Andreadis and Ph. Tsalides. Extension and VLSI implementation of the majority-gate algorithm for gray-scale morphological operations. *Optical Engineering*, 36, 857-861,1997.

[4] H. Park and R.T. Chin, Decomposition of arbitrarily shaped morphological structuring elements. *IEEE Trans. Pattern Analysis & Machine Intelligence*, PAMI-17, 2-15,1995.

[5] J. Serra. Image Analysis and Mathematical Morphology-Vol. I. Academic Press, London, 1982.

[6] F.Y. Shih and O.R. Mitchell. Decomposition of grey-scale morphological structuring elements. *Pattern Recognition*, 24, 195-203, 1991.

[7] X. Zhuang and R.M. Haralick. Morphological structuring element decomposition. *Computer Vision, Graphics, Image Proc.*, 35, 370-382, 1986.