Iaetsd degraded document image enhancing in

Page 1



Degraded Document Image Enhancing in Spatial Domain using Adaptive Contrasting and Thresholding Dr. P V Ramaraju, Professor

G.Nagaraju, Asst.Professor,

V.Rajasekhar, M.Tech Student

Department of ECE, SRKR Engineering College, Bhimavaram, India. pvrraju50@gmail.com

Department of ECE, SRKR Engineering College, Bhimavaram, India. bhanu.raj.nikhil@gmail.com

Department of ECE, SRKR Engineering College, Bhimavaram, India. vrm.rajasekhar@gmail.com

Abstract: This paper presents a new adaptive approach for the binarization and enhancement of degraded document images. The proposed method can deal with degradations which occur due to shadows, non-uniform illumination, low contrast, ink bleeding-through, smear and strain. We follow several distinct steps in the proposed technique; an adaptive contrast map is first constructed for an input degraded document image. The contrast map is then binarized by using local threshold that is estimated based on the intensities of detected text stroke edge pixels within a local window. Some post-processing is further applied to improve the document binarization quality. The proposed method is simple, robust, and involves minimum parameter tuning. Keywords--Adaptive contrast map, document image enhancing, Adaptive thresholding, degraded document image binarization, pixel classification.

I. INTRODUCTION Robust binarization gives the possibility of a correct extraction of the sketched line drawing or text from its background. For the binarization of images many algorithms have been implemented. Thresholding is a sufficiently accurate and high processing speed segmentation approach to monochrome image. This paper describes a modified logical thresholding method for binarization of seriously degraded and very poor quality gray-scale document images. In general there are two types of image thresholding techniques available: global and local. In the global thresholding technique a gray level image is converted into a binary image based on an image intensity value called global threshold which is fixed in the whole image domain whereas in local thresholding technique, threshold value can vary from one pixel location to next. Thus, global thresholding converts an input image I to a binary image G as follows G(i, j) = 1 for I (i, j) ≼ T, or, G(i, j) = 0 for I (i, j) < T, where T is the threshold,

G (i, j) = 1 for foreground and G (i, j) = 0 for background. Whereas, for a local threshold, the threshold T is a function over the image domain, i.e.,T= T(x, y). In addition, if in constructing the threshold value/surface the algorithm adapts itself to the image intensity values, then it is called dynamic or adaptive threshold. In a general setting, thresholding can be expressed as a test operation that tests against a function T of the form [1]: T = T[x , y, h, I ] , where, I is the input image and h denotes some local property of this point– for example, the average gray level of a neighborhood centered on (x, y). Threshold selection depends on the information available in the gray level histogram of the image. We know that an image function I(x, y) can be expressed as the product of a reflectance function and an illumination function based on a simple image formation model. If the illumination component of the image is uniform then the gray level histogram of the image is clearly bimodal, because the gray levels of object pixels are significantly different from the gray levels of the background. It indicates that one mode is populated from object pixels and the other mode is populated from background pixels. Then objects could be easily partitioned by placing a single global threshold at the neck or valley at the histogram. However, in reality bimodality in histograms does not occur very often. Consequently, a fixed threshold level based on the information of the gray level histogram will fail totally to separate objects from the background. In this scenario we turn our attention to adaptive local threshold surface where threshold value changes over the image domain to fit the spatially changing background and lighting conditions. Over the years many threshold selection methods have been proposed. Otsu has suggested a global image thresholding technique where the





optimal global threshold value is ascertained by maximizing the between-class variance with an exhaustive search [2]. Sahoo et al. [3] claim that Otsu’s method is suitable for real world applications with regard to uniformity and shape measures. Though Otsu’s method is one of the most popular methods for global thresholding, it does not work well for many real world images where a significant overlap exists in the gray level histogram between the pixel intensity values of the objects and the background due to un-even and poor illumination. As many degraded documents do not have a clear bimodal pattern, global thresholding [4]–[7] is usually not a suitable approach for the degraded document binarization. Adaptive thresholding [8]– [14], which estimates a local threshold for each document image pixel, is often a better approach to deal with different variations within degraded document images. For example, the early windowbased adaptive thresholding techniques [12], [13] estimate the local threshold by using the mean and the standard variation of image pixels within a local neighborhood window. The local image contrast and the local image gradient are very useful features for segmenting the text from the document background because the document text usually has certain image contrast to the neighboring document background. The image gradient is defined as follows 1

G(x,y)=fmax(x, y) − fmin(x,y ) The Local contrast is defined as follows

D( x, y) =

f f



( x, y ) −

( x, y ) +





( x, y )

( x, y ) + ε


Where ε is a positive but infinitely small number that is added in case the local maximum is equal to 0.

a weak contrast will be calculated for stroke edges of the bright text where the denominator in Equation 2 will be large but the numerator will be small. To overcome this over-normalization problem, we combine the local image contrast with the local image gradient and derive an adaptive local image contrast as follows

Da ( x, y) = α D( x, y) + (1− α )( fmax ( x, y) − fmin ( x, y)) → 3 Where D(x, y) denotes the local contrast in Equation 2 and (fmax(x, y) − fmin(x,y )) refers to the local image gradient that is normalized to [0, 1]. The local windows size is set to 3 empirically. α is the weight between local contrast and local gradient that is controlled based on the document image statistical information. Ideally, the image contrast will be assigned with a high weight (i.e. large α) when the document image has significant intensity variation. So that the proposed binarization technique depends more on the local image contrast that can capture the intensity variation well and hence produce good results. Otherwise, the local image gradient will be assigned with a high weight. We model the mapping from document image intensity variation to α by a power function as follows 4 α= (Std/128)γ Where Std denotes the document image intensity standard deviation, and γ is a pre-defined parameter. The power function has a nice property in that it monotonically and smoothly increases from 0 to 1 and its shape can be easily controlled by different γ .γ can be selected from [0,∞], where the power function becomes a linear function when γ = 1. Therefore, the local image gradient will play the major role in Equation 3 when γ is large and the local image contrast will play the major role when γ is small. The setting of parameter γ will be discussed in the section of parameter selection.

II. PROPOSED METHOD This section describes the proposed document image binarization techniques. Given a degraded document image, an adaptive contrast map is first constructed. The text is then segmented based on the local threshold that is estimated from the detected text stroke edge pixels. Some post processing is further applied to improve the document binarization quality. Fig. 1

A. Contrast Image Construction The image contrast in Equation 2 has one typical limitation that it may not handle document images with the bright text properly. This is because





original document image) within the neighborhood window that can be evaluated as follows

∑ E mean =

∑ Fig. 2

E std =

I ( x, y ) * (1 − E ( x, y ))




(( I ( x , y ) − Emean ) * (1 − E ( x , y ))) 2




The size of the neighborhood window W can be set based on the stroke width of the document image under study. (a)


(c) Fig. 3. Contrast Images constructed using (a) local image gradient, (b) local image contrast [15], and (c) our proposed method for the original sample document images which are shown in Fig. 1 and 2, respectively.

Fig. 3 shows the contrast map of the sample document images in Fig. 1 and 2 that are created by using local image gradient, local image contrast [15] and our proposed method in Equation 3, respectively. B. Local Threshold Estimation The text can then be extracted from the document background pixels once the high contrast stroke edge pixels are detected properly. Two characteristics can be observed from different kinds of document images [15]: First, the text pixels are close to the detected text stroke edge pixels. Second, there is a distinct intensity difference between the high contrast stroke edge pixels and the surrounding background pixels. The document image text can thus be extracted based on the detected text stroke edge pixels as follows Ne ≥ N min&& I ( x , y ) ≤ Emean + Estd / 2 R ( x , y ) = {1.. →5 0.. otherwise

Where Emean and Estd are the mean and the standard deviation of the image intensity of the detected high contrast image pixels (within the

C. Post-Processing Once the initial binarization result is derived from Equation 5 as described in previous subsections, the binarization result can be further improved by incorporating certain domain knowledge as described in Algorithm 1. First, the isolated foreground pixels that do not connect with other foreground pixels are filtered out to make the edge pixel set precisely. Second, the neighborhood pixel pair that lies on symmetric sides of a text stroke edge pixel should belong to different classes (i.e., either the document background or the foreground text). One pixel of the pixel pair is therefore labeled to the other category if both of the two pixels belong to the same class. Finally, some single-pixel artifacts along the text stroke boundaries are filtered out by using several logical operators as described in[16]. Algorithm 1 Post-Processing Procedure Require: The Input grayscale Document Image ‘I’, Initial Binary Result ‘B’ and Corresponding Binary Text Stroke Edge Image ‘Edge’ Ensure: The Final Binary Result ‘Bf’ 1: Obtain the connect components of the stroke edge pixels in ‘Edge’. 2: Take out those pixels that do not connect with other pixels. 3: For removing isolated pixels, we need to check connectivity. 4: for Each remaining edge pixels (i, j ): do 5: Get its neighborhood pairs: (i − 1, j) & (i + 1, j); (i, j − 1) &(i, j + 1) 6: if The pixels in the pairs belong to the same class (both text or background) then 7: Classify the foreground and background pixels based on pixel values. 8: end if 9: end for 10: Remove single-pixel artifacts [16] along the text stroke boundaries after the document thresholding. 11: Store the new binary result to ‘Bf ‘.





D. Parameter Selection In the first experiment, we apply different γ to obtain different power functions and test their performance. α is close to 1 when γ is small and the local image contrast Da dominates the adaptive image contrast Da in Equation 3. On the other hand, Da is mainly influenced by local image gradient when γ is large. At the same time, the variation of α for different document images increases when γ is close to 1. Under such circumstance, the power function becomes more sensitive to the global image intensity variation and appropriate weights can be assigned to images with different characteristics. The proposed method can assign more suitable α to different images when γ is closer to 1. Parameter γ should therefore be set around 1 when the adaptability of the proposed technique is maximized and better and more robust binarization results can be derived from different kinds of degraded document images. III. RESULTS This section evaluates the results for proposed document image binarization techniques. Given a degraded document image, an adaptive contrast map is first constructed. The text is then segmented based on the local threshold that is estimated from the detected text stroke edge pixels. Some post-processing is further applied to improve the document binarization quality.

Fig.6 Binarized resultant image constructed based on proposed local thresholding and post processing. Example 2

Fig.7 input degraded document image having ink bleeding through effect

Example 1 Fig.8 Contrast image constructed based on proposed adaptive local contrast map

Fig.4 input degraded document image having ink bleeding through effect Fig.9 Binarized resultant image constructed based on proposed local thresholding and post processing.


Fig.5 Contrast image constructed based on proposed adaptive local contrast map

In this experiment, we quantitatively compare our proposed method with Otsu’s method (OTSU) [2], Sauvola’s method (SAUV) [12], Niblack’s method (NIBL) [13], Bernsen’s method (BERN) [8], Gatos et al.’s method (GATO) [17], and LMM method (LMM [15], BE [16]). These are composed of the same series of document images that suffer from several common document degradations





such as smear, smudge, bleed-through and low contrast. Example 1

Fig. 10. Binarization results of the sample document image in Fig. 1(b) produced by different methods. (a) OTSU [2]. (b) SAUV [12]. (c) NIBL [13]. (d) BERN [8]. (e) GATO [17]. (f) LMM [15]. (g) BE [16]. (h) Proposed.

Example 2

Fig. 11. Binarization results of the sample document image (PR 06) in DIBCO 2011 dataset produced by different methods. (a) Input Image. (b) OTSU [2]. (c) SAUV [12]. (d) NIBL [13]. (e) BERN [8]. (f) GATO [17]. (g) LMM [15]. (h) BE [16]. (i) LELO [18]. (j) SNUS. (k) HOWE [19]. (l) Proposed.

V. CONCLUSION This paper presents a simple and robust method of enhancing degraded document images. The method proposed in this paper constitutes binarization that is tolerant to different types of document degradation such as non-uniform illumination, ink bleeding through and document smear. This image binarization is based on local thresholding along with adaptive contrast mapping. The proposed method has been tested over various noise affected document images and is binarized efficaciously. But we observed that the performance on Bickley diary dataset needs to be improved, we will explore it in future. REFERENCES [1] R. C. Gonzalez and R. E. Woods, Digital Image Processing, Pearson prentice Hall, 2005. [2] N. Otsu, “A threshold selection method from gray– level histogram,” IEEE Transactions on System Man Cybernatics, Vol. SMC-9, No.1, pp. 62-66, 1979. [3] P.K. Sahoo, S. Soltani, A.K.C. Wong, and Y. Chen, “A survey of thresholding techniques,” Computer Vision Graphics Image Processing, Vol. 41, 1988, pp. 233 – 260. [4] A. Brink, “Thresholding of digital images using two-dimensional entropies,” Pattern Recognit., vol. 25, no. 8, pp. 803–808, 1992. [5] J. Kittler and J. Illingworth, “On threshold selection using clustering criteria,” IEEE Trans. Syst., Man, Cybern., vol. 15, no. 5, pp. 652–655, Sep.–Oct. 1985. [6] N. Otsu, “A threshold selection method from gray level histogram,” IEEE Trans. Syst., Man, Cybern., vol. 19, no. 1, pp. 62–66, Jan. 1979. [7] N. Papamarkos and B. Gatos, “A new approach for multithreshold selection,” Comput. Vis. Graph. Image Process., vol. 56, no. 5, pp. 357–370, 1994. [8] J. Bernsen, “Dynamic thresholding of gray-level images,” in Proc. Int. Conf. Pattern Recognit., Oct. 1986, pp. 1251–1255. [9] L. Eikvil, T. Taxt, and K. Moen, “A fast adaptive method for binarization of document images,” in Proc. Int. Conf. Document Anal. Recognit., Sep. 1991, pp. 435–443. [10] I.-K. Kim, D.-W. Jung, and R.-H. Park, “Document image binarization based on topographic analysis using a water flow model,” Pattern Recognit., vol. 35, no. 1, pp. 265–277, 2002. [11] J. Parker, C. Jennings, and A. Salkauskas, “Thresholding using an illumination model,” in Proc. Int. Conf. Doc. Anal. Recognit., Oct. 1993, pp. 270– 273.





[12] J. Sauvola and M. Pietikainen, “Adaptive document image binarization,” Pattern Recognit., vol. 33, no. 2, pp. 225–236, 2000. [13] W. Niblack, An Introduction to Digital Image Processing. Englewood Cliffs, NJ: Prentice-Hall, 1986. [14] J.-D. Yang, Y.-S. Chen, and W.-H. Hsu, “Adaptive thresholding algorithm and its hardware implementation,” Pattern Recognit. Lett., vol. 15, no. 2, pp. 141–150, 1994. [15] B. Su, S. Lu, and C. L. Tan, “Binarization of historical handwritten document images using local maximum and minimum filter,” in Proc. Int. Workshop Document Anal. Syst., Jun. 2010, pp. 159– 166. [16] S. Lu, B. Su, and C. L. Tan, “Document image binarization using background estimation and stroke edges,” Int. J. Document Anal. Recognit., vol. 13, no. 4, pp. 303–314, Dec. 2010. [17] B. Gatos, I. Pratikakis, and S. Perantonis, “Adaptive degraded document image binarization,” Pattern Recognit., vol. 39, no. 3, pp. 317–327, 2006. [18] T. Lelore and F. Bouchara, “Super resolved binarization of text based on the fair algorithm,” in Proc. Int. Conf. Document Anal. Recognit., Sep. 2011, pp. 839–843. [19] N. Howe, “A Laplacian energy for document binarization,” in Proc. Int. Conf. Doc. Anal. Recognit., Sep. 2011, pp. 6–10.

Nagaraju.G presently working as Assistant Professor at the Department of Electronics and Communication Engineering, S.R.K.R. Engineering College, Bhimavaram, AP, He received the B.Tech degree from S.R.K.R. Engineering College, Bhimavaram in 2002, and M. Tech degree in Computer electronics specialization from Govt. College of engg, Pune university in 2004. His current research interests include Image processing, digital security systems, Biomedical-Signal Processing, Signal Processing, and VLSI Design.

V.RAJASEKHAR received the B-tech degree in Electronics and communication engineering from Sri Vasavi Engineering college,Tadepalligudem , A.P, India, in 2011. He is currently pursuing the M.Tech degree in Communication Systems from S.R.K.R. Engineering college, Bhimavaram .

Dr.P.V.RamaRaju working as a Professor at the Department of Electronics and Communication Engineering, S.R.K.R. Engineering College, AP, India. His research interests include Biomedical-Signal Processing, Signal Processing, VLSI Design and Microwave Anechoic Chambers Design. He is author of several research studies published in national and international journals and conference proceedings.



Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.