Binary Image Analysis Technique for Preprocessing of Excessively dilated characters in Aged Kannada Document Images
Sridevi T.N1, Lalitha Rangarajan2
1Sridevi T. N*, Department of studies in Computer Science, Universi-ty of Mysore, Mysuru, India.
2Lalitha Rangarajan, Department of studies in Computer Science, University of Mysore, Mysuru, India.
Manuscript received on November 20, 2019. | Revised Manuscript received on November 28, 2019. | Manuscript published on 30 November, 2019. | PP: 6660-6669 | Volume-8 Issue-4, November 2019. | Retrieval Number: D9101118419/2019©BEIESP | DOI: 10.35940/ijrteD9101.118419
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: In this paper, variety of document image enhancement techniques are applied for removal of background noise in the degraded document images. The noise removal techniques are applied on different forms of noise including non-uniform illuminations, complex stain marks, user annotations, show through effect and foxing effect. In this work, Binary Image Analysis (BIA) Technique is proposed for removal of aging degradation in ancient document images of Kannada literature. The method involves multiple phases comprising of contrast enhancement, Gaussian smoothing, binarization, morphological processing, object detection using connected component analysis and filtering followed by marginal noise removal of non-textual regions. The document samples employed for experimentation comprised of more than 175 aged and highly degraded scanned documents of old Kannada literature and poetry that are massively affected by noise and 25 images from DIBCO datasets collected across 2009 to 2017. The results of the experimentation are quite satisfactory and suitable enough for processing of document images in the subsequent stages of OCR..The experimentations are compared with some widely used approaches like Sauvola, Otsu, Gaussian. It is noticed that the proposed method outperforms other noise removal methods in terms of character retentions for extensively degraded and aged document images.
Keywords: Aged Documents, Degradation Removal, Ancient Kannada Documents, Binarization, Connected Component, Illumination Correction.
Scope of the Article: Component-Based Software Engineering.