IJMEMES logo

International Journal of Mathematical, Engineering and Management Sciences

ISSN: 2455-7749 . Open Access


CNN-Based Optical Character Recognition for Isolated Printed Gujarati Characters and Handwritten Numerals

CNN-Based Optical Character Recognition for Isolated Printed Gujarati Characters and Handwritten Numerals

Sanket B. Suthar
Department of Information Technology, Chandubhai S. Patel Institute of Technology (CSPIT), Faculty of Technology & Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India.

Amit R. Thakkar
Department of Computer Science and Engineering, Chandubhai S. Patel Institute of Technology (CSPIT), Faculty of Technology & Engineering (FTE), Charotar University of Science and Technology (CHARUSAT), Changa, Gujarat, India.

DOI https://doi.org/10.33889/IJMEMS.2022.7.5.042

Received on March 30, 2022
  ;
Accepted on June 28, 2022

Abstract

Optical character recognition (OCR) technologies have made significant progress in the field of language recognition. Gujarati is a more difficult language to recognize compared to other languages because of curves, close loops, the inclusion of modifiers, and the presence of joint characters. So great effort has been laid into the literature for Gujarati OCR. Recently deep learning-based CNN models are applied to develop OCR for different languages but Convolutional Neural Networks (CNN) models are not yet giving a satisfactory performance to recognize Gujarati characters. So this paper proposes a revolutionary Gujarati printed characters and numerals recognition CNN models. CNN-PGC (CNN for - Printed Gujarati Character) and CNN-HGC (CNN for - Handwritten Gujarati Character) are two optimally configured Convolutional Neural Networks (CNNs) presented in this research for printed Gujarati base characters and handwritten numbers, respectively. Concerning particular performance indicators, the suggested work's performance is evaluated and proven against that of other traditional models and with the latest baseline methods. Experimental analysis has been carried out on well-segmented newly generated Gujarati base characters and numerals dataset which includes 36 consonants, 13 vowels, and 10 handwritten numerals. Variation in the database is also taken into consideration during experiments like size, skew, noise blue, etc. Even in the presence of printing irregularities, writing irregularities, and degradations the proposed method achieves a 98.08% recognition rate for printcharactersers and a 95.24 % recognition rate for handwritten numerals which is better than other existing models.

Keywords- Optical character recognition (OCR); Convolution neural network (CNN); Recognition; Gujarati character and symbols; Handwritten numerals; Printed character classification.

Citation

Suthar, S. B. & Thakkar, A. R (2022). CNN-Based Optical Character Recognition for Isolated Printed Gujarati Characters and Handwritten Numerals. International Journal of Mathematical, Engineering and Management Sciences, 7(5), 643-655. https://doi.org/10.33889/IJMEMS.2022.7.5.042.