Kinga Pilch
supervisor: Maria Ganzha
Optical character recognition (OCR) is the oldest branch of pattern recognition. It refers to the conversion of handwritten letters or symbols in general to machine-coded text.
Much work has been done on OCR in Western scripts and popular languages like Chinese, Indian or Japanese. However the main Indian language in the northeast is still not examined.
Assamese is spoken by over 14 millions of people. There were some attempts but the researchers achieved an accuracy of up to 90%. In their work, the major limitation of neural networks is their inability to capture spatial features in images. This problem can be omitted by using convolutional neural networks. CNNs eliminate the need for manual feature extraction as they learn features directly during training. Since no standard image data set for Assamese characters was available the co-authors have collected and generated handwritten drawing samples with over 12000 images.
The aim is to create the best possible solution for Assamese characters recognition. So far the dataset was created, images were preprocessed using different methods such as otsu binarization, normalization or image smoothing. Moreover, many models were tested, both created by myself or available online (like DenseNet 201). The best obtained result is ~94% so far but there are many areas where this score can be improved for example by meta-models or transfer learning.