Kinga Pilch
supervisor: Maria Ganzha
Optical character recognition (OCR) is the oldest branch of pattern recognition. It refers to the conversion of handwritten letters or symbols in general to machine-coded text.
Much work has been done on OCR in Western scripts and popular languages like Chinese, Indian or Japanese. However the main Indian language in the northeast is still not examined.
Assamese is spoken by over 14 millions of people. There were some attempts but the researchers achieved an accuracy of up to 90%. In their work, the major limitation of neural networks is their inability to capture spatial features in images. This problem can be omitted by using convolutional neural networks. CNNs eliminate the need for manual feature extraction as they learn features directly during training. Since no standard image data set for Assamese characters was available the cooperating institute has collected and generated handwritten drawing samples with over 12000 images.
The aim is to create the best possible solution for Assamese characters recognition. So far the dataset was created, images were preprocessed using different methods such as otsu binarization, normalization or image smoothing. Moreover, many models were tested, both created by myself or available online (like DenseNet 201, U-Net). Meta deep learning models were developed. The best obtained result is 95,4% so far but there is still ongoing work on improving this result.