CycleAugment: Efficient data augmentation strategy for handwritten text recognition in historical document images
รหัสดีโอไอ
Creator 1. Sarayut Gonwirat
2. Olarik Surinta
Title CycleAugment: Efficient data augmentation strategy for handwritten text recognition in historical document images
Publisher Faculty of Engineering, Khon Kaen University
Publication Year 2565
Journal Title Engineering and Applied Science Research
Journal Vol. 49
Journal No. 4
Page no. 505-520
Keyword Convolutional recurrent neural network, Handwritten text recognition, Data augmentation, Deep learning, Training strategy
URL Website https://ph01.tci-thaijo.org/index.php/easr/index
Website title Engineering and Applied Science Research
ISSN 2539-6161
Abstract Predicting the sequence pattern of the handwritten text images is a challenging problem due to various writing styles, insufficient training data, and also background noise appearing in the text images. The architecture of the combination between convolutional neural network (CNN) and recurrent neural network (RNN), called CRNN architecture, is the most successful sequence learning method for handwritten text recognition systems. For handwritten text recognition in historical Thai document images, we first trained nine different CRNN architectures with both training from scratch and transfer learning techniques to find out the most powerful technique. We discovered that the transfer learning technique does not significantly outperform scratch learning. Second, we examined training the CRNN model by applying the basic transformation data augmentation techniques: shifting, rotation, and shearing. Indeed, the data augmentation techniques provided more accurate performance than without applying data augmentation techniques. However, it did not show significant results. The original training strategy aimed to find the global minima value and not always solve the overfitting problems. Third, we proposed a cyclical data augmentation strategy, called CycleAugment, to discover many local minima values and prevent overfitting. In each cycle, it rapidly decreased the training loss to reach the local minima. The CycleAugment strategy allowed the CRNN model to learn the input images with and without applying data augmentation techniques to learn from many input patterns. Hence, the CycleAugment strategy consistently achieved the best performance when compared with other strategies. Finally, we prevented image distortion by applying a simple technique to the short word images and achieved better performance on the historical Thai document image dataset.
Engineering and Applied Science Research

บรรณานุกรม

EndNote

APA

Chicago

MLA

ดิจิตอลไฟล์

Digital File
DOI Smart-Search
สวัสดีค่ะ ยินดีให้บริการสอบถาม และสืบค้นข้อมูลตัวระบุวัตถุดิจิทัล (ดีโอไอ) สำนักการวิจัยแห่งชาติ (วช.) ค่ะ