Skip to content

arshjot/Handwritten-Text-Recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

113 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Handwritten Text Recognition with TensorFlow

Code and model weights for English handwritten text recognition model trained on IAM Handwriting Database. It is more or less a TensorFlow port of Joan Puigcerver's amazing work on HTR. This framework could also be used for building similar models using other datasets. Codes for 3 architectures - BLSTM, CRNN, and STN followed by CRNN - have been provided.

Inigo Montoya Inigo Montoya

Inigo Montoya

Guess it's Anigo Montoya now...

Requirements

Steps for predicting on new images

A pre-trained model with CRNN architecture (5 Conv2D blocks followed by 5 bidirectional LSTM layers, hyperparameters and architecture are same as used here) has been provided. You can use the model to get predictions on new images by following the below steps:

  1. Place the images of handwritten text in the samples folder
  2. Download the model weights from here, extract, and place it under the experiments directory. Ensure that the below directory structure is followed:
    ├── experiments
    │   ├── CRNN_h128
    │   │   ├── best_model
    │   │   ├── checkpoint
    │   │   └── summary
  3. Enter the mains directory and run:
    python predict.py -c ../configs/config.json

Steps for training model from scratch

  1. Download the IAM dataset (you'll need to register on the website) and keep the lines partition in the /data/IAM/ directory as shown below:

    ├── data
    │   ├── IAM
    │   │   ├── lines
    │   │   │   ├── a01-000u-00.png
    │   │   │   ├── a01-000u-01.png
    │   │   │   ├── .
    │   │   │   ├── .
    │   │   │   ├── .
    │   │   ├── lines.txt
  2. If required, modify the /configs/config.json file to change model architecture , image height, etc.

  3. From the data directory, run:

    python process_images.py -c ../configs/config.json

    This will pre-process the images (add borders, resize, remove skew, etc.) using imgtxtenh and ImageMagick's convert.

  4. From the data directory, run:

    python prepare_IAM.py -c ../configs/config.json

    This will:

    • process the ground-truth labels to remove spaces within words and collapse contractions
    • read each image and create TFRecords files for train, validation and test sets using Aachen's partition
  5. Start model training by running the below command from the mains directory:

    python main.py -c ../configs/config.json

Results

The error rates, achieved by the pre-trained model, on IAM validation and test sets are shown below:

Set CER (%)
Validation 4.83
Test 7.01

Notes

  • Please ensure the text is written in black on white background, similar to the images placed in the samples folder
  • During training phase, character error rate (CER) is calculated only after every 10 steps; otherwise, training is slowed down due to TensorFlow's ctc_beam_search_decoder
  • Option for bucketing images according to image width (to avoid extraneous image padding) has been provided and can be toggled using the config file
  • Keeping images with a large width range together in a batch might produce slightly lower accuracy due to padding. A workaround is to keep batch size as 1 during inference.

Citations

About

Handwritten text recognition model (English) and general framework

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages