Skip to content

Commit

Permalink
Removed Keras import and updated readme.
Browse files Browse the repository at this point in the history
  • Loading branch information
SivanKe committed Jan 11, 2017
1 parent 2de054f commit 8bab320
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 18 deletions.
27 changes: 13 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,14 @@
# Attention-OCR
Visual Attention based OCR. The model first runs a sliding CNN on the image (images are resized to height 32 while preserving aspect ratio). Then an LSTM is stacked on top of the CNN. Finally, an attenion model is used as a decoder for producing the final outputs.
Authours: [Qi Guo](http://qiguo.ml) and [Yuntian Deng](https://github.com/da03)

Visual Attention based OCR. The model first runs a sliding CNN on the image (images are resized to height 32 while preserving aspect ratio). Then an LSTM is stacked on top of the CNN. Finally, an attention model is used as a decoder for producing the final outputs.

![example image 0](http://cs.cmu.edu/~yuntiand/OCR-2.jpg)

# Prerequsites
Most of our code is written based on Tensorflow, but we also use Keras for the convolution part of our model. Besides, we use python package distance to calculate edit distance for evaluation. (However, that is not mandatory, if distance is not installed, we will do exact match).

### Tensorflow: [Installation Instructions](https://www.tensorflow.org/versions/r0.9/get_started/os_setup.html#download-and-setup)

### Keras: [Installation Instructions](http://keras.io/#installation)

Set Keras backend:

```
export KERAS_BACKEND=tensorflow
```

```
echo 'export KERAS_BACKEND=tensorflow' >> ~/.bashrc
```
### Tensorflow: [Installation Instructions](https://www.tensorflow.org/get_started/os_setup#download-and-setup) (tested on 0.12.1)

### Distance (Optional):

Expand Down Expand Up @@ -178,3 +168,12 @@ Image 5 (e/e): ![example image 5](http://cs.cmu.edu/~yuntiand/2evaluation_data_i
* `attn-num-hidden`: Number of hidden units in attention decoder cell.
* `attn-num-layers`: Number of layers in attention decoder cell. (Encoder number of hidden units will be `attn-num-hidden`*`attn-num-layers`).
* `target-vocab-size`: Target vocabulary size. Default is = 26+10+3 # 0: PADDING, 1: GO, 2: EOS, >2: 0-9, a-z


# References

[Convert a formula to its LaTex source](https://github.com/harvardnlp/im2markup)

[What You Get Is What You See: A Visual Markup Decompiler](https://arxiv.org/pdf/1609.04938.pdf)

[Torch attention OCR](https://github.com/da03/torch-Attention-OCR)
4 changes: 0 additions & 4 deletions src/model/cnn.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,7 @@
__author__ = 'moonkey'

from keras import models, layers
import logging
import numpy as np
# from src.data_util.synth_prepare import SynthGen

import keras.backend as K
import tensorflow as tf


Expand Down

0 comments on commit 8bab320

Please sign in to comment.