From 8bab320e342e0f2401aa5f2b795a0ed27c97ddb2 Mon Sep 17 00:00:00 2001 From: Sivan Date: Wed, 11 Jan 2017 14:11:41 +0200 Subject: [PATCH] Removed Keras import and updated readme. --- README.md | 27 +++++++++++++-------------- src/model/cnn.py | 4 ---- 2 files changed, 13 insertions(+), 18 deletions(-) diff --git a/README.md b/README.md index 5ad6e803..90d3a3bd 100644 --- a/README.md +++ b/README.md @@ -1,24 +1,14 @@ # Attention-OCR -Visual Attention based OCR. The model first runs a sliding CNN on the image (images are resized to height 32 while preserving aspect ratio). Then an LSTM is stacked on top of the CNN. Finally, an attenion model is used as a decoder for producing the final outputs. +Authours: [Qi Guo](http://qiguo.ml) and [Yuntian Deng](https://github.com/da03) + +Visual Attention based OCR. The model first runs a sliding CNN on the image (images are resized to height 32 while preserving aspect ratio). Then an LSTM is stacked on top of the CNN. Finally, an attention model is used as a decoder for producing the final outputs. ![example image 0](http://cs.cmu.edu/~yuntiand/OCR-2.jpg) # Prerequsites Most of our code is written based on Tensorflow, but we also use Keras for the convolution part of our model. Besides, we use python package distance to calculate edit distance for evaluation. (However, that is not mandatory, if distance is not installed, we will do exact match). -### Tensorflow: [Installation Instructions](https://www.tensorflow.org/versions/r0.9/get_started/os_setup.html#download-and-setup) - -### Keras: [Installation Instructions](http://keras.io/#installation) - -Set Keras backend: - -``` -export KERAS_BACKEND=tensorflow -``` - -``` -echo 'export KERAS_BACKEND=tensorflow' >> ~/.bashrc -``` +### Tensorflow: [Installation Instructions](https://www.tensorflow.org/get_started/os_setup#download-and-setup) (tested on 0.12.1) ### Distance (Optional): @@ -178,3 +168,12 @@ Image 5 (e/e): ![example image 5](http://cs.cmu.edu/~yuntiand/2evaluation_data_i * `attn-num-hidden`: Number of hidden units in attention decoder cell. * `attn-num-layers`: Number of layers in attention decoder cell. (Encoder number of hidden units will be `attn-num-hidden`*`attn-num-layers`). * `target-vocab-size`: Target vocabulary size. Default is = 26+10+3 # 0: PADDING, 1: GO, 2: EOS, >2: 0-9, a-z + + +# References + +[Convert a formula to its LaTex source](https://github.com/harvardnlp/im2markup) + +[What You Get Is What You See: A Visual Markup Decompiler](https://arxiv.org/pdf/1609.04938.pdf) + +[Torch attention OCR](https://github.com/da03/torch-Attention-OCR) diff --git a/src/model/cnn.py b/src/model/cnn.py index aee383d6..7eb19d59 100644 --- a/src/model/cnn.py +++ b/src/model/cnn.py @@ -1,11 +1,7 @@ __author__ = 'moonkey' -from keras import models, layers import logging import numpy as np -# from src.data_util.synth_prepare import SynthGen - -import keras.backend as K import tensorflow as tf