Skip to content

Latest commit

 

History

History
38 lines (29 loc) · 999 Bytes

README.md

File metadata and controls

38 lines (29 loc) · 999 Bytes

Visual-Question-Answering

This is RNN+CNN Visual Question Answering Model. It uses VGG16 for image feature extraction. VQA Dataset is used for training the model.

Dependency

  1. Keras version 2.0+
  2. Tensorflow 1.2+
  3. Spacy version 2.0+
    • To upgrade & install Glove Vectors
      • python -m spacy download en_vectors_web_lg
  4. OpenCV

Usage

Download my pretrained model from here

For running pretrained model in Google Colab Click Here

For training the model run:

$ python train.py

Flask web app

Currently in intitial stages. You have to rename the image with the question you want to ask. For running:

$ set FLASK_APP=hello_app.py
$ flask run

Reference

https://github.com/VT-vision-lab/VQA_LSTM_CNN