Skip to content

dozingLee/Preprocess-data

Repository files navigation

PascalSentenceDataset

This program is utility to download pascal sentence dataset.

Installation

You can install by "git clone" command.

git clone https://github.com/rupy/PascalSentenceDataset.git

Dependency

You must install some python libraries. Use pip command. Python>=2

PyQuery

Usage

To download dataset, just run program as follow:

python pascal_sentence_dataset.py

You can also write code like this:

# import
from pascal_sentence_dataset import PascalSentenceDataSet

# create instance
dataset = PascalSentenceDataSet()
# download images
dataset.download_images()
# download sentences
dataset.download_sentences()
# create correspondence data by dataset
# dataset.create_correspondence_data()

# create my pair data
dataset.create_pair_data()
# preprocess data
dataset.preprocess_data()

Return the following file list: (./list/)

  • correspondence.csv 1000 list data, titled: index, image
  • data_pairs.csv 1000 list data, titled: index, image, text, label
  • train.csv the training set with 800 image-text pairs (40 pairs per class)
  • validate.csv 100 the validation set with 100 image-text pairs (5 pairs per class)
  • test.csv 100 the testing set with 100 image-text pairs (5 pairs per class)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages