download the CUB dataset, but there is no train.json in it. #67

michaelwithu · 2024-06-21T09:41:11Z

Download CUB dataset by link ：https://data.caltech.edu/records/65de6-vp158
but there is no train.json.

then,links(https://cornell.box.com/v/vptfgvcsplits; https://drive.google.com/drive/folders/1mnvxTkYxmOr2W9QjcgS64UBpoJ4UmKaM?usp=sharing) do not work.

aba122 · 2024-07-28T19:56:09Z

I have same issue like you

arpita-chowdhury-osu · 2024-08-21T06:29:24Z

It should look like this:

{
   image_name : class_index
}

You can follow the below code to generate the json. if you take the whole train set as training data and val as both test and val data. Feel free to randomly take 20% of train data in train.json and 20% in val.json if needed. I didn't cause for cub I don't need to find proper hyperparameters, they did it already.
assuming your dataset for cub looks like this:

cub
- train
  - 1_className
    - image_1
    - image_2
  - 2_className
    - image_1
    - ...
- val
  - 1_className
    - image_1

import os
import json

def create_json_files(data_dir):
    json_data = {'train': {}, 'val': {}, 'test': {}}

    for split in ['train', 'val']:
        split_dir = os.path.join(data_dir, split)
        for class_name in os.listdir(split_dir):
            class_dir = os.path.join(split_dir, class_name)
            if os.path.isdir(class_dir):
                class_id = int(class_name.split(".")[0]) 
                for img_name in os.listdir(class_dir):
                    img_path = os.path.join(split, class_name, img_name) 
                    json_data[split][img_path] = class_id

    # Create the JSON files
    for split in ['train', 'val']:
        json_file_path = os.path.join(data_dir, f'{split}.json')
        with open(json_file_path, 'w') as f:
            json.dump(json_data[split], f, indent=4)
    
    # For the test set, we'll assume it uses the same format as val
    json_data['test'] = json_data['val']
    test_file_path = os.path.join(data_dir, 'test.json')
    with open(test_file_path, 'w') as f:
        json.dump(json_data['test'], f, indent=4)

    return json_data

dataset_path = "<path_to _your_dataset>"
create_json_files(dataset_path)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

download the CUB dataset, but there is no train.json in it. #67

download the CUB dataset, but there is no train.json in it. #67

michaelwithu commented Jun 21, 2024

aba122 commented Jul 28, 2024

arpita-chowdhury-osu commented Aug 21, 2024

download the CUB dataset, but there is no train.json in it. #67

download the CUB dataset, but there is no train.json in it. #67

Comments

michaelwithu commented Jun 21, 2024

aba122 commented Jul 28, 2024

arpita-chowdhury-osu commented Aug 21, 2024