Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BRIGHT dataset #2520

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

Add BRIGHT dataset #2520

wants to merge 14 commits into from

Conversation

nilsleh
Copy link
Collaborator

@nilsleh nilsleh commented Jan 18, 2025

This PR adds the BRIGHT dataset.

Dataset Features:

* Pre-disaster optical images from MAXAR, NAIP, NOAA Digital Coast Raster Datasets, and the National Plan for Aerial Orthophotography Spain
* Post-disaster SAR images from Capella Space and Umbra
* high image resolution of 0.3-1m

Dataset Format:

* Images are in GeoTIFF format with pixel dimensions of 1024x1024
* Pre-disaster are three channel images
* Post-disaster SAR images are single channel but repeated to have 3 channels

bright_example

@ChenHongruixuan @olidietrich Thank you for the nice work and making the dataset public. I was wondering whether you could include the split .txt files found here inside the zip file on Huggingface such that everything is in one place? And of course if you have any other comments about the PR, feel free to let us know below.

@nilsleh nilsleh marked this pull request as draft January 18, 2025 10:19
@github-actions github-actions bot added documentation Improvements or additions to documentation datasets Geospatial or benchmark datasets testing Continuous integration testing datamodules PyTorch Lightning datamodules labels Jan 18, 2025
@adamjstewart adamjstewart added this to the 0.7.0 milestone Jan 18, 2025
@github-actions github-actions bot removed the datamodules PyTorch Lightning datamodules label Jan 20, 2025
@nilsleh nilsleh marked this pull request as ready for review January 20, 2025 10:48
@nilsleh nilsleh requested a review from isaaccorley January 22, 2025 17:27
torchgeo/datasets/bright.py Show resolved Hide resolved
docs/api/datasets/non_geo_datasets.csv Outdated Show resolved Hide resolved
torchgeo/datasets/bright.py Show resolved Hide resolved
torchgeo/datasets/bright.py Outdated Show resolved Hide resolved
Comment on lines 135 to 137
# https://github.com/ChenHongruixuan/BRIGHT/blob/11b1ffafa4d30d2df2081189b56864b0de4e3ed7/dfc25_benchmark/dataset/make_data_loader.py#L101
# post image is stacked to also have 3 channels
post_image = repeat(post_image, 'c h w -> (repeat c) h w', repeat=3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, does this make sense? Is this only because they want to use RGB models pre-trained on ImageNet?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think now that we are implementing this as the "Challenge" version of the dataset it's okay to do?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends. If we want to use one of our S1 pretrained models, not duplicating may make more sense. I would personally leave it up to the user to add a transform if they want this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm still against this. It would prevent us from using our S1 pretrained models on S1 imagery.

torchgeo/datasets/bright.py Outdated Show resolved Hide resolved
torchgeo/datasets/bright.py Outdated Show resolved Hide resolved
@ChenHongruixuan
Copy link

ChenHongruixuan commented Jan 25, 2025

This PR adds the BRIGHT dataset.

Dataset Features:

* Pre-disaster optical images from MAXAR, NAIP, NOAA Digital Coast Raster Datasets, and the National Plan for Aerial Orthophotography Spain
* Post-disaster SAR images from Capella Space and Umbra
* high image resolution of 0.3-1m

Dataset Format:

* Images are in GeoTIFF format with pixel dimensions of 1024x1024
* Pre-disaster are three channel images
* Post-disaster SAR images are single channel but repeated to have 3 channels

bright_example

@ChenHongruixuan @olidietrich Thank you for the nice work and making the dataset public. I was wondering whether you could include the split .txt files found here inside the zip file on Huggingface such that everything is in one place? And of course if you have any other comments about the PR, feel free to let us know below.

Hi @nilsleh ,

Thank you very much for your contribution and efforts in integrating BRIGHT into torchgeo! We are a little hesitant to include the txt file you mentioned into this for torchgeo, as that is only for the DFC25 division. That's why we provide these txt files in github instead of original dataset zip file, especially considering that the current BRIGHT is not final version.

Best,

@nilsleh
Copy link
Collaborator Author

nilsleh commented Jan 27, 2025

Thank you very much for your contribution and efforts in integrating BRIGHT into torchgeo! We are a little hesitant to include the txt file you mentioned into this for torchgeo, as that is only for the DFC25 division. That's why we provide these txt files in github instead of original dataset zip file, especially considering that the current BRIGHT is not final version.

Thanks for getting back @ChenHongruixuan , no problem, we can also host them on our torchgeo Hugginface, just to make downloading more easier. Also very much looking forward to the final version of the dataset. Feel free to let me know when that is openly available, such that we can integrate support for that as well here.

docs/api/datasets.rst Outdated Show resolved Hide resolved
@ChenHongruixuan
Copy link

Thank you very much for your contribution and efforts in integrating BRIGHT into torchgeo! We are a little hesitant to include the txt file you mentioned into this for torchgeo, as that is only for the DFC25 division. That's why we provide these txt files in github instead of original dataset zip file, especially considering that the current BRIGHT is not final version.

Thanks for getting back @ChenHongruixuan , no problem, we can also host them on our torchgeo Hugginface, just to make downloading more easier. Also very much looking forward to the final version of the dataset. Feel free to let me know when that is openly available, such that we can integrate support for that as well here.

@nilsleh , thank you so much for your support!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets documentation Improvements or additions to documentation testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants