Standardize sample dict keys #985

adamjstewart · 2022-12-27T19:44:23Z

adamjstewart
Dec 27, 2022
Maintainer

Summary

In TorchGeo, each Dataset returns a sample dictionary with key/value pairs for each object. I propose we standardize the names of these keys and the types and dimensions of their corresponding values as follows:

input: Tensor, [C, H, W]
label: Tensor, [1]
mask: Tensor, [H, W]
bbox: Tensor, [N, 4, 2], defined in clockwise order: top-left, top-right, bottom-right and bottom-left, coordinates in x, y order
bbox_xyxy: Tensor, [N, 4], defined in xmin, ymin, xmax, ymax format
bbox_xywh: Tensor, [N, 4], defined in xmin, ymin, width, height format
keypoints: Tensor, [N, 2], coordinates in x, y order

I also propose we standardize batch dictionaries (mini-batches of samples) using the exact same definition, but with an additional batch dimension in front.

Rationale

Our current keys are not uniform (bbox and boxes, label and labels):

$ grep -RPRoh '(sample|batch)\[".*?\"]' | sort | uniq
batch["boxes"]
batch["image"]
batch["label"]
batch["labels"]
batch["mask"]
batch["prediction"]
batch["prediction_boxes"]
batch["prediction_labels"]
batch["prediction_scores"]
sample["bbox"]
sample["boxes"]
sample["chm"]
sample["hsi"]
sample["image"]
sample["label"]
sample["mask"]
sample["num_patches"]
sample["original_shape"]
sample["padding"]
sample["patch_shape"]
sample["prediction"]
sample["prediction_boxes"]
sample["prediction_label"]

This makes it difficult to create dataset-independent trainers that handle multiple datamodules.

We would also like to be able to type check each of these so we don't have a mix of ints/floats and Tensors (e.g., for label).

TorchGeo relies heavily on Kornia for data augmentation. Kornia provides an AugmentationSequential container for composing and applying transforms. Currently, we create our own wrapper around AugmentationSequential that maps our keys to the keys expected by Kornia, but a better solution would be to standardize on the same keys that Kornia uses.

Implementation

Newer versions of Python introduce the TypedDict type. I propose we use this (and add a dependency on typing-extensions for older versions of Python). The code would look like:

from typing import TypedDict

class Sample(TypedDict, total=False):
    input: Tensor
    label: Tensor
    mask: Tensor
    bbox: Tensor
    bbox_xyxy: Tensor
    bbox_xywh: Tensor
    keypoints: Tensor

Batch = Sample

Each dataset __getitem__ would return an object of type Sample, while collation functions would take in List[Sample] and return Batch. This would replace our current Dict[str, Tensor] type hints and be much more strict.

There are a few important caveats to note here:

By specifying total=False, all keys are optional.
Extra keys are disallowed. If a dataset requires a custom key, it will need to be added to this list. If this wasn't the case, we would end up with accidental labels and boxes keys again.
PyTorch does not yet allow for dtype or shape to be typed. When it does, Batch will have its own separate implementation. For now, the difference is only semantic.

Alternatives

The keys allowed by Kornia are defined in the kornia.constants.DataKey enum. Unfortunately, I couldn't find any documentation or discussion on this enum. Kornia does not standardize "label", so we could alternatively call this "target".

Kornia isn't the only library for data augmentation. There is also Albumentations, which seems to standardize on:

image
class_labels
class_categories
mask
bboxes
bboxes_classes
keypoints
keypoints_classes

Torchvision does not use sample dicts, so they don't have a standard of their own. Torchvision uses PIL, while Albumentations uses OpenCV, neither of which support MSI or GPU, so I think we're better off matching the Kornia standard.

The final alternative is to convince Kornia to change their key names to match ours. I think this is unlikely, but @isaaccorley is more optimistic.

Additional information

We will likely need many additional keys for things that don't fall into this list. How strict should we be about this? It would be really nice if PyTorch's default collate function worked. We currently roll our own solely because we need to include things like rasterio.crs.CRS which isn't a type that the default function knows how to handle.

How should we handle predictions? We currently prefix with prediction_.

How should we handle multi-label problems? The dimensions of these would be different. Should we use a different key?

Should we also standardize dtypes? All Kornia transforms require {float16, float32, float64} as input, while all torchmetrics metrics require int? I can't actually find evidence of the latter, but we have several datamodules which assert this. Depending on if labels are for classification or regression, the type can and should change. We could either have separate keys for these, or ignore dtype for these.

isaaccorley · 2022-12-27T20:03:44Z

isaaccorley
Dec 27, 2022
Maintainer

I'm okay with standardizing on input, mask, label, bbox, bbox_xyxy, .... I think some datasets have multiple inputs of different shapes and dtypes, e.g. IDTReeS (chm, hsi, rgb, and label). This is the only edge case I can think of at the moment though. There are some other datasets like RSVQA which have text/image pairs which we haven't thought through yet but would be of interest given the current diffusion model hype.

2 replies

adamjstewart Dec 27, 2022
Maintainer Author

What exactly are the outputs of IDTReeS? I know they have point cloud data so we'll at least need another key for that. Are the rest actually useful? We could always add them as people need them.

isaaccorley Dec 28, 2022
Maintainer

The other outputs could be treated like images. HSI is an image of shape (C,H,W). CHM is like a DSM but for canopy height. It's also (C,H,W) but C=1.

adamjstewart · 2022-12-28T20:19:03Z

adamjstewart
Dec 28, 2022
Maintainer Author

Currently, we create our own wrapper around AugmentationSequential that maps our keys to the keys expected by Kornia, but a better solution would be to standardize on the same keys that Kornia uses.

@isaaccorley correctly pointed out that AugmentationSequential doesn't take sample dicts as input and that the user needs to manually pass the input like so:

augs = AugmentationSequential(...)
augs(sample["image"], sample["mask"], data_keys=["input", "mask"])

so there's actually no reason to force ourselves to match our keys to Kornia's.

I think we should:

Standardize on "image" for now since that's what we almost exclusively use
See if Kornia is willing to support sample dicts as input to AugmentationSequential.forward so that we can do sample = augs(sample): AugmentationSequential: accept sample dict as input kornia/kornia#2119
See if Kornia is willing to add "image" as an alias to "input", it's just an enum so it would be a 1-line code change
Consider changing "image" to "input" in our code iff 2 is approved but 3 is rejected

Does that sound reasonable? I agree that "input" isn't a good variable name, it's a builtin function in Python and it's too generic.

7 replies

MathiasBaumgartinger Oct 1, 2024

Thanks for your your elaboration! As i think this might be relevant information for other users as well, shall we work together to include this in the correspodning tutorial?

adamjstewart Oct 1, 2024
Maintainer Author

Kornia 0.7.4 is supposed to be released quite soon I think (@johnnv1 @edgarriba?). By the time our tutorial is updated, it will probably be out. @ashnair1 is already working on updating the rest of TorchGeo to be compatible with Kornia and remove the need for any hacks, although some of that work is backwards-incompatible and won't make it into a stable release until TorchGeo 0.7.0. I guess I'll let @ashnair1 decide if it's worth clarifying this in the tutorial or if we should just wait until it's no longer an issue.

MathiasBaumgartinger Oct 1, 2024

I see your point. Anyway, I was thinking about a mention of using augs as opposed to transform with the explanation you provided would be helpful. Especially since https://torchgeo.readthedocs.io/en/stable/tutorials/transforms.html does not mention how one would include augmentations in a module at all.

adamjstewart Oct 1, 2024
Maintainer Author

What we really need is an entire tutorial on writing custom data modules.

edgarriba Oct 2, 2024

@adamjstewart the reelase target is next week after eccv

adamjstewart · 2023-01-02T15:29:47Z

adamjstewart
Jan 2, 2023
Maintainer Author

We can actually standardize the dtypes at the same time. I propose:

class Sample(TypedDict, total=False):
    input: FloatTensor
    label: LongTensor
    mask: LongTensor
    bbox: LongTensor
    bbox_xyxy: LongTensor
    bbox_xywh: LongTensor
    keypoints: LongTensor

Kornia still requires everything to be float, but torchmetrics and torchvision/timm seem to require all targets to be long.

1 reply

adamjstewart Feb 28, 2023
Maintainer Author

Or not, see #1149. We may need to see if it's possible for kornia/torchmetrics/torchvision/timm to be less particular about their dtypes.

adamjstewart · 2023-01-02T15:32:27Z

adamjstewart
Jan 2, 2023
Maintainer Author

While we're at it, we should also standardize all transforms/data augmentation:

SampleTransform = Callable[[Sample], Sample]
BatchTransform = Callable[[Batch], Batch]

0 replies

adamjstewart · 2024-04-16T15:27:54Z

adamjstewart
Apr 16, 2024
Maintainer Author

In #1997 we note that some datasets contain both regression and classification labels, but these can't both be returned since they both use the same "label" key. We should consider using a different key for each type of task such that both classification and regression can be supported for the same dataset.

0 replies

adamjstewart · 2024-08-23T11:32:34Z

adamjstewart
Aug 23, 2024
Maintainer Author

Extra keys are disallowed. If a dataset requires a custom key, it will need to be added to this list. If this wasn't the case, we would end up with accidental labels and boxes keys again.

We can make subclasses for these: https://stackoverflow.com/a/71814659/5828163

2 replies

adamjstewart Sep 24, 2024
Maintainer Author

Also possible to disable the specific error raised for these extra keys: python/mypy#4617 (comment). Only problem is that this also requires libraries that depend on us to ignore these mypy warnings too.

adamjstewart Sep 26, 2024
Maintainer Author

Subclasses are annoying because the subclasses must be documented too. I wonder if we can create a custom TypedDict interface with a generic __getitem__(self, key: str) -> Any for all unknown keys.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardize sample dict keys #985

{{title}}

Replies: 6 comments 12 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Standardize sample dict keys #985

adamjstewart Dec 27, 2022 Maintainer

Summary

Rationale

Implementation

Alternatives

Additional information

Replies: 6 comments · 12 replies

isaaccorley Dec 27, 2022 Maintainer

adamjstewart Dec 27, 2022 Maintainer Author

isaaccorley Dec 28, 2022 Maintainer

adamjstewart Dec 28, 2022 Maintainer Author

MathiasBaumgartinger Oct 1, 2024

adamjstewart Oct 1, 2024 Maintainer Author

MathiasBaumgartinger Oct 1, 2024

adamjstewart Oct 1, 2024 Maintainer Author

edgarriba Oct 2, 2024

adamjstewart Jan 2, 2023 Maintainer Author

adamjstewart Feb 28, 2023 Maintainer Author

adamjstewart Jan 2, 2023 Maintainer Author

adamjstewart Apr 16, 2024 Maintainer Author

adamjstewart Aug 23, 2024 Maintainer Author

adamjstewart Sep 24, 2024 Maintainer Author

adamjstewart Sep 26, 2024 Maintainer Author

adamjstewart
Dec 27, 2022
Maintainer

Replies: 6 comments 12 replies

isaaccorley
Dec 27, 2022
Maintainer

adamjstewart Dec 27, 2022
Maintainer Author

isaaccorley Dec 28, 2022
Maintainer

adamjstewart
Dec 28, 2022
Maintainer Author

adamjstewart Oct 1, 2024
Maintainer Author

adamjstewart Oct 1, 2024
Maintainer Author

adamjstewart
Jan 2, 2023
Maintainer Author

adamjstewart Feb 28, 2023
Maintainer Author

adamjstewart
Jan 2, 2023
Maintainer Author

adamjstewart
Apr 16, 2024
Maintainer Author

adamjstewart
Aug 23, 2024
Maintainer Author

adamjstewart Sep 24, 2024
Maintainer Author

adamjstewart Sep 26, 2024
Maintainer Author