-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memap dataset for multimodal data #47
base: main
Are you sure you want to change the base?
Conversation
…tests for image and multimodal data
@ndoll1998, please make sure to enable the pre-commit hooks. |
src/modalities/dataloader/codecs.py
Outdated
@@ -0,0 +1,111 @@ | |||
from abc import ABC, abstractmethod | |||
from typing import TypeVar, Generic, Optional, Dict, Any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused imports
src/modalities/dataloader/codecs.py
Outdated
# write image to buffer | ||
with Image.open(img_file_path) as img: | ||
img.save(buf, format=self._format) | ||
# retuen buffer content |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
from typing import TypeVar, Generic, Optional, Dict, Any | ||
|
||
from io import BytesIO | ||
from PIL import Image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing pip install
index = [(4, 24), (28, 40), (68, 12), (80, 4)] # [(index,len), ...] -> in 4 bytes #lengths: 6,10,3,1 | ||
data += codecs_bytes | ||
# index | ||
index = [(16, 24), (40, 28), (68, 12), (80, 16)] # [(index,len), ...] -> in 4 bytes #lengths: 6,10,3,1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure if this index
is correct. I think it should be index= [(16,24), (40,40), (80,12), (92, 4)]
. Maybe I have not understood the concept. Could you please clarify @ndoll1998
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specific values of the index shouldn't matter as long as they follow the schema (begin, length). Both the index lists do so.
Take for example the following index item (16, 24). This index refers to the sequence [0, 1, 2, 3, 4, 5] as it begins at the 16th byte right after the header bytes and has a length of 24 bytes <=> 6 integer values.
Given that this is just a dummy example, the specific sequences in the dataset are not of interest.
cb6e816
to
179052b
Compare
Will this PR be continued or can it be closed? @spravil @ndoll1998 |
No description provided.