Memap dataset for multimodal data #47

spravil · 2024-02-05T09:09:01Z

No description provided.

…ic app config

…tests for image and multimodal data

thomaschhh · 2024-02-05T15:00:40Z

@ndoll1998, please make sure to enable the pre-commit hooks.

thomaschhh · 2024-02-05T15:02:13Z

src/modalities/dataloader/codecs.py

@@ -0,0 +1,111 @@
+from abc import ABC, abstractmethod
+from typing import TypeVar, Generic, Optional, Dict, Any


Unused imports

thomaschhh · 2024-02-05T15:03:33Z

src/modalities/dataloader/codecs.py

+        # write image to buffer
+        with Image.open(img_file_path) as img:
+            img.save(buf, format=self._format)
+        # retuen buffer content


thomaschhh · 2024-02-05T15:04:28Z

src/modalities/dataloader/codecs.py

+from typing import TypeVar, Generic, Optional, Dict, Any
+
+from io import BytesIO
+from PIL import Image


missing pip install

manasMauryax · 2024-02-06T08:22:49Z

tests/conftest.py

-    index = [(4, 24), (28, 40), (68, 12), (80, 4)]  # [(index,len), ...] -> in 4 bytes #lengths: 6,10,3,1
+    data += codecs_bytes
+    # index
+    index = [(16, 24), (40, 28), (68, 12), (80, 16)]  # [(index,len), ...] -> in 4 bytes #lengths: 6,10,3,1


I am unsure if this index is correct. I think it should be index= [(16,24), (40,40), (80,12), (92, 4)]. Maybe I have not understood the concept. Could you please clarify @ndoll1998

The specific values of the index shouldn't matter as long as they follow the schema (begin, length). Both the index lists do so.

Take for example the following index item (16, 24). This index refers to the sequence [0, 1, 2, 3, 4, 5] as it begins at the 16th byte right after the header bytes and has a length of 24 bytes <=> 6 integer values.

Given that this is just a dummy example, the specific sequences in the dataset are not of interest.

…g errors

…taset

…eration

fromm-m · 2024-06-17T12:26:20Z

Will this PR be continued or can it be closed? @spravil @ndoll1998

Niclas Doll added 3 commits February 4, 2024 20:23

refactor: Rewrite class resolver register to be independent of specif…

ac6c94c

…ic app config

feat: Support multiple modalities in MemMapDataset

19a1d85

test: Adapt test to new memory format of MemMapDataset and implement …

307733b

…tests for image and multimodal data

spravil assigned ndoll1998 Feb 5, 2024

fix: formatting

1feaaf2

thomaschhh reviewed Feb 5, 2024

View reviewed changes

manasMauryax reviewed Feb 6, 2024

View reviewed changes

thomaschhh and others added 13 commits February 8, 2024 11:25

feat: add audio codec WIP

a5ae389

feat: add missing fixture

0af4540

fix: audio codec

49d9c1c

refactor: clean-up test fixtures

220493a

refactor: run pre-commit hooks

7191ba1

fix: add pillow to requirements

70747f7

refactor: fix type

19e95d9

chore: remove unused feature

8945b0c

refactor: TorchaudioAudioCodec init

16d0fb0

fix: add soundfile to requirements for audio processing

ccce2ba

fix: manually resolve config type for codecs to avoid pydantic parsin…

340fab5

…g errors

feat: Merge branch 'feat/audio-codec' into feat/multimodal_mem_map_da…

fdb0815

…taset

fix: fixed issue that discards all-zero bytes at the end of a read op…

3a16d6e

…eration

thomaschhh marked this pull request as ready for review March 5, 2024 14:08

thomaschhh marked this pull request as draft March 5, 2024 14:13

le1nux force-pushed the main branch 3 times, most recently from cb6e816 to 179052b Compare March 13, 2024 22:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memap dataset for multimodal data #47

Memap dataset for multimodal data #47

spravil commented Feb 5, 2024

thomaschhh commented Feb 5, 2024

thomaschhh Feb 5, 2024

thomaschhh Feb 5, 2024

thomaschhh Feb 5, 2024

manasMauryax Feb 6, 2024

ndoll1998 Feb 19, 2024

fromm-m commented Jun 17, 2024

		@@ -0,0 +1,111 @@
		from abc import ABC, abstractmethod
		from typing import TypeVar, Generic, Optional, Dict, Any

Memap dataset for multimodal data #47

Are you sure you want to change the base?

Memap dataset for multimodal data #47

Conversation

spravil commented Feb 5, 2024

thomaschhh commented Feb 5, 2024

thomaschhh Feb 5, 2024

Choose a reason for hiding this comment

thomaschhh Feb 5, 2024

Choose a reason for hiding this comment

thomaschhh Feb 5, 2024

Choose a reason for hiding this comment

manasMauryax Feb 6, 2024

Choose a reason for hiding this comment

ndoll1998 Feb 19, 2024

Choose a reason for hiding this comment

fromm-m commented Jun 17, 2024