Skip to content

Latest commit

 

History

History
28 lines (19 loc) · 1.21 KB

File metadata and controls

28 lines (19 loc) · 1.21 KB

SignPuddle

The Swiss-French SignPuddle includes around 5000 entries in SignWriting with an illustration.

Data

We collected the data using collect_data.py.

Issues

Since this data is unstructured, we assume all png files are illustrations, and all jpg files are pictographs.

This means that we have multiple types of data:

Type Example
Noise noise
No-person Illustration illustration
Blurry Illustration illustration
HD Illustration illustration
HD Colored Illustration illustration
Corrupted illustration
Blurry Image + Arrows illustration
HD Image illustration

Ideally, we should be able to classify these images, and apply some "tag" in their prompt. We should not train on corrupted or noise images.