This project provides a framework for creating repeatable infinite streams of data samples, emphasizing computer vision data. The main reason for this is (of course) deep learning; most deep models require many samples to be processed in a training phase. These samples must be sampled from a dataset and bundled into batches that can be processed simultaneously on a GPU. Besides sampling, another important concept in deep learning for computer vision is data augmentation where images are processed with several image processing steps to increase data diversity in a controlled manner.
PixelPipes combines both sampling and augmentation into a single data-generation pipeline. The pipeline is first described as a computational graph in Python. It is then transformed into an operation pipeline executed in C++, avoiding GIL and enabling efficient use of multiple threads with shared access to memory structures.
The package can be installed as a Python wheel package, currently from a testing PyPi compatible repository located here.
> pip install pixelpipes -i https://data.vicos.si/lukacu/pypi/
Below is an example of a Python script that constructs a very simple graphs for sampling images from a directory and randomly cropping and augmenting them. Different and more complex examples are available in the documentation.
TODO
The documentation is hosted at ReadTheDocs:
- Index
- Quick start
- Tutorials
- API
- Extending
- Development
The development of this package was supported by Sloveninan research agency (ARRS) projects Z2-1866, J2-316 and J7-2596.