v0.1.1
🚀 Streaming v0.1.1
Streaming v0.1.1 is released! Install via pip
:
pip install --upgrade mosaicml-streaming==0.1.1
What's Changed
- Streaming datasets V2 by @knighton in #2
- Initial Docs Site by @bandish-shah in #3
- Added a ADE20K and COCO2017 data conversion scripts by @karan6181 in #5
- Added pre-commit config by @karan6181 in #6
- Added pre-commit config for a License Header by @karan6181 in #7
- Convert relative imports to absolute imports by @karan6181 in #8
- C4 dataset by @knighton in #4
- Add a ADE20K streaming dataset class by @karan6181 in #9
- PyPi mods for setup.py by @bandish-shah in #10
- Disable local shard deletion by @knighton in #12
- Add a COCO streaming dataset class by @karan6181 in #13
- Add docstrings. by @knighton in #14
- Added unittest for Writer and Reader by @karan6181 in #16
- added new streaming logos by @ejyuen in #15
- Update package version code for unification by @karan6181 in #17
- Fix wait-for-unzip race by @knighton in #18
- Added algolia search to streaming docs site by @nqn in #19
- Add a pre-commit GitHub workflow by @karan6181 in #21
- Added pydocstyle and docformatter in pre-commit config by @karan6181 in #20
- Improve algorithmic complexity of sample-to-shard lookup from O(log N) to O(1) by @knighton in #22
- Add enwiki-20200101 streaming dataset by @knighton in #23
- Add submodules to api reference doc by @karan6181 in #24
- Initial Docs site content by @bandish-shah in #11
- Add unittest for compression by @karan6181 in #25
- Fix hang when compression is used but compressed files are not retained by @knighton in #26
- Add long_description for packaging by @bandish-shah in #29
- Update tutorial notebooks to have it run end-to-end by @karan6181 in #30
- Adjustment for last partition bug by @knighton in #27
- Fix preprocessing for English Wikipedia dataset by @knighton in #28
- Fix enwiki dataset by @dskhudia in #31
- Skip pre-commit check for enwiki convert skip to have code parity by @karan6181 in #32
- Update doc and fixed reference links by @karan6181 in #33
- Parallel tfrecord creation, validate sample counts vs MDS by @knighton in #34
- Bump up the version to 0.0.1b by @karan6181 in #35
- Add NLP synthetic dataset jupyter notebook tutorial by @karan6181 in #36
- Add README and CONTRIBUTING guide by @karan6181 in #38
- Typos + copy editing in README by @dblalock in #40
- Re-factor docs tutorials to top-level examples by @bandish-shah in #39
- Fixed typos and update documentation by @karan6181 in #42
- Add CodeQL security scanner and Dependabot workflow by @karan6181 in #43
- Bump gitpython from 3.1.28 to 3.1.29 by @dependabot in #46
- Bump myst-parser from 0.16.1 to 0.18.1 by @dependabot in #47
- Add bug report and feature request template by @karan6181 in #48
- mlperf enwiki conversion code mild cleanup by @knighton in #41
- Add Build publish to PyPI and create GitHub release workflow by @karan6181 in #50
- Added writer unittest and update existing test by @karan6181 in #52
- Bump version to 0.1.0 by @karan6181 in #53
- Fixed dead image link in pypi home page by @karan6181 in #54
- Add TorchVision VisionDataset inheritance. by @knighton in #55
- bump version to 0.1.1b0 by @karan6181 in #56
- Fixed rendering of pypi image by @karan6181 in #59
- Bump version to 0.1.1 by @karan6181 in #60
New Contributors
- @knighton made their first contribution in #2
- @bandish-shah made their first contribution in #3
- @karan6181 made their first contribution in #5
- @ejyuen made their first contribution in #15
- @nqn made their first contribution in #19
- @dskhudia made their first contribution in #31
- @dblalock made their first contribution in #40
- @dependabot made their first contribution in #46
Full Changelog: https://github.com/mosaicml/streaming/commits/v0.1.1