Releases: kakaobrain/torchgpipe
Releases · kakaobrain/torchgpipe
v0.0.7
v0.0.6
v0.0.5
Released on November 29, 2019.
Featured
@skippable
for efficient skip connections. With this interface, GPipe
copies skip tensors directly to the destination device.
Improvements
- Checkpointing deterministically handles randomness managed by PyTorch.
balance_by_size()
analyzes parameters as well.
Breaking Changes
- Moved
torchgpipe_balancing
module totorchgpipe.balance
. - Redesigned interface of
balance_by_time()
andbalance_by_size()
.
v0.0.4
Released on October 8, 2019.
- Reduced GPU memory fragmentation by caching CUDA streams for copy.
- Fixed potential GPU memory violation on tuple of multiple tensors.
- Fixed potential GPU memory violation on shifted view tensors. (issue #27366 and pull request #27371 on PyTorch)
v0.0.3
Released on September 30, 2019.
Featured
torchgpipe now overlaps copy and computation using the separate CUDA streams. Previously, GPU could not compute a partition while copying micro-batches across different GPUs because they all happened on the same default CUDA stream.
Other Improvements
- Added support for PyTorch 1.2.
- Redesigned the internal pipeline parallelism to represent dependencies transparently.
- Fixed the hanging issue when an exception is raised in a partition.
- Fixed the unintended size accumulation (#3 by @842974287) of
balance_by_size()
.
Breaking Changes:
- No more support for PyTorch 1.0.
- Changed type of
GPipe.devices
fromtuple
tolist
. - Removed
current_microbatch()
. This approach turned out to be incompatible with checkpointing.
v0.0.2
Released on June 26, 2019.
- Added support for PyTorch 1.1.
- Refined public APIs.
- Detailed documentation.
- Proper exceptions for invalid usage.
- Provided automatic balancing.
- Provided inspecting utilities:
current_microbatch()
andis_recomputing()
. - Reimplemented deferred batch normalization by subclassing.