Skip to content
This repository has been archived by the owner on Aug 10, 2023. It is now read-only.

Releases: hfxunlp/transformer

v0.3.8

10 Aug 06:13
635c75c
Compare
Choose a tag to compare

support pre-trained models (BERT, RoBERTa, BART, T5, MBART);
add regression loss, bucket relative positional encoding and self-dependency units;
support compression (gz, bz2, xz) and character-level text data processing;
support unicode standardization and Chinese desegmentation;
configure BF/FP16 and inference mode for pytorch;
fix & enhancement.

v0.3.7

07 May 00:53
Compare
Choose a tag to compare
v0.3.7 Pre-release
Pre-release

add constrained decoder;
support GLU;
fix & enhancement.

Hello 2022 :-)

v0.3.6

23 Dec 11:28
Compare
Choose a tag to compare
v0.3.6 Pre-release
Pre-release

support hard retrieval attention;
fix & enhancement.

Bye 2021 :-)

v0.3.5

16 Aug 03:50
Compare
Choose a tag to compare
v0.3.5 Pre-release
Pre-release

support multilingual NMT;
support contiguous model parameters;
add sentencepiece (spm) support;
add C backend for core modules (this saves resources but is slower than the python backend);
clean & enhancement (the class components of transformer.Encoder/Decoder(s) are changed, and model files of previous commits cannot be loaded correctly).

v0.3.4

25 Jun 00:56
Compare
Choose a tag to compare
v0.3.4 Pre-release
Pre-release

support MultiGPUOptimizer;
support word translation probe, MHPLSTM;
disable FFN inside AAN by default;
support to clean data with many repeated tokens;
clean & enhancement.

v0.3.3

22 Feb 00:43
Compare
Choose a tag to compare
v0.3.3 Pre-release
Pre-release

fix decoding efficiency by moving decoding cache from attention inputs to attention hiddens;
support shared vocabulary pruning of trained models.

v0.3.2

06 Feb 03:58
Compare
Choose a tag to compare
v0.3.2 Pre-release
Pre-release

fix bugs;
support fast label smoothing loss;

v0.3.1

20 Jan 03:06
Compare
Choose a tag to compare
v0.3.1 Pre-release
Pre-release

In this release, we:
Support AdaBelief optimizer;
Accelerate zero_grad by enabling set_to_none;
Support RealFormer.

v0.3.0

31 Aug 03:57
Compare
Choose a tag to compare
v0.3.0 Pre-release
Pre-release

In this release, we:
Move AMP support from apex to torch.cuda.amp introduced in PyTorch 1.6;
Support sampling during greedy decode (for back-translation);
Accelerate Average Attention Network by replacing the matrix multiplication with cumsum; (A typo in this release is fixed in commit ed5eb60)
Add APE support;
Support the Mish activation function.

v0.2.9

12 Jun 01:39
Compare
Choose a tag to compare
v0.2.9 Pre-release
Pre-release

In this release, we:
adapt to PyTorch 1.5;
explicitly support Lipschitz constrained parameter initialization;
incorporate features: n-gram dropout, dynamic batch sizes, and source phrase representation learning.

Sorry for we did not include the update of utils in this release, please find utils/comm.py (a.k.a utils.comm) required by parallel/base.py (a.k.a parallel.base), or use commit 2b6b22094b545e74b05c075f3daac9c14f16414d instead.