[ENH] `TemporalFusionTransformer` - allow mixed precision training #1518

Marcrb2 · 2024-02-19T09:19:23Z

Description

This PR modifies the attention mask in the TFT model from 1e-9 to float("inf") to allow Pytorch mixed precision training.

Closes #1325, closes #285

codecov-commenter · 2024-02-19T09:26:38Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (b3fcf86) 90.19% compared to head (d93c1e0) 90.19%.
Report is 8 commits behind head on master.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #1518   +/-   ##
=======================================
  Coverage   90.19%   90.19%           
=======================================
  Files          30       30           
  Lines        4724     4724           
=======================================
  Hits         4261     4261           
  Misses        463      463

Flag	Coverage Δ
cpu	`90.19% <100.00%> (ø)`
pytest	`90.19% <100.00%> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

fkiraly

Thanks!

From the API perspective, we need to address the problem that this changes default behaviour, and thus may break or change user code downstream.

What we need to do is to expose the constant in masked_fill as a parameter in ScaledDotProductAttention, up until TemporalFusionTransformer, that means also in InterpretableMultiHeadAttention.

The default must be left at 1e9 for now (to avoid breaking/changing other peoples' code), but this will allow you to set float("inf") by using the new parameter.

The docstrings of the modules and components should also get the description of the new parameter. Where classes have no docstring, it would be great if you could add one, but that is not blocking.

fkiraly · 2024-09-16T17:24:45Z

I would also appreciate feedback on which value is better as a default - even if we cannot change it right now, we could after a couple releases, giving the users a forewarning that the default will change.

Allow mixed precision TFT training

d93c1e0

MrGG14 approved these changes May 22, 2024

View reviewed changes

Merge branch 'master' into Allow_mixed_precision_TFT_training

29660cd

fkiraly added the enhancement New feature or request label Sep 16, 2024

fkiraly requested changes Sep 16, 2024

View reviewed changes

fkiraly changed the title ~~Allow mixed precision TFT training~~ [ENH] TemporalFusionTransformer - allow mixed precision training Sep 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] `TemporalFusionTransformer` - allow mixed precision training #1518

[ENH] `TemporalFusionTransformer` - allow mixed precision training #1518

Marcrb2 commented Feb 19, 2024 •

edited by fkiraly

Loading

codecov-commenter commented Feb 19, 2024 •

edited

Loading

fkiraly left a comment

fkiraly commented Sep 16, 2024

[ENH] TemporalFusionTransformer - allow mixed precision training #1518

Are you sure you want to change the base?

[ENH] TemporalFusionTransformer - allow mixed precision training #1518

Conversation

Marcrb2 commented Feb 19, 2024 • edited by fkiraly Loading

Description

codecov-commenter commented Feb 19, 2024 • edited Loading

Codecov Report

fkiraly left a comment

Choose a reason for hiding this comment

fkiraly commented Sep 16, 2024

[ENH] `TemporalFusionTransformer` - allow mixed precision training #1518

[ENH] `TemporalFusionTransformer` - allow mixed precision training #1518

Marcrb2 commented Feb 19, 2024 •

edited by fkiraly

Loading

codecov-commenter commented Feb 19, 2024 •

edited

Loading