Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy gates in LSTM, GRU ops before applying activations #193

Merged
merged 2 commits into from
May 20, 2024

Conversation

robertknight
Copy link
Owner

@robertknight robertknight commented May 20, 2024

When the batch size is > 1, slices of the gates scratch space to extract specific gates are non-contiguous, which makes sigmoid_in_place and tanh_in_place much slower. See #192. The GRU op had a workaround by using the non-in-place tanh instead. This applies the same solution to the sigmoid activation in the GRU op and all activations in the LSTM op.

This is a workaround until #192 is
solved more generally.
@robertknight robertknight changed the title Copy update and reset gates in GRU op before applying sigmoid activation Copy gates in LSTM, GRU ops before applying activations May 20, 2024
This change was previously applied in the GRU operator to work around
`sigmoid_in_place` and `tanh_in_place` being slow for non-contiguous inputs,
which will be the case if the batch size is > 1.
@robertknight robertknight merged commit 7f24d0f into main May 20, 2024
2 checks passed
@robertknight robertknight deleted the gru-sigmoid-copy branch May 20, 2024 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant