Generalize Loop.unstack to unstack with state #202

albertz · 2022-09-06T08:58:18Z

This came up while implementing some time-sync transducer.
You would want to define a align-step function with an interface like:

def next_step(..., state: nn.LayerState) -> (..., nn.LayerState)

Somehow, you want to access the encoder output. Because it is time-sync, you naturally would use loop.unstack(enc_out) or so. However, this needs to access the Loop instance, and has an implicit assumption that there is a loop around it.

We want to avoid this, i.e. the assumption that this runs inside a loop. Just the layer state should be enough.

So, can we generalize this?

Such input could already be part of the next_step function.

Loop.unstack is basically a special case of nn.gather. However, it is much more efficient, in terms of runtime and memory. So it is very important that it is used when possible. This can be via automatic optimization.

The state would be the current position (default init is 0) and the state update would increase it.

What API to provide for the user?

rec_unstack (actually exists, but without state, so extend or change that)?
Or gather, and position is just explicit?

Where do we want to do the optimization?

RETURNN level. Mostly in the GatherLayer. But also the pos idx state increase (EvalLayer or CombineLayer). Should have tests for proper optimization, both in RETURNN and RETURNN-common.
RETURNN-common. Also in gather?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generalize Loop.unstack to unstack with state #202

Generalize Loop.unstack to unstack with state #202

albertz commented Sep 6, 2022 •

edited

Loading

Generalize Loop.unstack to unstack with state #202

Generalize Loop.unstack to unstack with state #202

Comments

albertz commented Sep 6, 2022 • edited Loading

albertz commented Sep 6, 2022 •

edited

Loading