Intermediate usage before first release #98

albertz · 2022-02-04T09:37:27Z

There are still a number of open issues which should be done for the first "official" release (or at least addressed, or maybe deliberately postponed) (#32).

Even before that, I think the current functionality is already in a usable state which could simplify certain things for some applications. E.g. some part of your model or maybe of the loss could already be implemented using RETURNN common.

So, what things are currently blocking us from already using it that way?

Somewhat stable API. It should not break much anymore. I think we already have this, at least for the model definition. (Although, please keep in mind it is still experimental, and not guaranteed.)
Partial serialization of sub network dicts including dim tags. Currently it's only for the full network. Although you could also wrap your existing raw net dict and then it would work already, so maybe not really needed? Edit Not really needed.
Serialization logic for Sisyphus (Extensions for Sisyphus serialization of net dict and extern data #104)
How to define whether search (or train flag) is enabled? #18. Nothing really major but we still need to get to some decision here. See the TransformerDecoder as an example for search. Or stochastic layers (Stochastic depth #99) as an example for using the train flag.
Masked computation wrapper #23 for some models like transducer
How to handle Sisyphus hashes #51? We can already do experiments without this and rely on the basic net dict for the hash. However, I expect that the produced net dict will still change, so this will probably cause you some trouble.
How to flatten (pack) sequence before softmax + loss or just loss #68. We can already do experiments without this. This is needed for efficiency (to get the same as we currently have with normal RETURNN usage) and also maybe for stability.
What param init defaults should we use #94. Everything is basically already prepared and decided, so this is mostly done.

What else?

I now just picked things which I see critical before this can be really used. For further things, see #32.

The text was updated successfully, but these errors were encountered:

albertz · 2022-02-04T09:39:40Z

@JackTemaki @Atticus1806 @mmz33 I see us as the first experimental users for this, so I guess we are the ones who should try it out.

@mmz33 e.g. we could implement just the contrastive loss using returnn common.

Or basically any encoder-decoder-attention models should also be quite trivial to implement now.

We also have Conformer and Transformer already (untested...).

albertz · 2022-03-01T09:11:21Z

Note that if we do not deal with Sis hashes (#51) yet, the naming and descriptions of dim tags will likely become a problem: #119

albertz · 2022-05-05T10:04:19Z

I think this can be closed for now. It can already be used, even when not having the Sisyphus hashes solved yet (although you should keep that in mind of course). E.g. you could just handle that manually. Or just use the net dict as hash but then take into consideration that the hash likely will change in the future.

albertz assigned JackTemaki, Atticus1806, mmz33 and albertz Feb 4, 2022

This comment was marked as off-topic.

Sign in to view

albertz mentioned this issue Feb 4, 2022

Make training loop and stages explicit? #96

Open

This comment was marked as off-topic.

Sign in to view

This was referenced Feb 7, 2022

Extensions for Sisyphus serialization of net dict and extern data #104

Closed

How to handle Sisyphus hashes #51

Closed

albertz added this to the first-release milestone Feb 17, 2022

albertz closed this as completed May 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intermediate usage before first release #98

Intermediate usage before first release #98

albertz commented Feb 4, 2022 •

edited

Loading

albertz commented Feb 4, 2022

This comment was marked as off-topic.

This comment was marked as off-topic.

albertz commented Mar 1, 2022

albertz commented May 5, 2022

Intermediate usage before first release #98

Intermediate usage before first release #98

Comments

albertz commented Feb 4, 2022 • edited Loading

albertz commented Feb 4, 2022

This comment was marked as off-topic.

This comment was marked as off-topic.

albertz commented Mar 1, 2022

albertz commented May 5, 2022

albertz commented Feb 4, 2022 •

edited

Loading