Dim tags matching should use `allow_same_spatial_dim=False` #865

Zettelkasten · 2021-12-16T19:04:24Z

This is kind of an follow up to #666.

I have the problem currently, that SpatialDim("a", 1) matches SpatialDim("b", 1) in many cases, e.g. get_common_data, copy_compatible_to and find_matching_dim_map used by GatherLayer/DotLayer/etc.

Looking at the Dim.is_equal code, allow_same_spatial_dim matches exactly if

both tags are spatial dims (or feature if treat_feature_as_spatial which e.g. get_common_data sets)
the dimensions of self and other are equal, but not None

For my case, I definitely do not want these axes to match (they are not related in any way).

But this kind of matching is currently necessary for something like this:

      "lin1": {"class": "linear", "activation": "sigmoid", "n_out": 5, "from": "data:data"},
      "lin2": {"class": "linear", "activation": "sigmoid", "n_out": 5, "from": "data:data"},
      "output": {"class": "combine", "mode": "add", "from": ["lin1", "lin2"]},  # would give [B,T,F1,F2] with allow_same_spatial_dim=False!!

This is also because currently, LinearLayer does not set derived_from_tag on its output feature dim.
Then, derived_matches would catch this.
We can fix LinearLayer and to do this properly (currently this is via _base_get_out_data_from_opts, when creating a feature dim tag, I think it should just derive it from its input feature dim tag).
For user specified out_dims however, we would not automatically set derived from to the input dim tag.

For other cases, this is more difficult maybe.
E.g. if you have two inputs with different time axes, and then split into the same dim size.
Then, the user would maybe want these axes to match, but they are in no way derived from the same thing.

Another solution is, similar to what @albertz wrote in #666, to separate user defined dim tags and automatically created ones. Then, user defined ones could have a more restrictive matching logic.

The text was updated successfully, but these errors were encountered:

albertz · 2021-12-18T02:59:11Z

derived_from_tag/derived_matches is not the right option here. derived_from_tag means that it is derived from this tag (via some op, like +- or so). We use it as a heuristic to assume in the rec layer, when some dim is derived (but unknown yet due to template construction, also this was before the dim math), that it is the same as the orig dim. E.g. we had the case where some people did some extra padding, then convolution with padding="valid" which removed the padding again such that it became the same, and that inside a rec layer, on the attention encoder axis.

The output feature dim is usually never the same as the input feature dim. And also, it is not derived from the input feature dim in any way. They are completely independent. So derived_from_tag doesn't really make sense here.

Zettelkasten · 2022-01-23T13:17:42Z

Ah yeah, you're right, the derived_* logic is not applicable here.

So really, I think specifying n_out: int is the problem at the moment. In many cases currently, if n_in == n_out, we want the in and output dim tags to match. Like in my example above.
Perhaps we could change it so that if n_out: int is specified, and n_out == in_dim.dimension, that then the newly created output dim tag will match with the input dim tag (either by being exactly the same, or via some other logic).
I wonder if this most cases where we needed allow_same_spatial_dim before.

Logically, this does not make sense 100%, because the in and out dims should logically be different, but we matched them before always anyway, and in the future, we discourage anyone from using n_out: int anyway.

Zettelkasten · 2022-01-23T13:25:18Z

Also the current matching logic is just dangerously broken and depends on the order of axes, e.g. this fails currently:

def test_same_spatial_dim():
  foo_dim = SpatialDim("foo", 3)
  bar_dim = SpatialDim("bar", 3)
  x = Data("a", dim_tags=[foo_dim, bar_dim])
  y = Data("b", dim_tags=[bar_dim, foo_dim])
  # test with default is_equal_opts=None
  assert_equal(x.find_matching_dim_map(y, other_axes=[0, 1]), {0: 1, 1: 0})  # returns {0: 0, 1: 1} currently

We should probably just at least throw an error here.

albertz · 2022-10-24T07:22:12Z

Note that I stumbled upon this (or related) specifically for the DotLayer in #1154, and fixed via #1155.
In #1155, I introduced a new behavior version to trigger the more strict matching for DotLayer. This is anyway only for the case with _auto_var_axes, i.e. when var1 and var2 are not specified explicitly.

But actually, I did not fully think through it, whether this is maybe difficult for users who did not adapt to use dim tags. Maybe not for DotLayer, as you can always specify everything explicitly.

albertz · 2022-10-24T07:45:43Z

I'm now thinking whether my proposed solution from from #666 is maybe really the better way which would have avoided these issues:

As another solution, maybe we can disallow the broadcasting (or other is_equal attribs) if this is an explicit dim tag by the user? To implement that, we would need to go through all places in RETURNN where a static dim tag is created, and add a new flag like auto_created=True or so. And then allow the broadcasting only if auto_created and dimension == 1.

Again we should think about the quality (#634). When comparing a user-generated dim tag to another user-generated dim tag, it should be strict. When comparing an auto-created dim tag to another auto-created dim tag, it can use the is_equal attribs and be more relaxed. When comparing an auto-created dim tag to a user-created dim tag, what then? I think it should also be strict.

But yes, we also should think about having it all really well defined and the code should be clean. See #975.

albertz · 2022-11-16T16:47:20Z

We have the auto_generated flag.

#865

#1027 The way dim tags were used was just wrong. See #865.

#865

#1027 The way dim tags were used was just wrong. See #865.

#865

Zettelkasten mentioned this issue Feb 16, 2022

Dim auto_generated flag #950

Merged

albertz added the potential-new-behavior Discussions about RETURNN behaviour label Oct 24, 2022

albertz mentioned this issue Oct 24, 2022

DotLayer heuristical dim matching causes problems #1154

Closed

albertz mentioned this issue Oct 24, 2022

Dim internals and API should be refactored #975

Open

This was referenced Nov 16, 2022

GatherLayer resolved wrong common axes #1219

Closed

More restrictive is_equal_opts for dim tag comparison #1220

Closed

albertz added a commit that referenced this issue Nov 16, 2022

Dim is_equal, check auto_generated

2c58b7e

#865

albertz mentioned this issue Nov 16, 2022

Dim is_equal, check auto_generated #1222

Merged

albertz added a commit that referenced this issue Nov 16, 2022

test_reclayer_att_weights_output_layer, fix dim tags

30b2b65

#1027 The way dim tags were used was just wrong. See #865.

albertz added a commit that referenced this issue Nov 16, 2022

test_ScatterNdLayer_RangeLayer, small dim tag fix

153addb

#865

albertz added a commit that referenced this issue Nov 16, 2022

test_GenericAttentionLayer_extra_spatial_multi_head fix dims

277a946

#865

albertz added a commit that referenced this issue Nov 16, 2022

test_extra_scatter_nd_search_train, fix dim

64f0623

#865

albertz closed this as completed in #1222 Nov 16, 2022

albertz added a commit that referenced this issue Nov 16, 2022

Dim is_equal, check auto_generated

7c9f949

#865

albertz added a commit that referenced this issue Nov 16, 2022

test_reclayer_att_weights_output_layer, fix dim tags

ac9e7c0

#1027 The way dim tags were used was just wrong. See #865.

albertz added a commit that referenced this issue Nov 16, 2022

test_ScatterNdLayer_RangeLayer, small dim tag fix

7392a99

#865

albertz added a commit that referenced this issue Nov 16, 2022

test_GenericAttentionLayer_extra_spatial_multi_head fix dims

ab6acc3

#865

albertz added a commit that referenced this issue Nov 16, 2022

test_extra_scatter_nd_search_train, fix dim

4e904c4

#865

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dim tags matching should use `allow_same_spatial_dim=False` #865

Dim tags matching should use `allow_same_spatial_dim=False` #865

Zettelkasten commented Dec 16, 2021 •

edited

Loading

albertz commented Dec 18, 2021

Zettelkasten commented Jan 23, 2022

Zettelkasten commented Jan 23, 2022 •

edited

Loading

albertz commented Oct 24, 2022 •

edited

Loading

albertz commented Oct 24, 2022

albertz commented Nov 16, 2022

Dim tags matching should use allow_same_spatial_dim=False #865

Dim tags matching should use allow_same_spatial_dim=False #865

Comments

Zettelkasten commented Dec 16, 2021 • edited Loading

albertz commented Dec 18, 2021

Zettelkasten commented Jan 23, 2022

Zettelkasten commented Jan 23, 2022 • edited Loading

albertz commented Oct 24, 2022 • edited Loading

albertz commented Oct 24, 2022

albertz commented Nov 16, 2022

Dim tags matching should use `allow_same_spatial_dim=False` #865

Dim tags matching should use `allow_same_spatial_dim=False` #865

Zettelkasten commented Dec 16, 2021 •

edited

Loading

Zettelkasten commented Jan 23, 2022 •

edited

Loading

albertz commented Oct 24, 2022 •

edited

Loading