Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Extend feature extraction module to allow for RASR compatible… #41

Merged
merged 1 commit into from
Dec 7, 2023

Conversation

JackTemaki
Copy link
Contributor

From my understanding it should be possible to use "ModelConfiguration" objects directly in Sisyphus to define the model parameters, and I did so in my setups. Thus the PR in question breaks two of my pipelines. (In the others I still had my own custom feature extraction config object).

I assume others are affected as well, at least @Atticus1806 also is.

… logmel features (#40)"

This reverts commit c363a01.

Previous commit breakes hashes/setups.
@albertz
Copy link
Member

albertz commented Dec 7, 2023

How does it break?

@JackTemaki
Copy link
Contributor Author

JackTemaki commented Dec 7, 2023

It is not clear yet why (Benedikt reported this should not be the case), but the default values also get converted when using asdict, which is used to serialize (and hash) the model config.

@albertz
Copy link
Member

albertz commented Dec 7, 2023

It is not clear yet why (Benedikt reported this should not be the case), but the default values also get converted when using asdict, which is used to serialize (and hash) the model config.

It sounds like we should maybe just fix this instead? Otherwise we could never ever extend an existing config/model by new extra arguments.

Also, to clarify, by "break", you just mean that it changes the hash, right? I initially understood "break" in the way that it does not work anymore, you get a crash, or incorrect output, or so.

@JackTemaki
Copy link
Contributor Author

you get a crash

Well, it also crashed because it writes invalid code:

error: cannot format returnn.config: Cannot parse: 331:273:     **{'model_config_dict': {'feature_extraction_config': {'sample_rate': 16000, 'win_size': 0.025, 'hop_size': 0.01, 'f_min': 60, 'f_max': 7600, 'min_amp': 1e-10, 'num_filters': 80, 'center': False, 'n_fft': 400, 'periodic': True, 'htk': False, 'norm': 'slaney', 'dtype': <class 'numpy.float32'>, 'spectrum_type': <SpectrumType.STFT: 1>}, 'frontend_config': {'in_features': 80, 'conv1_channels': 32, 'conv2_channels': 64, 'conv3_channels': 64, 'conv4_channels': 32, 'conv_kernel_size': (3, 3), 'conv_padding': None, 'pool1_kernel_size': (2, 1), 'pool1_stride': (2, 1), 'pool1_padding': None, 'pool2_kernel_size': (2, 1), 'pool2_stride': (2, 1), 'pool2_padding': None, 'activation': None, 'out_features': 384, 'activation_str': 'ReLU'}, 'specaug_config': {'repeat_per_n_frames': 25, 'max_dim_time': 20, 'num_repeat_feat': 5, 'max_dim_feat': 16}, 'specauc_start_epoch': 11, 'label_target_size': 356, 'conformer_size': 384, 'num_layers': 12, 'num_heads': 4, 'ff_dim': 1536, 'att_weights_dropout': 0.2, 'conv_dropout': 0.2, 'ff_dropout': 0.2, 'mhsa_dropout': 0.2, 'conv_kernel_size': 31, 'final_dropout': 0.2}}

But this is less relevant. While I would like the ModelConfiguration objects to be trivially serializable, we already have some that are not, and I can arrange myself with that. It is just that existing ones that I use should not suddenly break.

@JackTemaki
Copy link
Contributor Author

It is not clear yet why (Benedikt reported this should not be the case), but the default values also get converted when using asdict, which is used to serialize (and hash) the model config.

It sounds like we should maybe just fix this instead? Otherwise we could never ever extend an existing config/model by new extra arguments.

As it works for "vanilla" dataclasses, I would assume the type checking is the culprit.

Independent of that problem, I would have rejected the PR for the same reasons Willi stated as well.

@albertz
Copy link
Member

albertz commented Dec 7, 2023

Well, it also crashed because it writes invalid code:

It's a bit hard to read from your one-line flattened output but it looks like SpectrumType.STFT is not correctly serialized? I would say this is again a separate issue and not relevant here.

Edit Separate issue: #42

@JackTemaki
Copy link
Contributor Author

I would say this is again a separate issue and not relevant here.

I literally said this is not really relevant

@albertz
Copy link
Member

albertz commented Dec 7, 2023

I would have rejected the PR for the same reasons Willi stated as well.

Well, I disagree. The default values are exactly such that you get exactly the old behavior. These are very reasonable default parameters.

I think making a separate class here, or V2, just because of that, just adds complexity for no good reason.

@JackTemaki
Copy link
Contributor Author

I would have rejected the PR for the same reasons Willi stated as well.

Well, I disagree. The default values are exactly such that you get exactly the old behavior. These are very reasonable default parameters.

This is not only about the defaults, but that the class provides too many option combinations that are never used. Which makes it harder to understand and use. I see it exactly the other way around. It makes a class too complex and too configurable. Having one fixed class for the "normal" librosa and one fixed class for RASR should be perfectly fine, and more understandable.

@Atticus1806
Copy link
Contributor

Atticus1806 commented Dec 7, 2023

Actually deleted my last comment, since I just found out its related to the typing. So if you give a type its included into asdict, if you dont its not.
So

@dataclass
class Test:
    no_def: int
    with_def: int = 3

produces

{'no_def': 1, 'with_def': 3}

while

@dataclass
class Test:
    no_def: int
    with_def = 3

produces

asdict(Test(1)) = {'no_def': 1}

@rwth-i6 rwth-i6 deleted a comment from albertz Dec 7, 2023
@albertz
Copy link
Member

albertz commented Dec 7, 2023

This is not only about the defaults, but that the class provides too many option combinations that are never used.

Well, @curufinwe argued to add all the other options. His had valid reasons for it. (See the discussion in the PR, it might be hidden in some of the resolved parts.)

@albertz

This comment was marked as off-topic.

@albertz

This comment was marked as resolved.

@JackTemaki

This comment was marked as resolved.

@JackTemaki
Copy link
Contributor Author

This is not only about the defaults, but that the class provides too many option combinations that are never used.

Well, @curufinwe argued to add all the other options. His had valid reasons for it. (See the discussion in the PR, it might be hidden in some of the resolved parts.)

I only found a discussion about how the arguments should look like, and that he did not like kwargs options. There is no discussion about making a separate class.

@albertz

This comment was marked as resolved.

@albertz
Copy link
Member

albertz commented Dec 7, 2023

Well, @curufinwe argued to add all the other options. His had valid reasons for it. (See the discussion in the PR, it might be hidden in some of the resolved parts.)

I only found a discussion about how the arguments should look like, and that he did not like kwargs options.

I found it: #40 (comment)
@curufinwe said:

Instead of adding arbitrary additional options we could explicitly add all remaining options here. That should be htk, norm and dtype.

@Atticus1806

This comment was marked as resolved.

@JackTemaki
Copy link
Contributor Author

JackTemaki commented Dec 7, 2023

Well, @curufinwe argued to add all the other options. His had valid reasons for it. (See the discussion in the PR, it might be hidden in some of the resolved parts.)

I only found a discussion about how the arguments should look like, and that he did not like kwargs options.

I found it: #40 (comment) @curufinwe said:

Instead of adding arbitrary additional options we could explicitly add all remaining options here. That should be htk, norm and dtype.

Yes exactly, this is about kwargs vs explicit options. He argued against arbitrary undefined options, where I fully agree with Eugen. And has nothing to do with what I mean: Not adding ANY options to the existing class, that are solely there for the purpose of getting one exact other behavior. But to have one class that will give you "exactly" the RASR behavior without any other parameters you could accidentally set wrong. Or am I misunderstanding why the other attributes were added? So are they actually not necessary to get the RASR behavior?

Edit: Ah I think I start do understand now, so some parameters were really only there for completeness, not related to RASR.

But then there is still the issue that this does break setups, and even if there is a workaround on my side this would mean I would have to in future refrain from relying on the stability of ModelConfig Objects. I understand that it is somewhat unpleasant to not easily extend existing classes, but this does not change the fact that the PR has to be reverted as is. So we should first find possible solutions to that issue, and hear out Eugens opinion before continuing on the PR here.

@albertz
Copy link
Member

albertz commented Dec 7, 2023

Yes exactly, this is about kwargs vs explicit options. He argued against arbitrary undefined options, where I fully agree with Eugen.

No, I think this is about consistency. There are some options (f_min, f_max etc) which are directly specified, and then remaining ones in mel_options. To have it consistent, either all options should be inside mel_options or all options should be explicitly listed. As we already had some listed, there was no way without breaking it to go with the more generic mel_options. I think this was the main reason, otherwise having such generic mel_options might have been fine just as well. At least this is how I understood it and why I agreed with @curufinwe's comment on this.

However, I'm not really sure it was actually discussed why mel_options (or the explicit htk, norm and dtype) was added though.

@albertz
Copy link
Member

albertz commented Dec 7, 2023

the issue that this does break setups

You mean that it changes the hash? I would still not call this "break".

But we can just fix this, or not? What is the problem in fixing this?

@albertz
Copy link
Member

albertz commented Dec 7, 2023

Btw, about the dataclass example, actually this is misleading, having just with_def = 3 without the type is not valid anyway, dataclass will ignore it, i.e. you cannot do Test(no_def=1, with_def=4).

@Atticus1806
Copy link
Contributor

without the type is not valid anyway

Oh I did not know this.

You mean that it changes the hash? I would still not call this "break".

I would disagree. I think if we allow changes that change hashes we need to be really careful which ones that are and I would maybe even argue to only have changes do that which fix clear bugs. Otherwise we might again have people work on old commits or own derivations which patch in certain updates.

@JackTemaki
Copy link
Contributor Author

JackTemaki commented Dec 7, 2023

the issue that this does break setups

You mean that it changes the hash? I would still not call this "break".

But we can just fix this, or not? What is the problem in fixing this?

Do you mean from i6_models side or from the side of my setup? From the side of my setup it is annoying but possible. I can make a local copy of the ModelConfig object before the PR, use this to write the config, and then in my model code map this to the new one. But if I would possibly have to do that after each pull, I would stop using the ModelConfig options for serialization completely and always use my custom ones (which is maybe not even bad?)

From the i6_models side, you would have to prohibit dataclasses to write new defaults when calling as_dict. For sure this is possible somehow, I am just not confident this is something we should do.

@albertz
Copy link
Member

albertz commented Dec 7, 2023

You mean that it changes the hash? I would still not call this "break".

I would disagree. I think if we allow changes that change hashes

I never said that we should allow changing hashes. Definitely we should not allow this. I just said that I would not call this "break", as I understand "break" as "does not work anymore" (crash, different incorrect output, whatever). Instead of calling this "break", I would just say explicitly "this changes the hash", nothing more, nothing less. When I read "break", this confused me multiple times here already that I thought you mean sth different.

@albertz
Copy link
Member

albertz commented Dec 7, 2023

You mean that it changes the hash?
But we can just fix this, or not? What is the problem in fixing this?

Do you mean from i6_models side

Yes of course from i6_models side. The user should not need to change any code. We should not allow any changes here which would require changes on the user side as well (at least in most cases, unless there is maybe a very good reason).

From the i6_models side, you would have to prohibit dataclasses to write new defaults when calling as_dict. For sure this is possible somehow, I am just not confident this is something we should do.

I'm not exactly sure I understand what you mean by "new defaults", but yes, sth along this direction would be one solution. There are more potential solutions, e.g. doing sth like we do in i6_core with jobs using __sis_hash_exclude__.

If you don't want this, it means that we never ever can add new options to an existing config/model, and I'm not sure this is a good idea.

@michelwi
Copy link
Contributor

michelwi commented Dec 7, 2023

If you don't want this, it means that we never ever can add new options to an existing config/model, and I'm not sure this is a good idea.

I thought we agreed to deprecate the old and add a new version of the classes instead of doing extensions (that go beyond bugfixing)

@albertz
Copy link
Member

albertz commented Dec 7, 2023

If you don't want this, it means that we never ever can add new options to an existing config/model, and I'm not sure this is a good idea.

I thought we agreed to deprecate the old and add a new version of the classes instead of doing extensions (that go beyond bugfixing)

I thought that there would be a line somewhere between small extensions are ok to be added, big changes would go to a new version. I don't remember that we agreed that we never want any extensions?

@albertz
Copy link
Member

albertz commented Dec 7, 2023

What about e.g. PR #33? It adds an option (violation to not extend existing things) + maybe actually this also changes the hash in the same way as we discussed here?

@JackTemaki
Copy link
Contributor Author

You mean that it changes the hash?
But we can just fix this, or not? What is the problem in fixing this?

Do you mean from i6_models side

Yes of course from i6_models side. The user should not need to change any code. We should not allow any changes here which would require changes on the user side as well

As it seems you agree with me that a PR should not lead to me having to change code, I would propose to merge this revert, and start a new discussion on how to solve this problem more independently from this specific PR. And then re-do the feature extraction PR.

@albertz
Copy link
Member

albertz commented Dec 7, 2023

I would propose to merge this revert

Instead, I would just fix the issue.

Otherwise you also want to revert other PRs now? E.g. #33?

@JackTemaki
Copy link
Contributor Author

I would propose to merge this revert

Instead, I would just fix the issue.

Otherwise you also want to revert other PRs now? E.g. #33?

#33 was never merged.

@albertz
Copy link
Member

albertz commented Dec 7, 2023

Ah sorry, right. So we never had that case so far? Ok then.

@JackTemaki JackTemaki merged commit 933c6c1 into main Dec 7, 2023
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants