-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: Set file.checksum_md5 non optional #841
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think making this field required is important, but I'm having second thoughts about an empty string being a valid checksum. I think there is also a place for stricter checks here.
A valid md5 checksum usually has 32 characters, 0-9a-f. We should probably make this a validation check, as an invalid checksum compromises data quality
Maybe for reminding me, and for posterity, under what circumstances are we giving an empty string checksum?
The empty string will only occur in the case that a user directly uses fmu-dataio/src/fmu/dataio/dataio.py Line 930 in acb2f25
Before we can really ensure we don't create an empty string in the metadata we need to deprecate the What do you think 🙂 ? |
Yes, let's do it 👍 there is no good reason to not generate a checksum. And in 8 months time if we change our mind about that and land back on this PR, at least we will have some context to our decision 😄 |
Could be that this is needed when no files are actually materialized to disk? I.e. in our aggregation service (which at the moment does not use |
18a4dbb
to
6e79086
Compare
6e79086
to
f843619
Compare
PR to set
file.checksum_md5
non optional. It was non optional in the legacy schema, and was introduced in https://github.com/equinor/fmu-dataio/pull/512/files, probably by mistake.Closes #835