Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unavoidable warning when writing String #208

Closed
johnomotani opened this issue Nov 21, 2024 · 4 comments
Closed

Unavoidable warning when writing String #208

johnomotani opened this issue Nov 21, 2024 · 4 comments

Comments

@johnomotani
Copy link

johnomotani commented Nov 21, 2024

When NCDatasets writes a String, a warning generated by DiskArrays is always printed:

using NCDatasets
f = NCDataset("test.nc", "c")
v = defVar(f, "foo", String, ())
v[] = "bar"

prints

┌ Warning: Can not determine size of element type. Using DiskArrays.fallback_element_size[] = 100 bytes
└ @ DiskArrays ~/.julia/packages/DiskArrays/ny95C/src/chunks.jl:354
┌ Warning: Can not determine size of element type. Using DiskArrays.fallback_element_size[] = 100 bytes
└ @ DiskArrays ~/.julia/packages/DiskArrays/ny95C/src/chunks.jl:354 

In the NCDatasets issue (Alexander-Barth/NCDatasets.jl#269) for this problem, @Alexander-Barth notes that

The warning is not generated by NCDatasets but by a dependency.
We cannot address the problem in NCDatasets using the public API of DiskArrays.

The NCDatasets usage in the example above is correct (NCDatasets is expected to be able to write a String variable), so should not print a warning. It seems some change is needed in DiskArrays, for example to provide some public API that would allow NCDatasets to pick some sensible setting for String variables that would disable this warning?

Edit to add: Maybe it's obvious, but to be explicit, the warning comes from the call to setindex!() implicit in the last line (v[] = "bar") of the example. The implementation of setindex!() used by NCDatasets is entirely in DiskArrays, not NCDatasets.

@felixcremer
Copy link
Contributor

With #202 this warning should only be printed once. Do you think, this behaviour is acceptable or do you want to fully suppress the warning?

@johnomotani
Copy link
Author

Personally I'd be happy enough with the solution in #202 - it's easier to ignore a warning printed once than when it's printed thousands of times (which is what happens for me with the release version I'm using).

In general though, it would be ideal if the warning was not printed at all, because for a user of NCDatasets there is nothing they can do to 'fix' the issue pointed out by the warning (as far as I know, it doesn't need fixing at all in the case of writing a single String), so it is confusing, and liable to waste people's time as they go to figure out what the warning means and how they can get rid of it (only to find out in the end that they can't, which is rather frustrating!).

@meggart
Copy link
Collaborator

meggart commented Nov 22, 2024

For me it would be ok to simply remove the warning, since it not not very useful for most users. It was originally meant as a hint that if the dataset contains very large chunks containing very large strings there might be excessive memory usage, which is something DiskArrays generally tries to avoid. However, I think this case is rare enough so that bothering every user with a warning probably does more harm than helping others.

@johnomotani
Copy link
Author

🎉
Thanks @meggart!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants