Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with Unitful #54

Open
Datseris opened this issue Jan 7, 2020 · 8 comments
Open

Integration with Unitful #54

Datseris opened this issue Jan 7, 2020 · 8 comments

Comments

@Datseris
Copy link
Contributor

Datseris commented Jan 7, 2020

I have:

julia> field
toa_sw_clr_c_mon (360 × 180 × 228)
  Datatype:    Float32
  Dimensions:  lon × lat × time
  Attributes:
   long_name            = Top of The Atmosphere Shortwave Flux, Clear-Sky (for cloud-free areas of region) conditions, Monthly Means
   standard_name        = TOA Shortwave Flux - Clear-Sky (for cloud-free areas of
region)
   CF_name              = none
   comment              = none
   units                = W m-2
   valid_min            =       0.00000
   valid_max            =       600.000
   _FillValue           = -999.0

and I do:

julia> Array(field) |> summary
"360×180×228 Array{Union{Missing, Float32},3}"

wouldn't it be cool if we at least tried to use Unitful.jl and instead of returning an array of FLoats, try to associate units with them, if the CFVariable have a units field?

@Balinus
Copy link
Contributor

Balinus commented Jan 7, 2020

I did some work on this in ClimateTools and I must admit I finally dropped the idea sadly. There was many case where I needed to program boilerplate code. For example, try averaging temperature in Celsius. Unitful.jl prevent the user to do it and I had to code stripping unit functions in case where temps where in Celsius: 1. if Celsius remove units, 2. do arithmetic, then 3. put back the units. I must say I haven't had the time to properly deal with those and I decided to simply uses Floats.

With that being said, I think that the handling of Unitful in NCDatasets.jl is certainly the place where it should be since the package is not meant to do arithmetics on those fields. Would be important to have the option to have units or not though.

@Datseris
Copy link
Contributor Author

Datseris commented Jan 7, 2020

For example, try averaging temperature in Celsius. Unitful.jl prevent the user to do it

I am confused about this. Why would Unitful prevent averaging a vector of Celcius? I see, you mean that you need to use Kelvin:

julia> mean(rand(1000) .* u"K")
0.5043183459463381 K

The docs of Unitful describe why its a bad idea to use Celcius as unit while doing operations on it. From my perspective your problems would be solved by always using Kelvin as the unit of temperature (and thus if the data on disk have Celcius, convert them to Kelvin).

Would be important to have the option to have units or not though.

Sure. I imagine this as a keyword argument to NCDataset(file; units = true).


Can you crosslink the point in your code where you tried to do this?

@Balinus
Copy link
Contributor

Balinus commented Jan 7, 2020

Indeed, using only Kelvin is the solution but it clashed with some in-house projects that we needed to support and as time goes by, I decided to drop it and come back to it later (which I haven't yet!).

There was a release with Unitful support. It's a 1st pass on the idea: https://github.com/Balinus/ClimateTools.jl/releases/tag/v0.11.0

It was dropped in 0.13: https://github.com/Balinus/ClimateTools.jl/releases/tag/v0.13.0

@Datseris
Copy link
Contributor Author

Datseris commented Jan 7, 2020

I am interested to implement this, so its okay to ask for you to review the PR as well?

@Balinus
Copy link
Contributor

Balinus commented Jan 7, 2020

No problem!

I think it would be a really nice addition and a good baseline to redefine functions correctly in ClimateTools.

@Alexander-Barth
Copy link
Member

Alexander-Barth commented Jan 8, 2020

I like the approach, for adding an optional argument to NCDataset(file; make_units_happen_please_or_something).

There are some alternative ideas (#47 (comment)) what the result type of the variable subset should be in this example:

ds = NCDataset(file);
ncvar = ds["myvariable_name"]
subset = ncvar[:,:,1] 

Now subset is simply a plain Julia array of floats, integers,.... (and missing) A possible approach would be to add a transformation function (or type) as a parameter:

ds = NCDataset(file; transform=add_units);
ncvar = ds["myvariable_name"]
subset = ncvar[:,:,1] 

The data in ncvar[:,:,1] would be passed thought transform (which also get the NCVariable object):

function add_units(ncvar,indices)
    ncunits = u"m/s" # defined from ncvar.attrib["units"]
    return ncvar[indices...] * ncunits
end

(it is not clear to me how I get a unit from a string in a variable - not a constant string literal in Unitful, does somebody know?).

The call to add_units (or whatever the argument transform is) would happen inside NCDatasets.
If this turns out to be type unstable, maybe we should using a type instead of a function.

Other applications this transform function, would it be:

  • return an array with NaN instead of missing
  • return a Dagger.Distribute array (or any other python-dask-like data structure) for out-of-core computing
  • unit conversion

@Datseris
Copy link
Contributor Author

Datseris commented Jan 8, 2020

I like this, and this seems the cleanest one approach. And it also seems logical that the transformation would happen the moment the user attempts to actually load the numerical values. I'll try to do this like this, and thus this will happen at getindex.

@Balinus
Copy link
Contributor

Balinus commented Jan 8, 2020

(it is not clear to me how I get a unit from a string in a variable - not a constant string literal in Unitful, does somebody know?).

Found this discussion that seems relevant and have some suggestions: PainterQubits/Unitful.jl#214

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants