-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API changes 0.9 -> 0.10 #60
Comments
Hi there, I was about to open a new issue regarding Axis array but this fits more. Seems like a lot of the features requested by users like @Datseris , and the issues described in e.g. #48 would be solved by the following extra point for a 1.0 release:
This Julia package targets more the functionality of e.g. Python's Another point I'd strongly recommend to consider for 1.0 is:
Regarding the rest of your points I give 👍 to all. |
AxisArrays seems to already integrate with Unitful.jl, so from our end we would only care about converting the string present in |
My proposed syntax would be: |
Another thing to consider, if I may: some kind of warning should be thrown when accessing an |
I guess, you have seen the discussion at #33. So there are already at least two users who do not like this at all :-) and it is indeed inconsistent with all others arrays in julia (but surprisingly, numpy and python-netCDF4 also copy the data without flatting the array when indexing with [:]). OK, let's deprecate it so that we can change the behaviour.
Would you make such warning unconditionally (I mean even if the user just loads a single element)? What you do think about
|
I have not read #33 yet, thanks for letting me know. Yes, me and Simon raise mostly the same points, and these points fairly come from parallelizing array indexing with base Julia. In the end, the best expectation is that a user is already familiar with using normal Julia arrays, and should/would thus expect similar behavior throughout the Julia ecosystem.
Sorry, I am not familiar with For the actual case where a scalar is indeed written on a NetCDF file, your suggestion of
That is an extremely important point and one I immediately (like literally 5 minutes after installing AxisArrays) encountered after using AxisArrays.jl. It worried me a bit that the community is not yet settled in AxisArray-like data, as is evident by the existence of numerous packages that do the same thing and also by the very worrying existence of https://github.com/JuliaCollections/AxisArraysFuture . This all hinds to instability, which I never like :P Give me some time, at the moment I have two goals for my existence: 1 to establish an axis-array-like conversion scheme in NCDatasets.jl (as I find it so darn useful for day to day work). I will be testing a lot of them, and also opening an issue in discourse asking the community about the status quo. My 2nd goal is to establish a direct plotting scheme that goes from |
I am clueless about what |
The community is far from settled, there are about 5 candidates for named dimension packages being written at the moment. DimensionalData.jl is mine, because AxisArrays is not very well written if you need to extend it in any way. DimensionalData also defines a range of grid behaviours that allow you to track cell properties and grid bounds and other things, so eg the data can be reversed and subsetted and it is still plotted the right way at the right coordinates. These parts will eventually be moved to DimensionalArrayTraits.jl so other packages can use them. You can also use unitful units or DateTime units. It can also handle rotated grids and cordinate transforms (although this is still a WIP). GeoData extends DimensionalData.jl to add spatial types like a GeoStack - that can be either a list of files for different layers, or a netcdf etc that contains multiple layers. This means you can use a lot of data structures without changing any code except your load script. GeoSeries does this at a higher level, when you have arrays or stack files that are distributed along some dimension like time - it lazily loads them as you need them. See this example. What most of the code in GeoData does is implement these types for multiple backends, so you can swap them with the same code. Base/statistics methods work on them as in DimensionalData.jl, and the result is always a GeoData.jl already wraps NCDatasets.jl for load/save and manipulation. I use it in production and its getting fairly stable. I have a whole stack of modelling packages built on this, so it is definitely going somewhere as my work and research depends on it. I should make a first release some time in the next month. My opinion would be not to implement dimensions again here, just use GeoData.jl and help me improve it, instead of every spatial package doing something custom and different. |
Both sad as well as surprised to hear this, especially given how well developed and standarized these things are in other languages, and how fundamentally necessary they are for any geo/climate related work. Since I want better support, I'd better "get my hands dirty" and start contributing even more then.
Sure, at least I'd be very happy to avoid as much duplication as possible and re-use, improve and extend existing code. But no matter how willing I am, at the moment it is not possible... :( As far as I personally can say, the documentation of GeoData.jl is not at a point where anyone besides the main developer can understand what is happening. At the moment the documentation is simply a sequence of documentation strings. This is enough for someone that already knows the reasoning behind the package, as well as the sequence of steps one typically does when using it, and thus does not need aid in understanding. But unfortunately for someone who doesn't already know the package, just listing documentation strings doesn't take them far. I am happy to help with improving, and making the docs more welcoming for people, but this is something that you'd have to start, if you want others to start using, and improving, your creation (at the moment I understand so little that I find myself unable to do any contribution). The best thing to start would be actual examples or code snippets that highlight basic components, as well as guiding the person through the typical use of a package. A good docs is the n1 thing that will attract more people. (DimensionalData.jl is in a similar situation by the way, but since it has much less complexity built on top of it I could at least understand its basic components). To simplify things for my own understanding by the way, is GeoData.jl simply an extension of DimensionalDatasets.jl that (1) adds extra metadata to the array (2) allows directly loading from NCDatasets into GeoArray (3) defines plotting recipes for GeoArray? I'll try to not derail too off-topic here, talking too much about other packages. But this situation calls for a discourse post. |
GeoData.jl isn't released. If you want to do some work on the docs, that would be great - I'll try to get that a little further along. As I've said elsewhere this is one of 15 packages in progress I don't really have time docs, I just need the functionality asap because the ecosystem doesn't provide it elsewhere. Polishing will come later. There was an intial discourse post on DimensionalData and GeoData a few months ago. DimensionalData.jl is published, GeoData.jl isn't - there will be an updated post when it is. As a developer, if you want to understand a package read the tests. The docs are for users. |
Oh, sorry, my bad. Didn't know it wasn't registered, but this statement:
certainly made me interpret it as actually being registered, as only registered packages can be dependencies of NCDatasets.jl (at least at the moment). |
It will be published soon. But you also gave up very quickly on exploring the option and instead pushed the work back on to me. You should at least read the tests. You also need to realise how much work it requires to get it to the current point of generality - pull requests to NCDatasets, ArchGDAL, HDF5, and writing GeoFormatTypes and DimensionalData.jl. I'm aware of the documentation needs, and lack of docs doesn't mean I shouldn't suggest you use GeoData.jl - you asked me to comment here and we are talking about future functionality and working together towards that. |
I've opened this so its easier to communicate and work towards a common goal: https://discourse.julialang.org/t/the-fate-of-dimensionalarrays-axisarrays-in-julia-and-which-to-actually-use/33253 |
What you are missing is that these things have all been discussed at length in multiple github issues, it's not like we are just realising this is a problem. |
@Alexander-Barth another point to consider for 1.0: Dropping the |
@Datseris, of course thanks for reminding me to drop Compat! |
@Alexander-Barth I know this will be a bit "too breaking" or "too heavy", but still I believe it is worth considering for a 1.0 change: "What should the syntax This comment is also inspired by the behavior of typically using |
@Alexander-Barth |
I see |
It would be nice to have a 1.0 release of NCDatasets. There might be some aspect of the API that need to be deprecated:
NCDataset
and deprecateDataset
(but we might keepDataset
as an alias around) for the reasons explained here Improve README, add TOC, add alias NCDataset #44 by @Datserisscale_factor
) in defVar:instead of
Such attributes are
scale_factor
,add_offset
and_FillValue
,unit
andcalendar
(for time variables).Otherwise it is not possible to know the element-type of
ncdata
when it is created (as it can change). I belief that this is necessary address #40. You can also useDict
in the example above, if the order of attribute is not important.Do not use
Union{Float64,Missing}
(or similar type) if a"_FillValue"
attribute is not defined, as element-type. We would just useFloat64
(or similarly Float32, Int8, ...).The constants: NC_FILL_BYTE, NC_FILL_CHAR, NC_FILL_SHORT, NC_FILL_INT, NC_FILL_FLOAT, NC_FILL_DOUBLE, NC_FILL_UBYTE, NC_FILL_USHORT, NC_FILL_UINT, NC_FILL_INT64, NC_FILL_UINT64, NC_FILL_STRING are have to be replace by e.g.
fillvalue(Int8)
.Let me know, if you think that additional breaking changes are necessary before 1.0 by replying to this issue or if you disagree with what I proposed.
In any case, I think we will need a version 0.10 where we test all these changes.
The text was updated successfully, but these errors were encountered: