You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the tests, and for development, it's really good to have datasets you can work with without downloading anything.
I think we should have three sets of data / files:
really tiny (probably hard coded in Python code) examples of various file types, metadata types, etc, for the tests. These could be hand written, or borrowed from other projects.
I set up git LFS support, I think that means all *.nc files will be stored in LFS, so we an put medium sized files in there -- but probably don't want to go more than 10MB or so.
There's a bit of a chicken-egg problem there -- how do you make a small file if you don't yet have a tool to subset larger ones with?
But at this point, we do have some working subset code, so I think we could do:
get the subset code working
make a small test file with it
add that file, and build your more comprehensive tests against it.
larger test files
These could be stored somewhere else, and downloaded on demand -- one option would be in gitHub as releases or packages, or ???
We can also have test code, etc, that points to what we know are stable resources on S3, etc. -- already some of that in the examples.
Maybe have a file in the repo with a set of links to various datasets?
@omkar-334: maybe you could put together a small example or two of ADCIRC (STOFS2D).
The text was updated successfully, but these errors were encountered:
For the tests, and for development, it's really good to have datasets you can work with without downloading anything.
I think we should have three sets of data / files:
There are some in the gridded project:
https://github.com/NOAA-ORR-ERD/gridded
There are some examples of UGRID and SGRID files and data sets:
https://github.com/NOAA-ORR-ERD/gridded/blob/master/gridded/tests/gen_analytical_datasets.py
https://github.com/NOAA-ORR-ERD/gridded/tree/master/gridded/tests/test_ugrid/files
https://github.com/NOAA-ORR-ERD/gridded/blob/master/gridded/tests/test_pysgrid/write_nc_test_files.py
xarray-subset-grid/examples/example_data/SFBOFS_subset1.nc
I set up git LFS support, I think that means all *.nc files will be stored in LFS, so we an put medium sized files in there -- but probably don't want to go more than 10MB or so.
There's a bit of a chicken-egg problem there -- how do you make a small file if you don't yet have a tool to subset larger ones with?
But at this point, we do have some working subset code, so I think we could do:
These could be stored somewhere else, and downloaded on demand -- one option would be in gitHub as releases or packages, or ???
We can also have test code, etc, that points to what we know are stable resources on S3, etc. -- already some of that in the examples.
Maybe have a file in the repo with a set of links to various datasets?
@omkar-334: maybe you could put together a small example or two of ADCIRC (STOFS2D).
The text was updated successfully, but these errors were encountered: