-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
97 test example notebooks ps #204
Conversation
The problem this tries to solve is being able to integrate the testing of notebooks into the CI without requiring downloading large files for testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly this is fine. I don't like it, but I poked around for a while today and couldn't come up with anything substantially better.
There are packages like testbook which help with testing of the contents of a notebook (which overlaps a good bit with your code). But we're working in the opposite direction: we need to patch the filename
variable to be something else, but only during tests.
I couldn't find any prior art that supports this, so I don't think there's a better option than editing the notebook on the fly as you have.
There are a few "cleaner" solutions, but they would require modifying the notebook, and I think making it less clear than it is now.
For example, we could (I think; didn't actually test) use an environmental variable to allow easy swapping:
filename = os.environ.get("DYSH_EXAMPLE_DATA_PATH", "TGBT21A_501_11.raw.vegas.fits")
Which would allow the unit test to set DYSH_EXAMPLE_DATA_PATH
to whatever value we want. But then the notebook itself looks dumb and confusing.
Or we could abstract the "get example data and load it as a GBTFITSLoad
instance" into a function, e.g.
def load_example_data(path: str, base_url=f"http://www.gb.nrao.edu/dysh/example_data/"):
full_url = os.path.join(base_url, path)
wget.download(full_url)
return GBTFITSLoad(os.path.basename(path))
And then we could mock that such that it loads the "test" URL instead.
Even better would be to have these examples work with significantly smaller files, such that we could use them in GitHub. We could then use caching in the Action to prevent repeated unnecessary downloads. But I assume that this isn't feasible for some reason or we would have done it by now
So I say go for it. If we find a better way we can clean it up then.
Possible way of testing notebooks using
nbconvert
andnbformat
.It kind of requires significant work to get started, but it does allow testing the notebooks thoroughly against the data in testdata.
For this to work we have to tag the cells in the example notebooks with meaningful names, and keep the part where we define the
filename
(the SDFITS to be loaded) in a separate cell, sofilename
can be changed to one of the files in testdata. Cells to be tested also need to be tagged so they can be identified and used to check that the notebooks works. If we decide to use this method, I'd be happy to document the process.