Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dysh data2 simplify data filenames #394

Merged
merged 6 commits into from
Oct 9, 2024
Merged

Dysh data2 simplify data filenames #394

merged 6 commits into from
Oct 9, 2024

Conversation

teuben
Copy link
Collaborator

@teuben teuben commented Oct 2, 2024

Although this will not be the final version, pending some discussion particularly on how to configure it in production time, there was a need to simplify filenames that work across being at GBO or on a laptop. The current solution is to provide an environment variable $DYSH_DATA within which directories (or symlinks) to places like example_data, acceptance_testing etc.
I'd like this merged into main sooner than later, since the nodding code testing (which has 10 examples) now depends on keeping filename easy to manage across sites.

@astrofle astrofle requested a review from vcatlett October 2, 2024 19:41
@vcatlett
Copy link
Contributor

vcatlett commented Oct 4, 2024

Can you merge the latest version of main into this PR?

@teuben
Copy link
Collaborator Author

teuben commented Oct 4, 2024

main is merged, only saw src/dysh/shell/shell.py and src/dysh/spectra/scan.py

@astrofle
Copy link
Collaborator

astrofle commented Oct 8, 2024

For what it's worth, we did remove wget from the dependencies some PRs ago. Since the code does not really depend on wget this does not seem like an issue.

@teuben
Copy link
Collaborator Author

teuben commented Oct 8, 2024

the current dysh_data() still has some wget in it, but of course this would only work for actual fits files, not for datasets with multi-fits contents.

The example notebooks all are using from_url now. My problem with those is that it will often trigger a download, but i have my data in $DATA_DYSH, and so I'd rather have a frontend for this from_url function.

For example, in the PS example notebook we currently have

from dysh.util.download import from_url
url = "http://www.gb.nrao.edu/dysh/example_data/positionswitch/data/AGBT05B_047_01/AGBT05B_047_01.raw.acs/AGBT05B_047_01.raw.acs.fits"
savepath = Path.cwd() / "data"
filename = from_url(url, savepath)

and this would be replaced by

from dysh.util.files import dysh_data
filename = dysh_data(example="example_ps") 

though one can also use full example names like

filename = dysh_data(example="positionswitch/data/AGBT05B_047_01/AGBT05B_047_01.raw.acs/AGBT05B_047_01.raw.acs.fits")

@vcatlett
Copy link
Contributor

vcatlett commented Oct 9, 2024

Apologies for my delay; I just returned from some sick leave. It looks like there are now some merge conflicts which need to be resolved. Peter, can you take a look at those? I can help you make the changes if needed.

@vcatlett vcatlett merged commit 1528e00 into main Oct 9, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants