-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve subprocess CRDS calls for easier debugging #140
base: master
Are you sure you want to change the base?
Conversation
"subprocess.check_output('crds bestrefs --files ua0605*_c0m.fits --sync-references=1 --update-bestrefs', shell=True, stderr=subprocess.DEVNULL)\n", | ||
"stdout = subprocess.check_output(\n", | ||
" 'crds bestrefs --files ua0605*_c0m.fits --sync-references=1 --update-bestrefs',\n", | ||
" shell=True,\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to do this without shell=True
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't CRDS written in Python? Can't we call its Python API directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CC: CRDS experts @eslavich @jaytmiller
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pllim May I ask: what is the issue with shell=True
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just out of academic curiosity, is there benefit in sorting the glob results first in this usage?
I do not see why would it matter here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, let's say if the bestref is usually named in a way that would appear last in listing, and the function do a simple linear search, then sorted(..., reverse=True)
would make the search faster? Anyway, I have no idea how it works, so I am just thinking aloud. And of course, it is probably out of topic as far as this PR is concerned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order shouldn't matter -- the glob result is a list of datasets, and we have to make the best references determination for each of them.
Incidentally, I don't think shell=True
is a security concern in this case. We wouldn't want to use it in web server code (for example) that interpolates user input into a command, but here the user is already in an environment where they can execute arbitrary shell commands so it's no more unsafe than a notebook in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not as much as a safety in this case, as to promote best practices. What if someone copy-paste what is here into server code? 😉
And given that it is possible to call Python API directly, there is really no reason to have shell=True
at all in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
p.s. Yeah, just ignore my comment about sorting. Sorry for the noise!
I don't know if it is relevant or useful here or not, but here's an example of using the CRDS API: https://github.com/spacetelescope/mirage/blob/master/mirage/reference_files/crds_tools.py#L141 |
I would say that I just want to improve one aspect of existing notebooks to address a specific issue. If there is desire to re-design the notebooks, that could be done too as a separate PR. |
Recently there were a couple of helpdesk issues (INC0155647 and INC0156501) when users reported errors with running CRDS commands using
subprocess
. One difficulty in understanding the cause of the errors was thatsuprocess
in some notebooks is not set up to output errors. This PR modifies calls tosubprocess.check_output()
so that messages from CRDS can be read in the notebook.