Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CADC-12563 expect the PackageRunner job to be SUSPENDED instead of QUEUED #97

Merged
merged 5 commits into from
Jan 19, 2024

Conversation

jburke-cadc
Copy link
Contributor

No description provided.

Copy link
Member

@pdowler pdowler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for caom2-pkg-server, clients directly invoke the package endpoint so those are going to do the normal PENDING -> QUEUED before being invoked. Unfortunately, that means the PackageRunner needs to tolerate both of these...

You probably need to add something like

protected ExecutionPhase getInitialPhase() {
    return ExecutionPhase.QUEUED;
}

and then have the vospace impl override that.

@jburke-cadc
Copy link
Contributor Author

Forget about the caom2-pkg-server. Added the method to the PackageRunner, and overrode it in the VospacePackageRunner, and the cavern int-test for packages passes.

@@ -206,7 +210,7 @@ private void doIt() {
} catch (Throwable t) {
if (ThrowableUtil.isACause(t, InterruptedException.class)) {
try {
ep = jobUpdater.setPhase(job.getID(), ExecutionPhase.QUEUED, ExecutionPhase.EXECUTING, new Date());
ep = jobUpdater.setPhase(job.getID(), getInitialPhase(), ExecutionPhase.EXECUTING, new Date());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is something that was like this and I never noticed before but it's really broken:

this change from initial to executing in the error handling doesn't make any sense

should be EXECUTING to ERROR with an error message from the throwable and should always call sendError (I don't know what scenario that ABORTED is meant to handle)

out of scope for now: there is also no code here at differentiate the cause and chose an approp status code - that would normally be done by earlier catch blocks and this is the final "unexpected" aka 500 handling... please create an issue for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the catch for the throwable just be setting the phase to ERROR and calling sendError(), without checking if the throwable cause is a particular exception?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, as long as the fail is not from doing initial->executing you can be certain that the current phase is executing and go to error.

and because this is really catch(Throwable unexpected) the response code should be 500 in this block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the throwable you can get the current job phase when setting the phase to error, which would cover the phase in initial or executing?


}
try {
ep = jobUpdater.setPhase(job.getID(), job.getExecutionPhase(), ExecutionPhase.ERROR, new Date());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

job.getExecutionPhase( ) is not reliably up-to-date (it's probably still SUSPENDED in this case)... I think you can hard-code EXECUTING here because logically nothing happens before SUSPENDED->EXECUTING or after EXECUTING->COMPETED.

there really should be a catch designed to handle those setPhase operations throwing a job-related exception, but because there is a catch(Throwable) your IDE doesn't tell you about what those are... out of scope. Just hard code EXECUTING->ERROR

Also, for the sendError, it was using the two-arg method so that "unexpected failure" would be added to the output. Please revert to that.

@pdowler pdowler merged commit 4aff7e7 into opencadc:master Jan 19, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants