-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about planning ... failures? #63
Comments
Same result when I use the code from |
If it helps, when I did the following cell in the notebook:
I got back:
|
Hrmz...there should be a log in there, ya. Are you able to see anything on the monitor? Same base URL, with the port |
Hm. The server is up ( And looking at a list of my processes, this does not look good:
Looks like these aren't getting reaped? But also, I thought all the action was inside the docker containers? Or is PaaS running singularity on the host machine? |
Listing the running containers, I don't see anything that looks like the monitor (but I don't really know how to read a docker-compose.yml file, so I could be wrong):
|
Indeed, the monitor isn't there. It's all done through the docker containers, but they're bridged on the virtual network. You may be facing the version mismatch similar to #65 . Can you try running my version here? It's in #62 , waiting for review. Added bonus, it has your frontend port as a parameter now in the |
Aha! I tried
Given that the timeout value in the call (20s) is less than the runtime reported (20.21391299692914s) is there any chance that the server is misinterpreting a timeout as successful plan generation? |
It could be, ya. That'd be my hunch as well... @nirlipo : any recollection on what the status code is meant to be on timeout? I'll start poking around to find it in the meantime... |
OK, one (last?) update. Is there any chance the monitor has failed?
There is no |
Yep -- can you try my branch? Should hopefully fix it. |
So the celery process gets the time limit + 10 (just to make sure the service ends appropriately):
If this is triggered, then this would have been the response:
The planner itself is given the exact limit:
So the question is what is in the stdout/stderr when the planner completes:
Weird that there would be nothing (the first response you posted). I don't think it's the planning-editor-adapter you want to use. |
Should I expect the plan to be in Also, a side issue that I will raise in a separate issue: when one is running rootless and starts a privileged container, it's easy to end up with files created in mounted directories with uid's that don't exist on the host machine, and thus cannot be removed by the host machine owner, e.g., If one must do this, there's some way to sync the uids. I will talk to my colleagues and ask them for the trick. But, of course, if one can avoid running as root while writing files in shared directories, that's great. |
Without an adapter specified, I think it's in
This issue has plagued me across many projects. Would love to know an answer to how to treat it more appropriately! |
See #66 We have an approach locally to syncing file ownership, but it involves all kinds of shell scripting to sync up UIDs between the host and dockerfile, where the dockerfile imports a shell script ( |
Sorry to be so difficult. I grabbed your new version @haz and now I see this in
Unfortunately, there's no indication of what the user exception In the client Jupyter notebook, I see this
That's the value of package_name = "lama-first"
service = "solve"
service_url="http://"+ip+":5001/package/"+package_name+"/"+service
solve_request = requests.post(service_url, json=req_body,headers=headers).json()
solve_request I'm at a bit of a dead end here, since when I search the AI-Planning group in github for "This Planutils package" I find nothing. |
Aye, this is the kind of catch-all, and more info really should be logged. Usually it occurs when I try calling a planner that's not installed (or it doesn't have that privileged status to run). planning-as-a-service/server/api/app.py Line 57 in 8528c9d
Note that the If you have dom/prob pairing you're willing to share, I can make sure I get to a workable state on my side with things? |
I was just using the example in the Jupyter notebook in the planning-as-a-service repo. I will attach my version of that here. |
Oh, yes. It might be reasonable to put the exception in there (presumably one can translate arbitrary exceptions into json?) and possibly even stash the backtrace... |
It may be server internal stuff, and perhaps not the wisest to send back...it fires on any 500 error. But at present it's not even in the server logs, and that should be fixed! |
Are the server logs what I see in |
Random thought: could you grab up any errors raised by the planners themselves and give them a specific type? Then you could send along exceptions of that type and backtraces, and ignore miscellaneous errors that would not be of type |
Problem with that one is that every planner has their own custom way of running and failing. I think one thing we could do is have planning authors implement their own way to run it so that the result is a json output. Then we just echo that json back to the user who invoked it. E.g., the FasDownward driver (written in Python) could have a top-level try-catch that dumps a json with failure info, when a failure occurs. Also makes it easy for arbitrary planning software to be hosted as well (e.g., domain generators, model acquisition techniques, VAL, etc). Just needs to be on the individual package maintainer side, rather than something cross-cutting for all the solvers. |
Minor change for better error handling: haz#2 |
I took the jupyter notebook from the repo (which is, I think, the same as the python script in the README) and modified it slightly (print the URL and the response from celery) to test my PaaS server. I'm getting a very odd result:
So no plan, but no error output, either, and celery seems to be claiming my request is no longer
PENDING
.Is this what I should expect if there's a time out?
Is there a log I should be looking at to diagnose this?
The text was updated successfully, but these errors were encountered: