Skip to content

Commit

Permalink
Rollback non-decoupled any response on cancel
Browse files Browse the repository at this point in the history
  • Loading branch information
kthui committed Oct 5, 2023
1 parent a5892b6 commit f4e0ae1
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 7 deletions.
11 changes: 4 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -508,10 +508,8 @@ Supported error codes:
#### Request Cancellation Handling

One or more requests may be cancelled by the client during execution. Starting
from 23.10, `request.is_cancelled()` returns whether the request is cancelled.

If a request is cancelled, the model may respond with any dummy object in place
of the normal output tensors on the request. For example:
from 23.10, `request.is_cancelled()` returns whether the request is cancelled or
not. For example:

```python
import triton_python_backend_utils as pb_utils
Expand All @@ -524,7 +522,8 @@ class TritonPythonModel:

for request in requests:
if request.is_cancelled():
responses.append(None)
responses.append(pb_utils.InferenceResponse(
error=pb_utils.TritonError("Message", pb_utils.TritonError.CANCELLED)))
else:
...

Expand Down Expand Up @@ -600,8 +599,6 @@ full power of what can be achieved from decoupled API. Read
[Decoupled Backends and Models](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/decoupled_models.md)
for more details on how to host a decoupled model.

#####

##### Known Issues

* Currently, decoupled Python models can not make async infer requests.
Expand Down
5 changes: 5 additions & 0 deletions src/pb_stub.cc
Original file line number Diff line number Diff line change
Expand Up @@ -771,6 +771,11 @@ Stub::ProcessRequests(RequestBatch* request_batch_shm_ptr)
std::to_string(response_size) + "\n";
throw PythonBackendException(err);
}
for (auto& response : responses) {
if (!py::isinstance<InferResponse>(response)) {
std::string str = py::str(response.get_type());
}
}
for (size_t i = 0; i < response_size; i++) {
// If the model has checked for cancellation and the request is cancelled,
// replace returned type with a cancelled response.
Expand Down

0 comments on commit f4e0ae1

Please sign in to comment.