Skip to content

Commit

Permalink
Rollback non-decoupled any response on cancel
Browse files Browse the repository at this point in the history
  • Loading branch information
kthui committed Oct 5, 2023
1 parent a5892b6 commit 3de5922
Show file tree
Hide file tree
Showing 6 changed files with 7 additions and 32 deletions.
11 changes: 4 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -508,10 +508,8 @@ Supported error codes:
#### Request Cancellation Handling

One or more requests may be cancelled by the client during execution. Starting
from 23.10, `request.is_cancelled()` returns whether the request is cancelled.

If a request is cancelled, the model may respond with any dummy object in place
of the normal output tensors on the request. For example:
from 23.10, `request.is_cancelled()` returns whether the request is cancelled or
not. For example:

```python
import triton_python_backend_utils as pb_utils
Expand All @@ -524,7 +522,8 @@ class TritonPythonModel:

for request in requests:
if request.is_cancelled():
responses.append(None)
responses.append(pb_utils.InferenceResponse(
error=pb_utils.TritonError("Message", pb_utils.TritonError.CANCELLED)))
else:
...

Expand Down Expand Up @@ -600,8 +599,6 @@ full power of what can be achieved from decoupled API. Read
[Decoupled Backends and Models](https://github.com/triton-inference-server/server/blob/main/docs/user_guide/decoupled_models.md)
for more details on how to host a decoupled model.

#####

##### Known Issues

* Currently, decoupled Python models can not make async infer requests.
Expand Down
6 changes: 0 additions & 6 deletions src/infer_request.cc
Original file line number Diff line number Diff line change
Expand Up @@ -410,12 +410,6 @@ InferRequest::IsCancelled()
return pb_cancel_->IsCancelled();
}

bool
InferRequest::IsCancelledLastResponse()
{
return pb_cancel_->IsCancelledInternalFlag();
}

std::shared_ptr<ResponseSender>
InferRequest::GetResponseSender()
{
Expand Down
1 change: 0 additions & 1 deletion src/infer_request.h
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,6 @@ class InferRequest {
std::shared_ptr<InferResponse> Exec(const bool is_decoupled);
std::shared_ptr<ResponseSender> GetResponseSender();
bool IsCancelled();
bool IsCancelledLastResponse();
#endif

/// Save an Inference Request to shared memory.
Expand Down
6 changes: 0 additions & 6 deletions src/pb_cancel.cc
Original file line number Diff line number Diff line change
Expand Up @@ -54,12 +54,6 @@ PbCancel::ShmPayload()
return cancel_shm_.data_.get();
}

bool
PbCancel::IsCancelledInternalFlag()
{
return is_cancelled_;
}

bool
PbCancel::IsCancelled()
{
Expand Down
2 changes: 0 additions & 2 deletions src/pb_cancel.h
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,6 @@ class PbCancel {
bi::managed_external_buffer::handle_t ShmHandle();
IsCancelledMessage* ShmPayload();

bool IsCancelledInternalFlag();

bool IsCancelled();
void ReportIsCancelled(bool is_cancelled);

Expand Down
13 changes: 3 additions & 10 deletions src/pb_stub.cc
Original file line number Diff line number Diff line change
Expand Up @@ -771,17 +771,10 @@ Stub::ProcessRequests(RequestBatch* request_batch_shm_ptr)
std::to_string(response_size) + "\n";
throw PythonBackendException(err);
}
for (size_t i = 0; i < response_size; i++) {
// If the model has checked for cancellation and the request is cancelled,
// replace returned type with a cancelled response.
if (py_request_list[i].cast<InferRequest*>()->IsCancelledLastResponse()) {
responses[i] = std::make_shared<InferResponse>(
std::vector<std::shared_ptr<PbTensor>>{},
std::make_shared<PbError>("", TRITONSERVER_ERROR_CANCELLED));
}
for (auto& response : responses) {
// Check the return type of execute function.
else if (!py::isinstance<InferResponse>(responses[i])) {
std::string str = py::str(responses[i].get_type());
if (!py::isinstance<InferResponse>(response)) {
std::string str = py::str(response.get_type());
throw PythonBackendException(
std::string("Expected an 'InferenceResponse' object in the execute "
"function return list, found type '") +
Expand Down

0 comments on commit 3de5922

Please sign in to comment.