-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up ResponseFactory when a final complete flag is set #358
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
// Copyright 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
// Copyright 2022-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
// | ||
// Redistribution and use in source and binary forms, with or without | ||
// modification, are permitted provided that the following conditions | ||
|
@@ -46,7 +46,16 @@ class ResponseSender { | |
intptr_t request_address_; | ||
intptr_t response_factory_address_; | ||
std::unique_ptr<SharedMemoryManager>& shm_pool_; | ||
// The flag to indicate if the response sender is closed. It is set to true | ||
// once the TRITONSERVER_RESPONSE_COMPLETE_FINAL flag is set, meaning that the | ||
// response_sender should not be used anymore. This flag is separate from the | ||
// `is_response_factory_cleaned_` flag because errors might occur after | ||
// complete_final flag is set but before the response_factory gets cleaned up. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you elaborate on the errors that might occur after sending complete_final flag but before deleting the response_factory? Is it the model that is holding on to the response_sender forever? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When the complete final flag is set, it is still possible that some error occurs before the response actually gets sent to the backend. For example, here and here. I think this bool variable |
||
bool closed_; | ||
std::shared_ptr<PbCancel> pb_cancel_; | ||
// The flag to indicate if the response_factory is already cleaned up in the | ||
// python backend side. If not, the response_factory will be cleaned up in the | ||
// destructor. | ||
bool is_response_factory_cleaned_; | ||
}; | ||
}}} // namespace triton::backend::python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we remove
EnqueueCleanupId
(and associated logic on the server side) now that response factory is deleted when it sees the final flag?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we would still need to keep it for rescheduled requests. It's possible that the response sender hasn't sent the final flag yet and the request gets rescheduled. There will be a new response factory created for each request even if the request is a rescheduled one, so I think we would still need to clean up the response factory in the destructor.