-
Notifications
You must be signed in to change notification settings - Fork 74
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- copy_borrowed_tensor_in_async_mode does not stall for device tensors anymore - Typechecking moved to compile time - work_executor optimizations: Pass shared ptrs down to workers, instead of lambda objects. Lock Free Queue is now statically initialized - launch_op optimization: lambda initialized outside multi-device for loop - Tensor deallocate optimization: Pass attribute ptr to lambda instead of passing entire tensor object - System Level Optimizations: Set process priority to 0. Bind CQ reader to core and use CV to toggle its state instead of calling sleep
- Loading branch information
1 parent
a20cb5c
commit cd0587b
Showing
10 changed files
with
271 additions
and
139 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.