Question about memory domain rcache lru spin lock #9831
-
Some of our use cases(tag zcopy) always hit the following spin lock code which seems to be a performance bottleneck: Line 377 in 42b9f55 It seems since the lock is used for portecting per-context memory domain rcache lru, all workers belong to one ucx context will compete with each other on it. So we tried to split ucx context to let each worker has its own one, and the lock can be omit. But we are not sure if it is the right way to do (ucx context and worker will be as 1:1 mapping). Any ideas? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 2 replies
-
Can you pls post the backtrace of this lock where you see contention? |
Beta Was this translation helpful? Give feedback.
-
Thanks for the info! Below is the backtrace captured by perf. The enviroment is an aarch64 machine, I didn't see this bottleneck on x64 platform.
|
Beta Was this translation helpful? Give feedback.
-
The backtrace suggest you are using older UCX version which uses protov1 and does not use lock-less rcache APIs. Can you pls try with latest master branch, or v1.16.0 release? |
Beta Was this translation helpful? Give feedback.
-
Unfortunately, the change is too big to backport. However if the application is single threaded it should be safe to remove this spin lock as a local patch. |
Beta Was this translation helpful? Give feedback.
Can you pls post the backtrace of this lock where you see contention?
Anyway, it should be ok to create mul…