-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GraphBolt] Refactor NeighborSampler and expose fine-grained datapipes. #6983
Conversation
To trigger regression tests:
|
Incorporating the concept of Additionally, there is a dependency between |
I am not sure I fully understand your point of view. I guess discussing this in a meeting is the best way forward as communication over text can be slow and prone to misunderstanding. Thank you very much for the early feedback! |
I wanted to separate sampling and the compact operation because implementing the Cooperative Minibatching idea will require us to customize the compact operation later to avoid unnecessary work. Cooperative Minibatching: https://arxiv.org/pdf/2310.12403.pdf |
Just ping me when needed, we can have a short meeting or discuss it in the weekly meeting |
Splitting large datapipe(stage) into smaller ones(sub-stage) gives the chance for better pipelining, scheduling, then better performance in further. And this could further benefit from optimization in specific calls(OPs) such as sampling, compact. But the tradeoff between scheduling benefits and overhead incurred should be well measured and profiled. And I think the schedule logic(or top DataLoader) is the major part that needs sophisticated design to accommodate various cases. |
@Rhett-Ying the modified example is runnable. I will run the two versions and report if there is any regression. In my experience so far, the runtime didn't get affected at all, will provide actual data on it though. |
sample_neighbor:
sample_neighbor2:
|
I am quite happy with the final refactored version. Quite elegant in my opinion thanks to the reviewers' help. |
LGTM to me, @peizhou001 please take anther look and approve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approce with samll comment
Description
We expose the different steps of the graph sampling operation so that optimizations can be implemented later.
Checklist
Please feel free to remove inapplicable items for your PR.
Changes