Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The possibility of applying Polynormer on large-scale datasets #2

Open
xioacd99 opened this issue Aug 16, 2024 · 1 comment
Open

The possibility of applying Polynormer on large-scale datasets #2

xioacd99 opened this issue Aug 16, 2024 · 1 comment

Comments

@xioacd99
Copy link

A simple and elegant work and it seems to be the state-of-the-art graph transformer for node classification.

I notice that the largest dataset used in your paper is ogbn-products with about 2 million nodes, and I wonder if Polynormer can be used on super large datasets, such as obgn-papers100M with about 100 million nodes.

Similar work such as SGFormer gives experimental results on ogbn-arxiv/products/paper100M, so I am confused why Polynormer, which is also a linear transformer, has no experiments on ogbn-papers100M.

@Chenhui1016
Copy link
Contributor

Hi,

Thanks for your interest!

Yes, Polynormer can also be used on larger datasets, including obgn-papers100M. Polynormer employs a random partition method for mini-batch training, similar to the approach used by SGFormer for scaling to large graphs. By adjusting the batch size, Polynormer effectively prevents the GPU out-of-memory issue, regardless of the size of the underlying graph. While evaluating on even larger datasets is ideal, we believe our experiments with 2M-node graphs have already demonstrated the effective scalability of Polynormer to large graphs through mini-batch training. I hope this answers your question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants