Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default 'remove_self_loops' set to True in heterogeneous link prediction example problematic for use case with multiple node types? #24

Open
anniekmyatt opened this issue May 28, 2021 · 2 comments

Comments

@anniekmyatt
Copy link

anniekmyatt commented May 28, 2021

Hello,
I am working on setting up link prediction on a heterogeneous graph with different node types and used the examples in deepsnap/examples/link_prediction_hetero as a starting point. In these examples, the HeteroSAGEConv layer is used, with its default of setting 'remove_self_loops' to true.

Could this be problematic when I have two different node types, as the edge index for the links between these could have links between, for example, node 0 of one node type and node 0 of the other node type? In fact, this is what happened for me.

As far as I understood the negative sampling algorithm in the hetero_graph/negative_sampling() method expects the indices for both node types to start from zero, rather than have unique indices, so the solution can't be to create unique node indices across the different node types?

If I am correct, it may be worth adding a comment on the need to remove self-loops in careful pre-processing instead of using the default functionality in the HeteroSAGEConv layer?
Another option could be to add a check if the source and destination node type are the same in the HeteroSAGEConv implementation, before running pyg_utils.remove_self_loops(edge_index)? I guess it doesn't really make sense to remove self loops in the case where the source and destination node type is different anyway?

If I have misunderstood the situation, I would very much appreciate to be corrected!

Thank you!

@zechengz
Copy link
Collaborator

Hi,

Yes, the current remove_self_loops=True will be problematic when the node types are different. Thanks for pointing out this problem! I will update the HeteroSAGEConv layer recently.

@anniekmyatt
Copy link
Author

Thank you, I’m glad it is useful! I appreciate that the impact in reality will likely be small but it is still good to not have unintended behaviour.

Thank you for your quick reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants