Default 'remove_self_loops' set to True in heterogeneous link prediction example problematic for use case with multiple node types? #24

anniekmyatt · 2021-05-28T08:41:46Z

Hello,
I am working on setting up link prediction on a heterogeneous graph with different node types and used the examples in deepsnap/examples/link_prediction_hetero as a starting point. In these examples, the HeteroSAGEConv layer is used, with its default of setting 'remove_self_loops' to true.

Could this be problematic when I have two different node types, as the edge index for the links between these could have links between, for example, node 0 of one node type and node 0 of the other node type? In fact, this is what happened for me.

As far as I understood the negative sampling algorithm in the hetero_graph/negative_sampling() method expects the indices for both node types to start from zero, rather than have unique indices, so the solution can't be to create unique node indices across the different node types?

If I am correct, it may be worth adding a comment on the need to remove self-loops in careful pre-processing instead of using the default functionality in the HeteroSAGEConv layer?
Another option could be to add a check if the source and destination node type are the same in the HeteroSAGEConv implementation, before running pyg_utils.remove_self_loops(edge_index)? I guess it doesn't really make sense to remove self loops in the case where the source and destination node type is different anyway?

If I have misunderstood the situation, I would very much appreciate to be corrected!

Thank you!

The text was updated successfully, but these errors were encountered:

zechengz · 2021-05-31T07:05:35Z

Hi,

Yes, the current remove_self_loops=True will be problematic when the node types are different. Thanks for pointing out this problem! I will update the HeteroSAGEConv layer recently.

anniekmyatt · 2021-05-31T09:24:24Z

Thank you, I’m glad it is useful! I appreciate that the impact in reality will likely be small but it is still good to not have unintended behaviour.

Thank you for your quick reply!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default 'remove_self_loops' set to True in heterogeneous link prediction example problematic for use case with multiple node types? #24

Default 'remove_self_loops' set to True in heterogeneous link prediction example problematic for use case with multiple node types? #24

anniekmyatt commented May 28, 2021 •

edited

Loading

zechengz commented May 31, 2021

anniekmyatt commented May 31, 2021

Default 'remove_self_loops' set to True in heterogeneous link prediction example problematic for use case with multiple node types? #24

Default 'remove_self_loops' set to True in heterogeneous link prediction example problematic for use case with multiple node types? #24

Comments

anniekmyatt commented May 28, 2021 • edited Loading

zechengz commented May 31, 2021

anniekmyatt commented May 31, 2021

anniekmyatt commented May 28, 2021 •

edited

Loading