Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Moving to HuggingFace for some databases #242

Open
Dsantra92 opened this issue Aug 30, 2024 · 2 comments
Open

[Discussion] Moving to HuggingFace for some databases #242

Dsantra92 opened this issue Aug 30, 2024 · 2 comments

Comments

@Dsantra92
Copy link
Collaborator

Some of the (graph) databases that we are trying to support might have either of the following problems:

  1. Hosted in university servers or a non-trusted source which cannot provide proper download speeds though out the globe.
  2. Datasets that aren't hosted anywhere and come with a license
  3. Datasets stored as python formats.

HuggingFace has now good set of community maintained graph datasets. If we come across any of these above issues for a dataset, we can try to add these datasets to HF and then pull from HF and then process as required. This I believe will largely reduce code for integrating and testing new datasets. I am not sure about the planned support for https://github.com/FluxML/HuggingFaceApi.jl but this seems to me like a better idea than relying on links that can fail without warning.

cc: @CarloLucibello

@CarloLucibello
Copy link
Member

It would be nice to have HF as official storage. Maybe we can replace current download links with HF's ones without having to resort to HF's api?

@Dsantra92
Copy link
Collaborator Author

That sounds nice. I will see if we can do bypass calling HF's API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants