Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Dependency on GHTorrent #20

Open
nuthanmunaiah opened this issue Sep 30, 2020 · 1 comment
Open

Remove Dependency on GHTorrent #20

nuthanmunaiah opened this issue Sep 30, 2020 · 1 comment
Assignees

Comments

@nuthanmunaiah
Copy link
Member

Description

reaper requires the GHTorrent database be restored to a MySQL/MariaDB instance. The requirement to have the full GHTorrent database restored before running reaper is prohibitively time intensive (the GHTorrent database dump from 2019-06-01 is over 100 GB in size). The removal of dependency on GHTorrent will require reaper to mine GitHub for the repository data and metadata that has already been mined by the GHTorrent project. On the other hand, there will be no need to restore repository data and metadata for several million repositories while all the user wants to do is analyze a few.

@JEIEJEE
Copy link

JEIEJEE commented Oct 21, 2020

i think it is convenient for outlier

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants