Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SciHub... #16

Open
nickschurch opened this issue Jul 6, 2017 · 2 comments
Open

SciHub... #16

nickschurch opened this issue Jul 6, 2017 · 2 comments

Comments

@nickschurch
Copy link

Come on, we know you want to! ;)

Similar to some of the others, it'd require reverse engineering the text from PDF...

@blahah
Copy link
Member

blahah commented Jul 6, 2017

Some practical considerations:

  • in some jurisdictions this might be unlawful
  • even where it isn't this might open users up to the possibility of legal pressure from litigious publishers
  • the number of papers is too large to fit in a single datasource using our current technology (though in future versions the scale of 10s of millions of entries will be possible and fast)
  • the diversity of PDF sources makes it particularly hard to reverse engineer XML coherently

And as per #6, I note that we can't officially create or sanction something which might open us up to legal bullying.

@blahah
Copy link
Member

blahah commented Jul 6, 2017

I will however leave this open for discussion - it is a unique resource and deserves to be discussed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants