Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understand and update bibtex/doi connection #3

Open
daaronr opened this issue Apr 3, 2019 · 0 comments
Open

Understand and update bibtex/doi connection #3

daaronr opened this issue Apr 3, 2019 · 0 comments

Comments

@daaronr
Copy link
Collaborator

daaronr commented Apr 3, 2019

Notes from Slack:

Katja Abramova [May 28th, 2018 at 5:23 PM]
Bibtex update thread.
17 replies

Katja Abramova [10 months ago]
I’ve implemented several functions using doi and crossref APIs:

  1. For all records in paper_mass, try to fetch a doi based on an exact title match - this misses any papers for which the title column has a weird formatting or includes subtitle etc. I opted for such a restrictive version to increase the chance that we’re getting the right dois.

Katja Abramova [10 months ago]
2. Given a doi (for records filled in step 1), fetch a bibtex entry (this can later be used to fill in the citation columns like author, year etc.)

Katja Abramova [10 months ago]
3. Given a doi, fetch the number of citations (this goes into numeric num_citations column).

Katja Abramova [10 months ago]
Similar functions can be implemented for any particular new record - the easiest is to just have a doi, then the rest can be automatically filled in.

Katja Abramova [10 months ago]
(the script is currently running but you should soon see a more populated table)

Katja Abramova [10 months ago]
For the remaining records for which I couldn’t fetch the doi automatically, it would need to be filled in manually.

Katja Abramova [10 months ago]
I’m planning to also write a function to automatically fill in the parencite column given author-year information.

daaronr [10 months ago]
that is really good!

Katja Abramova [10 months ago]
I’ve updated the paper_mass table with all the doi I could find (the remaining articles seem not to have one at all), bibtex, citation info (author, year etc.), number of citations and parencite.

Katja Abramova [10 months ago]
There are 167 records remaining that didn’t have a doi and so to get bibtex, I’d need to interface with e.g. google scholar. However, I’ve noticed that for instance number of citations is not the same when checking crossref vs scholar. I’m not sure if it’s ok to mix the 2 in one table.

daaronr [10 months ago]
we should either use crossref only, or be clear about which one we are referring to. It’s fairly well known that Google overstates citations

Katja Abramova [10 months ago]
yeah, so we can use scholar only for bibtex, not citations

daaronr [10 months ago]
it’s not without any value for citations … it’s better than nothing, but it should be flagged distinctly

daaronr [10 months ago]
i.e., if crossref isn’t available, we can use google scholar but we put an asterisk

Katja Abramova [10 months ago]
ok

Katja Abramova [10 months ago]
So I’ve tried to do things with google scholar and it sucks. They don’t provide an official API and it’s not clear what are the rate limits for accessing it via http requests (I’m using a package somebody made). Unfortunately, whatever I managed to produce with this had a lot of error data and I also got locked out of scholar for several days even though I was very careful in querying it. So, I’d say until this changes, we should rather rely on crossref and only use scholar for manual lookup: whenever somebody wants to add a paper that doesn’t have a doi, just copy-paste a bibtex from scholar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant