Understand and update bibtex/doi connection #3

daaronr · 2019-04-03T15:19:48Z

Notes from Slack:

Katja Abramova [May 28th, 2018 at 5:23 PM]
Bibtex update thread.
17 replies

Katja Abramova [10 months ago]
I’ve implemented several functions using doi and crossref APIs:

For all records in paper_mass, try to fetch a doi based on an exact title match - this misses any papers for which the title column has a weird formatting or includes subtitle etc. I opted for such a restrictive version to increase the chance that we’re getting the right dois.

Katja Abramova [10 months ago]
2. Given a doi (for records filled in step 1), fetch a bibtex entry (this can later be used to fill in the citation columns like author, year etc.)

Katja Abramova [10 months ago]
3. Given a doi, fetch the number of citations (this goes into numeric num_citations column).

Katja Abramova [10 months ago]
Similar functions can be implemented for any particular new record - the easiest is to just have a doi, then the rest can be automatically filled in.

Katja Abramova [10 months ago]
(the script is currently running but you should soon see a more populated table)

Katja Abramova [10 months ago]
For the remaining records for which I couldn’t fetch the doi automatically, it would need to be filled in manually.

Katja Abramova [10 months ago]
I’m planning to also write a function to automatically fill in the parencite column given author-year information.

daaronr [10 months ago]
that is really good!

Katja Abramova [10 months ago]
I’ve updated the paper_mass table with all the doi I could find (the remaining articles seem not to have one at all), bibtex, citation info (author, year etc.), number of citations and parencite.

Katja Abramova [10 months ago]
There are 167 records remaining that didn’t have a doi and so to get bibtex, I’d need to interface with e.g. google scholar. However, I’ve noticed that for instance number of citations is not the same when checking crossref vs scholar. I’m not sure if it’s ok to mix the 2 in one table.

daaronr [10 months ago]
we should either use crossref only, or be clear about which one we are referring to. It’s fairly well known that Google overstates citations

Katja Abramova [10 months ago]
yeah, so we can use scholar only for bibtex, not citations

daaronr [10 months ago]
it’s not without any value for citations … it’s better than nothing, but it should be flagged distinctly

daaronr [10 months ago]
i.e., if crossref isn’t available, we can use google scholar but we put an asterisk

Katja Abramova [10 months ago]
ok

Katja Abramova [10 months ago]
So I’ve tried to do things with google scholar and it sucks. They don’t provide an official API and it’s not clear what are the rate limits for accessing it via http requests (I’m using a package somebody made). Unfortunately, whatever I managed to produce with this had a lot of error data and I also got locked out of scholar for several days even though I was very careful in querying it. So, I’d say until this changes, we should rather rely on crossref and only use scholar for manual lookup: whenever somebody wants to add a paper that doesn’t have a doi, just copy-paste a bibtex from scholar.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understand and update bibtex/doi connection #3

Understand and update bibtex/doi connection #3

daaronr commented Apr 3, 2019

Understand and update bibtex/doi connection #3

Understand and update bibtex/doi connection #3

Comments

daaronr commented Apr 3, 2019