-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[citeproc] CSL JSON in-field markup not recognized when using pandoc-zotxt.lua #6722
Comments
This sounds like an issue with pandoc-zotxt.lua (about which I know nothing). |
From https://github.com/odkr/pandoc-zotxt.lua: “pandoc-zotxt.lua looks up sources of citations in Zotero and adds them either to a document's references metadata field or to a bibliography file , where pandoc-citeproc can pick them up.” Maybe the output from the following might give a clue (again, requires Zotero plus add-ons installed and running, as above):
Output from pandoc 2.10:
Output from pandoc 2.11:
Note that the title that appears ( Question is: When formatting this further, why would pandoc 2.10 parse the in-field markup correctly while 2.11 does not? |
Looping in @odkr as zotext.lua maintainer, since I don't know the details of how zotext is doing this, and how the conversion worked before. It may be that zotext.lua is going to need to check pandoc version and behave differently for 2.11+. For pandoc 2.11, you can just use pandoc.read(string, "csljson") to convert a CSL JSON bibliography to a Pandoc, and then just grab the |
I do hope @odkr will weigh in. Still, I might be wrong, but in my mind this seems to be a pandoc, not a The two So something relevant does seem to have changed between pandoc 2.10 and 2.11. But what? |
Prima facie, I agree with @njbart. |
The key question is: how does it parse them? I've told you how you could parse them reliably using pandoc 2.11+. How were you doing it before? If you can tell me that, maybe I can explain why it stopped working. |
It takes the CSL JSON from zotxt and passes it on to lunajson's The relevant blocks of code are: function get_source (citekey)
assert(type(citekey) == 'string', 'given citekey is not a string')
assert(citekey ~= '', 'given citekey is the empty string')
local _, reply
for i = 1, #keytypes do
local query_url = concat({ZOTXT_QUERY_BASE_URL, keytypes[i], '=', citekey})
-- This is the HTTP get request to zotxt.
reply = select(2, fetch(query_url, '.'))
-- This is the call to lunajson's decode function.
local ok, data = pcall(decode, reply)
if ok then
insert(keytypes, 1, remove(keytypes, i))
-- This is the number conversion (see below).
local source = convert_numbers_to_strings(data[1])
source.id = citekey
return source
end
end
return nil, reply
end and function convert_numbers_to_strings (data, depth)
if not depth then depth = 1 end
assert(depth < 512, 'too many recursions')
local data_type = type(data)
if data_type == 'table' then
local s = {}
for k, v in pairs(data) do
s[k] = convert_numbers_to_strings(v, depth + 1)
end
return s
elseif data_type == 'number' then
return tostring(floor(data))
else
return data
end
end These entries are then added to the I’ll have a look at it and try a call to I take it that |
If you're just using a standard JSON parser, you'll get plain strings for the values of the variables ( |
Note also that the relevant change in pandoc 2.11 is that Like pandoc-citeproc, this library does indeed process CSL JSON differently from |
Clarification: if you're directly adding things to metadata (e.g. |
Hmm, could it be that |
Oh, and while I’m at it. Thanks, @jgm, for taking the time to clear that up! |
I'm happy to help further if you have questions while revising |
Thanks! That’d be great. |
@jgm, I’ve encountered a bug when I updated my test suite: pandoc --citeproc -t plain <<EOF
---
references:
- {"id":"díaz-león:2013what","type":"article-journal","title":"What Is Social Construction?","container-title":"European Journal of Philosophy","page":"1-16","URL":"http://onlinelibrary.wiley.com/doi/10.1111/ejop.12033/abstract","DOI":"10.1111/ejop.12033","ISSN":"1468-0378","language":"en","author":[{"family":"Díaz-León","given":"Esa"}],"issued":{"date-parts":[[2013,5,1]]},"accessed":{"date-parts":[[2015,11,14]]},"container-title-short":"Eur J Philos"}
- {"id":"díaz-león:2015defence","type":"article-journal","title":"In Defence of Historical Constructivism about Races","container-title":"Ergo, an Open Access Journal of Philosophy","volume":"2","URL":"http://hdl.handle.net/2027/spo.12405314.0002.021","DOI":"10.3998/ergo.12405314.0002.021","ISSN":"2330-4014","author":[{"family":"Díaz-León","given":"Esa"}],"issued":{"date-parts":[[2015]]}}
- {"id":"díaz-león:2016woman","type":"article-journal","title":"<i>Woman</i> as a Politically Significant Term: A Solution to the Puzzle","container-title":"Hypatia","page":"245-258","URL":"http://onlinelibrary.wiley.com.uaccess.univie.ac.at/doi/10.1111/hypa.12234/abstract","DOI":"10.1111/hypa.12234","ISSN":"1527-2001","title-short":"<i>Woman</i> as a Politically Significant Term","language":"en","author":[{"family":"Díaz-León","given":"Esa"}],"issued":{"date-parts":[[2016,2,1]]},"accessed":{"date-parts":[[2016,2,18]]},"container-title-short":"Hypatia"}
---
p [cf. @díaz-león:2013what; @díaz-león:2015defence; @díaz-león:2016woman], \
yet p [@díaz-león:2013what; @díaz-león:2015defence; @díaz-león:2016woman]!
EOF prints:
That is, Pandoc fails to group (some) references together, but only when they are prefixed. Shall I open a new issue for that? |
Yes, why don't you close this, and perhaps we can open up a new issue for the grouping issue.
should NOT become
In your particular case, though, it would be harmless to group, because the position of the prefix isn't affected by sorting. I guess it would be helpful to compare citeproc-js's output. |
@odkr, v0.3.18b behaves very well so far. Many thanks. |
@njbart, great! Could you close the issue then? (If it turns out that v0.3.18b is buggy after all, you can open an issue at its repo). |
@jgm, if it’s intended then I think it’s fine as it is. What struck me as odd was the repitition of the author name. If there’s an easy way to fix this, that’s be good, IMHO. But if there isn’t, it’s fine, again IMHO, for that to be left to users to fix manually. |
I think it would be good tot open a new issue, though I'm still not sure what to think about this case. Here's what citeproc.js yields:
If the prefix is moved to 2016, citeproc.js yields:
I guess the repetition of the author's name would better be avoided, here, and it would be good to have an issue to track this. |
Preivously if a citation item had a prefix, it would not be grouped with following citations. See jgm/pandoc#6722 for discussion.
OK, I've pushed a fix to citeproc; now we get
This differs from the citeproc.js output, but I think our output is right -- the cite-group-delimiter should be used here (and it defaults to comma). For relevant discussion see citation-style-language/test-suite#39. |
Oh, thanks a lot! I was about to open that issue, but that’s no longer needed, or is it? |
It's no longer needed. |
The latest master, self-compiled, identifying as 2.11, does not seem to recognize CSL JSON in-field markup (e.g.,
<i>
...</i>
) if biblio data are obtained from Zotero using thepandoc-zotxt.lua
filter in its default mode, i.e., not having the filter write an intermediary/cache CSL JSON file to disk.Note that for using
pandoc-zotxt.lua
, the Zotero add-ons zotxt and, IIRC, Better BibTeX (the latter for useable citation keys) need to be installed.By contrast, after adding a line
zotero-bibliography: test.json
to the markdown document’s metadata (which leads to a an intermediary/cache CSL JSON file being written to disk), the in-field markup is recognized, just as it used to be when using pandoc 2.10.1, with or without the intermediary.json
file.My hunch is that the most recent pandoc might not be properly identifying the incoming data from the
pandoc-zotxt.lua
filter as CSL JSON and hence might not be trying to parse the html-like in-field markup.MWE; prerequisite: Zotero plus add-ons listed above installed and running, a Zotero item containing
<i>
...</i>
in its title field:Actual output:
Expected output = output from pandoc 2.11 with the
zotero-bibliography
line uncommented = output from pandoc 2.10.1, regardless of thezotero-bibliography
line:The text was updated successfully, but these errors were encountered: