BibTeX ABS export: trailing #172

golnazads · 2020-06-16T15:50:02Z

Alberto
replied to
You
@Carolyn @golnaz sorry, I neglected to let you know of this possible markup. Please translate  to blank lines and   to a newline when outputting in a non-XML format. I think this means that for bibtex it would be:
 => \\

The text was updated successfully, but these errors were encountered:

golnazads · 2020-06-16T21:53:44Z

@aaccomazzi
this is implemented for BibTex ABS. do I need to remove these tags for for example custom format unicode encoding. I am guessing it is a yes for latex encoding. If it is a yes for unicode, then I guess need to fix that for XML and fielded formats, right? thank you.

aaccomazzi · 2020-06-17T00:02:28Z

This is the situation with respect to encoding in our json fields (see e.g. 2016ApJ...818L..26F)

abstract and title text have the basic HTML entities encoded (these are < > and &)
they may also have some markup in the form of  etc.

When creating custom output, we recognize and support three basic encoding:

HTML: In this case the entities and markup are kept as they are, so < remains <
Latex: in this case the entities and markup are translated according to html -> latex syntax
Unicode: In this case the entities are turned into their unicode equivalent, in this case it's just the three characters above which become <, >, &. The issue of markup for unicode encoding has never been formally defined in our documentation and I had to go check the code of classic to figure out what we are doing here. Turns out classic simply strips the markup:  -> (empty string)

I feel that the unicode handling of markup done by classic is wrong, because we provide a separate formatting option to control the treatment of markup (%ZMarkup:{keep|strip}), as documented here: http://adsabs.github.io/help/actions/export
So I'm in favor of passing through markup as it is, and let users customize the output via formatting options.

golnazads · 2020-06-17T18:14:21Z

just for your information export has the option of markup keep|strip https://github.com/adsabs/export_service/blob/master/exportsrv/formatter/customFormat.py#L702. I can remove it if you want @aaccomazzi .

aaccomazzi · 2020-06-17T21:34:13Z

We should keep the markup option, this way users can control what they get or not get.
So I think the adjustments to make for unicode encoding are:

 => \n\n (new paragraph)
 => \n (newline)
&, >, < => &, >, <
All other markup: controlled by %ZMarkup settings

aaccomazzi mentioned this issue Jun 17, 2020

replace html entities <, >, and '&' #173

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BibTeX ABS export: trailing <P /> #172

BibTeX ABS export: trailing <P /> #172

golnazads commented Jun 16, 2020 •

edited

Loading

golnazads commented Jun 16, 2020

aaccomazzi commented Jun 17, 2020

golnazads commented Jun 17, 2020

aaccomazzi commented Jun 17, 2020

BibTeX ABS export: trailing <P /> #172

BibTeX ABS export: trailing <P /> #172

Comments

golnazads commented Jun 16, 2020 • edited Loading

golnazads commented Jun 16, 2020

aaccomazzi commented Jun 17, 2020

golnazads commented Jun 17, 2020

aaccomazzi commented Jun 17, 2020

golnazads commented Jun 16, 2020 •

edited

Loading