Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOI breaking and hyphenation #6

Open
bgvoisin opened this issue Apr 13, 2024 · 2 comments
Open

DOI breaking and hyphenation #6

bgvoisin opened this issue Apr 13, 2024 · 2 comments

Comments

@bgvoisin
Copy link

Just now I encountered a situation where a DOI, entered with \doi, was cut across lines and a hyphen inserted at the break, turning

10.1016/j.ijheatfluidflow.2008.07.001

into

10.1016/j.ijheatfluid-
flow.2008.07.001

This is because the doi package uses \href from hyperref, which treats its second argument like normal text. Clearly adding a hyphen inside a URL is a bad idea.

The url package works differently, its does not add hyphens and instead defines possible break points. Heiko Oberdiek pointed out that the best of both worlds is obtained with \nolinkurl, to format (if I understood correctly) the second argument of \href using \url.

The attached example (to be typeset with LuaLaTeX for the above hyphenation to occur) compares the outputs of

\doi{<doi>}
\href{https://doi.org/<doi>}{https://doi.org/<doi>}
\url{https://doi.org/<doi>}
\href{https://doi.org/<doi>}{\nolinkurl{https://doi.org/<doi>}}
\href{https://doi.org/<doi>}{\nolinkurl{doi:<doi>}}

The last of these gives the same output as \doi, without the unwanted hyphen. The only difference is doi:, outside the link with \doi and inside it with \href+\nolinkurl (I prefer the latter).

The whole doi package was introduced to deal with DOIs containing non-sensical characters like

10.1175/1520-0469(1983)040<0396:SWALW>2.0.CO;2

At the time, \href couldn't deal with these characters. Now it seems it can, see the second example in the attached file.

Hence the two questions:

  • Is the doi package still necessary? Could its functionality be implemented simply with \href{https://doi.org/<doi>}{\nolinkurl{doi:<doi>}}?
  • Assuming it is still necessary, could the clever breakpoint handling brought by the url package be combined into it?

doi-hyphen-example.zip

@u-fischer
Copy link

At the time, \href couldn't deal with these characters. Now it seems it can, see the second example in the attached file.

well you get two different links in the PDF:

https://doi.org/10.1175/1520-0469\(1983\)040%3C0396:SWALW%3E2.0.CO;2
https://doi.org/10.1175/1520-0469\(1983\)040<0396:SWALW>2.0.CO;2

acrobat reader seems no to care and opens both, but acrobat pro refuses to follow the second. You can naturally use \href or \url to input the link, but both do not try to encode the url or change the url, or extend a prefix or a protocol, so whatever you give them, must be a complete, correct url. They only escape e.g. parentheses as needed inside a PDF.

@bgvoisin
Copy link
Author

acrobat reader seems no to care and opens both, but acrobat pro refuses to follow the second.

I can confirm this on the Mac: TeXShop and Preview open the link fine, but Acrobat Pro and Skim don't (nor does a locally compiled mupdf-x11). I had always thought Acrobat Pro to be the laxest PDF viewer.

Redefining \toks0 as follows seems to achieve the desired result:

\def\@doi#1{% 
  \let\#\relax
  \let\_\relax
  \let\textless\relax 
  \let\textgreater\relax 
%  \edef\x{\toks0={{#1}}}% 
  \edef\#{\#}%
  \edef\_{_}%
  \edef\textless{<}%
  \edef\textgreater{>}%
  \edef\x{\toks0={{\noexpand\nolinkurl{#1}}}}% 
  \x
  \edef\#{\@percentchar23}%
  \edef\_{_}%
  \edef\textless{\@percentchar3C}% instead of {\string<} for Apple
  \edef\textgreater{\@percentchar3E}% instead of {\sting>} for Apple
  \edef\x{\toks2={\noexpand\href{\doiurl#1}}}% 
  \x
  \edef\x{\endgroup\doitext\the\toks2 \the\toks0}%
  \x
}

See the attached file.

It's very possible the above code makes no sense. I can't pretend to actually understand the \doi macro; I just tried many things for hours, until suddenly one seemed to work.

I couldn't find an example of DOI containing # to test.

doi-hyphen-example-v3.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants