Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arabic: Kashidas displaced by ligatures #258

Open
lueck opened this issue Aug 21, 2023 · 0 comments
Open

Arabic: Kashidas displaced by ligatures #258

lueck opened this issue Aug 21, 2023 · 0 comments

Comments

@lueck
Copy link

lueck commented Aug 21, 2023

There is a whole class of errors in the kashida justification resulting from ligatures: A ligature of two letters, which allow kashida insertion in between, results in the kashida deferred/displaced after the ligature, i.e. after the second (the last) letter it consists of. So kashidas can occur, where they must not occur. Roughly, 10 out of 100 words in my texts are effected by such errors.

I would suggest to add an option to the babel package to turn off kashida insertion after ligatures.

Here is an MWE, where you can see kashidas after ALEFs, at the end of words and other places, where they should not occur.

\documentclass{book}

\usepackage{luabidi}
\setRTLmain

\usepackage[english,bidi=basic]{babel}[2021/05/16]% version 3.59 or later
% see babel's change log: https://latex3.github.io/babel/#whats-new

\babelprovide[import,main,%
justification=kashida,%
transforms=kashida.plain%
]{arabic}

\babelfont{rm}[Scale=3]{ArabicTypesetting} % {FreeSerif}
% font source: https://arabicfonts.net/fonts/arabic-typesetting-regular


% output a test case with \case{NUMBER}{WORD}{EXPECTATION}
\newcommand*{\case}[3]{%
  \noindent #1 %
  \directlua{Babel.arabic.justify_enabled=false}%
  #2 %
  -- #3 %
  \directlua{Babel.arabic.justify_enabled=true}%
  \hfill%
  \fbox{\makebox[5em][s]{#2}}%
}

%% override default rule from kashida.plain
\babelprehyphenation{arabic}{()ل()[]*[اأإآ]}{kashida = 500}

\begin{document}

\case{1}{لا}{لا}%

\case{2}{بِأَبي}{بِـأَبي}%

\case{3}{بِيَ}{بِيَ}%

\case{4}{فكانَ}{فـكانَ}%

\case{5}{باخِلٌ}{باخِـلٌ}%

\case{6}{له}{له}%


%\case{e}{ل\/ه}{لـه}%

\end{document}

TEX engine: LuaHBTeX, Version 1.17.0 (TeX Live 2023)

babel version: 2023/08/09 v3.92.22182 The Babel package (from github)

The test case 1 is the sequence of LAM and ALEF, for which there is a ligature in (almost) every Arabic font. The MWE overrides a rule from kashida.plain transformation, that excludes Kashidas between LAM and ALEF.

kashida-ligature AT

Above is the result with font ArabicTypesetting (see link in comment in MWE), which provides many ligatures. There's an error in each case, 1...6.

kashida-ligature FS

Above is the result of the same MWE with font FreeSerif which provides only some standard ligatures like LAM+ALEF and thus does not have so many errors (in fact only case 1).

As you can infer from the comparison, the false kashidas result from the ligatures.

With an option for turning on/off kashida insertion after ligatures, we would gain

  1. more sensible or transparent justification rules: IMO it feels like an odd workaround when we need a rule that forbids Kashidas between e.g. LAM and ALEF in order to turn off Kashidas after LAM+ALEF.

  2. we would not have to clutter up the set of justification/hyphenation rules with font-specific rules which take care of the ligatures, that are actually present in the font

  3. It would confirm to LuaTex's idea about hyphenation: "whether or not hyphenation takes place should not depend on the current font, it is a language property" (LuaTeX Reference Manual, sec. 5.5, p. 76)

  4. If it can be turned on/off, nothing is lost for those guys who want a fine-grained rule set and need it when they type ligatures already into the TeX input file.

lueck added a commit to lueck/babel that referenced this issue Aug 21, 2023
…x3#258

Replacing simple characters with ligatures seems to be done before
inserting kashidas. This may result in incorrect kashida insertion.

If we have e.g. letters BEH and YEH where a kashida can be inserted in
between. But we also have an BEH+YEH ligature. Now, the glyph on the
a's position gets a kashida weight, but it has been replaced with the
BEH+YEH ligature. So, the result will be inserted kashidas after the
ligature, i.e. after YEH, which is not what we want.

Test case: {بِيَ} with Arabic Typesetting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant