question about the hyphenator #21

mtrevisan · 2018-07-03T13:09:55Z

Issue type:

Others, questions

I would like to ask you if there are some documentation on the hyphenation algorithm, particularly on the NEXTLEVEL keyword.
I don't understand how it is used. It divides the patterns in two groups where the first is used in the hyphenation of a non-compounded word and the second on a compounded word? How can I know a word is compounded? I've read this https://github.com/hunspell/hyphen/blob/a7255913300734655691fc3e8ce20041d611fbdb/README.compound but I don't quite understand how the things going on.

When it is written "Hyphen, apostrophe and other characters may be word boundary characters, but they don't need (extra) hyphenation. [...] Without explicite NEXTLEVEL declaration, Hyphen 2.8 uses the previous settings, plus in UTF-8 encoding, endash (U+2013) and typographical apostrophe (U+2019) are NOHYPHEN characters, too." that means the hyphen, apostrophe (and additionally endash and typographical apostrophe if no NEXTLEVEL keyword is present) defines a break point by default, without checking any patterns?

When it is written

"ISO8859-1
NOHYPHEN -,'
1-1
1'1
NEXTLEVEL

Description:
1-1 and 1'1 declare hyphen and apostrophe as word boundary characters
and NOHYPHEN with the comma separated character (or character sequence)
list forbid the (extra) hyphens at the hyphen and apostrophe characters."

What is the meaning of "(extra)"? If I don't include the NOHYPHEN -,' part there will be an extra hyphen?

When it is written

"The algorithm is recursive: every word parts of a successful
first (compound) level hyphenation will be rehyphenated
by the same (first) pattern set.

Finally, when first level hyphenation is not possible, Hyphen uses
the second level hyphenation for the word or the word parts."

That means that, if the NEXTLEVEL option is present, the algorithm scans two times the first set and, for the "sub-words" that were not re-splitted the second time, the second set is used? I understand correctly?

Thank you

dimztimz · 2018-07-03T13:17:30Z

You opened the issue in the wrong place. This repository is for spell checking. Seems like you already opened one issue there. hunspell/hyphen#16.
Your best bet is to read the source code and make some sense of it.

mtrevisan · 2018-07-03T13:20:14Z

I was afraid of that answer... Thank you anyway!

dimztimz closed this as completed Jul 3, 2018

dimztimz added invalid This doesn't seem right question Further information is requested labels Feb 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about the hyphenator #21

question about the hyphenator #21

mtrevisan commented Jul 3, 2018 •

edited by dimztimz

Loading

dimztimz commented Jul 3, 2018

mtrevisan commented Jul 3, 2018

question about the hyphenator #21

question about the hyphenator #21

Comments

mtrevisan commented Jul 3, 2018 • edited by dimztimz Loading

dimztimz commented Jul 3, 2018

mtrevisan commented Jul 3, 2018

mtrevisan commented Jul 3, 2018 •

edited by dimztimz

Loading