The esperanto_stemmer
is an Elasticsearch filter that provides stemming for the
Esperanto language.
- Folding: If you use generic folding (ICU folding conveniently handles both combining and precomposed diacrtics), be sure not to fold Ĉ/ĉ, Ĝ/ĝ, Ĥ/ĥ, Ĵ/ĵ, Ŝ/ŝ, and Ŭ/ŭ which should be kept distinct from C/c, G/G. H/h, J/j, S/s, and U/u. See more about Esperanto orthogrpahy on Wikipedia.
- Transliterations: The stemmer does not support H-system or X-system transliterations. This affect stemming exceptions and number recognition.
The original implementation in Java is "Esperanto Stemmer", by Declan Whitford Jones.
That stemmer was wrapped into this Elasticsearch plugin by Trey Jones to provide the