xx_sent_ud_sm bad sentence split #12648
lance0108
started this conversation in
Language Support
Replies: 1 comment 1 reply
-
|
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm trying xx_sent_ud_sm for sentence segmentation of Chinese texts. It seems the model can never get anything correct. An example is provided below.
The example contains 10 sentences. It is pretty clear that each sentence ends with a "。". But the model absolutely failed to split the sentence. Another model,
zh_core_web_sm
, works just fine.Output:
For comparison, this is the output from
zh_core_web_sm
.Beta Was this translation helpful? Give feedback.
All reactions