Skip to content

Latest commit

 

History

History
27 lines (19 loc) · 908 Bytes

README.rst

File metadata and controls

27 lines (19 loc) · 908 Bytes

Tokenize UK

Documentation Status

Simple python lib to tokenize texts into sentences and sentences to words. Small, fast and robust. Comes with ukrainian flavour

Features

  • Tokenize given text into sentences
  • Tokenize given sentence into words
  • Works well with accented characters (like stresses) and apostrophes
  • Suitable also for other languages