Skip to content
forked from pudo/normality

A tiny library for Python text normalisation. Useful for ad-hoc text processing.

License

Notifications You must be signed in to change notification settings

andkamau/normality

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

normality

Normality is a Python micro-package that contains a small set of text normalization functions for easier re-use. These functions accept a snippet of unicode or utf-8 encoded text and remove various classes of characters, such as diacritics, punctuation etc. This is useful as a preparation to further text analysis.

Example

# coding: utf-8
from normality import normalize, slugify

text = normalize('Nie wieder "Grüne Süppchen" kochen!')
assert text == 'nie wieder grune suppchen kochen'

slug = slugify('My first blog post!')
assert slug == 'my-first-blog-post'

Extended usage

Read the source code, it's twenty lines of stuff.

RTSL

License

normality is open source, licensed under a standard MIT license (included in this repository as LICENSE).

About

A tiny library for Python text normalisation. Useful for ad-hoc text processing.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.0%
  • Makefile 1.0%