Skip to content

Commit

Permalink
Merge pull request #49 from bact/main
Browse files Browse the repository at this point in the history
Update doc for Python binding
  • Loading branch information
bact authored Nov 8, 2021
2 parents 9753293 + 2088264 commit 849fd0c
Show file tree
Hide file tree
Showing 6 changed files with 22 additions and 25 deletions.
4 changes: 2 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,8 @@ anyhow = "1.0.45"
binary-heap-plus = "0.4.1"
bytecount = "0.6.2"
lazy_static = "1.4.0"
rayon = "1.5"
regex = "1.4.6"
rayon = "1.5.1"
regex = "1.5.4"
rustc-hash = "1.1.0"


Expand Down
4 changes: 2 additions & 2 deletions nlpo3-nodejs/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion nlpo3-nodejs/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ exclude = ["index.node"]
crate-type = ["cdylib"]

[dependencies]
ahash = "0.7.2"
ahash = "0.7.6"
lazy_static = "1.4.0"
nlpo3 = "1.2.0"

Expand Down
2 changes: 1 addition & 1 deletion nlpo3-python/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ path = "src/lib.rs"
crate-type = ["cdylib", "rlib"]

[dependencies]
ahash = "0.7.2"
ahash = "0.7.6"
lazy_static = "1.4.0"
nlpo3 = "1.3.1"

Expand Down
2 changes: 1 addition & 1 deletion nlpo3-python/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "nlpo3"
version = "1.1.3"
version = "1.2.0"
description = "Python binding for nlpO3 Thai language processing library in Rust"
readme = "README.md"
requires-python = ">=3.6"
Expand Down
33 changes: 15 additions & 18 deletions nlpo3-python/setup.cfg
Original file line number Diff line number Diff line change
@@ -1,43 +1,40 @@
[metadata]
name = nlpo3
version = 1.1.3
version = 1.2.0
description = Python binding for nlpO3 Thai language processing library
long_description =
Python binding for nlpO3, a Thai natural language processing library in Rust.

## Features

- Thai word tokenizer
- use maximal-matching dictionary-based tokenization algorithm and honor Thai Character Cluster boundaries
- use user-supplied dictionary
- 2.5x faster than similar pure Python implementation
- built-in dictionary included (62,000 words, a copy from PyThaiNLP)
- support custom dictionary



## Install

```bash
pip install nlpo3
```

## Usage

Tokenization using default dictionary:
```python
from nlpo3 import segment

segment("สวัสดีครับ") # returns ["สวัสดี", "ครับ"]
```

Load file `path/to/dict.file` to memory and assigned it with name `custom_dict`.
Then tokenize a text with `custom_dict` dictionary:
```python
from nlpo3 import load_dict, segment

load_dict("path/to/dict.file", "custom_dict")
segment("สวัสดีครับ", "custom_dict")
```


it will return a list of strings:
```python
['สวัสดี', 'ครับ']
```
(result depends on words included in the dictionary)

For more documentation, go [https://github.com/PyThaiNLP/nlpo3](https://github.com/PyThaiNLP/nlpo3)

long_description_content_type = text/markdown
Expand Down

0 comments on commit 849fd0c

Please sign in to comment.