Skip to content

Java JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.

License

Notifications You must be signed in to change notification settings

cactuscommunications/sentencepiece-jni

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SentencePiece Java Wrapper

Build Status

Java wrapper for SentencePiece with JNI. This module wraps sentencepiece::SentencePieceProcessor class with the following modifications:

  • Encode and Decode methods are re-defined as EncodeAsIds, EncodeAsPieces, DecodeIds and DecodePieces respectively.
  • SentencePieceText proto is not supported.

SentencePiece Version

v0.1.92

Build and Install SentencePiece

To build and install the Java wrapper from source, please try the following commands:

% mvn clean install

Please note you need to have gcc, cmake and libnative installed.

Usage

See SentencePieceProcessorTest for more.

About

Java JNI wrapper for SentencePiece: unsupervised text tokenizer for Neural Network-based text generation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 55.7%
  • Java 41.4%
  • CMake 2.9%