This project is an implementation to convert Google's LaBSE model from TensorFlow to PyTorch. It also offers extensions to convert the smaller-LaBSE model from TensorFlow to PyTorch, and the LEALLA family of models.
The models are uploaded to the HuggingFace Model Hub in the PyTorch HF-compatible (original and safetensors
), TensorFlow and Flax formats, alongwith a compatible tokenizer.
To convert and export the models:
poetry install
poetry run convert_labse --output_path /path/to/models
To update the models on the HuggingFace Model Hub:
# Clone the already uploaded models.
cd /path/to/model
git clone https://huggingface.co/setu4993/LaBSE.git
# Export models anew and update.
cd /path/to/repo
poetry install
poetry run convert_labse --output_path /path/to/models/LaBSE --huggingface_path
- LaBSE:
poetry run convert_labse --output_path /path/to/models/setu4993/LaBSE --huggingface_path
- smaller-LaBSE:
poetry run convert_labse --output_path /path/to/models/setu4993/smaller-LaBSE --smaller --huggingface_path
- LEALLA-base:
poetry run convert_lealla --size base --output_path /path/to/models/setu4993/LEALLA-base --huggingface_path
- LEALLA-small:
poetry run convert_lealla --size small --output_path /path/to/models/setu4993/LEALLA-small --huggingface_path
- LEALLA-large:
poetry run convert_lealla --size large --output_path /path/to/models/setu4993/LEALLA-large --huggingface_path
See the model-cards
directory for a copy of the model cards.
This repository and the conversion code is licensed under the MIT license, but the model is distributed with an Apache-2.0 license.