Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 #129

Merged
merged 3 commits into from
Feb 21, 2024

Conversation

geetu040
Copy link
Contributor

Issue

This PR resolves issue #89

The requested model, sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 has been included in the package by adding this configuration

    {
        "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",
        "dim": 384,
        "description": "Sentence Transformer model, paraphrase-multilingual-MiniLM-L12-v2",
        "size_in_GB": 0.46,
        "hf_sources": [
            "qdrant/paraphrase-multilingual-MiniLM-L12-v2-onnx-Q"
        ],
        "compressed_url_sources": []
    },

Usage

sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 can now be used with fastembed

from fastembed import TextEmbedding

# load the model
embedding_model = TextEmbedding(
	model_name="sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"
)

embeddings = embedding_model.embed(
	# documents
	["hello world", "flag embedding"] * 100,

	# optional params
	batch_size=2,
	parallel=2
)

Changes

Following changes were made to add this model

  • fastembed/fastembed/models.json: model configuration was added
  • fastembed/fastembed/text/onnx_embedding.py: model configuration was added in variable supported_onnx_models
  • fastembed/fastembed/docs/examples/Supported_Models.ipynb: This notebook was run to show the latest model included in the output
  • fastembed/tests/test_onnx_embeddings.py and fastembed/tests/test_text_onnx_embeddings.py: New model was added in list with the output generated on hard-coded documents for test-cases

Tests

All test cases have been passed with the latest model included
Screenshot from 2024-02-20 21-12-50

@NirantK NirantK requested a review from Anush008 February 21, 2024 05:32
Copy link
Member

@Anush008 Anush008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @geetu040. Thank you.

@NirantK NirantK merged commit 406f432 into qdrant:main Feb 21, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants