[Bug]: GPU not being fully utilized #434

nauyan · 2025-01-06T10:57:33Z

What happened?

I am currently using LateInteraction and BM25 model using fastembed library but my GPU is not being fully utilized! my provider is set to CUDAExecutionProvider, still only 4GB is being utilized out of 24GB!

What is the expected behaviour?

All of the available GPU memory should be utilized!

A minimal reproducible example

`def _initialize_colbert_model(gpu: bool):
"""Initialize ColBERT model with GPU or CPU based on the flag."""
provider = ["CUDAExecutionProvider"] if gpu else ["CPUExecutionProvider"]
use_cuda = True if gpu else False
logger.info(f"Initializing ColBERT model with {provider}")
return LateInteractionTextEmbedding("colbert-ir/colbertv2.0", providers=provider,cuda=use_cuda, parallel=0,local_files_only=LOCAL_FILES_ONLY)

def _initialize_sparse_bm25_model(gpu: bool):
"""Initialize FastEmbedSparse model with GPU or CPU based on the flag."""
provider = ["CUDAExecutionProvider"] if gpu else None
use_cuda = True if gpu else False
logger.info(f"Initializing FastEmbedSparse model with {'GPU' if gpu else 'CPU'}")
return FastEmbedSparse(providers=provider, cuda = use_cuda, parallel=0,local_files_only=LOCAL_FILES_ONLY)`

here is my docker-compose file:
`version: '3.8'

services:
web:
build:
context: .
dockerfile: Dockerfile-gpu
environment:
- GPU_DEPLOYMENT=TRUE
restart: always
ports:
- "5003:5003"
volumes:
- ./logs:/app/logs
deploy:
resources:
limits:
memory: 15g
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]`

What Python version are you on? e.g. python --version

python 3.12

FastEmbed version

fastembed-gpu==0.4.0

What os are you seeing the problem on?

Linux

Relevant stack traces and/or logs

No response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: GPU not being fully utilized #434

[Bug]: GPU not being fully utilized #434

nauyan commented Jan 6, 2025

[Bug]: GPU not being fully utilized #434

[Bug]: GPU not being fully utilized #434

Comments

nauyan commented Jan 6, 2025

What happened?

What is the expected behaviour?

A minimal reproducible example

What Python version are you on? e.g. python --version

FastEmbed version

What os are you seeing the problem on?

Relevant stack traces and/or logs