Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: GPU not being fully utilized #434

Open
nauyan opened this issue Jan 6, 2025 · 0 comments
Open

[Bug]: GPU not being fully utilized #434

nauyan opened this issue Jan 6, 2025 · 0 comments

Comments

@nauyan
Copy link

nauyan commented Jan 6, 2025

What happened?

I am currently using LateInteraction and BM25 model using fastembed library but my GPU is not being fully utilized! my provider is set to CUDAExecutionProvider, still only 4GB is being utilized out of 24GB!

What is the expected behaviour?

All of the available GPU memory should be utilized!

A minimal reproducible example

`def _initialize_colbert_model(gpu: bool):
"""Initialize ColBERT model with GPU or CPU based on the flag."""
provider = ["CUDAExecutionProvider"] if gpu else ["CPUExecutionProvider"]
use_cuda = True if gpu else False
logger.info(f"Initializing ColBERT model with {provider}")
return LateInteractionTextEmbedding("colbert-ir/colbertv2.0", providers=provider,cuda=use_cuda, parallel=0,local_files_only=LOCAL_FILES_ONLY)

def _initialize_sparse_bm25_model(gpu: bool):
"""Initialize FastEmbedSparse model with GPU or CPU based on the flag."""
provider = ["CUDAExecutionProvider"] if gpu else None
use_cuda = True if gpu else False
logger.info(f"Initializing FastEmbedSparse model with {'GPU' if gpu else 'CPU'}")
return FastEmbedSparse(providers=provider, cuda = use_cuda, parallel=0,local_files_only=LOCAL_FILES_ONLY)`

here is my docker-compose file:
`version: '3.8'

services:
web:
build:
context: .
dockerfile: Dockerfile-gpu
environment:
- GPU_DEPLOYMENT=TRUE
restart: always
ports:
- "5003:5003"
volumes:
- ./logs:/app/logs
deploy:
resources:
limits:
memory: 15g
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]`

What Python version are you on? e.g. python --version

python 3.12

FastEmbed version

fastembed-gpu==0.4.0

What os are you seeing the problem on?

Linux

Relevant stack traces and/or logs

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant