Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why can't this run_example.sh script program run? #70

Open
wcqy-ye opened this issue Dec 1, 2024 · 9 comments
Open

Why can't this run_example.sh script program run? #70

wcqy-ye opened this issue Dec 1, 2024 · 9 comments

Comments

@wcqy-ye
Copy link

wcqy-ye commented Dec 1, 2024

image

@fnlandini
Copy link
Collaborator

Hi,
Are you using the latest version? https://github.com/BUTSpeechFIT/VBx/blob/master/VBx/vbhmm.py
The argument should be --xvec-transform. But some older version has --xvec-tran and xvec-mean

@wcqy-ye
Copy link
Author

wcqy-ye commented Dec 1, 2024

Hello, I am not using the code from the master branch; I am using the code from the branch shown in the image below. What should I do next?
image

Hi, Are you using the latest version? https://github.com/BUTSpeechFIT/VBx/blob/master/VBx/vbhmm.py The argument should be --xvec-transform. But some older version has --xvec-tran and xvec-mean

@fnlandini
Copy link
Collaborator

The run_example script will only work with the master branch which is the latest version. If you want to use the voxconverse branch, you are welcome to do so but the run_example script will not work. You should be able to run on the voxconverse data with that branch

@wcqy-ye
Copy link
Author

wcqy-ye commented Dec 8, 2024

The run_example script will only work with the master branch which is the latest version. If you want to use the voxconverse branch, you are welcome to do so but the run_example script will not work. You should be able to run on the voxconverse data with that branch
thank you a lot.
I would like to ask how to replace the model. Your project provides several models, and I have trained a few other onnx or pt models using WeSpeaker. I would like to try replacing and using them. Also, I noticed that your script uses the CPU for computation. Is it possible to utilize the GPU for acceleration?
image
image
image

@wcqy-ye
Copy link
Author

wcqy-ye commented Dec 8, 2024

The run_example script will only work with the master branch which is the latest version. If you want to use the voxconverse branch, you are welcome to do so but the run_example script will not work. You should be able to run on the voxconverse data with that branch
thank you a lot.
I would like to ask how to replace the model. Your project provides several models, and I have trained a few other onnx or pt models using WeSpeaker. I would like to try replacing and using them. Also, I noticed that your script uses the CPU for computation. Is it possible to utilize the GPU for acceleration?
image
image
image

image
I successfully used an ONNX model to extract embeddings, but I want to accelerate this process using a GPU. However, I encountered an issue: performing computations with the GPU is even slower than using only the CPU. Previously, I used the ONNX model from the WeSpeaker project for this task, and it was very fast, completing the entire embedding extraction process in just one minute. Therefore, I would like to ask if you have any suggestions to resolve this issue. I would greatly appreciate any advice you can provide.

@fnlandini
Copy link
Collaborator

Hi @wcqy-ye
It should be possible to run on GPU by replacing DEVICE=cpu to DEVICE=gpu if I'm not mistaken. I have certainly done the extraction with the models in this repository using GPU before. But I have not used models from WeSpeaker so I am not sure if you need to change something in the scripts.
Have you verified that you can extract with GPU using the models shared in the repository? I have done that before and that runs faster than using CPU.

@wcqy-ye
Copy link
Author

wcqy-ye commented Dec 9, 2024

Hi @wcqy-ye It should be possible to run on GPU by replacing DEVICE=cpu to DEVICE=gpu if I'm not mistaken. I have certainly done the extraction with the models in this repository using GPU before. But I have not used models from WeSpeaker so I am not sure if you need to change something in the scripts. Have you verified that you can extract with GPU using the models shared in the repository? I have done that before and that runs faster than using CPU.

Hello, thank you for your reply. I tried the models shared in the repository and ran the VoxConverse2020_run.sh script directly. I also compared the CPU and GPU modes and found no difference in speed. Additionally, I noticed an issue with the way free_gpu.sh checks whether the GPU is idle. In many cases, even when the GPU is idle, the "No running processes found" field does not appear. I suspect this might be because no batching is performed, and the lack of batch size prevents the parallel computing speed of the GPU from being utilized, while the CPU's computation speed limits the GPU.
1ff1eaba-294b-4b12-a01c-90ad089832b8
2afeddfe-c812-41b0-a49b-5aa1cba1cd13
d11b7614-030f-4b34-ac3f-3aac11f83251

@fnlandini
Copy link
Collaborator

From what I see in your print screens, both of them say GPU. But I trust you ran one of those with the CPU option.
In my previous experience, I definitely saw speed ups when using GPU so this is certainly not expected. At the same time, the measure of time that is reported includes the load of the file and calculation of features. In principle, that should not take a lot of time but you can focus on measuring the specific call to inference.
Regarding the batching, you are right. This code does not fully utilize the resources. For these evaluation datasets with short files, I would normally run the script on several CPUs in parallel and that was fast enough for the needs we had. If you want to add more options for GPU, I am afraid that you will need to implement them yourself.

@wcqy-ye
Copy link
Author

wcqy-ye commented Dec 10, 2024

From what I see in your print screens, both of them say GPU. But I trust you ran one of those with the CPU option. In my previous experience, I definitely saw speed ups when using GPU so this is certainly not expected. At the same time, the measure of time that is reported includes the load of the file and calculation of features. In principle, that should not take a lot of time but you can focus on measuring the specific call to inference. Regarding the batching, you are right. This code does not fully utilize the resources. For these evaluation datasets with short files, I would normally run the script on several CPUs in parallel and that was fast enough for the needs we had. If you want to add more options for GPU, I am afraid that you will need to implement them yourself.

Thank you for your response. However, I encountered a serious issue during my usage. The results I obtained using the ONNX model voxceleb_resnet34_LM.onnx were very poor, which suggests there might be a critical problem. After troubleshooting step by step, I found that it might be caused by the fact that the PLDA model file (represented by plda-file), the x-vector transformation matrix file (represented by xvec-tran), and the mean vector file (represented by xvec-mean) are all supposed to correspond to the embedding model used in the previous step.

This mismatch likely caused the issue. I would like to ask for your advice on how to handle this situation. Specifically:

How can I generate these corresponding files for the embedding model?
Since I want to experiment with the fusion of embeddings from different models, is there any method to avoid using these files altogether?
If you have any other suggestions or thoughts, I would greatly appreciate it if you could share them with me.
Thank you very much for your help!
image
image
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants