You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description:
There is an issue in the provider stats endpoint concerning GPU utilization reporting, specifically when handling deployments requesting GPUs across service count >1. This problem is evident in provider version 0.4.7 and Akash Network version 0.26.2.
Issue Details:
The current implementation of the provider stats endpoint does not correctly factor in the 'service count' for deployments that request GPUs. This results in an inaccurate display of the total GPU usage & availability.
Example Scenario:
Consider a GPU deployment consisting of two services:
First service with count: 14 and gpu: 2.
Second service with count: 1 and gpu: 2.
Theoretically, the total GPU usage should be 30 (calculated as 14*2 + 1*2), but this is not reflected in the provider stats.
Observed Output:
For the provider at provider.akash-ai.com (akash1c6rsz4f59nkus3s5qauxxh969j2mtkkn2clk2e), the stats endpoint incorrectly reports only 4 GPUs in use (should be 30 in use). The script output is as follows (based on the :8443/stats report you can see below):
Expected Behavior:
The provider stats endpoint should accurately represent the total number of GPUs in use, incorporating the 'service count' in its calculation for deployments with GPU requests.
Impact:
This inaccurate reporting can lead to misunderstandings regarding resource availability and utilization, potentially affecting scheduling decisions and overall resource management on the Akash Network.
Description:
There is an issue in the provider stats endpoint concerning GPU utilization reporting, specifically when handling deployments requesting GPUs across service count >1. This problem is evident in provider version 0.4.7 and Akash Network version 0.26.2.
Issue Details:
The current implementation of the provider stats endpoint does not correctly factor in the 'service count' for deployments that request GPUs. This results in an inaccurate display of the total GPU usage & availability.
Example Scenario:
Consider a GPU deployment consisting of two services:
count: 14
andgpu: 2
.count: 1
andgpu: 2
.Theoretically, the total GPU usage should be
30
(calculated as14*2 + 1*2
), but this is not reflected in the provider stats.Observed Output:
For the provider at provider.akash-ai.com (akash1c6rsz4f59nkus3s5qauxxh969j2mtkkn2clk2e), the stats endpoint incorrectly reports only 4 GPUs in use (should be 30 in use). The script output is as follows (based on the :8443/stats report you can see below):
Expected Behavior:
The provider stats endpoint should accurately represent the total number of GPUs in use, incorporating the 'service count' in its calculation for deployments with GPU requests.
Impact:
This inaccurate reporting can lead to misunderstandings regarding resource availability and utilization, potentially affecting scheduling decisions and overall resource management on the Akash Network.
Additional info
The text was updated successfully, but these errors were encountered: