-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log stats #1423
base: main
Are you sure you want to change the base?
Log stats #1423
Conversation
感觉gpu和cpu是不是不用放里面,可以自己起一个nvidia的端口就能获得metrics,如 docker run -d --gpus all --rm -p 9400:9400 nvcr.io/nvidia/k8s/dcgm-exporter:3.1.7-3.1.4-ubuntu20.04 |
能不能添加tokens相关性能指标? |
Can we add first token time as well, so the difference between scheduling time and first token time can be used to estimate prefill time? |
@@ -123,6 +123,10 @@ def add_parser_api_server(): | |||
type=str, | |||
default='', | |||
help='Qos policy config path') | |||
parser.add_argument('--log-stats', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use --metrics
Open http://xxxx:23333/metrics/ to view the metrics.