Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

one-yolov5/classify/train.py 脚本 nsys 报告 【2023-03-29】 #123

Open
ccssu opened this issue Mar 29, 2023 · 0 comments
Open

one-yolov5/classify/train.py 脚本 nsys 报告 【2023-03-29】 #123

ccssu opened this issue Mar 29, 2023 · 0 comments

Comments

@ccssu
Copy link
Collaborator

ccssu commented Mar 29, 2023

引言

对 one-yolov5/classify/train.py 跑了两份 nsys 报告 .

one-yolo_profile:
03-29-07-10profile.zip

torch-yolo_profile:
torch_03-29-08-37profile.zip

one-yolo 测试结果

tloss = (tloss * i + loss.item()) / (i + 1) # update mean losses

one-yolo torch-yolo
tloss这一行耗时 99ms 14ms

注意:

  • flow.version='0.9.1.dev20230327+cu117'
  • torch.version='1.13.0+cu117'
  • 均使用 float32训练·。
  • 启动指令均使用batch-size=256 , epochs = 6 , model = yolov5s-cls 模型
  • 机器 a100

结论:nsys分析看 tloss 这一行速度比较明显低于torch-yolo。如果优化速度将得到极大提升。

one-yolov5项目相关数据

项目地址: https://github.com/Oneflow-Inc/one-yolov5
数据集路径: @oneflow-25:/data/home/fengwen/imagenette160
权重路径: @oneflow-25:/data/home/fengwen/weight_v1_2_0

如果执行nsys产生报错
The target application terminated. One or more process it created re-parented.
Waiting for termination of re-parented processes.
Use the `--wait` option to modify this behavior.

请将 train.py中 check_git_status() 这一行注释

one-yolo 详细测试数据

one-yolov5启动指令
DATESTR=$(date +"%m-%d-%H-%M")
cd  ~/one-yolov5 
set -e 
# py-spy record -o profile.svg --native --
run_cmd="/usr/local/cuda/bin/nsys   profile -o runs/${DATESTR}profile python  \
    classify/train.py \
    --model runs/yolov5s-cls.pt \
    --data ../datasets/imagenette160   \
    --img 224  \
    --batch 256 \
    --epochs 6 \
    --project  One-YOLOv5_v_1_2_0_train \
    --name yolov5n-default \
    --multi_tensor_optimizer \
    --name yolov5n-default --lr0 0.1 --optimizer SGD "

echo ${run_cmd}
eval ${run_cmd}

one-yolo_profile
03-29-07-10profile.zip

image

torch-yolo_profile
torch_03-29-08-37profile.zip
image

修复方案

努力加载中。。。

资料集

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant