mmengine - INFO #1489
anaamansari
started this conversation in
General
mmengine - INFO
#1489
Replies: 1 comment
-
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I had a question regarding the output logs of the training script?
What is the unit of time, data_time and memory in the output below?
02/08 06:41:02 - mmengine - INFO - Epoch(train) [1][ 50/1931] lr: 7.9756e-05 eta: 7:52:46 time: 2.4590 data_time: 0.3534 memory: 42306 grad_norm: 3328.0067 loss: 634.0465 loss_heatmap: 613.3816 layer_-1_loss_cls: 7.7629 layer_-1_loss_bbox: 12.9020 matched_ious: 0.0015
02/08 06:42:41 - mmengine - INFO - Epoch(train) [1][ 100/1931] lr: 9.3103e-05 eta: 7:05:07 time: 1.9825 data_time: 0.0618 memory: 42732 grad_norm: 70.0995 loss: 23.1417 loss_heatmap: 7.4082 layer_-1_loss_cls: 5.3415 layer_-1_loss_bbox: 10.3920 matched_ious: 0.0288
Also using nvidia-smi I can see that I am using about 78G of memory on a H100 GPU. Please let me know how does the memory reported by the log relate to the nvida-smi memory?
Beta Was this translation helpful? Give feedback.
All reactions