Question about reproduce #17

Naive-Bayes · 2023-07-28T03:41:28Z

Thanks for such excellent work. I want to follow PlanT and so I first want to reproduce the result mentioned in the paper(DS:81.36 RC:93.55 IS:0.87).

I download the pretrained model and run the evaluation by "python leaderboard/scripts/run_evaluation.py user=$USER experiments=PlanTmedium3x eval=longest6", just the same as "README.md".

However, I get the worse result (DS:70.29 RC:81.851 IS:0.866). It seems like because the Route Completion make the result lower than paper.

Could you give me some recommendation to get the result in paper? Thank a lot!

RenzKa · 2023-08-14T15:03:40Z

Hi,
thanks for your interest in our work.
I saw that you had the same problem with transfuser (autonomousvision/carla_garage#11).
Did you find there the reason for the performance drop already?

I will test it again on my machine to double-check the numbers. Can you send me more details about how you run it in the meantime? The .hydra/config.yaml may help.

UPDATE: i rerun the evaluation with the uploaded ckpts. I got the following results in three eval runs:

DS: 82.24, RC: 92.53, IS: 0.89
DS: 78.32, RC: 92.79, IS: 0.85
DS: 80.45, RC: 92.16, IS: 0.88

Naive-Bayes · 2023-08-31T09:03:52Z

So sorry for later replay, I have to do some thing others in last two week.

In carla_garage, we have to add some os.environ, now I think I can reproduce the result in carla_garage.

But these os.environ does not need in plant, and I still can not get DS 82.

I found another problem, I re-train the plant and use the 49th epoch, can get DS 77.
So in your test, 47th epoch is better than 49th?

georgeliu233 · 2023-10-16T12:24:19Z

So sorry for later replay, I have to do some thing others in last two week.

In carla_garage, we have to add some os.environ, now I think I can reproduce the result in carla_garage.

But these os.environ does not need in plant, and I still can not get DS 82.

I found another problem, I re-train the plant and use the 49th epoch, can get DS 77. So in your test, 47th epoch is better than 49th?

Hi there,

Can you share how do you get the result of DS=0.82? As I meet the same problem (DS 70, RC 81.8) using the setup the author released.

Best

RenzKa · 2023-10-16T21:44:32Z

Hi,
sorry about those problems.
Could you quickly share your setup (what GPU type are you using, your config, etc.)?

georgeliu233 · 2023-10-17T03:59:14Z

Hi, sorry about those problems. Could you quickly share your setup (what GPU type are you using, your config, etc.)?

Hi,

I'm currently using a RTX3080. The CUDA version is 11.1 and I follow the conda envs,
except using torch 1.9+cu11.1 instead of cu10.2; and the rest of configs are identical for inference;

I'm wondering if the environmental variables for CARLA are fully listed in current repo? It seems that there is a drop in RC. So is it possible that some of the variables that capable of the route or waypoint selections are different from your original experiment?

Thanks again for the quick response!

Best,
Haochen

georgeliu233 · 2023-10-17T05:46:40Z

Hi, sorry about those problems. Could you quickly share your setup (what GPU type are you using, your config, etc.)?

Hi,

I'm currently using a RTX3080. The CUDA version is 11.1 and I follow the conda envs, except using torch 1.9+cu11.1 instead of cu10.2; and the rest of configs are identical for inference;

I'm wondering if the environmental variables for CARLA are fully listed in current repo? It seems that there is a drop in RC. So is it possible that some of the variables that capable of the route or waypoint selections are different from your original experiment?

Thanks again for the quick response!

Best, Haochen

Here is the log of reproduced result:
It seems that the lower RC score is due to frequent agent blocks (0.25)
Any suggestion on resolving this?

"labels": [
"Avg. driving score",
"Avg. route completion",
"Avg. infraction penalty",
"Collisions with pedestrians",
"Collisions with vehicles",
"Collisions with layout",
"Red lights infractions",
"Stop sign infractions",
"Off-road infractions",
"Route deviations",
"Route timeouts",
"Agent blocked"
],
"sensors": [
"carla_opendrive_map",
"carla_imu",
"carla_gnss",
"carla_speedometer"
],
"values": [
"70.903",
"83.082",
"0.877",
"0.004",
"0.080",
"0.000",
"0.005",
"0.126",
"0.023",
"0.000",
"0.046",
"0.253"
]

RenzKa · 2023-10-28T11:22:22Z

Hei,
in the past, we encountered problems in reproducing results when using RTX 3080/3090 GPUs (not tested in Carla but for other evaluations). I think this could be related to the Ampere architecture of the newer Nvidia GPUs and the tf_32 mode. However, I do not know if the drop in RC is related to this but might be worth trying.
Here some related links:
https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices
https://dev-discuss.pytorch.org/t/pytorch-and-tensorfloat32/504

Could you try disabling the tf_32 mode?

    torch.backends.cuda.matmul.allow_tf32 = False
    torch.backends.cudnn.allow_tf32 = False

If this doesn't help, do you have any chance to test this on e.g. 2080 GPUs?

Best,
Katrin

georgeliu233 · 2023-10-30T05:50:23Z

Hei, in the past, we encountered problems in reproducing results when using RTX 3080/3090 GPUs (not tested in Carla but for other evaluations). I think this could be related to the Ampere architecture of the newer Nvidia GPUs and the tf_32 mode. However, I do not know if the drop in RC is related to this but might be worth trying. Here some related links: https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices https://dev-discuss.pytorch.org/t/pytorch-and-tensorfloat32/504

Could you try disabling the tf_32 mode?
    torch.backends.cuda.matmul.allow_tf32 = False
    torch.backends.cudnn.allow_tf32 = False
If this doesn't help, do you have any chance to test this on e.g. 2080 GPUs?

Best, Katrin

Thanks for your reply! Yep currently I'm using 3090 GPU, and I will try to add this config.

Another problem is that in current CARLA testing (longest6), chances are that the ego vehicle is stuck in a jammed traffic and get blocked (not moving after 900 steps). Another situation (less than the previous one) is that after such jam, the ego vehicle remain still and get blocked. Have you encountered similar situations and much appreciated if you can share some tips in resolving this issue :)

JaneFo · 2024-02-07T08:24:23Z

Thank you for your excellent work. I have the same problem when reproducing the results, I tried to add the following configurations: torch.backends.cuda.matmul.allow_tf32 = False torch.backends.cudnn.allow_tf32 = False but the results are still bad.
Do you have solutions to this problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about reproduce #17

Question about reproduce #17

Naive-Bayes commented Jul 28, 2023

RenzKa commented Aug 14, 2023 •

edited

Loading

Naive-Bayes commented Aug 31, 2023

georgeliu233 commented Oct 16, 2023

RenzKa commented Oct 16, 2023

georgeliu233 commented Oct 17, 2023

georgeliu233 commented Oct 17, 2023

RenzKa commented Oct 28, 2023

georgeliu233 commented Oct 30, 2023

JaneFo commented Feb 7, 2024

Question about reproduce #17

Question about reproduce #17

Comments

Naive-Bayes commented Jul 28, 2023

RenzKa commented Aug 14, 2023 • edited Loading

Naive-Bayes commented Aug 31, 2023

georgeliu233 commented Oct 16, 2023

RenzKa commented Oct 16, 2023

georgeliu233 commented Oct 17, 2023

georgeliu233 commented Oct 17, 2023

RenzKa commented Oct 28, 2023

georgeliu233 commented Oct 30, 2023

JaneFo commented Feb 7, 2024

RenzKa commented Aug 14, 2023 •

edited

Loading