Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about reproduce #17

Open
Naive-Bayes opened this issue Jul 28, 2023 · 9 comments
Open

Question about reproduce #17

Naive-Bayes opened this issue Jul 28, 2023 · 9 comments

Comments

@Naive-Bayes
Copy link

Thanks for such excellent work. I want to follow PlanT and so I first want to reproduce the result mentioned in the paper(DS:81.36 RC:93.55 IS:0.87).

I download the pretrained model and run the evaluation by "python leaderboard/scripts/run_evaluation.py user=$USER experiments=PlanTmedium3x eval=longest6", just the same as "README.md".

However, I get the worse result (DS:70.29 RC:81.851 IS:0.866). It seems like because the Route Completion make the result lower than paper.

Could you give me some recommendation to get the result in paper? Thank a lot!

@RenzKa
Copy link
Collaborator

RenzKa commented Aug 14, 2023

Hi,
thanks for your interest in our work.
I saw that you had the same problem with transfuser (autonomousvision/carla_garage#11).
Did you find there the reason for the performance drop already?

I will test it again on my machine to double-check the numbers. Can you send me more details about how you run it in the meantime? The .hydra/config.yaml may help.

UPDATE: i rerun the evaluation with the uploaded ckpts. I got the following results in three eval runs:

  1. DS: 82.24, RC: 92.53, IS: 0.89
  2. DS: 78.32, RC: 92.79, IS: 0.85
  3. DS: 80.45, RC: 92.16, IS: 0.88

@Naive-Bayes
Copy link
Author

So sorry for later replay, I have to do some thing others in last two week.

In carla_garage, we have to add some os.environ, now I think I can reproduce the result in carla_garage.

But these os.environ does not need in plant, and I still can not get DS 82.

I found another problem, I re-train the plant and use the 49th epoch, can get DS 77.
So in your test, 47th epoch is better than 49th?

@georgeliu233
Copy link

So sorry for later replay, I have to do some thing others in last two week.

In carla_garage, we have to add some os.environ, now I think I can reproduce the result in carla_garage.

But these os.environ does not need in plant, and I still can not get DS 82.

I found another problem, I re-train the plant and use the 49th epoch, can get DS 77. So in your test, 47th epoch is better than 49th?

Hi there,

Can you share how do you get the result of DS=0.82? As I meet the same problem (DS 70, RC 81.8) using the setup the author released.

Best

@RenzKa
Copy link
Collaborator

RenzKa commented Oct 16, 2023

Hi,
sorry about those problems.
Could you quickly share your setup (what GPU type are you using, your config, etc.)?

@georgeliu233
Copy link

Hi, sorry about those problems. Could you quickly share your setup (what GPU type are you using, your config, etc.)?

Hi,

I'm currently using a RTX3080. The CUDA version is 11.1 and I follow the conda envs,
except using torch 1.9+cu11.1 instead of cu10.2; and the rest of configs are identical for inference;

I'm wondering if the environmental variables for CARLA are fully listed in current repo? It seems that there is a drop in RC. So is it possible that some of the variables that capable of the route or waypoint selections are different from your original experiment?

Thanks again for the quick response!

Best,
Haochen

@georgeliu233
Copy link

Hi, sorry about those problems. Could you quickly share your setup (what GPU type are you using, your config, etc.)?

Hi,

I'm currently using a RTX3080. The CUDA version is 11.1 and I follow the conda envs, except using torch 1.9+cu11.1 instead of cu10.2; and the rest of configs are identical for inference;

I'm wondering if the environmental variables for CARLA are fully listed in current repo? It seems that there is a drop in RC. So is it possible that some of the variables that capable of the route or waypoint selections are different from your original experiment?

Thanks again for the quick response!

Best, Haochen

Here is the log of reproduced result:
It seems that the lower RC score is due to frequent agent blocks (0.25)
Any suggestion on resolving this?

"labels": [
"Avg. driving score",
"Avg. route completion",
"Avg. infraction penalty",
"Collisions with pedestrians",
"Collisions with vehicles",
"Collisions with layout",
"Red lights infractions",
"Stop sign infractions",
"Off-road infractions",
"Route deviations",
"Route timeouts",
"Agent blocked"
],
"sensors": [
"carla_opendrive_map",
"carla_imu",
"carla_gnss",
"carla_speedometer"
],
"values": [
"70.903",
"83.082",
"0.877",
"0.004",
"0.080",
"0.000",
"0.005",
"0.126",
"0.023",
"0.000",
"0.046",
"0.253"
]

@RenzKa
Copy link
Collaborator

RenzKa commented Oct 28, 2023

Hei,
in the past, we encountered problems in reproducing results when using RTX 3080/3090 GPUs (not tested in Carla but for other evaluations). I think this could be related to the Ampere architecture of the newer Nvidia GPUs and the tf_32 mode. However, I do not know if the drop in RC is related to this but might be worth trying.
Here some related links:
https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices
https://dev-discuss.pytorch.org/t/pytorch-and-tensorfloat32/504

Could you try disabling the tf_32 mode?

    torch.backends.cuda.matmul.allow_tf32 = False
    torch.backends.cudnn.allow_tf32 = False

If this doesn't help, do you have any chance to test this on e.g. 2080 GPUs?

Best,
Katrin

@georgeliu233
Copy link

Hei, in the past, we encountered problems in reproducing results when using RTX 3080/3090 GPUs (not tested in Carla but for other evaluations). I think this could be related to the Ampere architecture of the newer Nvidia GPUs and the tf_32 mode. However, I do not know if the drop in RC is related to this but might be worth trying. Here some related links: https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices https://dev-discuss.pytorch.org/t/pytorch-and-tensorfloat32/504

Could you try disabling the tf_32 mode?

    torch.backends.cuda.matmul.allow_tf32 = False
    torch.backends.cudnn.allow_tf32 = False

If this doesn't help, do you have any chance to test this on e.g. 2080 GPUs?

Best, Katrin

Thanks for your reply! Yep currently I'm using 3090 GPU, and I will try to add this config.

Another problem is that in current CARLA testing (longest6), chances are that the ego vehicle is stuck in a jammed traffic and get blocked (not moving after 900 steps). Another situation (less than the previous one) is that after such jam, the ego vehicle remain still and get blocked. Have you encountered similar situations and much appreciated if you can share some tips in resolving this issue :)

@JaneFo
Copy link

JaneFo commented Feb 7, 2024

Thank you for your excellent work. I have the same problem when reproducing the results, I tried to add the following configurations: torch.backends.cuda.matmul.allow_tf32 = False torch.backends.cudnn.allow_tf32 = False but the results are still bad.
Do you have solutions to this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants