Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some question about "Data Generation" in GitHub #81

Open
JunyongYun-SPA opened this issue Nov 27, 2023 · 8 comments
Open

Some question about "Data Generation" in GitHub #81

JunyongYun-SPA opened this issue Nov 27, 2023 · 8 comments

Comments

@JunyongYun-SPA
Copy link

I have a few questions and I'm posting them.

  1. Do I have to generate data in advance through Github's Data Generation process in order to train the model? Or, is it possible to learn the model right away without the Data Generation process?

  2. If I should do a Data generation process, is there a way to create it more efficiently as it seems to take too long to generate for all towns and all weathers with Data generation?

  3. In town05 long benchmark, the test set is town05 long, and exactly what data is used for the train set?
    As far as I know, there are about 576 (2183) data, including 21 weathers, 8 towns, long, short, and tiny respectively.

Thank you.

@deepcs233
Copy link
Collaborator

Hi!

  1. Yes, you need to collect the dataset first if you want to train the model.
  2. You could start multiple carla servers at the same time, and run batch scripts we provided. It may cost up to two weeks.
  3. We use the data from the other towns as the train set.

@soyaCat
Copy link

soyaCat commented Nov 27, 2023

Thank you for your response. I am the co-worker of the person who asked the first question. I understood the answers to the first and second questions, but I still have some doubts regarding the third question.
In the town05 benchmark, I'm curious the 'data from other towns as the train set' means datasets generated under all weather conditions of long, short, and tiny for each town, which amounts to 21 different weather conditions.

@deepcs233
Copy link
Collaborator

deepcs233 commented Nov 27, 2023

Hi!
We didn't distinguish the training data from different weathers because Town05 benchmark doesn't have this requirement. Besides, "long, short, and tiny" only denote the route length and they share the same town map and traffic scenarios. Our framework takes as input the single frame data. So we also didn't distinguish the data from "long, short, and tiny".

During the data collection, we didn't put the collected data into 576 data folders. We just collected and name them like "Town05_long_weather13_22_15_14". Then we can choose the data folders we need according to their names when training. "576 (2183) data" is not the total size of our dataset. Each type (short or long) may have 5-200 routes. For example, Short Route doesn't mean a specific route, and it means a type of routes. You can refer to https://github.com/opendilab/InterFuser/blob/main/leaderboard/data/training_routes/routes_town01_short.xml

@JunyongYun-SPA
Copy link
Author

Thank you for your quick response.

However, there are still some questions.

  1. You said short, long, and tiny only represent the length of the route, so shouldn't town01_long and town01_short include the same number of routes? But town01_long has 10 routes and short has 22 routes. I understand that the route of long and short are different routes, is that correct?

  2. You said you don't distinguish training data from other weathers, then doesn't the training data include all 21 weathers? If so, by what criteria did you choose the weather for each town?

  3. I'm sorry there was a typo in my question. The 576 (2183) I'm talking about is 576 (21x8x3). In other words, if we collect data for all towns (8), taking into account all-weather (21) and all types (long, short, and tiny), we predict that approximately 576 (21x8x3) folders will be created. But according to you, it's wrong, right?

Thank you.

@deepcs233
Copy link
Collaborator

Hi!

  1. You can check this folder which may answer your question: https://github.com/opendilab/InterFuser/blob/main/leaderboard/data/training_routes/
  2. To evaluate in Town05 benchmark, weather conditions are not restricted. So we use all weathers. If you run some other benchmarks, you may need to filter some weather conditions for training.
  3. Yes, because each type of route (tiny/short/long) all have multiple routes and create thousands of folders instead.

@No4x
Copy link

No4x commented Dec 8, 2023

Hi!

  1. Yes, you need to collect the dataset first if you want to train the model.
  2. You could start multiple carla servers at the same time, and run batch scripts we provided. It may cost up to two weeks.
  3. We use the data from the other towns as the train set.

Hello, I'm also very interested in the results of data generation. Because I integrated some other sensors, some routes may perform worse than ideal. Maybe 2-30% of the routes failed in a town. Is this normal?

@deepcs233
Copy link
Collaborator

That's ok. What's important is to make sure to collect enough data within safe controls. The frames in the failed case can be dropped to improve the data quality if you need to.

@No4x
Copy link

No4x commented Dec 8, 2023

That's ok. What's important is to make sure to collect enough data within safe controls. The frames in the failed case can be dropped to improve the data quality if you need to.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants