-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
training method #58
Comments
@Re-dot-art No need to do separate training. ConsistentTeacher enables end-to-end training, which means that the labeled data and unlabeled data are fed to the model at the same time. The teacher is maintained as a moving average of the student. For your problem regarding the performance, can you specify which config you are using? What's your batch size and GPU number? |
|
|
As mentioned in README, all experiments in the paper use 8gpux5sample-per-gpu for training. Smaller bs gets worse results as expected. But your results seems to be too low, which even worse than the baseline. Did you edit anything? |
I did not make any modifications to the code, only added some comments. |
Could you please share your configuration settings, the scripts you're using for execution, and the method you're employing to process the dataset? This results is even lower than baselines that just train on labeled data only (No use of unlabeled data). I suspect there is something wrong on your side. I'm here to assist, but I'll need more detailed information to provide effective support. |
Okay, thank you. The config file for the experiment is as follows: The processing of the dataset is carried out according to the methods in readme, and the processing results are shown in the following figure: |
Do you use wandb to record the training process? If yes, can you also share? |
|
|
I would suggest follow this config to (1) increase batch size (2) increase the number of labeled sample within a batch (3) and lower your learning rate https://github.com/Adamdad/ConsistentTeacher/blob/main/configs/consistent-teacher/consistent_teacher_r50_fpn_coco_180k_10p_2x8.py For 2 GPUs, the original config only has 2 labeled sample in total. As such 0.01 is too large for the network to converge. So for 2-GPU-training, we use |
Okay, thank you very much! I'll give it a try. |
Hello, the result of running the code directly differs significantly from the result displayed in your paper. I guess it's due to a problem with the training method, so I would like to confirm with you: before using semi-supervised methods, do we need to train the Faster RCNN network with 1% or 5% label samples, and then use semi-supervised learning methods to train the pre-trained weights again after the training is completed? Looking forward to your reply!
The text was updated successfully, but these errors were encountered: