-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training instance segmentation #28
Comments
@marekjaszuk |
Here are the logs: |
BTW, the original prototxt 1 and 2 has different num_class: 1:
2:
How do u able to resume weights? (I got error when try to resume step1 trained weights to step2) |
What's the first prototxt 18 stands for? Isn't that only 1 class? What if wanna training on 2 or more classes? |
@marekjaszuk @jinfagang |
The step1 of training went fine, but you refer to pva_solver.prototxt file in the train_maskyolo_step2.sh |
@leon-liangwu Thanks for replying.... I have one more question still. You set class to 1, but, in our data, we should have 80 classes stored? How can do specific the only |
@jinfagang You can refer to script/createdata_xxx.py. I select all the person targets to create the lmdb. |
@marekjaszuk Yes, just use the solver_step2.prototxt to replace pva_solver.prototxt. That is a mistake. I have updated the model.tgz. |
If change create_dataxxx.py to select multi classes such as car and person, and also edit classes num in prototxt, would that able to work? |
@jinfagang |
@leon-liangwu But if using classes more than person, then it should not have keypoints, only box and mask, does it compatible to using kps_data_layer to load data? |
@jinfagang You can use scripts/createdata_mask_only.py this file to generate lmdb with boxes and masks. |
@leon-liangwu I have tried mask training, I can not reproduce your effect. 40000 iterations for step1 and 40000 iterations for step2: |
@jinfagang Actually, you need to change batch_size: 1 in KpsBoxData layer and prop_num: 1 in DecodeRois layer. |
@leon-liangwu thank you for the last suggestions. I finally got satisfying results for masking person shapes on data generated by the createdata_mask_kps.py script. Now I'm trying to reproduce the result on data generated by the createdata_mask_only.py script. The script seems to work fine. But after running the training I get the following error: Error in `../../caffe-maskyolo/build/tools/caffe': corrupted size vs. prev_size: 0x00007ed8900fea70 *** |
@marekjaszuk Have u got any mask result? What did u changed? Original prototxt definitely can not produces good result except you change somewhere like prop_num |
@jinfagang yes, I used prop_num=32 and the same batch_size. With 64 the GPU memory was not sufficient. I'm trying to run multi-GPU training. I installed NCCL, and rebuilt the program but running the training fails. Were you able to run multi-GPU training? Below are my results of mask training on your image. I got them after 40000 iterations. They are not perfect, but I think longer training would improve them. |
@marekjaszuk That's wired, I using prop_num=128 and batch_size=2 Why batch_size so effect result? It shouldn't be... Did u trained step1 40000 and step2 40000? |
@marekjaszuk |
Batchsize 32 also out of memory on GTX1080ti..,.. |
Ok. I finally ran the multi-GPU training. But with batch_size and prop_num=32. With larger values, like 64, it causes out of GPU memory error. I have RTX 2080 Ti (11GB), Titan Xp (12GB), and Titan X (12GB), so as it seems running with larger batch_size would require a GPU with larger memory. I was training on COCO2017 train dataset. |
@marekjaszuk How do u able to using multi-gpu training? Does it support nccl? |
OK, I am able build caffe with nccl and training with multi-gpu support. I am setting batchsize maxium 20 and prop_num 128 |
I'm still trying to train the network on data generated with the create_mask_only.py script. Unfortunately the training fails. I was trying to modify the prototxt with the model to eliminate elements related to kps, but this did not improve the situation. Do you have any working model possible to train on the data without kps? |
@marekjaszuk Does instance segmentation needs kps information? that would be tricky. If only using mask and detection to training, results bad? |
@jinfagang Instance segmentation does not need kps information at all. |
Hi, I try to training on a new model with multi class mask instance segmentation. I got some error when change prototxt, could u help which part needs change?
these are where I changed. But I got shape mistach in regionloss layer:
|
@jinfagang I have updated the repo to train segmentation more easily. Thanks. |
@leon-liangwu does it support train on various classes for now? |
@jinfagang the box, of course, has a category label. The instance only shows a binary may. So it certainly supports train on tasks with multi classes. But you need to modify your label to coco format if you want to use the scripts provided. |
I mean, multi classes simultaneously on a single mask model. but does the loss function support it? How to tried it? |
@jinfagang If you are referring to AffordanceNet, it is not supported now. Multi-class object detection with instance mask is supported. |
Oh, yes, mask with boxes. |
AffordanceNet is not supported now.
|
what's the |
num_object means the num of anchors in region loss and decode roi layers.
|
@jinfagang Hi, if you have any other problems with this repo, please feel free to let me know. |
@leon-liangwu I will let u know when I start training, I am afraid it will got some issue to make model run on multi-classes |
Yes, please. |
I think # proposals are done in this way: |
Hi,
I'm trying to reproduce Your results in instance segmentation, using the scripts that you delivered (train_maskyolo_step1.sh and 2). I did everything according to the instruction. The scripts work, and produce some loss values. The first phase of training gives me the roi rectangles, that identify the persons, but after the second phase of training I get no result (neither roi or masks). What do you think I could do wrong? Did you get the result using the same scripts?
The text was updated successfully, but these errors were encountered: