FPN ROI Choosing #5

Max-Fu · 2017-07-15T14:09:04Z

Hi there! As I scan through Feature Pyramid Network for Object Detection, I found a part where there is a formula for choosing the feature map for ROI based on the size of the region proposal. Can you show me how you implement this? I wish to implement FPN on the new Object Detection API provided by Tensorflow.

Max-Fu · 2017-07-15T14:11:04Z

k = k0 + log2(√(wh)/224) where k0 equals 4, k equals the output-layer's layer number, w and h are the width and height of the regional proposal.

xmyqsh · 2017-07-20T09:11:22Z

in file proposal_target_layer.py L101

`def calc_level(width, height):
return min(5, max(2, int(4 + np.log2(np.sqrt(width * height) / 224))))

level = lambda roi : calc_level(roi[3] - roi[1], roi[4] - roi[2])   # roi: [0, x0, y0, x1, y1]

leveled_rois = [[], [], [], []]
leveled_rois[0] = [roi for roi in rois if level(roi) == 2]
leveled_rois[1] = [roi for roi in rois if level(roi) == 3]
leveled_rois[2] = [roi for roi in rois if level(roi) == 4]
leveled_rois[3] = [roi for roi in rois if level(roi) == 5]`

this logic can be implemented either in proposal_target_layer or in roi_pooling_layer
implemented in proposal_target_layer need call four roi_pooling_layer but may benefit from CPU and GPU parallel
implemented in roi_pooling_layer need just one roi_pooling_layer and better benefit of acceleration of GPU

Do you think the latter one is a better chioce?

Max-Fu · 2017-07-20T12:02:58Z

Thank you for answering this question! I just finished my implementation of FPN over the new Tensorflow Object Detection API. I implemented this algorithm in roi pooling layer.

Max-Fu · 2017-07-20T12:04:15Z

Choosing ROI is definitely the better choice. Then I was just confused about where to add this formula.

xmyqsh · 2017-07-24T13:49:56Z

Hey man,
How is your training result?
The rpn_loss of my training result is many times larger than the FastRCNN loss.
Do you think I should also add k = k0 + log2(√(wh)/224) into archor target layer?
Has the paper mentioned this? I think it should be a reasonable improvement.
(Set w and h to be the the width and height of ground truth bbox in this layer.)

Max-Fu · 2017-07-25T01:52:32Z

The training result was not as good as the one mentioned in the paper (I only trained for 2 days). RPN loss was also many times larger than the Faster RCNN loss. (I don't know much about fast rcnn though). You can definitely try your method, see if it is correct.

Zehaos · 2017-07-26T10:25:30Z

Hi, @xmyqsh @Max-Fu
The author claim that they use 4 step training rather than end2end training. (Please refer to 5.2.2 Shareing Features).
I implemented FPN using MXNET, and I have tried alternated training. The RPN result is good (8 points higher than the res50-c4 baseline), but the fast-rcnn result is quite bad.

xmyqsh · 2017-07-28T14:46:11Z

@Zehaos
Good!
I will try it.
But how to evaluate RPN result, AP or AR?
Do you know where has the definition of AR in Table 1, average recall?

Zehaos · 2017-07-28T14:53:12Z

@xmyqsh
I used average recall to do the evaluation (on VOC dataset). Table 1 is the eval result of COCO tools? I'm not sure.

Johere · 2017-07-31T06:36:46Z

@xmyqsh Hi, you mentioned that the logic of choosing the feature map for ROI can be implemented either in proposal_target_layer or in roi_pooling_layer. And I implemented this algorithm in roi pooling layer but I got a bad result. However, I find that proposal_target_layer is not used in the stage 'TEST', while roi_pooling_layer is both used in the stage 'TRAIN' and 'TEST'. So the implementation of these two situations should be different? Is there anything wrong with my understanding?

xmyqsh · 2017-07-31T07:45:44Z

@Johere
You are right.
Among the three layers of RPN, only proposal layer used in 'TEST' phase.
Anchor target layer is used to generate the delta of anchors for RPN training. And the proposal target layer is used to generate the delta of proposal region as well as proposal region (ROI) for Fast-RCNN training. Well, the proposal layer is used to generate the proposal region (ROI). Roi pooling is used to crop the ROI from the feature map, then pool them into unified 7x7 features.

I implemented the logic of choosing the feature map for ROI (k = k0 + log2(√(wh)/224)), I think it should be better to say P2~P5 aware/wising ROI, in proposal layer. My proposal layer in 'TEST' phase output P2~P5 aware/wising ROI, which is different from its output in 'TRAIN' phase.

Johere · 2017-07-31T08:37:49Z

@xmyqsh
Thank you very much!
May I ask you about your training results? I modified the roi_pooling_layer by choosing the feature map
(P2/P3/P4/P5) before roi pooling operate, and the rest code of this layer remains the same, but the result was bad... How about your implementation?

xmyqsh · 2017-07-31T09:30:20Z

@Johere

I implement the feature map(P2/P3/P4/P5) choosing operate in proposal layer in 'TEST' phase, and in proposal target layer in 'TRAIN' phase.

If you implement this in roi_pooling_layer, be aware of recoding the mapping of feature map (P2/P3/P4/P5) and ROIS, the mapping relationship should be used in backward process again.

If your RPN performance is not as good as the paper says, and you just use one image, not two as the paper says, in one forward/backward process, I think you should use lower learning rate than the paper says, cause there are not enough efficient rois in rgs_loss, so its gradient may be not stable, and a lower learning rate should be a better choice.

I'm just optimizing and testing my RPN performance. Previous end-to-end training result is bad.

Johere · 2017-07-31T09:44:52Z

@xmyqsh
OK. Thank you for answering me!

xmyqsh · 2017-08-02T03:55:02Z

@Zehaos
P6 should be included in RPN's head, but I encountered numerical problem(nan) during training when I added it.
Have you encountered similar problem?

Zehaos · 2017-08-02T04:15:46Z

@xmyqsh
No. I use max pooling to downsample P5 and allow border anchor during training, the training is smooth.

xmyqsh · 2017-08-02T05:55:49Z

@Zehaos
Same to you.
What's your max pooling's kernel size, 3x3 or 1x1?
And your learning rate is 0.02 as the paper says?

Zehaos · 2017-08-02T06:03:33Z

@xmyqsh
Kernel size=2 ... stride=2.
I used lr=0.002 due to a smaller batch size(1img/gpu * 4 gpu).

xmyqsh · 2017-08-02T08:41:22Z

@Zehaos
After use Kernel size=2, NAN disappeared...
Thank you!

Zehaos · 2017-08-02T08:50:57Z

@xmyqsh You are welcome.

xmyqsh · 2017-08-07T00:15:16Z

@Zehaos
How many image_batch_size do you use in fast-rcnn of alternated training?
Larger image_batch_size should help training?

Zehaos · 2017-08-07T02:06:14Z

@xmyqsh
I used image_batch_size = 2, roi_batch_size = 256. Larger image_batch_size should help because of less ROI correlation.

Feynman27 · 2017-09-15T23:23:48Z

@xmyqsh Your current implementation for choosing the pyramid level assigns all rois to each feature map. For example, you are sampling 128 rois and assigning each of them to the 4 pyramid levels (P2~P5), resulting in 512 rois per image. Is this deliberate? Shouldn't each roi be assigned to a unique level in the feature pyramid -- given by the formula in the paper?

For example, compare leveled_idxs in the two implementations below (they are not the same)

RoI indexes are assigned to each level

leveled_idxs = [[]] * 4
for idx, roi in enumerate(rois):
        level_idx = level(roi) - 2
        leveled_idxs[level_idx].append(idx)

RoI indexes assigned to different levels, determined by k = k0 + log2(√(wh)/224):

leveled_idxs = [[], [], [], []]
for idx, roi in enumerate(rois):
        level_idx = level(roi) - 2
        leveled_idxs[level_idx].append(idx)

stillwalker1234 · 2017-09-20T11:51:09Z

@Feynman27

That's a really subtle error, have you tried training with that mod?

xmyqsh · 2017-09-20T12:04:14Z

@Feynman27
Good!

Feynman27 · 2017-09-20T12:16:10Z

Yes, but surprisingly, it didn't really change the mAP much. It actually dropped it about 0.5-1.0 percentage point.

xmyqsh · 2017-09-20T13:23:57Z

@Feynman27
Have you changed the related codes in proposal_layer.py and proposal_target_layer.py simultaneously?

Feynman27 · 2017-09-20T14:07:56Z

@xmyqsh Yes.

…

On Wed, Sep 20, 2017, 9:23 AM xmyqsh ***@***.***> wrote: @Feynman27 <https://github.com/feynman27> Have you changed the related codes in proposal_layer.py and proposal_target_layer.py simultaneously? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AL2y8H8d5dOfKKhZhu35bWuxMaZMN6v8ks5skRHvgaJpZM4OZBm7> .

hhchyer · 2018-03-13T09:11:17Z

@Feynman27 The formula to choose level of proposal layer should be a balance of speed and accuracy. The proposals can benefit from other layers even.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FPN ROI Choosing #5

FPN ROI Choosing #5

Max-Fu commented Jul 15, 2017

Max-Fu commented Jul 15, 2017

xmyqsh commented Jul 20, 2017 •

edited

Loading

Max-Fu commented Jul 20, 2017

Max-Fu commented Jul 20, 2017

xmyqsh commented Jul 24, 2017 •

edited

Loading

Max-Fu commented Jul 25, 2017

Zehaos commented Jul 26, 2017

xmyqsh commented Jul 28, 2017

Zehaos commented Jul 28, 2017

Johere commented Jul 31, 2017

xmyqsh commented Jul 31, 2017 •

edited

Loading

Johere commented Jul 31, 2017

xmyqsh commented Jul 31, 2017 •

edited

Loading

Johere commented Jul 31, 2017

xmyqsh commented Aug 2, 2017

Zehaos commented Aug 2, 2017

xmyqsh commented Aug 2, 2017

Zehaos commented Aug 2, 2017

xmyqsh commented Aug 2, 2017

Zehaos commented Aug 2, 2017

xmyqsh commented Aug 7, 2017

Zehaos commented Aug 7, 2017

Feynman27 commented Sep 15, 2017 •

edited

Loading

stillwalker1234 commented Sep 20, 2017

xmyqsh commented Sep 20, 2017

Feynman27 commented Sep 20, 2017

xmyqsh commented Sep 20, 2017

Feynman27 commented Sep 20, 2017 via email •

edited

Loading

hhchyer commented Mar 13, 2018

FPN ROI Choosing #5

FPN ROI Choosing #5

Comments

Max-Fu commented Jul 15, 2017

Max-Fu commented Jul 15, 2017

xmyqsh commented Jul 20, 2017 • edited Loading

Max-Fu commented Jul 20, 2017

Max-Fu commented Jul 20, 2017

xmyqsh commented Jul 24, 2017 • edited Loading

Max-Fu commented Jul 25, 2017

Zehaos commented Jul 26, 2017

xmyqsh commented Jul 28, 2017

Zehaos commented Jul 28, 2017

Johere commented Jul 31, 2017

xmyqsh commented Jul 31, 2017 • edited Loading

Johere commented Jul 31, 2017

xmyqsh commented Jul 31, 2017 • edited Loading

Johere commented Jul 31, 2017

xmyqsh commented Aug 2, 2017

Zehaos commented Aug 2, 2017

xmyqsh commented Aug 2, 2017

Zehaos commented Aug 2, 2017

xmyqsh commented Aug 2, 2017

Zehaos commented Aug 2, 2017

xmyqsh commented Aug 7, 2017

Zehaos commented Aug 7, 2017

Feynman27 commented Sep 15, 2017 • edited Loading

stillwalker1234 commented Sep 20, 2017

xmyqsh commented Sep 20, 2017

Feynman27 commented Sep 20, 2017

xmyqsh commented Sep 20, 2017

Feynman27 commented Sep 20, 2017 via email • edited Loading

hhchyer commented Mar 13, 2018

xmyqsh commented Jul 20, 2017 •

edited

Loading

xmyqsh commented Jul 24, 2017 •

edited

Loading

xmyqsh commented Jul 31, 2017 •

edited

Loading

xmyqsh commented Jul 31, 2017 •

edited

Loading

Feynman27 commented Sep 15, 2017 •

edited

Loading

Feynman27 commented Sep 20, 2017 via email •

edited

Loading