-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
step by step understanding approximate joint training method #192 #8
Comments
I don't understand the question. Can you rephrase? Backprop is performed in this function. There is not a complete path back from the detector losses to the image, if I recall correctly. RPN proposals are where the path is broken. The proposals are generated and fed into the detector stage but the gradient is not propagated back through the proposals and up into the RPN. This means that proposals fed into the detector are treated as constants in each step (that of course change in each step as well) that do not affect the gradient. You can see this in the PyTorch version here, where I detach the proposals from the computational graph. In the TensorFlow version, the equivalent code is here using |
thanks for reply
I want to find a path that can compute derivative of : DECODER: |
i just want know the gradients flow path in the break point(between RPN and detector). Is what I understood correct?
|
Sorry |
You should look at the function I pointed out and draw a visual diagram. Looking at the code, we can see that the gradient flows back from the detector through the RoI pool layer (which does not have any weights) but is then stopped. Looking forward, the input enters the backbone and then passes into RPN, which generates proposals, and also directly to the detector (for the RoI pooling to use alongside the proposal regions generated from the RPN). However, the gradient cannot flow through the proposal generation logic (I believe it is not differentiable?) and we explicitly stop it from doing so. So the backprop path is not symmetric to the forward path: that one branch of the network does not support backprop. Makes sense? |
tanks. understood.
in here
coordinates in feature map are just indexes. |
This is commonly done in all Faster R-CNN implementations. RoI is probably differentiable (?) but I'm guessing that's not the real problem. Look at what happens prior to the detach() call.
I'm guessing that |
I apologize for my many question. but i am confused and i cant give my answer during any research.
i realy confused. |
Have you tried removing the detach statements and seeing what happens during training? What happens? |
No, I did not |
Give it a try and observe. If your hypothesis is correct and this is redundant, there should be no difference in training progression and performance. If PyTorch is attempting to automatically differentiate these functions anyway, I would expect that training would not proceed smoothly and would have difficulty converging. |
thank you for attention. |
i don't understand exactly approximate joint training method.
i know RPN and detector merged as a one network during training.
the forward path is started pre trained conv network and pass from RPN and finally arrives to fast rcnn layers. loss is computed :
but where is it from the backpropagation path? is it from detector and RPN and finally pretrained convnet?
in this case how derivation performed in decoder section in RPN? offcets produced with 1x1 reg-conv layer in RPN is translated to proposals in decoder.
The text was updated successfully, but these errors were encountered: