About vis_rpn_anchors #21

a5372935 · 2020-03-16T15:58:23Z

❓ Questions and Help

Which one is the match_anchor or anchor_proposal that I should care about?

And, why image has more than two bboxes on the same target when using inference_demo.py prediction, how can I make him output only one bboxes

mrlooi · 2020-03-16T17:36:04Z

anchor_proposal is used to generate initial proposals for the network, before the rroi layer refines the (rotated) bounding boxes

You would get two rotated bboxes on the same target if their IoU < IoU threshold. Try decreasing the ROI IoU threshold

a5372935 · 2020-03-16T17:39:40Z

thanks. Then what does match_anchors mean

a5372935 · 2020-03-16T17:46:56Z

Is it first match_anchor, then we have to try to train the regression as anchor_proposal

mrlooi · 2020-03-16T18:21:31Z

It's been a long time since I last saw the code, but based on the naming, it probably means anchors that have IoU > RPN IoU threshold. These anchors are fed to the RPN regression layer

a5372935 · 2020-03-17T02:40:03Z

I need help on my case
This is output : config https://drive.google.com/open?id=1AhByUq5SHwmo8xIadWziPG5_UPRR5vU2
My initial config : https://drive.google.com/open?id=1AhByUq5SHwmo8xIadWziPG5_UPRR5vU2
My log : https://drive.google.com/open?id=1HQfS0Fhqf-ABMcQfg9OOGeyOLTcQcett
My predict image : https://drive.google.com/open?id=1lupmX2EsgxJ5GA33knmsRusINB8vJ3Do

The loss of my training is already very low, why is the result still so bad, is my parameter tuning bad, or is it not enough training

mrlooi · 2020-03-17T10:30:44Z

From the image, the target objects are really small. My guess is that there is a class significant imbalance where there are a lot more invalid region proposals (rotated RPNs) than valid ones. A possible fix is to remove very large anchor sizes (i.e. 256) or really small ones (i.e. 20) that don't fit the objects in the dataset, and start with a simpler model (R-50-FPN). It's generally good to reduce the number of total anchors to i.e. 9-15 anchors.

a5372935 · 2020-03-17T10:45:45Z

let me try

a5372935 · 2020-03-18T15:47:43Z

@mrlooi Sometime, I got

### File "/home/lab602/桌面/rotated_maskrcnn-master/maskrcnn_benchmark/modeling/roi_heads/maskiou_head/roi_maskiou_feature_extractors.py", line 66, in forward
mask_pool = self.max_pool2d(mask)
File "/home/lab602/anaconda3/envs/rotated/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "/home/lab602/anaconda3/envs/rotated/lib/python3.6/site-packages/torch/nn/modules/pooling.py", line 146, in forward
self.return_indices)
File "/home/lab602/anaconda3/envs/rotated/lib/python3.6/site-packages/torch/_jit_internal.py", line 133, in fn
return if_false(*args, **kwargs)
File "/home/lab602/anaconda3/envs/rotated/lib/python3.6/site-packages/torch/nn/functional.py", line 494, in _max_pool2d
input, kernel_size, stride, padding, dilation, ceil_mode)
### RuntimeError: invalid argument 2: non-empty 3D or 4D input tensor expected but got: [0 x 1 x 28 x 28] at /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/generic/SpatialDilatedMaxPooling.cu:37

why?

mrlooi · 2020-03-19T02:39:10Z

The error looks to originate from pooling.py. My guess is that the number of initial proposals were small/empty, and after pooling none of the proposals met the passing criterion (could be IoU with ground truth)

a5372935 · 2020-03-19T02:57:14Z

@mrlooi Thank you i understand. And I also want to ask a few questions about RRPN Faster

restore from pretrained_weighs in IMAGE_NET
2020-03-19 10:49:16.049380: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-03-19 10:49:16.199289: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-19 10:49:16.199738: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x5564e3590510 executing computations on platform CUDA. Devices:
2020-03-19 10:49:16.199753: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): GeForce RTX 2080, Compute Capability 7.5
2020-03-19 10:49:16.221193: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3696000000 Hz
2020-03-19 10:49:16.223662: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x5564e35fb270 executing computations on platform Host. Devices:
2020-03-19 10:49:16.223733: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): ,
2020-03-19 10:49:16.224412: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties:
name: GeForce RTX 2080 major: 7 minor: 5 memoryClockRate(GHz): 1.8
pciBusID: 0000:01:00.0
totalMemory: 7.76GiB freeMemory: 6.34GiB
2020-03-19 10:49:16.224473: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2020-03-19 10:49:16.230070: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-03-19 10:49:16.230131: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0
2020-03-19 10:49:16.230159: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N
2020-03-19 10:49:16.230719: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6162 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080, pci bus id: 0000:01:00.0, compute capability: 7.5)
WARNING:tensorflow:From /home/lab602/anaconda3/envs/faster/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
restore model
WARNING:tensorflow:From train.py:170: start_queue_runners (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the tf.data module.
2020-03-19 10:49:22.027217: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally

When I training RRPN Faster and it got stuck, is there anything I haven't changed?

mrlooi · 2020-03-19T13:03:31Z

hmm not sure why but you've posted tensorflow logs

NimaDL · 2020-03-20T06:01:07Z

@mrlooi Thank you. How can I solve @a5372935 problem when number of initial proposals are small/empty? I have got same error:
RuntimeError: invalid argument 2: non-empty 3D or 4D input tensor expected but got: [0 x 1 x 28 x 28] at /opt/conda/conda-bld/pytorch_1556653099582/work/aten/src/THCUNN/generic/SpatialDilatedMaxPooling.cu:37

mrlooi · 2020-03-20T08:19:51Z

I would recommend starting with good RPN anchors. Use the vis_rpn_anchors.py file to visualize the anchors for your dataset.

a5372935 · 2020-03-23T03:41:25Z

@mrlooi I forgot to ask is the brackets after each loss refer to val_loss?

mrlooi · 2020-03-23T05:21:22Z

If I remember correctly, it's the loss for that minibatch.

Actually I had a look again into your log : https://drive.google.com/open?id=1HQfS0Fhqf-ABMcQfg9OOGeyOLTcQcett
The loss values in brackets are certainly way too high, the training was unstable and will not work

a5372935 · 2020-03-23T06:17:16Z

Yes, the loss for that minibatch is really high, but I think vis_rpn_anchors are all correct. Why is this?

mrlooi · 2020-03-23T08:36:50Z

Possibly due to version differences. I used torch 1.0 - 1.1
Or it could be a faulty dataset issue. The default pipeline does not handle missing, faulty or empty groundtruth very well

mrlooi mentioned this issue Mar 20, 2020

returned non-zero exit status 1. #22

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About vis_rpn_anchors #21

About vis_rpn_anchors #21

a5372935 commented Mar 16, 2020

mrlooi commented Mar 16, 2020

a5372935 commented Mar 16, 2020

a5372935 commented Mar 16, 2020

mrlooi commented Mar 16, 2020

a5372935 commented Mar 17, 2020

mrlooi commented Mar 17, 2020 •

edited

Loading

a5372935 commented Mar 17, 2020

a5372935 commented Mar 18, 2020

mrlooi commented Mar 19, 2020

a5372935 commented Mar 19, 2020

mrlooi commented Mar 19, 2020 •

edited

Loading

NimaDL commented Mar 20, 2020

mrlooi commented Mar 20, 2020

a5372935 commented Mar 23, 2020

mrlooi commented Mar 23, 2020

a5372935 commented Mar 23, 2020

mrlooi commented Mar 23, 2020 •

edited

Loading

About vis_rpn_anchors #21

About vis_rpn_anchors #21

Comments

a5372935 commented Mar 16, 2020

❓ Questions and Help

mrlooi commented Mar 16, 2020

a5372935 commented Mar 16, 2020

a5372935 commented Mar 16, 2020

mrlooi commented Mar 16, 2020

a5372935 commented Mar 17, 2020

mrlooi commented Mar 17, 2020 • edited Loading

a5372935 commented Mar 17, 2020

a5372935 commented Mar 18, 2020

mrlooi commented Mar 19, 2020

a5372935 commented Mar 19, 2020

mrlooi commented Mar 19, 2020 • edited Loading

NimaDL commented Mar 20, 2020

mrlooi commented Mar 20, 2020

a5372935 commented Mar 23, 2020

mrlooi commented Mar 23, 2020

a5372935 commented Mar 23, 2020

mrlooi commented Mar 23, 2020 • edited Loading

mrlooi commented Mar 17, 2020 •

edited

Loading

mrlooi commented Mar 19, 2020 •

edited

Loading

mrlooi commented Mar 23, 2020 •

edited

Loading