-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
out of memory #6
Comments
Same problem. After 5000 steps, the problem occurs. |
What is your cudnn version? |
cuda8.0 cudnn5.0 |
I have met the same question, do you have solved it. and how to sovle this. Thanks |
I found that by reducing the anchors can somehow alleviate this problem, you can reduce some angels or ratios in
R-DFPN_FPN_Tensorflow/libs/configs/cfgs.py
ANCHOR_ANGLES = [-90, -75, -60, -45, -30, -15]
ANCHOR_RATIOS = [1/5., 5., 1/7., 7., 1/9, 9]
I encountered the problem of CUDA_ERROR_ILLEGAL_ADDRESS error during training when the objects are densely located, so control the objects in your own dataset( reduce some really exsiting objects) can also alleviate this problem. It works but not all the time.
…------------------ 原始邮件 ------------------
发件人: "李泽中"<[email protected]>;
发送时间: 2019年1月14日(星期一) 下午4:19
收件人: "yangxue0827/R-DFPN_FPN_Tensorflow"<[email protected]>;
抄送: "victor"<[email protected]>; "Comment"<[email protected]>;
主题: Re: [yangxue0827/R-DFPN_FPN_Tensorflow] out of memory (#6)
I have met the same question, do you have solved it. and how to sovle this. Thanks
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Thanks a lot! it works |
Sorry for bothering you again. When I train it with one 1080 GPU with batchsize of 1. I got the following mistakes. How can I solve it?
2018-05-10 13:42:49: step247692 image_name:000624.jpg |
rpn_loc_loss:0.189756244421 | rpn_cla_loss:0.214562356472 | rpn_total_loss:0.404318600893 |
fast_rcnn_loc_loss:0.0 | fast_rcnn_cla_loss:0.00815858319402 | fast_rcnn_total_loss:0.00815858319402 |
total_loss:1.17546725273 | per_cost_time:0.65540599823s
out of memory
invalid argument
2018-05-10 13:42:53.349625: E tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:639] failed to record completion event; therefore, failed to create inter-stream dependency
2018-05-10 13:42:53.349637: I tensorflow/stream_executor/stream.cc:4138] stream 0x55cd063dc880 did not memcpy host-to-device; source: 0x7fa30b0da010
2018-05-10 13:42:53.349641: E tensorflow/stream_executor/stream.cc:289] Error recording event in stream: error recording CUDA event on stream 0x55cd063dc950: CUDA_ERROR_ILLEGAL_ADDRESS; not marking stream as bad, as the Event object may be at fault. Monitor for further errors.
2018-05-10 13:42:53.349647: E tensorflow/stream_executor/cuda/cuda_event.cc:49] Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_ADDRESS
2018-05-10 13:42:53.349650: F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:203] Unexpected Event status: 1
an illegal memory access was encountered
an illegal memory access was encountered
The text was updated successfully, but these errors were encountered: