-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
测试vgg_16_cifar.py报错 #9
Comments
Please use command |
Hi, Can you post your GPU type name? For instance, K40? |
PaddlePaddle 0.8.0b, compiled with gtx titanx, driver 352.39 |
@quietsmile Hi, there is no problem when we tested on Tesla K20/K40 with cuda 7.5 and cudnn 5.1, cudnn 4.0. But we don't have gtx titanx environment and wasn't able to to replicate this problem. We will solve it later. |
I have added a change list to fix it. |
@quietsmile We have fixed this problem in GTX 980, see 341486d . |
Fixed #107, and close issue. |
…amework_proto Fix merge error
Add fsp op for distillation in slim.
* add c_concat for npu * UT for c_concat_npu * fix c_concat , adding rank * add assert nranks * add assert dims % nranks == 0
[yolov3] Add yolov3 demo
[Gpugraph] change graph_sample interface
* parquet parser * fix IsThreadLocalCapturing * run cuda kernel: CalcAucKernel with 512 threads * fix_afs_api_download_dnn_plugin * fix_fleet_last_base * parquet parser * add ps core so * chg cmake Co-authored-by: rensilin <[email protected]> Co-authored-by: root <[email protected]> * parquet * fix IsThreadLocalCapturing * run cuda kernel: CalcAucKernel with 512 threads * fix_afs_api_download_dnn_plugin * fix_fleet_last_base * parquet parser * add ps core so * chg cmake * fix libjvm lost Co-authored-by: rensilin <[email protected]> Co-authored-by: root <[email protected]> * add dymf (PaddlePaddle#10) * dymf tmp * add dymf tmp * local test change * pull thread pool * fix conflict * delete unuse log * local change for mirrow 0 * fix dymf * code clean * fix code clean * code clean * code clean * fix dymf * fix dymf * add endpass optimize * clean code * fix endpass optimize * fix * fix Co-authored-by: yaoxuefeng6 <[email protected]> Co-authored-by: Thunderbrook <[email protected]> * pipeline build (#9) * Fix eigvals_op (PaddlePaddle#12) * dymf tmp * add dymf tmp * local test change * pull thread pool * fix conflict * delete unuse log * local change for mirrow 0 * fix dymf * code clean * fix code clean * code clean * code clean * fix dymf * fix dymf * add endpass optimize * clean code * fix endpass optimize * fix * fix * fix eigvals_op * merge pre-stable * merge pre-stable Co-authored-by: yaoxuefeng6 <[email protected]> Co-authored-by: Thunderbrook <[email protected]> * test * passid memory && Generalization * fix code style Co-authored-by: xionglei1234 <[email protected]> Co-authored-by: rensilin <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: zmxdream <[email protected]> Co-authored-by: yaoxuefeng6 <[email protected]> Co-authored-by: Thunderbrook <[email protected]> Co-authored-by: Thunderbrook <[email protected]> Co-authored-by: liaoxiaochao <[email protected]>
add cgpu and file parser block bug fix
* update docs * add pretrained models
Readme: fix link to header file
fix static issues
支持overwrite = True时的scatter算子,减少子图数量
update model doc
ubuntu 14.04, cuda 7.5, cudnn 5.1.5 安装成功
但是跑demo/image_classification/train.sh时报错,错误信息如下:
[INFO 2016-08-31 17:20:21,497 layers.py:1430] channels=512 size=8192
[INFO 2016-08-31 17:20:21,497 layers.py:1430] output size for conv_8 is 4
[INFO 2016-08-31 17:20:21,498 layers.py:1430] channels=512 size=8192
[INFO 2016-08-31 17:20:21,499 layers.py:1430] output size for conv_9 is 4
[INFO 2016-08-31 17:20:21,501 layers.py:1490] output size for pool_3 is 2_2
[INFO 2016-08-31 17:20:21,502 layers.py:1490] output size for pool_4 is 1_1
[INFO 2016-08-31 17:20:21,507 networks.py:960] The input order is [image, label]
[INFO 2016-08-31 17:20:21,507 networks.py:963] The output order is [cost_0]
I0831 17:20:21.523936 13974 Trainer.cpp:169] trainer mode: Normal
I0831 17:20:21.546594 13974 PyDataProvider2.cpp:219] loading dataprovider image_provider::processData
[INFO 2016-08-31 17:20:21,682 image_provider.py:52] Image size: 32
[INFO 2016-08-31 17:20:21,682 image_provider.py:53] Meta path: data/cifar-out/batches/batches.meta
[INFO 2016-08-31 17:20:21,682 image_provider.py:58] DataProvider Initialization finished
I0831 17:20:21.682675 13974 PyDataProvider2.cpp:219] loading dataprovider image_provider::processData
[INFO 2016-08-31 17:20:21,682 image_provider.py:52] Image size: 32
[INFO 2016-08-31 17:20:21,682 image_provider.py:53] Meta path: data/cifar-out/batches/batches.meta
[INFO 2016-08-31 17:20:21,682 image_provider.py:58] DataProvider Initialization finished
I0831 17:20:21.683006 13974 GradientMachine.cpp:134] Initing parameters..
I0831 17:20:22.312453 13974 GradientMachine.cpp:141] Init parameters done.
.........
I0831 17:20:52.894659 13974 TrainerInternal.cpp:162] Batch=100 samples=12800 AvgCost=2.35864 CurrentCost=2.35864 Eval: classification_error_evaluator=0.833906 CurrentEval: classification_error_evaluator=0.833906
.........
I0831 17:21:00.884374 13974 TrainerInternal.cpp:162] Batch=200 samples=25600 AvgCost=2.15774 CurrentCost=1.95684 Eval: classification_error_evaluator=0.792148 CurrentEval: classification_error_evaluator=0.750391
.........
I0831 17:21:08.731333 13974 TrainerInternal.cpp:162] Batch=300 samples=38400 AvgCost=2.01417 CurrentCost=1.72705 Eval: classification_error_evaluator=0.753672 CurrentEval: classification_error_evaluator=0.676719
.........I0831 17:21:15.873359 13974 TrainerInternal.cpp:179] Pass=0 Batch=391 samples=50048 AvgCost=1.90795 Eval: classification_error_evaluator=0.71814
F0831 17:21:18.497601 13974 hl_cuda_cudnn.cc:779] Check failed: CUDNN_STATUS_SUCCESS == cudnnStat (0 vs. 5) Cudnn Error: CUDNN_STATUS_INVALID_VALUE
*** Check failure stack trace: ***
@ 0x7f609f255daa (unknown)
@ 0x7f609f255ce4 (unknown)
@ 0x7f609f2556e6 (unknown)
@ 0x7f609f258687 (unknown)
@ 0x8a98d4 hl_convolution_forward()
@ 0x5c66fc paddle::CudnnConvLayer::forward()
@ 0x62305c paddle::NeuralNetwork::forward()
@ 0x6b54af paddle::Tester::testOneBatch()
@ 0x6b5dc2 paddle::Tester::testOnePeriod()
@ 0x69a28c paddle::Trainer::trainOnePass()
@ 0x69d687 paddle::Trainer::train()
@ 0x53b0b3 main
@ 0x7f609e461ec5 (unknown)
@ 0x546695 (unknown)
@ (nil) (unknown)
更改cudnn版本,5.0.5, 4.0.4错误都一样~
求助!
The text was updated successfully, but these errors were encountered: