Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to test some samples while training in newest version? #318

Closed
jamestang0219 opened this issue Nov 2, 2016 · 5 comments
Closed

How to test some samples while training in newest version? #318

jamestang0219 opened this issue Nov 2, 2016 · 5 comments
Assignees
Labels
Milestone

Comments

@jamestang0219
Copy link

I updated my paddle version to 0.8.0b3, and then trained a RNN model.
I found that there is no Tester while training.
But I used previous version to train the model, Tester can test the test data every 100 batches.
Here is my train config:

paddle train \
--config=$mod \
--save_dir=./model_desc_full \
--trainer_count=4 \
--log_period=100 \
--num_passes=10 \
--use_gpu=true \
--show_parameter_stats_period=1000 \
--test_all_data_in_one_period=1 \
--config_args=is_predict=0 \

This config worked in previous version, but the Tester disappear in newest version.
I wanna know hot to set the config to TEST while TRAINING

@reyoung
Copy link
Collaborator

reyoung commented Nov 2, 2016

@jamestang0219 set --test_period=1000 will do test job after 1000 mini-batch.
We changed the --test_period's default value to zero, means test after one pass.

It is very kind if you have time to submit a PR change demo's run.sh, if you don't have time, please let us know and leave them to us.

@backyes Please change the demo's shell script if @jamestang0219 don't want submit a PR, because the default value of test_period has been changed.

@reyoung reyoung added Bug and removed question labels Nov 2, 2016
@reyoung reyoung added this to the 0.8.1 milestone Nov 2, 2016
@jamestang0219
Copy link
Author

@reyoung
Here is the log of previous version:
`I1101 07:53:16.276069 119355 TrainerInternal.cpp:162] Batch=8800 samples=281600 AvgCost=0.0174819 CurrentCost=0.0205525 Eval: classification_error_evaluator=0.00603693 CurrentEval: classification_error_evaluator=0.008125
...................................................................................................
I1101 07:53:42.518815 119355 TrainerInternal.cpp:162] Batch=8900 samples=284800 AvgCost=0.0175253 CurrentCost=0.0213492 Eval: classification_error_evaluator=0.00604986 CurrentEval: classification_error_evaluator=0.0071875
...................................................................................................I1101 07:54:08.425493 119355 TrainerInternal.cpp:204] _embedding_0.w0 avg_abs_val=0.106658 max_val=3.63432 avg_abs_grad=1.56957e-07 max_grad=0.00512181
I1101 07:54:08.426228 119355 TrainerInternal.cpp:204] _fc_layer_0.w0 avg_abs_val=0.16739 max_val=0.997636 avg_abs_grad=0.00018073 max_grad=0.00459319
I1101 07:54:08.426569 119355 TrainerInternal.cpp:204] _fc_layer_0.wbias avg_abs_val=0.195303 max_val=0.520971 avg_abs_grad=0.000887678 max_grad=0.0156855
I1101 07:54:08.427664 119355 TrainerInternal.cpp:204] _lstmemory_0.w0 avg_abs_val=0.125715 max_val=1.15954 avg_abs_grad=0.000132394 max_grad=0.0128729
I1101 07:54:08.428009 119355 TrainerInternal.cpp:204] _lstmemory_0.wbias avg_abs_val=0.160678 max_val=0.589867 avg_abs_grad=0.000734516 max_grad=0.0196687
I1101 07:54:08.428349 119355 TrainerInternal.cpp:204] _fc_layer_1.w0 avg_abs_val=0.461006 max_val=1.20577 avg_abs_grad=0.00776989 max_grad=0.0120312
I1101 07:54:08.428617 119355 TrainerInternal.cpp:204] _fc_layer_1.wbias avg_abs_val=0.310792 max_val=0.310793 avg_abs_grad=0.0090016 max_grad=0.00900169

I1101 07:54:08.428644 119355 TrainerInternal.cpp:162] Batch=9000 samples=288000 AvgCost=0.0175542 CurrentCost=0.0201257 Eval: classification_error_evaluator=0.00605903 CurrentEval: classification_error_evaluator=0.006875
I1101 07:54:10.472939 119355 Tester.cpp:111] Test samples=1000 cost=0.466531 Eval: classification_error_evaluator=0.159
...................................................................................................
I1101 07:54:35.642098 119355 TrainerInternal.cpp:162] Batch=9100 samples=291200 AvgCost=0.0175926 CurrentCost=0.0210439 Eval: classification_error_evaluator=0.00606799 CurrentEval: classification_error_evaluator=0.006875
...................................................................................................
I1101 07:55:01.101780 119355 TrainerInternal.cpp:162] Batch=9200 samples=294400 AvgCost=0.01767 CurrentCost=0.0247131 Eval: classification_error_evaluator=0.00611073 CurrentEval: classification_error_evaluator=0.01`

It worked well for testing every 1000 batches

@jamestang0219
Copy link
Author

@reyoung
And here is the newest version logs:
`I1102 18:22:04.354857 34647 TrainerInternal.cpp:165] Batch=3800 samples=486400 AvgCost=0.107311 CurrentCost=0.0766458 Eval: classification_error_evaluator=0.041801 CurrentEval: classification_error_evaluator=0.0296875
...................................................................................................
I1102 18:23:05.413488 34647 TrainerInternal.cpp:165] Batch=3900 samples=499200 AvgCost=0.10815 CurrentCost=0.140034 Eval: classification_error_evaluator=0.0420613 CurrentEval: classification_error_evaluator=0.0519531
...................................................................................................I1102 18:24:05.555019 34647 TrainerInternal.cpp:207] _embedding_0.w0 avg_abs_val=0.0516422 max_val=0.87552 avg_abs_grad=1.31203e-05 max_grad=0.202811
I1102 18:24:05.556380 34647 TrainerInternal.cpp:207] _fc_layer_0.w0 avg_abs_val=0.133105 max_val=1.14789 avg_abs_grad=0.00251638 max_grad=0.252082
I1102 18:24:05.556483 34647 TrainerInternal.cpp:207] _fc_layer_0.wbias avg_abs_val=0.13841 max_val=0.452819 avg_abs_grad=0.0268143 max_grad=2.08488
I1102 18:24:05.557827 34647 TrainerInternal.cpp:207] _lstmemory_0.w0 avg_abs_val=0.0976409 max_val=0.819044 avg_abs_grad=0.00756949 max_grad=2.52075
I1102 18:24:05.557920 34647 TrainerInternal.cpp:207] _lstmemory_0.wbias avg_abs_val=0.117548 max_val=0.493404 avg_abs_grad=0.0481846 max_grad=5.2654
I1102 18:24:05.558040 34647 TrainerInternal.cpp:207] _fc_layer_1.w0 avg_abs_val=0.109183 max_val=0.359466 avg_abs_grad=1.06527 max_grad=5.85286
I1102 18:24:05.558117 34647 TrainerInternal.cpp:207] _fc_layer_1.wbias avg_abs_val=0.0851114 max_val=0.0851116 avg_abs_grad=3.39263 max_grad=3.39263

I1102 18:24:05.558130 34647 TrainerInternal.cpp:165] Batch=4000 samples=512000 AvgCost=0.110242 CurrentCost=0.191823 Eval: classification_error_evaluator=0.0426113 CurrentEval: classification_error_evaluator=0.0640625
...................................................................................................
I1102 18:25:05.374027 34647 TrainerInternal.cpp:165] Batch=4100 samples=524800 AvgCost=0.115381 CurrentCost=0.320934 Eval: classification_error_evaluator=0.0453944 CurrentEval: classification_error_evaluator=0.156719`

and I also add the --test_period=1000 config, but it doesn't test the test data while training.

@jamestang0219
Copy link
Author

@reyoung
One more question.
While training, the AvgCost sometimes became nan, and it caused training failed.
So what's problem?
`I1102 18:24:05.558130 34647 TrainerInternal.cpp:165] Batch=4000 samples=512000 AvgCost=0.110242 CurrentCost=0.191823 Eval: classification_error_evaluator=0.0426113 CurrentEval: classification_error_evaluator=0.0640625
...................................................................................................
I1102 18:25:05.374027 34647 TrainerInternal.cpp:165] Batch=4100 samples=524800 AvgCost=0.115381 CurrentCost=0.320934 Eval: classification_error_evaluator=0.0453944 CurrentEval: classification_error_evaluator=0.156719
...................................................................................................
I1102 18:26:06.421185 34647 TrainerInternal.cpp:165] Batch=4200 samples=537600 AvgCost=nan CurrentCost=nan Eval: classification_error_evaluator=0.0540532 CurrentEval: classification_error_evaluator=0.469609
...................................................................................................
I1102 18:27:06.688400 34647 TrainerInternal.cpp:165] Batch=4300 samples=550400 AvgCost=nan CurrentCost=nan Eval: classification_error_evaluator=0.0715807 CurrentEval: classification_error_evaluator=0.952656
*** Aborted at 1478082836 (unix time) try "date -d @1478082836" if you are using GNU date ***

PC: @ 0x700e89 paddle::GpuVectorT<>::getAbsMax()

*** SIGFPE (@0x700e89) received by PID 34647 (TID 0x7f7d72fb9700) from PID 7343753; stack trace: ***

@     0x7f7da965a330 (unknown)

@           0x700e89 paddle::GpuVectorT<>::getAbsMax()

@           0x6a5a69 _ZNSt17_Function_handlerIFvPN6paddle9ParameterEEZNS0_15TrainerInternal13trainOneBatchElRKNS0_9DataBatchEPSt6vectorINS0_8ArgumentESaIS9_EEEUlS2_E_E9_M_invokeERKSt9_Any_dataS2_

@           0x6579bc paddle::TrainerThread::doCallback()

@           0x657e91 paddle::TrainerThread::gradCollectThread()

@     0x7f7da8836a60 (unknown)

@     0x7f7da9652184 start_thread

@     0x7f7da7f9e37d (unknown)

@                0x0 (unknown)

/usr/local/paddle/bin//paddle: line 81: 34647 Floating point exception(core dumped) ${DEBUGGER} $MYDIR/../opt/paddle/bin/paddle_trainer ${@:2}`

@backyes
Copy link
Contributor

backyes commented Nov 2, 2016

@reyoung had found some clue for this BUG, he will fix it later

reyoung added a commit to reyoung/Paddle that referenced this issue Nov 2, 2016
* Forget to finishTestPeriod in testOnePeriod.
* Fix PaddlePaddle#318
* Protentially related to PaddlePaddle#310, PaddlePaddle#311
emailweixu pushed a commit that referenced this issue Nov 2, 2016
* Forget to finishTestPeriod in testOnePeriod.
* Fix #318
sanylcs added a commit to sanylcs/Paddle that referenced this issue Dec 21, 2016
* refine sparse momentum api and unittest (PaddlePaddle#126)

* refine sparse momentum api and unittest
* fix unittests bug

* Remove main function in some unittest.

* Update Mac OS X port
* follow comments to fix bugs

* Revise some word in build doc

* Add automatic check AVX in CMake (PaddlePaddle#145)

* Add automatic check AVX in CMake
* Revise table format and some words in build docs

* Fix cmake/FindAVX.cmake

* Update build docs (PaddlePaddle#148)

* Add automatic check AVX in CMake

* Add indent in FindAVX.cmake

* Revise table format and some words in build docs

* Update build docs

* Fix bug when only support AVX 2 (PaddlePaddle#150)

In some situation, for instance, in the virtual machine, it could happen.

* add scripts to build ubuntu install package. (PaddlePaddle#132)

* also refine install docs, too

* some bug fix for sparse matrix (PaddlePaddle#133)

* some bug fix for sparse matrix

* a minor bug fix

* Update build docs (PaddlePaddle#149)

* Add automatic check AVX in CMake

* Add indent in FindAVX.cmake

* Revise table format and some words in build docs

* Update build docs

* Update build docs

* [DOC CHANGE] Rerange Build docs & emphasize them in README.md (PaddlePaddle#151)

* Rerange Build docs & emphasize them in README.md

* Rerange Build docs & emphasize them in README.md

* Update Readme (PaddlePaddle#153)

* Update Readme

* Update readme

* Update readme

* Fix  CUDA_VERSION Comparsion (PaddlePaddle#165)

* Update readme (PaddlePaddle#155)

* Update readme

* Apache 2.0

* add interface and test of RecurrentGradientMachine (PaddlePaddle#156)

* add interface and unittest of RecurrentGradientMachine for the function of multiple Subsequence inlinks with unequal token length

* bug fix for dataprovider for quick start inference (PaddlePaddle#168)

* Support MAC OS Sierra (PaddlePaddle#169)

* typo in image classification demo (PaddlePaddle#167)

* support rectangle padding, stride, window and input for PoolProjection (PaddlePaddle#115)

* support rectangle padding, stride, window and input for PoolProjection

* Follow comments.
1. Remove start
2. refine img_pool_a/b.conf for test_NetworkCompare
3. Split unit test

* Modify the test in img_layers.py

* Use C++ 11 atomic_flag in MacOS as spin lock (PaddlePaddle#175)

* Use C++ 11 atomic_flag in MacOS as spin lock
* Add unittest for it.

* Read git sha1 when building Paddle, and add it to PADDLE_VERSION macro

* save the model file including git sha1

* add weight for cost layer interface (PaddlePaddle#177)

* Should not compile the two files if -DWITH_AVX=OFF. (PaddlePaddle#163)

* If cmake -DWITH_AVX=OFF during configuration, should not compile the file src/hl_math.cc and src/hl_avx_functions.cc.

* Add travis for osx (PaddlePaddle#189)

* set MKL search path with intel64 (PaddlePaddle#188)

* Mnist demo (PaddlePaddle#162)

* added mnist demo

* modified .gitignore for .project files

* normalize pixel in mnist_provider.py and set use_gpu=0

* add interface and unittest for nce layer (PaddlePaddle#180)

* add interface and unittest for nce layer

* follow comments

* Merge internal changes (PaddlePaddle#198)

* fix DataProvider create function args bug

Change-Id: I9e3a1c535c805bf30204a14aea8d5143ff534784

* remove PserverForPython.h which is not used

Change-Id: I2b27f1f3c11a42766a92fc689f0f5f1f73ee1d70

* add internal document script

Change-Id: Ia0fec79456caea0b271f9903cc13e8a3d32e0774

* hierarchical rnn document, add new config example (PaddlePaddle#106)

* hierarchical rnn document, add new config example

* update inputs_type of label

* add check for unsupported config

* refine hierarchical document

* refine doc title

* update docs, fix paddle to PaddlePaddle

* follow comments

* remove some copyfrom in AgentLayer and ExpandLayer, fix warning in seq2seq config (PaddlePaddle#183)

* remove redundant HPPL_TYPE_DOUBLE (PaddlePaddle#200)

* add cost_type constraint to weighted_cost interface (PaddlePaddle#206)

* remove unmerged internal documents (PaddlePaddle#205)

* Add FAQ (PaddlePaddle#128)

* Init commit for doing FAQ

* Add speed up training

* Add graphviz to ci

* Add shared paramter

* Tiny refine

* Fix bug in yield dictionary in DataProvider. (PaddlePaddle#197)

* Fix bug in yield dictionary in DataProvider.
* Also make virtualenv work in Paddle.

* Update docker_instll.rst docker image name (PaddlePaddle#210)

* Fix sparse training for trainer_count=1 (PaddlePaddle#204)

* Fix sparse training for trainer_count=1

For trainer_count=1, the gradient machine is NeuralNetwork, which does not create parameter buf for PARAMETER_GRADIENT for sparse update in Parameter::enableType. But gradient parameter buf is still used in SgdThreadUpdater.

* Minor update to comment

* Supplement doc for RNN (PaddlePaddle#214)

* Speed up PyDP2, support numpy.float array (PaddlePaddle#207)

* fix bug in some different python environment (PaddlePaddle#220)

* Fix install_docker.rst and data_sources file open mode

* Follow PaddlePaddle#223
* Fix PaddlePaddle#222

* add base class for seqlastin/max/average layer (PaddlePaddle#187)

* Added Bidi-LSTM and DB-LSTM to quick_start demo (PaddlePaddle#226)

* add missing layer_attr (PaddlePaddle#234)

* fix build bug in gcc46 (PaddlePaddle#236)

* error in doc of quick_start (PaddlePaddle#228)

* fix error in doc of quick_start
* There are some warning when execute preprocess.sh

* add maxout layer, including interface and unittest (PaddlePaddle#229)

* add maxout layer, including interface and unittest

* follow maxout comments

* auto setting channels

* fix unittest bug in test_RecurrentGradientMachine

* remove deprecated start input in img_pool_layer (PaddlePaddle#237)

* Fix dataprovider converter for sparse data

* FIx check type unmatch in MaxOutLayer (PaddlePaddle#242)

Compiled failed on gcc 4.6

* Sequence tagging demo (PaddlePaddle#225)

* Update contribute_to_paddle.md (PaddlePaddle#248)

* add input sparse data check for sparse layer at runtime (PaddlePaddle#247)

* add input sparse data check for sparse layer at runtime,
to avoid invalid data access at pserver end while doing prefetch

* remote sparse design support binary sparse and float saprse both

* Python trainer api (PaddlePaddle#193)

* Python trainer API and demo

* Adding missing PaddleAPIPrivate.h

* Adding api_train.sh

* More comments

* Bump up patch version to 0b3

* Change contribute to paddle to fit new branching model (PaddlePaddle#275)

* Change contribute to paddle to fit new branching model

* set test_period default value to 0 (PaddlePaddle#279)

* Make Paddle --save_dir support a directory name  (PaddlePaddle#277)

* Also fix PaddlePaddle#243

* fix interface bug of block_expand_layer and add unittest (PaddlePaddle#265)

* fix interface bug of block_expand_layer and add unittest

* auto compute num_channels

* default value of num_channels is None

* adjust input order of block_expand

* Support empty Param Block in ParameterSever (PaddlePaddle#244)

* Because in cluster maybe use a lot machine to train a model, and some parameter size could be too small for ParameterServer. Then some of pservers could not have any ParamBlock.
* Also, because ports_num or ports_num_for_sparse is too large, then give a warning in runtime.

* Add bilinear interpolation layer

* fix type unmatch on gcc

* Adding an introduction doc for Paddle to implement simplest linear regression.

* Add default cuda system path (PaddlePaddle#192)

* DYLD_LIBRARY_PATH is disable after Mac OS X 10.11
* fix clang + gpu compile error on Mac OS
* fix some words and errors in build docs

* Add glog header path to include (PaddlePaddle#295)

* add SpatialPyramidPoolLayer c++ support

* Add job=time in trainer, refine cudnn_conv to reduce gpu memory and speed up training. (PaddlePaddle#218)

* Add benchmark for PaddlePaddle, tensorflow and caffe

* ConvProjection to reduce memory for goolenet

* Add unit test for ConvProjection.
1. unit test in test_LayerGrad.
2. compare the ConvPorjection and CudnnConvLayer, also compare the concat_layer+img_conv_layer and concat_layer_conv_projection.

* Reduce cudnn_conv memory and add benchmark document.
1. Use TmpMatrix as the workspace in cudnn_conv to reduce gpu memory. It reduce lots of memory.
2. Add benchmark document.
3. fix smallnet_mnist_cifar.py in paddle.

* Add job=time and refine cudnn_conv to reduce gpu memroy and speed up

* Refine cudnn_conv and shared biases operation in concat_layer and mixed_layer.

* follow comments

* follow comments

* Use unique_ptr to prevent memory leaks in CudnnConvLayer.

* Add some concepts documents to guide user for using paddle (PaddlePaddle#249)

* reuse code of PoolProjection in PoolProjectionLayer

* Add How to build docs (PaddlePaddle#312)

* Bug fix in CudnnConvLayer, which will lead to destruction error. (PaddlePaddle#317)

* Fix a bug in testOnePeriod. (PaddlePaddle#322)

* Forget to finishTestPeriod in testOnePeriod.
* Fix PaddlePaddle#318

* add user_arg to LayerConfig (PaddlePaddle#315)

* install the right python package version (PaddlePaddle#326)

For multiple installation of paddle, there might be multiple versions of python package at opt/paddle/share/wheels/. We should install the right version.
Ideally, we should remove the wrong versions when install. But it's not easy to do this with cmake.

Change-Id: Ida8a8d60643ad9e42cf1c85776de9122d5ba1392

* Add matrix inverse (PaddlePaddle#240)

* Add matrix inverse

* report error when use parallel_nn to train recurrent_nn model (PaddlePaddle#335)

* install the right python package version (PaddlePaddle#340)

For multiple installation of paddle, there might be multiple versions of python package at opt/paddle/share/wheels/. We should install the right version.
Ideally, we should remove the wrong versions when install. But it's not easy to do this with cmake.

Change-Id: Ida8a8d60643ad9e42cf1c85776de9122d5ba1392

* Fix minor errors in instructions of building Paddle on Mac OS X (PaddlePaddle#347)

* Fix bug and redundant code in hl_dso_loader.cc (PaddlePaddle#306)

* Fix glog check type unmatch in Util.cpp (PaddlePaddle#353)

* Fix glog check type unmatch in Util.cpp

PaddlePaddle#352

* Add code coverage and coveralls (PaddlePaddle#296)

* Add Issue template to guide user submit good issue (PaddlePaddle#354)

* Add issue template

* Update ISSUE_TEMPLATE.md

* Update ISSUE_TEMPLATE.md

* Rename

* Rename

* Typo

* Typo

* Typo

* Typo

* Follow comments

* Follow comments

* Add elementwise math operations (PaddlePaddle#343)

* Add elementwise math operations
This allows use to use expressions like: y=log(1+exp(x))
Also added unittests for ActivationFunction
* Enforce keyword arguments for non-positional arguments
* Add LogActivation to doc

* include mkl_lapacke.h (PaddlePaddle#359)

* Update ISSUE_TEMPLATE.md (PaddlePaddle#357)

* add rdma cmake support (PaddlePaddle#284)

* add rdma cmake support

* move rdma related code to rdma.cmake

* using find_package for swig (PaddlePaddle#334)

* Use diff to compare config unittest (PaddlePaddle#363)

Fix PaddlePaddle#342

* Fix SRL hang when exit. (PaddlePaddle#291)

* Fix SRL hang when exit.

* Error occurred when enable Async Load in TestDataProvider.
  * It because DataProvider is calling getNextBatchInternal in one thread, and destructing DataProvider in other thread.
  * Add wait routine in DataProvider destructing.
* Also fix another bug, when destructing TestDataProvider and do not read any test data.

Fix PaddlePaddle#286

* Follow comments, Use mutex is cool!

* Follow comments

* Add img_size for unit test

* Fix bilinear interp bug

* revert flags.cmake

* Replace outputH to batchSize

* Follow comments

* Revise one word in ISSUE_TEMPLATE.md (PaddlePaddle#371)

* abstract outputSize function in CNN-related layers (PaddlePaddle#314)

* Add define for double getrf, getri (PaddlePaddle#381)

* Add SumCost

This allows user to implement any type of cost by summing over the output of non-cost layers.

Change-Id: Ic55aaabbf0c1299e70b8e48a0effcc91f8f5bd29

* Add sum_cost to document

And rebase

Change-Id: I7ea234b3aa8fc70675af15d91db08242c43fb5ff

* Remove Mac OS X build docs (PaddlePaddle#386)

Currently, Paddle on Mac OS X is not deliberate testing through the different versions of  Mac OS X and Clang.
When all these things that we've done, we will reopen Mac build docs.

* add python wrap for sppLayer

* Cancelling Travis build with docs updates only. (PaddlePaddle#372)

* fix deadlink in Chinese quick start doc. (PaddlePaddle#389)

* add python-related unittest problem in faq document (PaddlePaddle#377)

* Fix macOS quick start preprocess script. (PaddlePaddle#390)

* Use `gshuf` instead of `shuf` in macOS
* Fix PaddlePaddle#388

* fix floating-point overflow problem of tanh (PaddlePaddle#355)

* py_paddle link zlib(PaddlePaddle#393)

* enable swig unittest in travis-ci (PaddlePaddle#394)

* Init

* Add numpy deps

* Refine

* fix some nvcc compile options (PaddlePaddle#392)

* Follow comments

* modify the format of diff information in protostr (PaddlePaddle#398)

* Fix minior bug
* add patch does not trigger travis ci

* follow comments

* Fix Travis Ci does not build when push patches (PaddlePaddle#399)

* add getSize method for PoolProjection

* Make matrix well-conditioned when unittest inverse

* Implement setDiag() with BaseMatrix::assign()

* Follow comments

* follow comments

* Update FindAVX.cmake (PaddlePaddle#404)

* make AVX_FOUND is default value to WITH AVX
* let AVX_FLAG always keep -mavx flag since compiler can build binary with -mavx even CPU does not support avx.

* some tiny fixs (PaddlePaddle#406)

* some tiny fixs

* use VLOG(3)

* [Work in Progress] Update cluster_train.md (PaddlePaddle#391)

Update cluster_train.md for easier understanding

* Fix memory leak in image classification demo, which is caused by dataprovider (PaddlePaddle#323)

* the memory leak is inside one pass.

* Update

* Delelte old protostr

* Follow comments

* add some code comments for SppLayer

* Update

* Fix a bug

* initial take on deconv layers

* added convTrans test and python components

* added more test on convTrans layer and comments

* Refactor ExpandConvTransLayer to share codes with ExpandConvLayer

* refactored ExpandConvLayer and ExpandConvTransLayer with ConvBaseLayerCpu

* fixed a bug in refactoring ExpandConv/TransLayer

* add another small test in test_LayerGrad for convTransLayer

* Revised deconv implementations according to luotao1

* rebase deconv implementation with develop branch and resolve conflicts with pull#218 commit 45c81a4

* deconv layer implementation modification following luotao1 comments

* fix a small bug in ConvTransLayerBase in config_parser.py

* deconv implementation mionr changes in ConvBaseLayer.cpp and config_parser.py

* minor changes on deconv per luotao1 comments

* Refactored imageSize in ConvBaseLayer to MathUtil

* minor change to convTransLayer test in test_LayerGrad

* minor changes on deconv implementation and add protostr test for deconv layer

* fixed a bug in parse_conv in config_parser.py

* Generate bilinear protostr via Linux

* set mixedlayer output size according to input operator (PaddlePaddle#414)

* set mixedlayer output size according to input operator
* change from num_channel to num_channels for conv_operator (the old one is
really misleading because all the others are num_channels)

* also changed the arg name in projections.py

* change the act.name for LinearActivation() to "linear" so that it won't
fail in hl_activetype; also fix the hasinputsset in submodel

* Revise code

* use yapf to format python code, add style config file

* Add checkout name for Dockerfile

* Because in dockerhub, we cannot set the `docker build `running
  directory, we could only use `git clone` command to get the latest
  code if we put `Dockerfile` in subdirectory

* But the `git clone` will checkout the default branch only, so here
  we add a `ENV` in Dockerfile to checkout special branch or tag in
  git repo. We could change it to `V0.9.0` tag when it release.

* '*' operator overload for LayerOutput

Making '*' support the multiplication between a scalar and LayerOutput

Also changing '+' to support adding between a vector and a scalar.

Change-Id: I7daf35590dc2b2f855a29d9ef43ac57979442e0f

* change hlactivetype instead of act.name

* fix bug in sum_cost

* fix test_layerHelpers unittest error

* change python code style to pep8

* Fix bug in multple objects in define_py_sources

* Add unittest for split datasource

* Fix PaddlePaddle#436

* multi_binary_cross_entropy when ids vector is provided

* copy the data when createSparseMatrix

* format python code in demo, doc, doc_cn and paddle directories

* format python code in python directory

* modifications according to comments

* Add pre-commit config file.

* Add yapf hook to format python code.
* Add Remove CRLF

* Update pre-commit-config

* Check all files by pre commit hooks

* Bug fix in testing mode.

* Refine clang-format for Paddle style

* fix url of sub-pages

* added resnet lstm architecture from GNMT

* modify document directory structure in model config helpers

* Revert "fix url of sub-pages"

* Add ScalingProjection

out = w * input
where w is a parameter of size 1

Change-Id: Ife682d62323ceb1a20cbbf6269421b20a862d888

* Fix unittest

Change-Id: Ic80845c892c96c37a0df0ddc433fe1aeaa5a9d1c

* Fix forwardTest for ids in python swig.

* unittest need to be added. But fix the bugs first.

* Bumping up version number to v0.9.0a0

* Fix some problems in Debian build scripts.

* Mount local Paddle instead of git clone from remote.
* Use official chinese ubuntu source instead of 163 mirror.

* Update dockerfile tags

* Add version check for paddle

* Refine ver2num function, add comments

* Fix Debian package name in ubuntu install docs.

* Fix PaddlePaddle#486

* Change demo datafile location by using CDN in baidu.

* merge bugfix PaddlePaddle#593 and # 597 from develop branch

* Bumping up version number

* Add Release notes

* Refine documentation in RELEASE.md

* fix dead link for quick start

* update

* Fix Travis-CI build for release

* Remove typo in documentation.

* fix typo
thisjiang pushed a commit to thisjiang/Paddle that referenced this issue Oct 28, 2021
gglin001 added a commit to graphcore/Paddle-fork that referenced this issue Mar 17, 2022
AnnaTrainingG pushed a commit to AnnaTrainingG/Paddle that referenced this issue Sep 19, 2022
* add LapStyle Model
qingshui added a commit to qingshui/Paddle that referenced this issue Jun 25, 2023
* add fennel split, fix amp bug,  fix node edge not equal (PaddlePaddle#318)

* fix amp (PaddlePaddle#319)
zmxdream pushed a commit to zmxdream/Paddle that referenced this issue Jun 26, 2023
danleifeng pushed a commit to danleifeng/Paddle that referenced this issue Sep 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants