-
Notifications
You must be signed in to change notification settings - Fork 5.6k
2018 06 14
Tao Luo edited this page Dec 9, 2019
·
1 revision
- OPEN bugfix/add_inference_lib_to_release
- OPEN Feature/pass manager
- OPEN bugfix/trt op with kernel
- OPEN doc/inference api
- MERGED feature/anakin ci
- MERGED loose threshold of TRT for CI in different model
- [WIP] Checkpoint For lookup table
- documentation about checkpoint
- Seq2seq Attention Model:
- Convergence issue only exists on GPU.
- Removed the GPU kernel of sequence_expand_op.
- Debugging sequence_expand_op GPU kernel.
- mkldnn:
- add FLAGS_use_mkldnn to global control use_mkldnn: https://github.com/PaddlePaddle/Paddle/pull/11319
- align the mklml v2 benchmark between v0.11.0 and latest images on 6148 CPU. Test the mklml fluid benchmark using
ParallelDo
: http://agroup.baidu.com/paddlepaddle/md/article/964808 - OCR (Doing):fluid MKL version will affect the accuracy of Caffe MKL version on the same machine
- refine api docs:
- refine docs of elementwise_op etc: https://github.com/PaddlePaddle/Paddle/pull/11369
- add url of cuda9.0_cudnn7_avx_mkl:https://github.com/PaddlePaddle/Paddle/pull/11417
- code review:
- feature/anakin ci: https://github.com/PaddlePaddle/Paddle/pull/11330
- doc/inference api:https://github.com/PaddlePaddle/Paddle/pull/11332
- bugfix/add_inference_lib_to_release:https://github.com/PaddlePaddle/Paddle/pull/11455
- bugfix/trt op with kernel: https://github.com/PaddlePaddle/Paddle/pull/11408
- Face Detection:
- Train and debug model.
- Make the normalization operator more general and fix bug in l2_normalize.
- Refine the VGG-SSD network. https://github.com/PaddlePaddle/models/pull/977
- Implement a bilinear initializer for transposed convolution to do upsampling.
- Change the transposed conv2d initializer.
- maxout Python API: https://github.com/PaddlePaddle/Paddle/pull/11278
- Others:
- Debug nan in SSD model with @buxingyuan.
- Some work for image-classification release: how to download ImageNet dataset and Realse the trained models.
- Code review:
- SSD doc: https://github.com/PaddlePaddle/models/pull/968#pullrequestreview-127443167
- Image classification: https://github.com/PaddlePaddle/models/pull/979#pullrequestreview-128214829
- Achor generator: https://github.com/PaddlePaddle/Paddle/pull/11218#pullrequestreview-127454877
- Argsort op: https://github.com/PaddlePaddle/Paddle/pull/11174#pullrequestreview-127448662
- Documentation
- BugFix
- Single GPU Performance Profile
- Preprocessing ops:
- image_center_crop: https://github.com/PaddlePaddle/Paddle/pull/11245
- crop_op updating: https://github.com/PaddlePaddle/Paddle/pull/11293
- cast_op updating: https://github.com/PaddlePaddle/Paddle/pull/11442
- API reference:
- [WIP] Profiling: single-GPU performance
- Pull Requests
- Finish argmin and argmax operators (both CPU and GPU kernels)
- Fix identifier errors in benchmark model
vgg.py
- Refine
ZeroGradFunctor
implementation inactivation_op.h
- Issues
- Identifier error bug in benchmark model
vgg.py
andstacked_dynamic_lstm.py
- Issues on paddle installation and benchmark model running
- Identifier error bug in benchmark model
- Code Review
- Review argsort op and make suggestions on CPU kernel implementation
-
Metric learning
- triplet loss op[WIP]
- Enhance Print op to print tensors on specified CUDA devices.
-
Others
- Fix and merge mean iou op
- Review and validate ICNet
-
Review
-
PR
- [Merged] Infer multi-threads API Demo and UT https://github.com/PaddlePaddle/Paddle/pull/11247
- [Merged] add initial cpu memory flag in MB for infer https://github.com/PaddlePaddle/Paddle/pull/11392
- [WIP] fix unknown use_mkldnn flag https://github.com/PaddlePaddle/Paddle/pull/11395
- [WIP] scope thread safe https://github.com/PaddlePaddle/Paddle/pull/11258
-
code review
- [Merged] MKLDNN layout: Support for convolution operator https://github.com/PaddlePaddle/Paddle/pull/11099
- [Merged] MKLDNN layout: Support for pool operator https://github.com/PaddlePaddle/Paddle/pull/11101
- [Merged] MKLDNN layout: Support for batch norm operator https://github.com/PaddlePaddle/Paddle/pull/11098
- [Merged] Add an interface to set the number of threads for math function, and set the default value to 1 for inference.https://github.com/PaddlePaddle/Paddle/pull/10789
-
issue
- [fixed] cpu memory issue https://github.com/PaddlePaddle/Paddle/issues/11272
- [WIP] test fail with fluid only, unknown use_mkldnn https://github.com/PaddlePaddle/Paddle/issues/11393
-
mklml
- preproduce performance with v2 API,but fluid ParallelDo can not match the performance of v2 API.
-
paddle fluid framework
- fix build on mac
-
paddle document
- update
split_lod_tensor
,create_array
andarray_length
doc
- update
-
distributed lookup table
- outvar must be create in local scope for prefetch
- Refine prefetch
- Add merge_ids_op
- fix distribute_transpiler
- NMT:
- Refine the beam_search PR and add unittests and docs.
- Experiments on WMT16 en-de dataset (BPE data and weight sharing).
- Compare with Tensor2Tensor and tune the model with new features (BPE data and weight sharing)
- test data: newstest2013, BLEU paper reported: 25.8
- multi_bleu: Fluid: 24.68; T2T: 23.5
- t2t-bleu (uncased, cased): Fluid: (27.41, 26.93); T2T: (26.40, 25,89)
- Compare with Tensor2Tensor and tune the model with new features (BPE data and weight sharing)
- Review:
-
DeepASR:
- Acoustic model training;
- The enchancement & integration of decoder
-
Transformer:
- Catch up with the training strategies in T2T;
- Model training of en-de translation model
-
Dectection: Add GPU implementation and enhance CPU computation in Argsort Op for Faster RCNN https://github.com/PaddlePaddle/Paddle/pull/11174
- face detection:
- refine infer to eval the model https://github.com/PaddlePaddle/models/pull/986
- [WIP] check different shape input error
- reader support test: https://github.com/PaddlePaddle/Paddle/pull/11390
- Documentation:
- Fixes:
- Distribution:
- Add robust cases support into fluid benchmark and add robust case into ce-latest-kpi
- Fix the bug that ParallelExecutor will hang up when GPU's memory is NOT enough
- Update the document of transpiler's parameter split strategy
- Fix the bug that Parameter Server does not add in the control block
- EDL:
- Fix CRD's bugs in Dockerfile and edl_controller.yaml, which will cause the build failure and start failure
- cuptiFinalize remove
- Fix activation op doc
- make status thread-safe
- Merge release 0.12.0 and 0.13.0 to master
- fix sparse var in dist train
- memory optimize transpiler(add inlace)
- mixed precision training fp16 training
- eigen memory aligment
- eigen change default device to threadpool
-
PR
- Enable CPU on Parallel executor
- Check SSA Graph
- Fuse AllReduce Operator
- Refine multi thread cpu parallel exe
- SE-ResNeXt-152 multi card acceleration ratio tuning process
- Add conv3d/pool3d/conv3d_trans Python API
-
Review
- overlap rpc op memcpy in distributed training
- Fix NCCLBcast hang up bug in Parallel Executor
- Add lock to record_event
- doc fix & fix generate doc error
- book v2 fix: machine_translation
- validate models to be merged
- CE duty shift mechanism
- CE model reconstruction scheme
- Celery distributed framework research:
- CE model reconstruction:
- paddle bad code fix
- NMT transformer distribute training run locally, but on paddlecloud core dump.
- PR
- fix bug: get var return null, https://github.com/PaddlePaddle/Paddle/pull/11431
- distribute training script, https://github.com/PaddlePaddle/models/pull/982
- PR
-
Re-write book chapters (Eng & Chi), demo programs (train.py) and Jupyter Notebook in Fluid High Level API syntax:
- https://github.com/PaddlePaddle/book/pull/545
- https://github.com/PaddlePaddle/book/pull/541
- https://github.com/PaddlePaddle/book/pull/530
- https://github.com/PaddlePaddle/book/commit/928d178c3a1b206eab54ab44047f15e70dd8cf0b
- https://github.com/PaddlePaddle/book/commit/1b3bb17befaf08310d3f0cb1b1d5dc64a06acd84
- https://github.com/PaddlePaddle/book/commit/48f26463b2ec46d0e59eafa8712a6b60589750f3
- https://github.com/PaddlePaddle/book/commit/7fbabdd62394c217c05904f6b76a5c5c30838244
-
The new PaddlePaddle official website
- discussed with Beijing team and clarified the requirements
-
Review:
- Add plot support in jupyter for book chapter 5 https://github.com/PaddlePaddle/book/pull/546
- Rewrite document and code in jupyter for book chapter 6 sentimental analysis https://github.com/PaddlePaddle/book/pull/539
Review:
- https://github.com/PaddlePaddle/book/pull/545
- https://github.com/PaddlePaddle/book/pull/544
-
PR
- Re-write book chapter image classification: https://github.com/PaddlePaddle/book/pull/542
- Fluid High-level-api can't infer with drop layer: https://github.com/PaddlePaddle/Paddle/pull/11323
- Fix the issue where the paddlepaddlep.org docker tool not working: https://github.com/PaddlePaddle/PaddlePaddle.org/pull/485
-
Review and Issues
- https://github.com/PaddlePaddle/Paddle/issues/11299
- https://github.com/PaddlePaddle/book/pull/541#pullrequestreview-126998051
- https://github.com/PaddlePaddle/book/pull/544#pullrequestreview-127776258
- https://github.com/PaddlePaddle/book/pull/539#pullrequestreview-128587195
- https://github.com/PaddlePaddle/PaddlePaddle.org/issues/484#issuecomment-395910304
- https://github.com/PaddlePaddle/PaddlePaddle.org/issues/484#issuecomment-395914398
- Build offline real-time search for all documentation - in WIP PR: https://github.com/PaddlePaddle/PaddlePaddle.org/pull/481
- Work with Synopsys, Intel folks on ONNX issues
- [Merged] continue to Modify Pybind LoDTensor API according to length-based LoD:
- [Merged] continue to Modify lod tensor doc based on new LoDTensor Python API:
- Review
- Fix the incremental build issue
- https://github.com/PaddlePaddle/Paddle/pull/11378
- Read tensorflow design doc and papers