accelerate calculation mechanism and accelerate training mechanism #124

shenhuinuist · 2016-09-27T07:17:36Z

According to Paddle's documents, sparse training is usually used to accelerate calculation when input is sparse data with highly dimension, and sparse update is not applicable to dense input.
I find Paddle speed up matrix multiplication by calling external math libraries. Is there any accelerate calculation mechanism or accelerate training mechanism applied in Paddle especially when input is dense ? Can you show me more details?

reyoung · 2016-09-27T12:11:03Z

Basically, no special optimization for dense matrix.

In details, some AVX and SSE code has been written for merge gradient. In Baidu, we use MKL to calculate dense matrix. It is hard to write code faster than MKL in general.

The opensource version supports MKL, too. But you should buy the MKL license to use it. Maybe the student license is better for school usages.

shenhuinuist · 2016-09-28T01:13:12Z

@reyoung Thank you so much!

update python paddle_trt demo

Co-authored-by: root <[email protected]>

test new sample optimize thrust alloc (PaddlePaddle#112) fix deepwalk sample kernel (PaddlePaddle#122) Update graphsage speed(thrust -> cub), fix sample async bug (PaddlePaddle#120) * fix deepwalk sample kernel, fix inplace kernel sync bug * update v2 sample * change kernel name for readability * delete unused kernel support slot_feature with different length (PaddlePaddle#124) Co-authored-by: root <[email protected]> add graphsage slot feature (PaddlePaddle#126) 【graphsage】don't alloc a new d_feature_buf if the old one is enough (PaddlePaddle#128) * add graphsage slot feature * update graphsage slot feature * update graphsage slot feature fix linking use type optimization remove file add type-optimization config fix bug in slot_feature (PaddlePaddle#134) Co-authored-by: root <[email protected]> sage network optimization remove log fix bug in slot_feature (PaddlePaddle#134) Co-authored-by: root <[email protected]>

Co-authored-by: root <[email protected]>

ViT-B/16 finetune, Top1 Acc: 0.7805, while ViT paper was 0.7791. There is diff: lr from 0.03 to 0.003 add 0.0001 weight_decay global gradient clip from 1.0 to 0.5 ViT-L/16 finetune, Top1 Acc 85.03% based on JAX checkpoint, while ViT GitHub was 85.05%. add 0.0001 weight decay for Momentum optimizer

fix compile bug for fused seq tensor KL3

reyoung closed this as completed Sep 28, 2016

thisjiang pushed a commit to thisjiang/Paddle that referenced this issue Oct 28, 2021

Refine bind gpu axis (PaddlePaddle#124)

1f98e30

gglin001 added a commit to graphcore/Paddle-fork that referenced this issue Dec 8, 2021

c++ code format (PaddlePaddle#124)

edc8a5b

zhoutianzi666 pushed a commit to zhoutianzi666/Paddle that referenced this issue May 23, 2022

Merge pull request PaddlePaddle#124 from winter-wang/master

79a0777

update python paddle_trt demo

DesmonDay pushed a commit to DesmonDay/Paddle that referenced this issue Sep 23, 2022

support slot_feature with different length (PaddlePaddle#124)

bb5a184

Co-authored-by: root <[email protected]>

zmxdream pushed a commit to zmxdream/Paddle that referenced this issue Dec 7, 2022

support slot_feature with different length (PaddlePaddle#124)

520e680

Co-authored-by: root <[email protected]>

jack603047588 pushed a commit to jiaoxuewu/PaddleBox that referenced this issue Oct 29, 2024

Merge pull request PaddlePaddle#124 from YaoCheng8667/paddlebox-yc-new

2264d1f

fix compile bug for fused seq tensor KL3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

accelerate calculation mechanism and accelerate training mechanism #124

accelerate calculation mechanism and accelerate training mechanism #124

shenhuinuist commented Sep 27, 2016

reyoung commented Sep 27, 2016

shenhuinuist commented Sep 28, 2016

accelerate calculation mechanism and accelerate training mechanism #124

accelerate calculation mechanism and accelerate training mechanism #124

Comments

shenhuinuist commented Sep 27, 2016

reyoung commented Sep 27, 2016

shenhuinuist commented Sep 28, 2016