[API] Optimize `paddle.where` and `paddle.where_` in eager mode #69556

HydrogenSulfate · 2024-11-20T12:07:49Z

PR Category

Performance Optimization

PR Types

Improvements

Description

Pcard-75624

paddle.where和paddle.where_的原有实现通过多次调用基础二元运算符进行隐式广播，从而让cond, x,, y形状保持一致，优化之后使用无计算量的broadcast_shape计算广播后的形状，再使用broadcast_to进行广播，极大简化了代码逻辑，并且减少了不必要的前向和反向算子开销
由于减少了算子，重新适配了where_部分动态图单测，并且修复了个别静态图单测中，占位符形状和实际数据形状不一致的问题。

Note

考虑到broadcast_shape和broadcast_to暂时没有完全适配动态shape，因此本PR只针对动态图分支进行修改

paddle-bot · 2024-11-20T12:07:54Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

zyfncg · 2024-11-22T02:59:54Z

python/paddle/tensor/search.py

+    broadcast_shape = paddle.broadcast_shape(x_shape, y_shape)
+    broadcast_shape = paddle.broadcast_shape(broadcast_shape, condition_shape)
+
+    broadcast_x = x
+    broadcast_y = y
+    broadcast_condition = condition
+
+    if condition_shape != broadcast_shape:
+        broadcast_condition = paddle.broadcast_to(
+            broadcast_condition, broadcast_shape
+        )
+    if x_shape != broadcast_shape:
+        broadcast_x = paddle.broadcast_to(broadcast_x, broadcast_shape)
+    if y_shape != broadcast_shape:
+        broadcast_y = paddle.broadcast_to(broadcast_y, broadcast_shape)


确认下broadcast_shape和broadcast_to是否支持动态shape, 不支持的话静态图下可能会有问题

确认下broadcast_shape和broadcast_to是否支持动态shape, 不支持的话静态图下可能会有问题

看了下，修改前的代码用了add这个支持动态shape的算子进行了隐式broadcast，所以静态图支持动态shape，但是修改之后的方案依赖于broadcast_shape和broadcast_to，这两个算子至少有一个存在对-1的处理BUG，导致静态图下的运行时结果不太对，所以目前的临时解决方案是：新的代码逻辑放在dynamic_mode里面，pir仍然使用原本来的代码。这样不影响已有功能。

Summary of this PR: 1. upload DPA-1 related code 2. merge much develop code 3. add all eager composite operators except `softmax_grad`, `p_norm_grad`, `split_grad`, and `concat_grad` to the composite operator blacklist(<https://github.com/deepmodeling/deepmd-kit/pull/4414/files#diff-e678abb052b278f8a479f8d13b839a9ec0effd9923478a850bc13758f918e1e9R134-R148>) to significantly improve model execution speed (reducing the time taken from 100% more than PyTorch to about 10% to 15% more). related PR: lanpa/tensorboardX#728 ### Training curve: ![training_curves_comparison_eager_opt](https://github.com/user-attachments/assets/3b71fc99-5abf-4353-a61a-38737d3c7f2c) ### Accuracy test(left: paddle, right: torch): ![image](https://github.com/user-attachments/assets/a42b4bfd-c0f8-4eb8-85eb-ff1adf981dbb) Ralated optimization of Paddle framework: - [x] PaddlePaddle/Paddle#69349 - [x] PaddlePaddle/Paddle#69333 - [x] PaddlePaddle/Paddle#69479 - [x] PaddlePaddle/Paddle#69515 - [x] PaddlePaddle/Paddle#69487 - [x] PaddlePaddle/Paddle#69661 - [x] PaddlePaddle/Paddle#69660 - [x] PaddlePaddle/Paddle#69596 - [x] PaddlePaddle/Paddle#69556  ## Summary by CodeRabbit ## Release Notes - **New Features** - Introduced several new classes for molecular descriptors, including `DescrptDPA1`, `DescrptBlockSeAtten`, and `LayerNorm`, enhancing the modeling capabilities for molecular simulations. - Added new JSON configuration files for model parameters and multitask models related to water simulations. - Implemented new test classes for validating the functionality of the `DPAtomicModel` and various descriptor classes. - Added new test classes for evaluating denoising models, including `TestDenoiseModelDPA1` and `TestDenoiseModelDPA2`. - Enhanced the `ModelWrapper` class to clarify the handling of model parameters and state management. - **Bug Fixes** - Improved internal logic for handling model state saving and loading, ensuring consistency in outputs. - **Documentation** - Enhanced type hints and return annotations across various classes and methods for better clarity. - **Tests** - Expanded the testing framework with new test cases for denoising models and descriptor functionalities, ensuring robust validation of features. - Activated previously skipped tests for energy models, improving test coverage. - Enhanced multitask training tests with new configuration handling and test classes.  --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

Support DPA-2 in paddle backend. This PR will be updated after #4414 is merged. ### Training curve: ![training_curves_comparison_dpa2](https://github.com/user-attachments/assets/29bdeffa-cf2d-4586-afcf-7df0569997c3) ### Accuracy test(left: paddle, right: torch): ![image](https://github.com/user-attachments/assets/5bff55f3-1c39-4b95-93f0-68783e794716) Ralated optimization of Paddle framework: - [x] PaddlePaddle/Paddle#69349 - [x] PaddlePaddle/Paddle#69333 - [x] PaddlePaddle/Paddle#69479 - [x] PaddlePaddle/Paddle#69515 - [x] PaddlePaddle/Paddle#69487 - [x] PaddlePaddle/Paddle#69661 - [x] PaddlePaddle/Paddle#69660 - [x] PaddlePaddle/Paddle#69596 - [x] PaddlePaddle/Paddle#69556  ## Summary by CodeRabbit - **New Features** - Introduced new classes for molecular descriptors: `DescrptDPA2`, `DescrptBlockRepformers`, `DescrptSeTTebd`, and `DescrptBlockSeTTebd`. - Added new functions for tensor operations and descriptor management, enhancing the capabilities of the module. - Updated JSON configurations for multitask models to refine selection criteria and data paths. - **Bug Fixes** - Improved error handling and parameter validation across various descriptor classes. - **Documentation** - Enhanced test coverage for new descriptor functionalities and configurations. - **Tests** - Added new test classes to validate the functionality of `DescrptDPA2` and multitask training scenarios. - Expanded test capabilities for descriptor classes based on installed dependencies. - Updated existing tests to support new configurations and functionalities.  --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

optimize where

092d803

fix code

b157596

zyfncg reviewed Nov 22, 2024

View reviewed changes

split code into dynamic and pir mode

b4396d9

HydrogenSulfate force-pushed the fix_where3 branch from df5af98 to b4396d9 Compare November 22, 2024 06:58

HydrogenSulfate changed the title ~~optimize where~~ [API] Optimize paddle.where and paddle.where_ in eager mode Nov 22, 2024

zyfncg previously approved these changes Nov 22, 2024

View reviewed changes

HydrogenSulfate mentioned this pull request Nov 25, 2024

pd: support dpa1 deepmodeling/deepmd-kit#4414

Merged

9 tasks

Merge branch 'PaddlePaddle:develop' into fix_where3

b49928c

HydrogenSulfate mentioned this pull request Nov 25, 2024

pd: support dpa2 deepmodeling/deepmd-kit#4418

Merged

9 tasks

fix where for pir/old ir bug

23f1da3

HydrogenSulfate dismissed zyfncg’s stale review via 23f1da3 November 27, 2024 07:35

Merge branch 'PaddlePaddle:develop' into fix_where3

c1dafbf

zyfncg approved these changes Nov 28, 2024

View reviewed changes

HydrogenSulfate merged commit 510b3ed into PaddlePaddle:develop Nov 28, 2024
28 checks passed

HydrogenSulfate deleted the fix_where3 branch November 28, 2024 07:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[API] Optimize `paddle.where` and `paddle.where_` in eager mode #69556

[API] Optimize `paddle.where` and `paddle.where_` in eager mode #69556

HydrogenSulfate commented Nov 20, 2024 •

edited

Loading

paddle-bot bot commented Nov 20, 2024

zyfncg Nov 22, 2024

HydrogenSulfate Nov 22, 2024 •

edited

Loading

[API] Optimize paddle.where and paddle.where_ in eager mode #69556

[API] Optimize paddle.where and paddle.where_ in eager mode #69556

Conversation

HydrogenSulfate commented Nov 20, 2024 • edited Loading

PR Category

PR Types

Description

paddle-bot bot commented Nov 20, 2024

zyfncg Nov 22, 2024

Choose a reason for hiding this comment

HydrogenSulfate Nov 22, 2024 • edited Loading

Choose a reason for hiding this comment

[API] Optimize `paddle.where` and `paddle.where_` in eager mode #69556

[API] Optimize `paddle.where` and `paddle.where_` in eager mode #69556

HydrogenSulfate commented Nov 20, 2024 •

edited

Loading

HydrogenSulfate Nov 22, 2024 •

edited

Loading