-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API] Optimize paddle.where
and paddle.where_
in eager mode
#69556
[API] Optimize paddle.where
and paddle.where_
in eager mode
#69556
Conversation
你的PR提交成功,感谢你对开源项目的贡献! |
python/paddle/tensor/search.py
Outdated
broadcast_shape = paddle.broadcast_shape(x_shape, y_shape) | ||
broadcast_shape = paddle.broadcast_shape(broadcast_shape, condition_shape) | ||
|
||
broadcast_x = x | ||
broadcast_y = y | ||
broadcast_condition = condition | ||
|
||
if condition_shape != broadcast_shape: | ||
broadcast_condition = paddle.broadcast_to( | ||
broadcast_condition, broadcast_shape | ||
) | ||
if x_shape != broadcast_shape: | ||
broadcast_x = paddle.broadcast_to(broadcast_x, broadcast_shape) | ||
if y_shape != broadcast_shape: | ||
broadcast_y = paddle.broadcast_to(broadcast_y, broadcast_shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确认下broadcast_shape和broadcast_to是否支持动态shape, 不支持的话静态图下可能会有问题
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
确认下broadcast_shape和broadcast_to是否支持动态shape, 不支持的话静态图下可能会有问题
看了下,修改前的代码用了add这个支持动态shape的算子进行了隐式broadcast,所以静态图支持动态shape,但是修改之后的方案依赖于broadcast_shape和broadcast_to,这两个算子至少有一个存在对-1的处理BUG,导致静态图下的运行时结果不太对,所以目前的临时解决方案是:新的代码逻辑放在dynamic_mode里面,pir仍然使用原本来的代码。这样不影响已有功能。
df5af98
to
b4396d9
Compare
paddle.where
and paddle.where_
in eager mode
Summary of this PR: 1. upload DPA-1 related code 2. merge much develop code 3. add all eager composite operators except `softmax_grad`, `p_norm_grad`, `split_grad`, and `concat_grad` to the composite operator blacklist(<https://github.com/deepmodeling/deepmd-kit/pull/4414/files#diff-e678abb052b278f8a479f8d13b839a9ec0effd9923478a850bc13758f918e1e9R134-R148>) to significantly improve model execution speed (reducing the time taken from 100% more than PyTorch to about 10% to 15% more). related PR: lanpa/tensorboardX#728 ### Training curve: ![training_curves_comparison_eager_opt](https://github.com/user-attachments/assets/3b71fc99-5abf-4353-a61a-38737d3c7f2c) ### Accuracy test(left: paddle, right: torch): ![image](https://github.com/user-attachments/assets/a42b4bfd-c0f8-4eb8-85eb-ff1adf981dbb) Ralated optimization of Paddle framework: - [x] PaddlePaddle/Paddle#69349 - [x] PaddlePaddle/Paddle#69333 - [x] PaddlePaddle/Paddle#69479 - [x] PaddlePaddle/Paddle#69515 - [x] PaddlePaddle/Paddle#69487 - [x] PaddlePaddle/Paddle#69661 - [x] PaddlePaddle/Paddle#69660 - [x] PaddlePaddle/Paddle#69596 - [x] PaddlePaddle/Paddle#69556 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Release Notes - **New Features** - Introduced several new classes for molecular descriptors, including `DescrptDPA1`, `DescrptBlockSeAtten`, and `LayerNorm`, enhancing the modeling capabilities for molecular simulations. - Added new JSON configuration files for model parameters and multitask models related to water simulations. - Implemented new test classes for validating the functionality of the `DPAtomicModel` and various descriptor classes. - Added new test classes for evaluating denoising models, including `TestDenoiseModelDPA1` and `TestDenoiseModelDPA2`. - Enhanced the `ModelWrapper` class to clarify the handling of model parameters and state management. - **Bug Fixes** - Improved internal logic for handling model state saving and loading, ensuring consistency in outputs. - **Documentation** - Enhanced type hints and return annotations across various classes and methods for better clarity. - **Tests** - Expanded the testing framework with new test cases for denoising models and descriptor functionalities, ensuring robust validation of features. - Activated previously skipped tests for energy models, improving test coverage. - Enhanced multitask training tests with new configuration handling and test classes. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Support DPA-2 in paddle backend. This PR will be updated after #4414 is merged. ### Training curve: ![training_curves_comparison_dpa2](https://github.com/user-attachments/assets/29bdeffa-cf2d-4586-afcf-7df0569997c3) ### Accuracy test(left: paddle, right: torch): ![image](https://github.com/user-attachments/assets/5bff55f3-1c39-4b95-93f0-68783e794716) Ralated optimization of Paddle framework: - [x] PaddlePaddle/Paddle#69349 - [x] PaddlePaddle/Paddle#69333 - [x] PaddlePaddle/Paddle#69479 - [x] PaddlePaddle/Paddle#69515 - [x] PaddlePaddle/Paddle#69487 - [x] PaddlePaddle/Paddle#69661 - [x] PaddlePaddle/Paddle#69660 - [x] PaddlePaddle/Paddle#69596 - [x] PaddlePaddle/Paddle#69556 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced new classes for molecular descriptors: `DescrptDPA2`, `DescrptBlockRepformers`, `DescrptSeTTebd`, and `DescrptBlockSeTTebd`. - Added new functions for tensor operations and descriptor management, enhancing the capabilities of the module. - Updated JSON configurations for multitask models to refine selection criteria and data paths. - **Bug Fixes** - Improved error handling and parameter validation across various descriptor classes. - **Documentation** - Enhanced test coverage for new descriptor functionalities and configurations. - **Tests** - Added new test classes to validate the functionality of `DescrptDPA2` and multitask training scenarios. - Expanded test capabilities for descriptor classes based on installed dependencies. - Updated existing tests to support new configurations and functionalities. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
PR Category
Performance Optimization
PR Types
Improvements
Description
Pcard-75624
paddle.where
和paddle.where_
的原有实现通过多次调用基础二元运算符进行隐式广播,从而让cond
,x,
,y
形状保持一致,优化之后使用无计算量的broadcast_shape
计算广播后的形状,再使用broadcast_to
进行广播,极大简化了代码逻辑,并且减少了不必要的前向和反向算子开销where_
部分动态图单测,并且修复了个别静态图单测中,占位符形状和实际数据形状不一致的问题。Note
考虑到broadcast_shape和broadcast_to暂时没有完全适配动态shape,因此本PR只针对动态图分支进行修改