Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HOTFIX][WORKAROUND] Revert #982 -- partial W/A for FP16 SSD issue (memory access fault in SWDEV-295434) #1048

Merged
merged 1 commit into from
Jul 23, 2021

Conversation

atamazov
Copy link
Contributor

Reverts #982. Partial W/A for FP32 SSD issue (memory access fault in https://ontrack-internal.amd.com/browse/SWDEV-295434).

⚠️ Merging this will cause #980 to appear again!

… 980) (#982)"

This reverts commit 823fd14.

# RESOLVED Conflicts:
#	test/CMakeLists.txt
@codecov

This comment has been minimized.

@shaojiewang
Copy link
Contributor

Reverts #982. Partial W/A for FP32 SSD issue (memory access fault in https://ontrack-internal.amd.com/browse/SWDEV-295434).

⚠️ Merging this will cause #980 to appear again!

Yes. it will cause #980 again. So just close this solver?

Copy link
Contributor

@shaojiewang shaojiewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe close the solver? or remove the relevant ctest case?

@atamazov
Copy link
Contributor Author

maybe close the solver?

This may affect performance... If CQE will report the issue, AND full-blown fix is not ready, then we'll consider this possibility.

or remove the relevant ctest case?

It should be already removed in this PR when reverting #982, but, well, this is not actually so, because the test case was already disabled as a workaround for issue #995. That is why you do not see changes in tests/CMakeLists.txt in this PR.

Please note that removing test cases is generally not a way to go, because camouflaging the issue may make more harm than the issue itself.

@junliume
Copy link
Collaborator

maybe close the solver? or remove the relevant ctest case?

@shaojiewang here is the proposed work flow: #1049

@atamazov atamazov merged commit 001fbbd into develop Jul 23, 2021
@atamazov atamazov deleted the wa-swdev-295434-revert-982 branch July 23, 2021 23:55
atamazov pushed a commit that referenced this pull request Jul 27, 2021
…(part 2/2 of the real fix for SWDEV-295434) (#1045)

* fix compute error and mem fault in ssd. cc SWDEV-295434
* add regression test
* fix bug in asm igemm nchw wrw solver
* fix bug in kernel sel
* update comments
* check groups
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants